Skip to main content

Showing 1–13 of 13 results for author: Leong, P H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.18003  [pdf, ps, other

    cs.AR

    AMD Versal Implementations of FAM and SSCA Estimators

    Authors: Carol Jingyi Li, Ruilin Wu, Philip H. W. Leong

    Abstract: Cyclostationary analysis is widely used in signal processing, particularly in the analysis of human-made signals, and spectral correlation density (SCD) is often used to characterise cyclostationarity. Unfortunately, for real-time applications, even utilising the fast Fourier transform (FFT), the high computational complexity associated with estimating the SCD limits its applicability. In this wor… ▽ More

    Submitted 22 June, 2025; originally announced June 2025.

  2. arXiv:2501.05033  [pdf, other

    cs.AR cs.IT

    Towards High-Performance Network Coding: FPGA Acceleration With Bounded-value Generators

    Authors: Jiaxin Qing, Philip H. W. Leong, Kin Hong Lee, Raymond W. Yeung

    Abstract: Network coding enhances performance in network communications and distributed storage by increasing throughput and robustness while reducing latency. Batched Sparse (BATS) codes are a class of capacity-achieving network codes, but their practical applications are hindered by their structure, computational intensity, and power demands of finite field operations. Most literature focuses on algorithm… ▽ More

    Submitted 9 January, 2025; originally announced January 2025.

  3. arXiv:2412.06806  [pdf, other

    physics.optics cs.CV

    A Physics-Inspired Deep Learning Framework with Polar Coordinate Attention for Ptychographic Imaging

    Authors: Han Yue, Jun Cheng, Yu-Xuan Ren, Chien-Chun Chen, Grant A. van Riessen, Philip Heng Wai Leong, Steve Feng Shu

    Abstract: Ptychographic imaging confronts inherent challenges in applying deep learning for phase retrieval from diffraction patterns. Conventional neural architectures, both convolutional neural networks and Transformer-based methods, are optimized for natural images with Euclidean spatial neighborhood-based inductive biases that exhibit geometric mismatch with the concentric coherent patterns characterist… ▽ More

    Submitted 2 May, 2025; v1 submitted 25 November, 2024; originally announced December 2024.

    Comments: 13 pages, 10 figures

    MSC Class: 68T07; 68U10

  4. arXiv:2406.05999  [pdf, other

    cs.AR cs.AI cs.LG

    fSEAD: a Composable FPGA-based Streaming Ensemble Anomaly Detection Library

    Authors: Binglei Lou, David Boland, Philip H. W. Leong

    Abstract: Machine learning ensembles combine multiple base models to produce a more accurate output. They can be applied to a range of machine learning problems, including anomaly detection. In this paper, we investigate how to maximize the composability and scalability of an FPGA-based streaming ensemble anomaly detector (fSEAD). To achieve this, we propose a flexible computing architecture consisting of m… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

    Comments: The source code for this paper is available at: https://github.com/bingleilou/fSEAD

    Journal ref: ACM Transactions on Reconfigurable Technology and Systems(TRETS),16, 3, Article 42 (2023). Journal Track of The International Conference on Field Programmable Technology (FPT'22), Hong Kong SAR, China

  5. arXiv:2406.04910  [pdf, other

    cs.LG cs.AI cs.AR

    PolyLUT-Add: FPGA-based LUT Inference with Wide Inputs

    Authors: Binglei Lou, Richard Rademacher, David Boland, Philip H. W. Leong

    Abstract: FPGAs have distinct advantages as a technology for deploying deep neural networks (DNNs) at the edge. Lookup Table (LUT) based networks, where neurons are directly modeled using LUTs, help maximize this promise of offering ultra-low latency and high area efficiency on FPGAs. Unfortunately, LUT resource usage scales exponentially with the number of inputs to the LUT, restricting PolyLUT to small LU… ▽ More

    Submitted 15 September, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

    Comments: The source code for this paper is available at: https://github.com/bingleilou/PolyLUT-Add

    Journal ref: International Conference on Field-Programmable Logic and Applications (FPL2024) in Turin, Italy, from 2nd to 6th September 2024

  6. arXiv:2303.15860  [pdf, other

    cs.IT cs.LG

    The Wyner Variational Autoencoder for Unsupervised Multi-Layer Wireless Fingerprinting

    Authors: Teng-Hui Huang, Thilini Dahanayaka, Kanchana Thilakarathna, Philip H. W. Leong, Hesham El Gamal

    Abstract: Wireless fingerprinting refers to a device identification method leveraging hardware imperfections and wireless channel variations as signatures. Beyond physical layer characteristics, recent studies demonstrated that user behaviors could be identified through network traffic, e.g., packet length, without decryption of the payload. Inspired by these results, we propose a multi-layer fingerprinting… ▽ More

    Submitted 28 August, 2023; v1 submitted 28 March, 2023; originally announced March 2023.

  7. NITI: Training Integer Neural Networks Using Integer-only Arithmetic

    Authors: Maolin Wang, Seyedramin Rasoulinezhad, Philip H. W. Leong, Hayden K. H. So

    Abstract: While integer arithmetic has been widely adopted for improved performance in deep quantized neural network inference, training remains a task primarily executed using floating point arithmetic. This is because both high dynamic range and numerical accuracy are central to the success of most modern training algorithms. However, due to its potential for computational, storage and energy advantages i… ▽ More

    Submitted 11 February, 2022; v1 submitted 28 September, 2020; originally announced September 2020.

  8. LUXOR: An FPGA Logic Cell Architecture for Efficient Compressor Tree Implementations

    Authors: SeyedRamin Rasoulinezhad, Siddhartha, Hao Zhou, Lingli Wang, David Boland, Philip H. W. Leong

    Abstract: We propose two tiers of modifications to FPGA logic cell architecture to deliver a variety of performance and utilization benefits with only minor area overheads. In the irst tier, we augment existing commercial logic cell datapaths with a 6-input XOR gate in order to improve the expressiveness of each element, while maintaining backward compatibility. This new architecture is vendor-agnostic, and… ▽ More

    Submitted 6 March, 2020; originally announced March 2020.

    Comments: In Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA'20), February 23-25, 2020, Seaside, CA, USA

    ACM Class: B.2.1; C.0

  9. MajorityNets: BNNs Utilising Approximate Popcount for Improved Efficiency

    Authors: Seyedramin Rasoulinezhad, Sean Fox, Hao Zhou, Lingli Wang, David Boland, Philip H. W. Leong

    Abstract: Binarized neural networks (BNNs) have shown exciting potential for utilising neural networks in embedded implementations where area, energy and latency constraints are paramount. With BNNs, multiply-accumulate (MAC) operations can be simplified to XnorPopcount operations, leading to massive reductions in both memory and computation resources. Furthermore, multiple efficient implementations of BNNs… ▽ More

    Submitted 26 February, 2020; originally announced February 2020.

    Comments: 4 pages

    Journal ref: International Conference on Field-Programmable Technology, {FPT} 2019,Tianjin, China, December 9-13, 2019

  10. arXiv:1911.08097  [pdf

    eess.SP cs.AR cs.CV

    AddNet: Deep Neural Networks Using FPGA-Optimized Multipliers

    Authors: Julian Faraone, Martin Kumm, Martin Hardieck, Peter Zipf, Xueyuan Liu, David Boland, Philip H. W. Leong

    Abstract: Low-precision arithmetic operations to accelerate deep-learning applications on field-programmable gate arrays (FPGAs) have been studied extensively, because they offer the potential to save silicon area or increase throughput. However, these benefits come at the cost of a decrease in accuracy. In this article, we demonstrate that reconfigurable constant coefficient multipliers (RCCMs) offer a bet… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

    Comments: 14 pages

  11. arXiv:1909.04509  [pdf, other

    eess.SP cs.LG cs.NE eess.IV

    Unrolling Ternary Neural Networks

    Authors: Stephen Tridgell, Martin Kumm, Martin Hardieck, David Boland, Duncan Moss, Peter Zipf, Philip H. W. Leong

    Abstract: The computational complexity of neural networks for large scale or real-time applications necessitates hardware acceleration. Most approaches assume that the network architecture and parameters are unknown at design time, permitting usage in a large number of applications. This paper demonstrates, for the case where the neural network architecture and ternary weight values are known a priori, that… ▽ More

    Submitted 9 September, 2019; originally announced September 2019.

    Comments: Accepted in TRETS

  12. arXiv:1807.00301  [pdf, other

    cs.CV

    SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks

    Authors: Julian Faraone, Nicholas Fraser, Michaela Blott, Philip H. W. Leong

    Abstract: Inference for state-of-the-art deep neural networks is computationally expensive, making them difficult to deploy on constrained hardware environments. An efficient way to reduce this complexity is to quantize the weight parameters and/or activations during training by approximating their distributions with a limited entry codebook. For very low-precisions, such as binary or ternary networks with… ▽ More

    Submitted 1 July, 2018; originally announced July 2018.

    Comments: Published as a conference paper at the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  13. arXiv:1709.06262  [pdf, other

    cs.CV

    Compressing Low Precision Deep Neural Networks Using Sparsity-Induced Regularization in Ternary Networks

    Authors: Julian Faraone, Nicholas Fraser, Giulio Gambardella, Michaela Blott, Philip H. W. Leong

    Abstract: A low precision deep neural network training technique for producing sparse, ternary neural networks is presented. The technique incorporates hard- ware implementation costs during training to achieve significant model compression for inference. Training involves three stages: network training using L2 regularization and a quantization threshold regularizer, quantization pruning, and finally retra… ▽ More

    Submitted 9 October, 2017; v1 submitted 19 September, 2017; originally announced September 2017.

    Comments: To appear as a conference paper at the 24th International Conference On Neural Information Processing (ICONIP 2017)