Search | arXiv e-print repository

Fourier Amplitude and Correlation Loss: Beyond Using L2 Loss for Skillful Precipitation Nowcasting

Authors: Chiu-Wai Yan, Shi Quan Foo, Van Hoan Trinh, Dit-Yan Yeung, Ka-Hing Wong, Wai-Kin Wong

Abstract: Deep learning approaches have been widely adopted for precipitation nowcasting in recent years. Previous studies mainly focus on proposing new model architectures to improve pixel-wise metrics. However, they frequently result in blurry predictions which provide limited utility to forecasting operations. In this work, we propose a new Fourier Amplitude and Correlation Loss (FACL) which consists of… ▽ More Deep learning approaches have been widely adopted for precipitation nowcasting in recent years. Previous studies mainly focus on proposing new model architectures to improve pixel-wise metrics. However, they frequently result in blurry predictions which provide limited utility to forecasting operations. In this work, we propose a new Fourier Amplitude and Correlation Loss (FACL) which consists of two novel loss terms: Fourier Amplitude Loss (FAL) and Fourier Correlation Loss (FCL). FAL regularizes the Fourier amplitude of the model prediction and FCL complements the missing phase information. The two loss terms work together to replace the traditional $L_2$ losses such as MSE and weighted MSE for the spatiotemporal prediction problem on signal-based data. Our method is generic, parameter-free and efficient. Extensive experiments using one synthetic dataset and three radar echo datasets demonstrate that our method improves perceptual metrics and meteorology skill scores, with a small trade-off to pixel-wise accuracy and structural similarity. Moreover, to improve the error margin in meteorological skill scores such as Critical Success Index (CSI) and Fractions Skill Score (FSS), we propose and adopt the Regional Histogram Divergence (RHD), a distance metric that considers the patch-wise similarity between signal-based imagery patterns with tolerance to local transforms. Code is available at https://github.com/argenycw/FACL △ Less

Submitted 30 October, 2024; originally announced October 2024.

Comments: Accepted by NeurIPS 2024. Camera-ready submission

arXiv:2410.21076 [pdf, other]

Accelerated Bayesian parameter estimation and model selection for gravitational waves with normalizing flows

Authors: Alicja Polanska, Thibeau Wouters, Peter T. H. Pang, Kaze K. W. Wong, Jason D. McEwen

Abstract: We present an accelerated pipeline, based on high-performance computing techniques and normalizing flows, for joint Bayesian parameter estimation and model selection and demonstrate its efficiency in gravitational wave astrophysics. We integrate the Jim inference toolkit, a normalizing flow-enhanced Markov chain Monte Carlo (MCMC) sampler, with the learned harmonic mean estimator. Our Bayesian evi… ▽ More We present an accelerated pipeline, based on high-performance computing techniques and normalizing flows, for joint Bayesian parameter estimation and model selection and demonstrate its efficiency in gravitational wave astrophysics. We integrate the Jim inference toolkit, a normalizing flow-enhanced Markov chain Monte Carlo (MCMC) sampler, with the learned harmonic mean estimator. Our Bayesian evidence estimates run on $1$ GPU are consistent with traditional nested sampling techniques run on $16$ CPU cores, while reducing the computation time by factors of $5\times$ and $15\times$ for $4$-dimensional and $11$-dimensional gravitational wave inference problems, respectively. Our code is available in well-tested and thoroughly documented open-source packages, ensuring accessibility and reproducibility for the wider research community. △ Less

Submitted 31 October, 2024; v1 submitted 28 October, 2024; originally announced October 2024.

Comments: accepted to NeurIPS 2024 workshop on Machine Learning and the Physical Sciences

arXiv:2410.19956 [pdf, other]

Gravitational-Wave Parameter Estimation in non-Gaussian noise using Score-Based Likelihood Characterization

Authors: Ronan Legin, Maximiliano Isi, Kaze W. K. Wong, Yashar Hezaveh, Laurence Perreault-Levasseur

Abstract: Gravitational-wave (GW) parameter estimation typically assumes that instrumental noise is Gaussian and stationary. Obvious departures from this idealization are typically handled on a case-by-case basis, e.g., through bespoke procedures to ``clean'' non-Gaussian noise transients (glitches), as was famously the case for the GW170817 neutron-star binary. Although effective, manipulating the data in… ▽ More Gravitational-wave (GW) parameter estimation typically assumes that instrumental noise is Gaussian and stationary. Obvious departures from this idealization are typically handled on a case-by-case basis, e.g., through bespoke procedures to ``clean'' non-Gaussian noise transients (glitches), as was famously the case for the GW170817 neutron-star binary. Although effective, manipulating the data in this way can introduce biases in the inference of key astrophysical properties, like binary precession, and compound in unpredictable ways when combining multiple observations; alternative procedures free of the same biases, like joint inference of noise and signal properties, have so far proved too computationally expensive to execute at scale. Here we take a different approach: rather than explicitly modeling individual non-Gaussianities to then apply the traditional GW likelihood, we seek to learn the true distribution of instrumental noise without presuming Gaussianity and stationarity in the first place. Assuming only noise additivity, we employ score-based diffusion models to learn an empirical noise distribution directly from detector data and then combine it with a deterministic waveform model to provide an unbiased estimate of the likelihood function. We validate the method by performing inference on a subset of GW parameters from 400 mock observations, containing real LIGO noise from either the Livingston or Hanford detectors. We show that the proposed method can recover the true parameters even in the presence of loud glitches, and that the inference is unbiased over a population of signals without applying any cleaning to the data. This work provides a promising avenue for extracting unbiased source properties in future GW observations over the coming decade. △ Less

Submitted 25 October, 2024; originally announced October 2024.

Comments: 10 pages, 3 figures

arXiv:2410.16565 [pdf, other]

doi 10.3847/1538-4357/adc681

Search for gravitational waves emitted from SN 2023ixf

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Al-Jodah, C. Alléné, A. Allocca , et al. (1758 additional authors not shown)

Abstract: We present the results of a search for gravitational-wave transients associated with core-collapse supernova SN 2023ixf, which was observed in the galaxy Messier 101 via optical emission on 2023 May 19th, during the LIGO-Virgo-KAGRA 15th Engineering Run. We define a five-day on-source window during which an accompanying gravitational-wave signal may have occurred. No gravitational waves have been… ▽ More We present the results of a search for gravitational-wave transients associated with core-collapse supernova SN 2023ixf, which was observed in the galaxy Messier 101 via optical emission on 2023 May 19th, during the LIGO-Virgo-KAGRA 15th Engineering Run. We define a five-day on-source window during which an accompanying gravitational-wave signal may have occurred. No gravitational waves have been identified in data when at least two gravitational-wave observatories were operating, which covered $\sim 14\%$ of this five-day window. We report the search detection efficiency for various possible gravitational-wave emission models. Considering the distance to M101 (6.7 Mpc), we derive constraints on the gravitational-wave emission mechanism of core-collapse supernovae across a broad frequency spectrum, ranging from 50 Hz to 2 kHz where we assume the gravitational-wave emission occurred when coincident data are available in the on-source window. Considering an ellipsoid model for a rotating proto-neutron star, our search is sensitive to gravitational-wave energy $1 \times 10^{-4} M_{\odot} c^2$ and luminosity $2.6 \times 10^{-4} M_{\odot} c^2/s$ for a source emitting at 82 Hz. These constraints are around an order of magnitude more stringent than those obtained so far with gravitational-wave data. The constraint on the ellipticity of the proto-neutron star that is formed is as low as 1.08, at frequencies above 1200 Hz, surpassing past results. △ Less

Submitted 11 March, 2025; v1 submitted 21 October, 2024; originally announced October 2024.

Comments: Main paper: 6 pages, 4 figures and 1 table. Total with appendices: 20 pages, 4 figures, and 1 table

Report number: LIGO-P2400125

Journal ref: ApJ 985 183 (2025)

arXiv:2410.15977 [pdf, other]

doi 10.1109/TPAMI.2024.3483654

Enabling Energy-Efficient Deployment of Large Language Models on Memristor Crossbar: A Synergy of Large and Small

Authors: Zhehui Wang, Tao Luo, Cheng Liu, Weichen Liu, Rick Siow Mong Goh, Weng-Fai Wong

Abstract: Large language models (LLMs) have garnered substantial attention due to their promising applications in diverse domains. Nevertheless, the increasing size of LLMs comes with a significant surge in the computational requirements for training and deployment. Memristor crossbars have emerged as a promising solution, which demonstrated a small footprint and remarkably high energy efficiency in compute… ▽ More Large language models (LLMs) have garnered substantial attention due to their promising applications in diverse domains. Nevertheless, the increasing size of LLMs comes with a significant surge in the computational requirements for training and deployment. Memristor crossbars have emerged as a promising solution, which demonstrated a small footprint and remarkably high energy efficiency in computer vision (CV) models. Memristors possess higher density compared to conventional memory technologies, making them highly suitable for effectively managing the extreme model size associated with LLMs. However, deploying LLMs on memristor crossbars faces three major challenges. Firstly, the size of LLMs increases rapidly, already surpassing the capabilities of state-of-the-art memristor chips. Secondly, LLMs often incorporate multi-head attention blocks, which involve non-weight stationary multiplications that traditional memristor crossbars cannot support. Third, while memristor crossbars excel at performing linear operations, they are not capable of executing complex nonlinear operations in LLM such as softmax and layer normalization. To address these challenges, we present a novel architecture for the memristor crossbar that enables the deployment of state-of-the-art LLM on a single chip or package, eliminating the energy and time inefficiencies associated with off-chip communication. Our testing on BERT_Large showed negligible accuracy loss. Compared to traditional memristor crossbars, our architecture achieves enhancements of up to 39X in area overhead and 18X in energy consumption. Compared to modern TPU/GPU systems, our architecture demonstrates at least a 68X reduction in the area-delay product and a significant 69% energy consumption reduction. △ Less

Submitted 21 October, 2024; originally announced October 2024.

Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (2024 early access)

arXiv:2410.15428 [pdf, other]

Multiset Combinatorial Gray Codes with Application to Proximity Sensor Networks

Authors: Chung Shue Chen, Wing Shing Wong, Yuan-Hsun Lo, Tsai-Lien Wong

Abstract: We investigate coding schemes that map source symbols into multisets of an alphabet set. Such a formulation of source coding is an alternative approach to the traditional framework and is inspired by an object tracking problem over proximity sensor networks. We define a \textit{multiset combinatorial Gray code} as a mulitset code with fixed multiset cardinality that possesses combinatorial Gray co… ▽ More We investigate coding schemes that map source symbols into multisets of an alphabet set. Such a formulation of source coding is an alternative approach to the traditional framework and is inspired by an object tracking problem over proximity sensor networks. We define a \textit{multiset combinatorial Gray code} as a mulitset code with fixed multiset cardinality that possesses combinatorial Gray code characteristic. For source codes that are organized as a grid, namely an integer lattice, we propose a solution by first constructing a mapping from the grid to the alphabet set, the codes are then defined as the images of rectangular blocks in the grid of fixed dimensions. We refer to the mapping as a \textit{color mapping} and the code as a \textit{color multiset code}. We propose the idea of product multiset code that enables us to construct codes for high dimensional grids based on 1-dimensional (1D) grids. We provide a detailed analysis of color multiset codes on 1D grids, focusing on codes that require the minimal number of colors. To illustrate the application of such a coding scheme, we consider an object tracking problem on 2D grids and show its efficiency, which comes from exploiting transmission parallelism. Some numerical results are presented to conclude the paper. △ Less

Submitted 20 October, 2024; originally announced October 2024.

Comments: 30 pages, 4 figures

arXiv:2410.10046 [pdf]

A Hybrid Sampling and Multi-Objective Optimization Approach for Enhanced Software Defect Prediction

Authors: Jie Zhang, Dongcheng Li, W. Eric Wong, Shengrong Wang

Abstract: Accurate early prediction of software defects is essential to maintain software quality and reduce maintenance costs. However, the field of software defect prediction (SDP) faces challenges such as class imbalances, high-dimensional feature spaces, and suboptimal prediction accuracy. To mitigate these challenges, this paper introduces a novel SDP framework that integrates hybrid sampling technique… ▽ More Accurate early prediction of software defects is essential to maintain software quality and reduce maintenance costs. However, the field of software defect prediction (SDP) faces challenges such as class imbalances, high-dimensional feature spaces, and suboptimal prediction accuracy. To mitigate these challenges, this paper introduces a novel SDP framework that integrates hybrid sampling techniques, specifically Borderline SMOTE and Tomek Links, with a suite of multi-objective optimization algorithms, including NSGA-II, MOPSO, and MODE. The proposed model applies feature fusion through multi-objective optimization, enhancing both the generalization capability and stability of the predictions. Furthermore, the integration of parallel processing for these optimization algorithms significantly boosts the computational efficiency of the model. Comprehensive experiments conducted on datasets from NASA and PROMISE repositories demonstrate that the proposed hybrid sampling and multi-objective optimization approach improves data balance, eliminates redundant features, and enhances prediction accuracy. The experimental results also highlight the robustness of the feature fusion approach, confirming its superiority over existing state-of-the-art techniques in terms of predictive performance and applicability across diverse datasets. △ Less

Submitted 13 October, 2024; originally announced October 2024.

arXiv:2410.09963 [pdf, other]

Heterogeneous Graph Neural Network for Cooperative ISAC Beamforming in Cell-Free MIMO Systems

Authors: Zihuan Wang, Vincent W. S. Wong

Abstract: Integrated sensing and communication (ISAC) is one of the usage scenarios for the sixth generation (6G) wireless networks. In this paper, we study cooperative ISAC in cell-free multiple-input multiple-output (MIMO) systems, where multiple MIMO access points (APs) collaboratively provide communication services and perform multi-static sensing. We formulate an optimization problem for the ISAC beamf… ▽ More Integrated sensing and communication (ISAC) is one of the usage scenarios for the sixth generation (6G) wireless networks. In this paper, we study cooperative ISAC in cell-free multiple-input multiple-output (MIMO) systems, where multiple MIMO access points (APs) collaboratively provide communication services and perform multi-static sensing. We formulate an optimization problem for the ISAC beamforming design, which maximizes the achievable sum-rate while guaranteeing the sensing signal-to-noise ratio (SNR) requirement and total power constraint. Learning-based techniques are regarded as a promising approach for addressing such a nonconvex optimization problem. By taking the topology of cell-free MIMO systems into consideration, we propose a heterogeneous graph neural network (GNN), namely SACGNN, for ISAC beamforming design. The proposed SACGNN framework models the cell-free MIMO system for cooperative ISAC as a heterogeneous graph and employs a transformer-based heterogeneous message passing scheme to capture the important information of sensing and communication channels and propagate the information through the graph network. Simulation results demonstrate the performance gain of the proposed SACGNN framework over a conventional null-space projection based scheme and a deep neural network (DNN)-based baseline scheme. △ Less

Submitted 13 October, 2024; originally announced October 2024.

Comments: This paper has been accepted for publication in Proc. of 3rd ACM MobiCom Workshop on Integrated Sensing and Communications Systems (ISACom), Washington, DC, Nov. 2024

arXiv:2410.09151 [pdf, other]

doi 10.3847/1538-4357/ad8de0

A search using GEO600 for gravitational waves coincident with fast radio bursts from SGR 1935+2154

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Al-Jodah, C. Alléné , et al. (1758 additional authors not shown)

Abstract: The magnetar SGR 1935+2154 is the only known Galactic source of fast radio bursts (FRBs). FRBs from SGR 1935+2154 were first detected by CHIME/FRB and STARE2 in 2020 April, after the conclusion of the LIGO, Virgo, and KAGRA Collaborations' O3 observing run. Here we analyze four periods of gravitational wave (GW) data from the GEO600 detector coincident with four periods of FRB activity detected by… ▽ More The magnetar SGR 1935+2154 is the only known Galactic source of fast radio bursts (FRBs). FRBs from SGR 1935+2154 were first detected by CHIME/FRB and STARE2 in 2020 April, after the conclusion of the LIGO, Virgo, and KAGRA Collaborations' O3 observing run. Here we analyze four periods of gravitational wave (GW) data from the GEO600 detector coincident with four periods of FRB activity detected by CHIME/FRB, as well as X-ray glitches and X-ray bursts detected by NICER and NuSTAR close to the time of one of the FRBs. We do not detect any significant GW emission from any of the events. Instead, using a short-duration GW search (for bursts $\leq$ 1 s) we derive 50\% (90\%) upper limits of $10^{48}$ ($10^{49}$) erg for GWs at 300 Hz and $10^{49}$ ($10^{50}$) erg at 2 kHz, and constrain the GW-to-radio energy ratio to $\leq 10^{14} - 10^{16}$. We also derive upper limits from a long-duration search for bursts with durations between 1 and 10 s. These represent the strictest upper limits on concurrent GW emission from FRBs. △ Less

Submitted 21 May, 2025; v1 submitted 11 October, 2024; originally announced October 2024.

Comments: 15 pages of text including references, 4 figures, 5 tables

Report number: LIGO-P2400192

Journal ref: ApJ 977 255 (2024)

arXiv:2410.08064 [pdf, other]

doi 10.1142/S0218216525500555

Bounds on the mosaic number of Legendrian Knots

Authors: Margaret Kipe, Samantha Pezzimenti, Leif Schaumann, Luc Ta, Wing Hong Tony Wong

Abstract: Mosaic tiles were first introduced by Lomonaco and Kauffman in 2008 to describe quantum knots, and have since been studied for their own right. Using a modified set of tiles, front projections of Legendrian knots can be built from mosaics as well. In this work, we compute lower bounds on the mosaic number of Legendrian knots in terms of their classical invariants. We also provide a class of exampl… ▽ More Mosaic tiles were first introduced by Lomonaco and Kauffman in 2008 to describe quantum knots, and have since been studied for their own right. Using a modified set of tiles, front projections of Legendrian knots can be built from mosaics as well. In this work, we compute lower bounds on the mosaic number of Legendrian knots in terms of their classical invariants. We also provide a class of examples that imply sharpness of these bounds in certain cases. An additional construction of Legendrian unknots provides an upper bound on the mosaic number of Legendrian unknots. We also adapt a result of Oh, Hong, Lee, and Lee to give an algorithm to compute the number of Legendrian link mosaics of any given size. Finally, we use a computer search to provide an updated census of known mosaic numbers for Legendrian knots, including all Legendrian knots whose mosaic number is 6 or less. △ Less

Submitted 10 October, 2024; originally announced October 2024.

Comments: 46 pages, 35 figures

MSC Class: 57K10 (Primary); 57K33 (Secondary)

arXiv:2410.00282 [pdf]

Smart Contract Vulnerability Detection based on Static Analysis and Multi-Objective Search

Authors: Dongcheng Li, W. Eric Wong, Xiaodan Wang, Sean Pan, Liang-Seng Koh

Abstract: This paper introduces a method for detecting vulnerabilities in smart contracts using static analysis and a multi-objective optimization algorithm. We focus on four types of vulnerabilities: reentrancy, call stack overflow, integer overflow, and timestamp dependencies. Initially, smart contracts are compiled into an abstract syntax tree to analyze relationships between contracts and functions, inc… ▽ More This paper introduces a method for detecting vulnerabilities in smart contracts using static analysis and a multi-objective optimization algorithm. We focus on four types of vulnerabilities: reentrancy, call stack overflow, integer overflow, and timestamp dependencies. Initially, smart contracts are compiled into an abstract syntax tree to analyze relationships between contracts and functions, including calls, inheritance, and data flow. These analyses are transformed into static evaluations and intermediate representations that reveal internal relations. Based on these representations, we examine contract's functions, variables, and data dependencies to detect the specified vulnerabilities. To enhance detection accuracy and coverage, we apply a multi-objective optimization algorithm to the static analysis process. This involves assigning initial numeric values to input data and monitoring changes in statement coverage and detection accuracy. Using coverage and accuracy as fitness values, we calculate Pareto front and crowding distance values to select the best individuals for the new parent population, iterating until optimization criteria are met. We validate our approach using an open-source dataset collected from Etherscan, containing 6,693 smart contracts. Experimental results show that our method outperforms state-of-the-art tools in terms of coverage, accuracy, efficiency, and effectiveness in detecting the targeted vulnerabilities. △ Less

Submitted 30 September, 2024; originally announced October 2024.

arXiv:2409.20462 [pdf, other]

Nonreciprocal Local-Resonance Induced Complex Band Hybridization

Authors: Wang Tat Yau, Kai Fung Lee, Raymond P. H. Wu, Wai Chun Wong, Jensen Li, C. T. Chan, Kin Hung Fung

Abstract: We study the complex band hybridization induced by nonreciprocal local resonances in photonic crystals. Composed of trimer unit cells, a two-dimensional (2D) magnetophotonic crystal with an analytically obtainable solution is considered. We find that nonreciprocal spectral gap may appear without nonreciprocal transmission and that the imaginary parts of the complex wavevectors… ▽ More We study the complex band hybridization induced by nonreciprocal local resonances in photonic crystals. Composed of trimer unit cells, a two-dimensional (2D) magnetophotonic crystal with an analytically obtainable solution is considered. We find that nonreciprocal spectral gap may appear without nonreciprocal transmission and that the imaginary parts of the complex wavevectors $\text{Im}(\mathbf{k})$ may blow up at resonance to give extreme nonreciprocal transmission. We further show that, for a subwavelegnth lattice, the isolation ratio for the nonreciprocal transmission is determined solely by $\text{Im}(\mathbf{k})$ instead of the extensively studied real part $\text{Re}(\mathbf{k})$. Our finding contradicts the common belief that "spectral nonreciprocity [$ω(\mathbf{k})\neqω(-\mathbf{k})$] always implies nonreciprocal transmission". △ Less

Submitted 1 April, 2025; v1 submitted 30 September, 2024; originally announced September 2024.

Comments: 5 pages, 4 figures

arXiv:2409.19147 [pdf, other]

Training the Next Generation of Seismologists: Delivering Research-Grade Software Education for Cloud and HPC Computing through Diverse Training Modalities

Authors: M. Denolle, C. Tape, E. Bozdağ, Y. Wang, F. Waldhauser, A. A. Gabriel, J. Braunmiller, B. Chow, L. Ding, K. F. Feng, A. Ghosh, N. Groebner, A. Gupta, Z. Krauss, A. McPherson, M. Nagaso, Z. Niu, Y. Ni, R. \" Orsvuran, G. Pavlis, F. Rodriguez-Cardozo, T. Sawi, N. Schliwa, D. Schneller, Q. Shi , et al. (6 additional authors not shown)

Abstract: With the rise of data volume and computing power, seismological research requires more advanced skills in data processing, numerical methods, and parallel computing. We present the experience of conducting training workshops over various forms of delivery to support the adoption of large-scale High-Performance Computing and Cloud computing to advance seismological research. The seismological foci… ▽ More With the rise of data volume and computing power, seismological research requires more advanced skills in data processing, numerical methods, and parallel computing. We present the experience of conducting training workshops over various forms of delivery to support the adoption of large-scale High-Performance Computing and Cloud computing to advance seismological research. The seismological foci were on earthquake source parameter estimation in catalogs, forward and adjoint wavefield simulations in 2 and 3 dimensions at local, regional, and global scales, earthquake dynamics, ambient noise seismology, and machine learning. This contribution describes the series of workshops that were delivered as part of research projects, the learning outcomes of the participants, and lessons learned by the instructors. Our curriculum was grounded on open and reproducible science, large-scale scientific computing and data mining, and computing infrastructure (access and usage) for HPC and the cloud. We also describe the types of teaching materials that have proven beneficial to the instruction and the sustainability of the program. We propose guidelines to deliver future workshops on these topics. △ Less

Submitted 8 April, 2025; v1 submitted 27 September, 2024; originally announced September 2024.

arXiv:2409.15298 [pdf, other]

Sorbet: A Neuromorphic Hardware-Compatible Transformer-Based Spiking Language Model

Authors: Kaiwen Tang, Zhanglu Yan, Weng-Fai Wong

Abstract: For reasons such as privacy, there are use cases for language models at the edge. This has given rise to small language models (SLMs) targeted for deployment in resource-constrained devices where energy efficiency is a significant concern. Spiking neural networks (SNNs) offer a promising solution due to their energy efficiency, and there are already works on realizing transformer-based models on S… ▽ More For reasons such as privacy, there are use cases for language models at the edge. This has given rise to small language models (SLMs) targeted for deployment in resource-constrained devices where energy efficiency is a significant concern. Spiking neural networks (SNNs) offer a promising solution due to their energy efficiency, and there are already works on realizing transformer-based models on SNNs. However, key operations like softmax and layer normalization (LN) are difficult to implement on neuromorphic hardware, and many of these early works sidestepped them. To address these challenges, we introduce Sorbet, a transformer-based spiking language model that is more neuromorphic hardware-compatible. Sorbet incorporates a novel shifting-based softmax called PTsoftmax and a power normalization method using bit-shifting (BSPN), both designed to replace the respective energy-intensive operations. By leveraging knowledge distillation and model quantization, Sorbet achieved a highly compressed binary weight model that maintains competitive performance while significantly reducing energy consumption. We validate Sorbet's effectiveness through extensive testing on the GLUE benchmark and a series of ablation studies, demonstrating its potential as an energy-efficient solution for language model inference. △ Less

Submitted 4 September, 2024; originally announced September 2024.

arXiv:2409.13902 [pdf]

Enhancing Large Language Models with Domain-specific Retrieval Augment Generation: A Case Study on Long-form Consumer Health Question Answering in Ophthalmology

Authors: Aidan Gilson, Xuguang Ai, Thilaka Arunachalam, Ziyou Chen, Ki Xiong Cheong, Amisha Dave, Cameron Duic, Mercy Kibe, Annette Kaminaka, Minali Prasad, Fares Siddig, Maxwell Singer, Wendy Wong, Qiao Jin, Tiarnan D. L. Keenan, Xia Hu, Emily Y. Chew, Zhiyong Lu, Hua Xu, Ron A. Adelman, Yih-Chung Tham, Qingyu Chen

Abstract: Despite the potential of Large Language Models (LLMs) in medicine, they may generate responses lacking supporting evidence or based on hallucinated evidence. While Retrieval Augment Generation (RAG) is popular to address this issue, few studies implemented and evaluated RAG in downstream domain-specific applications. We developed a RAG pipeline with 70,000 ophthalmology-specific documents that ret… ▽ More Despite the potential of Large Language Models (LLMs) in medicine, they may generate responses lacking supporting evidence or based on hallucinated evidence. While Retrieval Augment Generation (RAG) is popular to address this issue, few studies implemented and evaluated RAG in downstream domain-specific applications. We developed a RAG pipeline with 70,000 ophthalmology-specific documents that retrieve relevant documents to augment LLMs during inference time. In a case study on long-form consumer health questions, we systematically evaluated the responses including over 500 references of LLMs with and without RAG on 100 questions with 10 healthcare professionals. The evaluation focuses on factuality of evidence, selection and ranking of evidence, attribution of evidence, and answer accuracy and completeness. LLMs without RAG provided 252 references in total. Of which, 45.3% hallucinated, 34.1% consisted of minor errors, and 20.6% were correct. In contrast, LLMs with RAG significantly improved accuracy (54.5% being correct) and reduced error rates (18.8% with minor hallucinations and 26.7% with errors). 62.5% of the top 10 documents retrieved by RAG were selected as the top references in the LLM response, with an average ranking of 4.9. The use of RAG also improved evidence attribution (increasing from 1.85 to 2.49 on a 5-point scale, P<0.001), albeit with slight decreases in accuracy (from 3.52 to 3.23, P=0.03) and completeness (from 3.47 to 3.27, P=0.17). The results demonstrate that LLMs frequently exhibited hallucinated and erroneous evidence in the responses, raising concerns for downstream applications in the medical domain. RAG substantially reduced the proportion of such evidence but encountered challenges. △ Less

Submitted 20 September, 2024; originally announced September 2024.

arXiv:2409.08290 [pdf, ps, other]

Reconsidering the energy efficiency of spiking neural networks

Authors: Zhanglu Yan, Zhenyu Bai, Weng-Fai Wong

Abstract: Spiking Neural Networks (SNNs) promise higher energy efficiency over conventional Quantized Artificial Neural Networks (QNNs) due to their event-driven, spike-based computation. However, prevailing energy evaluations often oversimplify, focusing on computational aspects while neglecting critical overheads like comprehensive data movement and memory access. Such simplifications can lead to misleadi… ▽ More Spiking Neural Networks (SNNs) promise higher energy efficiency over conventional Quantized Artificial Neural Networks (QNNs) due to their event-driven, spike-based computation. However, prevailing energy evaluations often oversimplify, focusing on computational aspects while neglecting critical overheads like comprehensive data movement and memory access. Such simplifications can lead to misleading conclusions regarding the true energy benefits of SNNs. This paper presents a rigorous re-evaluation. We establish a fair baseline by mapping rate-encoded SNNs with $T$ timesteps to functionally equivalent QNNs with $\lceil \log_2(T+1) \rceil$ bits. This ensures both models have comparable representational capacities, as well has similar hardware requirement, enabling meaningful energy comparisons. We introduce a detailed analytical energy model encompassing core computation and data movement (sparse and dense activations, weights). Using this model, we systematically explore a wide parameter space, including intrinsic network characteristics ($T$, spike rate $s_r$, QNN sparsity $γ$, model size $N$, weight bit-level) and hardware characteristics (memory system and network-on-chip). Our analysis identifies specific operational regimes where SNNs genuinely offer superior energy efficiency. For example, under typical neuromorphic hardware conditions, SNNs with moderate time windows ($T \in [5,10]$) require an average spike rate ($s_r$) below 6.4% to outperform equivalent QNNs. These insights guide the design of genuinely energy-efficient neural network solutions. △ Less

Submitted 3 July, 2025; v1 submitted 29 August, 2024; originally announced September 2024.

arXiv:2409.07931 [pdf, other]

Task-Augmented Cross-View Imputation Network for Partial Multi-View Incomplete Multi-Label Classification

Authors: Lian Zhao, Jie Wen, Xiaohuan Lu, Wai Keung Wong, Jiang Long, Wulin Xie

Abstract: In real-world scenarios, multi-view multi-label learning often encounters the challenge of incomplete training data due to limitations in data collection and unreliable annotation processes. The absence of multi-view features impairs the comprehensive understanding of samples, omitting crucial details essential for classification. To address this issue, we present a task-augmented cross-view imput… ▽ More In real-world scenarios, multi-view multi-label learning often encounters the challenge of incomplete training data due to limitations in data collection and unreliable annotation processes. The absence of multi-view features impairs the comprehensive understanding of samples, omitting crucial details essential for classification. To address this issue, we present a task-augmented cross-view imputation network (TACVI-Net) for the purpose of handling partial multi-view incomplete multi-label classification. Specifically, we employ a two-stage network to derive highly task-relevant features to recover the missing views. In the first stage, we leverage the information bottleneck theory to obtain a discriminative representation of each view by extracting task-relevant information through a view-specific encoder-classifier architecture. In the second stage, an autoencoder based multi-view reconstruction network is utilized to extract high-level semantic representation of the augmented features and recover the missing data, thereby aiding the final classification task. Extensive experiments on five datasets demonstrate that our TACVI-Net outperforms other state-of-the-art methods. △ Less

Submitted 24 March, 2025; v1 submitted 12 September, 2024; originally announced September 2024.

arXiv:2408.10485 [pdf]

Metasurface-enabled quantum holograms with hybrid entanglement

Authors: Hong Liang, Wai Chun Wong, Tailin An, Jensen Li

Abstract: Metasurfaces, with their capability to control all possible dimensions of light, have become integral to quantum optical applications, including quantum state generation, operation, and tomography. In this work, we utilize a metasurface to generate polarization-hologram hybrid entanglement between a signal-idler photon pair to construct a quantum hologram. The properties of the quantum hologram ca… ▽ More Metasurfaces, with their capability to control all possible dimensions of light, have become integral to quantum optical applications, including quantum state generation, operation, and tomography. In this work, we utilize a metasurface to generate polarization-hologram hybrid entanglement between a signal-idler photon pair to construct a quantum hologram. The properties of the quantum hologram can be revealed by collapsing the polarization degree of freedom of the idler photon, inducing interference between two holographic states of the signal photon, as a meaningful and selective erasure of the holographic content. In contrary, interference disappears when the idler photon is detected without observing polarization. This process can be further interpreted as a quantum holographic eraser, where the erasing action is visualized with erased contents in holograms. Our construction of polarization-hologram hybrid entangled state with metasurfaces will be useful for quantum communication with enhanced robustness, anti-counterfeiting applications through the additional quantum degrees of freedom, and as an emerging platform for exploring fundamental quantum concepts for entanglement and non-locality. △ Less

Submitted 19 August, 2024; originally announced August 2024.

Comments: 12 pages, 6 figures

arXiv:2407.12867 [pdf, other]

Swift-BAT GUANO follow-up of gravitational-wave triggers in the third LIGO-Virgo-KAGRA observing run

Authors: Gayathri Raman, Samuele Ronchini, James Delaunay, Aaron Tohuvavohu, Jamie A. Kennea, Tyler Parsotan, Elena Ambrosi, Maria Grazia Bernardini, Sergio Campana, Giancarlo Cusumano, Antonino D'Ai, Paolo D'Avanzo, Valerio D'Elia, Massimiliano De Pasquale, Simone Dichiara, Phil Evans, Dieter Hartmann, Paul Kuin, Andrea Melandri, Paul O'Brien, Julian P. Osborne, Kim Page, David M. Palmer, Boris Sbarufatti, Gianpiero Tagliaferri , et al. (1797 additional authors not shown)

Abstract: We present results from a search for X-ray/gamma-ray counterparts of gravitational-wave (GW) candidates from the third observing run (O3) of the LIGO-Virgo-KAGRA (LVK) network using the Swift Burst Alert Telescope (Swift-BAT). The search includes 636 GW candidates received in low latency, 86 of which have been confirmed by the offline analysis and included in the third cumulative Gravitational-Wav… ▽ More We present results from a search for X-ray/gamma-ray counterparts of gravitational-wave (GW) candidates from the third observing run (O3) of the LIGO-Virgo-KAGRA (LVK) network using the Swift Burst Alert Telescope (Swift-BAT). The search includes 636 GW candidates received in low latency, 86 of which have been confirmed by the offline analysis and included in the third cumulative Gravitational-Wave Transient Catalogs (GWTC-3). Targeted searches were carried out on the entire GW sample using the maximum--likelihood NITRATES pipeline on the BAT data made available via the GUANO infrastructure. We do not detect any significant electromagnetic emission that is temporally and spatially coincident with any of the GW candidates. We report flux upper limits in the 15-350 keV band as a function of sky position for all the catalog candidates. For GW candidates where the Swift-BAT false alarm rate is less than 10$^{-3}$ Hz, we compute the GW--BAT joint false alarm rate. Finally, the derived Swift-BAT upper limits are used to infer constraints on the putative electromagnetic emission associated with binary black hole mergers. △ Less

Submitted 27 March, 2025; v1 submitted 13 July, 2024; originally announced July 2024.

Comments: Update to version accepted for publication in ApJ. 50 pages, 10 figures, 4 tables

Journal ref: ApJ, Volume 980, 2025, 207

arXiv:2407.10213 [pdf]

Spatio-temporal breather dynamics in microcomb soliton crystals

Authors: Futai Hu, Abhinav Kumar Vinod, Wenting Wang, Hsiao-Hsuan Chin, James F. McMillan, Ziyu Zhan, Yuan Meng, Mali Gong, Chee Wei Wong

Abstract: Solitons, the distinct balance between nonlinearity and dispersion, provide a route toward ultrafast electromagnetic pulse shaping, high-harmonic generation, real-time image processing, and RF photonic communications. Here we newly explore and observe the spatio-temporal breather dynamics of optical soliton crystals in frequency microcombs, examining spatial breathers, chaos transitions, and dynam… ▽ More Solitons, the distinct balance between nonlinearity and dispersion, provide a route toward ultrafast electromagnetic pulse shaping, high-harmonic generation, real-time image processing, and RF photonic communications. Here we newly explore and observe the spatio-temporal breather dynamics of optical soliton crystals in frequency microcombs, examining spatial breathers, chaos transitions, and dynamical deterministic switching in nonlinear measurements and theory. To understand the breather solitons, we describe their dynamical routes and two example transitional maps of the ensemble spatial breathers, with and without chaos initiation. We elucidate the physical mechanisms of the breather dynamics in the soliton crystal microcombs, in the interaction plane limit cycles and in the domain-wall understanding with parity symmetry breaking from third order dispersion. We present maps of the accessible nonlinear regions, the breather frequency dependences on third order dispersion and avoided mode crossing strengths, and the transition between the collective breather spatiotemporal states. Our range of measurements matches well with our first-principles theory and nonlinear modeling. To image these soliton ensembles and their breathers, we further constructed panoramic temporal imaging for simultaneous fast and slow axis two dimensional mapping of the breathers. In the phase differential sampling, we present two dimensional evolution maps of soliton crystal breathers, including with defects, in both stable breathers and breathers with drift. Our fundamental studies contribute to the understanding of nonlinear dynamics in soliton crystal complexes, their spatiotemporal dependences, and their stability-existence zones. △ Less

Submitted 14 July, 2024; originally announced July 2024.

arXiv:2407.09089 [pdf]

Lomics: Generation of Pathways and Gene Sets using Large Language Models for Transcriptomic Analysis

Authors: Chun-Ka Wong, Ali Choo, Eugene C. C. Cheng, Wing-Chun San, Kelvin Chak-Kong Cheng, Yee-Man Lau, Minqing Lin, Fei Li, Wei-Hao Liang, Song-Yan Liao, Kwong-Man Ng, Ivan Fan-Ngai Hung, Hung-Fat Tse, Jason Wing-Hon Wong

Abstract: Interrogation of biological pathways is an integral part of omics data analysis. Large language models (LLMs) enable the generation of custom pathways and gene sets tailored to specific scientific questions. These targeted sets are significantly smaller than traditional pathway enrichment analysis libraries, reducing multiple hypothesis testing and potentially enhancing statistical power. Lomics (… ▽ More Interrogation of biological pathways is an integral part of omics data analysis. Large language models (LLMs) enable the generation of custom pathways and gene sets tailored to specific scientific questions. These targeted sets are significantly smaller than traditional pathway enrichment analysis libraries, reducing multiple hypothesis testing and potentially enhancing statistical power. Lomics (Large Language Models for Omics Studies) v1.0 is a python-based bioinformatics toolkit that streamlines the generation of pathways and gene sets for transcriptomic analysis. It operates in three steps: 1) deriving relevant pathways based on the researcher's scientific question, 2) generating valid gene sets for each pathway, and 3) outputting the results as .GMX files. Lomics also provides explanations for pathway selections. Consistency and accuracy are ensured through iterative processes, JSON format validation, and HUGO Gene Nomenclature Committee (HGNC) gene symbol verification. Lomics serves as a foundation for integrating LLMs into omics research, potentially improving the specificity and efficiency of pathway analysis. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2407.01926 [pdf]

Chemical Shift Encoding based Double Bonds Quantification in Triglycerides using Deep Image Prior

Authors: Chaoxing Huang, Ziqiang Yu, Zijian Gao, Qiuyi Shen, Queenie Chan, Vincent Wai-Sun Wong, Winnie Chiu-Wing Chu, Weitian Chen

Abstract: Fatty acid can potentially serve as biomarker for evaluating metabolic disorder and inflammation condition, and quantifying the double bonds is the key for revealing fatty acid information. This study presents an assessment of a deep learning approach utilizing Deep Image Prior (DIP) for the quantification of double bonds and methylene-interrupted double bonds of triglyceride derived from chemical… ▽ More Fatty acid can potentially serve as biomarker for evaluating metabolic disorder and inflammation condition, and quantifying the double bonds is the key for revealing fatty acid information. This study presents an assessment of a deep learning approach utilizing Deep Image Prior (DIP) for the quantification of double bonds and methylene-interrupted double bonds of triglyceride derived from chemical-shift encoded multi-echo gradient echo images, all achieved without the necessity for network training. The methodology implemented a cost function grounded in signal constraints to continually refine the neural network's parameters on a single slice of images through iterative processes. Validation procedures encompassed both phantom experiments and in-vivo scans. The outcomes evidenced a concordance between the quantified values and the established reference standards, notably exemplified by a Pearson correlation coefficient of 0.96 (p = 0.0005) derived from the phantom experiments. The results in water-oil phantom also demonstrate the quantification reliability of the DIP method under the condition of having a relatively low-fat signal. Furthermore, the in-vivo assessments showcased the method's competency by showcasing consistent quantification results that closely mirrored previously published findings concerning subcutaneous fat. In summary, the study underscores the potential of Deep Image Prior in enabling the quantification of double bonds and methylene-interrupted double bonds from chemical-shift encoded multi-echo magnetic resonance imaging (MRI) data, suggesting potential avenues for future research and clinical applications in the field. △ Less

Submitted 29 October, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

Comments: This technical note is accepted by Quantitative Imaging in Medicine and Surgery as a breif report

arXiv:2406.15170 [pdf, other]

Inference for Delay Differential Equations Using Manifold-Constrained Gaussian Processes

Authors: Yuxuan Zhao, Samuel W. K. Wong

Abstract: Dynamic systems described by differential equations often involve feedback among system components. When there are time delays for components to sense and respond to feedback, delay differential equation (DDE) models are commonly used. This paper considers the problem of inferring unknown system parameters, including the time delays, from noisy and sparse experimental data observed from the system… ▽ More Dynamic systems described by differential equations often involve feedback among system components. When there are time delays for components to sense and respond to feedback, delay differential equation (DDE) models are commonly used. This paper considers the problem of inferring unknown system parameters, including the time delays, from noisy and sparse experimental data observed from the system. We propose an extension of manifold-constrained Gaussian processes to conduct parameter inference for DDEs, whereas the time delay parameters have posed a challenge for existing methods that bypass numerical solvers. Our method uses a Bayesian framework to impose a Gaussian process model over the system trajectory, conditioned on the manifold constraint that satisfies the DDEs. For efficient computation, a linear interpolation scheme is developed to approximate the values of the time-delayed system outputs, along with corresponding theoretical error bounds on the approximated derivatives. Two simulation examples, based on Hutchinson's equation and the lac operon system, together with a real-world application using Ontario COVID-19 data, are used to illustrate the efficacy of our method. △ Less

Submitted 21 June, 2024; originally announced June 2024.

Comments: 42 pages, 8 figures

arXiv:2406.13078 [pdf]

A universal bioluminescence tomography system for pre-clinical image-guided radiotherapy research

Authors: Zhishen Tong, Zijian Deng, Xiangkun Xu, Ciara Newman, Xun Jia, Yuncheng Zhong, Merle Reinhart, Paul Tsouchlos, Tim Devling, Hamid Dehghani, Iulian Iordachita, Debabrata Saha, James Kim, John W. Wong, Ken Kang-Hsin Wang

Abstract: CBCT-guided small animal irradiators encounter challenges in localizing soft-tissue targets due to low imaging contrast. Bioluminescence tomography (BLT) offers a promising solution, but they have largely remained in laboratorial development, limiting accessibility for researchers. In this work, we develop a universal, commercial-graded BLT-guided system (MuriGlo) designed to seamlessly integrate… ▽ More CBCT-guided small animal irradiators encounter challenges in localizing soft-tissue targets due to low imaging contrast. Bioluminescence tomography (BLT) offers a promising solution, but they have largely remained in laboratorial development, limiting accessibility for researchers. In this work, we develop a universal, commercial-graded BLT-guided system (MuriGlo) designed to seamlessly integrate with commercial irradiators and empower researchers for translational studies. We demonstrate its capabilities in supporting in vitro and in vivo studies. The MuriGlo comprises detachable mouse bed, thermostatic control, mirrors, filters, and CCD, enabling multi-projection and multi-spectral imaging. We evaluate that the thermostatic control effectively sustains animal temperature at 37°C throughout imaging, and quantify that the system can detect as few as 61 GL261-AkaLuc cells in vitro. To illustrate how the MuriGlo can be utilized for in vivo image-guided research, we present 3 strategies, BLT-guided 5-arc, 2-field box, and BLI-guided single-beam, ranging from complicated high-conformal to simplest high-throughput plans. The high conformal BLT-guided 5-arc plan fully covers the gross tumor volume (GTV) at prescribed dose with minimal normal tissue exposure (3.9%), while the simplified, high-throughput BLT-guided 2-field box achieves 100% GTV coverage but results in higher normal tissue exposure (13.1%). Moreover, we demonstrate that the localization accuracy of MuriGlo for both widely-used SARRP and SmART irradiators is within1 mm, and the tumor coverage reaches over 97% with 0.75mm margin. The universal BLT-guided system offers seamless integration with commercial irradiators, achieving comparable localization accuracy, expected to supporting high-precision radiation research. △ Less

Submitted 26 March, 2025; v1 submitted 18 June, 2024; originally announced June 2024.

arXiv:2406.10438 [pdf, other]

A Fine-grained Analysis of Fitted Q-evaluation: Beyond Parametric Models

Authors: Jiayi Wang, Zhengling Qi, Raymond K. W. Wong

Abstract: In this paper, we delve into the statistical analysis of the fitted Q-evaluation (FQE) method, which focuses on estimating the value of a target policy using offline data generated by some behavior policy. We provide a comprehensive theoretical understanding of FQE estimators under both parameteric and nonparametric models on the $Q$-function. Specifically, we address three key questions related t… ▽ More In this paper, we delve into the statistical analysis of the fitted Q-evaluation (FQE) method, which focuses on estimating the value of a target policy using offline data generated by some behavior policy. We provide a comprehensive theoretical understanding of FQE estimators under both parameteric and nonparametric models on the $Q$-function. Specifically, we address three key questions related to FQE that remain largely unexplored in the current literature: (1) Is the optimal convergence rate for estimating the policy value regarding the sample size $n$ ($n^{-1/2}$) achievable for FQE under a non-parametric model with a fixed horizon ($T$)? (2) How does the error bound depend on the horizon $T$? (3) What is the role of the probability ratio function in improving the convergence of FQE estimators? Specifically, we show that under the completeness assumption of $Q$-functions, which is mild in the non-parametric setting, the estimation errors for policy value using both parametric and non-parametric FQE estimators can achieve an optimal rate in terms of $n$. The corresponding error bounds in terms of both $n$ and $T$ are also established. With an additional realizability assumption on ratio functions, the rate of estimation errors can be improved from $T^{1.5}/\sqrt{n}$ to $T/\sqrt{n}$, which matches the sharpest known bound in the current literature under the tabular setting. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2406.09317 [pdf, other]

Enhancing Diagnostic Accuracy in Rare and Common Fundus Diseases with a Knowledge-Rich Vision-Language Model

Authors: Meng Wang, Tian Lin, Aidi Lin, Kai Yu, Yuanyuan Peng, Lianyu Wang, Cheng Chen, Ke Zou, Huiyu Liang, Man Chen, Xue Yao, Meiqin Zhang, Binwei Huang, Chaoxin Zheng, Peixin Zhang, Wei Chen, Yilong Luo, Yifan Chen, Honghe Xia, Tingkun Shi, Qi Zhang, Jinming Guo, Xiaolin Chen, Jingcheng Wang, Yih Chung Tham , et al. (24 additional authors not shown)

Abstract: Previous foundation models for fundus images were pre-trained with limited disease categories and knowledge base. Here we introduce a knowledge-rich vision-language model (RetiZero) that leverages knowledge from more than 400 fundus diseases. For RetiZero's pretraining, we compiled 341,896 fundus images paired with texts, sourced from public datasets, ophthalmic literature, and online resources, e… ▽ More Previous foundation models for fundus images were pre-trained with limited disease categories and knowledge base. Here we introduce a knowledge-rich vision-language model (RetiZero) that leverages knowledge from more than 400 fundus diseases. For RetiZero's pretraining, we compiled 341,896 fundus images paired with texts, sourced from public datasets, ophthalmic literature, and online resources, encompassing a diverse range of diseases across multiple ethnicities and countries. RetiZero exhibits remarkable performance in several downstream tasks, including zero-shot disease recognition, image-to-image retrieval, AI-assisted clinical diagnosis,few-shot fine-tuning, and internal- and cross-domain disease identification. In zero-shot scenarios, RetiZero achieves Top-5 accuracies of 0.843 for 15 diseases and 0.756 for 52 diseases. For image retrieval, it achieves Top-5 scores of 0.950 and 0.886 for the same sets, respectively. AI-assisted clinical diagnosis results show that RetiZero's Top-3 zero-shot performance surpasses the average of 19 ophthalmologists from Singapore, China, and the United States. RetiZero substantially enhances clinicians' accuracy in diagnosing fundus diseases, in particularly rare ones. These findings underscore the value of integrating the RetiZero into clinical settings, where various fundus diseases are encountered. △ Less

Submitted 10 April, 2025; v1 submitted 13 June, 2024; originally announced June 2024.

arXiv:2406.07574 [pdf, other]

Biharmonic Distance of Graphs and its Higher-Order Variants: Theoretical Properties with Applications to Centrality and Clustering

Authors: Mitchell Black, Lucy Lin, Amir Nayyeri, Weng-Keen Wong

Abstract: Effective resistance is a distance between vertices of a graph that is both theoretically interesting and useful in applications. We study a variant of effective resistance called the biharmonic distance. While the effective resistance measures how well-connected two vertices are, we prove several theoretical results supporting the idea that the biharmonic distance measures how important an edge i… ▽ More Effective resistance is a distance between vertices of a graph that is both theoretically interesting and useful in applications. We study a variant of effective resistance called the biharmonic distance. While the effective resistance measures how well-connected two vertices are, we prove several theoretical results supporting the idea that the biharmonic distance measures how important an edge is to the global topology of the graph. Our theoretical results connect the biharmonic distance to well-known measures of connectivity of a graph like its total resistance and sparsity. Based on these results, we introduce two clustering algorithms using the biharmonic distance. Finally, we introduce a further generalization of the biharmonic distance that we call the $k$-harmonic distance. We empirically study the utility of biharmonic and $k$-harmonic distance for edge centrality and graph clustering. △ Less

Submitted 17 February, 2025; v1 submitted 4 June, 2024; originally announced June 2024.

Comments: Accepted to ICML 2024. In v2, we correct an error in the definition of electrical flows and, accordingly, the proofs of Lemma 2.2 and Theorem 4.1

arXiv:2406.06543 [pdf, other]

SparrowSNN: A Hardware/software Co-design for Energy Efficient ECG Classification

Authors: Zhanglu Yan, Zhenyu Bai, Tulika Mitra, Weng-Fai Wong

Abstract: Heart disease is one of the leading causes of death worldwide. Given its high risk and often asymptomatic nature, real-time continuous monitoring is essential. Unlike traditional artificial neural networks (ANNs), spiking neural networks (SNNs) are well-known for their energy efficiency, making them ideal for wearable devices and energy-constrained edge computing platforms. However, current energy… ▽ More Heart disease is one of the leading causes of death worldwide. Given its high risk and often asymptomatic nature, real-time continuous monitoring is essential. Unlike traditional artificial neural networks (ANNs), spiking neural networks (SNNs) are well-known for their energy efficiency, making them ideal for wearable devices and energy-constrained edge computing platforms. However, current energy measurement of SNN implementations for detecting heart diseases typically rely on empirical values, often overlooking hardware overhead. Additionally, the integer and fire activations in SNNs require multiple memory accesses and repeated computations, which can further compromise energy efficiency. In this paper, we propose sparrowSNN, a redesign of the standard SNN workflow from a hardware perspective, and present a dedicated ASIC design for SNNs, optimized for ultra-low power wearable devices used in heartbeat classification. Using the MIT-BIH dataset, our SNN achieves a state-of-the-art accuracy of 98.29% for SNNs, with energy consumption of 31.39nJ per inference and power usage of 6.1uW, making sparrowSNN the highest accuracy with the lowest energy use among comparable systems. We also compare the energy-to-accuracy trade-offs between SNNs and quantized ANNs, offering recommendations on insights on how best to use SNNs. △ Less

Submitted 6 May, 2024; originally announced June 2024.

arXiv:2406.03063 [pdf, other]

In-operando microwave scattering-parameter calibrated measurement of a Josephson travelling wave parametric amplifier

Authors: S. H. Shin, M. Stanley, W. N. Wong, T. Sweetnam, A. Elarabi, T. Lindström, N. M. Ridler, S. E. de Graaf

Abstract: Superconducting travelling wave parametric amplifiers (TWPAs) are broadband near-quantum limited microwave amplifiers commonly used for qubit readout and a wide range of other applications in quantum technologies. The performance of these amplifiers depends on achieving impedance matching to minimise reflected signals. Here we apply a microwave calibration technique to extract the S-parameters of… ▽ More Superconducting travelling wave parametric amplifiers (TWPAs) are broadband near-quantum limited microwave amplifiers commonly used for qubit readout and a wide range of other applications in quantum technologies. The performance of these amplifiers depends on achieving impedance matching to minimise reflected signals. Here we apply a microwave calibration technique to extract the S-parameters of a Josephson junction based TWPA in-operando. This enables reflections occurring at the TWPA and its extended network of components to be quantified, and we find that the in-operation performance can be well described by the off-state measured S-parameters. △ Less

Submitted 5 June, 2024; originally announced June 2024.

arXiv:2405.18453 [pdf, ps, other]

Tournament completions of bipartite tournaments and their augmented directed cycles

Authors: H. W. Willie Wong

Abstract: A tournament $T$ is a tournament completion of a bipartite tournament $D$ if $D$ is a spanning subdigraph of $T$, i.e., $V(D)=V(T)$ and $A(D)\subseteq A(T)$. If $C$ is a $k$-dicycle (i.e., directed cycle of length $k$) in a tournament completion $T$ of $D$ and $C$ is not a dicycle in $D$, i.e., $A(C)\subseteq A(T)$ and $A(C)\not\subseteq A(D)$, then we call $C$ an augmented $k$-dicycle of $T$. In… ▽ More A tournament $T$ is a tournament completion of a bipartite tournament $D$ if $D$ is a spanning subdigraph of $T$, i.e., $V(D)=V(T)$ and $A(D)\subseteq A(T)$. If $C$ is a $k$-dicycle (i.e., directed cycle of length $k$) in a tournament completion $T$ of $D$ and $C$ is not a dicycle in $D$, i.e., $A(C)\subseteq A(T)$ and $A(C)\not\subseteq A(D)$, then we call $C$ an augmented $k$-dicycle of $T$. In this paper, we investigate the families of bipartite tournaments for which there exists a tournament completion with exactly one augmented $3$-dicycle and with no augmented $4$-dicycles. Our investigation may be viewed as a variant of the orientation completion problem initiated by Bang-Jensen et al.. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: 14 pages

MSC Class: 05C20; 05C38 ACM Class: G.2.2

Journal ref: Discrete Math., 347, (2024), Article 114108

arXiv:2405.17940 [pdf, other]

World Models for General Surgical Grasping

Authors: Hongbin Lin, Bin Li, Chun Wai Wong, Juan Rojas, Xiangyu Chu, Kwok Wai Samuel Au

Abstract: Intelligent vision control systems for surgical robots should adapt to unknown and diverse objects while being robust to system disturbances. Previous methods did not meet these requirements due to mainly relying on pose estimation and feature tracking. We propose a world-model-based deep reinforcement learning framework "Grasp Anything for Surgery" (GAS), that learns a pixel-level visuomotor poli… ▽ More Intelligent vision control systems for surgical robots should adapt to unknown and diverse objects while being robust to system disturbances. Previous methods did not meet these requirements due to mainly relying on pose estimation and feature tracking. We propose a world-model-based deep reinforcement learning framework "Grasp Anything for Surgery" (GAS), that learns a pixel-level visuomotor policy for surgical grasping, enhancing both generality and robustness. In particular, a novel method is proposed to estimate the values and uncertainties of depth pixels for a rigid-link object's inaccurate region based on the empirical prior of the object's size; both depth and mask images of task objects are encoded to a single compact 3-channel image (size: 64x64x3) by dynamically zooming in the mask regions, minimizing the information loss. The learned controller's effectiveness is extensively evaluated in simulation and in a real robot. Our learned visuomotor policy handles: i) unseen objects, including 5 types of target grasping objects and a robot gripper, in unstructured real-world surgery environments, and ii) disturbances in perception and control. Note that we are the first work to achieve a unified surgical control system that grasps diverse surgical objects using different robot grippers on real robots in complex surgery scenes (average success rate: 69%). Our system also demonstrates significant robustness across 6 conditions including background variation, target disturbance, camera pose variation, kinematic control error, image noise, and re-grasping after the gripped target object drops from the gripper. Videos and codes can be found on our project page: https://linhongbin.github.io/gas/. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Journal ref: Robotics: Science and Systems 2024

arXiv:2405.13491 [pdf, other]

doi 10.1051/0004-6361/202450810

Euclid. I. Overview of the Euclid mission

Authors: Euclid Collaboration, Y. Mellier, Abdurro'uf, J. A. Acevedo Barroso, A. Achúcarro, J. Adamek, R. Adam, G. E. Addison, N. Aghanim, M. Aguena, V. Ajani, Y. Akrami, A. Al-Bahlawan, A. Alavi, I. S. Albuquerque, G. Alestas, G. Alguero, A. Allaoui, S. W. Allen, V. Allevato, A. V. Alonso-Tetilla, B. Altieri, A. Alvarez-Candal, S. Alvi, A. Amara , et al. (1115 additional authors not shown)

Abstract: The current standard model of cosmology successfully describes a variety of measurements, but the nature of its main ingredients, dark matter and dark energy, remains unknown. Euclid is a medium-class mission in the Cosmic Vision 2015-2025 programme of the European Space Agency (ESA) that will provide high-resolution optical imaging, as well as near-infrared imaging and spectroscopy, over about 14… ▽ More The current standard model of cosmology successfully describes a variety of measurements, but the nature of its main ingredients, dark matter and dark energy, remains unknown. Euclid is a medium-class mission in the Cosmic Vision 2015-2025 programme of the European Space Agency (ESA) that will provide high-resolution optical imaging, as well as near-infrared imaging and spectroscopy, over about 14,000 deg^2 of extragalactic sky. In addition to accurate weak lensing and clustering measurements that probe structure formation over half of the age of the Universe, its primary probes for cosmology, these exquisite data will enable a wide range of science. This paper provides a high-level overview of the mission, summarising the survey characteristics, the various data-processing steps, and data products. We also highlight the main science objectives and expected performance. △ Less

Submitted 24 September, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

Comments: Accepted for publication in the A&A special issue`Euclid on Sky'

Journal ref: A&A 697, A1 (2025)

arXiv:2405.12386 [pdf, other]

Particle swarm optimization with Applications to Maximum Likelihood Estimation and Penalized Negative Binomial Regression

Authors: Sisi Shao, Junhyung Park, Weng Kee Wong

Abstract: General purpose optimization routines such as nlminb, optim (R) or nlmixed (SAS) are frequently used to estimate model parameters in nonstandard distributions. This paper presents Particle Swarm Optimization (PSO), as an alternative to many of the current algorithms used in statistics. We find that PSO can not only reproduce the same results as the above routines, it can also produce results that… ▽ More General purpose optimization routines such as nlminb, optim (R) or nlmixed (SAS) are frequently used to estimate model parameters in nonstandard distributions. This paper presents Particle Swarm Optimization (PSO), as an alternative to many of the current algorithms used in statistics. We find that PSO can not only reproduce the same results as the above routines, it can also produce results that are more optimal or when others cannot converge. In the latter case, it can also identify the source of the problem or problems. We highlight advantages of using PSO using four examples, where: (1) some parameters in a generalized distribution are unidentified using PSO when it is not apparent or computationally manifested using routines in R or SAS; (2) PSO can produce estimation results for the log-binomial regressions when current routines may not; (3) PSO provides flexibility in the link function for binomial regression with LASSO penalty, which is unsupported by standard packages like GLM and GENMOD in Stata and SAS, respectively, and (4) PSO provides superior MLE estimates for an EE-IW distribution compared with those from the traditional statistical methods that rely on moments. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2405.07216 [pdf, other]

Magnetic-Guided Flexible Origami Robot toward Long-Term Phototherapy of H. pylori in the Stomach

Authors: Sishen Yuan, Baijia Liang, Po Wa Wong, Mingjing Xu, Chi Hsuan Li, Zhen Li, Hongliang Ren

Abstract: Helicobacter pylori, a pervasive bacterial infection associated with gastrointestinal disorders such as gastritis, peptic ulcer disease, and gastric cancer, impacts approximately 50% of the global population. The efficacy of standard clinical eradication therapies is diminishing due to the rise of antibiotic-resistant strains, necessitating alternative treatment strategies. Photodynamic therapy (P… ▽ More Helicobacter pylori, a pervasive bacterial infection associated with gastrointestinal disorders such as gastritis, peptic ulcer disease, and gastric cancer, impacts approximately 50% of the global population. The efficacy of standard clinical eradication therapies is diminishing due to the rise of antibiotic-resistant strains, necessitating alternative treatment strategies. Photodynamic therapy (PDT) emerges as a promising prospect in this context. This study presents the development and implementation of a magnetically-guided origami robot, incorporating flexible printed circuit units for sustained and stable phototherapy of Helicobacter pylori. Each integrated unit is equipped with wireless charging capabilities, producing an optimal power output that can concurrently illuminate up to 15 LEDs at their maximum intensity. Crucially, these units can be remotely manipulated via a magnetic field, facilitating both translational and rotational movements. We propose an open-loop manual control sequence that allows the formation of a stable, compliant triangular structure through the interaction of internal magnets. This adaptable configuration is uniquely designed to withstand the dynamic squeezing environment prevalent in real-world gastric applications. The research herein represents a significant stride in leveraging technology for innovative medical solutions, particularly in the management of antibiotic-resistant Helicobacter pylori infections. △ Less

Submitted 12 May, 2024; originally announced May 2024.

Comments: IEEE ICRA 2024

arXiv:2405.04206 [pdf, other]

NOVA: NoC-based Vector Unit for Mapping Attention Layers on a CNN Accelerator

Authors: Mohit Upadhyay, Rohan Juneja, Weng-Fai Wong, Li-Shiuan Peh

Abstract: Attention mechanisms are becoming increasingly popular, being used in neural network models in multiple domains such as natural language processing (NLP) and vision applications, especially at the edge. However, attention layers are difficult to map onto existing neuro accelerators since they have a much higher density of non-linear operations, which lead to inefficient utilization of today's vect… ▽ More Attention mechanisms are becoming increasingly popular, being used in neural network models in multiple domains such as natural language processing (NLP) and vision applications, especially at the edge. However, attention layers are difficult to map onto existing neuro accelerators since they have a much higher density of non-linear operations, which lead to inefficient utilization of today's vector units. This work introduces NOVA, a NoC-based Vector Unit that can perform non-linear operations within the NoC of the accelerators, and can be overlaid onto existing neuro accelerators to map attention layers at the edge. Our results show that the NOVA architecture is up to 37.8x more power-efficient than state-of-the-art hardware approximators when running existing attention-based neural networks. △ Less

Submitted 7 May, 2024; originally announced May 2024.

Comments: 6 pages, 8 figures

ACM Class: B.2.4

arXiv:2404.14957 [pdf, other]

doi 10.1126/sciadv.adm9563

Strongly correlated multi-electron bunches from interaction with quantum light

Authors: Suraj Kumar, Jeremy Lim, Nicholas Rivera, Wesley Wong, Yee Sin Ang, Lay Kee Ang, Liang Jie Wong

Abstract: Strongly correlated electron systems are a cornerstone of modern physics, being responsible for groundbreaking phenomena from superconducting magnets to quantum computing. In most cases, correlations in electrons arise exclusively due to Coulomb interactions. In this work, we reveal that free electrons interacting simultaneously with a light field can become highly correlated via mechanisms beyond… ▽ More Strongly correlated electron systems are a cornerstone of modern physics, being responsible for groundbreaking phenomena from superconducting magnets to quantum computing. In most cases, correlations in electrons arise exclusively due to Coulomb interactions. In this work, we reveal that free electrons interacting simultaneously with a light field can become highly correlated via mechanisms beyond Coulomb interactions. In the case of two electrons, the resulting Pearson correlation coefficient (PCC) for the joint probability distribution of the output electron energies is enhanced over 13 orders of magnitude compared to that of electrons interacting with the light field in succession (one after another). These highly correlated electrons are the result of momentum and energy exchange between the participating electrons via the external quantum light field. Our findings pave the way to the creation and control of highly correlated free electrons for applications including quantum information and ultra-fast imaging. △ Less

Submitted 13 May, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

Comments: 3 figures for Main Text, 4 figures for Supplementary Materials, Supplementary is available at end of Main Text figures

arXiv:2404.05571 [pdf, other]

doi 10.1039/D4SM00346B

Wetting on Silicone Surfaces

Authors: Lukas Hauer, Abhinav Naga, Rodrique G. M. Badr, Jonathan T. Pham, William S. Y. Wong, Doris Vollmer

Abstract: Silicone is frequently used as a model system to investigate and tune wetting on soft materials. Silicone is biocompatible and shows excellent thermal, chemical, and UV stability. Moreover, the mechanical properties of the surface can be easily varied by several orders of magnitude in a controlled manner. Polydimethylsiloxane (PDMS) is a popular choice for coating applications such as lubrication,… ▽ More Silicone is frequently used as a model system to investigate and tune wetting on soft materials. Silicone is biocompatible and shows excellent thermal, chemical, and UV stability. Moreover, the mechanical properties of the surface can be easily varied by several orders of magnitude in a controlled manner. Polydimethylsiloxane (PDMS) is a popular choice for coating applications such as lubrication, self-cleaning, and drag reduction, facilitated by low surface energy. Aiming to understand the underlying interactions and forces, motivated numerous and detailed investigations of the static and dynamic wetting behavior of drops on PDMS-based surfaces. Here, we recognize the three most prevalent PDMS surface variants, namely liquid-infused (SLIPS/LIS), elastomeric, and liquid-like (SOCAL) surfaces. To understand, optimize, and tune the wetting properties of these PDMS surfaces, we review and compare their similarities and differences by discussing (i) the chemical and molecular structure, and (ii) the static and dynamic wetting behavior. We also provide (iii) an overview of methods and techniques to characterize PDMS-based surfaces and their wetting behavior. The static and dynamic wetting ridge is given particular attention, as it dominates energy dissipation, adhesion, and friction of sliding drops and influences the durability of the surfaces. We also discuss special features such as cloaking and wetting-induced phase separation. Key challenges and opportunities of these three surface variants are outlined. △ Less

Submitted 1 July, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

arXiv:2404.04248 [pdf, other]

doi 10.3847/2041-8213/ad5beb

Observation of Gravitational Waves from the Coalescence of a $2.5\text{-}4.5~M_\odot$ Compact Object and a Neutron Star

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, S. Akçay, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Al-Jodah , et al. (1771 additional authors not shown)

Abstract: We report the observation of a coalescing compact binary with component masses $2.5\text{-}4.5~M_\odot$ and $1.2\text{-}2.0~M_\odot$ (all measurements quoted at the 90% credible level). The gravitational-wave signal GW230529_181500 was observed during the fourth observing run of the LIGO-Virgo-KAGRA detector network on 2023 May 29 by the LIGO Livingston Observatory. The primary component of the so… ▽ More We report the observation of a coalescing compact binary with component masses $2.5\text{-}4.5~M_\odot$ and $1.2\text{-}2.0~M_\odot$ (all measurements quoted at the 90% credible level). The gravitational-wave signal GW230529_181500 was observed during the fourth observing run of the LIGO-Virgo-KAGRA detector network on 2023 May 29 by the LIGO Livingston Observatory. The primary component of the source has a mass less than $5~M_\odot$ at 99% credibility. We cannot definitively determine from gravitational-wave data alone whether either component of the source is a neutron star or a black hole. However, given existing estimates of the maximum neutron star mass, we find the most probable interpretation of the source to be the coalescence of a neutron star with a black hole that has a mass between the most massive neutron stars and the least massive black holes observed in the Galaxy. We provisionally estimate a merger rate density of $55^{+127}_{-47}~\text{Gpc}^{-3}\,\text{yr}^{-1}$ for compact binary coalescences with properties similar to the source of GW230529_181500; assuming that the source is a neutron star-black hole merger, GW230529_181500-like sources constitute about 60% of the total merger rate inferred for neutron star-black hole coalescences. The discovery of this system implies an increase in the expected rate of neutron star-black hole mergers with electromagnetic counterparts and provides further evidence for compact objects existing within the purported lower mass gap. △ Less

Submitted 26 July, 2024; v1 submitted 5 April, 2024; originally announced April 2024.

Comments: 45 pages (10 pages author list, 13 pages main text, 1 page acknowledgements, 13 pages appendices, 8 pages bibliography), 17 figures, 16 tables. Update to match version published in The Astrophysical Journal Letters. Data products available from https://zenodo.org/records/10845779

Report number: LIGO-P2300352

Journal ref: ApJL 970, L34 (2024)

arXiv:2404.00720 [pdf, other]

Kicking time back in black-hole mergers: Ancestral masses, spins, birth recoils and hierarchical-formation viability of GW190521

Authors: Carlos Araújo Álvarez, Henry W. Y. Wong, Anna Liu, Juan Calderón Bustillo

Abstract: Pair-instability supernova (PISN) prevents black-hole formation from stellar collapse within the approximate mass range $M\in [65,130]M_\odot$. However, such black holes may form hierarchically through merging ancestral black holes, whose properties determine those of the ``child'' one: mass, spin, and recoil velocity. Crucially, the child will leave its host environment if its ``birth recoil'' ex… ▽ More Pair-instability supernova (PISN) prevents black-hole formation from stellar collapse within the approximate mass range $M\in [65,130]M_\odot$. However, such black holes may form hierarchically through merging ancestral black holes, whose properties determine those of the ``child'' one: mass, spin, and recoil velocity. Crucially, the child will leave its host environment if its ``birth recoil'' exceeds the corresponding escape velocity, preventing further mergers. We exploit relations between the final recoil and spin of quasi-circular black-hole mergers to obtain posterior probability distributions for the hypothetical ancestral masses, spins and birth recoils of the component black holes of GW190521. To this, we present a Bayesian framework applicable to existing estimates for the components of black-hole merger observations. We consider both the quasi-circular (generically spinning) analysis performed by the LIGO-Virgo-KAGRA collaboration and the eccentric (aligned-spin) one performed by Romero-Shaw et. al. We evaluate the probability $p_{2g}$ that the GW190521 components inferred by these analyses formed from the merger of stellar-origin black holes and were retained by their environment. For the primary component, which populates the PISN gap, such scenario is strongly suppressed if GW190521 happened in a Globular Cluster with $p_{2g} \sim 10^{-3}$ unless it was quasi-circular and its ancestors had aligned-spins, uncharacteristic of hierarchical formation channels, or small spins, which yields $p_{2g} \simeq 10^{-2}$. If GW190521 was eccentric, we obtain $p_{2g} \simeq 0.1$ for any host other than an AGN, and zero for a Globular Cluster. If GW190521 was quasi-circular, a Nuclear-Star Cluster origin is possible with $p_{2g} \in (\sim 0.4 \sim ,0.8)$ △ Less

Submitted 8 November, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

Comments: 20 pages, 7 Figures, Version accepted for publication in The Astrophysical Journal

Report number: LIGO-DCC P2400073

arXiv:2403.11414 [pdf, other]

doi 10.1145/3626202.3637576

Table-Lookup MAC: Scalable Processing of Quantised Neural Networks in FPGA Soft Logic

Authors: Daniel Gerlinghoff, Benjamin Chen Ming Choong, Rick Siow Mong Goh, Weng-Fai Wong, Tao Luo

Abstract: Recent advancements in neural network quantisation have yielded remarkable outcomes, with three-bit networks reaching state-of-the-art full-precision accuracy in complex tasks. These achievements present valuable opportunities for accelerating neural networks by computing in reduced precision. Implementing it on FPGAs can take advantage of bit-level reconfigurability, which is not available on con… ▽ More Recent advancements in neural network quantisation have yielded remarkable outcomes, with three-bit networks reaching state-of-the-art full-precision accuracy in complex tasks. These achievements present valuable opportunities for accelerating neural networks by computing in reduced precision. Implementing it on FPGAs can take advantage of bit-level reconfigurability, which is not available on conventional CPUs and GPUs. Simultaneously, the high data intensity of neural network processing has inspired computing-in-memory paradigms, including on FPGA platforms. By programming the effects of trained model weights as lookup operations in soft logic, the transfer of weight data from memory units can be avoided, alleviating the memory bottleneck. However, previous methods face poor scalability - the high logic utilisation limiting them to small networks/sub-networks of binary models with low accuracy. In this paper, we introduce Table Lookup Multiply-Accumulate (TLMAC) as a framework to compile and optimise quantised neural networks for scalable lookup-based processing. TLMAC clusters and maps unique groups of weights to lookup-based processing elements, enabling highly parallel computation while taking advantage of parameter redundancy. Further place and route algorithms are proposed to reduce LUT utilisation and routing congestion. We demonstrate that TLMAC significantly improves the scalability of previous related works. Our efficient logic mapping and high degree of reuse enables entire ImageNet-scale quantised models with full-precision accuracy to be implemented using lookup-based computing on one commercially available FPGA. △ Less

Submitted 17 March, 2024; originally announced March 2024.

arXiv:2403.05530 [pdf, other]

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1112 additional authors not shown)

Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content. △ Less

Submitted 16 December, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.04036 [pdf, other]

Unsupervised Contrastive Learning for Robust RF Device Fingerprinting Under Time-Domain Shift

Authors: Jun Chen, Weng-Keen Wong, Bechir Hamdaoui

Abstract: Radio Frequency (RF) device fingerprinting has been recognized as a potential technology for enabling automated wireless device identification and classification. However, it faces a key challenge due to the domain shift that could arise from variations in the channel conditions and environmental settings, potentially degrading the accuracy of RF-based device classification when testing and traini… ▽ More Radio Frequency (RF) device fingerprinting has been recognized as a potential technology for enabling automated wireless device identification and classification. However, it faces a key challenge due to the domain shift that could arise from variations in the channel conditions and environmental settings, potentially degrading the accuracy of RF-based device classification when testing and training data is collected in different domains. This paper introduces a novel solution that leverages contrastive learning to mitigate this domain shift problem. Contrastive learning, a state-of-the-art self-supervised learning approach from deep learning, learns a distance metric such that positive pairs are closer (i.e. more similar) in the learned metric space than negative pairs. When applied to RF fingerprinting, our model treats RF signals from the same transmission as positive pairs and those from different transmissions as negative pairs. Through experiments on wireless and wired RF datasets collected over several days, we demonstrate that our contrastive learning approach captures domain-invariant features, diminishing the effects of domain-specific variations. Our results show large and consistent improvements in accuracy (10.8\% to 27.8\%) over baseline models, thus underscoring the effectiveness of contrastive learning in improving device classification under domain shift. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: 6 pages, 5 figures, accepted by 2024 IEEE International Conference on Communications (ICC)

arXiv:2403.03004 [pdf, other]

Ultralight vector dark matter search using data from the KAGRA O3GK run

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, H. Abe, I. Abouelfettouh, F. Acernese, K. Ackley, C. Adamcewicz, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi , et al. (1778 additional authors not shown)

Abstract: Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we prese… ▽ More Among the various candidates for dark matter (DM), ultralight vector DM can be probed by laser interferometric gravitational wave detectors through the measurement of oscillating length changes in the arm cavities. In this context, KAGRA has a unique feature due to differing compositions of its mirrors, enhancing the signal of vector DM in the length change in the auxiliary channels. Here we present the result of a search for $U(1)_{B-L}$ gauge boson DM using the KAGRA data from auxiliary length channels during the first joint observation run together with GEO600. By applying our search pipeline, which takes into account the stochastic nature of ultralight DM, upper bounds on the coupling strength between the $U(1)_{B-L}$ gauge boson and ordinary matter are obtained for a range of DM masses. While our constraints are less stringent than those derived from previous experiments, this study demonstrates the applicability of our method to the lower-mass vector DM search, which is made difficult in this measurement by the short observation time compared to the auto-correlation time scale of DM. △ Less

Submitted 5 March, 2024; originally announced March 2024.

Comments: 20 pages, 5 figures

Report number: LIGO-P2300250

arXiv:2403.01821 [pdf]

Complete Interband Transitions for Non-Hermitian Spin-Orbit-Coupled Cold-Atom Systems

Authors: Dong Liu, Zejian Ren, Wai Chun Wong, Entong Zhao, Chengdong He, Ka Kwan Pak, Gyu-Boong Jo, Jensen Li

Abstract: Recently, synthetic spin-orbit coupling has been introduced into cold-atom systems for more flexible control of the Hamiltonian, which was further made time-varying through two-photon detuning to achieve dynamic control of the cold-atom state. While an intraband transition can be adiabatically obtained, a complete interband transition, rather than a superposition of different bands, obtained throu… ▽ More Recently, synthetic spin-orbit coupling has been introduced into cold-atom systems for more flexible control of the Hamiltonian, which was further made time-varying through two-photon detuning to achieve dynamic control of the cold-atom state. While an intraband transition can be adiabatically obtained, a complete interband transition, rather than a superposition of different bands, obtained through fast sweeping is usually guaranteed by having the positions of the initial and final states be far away from any band gap in the quasimomentum space. Here, by introducing an additional non-Hermitian parameter through an atom-loss contrast together with two-photon detuning as two controllable external parameters, both intraband and complete interband transitions can be achieved independent of the positions of the initial and final states. In addition, a point-source diagram approach in the 2D external parameter space is developed to visualize and predict the locations of any nonadiabatic transitions. This control protocol can have potential applications in quantum state control and quantum simulations using cold-atom systems. △ Less

Submitted 4 March, 2024; originally announced March 2024.

Comments: 21 pages, 4 figures

arXiv:2403.00192 [pdf, other]

Block-MDS QC-LDPC Codes for Information Reconciliation in Key Distribution

Authors: Lev Tauz, Debarnab Mitra, Jayanth Shreekumar, Murat Can Sarihan, Chee Wei Wong, Lara Dolecek

Abstract: Quantum key distribution (QKD) is a popular protocol that provides information theoretically secure keys to multiple parties. Two important post-processing steps of QKD are 1) the information reconciliation (IR) step, where parties reconcile mismatches in generated keys through classical communication, and 2) the privacy amplification (PA) step, where parties distill their common key into a new se… ▽ More Quantum key distribution (QKD) is a popular protocol that provides information theoretically secure keys to multiple parties. Two important post-processing steps of QKD are 1) the information reconciliation (IR) step, where parties reconcile mismatches in generated keys through classical communication, and 2) the privacy amplification (PA) step, where parties distill their common key into a new secure key that the adversary has little to no information about. In general, these two steps have been abstracted as two distinct problems. In this work, we consider a new technique of performing the IR and PA steps jointly through sampling that relaxes the requirement on the IR step, allowing for more success in key creation. We provide a novel LDPC code construction known as Block-MDS QC-LDPC codes that can utilize the relaxed requirement by creating LDPC codes with pre-defined sub-matrices of full-rank. We demonstrate through simulations that our technique of sampling can provide notable gains in successfully creating secret keys. △ Less

Submitted 29 February, 2024; originally announced March 2024.

Comments: 7 pages, 1 figure, submitted to the International Symposium on Information Theory (ISIT) 2024

arXiv:2402.15525 [pdf, other]

Detecting misinformation through Framing Theory: the Frame Element-based Model

Authors: Guan Wang, Rebecca Frederick, Jinglong Duan, William Wong, Verica Rupar, Weihua Li, Quan Bai

Abstract: In this paper, we delve into the rapidly evolving challenge of misinformation detection, with a specific focus on the nuanced manipulation of narrative frames - an under-explored area within the AI community. The potential for Generative AI models to generate misleading narratives underscores the urgency of this problem. Drawing from communication and framing theories, we posit that the presentati… ▽ More In this paper, we delve into the rapidly evolving challenge of misinformation detection, with a specific focus on the nuanced manipulation of narrative frames - an under-explored area within the AI community. The potential for Generative AI models to generate misleading narratives underscores the urgency of this problem. Drawing from communication and framing theories, we posit that the presentation or 'framing' of accurate information can dramatically alter its interpretation, potentially leading to misinformation. We highlight this issue through real-world examples, demonstrating how shifts in narrative frames can transmute fact-based information into misinformation. To tackle this challenge, we propose an innovative approach leveraging the power of pre-trained Large Language Models and deep neural networks to detect misinformation originating from accurate facts portrayed under different frames. These advanced AI techniques offer unprecedented capabilities in identifying complex patterns within unstructured data critical for examining the subtleties of narrative frames. The objective of this paper is to bridge a significant research gap in the AI domain, providing valuable insights and methodologies for tackling framing-induced misinformation, thus contributing to the advancement of responsible and trustworthy AI technologies. Several experiments are intensively conducted and experimental results explicitly demonstrate the various impact of elements of framing theory proving the rationale of applying framing theory to increase the performance in misinformation detection. △ Less

Submitted 19 February, 2024; originally announced February 2024.

Comments: 17 pages, 9 figures, 7 tables

arXiv:2402.13297 [pdf, other]

Integrating Deep Learning and Synthetic Biology: A Co-Design Approach for Enhancing Gene Expression via N-terminal Coding Sequences

Authors: Zhanglu Yan, Weiran Chu, Yuhua Sheng, Kaiwen Tang, Shida Wang, Yanfeng Liu, Weng-Fai Wong

Abstract: N-terminal coding sequence (NCS) influences gene expression by impacting the translation initiation rate. The NCS optimization problem is to find an NCS that maximizes gene expression. The problem is important in genetic engineering. However, current methods for NCS optimization such as rational design and statistics-guided approaches are labor-intensive yield only relatively small improvements. T… ▽ More N-terminal coding sequence (NCS) influences gene expression by impacting the translation initiation rate. The NCS optimization problem is to find an NCS that maximizes gene expression. The problem is important in genetic engineering. However, current methods for NCS optimization such as rational design and statistics-guided approaches are labor-intensive yield only relatively small improvements. This paper introduces a deep learning/synthetic biology co-designed few-shot training workflow for NCS optimization. Our method utilizes k-nearest encoding followed by word2vec to encode the NCS, then performs feature extraction using attention mechanisms, before constructing a time-series network for predicting gene expression intensity, and finally a direct search algorithm identifies the optimal NCS with limited training data. We took green fluorescent protein (GFP) expressed by Bacillus subtilis as a reporting protein of NCSs, and employed the fluorescence enhancement factor as the metric of NCS optimization. Within just six iterative experiments, our model generated an NCS (MLD62) that increased average GFP expression by 5.41-fold, outperforming the state-of-the-art NCS designs. Extending our findings beyond GFP, we showed that our engineered NCS (MLD62) can effectively boost the production of N-acetylneuraminic acid by enhancing the expression of the crucial rate-limiting GNA1 gene, demonstrating its practical utility. We have open-sourced our NCS expression database and experimental procedures for public use. △ Less

Submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.13249 [pdf, other]

TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Authors: Liyan Tang, Igor Shalyminov, Amy Wing-mei Wong, Jon Burnsky, Jake W. Vincent, Yu'an Yang, Siffi Singh, Song Feng, Hwanjun Song, Hang Su, Lijia Sun, Yi Zhang, Saab Mansour, Kathleen McKeown

Abstract: Single document news summarization has seen substantial progress on faithfulness in recent years, driven by research on the evaluation of factual consistency, or hallucinations. We ask whether these advances carry over to other text summarization domains. We propose a new evaluation benchmark on topic-focused dialogue summarization, generated by LLMs of varying sizes. We provide binary sentence-le… ▽ More Single document news summarization has seen substantial progress on faithfulness in recent years, driven by research on the evaluation of factual consistency, or hallucinations. We ask whether these advances carry over to other text summarization domains. We propose a new evaluation benchmark on topic-focused dialogue summarization, generated by LLMs of varying sizes. We provide binary sentence-level human annotations of the factual consistency of these summaries along with detailed explanations of factually inconsistent sentences. Our analysis shows that existing LLMs hallucinate significant amounts of factual errors in the dialogue domain, regardless of the model's size. On the other hand, when LLMs, including GPT-4, serve as binary factual evaluators, they perform poorly and can be outperformed by prevailing state-of-the-art specialized factuality evaluation metrics. Finally, we conducted an analysis of hallucination types with a curated error taxonomy. We find that there are diverse errors and error distributions in model-generated summaries and that non-LLM based metrics can capture all error types better than LLM-based evaluators. △ Less

Submitted 31 March, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

Comments: NAACL 2024; Linguistic annotations available at https://github.com/amazon-science/tofueval

arXiv:2402.11381 [pdf, ps, other]

Paired $(n-1)$-to-$(n-1)$ disjoint path covers in bipartite transposition-like graphs

Authors: Anna Coleman, Gabrielle Fischberg, Charles Gong, Joshua Harrington, Tony W. H. Wong

Abstract: A paired $k$-to-$k$ disjoint path cover of a graph $G$ is a collection of pairwise disjoint path subgraphs $P_1,P_2,\dotsc,P_k$ such that each $P_i$ has prescribed vertices $s_i$ and $t_i$ as endpoints and the union of $P_1,P_2,\dotsc,P_k$ contains all vertices of $G$. In this paper, we introduce bipartite transposition-like graphs, which are inductively constructed from lower ranked bipartite tra… ▽ More A paired $k$-to-$k$ disjoint path cover of a graph $G$ is a collection of pairwise disjoint path subgraphs $P_1,P_2,\dotsc,P_k$ such that each $P_i$ has prescribed vertices $s_i$ and $t_i$ as endpoints and the union of $P_1,P_2,\dotsc,P_k$ contains all vertices of $G$. In this paper, we introduce bipartite transposition-like graphs, which are inductively constructed from lower ranked bipartite transposition-like graphs. We show that every rank $n$ bipartite transposition-like graph $G$ admit a paired $(n-1)$-to-$(n-1)$ disjoint path cover for all choices of $S=\{s_1,s_2,\dotsc,s_{n-1}\}$ and $T=\{t_1,t_2,\dotsc,t_{n-1}\}$, provided that $S$ is in one partite set of $G$ and $T$ is in the other. △ Less

Submitted 17 February, 2024; originally announced February 2024.

MSC Class: 05C45; 05C70; 05C75

arXiv:2402.10456 [pdf, other]

Efficient Generative Modeling via Penalized Optimal Transport Network

Authors: Wenhui Sophia Lu, Chenyang Zhong, Wing Hung Wong

Abstract: The generation of synthetic data with distributions that faithfully emulate the underlying data-generating mechanism holds paramount significance. Wasserstein Generative Adversarial Networks (WGANs) have emerged as a prominent tool for this task; however, due to the delicate equilibrium of the minimax formulation and the instability of Wasserstein distance in high dimensions, WGAN often manifests… ▽ More The generation of synthetic data with distributions that faithfully emulate the underlying data-generating mechanism holds paramount significance. Wasserstein Generative Adversarial Networks (WGANs) have emerged as a prominent tool for this task; however, due to the delicate equilibrium of the minimax formulation and the instability of Wasserstein distance in high dimensions, WGAN often manifests the pathological phenomenon of mode collapse. This results in generated samples that converge to a restricted set of outputs and fail to adequately capture the tail behaviors of the true distribution. Such limitations can lead to serious downstream consequences. To this end, we propose the Penalized Optimal Transport Network (POTNet), a versatile deep generative model based on the marginally-penalized Wasserstein (MPW) distance. Through the MPW distance, POTNet effectively leverages low-dimensional marginal information to guide the overall alignment of joint distributions. Furthermore, our primal-based framework enables direct evaluation of the MPW distance, thus eliminating the need for a critic network. This formulation circumvents training instabilities inherent in adversarial approaches and avoids the need for extensive parameter tuning. We derive a non-asymptotic bound on the generalization error of the MPW loss and establish convergence rates of the generative distribution learned by POTNet. Our theoretical analysis together with extensive empirical evaluations demonstrate the superior performance of POTNet in accurately capturing underlying data structures, including their tail behaviors and minor modalities. Moreover, our model achieves orders of magnitude speedup during the sampling stage compared to state-of-the-art alternatives, which enables computationally efficient large-scale synthetic data generation. △ Less

Submitted 7 January, 2025; v1 submitted 16 February, 2024; originally announced February 2024.

Comments: 54 pages, 12 figures

Showing 51–100 of 725 results for author: Wong, W