Search | arXiv e-print repository

Reading Your Heart: Learning ECG Words and Sentences via Pre-training ECG Language Model

Authors: Jiarui Jin, Haoyu Wang, Hongyan Li, Jun Li, Jiahui Pan, Shenda Hong

Abstract: Electrocardiogram (ECG) is essential for the clinical diagnosis of arrhythmias and other heart diseases, but deep learning methods based on ECG often face limitations due to the need for high-quality annotations. Although previous ECG self-supervised learning (eSSL) methods have made significant progress in representation learning from unannotated ECG data, they typically treat ECG signals as ordi… ▽ More Electrocardiogram (ECG) is essential for the clinical diagnosis of arrhythmias and other heart diseases, but deep learning methods based on ECG often face limitations due to the need for high-quality annotations. Although previous ECG self-supervised learning (eSSL) methods have made significant progress in representation learning from unannotated ECG data, they typically treat ECG signals as ordinary time-series data, segmenting the signals using fixed-size and fixed-step time windows, which often ignore the form and rhythm characteristics and latent semantic relationships in ECG signals. In this work, we introduce a novel perspective on ECG signals, treating heartbeats as words and rhythms as sentences. Based on this perspective, we first designed the QRS-Tokenizer, which generates semantically meaningful ECG sentences from the raw ECG signals. Building on these, we then propose HeartLang, a novel self-supervised learning framework for ECG language processing, learning general representations at form and rhythm levels. Additionally, we construct the largest heartbeat-based ECG vocabulary to date, which will further advance the development of ECG language processing. We evaluated HeartLang across six public ECG datasets, where it demonstrated robust competitiveness against other eSSL methods. Our data and code are publicly available at https://github.com/PKUDigitalHealth/HeartLang. △ Less

Submitted 15 February, 2025; originally announced February 2025.

Comments: 21 pages, 8 figures, accepted by International Conference on Learning Representations 2025

arXiv:2502.09346 [pdf, other]

Machine learning for modelling unstructured grid data in computational physics: a review

Authors: Sibo Cheng, Marc Bocquet, Weiping Ding, Tobias Sebastian Finn, Rui Fu, Jinlong Fu, Yike Guo, Eleda Johnson, Siyi Li, Che Liu, Eric Newton Moro, Jie Pan, Matthew Piggott, Cesar Quilodran, Prakhar Sharma, Kun Wang, Dunhui Xiao, Xiao Xue, Yong Zeng, Mingrui Zhang, Hao Zhou, Kewei Zhu, Rossella Arcucci

Abstract: Unstructured grid data are essential for modelling complex geometries and dynamics in computational physics. Yet, their inherent irregularity presents significant challenges for conventional machine learning (ML) techniques. This paper provides a comprehensive review of advanced ML methodologies designed to handle unstructured grid data in high-dimensional dynamical systems. Key approaches discuss… ▽ More Unstructured grid data are essential for modelling complex geometries and dynamics in computational physics. Yet, their inherent irregularity presents significant challenges for conventional machine learning (ML) techniques. This paper provides a comprehensive review of advanced ML methodologies designed to handle unstructured grid data in high-dimensional dynamical systems. Key approaches discussed include graph neural networks, transformer models with spatial attention mechanisms, interpolation-integrated ML methods, and meshless techniques such as physics-informed neural networks. These methodologies have proven effective across diverse fields, including fluid dynamics and environmental simulations. This review is intended as a guidebook for computational scientists seeking to apply ML approaches to unstructured grid data in their domains, as well as for ML researchers looking to address challenges in computational physics. It places special focus on how ML methods can overcome the inherent limitations of traditional numerical techniques and, conversely, how insights from computational physics can inform ML development. To support benchmarking, this review also provides a summary of open-access datasets of unstructured grid data in computational physics. Finally, emerging directions such as generative models with unstructured data, reinforcement learning for mesh generation, and hybrid physics-data-driven paradigms are discussed to inspire future advancements in this evolving field. △ Less

Submitted 13 February, 2025; originally announced February 2025.

arXiv:2502.08940 [pdf, other]

Towards Understanding Why Data Augmentation Improves Generalization

Authors: Jingyang Li, Jiachun Pan, Kim-Chuan Toh, Pan Zhou

Abstract: Data augmentation is a cornerstone technique in deep learning, widely used to improve model generalization. Traditional methods like random cropping and color jittering, as well as advanced techniques such as CutOut, Mixup, and CutMix, have achieved notable success across various domains. However, the mechanisms by which data augmentation improves generalization remain poorly understood, and exist… ▽ More Data augmentation is a cornerstone technique in deep learning, widely used to improve model generalization. Traditional methods like random cropping and color jittering, as well as advanced techniques such as CutOut, Mixup, and CutMix, have achieved notable success across various domains. However, the mechanisms by which data augmentation improves generalization remain poorly understood, and existing theoretical analyses typically focus on individual techniques without a unified explanation. In this work, we present a unified theoretical framework that elucidates how data augmentation enhances generalization through two key effects: partial semantic feature removal and feature mixing. Partial semantic feature removal reduces the model's reliance on individual feature, promoting diverse feature learning and better generalization. Feature mixing, by scaling down original semantic features and introducing noise, increases training complexity, driving the model to develop more robust features. Advanced methods like CutMix integrate both effects, achieving complementary benefits. Our theoretical insights are further supported by experimental results, validating the effectiveness of this unified perspective. △ Less

Submitted 12 February, 2025; originally announced February 2025.

arXiv:2502.08104 [pdf, other]

Homogeneous fermionic Hubbard gases in a flat-top optical lattice

Authors: Yu-Xuan Wang, Hou-Ji Shao, Yan-Song Zhu, De-Zhi Zhu, Hao-Nan Sun, Si-Yuan Chen, Xing-Can Yao, Yu-Ao Chen, Jian-Wei Pan

Abstract: Fermionic atoms in a large-scale, homogeneous optical lattice provide an ideal quantum simulator for investigating the fermionic Hubbard model, yet achieving this remains challenging. Here, by developing a hybrid potential that integrates a flat-top optical lattice with an optical box trap, we successfully realize the creation of three-dimensional, homogeneous fermionic Hubbard gases across approx… ▽ More Fermionic atoms in a large-scale, homogeneous optical lattice provide an ideal quantum simulator for investigating the fermionic Hubbard model, yet achieving this remains challenging. Here, by developing a hybrid potential that integrates a flat-top optical lattice with an optical box trap, we successfully realize the creation of three-dimensional, homogeneous fermionic Hubbard gases across approximately $8\times10^5$ lattice sites. This homogeneous system enables us to capture a well-defined energy band occupation that aligns perfectly with the theoretical calculations for a zero-temperature, ideal fermionic Hubbard model. Furthermore, by employing novel radio-frequency spectroscopy, we precisely measure the doublon fraction $D$ as a function of interaction strength $U$ and temperature $T$, respectively. The crossover from metal to Mott insulator is detected, where $D$ smoothly decreases with increasing $U$. More importantly, we observe a non-monotonic temperature dependence in $D$, revealing the Pomeranchuk effect and the development of extended antiferromagnetic correlations. △ Less

Submitted 11 February, 2025; originally announced February 2025.

arXiv:2502.08099 [pdf, other]

Feshbach spectroscopy of ultracold mixtures of $^{6}{\rm Li}$ and $^{164}{\rm Dy}$ atoms

Authors: Ke Xie, Xi Li, Yu-Yang Zhou, Ji-Hong Luo, Shuai Wang, Yu-Zhao Nie, Hong-Chi Shen, Yu-Ao Chen, Xing-Can Yao, Jian-Wei Pan

Abstract: We report on the observation of Feshbach resonances in ultracold $^6\mathrm{Li}$-$^{164}\mathrm{Dy}$ mixtures, where $^6\mathrm{Li}$ atoms are respectively prepared in their three lowest spin states, and $^{164}\mathrm{Dy}$ atoms are prepared in their lowest energy state. We observe 21 interspecies scattering resonances over a magnetic field range from 0 to \SI{702}{\gauss} using atom loss spectro… ▽ More We report on the observation of Feshbach resonances in ultracold $^6\mathrm{Li}$-$^{164}\mathrm{Dy}$ mixtures, where $^6\mathrm{Li}$ atoms are respectively prepared in their three lowest spin states, and $^{164}\mathrm{Dy}$ atoms are prepared in their lowest energy state. We observe 21 interspecies scattering resonances over a magnetic field range from 0 to \SI{702}{\gauss} using atom loss spectroscopy, three of which exhibit relatively broad widths. These broad resonances provide precise control over the interspecies interaction strength, enabling the study of strongly interacting effects in $^6\mathrm{Li}$-$^{164}\mathrm{Dy}$ mixtures. Additionally, we observe a well-isolated interspecies resonance at 700.1 G, offering a unique platform to explore novel impurity physics, where heavy dipolar $^{164}\mathrm{Dy}$ atoms are immersed in a strongly interacting Fermi superfluid of $^6\mathrm{Li}$ atoms. △ Less

Submitted 11 February, 2025; originally announced February 2025.

arXiv:2502.06100 [pdf, other]

Col-OLHTR: A Novel Framework for Multimodal Online Handwritten Text Recognition

Authors: Chenyu Liu, Jinshui Hu, Baocai Yin, Jia Pan, Bing Yin, Jun Du, Qingfeng Liu

Abstract: Online Handwritten Text Recognition (OLHTR) has gained considerable attention for its diverse range of applications. Current approaches usually treat OLHTR as a sequence recognition task, employing either a single trajectory or image encoder, or multi-stream encoders, combined with a CTC or attention-based recognition decoder. However, these approaches face several drawbacks: 1) single encoders ty… ▽ More Online Handwritten Text Recognition (OLHTR) has gained considerable attention for its diverse range of applications. Current approaches usually treat OLHTR as a sequence recognition task, employing either a single trajectory or image encoder, or multi-stream encoders, combined with a CTC or attention-based recognition decoder. However, these approaches face several drawbacks: 1) single encoders typically focus on either local trajectories or visual regions, lacking the ability to dynamically capture relevant global features in challenging cases; 2) multi-stream encoders, while more comprehensive, suffer from complex structures and increased inference costs. To tackle this, we propose a Collaborative learning-based OLHTR framework, called Col-OLHTR, that learns multimodal features during training while maintaining a single-stream inference process. Col-OLHTR consists of a trajectory encoder, a Point-to-Spatial Alignment (P2SA) module, and an attention-based decoder. The P2SA module is designed to learn image-level spatial features through trajectory-encoded features and 2D rotary position embeddings. During training, an additional image-stream encoder-decoder is collaboratively trained to provide supervision for P2SA features. At inference, the extra streams are discarded, and only the P2SA module is used and merged before the decoder, simplifying the process while preserving high performance. Extensive experimental results on several OLHTR benchmarks demonstrate the state-of-the-art (SOTA) performance, proving the effectiveness and robustness of our design. △ Less

Submitted 9 February, 2025; originally announced February 2025.

Comments: ICASSP 2025

arXiv:2502.04420 [pdf, ps, other]

KVTuner: Sensitivity-Aware Layer-Wise Mixed-Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference

Authors: Xing Li, Zeyu Xing, Yiming Li, Linping Qu, Hui-Ling Zhen, Wulong Liu, Yiwu Yao, Sinno Jialin Pan, Mingxuan Yuan

Abstract: KV cache quantization can improve Large Language Models (LLMs) inference throughput and latency in long contexts and large batch-size scenarios while preserving LLMs effectiveness. However, current methods have three unsolved issues: overlooking layer-wise sensitivity to KV cache quantization, high overhead of online fine-grained decision-making, and low flexibility to different LLMs and constrain… ▽ More KV cache quantization can improve Large Language Models (LLMs) inference throughput and latency in long contexts and large batch-size scenarios while preserving LLMs effectiveness. However, current methods have three unsolved issues: overlooking layer-wise sensitivity to KV cache quantization, high overhead of online fine-grained decision-making, and low flexibility to different LLMs and constraints. Therefore, we theoretically analyze the inherent correlation of layer-wise transformer attention patterns to KV cache quantization errors and study why key cache is generally more important than value cache for quantization error reduction. We further propose a simple yet effective framework KVTuner to adaptively search for the optimal hardware-friendly layer-wise KV quantization precision pairs for coarse-grained KV cache with multi-objective optimization and directly utilize the offline searched configurations during online inference. To reduce the computational cost of offline calibration, we utilize the intra-layer KV precision pair pruning and inter-layer clustering to reduce the search space. Experimental results show that we can achieve nearly lossless 3.25-bit mixed precision KV cache quantization for LLMs like Llama-3.1-8B-Instruct and 4.0-bit for sensitive models like Qwen2.5-7B-Instruct on mathematical reasoning tasks. The maximum inference throughput can be improved by 21.25\% compared with KIVI-KV8 quantization over various context lengths. Our code and searched configurations are available at https://github.com/cmd2001/KVTuner. △ Less

Submitted 31 May, 2025; v1 submitted 6 February, 2025; originally announced February 2025.

Comments: Accepted by ICML25. Code: https://github.com/cmd2001/KVTuner

arXiv:2502.04416 [pdf, other]

CMoE: Converting Mixture-of-Experts from Dense to Accelerate LLM Inference

Authors: Zehua Pei, Lancheng Zou, Hui-Ling Zhen, Xianzhi Yu, Wulong Liu, Sinno Jialin Pan, Mingxuan Yuan, Bei Yu

Abstract: Scaling large language models (LLMs) improves performance but dramatically increases inference costs. The feed-forward network (FFN), consuming approximately 70\% of inference compute, represents a critical bottleneck, particularly in large batch size scenarios. While mixture-of-experts (MoE) architectures leverage activation sparsity for efficiency, converting existing dense models to MoEs tradit… ▽ More Scaling large language models (LLMs) improves performance but dramatically increases inference costs. The feed-forward network (FFN), consuming approximately 70\% of inference compute, represents a critical bottleneck, particularly in large batch size scenarios. While mixture-of-experts (MoE) architectures leverage activation sparsity for efficiency, converting existing dense models to MoEs traditionally requires resource-intensive continual pre-training. We present CMoE, a framework that rapidly transforms dense LLMs into MoEs without training. The key innovation lies in analyzing FFN neuron activations to partition them into shared (always active) and routed experts. Routed neurons are clustered using a balanced assignment algorithm, and a differentiable router is constructed analytically from activation statistics, enabling immediate deployment or optional lightweight fine-tuning. Experiments demonstrate that, with activation ratio of 75\%, it achieves remarkable results, delivering lossless precision in terms of perplexity while still maintaining a 5\% acceleration. Further experiments reveal that a CMoE configuration activating just 25\% of parameters reduces end-to-end latency by 1.5x while preserving usable perplexity without additional training. Moreover, a brief LoRA fine-tuning process (requiring only 1 hour and 2,000 samples) successfully recovers over 76\% of the dense model's downstream accuracy. By effectively balancing performance and efficiency, CMoE offers a viable path forward for deploying LLMs in real-world scenarios where computational resources are limited. We make our code publicly available at https://github.com/JarvisPei/CMoE. △ Less

Submitted 24 May, 2025; v1 submitted 6 February, 2025; originally announced February 2025.

arXiv:2502.03810 [pdf, other]

DeblurDiff: Real-World Image Deblurring with Generative Diffusion Models

Authors: Lingshun Kong, Jiawei Zhang, Dongqing Zou, Jimmy Ren, Xiaohe Wu, Jiangxin Dong, Jinshan Pan

Abstract: Diffusion models have achieved significant progress in image generation. The pre-trained Stable Diffusion (SD) models are helpful for image deblurring by providing clear image priors. However, directly using a blurry image or pre-deblurred one as a conditional control for SD will either hinder accurate structure extraction or make the results overly dependent on the deblurring network. In this wor… ▽ More Diffusion models have achieved significant progress in image generation. The pre-trained Stable Diffusion (SD) models are helpful for image deblurring by providing clear image priors. However, directly using a blurry image or pre-deblurred one as a conditional control for SD will either hinder accurate structure extraction or make the results overly dependent on the deblurring network. In this work, we propose a Latent Kernel Prediction Network (LKPN) to achieve robust real-world image deblurring. Specifically, we co-train the LKPN in latent space with conditional diffusion. The LKPN learns a spatially variant kernel to guide the restoration of sharp images in the latent space. By applying element-wise adaptive convolution (EAC), the learned kernel is utilized to adaptively process the input feature, effectively preserving the structural information of the input. This process thereby more effectively guides the generative process of Stable Diffusion (SD), enhancing both the deblurring efficacy and the quality of detail reconstruction. Moreover, the results at each diffusion step are utilized to iteratively estimate the kernels in LKPN to better restore the sharp latent by EAC. This iterative refinement enhances the accuracy and robustness of the deblurring process. Extensive experimental results demonstrate that the proposed method outperforms state-of-the-art image deblurring methods on both benchmark and real-world images. △ Less

Submitted 6 February, 2025; originally announced February 2025.

arXiv:2502.02629 [pdf]

Graph Structure Learning for Tumor Microenvironment with Cell Type Annotation from non-spatial scRNA-seq data

Authors: Yu-An Huang, Yue-Chao Li, Hai-Ru You, Jie Pan, Xiyue Cao, Xinyuan Li, Zhi-An Huang, Zhu-Hong You

Abstract: The exploration of cellular heterogeneity within the tumor microenvironment (TME) via single-cell RNA sequencing (scRNA-seq) is essential for understanding cancer progression and response to therapy. Current scRNA-seq approaches, however, lack spatial context and rely on incomplete datasets of ligand-receptor interactions (LRIs), limiting accurate cell type annotation and cell-cell communication (… ▽ More The exploration of cellular heterogeneity within the tumor microenvironment (TME) via single-cell RNA sequencing (scRNA-seq) is essential for understanding cancer progression and response to therapy. Current scRNA-seq approaches, however, lack spatial context and rely on incomplete datasets of ligand-receptor interactions (LRIs), limiting accurate cell type annotation and cell-cell communication (CCC) inference. This study addresses these challenges using a novel graph neural network (GNN) model that enhances cell type prediction and cell interaction analysis. Our study utilized a dataset consisting of 49,020 cells from 19 patients across three cancer types: Leukemia, Breast Invasive Carcinoma, and Colorectal Cancer. The proposed scGSL model demonstrated robust performance, achieving an average accuracy of 84.83%, precision of 86.23%, recall of 81.51%, and an F1 score of 80.92% across all datasets. These metrics represent a significant enhancement over existing methods, which typically exhibit lower performance metrics. Additionally, by reviewing existing literature on gene interactions within the TME, the scGSL model proves to robustly identify biologically meaningful gene interactions in an unsupervised manner, validated by significant expression differences in key gene pairs across various cancers. The source code and data used in this paper can be found in https://github.com/LiYuechao1998/scGSL. △ Less

Submitted 4 February, 2025; originally announced February 2025.

Comments: 29 pages, 6 figures

arXiv:2502.02390 [pdf, other]

CoAT: Chain-of-Associated-Thoughts Framework for Enhancing Large Language Models Reasoning

Authors: Jianfeng Pan, Senyou Deng, Shaomang Huang

Abstract: Research on LLM technologies is rapidly emerging, with most of them employing a 'fast thinking' approach to inference. Most LLMs generate the final result based solely on a single query and LLM's reasoning capabilities. However, with the advent of OpenAI-o1, 'slow thinking' techniques have garnered increasing attention because its process is closer to the human thought process. Inspired by the hum… ▽ More Research on LLM technologies is rapidly emerging, with most of them employing a 'fast thinking' approach to inference. Most LLMs generate the final result based solely on a single query and LLM's reasoning capabilities. However, with the advent of OpenAI-o1, 'slow thinking' techniques have garnered increasing attention because its process is closer to the human thought process. Inspired by the human ability to constantly associate and replenish knowledge during thinking, we developed the novel Chain-of-Associated-Thoughts (CoAT) framework, which introduces an innovative synergy between the Monte Carlo Tree Search (MCTS) algorithm and a dynamic mechanism for integrating new key information, termed 'associative memory'. By combining the structured exploration capabilities of MCTS with the adaptive learning capacity of associative memory, CoAT significantly expands the LLM search space, enabling our framework to explore diverse reasoning pathways and dynamically update its knowledge base in real-time. This allows the framework to not only revisit and refine earlier inferences but also adaptively incorporate evolving information, ensuring that the final output is both accurate and comprehensive. To validate the effectiveness of our framework, we conducted extensive experiments across a range of generative and reasoning tasks. These experiments demonstrated that our framework outperforms conventional inference processes on accuracy, coherence, and diversity. The framework's ability to iteratively expand its search space while retaining contextually relevant information results. △ Less

Submitted 4 February, 2025; originally announced February 2025.

arXiv:2502.01989 [pdf, ps, other]

VFScale: Intrinsic Reasoning through Verifier-Free Test-time Scalable Diffusion Model

Authors: Tao Zhang, Jia-Shu Pan, Ruiqi Feng, Tailin Wu

Abstract: Inspired by human SYSTEM 2 thinking, LLMs excel at complex reasoning tasks via extended Chain-of-Thought. However, similar test-time scaling for diffusion models to tackle complex reasoning remains largely unexplored. From existing work, two primary challenges emerge in this setting: (i) the dependence on an external verifier indicating a notable gap from intrinsic reasoning of human intelligence… ▽ More Inspired by human SYSTEM 2 thinking, LLMs excel at complex reasoning tasks via extended Chain-of-Thought. However, similar test-time scaling for diffusion models to tackle complex reasoning remains largely unexplored. From existing work, two primary challenges emerge in this setting: (i) the dependence on an external verifier indicating a notable gap from intrinsic reasoning of human intelligence without any external feedback, and (ii) the lack of an efficient search algorithm. In this paper, we introduce the Verifier-free Test-time Scalable Diffusion Model (VFScale) to achieve scalable intrinsic reasoning, which equips number-of-sample test-time scaling with the intrinsic energy function of diffusion models as the verifier. Concretely, VFScale comprises two key innovations to address the aforementioned challenges. On the training side, VFScale consists of a novel LRNCL loss and a KL regularization to improve the energy landscape, ensuring that the learned energy function itself serves as a reliable verifier. On the inference side, VFScale integrates the denoising process with a novel hybrid Monte Carlo Tree Search (hMCTS) to improve search efficiency. On challenging reasoning tasks of Maze and Sudoku, we demonstrate the effectiveness of VFScale's training objective and scalable inference method. In particular, trained with Maze sizes of up to $6\times6$, our VFScale solves 88% of Maze problems with much larger sizes of $15\times15$, while standard diffusion model completely fails. △ Less

Submitted 31 May, 2025; v1 submitted 3 February, 2025; originally announced February 2025.

Comments: 22 pages, 13 figures

arXiv:2502.01088 [pdf, ps, other]

Study of the mass spectra of doubly heavy $Ξ_{QQ^{\prime}}$ and $Ω_{QQ^{\prime}}$ baryons

Authors: Ji-Hai Pan, Ji-Si Pan

Abstract: LHCb Collaboration first observed a doubly charmed baryon $Ξ^{++}_{cc}$ in the $Λ^{+}_{c}K^{-}π^{+}π^{+}$ decay with a mass of $3621.40\pm0.78$ MeV. In this paper, we enumerated the mass spectra of the radial and orbital excited states for the doubly heavy $Ξ_{QQ^{\prime}}$ and $Ω_{QQ^{\prime}}$ baryons using the Regge trajectory model and the scaling rules. Our studies suggest that $Ξ^{++}_{cc}$… ▽ More LHCb Collaboration first observed a doubly charmed baryon $Ξ^{++}_{cc}$ in the $Λ^{+}_{c}K^{-}π^{+}π^{+}$ decay with a mass of $3621.40\pm0.78$ MeV. In this paper, we enumerated the mass spectra of the radial and orbital excited states for the doubly heavy $Ξ_{QQ^{\prime}}$ and $Ω_{QQ^{\prime}}$ baryons using the Regge trajectory model and the scaling rules. Our studies suggest that $Ξ^{++}_{cc}$ can be grouped into the $1S$-wave state with the spin-parity quantum number $J^{P} = 1/2^{+}$. On the other hand, the mass of $Ξ_{cc}$ state with $J^{P} = 3/2^{+}$ is predicted to be $3699.69$ MeV. We also predict the mass spectra of the unknown ground and excited states for the doubly heavy $Ξ_{QQ^{\prime}}$ and $Ω_{QQ^{\prime}}$ baryons, which provide useful references for the experimental test in the future. △ Less

Submitted 27 April, 2025; v1 submitted 3 February, 2025; originally announced February 2025.

Comments: 31 pages

arXiv:2502.00353 [pdf]

Flexible delivery of high-power picosecond laser in purely-single optical mode of anti-resonant hollow-core fiber for micromachining

Authors: Xinshuo Chang, Qinan Jiang, Zhiyuan Huang, Jinyu Pan, Qingwei Zhang, Nan Li, Zhuozhao Luo, Ruochen Yin, Wenbin He, Jiapeng Huang, Yuxin Leng, Xin Jiang, Shanglu Yang, Meng Pang

Abstract: We present the flexible delivery of picosecond laser pulses with up to 20 W average power over a 3-m-long sample of anti-resonant hollow-core fiber (AR-HCF) for laser micromachining applications. Our experiments highlight the importance of optical mode purity of the AR-HCF for the manufacturing precision. We demonstrate that compared with an AR-HCF sample with a capillary to core (d/D) ratio of ~0… ▽ More We present the flexible delivery of picosecond laser pulses with up to 20 W average power over a 3-m-long sample of anti-resonant hollow-core fiber (AR-HCF) for laser micromachining applications. Our experiments highlight the importance of optical mode purity of the AR-HCF for the manufacturing precision. We demonstrate that compared with an AR-HCF sample with a capillary to core (d/D) ratio of ~0.5, the AR-HCF with a d/D ratio of ~0.68 exhibits better capability of high-order-mode suppression, giving rise to improved micromachining quality. Moreover, the AR-HCF delivery system exhibits better pointing stability and set-up flexibility than the free-space beam delivery system. These results pave the way to practical applications of AR-HCF in developing advanced equipment for ultrafast laser micromachining. △ Less

Submitted 1 February, 2025; originally announced February 2025.

arXiv:2501.16728 [pdf, other]

Optimizing Efficiency of Mixed Traffic through Reinforcement Learning: A Topology-Independent Approach and Benchmark

Authors: Chuyang Xiao, Dawei Wang, Xinzheng Tang, Jia Pan, Yuexin Ma

Abstract: This paper presents a mixed traffic control policy designed to optimize traffic efficiency across diverse road topologies, addressing issues of congestion prevalent in urban environments. A model-free reinforcement learning (RL) approach is developed to manage large-scale traffic flow, using data collected by autonomous vehicles to influence human-driven vehicles. A real-world mixed traffic contro… ▽ More This paper presents a mixed traffic control policy designed to optimize traffic efficiency across diverse road topologies, addressing issues of congestion prevalent in urban environments. A model-free reinforcement learning (RL) approach is developed to manage large-scale traffic flow, using data collected by autonomous vehicles to influence human-driven vehicles. A real-world mixed traffic control benchmark is also released, which includes 444 scenarios from 20 countries, representing a wide geographic distribution and covering a variety of scenarios and road topologies. This benchmark serves as a foundation for future research, providing a realistic simulation environment for the development of effective policies. Comprehensive experiments demonstrate the effectiveness and adaptability of the proposed method, achieving better performance than existing traffic control methods in both intersection and roundabout scenarios. To the best of our knowledge, this is the first project to introduce a real-world complex scenarios mixed traffic control benchmark. Videos and code of our work are available at https://sites.google.com/berkeley.edu/mixedtrafficplus/home △ Less

Submitted 28 January, 2025; originally announced January 2025.

Comments: accepted to ICRA 2025

arXiv:2501.14497 [pdf, other]

Evaluating and Improving Graph to Text Generation with Large Language Models

Authors: Jie He, Yijun Yang, Wanqiu Long, Deyi Xiong, Victor Gutierrez-Basulto, Jeff Z. Pan

Abstract: Large language models (LLMs) have demonstrated immense potential across various tasks. However, research for exploring and improving the capabilities of LLMs in interpreting graph structures remains limited. To address this gap, we conduct a comprehensive evaluation of prompting current open-source LLMs on graph-to-text generation tasks. Although we explored the optimal prompting strategies and pr… ▽ More Large language models (LLMs) have demonstrated immense potential across various tasks. However, research for exploring and improving the capabilities of LLMs in interpreting graph structures remains limited. To address this gap, we conduct a comprehensive evaluation of prompting current open-source LLMs on graph-to-text generation tasks. Although we explored the optimal prompting strategies and proposed a novel and effective diversity-difficulty-based few-shot sample selection method, we found that the improvements from tuning-free approaches were incremental, as LLMs struggle with planning on complex graphs, particularly those with a larger number of triplets. To further improve LLMs in planning with graph sequences and grounding in truth, we introduce a new graph-to-text dataset, PlanGTG, annotated with two sub-tasks: reordering and attribution. Through extensive automatic and human evaluations, we demonstrate significant improvements in the quality of generated text from both few-shot learning and fine-tuning perspectives using the PlanGTG dataset. Our study paves the way for new research directions in graph-to-text generation. PlanGTG datasets can be found in https://github.com/probe2/kg_text. △ Less

Submitted 14 February, 2025; v1 submitted 24 January, 2025; originally announced January 2025.

Comments: NAACL 2025

arXiv:2501.14249 [pdf, other]

Humanity's Last Exam

Authors: Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Chen Bo Calvin Zhang, Mohamed Shaaban, John Ling, Sean Shi, Michael Choi, Anish Agrawal, Arnav Chopra, Adam Khoja, Ryan Kim, Richard Ren, Jason Hausenloy, Oliver Zhang, Mantas Mazeika, Dmitry Dodonov, Tung Nguyen, Jaeho Lee, Daron Anderson, Mikhail Doroshenko, Alun Cennyth Stokes , et al. (1084 additional authors not shown)

Abstract: Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of… ▽ More Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage. HLE consists of 2,500 questions across dozens of subjects, including mathematics, humanities, and the natural sciences. HLE is developed globally by subject-matter experts and consists of multiple-choice and short-answer questions suitable for automated grading. Each question has a known solution that is unambiguous and easily verifiable, but cannot be quickly answered via internet retrieval. State-of-the-art LLMs demonstrate low accuracy and calibration on HLE, highlighting a significant gap between current LLM capabilities and the expert human frontier on closed-ended academic questions. To inform research and policymaking upon a clear understanding of model capabilities, we publicly release HLE at https://lastexam.ai. △ Less

Submitted 19 April, 2025; v1 submitted 24 January, 2025; originally announced January 2025.

Comments: 29 pages, 6 figures

arXiv:2501.12492 [pdf, ps, other]

QuSplit: Achieving Both High Fidelity and Throughput via Job Splitting on Noisy Quantum Computers

Authors: Jinyang Li, Yuhong Song, Yipei Liu, Jianli Pan, Lei Yang, Travis Humble, Weiwen Jiang

Abstract: With the progression into the quantum utility era, computing is shifting toward quantum-centric architectures, where multiple quantum processors collaborate with classical computing resources. Platforms such as IBM Quantum and Amazon Braket exemplify this trend, enabling access to diverse quantum backends. However, efficient resource management remains a challenge, as quantum processors are highly… ▽ More With the progression into the quantum utility era, computing is shifting toward quantum-centric architectures, where multiple quantum processors collaborate with classical computing resources. Platforms such as IBM Quantum and Amazon Braket exemplify this trend, enabling access to diverse quantum backends. However, efficient resource management remains a challenge, as quantum processors are highly susceptible to noise, which significantly impacts computation fidelity. Additionally, the heterogeneous noise characteristics across different processors add further complexity to scheduling and resource allocation. Existing scheduling strategies typically focus on mapping and scheduling jobs to these heterogeneous backends, which leads to some jobs suffering extremely low fidelity. Targeting quantum optimization jobs (e.g., VQC, VQE, QAOA) - among the most promising quantum applications in the NISQ era - we hypothesize that executing the later stages of a job on a high-fidelity quantum processor can significantly improve overall fidelity. To verify this, we use VQE as a case study and develop a Genetic Algorithm-based scheduling framework that incorporates job splitting to optimize fidelity and throughput. Experimental results demonstrate that our approach consistently maintains high fidelity across all jobs while significantly enhancing system throughput. Furthermore, the proposed algorithm exhibits excellent scalability in handling an increasing number of quantum processors and larger workloads, making it a robust and practical solution for emerging quantum computing platforms. To further substantiate its effectiveness, we conduct experiments on a real quantum processor, IBM Strasbourg, which confirm that job splitting improves fidelity and reduces the number of iterations required for convergence. △ Less

Submitted 11 March, 2025; v1 submitted 21 January, 2025; originally announced January 2025.

arXiv:2501.09783 [pdf, other]

GeoManip: Geometric Constraints as General Interfaces for Robot Manipulation

Authors: Weiliang Tang, Jia-Hui Pan, Yun-Hui Liu, Masayoshi Tomizuka, Li Erran Li, Chi-Wing Fu, Mingyu Ding

Abstract: We present GeoManip, a framework to enable generalist robots to leverage essential conditions derived from object and part relationships, as geometric constraints, for robot manipulation. For example, cutting the carrot requires adhering to a geometric constraint: the blade of the knife should be perpendicular to the carrot's direction. By interpreting these constraints through symbolic language r… ▽ More We present GeoManip, a framework to enable generalist robots to leverage essential conditions derived from object and part relationships, as geometric constraints, for robot manipulation. For example, cutting the carrot requires adhering to a geometric constraint: the blade of the knife should be perpendicular to the carrot's direction. By interpreting these constraints through symbolic language representations and translating them into low-level actions, GeoManip bridges the gap between natural language and robotic execution, enabling greater generalizability across diverse even unseen tasks, objects, and scenarios. Unlike vision-language-action models that require extensive training, operates training-free by utilizing large foundational models: a constraint generation module that predicts stage-specific geometric constraints and a geometry parser that identifies object parts involved in these constraints. A solver then optimizes trajectories to satisfy inferred constraints from task descriptions and the scene. Furthermore, GeoManip learns in-context and provides five appealing human-robot interaction features: on-the-fly policy adaptation, learning from human demonstrations, learning from failure cases, long-horizon action planning, and efficient data collection for imitation learning. Extensive evaluations on both simulations and real-world scenarios demonstrate GeoManip's state-of-the-art performance, with superior out-of-distribution generalization while avoiding costly model training. △ Less

Submitted 16 January, 2025; originally announced January 2025.

Comments: 32 pages, 13 figures

arXiv:2501.09655 [pdf, other]

A Survey of Research in Large Language Models for Electronic Design Automation

Authors: Jingyu Pan, Guanglei Zhou, Chen-Chia Chang, Isaac Jacobson, Jiang Hu, Yiran Chen

Abstract: Within the rapidly evolving domain of Electronic Design Automation (EDA), Large Language Models (LLMs) have emerged as transformative technologies, offering unprecedented capabilities for optimizing and automating various aspects of electronic design. This survey provides a comprehensive exploration of LLM applications in EDA, focusing on advancements in model architectures, the implications of va… ▽ More Within the rapidly evolving domain of Electronic Design Automation (EDA), Large Language Models (LLMs) have emerged as transformative technologies, offering unprecedented capabilities for optimizing and automating various aspects of electronic design. This survey provides a comprehensive exploration of LLM applications in EDA, focusing on advancements in model architectures, the implications of varying model sizes, and innovative customization techniques that enable tailored analytical insights. By examining the intersection of LLM capabilities and EDA requirements, the paper highlights the significant impact these models have on extracting nuanced understandings from complex datasets. Furthermore, it addresses the challenges and opportunities in integrating LLMs into EDA workflows, paving the way for future research and application in this dynamic field. Through this detailed analysis, the survey aims to offer valuable insights to professionals in the EDA industry, AI researchers, and anyone interested in the convergence of advanced AI technologies and electronic design. △ Less

Submitted 16 January, 2025; originally announced January 2025.

Comments: 21 pages, 2 figures, 3 tables, accepted by TODAES

arXiv:2501.08667 [pdf, ps, other]

TimeFlow: Longitudinal Brain Image Registration and Aging Progression Analysis

Authors: Bailiang Jian, Jiazhen Pan, Yitong Li, Fabian Bongratz, Ruochen Li, Daniel Rueckert, Benedikt Wiestler, Christian Wachinger

Abstract: Predicting future brain states is crucial for understanding healthy aging and neurodegenerative diseases. Longitudinal brain MRI registration, a cornerstone for such analyses, has long been limited by its inability to forecast future developments, reliance on extensive dense longitudinal data, and the need to balance registration accuracy with temporal smoothness. In this work, we present \emph{Ti… ▽ More Predicting future brain states is crucial for understanding healthy aging and neurodegenerative diseases. Longitudinal brain MRI registration, a cornerstone for such analyses, has long been limited by its inability to forecast future developments, reliance on extensive dense longitudinal data, and the need to balance registration accuracy with temporal smoothness. In this work, we present \emph{TimeFlow}, a novel framework for longitudinal brain MRI registration that overcomes all these challenges. TimeFlow leverages a U-Net architecture with temporal conditioning inspired by diffusion models, enabling accurate registration using only two images as input and facilitating prospective analyses through future image prediction. Unlike traditional methods, TimeFlow eliminates the demand for explicit smoothness regularizers and dense sequential data while maintaining temporal consistency and continuity. Experimental results highlight its superior performance in both future timepoint prediction and registration accuracy compared to state-of-the-art methods. Additionally, TimeFlow supports novel biological brain aging analyses, effectively differentiating neurodegenerative conditions from healthy aging, all without requiring segmentation, thus avoiding non-trivial annotation and inconsistent segmentation flaws. This framework paves the way for accurate, data-efficient, and annotation-free prospective analyses of brain aging and chronic diseases. △ Less

Submitted 7 July, 2025; v1 submitted 15 January, 2025; originally announced January 2025.

arXiv:2501.08164 [pdf, other]

doi 10.1103/PhysRevResearch.7.023079

Gapless higher-order topology and corner states in Floquet systems

Authors: Longwen Zhou, Rongtao Wang, Jiaxin Pan

Abstract: Higher-order topological phases (HOTPs) possess localized and symmetry-protected eigenmodes at corners and along hinges in two and three dimensional lattices. The numbers of these topological boundary modes will undergo quantized changes at the critical points between different HOTPs. In this work, we reveal unique higher-order topology induced by time-periodic driving at the critical points of to… ▽ More Higher-order topological phases (HOTPs) possess localized and symmetry-protected eigenmodes at corners and along hinges in two and three dimensional lattices. The numbers of these topological boundary modes will undergo quantized changes at the critical points between different HOTPs. In this work, we reveal unique higher-order topology induced by time-periodic driving at the critical points of topological phase transitions, which has no equilibrium counterparts and also goes beyond the description of gapped topological matter. Using an alternately coupled Creutz ladder and its Floquet-driven descendants as illustrative examples, we analytically characterize and numerically demonstrate the zero and $π$ corner modes that could emerge at the critical points between different Floquet HOTPs. Moreover, we propose a unified scheme of bulk-corner correspondence for both gapless and gapped Floquet HOTPs protected by chiral symmetry in two dimensions. Our work reveals the possibility of corner modes surviving topological transitions in Floquet systems and initializes the study of higher-order Floquet topology at quantum criticality. △ Less

Submitted 23 April, 2025; v1 submitted 14 January, 2025; originally announced January 2025.

Comments: 24 pages, 7 figures, accepted version

Journal ref: Phys. Rev. Research 7, 023079 (2025)

arXiv:2501.06482 [pdf, other]

Deep Reinforcement Learning Optimized Intelligent Resource Allocation in Active RIS-Integrated TN-NTN Networks

Authors: Muhammad Ahmed Mohsin, Hassan Rizwan, Muhammad Jazib, Muhammad Iqbal, Muhammad Bilal, Tabinda Ashraf, Muhammad Farhan Khan, Jen-Yi Pan

Abstract: This work explores the deployment of active reconfigurable intelligent surfaces (A-RIS) in integrated terrestrial and non-terrestrial networks (TN-NTN) while utilizing coordinated multipoint non-orthogonal multiple access (CoMP-NOMA). Our system model incorporates a UAV-assisted RIS in coordination with a terrestrial RIS which aims for signal enhancement. We aim to maximize the sum rate for all us… ▽ More This work explores the deployment of active reconfigurable intelligent surfaces (A-RIS) in integrated terrestrial and non-terrestrial networks (TN-NTN) while utilizing coordinated multipoint non-orthogonal multiple access (CoMP-NOMA). Our system model incorporates a UAV-assisted RIS in coordination with a terrestrial RIS which aims for signal enhancement. We aim to maximize the sum rate for all users in the network using a custom hybrid proximal policy optimization (H-PPO) algorithm by optimizing the UAV trajectory, base station (BS) power allocation factors, active RIS amplification factor, and phase shift matrix. We integrate edge users into NOMA pairs to achieve diversity gain, further enhancing the overall experience for edge users. Exhaustive comparisons are made with passive RIS-assisted networks to demonstrate the superior efficacy of active RIS in terms of energy efficiency, outage probability, and network sum rate. △ Less

Submitted 11 January, 2025; originally announced January 2025.

Comments: Accepted to WCNC 2025

arXiv:2501.06063 [pdf, other]

doi 10.1088/1361-6463/ada44c

Bias voltage controlled inversions of tunneling magnetoresistance in van der Waals heterostructures Fe3GaTe2/hBN/Fe3GaTe2

Authors: Lihao Zhang, Miao He, Xiaoyu Wang, Haodong Zhang, Keying Han, Yonglai Liu, Lei Zhang, Yingchun Cheng, Jie Pan, Zhe Qu, Zhe Wang

Abstract: We report the bias voltage controlled inversions of tunneling magnetoresistance (TMR) in magnetic tunnel junctions composed of Fe3GaTe2 electrodes and hBN tunneling barrier, observed at room temperature. The polarity reversal of TMR occurs consistently at around 0.625 V across multiple devices and temperatures, highlighting the robustness of the effect. To understand this behavior, we developed a… ▽ More We report the bias voltage controlled inversions of tunneling magnetoresistance (TMR) in magnetic tunnel junctions composed of Fe3GaTe2 electrodes and hBN tunneling barrier, observed at room temperature. The polarity reversal of TMR occurs consistently at around 0.625 V across multiple devices and temperatures, highlighting the robustness of the effect. To understand this behavior, we developed a theoretical model incorporating spin-resolved density of states (DOS) at high energy levels. By adjusting the DOS weighting at different k points to account for misalignment between the crystal structure of electrodes in experimental devices, we improved agreement between experimental and theoretical inversion voltages. Our results provide valuable insight into the voltage-controlled spin injection and detection in two-dimensional magnetic tunnel junctions, with implications for the development of energy-efficient spintronic devices. △ Less

Submitted 10 January, 2025; originally announced January 2025.

Comments: 4 Figures

Journal ref: Journal of Physics D: Applied Physics, 58, 105005 (2025)

arXiv:2501.05734 [pdf, ps, other]

Homogenization of Inhomogeneous Incompressible Navier-Stokes Equations in Domains with Very Tiny Holes

Authors: Yong Lu, Jiaojiao Pan, Peikang Yang

Abstract: In this paper, we study the homogenization problems of $3D$ inhomogeneous incompressible Navier-Stokes system perforated with very tiny holes whose diameters are much smaller than their mutual distances. The key is to establish the equations in the homogeneous domain without holes for the zero extensions of the weak solutions. This allows us to derive time derivative estimates and show the strong… ▽ More In this paper, we study the homogenization problems of $3D$ inhomogeneous incompressible Navier-Stokes system perforated with very tiny holes whose diameters are much smaller than their mutual distances. The key is to establish the equations in the homogeneous domain without holes for the zero extensions of the weak solutions. This allows us to derive time derivative estimates and show the strong convergence of the density and the momentum by Aubin-Lions type argument. For the case of small holes, we finally show the limit equations remain unchanged in the homogenization limit. △ Less

Submitted 10 January, 2025; originally announced January 2025.

Comments: 13 pages. arXiv admin note: text overlap with arXiv:2204.01207

MSC Class: 35B27; 76M50; 76N06

arXiv:2501.05153 [pdf, other]

Assisting MoCap-Based Teleoperation of Robot Arm using Augmented Reality Visualisations

Authors: Qiushi Zhou, Antony Chacon, Jiahe Pan, Wafa Johal

Abstract: Teleoperating a robot arm involves the human operator positioning the robot's end-effector or programming each joint. Whereas humans can control their own arms easily by integrating visual and proprioceptive feedback, it is challenging to control an external robot arm in the same way, due to its inconsistent orientation and appearance. We explore teleoperating a robot arm through motion-capture (M… ▽ More Teleoperating a robot arm involves the human operator positioning the robot's end-effector or programming each joint. Whereas humans can control their own arms easily by integrating visual and proprioceptive feedback, it is challenging to control an external robot arm in the same way, due to its inconsistent orientation and appearance. We explore teleoperating a robot arm through motion-capture (MoCap) of the human operator's arm with the assistance of augmented reality (AR) visualisations. We investigate how AR helps teleoperation by visualising a virtual reference of the human arm alongside the robot arm to help users understand the movement mapping. We found that the AR overlay of a humanoid arm on the robot in the same orientation helped users learn the control. We discuss findings and future work on MoCap-based robot teleoperation. △ Less

Submitted 9 January, 2025; originally announced January 2025.

Comments: 5 pages, 7 figures, accepted to HRI 2025

arXiv:2501.05141 [pdf, other]

OfficeMate: Pilot Evaluation of an Office Assistant Robot

Authors: Jiahe Pan, Sarah Schömbs, Yan Zhang, Ramtin Tabatabaei, Muhammad Bilal, Wafa Johal

Abstract: Office Assistant Robots (OARs) offer a promising solution to proactively provide in-situ support to enhance employee well-being and productivity in office spaces. We introduce OfficeMate, a social OAR designed to assist with practical tasks, foster social interaction, and promote health and well-being. Through a pilot evaluation with seven participants in an office environment, we found that users… ▽ More Office Assistant Robots (OARs) offer a promising solution to proactively provide in-situ support to enhance employee well-being and productivity in office spaces. We introduce OfficeMate, a social OAR designed to assist with practical tasks, foster social interaction, and promote health and well-being. Through a pilot evaluation with seven participants in an office environment, we found that users see potential in OARs for reducing stress and promoting healthy habits and value the robot's ability to provide companionship and physical activity reminders in the office space. However, concerns regarding privacy, communication, and the robot's interaction timing were also raised. The feedback highlights the need to carefully consider the robot's appearance and behaviour to ensure it enhances user experience and aligns with office social norms. We believe these insights will better inform the development of adaptive, intelligent OAR systems for future office space integration. △ Less

Submitted 9 January, 2025; originally announced January 2025.

Comments: 5 pages, 1 figure, accepted to HRI 2025

arXiv:2501.04744 [pdf, other]

Exact computation of the color function for triangular element interfaces

Authors: Jieyun Pan, Désir-André Koffi Bi, Ahmed Basil Kottilingal, Serena Costanzo, Jiacai Lu, Yue Ling, Ruben Scardovelli, Grétar Tryggvason, Stéphane Zaleski

Abstract: The calculation of the volume enclosed by curved surfaces discretized into triangular elements, and a cube is of great importance in different domains, such as computer graphics and multiphase flow simulations. We propose a robust algorithm, the Front2VOF (F2V) algorithm, to address this problem. The F2V algorithm consists of two main steps. First, it identifies the polygons within the cube by seg… ▽ More The calculation of the volume enclosed by curved surfaces discretized into triangular elements, and a cube is of great importance in different domains, such as computer graphics and multiphase flow simulations. We propose a robust algorithm, the Front2VOF (F2V) algorithm, to address this problem. The F2V algorithm consists of two main steps. First, it identifies the polygons within the cube by segmenting the triangular elements on the surface, retaining only the portions inside the cube boundaries. Second, it computes the volume enclosed by these polygons in combination with the cube faces. To validate the algorithm's accuracy and robustness, we tested it using a range of synthetic configurations with known analytical solutions. △ Less

Submitted 8 January, 2025; originally announced January 2025.

arXiv:2501.02580 [pdf, ps, other]

LP-ICP: General Localizability-Aware Point Cloud Registration for Robust Localization in Extreme Unstructured Environments

Authors: Haosong Yue, Qingyuan Xu, Fei Chen, Jia Pan, Weihai Chen

Abstract: The Iterative Closest Point (ICP) algorithm is a crucial component of LiDAR-based SLAM algorithms. However, its performance can be negatively affected in unstructured environments that lack features and geometric structures, leading to low accuracy and poor robustness in localization and mapping. It is known that degeneracy caused by the lack of geometric constraints can lead to errors in 6-DOF po… ▽ More The Iterative Closest Point (ICP) algorithm is a crucial component of LiDAR-based SLAM algorithms. However, its performance can be negatively affected in unstructured environments that lack features and geometric structures, leading to low accuracy and poor robustness in localization and mapping. It is known that degeneracy caused by the lack of geometric constraints can lead to errors in 6-DOF pose estimation along ill-conditioned directions. Therefore, there is a need for a broader and more fine-grained degeneracy detection and handling method. This paper proposes a new point cloud registration framework, LP-ICP, that combines point-to-line and point-to-plane distance metrics in the ICP algorithm, with localizability detection and handling. Rather than relying solely on point-to-plane localizability information, LP-ICP enhances the localizability analysis by incorporating a point-to-line metric, thereby exploiting richer geometric constraints. It consists of a localizability detection module and an optimization module. The localizability detection module performs localizability analysis by utilizing the correspondences between edge points (with low local smoothness) to lines and planar points (with high local smoothness) to planes between the scan and the map. The localizability contribution of individual correspondence constraints can be applied to a broader range. The optimization module adds additional soft and hard constraints to the optimization equations based on the localizability category. This allows the pose to be constrained along ill-conditioned directions. The proposed method is evaluated on simulation and real-world datasets, showing comparable or better accuracy than the state-of-the art methods in tested scenarios. Observed variations in partially localizable directions suggest the need for further investigation on robustness and generalizability. △ Less

Submitted 31 May, 2025; v1 submitted 5 January, 2025; originally announced January 2025.

Comments: 18 Pages, 9 Figures

arXiv:2501.01495 [pdf, other]

doi 10.3847/1538-4357/adb3a0

Search for continuous gravitational waves from known pulsars in the first part of the fourth LIGO-Virgo-KAGRA observing run

Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, A. G. Abac, R. Abbott, I. Abouelfettouh, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, D. Agarwal, M. Agathos, M. Aghaei Abchouyeh, O. D. Aguiar, I. Aguilar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Al-Jodah, C. Alléné , et al. (1794 additional authors not shown)

Abstract: Continuous gravitational waves (CWs) emission from neutron stars carries information about their internal structure and equation of state, and it can provide tests of General Relativity. We present a search for CWs from a set of 45 known pulsars in the first part of the fourth LIGO--Virgo--KAGRA observing run, known as O4a. We conducted a targeted search for each pulsar using three independent ana… ▽ More Continuous gravitational waves (CWs) emission from neutron stars carries information about their internal structure and equation of state, and it can provide tests of General Relativity. We present a search for CWs from a set of 45 known pulsars in the first part of the fourth LIGO--Virgo--KAGRA observing run, known as O4a. We conducted a targeted search for each pulsar using three independent analysis methods considering the single-harmonic and the dual-harmonic emission models. We find no evidence of a CW signal in O4a data for both models and set upper limits on the signal amplitude and on the ellipticity, which quantifies the asymmetry in the neutron star mass distribution. For the single-harmonic emission model, 29 targets have the upper limit on the amplitude below the theoretical spin-down limit. The lowest upper limit on the amplitude is $6.4\!\times\!10^{-27}$ for the young energetic pulsar J0537-6910, while the lowest constraint on the ellipticity is $8.8\!\times\!10^{-9}$ for the bright nearby millisecond pulsar J0437-4715. Additionally, for a subset of 16 targets we performed a narrowband search that is more robust regarding the emission model, with no evidence of a signal. We also found no evidence of non-standard polarizations as predicted by the Brans-Dicke theory. △ Less

Submitted 2 January, 2025; originally announced January 2025.

Comments: main paper: 12 pages, 6 figures, 4 tables

Report number: LIGO-P2400315

Journal ref: Astrophys.J. 983 (2025) 2, 99

arXiv:2412.21139 [pdf, ps, other]

Training Software Engineering Agents and Verifiers with SWE-Gym

Authors: Jiayi Pan, Xingyao Wang, Graham Neubig, Navdeep Jaitly, Heng Ji, Alane Suhr, Yizhe Zhang

Abstract: We present SWE-Gym, the first environment for training real-world software engineering (SWE) agents. SWE-Gym contains 2,438 real-world Python task instances, each comprising a codebase with an executable runtime environment, unit tests, and a task specified in natural language. We use SWE-Gym to train language model based SWE agents, achieving up to 19% absolute gains in resolve rate on the popula… ▽ More We present SWE-Gym, the first environment for training real-world software engineering (SWE) agents. SWE-Gym contains 2,438 real-world Python task instances, each comprising a codebase with an executable runtime environment, unit tests, and a task specified in natural language. We use SWE-Gym to train language model based SWE agents, achieving up to 19% absolute gains in resolve rate on the popular SWE-Bench Verified and Lite test sets. We also experiment with inference-time scaling through verifiers trained on agent trajectories sampled from SWE-Gym. When combined with our fine-tuned SWE agents, we achieve 32.0% and 26.0% on SWE-Bench Verified and Lite, respectively, reflecting a new state-of-the-art for open-weight SWE agents. To facilitate further research, we publicly release SWE-Gym, models, and agent trajectories. △ Less

Submitted 6 June, 2025; v1 submitted 30 December, 2024; originally announced December 2024.

Comments: Accepted at ICML 2025. Code at https://github.com/SWE-Gym/SWE-Gym

arXiv:2412.20318 [pdf, ps, other]

A note on the Cuntz algebra automorphisms

Authors: Junyao Pan

Abstract: Permutative automorphisms of the Cuntz algebras $\mathcal{O}_n$ are in bijection with the stable permutations of $[n]^k$. Thereby, it is used to determine the restricted Weyl group of $Aut(\mathcal{O}_n)$ by describing all satble permutations. In this note, we characterize some stable involutions of rank one, and thus we prove Conjecture 12.2 of Brenti and Conti [Adv. Math. 381 (2021), p. 60]. Permutative automorphisms of the Cuntz algebras $\mathcal{O}_n$ are in bijection with the stable permutations of $[n]^k$. Thereby, it is used to determine the restricted Weyl group of $Aut(\mathcal{O}_n)$ by describing all satble permutations. In this note, we characterize some stable involutions of rank one, and thus we prove Conjecture 12.2 of Brenti and Conti [Adv. Math. 381 (2021), p. 60]. △ Less

Submitted 28 December, 2024; originally announced December 2024.

MSC Class: 05E16; 05A05; 05A15

arXiv:2412.18882 [pdf, other]

Boosted fusion gates above the percolation threshold for scalable graph-state generation

Authors: Yong-Peng Guo, Geng-Yan Zou, Xing Ding, Qi-Hang Zhang, Mo-Chi Xu, Run-Ze Liu, Jun-Yi Zhao, Zhen-Xuan Ge, Li-Chao Peng, Ke-Mi Xu, Yi-Yang Lou, Zhen Ning, Lin-Jun Wang, Hui Wang, Yong-Heng Huo, Yu-Ming He, Chao-Yang Lu, Jian-Wei Pan

Abstract: Fusing small resource states into a larger, fully connected graph-state is essential for scalable photonic quantum computing. Theoretical analysis reveals that this can only be achieved when the success probability of the fusion gate surpasses a specific percolation threshold of 58.98% by using three-photon GHZ states as resource states. However, such an implementation of a fusion gate has never b… ▽ More Fusing small resource states into a larger, fully connected graph-state is essential for scalable photonic quantum computing. Theoretical analysis reveals that this can only be achieved when the success probability of the fusion gate surpasses a specific percolation threshold of 58.98% by using three-photon GHZ states as resource states. However, such an implementation of a fusion gate has never been experimentally realized before. Here, we successfully demonstrate a boosted fusion gate with a theoretical success probability of 75%, using deterministically generated auxiliary states. The success probability is experimentally measured to be 71.0(7)%. We further demonstrate the effectiveness of the boosted fusion gate by fusing two Bell states with a fidelity of 67(2)%. Our work paves a crucial path toward scalable linear optical quantum computing. △ Less

Submitted 25 December, 2024; originally announced December 2024.

Comments: 5 pages, 4 figures

arXiv:2412.18494 [pdf, other]

Topological phases protected by projective PT symmetry in alkaline-earth-like atoms

Authors: Xiaofan Zhou, Suotang Jia, Jian-Song Pan

Abstract: An important aspect in categorizing topological phases is whether the system is spinless or spinful, given that these classes exhibit distinct symmetry algebras, leading to disparate topological classifications. By utilizing the projective presentation strategy, the topological phases of spinless (or spinful) systems can be emulated using spinful (or spinless) systems augmented with gauge fields.… ▽ More An important aspect in categorizing topological phases is whether the system is spinless or spinful, given that these classes exhibit distinct symmetry algebras, leading to disparate topological classifications. By utilizing the projective presentation strategy, the topological phases of spinless (or spinful) systems can be emulated using spinful (or spinless) systems augmented with gauge fields. In this study, we propose to implement the topological phases safeguarded by the unique projective space-time (PT) symmetry inherent to spinful models, using synthetic spinless alkaline-earth-like atoms. Employing the separation of orbital and nuclear-spin degrees of freedom, the model is configured as a rectangular tube penetrated by a uniform magnetic flux through each plaquette, which simulates a spinless ladder endowed with projective PT symmetry satisfying the algebraic properties of a spinful model. For interacting topological phases with inter-orbital spin-exchange interactions, which also adhere to PT symmetry, the four-fold degeneracy of edge modes is split into two pairs of edge modes with two-fold degeneracy. We map the complete phase diagram in the end and discover that these interacting topological phases ultimately evolve into distinct charge-density-wave phases via spontaneous symmetry breaking. △ Less

Submitted 24 December, 2024; originally announced December 2024.

Comments: 7 pages+6 figures

arXiv:2412.18451 [pdf, other]

Interaction-induced inversion of chiral transports

Authors: Li Pan, Qian Liang, Chang-An Yang, Yu Huang, Pengjie Liu, Fanying Xi, Wei Yi, Xiaofan Zhou, Jian-Song Pan

Abstract: We study the chiral transport of interacting bosons in a two-leg flux ladder with on-site interactions. Focusing on the flux-induced chiral current along the two legs, we show that, counter-intuitively, on-site interactions can reverse the direction of the chiral flow. For a Bose-Einstein condensate whose dynamical evolution is driven by the Gross-Pitaevskii equation under the mean-field approxima… ▽ More We study the chiral transport of interacting bosons in a two-leg flux ladder with on-site interactions. Focusing on the flux-induced chiral current along the two legs, we show that, counter-intuitively, on-site interactions can reverse the direction of the chiral flow. For a Bose-Einstein condensate whose dynamical evolution is driven by the Gross-Pitaevskii equation under the mean-field approximation, this reversal can be understood as an interaction-induced dynamic occupation inversion, under which single-particle band with opposing chirality becomes heavily populated in the dynamics. This chirality inversion also persists in the two-body dynamics with strong quantum fluctuations beyond the mean-field regime, as demonstrated through time-dependent density-matrix renormalization group and exact diagonalization analyses. Herein, besides the band-occupation-inversion mechanism, we find that the formation of two-body bound states with opposite chirality contributes significantly to the reversed chiral transport. Our discovery highlights the significance of correlation effects in quantum transport, and can be readily demonstrated using cold atoms. △ Less

Submitted 24 December, 2024; originally announced December 2024.

Comments: 7 pages+3 figures

arXiv:2412.18431 [pdf, ps, other]

GeAR: Graph-enhanced Agent for Retrieval-augmented Generation

Authors: Zhili Shen, Chenxin Diao, Pavlos Vougiouklis, Pascual Merita, Shriram Piramanayagam, Enting Chen, Damien Graux, Andre Melo, Ruofei Lai, Zeren Jiang, Zhongyang Li, YE QI, Yang Ren, Dandan Tu, Jeff Z. Pan

Abstract: Retrieval-augmented Generation (RAG) relies on effective retrieval capabilities, yet traditional sparse and dense retrievers inherently struggle with multi-hop retrieval scenarios. In this paper, we introduce GeAR, a system that advances RAG performance through two key innovations: (i) an efficient graph expansion mechanism that augments any conventional base retriever, such as BM25, and (ii) an a… ▽ More Retrieval-augmented Generation (RAG) relies on effective retrieval capabilities, yet traditional sparse and dense retrievers inherently struggle with multi-hop retrieval scenarios. In this paper, we introduce GeAR, a system that advances RAG performance through two key innovations: (i) an efficient graph expansion mechanism that augments any conventional base retriever, such as BM25, and (ii) an agent framework that incorporates the resulting graph-based retrieval into a multi-step retrieval framework. Our evaluation demonstrates GeAR's superior retrieval capabilities across three multi-hop question answering datasets. Notably, our system achieves state-of-the-art results with improvements exceeding 10% on the challenging MuSiQue dataset, while consuming fewer tokens and requiring fewer iterations than existing multi-step retrieval systems. The project page is available at https://gear-rag.github.io. △ Less

Submitted 22 June, 2025; v1 submitted 24 December, 2024; originally announced December 2024.

Comments: ACL 2025 Findings

arXiv:2412.18243 [pdf, other]

A Large-Scale IPv6-Based Measurement of the Starlink Network

Authors: Bingsen Wang, Xiaohui Zhang, Shuai Wang, Li Chen, Jinwei Zhao, Jianping Pan, Dan Li, Yong Jiang

Abstract: Low Earth Orbit (LEO) satellite networks have attracted considerable attention for their ability to deliver global, low-latency broadband Internet services. In this paper, we present a large-scale measurement study of the Starlink network, the largest LEO satellite constellation to date. We begin by proposing an efficient method for discovering active Starlink user routers, identifying approximate… ▽ More Low Earth Orbit (LEO) satellite networks have attracted considerable attention for their ability to deliver global, low-latency broadband Internet services. In this paper, we present a large-scale measurement study of the Starlink network, the largest LEO satellite constellation to date. We begin by proposing an efficient method for discovering active Starlink user routers, identifying approximately 3.2 million IPv6 addresses across 102 countries and 123 regions-representing, to the best of our knowledge, the most complete list of Starlink user routers' active IPv6 addresses. Based on the discovered user routers, we map the Starlink backbone network, which consists of 33 Points of Presence (PoPs) and 70 connections between them. Furthermore, we conduct a detailed statistical analysis of active Starlink users and PoPs. Finally, we summarize the IPv6 address assignment strategy adopted by the Starlink network. The dataset of the backbone network is publicly available at https://ki3.org.cn/#/starlink-network. △ Less

Submitted 26 December, 2024; v1 submitted 24 December, 2024; originally announced December 2024.

Comments: 6 pages

arXiv:2412.18119 [pdf, other]

Age Optimal Sampling for Unreliable Channels under Unknown Channel Statistics

Authors: Hongyi He, Haoyue Tang, Jiayu Pan, Jintao Wang, Jian Song, Leandros Tassiulas

Abstract: In this paper, we study a system in which a sensor forwards status updates to a receiver through an error-prone channel, while the receiver sends the transmission results back to the sensor via a reliable channel. Both channels are subject to random delays. To evaluate the timeliness of the status information at the receiver, we use the Age of Information (AoI) metric. The objective is to design a… ▽ More In this paper, we study a system in which a sensor forwards status updates to a receiver through an error-prone channel, while the receiver sends the transmission results back to the sensor via a reliable channel. Both channels are subject to random delays. To evaluate the timeliness of the status information at the receiver, we use the Age of Information (AoI) metric. The objective is to design a sampling policy that minimizes the expected time-average AoI, even when the channel statistics (e.g., delay distributions) are unknown. We first review the threshold structure of the optimal offline policy under known channel statistics and then reformulate the design of the online algorithm as a stochastic approximation problem. We propose a Robbins-Monro algorithm to solve this problem and demonstrate that the optimal threshold can be approximated almost surely. Moreover, we prove that the cumulative AoI regret of the online algorithm increases with rate $\mathcal{O}(\ln K)$, where $K$ is the number of successful transmissions. In addition, our algorithm is shown to be minimax order optimal, in the sense that for any online learning algorithm, the cumulative AoI regret up to the $K$-th successful transmissions grows with the rate at least $Ω(\ln K)$ in the worst case delay distribution. Finally, we improve the stability of the proposed online learning algorithm through a momentum-based stochastic gradient descent algorithm. Simulation results validate the performance of our proposed algorithm. △ Less

Submitted 24 February, 2025; v1 submitted 23 December, 2024; originally announced December 2024.

arXiv:2412.17477 [pdf, other]

Predicting Satisfied User and Machine Ratio for Compressed Images: A Unified Approach

Authors: Qi Zhang, Shanshe Wang, Xinfeng Zhang, Siwei Ma, Jingshan Pan, Wen Gao

Abstract: Nowadays, high-quality images are pursued by both humans for better viewing experience and by machines for more accurate visual analysis. However, images are usually compressed before being consumed, decreasing their quality. It is meaningful to predict the perceptual quality of compressed images for both humans and machines, which guides the optimization for compression. In this paper, we propose… ▽ More Nowadays, high-quality images are pursued by both humans for better viewing experience and by machines for more accurate visual analysis. However, images are usually compressed before being consumed, decreasing their quality. It is meaningful to predict the perceptual quality of compressed images for both humans and machines, which guides the optimization for compression. In this paper, we propose a unified approach to address this. Specifically, we create a deep learning-based model to predict Satisfied User Ratio (SUR) and Satisfied Machine Ratio (SMR) of compressed images simultaneously. We first pre-train a feature extractor network on a large-scale SMR-annotated dataset with human perception-related quality labels generated by diverse image quality models, which simulates the acquisition of SUR labels. Then, we propose an MLP-Mixer-based network to predict SUR and SMR by leveraging and fusing the extracted multi-layer features. We introduce a Difference Feature Residual Learning (DFRL) module to learn more discriminative difference features. We further use a Multi-Head Attention Aggregation and Pooling (MHAAP) layer to aggregate difference features and reduce their redundancy. Experimental results indicate that the proposed model significantly outperforms state-of-the-art SUR and SMR prediction methods. Moreover, our joint learning scheme of human and machine perceptual quality prediction tasks is effective at improving the performance of both. △ Less

Submitted 23 December, 2024; originally announced December 2024.

arXiv:2412.17254 [pdf, ps, other]

Enhancing Long Video Generation Consistency without Tuning

Authors: Xingyao Li, Fengzhuo Zhang, Jiachun Pan, Yunlong Hou, Vincent Y. F. Tan, Zhuoran Yang

Abstract: Despite the considerable progress achieved in the long video generation problem, there is still significant room to improve the consistency of the generated videos, particularly in terms of their smoothness and transitions between scenes. We address these issues to enhance the consistency and coherence of videos generated with either single or multiple prompts. We propose the Time-frequency based… ▽ More Despite the considerable progress achieved in the long video generation problem, there is still significant room to improve the consistency of the generated videos, particularly in terms of their smoothness and transitions between scenes. We address these issues to enhance the consistency and coherence of videos generated with either single or multiple prompts. We propose the Time-frequency based temporal Attention Reweighting Algorithm (TiARA), which judiciously edits the attention score matrix based on the Discrete Short-Time Fourier Transform. This method is supported by a frequency-based analysis, ensuring that the edited attention score matrix achieves improved consistency across frames. It represents the first-of-its-kind for frequency-based methods in video diffusion models. For videos generated by multiple prompts, we further uncover key factors such as the alignment of the prompts affecting prompt interpolation quality. Inspired by our analyses, we propose PromptBlend, an advanced prompt interpolation pipeline that systematically aligns the prompts. Extensive experimental results validate the efficacy of our proposed method, demonstrating consistent and substantial improvements over multiple baselines. △ Less

Submitted 7 July, 2025; v1 submitted 22 December, 2024; originally announced December 2024.

Comments: ICML 2025 Workshop on Building Physically Plausible World Models (Best Paper), 32 pages, 17 figures

arXiv:2412.17032 [pdf, other]

MINTQA: A Multi-Hop Question Answering Benchmark for Evaluating LLMs on New and Tail Knowledge

Authors: Jie He, Nan Hu, Wanqiu Long, Jiaoyan Chen, Jeff Z. Pan

Abstract: Large language models (LLMs) have demonstrated impressive capabilities in various reasoning tasks but face significant challenges with complex, knowledge-intensive multi-hop queries, particularly those involving new or long-tail knowledge. Existing benchmarks often fail to fully address these challenges. To bridge this gap, we introduce MINTQA (Multi-hop Question Answering on New and Tail Knowledg… ▽ More Large language models (LLMs) have demonstrated impressive capabilities in various reasoning tasks but face significant challenges with complex, knowledge-intensive multi-hop queries, particularly those involving new or long-tail knowledge. Existing benchmarks often fail to fully address these challenges. To bridge this gap, we introduce MINTQA (Multi-hop Question Answering on New and Tail Knowledge), a comprehensive benchmark to evaluate LLMs' capabilities in multi-hop reasoning across four critical dimensions: question handling strategy, sub-question generation, retrieval-augmented generation, and iterative or dynamic decomposition and retrieval. MINTQA comprises 10,479 question-answer pairs for evaluating new knowledge and 17,887 pairs for assessing long-tail knowledge, with each question equipped with corresponding sub-questions and answers. Our systematic evaluation of 22 state-of-the-art LLMs on MINTQA reveals significant limitations in their ability to handle complex knowledge base queries, particularly in handling new or unpopular knowledge. Our findings highlight critical challenges and offer insights for advancing multi-hop reasoning capabilities. The MINTQA benchmark is available at https://github.com/probe2/multi-hop/. △ Less

Submitted 28 January, 2025; v1 submitted 22 December, 2024; originally announced December 2024.

arXiv:2412.16039 [pdf, ps, other]

SafeCFG: Controlling Harmful Features with Dynamic Safe Guidance for Safe Generation

Authors: Jiadong Pan, Liang Li, Hongcheng Gao, Zheng-Jun Zha, Qingming Huang, Jiebo Luo

Abstract: Diffusion models (DMs) have demonstrated exceptional performance in text-to-image tasks, leading to their widespread use. With the introduction of classifier-free guidance (CFG), the quality of images generated by DMs is significantly improved. However, one can use DMs to generate more harmful images by maliciously guiding the image generation process through CFG. Existing safe alignment methods a… ▽ More Diffusion models (DMs) have demonstrated exceptional performance in text-to-image tasks, leading to their widespread use. With the introduction of classifier-free guidance (CFG), the quality of images generated by DMs is significantly improved. However, one can use DMs to generate more harmful images by maliciously guiding the image generation process through CFG. Existing safe alignment methods aim to mitigate the risk of generating harmful images but often reduce the quality of clean image generation. To address this issue, we propose SafeCFG to adaptively control harmful features with dynamic safe guidance by modulating the CFG generation process. It dynamically guides the CFG generation process based on the harmfulness of the prompts, inducing significant deviations only in harmful CFG generations, achieving high quality and safety generation. SafeCFG can simultaneously modulate different harmful CFG generation processes, so it could eliminate harmful elements while preserving high-quality generation. Additionally, SafeCFG provides the ability to detect image harmfulness, allowing unsupervised safe alignment on DMs without pre-defined clean or harmful labels. Experimental results show that images generated by SafeCFG achieve both high quality and safety, and safe DMs trained in our unsupervised manner also exhibit good safety performance. △ Less

Submitted 29 May, 2025; v1 submitted 20 December, 2024; originally announced December 2024.

arXiv:2412.15482 [pdf]

Tunable ultraviolet dispersive-wave emission driven directly by 40-fs Ti: sapphire laser pulses in hollow capillary fiber

Authors: Tiandao Chen, Zhiyuan Huang, Jinyu Pan, Donghan Liu, Yinuo Zhao, Wenbin He, Jiapeng Huang, Xin Jiang, Meng Pang, Yuxin Leng, Ruxin Li

Abstract: We demonstrate that by using 1-m-long gas-filled hollow capillary fiber (HCF) with a core diameter of 100 μm, tunable ultraviolet (UV) dispersive-wave (DW) pulses can be generated in a compact, single-stage set-up driven directly by 40-fs Ti: sapphire laser pulses. By adjusting the gas type and pressure inside the HCF, the central wavelength of the UV DW can be continuously tuned from 185 nm to ~4… ▽ More We demonstrate that by using 1-m-long gas-filled hollow capillary fiber (HCF) with a core diameter of 100 μm, tunable ultraviolet (UV) dispersive-wave (DW) pulses can be generated in a compact, single-stage set-up driven directly by 40-fs Ti: sapphire laser pulses. By adjusting the gas type and pressure inside the HCF, the central wavelength of the UV DW can be continuously tuned from 185 nm to ~450 nm. In the experiment, we found that for longer-wavelength (from ~320 to ~450 nm) DW generation, Raman-active gas filled in the HCF can efficiently suppress the pulse splitting effect of the high-order soliton due to the Raman-induced pulse energy dissipation, leading to the high-quality DW generation at these wavelengths with smooth, single-peak spectra. These results provide some useful insights for designing compact, wavelength-tunable ultrafast UV light sources with microjoule-level pulse energies. △ Less

Submitted 19 December, 2024; originally announced December 2024.

arXiv:2412.14746 [pdf, other]

Solving Unbalanced Optimal Transport on Point Cloud by Tangent Radial Basis Function Method

Authors: Jiangong Pan, Wei Wan, Chenlong Bao, Zuoqiang Shi

Abstract: In this paper, we solve unbalanced optimal transport (UOT) problem on surfaces represented by point clouds. Based on alternating direction method of multipliers algorithm, the original UOT problem can be solved by an iteration consists of three steps. The key ingredient is to solve a Poisson equation on point cloud which is solved by tangent radial basis function (TRBF) method. The proposed TRBF m… ▽ More In this paper, we solve unbalanced optimal transport (UOT) problem on surfaces represented by point clouds. Based on alternating direction method of multipliers algorithm, the original UOT problem can be solved by an iteration consists of three steps. The key ingredient is to solve a Poisson equation on point cloud which is solved by tangent radial basis function (TRBF) method. The proposed TRBF method requires only the point cloud and normal vectors to discretize the Poisson equation which simplify the computation significantly. Numerical experiments conducted on point clouds with varying geometry and topology demonstrate the effectiveness of the proposed method. △ Less

Submitted 21 April, 2025; v1 submitted 19 December, 2024; originally announced December 2024.

arXiv:2412.14647 [pdf]

AI-Enabled Rapid Assembly of Thousands of Defect-Free Neutral Atom Arrays with Constant-time-overhead

Authors: Rui Lin, Han-Sen Zhong, You Li, Zhang-Rui Zhao, Le-Tian Zheng, Tai-Ran Hu, Hong-Ming Wu, Zhan Wu, Wei-Jie Ma, Yan Gao, Yi-Kang Zhu, Zhao-Feng Su, Wan-Li Ouyang, Yu-Chen Zhang, Jun Rui, Ming-Cheng Chen, Chao-Yang Lu, Jian-Wei Pan

Abstract: Assembling increasingly larger-scale defect-free optical tweezer-trapped atom arrays is essential for quantum computation and quantum simulations based on atoms. Here, we propose an AI-enabled, rapid, constant-time-overhead rearrangement protocol, and we experimentally assemble defect-free 2D and 3D atom arrays with up to 2024 atoms with a constant time cost of 60 ms. The AI model calculates the h… ▽ More Assembling increasingly larger-scale defect-free optical tweezer-trapped atom arrays is essential for quantum computation and quantum simulations based on atoms. Here, we propose an AI-enabled, rapid, constant-time-overhead rearrangement protocol, and we experimentally assemble defect-free 2D and 3D atom arrays with up to 2024 atoms with a constant time cost of 60 ms. The AI model calculates the holograms for real-time atom rearrangement. With precise controls over both position and phase, a high-speed spatial light modulator moves all the atoms simultaneously. This protocol can be readily used to generate defect-free arrays of tens of thousands of atoms with current technologies, and become a useful toolbox for quantum error correction. △ Less

Submitted 19 December, 2024; originally announced December 2024.

arXiv:2412.13849 [pdf, other]

99.9%-fidelity in measuring a superconducting qubit

Authors: Can Wang, Feng-Ming Liu, He Chen, Yi-Fei Du, Chong Ying, Jian-Wen Wang, Yong-Heng Huo, Cheng-Zhi Peng, Xiaobo Zhu, Ming-Cheng Chen, Chao-Yang Lu, Jian-Wei Pan

Abstract: Despite the significant progress in superconducting quantum computation over the past years, quantum state measurement still lags nearly an order of magnitude behind quantum gate operations in speed and fidelity. The main challenge is that the strong coupling and readout signal used to probe the quantum state may also introduce additional channels which may cause qubit state transitions. Here, we… ▽ More Despite the significant progress in superconducting quantum computation over the past years, quantum state measurement still lags nearly an order of magnitude behind quantum gate operations in speed and fidelity. The main challenge is that the strong coupling and readout signal used to probe the quantum state may also introduce additional channels which may cause qubit state transitions. Here, we design a novel architecture to implement the long-sought longitudinal interaction scheme between qubits and resonators. This architecture not only provides genuine longitudinal interaction by eliminating residual transversal couplings, but also introduces proper nonlinearity to the resonator that can further minimize decay error and measurement-induced excitation error. Our experimental results demonstrate a measurement fidelity of 99.8% in 202 ns without the need for any first-stage amplification. After subtracting the residual preparation errors, the pure measurement fidelity is above 99.9%. Our scheme is compatible with the multiplexing readout scheme and can be used for quantum error correction. △ Less

Submitted 19 December, 2024; v1 submitted 18 December, 2024; originally announced December 2024.

arXiv:2412.13458 [pdf, other]

doi 10.1103/PhysRevB.110.235418

Inducing Berry Curvature Dipole in Multilayer Graphene through Inhomogeneous Interlayer Sliding

Authors: Jie Pan, Huanhuan Wang, Lin Zou, Haibo Xie, Yi Ding, Yuze Zhang, Aiping Fang, Zhe Wang

Abstract: Breaking lattice symmetry is crucial for generating a nonzero Berry curvature. While manipulating twisting angles between adjacent layers has successfully broken lattice symmetry through strain field and generated nonzero Berry curvature, interlayer sliding in principle offers a promising alternative route. However, realizing uniform interlayer sliding faces experimental challenges due to its ener… ▽ More Breaking lattice symmetry is crucial for generating a nonzero Berry curvature. While manipulating twisting angles between adjacent layers has successfully broken lattice symmetry through strain field and generated nonzero Berry curvature, interlayer sliding in principle offers a promising alternative route. However, realizing uniform interlayer sliding faces experimental challenges due to its energetic instability. In this work, we introduce an experimentally feasible method, using a corrugated substrate to induce an inhomogeneous but energetically more stable interlayer sliding in multilayer graphene. Our simulations demonstrate that inhomogeneous interlayer sliding produces a sizable Berry curvature dipole, which can be further tuned by varying the interlayer sliding distances and potential differences. The resulting Berry curvature dipole magnitude is remarkably up to 100 times greater than the maximum displacement involved in the inhomogeneous sliding. Our results highlight inhomogeneous interlayer sliding as a viable and effective method to induce a significant Berry curvature dipole in graphene systems and propose the experimentally feasible way to realize it. △ Less

Submitted 17 December, 2024; originally announced December 2024.

Comments: 9 pages, 7 figures

Journal ref: Physical Review B, 110, 235418 (2024)

arXiv:2412.12839 [pdf, other]

From An LLM Swarm To A PDDL-Empowered HIVE: Planning Self-Executed Instructions In A Multi-Modal Jungle

Authors: Kaustubh Vyas, Damien Graux, Yijun Yang, Sébastien Montella, Chenxin Diao, Wendi Zhou, Pavlos Vougiouklis, Ruofei Lai, Yang Ren, Keshuang Li, Jeff Z. Pan

Abstract: In response to the call for agent-based solutions that leverage the ever-increasing capabilities of the deep models' ecosystem, we introduce Hive -- a comprehensive solution for selecting appropriate models and subsequently planning a set of atomic actions to satisfy the end-users' instructions. Hive operates over sets of models and, upon receiving natural language instructions (i.e. user queries)… ▽ More In response to the call for agent-based solutions that leverage the ever-increasing capabilities of the deep models' ecosystem, we introduce Hive -- a comprehensive solution for selecting appropriate models and subsequently planning a set of atomic actions to satisfy the end-users' instructions. Hive operates over sets of models and, upon receiving natural language instructions (i.e. user queries), schedules and executes explainable plans of atomic actions. These actions can involve one or more of the available models to achieve the overall task, while respecting end-users specific constraints. Notably, Hive handles tasks that involve multi-modal inputs and outputs, enabling it to handle complex, real-world queries. Our system is capable of planning complex chains of actions while guaranteeing explainability, using an LLM-based formal logic backbone empowered by PDDL operations. We introduce the MuSE benchmark in order to offer a comprehensive evaluation of the multi-modal capabilities of agent systems. Our findings show that our framework redefines the state-of-the-art for task selection, outperforming other competing systems that plan operations across multiple models while offering transparency guarantees while fully adhering to user constraints. △ Less

Submitted 17 December, 2024; originally announced December 2024.

Comments: Under review

arXiv:2412.12316 [pdf, other]

Estimating HIV Cross-sectional Incidence Using Recency Tests from a Non-representative Sample

Authors: Jianan Pan, Marlena Bannick, Fei Gao

Abstract: Cross-sectional incidence estimation based on recency testing has become a widely used tool in HIV research. Recently, this method has gained prominence in HIV prevention trials to estimate the "placebo" incidence that participants might experience without preventive treatment. The application of this approach faces challenges due to non-representative sampling, as individuals aware of their HIV-p… ▽ More Cross-sectional incidence estimation based on recency testing has become a widely used tool in HIV research. Recently, this method has gained prominence in HIV prevention trials to estimate the "placebo" incidence that participants might experience without preventive treatment. The application of this approach faces challenges due to non-representative sampling, as individuals aware of their HIV-positive status may be less likely to participate in screening for an HIV prevention trial. To address this, a recent phase 3 trial excluded individuals based on whether they have had a recent HIV test. To the best of our knowledge, the validity of this approach has yet to be studied. In our work, we investigate the performance of cross-sectional HIV incidence estimation when excluding individuals based on prior HIV tests in realistic trial settings. We develop a statistical framework that incorporates a testing-based criterion and possible non-representative sampling. We introduce a metric we call the effective mean duration of recent infection (MDRI) that mathematically quantifies bias in incidence estimation. We conduct an extensive simulation study to evaluate incidence estimator performance under various scenarios. Our findings reveal that when screening attendance is affected by knowledge of HIV status, incidence estimators become unreliable unless all individuals with recent HIV tests are excluded. Additionally, we identified a trade-off between bias and variability: excluding more individuals reduces bias from non-representative sampling but in many cases increases the variability of incidence estimates. These findings highlight the need for caution when applying testing-based criteria and emphasize the importance of refining incidence estimation methods to improve the design and evaluation of future HIV prevention trials. △ Less

Submitted 16 December, 2024; originally announced December 2024.

arXiv:2412.11924 [pdf, other]

Establishing a New Benchmark in Quantum Computational Advantage with 105-qubit Zuchongzhi 3.0 Processor

Authors: Dongxin Gao, Daojin Fan, Chen Zha, Jiahao Bei, Guoqing Cai, Jianbin Cai, Sirui Cao, Xiangdong Zeng, Fusheng Chen, Jiang Chen, Kefu Chen, Xiawei Chen, Xiqing Chen, Zhe Chen, Zhiyuan Chen, Zihua Chen, Wenhao Chu, Hui Deng, Zhibin Deng, Pei Ding, Xun Ding, Zhuzhengqi Ding, Shuai Dong, Yupeng Dong, Bo Fan , et al. (129 additional authors not shown)

Abstract: In the relentless pursuit of quantum computational advantage, we present a significant advancement with the development of Zuchongzhi 3.0. This superconducting quantum computer prototype, comprising 105 qubits, achieves high operational fidelities, with single-qubit gates, two-qubit gates, and readout fidelity at 99.90%, 99.62% and 99.18%, respectively. Our experiments with an 83-qubit, 32-cycle r… ▽ More In the relentless pursuit of quantum computational advantage, we present a significant advancement with the development of Zuchongzhi 3.0. This superconducting quantum computer prototype, comprising 105 qubits, achieves high operational fidelities, with single-qubit gates, two-qubit gates, and readout fidelity at 99.90%, 99.62% and 99.18%, respectively. Our experiments with an 83-qubit, 32-cycle random circuit sampling on Zuchongzhi 3.0 highlight its superior performance, achieving one million samples in just a few hundred seconds. This task is estimated to be infeasible on the most powerful classical supercomputers, Frontier, which would require approximately $6.4\times 10^9$ years to replicate the task. This leap in processing power places the classical simulation cost six orders of magnitude beyond Google's SYC-67 and SYC-70 experiments [Nature 634, 328(2024)], firmly establishing a new benchmark in quantum computational advantage. Our work not only advances the frontiers of quantum computing but also lays the groundwork for a new era where quantum processors play an essential role in tackling sophisticated real-world challenges. △ Less

Submitted 16 December, 2024; originally announced December 2024.

Showing 151–200 of 1,708 results for author: Pan, J