Skip to main content

Showing 51–100 of 373 results for author: Gao, K

.
  1. arXiv:2412.06382  [pdf, other

    cs.LG cs.SE

    PyPulse: A Python Library for Biosignal Imputation

    Authors: Kevin Gao, Maxwell A. Xu, James M. Rehg, Alexander Moreno

    Abstract: We introduce PyPulse, a Python package for imputation of biosignals in both clinical and wearable sensor settings. Missingness is commonplace in these settings and can arise from multiple causes, such as insecure sensor attachment or data transmission loss. PyPulse's framework provides a modular and extendable framework with high ease-of-use for a broad userbase, including non-machine-learning bio… ▽ More

    Submitted 9 December, 2024; originally announced December 2024.

    Comments: 7 pages, 3 figures. Implementation and documentation are available at https://github.com/rehg-lab/pulseimpute

  2. arXiv:2412.05167  [pdf, other

    cs.AI cs.CL cs.SD eess.AS

    Benchmarking Open-ended Audio Dialogue Understanding for Large Audio-Language Models

    Authors: Kuofeng Gao, Shu-Tao Xia, Ke Xu, Philip Torr, Jindong Gu

    Abstract: Large Audio-Language Models (LALMs) have unclocked audio dialogue capabilities, where audio dialogues are a direct exchange of spoken language between LALMs and humans. Recent advances, such as GPT-4o, have enabled LALMs in back-and-forth audio dialogues with humans. This progression not only underscores the potential of LALMs but also broadens their applicability across a wide range of practical… ▽ More

    Submitted 6 December, 2024; originally announced December 2024.

  3. arXiv:2412.01564  [pdf, other

    cs.LG q-bio.BM

    Tokenizing 3D Molecule Structure with Quantized Spherical Coordinates

    Authors: Kaiyuan Gao, Yusong Wang, Haoxiang Guan, Zun Wang, Qizhi Pei, John E. Hopcroft, Kun He, Lijun Wu

    Abstract: The application of language models (LMs) to molecular structure generation using line notations such as SMILES and SELFIES has been well-established in the field of cheminformatics. However, extending these models to generate 3D molecular structures presents significant challenges. Two primary obstacles emerge: (1) the difficulty in designing a 3D line notation that ensures SE(3)-invariant atomic… ▽ More

    Submitted 2 December, 2024; originally announced December 2024.

    Comments: 17 pages, 6 figures, preprint

  4. arXiv:2412.01295  [pdf, other

    cs.LG cs.AI cs.DC

    FedAH: Aggregated Head for Personalized Federated Learning

    Authors: Pengzhan Zhou, Yuepeng He, Yijun Zhai, Kaixin Gao, Chao Chen, Zhida Qin, Chong Zhang, Songtao Guo

    Abstract: Recently, Federated Learning (FL) has gained popularity for its privacy-preserving and collaborative learning capabilities. Personalized Federated Learning (PFL), building upon FL, aims to address the issue of statistical heterogeneity and achieve personalization. Personalized-head-based PFL is a common and effective PFL method that splits the model into a feature extractor and a head, where the f… ▽ More

    Submitted 2 December, 2024; originally announced December 2024.

    Comments: 8 pages, 4 figures

  5. arXiv:2411.19352  [pdf, other

    cs.AI

    OMuleT: Orchestrating Multiple Tools for Practicable Conversational Recommendation

    Authors: Se-eun Yoon, Xiaokai Wei, Yexi Jiang, Rachit Pareek, Frank Ong, Kevin Gao, Julian McAuley, Michelle Gong

    Abstract: In this paper, we present a systematic effort to design, evaluate, and implement a realistic conversational recommender system (CRS). The objective of our system is to allow users to input free-form text to request recommendations, and then receive a list of relevant and diverse items. While previous work on synthetic queries augments large language models (LLMs) with 1-3 tools, we argue that a mo… ▽ More

    Submitted 31 December, 2024; v1 submitted 28 November, 2024; originally announced November 2024.

  6. Prediction of high-Tc superconductivity in ternary actinium beryllium hydrides at low pressure

    Authors: Kun Gao, Wenwen Cui, Jingming Shi, Artur P. Durajski, Jian Hao, Silvana Botti, Miguel A. L. Marques, Yinwei Li

    Abstract: Hydrogen-rich superconductors are promising candidates to achieve room-temperature superconductivity. However, the extreme pressures needed to stabilize these structures significantly limit their practical applications. An effective strategy to reduce the external pressure is to add a light element M that binds with H to form MHx units, acting as a chemical precompressor. We exemplify this idea by… ▽ More

    Submitted 28 November, 2024; originally announced November 2024.

    Journal ref: Physical Review B 109,014501(2024)

  7. arXiv:2411.16375  [pdf, other

    cs.CV

    Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing

    Authors: Kaifeng Gao, Jiaxin Shi, Hanwang Zhang, Chunping Wang, Jun Xiao, Long Chen

    Abstract: With the advance of diffusion models, today's video generation has achieved impressive quality. To extend the generation length and facilitate real-world applications, a majority of video diffusion models (VDMs) generate videos in an autoregressive manner, i.e., generating subsequent clips conditioned on the last frame(s) of the previous clip. However, existing autoregressive VDMs are highly ineff… ▽ More

    Submitted 21 May, 2025; v1 submitted 25 November, 2024; originally announced November 2024.

    Comments: Accepted by ICML 2025. Code is available: https://github.com/Dawn-LX/CausalCache-VDM

  8. arXiv:2411.08354  [pdf, other

    physics.geo-ph

    Developing a Foundation Model for Predicting Material Failure

    Authors: Agnese Marcato, Javier E. Santos, Aleksandra Pachalieva, Kai Gao, Ryley Hill, Esteban Rougier, Qinjun Kang, Jeffrey Hyman, Abigail Hunter, Janel Chua, Earl Lawrence, Hari Viswanathan, Daniel O'Malley

    Abstract: Understanding material failure is critical for designing stronger and lighter structures by identifying weaknesses that could be mitigated. Existing full-physics numerical simulation techniques involve trade-offs between speed, accuracy, and the ability to handle complex features like varying boundary conditions, grid types, resolution, and physical models. We present the first foundation model sp… ▽ More

    Submitted 13 November, 2024; originally announced November 2024.

    Comments: Accepted at NeurIPS 2024 "Foundation Models for Science: Progress, Opportunities, and Challenges" Workshop

  9. arXiv:2411.01215  [pdf, other

    astro-ph.HE

    Detection of two TeV gamma-ray outbursts from NGC 1275 by LHAASO

    Authors: Zhen Cao, F. Aharonian, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen, T. L. Chen , et al. (254 additional authors not shown)

    Abstract: The Water Cherenkov Detector Array (WCDA) is one of the components of Large High Altitude Air Shower Observatory (LHAASO) and can monitor any sources over two-thirds of the sky for up to 7 hours per day with >98\% duty cycle. In this work, we report the detection of two outbursts of the Fanaroff-Riley I radio galaxy NGC 1275 that were detected by LHAASO-WCDA between November 2022 and January 2023… ▽ More

    Submitted 18 April, 2025; v1 submitted 2 November, 2024; originally announced November 2024.

    Comments: 11 pages, 8 figures, 3 tables

  10. arXiv:2410.24022  [pdf, other

    q-bio.QM cs.AI cs.CL cs.LG

    SFM-Protein: Integrative Co-evolutionary Pre-training for Advanced Protein Sequence Representation

    Authors: Liang He, Peiran Jin, Yaosen Min, Shufang Xie, Lijun Wu, Tao Qin, Xiaozhuan Liang, Kaiyuan Gao, Yuliang Jiang, Tie-Yan Liu

    Abstract: Proteins, essential to biological systems, perform functions intricately linked to their three-dimensional structures. Understanding the relationship between protein structures and their amino acid sequences remains a core challenge in protein modeling. While traditional protein foundation models benefit from pre-training on vast unlabeled datasets, they often struggle to capture critical co-evolu… ▽ More

    Submitted 31 October, 2024; originally announced October 2024.

  11. arXiv:2410.10760  [pdf, other

    cs.CR cs.CL

    Denial-of-Service Poisoning Attacks against Large Language Models

    Authors: Kuofeng Gao, Tianyu Pang, Chao Du, Yong Yang, Shu-Tao Xia, Min Lin

    Abstract: Recent studies have shown that LLMs are vulnerable to denial-of-service (DoS) attacks, where adversarial inputs like spelling errors or non-semantic prompts trigger endless outputs without generating an [EOS] token. These attacks can potentially cause high latency and make LLM services inaccessible to other users or tasks. However, when there are speech-to-text interfaces (e.g., voice commands to… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

  12. arXiv:2410.10735  [pdf, other

    cs.AI cs.CL

    Embedding Self-Correction as an Inherent Ability in Large Language Models for Enhanced Mathematical Reasoning

    Authors: Kuofeng Gao, Huanqia Cai, Qingyao Shuai, Dihong Gong, Zhifeng Li

    Abstract: Accurate mathematical reasoning with Large Language Models (LLMs) is crucial in revolutionizing domains that heavily rely on such reasoning. However, LLMs often encounter difficulties in certain aspects of mathematical reasoning, leading to flawed reasoning and erroneous results. To mitigate these issues, we introduce a novel mechanism, the Chain of Self-Correction (CoSC), specifically designed to… ▽ More

    Submitted 8 February, 2025; v1 submitted 14 October, 2024; originally announced October 2024.

  13. arXiv:2410.05759  [pdf, other

    cs.NE

    3D UAV Trajectory Planning for IoT Data Collection via Matrix-Based Evolutionary Computation

    Authors: Pei-Fa Sun, Yujae Song, Kang-Yu Gao, Yu-Kai Wang, Changjun Zhou, Sang-Woon Jeon, Jun Zhang

    Abstract: UAVs are increasingly becoming vital tools in various wireless communication applications including internet of things (IoT) and sensor networks, thanks to their rapid and agile non-terrestrial mobility. Despite recent research, planning three-dimensional (3D) UAV trajectories over a continuous temporal-spatial domain remains challenging due to the need to solve computationally intensive optimizat… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

  14. arXiv:2410.05717  [pdf, other

    cs.CV

    Advancements in Road Lane Mapping: Comparative Fine-Tuning Analysis of Deep Learning-based Semantic Segmentation Methods Using Aerial Imagery

    Authors: Willow Liu, Shuxin Qiao, Kyle Gao, Hongjie He, Michael A. Chapman, Linlin Xu, Jonathan Li

    Abstract: This research addresses the need for high-definition (HD) maps for autonomous vehicles (AVs), focusing on road lane information derived from aerial imagery. While Earth observation data offers valuable resources for map creation, specialized models for road lane extraction are still underdeveloped in remote sensing. In this study, we perform an extensive comparison of twelve foundational deep lear… ▽ More

    Submitted 15 October, 2024; v1 submitted 8 October, 2024; originally announced October 2024.

  15. arXiv:2410.05284  [pdf, other

    cs.CR cs.AI cs.CV cs.LG

    Psychometrics for Hypnopaedia-Aware Machinery via Chaotic Projection of Artificial Mental Imagery

    Authors: Ching-Chun Chang, Kai Gao, Shuying Xu, Anastasia Kordoni, Christopher Leckie, Isao Echizen

    Abstract: Neural backdoors represent insidious cybersecurity loopholes that render learning machinery vulnerable to unauthorised manipulations, potentially enabling the weaponisation of artificial intelligence with catastrophic consequences. A backdoor attack involves the clandestine infiltration of a trigger during the learning process, metaphorically analogous to hypnopaedia, where ideas are implanted int… ▽ More

    Submitted 28 September, 2024; originally announced October 2024.

  16. arXiv:2410.04425  [pdf, other

    astro-ph.HE

    LHAASO detection of very-high-energy gamma-ray emission surrounding PSR J0248+6021

    Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

    Abstract: We report the detection of an extended very-high-energy (VHE) gamma-ray source coincident with the location of middle-aged (62.4~\rm kyr) pulsar PSR J0248+6021, by using the LHAASO-WCDA data of live 796 days and LHAASO-KM2A data of live 1216 days. A significant excess of \gray induced showers is observed both by WCDA in energy bands of 1-25~\rm TeV and KM2A in energy bands of $>$ 25~\rm TeV with 7… ▽ More

    Submitted 3 December, 2024; v1 submitted 6 October, 2024; originally announced October 2024.

    Comments: 12 pages, 10 figures, Accepted by Sci. China-Phys. Mech. Astron

  17. arXiv:2410.03869  [pdf, ps, other

    cs.CL cs.AI cs.CR cs.CV cs.MM

    Chain-of-Jailbreak Attack for Image Generation Models via Editing Step by Step

    Authors: Wenxuan Wang, Kuiyi Gao, Youliang Yuan, Jen-tse Huang, Qiuzhi Liu, Shuai Wang, Wenxiang Jiao, Zhaopeng Tu

    Abstract: Text-based image generation models, such as Stable Diffusion and DALL-E 3, hold significant potential in content creation and publishing workflows, making them the focus in recent years. Despite their remarkable capability to generate diverse and vivid images, considerable efforts are being made to prevent the generation of harmful content, such as abusive, violent, or pornographic material. To as… ▽ More

    Submitted 3 June, 2025; v1 submitted 4 October, 2024; originally announced October 2024.

    Comments: Accepted by ACL 2025 Findings

  18. arXiv:2408.05500  [pdf, other

    cs.CR cs.AI cs.CV cs.LG

    PointNCBW: Towards Dataset Ownership Verification for Point Clouds via Negative Clean-label Backdoor Watermark

    Authors: Cheng Wei, Yang Wang, Kuofeng Gao, Shuo Shao, Yiming Li, Zhibo Wang, Zhan Qin

    Abstract: Recently, point clouds have been widely used in computer vision, whereas their collection is time-consuming and expensive. As such, point cloud datasets are the valuable intellectual property of their owners and deserve protection. To detect and prevent unauthorized use of these datasets, especially for commercial or open-sourced ones that cannot be sold again or used commercially without permissi… ▽ More

    Submitted 4 November, 2024; v1 submitted 10 August, 2024; originally announced August 2024.

    Comments: This paper was accepted by IEEE Transactions on Information Forensics and Security (TIFS), 2024. 16 pages

  19. arXiv:2408.03677  [pdf, other

    cs.CV

    L4DR: LiDAR-4DRadar Fusion for Weather-Robust 3D Object Detection

    Authors: Xun Huang, Ziyu Xu, Hai Wu, Jinlong Wang, Qiming Xia, Yan Xia, Jonathan Li, Kyle Gao, Chenglu Wen, Cheng Wang

    Abstract: LiDAR-based vision systems are integral for 3D object detection, which is crucial for autonomous navigation. However, they suffer from performance degradation in adverse weather conditions due to the quality deterioration of LiDAR point clouds. Fusing LiDAR with the weather-robust 4D radar sensor is expected to solve this problem. However, the fusion of LiDAR and 4D radar is challenging because th… ▽ More

    Submitted 16 February, 2025; v1 submitted 7 August, 2024; originally announced August 2024.

    Comments: Accepted by AAAI2025(Oral)

  20. arXiv:2408.02487  [pdf, other

    cs.SE cs.AI cs.LG

    LiCoEval: Evaluating LLMs on License Compliance in Code Generation

    Authors: Weiwei Xu, Kai Gao, Hao He, Minghui Zhou

    Abstract: Recent advances in Large Language Models (LLMs) have revolutionized code generation, leading to widespread adoption of AI coding tools by developers. However, LLMs can generate license-protected code without providing the necessary license information, leading to potential intellectual property violations during software production. This paper addresses the critical, yet underexplored, issue of li… ▽ More

    Submitted 25 February, 2025; v1 submitted 5 August, 2024; originally announced August 2024.

    Comments: The 47th International Conference on Software Engineering(ICSE 2025)

  21. arXiv:2407.11080  [pdf, other

    eess.SP

    Performance analysis for a rotary compressor at high speed: experimental study and mathematical modeling

    Authors: Chuntai Zheng, Wei Zhao, Benshuai Lyu, Keke Gao, Hongjun Cao, Lei Zhong, Yi Gao, Ren Liao

    Abstract: This paper conducted a comprehensive study on the performance of a rotary compressor over a rotational speed range of 80Hz to 200Hz through experimental tests and mathematical modeling. A compressor performance test rig was designed to conduct the performance tests, with fast-response pressure sensors and displacement sensors capturing the P-V diagram and dynamic motion of the moving components. R… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

  22. arXiv:2407.04675  [pdf, other

    eess.AS cs.SD

    Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition

    Authors: Ye Bai, Jingping Chen, Jitong Chen, Wei Chen, Zhuo Chen, Chuang Ding, Linhao Dong, Qianqian Dong, Yujiao Du, Kepan Gao, Lu Gao, Yi Guo, Minglun Han, Ting Han, Wenchao Hu, Xinying Hu, Yuxiang Hu, Deyu Hua, Lu Huang, Mingkun Huang, Youjia Huang, Jishuo Jin, Fanliu Kong, Zongwei Lan, Tianyu Li , et al. (30 additional authors not shown)

    Abstract: Modern automatic speech recognition (ASR) model is required to accurately transcribe diverse speech signals (from different domains, languages, accents, etc) given the specific contextual information in various application scenarios. Classic end-to-end models fused with extra language models perform well, but mainly in data matching scenarios and are gradually approaching a bottleneck. In this wor… ▽ More

    Submitted 10 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

  23. arXiv:2407.02773  [pdf, other

    cs.MM

    OpenVNA: A Framework for Analyzing the Behavior of Multimodal Language Understanding System under Noisy Scenarios

    Authors: Ziqi Yuan, Baozheng Zhang, Hua Xu, Zhiyun Liang, Kai Gao

    Abstract: We present OpenVNA, an open-source framework designed for analyzing the behavior of multimodal language understanding systems under noisy conditions. OpenVNA serves as an intuitive toolkit tailored for researchers, facilitating convenience batch-level robustness evaluation and on-the-fly instance-level demonstration. It primarily features a benchmark Python library for assessing global model robus… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 10 pages, 4 figures, to be published in ACL 2024 System Demonstration Track

  24. arXiv:2407.02411  [pdf, other

    cs.CV cs.CR cs.MM

    Video Watermarking: Safeguarding Your Video from (Unauthorized) Annotations by Video-based LLMs

    Authors: Jinmin Li, Kuofeng Gao, Yang Bai, Jingyun Zhang, Shu-Tao Xia

    Abstract: The advent of video-based Large Language Models (LLMs) has significantly enhanced video understanding. However, it has also raised some safety concerns regarding data protection, as videos can be more easily annotated, even without authorization. This paper introduces Video Watermarking, a novel technique to protect videos from unauthorized annotations by such video-based LLMs, especially concerni… ▽ More

    Submitted 2 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2403.13507

  25. arXiv:2406.16129  [pdf

    cs.CV

    UDHF2-Net: Uncertainty-diffusion-model-based High-Frequency TransFormer Network for Remotely Sensed Imagery Interpretation

    Authors: Pengfei Zhang, Chang Li, Yongjun Zhang, Rongjun Qin, Kyle Gao, Jonathan Li

    Abstract: Remotely sensed imagery interpretation (RSII) faces the three major problems: (1) objective representation of spatial distribution patterns; (2) edge uncertainty problem caused by downsampling encoder and intrinsic edge noises (e.g., mixed pixel and edge occlusion etc.); and (3) false detection problem caused by geometric registration error in change detection. To solve the aforementioned problems… ▽ More

    Submitted 28 May, 2025; v1 submitted 23 June, 2024; originally announced June 2024.

  26. arXiv:2406.12556  [pdf, other

    cs.NI

    Towards Deep Application-Network Integration: Architectures, Progress and Opportunities

    Authors: Berta Serracanta, Kai Gao, Jordi Ros-Giralt, Alberto Rodriguez-Natal, Luis M. Contreras, Richard Yang, Albert Cabellos

    Abstract: With the rise of a new generation of applications (e.g., virtual and augmented reality, artificial intelligence, etc) demanding stringent performance requirements, the need for networking solutions and architectures that can enable a higher Quality of Experience (QoE) is becoming increasingly important. While jointly optimizing application and network may increase the applications' QoE and simul… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  27. arXiv:2406.10981  [pdf, other

    cs.CV

    ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models

    Authors: Kaifeng Gao, Jiaxin Shi, Hanwang Zhang, Chunping Wang, Jun Xiao

    Abstract: With the advance of diffusion models, today's video generation has achieved impressive quality. But generating temporal consistent long videos is still challenging. A majority of video diffusion models (VDMs) generate long videos in an autoregressive manner, i.e., generating subsequent clips conditioned on last frames of previous clip. However, existing approaches all involve bidirectional computa… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: Code will be available at https://github.com/Dawn-LX/Causal-VideoGen

  28. arXiv:2406.08698  [pdf, other

    astro-ph.HE hep-ph

    Constraints on Ultra Heavy Dark Matter Properties from Dwarf Spheroidal Galaxies with LHAASO Observations

    Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

    Abstract: In this work we try to search for signals generated by ultra-heavy dark matter at the Large High Altitude Air Shower Observatory (LHAASO) data. We look for possible gamma-ray by dark matter annihilation or decay from 16 dwarf spheroidal galaxies in the field of view of LHAASO. Dwarf spheroidal galaxies are among the most promising targets for indirect detection of dark matter which have low fluxes… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 17 pages, 12 figures, accepted by PRL

  29. arXiv:2406.05797  [pdf, other

    q-bio.BM cs.AI cs.CE cs.CL cs.LG

    3D-MolT5: Leveraging Discrete Structural Information for Molecule-Text Modeling

    Authors: Qizhi Pei, Rui Yan, Kaiyuan Gao, Jinhua Zhu, Lijun Wu

    Abstract: The integration of molecular and natural language representations has emerged as a focal point in molecular science, with recent advancements in Language Models (LMs) demonstrating significant potential for comprehensive modeling of both domains. However, existing approaches face notable limitations, particularly in their neglect of three-dimensional (3D) information, which is crucial for understa… ▽ More

    Submitted 18 March, 2025; v1 submitted 9 June, 2024; originally announced June 2024.

    Comments: Accepted by ICLR 2025

  30. arXiv:2406.05392  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Deconstructing The Ethics of Large Language Models from Long-standing Issues to New-emerging Dilemmas: A Survey

    Authors: Chengyuan Deng, Yiqun Duan, Xin Jin, Heng Chang, Yijun Tian, Han Liu, Yichen Wang, Kuofeng Gao, Henry Peng Zou, Yiqiao Jin, Yijia Xiao, Shenghao Wu, Zongxing Xie, Weimin Lyu, Sihong He, Lu Cheng, Haohan Wang, Jun Zhuang

    Abstract: Large Language Models (LLMs) have achieved unparalleled success across diverse language modeling tasks in recent years. However, this progress has also intensified ethical concerns, impacting the deployment of LLMs in everyday contexts. This paper provides a comprehensive survey of ethical challenges associated with LLMs, from longstanding issues such as copyright infringement, systematic bias, an… ▽ More

    Submitted 21 October, 2024; v1 submitted 8 June, 2024; originally announced June 2024.

  31. arXiv:2405.17213  [pdf

    physics.ao-ph

    Highly inhomogeneous interactions between background climate and urban warming across typical local climate zones in heatwave and non-heatwave days

    Authors: Jing Kong, Yongling Zhao, Kai Gao, Dominik Strebel, Jan Carmeliet, Chengwang Lei

    Abstract: Urban heat island (UHI) in conjunction with heatwave (HW) leads to exacerbation of thermal stress in urban areas. Prior research on UHI and HW has predominantly concentrated on examining the thermal conditions at the surface and near-surface, with few investigations extending to the radiative and dynamical interactions of UHI and HW, particularly with a focus on the inhomogeneities across local cl… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  32. arXiv:2405.15826  [pdf, other

    cs.CV

    3D Learnable Supertoken Transformer for LiDAR Point Cloud Scene Segmentation

    Authors: Dening Lu, Jun Zhou, Kyle Gao, Linlin Xu, Jonathan Li

    Abstract: 3D Transformers have achieved great success in point cloud understanding and representation. However, there is still considerable scope for further development in effective and efficient Transformers for large-scale LiDAR point cloud scene segmentation. This paper proposes a novel 3D Transformer framework, named 3D Learnable Supertoken Transformer (3DLST). The key contributions are summarized as f… ▽ More

    Submitted 26 December, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: 13 pages, 10 figures, 7 tables

  33. arXiv:2405.12775  [pdf, other

    cs.MM cs.AI cs.CL

    Unsupervised Multimodal Clustering for Semantics Discovery in Multimodal Utterances

    Authors: Hanlei Zhang, Hua Xu, Fei Long, Xin Wang, Kai Gao

    Abstract: Discovering the semantics of multimodal utterances is essential for understanding human language and enhancing human-machine interactions. Existing methods manifest limitations in leveraging nonverbal information for discerning complex semantics in unsupervised scenarios. This paper introduces a novel unsupervised multimodal clustering method (UMC), making a pioneering contribution to this field.… ▽ More

    Submitted 21 May, 2024; originally announced May 2024.

    Comments: Accepted by ACL 2024, Main Conference, Long Paper

  34. arXiv:2405.11826  [pdf, other

    astro-ph.IM hep-ex physics.ins-det

    Data quality control system and long-term performance monitor of the LHAASO-KM2A

    Authors: Zhen Cao, F. Aharonian, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, W. Bian, A. V. Bukevich, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, H. X. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. Chen , et al. (263 additional authors not shown)

    Abstract: The KM2A is the largest sub-array of the Large High Altitude Air Shower Observatory (LHAASO). It consists of 5216 electromagnetic particle detectors (EDs) and 1188 muon detectors (MDs). The data recorded by the EDs and MDs are used to reconstruct primary information of cosmic ray and gamma-ray showers. This information is used for physical analysis in gamma-ray astronomy and cosmic ray physics. To… ▽ More

    Submitted 13 June, 2024; v1 submitted 20 May, 2024; originally announced May 2024.

    Comments: 15 pages, 9 figures

  35. arXiv:2405.11021  [pdf, other

    cs.CV

    Enhanced 3D Urban Scene Reconstruction and Point Cloud Densification using Gaussian Splatting and Google Earth Imagery

    Authors: Kyle Gao, Dening Lu, Hongjie He, Linlin Xu, Jonathan Li

    Abstract: 3D urban scene reconstruction and modelling is a crucial research area in remote sensing with numerous applications in academia, commerce, industry, and administration. Recent advancements in view synthesis models have facilitated photorealistic 3D reconstruction solely from 2D images. Leveraging Google Earth imagery, we construct a 3D Gaussian Splatting model of the Waterloo region centered on th… ▽ More

    Submitted 1 June, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

    ACM Class: I.4; I.3

  36. arXiv:2405.10612  [pdf, other

    cs.CV cs.CR cs.LG

    Not All Prompts Are Secure: A Switchable Backdoor Attack Against Pre-trained Vision Transformers

    Authors: Sheng Yang, Jiawang Bai, Kuofeng Gao, Yong Yang, Yiming Li, Shu-tao Xia

    Abstract: Given the power of vision transformers, a new learning paradigm, pre-training and then prompting, makes it more efficient and effective to address downstream visual recognition tasks. In this paper, we identify a novel security threat towards such a paradigm from the perspective of backdoor attacks. Specifically, an extra prompt token, called the switch token in this work, can turn the backdoor mo… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  37. arXiv:2405.09981  [pdf, other

    cs.CV

    Adversarial Robustness for Visual Grounding of Multimodal Large Language Models

    Authors: Kuofeng Gao, Yang Bai, Jiawang Bai, Yong Yang, Shu-Tao Xia

    Abstract: Multi-modal Large Language Models (MLLMs) have recently achieved enhanced performance across various vision-language tasks including visual grounding capabilities. However, the adversarial robustness of visual grounding remains unexplored in MLLMs. To fill this gap, we use referring expression comprehension (REC) as an example task in visual grounding and propose three adversarial attack paradigms… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

    Comments: ICLR 2024 Workshop on Reliable and Responsible Foundation Models

  38. Discovery of Very-high-energy Gamma-ray Emissions from the Low Luminosity AGN NGC 4278 by LHAASO

    Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (255 additional authors not shown)

    Abstract: The first source catalog of Large High Altitude Air Shower Observatory reported the detection of a very-high-energy gamma ray source, 1LHAASO J1219+2915. In this paper a further detailed study of the spectral and temporal behavior of this point-like source have been carried. The best-fit position of the TeV source ($\rm{RA}=185.05^{\circ}\pm0.04^{\circ}$, $\rm{Dec}=29.25^{\circ}\pm0.03^{\circ}$) i… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 11 pages, 5 figures

  39. arXiv:2405.07136  [pdf

    physics.optics cond-mat.mtrl-sci

    Extremely long transverse optical needle focus for reflective metalens enabled by monolayer MoS$_2$

    Authors: Zhonglin Li, Kangyu Gao, Yingying Wang, Ruitong Bie, Dongliang Yang, Tianze Yu, Renxi Gao, Wenjun Liu, Bo Zhong, Linfeng Sun

    Abstract: Line-scan mode facilitates fast-speed and high-throughput imaging with developing a suitable optical transverse needle focus. Metasurface with periodic structures such as diffractive rings, ellipses, and gratings could enable discrete focus evolving into line focus under momentum conservation, but still face the challenge of extremely low light power utilization brought by inevitably multiple high… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    Comments: 22 pages, 5 figures

  40. arXiv:2405.04434  [pdf, other

    cs.CL cs.AI

    DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

    Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding , et al. (132 additional authors not shown)

    Abstract: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference… ▽ More

    Submitted 19 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  41. arXiv:2404.19387  [pdf, other

    eess.SY

    Online Electricity Purchase for Data Center with Dynamic Virtual Battery from Flexibility Aggregation

    Authors: Kekun Gao, Yuejun Yan, Yixuan Liu, Endong Liu, Pengcheng You

    Abstract: As a critical component of modern infrastructure, data centers account for a huge amount of power consumption and greenhouse gas emission. This paper studies the electricity purchase strategy for a data center to lower its energy cost while integrating local renewable generation under uncertainty. To facilitate efficient and scalable decision-making, we propose a two-layer hierarchy where the lowe… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  42. arXiv:2404.16565  [pdf, other

    cs.SE

    PyRadar: Towards Automatically Retrieving and Validating Source Code Repository Information for PyPI Packages

    Authors: Kai Gao, Weiwei Xu, Wenhao Yang, Minghui Zhou

    Abstract: A package's source code repository records the development history of the package, providing indispensable information for the use and risk monitoring of the package. However, a package release often misses its source code repository due to the separation of the package's development platform from its distribution platform. Existing tools retrieve the release's repository information from its meta… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: This paper has been accepted at FSE 2024

  43. arXiv:2404.16557  [pdf, other

    cs.CV cs.AI

    Energy-Latency Manipulation of Multi-modal Large Language Models via Verbose Samples

    Authors: Kuofeng Gao, Jindong Gu, Yang Bai, Shu-Tao Xia, Philip Torr, Wei Liu, Zhifeng Li

    Abstract: Despite the exceptional performance of multi-modal large language models (MLLMs), their deployment requires substantial computational resources. Once malicious users induce high energy consumption and latency time (energy-latency cost), it will exhaust computational resources and harm availability of service. In this paper, we investigate this vulnerability for MLLMs, particularly image-based and… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2401.11170

  44. arXiv:2404.16525  [pdf, other

    cond-mat.quant-gas

    An efficient method to generate near-ideal hollow beams of different shapes for box potential of quantum gases

    Authors: Tongtong Ren, Yirong Wang, Xiaoyu Dai, Xiaoxu Gao, Guangren Sun, Xue Zhao, Kuiyi Gao, Zhiyue Zheng, Wei Zhang

    Abstract: Ultracold quantum gases are usually prepared in conservative traps for quantum simulation experiments. The atomic density inhomogeneity, together with the consequent position-dependent energy and time scales of cold atoms in traditional harmonic traps, makes it difficult to manipulate and detect the sample at a better level. These problems are partially solved by optical box traps of blue-detuned… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  45. arXiv:2404.14372  [pdf, other

    cs.CL cs.AI

    Beyond Scaling: Predicting Patent Approval with Domain-specific Fine-grained Claim Dependency Graph

    Authors: Xiaochen Kev Gao, Feng Yao, Kewen Zhao, Beilei He, Animesh Kumar, Vish Krishnan, Jingbo Shang

    Abstract: Model scaling is becoming the default choice for many language tasks due to the success of large language models (LLMs). However, it can fall short in specific scenarios where simple customized methods excel. In this paper, we delve into the patent approval pre-diction task and unveil that simple domain-specific graph methods outperform enlarging the model, using the intrinsic dependencies within… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 17 Pages, Under Review

  46. arXiv:2404.11070  [pdf, other

    cs.CV eess.SP

    Sky-GVIO: an enhanced GNSS/INS/Vision navigation with FCN-based sky-segmentation in urban canyon

    Authors: Jingrong Wang, Bo Xu, Ronghe Jin, Shoujian Zhang, Kefu Gao, Jingnan Liu

    Abstract: Accurate, continuous, and reliable positioning is a critical component of achieving autonomous driving. However, in complex urban canyon environments, the vulnerability of a stand-alone sensor and non-line-of-sight (NLOS) caused by high buildings, trees, and elevated structures seriously affect positioning results. To address these challenges, a sky-view images segmentation algorithm based on Full… ▽ More

    Submitted 5 August, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

    Report number: 16.20 (2024): 3785

    Journal ref: Remote Sensing 2024

  47. arXiv:2404.06758  [pdf, other

    cs.RO

    Toward Holistic Planning and Control Optimization for Dual-Arm Rearrangement

    Authors: Kai Gao, Zihe Ye, Duo Zhang, Baichuan Huang, Jingjin Yu

    Abstract: Long-horizon task and motion planning (TAMP) is notoriously difficult to solve, let alone optimally, due to the tight coupling between the interleaved (discrete) task and (continuous) motion planning phases, where each phase on its own is frequently an NP-hard or even PSPACE-hard computational challenge. In this study, we tackle the even more challenging goal of jointly optimizing task and motion… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: First three authors made equal contributions to this study

  48. arXiv:2404.05211  [pdf, other

    cs.CV

    Multi-level Graph Subspace Contrastive Learning for Hyperspectral Image Clustering

    Authors: Jingxin Wang, Renxiang Guan, Kainan Gao, Zihao Li, Hao Li, Xianju Li, Chang Tang

    Abstract: Hyperspectral image (HSI) clustering is a challenging task due to its high complexity. Despite subspace clustering shows impressive performance for HSI, traditional methods tend to ignore the global-local interaction in HSI data. In this study, we proposed a multi-level graph subspace contrastive learning (MLGSC) for HSI clustering. The model is divided into the following main parts. Graph convolu… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: IJCNN 2024

  49. arXiv:2404.04801  [pdf, ps, other

    astro-ph.IM astro-ph.HE

    LHAASO-KM2A detector simulation using Geant4

    Authors: Zhen Cao, F. Aharonian, Q. An, Axikegu, Y. X. Bai, Y. W. Bao, D. Bastieri, X. J. Bi, Y. J. Bi, J. T. Cai, Q. Cao, W. Y. Cao, Zhe Cao, J. Chang, J. F. Chang, A. M. Chen, E. S. Chen, Liang Chen, Lin Chen, Long Chen, M. J. Chen, M. L. Chen, Q. H. Chen, S. H. Chen, S. Z. Chen , et al. (254 additional authors not shown)

    Abstract: KM2A is one of the main sub-arrays of LHAASO, working on gamma ray astronomy and cosmic ray physics at energies above 10 TeV. Detector simulation is the important foundation for estimating detector performance and data analysis. It is a big challenge to simulate the KM2A detector in the framework of Geant4 due to the need to track numerous photons from a large number of detector units (>6000) with… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  50. arXiv:2403.20261  [pdf, other

    q-bio.BM cs.AI cs.LG

    FABind+: Enhancing Molecular Docking through Improved Pocket Prediction and Pose Generation

    Authors: Kaiyuan Gao, Qizhi Pei, Gongbo Zhang, Jinhua Zhu, Kun He, Lijun Wu

    Abstract: Molecular docking is a pivotal process in drug discovery. While traditional techniques rely on extensive sampling and simulation governed by physical principles, these methods are often slow and costly. The advent of deep learning-based approaches has shown significant promise, offering increases in both accuracy and efficiency. Building upon the foundational work of FABind, a model designed with… ▽ More

    Submitted 24 February, 2025; v1 submitted 29 March, 2024; originally announced March 2024.

    Comments: Accepted for presentation at KDD 2025