Skip to main content

Showing 1–50 of 145 results for author: Dong, K

.
  1. arXiv:2506.09234  [pdf, ps, other

    cs.CE

    Transaction Categorization with Relational Deep Learning in QuickBooks

    Authors: Kaiwen Dong, Padmaja Jonnalagedda, Xiang Gao, Ayan Acharya, Maria Kissa, Mauricio Flores, Nitesh V. Chawla, Kamalika Das

    Abstract: Automatic transaction categorization is crucial for enhancing the customer experience in QuickBooks by providing accurate accounting and bookkeeping. The distinct challenges in this domain stem from the unique formatting of transaction descriptions, the wide variety of transaction categories, and the vast scale of the data involved. Furthermore, organizing transaction data in a relational database… ▽ More

    Submitted 10 June, 2025; originally announced June 2025.

    Comments: Accepted to ECML-PKDD 2025

  2. arXiv:2506.00967  [pdf, ps, other

    cs.LG

    Pilot Contamination-Aware Graph Attention Network for Power Control in CFmMIMO

    Authors: Tingting Zhang, Sergiy A. Vorobyov, David J. Love, Taejoon Kim, Kai Dong

    Abstract: Optimization-based power control algorithms are predominantly iterative with high computational complexity, making them impractical for real-time applications in cell-free massive multiple-input multiple-output (CFmMIMO) systems. Learning-based methods have emerged as a promising alternative, and among them, graph neural networks (GNNs) have demonstrated their excellent performance in solving powe… ▽ More

    Submitted 1 June, 2025; originally announced June 2025.

  3. arXiv:2505.16470  [pdf, other

    cs.IR cs.CL cs.CV

    Benchmarking Retrieval-Augmented Multimomal Generation for Document Question Answering

    Authors: Kuicai Dong, Yujing Chang, Shijie Huang, Yasheng Wang, Ruiming Tang, Yong Liu

    Abstract: Document Visual Question Answering (DocVQA) faces dual challenges in processing lengthy multimodal documents (text, images, tables) and performing cross-modal reasoning. Current document retrieval-augmented generation (DocRAG) methods remain limited by their text-centric approaches, frequently missing critical visual information. The field also lacks robust benchmarks for assessing multimodal evid… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

    Comments: preprint. code available at \url{https://mmdocrag.github.io/MMDocRAG/}

  4. arXiv:2505.14069  [pdf, ps, other

    cs.IR

    Process vs. Outcome Reward: Which is Better for Agentic RAG Reinforcement Learning

    Authors: Wenlin Zhang, Xiangyang Li, Kuicai Dong, Yichao Wang, Pengyue Jia, Xiaopeng Li, Yingyi Zhang, Derong Xu, Zhaocheng Du, Huifeng Guo, Ruiming Tang, Xiangyu Zhao

    Abstract: Retrieval-augmented generation (RAG) enhances the text generation capabilities of large language models (LLMs) by integrating external knowledge and up-to-date information. However, traditional RAG systems are limited by static workflows and lack the adaptability required for multistep reasoning and complex task management. To address these limitations, agentic RAG systems (e.g., DeepResearch) hav… ▽ More

    Submitted 21 May, 2025; v1 submitted 20 May, 2025; originally announced May 2025.

  5. arXiv:2505.10440  [pdf, other

    hep-ex nucl-ex

    First Results on the Search for Lepton Number Violating Neutrinoless Double Beta Decay with the LEGEND-200 Experiment

    Authors: H. Acharya, N. Ackermann, M. Agostini, A. Alexander, C. Andreoiu, G. R. Araujo, F. T. Avignone III, M. Babicz, W. Bae, A. Bakalyarov, M. Balata, A. S. Barabash, P. S. Barbeau, C. J. Barton, L. Baudis, C. Bauer, E. Bernieri, L. Bezrukov, K. H. Bhimani, V. Biancacci, E. Blalock, S. J. Borden, G. Borghi, F. Borra, B. Bos , et al. (234 additional authors not shown)

    Abstract: The LEGEND collaboration is searching for neutrinoless double beta ($0νββ$) decay by operating high-purity germanium detectors enriched in $^{76}$Ge in a low-background liquid argon environment. Building on key technological innovations from GERDA and the MAJORANA DEMONSTRATOR, LEGEND-200 has performed a first $0νββ$ decay search based on 61 kg yr of data. Over half of this exposure comes from our… ▽ More

    Submitted 19 May, 2025; v1 submitted 15 May, 2025; originally announced May 2025.

    Comments: Prepared for submission to Physical Review Letters

  6. arXiv:2505.08681  [pdf, ps, other

    cs.SD cs.AI eess.AS

    A Mamba-based Network for Semi-supervised Singing Melody Extraction Using Confidence Binary Regularization

    Authors: Xiaoliang He, Kangjie Dong, Jingkai Cao, Shuai Yu, Wei Li, Yi Yu

    Abstract: Singing melody extraction (SME) is a key task in the field of music information retrieval. However, existing methods are facing several limitations: firstly, prior models use transformers to capture the contextual dependencies, which requires quadratic computation resulting in low efficiency in the inference stage. Secondly, prior works typically rely on frequencysupervised methods to estimate the… ▽ More

    Submitted 13 May, 2025; originally announced May 2025.

  7. arXiv:2505.05766  [pdf, ps, other

    astro-ph.HE

    Measurement of separate electron and positron spectra from 10 GeV to 20GeV with the geomagnetic field on DAMPE

    Authors: DAMPE Collaboration, F. Alemanno, Q. An, P. Azzarello, F. C. T. Barbato, P. Bernardini, X. J. Bi, H. Boutin, I. Cagnoli, M. S. Cai, E. Casilli, E. Catanzani, J. Chang, D. Y. Chen, J. L. Chen, Z. F. Chen, Z. X. Chen, P. Coppin, M. Y. Cui, T. S. Cui, Y. X. Cui, I. DeMitri, F. dePalma, A. DiGiovanni, T. K. Dong , et al. (127 additional authors not shown)

    Abstract: The cosmic-ray (CR) electrons and positrons in space are of great significance for studying the origin and propagation of cosmic-rays. The satellite-borne experiment DArk Matter Particle Explorer (DAMPE) has been used to measure the separate electron and positron spectra, as well as the positron fraction. In this work, the Earth's magnetic field is used to distinguish CR electrons and positrons, a… ▽ More

    Submitted 9 May, 2025; originally announced May 2025.

    Comments: 18 pages, 5 figures

  8. arXiv:2504.12221  [pdf, other

    quant-ph

    Phonon-Coupled Hole-Spin Qubits in High-Purity Germanium: Design and Modeling of a Scalable Architecture

    Authors: D. -M. Mei, S. A. Panamaldeniya, K. Dong, S. Bhattarai, N. Budhathoki, A. Warren

    Abstract: We present a design and modeling of a scalable quantum processor architecture utilizing hole-spin qubits defined in gate-controlled germanium (Ge) quantum dots, where coherent spin-phonon coupling is predicted to facilitate qubit manipulation and long-range interactions. The architecture exploits the strong, electrically tunable spin-orbit interactions intrinsic to hole states in Ge, integrated wi… ▽ More

    Submitted 19 May, 2025; v1 submitted 16 April, 2025; originally announced April 2025.

    Comments: 16 pages, 13 figures, and 7 tables

  9. arXiv:2503.16650  [pdf, other

    hep-ph nucl-ex nucl-th

    Virtual Majorana Neutrinos and the Minimum Neutrino Mass Scale in Neutrinoless Double-Beta Decay

    Authors: Dongming Mei, Kunming Dong, Austin Warren, Sanjay Bhattarai

    Abstract: Virtual Majorana neutrinos are indispensable for neutrinoless double-beta (0$νββ$) decay. In this study, we demonstrate that the overlap of the virtual Majorana neutrino wavefunction, predominantly composed of a right-handed antineutrino component with a strongly suppressed left-handed component (with amplitude proportional to the effective Majorana neutrino mass, $|m_{ββ}|$, is crucial for trigge… ▽ More

    Submitted 20 March, 2025; originally announced March 2025.

    Comments: 11 pages, 7 figures, and 3 tables

  10. arXiv:2503.00751  [pdf, other

    cs.CL cs.AI

    RAPID: Efficient Retrieval-Augmented Long Text Generation with Writing Planning and Information Discovery

    Authors: Hongchao Gu, Dexun Li, Kuicai Dong, Hao Zhang, Hang Lv, Hao Wang, Defu Lian, Yong Liu, Enhong Chen

    Abstract: Generating knowledge-intensive and comprehensive long texts, such as encyclopedia articles, remains significant challenges for Large Language Models. It requires not only the precise integration of facts but also the maintenance of thematic coherence throughout the article. Existing methods, such as direct generation and multi-agent discussion, often struggle with issues like hallucinations, topic… ▽ More

    Submitted 2 March, 2025; originally announced March 2025.

  11. arXiv:2502.12961  [pdf, other

    cs.AI cs.CL

    Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger

    Authors: Wenjun Li, Dexun Li, Kuicai Dong, Cong Zhang, Hao Zhang, Weiwen Liu, Yasheng Wang, Ruiming Tang, Yong Liu

    Abstract: Large language models (LLMs) have shown remarkable emergent capabilities, transforming the execution of functional tasks by leveraging external tools for complex problems that require specialized processing or real-time data. While existing research expands LLMs access to diverse tools (e.g., program interpreters, search engines, weather/map apps), the necessity of using these tools is often overl… ▽ More

    Submitted 18 February, 2025; originally announced February 2025.

  12. arXiv:2502.07158  [pdf, other

    cs.LG cs.AI

    Early Risk Prediction of Pediatric Cardiac Arrest from Electronic Health Records via Multimodal Fused Transformer

    Authors: Jiaying Lu, Stephanie R. Brown, Songyuan Liu, Shifan Zhao, Kejun Dong, Del Bold, Michael Fundora, Alaa Aljiffry, Alex Fedorov, Jocelyn Grunwell, Xiao Hu

    Abstract: Early prediction of pediatric cardiac arrest (CA) is critical for timely intervention in high-risk intensive care settings. We introduce PedCA-FT, a novel transformer-based framework that fuses tabular view of EHR with the derived textual view of EHR to fully unleash the interactions of high-dimensional risk factors and their dynamics. By employing dedicated transformer modules for each modality v… ▽ More

    Submitted 20 May, 2025; v1 submitted 10 February, 2025; originally announced February 2025.

    Journal ref: in Proceedings of 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2025)

  13. arXiv:2502.00212  [pdf, other

    cs.LG cs.AI cs.LO

    STP: Self-play LLM Theorem Provers with Iterative Conjecturing and Proving

    Authors: Kefan Dong, Tengyu Ma

    Abstract: A fundamental challenge in formal theorem proving by LLMs is the lack of high-quality training data. Although reinforcement learning or expert iteration partially mitigates this issue by alternating between LLM generating proofs and finetuning them on correctly generated ones, performance quickly plateaus due to the scarcity of correct proofs (sparse rewards). To keep improving the models with lim… ▽ More

    Submitted 20 March, 2025; v1 submitted 31 January, 2025; originally announced February 2025.

    Comments: 25 pages, 5 figures

  14. arXiv:2501.12948  [pdf, other

    cs.CL cs.AI cs.LG

    DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

    Authors: DeepSeek-AI, Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, Xiaokang Zhang, Xingkai Yu, Yu Wu, Z. F. Wu, Zhibin Gou, Zhihong Shao, Zhuoshu Li, Ziyi Gao, Aixin Liu, Bing Xue, Bingxuan Wang, Bochao Wu, Bei Feng, Chengda Lu , et al. (175 additional authors not shown)

    Abstract: We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrates remarkable reasoning capabilities. Through RL, DeepSeek-R1-Zero naturally emerges with numerous powerful and intriguing reasoning behaviors. However, it encounters… ▽ More

    Submitted 22 January, 2025; originally announced January 2025.

  15. arXiv:2501.10669  [pdf

    cond-mat.soft cond-mat.dis-nn

    Role of Random Interaction Connection in the Order Transition of Active Matter Based on the Vicsek Model

    Authors: Ruizhi Jin, Kejun Dong

    Abstract: Randomness plays a key role in the order transition of active matter but has not yet been explicitly considered in pairwise interaction connection. In this letter, we introduce the perception rate P into the Vicsek model as the probability of the interaction connections and model the connections as superposition states. We show that with increasing P, the polar order number undergoes an order tran… ▽ More

    Submitted 18 January, 2025; originally announced January 2025.

    Comments: 8 pages, 5 figures

  16. arXiv:2501.08828  [pdf, other

    cs.IR cs.AI cs.CL cs.CV

    MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents

    Authors: Kuicai Dong, Yujing Chang, Xin Deik Goh, Dexun Li, Ruiming Tang, Yong Liu

    Abstract: Multimodal document retrieval aims to identify and retrieve various forms of multimodal content, such as figures, tables, charts, and layout information from extensive documents. Despite its increasing popularity, there is a notable lack of a comprehensive and robust benchmark to effectively evaluate the performance of systems in such tasks. To address this gap, this work introduces a new benchmar… ▽ More

    Submitted 20 May, 2025; v1 submitted 15 January, 2025; originally announced January 2025.

    Comments: https://huggingface.co/MMDocIR

  17. arXiv:2501.08567  [pdf, other

    q-bio.OT

    A new perspective on brain stimulation interventions: Optimal stochastic tracking control of brain network dynamics

    Authors: Kangli Dong, Siya Chen, Ying Dan, Lu Zhang, Xinyi Li, Wei Liang, Yue Zhao, Yu Sun

    Abstract: Network control theory (NCT) has recently been utilized in neuroscience to facilitate our understanding of brain stimulation effects. A particularly useful branch of NCT is optimal control, which focuses on applying theoretical and computational principles of control theory to design optimal strategies to achieve specific goals in neural processes. However, most existing research focuses on optima… ▽ More

    Submitted 16 January, 2025; v1 submitted 14 January, 2025; originally announced January 2025.

    Comments: Supplementary materials can be found at: https://zjueducn-my.sharepoint.com/:b:/g/personal/dongkl_zju_edu_cn/EbG817wduDFIgqqS3zt2d4gB0OZXM9wt-v18Xr41zXS1Fg?e=XOGNwG

  18. arXiv:2501.02474  [pdf, other

    cs.CV

    Generalization-Enhanced Few-Shot Object Detection in Remote Sensing

    Authors: Hui Lin, Nan Li, Pengjuan Yao, Kexin Dong, Yuhan Guo, Danfeng Hong, Ying Zhang, Congcong Wen

    Abstract: Remote sensing object detection is particularly challenging due to the high resolution, multi-scale features, and diverse ground object characteristics inherent in satellite and UAV imagery. These challenges necessitate more advanced approaches for effective object detection in such environments. While deep learning methods have achieved remarkable success in remote sensing object detection, they… ▽ More

    Submitted 5 January, 2025; originally announced January 2025.

    Journal ref: IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2015

  19. arXiv:2501.02461  [pdf, other

    cs.CV cs.AI

    FedRSClip: Federated Learning for Remote Sensing Scene Classification Using Vision-Language Models

    Authors: Hui Lin, Chao Zhang, Danfeng Hong, Kexin Dong, Congcong Wen

    Abstract: Remote sensing data is often distributed across multiple institutions, and due to privacy concerns and data-sharing restrictions, leveraging large-scale datasets in a centralized training framework is challenging. Federated learning offers a promising solution by enabling collaborative model training across distributed data sources without requiring data centralization. However, current Vision-Lan… ▽ More

    Submitted 5 January, 2025; originally announced January 2025.

  20. arXiv:2412.19437  [pdf, other

    cs.CL cs.AI

    DeepSeek-V3 Technical Report

    Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bing Xue, Bingxuan Wang, Bochao Wu, Chengda Lu, Chenggang Zhao, Chengqi Deng, Chenyu Zhang, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fucong Dai, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Han Bao , et al. (175 additional authors not shown)

    Abstract: We present DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token. To achieve efficient inference and cost-effective training, DeepSeek-V3 adopts Multi-head Latent Attention (MLA) and DeepSeekMoE architectures, which were thoroughly validated in DeepSeek-V2. Furthermore, DeepSeek-V3 pioneers an auxiliary-loss-free strategy for loa… ▽ More

    Submitted 18 February, 2025; v1 submitted 26 December, 2024; originally announced December 2024.

  21. arXiv:2412.11460  [pdf, other

    astro-ph.HE hep-ex

    Observation of a spectral hardening in cosmic ray boron spectrum with the DAMPE space mission

    Authors: DAMPE Collaboration, F. Alemanno, C. Altomare, Q. An, P. Azzarello, F. C. T. Barbato, P. Bernardini, X. J. Bi, H. Boutin, I. Cagnoli, M. S. Cai, E. Casilli, E. Catanzani, J. Chang, D. Y. Chen, J. L. Chen, Z. F. Chen, Z. X. Chen, P. Coppin, M. Y. Cui, T. S. Cui, Y. X. Cui, I. De Mitri, F. de Palma, A. Di Giovanni , et al. (121 additional authors not shown)

    Abstract: Secondary cosmic ray fluxes are important probes of the propagation and interaction of high-energy particles in the Galaxy. Recent measurements of primary and secondary cosmic ray nuclei have revealed unexpected spectral features that demand a deeper understanding. In this work we report the direct measurement of the cosmic ray boron spectrum from 10 GeV/n to 8 TeV/n with eight years of data colle… ▽ More

    Submitted 18 December, 2024; v1 submitted 16 December, 2024; originally announced December 2024.

    Comments: 10 pages, 10 figures, submitted to PRL

  22. arXiv:2412.10302  [pdf, other

    cs.CV cs.AI cs.CL

    DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

    Authors: Zhiyu Wu, Xiaokang Chen, Zizheng Pan, Xingchao Liu, Wen Liu, Damai Dai, Huazuo Gao, Yiyang Ma, Chengyue Wu, Bingxuan Wang, Zhenda Xie, Yu Wu, Kai Hu, Jiawei Wang, Yaofeng Sun, Yukun Li, Yishi Piao, Kang Guan, Aixin Liu, Xin Xie, Yuxiang You, Kai Dong, Xingkai Yu, Haowei Zhang, Liang Zhao , et al. (2 additional authors not shown)

    Abstract: We present DeepSeek-VL2, an advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL, through two key major upgrades. For the vision component, we incorporate a dynamic tiling vision encoding strategy designed for processing high-resolution images with different aspect ratios. For the language component, we leverage Deep… ▽ More

    Submitted 13 December, 2024; originally announced December 2024.

  23. arXiv:2412.01291  [pdf, other

    cs.IR cs.ET

    Global Estimation of Building-Integrated Facade and Rooftop Photovoltaic Potential by Integrating 3D Building Footprint and Spatio-Temporal Datasets

    Authors: Qing Yu, Kechuan Dong, Zhiling Guo, Jiaxing Li, Hongjun Tan, Yanxiu Jin, Jian Yuan, Haoran Zhang, Junwei Liu, Qi Chen, Jinyue Yan

    Abstract: This research tackles the challenges of estimating Building-Integrated Photovoltaics (BIPV) potential across various temporal and spatial scales, accounting for different geographical climates and urban morphology. We introduce a holistic methodology for evaluating BIPV potential, integrating 3D building footprint models with diverse meteorological data sources to account for dynamic shadow effect… ▽ More

    Submitted 2 December, 2024; originally announced December 2024.

    Comments: 17 pages, 5 figures

  24. arXiv:2411.17162  [pdf, other

    cond-mat.mtrl-sci

    A Recursive Hybrid Tetrahedron Method for Brillouin-zone Integration

    Authors: Kun Dong, Yihao Lin, Xiaoqiang Liu, Jiechao Feng, Ji Feng

    Abstract: A recursive extension of the hybrid tetrahedron method for Brillouin-zone integration is proposed, allowing iterative tetrahedron refinement and significantly reducing the error from the linear tetrahedron method. The Brillouin-zone integral is expressed as a weighted sum on the initial grid, with integral weights collected recursively from the finest grid. Our method is capable of simultaneously… ▽ More

    Submitted 26 November, 2024; originally announced November 2024.

    Comments: 15 pages, 6 figures

  25. arXiv:2411.06826  [pdf, other

    cs.LG cs.IR

    Adaptive Conditional Expert Selection Network for Multi-domain Recommendation

    Authors: Kuiyao Dong, Xingyu Lou, Feng Liu, Ruian Wang, Wenyi Yu, Ping Wang, Jun Wang

    Abstract: Mixture-of-Experts (MOE) has recently become the de facto standard in Multi-domain recommendation (MDR) due to its powerful expressive ability. However, such MOE-based method typically employs all experts for each instance, leading to scalability issue and low-discriminability between domains and experts. Furthermore, the design of commonly used domain-specific networks exacerbates the scalability… ▽ More

    Submitted 11 November, 2024; originally announced November 2024.

  26. arXiv:2411.02983  [pdf, other

    cs.AI cs.MA cs.RO

    Autonomous Decision Making for UAV Cooperative Pursuit-Evasion Game with Reinforcement Learning

    Authors: Yang Zhao, Zidong Nie, Kangsheng Dong, Qinghua Huang, Xuelong Li

    Abstract: The application of intelligent decision-making in unmanned aerial vehicle (UAV) is increasing, and with the development of UAV 1v1 pursuit-evasion game, multi-UAV cooperative game has emerged as a new challenge. This paper proposes a deep reinforcement learning-based model for decision-making in multi-role UAV cooperative pursuit-evasion game, to address the challenge of enabling UAV to autonomous… ▽ More

    Submitted 5 November, 2024; originally announced November 2024.

    Comments: 11 pages, 12 figures, 31 conference

    ACM Class: I.2.6; I.2.8

  27. arXiv:2411.01829  [pdf, other

    cs.LG

    Formal Theorem Proving by Rewarding LLMs to Decompose Proofs Hierarchically

    Authors: Kefan Dong, Arvind Mahankali, Tengyu Ma

    Abstract: Mathematical theorem proving is an important testbed for large language models' deep and abstract reasoning capability. This paper focuses on improving LLMs' ability to write proofs in formal languages that permit automated proof verification/evaluation. Most previous results provide human-written lemmas to the theorem prover, which is an arguably oversimplified setting that does not sufficiently… ▽ More

    Submitted 4 November, 2024; originally announced November 2024.

  28. arXiv:2410.12142   

    cs.RO eess.SY

    Design Space Exploration of Embedded SoC Architectures for Real-Time Optimal Control

    Authors: Kris Shengjun Dong, Dima Nikiforov, Widyadewi Soedarmadji, Minh Nguyen, Christopher Fletcher, Yakun Sophia Shao

    Abstract: Empowering resource-limited robots to execute computationally intensive tasks such as locomotion and manipulation is challenging. This project provides a comprehensive design space exploration to determine optimal hardware computation architectures suitable for model-based control algorithms. We profile and optimize representative architectural designs across general-purpose scalar, vector process… ▽ More

    Submitted 24 October, 2024; v1 submitted 15 October, 2024; originally announced October 2024.

    Comments: This submission has been withdrawn following further internal review and discussions with collaborators, as it was determined that the current version does not meet our intended standards, and will not be updated further. This decision aligns with internal changes and agreements that were finalized post-submission

  29. arXiv:2409.14754  [pdf, other

    cs.RO

    CushionCatch: A Compliant Catching Mechanism for Mobile Manipulators via Combined Optimization and Learning

    Authors: Bingjie Chen, Keyu Fan, Qi Yang, Yi Cheng, Houde Liu, Kangkang Dong, Chongkun Xia, Liang Han, Bin Liang

    Abstract: Catching flying objects with a cushioning process is a skill commonly performed by humans, yet it remains a significant challenge for robots. In this paper, we present a framework that combines optimization and learning to achieve compliant catching on mobile manipulators (CCMM). First, we propose a high-level capture planner for mobile manipulators (MM) that calculates the optimal capture point a… ▽ More

    Submitted 4 March, 2025; v1 submitted 23 September, 2024; originally announced September 2024.

  30. Hadronic cross section measurements with the DAMPE space mission using 20GeV-10TeV cosmic-ray protons and $^4$He

    Authors: F. Alemanno, Q. An, P. Azzarello, F. C. T. Barbato, P. Bernardini, X. J. Bi, I. Cagnoli, M. S. Cai, E. Casilli, E. Catanzani, J. Chang, D. Y. Chen, J. L. Chen, Z. F. Chen, P. Coppin, M. Y. Cui, T. S. Cui, Y. X. Cui, H. T. Dai, A. De Benedittis, I. De Mitri, F. de Palma, A. Di Giovanni, Q. Ding, T. K. Dong , et al. (126 additional authors not shown)

    Abstract: Precise direct cosmic-ray (CR) measurements provide an important probe to study the energetic particle sources in our Galaxy, and the interstellar environment through which these particles propagate. Uncertainties on hadronic models, ion-nucleon cross sections in particular, are currently the limiting factor towards obtaining more accurate CR ion flux measurements with calorimetric space-based exp… ▽ More

    Submitted 7 January, 2025; v1 submitted 30 August, 2024; originally announced August 2024.

    Comments: Published in PRD

  31. arXiv:2408.14158  [pdf, other

    cs.DC cs.AI

    Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning

    Authors: Wei An, Xiao Bi, Guanting Chen, Shanhuang Chen, Chengqi Deng, Honghui Ding, Kai Dong, Qiushi Du, Wenjun Gao, Kang Guan, Jianzhong Guo, Yongqiang Guo, Zhe Fu, Ying He, Panpan Huang, Jiashi Li, Wenfeng Liang, Xiaodong Liu, Xin Liu, Yiyuan Liu, Yuxuan Liu, Shanghao Lu, Xuan Lu, Xiaotao Nie, Tian Pei , et al. (27 additional authors not shown)

    Abstract: The rapid progress in Deep Learning (DL) and Large Language Models (LLMs) has exponentially increased demands of computational power and bandwidth. This, combined with the high costs of faster computing chips and interconnects, has significantly inflated High Performance Computing (HPC) construction costs. To address these challenges, we introduce the Fire-Flyer AI-HPC architecture, a synergistic… ▽ More

    Submitted 31 August, 2024; v1 submitted 26 August, 2024; originally announced August 2024.

    Comments: This is the preprint version of the paper accepted for presentation at the 2024 International Conference for High Performance Computing, Networking, Storage, and Analysis (SC'24). \c{opyright} 2024 IEEE. Personal use of this material is permitted. For other uses, permission from IEEE must be obtained. Please refer to IEEE Xplore for the final published version

  32. Collaborative Cross-modal Fusion with Large Language Model for Recommendation

    Authors: Zhongzhou Liu, Hao Zhang, Kuicai Dong, Yuan Fang

    Abstract: Despite the success of conventional collaborative filtering (CF) approaches for recommendation systems, they exhibit limitations in leveraging semantic knowledge within the textual attributes of users and items. Recent focus on the application of large language models for recommendation (LLM4Rec) has highlighted their capability for effective semantic knowledge capture. However, these methods ofte… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: 10 pages, 4 figures, accepted by CIKM 2024

  33. arXiv:2408.07866  [pdf, other

    eess.SY

    Certifiable Reachability Learning Using a New Lipschitz Continuous Value Function

    Authors: Jingqi Li, Donggun Lee, Jaewon Lee, Kris Shengjun Dong, Somayeh Sojoudi, Claire Tomlin

    Abstract: We propose a new reachability learning framework for high-dimensional nonlinear systems, focusing on reach-avoid problems. These problems require computing the reach-avoid set, which ensures that all its elements can safely reach a target set despite disturbances within pre-specified bounds. Our framework has two main parts: offline learning of a newly designed reachavoid value function, and post-… ▽ More

    Submitted 15 February, 2025; v1 submitted 14 August, 2024; originally announced August 2024.

  34. arXiv:2408.03408  [pdf, other

    cs.AR cs.LG cs.PL

    LLM-Aided Compilation for Tensor Accelerators

    Authors: Charles Hong, Sahil Bhatia, Altan Haan, Shengjun Kris Dong, Dima Nikiforov, Alvin Cheung, Yakun Sophia Shao

    Abstract: Hardware accelerators, in particular accelerators for tensor processing, have many potential application domains. However, they currently lack the software infrastructure to support the majority of domains outside of deep learning. Furthermore, a compiler that can easily be updated to reflect changes at both application and hardware levels would enable more agile development and design space explo… ▽ More

    Submitted 6 August, 2024; originally announced August 2024.

    Comments: 4 page workshop paper

  35. arXiv:2408.01332  [pdf, other

    cs.LG

    HMDN: Hierarchical Multi-Distribution Network for Click-Through Rate Prediction

    Authors: Xingyu Lou, Yu Yang, Kuiyao Dong, Heyuan Huang, Wenyi Yu, Ping Wang, Xiu Li, Jun Wang

    Abstract: As the recommendation service needs to address increasingly diverse distributions, such as multi-population, multi-scenario, multitarget, and multi-interest, more and more recent works have focused on multi-distribution modeling and achieved great progress. However, most of them only consider modeling in a single multi-distribution manner, ignoring that mixed multi-distributions often coexist and… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

  36. arXiv:2407.02883  [pdf, ps, other

    cs.IR cs.CL

    CoIR: A Comprehensive Benchmark for Code Information Retrieval Models

    Authors: Xiangyang Li, Kuicai Dong, Yi Quan Lee, Wei Xia, Hao Zhang, Xinyi Dai, Yasheng Wang, Ruiming Tang

    Abstract: Despite the substantial success of Information Retrieval (IR) in various NLP tasks, most IR systems predominantly handle queries and corpora in natural language, neglecting the domain of code retrieval. Code retrieval is critically important yet remains under-explored, with existing methods and benchmarks inadequately representing the diversity of code in various domains and tasks. Addressing this… ▽ More

    Submitted 5 June, 2025; v1 submitted 3 July, 2024; originally announced July 2024.

    Comments: ACL 2025 Main

  37. arXiv:2406.11931  [pdf, other

    cs.SE cs.AI cs.LG

    DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence

    Authors: DeepSeek-AI, Qihao Zhu, Daya Guo, Zhihong Shao, Dejian Yang, Peiyi Wang, Runxin Xu, Y. Wu, Yukun Li, Huazuo Gao, Shirong Ma, Wangding Zeng, Xiao Bi, Zihui Gu, Hanwei Xu, Damai Dai, Kai Dong, Liyue Zhang, Yishi Piao, Zhibin Gou, Zhenda Xie, Zhewen Hao, Bingxuan Wang, Junxiao Song, Deli Chen , et al. (15 additional authors not shown)

    Abstract: We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from an intermediate checkpoint of DeepSeek-V2 with additional 6 trillion tokens. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathe… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  38. arXiv:2406.06777  [pdf, ps, other

    cs.CV cs.AI

    MolX: Enhancing Large Language Models for Molecular Understanding With A Multi-Modal Extension

    Authors: Khiem Le, Zhichun Guo, Kaiwen Dong, Xiaobao Huang, Bozhao Nan, Roshni Iyer, Xiangliang Zhang, Olaf Wiest, Wei Wang, Ting Hua, Nitesh V. Chawla

    Abstract: Large Language Models (LLMs) with their strong task-handling capabilities have shown remarkable advancements across a spectrum of fields, moving beyond natural language understanding. However, their proficiency within the chemistry domain remains restricted, especially in solving professional molecule-related tasks. This challenge is attributed to their inherent limitations in comprehending molecu… ▽ More

    Submitted 8 June, 2025; v1 submitted 10 June, 2024; originally announced June 2024.

  39. arXiv:2405.18727  [pdf, other

    cs.CL cs.AI cs.IR

    CtrlA: Adaptive Retrieval-Augmented Generation via Inherent Control

    Authors: Huanshuo Liu, Hao Zhang, Zhijiang Guo, Jing Wang, Kuicai Dong, Xiangyang Li, Yi Quan Lee, Cong Zhang, Yong Liu

    Abstract: Retrieval-augmented generation (RAG) has emerged as a promising solution for mitigating hallucinations of large language models (LLMs) with retrieved external knowledge. Adaptive RAG enhances this approach by enabling dynamic retrieval during generation, activating retrieval only when the query exceeds LLM's internal knowledge. Existing methods primarily focus on detecting LLM's confidence via sta… ▽ More

    Submitted 3 October, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: 29 pages, 10 figures, 11 tables

  40. arXiv:2405.04434  [pdf, other

    cs.CL cs.AI

    DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

    Authors: DeepSeek-AI, Aixin Liu, Bei Feng, Bin Wang, Bingxuan Wang, Bo Liu, Chenggang Zhao, Chengqi Dengr, Chong Ruan, Damai Dai, Daya Guo, Dejian Yang, Deli Chen, Dongjie Ji, Erhang Li, Fangyun Lin, Fuli Luo, Guangbo Hao, Guanting Chen, Guowei Li, H. Zhang, Hanwei Xu, Hao Yang, Haowei Zhang, Honghui Ding , et al. (132 additional authors not shown)

    Abstract: We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference. It comprises 236B total parameters, of which 21B are activated for each token, and supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference… ▽ More

    Submitted 19 June, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

  41. arXiv:2404.19624  [pdf, other

    hep-ph nucl-ex physics.data-an

    Impact of recent updates to neutrino oscillation parameters on the effective Majorana neutrino mass in 0$νββ$ Decay

    Authors: Dongming Mei, Kunming Dong, Austin Warren, Sanjay Bhattarai

    Abstract: We investigate how recent updates to neutrino oscillation parameters and the sum of neutrino masses influence the sensitivity of neutrinoless double-beta (0$νββ$) decay experiments. Incorporating the latest cosmological constraints on the sum of neutrino masses and laboratory measurements on oscillations, we determine the sum of neutrino masses for both the normal hierarchy (NH) and the inverted h… ▽ More

    Submitted 27 June, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

    Comments: 7 pages, 6 figures

  42. arXiv:2404.15103  [pdf, other

    cs.CL

    Multi-view Content-aware Indexing for Long Document Retrieval

    Authors: Kuicai Dong, Derrick Goh Xin Deik, Yi Quan Lee, Hao Zhang, Xiangyang Li, Cong Zhang, Yong Liu

    Abstract: Long document question answering (DocQA) aims to answer questions from long documents over 10k words. They usually contain content structures such as sections, sub-sections, and paragraph demarcations. However, the indexing methods of long documents remain under-explored, while existing systems generally employ fixed-length chunking. As they do not consider content structures, the resultant chunks… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  43. arXiv:2404.13600  [pdf, other

    cs.RO

    Are We Ready for Planetary Exploration Robots? The TAIL-Plus Dataset for SLAM in Granular Environments

    Authors: Zirui Wang, Chen Yao, Yangtao Ge, Guowei Shi, Ningbo Yang, Zheng Zhu, Kewei Dong, Hexiang Wei, Zhenzhong Jia, Jing Wu

    Abstract: So far, planetary surface exploration depends on various mobile robot platforms. The autonomous navigation and decision-making of these mobile robots in complex terrains largely rely on their terrain-aware perception, localization and mapping capabilities. In this paper we release the TAIL-Plus dataset, a new challenging dataset in deformable granular environments for planetary exploration robots,… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: Accepted to the IEEE ICRA Workshop on Field Robotics 2024

  44. arXiv:2404.11032  [pdf, other

    cs.LG cs.SI

    CORE: Data Augmentation for Link Prediction via Information Bottleneck

    Authors: Kaiwen Dong, Zhichun Guo, Nitesh V. Chawla

    Abstract: Link prediction (LP) is a fundamental task in graph representation learning, with numerous applications in diverse domains. However, the generalizability of LP models is often compromised due to the presence of noisy or spurious information in graphs and the inherent incompleteness of graph data. To address these challenges, we draw inspiration from the Information Bottleneck principle and propose… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  45. arXiv:2404.11019  [pdf, other

    cs.LG

    You do not have to train Graph Neural Networks at all on text-attributed graphs

    Authors: Kaiwen Dong, Zhichun Guo, Nitesh V. Chawla

    Abstract: Graph structured data, specifically text-attributed graphs (TAG), effectively represent relationships among varied entities. Such graphs are essential for semi-supervised node classification tasks. Graph Neural Networks (GNNs) have emerged as a powerful tool for handling this graph-structured data. Although gradient descent is commonly utilized for training GNNs for node classification, this study… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: preprint

  46. arXiv:2404.10584  [pdf, other

    cs.CV

    ReWiTe: Realistic Wide-angle and Telephoto Dual Camera Fusion Dataset via Beam Splitter Camera Rig

    Authors: Chunli Peng, Xuan Dong, Tiantian Cao, Zhengqing Li, Kun Dong, Weixin Li

    Abstract: The fusion of images from dual camera systems featuring a wide-angle and a telephoto camera has become a hotspot problem recently. By integrating simultaneously captured wide-angle and telephoto images from these systems, the resulting fused image achieves a wide field of view (FOV) coupled with high-definition quality. Existing approaches are mostly deep learning methods, and predominantly rely o… ▽ More

    Submitted 29 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

  47. arXiv:2404.01356  [pdf, other

    cs.LG cs.AI cs.CY

    The Double-Edged Sword of Input Perturbations to Robust Accurate Fairness

    Authors: Xuran Li, Peng Wu, Yanting Chen, Xingjun Ma, Zhen Zhang, Kaixiang Dong

    Abstract: Deep neural networks (DNNs) are known to be sensitive to adversarial input perturbations, leading to a reduction in either prediction accuracy or individual fairness. To jointly characterize the susceptibility of prediction accuracy and individual fairness to adversarial perturbations, we introduce a novel robustness definition termed robust accurate fairness. Informally, robust accurate fairness… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  48. arXiv:2404.00702  [pdf, other

    cs.IR

    LLMTreeRec: Unleashing the Power of Large Language Models for Cold-Start Recommendations

    Authors: Wenlin Zhang, Chuhan Wu, Xiangyang Li, Yuhao Wang, Kuicai Dong, Yichao Wang, Xinyi Dai, Xiangyu Zhao, Huifeng Guo, Ruiming Tang

    Abstract: The lack of training data gives rise to the system cold-start problem in recommendation systems, making them struggle to provide effective recommendations. To address this problem, Large Language Models (LLMs) can model recommendation tasks as language analysis tasks and provide zero-shot results based on their vast open-world knowledge. However, the large scale of the item corpus poses a challeng… ▽ More

    Submitted 23 December, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

    Journal ref: COLING 2025

  49. arXiv:2403.16037  [pdf, other

    cs.IR

    Knowledge-aware Dual-side Attribute-enhanced Recommendation

    Authors: Taotian Pang, Xingyu Lou, Fei Zhao, Zhen Wu, Kuiyao Dong, Qiuying Peng, Yue Qi, Xinyu Dai

    Abstract: \textit{Knowledge-aware} recommendation methods (KGR) based on \textit{graph neural networks} (GNNs) and \textit{contrastive learning} (CL) have achieved promising performance. However, they fall short in modeling fine-grained user preferences and further fail to leverage the \textit{preference-attribute connection} to make predictions, leading to sub-optimal performance. To address the issue, we… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  50. arXiv:2403.05525  [pdf, other

    cs.AI

    DeepSeek-VL: Towards Real-World Vision-Language Understanding

    Authors: Haoyu Lu, Wen Liu, Bo Zhang, Bingxuan Wang, Kai Dong, Bo Liu, Jingxiang Sun, Tongzheng Ren, Zhuoshu Li, Hao Yang, Yaofeng Sun, Chengqi Deng, Hanwei Xu, Zhenda Xie, Chong Ruan

    Abstract: We present DeepSeek-VL, an open-source Vision-Language (VL) Model designed for real-world vision and language understanding applications. Our approach is structured around three key dimensions: We strive to ensure our data is diverse, scalable, and extensively covers real-world scenarios including web screenshots, PDFs, OCR, charts, and knowledge-based content, aiming for a comprehensive represe… ▽ More

    Submitted 11 March, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

    Comments: https://github.com/deepseek-ai/DeepSeek-VL