Skip to main content

Showing 1–50 of 150 results for author: Qiao, S

.
  1. arXiv:2505.24090  [pdf, other

    cs.DB cs.AI

    Searching Clinical Data Using Generative AI

    Authors: Karan Hanswadkar, Anika Kanchi, Shivani Tripathi, Shi Qiao, Rony Chatterjee, Alekh Jindal

    Abstract: Artificial Intelligence (AI) is making a major impact on healthcare, particularly through its application in natural language processing (NLP) and predictive analytics. The healthcare sector has increasingly adopted AI for tasks such as clinical data analysis and medical code assignment. However, searching for clinical information in large and often unorganized datasets remains a manual and error-… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  2. arXiv:2505.18668  [pdf, ps, other

    cs.CV cs.CL

    ChartGalaxy: A Dataset for Infographic Chart Understanding and Generation

    Authors: Zhen Li, Duan Li, Yukai Guo, Xinyuan Guo, Bowen Li, Lanxi Xiao, Shenyu Qiao, Jiashu Chen, Zijian Wu, Hui Zhang, Xinhuan Shu, Shixia Liu

    Abstract: Infographic charts are a powerful medium for communicating abstract data by combining visual elements (e.g., charts, images) with textual information. However, their visual and structural richness poses challenges for large vision-language models (LVLMs), which are typically trained on plain charts. To bridge this gap, we introduce ChartGalaxy, a million-scale dataset designed to advance the under… ▽ More

    Submitted 31 May, 2025; v1 submitted 24 May, 2025; originally announced May 2025.

    Comments: 56 pages

  3. arXiv:2505.12082  [pdf, other

    cs.CL cs.LG

    Model Merging in Pre-training of Large Language Models

    Authors: Yunshui Li, Yiyuan Ma, Shen Yan, Chaoyi Zhang, Jing Liu, Jianqiao Lu, Ziwen Xu, Mengzhao Chen, Minrui Wang, Shiyi Zhan, Jin Ma, Xunhao Lai, Deyi Liu, Yao Luo, Xingyan Bin, Hongbin Ren, Mingji Han, Wenhao Hao, Bairen Yi, LingJun Liu, Bole Ma, Xiaoying Jia, Xun Zhou, Siyuan Qiao, Liang Xiang , et al. (1 additional authors not shown)

    Abstract: Model merging has emerged as a promising technique for enhancing large language models, though its application in large-scale pre-training remains relatively unexplored. In this paper, we present a comprehensive investigation of model merging techniques during the pre-training process. Through extensive experiments with both dense and Mixture-of-Experts (MoE) architectures ranging from millions to… ▽ More

    Submitted 22 May, 2025; v1 submitted 17 May, 2025; originally announced May 2025.

  4. arXiv:2504.16360  [pdf, other

    cs.LG

    Disentangled Graph Representation Based on Substructure-Aware Graph Optimal Matching Kernel Convolutional Networks

    Authors: Mao Wang, Tao Wu, Xingping Xian, Shaojie Qiao, Weina Niu, Canyixing Cui

    Abstract: Graphs effectively characterize relational data, driving graph representation learning methods that uncover underlying predictive information. As state-of-the-art approaches, Graph Neural Networks (GNNs) enable end-to-end learning for diverse tasks. Recent disentangled graph representation learning enhances interpretability by decoupling independent factors in graph data. However, existing methods… ▽ More

    Submitted 22 April, 2025; originally announced April 2025.

  5. arXiv:2504.03561  [pdf, ps, other

    cs.CL cs.AI cs.CV cs.LG cs.MA

    SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement

    Authors: Runnan Fang, Xiaobin Wang, Yuan Liang, Shuofei Qiao, Jialong Wu, Zekun Xi, Ningyu Zhang, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

    Abstract: In the interaction between agents and their environments, agents expand their capabilities by planning and executing actions. However, LLM-based agents face substantial challenges when deployed in novel environments or required to navigate unconventional action spaces. To empower agents to autonomously explore environments, optimize workflows, and enhance their understanding of actions, we propose… ▽ More

    Submitted 1 June, 2025; v1 submitted 4 April, 2025; originally announced April 2025.

    Comments: ACL 2025

  6. arXiv:2504.03553  [pdf, ps, other

    cs.CL cs.AI cs.CV cs.LG cs.MA

    Agentic Knowledgeable Self-awareness

    Authors: Shuofei Qiao, Zhisong Qiu, Baochang Ren, Xiaobin Wang, Xiangyuan Ru, Ningyu Zhang, Xiang Chen, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

    Abstract: Large Language Models (LLMs) have achieved considerable performance across various agentic planning tasks. However, traditional agent planning approaches adopt a "flood irrigation" methodology that indiscriminately injects gold trajectories, external feedback, and domain knowledge into agent models. This practice overlooks the fundamental human cognitive principle of situational self-awareness dur… ▽ More

    Submitted 29 May, 2025; v1 submitted 4 April, 2025; originally announced April 2025.

    Comments: ACL 2025

  7. arXiv:2504.03438  [pdf, other

    cs.CV

    ZFusion: An Effective Fuser of Camera and 4D Radar for 3D Object Perception in Autonomous Driving

    Authors: Sheng Yang, Tong Zhan, Shichen Qiao, Jicheng Gong, Qing Yang, Jian Wang, Yanfeng Lu

    Abstract: Reliable 3D object perception is essential in autonomous driving. Owing to its sensing capabilities in all weather conditions, 4D radar has recently received much attention. However, compared to LiDAR, 4D radar provides much sparser point cloud. In this paper, we propose a 3D object detection method, termed ZFusion, which fuses 4D radar and vision modality. As the core of ZFusion, our proposed FP-… ▽ More

    Submitted 7 April, 2025; v1 submitted 4 April, 2025; originally announced April 2025.

    Comments: CVPR 2025 WDFM-AD

  8. arXiv:2504.01170  [pdf

    cs.SI

    Estimating Hourly Neighborhood Population Using Mobile Phone Data in the United States

    Authors: Huan Ning, Zhenlong Li, Manzhu Yu, Shiyan Zhang, Shan Qiao

    Abstract: Traditional population estimation techniques often fail to capture the dynamic fluctuations inherent in urban and rural population movements. Recognizing the need for a high spatiotemporal dynamic population dataset, we propose a method using smartphone-based human mobility data to reconstruct the hourly population for each neighborhood across the US. We quantify population fluctuations on an hour… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

  9. arXiv:2503.13436  [pdf, other

    cs.CV cs.LG

    Unified Autoregressive Visual Generation and Understanding with Continuous Tokens

    Authors: Lijie Fan, Luming Tang, Siyang Qin, Tianhong Li, Xuan Yang, Siyuan Qiao, Andreas Steiner, Chen Sun, Yuanzhen Li, Tao Zhu, Michael Rubinstein, Michalis Raptis, Deqing Sun, Radu Soricut

    Abstract: We present UniFluid, a unified autoregressive framework for joint visual generation and understanding leveraging continuous visual tokens. Our unified autoregressive architecture processes multimodal image and text inputs, generating discrete tokens for text and continuous tokens for image. We find though there is an inherent trade-off between the image generation and understanding task, a careful… ▽ More

    Submitted 17 March, 2025; originally announced March 2025.

    Comments: Tech report

  10. arXiv:2503.05659  [pdf, other

    cs.IR

    A Survey of Large Language Model Empowered Agents for Recommendation and Search: Towards Next-Generation Information Retrieval

    Authors: Yu Zhang, Shutong Qiao, Jiaqi Zhang, Tzu-Heng Lin, Chen Gao, Yong Li

    Abstract: Information technology has profoundly altered the way humans interact with information. The vast amount of content created, shared, and disseminated online has made it increasingly difficult to access relevant information. Over the past two decades, recommender systems and search (collectively referred to as information retrieval systems) have evolved significantly to address these challenges. Rec… ▽ More

    Submitted 11 April, 2025; v1 submitted 7 March, 2025; originally announced March 2025.

  11. arXiv:2503.02151  [pdf, other

    cs.HC

    YouthCare: Building a Personalized Collaborative Video Censorship Tool to Support Parent-Child Joint Media Engagement

    Authors: Wenxin Zhao, Fangyu Yu, Peng Zhang, Hansu Gu, Lin Wang, Siyuan Qiao, Tun Lu, Ning Gu

    Abstract: To mitigate the negative impacts of online videos on teenagers, existing research and platforms have implemented various parental mediation mechanisms, such as Parent-Child Joint Media Engagement (JME). However, JME generally relies heavily on parents' time, knowledge, and experience. To fill this gap, we aim to design an automatic tool to help parents/children censor videos more effectively and e… ▽ More

    Submitted 3 March, 2025; originally announced March 2025.

  12. arXiv:2502.15589  [pdf, other

    cs.CL cs.AI cs.IR cs.LG cs.MM

    LightThinker: Thinking Step-by-Step Compression

    Authors: Jintian Zhang, Yuqi Zhu, Mengshu Sun, Yujie Luo, Shuofei Qiao, Lun Du, Da Zheng, Huajun Chen, Ningyu Zhang

    Abstract: Large language models (LLMs) have shown remarkable performance in complex reasoning tasks, but their efficiency is hindered by the substantial memory and computational costs associated with generating lengthy tokens. In this paper, we propose LightThinker, a novel method that enables LLMs to dynamically compress intermediate thoughts during reasoning. Inspired by human cognitive processes, LightTh… ▽ More

    Submitted 21 February, 2025; originally announced February 2025.

  13. arXiv:2502.11035  [pdf

    cond-mat.supr-con cond-mat.mtrl-sci

    Organometallic-Inorganic Hybrid MXenes with Tunable Superconductivity

    Authors: Qi Fan, Tao Bo, Wei Guo, Minghua Chen, Qing Tang, Yicong Yang, Mian Li, Ke Chen, Fangfang Ge, Jialu Li, Sicong Qiao, Changda Wang, Li Song, Lijing Yu, Jinghua Guo, Michael Naguib, Zhifang Chai, Qing Huang, Chaochao Dun, Ning Kang, Yury Gogotsi, Kun Liang

    Abstract: Ti-based two-dimensional transition-metal carbides (MXenes) have attracted attention due to their superior properties and are being explored across various applications1,2. Despite their versatile properties, superconductivity has never been demonstrated, not even predicted, for this important group of 2D materials. In this work, we have introduced an electrochemical intercalation protocol to cons… ▽ More

    Submitted 16 February, 2025; originally announced February 2025.

  14. arXiv:2412.16485  [pdf, other

    cs.DS

    Fast Biclique Counting on Bipartite Graphs: A Node Pivot-based Approach

    Authors: Xiaowei Ye, Rong-Hua Li, Longlong Lin, Shaojie Qiao, Guoren Wang

    Abstract: Counting the number of $(p, q)$-bicliques (complete bipartite subgraphs) in a bipartite graph is a fundamental problem which plays a crucial role in numerous bipartite graph analysis applications. However, existing algorithms for counting $(p, q)$-bicliques often face significant computational challenges, particularly on large real-world networks. In this paper, we propose a general biclique count… ▽ More

    Submitted 20 December, 2024; originally announced December 2024.

  15. arXiv:2412.09048  [pdf, other

    cs.CY

    Oversight in Action: Experiences with Instructor-Moderated LLM Responses in an Online Discussion Forum

    Authors: Shuying Qiao, Paul Denny, Nasser Giacaman

    Abstract: The integration of large language models (LLMs) into computing education offers many potential benefits to student learning, and several novel pedagogical approaches have been reported in the literature. However LLMs also present challenges, one of the most commonly cited being that of student over-reliance. This challenge is compounded by the fact that LLMs are always available to provide instant… ▽ More

    Submitted 12 December, 2024; originally announced December 2024.

    Comments: Accepted to ACE'25

  16. arXiv:2411.12781  [pdf, other

    cs.CV

    FGP: Feature-Gradient-Prune for Efficient Convolutional Layer Pruning

    Authors: Qingsong Lv, Jiasheng Sun, Sheng Zhou, Xu Zhang, Liangcheng Li, Yun Gao, Sun Qiao, Jie Song, Jiajun Bu

    Abstract: To reduce computational overhead while maintaining model performance, model pruning techniques have been proposed. Among these, structured pruning, which removes entire convolutional channels or layers, significantly enhances computational efficiency and is compatible with hardware acceleration. However, existing pruning methods that rely solely on image features or gradients often result in the r… ▽ More

    Submitted 19 November, 2024; originally announced November 2024.

  17. arXiv:2411.09410  [pdf, other

    cs.IR

    LLM-based Bi-level Multi-interest Learning Framework for Sequential Recommendation

    Authors: Shutong Qiao, Chen Gao, Wei Yuan, Yong Li, Hongzhi Yin

    Abstract: Sequential recommendation (SR) leverages users' dynamic preferences, with recent advances incorporating multi-interest learning to model diverse user interests. However, most multi-interest SR models rely on noisy, sparse implicit feedback, limiting recommendation accuracy. Large language models (LLMs) offer robust reasoning on low-quality data but face high computational costs and latency challen… ▽ More

    Submitted 7 May, 2025; v1 submitted 14 November, 2024; originally announced November 2024.

  18. arXiv:2411.01844  [pdf, other

    cs.HC cs.AI cs.SI

    DeMod: A Holistic Tool with Explainable Detection and Personalized Modification for Toxicity Censorship

    Authors: Yaqiong Li, Peng Zhang, Hansu Gu, Tun Lu, Siyuan Qiao, Yubo Shu, Yiyang Shao, Ning Gu

    Abstract: Although there have been automated approaches and tools supporting toxicity censorship for social posts, most of them focus on detection. Toxicity censorship is a complex process, wherein detection is just an initial task and a user can have further needs such as rationale understanding and content modification. For this problem, we conduct a needfinding study to investigate people's diverse needs… ▽ More

    Submitted 4 November, 2024; originally announced November 2024.

    Journal ref: Proceedings of the ACM on Human-Computer Interaction (ACM CSCW 2025)

  19. Differentiable architecture search with multi-dimensional attention for spiking neural networks

    Authors: Yilei Man, Linhai Xie, Shushan Qiao, Yumei Zhou, Delong Shang

    Abstract: Spiking Neural Networks (SNNs) have gained enormous popularity in the field of artificial intelligence due to their low power consumption. However, the majority of SNN methods directly inherit the structure of Artificial Neural Networks (ANN), usually leading to sub-optimal model performance in SNNs. To alleviate this problem, we integrate Neural Architecture Search (NAS) method and propose Multi-… ▽ More

    Submitted 1 November, 2024; originally announced November 2024.

  20. arXiv:2410.13225  [pdf, ps, other

    hep-ph hep-ex nucl-ex nucl-th

    Twist-3 contribution in the Drell-Yan process with tensor-polarized deuteron

    Authors: Si-Yi Qiao, Qin-Tao Song

    Abstract: The tensor-polarized structures of the deuteron can be probed through the proton-deuteron Drell-Yan process, where the proton is unpolarized and the deuteron is tensor polarized. This measurement will be conducted at Fermilab in the near future. In this reaction, the twist-3 contribution is not negligible compared to the twist-2 contribution due to the limited invariant mass of the dilepton pair.… ▽ More

    Submitted 24 March, 2025; v1 submitted 17 October, 2024; originally announced October 2024.

    Comments: 8 pages, 4 figures

    Journal ref: Physical Review D 111, 054026 (2025)

  21. arXiv:2410.12194  [pdf, other

    cs.CL

    Negative-Prompt-driven Alignment for Generative Language Model

    Authors: Shiqi Qiao, Ning Xv, Biao Liu, Xin Geng

    Abstract: Large language models have achieved remarkable capabilities, but aligning their outputs with human values and preferences remains a significant challenge. Existing alignment methods primarily focus on positive examples while overlooking the importance of negative responses in guiding models away from undesirable behaviors. For instance, the widely-used alignment datasets reveals a scarcity of expl… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  22. arXiv:2410.08568  [pdf, other

    physics.geo-ph cs.LG

    GPR Full-Waveform Inversion through Adaptive Filtering of Model Parameters and Gradients Using CNN

    Authors: Peng Jiang, Kun Wang, Jiaxing Wang, Zeliang Feng, Shengjie Qiao, Runhuai Deng, Fengkai Zhang

    Abstract: GPR full-waveform inversion optimizes the subsurface property model iteratively to match the entire waveform information. However, the model gradients derived from wavefield continuation often contain errors, such as ghost values and excessively large values at transmitter and receiver points. Furthermore, models updated based on these gradients frequently exhibit unclear characterization of anoma… ▽ More

    Submitted 11 October, 2024; originally announced October 2024.

    Comments: 16 pages, 6 figures

    MSC Class: 86A22 (Primary) 86A20; 68T07 (Secondary) ACM Class: I.2.8; J.2

  23. arXiv:2410.07869  [pdf, other

    cs.CL cs.AI cs.HC cs.LG cs.MA

    Benchmarking Agentic Workflow Generation

    Authors: Shuofei Qiao, Runnan Fang, Zhisong Qiu, Xiaobin Wang, Ningyu Zhang, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

    Abstract: Large Language Models (LLMs), with their exceptional ability to handle a wide range of tasks, have driven significant advancements in tackling reasoning and planning tasks, wherein decomposing complex problems into executable workflows is a crucial step in this process. Existing workflow evaluation frameworks either focus solely on holistic performance or suffer from limitations such as restricted… ▽ More

    Submitted 23 February, 2025; v1 submitted 10 October, 2024; originally announced October 2024.

    Comments: ICLR 2025

  24. arXiv:2410.05717  [pdf, other

    cs.CV

    Advancements in Road Lane Mapping: Comparative Fine-Tuning Analysis of Deep Learning-based Semantic Segmentation Methods Using Aerial Imagery

    Authors: Willow Liu, Shuxin Qiao, Kyle Gao, Hongjie He, Michael A. Chapman, Linlin Xu, Jonathan Li

    Abstract: This research addresses the need for high-definition (HD) maps for autonomous vehicles (AVs), focusing on road lane information derived from aerial imagery. While Earth observation data offers valuable resources for map creation, specialized models for road lane extraction are still underdeveloped in remote sensing. In this study, we perform an extensive comparison of twelve foundational deep lear… ▽ More

    Submitted 15 October, 2024; v1 submitted 8 October, 2024; originally announced October 2024.

  25. arXiv:2409.15698  [pdf, other

    cs.LG cs.SI

    GISExplainer: On Explainability of Graph Neural Networks via Game-theoretic Interaction Subgraphs

    Authors: Xingping Xian, Jianlu Liu, Chao Wang, Tao Wu, Shaojie Qiao, Xiaochuan Tang, Qun Liu

    Abstract: Explainability is crucial for the application of black-box Graph Neural Networks (GNNs) in critical fields such as healthcare, finance, cybersecurity, and more. Various feature attribution methods, especially the perturbation-based methods, have been proposed to indicate how much each node/edge contributes to the model predictions. However, these methods fail to generate connected explanatory subg… ▽ More

    Submitted 30 December, 2024; v1 submitted 23 September, 2024; originally announced September 2024.

    Comments: 13 pages, 7 figures

  26. Absence of altermagnetic spin splitting character in rutile oxide RuO$_2$

    Authors: Jiayu Liu, Jie Zhan, Tongrui Li, Jishan Liu, Shufan Cheng, Yuming Shi, Liwei Deng, Meng Zhang, Chihao Li, Jianyang Ding, Qi Jiang, Mao Ye, Zhengtai Liu, Zhicheng Jiang, Siyu Wang, Qian Li, Yanwu Xie, Yilin Wang, Shan Qiao, Jinsheng Wen, Yan Sun, Dawei Shen

    Abstract: Rutile RuO$_2$ has been posited as a potential $d$-wave altermagnetism candidate, with a predicted significant spin splitting up to 1.4 eV. Despite accumulating theoretical predictions and transport measurements, direct spectroscopic observation of spin splitting has remained elusive. Here, we employ spin- and angle-resolved photoemission spectroscopy to investigate the band structures and spin po… ▽ More

    Submitted 8 November, 2024; v1 submitted 20 September, 2024; originally announced September 2024.

    Comments: 7 pages, 4 figures. Published in Physical Review Letters

    Journal ref: Phys. Rev. Lett. 133, 176401 (2024)

  27. arXiv:2408.09786  [pdf, ps, other

    cs.CV

    Graph-guided Cross-composition Feature Disentanglement for Compositional Zero-shot Learning

    Authors: Yuxia Geng, Runkai Zhu, Jiaoyan Chen, Jintai Chen, Xiang Chen, Zhuo Chen, Shuofei Qiao, Yuxiang Wang, Xiaoliang Xu, Sheng-Jun Huang

    Abstract: Disentanglement of visual features of primitives (i.e., attributes and objects) has shown exceptional results in Compositional Zero-shot Learning (CZSL). However, due to the feature divergence of an attribute (resp. object) when combined with different objects (resp. attributes), it is challenging to learn disentangled primitive features that are general across different compositions. To this end,… ▽ More

    Submitted 29 May, 2025; v1 submitted 19 August, 2024; originally announced August 2024.

    Comments: Accepted in ACL 2025 findings

  28. arXiv:2407.19555  [pdf

    cond-mat.str-el cond-mat.supr-con

    Crystal-symmetry-paired spin-valley locking in a layered room-temperature antiferromagnet

    Authors: Fayuan Zhang, Xingkai Cheng, Zhouyi Yin, Changchao Liu, Liwei Deng, Yuxi Qiao, Zheng Shi, Shuxuan Zhang, Junhao Lin, Zhengtai Liu, Mao Ye, Yaobo Huang, Xiangyu Meng, Cheng Zhang, Taichi Okuda, Kenya Shimada, Shengtao Cui, Yue Zhao, Guang-Han Cao, Shan Qiao, Junwei Liu, Chaoyu Chen

    Abstract: Recent theoretical efforts predicted a type of unconventional antiferromagnet characterized by the crystal symmetry C (rotation or mirror), which connects antiferromagnetic sublattices in real space and simultaneously couples spin and momentum in reciprocal space. This results in a unique C-paired spin-valley locking (SVL) and corresponding novel properties such as piezomagnetism and noncollinear… ▽ More

    Submitted 2 August, 2024; v1 submitted 28 July, 2024; originally announced July 2024.

    Comments: 22 pages, 5 figures

  29. arXiv:2407.15017  [pdf, other

    cs.CL cs.AI cs.CV cs.HC cs.LG

    Knowledge Mechanisms in Large Language Models: A Survey and Perspective

    Authors: Mengru Wang, Yunzhi Yao, Ziwen Xu, Shuofei Qiao, Shumin Deng, Peng Wang, Xiang Chen, Jia-Chen Gu, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen, Ningyu Zhang

    Abstract: Understanding knowledge mechanisms in Large Language Models (LLMs) is crucial for advancing towards trustworthy AGI. This paper reviews knowledge mechanism analysis from a novel taxonomy including knowledge utilization and evolution. Knowledge utilization delves into the mechanism of memorization, comprehension and application, and creation. Knowledge evolution focuses on the dynamic progression o… ▽ More

    Submitted 4 December, 2024; v1 submitted 22 July, 2024; originally announced July 2024.

    Comments: EMNLP 2024 Findings; 39 pages (v4)

  30. arXiv:2406.13920  [pdf, other

    cs.LG cs.SI

    Understanding the Robustness of Graph Neural Networks against Adversarial Attacks

    Authors: Tao Wu, Canyixing Cui, Xingping Xian, Shaojie Qiao, Chao Wang, Lin Yuan, Shui Yu

    Abstract: Recent studies have shown that graph neural networks (GNNs) are vulnerable to adversarial attacks, posing significant challenges to their deployment in safety-critical scenarios. This vulnerability has spurred a growing focus on designing robust GNNs. Despite this interest, current advancements have predominantly relied on empirical trial and error, resulting in a limited understanding of the robu… ▽ More

    Submitted 25 May, 2025; v1 submitted 19 June, 2024; originally announced June 2024.

  31. arXiv:2406.13499  [pdf, other

    cs.SI cs.LG

    GraphMU: Repairing Robustness of Graph Neural Networks via Machine Unlearning

    Authors: Tao Wu, Xinwen Cao, Chao Wang, Shaojie Qiao, Xingping Xian, Lin Yuan, Canyixing Cui, Yanbing Liu

    Abstract: Graph Neural Networks (GNNs) have demonstrated significant application potential in various fields. However, GNNs are still vulnerable to adversarial attacks. Numerous adversarial defense methods on GNNs are proposed to address the problem of adversarial attacks. However, these methods can only serve as a defense before poisoning, but cannot repair poisoned GNN. Therefore, there is an urgent need… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  32. arXiv:2406.11229  [pdf, other

    eess.SY eess.SP

    Low-probability of Intercept/Detect (LPI/LPD) Secure Communications Using Antenna Arrays Employing Rapid Sidelobe Time Modulation

    Authors: Jiahao Zhao, Shichen Qiao, John H. Booske, Nader Behdad

    Abstract: We present an electronically-reconfigurable antenna array offering low probability of intercept/detect (LPI/LPD) and secure communications capabilities simultaneously at the physical layer. This antenna array is designed to provide rapidly time-varying sidelobes and a stationary main lobe. By performing rapid sidelobe time modulation (SLTM), the signal transmitted in the undesired directions (i.e.… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: This work has been submitted to the IEEE for possible publication

  33. arXiv:2405.14205  [pdf, other

    cs.CL cs.AI cs.CV cs.LG cs.MA

    Agent Planning with World Knowledge Model

    Authors: Shuofei Qiao, Runnan Fang, Ningyu Zhang, Yuqi Zhu, Xiang Chen, Shumin Deng, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen

    Abstract: Recent endeavors towards directly using large language models (LLMs) as agent models to execute interactive planning tasks have shown commendable results. Despite their achievements, however, they still struggle with brainless trial-and-error in global planning and generating hallucinatory actions in local planning due to their poor understanding of the ``real'' physical world. Imitating humans' m… ▽ More

    Submitted 3 January, 2025; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: NeurIPS 2024

  34. Large band-splitting in $g$-wave type altermagnet CrSb

    Authors: Jianyang Ding, Zhicheng Jiang, Xiuhua Chen, Zicheng Tao, Zhengtai Liu, Tongrui Li, Jishan Liu, Jianping Sun, Jinguang Cheng, Jiayu Liu, Yichen Yang, Runfeng Zhang, Liwei Deng, Wenchuan Jing, Yu Huang, Yuming Shi, Mao Ye, Shan Qiao, Yilin Wang, Yanfeng Guo, Donglai Feng, Dawei Shen

    Abstract: Altermagnetism (AM), a newly discovered magnetic state, ingeniously integrates the properties of ferromagnetism and antiferromagnetism, representing a significant breakthrough in the field of magnetic materials. Despite experimental verification of some typical AM materials, such as MnTe and MnTe$_2$, the pursuit of AM materials that feature larger spin splitting and higher transition temperature… ▽ More

    Submitted 15 November, 2024; v1 submitted 21 May, 2024; originally announced May 2024.

    Comments: 7 pages, 4 figures

    Journal ref: Phys.Rev.Lett.133,206401(2024)

  35. arXiv:2405.03928  [pdf, ps, other

    cond-mat.mtrl-sci cond-mat.mes-hall cond-mat.supr-con

    MSene: A new large family of two-dimensional transition metal sulfide with MXene structure

    Authors: Shu-Xiang Qiao, Yu-Lin Han, Na Jiao, Meng-Meng Zheng, Hong-Yan Lu, Ping Zhang

    Abstract: In this work, we theoretically report a new large family of two-dimensional (2D) transition metal sulfides $M$$_{2}$S with MXene structure in 2H and 1T phases, which we name as MSene. Twenty-four out of fifty-eight MSenes are proved to be stable. Notably, this family includes twelve superconducting (SC) materials, seven SC topological metals (SCTMs), four charge density wave (CDW) materials, and f… ▽ More

    Submitted 9 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

    Comments: 6 pages, 5 figures

    Journal ref: Physical Review B 111, L041404 (2025)

  36. arXiv:2405.03162  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Advancing Multimodal Medical Capabilities of Gemini

    Authors: Lin Yang, Shawn Xu, Andrew Sellergren, Timo Kohlberger, Yuchen Zhou, Ira Ktena, Atilla Kiraly, Faruk Ahmed, Farhad Hormozdiari, Tiam Jaroensri, Eric Wang, Ellery Wulczyn, Fayaz Jamil, Theo Guidroz, Chuck Lau, Siyuan Qiao, Yun Liu, Akshay Goel, Kendall Park, Arnav Agharwal, Nick George, Yang Wang, Ryutaro Tanno, David G. T. Barrett, Wei-Hung Weng , et al. (22 additional authors not shown)

    Abstract: Many clinical tasks require an understanding of specialized data, such as medical images and genomics, which is not typically found in general-purpose large multimodal models. Building upon Gemini's multimodal models, we develop several models within the new Med-Gemini family that inherit core capabilities of Gemini and are optimized for medical use via fine-tuning with 2D and 3D radiology, histop… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  37. arXiv:2405.01488  [pdf, other

    cs.LG stat.ML

    Digital Twin Generators for Disease Modeling

    Authors: Nameyeh Alam, Jake Basilico, Daniele Bertolini, Satish Casie Chetty, Heather D'Angelo, Ryan Douglas, Charles K. Fisher, Franklin Fuller, Melissa Gomes, Rishabh Gupta, Alex Lang, Anton Loukianov, Rachel Mak-McCully, Cary Murray, Hanalei Pham, Susanna Qiao, Elena Ryapolova-Webb, Aaron Smith, Dimitri Theoharatos, Anil Tolwani, Eric W. Tramel, Anna Vidovszky, Judy Viduya, Jonathan R. Walsh

    Abstract: A patient's digital twin is a computational model that describes the evolution of their health over time. Digital twins have the potential to revolutionize medicine by enabling individual-level computer simulations of human health, which can be used to conduct more efficient clinical trials or to recommend personalized treatment options. Due to the overwhelming complexity of human biology, machine… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  38. arXiv:2403.19651  [pdf, other

    cs.CV cs.AI cs.CL cs.IR cs.MM

    MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions

    Authors: Kai Zhang, Yi Luan, Hexiang Hu, Kenton Lee, Siyuan Qiao, Wenhu Chen, Yu Su, Ming-Wei Chang

    Abstract: Image retrieval, i.e., finding desired images given a reference image, inherently encompasses rich, multi-faceted search intents that are difficult to capture solely using image-based measures. Recent works leverage text instructions to allow users to more freely express their search intents. However, they primarily focus on image pairs that are visually similar and/or can be characterized by a sm… ▽ More

    Submitted 24 June, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

    Comments: ICML 2024 (Oral); Project Website: https://open-vision-language.github.io/MagicLens/

  39. arXiv:2403.08357  [pdf

    cond-mat.mtrl-sci physics.comp-ph

    Geometric and electronic properties of two kinds of CrO2 magnetic monolayers: D3d and D2h phases

    Authors: Yang Zhang, Xianggong Bo, Jimeng Jing, Lixia Wang, Shiqian Qiao, Hong Wu, Yong Pu, Feng Li

    Abstract: Due to the high magnetic coupling strength between the Cr elements, the bulk phase CrO2 is one of several ferromagnetic oxides known to have the highest Curie temperature. When the dimensionality of the material is reduced from 3D to 2D, the 2D CrO2 system material is expected to maintain a high Curie temperature. In this work, we predict two new phases of CrO2 monolayer (D3d and D2h) by using fir… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: 5 pages,4 figures

  40. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1112 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 16 December, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  41. arXiv:2403.03101  [pdf, other

    cs.CL cs.AI cs.HC cs.LG cs.MA

    KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents

    Authors: Yuqi Zhu, Shuofei Qiao, Yixin Ou, Shumin Deng, Shiwei Lyu, Yue Shen, Lei Liang, Jinjie Gu, Huajun Chen, Ningyu Zhang

    Abstract: Large Language Models (LLMs) have demonstrated great potential in complex reasoning tasks, yet they fall short when tackling more sophisticated challenges, especially when interacting with environments through generating executable actions. This inadequacy primarily stems from the lack of built-in action knowledge in language agents, which fails to effectively guide the planning trajectories durin… ▽ More

    Submitted 21 February, 2025; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: NAACL 2025 Findings. Project page: https://zjunlp.github.io/project/KnowAgent/ Code: https://github.com/zjunlp/KnowAgent

  42. arXiv:2402.15349  [pdf, other

    cond-mat.mtrl-sci

    Two-dimensional photonic crystal cavities in ZnSe quantum well structures

    Authors: Siqi Qiao, Nils von den Driesch, Xi Chen, Stefan Trellenkamp, Florian Lentz, Christoph Krause, Benjamin Bennemann, Thorsten Brazda, James M. LeBeau, Alexander Pawlis

    Abstract: ZnSe and related materials like ZnMgSe and ZnCdSe are promising II-VI host materials for optically mediated quantum information technology such as single photon sources or spin qubits. Integrating these heterostructures into photonic crystal (PC) cavities enables further improvements, for example realizing Purcell-enhanced single photon sources with increased quantum efficiency. Here we report on… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  43. arXiv:2402.13840  [pdf, other

    cs.IR cs.AI

    Multi-view Intent Learning and Alignment with Large Language Models for Session-based Recommendation

    Authors: Shutong Qiao, Wei Zhou, Junhao Wen, Chen Gao, Qun Luo, Peixuan Chen, Yong Li

    Abstract: Session-based recommendation (SBR) methods often rely on user behavior data, which can struggle with the sparsity of session data, limiting performance. Researchers have identified that beyond behavioral signals, rich semantic information in item descriptions is crucial for capturing hidden user intent. While large language models (LLMs) offer new ways to leverage this semantic data, the challenge… ▽ More

    Submitted 13 April, 2025; v1 submitted 21 February, 2024; originally announced February 2024.

  44. arXiv:2402.03049  [pdf, other

    cs.CL cs.AI cs.HC cs.IR cs.LG

    EasyInstruct: An Easy-to-use Instruction Processing Framework for Large Language Models

    Authors: Yixin Ou, Ningyu Zhang, Honghao Gui, Ziwen Xu, Shuofei Qiao, Yida Xue, Runnan Fang, Kangwei Liu, Lei Li, Zhen Bi, Guozhou Zheng, Huajun Chen

    Abstract: In recent years, instruction tuning has gained increasing attention and emerged as a crucial technique to enhance the capabilities of Large Language Models (LLMs). To construct high-quality instruction datasets, many instruction processing approaches have been proposed, aiming to achieve a delicate balance between data quantity and data quality. Nevertheless, due to inconsistencies that persist am… ▽ More

    Submitted 23 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: ACL 2024 System Demonstrations; Project website: https://zjunlp.github.io/project/EasyInstruct Code: https://github.com/zjunlp/EasyInstruct Video: https://youtu.be/rfQOWYfziFo Demo: https://huggingface.co/spaces/zjunlp/EasyInstruct

  45. arXiv:2401.05268  [pdf, other

    cs.CL cs.AI cs.HC cs.LG cs.MA

    AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning

    Authors: Shuofei Qiao, Ningyu Zhang, Runnan Fang, Yujie Luo, Wangchunshu Zhou, Yuchen Eleanor Jiang, Chengfei Lv, Huajun Chen

    Abstract: Language agents have achieved considerable performance on various complex question-answering tasks by planning with external tools. Despite the incessant exploration in this field, existing language agent systems still struggle with costly, non-reproducible data reliance and face the challenge of compelling a single model for multiple functions. To this end, we introduce AutoAct, an automatic agen… ▽ More

    Submitted 26 May, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: ACL 2024

  46. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1326 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 9 May, 2025; v1 submitted 18 December, 2023; originally announced December 2023.

  47. arXiv:2312.02725  [pdf, other

    cs.CV

    R3D-SWIN:Use Shifted Window Attention for Single-View 3D Reconstruction

    Authors: Chenhuan Li, Meihua Xiao, zehuan li, Fangping Chen, Shanshan Qiao, Dingli Wang, Mengxi Gao, Siyi Zhang

    Abstract: Recently, vision transformers have performed well in various computer vision tasks, including voxel 3D reconstruction. However, the windows of the vision transformer are not multi-scale, and there is no connection between the windows, which limits the accuracy of voxel 3D reconstruction. Therefore, we propose a voxel 3D reconstruction network based on shifted window attention. To the best of our k… ▽ More

    Submitted 6 March, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: being consider to patter recognition letters

  48. arXiv:2311.17072  [pdf, other

    cs.CV cs.AI cs.LG cs.MM

    IG Captioner: Information Gain Captioners are Strong Zero-shot Classifiers

    Authors: Chenglin Yang, Siyuan Qiao, Yuan Cao, Yu Zhang, Tao Zhu, Alan Yuille, Jiahui Yu

    Abstract: Generative training has been demonstrated to be powerful for building visual-language models. However, on zero-shot discriminative benchmarks, there is still a performance gap between models trained with generative and discriminative objectives. In this paper, we aim to narrow this gap by improving the efficacy of generative training on classification tasks, without any finetuning processes or add… ▽ More

    Submitted 16 July, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

    Comments: To appear in ECCV 2024

  49. arXiv:2311.15235  [pdf, ps, other

    math.DS

    Limited bisimulations for nondeterministic fuzzy transition systems

    Authors: Sha Qiao, Jun e Feng, Ping Zhu

    Abstract: The limited version of bisimulation, called limited approximate bisimulation, has recently been introduced to fuzzy transition systems (NFTSs). This article extends limited approximate bisimulation to NFTSs, which are more general structures than FTSs, to introduce a notion of $k$-limited $α$-bisimulation by using an approach of relational lifting, where $k$ is a natural number and $α\in[0,1]$. To… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

  50. arXiv:2311.05770  [pdf, other

    cs.CV

    PolyMaX: General Dense Prediction with Mask Transformer

    Authors: Xuan Yang, Liangzhe Yuan, Kimberly Wilber, Astuti Sharma, Xiuye Gu, Siyuan Qiao, Stephanie Debats, Huisheng Wang, Hartwig Adam, Mikhail Sirotenko, Liang-Chieh Chen

    Abstract: Dense prediction tasks, such as semantic segmentation, depth estimation, and surface normal prediction, can be easily formulated as per-pixel classification (discrete outputs) or regression (continuous outputs). This per-pixel prediction paradigm has remained popular due to the prevalence of fully convolutional networks. However, on the recent frontier of segmentation task, the community has been… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: WACV 2024