Skip to main content

Showing 1–50 of 116 results for author: Shu, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2509.25154  [pdf, ps, other

    cs.AI

    Who's Your Judge? On the Detectability of LLM-Generated Judgments

    Authors: Dawei Li, Zhen Tan, Chengshuai Zhao, Bohan Jiang, Baixiang Huang, Pingchuan Ma, Abdullah Alnaibari, Kai Shu, Huan Liu

    Abstract: Large Language Model (LLM)-based judgments leverage powerful LLMs to efficiently evaluate candidate content and provide judgment scores. However, the inherent biases and vulnerabilities of LLM-generated judgments raise concerns, underscoring the urgent need for distinguishing them in sensitive scenarios like academic peer reviewing. In this work, we propose and formalize the task of judgment detec… ▽ More

    Submitted 29 September, 2025; originally announced September 2025.

    Comments: Under review

  2. arXiv:2509.21740  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Self-Speculative Biased Decoding for Faster Live Translation

    Authors: Linxiao Zeng, Haoyun Deng, Kangyuan Shu, Shizhen Wang

    Abstract: Large Language Models (LLMs) have recently demonstrated impressive capabilities in various text generation tasks. However, it remains challenging to use them off-the-shelf in streaming applications (such as live translation), where the output must continually update as the input context expands, while still maintaining a reasonable computational cost to meet the latency requirement. In this work… ▽ More

    Submitted 25 September, 2025; originally announced September 2025.

  3. arXiv:2509.03540  [pdf, ps, other

    cs.CL cs.AI

    Improving Factuality in LLMs via Inference-Time Knowledge Graph Construction

    Authors: Shanglin Wu, Lihui Liu, Jinho D. Choi, Kai Shu

    Abstract: Large Language Models (LLMs) often struggle with producing factually consistent answers due to limitations in their parametric memory. Retrieval-Augmented Generation (RAG) methods address this issue by incorporating external knowledge from trusted sources at inference time. However, such methods typically treat knowledge as unstructured text, which limits their ability to support compositional rea… ▽ More

    Submitted 31 August, 2025; originally announced September 2025.

  4. arXiv:2508.12632  [pdf, ps, other

    cs.CL

    Prompt-Induced Linguistic Fingerprints for LLM-Generated Fake News Detection

    Authors: Chi Wang, Min Gao, Zongwei Wang, Junwei Yin, Kai Shu, Chenghua Lin

    Abstract: With the rapid development of large language models, the generation of fake news has become increasingly effortless, posing a growing societal threat and underscoring the urgent need for reliable detection methods. Early efforts to identify LLM-generated fake news have predominantly focused on the textual content itself; however, because much of that content may appear coherent and factually consi… ▽ More

    Submitted 18 August, 2025; originally announced August 2025.

  5. arXiv:2508.07237  [pdf, ps, other

    cs.CV

    ASM-UNet: Adaptive Scan Mamba Integrating Group Commonalities and Individual Variations for Fine-Grained Segmentation

    Authors: Bo Wang, Mengyuan Xu, Yue Yan, Yuqun Yang, Kechen Shu, Wei Ping, Xu Tang, Wei Jiang, Zheng You

    Abstract: Precise lesion resection depends on accurately identifying fine-grained anatomical structures. While many coarse-grained segmentation (CGS) methods have been successful in large-scale segmentation (e.g., organs), they fall short in clinical scenarios requiring fine-grained segmentation (FGS), which remains challenging due to frequent individual variations in small-scale anatomical structures. Alth… ▽ More

    Submitted 10 August, 2025; originally announced August 2025.

  6. arXiv:2508.05920  [pdf, ps, other

    cs.DS math.NA

    Debiasing Polynomial and Fourier Regression

    Authors: Chris Camaño, Raphael A. Meyer, Kevin Shu

    Abstract: We study the problem of approximating an unknown function $f:\mathbb{R}\to\mathbb{R}$ by a degree-$d$ polynomial using as few function evaluations as possible, where error is measured with respect to a probability distribution $μ$. Existing randomized algorithms achieve near-optimal sample complexities to recover a $ (1+\varepsilon) $-optimal polynomial but produce biased estimates of the best pol… ▽ More

    Submitted 7 August, 2025; originally announced August 2025.

    MSC Class: 65F99 ACM Class: G.1.3

  7. arXiv:2508.03098  [pdf, ps, other

    cs.CL

    Privacy-Aware Decoding: Mitigating Privacy Leakage of Large Language Models in Retrieval-Augmented Generation

    Authors: Haoran Wang, Xiongxiao Xu, Baixiang Huang, Kai Shu

    Abstract: Retrieval-Augmented Generation (RAG) enhances the factual accuracy of large language models (LLMs) by conditioning outputs on external knowledge sources. However, when retrieval involves private or sensitive data, RAG systems are susceptible to extraction attacks that can leak confidential information through generated responses. We propose Privacy-Aware Decoding (PAD), a lightweight, inference-ti… ▽ More

    Submitted 5 August, 2025; originally announced August 2025.

  8. arXiv:2507.02773  [pdf, ps, other

    cs.AI cs.LG cs.MA

    KERAP: A Knowledge-Enhanced Reasoning Approach for Accurate Zero-shot Diagnosis Prediction Using Multi-agent LLMs

    Authors: Yuzhang Xie, Hejie Cui, Ziyang Zhang, Jiaying Lu, Kai Shu, Fadi Nahab, Xiao Hu, Carl Yang

    Abstract: Medical diagnosis prediction plays a critical role in disease detection and personalized healthcare. While machine learning (ML) models have been widely adopted for this task, their reliance on supervised training limits their ability to generalize to unseen cases, particularly given the high cost of acquiring large, labeled datasets. Large language models (LLMs) have shown promise in leveraging l… ▽ More

    Submitted 6 July, 2025; v1 submitted 3 July, 2025; originally announced July 2025.

    Journal ref: American Medical Informatics Association (AMIA) 2025 Annual Symposium, Oral

  9. arXiv:2507.02245  [pdf, ps, other

    cs.RO

    CoInfra: A Large-Scale Cooperative Infrastructure Perception System and Dataset in Adverse Weather

    Authors: Minghao Ning, Yufeng Yang, Keqi Shu, Shucheng Huang, Jiaming Zhong, Maryam Salehi, Mahdi Rahmani, Yukun Lu, Chen Sun, Aladdin Saleh, Ehsan Hashemi, Amir Khajepour

    Abstract: We present CoInfra, a large-scale cooperative infrastructure perception system and dataset designed to advance robust multi-agent perception under real-world and adverse weather conditions. The CoInfra system includes 14 fully synchronized sensor nodes, each equipped with dual RGB cameras and a LiDAR, deployed across a shared region and operating continuously to capture all traffic participants in… ▽ More

    Submitted 4 July, 2025; v1 submitted 2 July, 2025; originally announced July 2025.

    Comments: This paper has been submitted to the IEEE Transactions on Robotics for review

  10. arXiv:2506.20606  [pdf, ps, other

    cs.CL

    Model Editing as a Double-Edged Sword: Steering Agent Ethical Behavior Toward Beneficence or Harm

    Authors: Baixiang Huang, Zhen Tan, Haoran Wang, Zijie Liu, Dawei Li, Ali Payani, Huan Liu, Tianlong Chen, Kai Shu

    Abstract: Agents based on Large Language Models (LLMs) have demonstrated strong capabilities across a wide range of tasks. However, deploying LLM-based agents in high-stakes domains comes with significant safety and ethical risks. Unethical behavior by these agents can directly result in serious real-world consequences, including physical harm and financial loss. To efficiently steer the ethical behavior of… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

    Comments: Main paper: 9 pages; total: 18 pages (including appendix). Code, data, results, and additional resources are available at: https://model-editing.github.io

  11. arXiv:2506.12817  [pdf, ps, other

    eess.AS cs.SD

    Magnetoencephalography (MEG) Based Non-Invasive Chinese Speech Decoding

    Authors: Zhihong Jia, Hongbin Wang, Yuanzhong Shen, Feng Hu, Jiayu An, Kai Shu, Dongrui Wu

    Abstract: As an emerging paradigm of brain-computer interfaces (BCIs), speech BCI has the potential to directly reflect auditory perception and thoughts, offering a promising communication alternative for patients with aphasia. Chinese is one of the most widely spoken languages in the world, whereas there is very limited research on speech BCIs for Chinese language. This paper reports a text-magnetoencephal… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

  12. arXiv:2506.11697  [pdf, ps, other

    cs.SE

    SoK: Automated Vulnerability Repair: Methods, Tools, and Assessments

    Authors: Yiwei Hu, Zhen Li, Kedie Shu, Shenghua Guan, Deqing Zou, Shouhuai Xu, Bin Yuan, Hai Jin

    Abstract: The increasing complexity of software has led to the steady growth of vulnerabilities. Vulnerability repair investigates how to fix software vulnerabilities. Manual vulnerability repair is labor-intensive and time-consuming because it relies on human experts, highlighting the importance of Automated Vulnerability Repair (AVR). In this SoK, we present the systematization of AVR methods through the… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

    Comments: The full version of "SoK: Automated Vulnerability Repair: Methods, Tools, and Assessments" accepted by the 34th USENIX Security Symposium (USENIX Security 2025)

  13. arXiv:2505.23840  [pdf, ps, other

    cs.CL

    Measuring Sycophancy of Language Models in Multi-turn Dialogues

    Authors: Jiseung Hong, Grace Byun, Seungone Kim, Kai Shu, Jinho D. Choi

    Abstract: Large Language Models (LLMs) are expected to provide helpful and harmless responses, yet they often exhibit sycophancy--conforming to user beliefs regardless of factual accuracy or ethical soundness. Prior research on sycophancy has primarily focused on single-turn factual correctness, overlooking the dynamics of real-world interactions. In this work, we introduce SYCON Bench, a novel benchmark fo… ▽ More

    Submitted 25 August, 2025; v1 submitted 28 May, 2025; originally announced May 2025.

    Comments: Accepted to Findings of EMNLP 2025

  14. arXiv:2505.21676  [pdf, ps, other

    cs.RO cs.NI

    Real-World Deployment of Cloud Autonomous Mobility System Using 5G Networks for Outdoor and Indoor Environments

    Authors: Yufeng Yang, Minghao Ning, Keqi Shu, Aladdin Saleh, Ehsan Hashemi, Amir Khajepour

    Abstract: The growing complexity of both outdoor and indoor mobility systems demands scalable, cost-effective, and reliable perception and communication frameworks. This work presents the real-world deployment and evaluation of a Cloud Autonomous Mobility (CAM) system that leverages distributed sensor nodes connected via 5G networks, which integrates LiDAR- and camera-based perception at infrastructure unit… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

    Comments: This paper has been submitted to IEEE Intelligent Transportation Systems Magazine

  15. arXiv:2505.19652  [pdf, other

    cs.HC cs.SD eess.AS

    SACM: SEEG-Audio Contrastive Matching for Chinese Speech Decoding

    Authors: Hongbin Wang, Zhihong Jia, Yuanzhong Shen, Ziwei Wang, Siyang Li, Kai Shu, Feng Hu, Dongrui Wu

    Abstract: Speech disorders such as dysarthria and anarthria can severely impair the patient's ability to communicate verbally. Speech decoding brain-computer interfaces (BCIs) offer a potential alternative by directly translating speech intentions into spoken words, serving as speech neuroprostheses. This paper reports an experimental protocol for Mandarin Chinese speech decoding BCIs, along with the corres… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

  16. arXiv:2505.10690  [pdf, ps, other

    cs.MA cs.GT cs.RO

    Decision Making in Urban Traffic: A Game Theoretic Approach for Autonomous Vehicles Adhering to Traffic Rules

    Authors: Keqi Shu, Minghao Ning, Ahmad Alghooneh, Shen Li, Mohammad Pirani, Amir Khajepour

    Abstract: One of the primary challenges in urban autonomous vehicle decision-making and planning lies in effectively managing intricate interactions with diverse traffic participants characterized by unpredictable movement patterns. Additionally, interpreting and adhering to traffic regulations within rapidly evolving traffic scenarios pose significant hurdles. This paper proposed a rule-based autonomous ve… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

    Comments: This paper is already accepted on IEEE Transactions on Intelligent Transportation Systems

  17. arXiv:2504.05727  [pdf, other

    cs.RO

    SAP-CoPE: Social-Aware Planning using Cooperative Pose Estimation with Infrastructure Sensor Nodes

    Authors: Minghao Ning, Yufeng Yang, Shucheng Huang, Jiaming Zhong, Keqi Shu, Chen Sun, Ehsan Hashemi, Amir Khajepour

    Abstract: Autonomous driving systems must operate safely in human-populated indoor environments, where challenges such as limited perception and occlusion sensitivity arise when relying solely on onboard sensors. These factors generate difficulties in the accurate recognition of human intentions and the generation of comfortable, socially aware trajectories. To address these issues, we propose SAP-CoPE, a s… ▽ More

    Submitted 8 April, 2025; originally announced April 2025.

    Comments: This paper has been submitted to the IEEE Transactions on Industrial Electronics

  18. arXiv:2504.03720  [pdf, other

    cs.AI cs.LG

    TransNet: Transfer Knowledge for Few-shot Knowledge Graph Completion

    Authors: Lihui Liu, Zihao Wang, Dawei Zhou, Ruijie Wang, Yuchen Yan, Bo Xiong, Sihong He, Kai Shu, Hanghang Tong

    Abstract: Knowledge graphs (KGs) are ubiquitous and widely used in various applications. However, most real-world knowledge graphs are incomplete, which significantly degrades their performance on downstream tasks. Additionally, the relationships in real-world knowledge graphs often follow a long-tail distribution, meaning that most relations are represented by only a few training triplets. To address these… ▽ More

    Submitted 29 March, 2025; originally announced April 2025.

  19. arXiv:2502.17812  [pdf, other

    cs.CL cs.LG

    Can Multimodal LLMs Perform Time Series Anomaly Detection?

    Authors: Xiongxiao Xu, Haoran Wang, Yueqing Liang, Philip S. Yu, Yue Zhao, Kai Shu

    Abstract: Large language models (LLMs) have been increasingly used in time series analysis. However, the potential of multimodal LLMs (MLLMs), particularly vision-language models, for time series remains largely under-explored. One natural way for humans to detect time series anomalies is through visualization and textual description. Motivated by this, we raise a critical and practical research question: C… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

    Comments: 9 pages for the main content; 32 pages for the full paper including the appendix. More resources on the intersection of multimodal LLMs and time series analysis are on the website https://mllm-ts.github.io

  20. arXiv:2502.14296  [pdf, ps, other

    cs.CY

    On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective

    Authors: Yue Huang, Chujie Gao, Siyuan Wu, Haoran Wang, Xiangqi Wang, Yujun Zhou, Yanbo Wang, Jiayi Ye, Jiawen Shi, Qihui Zhang, Yuan Li, Han Bao, Zhaoyi Liu, Tianrui Guan, Dongping Chen, Ruoxi Chen, Kehan Guo, Andy Zou, Bryan Hooi Kuen-Yew, Caiming Xiong, Elias Stengel-Eskin, Hongyang Zhang, Hongzhi Yin, Huan Zhang, Huaxiu Yao , et al. (41 additional authors not shown)

    Abstract: Generative Foundation Models (GenFMs) have emerged as transformative tools. However, their widespread adoption raises critical concerns regarding trustworthiness across dimensions. This paper presents a comprehensive framework to address these challenges through three key contributions. First, we systematically review global AI governance laws and policies from governments and regulatory bodies, a… ▽ More

    Submitted 29 September, 2025; v1 submitted 20 February, 2025; originally announced February 2025.

  21. arXiv:2502.14122  [pdf, other

    cs.CL cs.CY cs.ET

    Benchmarking LLMs for Political Science: A United Nations Perspective

    Authors: Yueqing Liang, Liangwei Yang, Chen Wang, Congying Xia, Rui Meng, Xiongxiao Xu, Haoran Wang, Ali Payani, Kai Shu

    Abstract: Large Language Models (LLMs) have achieved significant advances in natural language processing, yet their potential for high-stake political decision-making remains largely unexplored. This paper addresses the gap by focusing on the application of LLMs to the United Nations (UN) decision-making process, where the stakes are particularly high and political decisions can have far-reaching consequenc… ▽ More

    Submitted 19 February, 2025; originally announced February 2025.

  22. arXiv:2502.13297  [pdf, other

    cs.CL cs.AI

    Understanding and Tackling Label Errors in Individual-Level Nature Language Understanding

    Authors: Yunpeng Xiao, Youpeng Zhao, Kai Shu

    Abstract: Natural language understanding (NLU) is a task that enables machines to understand human language. Some tasks, such as stance detection and sentiment analysis, are closely related to individual subjective perspectives, thus termed individual-level NLU. Previously, these tasks are often simplified to text-level NLU tasks, ignoring individual factors. This not only makes inference difficult and unex… ▽ More

    Submitted 18 February, 2025; originally announced February 2025.

    Comments: 12 pages

  23. arXiv:2412.05672  [pdf, other

    cs.CL

    Graph with Sequence: Broad-Range Semantic Modeling for Fake News Detection

    Authors: Junwei Yin, Min Gao, Kai Shu, Wentao Li, Yinqiu Huang, Zongwei Wang

    Abstract: The rapid proliferation of fake news on social media threatens social stability, creating an urgent demand for more effective detection methods. While many promising approaches have emerged, most rely on content analysis with limited semantic depth, leading to suboptimal comprehension of news content.To address this limitation, capturing broader-range semantics is essential yet challenging, as it… ▽ More

    Submitted 6 February, 2025; v1 submitted 7 December, 2024; originally announced December 2024.

  24. arXiv:2412.05206  [pdf, other

    cs.CL cs.AI cs.IR

    ConQRet: Benchmarking Fine-Grained Evaluation of Retrieval Augmented Argumentation with LLM Judges

    Authors: Kaustubh D. Dhole, Kai Shu, Eugene Agichtein

    Abstract: Computational argumentation, which involves generating answers or summaries for controversial topics like abortion bans and vaccination, has become increasingly important in today's polarized environment. Sophisticated LLM capabilities offer the potential to provide nuanced, evidence-based answers to such questions through Retrieval-Augmented Argumentation (RAArg), leveraging real-world evidence f… ▽ More

    Submitted 6 December, 2024; originally announced December 2024.

    MSC Class: I.2.7

  25. arXiv:2411.16594  [pdf, ps, other

    cs.AI cs.CL

    From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge

    Authors: Dawei Li, Bohan Jiang, Liangjie Huang, Alimohammad Beigi, Chengshuai Zhao, Zhen Tan, Amrita Bhattacharjee, Yuxuan Jiang, Canyu Chen, Tianhao Wu, Kai Shu, Lu Cheng, Huan Liu

    Abstract: Assessment and evaluation have long been critical challenges in artificial intelligence (AI) and natural language processing (NLP). Traditional methods, usually matching-based or small model-based, often fall short in open-ended and dynamic scenarios. Recent advancements in Large Language Models (LLMs) inspire the "LLM-as-a-judge" paradigm, where LLMs are leveraged to perform scoring, ranking, or… ▽ More

    Submitted 29 September, 2025; v1 submitted 25 November, 2024; originally announced November 2024.

    Comments: EMNLP 2025

  26. arXiv:2411.09547  [pdf, other

    cs.CL cs.AI

    Piecing It All Together: Verifying Multi-Hop Multimodal Claims

    Authors: Haoran Wang, Aman Rangapur, Xiongxiao Xu, Yueqing Liang, Haroon Gharwi, Carl Yang, Kai Shu

    Abstract: Existing claim verification datasets often do not require systems to perform complex reasoning or effectively interpret multimodal evidence. To address this, we introduce a new task: multi-hop multimodal claim verification. This task challenges models to reason over multiple pieces of evidence from diverse sources, including text, images, and tables, and determine whether the combined multimodal e… ▽ More

    Submitted 12 December, 2024; v1 submitted 14 November, 2024; originally announced November 2024.

    Comments: COLING 2025

  27. arXiv:2411.06469  [pdf, other

    cs.CL

    ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?

    Authors: Canyu Chen, Jian Yu, Shan Chen, Che Liu, Zhongwei Wan, Danielle Bitterman, Fei Wang, Kai Shu

    Abstract: Large Language Models (LLMs) hold great promise to revolutionize current clinical systems for their superior capacities on medical text processing tasks and medical licensing exams. Meanwhile, traditional ML models such as SVM and XGBoost have still been mainly adopted in clinical prediction tasks. An emerging question is Can LLMs beat traditional ML models in clinical prediction? Thus, we build a… ▽ More

    Submitted 10 November, 2024; originally announced November 2024.

    Comments: The first two authors contributed equally. 10 pages for main paper, 66 pages including appendix. Project website: https://clinicalbench.github.io

  28. arXiv:2410.16251  [pdf, other

    cs.CL

    Can Knowledge Editing Really Correct Hallucinations?

    Authors: Baixiang Huang, Canyu Chen, Xiongxiao Xu, Ali Payani, Kai Shu

    Abstract: Large Language Models (LLMs) suffer from hallucinations, referring to the non-factual information in generated content, despite their superior capacities across tasks. Meanwhile, knowledge editing has been developed as a new popular paradigm to correct erroneous factual knowledge encoded in LLMs with the advantage of avoiding retraining from scratch. However, a common issue of existing evaluation… ▽ More

    Submitted 3 March, 2025; v1 submitted 21 October, 2024; originally announced October 2024.

    Comments: ICLR 2025. Main paper: 10 pages; total: 34 pages (including appendix). The first two authors contributed equally to this work. Code, data, results, and additional resources are available on the project website: https://llm-editing.github.io

  29. arXiv:2410.11855  [pdf, other

    cs.DC cs.AI cs.AR cs.LG

    Online Energy Optimization in GPUs: A Multi-Armed Bandit Approach

    Authors: Xiongxiao Xu, Solomon Abera Bekele, Brice Videau, Kai Shu

    Abstract: Energy consumption has become a critical design metric and a limiting factor in the development of future computing architectures, from small wearable devices to large-scale leadership computing facilities. The predominant methods in energy management optimization are focused on CPUs. However, GPUs are increasingly significant and account for the majority of energy consumption in heterogeneous hig… ▽ More

    Submitted 3 October, 2024; originally announced October 2024.

  30. arXiv:2409.03947  [pdf, other

    cs.CV cs.AI

    FODA-PG for Enhanced Medical Imaging Narrative Generation: Adaptive Differentiation of Normal and Abnormal Attributes

    Authors: Kai Shu, Yuzhuo Jia, Ziyang Zhang, Jiechao Gao

    Abstract: Automatic Medical Imaging Narrative generation aims to alleviate the workload of radiologists by producing accurate clinical descriptions directly from radiological images. However, the subtle visual nuances and domain-specific terminology in medical images pose significant challenges compared to generic image captioning tasks. Existing approaches often neglect the vital distinction between normal… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

  31. arXiv:2408.08946  [pdf, other

    cs.CY

    Authorship Attribution in the Era of LLMs: Problems, Methodologies, and Challenges

    Authors: Baixiang Huang, Canyu Chen, Kai Shu

    Abstract: Accurate attribution of authorship is crucial for maintaining the integrity of digital content, improving forensic investigations, and mitigating the risks of misinformation and plagiarism. Addressing the imperative need for proper authorship attribution is essential to uphold the credibility and accountability of authentic authorship. The rapid advancements of Large Language Models (LLMs) have bl… ▽ More

    Submitted 9 January, 2025; v1 submitted 16 August, 2024; originally announced August 2024.

    Comments: Accepted to ACM SIGKDD Exploration. 12 pages for the main paper. More resources and a curated list of papers are available and regularly updated at https://llm-authorship.github.io

  32. arXiv:2407.21264  [pdf, other

    cs.CL

    Model Attribution in LLM-Generated Disinformation: A Domain Generalization Approach with Supervised Contrastive Learning

    Authors: Alimohammad Beigi, Zhen Tan, Nivedh Mudiam, Canyu Chen, Kai Shu, Huan Liu

    Abstract: Model attribution for LLM-generated disinformation poses a significant challenge in understanding its origins and mitigating its spread. This task is especially challenging because modern large language models (LLMs) produce disinformation with human-like quality. Additionally, the diversity in prompting methods used to generate disinformation complicates accurate source attribution. These methods… ▽ More

    Submitted 14 August, 2024; v1 submitted 30 July, 2024; originally announced July 2024.

    Comments: 10 pages, 2 figures, accepted at DSAA 2024

  33. arXiv:2407.20224  [pdf, other

    cs.CL

    Can Editing LLMs Inject Harm?

    Authors: Canyu Chen, Baixiang Huang, Zekun Li, Zhaorun Chen, Shiyang Lai, Xiongxiao Xu, Jia-Chen Gu, Jindong Gu, Huaxiu Yao, Chaowei Xiao, Xifeng Yan, William Yang Wang, Philip Torr, Dawn Song, Kai Shu

    Abstract: Knowledge editing has been increasingly adopted to correct the false or outdated knowledge in Large Language Models (LLMs). Meanwhile, one critical but under-explored question is: can knowledge editing be used to inject harm into LLMs? In this paper, we propose to reformulate knowledge editing as a new type of safety threat for LLMs, namely Editing Attack, and conduct a systematic investigation wi… ▽ More

    Submitted 16 August, 2024; v1 submitted 29 July, 2024; originally announced July 2024.

    Comments: The first two authors contributed equally. 9 pages for main paper, 36 pages including appendix. The code, results, dataset for this paper and more resources are on the project website: https://llm-editing.github.io

  34. arXiv:2406.14043  [pdf, other

    cs.IR cs.CL

    Taxonomy-Guided Zero-Shot Recommendations with LLMs

    Authors: Yueqing Liang, Liangwei Yang, Chen Wang, Xiongxiao Xu, Philip S. Yu, Kai Shu

    Abstract: With the emergence of large language models (LLMs) and their ability to perform a variety of tasks, their application in recommender systems (RecSys) has shown promise. However, we are facing significant challenges when deploying LLMs into RecSys, such as limited prompt length, unstructured item information, and un-constrained generation of recommendations, leading to sub-optimal performance. To a… ▽ More

    Submitted 19 February, 2025; v1 submitted 20 June, 2024; originally announced June 2024.

  35. arXiv:2405.18776  [pdf, other

    cs.CR cs.CL cs.LG

    LMO-DP: Optimizing the Randomization Mechanism for Differentially Private Fine-Tuning (Large) Language Models

    Authors: Qin Yang, Meisam Mohammad, Han Wang, Ali Payani, Ashish Kundu, Kai Shu, Yan Yan, Yuan Hong

    Abstract: Differentially Private Stochastic Gradient Descent (DP-SGD) and its variants have been proposed to ensure rigorous privacy for fine-tuning large-scale pre-trained language models. However, they rely heavily on the Gaussian mechanism, which may overly perturb the gradients and degrade the accuracy, especially in stronger privacy regimes (e.g., the privacy budget $ε< 3$). To address such limitations… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 18 pages, 15 figures

  36. arXiv:2404.14757  [pdf, other

    cs.LG cs.AI

    SST: Multi-Scale Hybrid Mamba-Transformer Experts for Long-Short Range Time Series Forecasting

    Authors: Xiongxiao Xu, Canyu Chen, Yueqing Liang, Baixiang Huang, Guangji Bai, Liang Zhao, Kai Shu

    Abstract: Despite significant progress in time series forecasting, existing forecasters often overlook the heterogeneity between long-range and short-range time series, leading to performance degradation in practical applications. In this work, we highlight the need of distinct objectives tailored to different ranges. We point out that time series can be decomposed into global patterns and local variations,… ▽ More

    Submitted 22 August, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

  37. arXiv:2403.09747  [pdf, other

    cs.CL cs.AI

    Re-Search for The Truth: Multi-round Retrieval-augmented Large Language Models are Strong Fake News Detectors

    Authors: Guanghua Li, Wensheng Lu, Wei Zhang, Defu Lian, Kezhong Lu, Rui Mao, Kai Shu, Hao Liao

    Abstract: The proliferation of fake news has had far-reaching implications on politics, the economy, and society at large. While Fake news detection methods have been employed to mitigate this issue, they primarily depend on two essential elements: the quality and relevance of the evidence, and the effectiveness of the verdict prediction mechanism. Traditional methods, which often source information from st… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  38. arXiv:2403.08213  [pdf, other

    cs.CL

    Can Large Language Models Identify Authorship?

    Authors: Baixiang Huang, Canyu Chen, Kai Shu

    Abstract: The ability to accurately identify authorship is crucial for verifying content authenticity and mitigating misinformation. Large Language Models (LLMs) have demonstrated an exceptional capacity for reasoning and problem-solving. However, their potential in authorship analysis remains under-explored. Traditional studies have depended on hand-crafted stylistic features, whereas state-of-the-art appr… ▽ More

    Submitted 22 October, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: Accepted to EMNLP 2024 Findings. The main paper is 9 pages long, with 16 pages total. The code, results, dataset, and additional resources are available on the project website: https://llm-authorship.github.io/

  39. arXiv:2402.04559  [pdf, other

    cs.AI cs.CL cs.HC

    Can Large Language Model Agents Simulate Human Trust Behavior?

    Authors: Chengxing Xie, Canyu Chen, Feiran Jia, Ziyu Ye, Shiyang Lai, Kai Shu, Jindong Gu, Adel Bibi, Ziniu Hu, David Jurgens, James Evans, Philip Torr, Bernard Ghanem, Guohao Li

    Abstract: Large Language Model (LLM) agents have been increasingly adopted as simulation tools to model humans in social science and role-playing applications. However, one fundamental question remains: can LLM agents really simulate human behavior? In this paper, we focus on one critical and elemental behavior in human interactions, trust, and investigate whether LLM agents can simulate human trust behavio… ▽ More

    Submitted 1 November, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: Accepted to Proceedings of NeurIPS 2024. The first two authors contributed equally. 10 pages for main paper, 56 pages including appendix. Project website: https://agent-trust.camel-ai.org

  40. arXiv:2401.15496  [pdf, other

    cs.CL cs.AI cs.LG

    Baichuan2-Sum: Instruction Finetune Baichuan2-7B Model for Dialogue Summarization

    Authors: Jianfei Xiao, Yancan Chen, Yimin Ou, Hanyi Yu, Kai Shu, Yiyong Xiao

    Abstract: Large language models (LLMs) like Llama, Baichuan and Bloom models show remarkable ability with instruction fine-tuning in many natural language tasks. Nevertheless, for the dialogue summarization task, which aims to generate summaries for different roles in dialogue, most of the state-of-the-art methods conduct on small models (e.g Bart and Bert). Existing methods try to add task specified optimi… ▽ More

    Submitted 3 April, 2024; v1 submitted 27 January, 2024; originally announced January 2024.

  41. arXiv:2401.09090  [pdf, other

    cs.CY

    Understanding the concerns and choices of public when using large language models for healthcare

    Authors: Yunpeng Xiao, Kyrie Zhixuan Zhou, Yueqing Liang, Kai Shu

    Abstract: Large language models (LLMs) have shown their potential in biomedical fields. However, how the public uses them for healthcare purposes such as medical Q\&A, self-diagnosis, and daily healthcare information seeking is under-investigated. This paper adopts a mixed-methods approach, including surveys (N=214) and interviews (N=17) to investigate how and why the public uses LLMs for healthcare. We fou… ▽ More

    Submitted 12 September, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

    Comments: 22 pages

    ACM Class: J.4; K.4.2

  42. arXiv:2401.05561  [pdf, other

    cs.CL

    TrustLLM: Trustworthiness in Large Language Models

    Authors: Yue Huang, Lichao Sun, Haoran Wang, Siyuan Wu, Qihui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bertie Vidgen, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric Xing, Furong Huang, Hao Liu, Heng Ji, Hongyi Wang , et al. (45 additional authors not shown)

    Abstract: Large language models (LLMs), exemplified by ChatGPT, have gained considerable attention for their excellent natural language processing capabilities. Nonetheless, these LLMs present many challenges, particularly in the realm of trustworthiness. Therefore, ensuring the trustworthiness of LLMs emerges as an important topic. This paper introduces TrustLLM, a comprehensive study of trustworthiness in… ▽ More

    Submitted 30 September, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: This work is still under work and we welcome your contribution

  43. arXiv:2311.11473   

    cs.LG cs.AI

    CSGNN: Conquering Noisy Node labels via Dynamic Class-wise Selection

    Authors: Yifan Li, Zhen Tan, Kai Shu, Zongsheng Cao, Yu Kong, Huan Liu

    Abstract: Graph Neural Networks (GNNs) have emerged as a powerful tool for representation learning on graphs, but they often suffer from overfitting and label noise issues, especially when the data is scarce or imbalanced. Different from the paradigm of previous methods that rely on single-node confidence, in this paper, we introduce a novel Class-wise Selection for Graph Neural Networks, dubbed CSGNN, whic… ▽ More

    Submitted 14 December, 2023; v1 submitted 19 November, 2023; originally announced November 2023.

    Comments: For the privacy issue

  44. arXiv:2311.09433  [pdf, other

    cs.CR cs.AI cs.CL

    Trojan Activation Attack: Red-Teaming Large Language Models using Activation Steering for Safety-Alignment

    Authors: Haoran Wang, Kai Shu

    Abstract: To ensure AI safety, instruction-tuned Large Language Models (LLMs) are specifically trained to ensure alignment, which refers to making models behave in accordance with human intentions. While these models have demonstrated commendable results on various safety benchmarks, the vulnerability of their safety alignment has not been extensively studied. This is particularly troubling given the potent… ▽ More

    Submitted 15 August, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: ACM International Conference on Information and Knowledge Management (CIKM'24)

  45. arXiv:2311.09428  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Beyond Detection: Unveiling Fairness Vulnerabilities in Abusive Language Models

    Authors: Yueqing Liang, Lu Cheng, Ali Payani, Kai Shu

    Abstract: This work investigates the potential of undermining both fairness and detection performance in abusive language detection. In a dynamic and complex digital world, it is crucial to investigate the vulnerabilities of these detection models to adversarial fairness attacks to improve their fairness robustness. We propose a simple yet effective framework FABLE that leverages backdoor attacks as they al… ▽ More

    Submitted 5 December, 2023; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: Under review

  46. arXiv:2311.05656  [pdf, other

    cs.CY

    Combating Misinformation in the Age of LLMs: Opportunities and Challenges

    Authors: Canyu Chen, Kai Shu

    Abstract: Misinformation such as fake news and rumors is a serious threat on information ecosystems and public trust. The emergence of Large Language Models (LLMs) has great potential to reshape the landscape of combating misinformation. Generally, LLMs can be a double-edged sword in the fight. On the one hand, LLMs bring promising opportunities for combating misinformation due to their profound world knowl… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: 9 pages for the main paper, 35 pages including 656 references, more resources on "LLMs Meet Misinformation" are on the website: https://llm-misinformation.github.io/

  47. arXiv:2310.05253  [pdf, other

    cs.CL cs.AI cs.LG

    Explainable Claim Verification via Knowledge-Grounded Reasoning with Large Language Models

    Authors: Haoran Wang, Kai Shu

    Abstract: Claim verification plays a crucial role in combating misinformation. While existing works on claim verification have shown promising results, a crucial piece of the puzzle that remains unsolved is to understand how to verify claims without relying on human-annotated data, which is expensive to create at a large scale. Additionally, it is important for models to provide comprehensive explanations t… ▽ More

    Submitted 19 October, 2023; v1 submitted 8 October, 2023; originally announced October 2023.

    Comments: Findings of EMNLP 2023

  48. arXiv:2309.13788  [pdf, other

    cs.CL cs.AI cs.CR cs.HC cs.LG

    Can LLM-Generated Misinformation Be Detected?

    Authors: Canyu Chen, Kai Shu

    Abstract: The advent of Large Language Models (LLMs) has made a transformative impact. However, the potential that LLMs such as ChatGPT can be exploited to generate misinformation has posed a serious concern to online safety and public trust. A fundamental research question is: will LLM-generated misinformation cause more harm than human-written misinformation? We propose to tackle this question from the pe… ▽ More

    Submitted 23 April, 2024; v1 submitted 24 September, 2023; originally announced September 2023.

    Comments: Accepted to Proceedings of ICLR 2024. 9 pages for main paper, 40 pages including appendix. The code, results, dataset for this paper and more resources on "LLMs Meet Misinformation" have been released on the project website: https://llm-misinformation.github.io/

  49. arXiv:2309.12363  [pdf, other

    cs.CY cs.AI

    Investigating Online Financial Misinformation and Its Consequences: A Computational Perspective

    Authors: Aman Rangapur, Haoran Wang, Kai Shu

    Abstract: The rapid dissemination of information through digital platforms has revolutionized the way we access and consume news and information, particularly in the realm of finance. However, this digital age has also given rise to an alarming proliferation of financial misinformation, which can have detrimental effects on individuals, markets, and the overall economy. This research paper aims to provide a… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: 32 pages, 2 figures

    ACM Class: A.1; I.m

  50. arXiv:2309.08793  [pdf, other

    cs.AI cs.CE cs.LG

    Fin-Fact: A Benchmark Dataset for Multimodal Financial Fact Checking and Explanation Generation

    Authors: Aman Rangapur, Haoran Wang, Ling Jian, Kai Shu

    Abstract: Fact-checking in financial domain is under explored, and there is a shortage of quality dataset in this domain. In this paper, we propose Fin-Fact, a benchmark dataset for multimodal fact-checking within the financial domain. Notably, it includes professional fact-checker annotations and justifications, providing expertise and credibility. With its multimodal nature encompassing both textual and v… ▽ More

    Submitted 1 May, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

    Comments: 8 pages, 4 figures, 4 tables

    ACM Class: I.2; E.m