Skip to main content

Showing 1–50 of 122 results for author: Shu, K

.
  1. arXiv:2505.23840  [pdf, ps, other

    cs.CL

    Measuring Sycophancy of Language Models in Multi-turn Dialogues

    Authors: Jiseung Hong, Grace Byun, Seungone Kim, Kai Shu

    Abstract: Large Language Models (LLMs) are expected to provide helpful and harmless responses, yet they often exhibit sycophancy--conforming to user beliefs regardless of factual accuracy or ethical soundness. Prior research on sycophancy has primarily focused on single-turn factual correctness, overlooking the dynamics of real-world interactions. In this work, we introduce SYCON Bench, a novel benchmark fo… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

  2. arXiv:2505.21676  [pdf, ps, other

    cs.RO cs.NI

    Real-World Deployment of Cloud Autonomous Mobility System Using 5G Networks for Outdoor and Indoor Environments

    Authors: Yufeng Yang, Minghao Ning, Keqi Shu, Aladdin Saleh, Ehsan Hashemi, Amir Khajepour

    Abstract: The growing complexity of both outdoor and indoor mobility systems demands scalable, cost-effective, and reliable perception and communication frameworks. This work presents the real-world deployment and evaluation of a Cloud Autonomous Mobility (CAM) system that leverages distributed sensor nodes connected via 5G networks, which integrates LiDAR- and camera-based perception at infrastructure unit… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

    Comments: This paper has been submitted to IEEE Intelligent Transportation Systems Magazine

  3. arXiv:2505.19652  [pdf, other

    cs.HC cs.SD eess.AS

    SACM: SEEG-Audio Contrastive Matching for Chinese Speech Decoding

    Authors: Hongbin Wang, Zhihong Jia, Yuanzhong Shen, Ziwei Wang, Siyang Li, Kai Shu, Feng Hu, Dongrui Wu

    Abstract: Speech disorders such as dysarthria and anarthria can severely impair the patient's ability to communicate verbally. Speech decoding brain-computer interfaces (BCIs) offer a potential alternative by directly translating speech intentions into spoken words, serving as speech neuroprostheses. This paper reports an experimental protocol for Mandarin Chinese speech decoding BCIs, along with the corres… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

  4. arXiv:2505.10690  [pdf, ps, other

    cs.MA cs.GT cs.RO

    Decision Making in Urban Traffic: A Game Theoretic Approach for Autonomous Vehicles Adhering to Traffic Rules

    Authors: Keqi Shu, Minghao Ning, Ahmad Alghooneh, Shen Li, Mohammad Pirani, Amir Khajepour

    Abstract: One of the primary challenges in urban autonomous vehicle decision-making and planning lies in effectively managing intricate interactions with diverse traffic participants characterized by unpredictable movement patterns. Additionally, interpreting and adhering to traffic regulations within rapidly evolving traffic scenarios pose significant hurdles. This paper proposed a rule-based autonomous ve… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

    Comments: This paper is already accepted on IEEE Transactions on Intelligent Transportation Systems

  5. arXiv:2504.05727  [pdf, other

    cs.RO

    SAP-CoPE: Social-Aware Planning using Cooperative Pose Estimation with Infrastructure Sensor Nodes

    Authors: Minghao Ning, Yufeng Yang, Shucheng Huang, Jiaming Zhong, Keqi Shu, Chen Sun, Ehsan Hashemi, Amir Khajepour

    Abstract: Autonomous driving systems must operate safely in human-populated indoor environments, where challenges such as limited perception and occlusion sensitivity arise when relying solely on onboard sensors. These factors generate difficulties in the accurate recognition of human intentions and the generation of comfortable, socially aware trajectories. To address these issues, we propose SAP-CoPE, a s… ▽ More

    Submitted 8 April, 2025; originally announced April 2025.

    Comments: This paper has been submitted to the IEEE Transactions on Industrial Electronics

  6. arXiv:2504.03720  [pdf, other

    cs.AI cs.LG

    TransNet: Transfer Knowledge for Few-shot Knowledge Graph Completion

    Authors: Lihui Liu, Zihao Wang, Dawei Zhou, Ruijie Wang, Yuchen Yan, Bo Xiong, Sihong He, Kai Shu, Hanghang Tong

    Abstract: Knowledge graphs (KGs) are ubiquitous and widely used in various applications. However, most real-world knowledge graphs are incomplete, which significantly degrades their performance on downstream tasks. Additionally, the relationships in real-world knowledge graphs often follow a long-tail distribution, meaning that most relations are represented by only a few training triplets. To address these… ▽ More

    Submitted 29 March, 2025; originally announced April 2025.

  7. arXiv:2502.17812  [pdf, other

    cs.CL cs.LG

    Can Multimodal LLMs Perform Time Series Anomaly Detection?

    Authors: Xiongxiao Xu, Haoran Wang, Yueqing Liang, Philip S. Yu, Yue Zhao, Kai Shu

    Abstract: Large language models (LLMs) have been increasingly used in time series analysis. However, the potential of multimodal LLMs (MLLMs), particularly vision-language models, for time series remains largely under-explored. One natural way for humans to detect time series anomalies is through visualization and textual description. Motivated by this, we raise a critical and practical research question: C… ▽ More

    Submitted 24 February, 2025; originally announced February 2025.

    Comments: 9 pages for the main content; 32 pages for the full paper including the appendix. More resources on the intersection of multimodal LLMs and time series analysis are on the website https://mllm-ts.github.io

  8. arXiv:2502.14296  [pdf, other

    cs.CY

    On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective

    Authors: Yue Huang, Chujie Gao, Siyuan Wu, Haoran Wang, Xiangqi Wang, Yujun Zhou, Yanbo Wang, Jiayi Ye, Jiawen Shi, Qihui Zhang, Yuan Li, Han Bao, Zhaoyi Liu, Tianrui Guan, Dongping Chen, Ruoxi Chen, Kehan Guo, Andy Zou, Bryan Hooi Kuen-Yew, Caiming Xiong, Elias Stengel-Eskin, Hongyang Zhang, Hongzhi Yin, Huan Zhang, Huaxiu Yao , et al. (41 additional authors not shown)

    Abstract: Generative Foundation Models (GenFMs) have emerged as transformative tools. However, their widespread adoption raises critical concerns regarding trustworthiness across dimensions. This paper presents a comprehensive framework to address these challenges through three key contributions. First, we systematically review global AI governance laws and policies from governments and regulatory bodies, a… ▽ More

    Submitted 11 May, 2025; v1 submitted 20 February, 2025; originally announced February 2025.

  9. arXiv:2502.14122  [pdf, other

    cs.CL cs.CY cs.ET

    Benchmarking LLMs for Political Science: A United Nations Perspective

    Authors: Yueqing Liang, Liangwei Yang, Chen Wang, Congying Xia, Rui Meng, Xiongxiao Xu, Haoran Wang, Ali Payani, Kai Shu

    Abstract: Large Language Models (LLMs) have achieved significant advances in natural language processing, yet their potential for high-stake political decision-making remains largely unexplored. This paper addresses the gap by focusing on the application of LLMs to the United Nations (UN) decision-making process, where the stakes are particularly high and political decisions can have far-reaching consequenc… ▽ More

    Submitted 19 February, 2025; originally announced February 2025.

  10. arXiv:2502.13297  [pdf, other

    cs.CL cs.AI

    Understanding and Tackling Label Errors in Individual-Level Nature Language Understanding

    Authors: Yunpeng Xiao, Youpeng Zhao, Kai Shu

    Abstract: Natural language understanding (NLU) is a task that enables machines to understand human language. Some tasks, such as stance detection and sentiment analysis, are closely related to individual subjective perspectives, thus termed individual-level NLU. Previously, these tasks are often simplified to text-level NLU tasks, ignoring individual factors. This not only makes inference difficult and unex… ▽ More

    Submitted 18 February, 2025; originally announced February 2025.

    Comments: 12 pages

  11. arXiv:2412.06731  [pdf, other

    math.OC

    Beyond Minimax Optimality: A Subgame Perfect Gradient Method

    Authors: Benjamin Grimmer, Kevin Shu, Alex L. Wang

    Abstract: The study of unconstrained convex optimization has historically been concerned with worst-case a priori convergence rates. The development of the Optimized Gradient Method (OGM), due to Drori and Teboulle, Kim and Fessler, marked a major milestone in this study, as OGM achieves the optimal worst-case convergence rate among all gradient-span first-order methods. However, this notion of worst-case o… ▽ More

    Submitted 27 January, 2025; v1 submitted 9 December, 2024; originally announced December 2024.

  12. arXiv:2412.05672  [pdf, other

    cs.CL

    Graph with Sequence: Broad-Range Semantic Modeling for Fake News Detection

    Authors: Junwei Yin, Min Gao, Kai Shu, Wentao Li, Yinqiu Huang, Zongwei Wang

    Abstract: The rapid proliferation of fake news on social media threatens social stability, creating an urgent demand for more effective detection methods. While many promising approaches have emerged, most rely on content analysis with limited semantic depth, leading to suboptimal comprehension of news content.To address this limitation, capturing broader-range semantics is essential yet challenging, as it… ▽ More

    Submitted 6 February, 2025; v1 submitted 7 December, 2024; originally announced December 2024.

  13. arXiv:2412.05206  [pdf, other

    cs.CL cs.AI cs.IR

    ConQRet: Benchmarking Fine-Grained Evaluation of Retrieval Augmented Argumentation with LLM Judges

    Authors: Kaustubh D. Dhole, Kai Shu, Eugene Agichtein

    Abstract: Computational argumentation, which involves generating answers or summaries for controversial topics like abortion bans and vaccination, has become increasingly important in today's polarized environment. Sophisticated LLM capabilities offer the potential to provide nuanced, evidence-based answers to such questions through Retrieval-Augmented Argumentation (RAArg), leveraging real-world evidence f… ▽ More

    Submitted 6 December, 2024; originally announced December 2024.

    MSC Class: I.2.7

  14. arXiv:2411.16594  [pdf, other

    cs.AI cs.CL

    From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge

    Authors: Dawei Li, Bohan Jiang, Liangjie Huang, Alimohammad Beigi, Chengshuai Zhao, Zhen Tan, Amrita Bhattacharjee, Yuxuan Jiang, Canyu Chen, Tianhao Wu, Kai Shu, Lu Cheng, Huan Liu

    Abstract: Assessment and evaluation have long been critical challenges in artificial intelligence (AI) and natural language processing (NLP). However, traditional methods, whether matching-based or embedding-based, often fall short of judging subtle attributes and delivering satisfactory results. Recent advancements in Large Language Models (LLMs) inspire the "LLM-as-a-judge" paradigm, where LLMs are levera… ▽ More

    Submitted 5 February, 2025; v1 submitted 25 November, 2024; originally announced November 2024.

    Comments: v6: add new citations; 36 pages, 5 figures

  15. arXiv:2411.09547  [pdf, other

    cs.CL cs.AI

    Piecing It All Together: Verifying Multi-Hop Multimodal Claims

    Authors: Haoran Wang, Aman Rangapur, Xiongxiao Xu, Yueqing Liang, Haroon Gharwi, Carl Yang, Kai Shu

    Abstract: Existing claim verification datasets often do not require systems to perform complex reasoning or effectively interpret multimodal evidence. To address this, we introduce a new task: multi-hop multimodal claim verification. This task challenges models to reason over multiple pieces of evidence from diverse sources, including text, images, and tables, and determine whether the combined multimodal e… ▽ More

    Submitted 12 December, 2024; v1 submitted 14 November, 2024; originally announced November 2024.

    Comments: COLING 2025

  16. arXiv:2411.06469  [pdf, other

    cs.CL

    ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?

    Authors: Canyu Chen, Jian Yu, Shan Chen, Che Liu, Zhongwei Wan, Danielle Bitterman, Fei Wang, Kai Shu

    Abstract: Large Language Models (LLMs) hold great promise to revolutionize current clinical systems for their superior capacities on medical text processing tasks and medical licensing exams. Meanwhile, traditional ML models such as SVM and XGBoost have still been mainly adopted in clinical prediction tasks. An emerging question is Can LLMs beat traditional ML models in clinical prediction? Thus, we build a… ▽ More

    Submitted 10 November, 2024; originally announced November 2024.

    Comments: The first two authors contributed equally. 10 pages for main paper, 66 pages including appendix. Project website: https://clinicalbench.github.io

  17. arXiv:2410.16251  [pdf, other

    cs.CL

    Can Knowledge Editing Really Correct Hallucinations?

    Authors: Baixiang Huang, Canyu Chen, Xiongxiao Xu, Ali Payani, Kai Shu

    Abstract: Large Language Models (LLMs) suffer from hallucinations, referring to the non-factual information in generated content, despite their superior capacities across tasks. Meanwhile, knowledge editing has been developed as a new popular paradigm to correct erroneous factual knowledge encoded in LLMs with the advantage of avoiding retraining from scratch. However, a common issue of existing evaluation… ▽ More

    Submitted 3 March, 2025; v1 submitted 21 October, 2024; originally announced October 2024.

    Comments: ICLR 2025. Main paper: 10 pages; total: 34 pages (including appendix). The first two authors contributed equally to this work. Code, data, results, and additional resources are available on the project website: https://llm-editing.github.io

  18. arXiv:2410.16249  [pdf, other

    math.OC

    Composing Optimized Stepsize Schedules for Gradient Descent

    Authors: Benjamin Grimmer, Kevin Shu, Alex L. Wang

    Abstract: Recent works by Altschuler and Parrilo and the authors have shown that it is possible to accelerate the convergence of gradient descent on smooth convex functions, even without momentum, just by picking special stepsizes. In this paper, we provide a general theory for composing stepsize schedules capturing all recent advances in this area and more. We propose three notions of ``composable'' stepsi… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

  19. arXiv:2410.11855  [pdf, other

    cs.DC cs.AI cs.AR cs.LG

    Online Energy Optimization in GPUs: A Multi-Armed Bandit Approach

    Authors: Xiongxiao Xu, Solomon Abera Bekele, Brice Videau, Kai Shu

    Abstract: Energy consumption has become a critical design metric and a limiting factor in the development of future computing architectures, from small wearable devices to large-scale leadership computing facilities. The predominant methods in energy management optimization are focused on CPUs. However, GPUs are increasingly significant and account for the majority of energy consumption in heterogeneous hig… ▽ More

    Submitted 3 October, 2024; originally announced October 2024.

  20. arXiv:2409.03947  [pdf, other

    cs.CV cs.AI

    FODA-PG for Enhanced Medical Imaging Narrative Generation: Adaptive Differentiation of Normal and Abnormal Attributes

    Authors: Kai Shu, Yuzhuo Jia, Ziyang Zhang, Jiechao Gao

    Abstract: Automatic Medical Imaging Narrative generation aims to alleviate the workload of radiologists by producing accurate clinical descriptions directly from radiological images. However, the subtle visual nuances and domain-specific terminology in medical images pose significant challenges compared to generic image captioning tasks. Existing approaches often neglect the vital distinction between normal… ▽ More

    Submitted 5 September, 2024; originally announced September 2024.

  21. arXiv:2408.08946  [pdf, other

    cs.CY

    Authorship Attribution in the Era of LLMs: Problems, Methodologies, and Challenges

    Authors: Baixiang Huang, Canyu Chen, Kai Shu

    Abstract: Accurate attribution of authorship is crucial for maintaining the integrity of digital content, improving forensic investigations, and mitigating the risks of misinformation and plagiarism. Addressing the imperative need for proper authorship attribution is essential to uphold the credibility and accountability of authentic authorship. The rapid advancements of Large Language Models (LLMs) have bl… ▽ More

    Submitted 9 January, 2025; v1 submitted 16 August, 2024; originally announced August 2024.

    Comments: Accepted to ACM SIGKDD Exploration. 12 pages for the main paper. More resources and a curated list of papers are available and regularly updated at https://llm-authorship.github.io

  22. arXiv:2407.21264  [pdf, other

    cs.CL

    Model Attribution in LLM-Generated Disinformation: A Domain Generalization Approach with Supervised Contrastive Learning

    Authors: Alimohammad Beigi, Zhen Tan, Nivedh Mudiam, Canyu Chen, Kai Shu, Huan Liu

    Abstract: Model attribution for LLM-generated disinformation poses a significant challenge in understanding its origins and mitigating its spread. This task is especially challenging because modern large language models (LLMs) produce disinformation with human-like quality. Additionally, the diversity in prompting methods used to generate disinformation complicates accurate source attribution. These methods… ▽ More

    Submitted 14 August, 2024; v1 submitted 30 July, 2024; originally announced July 2024.

    Comments: 10 pages, 2 figures, accepted at DSAA 2024

  23. arXiv:2407.20224  [pdf, other

    cs.CL

    Can Editing LLMs Inject Harm?

    Authors: Canyu Chen, Baixiang Huang, Zekun Li, Zhaorun Chen, Shiyang Lai, Xiongxiao Xu, Jia-Chen Gu, Jindong Gu, Huaxiu Yao, Chaowei Xiao, Xifeng Yan, William Yang Wang, Philip Torr, Dawn Song, Kai Shu

    Abstract: Knowledge editing has been increasingly adopted to correct the false or outdated knowledge in Large Language Models (LLMs). Meanwhile, one critical but under-explored question is: can knowledge editing be used to inject harm into LLMs? In this paper, we propose to reformulate knowledge editing as a new type of safety threat for LLMs, namely Editing Attack, and conduct a systematic investigation wi… ▽ More

    Submitted 16 August, 2024; v1 submitted 29 July, 2024; originally announced July 2024.

    Comments: The first two authors contributed equally. 9 pages for main paper, 36 pages including appendix. The code, results, dataset for this paper and more resources are on the project website: https://llm-editing.github.io

  24. arXiv:2407.11739  [pdf, other

    math.OC

    A Strengthened Conjecture on the Minimax Optimal Constant Stepsize for Gradient Descent

    Authors: Benjamin Grimmer, Kevin Shu, Alex L. Wang

    Abstract: Drori and Teboulle [4] conjectured that the minimax optimal constant stepsize for N steps of gradient descent is given by the stepsize that balances performance on Huber and quadratic objective functions. This was numerically supported by semidefinite program (SDP) solves of the associated performance estimation problems up to $N\approx 100$. This note presents a strengthened version of the initia… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  25. arXiv:2406.14043  [pdf, other

    cs.IR cs.CL

    Taxonomy-Guided Zero-Shot Recommendations with LLMs

    Authors: Yueqing Liang, Liangwei Yang, Chen Wang, Xiongxiao Xu, Philip S. Yu, Kai Shu

    Abstract: With the emergence of large language models (LLMs) and their ability to perform a variety of tasks, their application in recommender systems (RecSys) has shown promise. However, we are facing significant challenges when deploying LLMs into RecSys, such as limited prompt length, unstructured item information, and un-constrained generation of recommendations, leading to sub-optimal performance. To a… ▽ More

    Submitted 19 February, 2025; v1 submitted 20 June, 2024; originally announced June 2024.

  26. arXiv:2405.18776  [pdf, other

    cs.CR cs.CL cs.LG

    LMO-DP: Optimizing the Randomization Mechanism for Differentially Private Fine-Tuning (Large) Language Models

    Authors: Qin Yang, Meisam Mohammad, Han Wang, Ali Payani, Ashish Kundu, Kai Shu, Yan Yan, Yuan Hong

    Abstract: Differentially Private Stochastic Gradient Descent (DP-SGD) and its variants have been proposed to ensure rigorous privacy for fine-tuning large-scale pre-trained language models. However, they rely heavily on the Gaussian mechanism, which may overly perturb the gradients and degrade the accuracy, especially in stronger privacy regimes (e.g., the privacy budget $ε< 3$). To address such limitations… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 18 pages, 15 figures

  27. arXiv:2404.14757  [pdf, other

    cs.LG cs.AI

    SST: Multi-Scale Hybrid Mamba-Transformer Experts for Long-Short Range Time Series Forecasting

    Authors: Xiongxiao Xu, Canyu Chen, Yueqing Liang, Baixiang Huang, Guangji Bai, Liang Zhao, Kai Shu

    Abstract: Despite significant progress in time series forecasting, existing forecasters often overlook the heterogeneity between long-range and short-range time series, leading to performance degradation in practical applications. In this work, we highlight the need of distinct objectives tailored to different ranges. We point out that time series can be decomposed into global patterns and local variations,… ▽ More

    Submitted 22 August, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

  28. arXiv:2403.14045  [pdf, ps, other

    math.OC

    Accelerated Objective Gap and Gradient Norm Convergence for Gradient Descent via Long Steps

    Authors: Benjamin Grimmer, Kevin Shu, Alex L. Wang

    Abstract: This work considers gradient descent for L-smooth convex optimization with stepsizes larger than the classic regime where descent can be ensured. The stepsize schedules considered are similar to but differ slightly from the recent silver stepsizes of Altschuler and Parrilo. For one of our stepsize sequences, we prove a $O\left(N^{- 1.2716\dots}\right)$ convergence rate in terms of objective gap de… ▽ More

    Submitted 12 April, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

  29. arXiv:2403.09747  [pdf, other

    cs.CL cs.AI

    Re-Search for The Truth: Multi-round Retrieval-augmented Large Language Models are Strong Fake News Detectors

    Authors: Guanghua Li, Wensheng Lu, Wei Zhang, Defu Lian, Kezhong Lu, Rui Mao, Kai Shu, Hao Liao

    Abstract: The proliferation of fake news has had far-reaching implications on politics, the economy, and society at large. While Fake news detection methods have been employed to mitigate this issue, they primarily depend on two essential elements: the quality and relevance of the evidence, and the effectiveness of the verdict prediction mechanism. Traditional methods, which often source information from st… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  30. arXiv:2403.08213  [pdf, other

    cs.CL

    Can Large Language Models Identify Authorship?

    Authors: Baixiang Huang, Canyu Chen, Kai Shu

    Abstract: The ability to accurately identify authorship is crucial for verifying content authenticity and mitigating misinformation. Large Language Models (LLMs) have demonstrated an exceptional capacity for reasoning and problem-solving. However, their potential in authorship analysis remains under-explored. Traditional studies have depended on hand-crafted stylistic features, whereas state-of-the-art appr… ▽ More

    Submitted 22 October, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: Accepted to EMNLP 2024 Findings. The main paper is 9 pages long, with 16 pages total. The code, results, dataset, and additional resources are available on the project website: https://llm-authorship.github.io/

  31. arXiv:2402.04559  [pdf, other

    cs.AI cs.CL cs.HC

    Can Large Language Model Agents Simulate Human Trust Behavior?

    Authors: Chengxing Xie, Canyu Chen, Feiran Jia, Ziyu Ye, Shiyang Lai, Kai Shu, Jindong Gu, Adel Bibi, Ziniu Hu, David Jurgens, James Evans, Philip Torr, Bernard Ghanem, Guohao Li

    Abstract: Large Language Model (LLM) agents have been increasingly adopted as simulation tools to model humans in social science and role-playing applications. However, one fundamental question remains: can LLM agents really simulate human behavior? In this paper, we focus on one critical and elemental behavior in human interactions, trust, and investigate whether LLM agents can simulate human trust behavio… ▽ More

    Submitted 1 November, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: Accepted to Proceedings of NeurIPS 2024. The first two authors contributed equally. 10 pages for main paper, 56 pages including appendix. Project website: https://agent-trust.camel-ai.org

  32. arXiv:2401.15496  [pdf, other

    cs.CL cs.AI cs.LG

    Baichuan2-Sum: Instruction Finetune Baichuan2-7B Model for Dialogue Summarization

    Authors: Jianfei Xiao, Yancan Chen, Yimin Ou, Hanyi Yu, Kai Shu, Yiyong Xiao

    Abstract: Large language models (LLMs) like Llama, Baichuan and Bloom models show remarkable ability with instruction fine-tuning in many natural language tasks. Nevertheless, for the dialogue summarization task, which aims to generate summaries for different roles in dialogue, most of the state-of-the-art methods conduct on small models (e.g Bart and Bert). Existing methods try to add task specified optimi… ▽ More

    Submitted 3 April, 2024; v1 submitted 27 January, 2024; originally announced January 2024.

  33. arXiv:2401.09090  [pdf, other

    cs.CY

    Understanding the concerns and choices of public when using large language models for healthcare

    Authors: Yunpeng Xiao, Kyrie Zhixuan Zhou, Yueqing Liang, Kai Shu

    Abstract: Large language models (LLMs) have shown their potential in biomedical fields. However, how the public uses them for healthcare purposes such as medical Q\&A, self-diagnosis, and daily healthcare information seeking is under-investigated. This paper adopts a mixed-methods approach, including surveys (N=214) and interviews (N=17) to investigate how and why the public uses LLMs for healthcare. We fou… ▽ More

    Submitted 12 September, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

    Comments: 22 pages

    ACM Class: J.4; K.4.2

  34. arXiv:2401.05561  [pdf, other

    cs.CL

    TrustLLM: Trustworthiness in Large Language Models

    Authors: Yue Huang, Lichao Sun, Haoran Wang, Siyuan Wu, Qihui Zhang, Yuan Li, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bertie Vidgen, Bhavya Kailkhura, Caiming Xiong, Chaowei Xiao, Chunyuan Li, Eric Xing, Furong Huang, Hao Liu, Heng Ji, Hongyi Wang , et al. (45 additional authors not shown)

    Abstract: Large language models (LLMs), exemplified by ChatGPT, have gained considerable attention for their excellent natural language processing capabilities. Nonetheless, these LLMs present many challenges, particularly in the realm of trustworthiness. Therefore, ensuring the trustworthiness of LLMs emerges as an important topic. This paper introduces TrustLLM, a comprehensive study of trustworthiness in… ▽ More

    Submitted 30 September, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

    Comments: This work is still under work and we welcome your contribution

  35. arXiv:2311.11473   

    cs.LG cs.AI

    CSGNN: Conquering Noisy Node labels via Dynamic Class-wise Selection

    Authors: Yifan Li, Zhen Tan, Kai Shu, Zongsheng Cao, Yu Kong, Huan Liu

    Abstract: Graph Neural Networks (GNNs) have emerged as a powerful tool for representation learning on graphs, but they often suffer from overfitting and label noise issues, especially when the data is scarce or imbalanced. Different from the paradigm of previous methods that rely on single-node confidence, in this paper, we introduce a novel Class-wise Selection for Graph Neural Networks, dubbed CSGNN, whic… ▽ More

    Submitted 14 December, 2023; v1 submitted 19 November, 2023; originally announced November 2023.

    Comments: For the privacy issue

  36. arXiv:2311.09433  [pdf, other

    cs.CR cs.AI cs.CL

    Trojan Activation Attack: Red-Teaming Large Language Models using Activation Steering for Safety-Alignment

    Authors: Haoran Wang, Kai Shu

    Abstract: To ensure AI safety, instruction-tuned Large Language Models (LLMs) are specifically trained to ensure alignment, which refers to making models behave in accordance with human intentions. While these models have demonstrated commendable results on various safety benchmarks, the vulnerability of their safety alignment has not been extensively studied. This is particularly troubling given the potent… ▽ More

    Submitted 15 August, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: ACM International Conference on Information and Knowledge Management (CIKM'24)

  37. arXiv:2311.09428  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Beyond Detection: Unveiling Fairness Vulnerabilities in Abusive Language Models

    Authors: Yueqing Liang, Lu Cheng, Ali Payani, Kai Shu

    Abstract: This work investigates the potential of undermining both fairness and detection performance in abusive language detection. In a dynamic and complex digital world, it is crucial to investigate the vulnerabilities of these detection models to adversarial fairness attacks to improve their fairness robustness. We propose a simple yet effective framework FABLE that leverages backdoor attacks as they al… ▽ More

    Submitted 5 December, 2023; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: Under review

  38. arXiv:2311.05656  [pdf, other

    cs.CY

    Combating Misinformation in the Age of LLMs: Opportunities and Challenges

    Authors: Canyu Chen, Kai Shu

    Abstract: Misinformation such as fake news and rumors is a serious threat on information ecosystems and public trust. The emergence of Large Language Models (LLMs) has great potential to reshape the landscape of combating misinformation. Generally, LLMs can be a double-edged sword in the fight. On the one hand, LLMs bring promising opportunities for combating misinformation due to their profound world knowl… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: 9 pages for the main paper, 35 pages including 656 references, more resources on "LLMs Meet Misinformation" are on the website: https://llm-misinformation.github.io/

  39. arXiv:2310.08761  [pdf

    physics.atom-ph hep-ex

    Laser cooling of positronium

    Authors: K. Shu, Y. Tajima, R. Uozumi, N. Miyamoto, S. Shiraishi, T. Kobayashi, A. Ishida, K. Yamada, R. W. Gladen, T. Namba, S. Asai, K. Wada, I. Mochizuki, T. Hyodo, K. Ito, K. Michishio, B. E. O'Rourke, N. Oshima, K. Yoshioka

    Abstract: When laser radiation is skilfully applied, atoms and molecules can be cooled allowing precise measurements and control of quantum systems. This is essential in fundamental studies of physics as well as practical applications such as precision spectroscopy, quantum-statistical-property manifesting ultracold gases, and quantum computing. In laser cooling, repeated cycles of laser photon absorption a… ▽ More

    Submitted 15 October, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

    Journal ref: Nature 633, 793 (2024)

  40. arXiv:2310.05253  [pdf, other

    cs.CL cs.AI cs.LG

    Explainable Claim Verification via Knowledge-Grounded Reasoning with Large Language Models

    Authors: Haoran Wang, Kai Shu

    Abstract: Claim verification plays a crucial role in combating misinformation. While existing works on claim verification have shown promising results, a crucial piece of the puzzle that remains unsolved is to understand how to verify claims without relying on human-annotated data, which is expensive to create at a large scale. Additionally, it is important for models to provide comprehensive explanations t… ▽ More

    Submitted 19 October, 2023; v1 submitted 8 October, 2023; originally announced October 2023.

    Comments: Findings of EMNLP 2023

  41. arXiv:2309.13788  [pdf, other

    cs.CL cs.AI cs.CR cs.HC cs.LG

    Can LLM-Generated Misinformation Be Detected?

    Authors: Canyu Chen, Kai Shu

    Abstract: The advent of Large Language Models (LLMs) has made a transformative impact. However, the potential that LLMs such as ChatGPT can be exploited to generate misinformation has posed a serious concern to online safety and public trust. A fundamental research question is: will LLM-generated misinformation cause more harm than human-written misinformation? We propose to tackle this question from the pe… ▽ More

    Submitted 23 April, 2024; v1 submitted 24 September, 2023; originally announced September 2023.

    Comments: Accepted to Proceedings of ICLR 2024. 9 pages for main paper, 40 pages including appendix. The code, results, dataset for this paper and more resources on "LLMs Meet Misinformation" have been released on the project website: https://llm-misinformation.github.io/

  42. arXiv:2309.12363  [pdf, other

    cs.CY cs.AI

    Investigating Online Financial Misinformation and Its Consequences: A Computational Perspective

    Authors: Aman Rangapur, Haoran Wang, Kai Shu

    Abstract: The rapid dissemination of information through digital platforms has revolutionized the way we access and consume news and information, particularly in the realm of finance. However, this digital age has also given rise to an alarming proliferation of financial misinformation, which can have detrimental effects on individuals, markets, and the overall economy. This research paper aims to provide a… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: 32 pages, 2 figures

    ACM Class: A.1; I.m

  43. arXiv:2309.09961  [pdf, other

    math.OC

    Accelerated Gradient Descent via Long Steps

    Authors: Benjamin Grimmer, Kevin Shu, Alex L. Wang

    Abstract: Recently Grimmer [1] showed for smooth convex optimization by utilizing longer steps periodically, gradient descent's textbook $LD^2/2T$ convergence guarantees can be improved by constant factors, conjecturing an accelerated rate strictly faster than $O(1/T)$ could be possible. Here we prove such a big-O gain, establishing gradient descent's first accelerated convergence rate in this setting. Name… ▽ More

    Submitted 26 September, 2023; v1 submitted 18 September, 2023; originally announced September 2023.

  44. arXiv:2309.08793  [pdf, other

    cs.AI cs.CE cs.LG

    Fin-Fact: A Benchmark Dataset for Multimodal Financial Fact Checking and Explanation Generation

    Authors: Aman Rangapur, Haoran Wang, Ling Jian, Kai Shu

    Abstract: Fact-checking in financial domain is under explored, and there is a shortage of quality dataset in this domain. In this paper, we propose Fin-Fact, a benchmark dataset for multimodal fact-checking within the financial domain. Notably, it includes professional fact-checker annotations and justifications, providing expertise and credibility. With its multimodal nature encompassing both textual and v… ▽ More

    Submitted 1 May, 2024; v1 submitted 15 September, 2023; originally announced September 2023.

    Comments: 8 pages, 4 figures, 4 tables

    ACM Class: I.2; E.m

  45. arXiv:2308.09653  [pdf, ps, other

    math.AG math.OC

    Symmetric Hyperbolic Polynomials

    Authors: Grigoriy Blekherman, Julia Lindberg, Kevin Shu

    Abstract: Hyperbolic polynomials have been of recent interest due to applications in a wide variety of fields. We seek to better understand these polynomials in the case when they are symmetric, i.e. invariant under all permutations of variables. We give a complete characterization of the set of symmetric hyperbolic polynomials of degree 3, and a large class of symmetric hyperbolic polynomials of degree 4.… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

    Comments: 15 pages

    MSC Class: 14P10; 15A04; 90C22

  46. arXiv:2308.00877  [pdf, other

    physics.optics hep-ex physics.atom-ph

    Development of an optimal laser for chirp cooling of positronium based on chirped pulse-train generator

    Authors: Kenji Shu, Naoki Miyamoto, Yuto Motohashi, Ryosuke Uozumi, Yohei Tajima, Kosuke Yoshioka

    Abstract: We report the development and characterization of a pulsed 243 nm laser that is optimal for the cooling of positronium (Ps). The laser, which is based on the recent chirped pulse-train generator (CPTG) demonstrated by K. Yamada et al. (Phys. Rev. Appl. 16, 014009 (2021)), was designed to output a train of pulses with linewidths of 10 GHz, and with the center frequency of each pulse shifting upward… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Comments: 11 pages, 11 figures

    Journal ref: Phys. Rev. A 109, 043520 (2024)

  47. arXiv:2306.15231  [pdf, other

    cs.CL

    Emulating Reader Behaviors for Fake News Detection

    Authors: Junwei Yin, Min Gao, Kai Shu, Zehua Zhao, Yinqiu Huang, Jia Wang

    Abstract: The wide dissemination of fake news has affected our lives in many aspects, making fake news detection important and attracting increasing attention. Existing approaches make substantial contributions in this field by modeling news from a single-modal or multi-modal perspective. However, these modal-based methods can result in sub-optimal outcomes as they ignore reader behaviors in news consumptio… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

    Comments: 12 pages

  48. MUSER: A MUlti-Step Evidence Retrieval Enhancement Framework for Fake News Detection

    Authors: Hao Liao, Jiaohao Peng, Zhanyi Huang, Wei Zhang, Guanghua Li, Kai Shu, Xing Xie

    Abstract: The ease of spreading false information online enables individuals with malicious intent to manipulate public opinion and destabilize social stability. Recently, fake news detection based on evidence retrieval has gained popularity in an effort to identify fake news reliably and reduce its impact. Evidence retrieval-based methods can improve the reliability of fake news detection by computing the… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

    Comments: 12 pages, 5 figures, accepted by KDD '23, ADS track

    Journal ref: KDD 2023

  49. Data Augmentation for Seizure Prediction with Generative Diffusion Model

    Authors: Kai Shu, Le Wu, Yuchang Zhao, Aiping Liu, Ruobing Qian, Xun Chen

    Abstract: Data augmentation (DA) can significantly strengthen the electroencephalogram (EEG)-based seizure prediction methods. However, existing DA approaches are just the linear transformations of original data and cannot explore the feature space to increase diversity effectively. Therefore, we propose a novel diffusion-based DA method called DiffEEG. DiffEEG can fully explore data distribution and genera… ▽ More

    Submitted 9 December, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: 15 pages, 9 figures

  50. arXiv:2305.19552  [pdf, other

    cs.CY

    Investigating Gender Euphoria and Dysphoria on TikTok: Characterization and Comparison

    Authors: SJ Dillon, Yueqing Liang, H. Russell Bernard, Kai Shu

    Abstract: With the emergence of short video-sharing platforms, engagement with social media sites devoted to opinion and knowledge dissemination has rapidly increased. Among these platforms, TikTok is one of the most popular globally and has become the platform of choice for transgender and nonbinary individuals, who have formed a large community to mobilize personal experience and exchange information. The… ▽ More

    Submitted 11 March, 2025; v1 submitted 31 May, 2023; originally announced May 2023.

    Comments: ASONAM 2024