Skip to main content

Showing 1–8 of 8 results for author: Beigi, M

.
  1. arXiv:2502.17516  [pdf, other

    cs.LG cs.AI

    A Survey on Mechanistic Interpretability for Multi-Modal Foundation Models

    Authors: Zihao Lin, Samyadeep Basu, Mohammad Beigi, Varun Manjunatha, Ryan A. Rossi, Zichao Wang, Yufan Zhou, Sriram Balasubramanian, Arman Zarei, Keivan Rezaei, Ying Shen, Barry Menglong Yao, Zhiyang Xu, Qin Liu, Yuxiang Zhang, Yan Sun, Shilong Liu, Li Shen, Hongxuan Li, Soheil Feizi, Lifu Huang

    Abstract: The rise of foundation models has transformed machine learning research, prompting efforts to uncover their inner workings and develop more efficient and reliable applications for better control. While significant progress has been made in interpreting Large Language Models (LLMs), multimodal foundation models (MMFMs) - such as contrastive vision-language models, generative vision-language models,… ▽ More

    Submitted 22 February, 2025; originally announced February 2025.

    Comments: 30 pages, 4 Figures, 10 Tables

  2. arXiv:2411.07317  [pdf, other

    cs.LG

    SynRL: Aligning Synthetic Clinical Trial Data with Human-preferred Clinical Endpoints Using Reinforcement Learning

    Authors: Trisha Das, Zifeng Wang, Afrah Shafquat, Mandis Beigi, Jason Mezey, Jacob Aptekar, Jimeng Sun

    Abstract: Each year, hundreds of clinical trials are conducted to evaluate new medical interventions, but sharing patient records from these trials with other institutions can be challenging due to privacy concerns and federal regulations. To help mitigate privacy concerns, researchers have proposed methods for generating synthetic patient data. However, existing approaches for generating synthetic clinical… ▽ More

    Submitted 17 February, 2025; v1 submitted 11 November, 2024; originally announced November 2024.

  3. arXiv:2410.20199  [pdf, other

    cs.AI

    Rethinking the Uncertainty: A Critical Review and Analysis in the Era of Large Language Models

    Authors: Mohammad Beigi, Sijia Wang, Ying Shen, Zihao Lin, Adithya Kulkarni, Jianfeng He, Feng Chen, Ming Jin, Jin-Hee Cho, Dawei Zhou, Chang-Tien Lu, Lifu Huang

    Abstract: In recent years, Large Language Models (LLMs) have become fundamental to a broad spectrum of artificial intelligence applications. As the use of LLMs expands, precisely estimating the uncertainty in their predictions has become crucial. Current methods often struggle to accurately identify, measure, and address the true uncertainty, with many focusing primarily on estimating model confidence. This… ▽ More

    Submitted 26 October, 2024; originally announced October 2024.

  4. arXiv:2409.07089  [pdf, other

    cs.LG

    TrialSynth: Generation of Synthetic Sequential Clinical Trial Data

    Authors: Chufan Gao, Mandis Beigi, Afrah Shafquat, Jacob Aptekar, Jimeng Sun

    Abstract: Analyzing data from past clinical trials is part of the ongoing effort to optimize the design, implementation, and execution of new clinical trials and more efficiently bring life-saving interventions to market. While there have been recent advances in the generation of static context synthetic clinical trial data, due to both limited patient availability and constraints imposed by patient privacy… ▽ More

    Submitted 12 December, 2024; v1 submitted 11 September, 2024; originally announced September 2024.

  5. arXiv:2408.15302  [pdf

    cs.AR

    Corrigendum to: A Systematic Study of DDR4 DRAM Faults in the Field

    Authors: Majed Valad Beigi, Yi Cao, Sudhanva Gurumurthi, Charles Recchia, Andrew Walton, Vilas Sridharan

    Abstract: This paper is a corrigendum to the paper by Beigi et al. published at HPCA 2023 https://doi.org/10.1109/HPCA56546.2023.10071066. The HPCA paper presented a detailed field data analysis of faults observed at scale in DDR4 DRAM from two different memory vendors. This analysis included a breakdown of fault patterns or modes. Upon further study of the data, we found a bug in how we decoded errors base… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

  6. arXiv:2406.12053  [pdf, other

    cs.CL

    InternalInspector $I^2$: Robust Confidence Estimation in LLMs through Internal States

    Authors: Mohammad Beigi, Ying Shen, Runing Yang, Zihao Lin, Qifan Wang, Ankith Mohan, Jianfeng He, Ming Jin, Chang-Tien Lu, Lifu Huang

    Abstract: Despite their vast capabilities, Large Language Models (LLMs) often struggle with generating reliable outputs, frequently producing high-confidence inaccuracies known as hallucinations. Addressing this challenge, our research introduces InternalInspector, a novel framework designed to enhance confidence estimation in LLMs by leveraging contrastive learning on internal states including attention st… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 8 pages

  7. arXiv:2402.11122  [pdf, other

    cs.CL cs.AI

    Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models

    Authors: Zihao Lin, Mohammad Beigi, Hongxuan Li, Yufan Zhou, Yuxiang Zhang, Qifan Wang, Wenpeng Yin, Lifu Huang

    Abstract: Memory Editing (ME) has emerged as an efficient method to modify erroneous facts or inject new facts into Large Language Models (LLMs). Two mainstream ME methods exist: parameter-modifying ME and parameter-preserving ME (integrating extra modules while preserving original parameters). Regrettably, previous studies on ME evaluation have two critical limitations: (i) evaluating LLMs with single edit… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: preprint, 15 pages

  8. arXiv:1105.5675  [pdf

    cs.MM cs.CV

    Scale-Invariant Local Descriptor for Event Recognition in 1D Sensor Signals

    Authors: Jierui Xie, Mandis S. Beigi

    Abstract: In this paper, we introduce a shape-based, time-scale invariant feature descriptor for 1-D sensor signals. The time-scale invariance of the feature allows us to use feature from one training event to describe events of the same semantic class which may take place over varying time scales such as walking slow and walking fast. Therefore it requires less training set. The descriptor takes advantage… ▽ More

    Submitted 27 May, 2011; originally announced May 2011.

    Journal ref: IEEE International Conference on Multimedia & Expo(ICME),Page(s):1226 - 1229, 2009