Skip to main content

Showing 1–50 of 53 results for author: Palangi, H

.
  1. arXiv:2506.08249  [pdf, other

    cs.DB cs.CL

    RADAR: Benchmarking Language Models on Imperfect Tabular Data

    Authors: Ken Gu, Zhihan Zhang, Kate Lin, Yuwei Zhang, Akshay Paruchuri, Hong Yu, Mehran Kazemi, Kumar Ayush, A. Ali Heydari, Maxwell A. Xu, Girish Narayanswamy, Yun Liu, Ming-Zher Poh, Yuzhe Yang, Mark Malhotra, Shwetak Patel, Hamid Palangi, Xuhai Xu, Daniel McDuff, Tim Althoff, Xin Liu

    Abstract: Language models (LMs) are increasingly being deployed to perform autonomous data analyses. However, their data awareness -- the ability to recognize, reason over, and appropriately handle data artifacts such as missing values, outliers, and logical inconsistencies -- remains underexplored. These artifacts are especially common in real-world tabular data and, if mishandled, can significantly compro… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

  2. arXiv:2506.02175  [pdf, ps, other

    cs.CL

    AI Debate Aids Assessment of Controversial Claims

    Authors: Salman Rahman, Sheriff Issaka, Ashima Suvarna, Genglin Liu, James Shiffer, Jaeyoung Lee, Md Rizwan Parvez, Hamid Palangi, Shi Feng, Nanyun Peng, Yejin Choi, Julian Michael, Liwei Jiang, Saadia Gabriel

    Abstract: As AI grows more powerful, it will increasingly shape how we understand the world. But with this influence comes the risk of amplifying misinformation and deepening social divides-especially on consequential topics like public health where factual accuracy directly impacts well-being. Scalable Oversight aims to ensure AI truthfulness by enabling humans to supervise systems that may exceed human ca… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  3. arXiv:2504.13203  [pdf, other

    cs.CR cs.AI cs.CL cs.LG cs.MA

    X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents

    Authors: Salman Rahman, Liwei Jiang, James Shiffer, Genglin Liu, Sheriff Issaka, Md Rizwan Parvez, Hamid Palangi, Kai-Wei Chang, Yejin Choi, Saadia Gabriel

    Abstract: Multi-turn interactions with language models (LMs) pose critical safety risks, as harmful intent can be strategically spread across exchanges. Yet, the vast majority of prior work has focused on single-turn safety, while adaptability and diversity remain among the key challenges of multi-turn red-teaming. To address these challenges, we present X-Teaming, a scalable framework that systematically e… ▽ More

    Submitted 15 April, 2025; originally announced April 2025.

  4. arXiv:2504.01931  [pdf, other

    cs.CL

    Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection

    Authors: Souradip Chakraborty, Mohammadreza Pourreza, Ruoxi Sun, Yiwen Song, Nino Scherrer, Furong Huang, Amrit Singh Bedi, Ahmad Beirami, Jindong Gu, Hamid Palangi, Tomas Pfister

    Abstract: While AI agents have shown remarkable performance at various tasks, they still struggle with complex multi-modal applications, structured generation and strategic planning. Improvements via standard fine-tuning is often impractical, as solving agentic tasks usually relies on black box API access without control over model parameters. Inference-time methods such as Best-of-N (BON) sampling offer a… ▽ More

    Submitted 5 April, 2025; v1 submitted 2 April, 2025; originally announced April 2025.

  5. arXiv:2504.01081  [pdf, other

    cs.CV cs.CL eess.IV

    ShieldGemma 2: Robust and Tractable Image Content Moderation

    Authors: Wenjun Zeng, Dana Kurniawan, Ryan Mullins, Yuchi Liu, Tamoghna Saha, Dirichi Ike-Njoku, Jindong Gu, Yiwen Song, Cai Xu, Jingjing Zhou, Aparna Joshi, Shravan Dheep, Mani Malek, Hamid Palangi, Joon Baek, Rick Pereira, Karthik Narasimhan

    Abstract: We introduce ShieldGemma 2, a 4B parameter image content moderation model built on Gemma 3. This model provides robust safety risk predictions across the following key harm categories: Sexually Explicit, Violence \& Gore, and Dangerous Content for synthetic images (e.g. output of any image generation model) and natural images (e.g. any image input to a Vision-Language Model). We evaluated on both… ▽ More

    Submitted 8 April, 2025; v1 submitted 1 April, 2025; originally announced April 2025.

  6. arXiv:2503.08026  [pdf, other

    cs.CL cs.AI

    In Prospect and Retrospect: Reflective Memory Management for Long-term Personalized Dialogue Agents

    Authors: Zhen Tan, Jun Yan, I-Hung Hsu, Rujun Han, Zifeng Wang, Long T. Le, Yiwen Song, Yanfei Chen, Hamid Palangi, George Lee, Anand Iyer, Tianlong Chen, Huan Liu, Chen-Yu Lee, Tomas Pfister

    Abstract: Large Language Models (LLMs) have made significant progress in open-ended dialogue, yet their inability to retain and retrieve relevant information from long-term interactions limits their effectiveness in applications requiring sustained personalization. External memory mechanisms have been proposed to address this limitation, enabling LLMs to maintain conversational continuity. However, existing… ▽ More

    Submitted 11 March, 2025; originally announced March 2025.

  7. arXiv:2503.07826  [pdf, other

    cs.CL

    Magnet: Multi-turn Tool-use Data Synthesis and Distillation via Graph Translation

    Authors: Fan Yin, Zifeng Wang, I-Hung Hsu, Jun Yan, Ke Jiang, Yanfei Chen, Jindong Gu, Long T. Le, Kai-Wei Chang, Chen-Yu Lee, Hamid Palangi, Tomas Pfister

    Abstract: Large language models (LLMs) have exhibited the ability to effectively utilize external tools to address user queries. However, their performance may be limited in complex, multi-turn interactions involving users and multiple tools. To address this, we propose Magnet, a principled framework for synthesizing high-quality training trajectories to enhance the function calling capability of large lang… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

    Comments: 12 pages, 3 figures, 4 tables

  8. arXiv:2502.17955  [pdf, other

    cs.CL cs.AI

    Language Models' Factuality Depends on the Language of Inquiry

    Authors: Tushar Aggarwal, Kumar Tanmay, Ayush Agrawal, Kumar Ayush, Hamid Palangi, Paul Pu Liang

    Abstract: Multilingual language models (LMs) are expected to recall factual knowledge consistently across languages, yet they often fail to transfer knowledge between languages even when they possess the correct information in one of the languages. For example, we find that an LM may correctly identify Rashed Al Shashai as being from Saudi Arabia when asked in Arabic, but consistently fails to do so when as… ▽ More

    Submitted 25 February, 2025; originally announced February 2025.

  9. arXiv:2502.16111  [pdf, other

    cs.AI cs.CL

    PlanGEN: A Multi-Agent Framework for Generating Planning and Reasoning Trajectories for Complex Problem Solving

    Authors: Mihir Parmar, Xin Liu, Palash Goyal, Yanfei Chen, Long Le, Swaroop Mishra, Hossein Mobahi, Jindong Gu, Zifeng Wang, Hootan Nakhost, Chitta Baral, Chen-Yu Lee, Tomas Pfister, Hamid Palangi

    Abstract: Recent agent frameworks and inference-time algorithms often struggle with complex planning problems due to limitations in verifying generated plans or reasoning and varying complexity of instances within a single task. Many existing methods for these tasks either perform task-level verification without considering constraints or apply inference-time algorithms without adapting to instance-level co… ▽ More

    Submitted 22 February, 2025; originally announced February 2025.

    Comments: 30 pages

  10. arXiv:2502.04510  [pdf, other

    cs.CL

    Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems

    Authors: Shangbin Feng, Zifeng Wang, Palash Goyal, Yike Wang, Weijia Shi, Huang Xia, Hamid Palangi, Luke Zettlemoyer, Yulia Tsvetkov, Chen-Yu Lee, Tomas Pfister

    Abstract: We propose Heterogeneous Swarms, an algorithm to design multi-LLM systems by jointly optimizing model roles and weights. We represent multi-LLM systems as directed acyclic graphs (DAGs) of LLMs with topological message passing for collaborative generation. Given a pool of LLM experts and a utility function, Heterogeneous Swarms employs two iterative steps: role-step and weight-step. For role-step,… ▽ More

    Submitted 6 February, 2025; originally announced February 2025.

  11. arXiv:2502.02533  [pdf, other

    cs.LG cs.AI cs.CL cs.MA

    Multi-Agent Design: Optimizing Agents with Better Prompts and Topologies

    Authors: Han Zhou, Xingchen Wan, Ruoxi Sun, Hamid Palangi, Shariq Iqbal, Ivan Vulić, Anna Korhonen, Sercan Ö. Arık

    Abstract: Large language models, employed as multiple agents that interact and collaborate with each other, have excelled at solving complex tasks. The agents are programmed with prompts that declare their functionality, along with the topologies that orchestrate interactions across agents. Designing prompts and topologies for multi-agent systems (MAS) is inherently complex. To automate the entire design pr… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

    Comments: 11 pages, 7 figures, 1 table (30 pages, 9 figures, 5 tables including references and appendices)

  12. arXiv:2411.19865  [pdf, other

    cs.CL cs.AI cs.LG

    Reverse Thinking Makes LLMs Stronger Reasoners

    Authors: Justin Chih-Yao Chen, Zifeng Wang, Hamid Palangi, Rujun Han, Sayna Ebrahimi, Long Le, Vincent Perot, Swaroop Mishra, Mohit Bansal, Chen-Yu Lee, Tomas Pfister

    Abstract: Reverse thinking plays a crucial role in human reasoning. Humans can reason not only from a problem to a solution but also in reverse, i.e., start from the solution and reason towards the problem. This often enhances overall reasoning performance as it enables consistency checks between their forward and backward thinking. To enable Large Language Models (LLMs) to perform reverse thinking, we intr… ▽ More

    Submitted 7 March, 2025; v1 submitted 29 November, 2024; originally announced November 2024.

    Comments: Accepted to NAACL 2025

  13. arXiv:2410.11163  [pdf, ps, other

    cs.CL

    Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence

    Authors: Shangbin Feng, Zifeng Wang, Yike Wang, Sayna Ebrahimi, Hamid Palangi, Lesly Miculicich, Achin Kulshrestha, Nathalie Rauschmayr, Yejin Choi, Yulia Tsvetkov, Chen-Yu Lee, Tomas Pfister

    Abstract: We propose Model Swarms, a collaborative search algorithm to adapt LLMs via swarm intelligence, the collective behavior guiding individual systems. Specifically, Model Swarms starts with a pool of LLM experts and a utility function. Guided by the best-found checkpoints across models, diverse LLM experts collaboratively move in the weight space and optimize a utility function representing model ada… ▽ More

    Submitted 31 May, 2025; v1 submitted 14 October, 2024; originally announced October 2024.

    Comments: ICML 2025

  14. arXiv:2409.18216  [pdf, other

    cs.AI cs.CL cs.LG

    MMMT-IF: A Challenging Multimodal Multi-Turn Instruction Following Benchmark

    Authors: Elliot L. Epstein, Kaisheng Yao, Jing Li, Xinyi Bai, Hamid Palangi

    Abstract: Evaluating instruction following capabilities for multimodal, multi-turn dialogue is challenging. With potentially multiple instructions in the input model context, the task is time-consuming for human raters and we show LLM based judges are biased towards answers from the same model. We propose MMMT-IF, an image based multi-turn Q$\&$A evaluation set with added global instructions between questio… ▽ More

    Submitted 26 September, 2024; originally announced September 2024.

    Comments: 24 pages, 16 figures

    ACM Class: I.2

  15. arXiv:2409.10566  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Eureka: Evaluating and Understanding Large Foundation Models

    Authors: Vidhisha Balachandran, Jingya Chen, Neel Joshi, Besmira Nushi, Hamid Palangi, Eduardo Salinas, Vibhav Vineet, James Woffinden-Luey, Safoora Yousefi

    Abstract: Rigorous and reproducible evaluation is critical for assessing the state of the art and for guiding scientific advances in Artificial Intelligence. Evaluation is challenging in practice due to several reasons, including benchmark saturation, lack of transparency in methods used for measurement, development challenges in extracting measurements for generative tasks, and, more generally, the extensi… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

    ACM Class: I.2

  16. arXiv:2402.08225  [pdf, other

    cs.LG

    Improving Black-box Robustness with In-Context Rewriting

    Authors: Kyle O'Brien, Nathan Ng, Isha Puri, Jorge Mendez, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi, Thomas Hartvigsen

    Abstract: Machine learning models for text classification often excel on in-distribution (ID) data but struggle with unseen out-of-distribution (OOD) inputs. Most techniques for improving OOD robustness are not applicable to settings where the model is effectively a black box, such as when the weights are frozen, retraining is costly, or the model is leveraged via an API. Test-time augmentation (TTA) is a s… ▽ More

    Submitted 4 August, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

  17. arXiv:2402.06120  [pdf, other

    cs.CL

    Exploring Group and Symmetry Principles in Large Language Models

    Authors: Shima Imani, Hamid Palangi

    Abstract: Large Language Models (LLMs) have demonstrated impressive performance across a wide range of applications; however, assessing their reasoning capabilities remains a significant challenge. In this paper, we introduce a framework grounded in group and symmetry principles, which have played a crucial role in fields such as physics and mathematics, and offer another way to evaluate their capabilities.… ▽ More

    Submitted 5 September, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

  18. arXiv:2312.02073  [pdf, other

    cs.CL cs.AI cs.LG

    A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia

    Authors: Giovanni Monea, Maxime Peyrard, Martin Josifoski, Vishrav Chaudhary, Jason Eisner, Emre Kıcıman, Hamid Palangi, Barun Patra, Robert West

    Abstract: Large language models (LLMs) have an impressive ability to draw on novel information supplied in their context. Yet the mechanisms underlying this contextual grounding remain unknown, especially in situations where contextual information contradicts factual knowledge stored in the parameters, which LLMs also excel at recalling. Favoring the contextual information is critical for retrieval-augmente… ▽ More

    Submitted 10 June, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: Accepted at ACL 2024 (main conference)

  19. arXiv:2311.11045  [pdf, other

    cs.AI

    Orca 2: Teaching Small Language Models How to Reason

    Authors: Arindam Mitra, Luciano Del Corro, Shweti Mahajan, Andres Codas, Clarisse Simoes, Sahaj Agarwal, Xuxi Chen, Anastasia Razdaibiedina, Erik Jones, Kriti Aggarwal, Hamid Palangi, Guoqing Zheng, Corby Rosset, Hamed Khanpour, Ahmed Awadallah

    Abstract: Orca 1 learns from rich signals, such as explanation traces, allowing it to outperform conventional instruction-tuned models on benchmarks like BigBench Hard and AGIEval. In Orca 2, we continue exploring how improved training signals can enhance smaller LMs' reasoning abilities. Research on training small LMs has often relied on imitation learning to replicate the output of more capable models. We… ▽ More

    Submitted 21 November, 2023; v1 submitted 18 November, 2023; originally announced November 2023.

    Comments: Added url to model weights fixed typo in Author name

  20. arXiv:2310.17750  [pdf, other

    cs.CL

    A Framework for Automated Measurement of Responsible AI Harms in Generative AI Applications

    Authors: Ahmed Magooda, Alec Helyar, Kyle Jackson, David Sullivan, Chad Atalla, Emily Sheng, Dan Vann, Richard Edgar, Hamid Palangi, Roman Lutz, Hongliang Kong, Vincent Yun, Eslam Kamal, Federico Zarfati, Hanna Wallach, Sarah Bird, Mei Chen

    Abstract: We present a framework for the automated measurement of responsible AI (RAI) metrics for large language models (LLMs) and associated products and services. Our framework for automatically measuring harms from LLMs builds on existing technical and sociotechnical expertise and leverages the capabilities of state-of-the-art LLMs, such as GPT-4. We use this framework to run through several case studie… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: This is a living document

  21. arXiv:2310.07088  [pdf, other

    cs.CL cs.AI

    Diversity of Thought Improves Reasoning Abilities of LLMs

    Authors: Ranjita Naik, Varun Chandrasekaran, Mert Yuksekgonul, Hamid Palangi, Besmira Nushi

    Abstract: Large language models (LLMs) are documented to struggle in settings that require complex reasoning. Nevertheless, instructing the model to break down the problem into smaller reasoning steps, or ensembling various generations through modifying decoding steps boosts performance. However, these methods assume that the input prompt is fixed and expect the decoding strategies to introduce the diversit… ▽ More

    Submitted 23 February, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

  22. arXiv:2310.06827  [pdf, other

    cs.CL cs.LG

    Teaching Language Models to Hallucinate Less with Synthetic Tasks

    Authors: Erik Jones, Hamid Palangi, Clarisse Simões, Varun Chandrasekaran, Subhabrata Mukherjee, Arindam Mitra, Ahmed Awadallah, Ece Kamar

    Abstract: Large language models (LLMs) frequently hallucinate on abstractive summarization tasks such as document-based question-answering, meeting summarization, and clinical report generation, even though all necessary information is included in context. However, optimizing LLMs to hallucinate less on these tasks is challenging, as hallucination is hard to efficiently evaluate at each optimization step. I… ▽ More

    Submitted 7 November, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

  23. arXiv:2309.15129  [pdf, other

    cs.AI cs.CL cs.LG

    Evaluating Cognitive Maps and Planning in Large Language Models with CogEval

    Authors: Ida Momennejad, Hosein Hasanbeig, Felipe Vieira, Hiteshi Sharma, Robert Osazuwa Ness, Nebojsa Jojic, Hamid Palangi, Jonathan Larson

    Abstract: Recently an influx of studies claim emergent cognitive abilities in large language models (LLMs). Yet, most rely on anecdotes, overlook contamination of training sets, or lack systematic Evaluation involving multiple tasks, control conditions, multiple iterations, and statistical robustness tests. Here we make two major contributions. First, we propose CogEval, a cognitive science-inspired protoco… ▽ More

    Submitted 24 September, 2023; originally announced September 2023.

  24. arXiv:2309.15098  [pdf, other

    cs.CL cs.AI cs.LG

    Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models

    Authors: Mert Yuksekgonul, Varun Chandrasekaran, Erik Jones, Suriya Gunasekar, Ranjita Naik, Hamid Palangi, Ece Kamar, Besmira Nushi

    Abstract: We investigate the internal behavior of Transformer-based Large Language Models (LLMs) when they generate factually incorrect text. We propose modeling factual queries as constraint satisfaction problems and use this framework to investigate how the LLM interacts internally with factual constraints. We find a strong positive relationship between the LLM's attention to constraint tokens and the fac… ▽ More

    Submitted 17 April, 2024; v1 submitted 26 September, 2023; originally announced September 2023.

    Comments: Published at ICLR 2024

  25. arXiv:2307.10522  [pdf, other

    cs.CL

    Gender-tuning: Empowering Fine-tuning for Debiasing Pre-trained Language Models

    Authors: Somayeh Ghanbarzadeh, Yan Huang, Hamid Palangi, Radames Cruz Moreno, Hamed Khanpour

    Abstract: Recent studies have revealed that the widely-used Pre-trained Language Models (PLMs) propagate societal biases from the large unmoderated pre-training corpora. Existing solutions require debiasing training processes and datasets for debiasing, which are resource-intensive and costly. Furthermore, these methods hurt the PLMs' performance on downstream tasks. In this study, we propose Gender-tuning,… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Journal ref: ACL 2023

  26. arXiv:2307.10457  [pdf, other

    cs.CL

    Improving the Reusability of Pre-trained Language Models in Real-world Applications

    Authors: Somayeh Ghanbarzadeh, Hamid Palangi, Yan Huang, Radames Cruz Moreno, Hamed Khanpour

    Abstract: The reusability of state-of-the-art Pre-trained Language Models (PLMs) is often limited by their generalization problem, where their performance drastically decreases when evaluated on examples that differ from the training dataset, known as Out-of-Distribution (OOD)/unseen examples. This limitation arises from PLMs' reliance on spurious correlations, which work well for frequent example types but… ▽ More

    Submitted 8 August, 2023; v1 submitted 19 July, 2023; originally announced July 2023.

    Comments: Accepted as a long paper and awarded as the BEST Resaerch Paper in IEEE IRI'23 (IEEE 24th International conference on Information Reuse and Integrationfor Data Science)

  27. arXiv:2306.02707  [pdf, other

    cs.CL cs.LG

    Orca: Progressive Learning from Complex Explanation Traces of GPT-4

    Authors: Subhabrata Mukherjee, Arindam Mitra, Ganesh Jawahar, Sahaj Agarwal, Hamid Palangi, Ahmed Awadallah

    Abstract: Recent research has focused on enhancing the capability of smaller models through imitation learning, drawing on the outputs generated by large foundation models (LFMs). A number of issues impact the quality of these models, ranging from limited imitation signals from shallow LFM outputs; small scale homogeneous training data; and most notably a lack of rigorous evaluation resulting in overestimat… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  28. arXiv:2304.03916  [pdf, other

    cs.LG cs.AI

    Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning

    Authors: Yu Yang, Besmira Nushi, Hamid Palangi, Baharan Mirzasoleiman

    Abstract: Spurious correlations that degrade model generalization or lead the model to be right for the wrong reasons are one of the main robustness concerns for real-world deployments. However, mitigating these correlations during pre-training for large-scale models can be costly and impractical, particularly for those without access to high-performance computing resources. This paper proposes a novel appr… ▽ More

    Submitted 30 May, 2023; v1 submitted 8 April, 2023; originally announced April 2023.

  29. arXiv:2303.12712  [pdf, other

    cs.CL cs.AI

    Sparks of Artificial General Intelligence: Early experiments with GPT-4

    Authors: Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, Yi Zhang

    Abstract: Artificial intelligence (AI) researchers have been developing and refining large language models (LLMs) that exhibit remarkable capabilities across a variety of domains and tasks, challenging our understanding of learning and cognition. The latest model developed by OpenAI, GPT-4, was trained using an unprecedented scale of compute and data. In this paper, we report on our investigation of an earl… ▽ More

    Submitted 13 April, 2023; v1 submitted 22 March, 2023; originally announced March 2023.

  30. arXiv:2301.09211  [pdf, other

    cs.CL cs.AI

    An Empirical Study of Metrics to Measure Representational Harms in Pre-Trained Language Models

    Authors: Saghar Hosseini, Hamid Palangi, Ahmed Hassan Awadallah

    Abstract: Large-scale Pre-Trained Language Models (PTLMs) capture knowledge from massive human-written data which contains latent societal biases and toxic contents. In this paper, we leverage the primary task of PTLMs, i.e., language modeling, and propose a new metric to quantify manifested implicit representational harms in PTLMs towards 13 marginalized demographics. Using this metric, we conducted an emp… ▽ More

    Submitted 22 January, 2023; originally announced January 2023.

    Comments: 17 pages,

    ACM Class: I.2.7

  31. arXiv:2212.10015  [pdf, other

    cs.CV cs.AI cs.CL

    Benchmarking Spatial Relationships in Text-to-Image Generation

    Authors: Tejas Gokhale, Hamid Palangi, Besmira Nushi, Vibhav Vineet, Eric Horvitz, Ece Kamar, Chitta Baral, Yezhou Yang

    Abstract: Spatial understanding is a fundamental aspect of computer vision and integral for human-level reasoning about images, making it an important component for grounded language understanding. While recent text-to-image synthesis (T2I) models have shown unprecedented improvements in photorealism, it is unclear whether they have reliable spatial understanding capabilities. We investigate the ability of… ▽ More

    Submitted 27 October, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: preprint; Code and Data at https://github.com/microsoft/VISOR and https://huggingface.co/datasets/tgokhale/sr2d_visor

  32. arXiv:2211.11109  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Deep Learning on a Healthy Data Diet: Finding Important Examples for Fairness

    Authors: Abdelrahman Zayed, Prasanna Parthasarathi, Goncalo Mordido, Hamid Palangi, Samira Shabanian, Sarath Chandar

    Abstract: Data-driven predictive solutions predominant in commercial applications tend to suffer from biases and stereotypes, which raises equity concerns. Prediction models may discover, use, or amplify spurious correlations based on gender or other protected personal characteristics, thus discriminating against marginalized groups. Mitigating gender bias has become an important research focus in natural l… ▽ More

    Submitted 24 November, 2022; v1 submitted 20 November, 2022; originally announced November 2022.

    Comments: In Proceedings of AAAI 2023

  33. arXiv:2211.11031  [pdf, other

    cs.LG

    Aging with GRACE: Lifelong Model Editing with Discrete Key-Value Adaptors

    Authors: Thomas Hartvigsen, Swami Sankaranarayanan, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi

    Abstract: Deployed language models decay over time due to shifting inputs, changing user needs, or emergent world-knowledge gaps. When such problems are identified, we want to make targeted edits while avoiding expensive retraining. However, current model editors, which modify such behaviors of pre-trained models, degrade model performance quickly across multiple, sequential edits. We propose GRACE, a lifel… ▽ More

    Submitted 17 October, 2023; v1 submitted 20 November, 2022; originally announced November 2022.

    Comments: Accepted to NeurIPS 2023

  34. arXiv:2211.04364  [pdf, other

    cs.CL

    NaturalAdversaries: Can Naturalistic Adversaries Be as Effective as Artificial Adversaries?

    Authors: Saadia Gabriel, Hamid Palangi, Yejin Choi

    Abstract: While a substantial body of prior work has explored adversarial example generation for natural language understanding tasks, these examples are often unrealistic and diverge from the real-world data distributions. In this work, we introduce a two-stage adversarial example generation framework (NaturalAdversaries), for designing adversaries that are effective at fooling a given classifier and demon… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

    Comments: Findings of EMNLP 2022

  35. arXiv:2208.06061  [pdf, other

    cs.CL

    Structural Biases for Improving Transformers on Translation into Morphologically Rich Languages

    Authors: Paul Soulos, Sudha Rao, Caitlin Smith, Eric Rosen, Asli Celikyilmaz, R. Thomas McCoy, Yichen Jiang, Coleman Haley, Roland Fernandez, Hamid Palangi, Jianfeng Gao, Paul Smolensky

    Abstract: Machine translation has seen rapid progress with the advent of Transformer-based models. These models have no explicit linguistic structure built into them, yet they may still implicitly learn structured relationships by attending to relevant tokens. We hypothesize that this structural learning could be made more robust by explicitly endowing Transformers with a structural bias, and we investigate… ▽ More

    Submitted 11 August, 2022; originally announced August 2022.

    Comments: Revised edition to 4th Workshop on Technologies for MT of Low Resource Languages

    Journal ref: Proceedings of the 4th Workshop on Technologies for MT of Low Resource Languages (LoResMT2021)

  36. arXiv:2207.02159  [pdf, other

    cs.CV cs.MM

    Robustness Analysis of Video-Language Models Against Visual and Language Perturbations

    Authors: Madeline C. Schiappa, Shruti Vyas, Hamid Palangi, Yogesh S. Rawat, Vibhav Vineet

    Abstract: Joint visual and language modeling on large-scale datasets has recently shown good progress in multi-modal tasks when compared to single modal learning. However, robustness of these approaches against real-world perturbations has not been studied. In this work, we perform the first extensive robustness study of video-language models against various real-world perturbations. We focus on text-to-vid… ▽ More

    Submitted 18 July, 2023; v1 submitted 5 July, 2022; originally announced July 2022.

    Comments: NeurIPS 2022 Datasets and Benchmarks Track. This projects webpage is located at https://bit.ly/3CNOly4

    Journal ref: Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (2022)

  37. arXiv:2207.01398  [pdf, other

    cs.CV eess.IV

    Large-scale Robustness Analysis of Video Action Recognition Models

    Authors: Madeline Chantry Schiappa, Naman Biyani, Prudvi Kamtam, Shruti Vyas, Hamid Palangi, Vibhav Vineet, Yogesh Rawat

    Abstract: We have seen a great progress in video action recognition in recent years. There are several models based on convolutional neural network (CNN) and some recent transformer based approaches which provide top performance on existing benchmarks. In this work, we perform a large-scale robustness analysis of these existing models for video action recognition. We focus on robustness against real-world d… ▽ More

    Submitted 7 April, 2023; v1 submitted 4 July, 2022; originally announced July 2022.

    Comments: Accepted in 2023 Conference on Computer Vision and Pattern Recognition (CVPR)

  38. arXiv:2203.09509  [pdf, other

    cs.CL

    ToxiGen: A Large-Scale Machine-Generated Dataset for Adversarial and Implicit Hate Speech Detection

    Authors: Thomas Hartvigsen, Saadia Gabriel, Hamid Palangi, Maarten Sap, Dipankar Ray, Ece Kamar

    Abstract: Toxic language detection systems often falsely flag text that contains minority group mentions as toxic, as those groups are often the targets of online hate. Such over-reliance on spurious correlations also causes systems to struggle with detecting implicitly toxic language. To help mitigate these issues, we create ToxiGen, a new large-scale and machine-generated dataset of 274k toxic and benign… ▽ More

    Submitted 14 July, 2022; v1 submitted 17 March, 2022; originally announced March 2022.

    Comments: Published as a long paper at ACL 2022. Code: https://github.com/microsoft/TOXIGEN

  39. arXiv:2106.01317  [pdf, other

    cs.CL cs.AI cs.LG

    Enriching Transformers with Structured Tensor-Product Representations for Abstractive Summarization

    Authors: Yichen Jiang, Asli Celikyilmaz, Paul Smolensky, Paul Soulos, Sudha Rao, Hamid Palangi, Roland Fernandez, Caitlin Smith, Mohit Bansal, Jianfeng Gao

    Abstract: Abstractive summarization, the task of generating a concise summary of input documents, requires: (1) reasoning over the source document to determine the salient pieces of information scattered across the long document, and (2) composing a cohesive text by reconstructing these salient facts into a shorter summary that faithfully reflects the complex relations connecting these facts. In this paper,… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: NAACL 2021 (14 pages)

  40. arXiv:2105.08961  [pdf, other

    cs.LG cs.AI cs.CL

    Compositional Processing Emerges in Neural Networks Solving Math Problems

    Authors: Jacob Russin, Roland Fernandez, Hamid Palangi, Eric Rosen, Nebojsa Jojic, Paul Smolensky, Jianfeng Gao

    Abstract: A longstanding question in cognitive science concerns the learning mechanisms underlying compositionality in human cognition. Humans can infer the structured relationships (e.g., grammatical rules) implicit in their sensory observations (e.g., auditory speech), and use this knowledge to guide the composition of simpler meanings into complex wholes. Recent progress in artificial neural networks has… ▽ More

    Submitted 19 May, 2021; originally announced May 2021.

    Comments: 7 pages, 2 figures, Accepted to CogSci 2021 for poster presentation

  41. arXiv:2011.09530  [pdf, other

    cs.CV cs.AI eess.IV

    Neuro-Symbolic Representations for Video Captioning: A Case for Leveraging Inductive Biases for Vision and Language

    Authors: Hassan Akbari, Hamid Palangi, Jianwei Yang, Sudha Rao, Asli Celikyilmaz, Roland Fernandez, Paul Smolensky, Jianfeng Gao, Shih-Fu Chang

    Abstract: Neuro-symbolic representations have proved effective in learning structure information in vision and language. In this paper, we propose a new model architecture for learning multi-modal neuro-symbolic representations for video captioning. Our approach uses a dictionary learning-based method of learning relations between videos and their paired text descriptions. We refer to these relations as rel… ▽ More

    Submitted 18 November, 2020; originally announced November 2020.

  42. arXiv:2006.11524  [pdf, other

    cs.LG cs.AI cs.CV cs.NE cs.SC stat.ML

    Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning"

    Authors: Saeed Amizadeh, Hamid Palangi, Oleksandr Polozov, Yichen Huang, Kazuhito Koishida

    Abstract: Visual reasoning tasks such as visual question answering (VQA) require an interplay of visual perception with reasoning about the question semantics grounded in perception. However, recent advances in this area are still primarily driven by perception improvements (e.g. scene graph generation) rather than reasoning. Neuro-symbolic models such as Neural Module Networks bring the benefits of composi… ▽ More

    Submitted 25 August, 2020; v1 submitted 20 June, 2020; originally announced June 2020.

    Comments: Published in Proceedings of the 37th International Conference on Machine Learning (ICML), Online, PMLR 119, 2020

  43. arXiv:2005.11406  [pdf, other

    cs.CV

    Novel Human-Object Interaction Detection via Adversarial Domain Generalization

    Authors: Yuhang Song, Wenbo Li, Lei Zhang, Jianwei Yang, Emre Kiciman, Hamid Palangi, Jianfeng Gao, C. -C. Jay Kuo, Pengchuan Zhang

    Abstract: We study in this paper the problem of novel human-object interaction (HOI) detection, aiming at improving the generalization ability of the model to unseen scenarios. The challenge mainly stems from the large compositional space of objects and predicates, which leads to the lack of sufficient training data for all the object-predicate combinations. As a result, most existing HOI methods heavily re… ▽ More

    Submitted 22 May, 2020; originally announced May 2020.

  44. arXiv:1910.12647  [pdf, other

    cs.CL cs.LG stat.ML

    HUBERT Untangles BERT to Improve Transfer across NLP Tasks

    Authors: Mehrad Moradshahi, Hamid Palangi, Monica S. Lam, Paul Smolensky, Jianfeng Gao

    Abstract: We introduce HUBERT which combines the structured-representational power of Tensor-Product Representations (TPRs) and BERT, a pre-trained bidirectional Transformer language model. We show that there is shared structure between different NLP datasets that HUBERT, but not BERT, is able to learn and leverage. We validate the effectiveness of our model on the GLUE benchmark and HANS dataset. Our exper… ▽ More

    Submitted 25 April, 2021; v1 submitted 25 October, 2019; originally announced October 2019.

  45. arXiv:1910.02339  [pdf, other

    cs.CL cs.LG

    Mapping Natural-language Problems to Formal-language Solutions Using Structured Neural Representations

    Authors: Kezhen Chen, Qiuyuan Huang, Hamid Palangi, Paul Smolensky, Kenneth D. Forbus, Jianfeng Gao

    Abstract: Generating formal-language programs represented by relational tuples, such as Lisp programs or mathematical operations, to solve problems stated in natural language is a challenging task because it requires explicitly capturing discrete symbolic structural information implicit in the input. However, most general neural sequence models do not explicitly capture such structural information, limiting… ▽ More

    Submitted 1 August, 2020; v1 submitted 5 October, 2019; originally announced October 2019.

  46. arXiv:1909.11059  [pdf, other

    cs.CV

    Unified Vision-Language Pre-Training for Image Captioning and VQA

    Authors: Luowei Zhou, Hamid Palangi, Lei Zhang, Houdong Hu, Jason J. Corso, Jianfeng Gao

    Abstract: This paper presents a unified Vision-Language Pre-training (VLP) model. The model is unified in that (1) it can be fine-tuned for either vision-language generation (e.g., image captioning) or understanding (e.g., visual question answering) tasks, and (2) it uses a shared multi-layer transformer network for both encoding and decoding, which differs from many existing methods where the encoder and d… ▽ More

    Submitted 4 December, 2019; v1 submitted 24 September, 2019; originally announced September 2019.

    Comments: AAAI 2020 camera-ready version. The code and the pre-trained models are available at https://github.com/LuoweiZhou/VLP

  47. arXiv:1909.09953  [pdf, other

    cs.CV cs.AI

    Learning Visual Relation Priors for Image-Text Matching and Image Captioning with Neural Scene Graph Generators

    Authors: Kuang-Huei Lee, Hamid Palangi, Xi Chen, Houdong Hu, Jianfeng Gao

    Abstract: Grounding language to visual relations is critical to various language-and-vision applications. In this work, we tackle two fundamental language-and-vision tasks: image-text matching and image captioning, and demonstrate that neural scene graph generators can learn effective visual relation features to facilitate grounding language to visual relations and subsequently improve the two end applicati… ▽ More

    Submitted 22 September, 2019; originally announced September 2019.

  48. arXiv:1803.09848  [pdf, other

    eess.SP

    Epileptic Seizure Detection: A Deep Learning Approach

    Authors: Ramy Hussein, Hamid Palangi, Rabab Ward, Z. Jane Wang

    Abstract: Epilepsy is the second most common brain disorder after migraine. Automatic detection of epileptic seizures can considerably improve the patients' quality of life. Current Electroencephalogram (EEG)-based seizure detection systems encounter many challenges in real-life situations. The EEGs are non-stationary signals and seizure patterns vary across patients and recording sessions. Moreover, EEG da… ▽ More

    Submitted 26 March, 2018; originally announced March 2018.

    Comments: 12 pages, 8 figures

  49. arXiv:1705.08432  [pdf, other

    cs.CL

    Question-Answering with Grammatically-Interpretable Representations

    Authors: Hamid Palangi, Paul Smolensky, Xiaodong He, Li Deng

    Abstract: We introduce an architecture, the Tensor Product Recurrent Network (TPRN). In our application of TPRN, internal representations learned by end-to-end optimization in a deep neural network performing a textual question-answering (QA) task can be interpreted using basic concepts from linguistic theory. No performance penalty need be paid for this increased interpretability: the proposed model perfor… ▽ More

    Submitted 25 September, 2017; v1 submitted 23 May, 2017; originally announced May 2017.

  50. Distributed Compressive Sensing: A Deep Learning Approach

    Authors: Hamid Palangi, Rabab Ward, Li Deng

    Abstract: Various studies that address the compressed sensing problem with Multiple Measurement Vectors (MMVs) have been recently carried. These studies assume the vectors of the different channels to be jointly sparse. In this paper, we relax this condition. Instead we assume that these sparse vectors depend on each other but that this dependency is unknown. We capture this dependency by computing the cond… ▽ More

    Submitted 11 May, 2016; v1 submitted 20 August, 2015; originally announced August 2015.

    Comments: To appear in IEEE Transactions on Signal Processing

    Journal ref: IEEE Transactions on Signal Processing, Volume: 64, Issue: 17, pp. 4504-4518, 2016