Skip to main content

Showing 1–13 of 13 results for author: Ziyadi, M

.
  1. arXiv:2506.21506  [pdf, ps, other

    cs.AI cs.CL

    Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge

    Authors: Boyu Gou, Zanming Huang, Yuting Ning, Yu Gu, Michael Lin, Weijian Qi, Andrei Kopanev, Botao Yu, Bernal Jiménez Gutiérrez, Yiheng Shu, Chan Hee Song, Jiaman Wu, Shijie Chen, Hanane Nour Moussa, Tianshu Zhang, Jian Xie, Yifei Li, Tianci Xue, Zeyi Liao, Kai Zhang, Boyuan Zheng, Zhaowei Cai, Viktor Rozgic, Morteza Ziyadi, Huan Sun , et al. (1 additional authors not shown)

    Abstract: Agentic search such as Deep Research systems, where large language models autonomously browse the web, synthesize information, and return comprehensive citation-backed answers, represents a major shift in how users interact with web-scale information. While promising greater efficiency and cognitive offloading, the growing complexity and open-endedness of agentic search have outpaced existing eval… ▽ More

    Submitted 26 June, 2025; originally announced June 2025.

    Comments: Project Homepage: https://osu-nlp-group.github.io/Mind2Web2/

  2. arXiv:2506.12103  [pdf, other

    cs.AI cs.CY cs.LG

    The Amazon Nova Family of Models: Technical Report and Model Card

    Authors: Amazon AGI, Aaron Langford, Aayush Shah, Abhanshu Gupta, Abhimanyu Bhatter, Abhinav Goyal, Abhinav Mathur, Abhinav Mohanty, Abhishek Kumar, Abhishek Sethi, Abi Komma, Abner Pena, Achin Jain, Adam Kunysz, Adam Opyrchal, Adarsh Singh, Aditya Rawal, Adok Achar Budihal Prasad, Adrià de Gispert, Agnika Kumar, Aishwarya Aryamane, Ajay Nair, Akilan M, Akshaya Iyengar, Akshaya Vishnu Kudlu Shanbhogue , et al. (761 additional authors not shown)

    Abstract: We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents… ▽ More

    Submitted 17 March, 2025; originally announced June 2025.

    Comments: 48 pages, 10 figures

    Report number: 20250317

  3. arXiv:2506.00789  [pdf, ps, other

    cs.CL

    RARE: Retrieval-Aware Robustness Evaluation for Retrieval-Augmented Generation Systems

    Authors: Yixiao Zeng, Tianyu Cao, Danqing Wang, Xinran Zhao, Zimeng Qiu, Morteza Ziyadi, Tongshuang Wu, Lei Li

    Abstract: Retrieval-Augmented Generation (RAG) enhances recency and factuality in answers. However, existing evaluations rarely test how well these systems cope with real-world noise, conflicting between internal and external retrieved contexts, or fast-changing facts. We introduce Retrieval-Aware Robustness Evaluation (RARE), a unified framework and large-scale benchmark that jointly stress-tests query and… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

  4. arXiv:2505.23823  [pdf, ps, other

    cs.CL

    RAGPPI: RAG Benchmark for Protein-Protein Interactions in Drug Discovery

    Authors: Youngseung Jeon, Ziwen Li, Thomas Li, JiaSyuan Chang, Morteza Ziyadi, Xiang 'Anthony' Chen

    Abstract: Retrieving the biological impacts of protein-protein interactions (PPIs) is essential for target identification (Target ID) in drug development. Given the vast number of proteins involved, this process remains time-consuming and challenging. Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) frameworks have supported Target ID; however, no benchmark currently exists for identify… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

    Comments: 17 pages, 4 figures, 8 tables

  5. arXiv:2503.21699  [pdf, other

    cs.MM cs.AI cs.CV cs.SD eess.AS

    MAVERIX: Multimodal Audio-Visual Evaluation Reasoning IndeX

    Authors: Liuyue Xie, George Z. Wei, Avik Kuthiala, Ce Zheng, Ananya Bal, Mosam Dabhi, Liting Wen, Taru Rustagi, Ethan Lai, Sushil Khyalia, Rohan Choudhury, Morteza Ziyadi, Xu Zhang, Hao Yang, László A. Jeni

    Abstract: Frontier models have either been language-only or have primarily focused on vision and language modalities. Although recent advancements in models with vision and audio understanding capabilities have shown substantial progress, the field lacks a standardized evaluation framework for thoroughly assessing their cross-modality perception performance. We introduce MAVERIX~(Multimodal Audio-Visual Eva… ▽ More

    Submitted 27 March, 2025; originally announced March 2025.

  6. arXiv:2405.18780  [pdf, other

    cs.AI cs.LG

    Certifying Counterfactual Bias in LLMs

    Authors: Isha Chaudhary, Qian Hu, Manoj Kumar, Morteza Ziyadi, Rahul Gupta, Gagandeep Singh

    Abstract: Large Language Models (LLMs) can produce biased responses that can cause representational harms. However, conventional studies are insufficient to thoroughly evaluate biases across LLM responses for different demographic groups (a.k.a. counterfactual bias), as they do not scale to large number of inputs and do not provide guarantees. Therefore, we propose the first framework, LLMCert-B that certif… ▽ More

    Submitted 21 April, 2025; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: Published at ICLR 2025

  7. arXiv:2403.01615  [pdf, other

    cs.LG cs.DC

    Partial Federated Learning

    Authors: Tiantian Feng, Anil Ramakrishna, Jimit Majmudar, Charith Peris, Jixuan Wang, Clement Chung, Richard Zemel, Morteza Ziyadi, Rahul Gupta

    Abstract: Federated Learning (FL) is a popular algorithm to train machine learning models on user data constrained to edge devices (for example, mobile phones) due to privacy concerns. Typically, FL is trained with the assumption that no part of the user data can be egressed from the edge. However, in many production settings, specific data-modalities/meta-data are limited to be on device while others are n… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

  8. arXiv:2201.11473  [pdf, other

    cs.CL cs.AI cs.SC

    Reasoning Like Program Executors

    Authors: Xinyu Pi, Qian Liu, Bei Chen, Morteza Ziyadi, Zeqi Lin, Qiang Fu, Yan Gao, Jian-Guang Lou, Weizhu Chen

    Abstract: Reasoning over natural language is a long-standing goal for the research community. However, studies have shown that existing language models are inadequate in reasoning. To address the issue, we present POET, a novel reasoning pre-training paradigm. Through pre-training language models with programs and their execution results, POET empowers language models to harvest the reasoning knowledge poss… ▽ More

    Submitted 22 October, 2022; v1 submitted 27 January, 2022; originally announced January 2022.

    Comments: To appear in EMNLP 2022 main conference. The first two authors contributed equally

  9. arXiv:2107.07653  [pdf, other

    cs.CL cs.AI

    TAPEX: Table Pre-training via Learning a Neural SQL Executor

    Authors: Qian Liu, Bei Chen, Jiaqi Guo, Morteza Ziyadi, Zeqi Lin, Weizhu Chen, Jian-Guang Lou

    Abstract: Recent progress in language model pre-training has achieved a great success via leveraging large-scale unstructured textual data. However, it is still a challenge to apply pre-training on structured tabular data due to the absence of large-scale high-quality tabular data. In this paper, we propose TAPEX to show that table pre-training can be achieved by learning a neural SQL executor over a synthe… ▽ More

    Submitted 14 March, 2022; v1 submitted 15 July, 2021; originally announced July 2021.

    Comments: ICLR 2022 camera ready version

  10. arXiv:2008.10570  [pdf, other

    cs.CL cs.IR

    Example-Based Named Entity Recognition

    Authors: Morteza Ziyadi, Yuting Sun, Abhishek Goswami, Jade Huang, Weizhu Chen

    Abstract: We present a novel approach to named entity recognition (NER) in the presence of scarce data that we call example-based NER. Our train-free few-shot learning approach takes inspiration from question-answering to identify entity spans in a new and unseen domain. In comparison with the current state-of-the-art, the proposed method performs significantly better, especially when using a low number of… ▽ More

    Submitted 24 August, 2020; originally announced August 2020.

    Comments: 15 pages, 6 figures, 5 tables with appendix

  11. arXiv:2001.08904  [pdf, other

    cs.CL cs.LG stat.ML

    MT-BioNER: Multi-task Learning for Biomedical Named Entity Recognition using Deep Bidirectional Transformers

    Authors: Muhammad Raza Khan, Morteza Ziyadi, Mohamed AbdelHady

    Abstract: Conversational agents such as Cortana, Alexa and Siri are continuously working on increasing their capabilities by adding new domains. The support of a new domain includes the design and development of a number of NLU components for domain classification, intents classification and slots tagging (including named entity recognition). Each component only performs well when trained on a large amount… ▽ More

    Submitted 24 January, 2020; originally announced January 2020.

  12. arXiv:1605.08842  [pdf

    physics.optics

    Spatial Phase and Amplitude Structuring of Beams Using a Combination of Multiple Orthogonal Spatial Functions with Complex Coefficients

    Authors: Guodong Xie, Cong Liu, Long Li, Yongxiong Ren, Zhe Zhao, Yan Yan, Nisar Ahmed, Zhe Wang, Asher J. Willner, Changjing Bao, Yinwen Cao, Morteza Ziyadi, Ahmed Almaiman, Solyman Ashrafi, Moshe Tur, Alan E. Willner

    Abstract: Analogous to time signals that can be composed of multiple frequency functions, we use uniquely structured orthogonal spatial modes to create different beam shapes. We tailor the spatial structure by judiciously choosing a weighted combination of multiple modal states within an orthogonal basis set, and we can tunably create beam phase and intensity "shapes" that are not otherwise readily achievab… ▽ More

    Submitted 28 May, 2016; originally announced May 2016.

    Comments: 15 pages, 5 figures

  13. arXiv:1301.2015  [pdf, ps, other

    stat.ML cs.LG

    Heteroscedastic Relevance Vector Machine

    Authors: Daniel Khashabi, Mojtaba Ziyadi, Feng Liang

    Abstract: In this work we propose a heteroscedastic generalization to RVM, a fast Bayesian framework for regression, based on some recent similar works. We use variational approximation and expectation propagation to tackle the problem. The work is still under progress and we are examining the results and comparing with the previous works.

    Submitted 9 January, 2013; originally announced January 2013.