Skip to main content

Showing 1–27 of 27 results for author: Zhong, V

.
  1. arXiv:2504.19452  [pdf, other

    cs.LG physics.comp-ph

    Geometry-Informed Neural Operator Transformer

    Authors: Qibang Liu, Vincient Zhong, Hadi Meidani, Diab Abueidda, Seid Koric, Philippe Geubelle

    Abstract: Machine-learning-based surrogate models offer significant computational efficiency and faster simulations compared to traditional numerical methods, especially for problems requiring repeated evaluations of partial differential equations. This work introduces the Geometry-Informed Neural Operator Transformer (GINOT), which integrates the transformer architecture with the neural operator framework… ▽ More

    Submitted 27 May, 2025; v1 submitted 27 April, 2025; originally announced April 2025.

  2. arXiv:2411.07763  [pdf, other

    cs.CL cs.AI cs.DB

    Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows

    Authors: Fangyu Lei, Jixuan Chen, Yuxiao Ye, Ruisheng Cao, Dongchan Shin, Hongjin Su, Zhaoqing Suo, Hongcheng Gao, Wenjing Hu, Pengcheng Yin, Victor Zhong, Caiming Xiong, Ruoxi Sun, Qian Liu, Sida Wang, Tao Yu

    Abstract: Real-world enterprise text-to-SQL workflows often involve complex cloud or local data across various database systems, multiple SQL queries in various dialects, and diverse operations from data transformation to analytics. We introduce Spider 2.0, an evaluation framework comprising 632 real-world text-to-SQL workflow problems derived from enterprise-level database use cases. The databases in Spide… ▽ More

    Submitted 17 March, 2025; v1 submitted 12 November, 2024; originally announced November 2024.

    Comments: ICLR 2025 Oral

  3. arXiv:2407.10956  [pdf, other

    cs.AI cs.CL

    Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?

    Authors: Ruisheng Cao, Fangyu Lei, Haoyuan Wu, Jixuan Chen, Yeqiao Fu, Hongcheng Gao, Xinzhuang Xiong, Hanchong Zhang, Yuchen Mao, Wenjing Hu, Tianbao Xie, Hongshen Xu, Danyang Zhang, Sida Wang, Ruoxi Sun, Pengcheng Yin, Caiming Xiong, Ansong Ni, Qian Liu, Victor Zhong, Lu Chen, Kai Yu, Tao Yu

    Abstract: Data science and engineering workflows often span multiple stages, from warehousing to orchestration, using tools like BigQuery, dbt, and Airbyte. As vision language models (VLMs) advance in multimodal understanding and code generation, VLM-based agents could potentially automate these workflows by generating SQL queries, Python code, and GUI operations. This automation can improve the productivit… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: 34 pages, 14 figures, 10 tables

  4. arXiv:2404.07972  [pdf, other

    cs.AI cs.CL

    OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

    Authors: Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh Jing Hua, Zhoujun Cheng, Dongchan Shin, Fangyu Lei, Yitao Liu, Yiheng Xu, Shuyan Zhou, Silvio Savarese, Caiming Xiong, Victor Zhong, Tao Yu

    Abstract: Autonomous agents that accomplish complex computer tasks with minimal human interventions have the potential to transform human-computer interaction, significantly enhancing accessibility and productivity. However, existing benchmarks either lack an interactive environment or are limited to environments specific to certain applications or domains, failing to reflect the diverse and complex nature… ▽ More

    Submitted 30 May, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

    Comments: 51 pages, 21 figures

  5. arXiv:2402.07876  [pdf, other

    cs.LG cs.AI cs.CL

    Policy Improvement using Language Feedback Models

    Authors: Victor Zhong, Dipendra Misra, Xingdi Yuan, Marc-Alexandre Côté

    Abstract: We introduce Language Feedback Models (LFMs) that identify desirable behaviour - actions that help achieve tasks specified in the instruction - for imitation learning in instruction following. To train LFMs, we obtain feedback from Large Language Models (LLMs) on visual trajectories verbalized to language descriptions. First, by using LFMs to identify desirable behaviour to imitate, we improve in… ▽ More

    Submitted 9 October, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: NeurIPS 2024

  6. arXiv:2309.11489  [pdf, other

    cs.LG cs.AI cs.CL cs.RO

    Text2Reward: Reward Shaping with Language Models for Reinforcement Learning

    Authors: Tianbao Xie, Siheng Zhao, Chen Henry Wu, Yitao Liu, Qian Luo, Victor Zhong, Yanchao Yang, Tao Yu

    Abstract: Designing reward functions is a longstanding challenge in reinforcement learning (RL); it requires specialized knowledge or domain data, leading to high costs for development. To address this, we introduce Text2Reward, a data-free framework that automates the generation and shaping of dense reward functions based on large language models (LLMs). Given a goal described in natural language, Text2Rew… ▽ More

    Submitted 25 May, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

    Comments: ICLR 2024 camera ready, 37 pages, 12 figures

  7. arXiv:2212.10511  [pdf, other

    cs.CL cs.AI cs.LG

    When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories

    Authors: Alex Mallen, Akari Asai, Victor Zhong, Rajarshi Das, Daniel Khashabi, Hannaneh Hajishirzi

    Abstract: Despite their impressive performance on diverse tasks, large language models (LMs) still struggle with tasks requiring rich world knowledge, implying the limitations of relying solely on their parameters to encode a wealth of world knowledge. This paper aims to understand LMs' strengths and limitations in memorizing factual knowledge, by conducting large-scale knowledge probing experiments of 10 m… ▽ More

    Submitted 2 July, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: ACL 2023; Code and data available at https://github.com/AlexTMallen/adaptive-retrieval

  8. arXiv:2210.14353  [pdf, other

    cs.CL

    RoMQA: A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering

    Authors: Victor Zhong, Weijia Shi, Wen-tau Yih, Luke Zettlemoyer

    Abstract: We introduce RoMQA, the first benchmark for robust, multi-evidence, multi-answer question answering (QA). RoMQA contains clusters of questions that are derived from related constraints mined from the Wikidata knowledge graph. RoMQA evaluates robustness of QA models to varying constraints by measuring worst-case performance within each question cluster. Compared to prior QA datasets, RoMQA has more… ▽ More

    Submitted 15 November, 2022; v1 submitted 25 October, 2022; originally announced October 2022.

    Comments: The source code and evaluation for RoMQA are at https://github.com/facebookresearch/romqa

  9. arXiv:2210.07370  [pdf, other

    cs.CL

    M2D2: A Massively Multi-domain Language Modeling Dataset

    Authors: Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer

    Abstract: We present M2D2, a fine-grained, massively multi-domain corpus for studying domain adaptation in language models (LMs). M2D2 consists of 8.5B tokens and spans 145 domains extracted from Wikipedia and Semantic Scholar. Using ontologies derived from Wikipedia and ArXiv categories, we organize the domains in each data source into 22 groups. This two-level hierarchy enables the study of relationships… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Comments: EMNLP 2022

  10. arXiv:2210.00066  [pdf, other

    cs.LG cs.AI cs.CL

    Improving Policy Learning via Language Dynamics Distillation

    Authors: Victor Zhong, Jesse Mu, Luke Zettlemoyer, Edward Grefenstette, Tim Rocktäschel

    Abstract: Recent work has shown that augmenting environments with language descriptions improves policy learning. However, for environments with complex language abstractions, learning how to ground language to observations is difficult due to sparse, delayed rewards. We propose Language Dynamics Distillation (LDD), which pretrains a model to predict environment dynamics given demonstrations with language d… ▽ More

    Submitted 30 September, 2022; originally announced October 2022.

    Comments: Accepted to NeurIPS 2022. 16 pages, 12 figures

  11. Conspiracy Brokers: Understanding the Monetization of YouTube Conspiracy Theories

    Authors: Cameron Ballard, Ian Goldstein, Pulak Mehta, Genesis Smothers, Kejsi Take, Victoria Zhong, Rachel Greenstadt, Tobias Lauinger, Damon McCoy

    Abstract: Conspiracy theories are increasingly a subject of research interest as society grapples with their rapid growth in areas such as politics or public health. Previous work has established YouTube as one of the most popular sites for people to host and discuss different theories. In this paper, we present an analysis of monetization methods of conspiracy theorist YouTube creators and the types of adv… ▽ More

    Submitted 31 May, 2022; originally announced May 2022.

    Journal ref: WWW 2022 Proceedings of the ACM Web Conference, April 2022, Pages 2707-2718

  12. arXiv:2202.08938  [pdf, other

    cs.LG cs.AI cs.CL

    Improving Intrinsic Exploration with Language Abstractions

    Authors: Jesse Mu, Victor Zhong, Roberta Raileanu, Minqi Jiang, Noah Goodman, Tim Rocktäschel, Edward Grefenstette

    Abstract: Reinforcement learning (RL) agents are particularly hard to train when rewards are sparse. One common solution is to use intrinsic rewards to encourage agents to explore their environment. However, recent intrinsic exploration methods often use state-based novelty measures which reward low-level exploration and may not scale to domains requiring more abstract skills. Instead, we explore natural la… ▽ More

    Submitted 21 November, 2022; v1 submitted 17 February, 2022; originally announced February 2022.

    Comments: NeurIPS 2022

  13. arXiv:2201.05966  [pdf, other

    cs.CL

    UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

    Authors: Tianbao Xie, Chen Henry Wu, Peng Shi, Ruiqi Zhong, Torsten Scholak, Michihiro Yasunaga, Chien-Sheng Wu, Ming Zhong, Pengcheng Yin, Sida I. Wang, Victor Zhong, Bailin Wang, Chengzu Li, Connor Boyle, Ansong Ni, Ziyu Yao, Dragomir Radev, Caiming Xiong, Lingpeng Kong, Rui Zhang, Noah A. Smith, Luke Zettlemoyer, Tao Yu

    Abstract: Structured knowledge grounding (SKG) leverages structured knowledge to complete user requests, such as semantic parsing over databases and question answering over knowledge bases. Since the inputs and outputs of SKG tasks are heterogeneous, they have been studied separately by different communities, which limits systematic and compatible research on SKG. In this paper, we overcome this limitation… ▽ More

    Submitted 18 October, 2022; v1 submitted 15 January, 2022; originally announced January 2022.

    Comments: EMNLP 2022

  14. arXiv:2110.10661  [pdf, other

    cs.CL cs.AI cs.LG

    SILG: The Multi-environment Symbolic Interactive Language Grounding Benchmark

    Authors: Victor Zhong, Austin W. Hanjie, Sida I. Wang, Karthik Narasimhan, Luke Zettlemoyer

    Abstract: Existing work in language grounding typically study single environments. How do we build unified models that apply across multiple environments? We propose the multi-environment Symbolic Interactive Language Grounding benchmark (SILG), which unifies a collection of diverse grounded language learning environments under a common interface. SILG consists of grid-world environments that require genera… ▽ More

    Submitted 24 January, 2022; v1 submitted 20 October, 2021; originally announced October 2021.

    Comments: NeurIPS 2021. 14 pages, 8 figures

  15. arXiv:2105.08206  [pdf, other

    cs.CL

    LEWIS: Levenshtein Editing for Unsupervised Text Style Transfer

    Authors: Machel Reid, Victor Zhong

    Abstract: Many types of text style transfer can be achieved with only small, precise edits (e.g. sentiment transfer from I had a terrible time... to I had a great time...). We propose a coarse-to-fine editor for style transfer that transforms text using Levenshtein edit operations (e.g. insert, replace, delete). Unlike prior single-span edit methods, our method concurrently edits multiple spans in the sourc… ▽ More

    Submitted 17 May, 2021; originally announced May 2021.

    Comments: ACL-IJCNLP 2021 (Findings)

  16. arXiv:2101.07393  [pdf, other

    cs.CL cs.AI cs.LG

    Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning

    Authors: Austin W. Hanjie, Victor Zhong, Karthik Narasimhan

    Abstract: We investigate the use of natural language to drive the generalization of control policies and introduce the new multi-task environment Messenger with free-form text manuals describing the environment dynamics. Unlike previous work, Messenger does not assume prior knowledge connecting text and state observations $-$ the control policy must simultaneously ground the game manual to entity symbols an… ▽ More

    Submitted 11 June, 2021; v1 submitted 18 January, 2021; originally announced January 2021.

    Comments: Accepted to ICML 2021. Note author list and name changes from previous version

  17. arXiv:2009.07396  [pdf, other

    cs.CL cs.AI cs.DB cs.LG

    Grounded Adaptation for Zero-shot Executable Semantic Parsing

    Authors: Victor Zhong, Mike Lewis, Sida I. Wang, Luke Zettlemoyer

    Abstract: We propose Grounded Adaptation for Zero-shot Executable Semantic Parsing (GAZP) to adapt an existing semantic parser to new environments (e.g. new database schemas). GAZP combines a forward semantic parser with a backward utterance generator to synthesize data (e.g. utterances and SQL queries) in the new environment, then selects cycle-consistent examples to adapt the parser. Unlike data-augmentat… ▽ More

    Submitted 1 February, 2021; v1 submitted 15 September, 2020; originally announced September 2020.

    Comments: EMNLP 2020 long paper. 14 pages, 5 figures

  18. arXiv:1910.08210  [pdf, other

    cs.CL cs.AI cs.LG

    RTFM: Generalising to Novel Environment Dynamics via Reading

    Authors: Victor Zhong, Tim Rocktäschel, Edward Grefenstette

    Abstract: Obtaining policies that can generalise to new environments in reinforcement learning is challenging. In this work, we demonstrate that language understanding via a reading policy learner is a promising vehicle for generalisation to new environments. We propose a grounded policy learning problem, Read to Fight Monsters (RTFM), in which the agent must jointly reason over a language goal, relevant dy… ▽ More

    Submitted 1 February, 2021; v1 submitted 17 October, 2019; originally announced October 2019.

    Comments: ICLR 2020; 17 pages, 13 figures

  19. arXiv:1906.05373  [pdf, other

    cs.CL cs.AI cs.LG

    E3: Entailment-driven Extracting and Editing for Conversational Machine Reading

    Authors: Victor Zhong, Luke Zettlemoyer

    Abstract: Conversational machine reading systems help users answer high-level questions (e.g. determine if they qualify for particular government benefits) when they do not know the exact rules by which the determination is made(e.g. whether they need certain income levels or veteran status). The key challenge is that these rules are only provided in the form of a procedural text (e.g. guidelines from gover… ▽ More

    Submitted 13 February, 2020; v1 submitted 12 June, 2019; originally announced June 2019.

    Comments: Published at the Annual Meeting of the Association for Computational Linguistics (ACL) 2019. Source code: https://github.com/vzhong/e3. 10 pages, 5 figures

  20. arXiv:1906.02916  [pdf, other

    cs.CL cs.AI

    Multi-hop Reading Comprehension through Question Decomposition and Rescoring

    Authors: Sewon Min, Victor Zhong, Luke Zettlemoyer, Hannaneh Hajishirzi

    Abstract: Multi-hop Reading Comprehension (RC) requires reasoning and aggregation across several paragraphs. We propose a system for multi-hop RC that decomposes a compositional question into simpler sub-questions that can be answered by off-the-shelf single-hop RC models. Since annotations for such decomposition are expensive, we recast sub-question generation as a span prediction problem and show that our… ▽ More

    Submitted 30 June, 2019; v1 submitted 7 June, 2019; originally announced June 2019.

    Comments: Published as a conference paper at ACL 2019 (long). Code available at https://github.com/shmsw25/DecompRC

  21. arXiv:1901.00603  [pdf, other

    cs.CL cs.AI

    Coarse-grain Fine-grain Coattention Network for Multi-evidence Question Answering

    Authors: Victor Zhong, Caiming Xiong, Nitish Shirish Keskar, Richard Socher

    Abstract: End-to-end neural models have made significant progress in question answering, however recent studies show that these models implicitly assume that the answer and evidence appear close together in a single document. In this work, we propose the Coarse-grain Fine-grain Coattention Network (CFC), a new question answering model that combines information from evidence across multiple documents. The CF… ▽ More

    Submitted 13 May, 2019; v1 submitted 2 January, 2019; originally announced January 2019.

    Comments: ICLR 2019; 9 pages, 7 figures

  22. arXiv:1805.09655  [pdf, other

    cs.CL cs.AI

    Global-Locally Self-Attentive Dialogue State Tracker

    Authors: Victor Zhong, Caiming Xiong, Richard Socher

    Abstract: Dialogue state tracking, which estimates user goals and requests given the dialogue context, is an essential part of task-oriented dialogue systems. In this paper, we propose the Global-Locally Self-Attentive Dialogue State Tracker (GLAD), which learns representations of the user utterance and previous system actions with global-local modules. Our model uses global modules to share parameters betw… ▽ More

    Submitted 6 September, 2018; v1 submitted 19 May, 2018; originally announced May 2018.

    Comments: ACL 2018. 10 pages, 5 figures. Source code: https://github.com/salesforce/glad

  23. arXiv:1805.08092  [pdf, other

    cs.CL

    Efficient and Robust Question Answering from Minimal Context over Documents

    Authors: Sewon Min, Victor Zhong, Richard Socher, Caiming Xiong

    Abstract: Neural models for question answering (QA) over documents have achieved significant performance improvements. Although effective, these models do not scale to large corpora due to their complex modeling of interactions between the document and the question. Moreover, recent work has shown that such models are sensitive to adversarial inputs. In this paper, we study the minimal context required to a… ▽ More

    Submitted 21 May, 2018; originally announced May 2018.

    Comments: Published as a conference paper at ACL 2018 (long paper)

  24. arXiv:1711.00106  [pdf, other

    cs.CL cs.AI

    DCN+: Mixed Objective and Deep Residual Coattention for Question Answering

    Authors: Caiming Xiong, Victor Zhong, Richard Socher

    Abstract: Traditional models for question answering optimize using cross entropy loss, which encourages exact answers at the cost of penalizing nearby or overlapping answers that are sometimes equally accurate. We propose a mixed objective that combines cross entropy loss with self-critical policy learning. The objective uses rewards derived from word overlap to solve the misalignment between evaluation met… ▽ More

    Submitted 10 November, 2017; v1 submitted 31 October, 2017; originally announced November 2017.

    Comments: 10 pages, 6 figures

  25. arXiv:1709.00103  [pdf, other

    cs.CL cs.AI

    Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning

    Authors: Victor Zhong, Caiming Xiong, Richard Socher

    Abstract: A significant amount of the world's knowledge is stored in relational databases. However, the ability for users to retrieve facts from a database is limited due to a lack of understanding of query languages such as SQL. We propose Seq2SQL, a deep neural network for translating natural language questions to corresponding SQL queries. Our model leverages the structure of SQL queries to significantly… ▽ More

    Submitted 9 November, 2017; v1 submitted 31 August, 2017; originally announced September 2017.

    Comments: 12 pages, 5 figures

  26. arXiv:1611.01604  [pdf, other

    cs.CL cs.AI

    Dynamic Coattention Networks For Question Answering

    Authors: Caiming Xiong, Victor Zhong, Richard Socher

    Abstract: Several deep learning models have been proposed for question answering. However, due to their single-pass nature, they have no way to recover from local maxima corresponding to incorrect answers. To address this problem, we introduce the Dynamic Coattention Network (DCN) for question answering. The DCN first fuses co-dependent representations of the question and the document in order to focus on r… ▽ More

    Submitted 6 March, 2018; v1 submitted 5 November, 2016; originally announced November 2016.

    Comments: 14 pages, 7 figures, International Conference on Learning Representations 2017

  27. arXiv:1506.07285  [pdf, other

    cs.CL cs.LG cs.NE

    Ask Me Anything: Dynamic Memory Networks for Natural Language Processing

    Authors: Ankit Kumar, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus, Richard Socher

    Abstract: Most tasks in natural language processing can be cast into question answering (QA) problems over language input. We introduce the dynamic memory network (DMN), a neural network architecture which processes input sequences and questions, forms episodic memories, and generates relevant answers. Questions trigger an iterative attention process which allows the model to condition its attention on the… ▽ More

    Submitted 5 March, 2016; v1 submitted 24 June, 2015; originally announced June 2015.