Skip to main content

Showing 1–22 of 22 results for author: Ni, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.10956  [pdf, other

    cs.AI cs.CL

    Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?

    Authors: Ruisheng Cao, Fangyu Lei, Haoyuan Wu, Jixuan Chen, Yeqiao Fu, Hongcheng Gao, Xinzhuang Xiong, Hanchong Zhang, Yuchen Mao, Wenjing Hu, Tianbao Xie, Hongshen Xu, Danyang Zhang, Sida Wang, Ruoxi Sun, Pengcheng Yin, Caiming Xiong, Ansong Ni, Qian Liu, Victor Zhong, Lu Chen, Kai Yu, Tao Yu

    Abstract: Data science and engineering workflows often span multiple stages, from warehousing to orchestration, using tools like BigQuery, dbt, and Airbyte. As vision language models (VLMs) advance in multimodal understanding and code generation, VLM-based agents could potentially automate these workflows by generating SQL queries, Python code, and GUI operations. This automation can improve the productivit… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: 34 pages, 14 figures, 10 tables

  2. arXiv:2406.15976  [pdf, other

    cs.NE

    Effective Adaptive Mutation Rates for Program Synthesis

    Authors: Andrew Ni, Lee Spector

    Abstract: The problem-solving performance of many evolutionary algorithms, including genetic programming systems used for program synthesis, depends on the values of hyperparameters including mutation rates. The mutation method used to produce some of the best results to date on software synthesis benchmark problems, Uniform Mutation by Addition and Deletion (UMAD), adds new genes into a genome at a predete… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: 12 pages, 4 figures. Accepted at GECCO'24

  3. arXiv:2404.14662  [pdf, other

    cs.LG cs.CL cs.PL cs.SE

    NExT: Teaching Large Language Models to Reason about Code Execution

    Authors: Ansong Ni, Miltiadis Allamanis, Arman Cohan, Yinlin Deng, Kensen Shi, Charles Sutton, Pengcheng Yin

    Abstract: A fundamental skill among human developers is the ability to understand and reason about program execution. As an example, a programmer can mentally simulate code execution in natural language to debug and repair code (aka. rubber duck debugging). However, large language models (LLMs) of code are typically trained on the surface textual form of programs, thus may lack a semantic understanding of h… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: 35 pages

  4. arXiv:2404.12750  [pdf, other

    cs.NE

    Leveraging Symbolic Regression for Heuristic Design in the Traveling Thief Problem

    Authors: Andrew Ni, Lee Spector

    Abstract: The Traveling Thief Problem is an NP-hard combination of the well known traveling salesman and knapsack packing problems. In this paper, we use symbolic regression to learn useful features of near-optimal packing plans, which we then use to design efficient metaheuristic genetic algorithms for the traveling thief algorithm. By using symbolic regression again to initialize the metaheuristic GA with… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 23 pages

  5. arXiv:2403.04811  [pdf, other

    cs.SE cs.CL cs.LG

    Quantifying Contamination in Evaluating Code Generation Capabilities of Language Models

    Authors: Martin Riddell, Ansong Ni, Arman Cohan

    Abstract: While large language models have achieved remarkable performance on various code generation benchmarks, there have been growing concerns regarding potential contamination of these benchmarks as they may be leaked into pretraining and finetuning data. While recent work has investigated contamination in natural language generation and understanding tasks, there has been less extensive research into… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  6. arXiv:2401.12424  [pdf, other

    cs.NE cs.LG

    DALex: Lexicase-like Selection via Diverse Aggregation

    Authors: Andrew Ni, Li Ding, Lee Spector

    Abstract: Lexicase selection has been shown to provide advantages over other selection algorithms in several areas of evolutionary computation and machine learning. In its standard form, lexicase selection filters a population or other collection based on randomly ordered training cases that are considered one at a time. This iterated filtering process can be time-consuming, particularly in settings with la… ▽ More

    Submitted 8 February, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

    Comments: 15 pages, 4 figures. Accepted at EuroGP'24

  7. arXiv:2309.17446  [pdf, other

    cs.CL cs.LG cs.PL cs.SE

    L2CEval: Evaluating Language-to-Code Generation Capabilities of Large Language Models

    Authors: Ansong Ni, Pengcheng Yin, Yilun Zhao, Martin Riddell, Troy Feng, Rui Shen, Stephen Yin, Ye Liu, Semih Yavuz, Caiming Xiong, Shafiq Joty, Yingbo Zhou, Dragomir Radev, Arman Cohan

    Abstract: Recently, large language models (LLMs), especially those that are pretrained on code, have demonstrated strong capabilities in generating programs from natural language inputs in a few-shot or even zero-shot manner. Despite promising results, there is a notable lack of a comprehensive evaluation of these models language-to-code generation capabilities. Existing studies often focus on specific task… ▽ More

    Submitted 2 October, 2023; v1 submitted 29 September, 2023; originally announced September 2023.

    Comments: Project Website: https://l2c-eval.github.io/

  8. arXiv:2302.08468  [pdf, other

    cs.LG cs.CL cs.PL cs.SE

    LEVER: Learning to Verify Language-to-Code Generation with Execution

    Authors: Ansong Ni, Srini Iyer, Dragomir Radev, Ves Stoyanov, Wen-tau Yih, Sida I. Wang, Xi Victoria Lin

    Abstract: The advent of large language models trained on code (code LLMs) has led to significant progress in language-to-code generation. State-of-the-art approaches in this area combine LLM decoding with sample pruning and reranking using test cases or heuristics based on the execution results. However, it is challenging to obtain test cases for many real-world language-to-code applications, and heuristics… ▽ More

    Submitted 1 September, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

    Comments: ICML'23; code available at https://github.com/niansong1996/lever

  9. arXiv:2211.16740  [pdf, other

    cs.CL

    Explicit Knowledge Transfer for Weakly-Supervised Code Generation

    Authors: Zhangir Azerbayev, Ansong Ni, Hailey Schoelkopf, Dragomir Radev

    Abstract: Large language models (LLMs) can acquire strong code-generation capabilities through few-shot learning. In contrast, supervised fine-tuning is still needed for smaller models to achieve good performance. Such fine-tuning demands a large number of task-specific NL-code pairs, which are expensive to obtain. In this paper, we attempt to transfer the code generation ability of an LLM to a smaller mode… ▽ More

    Submitted 7 June, 2023; v1 submitted 29 November, 2022; originally announced November 2022.

    Comments: Updated on Jun 7. 2023 with ICLR workshop header

  10. arXiv:2209.00840  [pdf, other

    cs.CL

    FOLIO: Natural Language Reasoning with First-Order Logic

    Authors: Simeng Han, Hailey Schoelkopf, Yilun Zhao, Zhenting Qi, Martin Riddell, Wenfei Zhou, James Coady, David Peng, Yujie Qiao, Luke Benson, Lucy Sun, Alex Wardle-Solano, Hannah Szabo, Ekaterina Zubova, Matthew Burtell, Jonathan Fan, Yixin Liu, Brian Wong, Malcolm Sailor, Ansong Ni, Linyong Nan, Jungo Kasai, Tao Yu, Rui Zhang, Alexander R. Fabbri , et al. (10 additional authors not shown)

    Abstract: Large language models (LLMs) have achieved remarkable performance on a variety of natural language understanding tasks. However, existing benchmarks are inadequate in measuring the complex logical reasoning capabilities of a model. We present FOLIO, a human-annotated, logically complex and diverse dataset for reasoning in natural language (NL), equipped with first-order logic (FOL) annotations. FO… ▽ More

    Submitted 11 October, 2024; v1 submitted 2 September, 2022; originally announced September 2022.

  11. arXiv:2205.14318  [pdf, other

    cs.LG cs.PL

    Learning Math Reasoning from Self-Sampled Correct and Partially-Correct Solutions

    Authors: Ansong Ni, Jeevana Priya Inala, Chenglong Wang, Oleksandr Polozov, Christopher Meek, Dragomir Radev, Jianfeng Gao

    Abstract: Pretrained language models have shown superior performance on many natural language processing tasks, yet they still struggle at multi-step formal reasoning tasks like grade school math problems. One key challenge of finetuning them to solve such math reasoning problems is that many existing datasets only contain one reference solution for each problem, despite the fact that there are often altern… ▽ More

    Submitted 17 February, 2023; v1 submitted 27 May, 2022; originally announced May 2022.

    Comments: Accepted to ICLR 2023

  12. arXiv:2205.12476  [pdf, other

    cs.CL

    Leveraging Locality in Abstractive Text Summarization

    Authors: Yixin Liu, Ansong Ni, Linyong Nan, Budhaditya Deb, Chenguang Zhu, Ahmed H. Awadallah, Dragomir Radev

    Abstract: Neural attention models have achieved significant improvements on many natural language processing tasks. However, the quadratic memory complexity of the self-attention module with respect to the input length hinders their applications in long text summarization. Instead of designing more efficient attention modules, we approach this problem by investigating if models with a restricted context can… ▽ More

    Submitted 30 October, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: Accepted to EMNLP 2022

  13. arXiv:2201.05966  [pdf, other

    cs.CL

    UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

    Authors: Tianbao Xie, Chen Henry Wu, Peng Shi, Ruiqi Zhong, Torsten Scholak, Michihiro Yasunaga, Chien-Sheng Wu, Ming Zhong, Pengcheng Yin, Sida I. Wang, Victor Zhong, Bailin Wang, Chengzu Li, Connor Boyle, Ansong Ni, Ziyu Yao, Dragomir Radev, Caiming Xiong, Lingpeng Kong, Rui Zhang, Noah A. Smith, Luke Zettlemoyer, Tao Yu

    Abstract: Structured knowledge grounding (SKG) leverages structured knowledge to complete user requests, such as semantic parsing over databases and question answering over knowledge bases. Since the inputs and outputs of SKG tasks are heterogeneous, they have been studied separately by different communities, which limits systematic and compatible research on SKG. In this paper, we overcome this limitation… ▽ More

    Submitted 18 October, 2022; v1 submitted 15 January, 2022; originally announced January 2022.

    Comments: EMNLP 2022

  14. arXiv:2110.10150  [pdf, other

    cs.CL

    Summ^N: A Multi-Stage Summarization Framework for Long Input Dialogues and Documents

    Authors: Yusen Zhang, Ansong Ni, Ziming Mao, Chen Henry Wu, Chenguang Zhu, Budhaditya Deb, Ahmed H. Awadallah, Dragomir Radev, Rui Zhang

    Abstract: Text summarization helps readers capture salient information from documents, news, interviews, and meetings. However, most state-of-the-art pretrained language models (LM) are unable to efficiently process long text for many summarization tasks. In this paper, we propose Summ$^N$, a simple, flexible, and effective multi-stage framework for input texts that are longer than the maximum context lengt… ▽ More

    Submitted 13 April, 2022; v1 submitted 16 October, 2021; originally announced October 2021.

    Comments: ACL 2022

  15. arXiv:2110.08168  [pdf, other

    cs.CL

    DYLE: Dynamic Latent Extraction for Abstractive Long-Input Summarization

    Authors: Ziming Mao, Chen Henry Wu, Ansong Ni, Yusen Zhang, Rui Zhang, Tao Yu, Budhaditya Deb, Chenguang Zhu, Ahmed H. Awadallah, Dragomir Radev

    Abstract: Transformer-based models have achieved state-of-the-art performance on short-input summarization. However, they still struggle with summarizing longer text. In this paper, we present DYLE, a novel dynamic latent extraction approach for abstractive long-input summarization. DYLE jointly trains an extractor and a generator and treats the extracted text snippets as the latent variable, allowing dynam… ▽ More

    Submitted 24 April, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: ACL 2022

  16. arXiv:2109.04609  [pdf, other

    cs.CL

    An Exploratory Study on Long Dialogue Summarization: What Works and What's Next

    Authors: Yusen Zhang, Ansong Ni, Tao Yu, Rui Zhang, Chenguang Zhu, Budhaditya Deb, Asli Celikyilmaz, Ahmed Hassan Awadallah, Dragomir Radev

    Abstract: Dialogue summarization helps readers capture salient information from long conversations in meetings, interviews, and TV series. However, real-world dialogues pose a great challenge to current summarization models, as the dialogue length typically exceeds the input limits imposed by recent transformer-based pre-trained models, and the interactive nature of dialogues makes relevant information more… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

    Comments: Findings of EMNLP 2021

  17. arXiv:2108.12738  [pdf, other

    cs.CL

    SummerTime: Text Summarization Toolkit for Non-experts

    Authors: Ansong Ni, Zhangir Azerbayev, Mutethia Mutuma, Troy Feng, Yusen Zhang, Tao Yu, Ahmed Hassan Awadallah, Dragomir Radev

    Abstract: Recent advances in summarization provide models that can generate summaries of higher quality. Such models now exist for a number of summarization tasks, including query-based summarization, dialogue summarization, and multi-document summarization. While such models and tasks are rapidly growing in the research field, it has also become challenging for non-experts to keep track of them. To make su… ▽ More

    Submitted 10 September, 2021; v1 submitted 28 August, 2021; originally announced August 2021.

    Comments: EMNLP 2021 Demo Track

  18. arXiv:2103.12235  [pdf, other

    cs.CL

    Mitigating False-Negative Contexts in Multi-document Question Answering with Retrieval Marginalization

    Authors: Ansong Ni, Matt Gardner, Pradeep Dasigi

    Abstract: Question Answering (QA) tasks requiring information from multiple documents often rely on a retrieval model to identify relevant information for reasoning. The retrieval model is typically trained to maximize the likelihood of the labeled supporting evidence. However, when retrieving from large text corpora such as Wikipedia, the correct answer can often be obtained from multiple evidence candidat… ▽ More

    Submitted 8 September, 2021; v1 submitted 22 March, 2021; originally announced March 2021.

    Comments: Accepted to EMNLP 2021 (main conference)

  19. arXiv:2102.06726  [pdf, other

    cs.SE

    SOAR: A Synthesis Approach for Data Science API Refactoring

    Authors: Ansong Ni, Daniel Ramos, Aidan Yang, InĂªs Lynce, Vasco Manquinho, Ruben Martins, Claire Le Goues

    Abstract: With the growth of the open-source data science community, both the number of data science libraries and the number of versions for the same library are increasing rapidly. To match the evolving APIs from those libraries, open-source organizations often have to exert manual effort to refactor the APIs used in the code base. Moreover, due to the abundance of similar open-source libraries, data scie… ▽ More

    Submitted 12 February, 2021; originally announced February 2021.

  20. arXiv:1911.12986  [pdf, other

    cs.CL cs.LG

    Merging Weak and Active Supervision for Semantic Parsing

    Authors: Ansong Ni, Pengcheng Yin, Graham Neubig

    Abstract: A semantic parser maps natural language commands (NLs) from the users to executable meaning representations (MRs), which are later executed in certain environment to obtain user-desired results. The fully-supervised training of such parser requires NL/MR pairs, annotated by domain experts, which makes them expensive to collect. However, weakly-supervised semantic parsers are learnt only from pairs… ▽ More

    Submitted 29 November, 2019; originally announced November 2019.

    Comments: AAAI 2020 Main Track [Oral] (To appear)

  21. arXiv:1905.04561  [pdf, other

    math.OC cs.LG

    Linear Range in Gradient Descent

    Authors: Angxiu Ni, Chaitanya Talnikar

    Abstract: This paper defines linear range as the range of parameter perturbations which lead to approximately linear perturbations in the states of a network. We compute linear range from the difference between actual perturbations in states and the tangent solution. Linear range is a new criterion for estimating the effectivenss of gradients and thus having many possible applications. In particular, we pro… ▽ More

    Submitted 23 May, 2019; v1 submitted 11 May, 2019; originally announced May 2019.

    Comments: 9 pages, 4 figures

  22. arXiv:1611.00880  [pdf, other

    physics.comp-ph cs.CE math.NA nlin.CD

    Sensitivity analysis on chaotic dynamical system by Non-Intrusive Least Square Shadowing (NILSS)

    Authors: Angxiu Ni, Qiqi Wang

    Abstract: This paper develops the non-intrusive formulation of the Least-squares shadowing (LSS) method, for computing the sensitivity of long-time averaged objectives in chaotic dynamical systems. This non-intrusive formulation constrains the computation to only the unstable subspace, greatly reducing the cost of LSS for many problems; moreover, it reparametrizes the LSS problem, requiring only minor modif… ▽ More

    Submitted 21 June, 2019; v1 submitted 3 November, 2016; originally announced November 2016.

    Comments: 26 pages, 10 figures

    Journal ref: Journal of Computational Physics, Volume 347, Page 56-77, 2017