Skip to main content

Showing 1–50 of 54 results for author: Sundaresan, N

.
  1. arXiv:2503.07832  [pdf, other

    cs.AI cs.CL cs.LG cs.SE

    RefactorBench: Evaluating Stateful Reasoning in Language Agents Through Code

    Authors: Dhruv Gautam, Spandan Garg, Jinu Jang, Neel Sundaresan, Roshanak Zilouchian Moghaddam

    Abstract: Recent advances in language model (LM) agents and function calling have enabled autonomous, feedback-driven systems to solve problems across various digital domains. To better understand the unique limitations of LM agents, we introduce RefactorBench, a benchmark consisting of 100 large handcrafted multi-file refactoring tasks in popular open-source repositories. Solving tasks within RefactorBench… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

    Comments: ICLR 2025 Camera Ready

    ACM Class: I.2.5

  2. arXiv:2502.15034  [pdf, other

    quant-ph

    Randomized benchmarking of a high-fidelity remote CNOT gate over a meter-scale microwave interconnect

    Authors: Kentaro Heya, Timothy Phung, Moein Malekakhlagh, Rachel Steiner, Marco Turchetti, William Shanks, John Mamin, Wen-Sen Lu, Yadav Prasad Kandel, Neereja Sundaresan, Jason Orcutt

    Abstract: In the modular superconducting quantum processor architecture, high-fidelity, meter-scale microwave interconnect between processor modules is a key technology for extending system size beyond constraints imposed by device manufacturing equipment, yield, and signal delivery. While there have been many demonstrations of remote state transfer between modules, these relied on tomographic experiments f… ▽ More

    Submitted 20 February, 2025; originally announced February 2025.

  3. arXiv:2412.14308   

    cs.SE cs.LG

    Reinforcement Learning from Automatic Feedback for High-Quality Unit Test Generation

    Authors: Benjamin Steenhoek, Michele Tufano, Neel Sundaresan, Alexey Svyatkovskiy

    Abstract: Software testing is a crucial but time-consuming aspect of software development, and recently, Large Language Models (LLMs) have gained popularity for automated test case generation. However, because LLMs are trained on vast amounts of open-source code, they often generate test cases that do not adhere to best practices and may even contain test smells (anti-patterns). To address this issue, we pr… ▽ More

    Submitted 6 January, 2025; v1 submitted 18 December, 2024; originally announced December 2024.

    Comments: This work was intended as a replacement of arXiv:2310.02368 and any subsequent updates will appear there

  4. arXiv:2409.04634  [pdf, other

    quant-ph

    Mechanically-intermixed indium superconducting connections for microwave quantum interconnects

    Authors: Yves Martin, Neereja Sundaresan, Jae-woong Nah, Rachel Steiner, Marco Turchetti, Kevin Stawiasz, Chi Xiong, Jason S. Orcutt

    Abstract: Superconducting coaxial cables represent critical communication channels for interconnecting superconducting quantum processors. Here, we report mechanically-intermixed indium joins to aluminum coaxial cables for low loss quantum interconnects. We describe an ABCD matrix formalism to characterize the total resonator internal quality factor ($Q_i$) and any contact ($R_{cont}$) or shunt resistance (… ▽ More

    Submitted 6 September, 2024; originally announced September 2024.

    Comments: 6 pages, 5 figures

  5. arXiv:2404.08885  [pdf, other

    cs.PL cs.CL cs.LG

    Is Next Token Prediction Sufficient for GPT? Exploration on Code Logic Comprehension

    Authors: Mengnan Qi, Yufan Huang, Yongqiang Yao, Maoquan Wang, Bin Gu, Neel Sundaresan

    Abstract: Large language models (LLMs) has experienced exponential growth, they demonstrate remarkable performance across various tasks. Notwithstanding, contemporary research primarily centers on enhancing the size and quality of pretraining data, still utilizing the next token prediction task on autoregressive transformer model structure. The efficacy of this task in truly facilitating the model's compreh… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  6. arXiv:2403.08299  [pdf, other

    cs.SE cs.AI

    AutoDev: Automated AI-Driven Development

    Authors: Michele Tufano, Anisha Agarwal, Jinu Jang, Roshanak Zilouchian Moghaddam, Neel Sundaresan

    Abstract: The landscape of software development has witnessed a paradigm shift with the advent of AI-powered assistants, exemplified by GitHub Copilot. However, existing solutions are not leveraging all the potential capabilities available in an IDE such as building, testing, executing code, git operations, etc. Therefore, they are constrained by their limited capabilities, primarily focusing on suggesting… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  7. arXiv:2402.14261  [pdf, other

    cs.SE cs.AI

    Copilot Evaluation Harness: Evaluating LLM-Guided Software Programming

    Authors: Anisha Agarwal, Aaron Chan, Shubham Chandel, Jinu Jang, Shaun Miller, Roshanak Zilouchian Moghaddam, Yevhen Mohylevskyy, Neel Sundaresan, Michele Tufano

    Abstract: The integration of Large Language Models (LLMs) into Development Environments (IDEs) has become a focal point in modern software development. LLMs such as OpenAI GPT-3.5/4 and Code Llama offer the potential to significantly augment developer productivity by serving as intelligent, chat-driven programming assistants. However, utilizing LLMs out of the box is unlikely to be optimal for any given sce… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  8. arXiv:2401.09663  [pdf, other

    quant-ph

    Enhanced Quantum State Transfer and Bell State Generation over Long-Range Multimode Interconnects via Superadiabatic Transitionless Driving

    Authors: Moein Malekakhlagh, Timothy Phung, Daniel Puzzuoli, Kentaro Heya, Neereja Sundaresan, Jason Orcutt

    Abstract: Achieving high-fidelity direct two-qubit gates over meter-scale long quantum interconnects is challenging in part due to the multimode nature of such systems. One alternative scheme is to combine local operations with remote quantum state transfer or remote entanglement. Here, we study quantum state transfer and entanglement generation for two distant qubits, equipped with tunable interactions, ov… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: 14 pages, 12 figures, 4 appendices

  9. arXiv:2312.11508  [pdf, other

    cs.CL cs.AI

    Rethinking the Instruction Quality: LIFT is What You Need

    Authors: Yang Xu, Yongqiang Yao, Yufan Huang, Mengnan Qi, Maoquan Wang, Bin Gu, Neel Sundaresan

    Abstract: Instruction tuning, a specialized technique to enhance large language model (LLM) performance via instruction datasets, relies heavily on the quality of employed data. Existing quality improvement methods alter instruction data through dataset expansion or curation. However, the expansion method risks data redundancy, potentially compromising LLM performance, while the curation approach confines t… ▽ More

    Submitted 27 December, 2023; v1 submitted 11 December, 2023; originally announced December 2023.

  10. arXiv:2310.14209  [pdf, other

    cs.SE cs.LG

    SUT: Active Defects Probing for Transcompiler Models

    Authors: Mengnan Qi, Yufan Huang, Maoquan Wang, Yongqiang Yao, Zihan Liu, Bin Gu, Colin Clement, Neel Sundaresan

    Abstract: Automatic Program translation has enormous application value and hence has been attracting significant interest from AI researchers. However, we observe that current program translation models still make elementary syntax errors, particularly, when the target language does not have syntax elements in the source language. Metrics like BLUE, CodeBLUE and computation accuracy may not expose these iss… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

  11. arXiv:2310.11476  [pdf, other

    cs.SE cs.LG

    Program Translation via Code Distillation

    Authors: Yufan Huang, Mengnan Qi, Yongqiang Yao, Maoquan Wang, Bin Gu, Colin Clement, Neel Sundaresan

    Abstract: Software version migration and program translation are an important and costly part of the lifecycle of large codebases. Traditional machine translation relies on parallel corpora for supervised translation, which is not feasible for program translation due to a dearth of aligned data. Recent unsupervised neural machine translation techniques have overcome data limitations by included techniques s… ▽ More

    Submitted 17 October, 2023; originally announced October 2023.

  12. arXiv:2310.02368  [pdf, other

    cs.SE cs.LG

    Reinforcement Learning from Automatic Feedback for High-Quality Unit Test Generation

    Authors: Benjamin Steenhoek, Michele Tufano, Neel Sundaresan, Alexey Svyatkovskiy

    Abstract: Software testing is a crucial aspect of software development, and the creation of high-quality tests that adhere to best practices is essential for effective maintenance. Recently, Large Language Models (LLMs) have gained popularity for code generation, including the automated creation of test cases. However, these LLMs are often trained on vast amounts of publicly available code, which may includ… ▽ More

    Submitted 6 January, 2025; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: Accepted to DeepTest 2025 (ICSE Workshop). Previously this version appeared as arXiv:2412.14308 which was submitted as a new work by accident

  13. arXiv:2307.13383  [pdf, other

    cs.SE cs.AI

    Predicting Code Coverage without Execution

    Authors: Michele Tufano, Shubham Chandel, Anisha Agarwal, Neel Sundaresan, Colin Clement

    Abstract: Code coverage is a widely used metric for quantifying the extent to which program elements, such as statements or branches, are executed during testing. Calculating code coverage is resource-intensive, requiring code building and execution with additional overhead for the instrumentation. Furthermore, computing coverage of any snippet of code requires the whole program context. Using Machine Learn… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

  14. arXiv:2306.17077  [pdf, other

    cs.SE cs.AI

    RAPGen: An Approach for Fixing Code Inefficiencies in Zero-Shot

    Authors: Spandan Garg, Roshanak Zilouchian Moghaddam, Neel Sundaresan

    Abstract: Performance bugs are non-functional bugs that can even manifest in well-tested commercial products. Fixing these performance bugs is an important yet challenging problem. In this work, we address this challenge and present a new approach called Retrieval-Augmented Prompt Generation (RAPGen). Given a code snippet with a performance issue, RAPGen first retrieves a prompt instruction from a pre-const… ▽ More

    Submitted 8 January, 2025; v1 submitted 29 June, 2023; originally announced June 2023.

  15. arXiv:2306.01754  [pdf, other

    cs.CR cs.AI cs.LG

    Transformer-based Vulnerability Detection in Code at EditTime: Zero-shot, Few-shot, or Fine-tuning?

    Authors: Aaron Chan, Anant Kharkar, Roshanak Zilouchian Moghaddam, Yevhen Mohylevskyy, Alec Helyar, Eslam Kamal, Mohamed Elkamhawy, Neel Sundaresan

    Abstract: Software vulnerabilities bear enterprises significant costs. Despite extensive efforts in research and development of software vulnerability detection methods, uncaught vulnerabilities continue to put software owners and users at risk. Many current vulnerability detection methods require that code snippets can compile and build before attempting detection. This, unfortunately, introduces a long la… ▽ More

    Submitted 22 May, 2023; originally announced June 2023.

  16. Encoding a magic state with beyond break-even fidelity

    Authors: Riddhi S. Gupta, Neereja Sundaresan, Thomas Alexander, Christopher J. Wood, Seth T. Merkel, Michael B. Healy, Marius Hillenbrand, Tomas Jochym-O'Connor, James R. Wootton, Theodore J. Yoder, Andrew W. Cross, Maika Takita, Benjamin J. Brown

    Abstract: To run large-scale algorithms on a quantum computer, error-correcting codes must be able to perform a fundamental set of operations, called logic gates, while isolating the encoded information from noise~\cite{Harper2019,Ryan-Anderson2021,Egan2021fault, Chen2022calibrated, Sundaresan2022matching, ryananderson2022implementing, Postler2022demonstration, GoogleAI2023}. We can complete a universal set… ▽ More

    Submitted 13 March, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: 19 pages, 13 figures, 3 tables, comments welcome; v2 - Updated draft including new appendices following peer review. Includes a section on injecting the encoded magic state into larger codes (explicitly studying the surface code, the heavy-hex code and the color code) and a numerical section interrogating the fault-tolerant properties of the circuit

    Journal ref: Nature 625, 259 (2024)

  17. arXiv:2305.05383  [pdf, other

    cs.PL cs.AI cs.CL cs.SE

    Code Execution with Pre-trained Language Models

    Authors: Chenxiao Liu, Shuai Lu, Weizhu Chen, Daxin Jiang, Alexey Svyatkovskiy, Shengyu Fu, Neel Sundaresan, Nan Duan

    Abstract: Code execution is a fundamental aspect of programming language semantics that reflects the exact behavior of the code. However, most pre-trained models for code intelligence ignore the execution trace and only rely on source code and syntactic structures. In this paper, we investigate how well pre-trained models can understand and perform code execution. We develop a mutation-based data augmentati… ▽ More

    Submitted 8 May, 2023; originally announced May 2023.

    Comments: Accepted to the Findings of ACL 2023

  18. arXiv:2303.07263  [pdf, other

    cs.SE

    InferFix: End-to-End Program Repair with LLMs

    Authors: Matthew Jin, Syed Shahriar, Michele Tufano, Xin Shi, Shuai Lu, Neel Sundaresan, Alexey Svyatkovskiy

    Abstract: Software development life cycle is profoundly influenced by bugs: their introduction, identification, and eventual resolution account for a significant portion of software cost. This has motivated software engineering researchers and practitioners to propose different approaches for automating the identification and repair of software defects. Large language models have been adapted to the program… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

  19. arXiv:2208.13928  [pdf, other

    cs.SE cs.CL cs.LG

    Exploring and Evaluating Personalized Models for Code Generation

    Authors: Andrei Zlotchevski, Dawn Drain, Alexey Svyatkovskiy, Colin Clement, Neel Sundaresan, Michele Tufano

    Abstract: Large Transformer models achieved the state-of-the-art status for Natural Language Understanding tasks and are increasingly becoming the baseline model architecture for modeling source code. Transformers are usually pre-trained on large unsupervised corpora, learning token representations and transformations relevant to modeling generally available text, and are then fine-tuned on a particular dow… ▽ More

    Submitted 19 September, 2022; v1 submitted 29 August, 2022; originally announced August 2022.

    Comments: Accepted to the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022), Industry Track - Singapore, November 14-18, 2022, to appear 9 pages

  20. arXiv:2206.13619  [pdf, other

    cs.SE cs.AI cs.PF

    DeepPERF: A Deep Learning-Based Approach For Improving Software Performance

    Authors: Spandan Garg, Roshanak Zilouchian Moghaddam, Colin B. Clement, Neel Sundaresan, Chen Wu

    Abstract: Improving software performance is an important yet challenging part of the software development cycle. Today, the majority of performance inefficiencies are identified and patched by performance experts. Recent advancements in deep learning approaches and the wide-spread availability of open source data creates a great opportunity to automate the identification and patching of performance problems… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

  21. arXiv:2205.11023  [pdf, other

    cs.SE cs.CL

    AdaptivePaste: Code Adaptation through Learning Semantics-aware Variable Usage Representations

    Authors: Xiaoyu Liu, Jinu Jang, Neel Sundaresan, Miltiadis Allamanis, Alexey Svyatkovskiy

    Abstract: In software development, it is common for programmers to copy-paste or port code snippets and then adapt them to their use case. This scenario motivates the code adaptation task -- a variant of program repair which aims to adapt variable identifiers in a pasted snippet of code to the surrounding, preexisting source code. However, no existing approach has been shown to effectively address this task… ▽ More

    Submitted 6 October, 2023; v1 submitted 22 May, 2022; originally announced May 2022.

  22. arXiv:2204.12648  [pdf, other

    cs.SE cs.AI cs.LG

    Generating Examples From CLI Usage: Can Transformers Help?

    Authors: Roshanak Zilouchian Moghaddam, Spandan Garg, Colin B. Clement, Yevhen Mohylevskyy, Neel Sundaresan

    Abstract: Continuous evolution in modern software often causes documentation, tutorials, and examples to be out of sync with changing interfaces and frameworks. Relying on outdated documentation and examples can lead programs to fail or be less efficient or even less secure. In response, programmers need to regularly turn to other resources on the web such as StackOverflow for examples to guide them in writ… ▽ More

    Submitted 26 April, 2022; originally announced April 2022.

  23. Methods2Test: A dataset of focal methods mapped to test cases

    Authors: Michele Tufano, Shao Kun Deng, Neel Sundaresan, Alexey Svyatkovskiy

    Abstract: Unit testing is an essential part of the software development process, which helps to identify issues with source code in early stages of development and prevent regressions. Machine learning has emerged as viable approach to help software developers generate automated unit tests. However, generating reliable unit test cases that are semantically correct and capable of catching software bugs or un… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: Accepted for publication in the proceedings of The 2022 Mining Software Repositories Conference (MSR 2022) - Data and Tool track

  24. Learning to Reduce False Positives in Analytic Bug Detectors

    Authors: Anant Kharkar, Roshanak Zilouchian Moghaddam, Matthew Jin, Xiaoyu Liu, Xin Shi, Colin Clement, Neel Sundaresan

    Abstract: Due to increasingly complex software design and rapid iterative development, code defects and security vulnerabilities are prevalent in modern software. In response, programmers rely on static analysis tools to regularly scan their codebases and find potential bugs. In order to maximize coverage, however, these tools generally tend to report a significant number of false positives, requiring devel… ▽ More

    Submitted 7 March, 2022; originally announced March 2022.

    Comments: Accepted for publication at ICSE 2022

  25. arXiv:2203.09095  [pdf, other

    cs.SE cs.AI

    Automating Code Review Activities by Large-Scale Pre-training

    Authors: Zhiyu Li, Shuai Lu, Daya Guo, Nan Duan, Shailesh Jannu, Grant Jenks, Deep Majumder, Jared Green, Alexey Svyatkovskiy, Shengyu Fu, Neel Sundaresan

    Abstract: Code review is an essential part to software development lifecycle since it aims at guaranteeing the quality of codes. Modern code review activities necessitate developers viewing, understanding and even running the programs to assess logic, functionality, latency, style and other factors. It turns out that developers have to spend far too much time reviewing the code of their peers. Accordingly,… ▽ More

    Submitted 11 October, 2022; v1 submitted 17 March, 2022; originally announced March 2022.

    Comments: ESEC/FSE 2022, camera-ready version

  26. Matching and maximum likelihood decoding of a multi-round subsystem quantum error correction experiment

    Authors: Neereja Sundaresan, Theodore J. Yoder, Youngseok Kim, Muyuan Li, Edward H. Chen, Grace Harper, Ted Thorbeck, Andrew W. Cross, Antonio D. Córcoles, Maika Takita

    Abstract: Quantum error correction offers a promising path for performing quantum computations with low errors. Although a fully fault-tolerant execution of a quantum algorithm remains unrealized, recent experimental developments, along with improvements in control electronics, are enabling increasingly advanced demonstrations of the necessary operations for applying quantum error correction. Here, we perfo… ▽ More

    Submitted 19 April, 2022; v1 submitted 14 March, 2022; originally announced March 2022.

    Comments: 15 pages, 6 figures, 5 tables

    Journal ref: Nat Commun 14, 2852 (2023)

  27. arXiv:2201.12901  [pdf, other

    cs.LG cs.SE

    Training and Evaluating a Jupyter Notebook Data Science Assistant

    Authors: Shubham Chandel, Colin B. Clement, Guillermo Serrato, Neel Sundaresan

    Abstract: We study the feasibility of a Data Science assistant powered by a sequence-to-sequence transformer by training a new model JuPyT5 on all publicly available Jupyter Notebook GitHub repositories and developing a new metric: Data Science Problems (DSP). DSP is a collection of 1119 problems curated from 306 pedagogical notebooks with 92 dataset dependencies, natural language and Markdown problem descr… ▽ More

    Submitted 30 January, 2022; originally announced January 2022.

  28. Calibrated decoders for experimental quantum error correction

    Authors: Edward H. Chen, Theodore J. Yoder, Youngseok Kim, Neereja Sundaresan, Srikanth Srinivasan, Muyuan Li, Antonio D. Córcoles, Andrew W. Cross, Maika Takita

    Abstract: Arbitrarily long quantum computations require quantum memories that can be repeatedly measured without being corrupted. Here, we preserve the state of a quantum memory, notably with the additional use of flagged error events. All error events were extracted using fast, mid-circuit measurements and resets of the physical qubits. Among the error decoders we considered, we introduce a perfect matchin… ▽ More

    Submitted 8 October, 2021; originally announced October 2021.

    Comments: 16 pages, 14 figures, 5 tables, for peer-review

    MSC Class: 81P73 (Primary) 81P73 (Secondary) ACM Class: J.2

    Journal ref: Phys. Rev. Lett. 128, 110504 (2022)

  29. arXiv:2109.08780  [pdf, other

    cs.LG cs.SE

    Long-Range Modeling of Source Code Files with eWASH: Extended Window Access by Syntax Hierarchy

    Authors: Colin B. Clement, Shuai Lu, Xiaoyu Liu, Michele Tufano, Dawn Drain, Nan Duan, Neel Sundaresan, Alexey Svyatkovskiy

    Abstract: Statistical language modeling and translation with transformers have found many successful applications in program understanding and generation tasks, setting high benchmarks for tools in modern software development environments. The finite context window of these neural models means, however, that they will be unable to leverage the entire relevant context of large files and packages for any give… ▽ More

    Submitted 17 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021 camera ready

  30. Program Merge Conflict Resolution via Neural Transformers

    Authors: Alexey Svyatkovskiy, Sarah Fakhoury, Negar Ghorbani, Todd Mytkowicz, Elizabeth Dinella, Christian Bird, Jinu Jang, Neel Sundaresan, Shuvendu Lahiri

    Abstract: Collaborative software development is an integral part of the modern software development life cycle, essential to the success of large-scale software projects. When multiple developers make concurrent changes around the same lines of code, a merge conflict may occur. Such conflicts stall pull requests and continuous integration pipelines for hours to several days, seriously hurting developer prod… ▽ More

    Submitted 29 November, 2022; v1 submitted 31 August, 2021; originally announced September 2021.

    Comments: ESEC/FSE '22 camera ready version. 12 pages, 4 figures, online appendix

  31. Scalable mitigation of measurement errors on quantum computers

    Authors: Paul D. Nation, Hwajung Kang, Neereja Sundaresan, Jay M. Gambetta

    Abstract: We present a method for mitigating measurement errors on quantum computing platforms that does not form the full assignment matrix, or its inverse, and works in a subspace defined by the noisy input bit-strings. This method accommodates both uncorrelated and correlated errors, and allows for computing accurate error bounds. Additionally, we detail a matrix-free preconditioned iterative solution me… ▽ More

    Submitted 27 August, 2021; originally announced August 2021.

    Comments: 9 pages, 8 figures, 1 table

    Journal ref: PRX Quantum 2, 040326 (2021)

  32. arXiv:2108.03322  [pdf, other

    cs.IR cs.LG cs.SE

    Distilling Transformers for Neural Cross-Domain Search

    Authors: Colin B. Clement, Chen Wu, Dawn Drain, Neel Sundaresan

    Abstract: Pre-trained transformers have recently clinched top spots in the gamut of natural language tasks and pioneered solutions to software engineering tasks. Even information retrieval has not been immune to the charm of the transformer, though their large size and cost is generally a barrier to deployment. While there has been much work in streamlining, caching, and modifying transformer architectures… ▽ More

    Submitted 6 August, 2021; originally announced August 2021.

    Comments: 4 pages, 1 figure, emnlp formatting

  33. Quantum crosstalk cancellation for fast entangling gates and improved multi-qubit performance

    Authors: K. X. Wei, E. Magesan, I. Lauer, S. Srinivasan, D. F. Bogorin, S. Carnevale, G. A. Keefe, Y. Kim, D. Klaus, W. Landers, N. Sundaresan, C. Wang, E. J. Zhang, M. Steffen, O. E. Dial, D. C. McKay, A. Kandala

    Abstract: Quantum computers built with superconducting artificial atoms already stretch the limits of their classical counterparts. While the lowest energy states of these artificial atoms serve as the qubit basis, the higher levels are responsible for both a host of attractive gate schemes as well as generating undesired interactions. In particular, when coupling these atoms to generate entanglement, the h… ▽ More

    Submitted 1 June, 2021; originally announced June 2021.

    Comments: 8 pages, 5 figures plus Supplementary Information (8 pages, 7 figures)

    Journal ref: Phys. Rev. Lett. 129, 060501 (2022)

  34. arXiv:2105.09352  [pdf, other

    cs.SE cs.LG

    DeepDebug: Fixing Python Bugs Using Stack Traces, Backtranslation, and Code Skeletons

    Authors: Dawn Drain, Colin B. Clement, Guillermo Serrato, Neel Sundaresan

    Abstract: The joint task of bug localization and program repair is an integral part of the software development process. In this work we present DeepDebug, an approach to automated debugging using large, pretrained transformers. We begin by training a bug-creation model on reversed commit data for the purpose of generating synthetic bugs. We apply these synthetic bugs toward two ends. First, we directly tra… ▽ More

    Submitted 19 May, 2021; originally announced May 2021.

  35. Generating Bug-Fixes Using Pretrained Transformers

    Authors: Dawn Drain, Chen Wu, Alexey Svyatkovskiy, Neel Sundaresan

    Abstract: Detecting and fixing bugs are two of the most important yet frustrating parts of the software development cycle. Existing bug detection tools are based mainly on static analyzers, which rely on mathematical logic and symbolic reasoning about the program execution to detect common types of bugs. Fixing bugs is typically left out to the developer. In this work we introduce DeepDebug: a data-driven p… ▽ More

    Submitted 28 April, 2021; v1 submitted 16 April, 2021; originally announced April 2021.

  36. arXiv:2104.05310  [pdf, other

    cs.IR cs.PL

    Generating Code with the Help of Retrieved Template Functions and Stack Overflow Answers

    Authors: Dawn Drain, Changran Hu, Chen Wu, Mikhail Breslav, Neel Sundaresan

    Abstract: We approach the important challenge of code autocompletion as an open-domain task, in which a sequence-to-sequence code generator model is enhanced with the ability to attend to reference code snippets supplied by a semantic code search engine. In this work, we present a novel framework to precisely retrieve template functions as well as intent-snippet pairs and effectively train such a retrieval-… ▽ More

    Submitted 12 April, 2021; v1 submitted 12 April, 2021; originally announced April 2021.

    Comments: 8 pages

  37. arXiv:2102.04664  [pdf, other

    cs.SE cs.CL

    CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

    Authors: Shuai Lu, Daya Guo, Shuo Ren, Junjie Huang, Alexey Svyatkovskiy, Ambrosio Blanco, Colin Clement, Dawn Drain, Daxin Jiang, Duyu Tang, Ge Li, Lidong Zhou, Linjun Shou, Long Zhou, Michele Tufano, Ming Gong, Ming Zhou, Nan Duan, Neel Sundaresan, Shao Kun Deng, Shengyu Fu, Shujie Liu

    Abstract: Benchmark datasets have a significant impact on accelerating research in programming language tasks. In this paper, we introduce CodeXGLUE, a benchmark dataset to foster machine learning research for program understanding and generation. CodeXGLUE includes a collection of 10 tasks across 14 datasets and a platform for model evaluation and comparison. CodeXGLUE also features three baseline systems,… ▽ More

    Submitted 16 March, 2021; v1 submitted 9 February, 2021; originally announced February 2021.

    Comments: 14 pages; Revise CodeBLEU scores for all models on text-to-code task

  38. arXiv:2012.08475  [pdf, other

    quant-ph

    High-fidelity superconducting quantum processors via laser-annealing of transmon qubits

    Authors: Eric J. Zhang, Srikanth Srinivasan, Neereja Sundaresan, Daniela F. Bogorin, Yves Martin, Jared B. Hertzberg, John Timmerwilke, Emily J. Pritchett, Jeng-Bang Yau, Cindy Wang, William Landers, Eric P. Lewandowski, Adinath Narasgond, Sami Rosenblatt, George A. Keefe, Isaac Lauer, Mary Beth Rothwell, Douglas T. McClure, Oliver E. Dial, Jason S. Orcutt, Markus Brink, Jerry M. Chow

    Abstract: Scaling the number of qubits while maintaining high-fidelity quantum gates remains a key challenge for quantum computing. Presently, superconducting quantum processors with >50-qubits are actively available. For such systems, fixed-frequency transmons are attractive due to their long coherence and noise immunity. However, scaling fixed-frequency architectures proves challenging due to precise rela… ▽ More

    Submitted 15 December, 2020; originally announced December 2020.

    Comments: 9 pages, 8 figures, Supplementary Information

  39. arXiv:2010.03150  [pdf, other

    cs.LG cs.SE

    PyMT5: multi-mode translation of natural language and Python code with transformers

    Authors: Colin B. Clement, Dawn Drain, Jonathan Timcheck, Alexey Svyatkovskiy, Neel Sundaresan

    Abstract: Simultaneously modeling source code and natural language has many exciting applications in automated software development and understanding. Pursuant to achieving such technology, we introduce PyMT5, the Python method text-to-text transfer transformer, which is trained to translate between all pairs of Python method feature combinations: a single model that can both predict whole methods from natu… ▽ More

    Submitted 7 October, 2020; originally announced October 2020.

    Comments: 14 pages, 7 figures, 5 tables, EMNLP 2020 camera ready version

  40. arXiv:2009.10297  [pdf, other

    cs.SE cs.CL

    CodeBLEU: a Method for Automatic Evaluation of Code Synthesis

    Authors: Shuo Ren, Daya Guo, Shuai Lu, Long Zhou, Shujie Liu, Duyu Tang, Neel Sundaresan, Ming Zhou, Ambrosio Blanco, Shuai Ma

    Abstract: Evaluation metrics play a vital role in the growth of an area as it defines the standard of distinguishing between good and bad models. In the area of code synthesis, the commonly used evaluation metric is BLEU or perfect accuracy, but they are not suitable enough to evaluate codes, because BLEU is originally designed to evaluate the natural language, neglecting important syntactic and semantic fe… ▽ More

    Submitted 27 September, 2020; v1 submitted 21 September, 2020; originally announced September 2020.

    Comments: 8 pages, 6 figures

  41. arXiv:2009.08366  [pdf, other

    cs.SE cs.CL

    GraphCodeBERT: Pre-training Code Representations with Data Flow

    Authors: Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, Michele Tufano, Shao Kun Deng, Colin Clement, Dawn Drain, Neel Sundaresan, Jian Yin, Daxin Jiang, Ming Zhou

    Abstract: Pre-trained models for programming language have achieved dramatic empirical improvements on a variety of code-related tasks such as code search, code completion, code summarization, etc. However, existing pre-trained models regard a code snippet as a sequence of tokens, while ignoring the inherent structure of code, which provides crucial code semantics and would enhance the code understanding pr… ▽ More

    Submitted 13 September, 2021; v1 submitted 17 September, 2020; originally announced September 2020.

    Comments: Accepted by ICLR2021

  42. Generating Accurate Assert Statements for Unit Test Cases using Pretrained Transformers

    Authors: Michele Tufano, Dawn Drain, Alexey Svyatkovskiy, Neel Sundaresan

    Abstract: Unit testing represents the foundational basis of the software testing pyramid, beneath integration and end-to-end testing. Automated software testing researchers have proposed a variety of techniques to assist developers in this time-consuming task. In this paper we present an approach to support developers in writing unit test cases by generating accurate and useful assert statements. Our approa… ▽ More

    Submitted 11 September, 2020; originally announced September 2020.

  43. arXiv:2009.05617  [pdf, other

    cs.SE cs.CL cs.LG

    Unit Test Case Generation with Transformers and Focal Context

    Authors: Michele Tufano, Dawn Drain, Alexey Svyatkovskiy, Shao Kun Deng, Neel Sundaresan

    Abstract: Automated unit test case generation tools facilitate test-driven development and support developers by suggesting tests intended to identify flaws in their code. Existing approaches are usually guided by the test coverage criteria, generating synthetic test cases that are often difficult for developers to read or understand. In this paper we propose AthenaTest, an approach that aims to generate un… ▽ More

    Submitted 20 May, 2021; v1 submitted 11 September, 2020; originally announced September 2020.

  44. Demonstration of quantum volume 64 on a superconducting quantum computing system

    Authors: Petar Jurcevic, Ali Javadi-Abhari, Lev S. Bishop, Isaac Lauer, Daniela F. Bogorin, Markus Brink, Lauren Capelluto, Oktay Günlük, Toshinari Itoko, Naoki Kanazawa, Abhinav Kandala, George A. Keefe, Kevin Krsulich, William Landers, Eric P. Lewandowski, Douglas T. McClure, Giacomo Nannicini, Adinath Narasgond, Hasan M. Nayfeh, Emily Pritchett, Mary Beth Rothwell, Srikanth Srinivasan, Neereja Sundaresan, Cindy Wang, Ken X. Wei , et al. (6 additional authors not shown)

    Abstract: We improve the quality of quantum circuits on superconducting quantum computing systems, as measured by the quantum volume, with a combination of dynamical decoupling, compiler optimizations, shorter two-qubit gates, and excited state promoted readout. This result shows that the path to larger quantum volume systems requires the simultaneous increase of coherence, control gate fidelities, measurem… ▽ More

    Submitted 4 September, 2020; v1 submitted 19 August, 2020; originally announced August 2020.

    Comments: Fixed typo in author list. Added references [38], [49] and [52]

    Journal ref: Quantum Sci. Technol. 6 025020 (2021)

  45. Reducing unitary and spectator errors in cross resonance with optimized rotary echoes

    Authors: Neereja Sundaresan, Isaac Lauer, Emily Pritchett, Easwar Magesan, Petar Jurcevic, Jay M. Gambetta

    Abstract: We present an improvement to the cross resonance gate realized with the addition of resonant, target rotary pulses. These pulses, applied directly to the target qubit, are simultaneous to and in phase with the echoed cross resonance pulses. Using specialized Hamiltonian error amplifying tomography, we confirm a reduction of error terms with target rotary -- directly translating to improved two-qub… ▽ More

    Submitted 6 July, 2020; originally announced July 2020.

    Journal ref: PRX Quantum 1, 020318 (2020)

  46. arXiv:2005.08025  [pdf, other

    cs.CL cs.SE

    IntelliCode Compose: Code Generation Using Transformer

    Authors: Alexey Svyatkovskiy, Shao Kun Deng, Shengyu Fu, Neel Sundaresan

    Abstract: In software development through integrated development environments (IDEs), code completion is one of the most widely used features. Nevertheless, majority of integrated development environments only support completion of methods and APIs, or arguments. In this paper, we introduce IntelliCode Compose $-$ a general-purpose multilingual code completion tool which is capable of predicting sequences… ▽ More

    Submitted 29 October, 2020; v1 submitted 16 May, 2020; originally announced May 2020.

    Comments: Accepted for publication at ESEC/FSE conference

  47. Pythia: AI-assisted Code Completion System

    Authors: Alexey Svyatkovskiy, Ying Zhao, Shengyu Fu, Neel Sundaresan

    Abstract: In this paper, we propose a novel end-to-end approach for AI-assisted code completion called Pythia. It generates ranked lists of method and API recommendations which can be used by software developers at edit time. The system is currently deployed as part of Intellicode extension in Visual Studio Code IDE. Pythia exploits state-of-the-art large-scale deep learning models trained on code contexts… ▽ More

    Submitted 28 November, 2019; originally announced December 2019.

    Comments: Published in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD '19)

  48. Verifying Multipartite Entangled GHZ States via Multiple Quantum Coherences

    Authors: Ken X. Wei, Isaac Lauer, Srikanth Srinivasan, Neereja Sundaresan, Douglas T. McClure, David Toyli, David C. McKay, Jay M. Gambetta, Sarah Sheldon

    Abstract: The ability to generate and verify multipartite entanglement is an important benchmark for near-term quantum devices devices. We develop a scalable entanglement metric based on multiple quantum coherences, and demonstrate experimentally on a 20-qubit superconducting device - the IBM Q System One. We report a state fidelity of 0.5165$\pm$0.0036 for an 18-qubit GHZ state, indicating multipartite ent… ▽ More

    Submitted 14 May, 2019; originally announced May 2019.

    Comments: 7+4 pages, comments welcome

    Journal ref: Phys. Rev. A 101, 032343 (2020)

  49. Interacting Qubit-Photon Bound States with Superconducting Circuits

    Authors: Neereja M. Sundaresan, Rex Lundgren, Guanyu Zhu, Alexey V. Gorshkov, Andrew A. Houck

    Abstract: Qubits strongly coupled to a photonic crystal give rise to many exotic physical scenarios, beginning with single and multi-excitation qubit-photon dressed bound states comprising induced spatially localized photonic modes, centered around the qubits, and the qubits themselves. The localization of these states changes with qubit detuning from the band-edge, offering an avenue of in situ control of… ▽ More

    Submitted 30 January, 2018; originally announced January 2018.

    Journal ref: Phys. Rev. X 9, 011021 (2019)

  50. arXiv:1607.06895  [pdf, other

    quant-ph cond-mat.mes-hall

    Observation of a dissipative phase transition in a one-dimensional circuit QED lattice

    Authors: Mattias Fitzpatrick, Neereja M. Sundaresan, Andy C. Y. Li, Jens Koch, A. A. Houck

    Abstract: Condensed matter physics has been driven forward by significant experimental and theoretical progress in the study and understanding of equilibrium phase transitions based on symmetry and topology. However, nonequilibrium phase transitions have remained a challenge, in part due to their complexity in theoretical descriptions and the additional experimental difficulties in systematically controllin… ▽ More

    Submitted 23 July, 2016; originally announced July 2016.

    Journal ref: Phys. Rev. X 7, 011016 (2017)