Skip to main content

Showing 1–7 of 7 results for author: Lucchetti, F

.
  1. arXiv:2502.01584  [pdf, other

    cs.AI cs.LG

    PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models

    Authors: Zixuan Wu, Francesca Lucchetti, Aleksander Boruch-Gruszecki, Jingmiao Zhao, Carolyn Jane Anderson, Joydeep Biswas, Federico Cassano, Molly Q Feldman, Arjun Guha

    Abstract: Existing benchmarks for frontier models often test specialized, "PhD-level" knowledge that is difficult for non-experts to grasp. In contrast, we present a benchmark with 594 problems based on the NPR Sunday Puzzle Challenge that requires only general knowledge. Our benchmark is challenging for both humans and models; however correct solutions are easy to verify, and models' mistakes are easy to s… ▽ More

    Submitted 31 March, 2025; v1 submitted 3 February, 2025; originally announced February 2025.

  2. arXiv:2410.19792  [pdf, other

    cs.CY cs.LG

    Substance Beats Style: Why Beginning Students Fail to Code with LLMs

    Authors: Francesca Lucchetti, Zixuan Wu, Arjun Guha, Molly Q Feldman, Carolyn Jane Anderson

    Abstract: Although LLMs are increasing the productivity of professional programmers, existing work shows that beginners struggle to prompt LLMs to solve text-to-code tasks. Why is this the case? This paper explores two competing hypotheses about the cause of student-LLM miscommunication: (1) students simply lack the technical vocabulary needed to write good prompts, and (2) students do not understand the ex… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  3. arXiv:2407.14561  [pdf, other

    cs.LG cs.AI

    NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals

    Authors: Jaden Fiotto-Kaufman, Alexander R. Loftus, Eric Todd, Jannik Brinkmann, Koyena Pal, Dmitrii Troitskii, Michael Ripa, Adam Belfki, Can Rager, Caden Juang, Aaron Mueller, Samuel Marks, Arnab Sen Sharma, Francesca Lucchetti, Nikhil Prakash, Carla Brodley, Arjun Guha, Jonathan Bell, Byron C. Wallace, David Bau

    Abstract: We introduce NNsight and NDIF, technologies that work in tandem to enable scientific study of the representations and computations learned by very large neural networks. NNsight is an open-source system that extends PyTorch to introduce deferred remote execution. The National Deep Inference Fabric (NDIF) is a scalable inference service that executes NNsight requests, allowing users to share GPU re… ▽ More

    Submitted 1 April, 2025; v1 submitted 18 July, 2024; originally announced July 2024.

    Comments: Code at https://nnsight.net

  4. arXiv:2404.01903  [pdf, other

    cs.CL cs.LG cs.PL

    Understanding How CodeLLMs (Mis)Predict Types with Activation Steering

    Authors: Francesca Lucchetti, Arjun Guha

    Abstract: CodeLLMs are transforming software development as we know it. This is especially true for tasks where rule-based approaches fall short, like type prediction. The type prediction task consists in adding a new type annotation to a partially typed program, such that the resulting program is closer to being fully typed. The intractability of rule-based approaches and high cost of manual annotation mak… ▽ More

    Submitted 13 September, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: 14 pages, 7 figures

  5. Deploying and Evaluating LLMs to Program Service Mobile Robots

    Authors: Zichao Hu, Francesca Lucchetti, Claire Schlesinger, Yash Saxena, Anders Freeman, Sadanand Modak, Arjun Guha, Joydeep Biswas

    Abstract: Recent advancements in large language models (LLMs) have spurred interest in using them for generating robot programs from natural language, with promising initial results. We investigate the use of LLMs to generate programs for service mobile robots leveraging mobility, perception, and human interaction skills, and where accurate sequencing and ordering of actions is crucial for success. We contr… ▽ More

    Submitted 21 February, 2024; v1 submitted 18 November, 2023; originally announced November 2023.

    Comments: 8 pages, Accepted at IEEE Robotics and Automation Letters (RA-L)

    Journal ref: IEEE Robotics and Automation Letters, vol. 9, no. 3, pp. 2853-2860, March 2024

  6. arXiv:2308.09895  [pdf, other

    cs.PL cs.LG

    Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs

    Authors: Federico Cassano, John Gouwar, Francesca Lucchetti, Claire Schlesinger, Anders Freeman, Carolyn Jane Anderson, Molly Q Feldman, Michael Greenberg, Abhinav Jangda, Arjun Guha

    Abstract: Over the past few years, Large Language Models of Code (Code LLMs) have started to have a significant impact on programming practice. Code LLMs are also emerging as building blocks for research in programming languages and software engineering. However, Code LLMs produce impressive results on programming languages that are well represented in their training data (e.g., Java, Python, or JavaScript)… ▽ More

    Submitted 21 September, 2024; v1 submitted 18 August, 2023; originally announced August 2023.

  7. arXiv:2204.11017  [pdf, other

    cs.LG cs.DC

    Federated Geometric Monte Carlo Clustering to Counter Non-IID Datasets

    Authors: Federico Lucchetti, Jérémie Decouchant, Maria Fernandes, Lydia Y. Chen, Marcus Völp

    Abstract: Federated learning allows clients to collaboratively train models on datasets that are acquired in different locations and that cannot be exchanged because of their size or regulations. Such collected data is increasingly non-independent and non-identically distributed (non-IID), negatively affecting training accuracy. Previous works tried to mitigate the effects of non-IID datasets on training ac… ▽ More

    Submitted 23 April, 2022; originally announced April 2022.