Skip to main content

Showing 1–50 of 61 results for author: Molly

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.10077  [pdf, other

    cs.CL cs.AI cs.IR cs.IT

    A quantum semantic framework for natural language processing

    Authors: Christopher J. Agostino, Quan Le Thien, Molly Apsel, Denizhan Pak, Elina Lesyk, Ashabari Majumdar

    Abstract: Semantic degeneracy represents a fundamental property of natural language that extends beyond simple polysemy to encompass the combinatorial explosion of potential interpretations that emerges as semantic expressions increase in complexity. Large Language Models (LLMs) and other modern NLP systems face inherent limitations precisely because they operate within natural language itself, making them… ▽ More

    Submitted 11 June, 2025; originally announced June 2025.

    Comments: 12 pages, 2 figures, accepted submission to Quantum AI and NLP 2025

  2. "I Would Have Written My Code Differently'': Beginners Struggle to Understand LLM-Generated Code

    Authors: Yangtian Zi, Luisa Li, Arjun Guha, Carolyn Jane Anderson, Molly Q Feldman

    Abstract: Large language models (LLMs) are being increasingly adopted for programming work. Prior work shows that while LLMs accelerate task completion for professional programmers, beginning programmers struggle to prompt models effectively. However, prompting is just half of the code generation process -- when code is generated, it must be read, evaluated, and integrated (or rejected). How accessible are… ▽ More

    Submitted 26 April, 2025; originally announced April 2025.

    Comments: To appear in 33rd ACM International Conference on the Foundations of Software Engineering (FSE Companion '25), June 23-28, 2025, Trondheim, Norway

  3. arXiv:2503.16674  [pdf, other

    cs.CL

    Through the LLM Looking Glass: A Socratic Probing of Donkeys, Elephants, and Markets

    Authors: Molly Kennedy, Ayyoob Imani, Timo Spinde, Hinrich Schütze

    Abstract: While detecting and avoiding bias in LLM-generated text is becoming increasingly important, media bias often remains subtle and subjective, making it particularly difficult to identify and mitigate. In this study, we assess media bias in LLM-generated content and LLMs' ability to detect subtle ideological bias. We conduct this evaluation using two datasets, PoliGen and EconoLex, covering political… ▽ More

    Submitted 22 May, 2025; v1 submitted 20 March, 2025; originally announced March 2025.

  4. Privacy Ethics Alignment in AI: A Stakeholder-Centric Based Framework for Ethical AI

    Authors: Ankur Barthwal, Molly Campbell, Ajay Kumar Shrestha

    Abstract: The increasing integration of Artificial Intelligence (AI) in digital ecosystems has reshaped privacy dynamics, particularly for young digital citizens navigating data-driven environments. This study explores evolving privacy concerns across three key stakeholder groups, digital citizens (ages 16-19), parents/educators, and AI professionals, and assesses differences in data ownership, trust, trans… ▽ More

    Submitted 20 March, 2025; v1 submitted 14 March, 2025; originally announced March 2025.

    Comments: Submitted to peer reviwed venue

    Journal ref: Systems 2025, 13, 455

  5. arXiv:2503.11947  [pdf

    cs.CY cs.AI cs.LG

    Ethical AI for Young Digital Citizens: A Call to Action on Privacy Governance

    Authors: Austin Shouli, Ankur Barthwal, Molly Campbell, Ajay Kumar Shrestha

    Abstract: The rapid expansion of Artificial Intelligence (AI) in digital platforms used by youth has created significant challenges related to privacy, autonomy, and data protection. While AI-driven personalization offers enhanced user experiences, it often operates without clear ethical boundaries, leaving young users vulnerable to data exploitation and algorithmic biases. This paper presents a call to act… ▽ More

    Submitted 14 March, 2025; originally announced March 2025.

    Comments: Preprint Version | To be submitted to peer-reviewed venue

  6. arXiv:2502.01584  [pdf, other

    cs.AI cs.LG

    PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models

    Authors: Zixuan Wu, Francesca Lucchetti, Aleksander Boruch-Gruszecki, Jingmiao Zhao, Carolyn Jane Anderson, Joydeep Biswas, Federico Cassano, Molly Q Feldman, Arjun Guha

    Abstract: Existing benchmarks for frontier models often test specialized, "PhD-level" knowledge that is difficult for non-experts to grasp. In contrast, we present a benchmark with 594 problems based on the NPR Sunday Puzzle Challenge that requires only general knowledge. Our benchmark is challenging for both humans and models; however correct solutions are easy to verify, and models' mistakes are easy to s… ▽ More

    Submitted 31 March, 2025; v1 submitted 3 February, 2025; originally announced February 2025.

  7. arXiv:2501.18782  [pdf, other

    eess.IV cs.CV

    PSO-Net: Development of an automated psoriasis assessment system using attention-based interpretable deep neural networks

    Authors: Sharif A. Kamran, Molly V. Lucas, Brendon Lutnick, Chaitanya Parmar, Basudha Pal, Asha Patel Shah, David Apfel, Steven Fakharzadeh, Lloyd Miller, Stephen Yip, Kristopher Standish, Gabriela Oana Cula

    Abstract: Psoriasis is a chronic skin condition that requires long-term treatment and monitoring. Although, the Psoriasis Area and Severity Index (PASI) is utilized as a standard measurement to assess psoriasis severity in clinical trials, it has many drawbacks such as (1) patient burden for in-person clinic visits for assessment of psoriasis, (2) time required for investigator scoring and (3) variability o… ▽ More

    Submitted 30 January, 2025; originally announced January 2025.

    Comments: Accepted to IEEE ISBI 2025. 5 Pages, 3 figures, 2 tables

  8. arXiv:2501.16701  [pdf, other

    physics.soc-ph cs.DL

    Understanding the importance of SHAPE to the UK research ecosystem

    Authors: Hélène Draux, Briony Fane, Daniel W. Hook, Juergen Wastl, Philip Lewis, Molly Morgan Jones, Pablo Roblero, James R. Wilsdon

    Abstract: The UK has a long-established reputation for excellence in research across a broad range of fields, but in recent years, there has been greater emphasis on STEM investment and greater recognition of the UK's success in STEM. This paper examines the relative strengths of SHAPE disciplines and demonstrates that the UK's SHAPE research portfolio outperforms the UK's STEM research, for each internatio… ▽ More

    Submitted 27 January, 2025; originally announced January 2025.

    Comments: 13 pages, 9 figures

  9. Investigation of the Privacy Concerns in AI Systems for Young Digital Citizens: A Comparative Stakeholder Analysis

    Authors: Molly Campbell, Ankur Barthwal, Sandhya Joshi, Austin Shouli, Ajay Kumar Shrestha

    Abstract: The integration of Artificial Intelligence (AI) systems into technologies used by young digital citizens raises significant privacy concerns. This study investigates these concerns through a comparative analysis of stakeholder perspectives. A total of 252 participants were surveyed, with the analysis focusing on 110 valid responses from parents/educators and 100 from AI professionals after data cl… ▽ More

    Submitted 22 January, 2025; originally announced January 2025.

    Comments: To appear in the 2025 IEEE 14th Annual Computing and Communication Workshop and Conference (CCWC) proceedings

    Journal ref: 2025 IEEE 15th Annual Computing and Communication Workshop and Conference (CCWC)

  10. arXiv:2412.16369  [pdf

    cs.CY cs.LG

    Navigating AI to Unpack Youth Privacy Concerns: An In-Depth Exploration and Systematic Review

    Authors: Ajay Kumar Shrestha, Ankur Barthwal, Molly Campbell, Austin Shouli, Saad Syed, Sandhya Joshi, Julita Vassileva

    Abstract: This systematic literature review investigates perceptions, concerns, and expectations of young digital citizens regarding privacy in artificial intelligence (AI) systems, focusing on social media platforms, educational technology, gaming systems, and recommendation algorithms. Using a rigorous methodology, the review started with 2,000 papers, narrowed down to 552 after initial screening, and fin… ▽ More

    Submitted 20 December, 2024; originally announced December 2024.

    Comments: To appear in the 2024 IEEE Annual Information Technology, Electronics and Mobile Communication Conference proceedings

  11. arXiv:2412.02650  [pdf, other

    cs.RO

    Bridging Hard and Soft: Mechanical Metamaterials Enable Rigid Torque Transmission in Soft Robots

    Authors: Molly Carton, Jakub F. Kowalewski, Jiani Guo, Jacob F. Alpert, Aman Garg, Daniel Revier, Jeffrey Ian Lipton

    Abstract: Torque and continuous rotation are fundamental methods of actuation and manipulation in rigid robots. Soft robot arms use soft materials and structures to mimic the passive compliance of biological arms that bend and extend. This use of compliance prevents soft arms from continuously transmitting and exerting torques to interact with their environment. Here, we show how relying on patterning struc… ▽ More

    Submitted 3 December, 2024; originally announced December 2024.

  12. arXiv:2411.01111  [pdf, other

    cs.AI

    Rule Based Rewards for Language Model Safety

    Authors: Tong Mu, Alec Helyar, Johannes Heidecke, Joshua Achiam, Andrea Vallone, Ian Kivlichan, Molly Lin, Alex Beutel, John Schulman, Lilian Weng

    Abstract: Reinforcement learning based fine-tuning of large language models (LLMs) on human preferences has been shown to enhance both their capabilities and safety behavior. However, in cases related to safety, without precise instructions to human annotators, the data collected may cause the model to become overly cautious, or to respond in an undesirable style, such as being judgmental. Additionally, as… ▽ More

    Submitted 1 November, 2024; originally announced November 2024.

    Comments: Accepted at Neurips 2024

  13. arXiv:2410.21276  [pdf, other

    cs.CL cs.AI cs.CV cs.CY cs.LG cs.SD eess.AS

    GPT-4o System Card

    Authors: OpenAI, :, Aaron Hurst, Adam Lerer, Adam P. Goucher, Adam Perelman, Aditya Ramesh, Aidan Clark, AJ Ostrow, Akila Welihinda, Alan Hayes, Alec Radford, Aleksander Mądry, Alex Baker-Whitcomb, Alex Beutel, Alex Borzunov, Alex Carney, Alex Chow, Alex Kirillov, Alex Nichol, Alex Paino, Alex Renzin, Alex Tachard Passos, Alexander Kirillov, Alexi Christakis , et al. (395 additional authors not shown)

    Abstract: GPT-4o is an autoregressive omni model that accepts as input any combination of text, audio, image, and video, and generates any combination of text, audio, and image outputs. It's trained end-to-end across text, vision, and audio, meaning all inputs and outputs are processed by the same neural network. GPT-4o can respond to audio inputs in as little as 232 milliseconds, with an average of 320 mil… ▽ More

    Submitted 25 October, 2024; originally announced October 2024.

  14. arXiv:2410.19792  [pdf, other

    cs.CY cs.LG

    Substance Beats Style: Why Beginning Students Fail to Code with LLMs

    Authors: Francesca Lucchetti, Zixuan Wu, Arjun Guha, Molly Q Feldman, Carolyn Jane Anderson

    Abstract: Although LLMs are increasing the productivity of professional programmers, existing work shows that beginners struggle to prompt LLMs to solve text-to-code tasks. Why is this the case? This paper explores two competing hypotheses about the cause of student-LLM miscommunication: (1) students simply lack the technical vocabulary needed to write good prompts, and (2) students do not understand the ex… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  15. arXiv:2409.06941  [pdf, other

    cs.DC cs.AI

    FreeRide: Harvesting Bubbles in Pipeline Parallelism

    Authors: Jiashu Zhang, Zihan Pan, Molly, Xu, Khuzaima Daudjee, Sihang Liu

    Abstract: The occurrence of bubbles in pipeline parallelism is an inherent limitation that can account for more than 40% of the large language model (LLM) training time and is one of the main reasons for the underutilization of GPU resources in LLM training. Harvesting these bubbles for GPU side tasks can increase resource utilization and reduce training costs but comes with challenges. First, because bubbl… ▽ More

    Submitted 27 April, 2025; v1 submitted 10 September, 2024; originally announced September 2024.

  16. arXiv:2407.21037  [pdf, other

    cs.CL cs.AI

    An Application of Large Language Models to Coding Negotiation Transcripts

    Authors: Ray Friedman, Jaewoo Cho, Jeanne Brett, Xuhui Zhan, Ningyu Han, Sriram Kannan, Yingxiang Ma, Jesse Spencer-Smith, Elisabeth Jäckel, Alfred Zerres, Madison Hooper, Katie Babbit, Manish Acharya, Wendi Adair, Soroush Aslani, Tayfun Aykaç, Chris Bauman, Rebecca Bennett, Garrett Brady, Peggy Briggs, Cheryl Dowie, Chase Eck, Igmar Geiger, Frank Jacob, Molly Kern , et al. (33 additional authors not shown)

    Abstract: In recent years, Large Language Models (LLM) have demonstrated impressive capabilities in the field of natural language processing (NLP). This paper explores the application of LLMs in negotiation transcript analysis by the Vanderbilt AI Negotiation Lab. Starting in September 2022, we applied multiple strategies using LLMs from zero shot learning to fine tuning models to in-context learning). The… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  17. arXiv:2407.04906  [pdf, other

    cs.CR

    Privacy or Transparency? Negotiated Smartphone Access as a Signifier of Trust in Romantic Relationships

    Authors: Periwinkle Doerfler, Kieron Ivy Turk, Chris Geeng, Damon McCoy, Jeffrey Ackerman, Molly Dragiewicz

    Abstract: In this work, we analyze two large-scale surveys to examine how individuals think about sharing smartphone access with romantic partners as a function of trust in relationships. We find that the majority of couples have access to each others' devices, but may have explicit or implicit boundaries on how this access is to be used. Investigating these boundaries and related social norms, we find that… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  18. arXiv:2405.06058  [pdf, other

    cs.AI cs.CL cs.CY cs.HC

    Large Language Models Show Human-like Social Desirability Biases in Survey Responses

    Authors: Aadesh Salecha, Molly E. Ireland, Shashanka Subrahmanya, João Sedoc, Lyle H. Ungar, Johannes C. Eichstaedt

    Abstract: As Large Language Models (LLMs) become widely used to model and simulate human behavior, understanding their biases becomes critical. We developed an experimental framework using Big Five personality surveys and uncovered a previously undetected social desirability bias in a wide range of LLMs. By systematically varying the number of questions LLMs were exposed to, we demonstrate their ability to… ▽ More

    Submitted 21 November, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

    Comments: 3 pages, 2 figures, accepted at PNAS Nexus

  19. arXiv:2405.05382  [pdf, other

    cs.CY

    DrawL: Understanding the Effects of Non-Mainstream Dialects in Prompted Image Generation

    Authors: Joshua N. Williams, Molly FitzMorris, Osman Aka, Sarah Laszlo

    Abstract: Text-to-image models are now easy to use and ubiquitous. However, prior work has found that they are prone to recapitulating harmful Western stereotypes. For example, requesting that a model generate an "African person and their house," may produce a person standing next to a straw hut. In this example, the word "African" is an explicit descriptor of the person that the prompt is seeking to depict… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: 12 pages, 3 figures in main text, 2 tables in main text, 4 figures in appendix, 7 tables in appendix

  20. How Users Experience Closed Captions on Live Television: Quality Metrics Remain a Challenge

    Authors: Mariana Arroyo Chavez, Molly Feanny, Matthew Seita, Bernard Thompson, Keith Delk, Skyler Officer, Abraham Glasser, Raja Kushalnagar, Christian Vogler

    Abstract: This paper presents a mixed methods study on how deaf, hard of hearing and hearing viewers perceive live TV caption quality with captioned video stimuli designed to mirror TV captioning experiences. To assess caption quality, we used four commonly-used quality metrics focusing on accuracy: word error rate, weighted word error rate, automated caption evaluation (ACE), and its successor ACE2. We cal… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: To appear in Proceedings of the Conference on Human Factors in Computing Systems CHI 24, May 11-16, Honolulu, HI, USA, 16 pages. https://doi.org/10.1145/3613904.3641988

  21. arXiv:2403.04526  [pdf, other

    cs.LG cs.AI cs.CV

    Hyperspectral unmixing for Raman spectroscopy via physics-constrained autoencoders

    Authors: Dimitar Georgiev, Álvaro Fernández-Galiana, Simon Vilms Pedersen, Georgios Papadopoulos, Ruoxiao Xie, Molly M. Stevens, Mauricio Barahona

    Abstract: Raman spectroscopy is widely used across scientific domains to characterize the chemical composition of samples in a non-destructive, label-free manner. Many applications entail the unmixing of signals from mixtures of molecular species to identify the individual components present and their proportions, yet conventional methods for chemometrics often struggle with complex mixture scenarios encoun… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Journal ref: Proceedings of the National Academy of Sciences, 2024, 121(44), e2321305121

  22. arXiv:2402.17704  [pdf, other

    q-bio.QM cs.LG stat.ML

    Transfer Learning Bayesian Optimization to Design Competitor DNA Molecules for Use in Diagnostic Assays

    Authors: Ruby Sedgwick, John P. Goertz, Molly M. Stevens, Ruth Misener, Mark van der Wilk

    Abstract: With the rise in engineered biomolecular devices, there is an increased need for tailor-made biological sequences. Often, many similar biological sequences need to be made for a specific application meaning numerous, sometimes prohibitively expensive, lab experiments are necessary for their optimization. This paper presents a transfer learning design of experiments workflow to make this developmen… ▽ More

    Submitted 22 October, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

  23. How Beginning Programmers and Code LLMs (Mis)read Each Other

    Authors: Sydney Nguyen, Hannah McLean Babe, Yangtian Zi, Arjun Guha, Carolyn Jane Anderson, Molly Q Feldman

    Abstract: Generative AI models, specifically large language models (LLMs), have made strides towards the long-standing goal of text-to-code generation. This progress has invited numerous studies of user interaction. However, less is known about the struggles and strategies of non-experts, for whom each step of the text-to-code problem presents challenges: describing their intent in natural language, evaluat… ▽ More

    Submitted 7 July, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

    Comments: Published in CHI 2024

  24. arXiv:2312.11359  [pdf, other

    cs.HC

    Combining Game Design and Data Visualization to Inform Plastics Policy: Fostering Collaboration between Science, Decision-Makers, and Artificial Intelligence

    Authors: A Samuel Pottinger, Nivedita Biyani, Roland Geyer, Douglas J McCauley, Magali de Bruyn, Molly R Morse, Neil Nathan, Kevin Koy, Ciera Martinez

    Abstract: This multi-disciplinary case study details how a public web application combines information and game design to visualize effects of user-defined policies intended to reduce plastic waste. Contextualizing this open source software within a broader lineage of digital media research, this user experience exploration outlines potential directions for facilitating conversation between artificial intel… ▽ More

    Submitted 19 December, 2023; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: 29 pages of which 8 are citations, 4 figures, latex generated from markdown via Pandoc (https://pandoc.org/) for Arxiv

  25. arXiv:2312.00023  [pdf, other

    cs.CR

    Hypergraph Topological Features for Autoencoder-Based Intrusion Detection for Cybersecurity Data

    Authors: Bill Kay, Sinan G. Aksoy, Molly Baird, Daniel M. Best, Helen Jenne, Cliff Joslyn, Christopher Potvin, Gregory Henselman-Petrusek, Garret Seppala, Stephen J. Young, Emilie Purvine

    Abstract: In this position paper, we argue that when hypergraphs are used to capture multi-way local relations of data, their resulting topological features describe global behaviour. Consequently, these features capture complex correlations that can then serve as high fidelity inputs to autoencoder-driven anomaly detection pipelines. We propose two such potential pipelines for cybersecurity data, one that… ▽ More

    Submitted 9 November, 2023; originally announced December 2023.

    MSC Class: 55N31

  26. arXiv:2310.05597  [pdf, other

    cs.CL

    Can language models learn analogical reasoning? Investigating training objectives and comparisons to human performance

    Authors: Molly R. Petersen, Lonneke van der Plas

    Abstract: While analogies are a common way to evaluate word embeddings in NLP, it is also of interest to investigate whether or not analogical reasoning is a task in itself that can be learned. In this paper, we test several ways to learn basic analogical reasoning, specifically focusing on analogies that are more typical of what is used to evaluate analogical reasoning in humans than those in commonly used… ▽ More

    Submitted 3 May, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

  27. arXiv:2309.06640  [pdf, other

    cs.SE

    REVIS: An Error Visualization Tool for Rust

    Authors: Ruochen Wang, Molly Maclaren, Michael Coblenz

    Abstract: Rust is a programming language that uses a concept of ownership to guarantee memory safety without the use of a garbage collector. However, some error messages related to ownership can be difficult to understand and fix, particularly those that depend on value lifetimes. To help developers fix lifetime-related errors, we developed REVIS, a VSCode extension that visualizes lifetime-related Rust com… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: Presented at HATRA 2023

  28. arXiv:2308.09895  [pdf, other

    cs.PL cs.LG

    Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs

    Authors: Federico Cassano, John Gouwar, Francesca Lucchetti, Claire Schlesinger, Anders Freeman, Carolyn Jane Anderson, Molly Q Feldman, Michael Greenberg, Abhinav Jangda, Arjun Guha

    Abstract: Over the past few years, Large Language Models of Code (Code LLMs) have started to have a significant impact on programming practice. Code LLMs are also emerging as building blocks for research in programming languages and software engineering. However, Code LLMs produce impressive results on programming languages that are well represented in their training data (e.g., Java, Python, or JavaScript)… ▽ More

    Submitted 21 September, 2024; v1 submitted 18 August, 2023; originally announced August 2023.

  29. arXiv:2308.01320  [pdf, other

    cs.LG cs.AI cs.CL

    DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales

    Authors: Zhewei Yao, Reza Yazdani Aminabadi, Olatunji Ruwase, Samyam Rajbhandari, Xiaoxia Wu, Ammar Ahmad Awan, Jeff Rasley, Minjia Zhang, Conglong Li, Connor Holmes, Zhongzhu Zhou, Michael Wyatt, Molly Smith, Lev Kurilenko, Heyang Qin, Masahiro Tanaka, Shuai Che, Shuaiwen Leon Song, Yuxiong He

    Abstract: ChatGPT-like models have revolutionized various applications in artificial intelligence, from summarization and coding to translation, matching or even surpassing human performance. However, the current landscape lacks an accessible, efficient, and cost-effective end-to-end RLHF (Reinforcement Learning with Human Feedback) training pipeline for these powerful models, particularly when training at… ▽ More

    Submitted 2 August, 2023; originally announced August 2023.

    Comments: 14 pages, 7 figures

  30. arXiv:2307.13650  [pdf, other

    cond-mat.mtrl-sci cs.MS physics.data-an

    RamanSPy: An open-source Python package for integrative Raman spectroscopy data analysis

    Authors: Dimitar Georgiev, Simon Vilms Pedersen, Ruoxiao Xie, Álvaro Fernández-Galiana, Molly M. Stevens, Mauricio Barahona

    Abstract: Raman spectroscopy is a non-destructive and label-free chemical analysis technique, which plays a key role in the analysis and discovery cycle of various branches of science. Nonetheless, progress in Raman spectroscopic analysis is still impeded by the lack of software, methodological and data standardisation, and the ensuing fragmentation and lack of reproducibility of analysis workflows thereof.… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Journal ref: Anal. Chem. 2024, 96, 21, 8492-8500

  31. arXiv:2306.04556  [pdf, other

    cs.LG cs.HC cs.SE

    StudentEval: A Benchmark of Student-Written Prompts for Large Language Models of Code

    Authors: Hannah McLean Babe, Sydney Nguyen, Yangtian Zi, Arjun Guha, Molly Q Feldman, Carolyn Jane Anderson

    Abstract: Code LLMs are being rapidly deployed and there is evidence that they can make professional programmers more productive. Current benchmarks for code generation measure whether models generate correct programs given an expert prompt. In this paper, we present a new benchmark containing multiple prompts per problem, written by a specific population of non-expert prompters: beginning programmers. Stud… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  32. arXiv:2303.08774  [pdf, other

    cs.CL cs.AI

    GPT-4 Technical Report

    Authors: OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Berner, Lenny Bogdonoff, Oleg Boiko , et al. (256 additional authors not shown)

    Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based mo… ▽ More

    Submitted 4 March, 2024; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: 100 pages; updated authors list; fixed author names and added citation

  33. arXiv:2301.10531  [pdf, other

    cs.CV cs.AI

    3D Tooth Mesh Segmentation with Simplified Mesh Cell Representation

    Authors: Ananya Jana, Hrebesh Molly Subhash, Dimitris N. Metaxas

    Abstract: Manual tooth segmentation of 3D tooth meshes is tedious and there is variations among dentists. %Manual tooth annotation of 3D tooth meshes is a tedious task. Several deep learning based methods have been proposed to perform automatic tooth mesh segmentation. Many of the proposed tooth mesh segmentation algorithms summarize the mesh cell as - the cell center or barycenter, the normal at barycenter… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

    Comments: accepted at IEEE ISBI 2023 International Symposium on Biomedical Imaging

  34. arXiv:2212.13638  [pdf, other

    cs.SI stat.AP

    Battling the Coronavirus Infodemic Among Social Media Users in Kenya and Nigeria

    Authors: Molly Offer-Westort, Leah R. Rosenzweig, Susan Athey

    Abstract: How can we induce social media users to be discerning when sharing information during a pandemic? An experiment on Facebook Messenger with users from Kenya (n = 7,498) and Nigeria (n = 7,794) tested interventions designed to decrease intentions to share COVID-19 misinformation without decreasing intentions to share factual posts. The initial stage of the study incorporated: (i) a factorial design… ▽ More

    Submitted 15 September, 2023; v1 submitted 27 December, 2022; originally announced December 2022.

    Comments: 52 pages including appendix, 9 figures

  35. arXiv:2211.07357  [pdf, other

    cs.LG cs.AI eess.SY

    Controlling Commercial Cooling Systems Using Reinforcement Learning

    Authors: Jerry Luo, Cosmin Paduraru, Octavian Voicu, Yuri Chervonyi, Scott Munns, Jerry Li, Crystal Qian, Praneet Dutta, Jared Quincy Davis, Ningjia Wu, Xingwei Yang, Chu-Ming Chang, Ted Li, Rob Rose, Mingyan Fan, Hootan Nakhost, Tinglin Liu, Brian Kirkman, Frank Altamura, Lee Cline, Patrick Tonker, Joel Gouker, Dave Uden, Warren Buddy Bryan, Jason Law , et al. (11 additional authors not shown)

    Abstract: This paper is a technical overview of DeepMind and Google's recent work on reinforcement learning for controlling commercial cooling systems. Building on expertise that began with cooling Google's data centers more efficiently, we recently conducted live experiments on two real-world facilities in partnership with Trane Technologies, a building management system provider. These live experiments ha… ▽ More

    Submitted 14 December, 2022; v1 submitted 11 November, 2022; originally announced November 2022.

    Comments: 27 pages, 11 figures

  36. arXiv:2209.08132  [pdf, other

    cs.CV

    Automatic Tooth Segmentation from 3D Dental Model using Deep Learning: A Quantitative Analysis of what can be learnt from a Single 3D Dental Model

    Authors: Ananya Jana, Hrebesh Molly Subhash, Dimitris Metaxas

    Abstract: 3D tooth segmentation is an important task for digital orthodontics. Several Deep Learning methods have been proposed for automatic tooth segmentation from 3D dental models or intraoral scans. These methods require annotated 3D intraoral scans. Manually annotating 3D intraoral scans is a laborious task. One approach is to devise self-supervision methods to reduce the manual labeling effort. Compar… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

    Comments: accepted to SIPAIM 2022

  37. arXiv:2208.08227  [pdf, other

    cs.LG cs.PL

    MultiPL-E: A Scalable and Extensible Approach to Benchmarking Neural Code Generation

    Authors: Federico Cassano, John Gouwar, Daniel Nguyen, Sydney Nguyen, Luna Phipps-Costin, Donald Pinckney, Ming-Ho Yee, Yangtian Zi, Carolyn Jane Anderson, Molly Q Feldman, Arjun Guha, Michael Greenberg, Abhinav Jangda

    Abstract: Large language models have demonstrated the ability to generate both natural language and programming language text. Such models open up the possibility of multi-language code generation: could code generation models generalize knowledge from one language to another? Although contemporary code generation models can generate semantically correct Python code, little is known about their abilities wi… ▽ More

    Submitted 19 December, 2022; v1 submitted 17 August, 2022; originally announced August 2022.

  38. arXiv:2207.11357  [pdf, other

    cs.HC

    PREPRINT: Found Object Puppeteering as a Tool for Rapid Movement Sketching in 3D Animation

    Authors: Molly Jane Nicholas, Eric Paulos

    Abstract: Both expert and novice animators have a need to engage in movement sketching -- low-cost, rapid iteration on a character's movement style -- especially early on in the ideation process. Yet animation tools currently focus on low-level character control mechanisms rather than encouraging engagement with and deep observation of movement. We identify Found Object puppeteering -- where puppeteers mani… ▽ More

    Submitted 22 July, 2022; originally announced July 2022.

  39. arXiv:2205.03231  [pdf, other

    eess.SP cs.LG

    Side-aware Meta-Learning for Cross-Dataset Listener Diagnosis with Subjective Tinnitus

    Authors: Yun Li, Zhe Liu, Lina Yao, Molly Lucas, Jessica J. M. Monaghan, Yu Zhang

    Abstract: With the development of digital technology, machine learning has paved the way for the next generation of tinnitus diagnoses. Although machine learning has been widely applied in EEG-based tinnitus analysis, most current models are dataset-specific. Each dataset may be limited to a specific range of symptoms, overall disease severity, and demographic attributes; further, dataset formats may differ… ▽ More

    Submitted 2 May, 2022; originally announced May 2022.

  40. arXiv:2202.03868  [pdf, other

    cs.CV cs.LG

    Mapping DNN Embedding Manifolds for Network Generalization Prediction

    Authors: Molly O'Brien, Julia Bukowski, Mathias Unberath, Aria Pezeshk, Greg Hager

    Abstract: Understanding Deep Neural Network (DNN) performance in changing conditions is essential for deploying DNNs in safety critical applications with unconstrained environments, e.g., perception for self-driving vehicles or medical image analysis. Recently, the task of Network Generalization Prediction (NGP) has been proposed to predict how a DNN will generalize in a new operating domain. Previous NGP a… ▽ More

    Submitted 3 February, 2022; originally announced February 2022.

    Comments: 11 pages, 5 figures

  41. arXiv:2112.08460  [pdf

    cs.HC

    Friendscope: Exploring In-the-Moment Experience Sharing on Camera Glasses via a Shared Camera

    Authors: Molly Jane Nicholas, Brian A. Smith, Rajan Vaish

    Abstract: We introduce Friendscope, an instant, in-the-moment experience sharing system for lightweight commercial camera glasses. Friendscope explores a new concept called a shared camera. This concept allows a wearer to share control of their camera with a remote friend, making it possible for both people to capture photos/videos from the camera in the moment. Through a user study with 48 participants, we… ▽ More

    Submitted 15 December, 2021; originally announced December 2021.

    Comments: ACM CSCW 2022

  42. arXiv:2110.12501  [pdf, other

    cs.CL cs.LG

    Abstractified Multi-instance Learning (AMIL) for Biomedical Relation Extraction

    Authors: William Hogan, Molly Huang, Yannis Katsis, Tyler Baldwin, Ho-Cheol Kim, Yoshiki Vazquez Baeza, Andrew Bartko, Chun-Nan Hsu

    Abstract: Relation extraction in the biomedical domain is a challenging task due to a lack of labeled data and a long-tail distribution of fact triples. Many works leverage distant supervision which automatically generates labeled data by pairing a knowledge graph with raw textual data. Distant supervision produces noisy labels and requires additional techniques, such as multi-instance learning (MIL), to de… ▽ More

    Submitted 24 October, 2021; originally announced October 2021.

    Comments: 14 pages, 3 figures, submitted to Automated Knowledge Base Construction (2021)

    Report number: 13

    Journal ref: 3rd Conference on Automated Knowledge Base Construction (2021)

  43. arXiv:2108.07399  [pdf, other

    cs.CV

    Network Generalization Prediction for Safety Critical Tasks in Novel Operating Domains

    Authors: Molly O'Brien, Mike Medoff, Julia Bukowski, Greg Hager

    Abstract: It is well known that Neural Network (network) performance often degrades when a network is used in novel operating domains that differ from its training and testing domains. This is a major limitation, as networks are being integrated into safety critical, cyber-physical systems that must work in unconstrained environments, e.g., perception for autonomous vehicles. Training networks that generali… ▽ More

    Submitted 16 August, 2021; originally announced August 2021.

  44. arXiv:2105.01006  [pdf, other

    cs.RO cs.LG

    Robotic Surgery With Lean Reinforcement Learning

    Authors: Yotam Barnoy, Molly O'Brien, Will Wang, Gregory Hager

    Abstract: As surgical robots become more common, automating away some of the burden of complex direct human operation becomes ever more feasible. Model-free reinforcement learning (RL) is a promising direction toward generalizable automated surgical performance, but progress has been slowed by the lack of efficient and realistic learning environments. In this paper, we describe adding reinforcement learning… ▽ More

    Submitted 3 May, 2021; originally announced May 2021.

  45. arXiv:2104.10034  [pdf, other

    cs.CR

    On Generating and Labeling Network Traffic with Realistic, Self-Propagating Malware

    Authors: Molly Buchanan, Jeffrey W. Collyer, Jack W. Davidson, Saikat Dey, Mark Gardner, Jason D. Hiser, Jeffry Lang, Alastair Nottingham, Alina Oprea

    Abstract: Research and development of techniques which detect or remediate malicious network activity require access to diverse, realistic, contemporary data sets containing labeled malicious connections. In the absence of such data, said techniques cannot be meaningfully trained, tested, and evaluated. Synthetically produced data containing fabricated or merged network traffic is of limited value as it is… ▽ More

    Submitted 27 May, 2022; v1 submitted 20 April, 2021; originally announced April 2021.

    Comments: 4+2 pages, 3 figures, 1 table, for AI4CS-SDM21

  46. Improved Diagnosis of Tibiofemoral Cartilage Defects on MRI Images Using Deep Learning

    Authors: Gergo Merkely, Alireza Borjali, Molly Zgoda, Evan M. Farina, Simon Gortz, Orhun Muratoglu, Christian Lattermann, Kartik M. Varadarajan

    Abstract: Background: MRI is the modality of choice for cartilage imaging; however, its diagnostic performance is variable and significantly lower than the gold standard diagnostic knee arthroscopy. In recent years, deep learning has been used to automatically interpret medical images to improve diagnostic accuracy and speed. Purpose: The primary purpose of this study was to evaluate whether deep learning a… ▽ More

    Submitted 30 November, 2020; originally announced December 2020.

    Journal ref: https://doi.org/10.1016/j.jcjp.2021.100009

  47. arXiv:2011.10575  [pdf, other

    q-bio.QM cs.LG stat.ML

    Design of Experiments for Verifying Biomolecular Networks

    Authors: Ruby Sedgwick, John Goertz, Molly Stevens, Ruth Misener, Mark van der Wilk

    Abstract: There is a growing trend in molecular and synthetic biology of using mechanistic (non machine learning) models to design biomolecular networks. Once designed, these networks need to be validated by experimental results to ensure the theoretical network correctly models the true system. However, these experiments can be expensive and time consuming. We propose a design of experiments approach for v… ▽ More

    Submitted 25 November, 2020; v1 submitted 20 November, 2020; originally announced November 2020.

    Comments: Comment: Updated to correct typo "that that" => "that"

  48. arXiv:2010.00704  [pdf, other

    cs.LG cs.CV

    BCNN: A Binary CNN with All Matrix Ops Quantized to 1 Bit Precision

    Authors: Arthur J. Redfern, Lijun Zhu, Molly K. Newquist

    Abstract: This paper describes a CNN where all CNN style 2D convolution operations that lower to matrix matrix multiplication are fully binary. The network is derived from a common building block structure that is consistent with a constructive proof outline showing that binary neural networks are universal function approximators. 71.24% top 1 accuracy on the 2012 ImageNet validation set was achieved with a… ▽ More

    Submitted 5 March, 2021; v1 submitted 1 October, 2020; originally announced October 2020.

  49. arXiv:2009.13318  [pdf

    eess.IV cs.CV physics.med-ph

    High-throughput molecular imaging via deep learning enabled Raman spectroscopy

    Authors: Conor C. Horgan, Magnus Jensen, Anika Nagelkerke, Jean-Phillipe St-Pierre, Tom Vercauteren, Molly M. Stevens, Mads S. Bergholt

    Abstract: Raman spectroscopy enables non-destructive, label-free imaging with unprecedented molecular contrast but is limited by slow data acquisition, largely preventing high-throughput imaging applications. Here, we present a comprehensive framework for higher-throughput molecular imaging via deep learning enabled Raman spectroscopy, termed DeepeR, trained on a large dataset of hyperspectral Raman images,… ▽ More

    Submitted 28 September, 2020; originally announced September 2020.

  50. arXiv:2006.15292  [pdf, other

    cs.HC

    Project Calico: Wearable Chemical Sensors for Environmental Monitoring

    Authors: Alex Mariakakis, Sifang Chen, Bichlien Nguyen, Kirsten Bray, Molly Blank, Jonathan Lester, Lauren Ryan, Paul Johns, Gonzalo Ramos, Asta Roseway

    Abstract: Environmental hazards often go unnoticed because they are invisible to the naked eye, posing risks to our health over time. Project Calico aims to raise awareness of these risks by augmenting everyday fashion with color-changing chemical sensors that can be observed at a glance or captured by a smartphone camera. Project Calico leverages existing cosmetic and fabrication processes to democratize e… ▽ More

    Submitted 6 July, 2020; v1 submitted 27 June, 2020; originally announced June 2020.

    Comments: 9 pages, 6 figures 1 table