Skip to main content

Showing 1–12 of 12 results for author: Naidu, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.06806  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Label-semantics Aware Generative Approach for Domain-Agnostic Multilabel Classification

    Authors: Subhendu Khatuya, Shashwat Naidu, Saptarshi Ghosh, Pawan Goyal, Niloy Ganguly

    Abstract: The explosion of textual data has made manual document classification increasingly challenging. To address this, we introduce a robust, efficient domain-agnostic generative model framework for multi-label text classification. Instead of treating labels as mere atomic symbols, our approach utilizes predefined label descriptions and is trained to generate these descriptions based on the input text.… ▽ More

    Submitted 7 June, 2025; originally announced June 2025.

    Comments: This work has been accepted to appear at the Association for Computational Linguistics (ACL), 2025

  2. arXiv:2502.04273  [pdf, other

    math.NA cs.LG

    Electrical Impedance Tomography for Anisotropic Media: a Machine Learning Approach to Classify Inclusions

    Authors: Romina Gaburro, Patrick Healy, Shraddha Naidu, Clifford Nolan

    Abstract: We consider the problem in Electrical Impedance Tomography (EIT) of identifying one or multiple inclusions in a background-conducting body $Ω\subset\mathbb{R}^2$, from the knowledge of a finite number of electrostatic measurements taken on its boundary $\partialΩ$ and modelled by the Dirichlet-to-Neumann (D-N) matrix. Once the presence of one inclusion in $Ω$ is established, our model, combined wi… ▽ More

    Submitted 6 February, 2025; originally announced February 2025.

    Comments: 27 pages, 17 figures

    MSC Class: 65N21; 35R30; 68T99

  3. arXiv:2501.03988  [pdf

    cs.CL

    Semantically Cohesive Word Grouping in Indian Languages

    Authors: N J Karthika, Adyasha Patra, Nagasai Saketh Naidu, Arnab Bhattacharya, Ganesh Ramakrishnan, Chaitali Dangarikar

    Abstract: Indian languages are inflectional and agglutinative and typically follow clause-free word order. The structure of sentences across most major Indian languages are similar when their dependency parse trees are considered. While some differences in the parsing structure occur due to peculiarities of a language or its preferred natural way of conveying meaning, several apparent differences are simply… ▽ More

    Submitted 7 January, 2025; originally announced January 2025.

  4. arXiv:2408.04820  [pdf, other

    cs.SE cs.AI cs.CL cs.HC cs.LG

    Natural Language Outlines for Code: Literate Programming in the LLM Era

    Authors: Kensen Shi, Deniz Altınbüken, Saswat Anand, Mihai Christodorescu, Katja Grünwedel, Alexa Koenings, Sai Naidu, Anurag Pathak, Marc Rasi, Fredde Ribeiro, Brandon Ruffin, Siddhant Sanyam, Maxim Tabachnyk, Sara Toth, Roy Tu, Tobias Welp, Pengcheng Yin, Manzil Zaheer, Satish Chandra, Charles Sutton

    Abstract: We propose using natural language outlines as a novel modality and interaction surface for providing AI assistance to developers throughout the software development process. An NL outline for a code function comprises multiple statements written in concise prose, which partition the code and summarize its main ideas in the style of literate programming. Crucially, we find that modern LLMs can gene… ▽ More

    Submitted 17 April, 2025; v1 submitted 8 August, 2024; originally announced August 2024.

    Comments: Accepted to FSE'25 Industry Track

  5. arXiv:2406.19314  [pdf, other

    cs.CL cs.AI cs.LG

    LiveBench: A Challenging, Contamination-Limited LLM Benchmark

    Authors: Colin White, Samuel Dooley, Manley Roberts, Arka Pal, Ben Feuer, Siddhartha Jain, Ravid Shwartz-Ziv, Neel Jain, Khalid Saifullah, Sreemanti Dey, Shubh-Agrawal, Sandeep Singh Sandha, Siddartha Naidu, Chinmay Hegde, Yann LeCun, Tom Goldstein, Willie Neiswanger, Micah Goldblum

    Abstract: Test set contamination, wherein test data from a benchmark ends up in a newer model's training set, is a well-documented obstacle for fair LLM evaluation and can quickly render benchmarks obsolete. To mitigate this, many recent benchmarks crowdsource new prompts and evaluations from human or LLM judges; however, these can introduce significant biases, and break down when scoring hard questions. In… ▽ More

    Submitted 18 April, 2025; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: ICLR 2025 Spotlight

  6. arXiv:2404.16183  [pdf

    cs.LG cs.AI

    ABCD: Trust enhanced Attention based Convolutional Autoencoder for Risk Assessment

    Authors: Sarala Naidu, Ning Xiong

    Abstract: Anomaly detection in industrial systems is crucial for preventing equipment failures, ensuring risk identification, and maintaining overall system efficiency. Traditional monitoring methods often rely on fixed thresholds and empirical rules, which may not be sensitive enough to detect subtle changes in system health and predict impending failures. To address this limitation, this paper proposes, a… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  7. arXiv:2404.16179  [pdf

    cs.LG

    S2DEVFMAP: Self-Supervised Learning Framework with Dual Ensemble Voting Fusion for Maximizing Anomaly Prediction in Timeseries

    Authors: Sarala Naidu, Ning Xiong

    Abstract: Anomaly detection plays a crucial role in industrial settings, particularly in maintaining the reliability and optimal performance of cooling systems. Traditional anomaly detection methods often face challenges in handling diverse data characteristics and variations in noise levels, resulting in limited effectiveness. And yet traditional anomaly detection often relies on application of single mode… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  8. arXiv:2402.13228  [pdf, other

    cs.CL cs.AI cs.LG

    Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive

    Authors: Arka Pal, Deep Karkhanis, Samuel Dooley, Manley Roberts, Siddartha Naidu, Colin White

    Abstract: Direct Preference Optimisation (DPO) is effective at significantly improving the performance of large language models (LLMs) on downstream tasks such as reasoning, summarisation, and alignment. Using pairs of preferred and dispreferred data, DPO models the relative probability of picking one response over another. In this work, first we show theoretically that the standard DPO loss can lead to a r… ▽ More

    Submitted 3 July, 2024; v1 submitted 20 February, 2024; originally announced February 2024.

  9. Translating Legalese: Enhancing Public Understanding of Court Opinions with Legal Summarizers

    Authors: Elliott Ash, Aniket Kesari, Suresh Naidu, Lena Song, Dominik Stammbach

    Abstract: Judicial opinions are written to be persuasive and could build public trust in court decisions, yet they can be difficult for non-experts to understand. We present a pipeline for using an AI assistant to generate simplified summaries of judicial opinions. Compared to existing expert-written summaries, these AI-generated simple summaries are more accessible to the public and more easily understood… ▽ More

    Submitted 2 March, 2024; v1 submitted 11 November, 2023; originally announced November 2023.

    Comments: published in proceedings of CSLAW 2024: Symposium on Computer Science and Law

  10. arXiv:2311.01933  [pdf, other

    cs.LG

    ForecastPFN: Synthetically-Trained Zero-Shot Forecasting

    Authors: Samuel Dooley, Gurnoor Singh Khurana, Chirag Mohapatra, Siddartha Naidu, Colin White

    Abstract: The vast majority of time-series forecasting approaches require a substantial training dataset. However, many real-life forecasting applications have very little initial observations, sometimes just 40 or fewer. Thus, the applicability of most forecasting methods is restricted in data-sparse commercial applications. While there is recent work in the setting of very limited initial data (so-called… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Journal ref: Thirty-seventh Conference on Neural Information Processing Systems, 2023

  11. arXiv:2308.10882  [pdf, other

    cs.AI cs.CL

    Giraffe: Adventures in Expanding Context Lengths in LLMs

    Authors: Arka Pal, Deep Karkhanis, Manley Roberts, Samuel Dooley, Arvind Sundararajan, Siddartha Naidu

    Abstract: Modern large language models (LLMs) that rely on attention mechanisms are typically trained with fixed context lengths which enforce upper limits on the length of input sequences that they can handle at evaluation time. To use these models on sequences longer than the train-time context length, one might employ techniques from the growing family of context length extrapolation methods -- most of w… ▽ More

    Submitted 21 August, 2023; originally announced August 2023.

  12. arXiv:2005.04232  [pdf, other

    cs.CL cs.LG stat.ML

    Text-Based Ideal Points

    Authors: Keyon Vafa, Suresh Naidu, David M. Blei

    Abstract: Ideal point models analyze lawmakers' votes to quantify their political positions, or ideal points. But votes are not the only way to express a political position. Lawmakers also give speeches, release press statements, and post tweets. In this paper, we introduce the text-based ideal point model (TBIP), an unsupervised probabilistic topic model that analyzes texts to quantify the political positi… ▽ More

    Submitted 21 July, 2020; v1 submitted 8 May, 2020; originally announced May 2020.

    Comments: Appeared in Proceedings of the 2020 Conference of the Association for Computational Linguistics (ACL 2020)