Skip to main content

Showing 1–50 of 235 results for author: Sinha, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2509.26204  [pdf, ps, other

    cs.SE

    Hamster: A Large-Scale Study and Characterization of Developer-Written Tests

    Authors: Rangeet Pan, Tyler Stennett, Raju Pavuluri, Nate Levin, Alessandro Orso, Saurabh Sinha

    Abstract: Automated test generation (ATG), which aims to reduce the cost of manual test suite development, has been investigated for decades and has produced countless techniques based on a variety of approaches: symbolic analysis, search-based, random and adaptive-random, learning-based, and, most recently, large-language-model-based approaches. However, despite this large body of research, there is still… ▽ More

    Submitted 30 September, 2025; originally announced September 2025.

  2. arXiv:2509.16204  [pdf, ps, other

    cs.CE cs.HC cs.RO

    Toward Engineering AGI: Benchmarking the Engineering Design Capabilities of LLMs

    Authors: Xingang Guo, Yaxin Li, Xiangyi Kong, Yilan Jiang, Xiayu Zhao, Zhihua Gong, Yufan Zhang, Daixuan Li, Tianle Sang, Beixiao Zhu, Gregory Jun, Yingbing Huang, Yiqi Liu, Yuqi Xue, Rahul Dev Kundu, Qi Jian Lim, Yizhou Zhao, Luke Alexander Granger, Mohamed Badr Younis, Darioush Keivan, Nippun Sabharwal, Shreyanka Sinha, Prakhar Agarwal, Kojo Vandyck, Hanlin Mai , et al. (40 additional authors not shown)

    Abstract: Today, industry pioneers dream of developing general-purpose AI engineers capable of designing and building humanity's most ambitious projects--from starships that will carry us to distant worlds to Dyson spheres that harness stellar energy. Yet engineering design represents a fundamentally different challenge for large language models (LLMs) compared to traditional textbook-style problem solving… ▽ More

    Submitted 1 July, 2025; originally announced September 2025.

  3. arXiv:2509.15886  [pdf, ps, other

    cs.CV

    RangeSAM: Leveraging Visual Foundation Models for Range-View repesented LiDAR segmentation

    Authors: Paul Julius Kühn, Duc Anh Nguyen, Arjan Kuijper, Holger Graf, Dieter Fellner, Saptarshi Neil Sinha

    Abstract: Point cloud segmentation is central to autonomous driving and 3D scene understanding. While voxel- and point-based methods dominate recent research due to their compatibility with deep architectures and ability to capture fine-grained geometry, they often incur high computational cost, irregular memory access, and limited real-time efficiency. In contrast, range-view methods, though relatively und… ▽ More

    Submitted 19 September, 2025; originally announced September 2025.

  4. arXiv:2509.07122  [pdf, ps, other

    cs.AI cs.CL cs.SC

    Neuro-Symbolic Frameworks: Conceptual Characterization and Empirical Comparative Analysis

    Authors: Sania Sinha, Tanawan Premsri, Danial Kamali, Parisa Kordjamshidi

    Abstract: Neurosymbolic (NeSy) frameworks combine neural representations and learning with symbolic representations and reasoning. Combining the reasoning capacities, explainability, and interpretability of symbolic processing with the flexibility and power of neural computing allows us to solve complex problems with more reliability while being data-efficient. However, this recently growing topic poses a c… ▽ More

    Submitted 8 September, 2025; originally announced September 2025.

  5. arXiv:2508.21197  [pdf, ps, other

    cs.CV

    GCAV: A Global Concept Activation Vector Framework for Cross-Layer Consistency in Interpretability

    Authors: Zhenghao He, Sanchit Sinha, Guangzhi Xiong, Aidong Zhang

    Abstract: Concept Activation Vectors (CAVs) provide a powerful approach for interpreting deep neural networks by quantifying their sensitivity to human-defined concepts. However, when computed independently at different layers, CAVs often exhibit inconsistencies, making cross-layer comparisons unreliable. To address this issue, we propose the Global Concept Activation Vector (GCAV), a novel framework that u… ▽ More

    Submitted 9 September, 2025; v1 submitted 28 August, 2025; originally announced August 2025.

    Comments: Accepted at ICCV 2025

  6. arXiv:2507.17025  [pdf

    cs.CL cs.AI

    Evolutionary Feature-wise Thresholding for Binary Representation of NLP Embeddings

    Authors: Soumen Sinha, Shahryar Rahnamayan, Azam Asilian Bidgoli

    Abstract: Efficient text embedding is crucial for large-scale natural language processing (NLP) applications, where storage and computational efficiency are key concerns. In this paper, we explore how using binary representations (barcodes) instead of real-valued features can be used for NLP embeddings derived from machine learning models such as BERT. Thresholding is a common method for converting continuo… ▽ More

    Submitted 22 July, 2025; originally announced July 2025.

  7. arXiv:2507.01853  [pdf, ps, other

    cs.CL

    Eka-Eval : A Comprehensive Evaluation Framework for Large Language Models in Indian Languages

    Authors: Samridhi Raj Sinha, Rajvee Sheth, Abhishek Upperwal, Mayank Singh

    Abstract: The rapid advancement of Large Language Models (LLMs) has intensified the need for evaluation frameworks that address the requirements of linguistically diverse regions, such as India, and go beyond English-centric benchmarks. We introduce EKA-EVAL, a unified evaluation framework that integrates over 35+ benchmarks (including 10 Indic benchmarks) across nine major evaluation categories. The framew… ▽ More

    Submitted 12 July, 2025; v1 submitted 2 July, 2025; originally announced July 2025.

  8. arXiv:2506.12103  [pdf, other

    cs.AI cs.CY cs.LG

    The Amazon Nova Family of Models: Technical Report and Model Card

    Authors: Amazon AGI, Aaron Langford, Aayush Shah, Abhanshu Gupta, Abhimanyu Bhatter, Abhinav Goyal, Abhinav Mathur, Abhinav Mohanty, Abhishek Kumar, Abhishek Sethi, Abi Komma, Abner Pena, Achin Jain, Adam Kunysz, Adam Opyrchal, Adarsh Singh, Aditya Rawal, Adok Achar Budihal Prasad, Adrià de Gispert, Agnika Kumar, Aishwarya Aryamane, Ajay Nair, Akilan M, Akshaya Iyengar, Akshaya Vishnu Kudlu Shanbhogue , et al. (761 additional authors not shown)

    Abstract: We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents… ▽ More

    Submitted 17 March, 2025; originally announced June 2025.

    Comments: 48 pages, 10 figures

    Report number: 20250317

  9. arXiv:2506.04237  [pdf, ps, other

    cs.LG cs.AI

    A Comprehensive Survey on the Risks and Limitations of Concept-based Models

    Authors: Sanchit Sinha, Aidong Zhang

    Abstract: Concept-based Models are a class of inherently explainable networks that improve upon standard Deep Neural Networks by providing a rationale behind their predictions using human-understandable `concepts'. With these models being highly successful in critical applications like medical diagnosis and financial risk prediction, there is a natural push toward their wider adoption in sensitive domains t… ▽ More

    Submitted 24 May, 2025; originally announced June 2025.

  10. arXiv:2505.22291  [pdf, ps, other

    cs.CV cs.AI

    Neural Restoration of Greening Defects in Historical Autochrome Photographs Based on Purely Synthetic Data

    Authors: Saptarshi Neil Sinha, P. Julius Kuehn, Johannes Koppe, Arjan Kuijper, Michael Weinmann

    Abstract: The preservation of early visual arts, particularly color photographs, is challenged by deterioration caused by aging and improper storage, leading to issues like blurring, scratches, color bleeding, and fading defects. Despite great advances in image restoration and enhancement in recent years, such systematic defects often cannot be restored by current state-of-the-art software features as avail… ▽ More

    Submitted 20 August, 2025; v1 submitted 28 May, 2025; originally announced May 2025.

  11. arXiv:2505.21495  [pdf, ps, other

    cs.RO

    CLAMP: Crowdsourcing a LArge-scale in-the-wild haptic dataset with an open-source device for Multimodal robot Perception

    Authors: Pranav N. Thakkar, Shubhangi Sinha, Karan Baijal, Yuhan, Bian, Leah Lackey, Ben Dodson, Heisen Kong, Jueun Kwon, Amber Li, Yifei Hu, Alexios Rekoutis, Tom Silver, Tapomayukh Bhattacharjee

    Abstract: Robust robot manipulation in unstructured environments often requires understanding object properties that extend beyond geometry, such as material or compliance-properties that can be challenging to infer using vision alone. Multimodal haptic sensing provides a promising avenue for inferring such properties, yet progress has been constrained by the lack of large, diverse, and realistic haptic dat… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

  12. arXiv:2503.24306  [pdf, other

    cs.CV

    Point Tracking in Surgery--The 2024 Surgical Tattoos in Infrared (STIR) Challenge

    Authors: Adam Schmidt, Mert Asim Karaoglu, Soham Sinha, Mingang Jang, Ho-Gun Ha, Kyungmin Jung, Kyeongmo Gu, Ihsan Ullah, Hyunki Lee, Jonáš Šerých, Michal Neoral, Jiří Matas, Rulin Zhou, Wenlong He, An Wang, Hongliang Ren, Bruno Silva, Sandro Queirós, Estêvão Lima, João L. Vilaça, Shunsuke Kikuchi, Atsushi Kouno, Hiroki Matsuzaki, Tongtong Li, Yulu Chen , et al. (15 additional authors not shown)

    Abstract: Understanding tissue motion in surgery is crucial to enable applications in downstream tasks such as segmentation, 3D reconstruction, virtual tissue landmarking, autonomous probe-based scanning, and subtask autonomy. Labeled data are essential to enabling algorithms in these downstream tasks since they allow us to quantify and train algorithms. This paper introduces a point tracking challenge to a… ▽ More

    Submitted 31 March, 2025; originally announced March 2025.

  13. arXiv:2502.20975  [pdf, other

    cs.CL

    Set-Theoretic Compositionality of Sentence Embeddings

    Authors: Naman Bansal, Yash mahajan, Sanjeev Sinha, Santu Karmaker

    Abstract: Sentence encoders play a pivotal role in various NLP tasks; hence, an accurate evaluation of their compositional properties is paramount. However, existing evaluation methods predominantly focus on goal task-specific performance. This leaves a significant gap in understanding how well sentence embeddings demonstrate fundamental compositional properties in a task-independent context. Leveraging cla… ▽ More

    Submitted 28 February, 2025; originally announced February 2025.

  14. arXiv:2502.19414  [pdf, other

    cs.LG cs.SE

    Can Language Models Falsify? Evaluating Algorithmic Reasoning with Counterexample Creation

    Authors: Shiven Sinha, Shashwat Goel, Ponnurangam Kumaraguru, Jonas Geiping, Matthias Bethge, Ameya Prabhu

    Abstract: There is growing excitement about the potential of Language Models (LMs) to accelerate scientific discovery. Falsifying hypotheses is key to scientific progress, as it allows claims to be iteratively refined over time. This process requires significant researcher effort, reasoning, and ingenuity. Yet current benchmarks for LMs predominantly assess their ability to generate solutions rather than ch… ▽ More

    Submitted 26 February, 2025; originally announced February 2025.

    Comments: Technical Report

  15. arXiv:2502.18712  [pdf, other

    cs.AI cs.SI

    TrajLLM: A Modular LLM-Enhanced Agent-Based Framework for Realistic Human Trajectory Simulation

    Authors: Chenlu Ju, Jiaxin Liu, Shobhit Sinha, Hao Xue, Flora Salim

    Abstract: This work leverages Large Language Models (LLMs) to simulate human mobility, addressing challenges like high costs and privacy concerns in traditional models. Our hierarchical framework integrates persona generation, activity selection, and destination prediction, using real-world demographic and psychological data to create realistic movement patterns. Both physical models and language models are… ▽ More

    Submitted 25 February, 2025; originally announced February 2025.

    Comments: Accepted WWW2025 Demo Paper

  16. arXiv:2502.17289  [pdf

    cs.AI cs.CV

    A novel approach to navigate the taxonomic hierarchy to address the Open-World Scenarios in Medicinal Plant Classification

    Authors: Soumen Sinha, Tanisha Rana, Susmita Ghosh, Rahul Roy

    Abstract: In this article, we propose a novel approach for plant hierarchical taxonomy classification by posing the problem as an open class problem. It is observed that existing methods for medicinal plant classification often fail to perform hierarchical classification and accurately identifying unknown species, limiting their effectiveness in comprehensive plant taxonomy classification. Thus we address t… ▽ More

    Submitted 4 August, 2025; v1 submitted 24 February, 2025; originally announced February 2025.

    Comments: We want to do some modifications and add more experiments

  17. arXiv:2502.05368  [pdf, ps, other

    cs.SE cs.LG

    Otter: Generating Tests from Issues to Validate SWE Patches

    Authors: Toufique Ahmed, Jatin Ganhotra, Rangeet Pan, Avraham Shinnar, Saurabh Sinha, Martin Hirzel

    Abstract: While there has been plenty of work on generating tests from existing code, there has been limited work on generating tests from issues. A correct test must validate the code patch that resolves the issue. This paper focuses on the scenario where that code patch does not yet exist. Doing so supports two major use-cases. First, it supports TDD (test-driven development), the discipline of "test firs… ▽ More

    Submitted 30 May, 2025; v1 submitted 7 February, 2025; originally announced February 2025.

    Comments: Accepted to the main technical track of the International Conference on Machine Learning (ICML), 2025

  18. arXiv:2502.04144  [pdf, other

    cs.CV

    HD-EPIC: A Highly-Detailed Egocentric Video Dataset

    Authors: Toby Perrett, Ahmad Darkhalil, Saptarshi Sinha, Omar Emara, Sam Pollard, Kranti Parida, Kaiting Liu, Prajwal Gatti, Siddhant Bansal, Kevin Flanagan, Jacob Chalk, Zhifan Zhu, Rhodri Guerrier, Fahd Abdelazim, Bin Zhu, Davide Moltisanti, Michael Wray, Hazel Doughty, Dima Damen

    Abstract: We present a validation dataset of newly-collected kitchen-based egocentric videos, manually annotated with highly detailed and interconnected ground-truth labels covering: recipe steps, fine-grained actions, ingredients with nutritional values, moving objects, and audio annotations. Importantly, all annotations are grounded in 3D through digital twinning of the scene, fixtures, object locations,… ▽ More

    Submitted 25 March, 2025; v1 submitted 6 February, 2025; originally announced February 2025.

    Comments: Accepted at CVPR 2025. Project Webpage and Dataset: http://hd-epic.github.io

  19. arXiv:2501.18012  [pdf, ps, other

    cs.LG cond-mat.dis-nn

    Growing Neural Networks: Dynamic Evolution through Gradient Descent

    Authors: Anil Radhakrishnan, John F. Lindner, Scott T. Miller, Sudeshna Sinha, William L. Ditto

    Abstract: In contrast to conventional artificial neural networks, which are structurally static, we present two approaches for evolving small networks into larger ones during training. The first method employs an auxiliary weight that directly controls network size, while the second uses a controller-generated mask to modulate neuron participation. Both approaches optimize network size through the same grad… ▽ More

    Submitted 25 July, 2025; v1 submitted 29 January, 2025; originally announced January 2025.

    Comments: 11 pages, 9 figures; adding scaling results, revised introduction, abstract, and title

    Journal ref: Proceedings of the Royal Society A, volume 481, issue 2318, pages 20250222(1-15) (16 July 2025)

  20. arXiv:2501.09221  [pdf, other

    cs.CV cs.LG

    ASCENT-ViT: Attention-based Scale-aware Concept Learning Framework for Enhanced Alignment in Vision Transformers

    Authors: Sanchit Sinha, Guangzhi Xiong, Aidong Zhang

    Abstract: As Vision Transformers (ViTs) are increasingly adopted in sensitive vision applications, there is a growing demand for improved interpretability. This has led to efforts to forward-align these models with carefully annotated abstract, human-understandable semantic entities - concepts. Concepts provide global rationales to the model predictions and can be quickly understood/intervened on by domain… ▽ More

    Submitted 3 February, 2025; v1 submitted 15 January, 2025; originally announced January 2025.

  21. arXiv:2501.08600  [pdf, other

    cs.SE cs.AI

    AutoRestTest: A Tool for Automated REST API Testing Using LLMs and MARL

    Authors: Tyler Stennett, Myeongsoo Kim, Saurabh Sinha, Alessandro Orso

    Abstract: As REST APIs have become widespread in modern web services, comprehensive testing of these APIs is increasingly crucial. Because of the vast search space of operations, parameters, and parameter values, along with their dependencies and constraints, current testing tools often achieve low code coverage, resulting in suboptimal fault detection. To address this limitation, we present AutoRestTest, a… ▽ More

    Submitted 3 March, 2025; v1 submitted 15 January, 2025; originally announced January 2025.

    Comments: To be published in the 47th IEEE/ACM International Conference on Software Engineering - Demonstration Track (ICSE-Demo 2025)

  22. arXiv:2501.08598  [pdf, other

    cs.SE cs.AI

    LlamaRestTest: Effective REST API Testing with Small Language Models

    Authors: Myeongsoo Kim, Saurabh Sinha, Alessandro Orso

    Abstract: Modern web services rely heavily on REST APIs, typically documented using the OpenAPI specification. The widespread adoption of this standard has resulted in the development of many black-box testing tools that generate tests based on OpenAPI specifications. Although Large Language Models (LLMs) have shown promising test-generation abilities, their application to REST API testing remains mostly un… ▽ More

    Submitted 3 April, 2025; v1 submitted 15 January, 2025; originally announced January 2025.

    Comments: To be published in the ACM International Conference on the Foundations of Software Engineering (FSE 2025)

  23. arXiv:2501.02618  [pdf, other

    cs.CV

    Identifying Surgical Instruments in Pedagogical Cataract Surgery Videos through an Optimized Aggregation Network

    Authors: Sanya Sinha, Michal Balazia, Francois Bremond

    Abstract: Instructional cataract surgery videos are crucial for ophthalmologists and trainees to observe surgical details repeatedly. This paper presents a deep learning model for real-time identification of surgical instruments in these videos, using a custom dataset scraped from open-access sources. Inspired by the architecture of YOLOV9, the model employs a Programmable Gradient Information (PGI) mechani… ▽ More

    Submitted 5 January, 2025; originally announced January 2025.

    Comments: Preprint. Full paper accepted at the IEEE International Conference on Image Processing Applications and Systems (IPAS), Lyon, France, Jan 2025. 6 pages

    MSC Class: 68T05; 68T10 ACM Class: I.5

  24. arXiv:2501.01933  [pdf

    cs.CL cs.AI

    Abstractive Text Summarization for Contemporary Sanskrit Prose: Issues and Challenges

    Authors: Shagun Sinha

    Abstract: This thesis presents Abstractive Text Summarization models for contemporary Sanskrit prose. The first chapter, titled Introduction, presents the motivation behind this work, the research questions, and the conceptual framework. Sanskrit is a low-resource inflectional language. The key research question that this thesis investigates is what the challenges in developing an abstractive TS for Sanskri… ▽ More

    Submitted 3 January, 2025; originally announced January 2025.

    Comments: PhD Thesis

  25. arXiv:2412.02883  [pdf, other

    cs.SE cs.CL cs.LG

    TDD-Bench Verified: Can LLMs Generate Tests for Issues Before They Get Resolved?

    Authors: Toufique Ahmed, Martin Hirzel, Rangeet Pan, Avraham Shinnar, Saurabh Sinha

    Abstract: Test-driven development (TDD) is the practice of writing tests first and coding later, and the proponents of TDD expound its numerous benefits. For instance, given an issue on a source code repository, tests can clarify the desired behavior among stake-holders before anyone writes code for the agreed-upon fix. Although there has been a lot of work on automated test generation for the practice "wri… ▽ More

    Submitted 3 December, 2024; originally announced December 2024.

  26. arXiv:2411.17945  [pdf, other

    cs.CV cs.AI cs.GR cs.LG

    MARVEL-40M+: Multi-Level Visual Elaboration for High-Fidelity Text-to-3D Content Creation

    Authors: Sankalp Sinha, Mohammad Sadil Khan, Muhammad Usama, Shino Sam, Didier Stricker, Sk Aziz Ali, Muhammad Zeshan Afzal

    Abstract: Generating high-fidelity 3D content from text prompts remains a significant challenge in computer vision due to the limited size, diversity, and annotation depth of the existing datasets. To address this, we introduce MARVEL-40M+, an extensive dataset with 40 million text annotations for over 8.9 million 3D assets aggregated from seven major 3D datasets. Our contribution is a novel multi-stage ann… ▽ More

    Submitted 26 March, 2025; v1 submitted 26 November, 2024; originally announced November 2024.

  27. arXiv:2411.07098  [pdf, other

    cs.SE cs.AI

    A Multi-Agent Approach for REST API Testing with Semantic Graphs and LLM-Driven Inputs

    Authors: Myeongsoo Kim, Tyler Stennett, Saurabh Sinha, Alessandro Orso

    Abstract: As modern web services increasingly rely on REST APIs, their thorough testing has become crucial. Furthermore, the advent of REST API documentation languages, such as the OpenAPI Specification, has led to the emergence of many black-box REST API testing tools. However, these tools often focus on individual test elements in isolation (e.g., APIs, parameters, values), resulting in lower coverage and… ▽ More

    Submitted 21 January, 2025; v1 submitted 11 November, 2024; originally announced November 2024.

    Comments: To be published in the 47th IEEE/ACM International Conference on Software Engineering (ICSE 2025)

  28. arXiv:2410.24117  [pdf, ps, other

    cs.SE cs.LG

    AlphaTrans: A Neuro-Symbolic Compositional Approach for Repository-Level Code Translation and Validation

    Authors: Ali Reza Ibrahimzada, Kaiyao Ke, Mrigank Pawagi, Muhammad Salman Abid, Rangeet Pan, Saurabh Sinha, Reyhaneh Jabbarvand

    Abstract: Code translation transforms programs from one programming language (PL) to another. Several rule-based transpilers have been designed to automate code translation between different pairs of PLs. However, the rules can become obsolete as the PLs evolve and cannot generalize to other PLs. Recent studies have explored the automation of code translation using Large Language Models (LLMs). One key obse… ▽ More

    Submitted 19 June, 2025; v1 submitted 31 October, 2024; originally announced October 2024.

    Comments: Published in FSE 2025

  29. arXiv:2410.15491  [pdf, other

    cs.LG stat.ME

    Structural Causality-based Generalizable Concept Discovery Models

    Authors: Sanchit Sinha, Guangzhi Xiong, Aidong Zhang

    Abstract: The rising need for explainable deep neural network architectures has utilized semantic concepts as explainable units. Several approaches utilizing disentangled representation learning estimate the generative factors and utilize them as concepts for explaining DNNs. However, even though the generative factors for a dataset remain fixed, concepts are not fixed entities and vary based on downstream… ▽ More

    Submitted 20 October, 2024; originally announced October 2024.

  30. arXiv:2410.13685  [pdf, other

    cs.CV

    Label-free prediction of fluorescence markers in bovine satellite cells using deep learning

    Authors: Sania Sinha, Aarham Wasit, Won Seob Kim, Jongkyoo Kim, Jiyoon Yi

    Abstract: Assessing the quality of bovine satellite cells (BSCs) is essential for the cultivated meat industry, which aims to address global food sustainability challenges. This study aims to develop a label-free method for predicting fluorescence markers in isolated BSCs using deep learning. We employed a U-Net-based CNN model to predict multiple fluorescence signals from a single bright-field microscopy i… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: 11 pages, 4 figures

  31. arXiv:2410.13007  [pdf, other

    cs.SE

    Codellm-Devkit: A Framework for Contextualizing Code LLMs with Program Analysis Insights

    Authors: Rahul Krishna, Rangeet Pan, Raju Pavuluri, Srikanth Tamilselvam, Maja Vukovic, Saurabh Sinha

    Abstract: Large Language Models for Code (or code LLMs) are increasingly gaining popularity and capabilities, offering a wide array of functionalities such as code completion, code generation, code summarization, test generation, code translation, and more. To leverage code LLMs to their full potential, developers must provide code-specific contextual information to the models. These are typically derived a… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

  32. arXiv:2410.12665  [pdf, other

    cond-mat.soft cond-mat.stat-mech cs.AI math.DS math.OC

    Hamiltonian bridge: A physics-driven generative framework for targeted pattern control

    Authors: Vishaal Krishnan, Sumit Sinha, L. Mahadevan

    Abstract: Patterns arise spontaneously in a range of systems spanning the sciences, and their study typically focuses on mechanisms to understand their evolution in space-time. Increasingly, there has been a transition towards controlling these patterns in various functional settings, with implications for engineering. Here, we combine our knowledge of a general class of dynamical laws for pattern formation… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

    Comments: 29 pages, 8 figures

  33. arXiv:2410.10017  [pdf, other

    cs.RO cs.CV cs.GR

    REPeat: A Real2Sim2Real Approach for Pre-acquisition of Soft Food Items in Robot-assisted Feeding

    Authors: Nayoung Ha, Ruolin Ye, Ziang Liu, Shubhangi Sinha, Tapomayukh Bhattacharjee

    Abstract: The paper presents REPeat, a Real2Sim2Real framework designed to enhance bite acquisition in robot-assisted feeding for soft foods. It uses `pre-acquisition actions' such as pushing, cutting, and flipping to improve the success rate of bite acquisition actions such as skewering, scooping, and twirling. If the data-driven model predicts low success for direct bite acquisition, the system initiates… ▽ More

    Submitted 13 October, 2024; originally announced October 2024.

  34. arXiv:2410.04723  [pdf, other

    cs.LG cs.AI stat.ML

    ProtoNAM: Prototypical Neural Additive Models for Interpretable Deep Tabular Learning

    Authors: Guangzhi Xiong, Sanchit Sinha, Aidong Zhang

    Abstract: Generalized additive models (GAMs) have long been a powerful white-box tool for the intelligible analysis of tabular data, revealing the influence of each feature on the model predictions. Despite the success of neural networks (NNs) in various domains, their application as NN-based GAMs in tabular data analysis remains suboptimal compared to tree-based ones, and the opacity of encoders in NN-GAMs… ▽ More

    Submitted 6 October, 2024; originally announced October 2024.

  35. arXiv:2409.17106  [pdf, other

    cs.CV cs.GR

    Text2CAD: Generating Sequential CAD Models from Beginner-to-Expert Level Text Prompts

    Authors: Mohammad Sadil Khan, Sankalp Sinha, Talha Uddin Sheikh, Didier Stricker, Sk Aziz Ali, Muhammad Zeshan Afzal

    Abstract: Prototyping complex computer-aided design (CAD) models in modern softwares can be very time-consuming. This is due to the lack of intelligent systems that can quickly generate simpler intermediate parts. We propose Text2CAD, the first AI framework for generating text-to-parametric CAD models using designer-friendly instructions for all skill levels. Furthermore, we introduce a data annotation pipe… ▽ More

    Submitted 25 September, 2024; originally announced September 2024.

    Comments: Accepted in NeurIPS 2024 (Spotlight)

  36. On the Effectiveness of Neural Operators at Zero-Shot Weather Downscaling

    Authors: Saumya Sinha, Brandon Benton, Patrick Emami

    Abstract: Machine learning (ML) methods have shown great potential for weather downscaling. These data-driven approaches provide a more efficient alternative for producing high-resolution weather datasets and forecasts compared to physics-based numerical simulations. Neural operators, which learn solution operators for a family of partial differential equations (PDEs), have shown great success in scientific… ▽ More

    Submitted 18 February, 2025; v1 submitted 20 September, 2024; originally announced September 2024.

    Journal ref: Environ. Data Science 4 (2025) e21

  37. arXiv:2409.03093  [pdf, other

    cs.SE

    ASTER: Natural and Multi-language Unit Test Generation with LLMs

    Authors: Rangeet Pan, Myeongsoo Kim, Rahul Krishna, Raju Pavuluri, Saurabh Sinha

    Abstract: Implementing automated unit tests is an important but time-consuming activity in software development. To assist developers in this task, many techniques for automating unit test generation have been developed. However, despite this effort, usable tools exist for very few programming languages. Moreover, studies have found that automatically generated tests suffer poor readability and do not resem… ▽ More

    Submitted 15 January, 2025; v1 submitted 4 September, 2024; originally announced September 2024.

    Comments: Accepted at ICSE-SEIP, 2025

  38. arXiv:2408.06975  [pdf, other

    cs.CV cs.AI cs.GR

    SpectralGaussians: Semantic, spectral 3D Gaussian splatting for multi-spectral scene representation, visualization and analysis

    Authors: Saptarshi Neil Sinha, Holger Graf, Michael Weinmann

    Abstract: We propose a novel cross-spectral rendering framework based on 3D Gaussian Splatting (3DGS) that generates realistic and semantically meaningful splats from registered multi-view spectrum and segmentation maps. This extension enhances the representation of scenes with multiple spectra, providing insights into the underlying materials and segmentation. We introduce an improved physically-based rend… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

    ACM Class: I.2.10; I.3.7; I.4.8; I.4.1

  39. arXiv:2407.19300  [pdf, other

    cs.LG cs.AI

    CoLiDR: Concept Learning using Aggregated Disentangled Representations

    Authors: Sanchit Sinha, Guangzhi Xiong, Aidong Zhang

    Abstract: Interpretability of Deep Neural Networks using concept-based models offers a promising way to explain model behavior through human-understandable concepts. A parallel line of research focuses on disentangling the data distribution into its underlying generative factors, in turn explaining the data generation process. While both directions have received extensive attention, little work has been don… ▽ More

    Submitted 27 July, 2024; originally announced July 2024.

    Comments: KDD 2024

  40. Shape2.5D: A Dataset of Texture-less Surfaces for Depth and Normals Estimation

    Authors: Muhammad Saif Ullah Khan, Sankalp Sinha, Didier Stricker, Marcus Liwicki, Muhammad Zeshan Afzal

    Abstract: Reconstructing texture-less surfaces poses unique challenges in computer vision, primarily due to the lack of specialized datasets that cater to the nuanced needs of depth and normals estimation in the absence of textural information. We introduce "Shape2.5D," a novel, large-scale dataset designed to address this gap. Comprising 1.17 million frames spanning over 39,772 3D models and 48 unique obje… ▽ More

    Submitted 5 November, 2024; v1 submitted 22 June, 2024; originally announced June 2024.

    Comments: Accepted for publication in IEEE Access

  41. arXiv:2406.10764  [pdf, other

    cs.CL

    GNOME: Generating Negotiations through Open-Domain Mapping of Exchanges

    Authors: Darshan Deshpande, Shambhavi Sinha, Anirudh Ravi Kumar, Debaditya Pal, Jonathan May

    Abstract: Language Models have previously shown strong negotiation capabilities in closed domains where the negotiation strategy prediction scope is constrained to a specific setup. In this paper, we first show that these models are not generalizable beyond their original training domain despite their wide-scale pretraining. Following this, we propose an automated framework called GNOME, which processes exi… ▽ More

    Submitted 15 June, 2024; originally announced June 2024.

  42. arXiv:2406.10247  [pdf, other

    cs.CL cs.AI

    QCQA: Quality and Capacity-aware grouped Query Attention

    Authors: Vinay Joshi, Prashant Laddha, Shambhavi Sinha, Om Ji Omer, Sreenivas Subramoney

    Abstract: Excessive memory requirements of key and value features (KV-cache) present significant challenges in the autoregressive inference of large language models (LLMs), restricting both the speed and length of text generation. Approaches such as Multi-Query Attention (MQA) and Grouped Query Attention (GQA) mitigate these challenges by grouping query heads and consequently reducing the number of correspo… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  43. arXiv:2406.08787  [pdf, other

    cs.AI

    A Survey on Compositional Learning of AI Models: Theoretical and Experimental Practices

    Authors: Sania Sinha, Tanawan Premsri, Parisa Kordjamshidi

    Abstract: Compositional learning, mastering the ability to combine basic concepts and construct more intricate ones, is crucial for human cognition, especially in human language comprehension and visual perception. This notion is tightly connected to generalization over unobserved situations. Despite its integral role in intelligence, there is a lack of systematic theoretical and experimental research metho… ▽ More

    Submitted 20 November, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Journal ref: Transactions of Machine Learning Research, 2024

  44. arXiv:2405.19653  [pdf, other

    cs.LG cs.CL eess.SY

    SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems

    Authors: Patrick Emami, Zhaonan Li, Saumya Sinha, Truc Nguyen

    Abstract: Surrogate models are used to predict the behavior of complex energy systems that are too expensive to simulate with traditional numerical methods. Our work introduces the use of language descriptions, which we call ``system captions'' or SysCaps, to interface with such surrogates. We argue that interacting with surrogates through text, particularly natural language, makes these models more accessi… ▽ More

    Submitted 18 April, 2025; v1 submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted at ICLR 2025. 23 pages. Updated with final camera ready version

  45. arXiv:2405.11446  [pdf, other

    cs.CL cs.LG

    MAML-en-LLM: Model Agnostic Meta-Training of LLMs for Improved In-Context Learning

    Authors: Sanchit Sinha, Yuguang Yue, Victor Soto, Mayank Kulkarni, Jianhua Lu, Aidong Zhang

    Abstract: Adapting large language models (LLMs) to unseen tasks with in-context training samples without fine-tuning remains an important research problem. To learn a robust LLM that adapts well to unseen tasks, multiple meta-training approaches have been proposed such as MetaICL and MetaICT, which involve meta-training pre-trained LLMs on a wide variety of diverse tasks. These meta-training approaches esse… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: KDD 2024, 11 pages(9 main, 2 ref, 1 App) Openreview https://openreview.net/forum?id=JwecLNhWDy&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DKDD.org%2F2024%2FResearch_Track%2FAuthors%23your-submissions)

  46. arXiv:2405.03660  [pdf, other

    cs.CV

    CICA: Content-Injected Contrastive Alignment for Zero-Shot Document Image Classification

    Authors: Sankalp Sinha, Muhammad Saif Ullah Khan, Talha Uddin Sheikh, Didier Stricker, Muhammad Zeshan Afzal

    Abstract: Zero-shot learning has been extensively investigated in the broader field of visual recognition, attracting significant interest recently. However, the current work on zero-shot learning in document image classification remains scarce. The existing studies either focus exclusively on zero-shot inference, or their evaluation does not align with the established criteria of zero-shot evaluation in th… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 18 Pages, 4 Figures and Accepted in ICDAR 2024

  47. arXiv:2405.00349  [pdf, other

    cs.LG

    A Self-explaining Neural Architecture for Generalizable Concept Learning

    Authors: Sanchit Sinha, Guangzhi Xiong, Aidong Zhang

    Abstract: With the wide proliferation of Deep Neural Networks in high-stake applications, there is a growing demand for explainability behind their decision-making process. Concept learning models attempt to learn high-level 'concepts' - abstract entities that align with human understanding, and thus provide interpretability to DNN architectures. However, in this paper, we demonstrate that present SOTA conc… ▽ More

    Submitted 5 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

    Comments: IJCAI 2024. 16 pages (7 main content, 2 references, 7 Appendix) Code available at https://github.com/sanchit97/secl

  48. arXiv:2404.06405  [pdf, other

    cs.AI cs.CG cs.CL cs.LG

    Wu's Method can Boost Symbolic AI to Rival Silver Medalists and AlphaGeometry to Outperform Gold Medalists at IMO Geometry

    Authors: Shiven Sinha, Ameya Prabhu, Ponnurangam Kumaraguru, Siddharth Bhat, Matthias Bethge

    Abstract: Proving geometric theorems constitutes a hallmark of visual reasoning combining both intuitive and logical skills. Therefore, automated theorem proving of Olympiad-level geometry problems is considered a notable milestone in human-level automated reasoning. The introduction of AlphaGeometry, a neuro-symbolic model trained with 100 million synthetic samples, marked a major breakthrough. It solved 2… ▽ More

    Submitted 11 April, 2024; v1 submitted 9 April, 2024; originally announced April 2024.

    Comments: Work in Progress. Released for wider feedback

  49. arXiv:2403.18074  [pdf, other

    cs.CV eess.IV

    Every Shot Counts: Using Exemplars for Repetition Counting in Videos

    Authors: Saptarshi Sinha, Alexandros Stergiou, Dima Damen

    Abstract: Video repetition counting infers the number of repetitions of recurring actions or motion within a video. We propose an exemplar-based approach that discovers visual correspondence of video exemplars across repetitions within target videos. Our proposed Every Shot Counts (ESCounts) model is an attention-based encoder-decoder that encodes videos of varying lengths alongside exemplars from the same… ▽ More

    Submitted 13 October, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted at Asian Conference on Computer Vision (ACCV) 2024, project page: https://sinhasaptarshi.github.io/escounts , and code: https://github.com/sinhasaptarshi/EveryShotCounts

  50. arXiv:2402.15589  [pdf, other

    cs.CL cs.AI cs.LG cs.NE

    LLMs as Meta-Reviewers' Assistants: A Case Study

    Authors: Eftekhar Hossain, Sanjeev Kumar Sinha, Naman Bansal, Alex Knipper, Souvika Sarkar, John Salvador, Yash Mahajan, Sri Guttikonda, Mousumi Akter, Md. Mahadi Hassan, Matthew Freestone, Matthew C. Williams Jr., Dongji Feng, Santu Karmaker

    Abstract: One of the most important yet onerous tasks in the academic peer-reviewing process is composing meta-reviews, which involves assimilating diverse opinions from multiple expert peers, formulating one's self-judgment as a senior expert, and then summarizing all these perspectives into a concise holistic overview to make an overall recommendation. This process is time-consuming and can be compromised… ▽ More

    Submitted 8 February, 2025; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: Accepted to NAACL 2025, 41 pages

    ACM Class: I.2.7