Skip to main content

Showing 1–50 of 580 results for author: Chakraborty, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2509.22876  [pdf, ps, other

    cs.CL cs.LG

    HEART: Emotionally-driven test-time scaling of Language Models

    Authors: Gabriela Pinto, Palash Goyal, Yiwen Song, Souradip Chakraborty, Zifeng Wang, Tomas Pfister, Hamid Palangi

    Abstract: Test-time scaling has shown considerable success in improving the performance of language models on complex reasoning tasks without requiring fine-tuning. However, current strategies such as self-reflection primarily focus on logical or structural refinement. They do not leverage the guiding potential of affective feedback. Inspired by psychological research showing that emotions can modulate cogn… ▽ More

    Submitted 26 September, 2025; originally announced September 2025.

  2. Lost in Transition: The Struggle of Women Returning to Software Engineering Research after Career Breaks

    Authors: Shalini Chakraborty, Sebastian Baltes

    Abstract: The IT industry provides supportive pathways such as returnship programs, coding boot camps, and buddy systems for women re-entering their job after a career break. Academia, however, offers limited opportunities to motivate women to return. We propose a diverse multicultural research project investigating the challenges faced by women with software engineering (SE) backgrounds re-entering academi… ▽ More

    Submitted 25 September, 2025; originally announced September 2025.

    Comments: 3 pages, published in the Proceedings of the 18th International Conference on Cooperative and Human Aspects of Software Engineering (CHASE 2025)

  3. arXiv:2509.17289  [pdf, ps, other

    cs.CL

    Automated Knowledge Graph Construction using Large Language Models and Sentence Complexity Modelling

    Authors: Sydney Anuyah, Mehedi Mahmud Kaushik, Krishna Dwarampudi, Rakesh Shiradkar, Arjan Durresi, Sunandan Chakraborty

    Abstract: We introduce CoDe-KG, an open-source, end-to-end pipeline for extracting sentence-level knowledge graphs by combining robust coreference resolution with syntactic sentence decomposition. Using our model, we contribute a dataset of over 150,000 knowledge triples, which is open source. We also contribute a training corpus of 7248 rows for sentence complexity, 190 rows of gold human annotations for c… ▽ More

    Submitted 21 September, 2025; originally announced September 2025.

  4. arXiv:2509.13896  [pdf, ps, other

    cs.SE

    Mind the Ethics! The Overlooked Ethical Dimensions of GenAI in Software Modeling Education

    Authors: Shalini Chakraborty, Lola Burgueño, Nathalie Moreno, Javier Troya, Paula Muñoz

    Abstract: Generative Artificial Intelligence (GenAI) is rapidly gaining momentum in software modeling education, embraced by both students and educators. As GenAI assists with interpreting requirements, formalizing models, and translating students' mental models into structured notations, it increasingly shapes core learning outcomes such as domain comprehension, diagrammatic thinking, and modeling fluency… ▽ More

    Submitted 17 September, 2025; originally announced September 2025.

    Comments: 8 pages, Educators Symposium at MODELS 2025

  5. arXiv:2509.06731  [pdf, ps, other

    math.CO cs.CG

    No Infinite $(p,q)$-Theorem for Piercing Compact Convex Sets with Lines in $\mathbb{R}^3$

    Authors: Sutanoya Chakraborty, Arijit Ghosh

    Abstract: An infinite $(p,q)$-theorem, or an $(\aleph_0,q)$-theorem, involving two families $\mathcal{F}$ and $\mathcal{G}$ of sets, states that if in every infinite subset of $\mathcal{F}$, there are $q$ sets that are intersected by some set in $\mathcal{G}$, then there is a finite set $S_{\mathcal{F}}\subseteq\mathcal{G}$ such that for every $C\in\mathcal{F}$, there is a $B\in S_{\mathcal{F}}$ with… ▽ More

    Submitted 11 September, 2025; v1 submitted 8 September, 2025; originally announced September 2025.

  6. arXiv:2509.04784  [pdf, ps, other

    cs.CL cs.AI

    Enhancing Diversity in Large Language Models via Determinantal Point Processes

    Authors: Yilei Chen, Souradip Chakraborty, Lorenz Wolf, Ioannis Ch. Paschalidis, Aldo Pacchiano

    Abstract: Supervised fine-tuning and reinforcement learning are two popular methods for post-training large language models (LLMs). While improving the model's performance on downstream tasks, they often reduce the model's output diversity, leading to narrow, canonical responses. Existing methods to enhance diversity are limited, either by operating at inference time or by focusing on lexical differences. W… ▽ More

    Submitted 4 September, 2025; originally announced September 2025.

  7. arXiv:2508.20616  [pdf, ps, other

    cs.LG stat.ML

    Dimension Agnostic Testing of Survey Data Credibility through the Lens of Regression

    Authors: Debabrota Basu, Sourav Chakraborty, Debarshi Chanda, Buddha Dev Das, Arijit Ghosh, Arnab Ray

    Abstract: Assessing whether a sample survey credibly represents the population is a critical question for ensuring the validity of downstream research. Generally, this problem reduces to estimating the distance between two high-dimensional distributions, which typically requires a number of samples that grows exponentially with the dimension. However, depending on the model used for data analysis, the concl… ▽ More

    Submitted 28 August, 2025; originally announced August 2025.

    Comments: 30 pages, 8 figures, 6 Tables

  8. arXiv:2508.19466  [pdf, ps, other

    cs.LG cs.AI

    Incentivized Lipschitz Bandits

    Authors: Sourav Chakraborty, Amit Kiran Rege, Claire Monteleoni, Lijun Chen

    Abstract: We study incentivized exploration in multi-armed bandit (MAB) settings with infinitely many arms modeled as elements in continuous metric spaces. Unlike classical bandit models, we consider scenarios where the decision-maker (principal) incentivizes myopic agents to explore beyond their greedy choices through compensation, but with the complication of reward drift--biased feedback arising due to t… ▽ More

    Submitted 26 August, 2025; originally announced August 2025.

  9. arXiv:2508.15220  [pdf, ps, other

    cs.LG cs.AI cs.LO

    Locally Pareto-Optimal Interpretations for Black-Box Machine Learning Models

    Authors: Aniruddha Joshi, Supratik Chakraborty, S Akshay, Shetal Shah, Hazem Torfah, Sanjit Seshia

    Abstract: Creating meaningful interpretations for black-box machine learning models involves balancing two often conflicting objectives: accuracy and explainability. Exploring the trade-off between these objectives is essential for developing trustworthy interpretations. While many techniques for multi-objective interpretation synthesis have been developed, they typically lack formal guarantees on the Paret… ▽ More

    Submitted 21 August, 2025; originally announced August 2025.

    Comments: This work has been accepted at ATVA'25

  10. arXiv:2508.11942  [pdf, ps, other

    cs.SI cs.CY physics.soc-ph

    Trust@Health: A Trust-Based Multilayered Network for Scalable Healthcare Service Management

    Authors: Avijit Gayen, Somyajit Chakraborty, Joydeep Chakraborty, Angshuman Jana

    Abstract: We study the intricate relationships within healthcare systems, focusing on interactions among doctors, departments, and hospitals. Leveraging an evolutionary graph framework, the proposed model emphasizes both intra-layer and inter-layer trust relationships to better understand and optimize healthcare services. The trust-based network facilitates the identification of key healthcare entities by i… ▽ More

    Submitted 16 August, 2025; originally announced August 2025.

    Comments: Paper submitted to IEEE Access under review

    Report number: 10.1109/ACCESS.2025.3613326 MSC Class: 91D30; 68R10

    Journal ref: IEEE Access 2025

  11. arXiv:2508.11909  [pdf, ps, other

    math.CO cs.IT math.NT

    Higher and extended Jacobi polynomials for codes

    Authors: Himadri Shekhar Chakraborty, Tsuyoshi Miezaki

    Abstract: In this paper, we introduce Jacobi polynomial generalizations of several classical invariants in coding theory over finite fields, specifically, the higher and extended weight enumerators, and we establish explicit correspondences between the resulting Jacobi polynomials. Moreover, we present the Jacobi analogue of MacWilliams identity for both higher and extended weight enumerators. We also prese… ▽ More

    Submitted 16 August, 2025; originally announced August 2025.

    Comments: 23 pages

    MSC Class: Primary 11T71; Secondary 94B05; 11F11

  12. arXiv:2508.09627  [pdf, ps, other

    cs.LG

    Physics- and geometry-aware spatio-spectral graph neural operator for time-independent and time-dependent PDEs

    Authors: Subhankar Sarkar, Souvik Chakraborty

    Abstract: Solving partial differential equations (PDEs) efficiently and accurately remains a cornerstone challenge in science and engineering, especially for problems involving complex geometries and limited labeled data. We introduce a Physics- and Geometry- Aware Spatio-Spectral Graph Neural Operator ($π$G-Sp$^2$GNO) for learning the solution operators of time-independent and time-dependent PDEs. The prop… ▽ More

    Submitted 13 August, 2025; originally announced August 2025.

  13. arXiv:2508.09623  [pdf, ps, other

    stat.ML cs.LG

    Scalable h-adaptive probabilistic solver for time-independent and time-dependent systems

    Authors: Akshay Thakur, Sawan Kumar, Matthew Zahr, Souvik Chakraborty

    Abstract: Solving partial differential equations (PDEs) within the framework of probabilistic numerics offers a principled approach to quantifying epistemic uncertainty arising from discretization. By leveraging Gaussian process regression and imposing the governing PDE as a constraint at a finite set of collocation points, probabilistic numerics delivers mesh-free solutions at arbitrary locations. However,… ▽ More

    Submitted 14 August, 2025; v1 submitted 13 August, 2025; originally announced August 2025.

  14. arXiv:2508.08488  [pdf, ps, other

    cs.CV

    MuGa-VTON: Multi-Garment Virtual Try-On via Diffusion Transformers with Prompt Customization

    Authors: Ankan Deria, Dwarikanath Mahapatra, Behzad Bozorgtabar, Mohna Chakraborty, Snehashis Chakraborty, Sudipta Roy

    Abstract: Virtual try-on seeks to generate photorealistic images of individuals in desired garments, a task that must simultaneously preserve personal identity and garment fidelity for practical use in fashion retail and personalization. However, existing methods typically handle upper and lower garments separately, rely on heavy preprocessing, and often fail to preserve person-specific cues such as tattoos… ▽ More

    Submitted 11 August, 2025; originally announced August 2025.

  15. arXiv:2508.08479  [pdf, ps, other

    cs.DC cs.LG

    Benchmarking Federated Learning for Throughput Prediction in 5G Live Streaming Applications

    Authors: Yuvraj Dutta, Soumyajit Chatterjee, Sandip Chakraborty, Basabdatta Palit

    Abstract: Accurate and adaptive network throughput prediction is essential for latency-sensitive and bandwidth-intensive applications in 5G and emerging 6G networks. However, most existing methods rely on centralized training with uniformly collected data, limiting their applicability in heterogeneous mobile environments with non-IID data distributions. This paper presents the first comprehensive benchmarki… ▽ More

    Submitted 11 August, 2025; originally announced August 2025.

    Comments: 14 pages, 24 figures, submitted to IEEE TNET

    MSC Class: 14J60 ACM Class: F.2.2; I.2.7

  16. arXiv:2508.08052  [pdf, ps, other

    cs.LG cs.AI

    On Understanding of the Dynamics of Model Capacity in Continual Learning

    Authors: Supriyo Chakraborty, Krishnan Raghavan

    Abstract: The stability-plasticity dilemma, closely related to a neural network's (NN) capacity-its ability to represent tasks-is a fundamental challenge in continual learning (CL). Within this context, we introduce CL's effective model capacity (CLEMC) that characterizes the dynamic behavior of the stability-plasticity balance point. We develop a difference equation to model the evolution of the interplay… ▽ More

    Submitted 14 August, 2025; v1 submitted 11 August, 2025; originally announced August 2025.

  17. arXiv:2508.07207  [pdf, ps, other

    cs.LO cs.AI

    Presburger Functional Synthesis: Complexity and Tractable Normal Forms

    Authors: S. Akshay, A. R. Balasubramanian, Supratik Chakraborty, Georg Zetzsche

    Abstract: Given a relational specification between inputs and outputs as a logic formula, the problem of functional synthesis is to automatically synthesize a function from inputs to outputs satisfying the relation. Recently, a rich line of work has emerged tackling this problem for specifications in different theories, from Boolean to general first-order logic. In this paper, we launch an investigation of… ▽ More

    Submitted 10 August, 2025; originally announced August 2025.

    Comments: Full version of conference paper at KR 2025 (22nd International Conference on Principles of Knowledge Representation and Reasoning)

  18. arXiv:2508.06783  [pdf, ps, other

    cs.LG cs.AI cs.CR cs.IT

    PROPS: Progressively Private Self-alignment of Large Language Models

    Authors: Noel Teku, Fengwei Tian, Payel Bhattacharjee, Souradip Chakraborty, Amrit Singh Bedi, Ravi Tandon

    Abstract: Alignment is a key step in developing Large Language Models (LLMs) using human feedback to ensure adherence to human values and societal norms. Dependence on human feedback raises privacy concerns about how much a labeler's preferences may reveal about their personal values, beliefs, and personality traits. Existing approaches, such as Differentially Private SGD (DP-SGD), provide rigorous privacy… ▽ More

    Submitted 8 August, 2025; originally announced August 2025.

  19. arXiv:2508.04819  [pdf, ps, other

    quant-ph cs.IT math-ph math.OA

    Hybrid oscillator-qudit quantum processors: stabilizer states and symplectic operations

    Authors: Sayan Chakraborty, Victor V. Albert

    Abstract: We construct stabilizer states and error-correcting codes on combinations of discrete- and continuous-variable systems, generalizing the Gottesman-Kitaev-Preskill (GKP) quantum lattice formalism. Our framework absorbs the discrete phase space of a qudit into a hybrid phase space parameterizable entirely by the continuous variables of a harmonic oscillator. The unit cell of a hybrid quantum lattice… ▽ More

    Submitted 6 August, 2025; originally announced August 2025.

    Comments: 17 pages + appendices, 4 figures

  20. arXiv:2508.01668  [pdf, ps, other

    eess.IV cs.CV

    Measuring and Predicting Where and When Pathologists Focus their Visual Attention while Grading Whole Slide Images of Cancer

    Authors: Souradeep Chakraborty, Ruoyu Xue, Rajarsi Gupta, Oksana Yaskiv, Constantin Friedman, Natallia Sheuka, Dana Perez, Paul Friedman, Won-Tak Choi, Waqas Mahmud, Beatrice Knudsen, Gregory Zelinsky, Joel Saltz, Dimitris Samaras

    Abstract: The ability to predict the attention of expert pathologists could lead to decision support systems for better pathology training. We developed methods to predict the spatio-temporal (where and when) movements of pathologists' attention as they grade whole slide images (WSIs) of prostate cancer. We characterize a pathologist's attention trajectory by their x, y, and m (magnification) movements of a… ▽ More

    Submitted 3 August, 2025; originally announced August 2025.

    Comments: Accepted to Medical Image Analysis (MEDIA), Elsevier, 2025. This is the accepted manuscript version; the final published article link will be updated when available

  21. Private key and password protection by steganographic image encryption

    Authors: Debesh Choudhury, Sujoy Chakraborty

    Abstract: We propose a technique to protect and preserve a private key or a passcode in an encrypted two-dimensional graphical image. The plaintext private key or the passcode is converted into an encrypted QR code and embedded into a real-life color image with a steganographic scheme. The private key or the passcode is recovered from the stego color image by first extracting the encrypted QR code from the… ▽ More

    Submitted 5 June, 2025; originally announced July 2025.

    Comments: 5 pages, 3 figures, Applications of Digital Image Processing XLV, SPIE Optical Engineering + Applications 2022, Proc. SPIE 12226,

    Journal ref: Applications of Digital Image Processing XLV, 1222619 (3 October 2022)

  22. arXiv:2507.20976  [pdf, ps, other

    cs.CV

    Adapting Vehicle Detectors for Aerial Imagery to Unseen Domains with Weak Supervision

    Authors: Xiao Fang, Minhyek Jeon, Zheyang Qin, Stanislav Panev, Celso de Melo, Shuowen Hu, Shayok Chakraborty, Fernando De la Torre

    Abstract: Detecting vehicles in aerial imagery is a critical task with applications in traffic monitoring, urban planning, and defense intelligence. Deep learning methods have provided state-of-the-art (SOTA) results for this application. However, a significant challenge arises when models trained on data from one geographic region fail to generalize effectively to other areas. Variability in factors such a… ▽ More

    Submitted 28 July, 2025; originally announced July 2025.

    Comments: ICCV 2025

  23. arXiv:2507.14374  [pdf, ps, other

    cs.CL

    Error-Aware Curriculum Learning for Biomedical Relation Classification

    Authors: Sinchani Chakraborty, Sudeshna Sarkar, Pawan Goyal

    Abstract: Relation Classification (RC) in biomedical texts is essential for constructing knowledge graphs and enabling applications such as drug repurposing and clinical decision-making. We propose an error-aware teacher--student framework that improves RC through structured guidance from a large language model (GPT-4o). Prediction failures from a baseline student model are analyzed by the teacher to classi… ▽ More

    Submitted 18 July, 2025; originally announced July 2025.

    Comments: 16 pages, 2 figures

  24. arXiv:2507.13670  [pdf, ps, other

    quant-ph cond-mat.stat-mech cs.CC cs.CR

    Fast computational deep thermalization

    Authors: Shantanav Chakraborty, Soonwon Choi, Soumik Ghosh, Tudor Giurgică-Tiron

    Abstract: Deep thermalization refers to the emergence of Haar-like randomness from quantum systems upon partial measurements. As a generalization of quantum thermalization, it is often associated with high complexity and entanglement. Here, we introduce computational deep thermalization and construct the fastest possible dynamics exhibiting it at infinite effective temperature. Our circuit dynamics produce… ▽ More

    Submitted 18 July, 2025; originally announced July 2025.

    Comments: 22 pages, 1 figure

  25. arXiv:2507.11655  [pdf, ps, other

    cs.LO cs.AI

    Counting Answer Sets of Disjunctive Answer Set Programs

    Authors: Mohimenul Kabir, Supratik Chakraborty, Kuldeep S Meel

    Abstract: Answer Set Programming (ASP) provides a powerful declarative paradigm for knowledge representation and reasoning. Recently, counting answer sets has emerged as an important computational problem with applications in probabilistic reasoning, network reliability analysis, and other domains. This has motivated significant research into designing efficient ASP counters. While substantial progress has… ▽ More

    Submitted 15 July, 2025; originally announced July 2025.

    Comments: Under consideration in Theory and Practice of Logic Programming (TPLP)

  26. arXiv:2507.11574  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Distribution-Free Uncertainty-Aware Virtual Sensing via Conformalized Neural Operators

    Authors: Kazuma Kobayashi, Shailesh Garg, Farid Ahmed, Souvik Chakraborty, Syed Bahauddin Alam

    Abstract: Robust uncertainty quantification (UQ) remains a critical barrier to the safe deployment of deep learning in real-time virtual sensing, particularly in high-stakes domains where sparse, noisy, or non-collocated sensor data are the norm. We introduce the Conformalized Monte Carlo Operator (CMCO), a framework that transforms neural operator-based virtual sensing with calibrated, distribution-free pr… ▽ More

    Submitted 15 July, 2025; originally announced July 2025.

  27. arXiv:2507.07036  [pdf, ps, other

    cs.SI cs.AI

    Modeling Heterogeneity across Varying Spatial Extents: Discovering Linkages between Sea Ice Retreat and Ice Shelve Melt in the Antarctic

    Authors: Maloy Kumar Devnath, Sudip Chakraborty, Vandana P. Janeja

    Abstract: Spatial phenomena often exhibit heterogeneity across spatial extents and in proximity, making them complex to model-especially in dynamic regions like ice shelves and sea ice. In this study, we address this challenge by exploring the linkages between sea ice retreat and Antarctic ice shelf (AIS) melt. Although atmospheric forcing and basal melting have been widely studied, the direct impact of sea… ▽ More

    Submitted 18 June, 2025; originally announced July 2025.

  28. arXiv:2507.02907  [pdf, ps, other

    cs.LG

    Scaling Transformers for Time Series Forecasting: Do Pretrained Large Models Outperform Small-Scale Alternatives?

    Authors: Sanjay Chakraborty, Ibrahim Delibasoglu, Fredrik Heintz

    Abstract: Large pre-trained models have demonstrated remarkable capabilities across domains, but their effectiveness in time series forecasting remains understudied. This work empirically examines whether pre-trained large-scale time series models (LSTSMs) trained on diverse datasets can outperform traditional non-pretrained small-scale transformers in forecasting tasks. We analyze state-of-the-art (SOTA) p… ▽ More

    Submitted 24 June, 2025; originally announced July 2025.

  29. arXiv:2506.15906  [pdf, ps, other

    stat.ML cs.LG

    From Local Interactions to Global Operators: Scalable Gaussian Process Operator for Physical Systems

    Authors: Sawan Kumar, Tapas Tripura, Rajdip Nayek, Souvik Chakraborty

    Abstract: Operator learning offers a powerful paradigm for solving parametric partial differential equations (PDEs), but scaling probabilistic neural operators such as the recently proposed Gaussian Processes Operators (GPOs) to high-dimensional, data-intensive regimes remains a significant challenge. In this work, we introduce a novel, scalable GPO, which capitalizes on sparsity, locality, and structural i… ▽ More

    Submitted 18 June, 2025; originally announced June 2025.

  30. arXiv:2506.12921  [pdf, ps, other

    cs.DM

    Shortest Paths in a Weighted Simplicial Complex

    Authors: Sukrit Chakraborty, Prasanta Choudhury, Arindam Mukherjee

    Abstract: Simplicial complexes are extensively studied in the field of algebraic topology. They have gained attention in recent time due to their applications in fields like theoretical distributed computing and simplicial neural networks. Graphs are mono-dimensional simplicial complex. Graph theory has application in topics like theoretical computer science, operations research, bioinformatics and social s… ▽ More

    Submitted 15 June, 2025; originally announced June 2025.

    Comments: Comments are welcome. 11 pages, 5 figures

    MSC Class: 68R01; 68R10; 68Q25; 68W01

  31. arXiv:2506.12061  [pdf, ps, other

    stat.CO cs.LO

    Assessing the Quality of Binomial Samplers: A Statistical Distance Framework

    Authors: Uddalok Sarkar, Sourav Chakraborty, Kuldeep S. Meel

    Abstract: Randomized algorithms depend on accurate sampling from probability distributions, as their correctness and performance hinge on the quality of the generated samples. However, even for common distributions like Binomial, exact sampling is computationally challenging, leading standard library implementations to rely on heuristics. These heuristics, while efficient, suffer from approximation and syst… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

    Comments: The full version of the conference paper to be published at CAV-25

  32. arXiv:2506.10217  [pdf

    cs.CY

    Data-Centric Safety and Ethical Measures for Data and AI Governance

    Authors: Srija Chakraborty

    Abstract: Datasets play a key role in imparting advanced capabilities to artificial intelligence (AI) foundation models that can be adapted to various downstream tasks. These downstream applications can introduce both beneficial and harmful capabilities -- resulting in dual use AI foundation models, with various technical and regulatory approaches to monitor and manage these risks. However, despite the cruc… ▽ More

    Submitted 30 June, 2025; v1 submitted 11 June, 2025; originally announced June 2025.

    Comments: Paper accepted and presented at the AAAI 2025 Workshop on Datasets and Evaluators of AI Safety https://sites.google.com/view/datasafe25/home

  33. arXiv:2506.05447  [pdf, ps, other

    cs.LG cs.AI

    Training Dynamics Underlying Language Model Scaling Laws: Loss Deceleration and Zero-Sum Learning

    Authors: Andrei Mircea, Supriyo Chakraborty, Nima Chitsazan, Milind Naphade, Sambit Sahu, Irina Rish, Ekaterina Lobacheva

    Abstract: This work aims to understand how scaling improves language models, specifically in terms of training dynamics. We find that language models undergo loss deceleration early in training; an abrupt slowdown in the rate of loss improvement, resulting in piecewise linear behaviour of the loss curve in log-log space. Scaling up the model mitigates this transition by (1) decreasing the loss at which dece… ▽ More

    Submitted 14 July, 2025; v1 submitted 5 June, 2025; originally announced June 2025.

    Comments: Published as a conference paper at ACL 2025

    ACM Class: I.2.7

  34. arXiv:2506.04210  [pdf, ps, other

    cs.AI cs.CL

    Does Thinking More always Help? Understanding Test-Time Scaling in Reasoning Models

    Authors: Soumya Suvra Ghosal, Souradip Chakraborty, Avinash Reddy, Yifu Lu, Mengdi Wang, Dinesh Manocha, Furong Huang, Mohammad Ghavamzadeh, Amrit Singh Bedi

    Abstract: Recent trends in test-time scaling for reasoning models (e.g., OpenAI o1, DeepSeek R1) have led to a popular belief that extending thinking traces using prompts like "Wait" or "Let me rethink" can improve performance. This raises a natural question: Does thinking more at test-time truly lead to better reasoning? To answer this question, we perform a detailed empirical study across models and bench… ▽ More

    Submitted 13 June, 2025; v1 submitted 4 June, 2025; originally announced June 2025.

  35. arXiv:2506.00419  [pdf, ps, other

    cs.CR cs.SE

    Teaching an Old LLM Secure Coding: Localized Preference Optimization on Distilled Preferences

    Authors: Mohammad Saqib Hasan, Saikat Chakraborty, Santu Karmaker, Niranjan Balasubramanian

    Abstract: LLM generated code often contains security issues. We address two key challenges in improving secure code generation. First, obtaining high quality training data covering a broad set of security issues is critical. To address this, we introduce a method for distilling a preference dataset of insecure and secure code pairs from frontier LLMs, along with a security reasoning that explains the issues… ▽ More

    Submitted 10 September, 2025; v1 submitted 31 May, 2025; originally announced June 2025.

    Comments: Accepted to ACL 2025 (Main)

  36. arXiv:2505.23729  [pdf, ps, other

    cs.CL cs.AI

    Bounded Rationality for LLMs: Satisficing Alignment at Inference-Time

    Authors: Mohamad Chehade, Soumya Suvra Ghosal, Souradip Chakraborty, Avinash Reddy, Dinesh Manocha, Hao Zhu, Amrit Singh Bedi

    Abstract: Aligning large language models with humans is challenging due to the inherently multifaceted nature of preference feedback. While existing approaches typically frame this as a multi-objective optimization problem, they often overlook how humans actually make decisions. Research on bounded rationality suggests that human decision making follows satisficing strategies-optimizing primary objectives w… ▽ More

    Submitted 31 May, 2025; v1 submitted 29 May, 2025; originally announced May 2025.

    Comments: Accepted at ICML 2025

  37. arXiv:2505.21689  [pdf, other

    cs.CL cs.AI cs.LG

    LLMPR: A Novel LLM-Driven Transfer Learning based Petition Ranking Model

    Authors: Avijit Gayen, Somyajit Chakraborty, Mainak Sen, Soham Paul, Angshuman Jana

    Abstract: The persistent accumulation of unresolved legal cases, especially within the Indian judiciary, significantly hampers the timely delivery of justice. Manual methods of prioritizing petitions are often prone to inefficiencies and subjective biases further exacerbating delays. To address this issue, we propose LLMPR (Large Language Model-based Petition Ranking), an automated framework that utilizes t… ▽ More

    Submitted 27 May, 2025; originally announced May 2025.

    Comments: 28 pages, 5 figures, journal paper, submitted to AI and Law

  38. arXiv:2505.20207  [pdf, ps, other

    cs.LO cs.PL cs.SE

    GPUMC: A Stateless Model Checker for GPU Weak Memory Concurrency

    Authors: Soham Chakraborty, S. Krishna, Andreas Pavlogiannis, Omkar Tuppe

    Abstract: GPU computing is embracing weak memory concurrency for performance improvement. However, compared to CPUs, modern GPUs provide more fine-grained concurrency features such as scopes, have additional properties like divergence, and thereby follow different weak memory consistency models. These features and properties make concurrent programming on GPUs more complex and error-prone. To this end, we p… ▽ More

    Submitted 26 May, 2025; originally announced May 2025.

  39. arXiv:2505.18179  [pdf, ps, other

    cs.LG cs.AI

    GAIA: A Foundation Model for Operational Atmospheric Dynamics

    Authors: Ata Akbari Asanjan, Olivia Alexander, Tom Berg, Clara Zhang, Matt Yang, Jad Makki, Disha Shidham, Srija Chakraborty, William Bender, Stephen Peng, Arun Ravindran, Olivier Raiman, David Potere, David Bell

    Abstract: We present the GAIA (Geospatial Artificial Intelligence for Atmospheres) Foundation Model, a novel model that combines masked autoencoders (MAE) and self-DIstillation with NO labels (DINO) for analyzing global atmospheric patterns in satellite imagery. By integrating these complementary self-supervised learning approaches, our model simultaneously captures both local features and global dependenci… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

    Comments: 14 pages, 7 figures

  40. arXiv:2505.11628  [pdf, ps, other

    cs.CL cs.LG

    Critique-Guided Distillation for Efficient and Robust Language Model Reasoning

    Authors: Berkcan Kapusuzoglu, Supriyo Chakraborty, Chia-Hsuan Lee, Sambit Sahu

    Abstract: Supervised fine-tuning (SFT) with expert demonstrations often suffers from the imitation problem, where models reproduce correct responses without internalizing the underlying reasoning. We propose Critique-Guided Distillation (CGD), a multi-stage training framework that augments SFT with teacher-generated explanatory critiques and refined responses. Instead of directly imitating teacher outputs,… ▽ More

    Submitted 26 September, 2025; v1 submitted 16 May, 2025; originally announced May 2025.

  41. arXiv:2505.07830  [pdf, other

    cs.AI cs.CY

    An Optimized Evacuation Plan for an Active-Shooter Situation Constrained by Network Capacity

    Authors: Joseph Lavalle-Rivera, Aniirudh Ramesh, Subhadeep Chakraborty

    Abstract: A total of more than 3400 public shootings have occurred in the United States between 2016 and 2022. Among these, 25.1% of them took place in an educational institution, 29.4% at the workplace including office buildings, 19.6% in retail store locations, and 13.4% in restaurants and bars. During these critical scenarios, making the right decisions while evacuating can make the difference between li… ▽ More

    Submitted 29 April, 2025; originally announced May 2025.

    Comments: 21 pages, 18 figures

  42. arXiv:2504.21564  [pdf, ps, other

    quant-ph cs.DS

    Simulating quantum collision models with Hamiltonian simulations using early fault-tolerant quantum computers

    Authors: Kushagra Garg, Zeeshan Ahmed, Subhadip Mitra, Shantanav Chakraborty

    Abstract: We develop randomized quantum algorithms to simulate quantum collision models, also known as repeated interaction schemes, which provide a rich framework to model various open-system dynamics. The underlying technique involves composing time evolutions of the total (system, bath, and interaction) Hamiltonian and intermittent tracing out of the environment degrees of freedom. This results in a unif… ▽ More

    Submitted 19 August, 2025; v1 submitted 30 April, 2025; originally announced April 2025.

    Comments: 23+8 pages. 5 figures. Close to the accepted version

    Journal ref: Physical Review A 112, 022425 (2025)

  43. arXiv:2504.21211  [pdf, other

    cs.LG cs.AI

    A Cost-Effective LLM-based Approach to Identify Wildlife Trafficking in Online Marketplaces

    Authors: Juliana Barbosa, Ulhas Gondhali, Gohar Petrossian, Kinshuk Sharma, Sunandan Chakraborty, Jennifer Jacquet, Juliana Freire

    Abstract: Wildlife trafficking remains a critical global issue, significantly impacting biodiversity, ecological stability, and public health. Despite efforts to combat this illicit trade, the rise of e-commerce platforms has made it easier to sell wildlife products, putting new pressure on wild populations of endangered and threatened species. The use of these platforms also opens a new opportunity: as cri… ▽ More

    Submitted 29 April, 2025; originally announced April 2025.

  44. arXiv:2504.14495  [pdf, other

    cs.RO

    RadarTrack: Enhancing Ego-Vehicle Speed Estimation with Single-chip mmWave Radar

    Authors: Argha Sen, Soham Chakraborty, Soham Tripathy, Sandip Chakraborty

    Abstract: In this work, we introduce RadarTrack, an innovative ego-speed estimation framework utilizing a single-chip millimeter-wave (mmWave) radar to deliver robust speed estimation for mobile platforms. Unlike previous methods that depend on cross-modal learning and computationally intensive Deep Neural Networks (DNNs), RadarTrack utilizes a novel phase-based speed estimation approach. This method effect… ▽ More

    Submitted 20 April, 2025; originally announced April 2025.

    Comments: 8 pages, 9 figures

  45. arXiv:2504.12463  [pdf, other

    cs.LG cs.AI

    Dense Backpropagation Improves Training for Sparse Mixture-of-Experts

    Authors: Ashwinee Panda, Vatsal Baherwani, Zain Sarwar, Benjamin Therien, Supriyo Chakraborty, Tom Goldstein

    Abstract: Mixture of Experts (MoE) pretraining is more scalable than dense Transformer pretraining, because MoEs learn to route inputs to a sparse set of their feedforward parameters. However, this means that MoEs only receive a sparse backward update, leading to training instability and suboptimal performance. We present a lightweight approximation method that gives the MoE router a dense gradient update w… ▽ More

    Submitted 17 April, 2025; v1 submitted 16 April, 2025; originally announced April 2025.

  46. arXiv:2504.12461  [pdf, ps, other

    cs.SE

    On the Need to Rethink Trust in AI Assistants for Software Development: A Critical Review

    Authors: Sebastian Baltes, Timo Speith, Brenda Chiteri, Seyedmoein Mohsenimofidi, Shalini Chakraborty, Daniel Buschek

    Abstract: Trust is a fundamental concept in human decision-making and collaboration that has long been studied in philosophy and psychology. However, software engineering (SE) articles often use the term trust informally-providing an explicit definition or embedding results in established trust models is rare. In SE research on AI assistants, this practice culminates in equating trust with the likelihood of… ▽ More

    Submitted 5 August, 2025; v1 submitted 16 April, 2025; originally announced April 2025.

    Comments: 17 pages, 3 figures, 3 tables, currently under review

  47. arXiv:2504.11750  [pdf, other

    cs.DC cs.AI cs.AR cs.PF

    Characterizing and Optimizing LLM Inference Workloads on CPU-GPU Coupled Architectures

    Authors: Prabhu Vellaisamy, Thomas Labonte, Sourav Chakraborty, Matt Turner, Samantika Sury, John Paul Shen

    Abstract: Large language model (LLM)-based inference workloads increasingly dominate data center costs and resource utilization. Therefore, understanding the inference workload characteristics on evolving CPU-GPU coupled architectures is crucial for optimization. This paper presents an in-depth analysis of LLM inference behavior on loosely-coupled (PCIe A100/H100) and closely-coupled (GH200) systems. We ana… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

    Comments: Accepted for ISPASS 2025

  48. arXiv:2504.03997  [pdf, other

    cs.IR

    Towards Robust Offline Evaluation: A Causal and Information Theoretic Framework for Debiasing Ranking Systems

    Authors: Seyedeh Baharan Khatami, Sayan Chakraborty, Ruomeng Xu, Babak Salimi

    Abstract: Evaluating retrieval-ranking systems is crucial for developing high-performing models. While online A/B testing is the gold standard, its high cost and risks to user experience require effective offline methods. However, relying on historical interaction data introduces biases-such as selection, exposure, conformity, and position biases-that distort evaluation metrics, driven by the Missing-Not-At… ▽ More

    Submitted 4 April, 2025; originally announced April 2025.

  49. arXiv:2504.02385  [pdf, ps, other

    quant-ph cs.DS

    Quantum singular value transformation without block encodings: Near-optimal complexity with minimal ancilla

    Authors: Shantanav Chakraborty, Soumyabrata Hazra, Tongyang Li, Changpeng Shao, Xinzhao Wang, Yuxin Zhang

    Abstract: We develop new algorithms for Quantum Singular Value Transformation (QSVT), a unifying framework that encapsulates most known quantum algorithms and serves as the foundation for new ones. Existing implementations of QSVT rely on block encoding, incurring an intrinsic $O(\log L)$ ancilla overhead and circuit depth $\widetilde{O}(L dλ)$ for polynomial transformations of a Hamiltonian… ▽ More

    Submitted 3 September, 2025; v1 submitted 3 April, 2025; originally announced April 2025.

    Comments: This article has been split into two parts. This version contains the first part and is about QSVT without using block encoding with one ancilla and near optimal circuit depth. The other part is about randomized QSVT and will be available on arXiv soon

  50. arXiv:2504.01931  [pdf, ps, other

    cs.CL

    On the Role of Feedback in Test-Time Scaling of Agentic AI Workflows

    Authors: Souradip Chakraborty, Mohammadreza Pourreza, Ruoxi Sun, Yiwen Song, Nino Scherrer, Furong Huang, Amrit Singh Bedi, Ahmad Beirami, Jindong Gu, Hamid Palangi, Tomas Pfister

    Abstract: Agentic AI workflows (systems that autonomously plan and act) are becoming widespread, yet their task success rate on complex tasks remains low. A promising solution is inference-time alignment, which uses extra compute at test time to improve performance. Inference-time alignment relies on three components: sampling, evaluation, and feedback. While most prior work studies sampling and automatic e… ▽ More

    Submitted 7 July, 2025; v1 submitted 2 April, 2025; originally announced April 2025.