Skip to main content

Showing 1–50 of 112 results for author: Prasad, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.12154  [pdf, ps, other

    cs.SD eess.AS

    Adapting Whisper for Streaming Speech Recognition via Two-Pass Decoding

    Authors: Haoran Zhou, Xingchen Song, Brendan Fahy, Qiaochu Song, Binbin Zhang, Zhendong Peng, Anshul Wadhawan, Denglin Jiang, Apurv Verma, Vinay Ramesh, Srivas Prasad, Michele M. Franceschini

    Abstract: OpenAI Whisper is a family of robust Automatic Speech Recognition (ASR) models trained on 680,000 hours of audio. However, its encoder-decoder architecture, trained with a sequence-to-sequence objective, lacks native support for streaming ASR. In this paper, we fine-tune Whisper for streaming ASR using the WeNet toolkit by adopting a Unified Two-pass (U2) structure. We introduce an additional Conn… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

    Comments: Accepted to INTERSPEECH 2025

  2. arXiv:2506.04539  [pdf, other

    cs.RO cs.ET cs.LG eess.SY

    Olfactory Inertial Odometry: Sensor Calibration and Drift Compensation

    Authors: Kordel K. France, Ovidiu Daescu, Anirban Paul, Shalini Prasad

    Abstract: Visual inertial odometry (VIO) is a process for fusing visual and kinematic data to understand a machine's state in a navigation task. Olfactory inertial odometry (OIO) is an analog to VIO that fuses signals from gas sensors with inertial data to help a robot navigate by scent. Gas dynamics and environmental factors introduce disturbances into olfactory navigation tasks that can make OIO difficult… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: Published as a full conference paper at the 2025 IEEE International Symposium on Inertial Sensors & Systems

  3. arXiv:2505.19184  [pdf, ps, other

    cs.CL cs.AI cs.LG

    When Two LLMs Debate, Both Think They'll Win

    Authors: Pradyumna Shyama Prasad, Minh Nhat Nguyen

    Abstract: Can LLMs accurately adjust their confidence when facing opposition? Building on previous studies measuring calibration on static fact-based question-answering tasks, we evaluate Large Language Models (LLMs) in a dynamic, adversarial debate setting, uniquely combining two realistic factors: (a) a multi-turn format requiring models to update beliefs as new information emerges, and (b) a zero-sum str… ▽ More

    Submitted 9 June, 2025; v1 submitted 25 May, 2025; originally announced May 2025.

  4. arXiv:2505.13687  [pdf, ps, other

    cs.GT econ.TH

    Revenue-Optimal Efficient Mechanism Design with General Type Spaces

    Authors: Siddharth Prasad, Maria-Florina Balcan, Tuomas Sandholm

    Abstract: We derive the revenue-optimal efficient (welfare-maximizing) mechanism in a general multidimensional mechanism design setting when type spaces -- that is, the underlying domains from which agents' values come from -- can capture arbitrarily complex informational constraints about the agents. Type spaces can encode information about agents representing, for example, machine learning predictions of… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  5. arXiv:2505.13680  [pdf, other

    cs.GT econ.TH math.OC

    Weakest Bidder Types and New Core-Selecting Combinatorial Auctions

    Authors: Siddharth Prasad, Maria-Florina Balcan, Tuomas Sandholm

    Abstract: Core-selecting combinatorial auctions are popular auction designs that constrain prices to eliminate the incentive for any group of bidders -- with the seller -- to renegotiate for a better deal. They help overcome the low-revenue issues of classical combinatorial auctions. We introduce a new class of core-selecting combinatorial auctions that leverage bidder information available to the auction d… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

  6. arXiv:2505.13545  [pdf, ps, other

    cs.IR cs.AI

    Know Or Not: a library for evaluating out-of-knowledge base robustness

    Authors: Jessica Foo, Pradyumna Shyama Prasad, Shaun Khoo

    Abstract: While the capabilities of large language models (LLMs) have progressed significantly, their use in high-stakes applications have been limited due to risks of hallucination. One key approach in reducing hallucination is retrieval-augmented generation (RAG), but even in such setups, LLMs may still hallucinate when presented with questions outside of the knowledge base. Such behavior is unacceptable… ▽ More

    Submitted 18 May, 2025; originally announced May 2025.

  7. arXiv:2505.13159  [pdf, ps, other

    cs.AR

    MXDOTP: A RISC-V ISA Extension for Enabling Microscaling (MX) Floating-Point Dot Products

    Authors: Gamze İslamoğlu, Luca Bertaccini, Arpan Suravi Prasad, Francesco Conti, Angelo Garofalo, Luca Benini

    Abstract: Fast and energy-efficient low-bitwidth floating-point (FP) arithmetic is essential for Artificial Intelligence (AI) systems. Microscaling (MX) standardized formats have recently emerged as a promising alternative to baseline low-bitwidth FP formats, offering improved accuracy with a block-wise shared exponent scale combined with per-element values. However, efficiently executing the key linear alg… ▽ More

    Submitted 19 May, 2025; originally announced May 2025.

    Comments: Accepted at the 36th IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP 2025)

  8. arXiv:2505.01558  [pdf, ps, other

    cs.CV

    A Sensor Agnostic Domain Generalization Framework for Leveraging Geospatial Foundation Models: Enhancing Semantic Segmentation viaSynergistic Pseudo-Labeling and Generative Learning

    Authors: Anan Yaghmour, Melba M. Crawford, Saurabh Prasad

    Abstract: Remote sensing enables a wide range of critical applications such as land cover and land use mapping, crop yield prediction, and environmental monitoring. Advances in satellite technology have expanded remote sensing datasets, yet high-performance segmentation models remain dependent on extensive labeled data, challenged by annotation scarcity and variability across sensors, illumination, and geog… ▽ More

    Submitted 2 May, 2025; originally announced May 2025.

    Comments: Accepted in the 2025 CVPR Workshop on Foundation and Large Vision Models in Remote Sensing, to appear in CVPR 2025 Workshop Proceedings

  9. arXiv:2502.17459  [pdf, other

    eess.SP cs.LG

    Study on Downlink CSI compression: Are Neural Networks the Only Solution?

    Authors: K. Sai Praneeth, Anil Kumar Yerrapragada, Achyuth Sagireddi, Sai Prasad, Radha Krishna Ganti

    Abstract: Massive Multi Input Multi Output (MIMO) systems enable higher data rates in the downlink (DL) with spatial multiplexing achieved by forming narrow beams. The higher DL data rates are achieved by effective implementation of spatial multiplexing and beamforming which is subject to availability of DL channel state information (CSI) at the base station. For Frequency Division Duplexing (FDD) systems,… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

  10. arXiv:2502.02700  [pdf, other

    cs.LG

    Scalable Higher Resolution Polar Sea Ice Classification and Freeboard Calculation from ICESat-2 ATL03 Data

    Authors: Jurdana Masuma Iqrah, Younghyun Koo, Wei Wang, Hongjie Xie, Sushil K. Prasad

    Abstract: ICESat-2 (IS2) by NASA is an Earth-observing satellite that measures high-resolution surface elevation. The IS2's ATL07 and ATL10 sea ice elevation and freeboard products of 10m-200m segments which aggregated 150 signal photons from the raw ATL03 (geolocated photon) data. These aggregated products can potentially overestimate local sea surface height, thus underestimating the calculations of freeb… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

  11. arXiv:2502.02349  [pdf, other

    cs.AR cs.DC cs.OS cs.PF

    Random Adaptive Cache Placement Policy

    Authors: Vrushank Ahire, Pranav Menon, Aniruddh Muley, Abhinandan S. Prasad

    Abstract: This paper presents a new hybrid cache replacement algorithm that combines random allocation with a modified V-Way cache implementation. Our RAC adapts to complex cache access patterns and optimizes cache usage by improving the utilization of cache sets, unlike traditional cache policies. The algorithm utilizes a 16-way set-associative cache with 2048 sets, incorporating dynamic allocation and fle… ▽ More

    Submitted 4 February, 2025; originally announced February 2025.

  12. arXiv:2501.06177  [pdf, other

    cs.ET cs.CY cs.HC

    ScooterLab: A Programmable and Participatory Sensing Research Testbed using Micromobility Vehicles

    Authors: Ubaidullah Khan, Raveen Wijewickrama, Buddhi Ashan M. K., A. H. M. Nazmus Sakib, Khoi Trinh, Christina Duthie, Nima Najafian, Ahmer Patel, R. N. Molina, Anindya Maiti, Sushil K. Prasad, Greg P. Griffin, Murtuza Jadliwala

    Abstract: Micromobility vehicles, such as e-scooters, are increasingly popular in urban communities but present significant challenges in terms of road safety, user privacy, infrastructure planning, and civil engineering. Addressing these critical issues requires a large-scale and easily accessible research infrastructure to collect diverse mobility and contextual data from micromobility users in realistic… ▽ More

    Submitted 10 January, 2025; originally announced January 2025.

  13. arXiv:2501.05108  [pdf, other

    cs.CV

    Optimizing Multitask Industrial Processes with Predictive Action Guidance

    Authors: Naval Kishore Mehta, Arvind, Shyam Sunder Prasad, Sumeet Saurav, Sanjay Singh

    Abstract: Monitoring complex assembly processes is critical for maintaining productivity and ensuring compliance with assembly standards. However, variability in human actions and subjective task preferences complicate accurate task anticipation and guidance. To address these challenges, we introduce the Multi-Modal Transformer Fusion and Recurrent Units (MMTFRU) Network for egocentric activity anticipation… ▽ More

    Submitted 9 January, 2025; originally announced January 2025.

  14. arXiv:2501.02785   

    cs.CV cs.AI cs.LG

    Hybrid deep convolution model for lung cancer detection with transfer learning

    Authors: Sugandha Saxena, S. N. Prasad, Ashwin M Polnaya, Shweta Agarwala

    Abstract: Advances in healthcare research have significantly enhanced our understanding of disease mechanisms, diagnostic precision, and therapeutic options. Yet, lung cancer remains one of the leading causes of cancer-related mortality worldwide due to challenges in early and accurate diagnosis. While current lung cancer detection models show promise, there is considerable potential for further improving t… ▽ More

    Submitted 5 June, 2025; v1 submitted 6 January, 2025; originally announced January 2025.

    Comments: Authors realized mistake in the model. Also some data was misinterpreted

  15. arXiv:2412.09230  [pdf, other

    cs.CV cs.AI

    Foundation Models and Adaptive Feature Selection: A Synergistic Approach to Video Question Answering

    Authors: Sai Bhargav Rongali, Mohamad Hassan N C, Ankit Jha, Neha Bhargava, Saurabh Prasad, Biplab Banerjee

    Abstract: This paper tackles the intricate challenge of video question-answering (VideoQA). Despite notable progress, current methods fall short of effectively integrating questions with video frames and semantic object-level abstractions to create question-aware video representations. We introduce Local-Global Question Aware Video Embedding (LGQAVE), which incorporates three major innovations to integrate… ▽ More

    Submitted 12 December, 2024; originally announced December 2024.

    Journal ref: WACV2025

  16. arXiv:2412.03310  [pdf, other

    cs.CL cs.PL

    Grounded Language Design for Lightweight Diagramming for Formal Methods

    Authors: Siddhartha Prasad, Ben Greenman, Tim Nelson, Shriram Krishnamurthi

    Abstract: Model finding, as embodied by SAT solvers and similar tools, is used widely, both in embedding settings and as a tool in its own right. For instance, tools like Alloy target SAT to enable users to incrementally define, explore, verify, and diagnose sophisticated specifications for a large number of complex systems. These tools critically include a visualizer that lets users graphically explore t… ▽ More

    Submitted 4 December, 2024; originally announced December 2024.

    ACM Class: D.3.1; D.2.4; D.3.2

  17. arXiv:2411.16794  [pdf, other

    cs.CV

    Phase-Informed Tool Segmentation for Manual Small-Incision Cataract Surgery

    Authors: Bhuvan Sachdeva, Naren Akash, Tajamul Ashraf, Simon Mueller, Thomas Schultz, Maximilian W. M. Wintergerst, Niharika Singri Prasad, Kaushik Murali, Mohit Jain

    Abstract: Cataract surgery is the most common surgical procedure globally, with a disproportionately higher burden in developing countries. While automated surgical video analysis has been explored in general surgery, its application to ophthalmic procedures remains limited. Existing works primarily focus on Phaco cataract surgery, an expensive technique not accessible in regions where cataract treatment is… ▽ More

    Submitted 3 December, 2024; v1 submitted 25 November, 2024; originally announced November 2024.

  18. arXiv:2411.00254  [pdf, other

    eess.IV cs.CV cs.LG

    A Novel Breast Ultrasound Image Augmentation Method Using Advanced Neural Style Transfer: An Efficient and Explainable Approach

    Authors: Lipismita Panigrahi, Prianka Rani Saha, Jurdana Masuma Iqrah, Sushil Prasad

    Abstract: Clinical diagnosis of breast malignancy (BM) is a challenging problem in the recent era. In particular, Deep learning (DL) models have continued to offer important solutions for early BM diagnosis but their performance experiences overfitting due to the limited volume of breast ultrasound (BUS) image data. Further, large BUS datasets are difficult to manage due to privacy and legal concerns. Hence… ▽ More

    Submitted 31 October, 2024; originally announced November 2024.

  19. arXiv:2410.19436  [pdf, other

    eess.SP cs.LG

    On the Application of Deep Learning for Precise Indoor Positioning in 6G

    Authors: Sai Prasanth Kotturi, Anil Kumar Yerrapragada, Sai Prasad, Radha Krishna Ganti

    Abstract: Accurate localization in indoor environments is a challenge due to the Non Line of Sight (NLoS) nature of the signaling. In this paper, we explore the use of AI/ML techniques for positioning accuracy enhancement in Indoor Factory (InF) scenarios. The proposed neural network, which we term LocNet, is trained on measurements such as Channel Impulse Response (CIR) and Reference Signal Received Power… ▽ More

    Submitted 25 October, 2024; originally announced October 2024.

    Comments: 6 Pages, 6 Figures

  20. arXiv:2410.17361  [pdf, other

    cs.CR cs.LG cs.NI

    Characterizing Robocalls with Multiple Vantage Points

    Authors: Sathvik Prasad, Aleksandr Nahapetyan, Bradley Reaves

    Abstract: Telephone spam has been among the highest network security concerns for users for many years. In response, industry and government have deployed new technologies and regulations to curb the problem, and academic and industry researchers have provided methods and measurements to characterize robocalls. Have these efforts borne fruit? Are the research characterizations reliable, and have the prevent… ▽ More

    Submitted 22 October, 2024; originally announced October 2024.

    Comments: Accepted for publication at the 46th IEEE Symposium on Security and Privacy, 2025

  21. arXiv:2410.13295  [pdf, other

    cs.LG cs.AI cs.CV physics.optics

    PiLocNet: Physics-informed neural network on 3D localization with rotating point spread function

    Authors: Mingda Lu, Zitian Ao, Chao Wang, Sudhakar Prasad, Raymond H. Chan

    Abstract: For the 3D localization problem using point spread function (PSF) engineering, we propose a novel enhancement of our previously introduced localization neural network, LocNet. The improved network is a physics-informed neural network (PINN) that we call PiLocNet. Previous works on the localization problem may be categorized separately into model-based optimization and neural network approaches. Ou… ▽ More

    Submitted 9 February, 2025; v1 submitted 17 October, 2024; originally announced October 2024.

    Comments: 13 pages, 6 figures

  22. arXiv:2409.09244  [pdf, other

    cs.CV

    Investigation of Hierarchical Spectral Vision Transformer Architecture for Classification of Hyperspectral Imagery

    Authors: Wei Liu, Saurabh Prasad, Melba Crawford

    Abstract: In the past three years, there has been significant interest in hyperspectral imagery (HSI) classification using vision Transformers for analysis of remotely sensed data. Previous research predominantly focused on the empirical integration of convolutional neural networks (CNNs) to augment the network's capability to extract local feature information. Yet, the theoretical justification for vision… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

    Comments: \c{opyright} 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

  23. arXiv:2409.02839  [pdf, other

    cs.CR cs.CY cs.NI

    Jäger: Automated Telephone Call Traceback

    Authors: David Adei, Varun Madathil, Sathvik Prasad, Bradley Reaves, Alessandra Scafuro

    Abstract: Unsolicited telephone calls that facilitate fraud or unlawful telemarketing continue to overwhelm network users and the regulators who prosecute them. The first step in prosecuting phone abuse is traceback -- identifying the call originator. This fundamental investigative task currently requires hours of manual effort per call. In this paper, we introduce Jäger, a distributed secure call traceback… ▽ More

    Submitted 17 September, 2024; v1 submitted 4 September, 2024; originally announced September 2024.

    Comments: In Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security (CCS '24), October 14---18, 2024, Salt Lake City, UT, USA. ACM, New York, NY, USA, 24 pages

  24. ContextQ: Generated Questions to Support Meaningful Parent-Child Dialogue While Co-Reading

    Authors: Griffin Dietz Smith, Siddhartha Prasad, Matt J. Davidson, Leah Findlater, R. Benjamin Shapiro

    Abstract: Much of early literacy education happens at home with caretakers reading books to young children. Prior research demonstrates how having dialogue with children during co-reading can develop critical reading readiness skills, but most adult readers are unsure if and how to lead effective conversations. We present ContextQ, a tablet-based reading application to unobtrusively present auto-generated d… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: ACM Interaction Design and Children (IDC) 2024

  25. arXiv:2405.03103  [pdf, other

    cs.LG cs.CV

    Learning from Students: Applying t-Distributions to Explore Accurate and Efficient Formats for LLMs

    Authors: Jordan Dotzel, Yuzong Chen, Bahaa Kotb, Sushma Prasad, Gang Wu, Sheng Li, Mohamed S. Abdelfattah, Zhiru Zhang

    Abstract: The increasing size of large language models (LLMs) traditionally requires low-precision integer formats to meet strict latency and power demands. Yet recently, alternative formats such as Normal Float (NF4) have increased model accuracy at the cost of increased chip area. In this work, we first conduct a large-scale analysis of LLM weights and activations across 30 networks and conclude that most… ▽ More

    Submitted 10 June, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

    Comments: Accepted to ICML 2024

  26. arXiv:2405.02664  [pdf, other

    cs.AI cs.IR

    MedPromptExtract (Medical Data Extraction Tool): Anonymization and Hi-fidelity Automated data extraction using NLP and prompt engineering

    Authors: Roomani Srivastava, Suraj Prasad, Lipika Bhat, Sarvesh Deshpande, Barnali Das, Kshitij Jadhav

    Abstract: Introduction: The labour-intensive nature of data extraction from sources like discharge summaries (DS) poses significant obstacles to the digitisation of medical records particularly for low- and middle-income countries (LMICs). In this paper we present a completely automated method MedPromptExtract to efficiently extract data from DS while maintaining confidentiality. Methods: The source of data… ▽ More

    Submitted 6 September, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

  27. arXiv:2404.13770  [pdf, other

    cs.CV cs.AI cs.LG

    EncodeNet: A Framework for Boosting DNN Accuracy with Entropy-driven Generalized Converting Autoencoder

    Authors: Hasanul Mahmud, Kevin Desai, Palden Lama, Sushil K. Prasad

    Abstract: Image classification is a fundamental task in computer vision, and the quest to enhance DNN accuracy without inflating model size or latency remains a pressing concern. We make a couple of advances in this regard, leading to a novel EncodeNet design and training framework. The first advancement involves Converting Autoencoders, a novel approach that transforms images into an easy-to-classify image… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: 15 pages

  28. arXiv:2403.15022  [pdf, other

    cs.LG

    Insights into the Lottery Ticket Hypothesis and Iterative Magnitude Pruning

    Authors: Tausifa Jan Saleem, Ramanjit Ahuja, Surendra Prasad, Brejesh Lall

    Abstract: Lottery ticket hypothesis for deep neural networks emphasizes the importance of initialization used to re-train the sparser networks obtained using the iterative magnitude pruning process. An explanation for why the specific initialization proposed by the lottery ticket hypothesis tends to work better in terms of generalization (and training) performance has been lacking. Moreover, the underlying… ▽ More

    Submitted 25 June, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

  29. arXiv:2403.13135  [pdf, other

    cs.CV cs.DC cs.LG

    A Parallel Workflow for Polar Sea-Ice Classification using Auto-labeling of Sentinel-2 Imagery

    Authors: Jurdana Masuma Iqrah, Wei Wang, Hongjie Xie, Sushil Prasad

    Abstract: The observation of the advancing and retreating pattern of polar sea ice cover stands as a vital indicator of global warming. This research aims to develop a robust, effective, and scalable system for classifying polar sea ice as thick/snow-covered, young/thin, or open water using Sentinel-2 (S2) images. Since the S2 satellite is actively capturing high-resolution imagery over the earth's surface,… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Accepted in the 25th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC 2024), May 2024. arXiv admin note: substantial text overlap with arXiv:2303.12719

  30. arXiv:2403.07036  [pdf, other

    cs.LG cs.CV cs.DC

    A Converting Autoencoder Toward Low-latency and Energy-efficient DNN Inference at the Edge

    Authors: Hasanul Mahmud, Peng Kang, Kevin Desai, Palden Lama, Sushil Prasad

    Abstract: Reducing inference time and energy usage while maintaining prediction accuracy has become a significant concern for deep neural networks (DNN) inference on resource-constrained edge devices. To address this problem, we propose a novel approach based on "converting" autoencoder and lightweight DNNs. This improves upon recent work such as early-exiting framework and DNN partitioning. Early-exiting f… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

    Comments: 8 Pages, 8 Figures

  31. arXiv:2403.07003  [pdf, other

    cs.AI cs.CY cs.LG cs.NI

    Evacuation Management Framework towards Smart City-wide Intelligent Emergency Interactive Response System

    Authors: Anuj Abraham, Yi Zhang, Shitala Prasad

    Abstract: A smart city solution toward future 6G network deployment allows small and medium sized enterprises (SMEs), industry, and government entities to connect with the infrastructures and play a crucial role in enhancing emergency preparedness with advanced sensors. The objective of this work is to propose a set of coordinated technological solutions to transform an existing emergency response system in… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  32. arXiv:2403.04931  [pdf, other

    cs.AI cs.CL cs.HC

    A Survey on Human-AI Teaming with Large Pre-Trained Models

    Authors: Vanshika Vats, Marzia Binta Nizam, Minghao Liu, Ziyuan Wang, Richard Ho, Mohnish Sai Prasad, Vincent Titterton, Sai Venkat Malreddy, Riya Aggarwal, Yanwen Xu, Lei Ding, Jay Mehta, Nathan Grinnell, Li Liu, Sijia Zhong, Devanathan Nallur Gandamani, Xinyi Tang, Rohan Ghosalkar, Celeste Shen, Rachel Shen, Nafisa Hussain, Kesav Ravichandran, James Davis

    Abstract: In the rapidly evolving landscape of artificial intelligence (AI), the collaboration between human intelligence and AI systems, known as Human-AI (HAI) Teaming, has emerged as a cornerstone for advancing problem-solving and decision-making processes. The advent of Large Pre-trained Models (LPtM) has significantly transformed this landscape, offering unprecedented capabilities by leveraging vast am… ▽ More

    Submitted 26 June, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

  33. arXiv:2403.02909  [pdf, other

    cs.CV cs.HC eess.IV

    Gaze-Vector Estimation in the Dark with Temporally Encoded Event-driven Neural Networks

    Authors: Abeer Banerjee, Naval K. Mehta, Shyam S. Prasad, Himanshu, Sumeet Saurav, Sanjay Singh

    Abstract: In this paper, we address the intricate challenge of gaze vector prediction, a pivotal task with applications ranging from human-computer interaction to driver monitoring systems. Our innovative approach is designed for the demanding setting of extremely low-light conditions, leveraging a novel temporal event encoding scheme, and a dedicated neural network architecture. The temporal encoding metho… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  34. arXiv:2401.13773  [pdf, other

    math.OC cs.DM cs.DS

    New Sequence-Independent Lifting Techniques for Cutting Planes and When They Induce Facets

    Authors: Siddharth Prasad, Ellen Vitercik, Maria-Florina Balcan, Tuomas Sandholm

    Abstract: Sequence-independent lifting is a procedure for strengthening valid inequalities of an integer program. We generalize the sequence-independent lifting method of Gu, Nemhauser, and Savelsbergh (GNS lifting) for cover inequalities and correct an error in their proposed generalization. We obtain a new sequence-independent lifting technique -- piecewise-constant (PC) lifting -- with a number of intere… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  35. Conceptual Mutation Testing for Student Programming Misconceptions

    Authors: Siddhartha Prasad, Ben Greenman, Tim Nelson, Shriram Krishnamurthi

    Abstract: Context: Students often misunderstand programming problem descriptions. This can lead them to solve the wrong problem, which creates frustration, obstructs learning, and imperils grades. Researchers have found that students can be made to better understand the problem by writing examples before they start programming. These examples are checked against correct and wrong implementations -- analogou… ▽ More

    Submitted 28 December, 2023; originally announced January 2024.

    Journal ref: The Art, Science, and Engineering of Programming, 2024, Vol. 8, Issue 2, Article 7

  36. arXiv:2312.14750  [pdf, other

    cs.AR

    Siracusa: A 16 nm Heterogenous RISC-V SoC for Extended Reality with At-MRAM Neural Engine

    Authors: Arpan Suravi Prasad, Moritz Scherer, Francesco Conti, Davide Rossi, Alfio Di Mauro, Manuel Eggimann, Jorge Tómas Gómez, Ziyun Li, Syed Shakib Sarwar, Zhao Wang, Barbara De Salvo, Luca Benini

    Abstract: Extended reality (XR) applications are Machine Learning (ML)-intensive, featuring deep neural networks (DNNs) with millions of weights, tightly latency-bound (10-20 ms end-to-end), and power-constrained (low tens of mW average power). While ML performance and efficiency can be achieved by introducing neural engines within low-power systems-on-chip (SoCs), system-level power for nontrivial DNNs dep… ▽ More

    Submitted 14 April, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

    Comments: Final accepted manuscript pre-print submitted to the IEEE Journal of Solid-State Circuits

  37. arXiv:2312.14199  [pdf, other

    cs.CR

    Report on 2023 CyberTraining PI Meeting, 26-27 September 2023

    Authors: Geoffrey Fox, Mary P Thomas, Sajal Bhatia, Marisa Brazil, Nicole M Gasparini, Venkatesh Mohan Merwade, Henry J. Neeman, Jeff Carver, Henri Casanova, Vipin Chaudhary, Dirk Colbry, Lonnie Crosby, Prasun Dewan, Jessica Eisma, Nicole M Gasparini, Ahmed Irfan, Kate Kaehey, Qianqian Liu, Zhen Ni, Sushil Prasad, Apan Qasem, Erik Saule, Prabha Sundaravadivel, Karen Tomko

    Abstract: This document describes a two-day meeting held for the Principal Investigators (PIs) of NSF CyberTraining grants. The report covers invited talks, panels, and six breakout sessions. The meeting involved over 80 PIs and NSF program managers (PMs). The lessons recorded in detail in the report are a wealth of information that could help current and future PIs, as well as NSF PMs, understand the futur… ▽ More

    Submitted 28 December, 2023; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: 38 pages, 3 main sections and 2 Appendix sections, 2 figures, 19 tables; updated version: author corrections

  38. arXiv:2311.13878  [pdf, other

    cs.CL cs.AI

    Minimizing Factual Inconsistency and Hallucination in Large Language Models

    Authors: Muneeswaran I, Shreya Saxena, Siva Prasad, M V Sai Prakash, Advaith Shankar, Varun V, Vishal Vaddina, Saisubramaniam Gopalakrishnan

    Abstract: Large Language Models (LLMs) are widely used in critical fields such as healthcare, education, and finance due to their remarkable proficiency in various language-related tasks. However, LLMs are prone to generating factually incorrect responses or "hallucinations," which can lead to a loss of credibility and trust among users. To address this issue, we propose a multi-stage framework that generat… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

  39. arXiv:2309.15782  [pdf

    cs.CV

    Joint-YODNet: A Light-weight Object Detector for UAVs to Achieve Above 100fps

    Authors: Vipin Gautam, Shitala Prasad, Sharad Sinha

    Abstract: Small object detection via UAV (Unmanned Aerial Vehicle) images captured from drones and radar is a complex task with several formidable challenges. This domain encompasses numerous complexities that impede the accurate detection and localization of small objects. To address these challenges, we propose a novel method called JointYODNet for UAVs to detect small objects, leveraging a joint loss fun… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

  40. arXiv:2309.15780  [pdf

    cs.CV

    AaP-ReID: Improved Attention-Aware Person Re-identification

    Authors: Vipin Gautam, Shitala Prasad, Sharad Sinha

    Abstract: Person re-identification (ReID) is a well-known problem in the field of computer vision. The primary objective is to identify a specific individual within a gallery of images. However, this task is challenging due to various factors, such as pose variations, illumination changes, obstructions, and the presence ofconfusing backgrounds. Existing ReID methods often fail to capture discriminative feat… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

  41. arXiv:2309.13387  [pdf

    cs.CV

    YOLORe-IDNet: An Efficient Multi-Camera System for Person-Tracking

    Authors: Vipin Gautam, Shitala Prasad, Sharad Sinha

    Abstract: The growing need for video surveillance in public spaces has created a demand for systems that can track individuals across multiple cameras feeds in real-time. While existing tracking systems have achieved impressive performance using deep learning models, they often rely on pre-existing images of suspects or historical data. However, this is not always feasible in cases where suspicious individu… ▽ More

    Submitted 23 September, 2023; originally announced September 2023.

  42. arXiv:2309.00940  [pdf, other

    cs.MA cs.AI cs.GT cs.IR

    Content Prompting: Modeling Content Provider Dynamics to Improve User Welfare in Recommender Ecosystems

    Authors: Siddharth Prasad, Martin Mladenov, Craig Boutilier

    Abstract: Users derive value from a recommender system (RS) only to the extent that it is able to surface content (or items) that meet their needs/preferences. While RSs often have a comprehensive view of user preferences across the entire user base, content providers, by contrast, generally have only a local view of the preferences of users that have interacted with their content. This limits a provider's… ▽ More

    Submitted 2 September, 2023; originally announced September 2023.

  43. arXiv:2307.02740  [pdf, other

    cs.IR cs.CL

    Dense Retrieval Adaptation using Target Domain Description

    Authors: Helia Hashemi, Yong Zhuang, Sachith Sri Ram Kothur, Srivas Prasad, Edgar Meij, W. Bruce Croft

    Abstract: In information retrieval (IR), domain adaptation is the process of adapting a retrieval model to a new domain whose data distribution is different from the source domain. Existing methods in this area focus on unsupervised domain adaptation where they have access to the target document collection or supervised (often few-shot) domain adaptation where they additionally have access to (limited) labe… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

  44. arXiv:2306.05376  [pdf, other

    cs.CV cs.LG

    Anomaly Detection in Satellite Videos using Diffusion Models

    Authors: Akash Awasthi, Son Ly, Jaer Nizam, Samira Zare, Videet Mehta, Safwan Ahmed, Keshav Shah, Ramakrishna Nemani, Saurabh Prasad, Hien Van Nguyen

    Abstract: The definition of anomaly detection is the identification of an unexpected event. Real-time detection of extreme events such as wildfires, cyclones, or floods using satellite data has become crucial for disaster management. Although several earth-observing satellites provide information about disasters, satellites in the geostationary orbit provide data at intervals as frequent as every minute, ef… ▽ More

    Submitted 25 May, 2023; originally announced June 2023.

  45. arXiv:2303.12719  [pdf, other

    cs.CV cs.LG eess.IV

    Toward Polar Sea-Ice Classification using Color-based Segmentation and Auto-labeling of Sentinel-2 Imagery to Train an Efficient Deep Learning Model

    Authors: Jurdana Masuma Iqrah, Younghyun Koo, Wei Wang, Hongjie Xie, Sushil Prasad

    Abstract: Global warming is an urgent issue that is generating catastrophic environmental changes, such as the melting of sea ice and glaciers, particularly in the polar regions. The melting pattern and retreat of polar sea ice cover is an essential indicator of global warming. The Sentinel-2 satellite (S2) captures high-resolution optical imagery over the polar regions. This research aims at developing a r… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

    Comments: 2nd Annual AAAI Workshop on AI to Accelerate Science and Engineering (AI2ASE), February 2023

  46. arXiv:2302.14234  [pdf, other

    cs.GT econ.TH

    Bicriteria Multidimensional Mechanism Design with Side Information

    Authors: Maria-Florina Balcan, Siddharth Prasad, Tuomas Sandholm

    Abstract: We develop a versatile methodology for multidimensional mechanism design that incorporates side information about agents to generate high welfare and high revenue simultaneously. Side information sources include advice from domain experts, predictions from machine learning models, and even the mechanism designer's gut instinct. We design a tunable mechanism that integrates side information with an… ▽ More

    Submitted 9 October, 2024; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: An early version of this paper appeared in NeurIPS 2023

  47. Large-Scale Knowledge Synthesis and Complex Information Retrieval from Biomedical Documents

    Authors: Shreya Saxena, Raj Sangani, Siva Prasad, Shubham Kumar, Mihir Athale, Rohan Awhad, Vishal Vaddina

    Abstract: Recent advances in the healthcare industry have led to an abundance of unstructured data, making it challenging to perform tasks such as efficient and accurate information retrieval at scale. Our work offers an all-in-one scalable solution for extracting and exploring complex information from large-scale research documents, which would otherwise be tedious. First, we briefly explain our knowledge… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

  48. arXiv:2208.09901  [pdf

    cs.DC

    Scalable mRMR feature selection to handle high dimensional datasets: Vertical partitioning based Iterative MapReduce framework

    Authors: Yelleti Vivek, P. S. V. S. Sai Prasad

    Abstract: While building machine learning models, Feature selection (FS) stands out as an essential preprocessing step used to handle the uncertainty and vagueness in the data. Recently, the minimum Redundancy and Maximum Relevance (mRMR) approach has proven to be effective in obtaining the irredundant feature subset. Owing to the generation of voluminous datasets, it is essential to design scalable solutio… ▽ More

    Submitted 24 July, 2024; v1 submitted 21 August, 2022; originally announced August 2022.

    Comments: 20 pages, 3 Figures, 5 Tables

  49. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  50. Private Eye: On the Limits of Textual Screen Peeking via Eyeglass Reflections in Video Conferencing

    Authors: Yan Long, Chen Yan, Shilin Xiao, Shivan Prasad, Wenyuan Xu, Kevin Fu

    Abstract: Using mathematical modeling and human subjects experiments, this research explores the extent to which emerging webcams might leak recognizable textual and graphical information gleaming from eyeglass reflections captured by webcams. The primary goal of our work is to measure, compute, and predict the factors, limits, and thresholds of recognizability as webcam technology evolves in the future. Ou… ▽ More

    Submitted 16 January, 2023; v1 submitted 8 May, 2022; originally announced May 2022.

    Journal ref: 2023 IEEE Symposium on Security and Privacy