Skip to main content

Showing 1–30 of 30 results for author: Sahoo, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.08002  [pdf, ps, other

    cs.CV

    Aligning Text, Images, and 3D Structure Token-by-Token

    Authors: Aadarsh Sahoo, Vansh Tibrewal, Georgia Gkioxari

    Abstract: Creating machines capable of understanding the world in 3D is essential in assisting designers that build and edit 3D environments and robots navigating and interacting within a three-dimensional space. Inspired by advances in language and image modeling, we investigate the potential of autoregressive models for a new modality: structured 3D scenes. To this end, we propose a unified LLM framework… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: Project webpage: https://glab-caltech.github.io/kyvo/

  2. arXiv:2506.02945  [pdf, ps, other

    cs.CL cs.LG

    Quantitative LLM Judges

    Authors: Aishwarya Sahoo, Jeevana Kruthi Karnuthala, Tushar Parmanand Budhwani, Pranchal Agarwal, Sankaran Vaidyanathan, Alexa Siu, Franck Dernoncourt, Jennifer Healey, Nedim Lipka, Ryan Rossi, Uttaran Bhattacharya, Branislav Kveton

    Abstract: LLM-as-a-judge is a framework in which a large language model (LLM) automatically evaluates the output of another LLM. We propose quantitative LLM judges, which align evaluation scores of existing LLM judges to human scores in a given domain using regression models. The models are trained to improve the score of the original judge by using the judge's textual evaluation and score. We present four… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

  3. arXiv:2505.24763  [pdf, ps, other

    eess.SP cs.NI

    Detecting Airborne Objects with 5G NR Radars

    Authors: Steve Blandino, Nada Golmie, Anirudha Sahoo, Thao Nguyen, Tanguy Ropitault, David Griffith, Amala Sonny

    Abstract: The integration of sensing capabilities into 5G New Radio (5G NR) networks offers an opportunity to enable the detection of airborne objects without the need for dedicated radars. This paper investigates the feasibility of using standardized Positioning Reference Signals (PRS) to detect UAVs in Urban Micro (UMi) and Urban Macro (UMa) propagation environments. A full 5G NR radar processing chain is… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

  4. arXiv:2505.05755  [pdf, other

    cs.CL cs.LG

    Insertion Language Models: Sequence Generation with Arbitrary-Position Insertions

    Authors: Dhruvesh Patel, Aishwarya Sahoo, Avinash Amballa, Tahira Naseem, Tim G. J. Rudner, Andrew McCallum

    Abstract: Autoregressive models (ARMs), which predict subsequent tokens one-by-one ``from left to right,'' have achieved significant success across a wide range of sequence generation tasks. However, they struggle to accurately represent sequences that require satisfying sophisticated constraints or whose sequential dependencies are better addressed by out-of-order generation. Masked Diffusion Models (MDMs)… ▽ More

    Submitted 15 May, 2025; v1 submitted 8 May, 2025; originally announced May 2025.

    Comments: Corrected a typo in author names

  5. arXiv:2503.05718  [pdf, other

    cs.CY cs.CE cs.DC cs.LG

    zScore: A Universal Decentralised Reputation System for the Blockchain Economy

    Authors: Himanshu Udupi, Ashutosh Sahoo, Akshay S. P., Gurukiran S., Parag Paul, Petrus C. Martens

    Abstract: Modern society functions on trust. The onchain economy, however, is built on the founding principles of trustless peer-to-peer interactions in an adversarial environment without a centralised body of trust and needs a verifiable system to quantify credibility to minimise bad economic activity. We provide a robust framework titled zScore, a core primitive for reputation derived from a wallet's onch… ▽ More

    Submitted 17 February, 2025; originally announced March 2025.

    ACM Class: K.4.4; I.2.11; C.2.4; K.4.2; H.3.5

  6. arXiv:2502.03086  [pdf, other

    cs.ET cs.AI cs.LG cs.NE quant-ph

    Implementing Large Quantum Boltzmann Machines as Generative AI Models for Dataset Balancing

    Authors: Salvatore Sinno, Markus Bertl, Arati Sahoo, Bhavika Bhalgamiya, Thomas Groß, Nicholas Chancellor

    Abstract: This study explores the implementation of large Quantum Restricted Boltzmann Machines (QRBMs), a key advancement in Quantum Machine Learning (QML), as generative models on D-Wave's Pegasus quantum hardware to address dataset imbalance in Intrusion Detection Systems (IDS). By leveraging Pegasus's enhanced connectivity and computational capabilities, a QRBM with 120 visible and 120 hidden units was… ▽ More

    Submitted 5 February, 2025; originally announced February 2025.

    Comments: accapted at IEEE International Conference on Next Generation Information System Engineering

  7. arXiv:2501.11538  [pdf, other

    cs.LG

    DenoMAE: A Multimodal Autoencoder for Denoising Modulation Signals

    Authors: Atik Faysal, Taha Boushine, Mohammad Rostami, Reihaneh Gh. Roshan, Huaxia Wang, Nikhil Muralidhar, Avimanyu Sahoo, Yu-Dong Yao

    Abstract: We propose Denoising Masked Autoencoder (Deno-MAE), a novel multimodal autoencoder framework for denoising modulation signals during pretraining. DenoMAE extends the concept of masked autoencoders by incorporating multiple input modalities, including noise as an explicit modality, to enhance cross-modal learning and improve denoising performance. The network is pre-trained using unlabeled noisy mo… ▽ More

    Submitted 20 January, 2025; originally announced January 2025.

  8. arXiv:2501.09051  [pdf, other

    cs.CV cs.AI

    Polyp detection in colonoscopy images using YOLOv11

    Authors: Alok Ranjan Sahoo, Satya Sangram Sahoo, Pavan Chakraborty

    Abstract: Colorectal cancer (CRC) is one of the most commonly diagnosed cancers all over the world. It starts as a polyp in the inner lining of the colon. To prevent CRC, early polyp detection is required. Colonosopy is used for the inspection of the colon. Generally, the images taken by the camera placed at the tip of the endoscope are analyzed by the experts manually. Various traditional machine learning… ▽ More

    Submitted 15 January, 2025; originally announced January 2025.

  9. arXiv:2412.10529  [pdf, other

    cs.LG cs.CL

    Solving the Inverse Alignment Problem for Efficient RLHF

    Authors: Shambhavi Krishna, Aishwarya Sahoo

    Abstract: Collecting high-quality preference datasets for reinforcement learning from human feedback (RLHF) is resource-intensive and challenging. As a result, researchers often train reward models on extensive offline datasets which aggregate diverse generation sources and scoring/alignment policies. We hypothesize that this aggregation has an averaging effect on reward model scores, which limits signal an… ▽ More

    Submitted 13 December, 2024; originally announced December 2024.

  10. arXiv:2411.06263  [pdf, other

    cs.LG cs.AI cs.CR

    Federated Split Learning for Human Activity Recognition with Differential Privacy

    Authors: Josue Ndeko, Shaba Shaon, Aubrey Beal, Avimanyu Sahoo, Dinh C. Nguyen

    Abstract: This paper proposes a novel intelligent human activity recognition (HAR) framework based on a new design of Federated Split Learning (FSL) with Differential Privacy (DP) over edge networks. Our FSL-DP framework leverages both accelerometer and gyroscope data, achieving significant improvements in HAR accuracy. The evaluation includes a detailed comparison between traditional Federated Learning (FL… ▽ More

    Submitted 9 November, 2024; originally announced November 2024.

    Comments: Accepted to IEEE Consumer Communications and Networking Conference (CCNC), 6 pages

  11. arXiv:2403.19825  [pdf, other

    cs.NI

    Sensing Performance of the IEEE 802.11bf Protocol and Its Impact on Data Communication

    Authors: Anirudha Sahoo, Tanguy Ropitault, Steve Blandino, Nada Golmie

    Abstract: Wi-Fi sensing has been used to detect and track movements in an environment, resulting in the emergence of several innovative applications. Wi-Fi sensing can detect movement and locate objects by analyzing variations in the Wi-Fi signal due to its interaction with moving objects. Until recently, Wi-Fi sensing has been primarily available through proprietary solutions, which has limited its adoptio… ▽ More

    Submitted 31 May, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

  12. arXiv:2403.18456  [pdf, other

    cs.RO

    Inverse kinematics learning of a continuum manipulator using limited real time data

    Authors: Alok Ranjan Sahoo, Pavan Chakraborty

    Abstract: Data driven control of a continuum manipulator requires a lot of data for training but generating sufficient amount of real time data is not cost efficient. Random actuation of the manipulator can also be unsafe sometimes. Meta learning has been used successfully to adapt to a new environment. Hence, this paper tries to solve the above mentioned problem using meta learning. We consider two cases f… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  13. arXiv:2403.09819  [pdf, other

    cs.NI

    An Admission Control Algorithm for Isochronous and Asynchronous Traffic in IEEE 802.11ad MAC

    Authors: Anirudha Sahoo

    Abstract: Due to availability of large amount of bandwidth in the 60 GHz band and support of contention-free channel access called Service Period (SP), the IEEE 802.11ad/ay Wi-Fi standard is well suited for low latency and high data rate applications. IEEE 802.11ad supports two types of SP user traffic: isochronous and asynchronous. These user traffic need guaranteed SP duration before their respective dead… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  14. arXiv:2402.18599  [pdf, other

    cs.LG cs.AI

    Meta-Task: A Method-Agnostic Framework for Learning to Regularize in Few-Shot Learning

    Authors: Mohammad Rostami, Atik Faysal, Huaxia Wang, Avimanyu Sahoo

    Abstract: Overfitting is a significant challenge in Few-Shot Learning (FSL), where models trained on small, variable datasets tend to memorize rather than generalize to unseen tasks. Regularization is crucial in FSL to prevent overfitting and enhance generalization performance. To address this issue, we introduce Meta-Task, a novel, method-agnostic framework that leverages both labeled and unlabeled data to… ▽ More

    Submitted 26 February, 2025; v1 submitted 27 February, 2024; originally announced February 2024.

  15. arXiv:2402.10026  [pdf, other

    eess.IV cs.CV

    Hybrid CNN Bi-LSTM neural network for Hyperspectral image classification

    Authors: Alok Ranjan Sahoo, Pavan Chakraborty

    Abstract: Hyper spectral images have drawn the attention of the researchers for its complexity to classify. It has nonlinear relation between the materials and the spectral information provided by the HSI image. Deep learning methods have shown superiority in learning this nonlinearity in comparison to traditional machine learning methods. Use of 3-D CNN along with 2-D CNN have shown great success for learn… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

  16. arXiv:2401.12671  [pdf, other

    cs.CL

    Context Matters: Pushing the Boundaries of Open-Ended Answer Generation with Graph-Structured Knowledge Context

    Authors: Somnath Banerjee, Amruit Sahoo, Sayan Layek, Avik Dutta, Rima Hazra, Animesh Mukherjee

    Abstract: In the continuously advancing AI landscape, crafting context-rich and meaningful responses via Large Language Models (LLMs) is essential. Researchers are becoming more aware of the challenges that LLMs with fewer parameters encounter when trying to provide suitable answers to open-ended questions. To address these hurdles, the integration of cutting-edge strategies, augmentation of rich external d… ▽ More

    Submitted 15 October, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

    Comments: Accepted at EMNLP 2024

  17. arXiv:2312.05626  [pdf, other

    cs.SE cs.AI

    Redefining Developer Assistance: Through Large Language Models in Software Ecosystem

    Authors: Somnath Banerjee, Avik Dutta, Sayan Layek, Amruit Sahoo, Sam Conrad Joyce, Rima Hazra

    Abstract: In this paper, we delve into the advancement of domain-specific Large Language Models (LLMs) with a focus on their application in software development. We introduce DevAssistLlama, a model developed through instruction tuning, to assist developers in processing software-related natural language queries. This model, a variant of instruction tuned LLM, is particularly adept at handling intricate tec… ▽ More

    Submitted 15 March, 2024; v1 submitted 9 December, 2023; originally announced December 2023.

    Comments: Under review

  18. arXiv:2310.16314  [pdf, other

    cs.LG

    Understanding Code Semantics: An Evaluation of Transformer Models in Summarization

    Authors: Debanjan Mondal, Abhilasha Lodha, Ankita Sahoo, Beena Kumari

    Abstract: This paper delves into the intricacies of code summarization using advanced transformer-based language models. Through empirical studies, we evaluate the efficacy of code summarization by altering function and variable names to explore whether models truly understand code semantics or merely rely on textual cues. We have also introduced adversaries like dead code and commented code across three pr… ▽ More

    Submitted 26 October, 2023; v1 submitted 24 October, 2023; originally announced October 2023.

    Comments: Accepted at GenBench, EMNLP 2023. All authors are co-first authors and have equal contributions

  19. arXiv:2310.14239  [pdf, other

    cs.CV cs.LG

    Guidance system for Visually Impaired Persons using Deep Learning and Optical flow

    Authors: Shwetang Dubey, Alok Ranjan Sahoo, Pavan Chakraborty

    Abstract: Visually impaired persons find it difficult to know about their surroundings while walking on a road. Walking sticks used by them can only give them information about the obstacles in the stick's proximity. Moreover, it is mostly effective in static or very slow-paced environments. Hence, this paper introduces a method to guide them in a busy street. To create such a system it is very important to… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

  20. arXiv:2310.13085  [pdf, other

    cs.LG cs.AI

    Unsupervised Representation Learning to Aid Semi-Supervised Meta Learning

    Authors: Atik Faysal, Mohammad Rostami, Huaxia Wang, Avimanyu Sahoo, Ryan Antle

    Abstract: Few-shot learning or meta-learning leverages the data scarcity problem in machine learning. Traditionally, training data requires a multitude of samples and labeling for supervised learning. To address this issue, we propose a one-shot unsupervised meta-learning to learn the latent representation of the training samples. We use augmented samples as the query set during the training phase of the un… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

  21. arXiv:2309.05035  [pdf, other

    cs.IR cs.SE cs.SI

    Duplicate Question Retrieval and Confirmation Time Prediction in Software Communities

    Authors: Rima Hazra, Debanjan Saha, Amruit Sahoo, Somnath Banerjee, Animesh Mukherjee

    Abstract: Community Question Answering (CQA) in different domains is growing at a large scale because of the availability of several platforms and huge shareable information among users. With the rapid growth of such online platforms, a massive amount of archived data makes it difficult for moderators to retrieve possible duplicates for a new question and identify and confirm existing question pairs as dupl… ▽ More

    Submitted 5 March, 2024; v1 submitted 10 September, 2023; originally announced September 2023.

    Comments: Full paper accepted at ASONAM 2023: The 2023 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining

  22. arXiv:2211.05829  [pdf, other

    cs.CY cs.LG

    A Machine Learning system to monitor student progress in educational institutes

    Authors: Bibhuprasad Mahakud, Bibhuti Parida, Ipsit Panda, Souvik Maity, Arpita Sahoo, Reeta Sharma

    Abstract: In order to track and comprehend the academic achievement of students, both private and public educational institutions devote a significant amount of resources and labour. One of the difficult issues that institutes deal with on a regular basis is understanding the exam shortcomings of students. The performance of a student is influenced by a variety of factors, including attendance, attentivenes… ▽ More

    Submitted 2 November, 2022; originally announced November 2022.

    Comments: 9 pages, 7 figures

  23. arXiv:2110.15128  [pdf, other

    cs.CV

    Contrast and Mix: Temporal Contrastive Video Domain Adaptation with Background Mixing

    Authors: Aadarsh Sahoo, Rutav Shah, Rameswar Panda, Kate Saenko, Abir Das

    Abstract: Unsupervised domain adaptation which aims to adapt models trained on a labeled source domain to a completely unlabeled target domain has attracted much attention in recent years. While many domain adaptation techniques have been proposed for images, the problem of unsupervised domain adaptation in videos remains largely underexplored. In this paper, we introduce Contrast and Mix (CoMix), a new con… ▽ More

    Submitted 28 October, 2021; originally announced October 2021.

    Comments: Accepted to NeurIPS 2021. Project page: https://cvir.github.io/projects/comix

  24. arXiv:2012.03358  [pdf, other

    cs.CV

    Select, Label, and Mix: Learning Discriminative Invariant Feature Representations for Partial Domain Adaptation

    Authors: Aadarsh Sahoo, Rameswar Panda, Rogerio Feris, Kate Saenko, Abir Das

    Abstract: Partial domain adaptation which assumes that the unknown target label space is a subset of the source label space has attracted much attention in computer vision. Despite recent progress, existing methods often suffer from three key problems: negative transfer, lack of discriminability, and domain invariance in the latent space. To alleviate the above issues, we develop a novel 'Select, Label, and… ▽ More

    Submitted 3 January, 2023; v1 submitted 6 December, 2020; originally announced December 2020.

    Comments: Accepted to WACV 2023. Project page: https://cvir.github.io/projects/slm.html

  25. arXiv:2008.05524  [pdf, other

    cs.CV

    Mitigating Dataset Imbalance via Joint Generation and Classification

    Authors: Aadarsh Sahoo, Ankit Singh, Rameswar Panda, Rogerio Feris, Abir Das

    Abstract: Supervised deep learning methods are enjoying enormous success in many practical applications of computer vision and have the potential to revolutionize robotics. However, the marked performance degradation to biases and imbalanced data questions the reliability of these methods. In this work we address these questions from the perspective of dataset imbalance resulting out of severe under-represe… ▽ More

    Submitted 12 August, 2020; originally announced August 2020.

    Comments: Accepted in ECCV2020 Workshop on Imbalance Problems in Computer Vision (IPCV)

  26. Retrofitting Parallelism onto OCaml

    Authors: KC Sivaramakrishnan, Stephen Dolan, Leo White, Sadiq Jaffer, Tom Kelly, Anmol Sahoo, Sudha Parimala, Atul Dhiman, Anil Madhavapeddy

    Abstract: OCaml is an industrial-strength, multi-paradigm programming language, widely used in industry and academia. OCaml is also one of the few modern managed system programming languages to lack support for shared memory parallel programming. This paper describes the design, a full-fledged implementation and evaluation of a mostly-concurrent garbage collector (GC) for the multicore extension of the OCam… ▽ More

    Submitted 2 July, 2020; v1 submitted 24 April, 2020; originally announced April 2020.

    Comments: Accepted to ICFP 2020

    ACM Class: D.3.4

  27. arXiv:1901.01153  [pdf, other

    cs.CV

    Demystifying Multi-Faceted Video Summarization: Tradeoff Between Diversity,Representation, Coverage and Importance

    Authors: Vishal Kaushal, Rishabh Iyer, Khoshrav Doctor, Anurag Sahoo, Pratik Dubal, Suraj Kothawade, Rohan Mahadev, Kunal Dargan, Ganesh Ramakrishnan

    Abstract: This paper addresses automatic summarization of videos in a unified manner. In particular, we propose a framework for multi-faceted summarization for extractive, query base and entity summarization (summarization at the level of entities like objects, scenes, humans and faces in the video). We investigate several summarization models which capture notions of diversity, coverage, representation and… ▽ More

    Submitted 3 January, 2019; originally announced January 2019.

    Comments: Accepted to WACV 2019. arXiv admin note: substantial text overlap with arXiv:1704.01466, arXiv:1809.08846

  28. arXiv:1805.11191  [pdf, other

    cs.CV cs.LG stat.ML

    Learning From Less Data: Diversified Subset Selection and Active Learning in Image Classification Tasks

    Authors: Vishal Kaushal, Anurag Sahoo, Khoshrav Doctor, Narasimha Raju, Suyash Shetty, Pankaj Singh, Rishabh Iyer, Ganesh Ramakrishnan

    Abstract: Supervised machine learning based state-of-the-art computer vision techniques are in general data hungry and pose the challenges of not having adequate computing resources and of high costs involved in human labeling efforts. Training data subset selection and active learning techniques have been proposed as possible solutions to these challenges respectively. A special class of subset selection f… ▽ More

    Submitted 28 May, 2018; originally announced May 2018.

    Comments: 15 pages, 7 figures

  29. arXiv:1704.01466  [pdf, other

    cs.CV cs.DM

    A Unified Multi-Faceted Video Summarization System

    Authors: Anurag Sahoo, Vishal Kaushal, Khoshrav Doctor, Suyash Shetty, Rishabh Iyer, Ganesh Ramakrishnan

    Abstract: This paper addresses automatic summarization and search in visual data comprising of videos, live streams and image collections in a unified manner. In particular, we propose a framework for multi-faceted summarization which extracts key-frames (image summaries), skims (video summaries) and entity summaries (summarization at the level of entities like objects, scenes, humans and faces in the video… ▽ More

    Submitted 4 April, 2017; originally announced April 2017.

    Comments: 18 pages, 11 Figures

  30. arXiv:1401.0875  [pdf

    cs.NI

    Determining the Possibilities and Certainties in Network Participation for MANETS

    Authors: Anoop J. Sahoo, Md. Amir Khusru Akhtar

    Abstract: A mobile ad hoc network is a self organized cooperative network that works without any permanent infrastructure. This infrastructure less design makes it complex compared to other wireless networks. Lot of attacks and misbehavior obstruct the growth and implementation. The majority of attacks and misbehavior can be handled by existing protocols. But these protocols reduce the total strength of nod… ▽ More

    Submitted 5 January, 2014; originally announced January 2014.

    Comments: 10 Pages. International Journal of Computer Engineering and Applications,2013