Skip to main content

Showing 1–50 of 83 results for author: Mehta, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.01835  [pdf, ps, other

    cs.CV

    Modulate and Reconstruct: Learning Hyperspectral Imaging from Misaligned Smartphone Views

    Authors: Daniil Reutsky, Daniil Vladimirov, Yasin Mamedov, Georgy Perevozchikov, Nancy Mehta, Egor Ershov, Radu Timofte

    Abstract: Hyperspectral reconstruction (HSR) from RGB images is a fundamentally ill-posed problem due to severe spectral information loss. Existing approaches typically rely on a single RGB image, limiting reconstruction accuracy. In this work, we propose a novel multi-image-to-hyperspectral reconstruction (MI-HSR) framework that leverages a triple-camera smartphone system, where two lenses are equipped wit… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

  2. arXiv:2507.00743  [pdf, ps, other

    eess.IV cs.CV eess.SP

    Tunable Wavelet Unit based Convolutional Neural Network in Optical Coherence Tomography Analysis Enhancement for Classifying Type of Epiretinal Membrane Surgery

    Authors: An Le, Nehal Mehta, William Freeman, Ines Nagel, Melanie Tran, Anna Heinke, Akshay Agnihotri, Lingyun Cheng, Dirk-Uwe Bartsch, Hung Nguyen, Truong Nguyen, Cheolhong An

    Abstract: In this study, we developed deep learning-based method to classify the type of surgery performed for epiretinal membrane (ERM) removal, either internal limiting membrane (ILM) removal or ERM-alone removal. Our model, based on the ResNet18 convolutional neural network (CNN) architecture, utilizes postoperative optical coherence tomography (OCT) center scans as inputs. We evaluated the model using b… ▽ More

    Submitted 1 July, 2025; originally announced July 2025.

  3. arXiv:2505.23093  [pdf, ps, other

    cs.CV

    LeMoRe: Learn More Details for Lightweight Semantic Segmentation

    Authors: Mian Muhammad Naeem Abid, Nancy Mehta, Zongwei Wu, Radu Timofte

    Abstract: Lightweight semantic segmentation is essential for many downstream vision tasks. Unfortunately, existing methods often struggle to balance efficiency and performance due to the complexity of feature modeling. Many of these existing approaches are constrained by rigid architectures and implicit representation learning, often characterized by parameter-heavy designs and a reliance on computationally… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

    Comments: Accepted at IEEE ICIP 2025

  4. arXiv:2504.10752  [pdf, other

    cs.LG q-bio.NC

    Time-varying EEG spectral power predicts evoked and spontaneous fMRI motor brain activity

    Authors: Neil Mehta, Ines Goncalves, Alberto Montagna, Mathis Fleury, Gustavo Caetano, Ines Esteves, Athanasios Vourvopoulos, Pulkit Grover, Patricia Figueiredo

    Abstract: Simultaneous EEG-fMRI recordings are increasingly used to investigate brain activity by leveraging the complementary high spatial and high temporal resolution of fMRI and EEG signals respectively. It remains unclear, however, to what degree these two imaging modalities capture shared information about neural activity. Here, we investigate whether it is possible to predict both task-evoked and spon… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

  5. arXiv:2504.07400  [pdf, other

    cs.CL

    Talking Point based Ideological Discourse Analysis in News Events

    Authors: Nishanth Nakshatri, Nikhil Mehta, Siyi Liu, Sihao Chen, Daniel J. Hopkins, Dan Roth, Dan Goldwasser

    Abstract: Analyzing ideological discourse even in the age of LLMs remains a challenge, as these models often struggle to capture the key elements that shape real-world narratives. Specifically, LLMs fail to focus on characteristic elements driving dominant discourses and lack the ability to integrate contextual information required for understanding abstract ideological views. To address these limitations,… ▽ More

    Submitted 9 April, 2025; originally announced April 2025.

  6. arXiv:2504.06667  [pdf, other

    cs.IR cs.AI

    Toward Holistic Evaluation of Recommender Systems Powered by Generative Models

    Authors: Yashar Deldjoo, Nikhil Mehta, Maheswaran Sathiamoorthy, Shuai Zhang, Pablo Castells, Julian McAuley

    Abstract: Recommender systems powered by generative models (Gen-RecSys) extend beyond classical item ranking by producing open-ended content, which simultaneously unlocks richer user experiences and introduces new risks. On one hand, these systems can enhance personalization and appeal through dynamic explanations and multi-turn dialogues. On the other hand, they might venture into unknown territory-halluci… ▽ More

    Submitted 9 April, 2025; originally announced April 2025.

  7. arXiv:2503.11781  [pdf, other

    cs.CV

    Color Matching Using Hypernetwork-Based Kolmogorov-Arnold Networks

    Authors: Artem Nikonorov, Georgy Perevozchikov, Andrei Korepanov, Nancy Mehta, Mahmoud Afifi, Egor Ershov, Radu Timofte

    Abstract: We present cmKAN, a versatile framework for color matching. Given an input image with colors from a source color distribution, our method effectively and accurately maps these colors to match a target color distribution in both supervised and unsupervised settings. Our framework leverages the spline capabilities of Kolmogorov-Arnold Networks (KANs) to model the color matching between source and ta… ▽ More

    Submitted 14 March, 2025; originally announced March 2025.

  8. arXiv:2503.07748  [pdf, other

    eess.IV cs.CV

    AdaptSR: Low-Rank Adaptation for Efficient and Scalable Real-World Super-Resolution

    Authors: Cansu Korkmaz, Nancy Mehta, Radu Timofte

    Abstract: Recovering high-frequency details and textures from low-resolution images remains a fundamental challenge in super-resolution (SR), especially when real-world degradations are complex and unknown. While GAN-based methods enhance realism, they suffer from training instability and introduce unnatural artifacts. Diffusion models, though promising, demand excessive computational resources, often requi… ▽ More

    Submitted 10 March, 2025; originally announced March 2025.

    Comments: 11 pages including 3 pages of references, 7 figures and 7 tables

  9. arXiv:2502.11483  [pdf, other

    cs.LG cs.GT stat.ML

    No-regret incentive-compatible online learning under exact truthfulness with non-myopic experts

    Authors: Junpei Komiyama, Nishant A. Mehta, Ali Mortazavi

    Abstract: We study an online forecasting setting in which, over $T$ rounds, $N$ strategic experts each report a forecast to a mechanism, the mechanism selects one forecast, and then the outcome is revealed. In any given round, each expert has a belief about the outcome, but the expert wishes to select its report so as to maximize the total number of times it is selected. The goal of the mechanism is to obta… ▽ More

    Submitted 17 February, 2025; originally announced February 2025.

    Comments: 44 pages

  10. arXiv:2502.08143  [pdf, ps, other

    cs.LG

    Data-dependent Bounds with $T$-Optimal Best-of-Both-Worlds Guarantees in Multi-Armed Bandits using Stability-Penalty Matching

    Authors: Quan Nguyen, Shinji Ito, Junpei Komiyama, Nishant A. Mehta

    Abstract: Existing data-dependent and best-of-both-worlds regret bounds for multi-armed bandits problems have limited adaptivity as they are either data-dependent but not best-of-both-worlds (BOBW), BOBW but not data-dependent or have sub-optimal $O(\sqrt{T\ln{T}})$ worst-case guarantee in the adversarial regime. To overcome these limitations, we propose real-time stability-penalty matching (SPM), a new met… ▽ More

    Submitted 12 February, 2025; originally announced February 2025.

  11. arXiv:2501.19255  [pdf, other

    cs.CV

    ContextFormer: Redefining Efficiency in Semantic Segmentation

    Authors: Mian Muhammad Naeem Abid, Nancy Mehta, Zongwei Wu, Radu Timofte

    Abstract: Semantic segmentation assigns labels to pixels in images, a critical yet challenging task in computer vision. Convolutional methods, although capturing local dependencies well, struggle with long-range relationships. Vision Transformers (ViTs) excel in global context capture but are hindered by high computational demands, especially for high-resolution inputs. Most research optimizes the encoder a… ▽ More

    Submitted 9 March, 2025; v1 submitted 31 January, 2025; originally announced January 2025.

  12. arXiv:2501.05936  [pdf, other

    cs.CV

    A Multimodal Dataset for Enhancing Industrial Task Monitoring and Engagement Prediction

    Authors: Naval Kishore Mehta, Arvind, Himanshu Kumar, Abeer Banerjee, Sumeet Saurav, Sanjay Singh

    Abstract: Detecting and interpreting operator actions, engagement, and object interactions in dynamic industrial workflows remains a significant challenge in human-robot collaboration research, especially within complex, real-world environments. Traditional unimodal methods often fall short of capturing the intricacies of these unstructured industrial settings. To address this gap, we present a novel Multim… ▽ More

    Submitted 10 January, 2025; originally announced January 2025.

    Comments: Accepted at the 20th International Conference on Human-Robot Interaction (HRI) 2025

  13. arXiv:2501.05108  [pdf, other

    cs.CV

    Optimizing Multitask Industrial Processes with Predictive Action Guidance

    Authors: Naval Kishore Mehta, Arvind, Shyam Sunder Prasad, Sumeet Saurav, Sanjay Singh

    Abstract: Monitoring complex assembly processes is critical for maintaining productivity and ensuring compliance with assembly standards. However, variability in human actions and subjective task preferences complicate accurate task anticipation and guidance. To address these challenges, we introduce the Multi-Modal Transformer Fusion and Recurrent Units (MMTFRU) Network for egocentric activity anticipation… ▽ More

    Submitted 9 January, 2025; originally announced January 2025.

  14. arXiv:2501.04982  [pdf, other

    cs.RO cs.AI cs.LG

    CuRLA: Curriculum Learning Based Deep Reinforcement Learning for Autonomous Driving

    Authors: Bhargava Uppuluri, Anjel Patel, Neil Mehta, Sridhar Kamath, Pratyush Chakraborty

    Abstract: In autonomous driving, traditional Computer Vision (CV) agents often struggle in unfamiliar situations due to biases in the training data. Deep Reinforcement Learning (DRL) agents address this by learning from experience and maximizing rewards, which helps them adapt to dynamic environments. However, ensuring their generalization remains challenging, especially with static training environments. A… ▽ More

    Submitted 9 January, 2025; originally announced January 2025.

    Comments: To be published in the 17th International Conference on Agents and Artificial Intelligence (ICAART), Feb 2025

  15. arXiv:2501.04336  [pdf, other

    cs.CV

    Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs

    Authors: Zeyi Huang, Yuyang Ji, Xiaofang Wang, Nikhil Mehta, Tong Xiao, Donghyun Lee, Sigmund Vanvalkenburgh, Shengxin Zha, Bolin Lai, Licheng Yu, Ning Zhang, Yong Jae Lee, Miao Liu

    Abstract: Long-form video understanding with Large Vision Language Models is challenged by the need to analyze temporally dispersed yet spatially concentrated key moments within limited context windows. In this work, we introduce VideoMindPalace, a new framework inspired by the "Mind Palace", which organizes critical video moments into a topologically structured semantic graph. VideoMindPalace organizes key… ▽ More

    Submitted 8 January, 2025; originally announced January 2025.

  16. arXiv:2412.12166  [pdf

    cs.CL cs.AI

    Performance of a large language model-Artificial Intelligence based chatbot for counseling patients with sexually transmitted infections and genital diseases

    Authors: Nikhil Mehta, Sithira Ambepitiya, Thanveer Ahamad, Dinuka Wijesundara, Yudara Kularathne

    Abstract: Introduction: Global burden of sexually transmitted infections (STIs) is rising out of proportion to specialists. Current chatbots like ChatGPT are not tailored for handling STI-related concerns out of the box. We developed Otiz, an Artificial Intelligence-based (AI-based) chatbot platform designed specifically for STI detection and counseling, and assessed its performance. Methods: Otiz employs a… ▽ More

    Submitted 11 December, 2024; originally announced December 2024.

    Comments: 18 pages, 1 table

  17. arXiv:2412.10360  [pdf, other

    cs.CV cs.AI

    Apollo: An Exploration of Video Understanding in Large Multimodal Models

    Authors: Orr Zohar, Xiaohan Wang, Yann Dubois, Nikhil Mehta, Tong Xiao, Philippe Hansen-Estruch, Licheng Yu, Xiaofang Wang, Felix Juefei-Xu, Ning Zhang, Serena Yeung-Levy, Xide Xia

    Abstract: Despite the rapid integration of video perception capabilities into Large Multimodal Models (LMMs), the underlying mechanisms driving their video understanding remain poorly understood. Consequently, many design decisions in this domain are made without proper justification or analysis. The high computational cost of training and evaluating such models, coupled with limited open research, hinders… ▽ More

    Submitted 13 December, 2024; originally announced December 2024.

    Comments: https://apollo-lmms.github.io

  18. arXiv:2412.01027  [pdf, other

    cs.CV

    Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation

    Authors: Bolin Lai, Felix Juefei-Xu, Miao Liu, Xiaoliang Dai, Nikhil Mehta, Chenguang Zhu, Zeyi Huang, James M. Rehg, Sangmin Lee, Ning Zhang, Tong Xiao

    Abstract: Text-guided image manipulation has experienced notable advancement in recent years. In order to mitigate linguistic ambiguity, few-shot learning with visual examples has been applied for instructions that are underrepresented in the training set, or difficult to describe purely in language. However, learning from visual prompts requires strong reasoning capability, which diffusion models are strug… ▽ More

    Submitted 2 December, 2024; v1 submitted 1 December, 2024; originally announced December 2024.

    Comments: 18 pages, 16 figures, 5 tables

  19. arXiv:2411.18466  [pdf, other

    cs.CV

    Complexity Experts are Task-Discriminative Learners for Any Image Restoration

    Authors: Eduard Zamfir, Zongwei Wu, Nancy Mehta, Yuedong Tan, Danda Pani Paudel, Yulun Zhang, Radu Timofte

    Abstract: Recent advancements in all-in-one image restoration models have revolutionized the ability to address diverse degradations through a unified framework. However, parameters tied to specific tasks often remain inactive for other tasks, making mixture-of-experts (MoE) architectures a natural extension. Despite this, MoEs often show inconsistent behavior, with some experts unexpectedly generalizing ac… ▽ More

    Submitted 13 March, 2025; v1 submitted 27 November, 2024; originally announced November 2024.

    Comments: Accepted at CVPR 2025

  20. arXiv:2411.12615  [pdf, other

    cs.CV cs.LG

    A Multimodal Approach Combining Structural and Cross-domain Textual Guidance for Weakly Supervised OCT Segmentation

    Authors: Jiaqi Yang, Nitish Mehta, Xiaoling Hu, Chao Chen, Chia-Ling Tsai

    Abstract: Accurate segmentation of Optical Coherence Tomography (OCT) images is crucial for diagnosing and monitoring retinal diseases. However, the labor-intensive nature of pixel-level annotation limits the scalability of supervised learning with large datasets. Weakly Supervised Semantic Segmentation (WSSS) provides a promising alternative by leveraging image-level labels. In this study, we propose a nov… ▽ More

    Submitted 19 November, 2024; originally announced November 2024.

    Comments: 21 pages, 9 figures, 8 tables

  21. arXiv:2410.16458  [pdf, other

    cs.IR cs.AI cs.LG

    STAR: A Simple Training-free Approach for Recommendations using Large Language Models

    Authors: Dong-Ho Lee, Adam Kraft, Long Jin, Nikhil Mehta, Taibai Xu, Lichan Hong, Ed H. Chi, Xinyang Yi

    Abstract: Recent progress in large language models (LLMs) offers promising new approaches for recommendation system tasks. While the current state-of-the-art methods rely on fine-tuning LLMs to achieve optimal results, this process is costly and introduces significant engineering complexities. Conversely, methods that directly use LLMs without additional fine-tuning result in a large drop in recommendation… ▽ More

    Submitted 19 February, 2025; v1 submitted 21 October, 2024; originally announced October 2024.

  22. arXiv:2410.00690  [pdf, other

    cs.LG cs.AI math.OC

    Beyond Minimax Rates in Group Distributionally Robust Optimization via a Novel Notion of Sparsity

    Authors: Quan Nguyen, Nishant A. Mehta, Cristóbal Guzmán

    Abstract: The minimax sample complexity of group distributionally robust optimization (GDRO) has been determined up to a $\log(K)$ factor, where $K$ is the number of groups. In this work, we venture beyond the minimax perspective via a novel notion of sparsity that we dub $(λ, β)$-sparsity. In short, this condition means that at any parameter $θ$, there is a set of at most $β$ groups whose risks at $θ$ all… ▽ More

    Submitted 30 January, 2025; v1 submitted 1 October, 2024; originally announced October 2024.

    Comments: 44 pages. V2: updated a semi-adaptive approach and experimental results

  23. arXiv:2407.21783  [pdf, other

    cs.AI cs.CL cs.CV

    The Llama 3 Herd of Models

    Authors: Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Roziere , et al. (536 additional authors not shown)

    Abstract: Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical… ▽ More

    Submitted 23 November, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

  24. Neural Network-Based Bandit: A Medium Access Control for the IIoT Alarm Scenario

    Authors: Prasoon Raghuwanshi, Onel Luis Alcaraz López, Neelesh B. Mehta, Hirley Alves, Matti Latva-aho

    Abstract: Efficient Random Access (RA) is critical for enabling reliable communication in Industrial Internet of Things (IIoT) networks. Herein, we propose a deep reinforcement learning based distributed RA scheme, entitled Neural Network-Based Bandit (NNBB), for the IIoT alarm scenario. In such a scenario, the devices may detect a common critical event, and the goal is to ensure the alarm information is de… ▽ More

    Submitted 22 November, 2024; v1 submitted 23 July, 2024; originally announced July 2024.

  25. arXiv:2406.18630  [pdf, other

    cs.LG cs.AI stat.ML

    Improving Hyperparameter Optimization with Checkpointed Model Weights

    Authors: Nikhil Mehta, Jonathan Lorraine, Steve Masson, Ramanathan Arunachalam, Zaid Pervaiz Bhat, James Lucas, Arun George Zachariah

    Abstract: When training deep learning models, the performance depends largely on the selected hyperparameters. However, hyperparameter optimization (HPO) is often one of the most expensive parts of model design. Classical HPO methods treat this as a black-box optimization problem. However, gray-box HPO methods, which incorporate more information about the setup, have emerged as a promising direction for mor… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: See the project website at https://research.nvidia.com/labs/toronto-ai/FMS/

    MSC Class: 68T05 ACM Class: I.2.6; G.1.6; D.2.8

  26. arXiv:2406.00969  [pdf, other

    cs.CL

    Using RL to Identify Divisive Perspectives Improves LLMs Abilities to Identify Communities on Social Media

    Authors: Nikhil Mehta, Dan Goldwasser

    Abstract: The large scale usage of social media, combined with its significant impact, has made it increasingly important to understand it. In particular, identifying user communities, can be helpful for many downstream tasks. However, particularly when models are trained on past data and tested on future, doing this is difficult. In this paper, we hypothesize to take advantage of Large Language Models (L… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  27. arXiv:2405.15475  [pdf, other

    cs.CV

    Efficient Degradation-aware Any Image Restoration

    Authors: Eduard Zamfir, Zongwei Wu, Nancy Mehta, Danda Pani Paudel, Yulun Zhang, Radu Timofte

    Abstract: Reconstructing missing details from degraded low-quality inputs poses a significant challenge. Recent progress in image restoration has demonstrated the efficacy of learning large models capable of addressing various degradations simultaneously. Nonetheless, these approaches introduce considerable computational overhead and complex learning paradigms, limiting their practical utility. In response,… ▽ More

    Submitted 1 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

  28. arXiv:2405.02178  [pdf, other

    cs.CL cs.AI

    Assessing and Verifying Task Utility in LLM-Powered Applications

    Authors: Negar Arabzadeh, Siqing Huo, Nikhil Mehta, Qinqyun Wu, Chi Wang, Ahmed Awadallah, Charles L. A. Clarke, Julia Kiseleva

    Abstract: The rapid development of Large Language Models (LLMs) has led to a surge in applications that facilitate collaboration among multiple agents, assisting humans in their daily tasks. However, a significant gap remains in assessing to what extent LLM-powered applications genuinely enhance user experience and task execution efficiency. This highlights the need to verify utility of LLM-powered applicat… ▽ More

    Submitted 12 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2402.09015

  29. arXiv:2404.10700  [pdf, other

    eess.IV cs.CV cs.LG

    Rawformer: Unpaired Raw-to-Raw Translation for Learnable Camera ISPs

    Authors: Georgy Perevozchikov, Nancy Mehta, Mahmoud Afifi, Radu Timofte

    Abstract: Modern smartphone camera quality heavily relies on the image signal processor (ISP) to enhance captured raw images, utilizing carefully designed modules to produce final output images encoded in a standard color space (e.g., sRGB). Neural-based end-to-end learnable ISPs offer promising advancements, potentially replacing traditional ISPs with their ability to adapt without requiring extensive tuni… ▽ More

    Submitted 15 July, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: Accepted by ECCV 2024

    Journal ref: https://eccv.ecva.net/Conferences/2024

  30. arXiv:2404.10343  [pdf, other

    cs.CV eess.IV

    The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

    Authors: Bin Ren, Yawei Li, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang , et al. (109 additional authors not shown)

    Abstract: This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such… ▽ More

    Submitted 25 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024

  31. arXiv:2404.05155  [pdf, other

    cs.LG cs.GT stat.ML

    On the price of exact truthfulness in incentive-compatible online learning with bandit feedback: A regret lower bound for WSU-UX

    Authors: Ali Mortazavi, Junhao Lin, Nishant A. Mehta

    Abstract: In one view of the classical game of prediction with expert advice with binary outcomes, in each round, each expert maintains an adversarially chosen belief and honestly reports this belief. We consider a recently introduced, strategic variant of this problem with selfish (reputation-seeking) experts, where each expert strategically reports in order to maximize their expected future reputation bas… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

    Comments: Accepted to AISTATS 2024

  32. arXiv:2404.00245  [pdf, other

    cs.IR

    Aligning Large Language Models with Recommendation Knowledge

    Authors: Yuwei Cao, Nikhil Mehta, Xinyang Yi, Raghunandan Keshavan, Lukasz Heldt, Lichan Hong, Ed H. Chi, Maheswaran Sathiamoorthy

    Abstract: Large language models (LLMs) have recently been used as backbones for recommender systems. However, their performance often lags behind conventional methods in standard tasks like retrieval. We attribute this to a mismatch between LLMs' knowledge and the knowledge crucial for effective recommendations. While LLMs excel at natural language reasoning, they cannot model complex user-item interactions… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: Accepted to the NAACL 2024 Findings

  33. arXiv:2403.02909  [pdf, other

    cs.CV cs.HC eess.IV

    Gaze-Vector Estimation in the Dark with Temporally Encoded Event-driven Neural Networks

    Authors: Abeer Banerjee, Naval K. Mehta, Shyam S. Prasad, Himanshu, Sumeet Saurav, Sanjay Singh

    Abstract: In this paper, we address the intricate challenge of gaze vector prediction, a pivotal task with applications ranging from human-computer interaction to driver monitoring systems. Our innovative approach is designed for the demanding setting of extremely low-light conditions, leveraging a novel temporal event encoding scheme, and a dedicated neural network architecture. The temporal encoding metho… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  34. arXiv:2403.01315  [pdf, ps, other

    cs.LG stat.ML

    Near-optimal Per-Action Regret Bounds for Sleeping Bandits

    Authors: Quan Nguyen, Nishant A. Mehta

    Abstract: We derive near-optimal per-action regret bounds for sleeping bandits, in which both the sets of available arms and their losses in every round are chosen by an adversary. In a setting with $K$ total arms and at most $A$ available arms in each round over $T$ rounds, the best known upper bound is $O(K\sqrt{TA\ln{K}})$, obtained indirectly via minimizing internal sleeping regrets. Compared to the min… ▽ More

    Submitted 29 May, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

    Comments: V2: corrected Theorem 8 (FTARL's high probability bound) from log(1/delta) to log(K/delta)

  35. arXiv:2402.03412  [pdf, other

    eess.IV cs.CV

    See More Details: Efficient Image Super-Resolution by Experts Mining

    Authors: Eduard Zamfir, Zongwei Wu, Nancy Mehta, Yulun Zhang, Radu Timofte

    Abstract: Reconstructing high-resolution (HR) images from low-resolution (LR) inputs poses a significant challenge in image super-resolution (SR). While recent approaches have demonstrated the efficacy of intricate operations customized for various objectives, the straightforward stacking of these disparate operations can result in a substantial computational burden, hampering their practical utility. In re… ▽ More

    Submitted 6 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Accepted at ICML 2024

  36. arXiv:2312.15086  [pdf, other

    cs.LG cs.CV

    HyperMix: Out-of-Distribution Detection and Classification in Few-Shot Settings

    Authors: Nikhil Mehta, Kevin J Liang, Jing Huang, Fu-Jen Chu, Li Yin, Tal Hassner

    Abstract: Out-of-distribution (OOD) detection is an important topic for real-world machine learning systems, but settings with limited in-distribution samples have been underexplored. Such few-shot OOD settings are challenging, as models have scarce opportunities to learn the data distribution before being tasked with identifying OOD samples. Indeed, we demonstrate that recent state-of-the-art OOD methods f… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

  37. arXiv:2312.01167  [pdf, other

    cs.CV cs.LG stat.ML

    Meta-Learned Attribute Self-Interaction Network for Continual and Generalized Zero-Shot Learning

    Authors: Vinay K Verma, Nikhil Mehta, Kevin J Liang, Aakansha Mishra, Lawrence Carin

    Abstract: Zero-shot learning (ZSL) is a promising approach to generalizing a model to categories unseen during training by leveraging class attributes, but challenges remain. Recently, methods using generative models to combat bias towards classes seen during training have pushed state of the art, but these generative models can be slow or computationally expensive to train. Also, these generative models as… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

    Comments: Accepted in IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024. arXiv admin note: substantial text overlap with arXiv:2102.11856

  38. arXiv:2309.14966  [pdf, other

    cs.CL cs.AI cs.SI

    Interactively Learning Social Media Representations Improves News Source Factuality Detection

    Authors: Nikhil Mehta, Dan Goldwasser

    Abstract: The rise of social media has enabled the widespread propagation of fake news, text that is published with an intent to spread misinformation and sway beliefs. Rapidly detecting fake news, especially as new events arise, is important to prevent misinformation. While prior works have tackled this problem using supervised learning systems, automatedly modeling the complexities of the social media l… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: Accepted at Findings of IJCNLP-AACL 2023

  39. arXiv:2309.07384  [pdf, other

    cs.CL

    An Interactive Framework for Profiling News Media Sources

    Authors: Nikhil Mehta, Dan Goldwasser

    Abstract: The recent rise of social media has led to the spread of large amounts of fake and biased news, content published with the intent to sway beliefs. While detecting and profiling the sources that spread this news is important to maintain a healthy society, it is challenging for automated systems. In this paper, we propose an interactive framework for news media profiling. It combines the strengths… ▽ More

    Submitted 26 April, 2024; v1 submitted 13 September, 2023; originally announced September 2023.

    Comments: NAACL 2024 Main Conference

  40. arXiv:2308.01563  [pdf, other

    cs.IR

    Density Weighting for Multi-Interest Personalized Recommendation

    Authors: Nikhil Mehta, Anima Singh, Xinyang Yi, Sagar Jain, Lichan Hong, Ed H. Chi

    Abstract: Using multiple user representations (MUR) to model user behavior instead of a single user representation (SUR) has been shown to improve personalization in recommendation systems. However, the performance gains observed with MUR can be sensitive to the skewness in the item and/or user interest distribution. When the data distribution is highly skewed, the gains observed by learning multiple repres… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

  41. arXiv:2306.08121  [pdf, other

    cs.IR cs.LG

    Better Generalization with Semantic IDs: A Case Study in Ranking for Recommendations

    Authors: Anima Singh, Trung Vu, Nikhil Mehta, Raghunandan Keshavan, Maheswaran Sathiamoorthy, Yilin Zheng, Lichan Hong, Lukasz Heldt, Li Wei, Devansh Tandon, Ed H. Chi, Xinyang Yi

    Abstract: Randomly-hashed item ids are used ubiquitously in recommendation models. However, the learned representations from random hashing prevents generalization across similar items, causing problems of learning unseen and long-tail items, especially when item corpus is large, power-law distributed, and evolving dynamically. In this paper, we propose using content-derived features as a replacement for ra… ▽ More

    Submitted 30 May, 2024; v1 submitted 13 June, 2023; originally announced June 2023.

  42. arXiv:2305.06474  [pdf, other

    cs.IR cs.LG

    Do LLMs Understand User Preferences? Evaluating LLMs On User Rating Prediction

    Authors: Wang-Cheng Kang, Jianmo Ni, Nikhil Mehta, Maheswaran Sathiamoorthy, Lichan Hong, Ed Chi, Derek Zhiyuan Cheng

    Abstract: Large Language Models (LLMs) have demonstrated exceptional capabilities in generalizing to new tasks in a zero-shot or few-shot manner. However, the extent to which LLMs can comprehend user preferences based on their previous behavior remains an emerging and still unclear research question. Traditionally, Collaborative Filtering (CF) has been the most effective method for these tasks, predominantl… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

  43. arXiv:2305.05065  [pdf, other

    cs.IR cs.LG

    Recommender Systems with Generative Retrieval

    Authors: Shashank Rajput, Nikhil Mehta, Anima Singh, Raghunandan H. Keshavan, Trung Vu, Lukasz Heldt, Lichan Hong, Yi Tay, Vinh Q. Tran, Jonah Samost, Maciej Kula, Ed H. Chi, Maheswaran Sathiamoorthy

    Abstract: Modern recommender systems perform large-scale retrieval by first embedding queries and item candidates in the same unified space, followed by approximate nearest neighbor search to select top candidates given a query embedding. In this paper, we propose a novel generative retrieval approach, where the retrieval model autoregressively decodes the identifiers of the target candidates. To that end,… ▽ More

    Submitted 3 November, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

    Comments: To appear in The 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  44. arXiv:2305.04093  [pdf, other

    cs.LG

    An improved regret analysis for UCB-N and TS-N

    Authors: Nishant A. Mehta

    Abstract: In the setting of stochastic online learning with undirected feedback graphs, Lykouris et al. (2020) previously analyzed the pseudo-regret of the upper confidence bound-based algorithm UCB-N and the Thompson Sampling-based algorithm TS-N. In this note, we show how to improve their pseudo-regret analysis. Our improvement involves refining a key lemma of the previous analysis, allowing a $\log(T)$ f… ▽ More

    Submitted 6 May, 2023; originally announced May 2023.

    Comments: 5 pages

  45. arXiv:2304.10750  [pdf, other

    cs.CL cs.AI

    Improving Grounded Language Understanding in a Collaborative Environment by Interacting with Agents Through Help Feedback

    Authors: Nikhil Mehta, Milagro Teruel, Patricio Figueroa Sanz, Xin Deng, Ahmed Hassan Awadallah, Julia Kiseleva

    Abstract: Many approaches to Natural Language Processing (NLP) tasks often treat them as single-step problems, where an agent receives an instruction, executes it, and is evaluated based on the final outcome. However, human language is inherently interactive, as evidenced by the back-and-forth nature of human conversations. In light of this, we posit that human-AI collaboration should also be interactive, w… ▽ More

    Submitted 5 February, 2024; v1 submitted 21 April, 2023; originally announced April 2023.

    Comments: Findings of EACL 2024

  46. arXiv:2304.06703  [pdf, other

    cs.CV

    Gated Multi-Resolution Transfer Network for Burst Restoration and Enhancement

    Authors: Nancy Mehta, Akshay Dudhane, Subrahmanyam Murala, Syed Waqas Zamir, Salman Khan, Fahad Shahbaz Khan

    Abstract: Burst image processing is becoming increasingly popular in recent years. However, it is a challenging task since individual burst images undergo multiple degradations and often have mutual misalignments resulting in ghosting and zipper artifacts. Existing burst restoration methods usually do not consider the mutual correlation and non-local contextual information among burst frames, which tends to… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

    Comments: Accepted at CVPR 2023

  47. arXiv:2303.00123  [pdf, ps, other

    quant-ph cs.MS cs.PF

    QCLAB++: Simulating Quantum Circuits on GPUs

    Authors: Roel Van Beeumen, Daan Camps, Neil Mehta

    Abstract: We introduce qclab++, a light-weight, fully-templated C++ package for GPU-accelerated quantum circuit simulations. The code offers a high degree of portability as it has no external dependencies and the GPU kernels are generated through OpenMP offloading. qclab++ is designed for performance and numerical stability through highly optimized gate simulation algorithms for 1-qubit, controlled 1-qubit,… ▽ More

    Submitted 28 February, 2023; originally announced March 2023.

    Comments: 13 pages, 10 figures

  48. arXiv:2301.04268  [pdf, other

    cs.LG cs.AI stat.ML

    Adversarial Online Multi-Task Reinforcement Learning

    Authors: Quan Nguyen, Nishant A. Mehta

    Abstract: We consider the adversarial online multi-task reinforcement learning setting, where in each of $K$ episodes the learner is given an unknown task taken from a finite set of $M$ unknown finite-horizon MDP models. The learner's objective is to minimize its regret with respect to the optimal policy for each task. We assume the MDPs in $\mathcal{M}$ are well-separated under a notion of $λ$-separability… ▽ More

    Submitted 10 January, 2023; originally announced January 2023.

    Comments: To appear at the 34th International Conference on Algorithmic Learning Theory (ALT 2023)

  49. arXiv:2210.12818  [pdf, other

    cs.CV

    Pushing the Efficiency Limit Using Structured Sparse Convolutions

    Authors: Vinay Kumar Verma, Nikhil Mehta, Shijing Si, Ricardo Henao, Lawrence Carin

    Abstract: Weight pruning is among the most popular approaches for compressing deep convolutional neural networks. Recent work suggests that in a randomly initialized deep neural network, there exist sparse subnetworks that achieve performance comparable to the original network. Unfortunately, finding these subnetworks involves iterative stages of training and pruning, which can be computationally expensive.… ▽ More

    Submitted 23 October, 2022; originally announced October 2022.

    Comments: Accepted at the IEEE Winter Conference on Applications of Computer Vision, WACV 2023

  50. arXiv:2210.09132  [pdf, other

    cs.CL

    Pseudo-OOD training for robust language models

    Authors: Dhanasekar Sundararaman, Nikhil Mehta, Lawrence Carin

    Abstract: While pre-trained large-scale deep models have garnered attention as an important topic for many downstream natural language processing (NLP) tasks, such models often make unreliable predictions on out-of-distribution (OOD) inputs. As such, OOD detection is a key component of a reliable machine-learning model for any industry-scale application. Common approaches often assume access to additional O… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

    Comments: Work in progress