Skip to main content

Showing 1–50 of 62 results for author: Patil, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.05739  [pdf, ps, other

    cs.CR cs.AI

    To Protect the LLM Agent Against the Prompt Injection Attack with Polymorphic Prompt

    Authors: Zhilong Wang, Neha Nagaraja, Lan Zhang, Hayretdin Bahsi, Pawan Patil, Peng Liu

    Abstract: LLM agents are widely used as agents for customer support, content generation, and code assistance. However, they are vulnerable to prompt injection attacks, where adversarial inputs manipulate the model's behavior. Traditional defenses like input sanitization, guard models, and guardrails are either cumbersome or ineffective. In this paper, we propose a novel, lightweight defense mechanism called… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

    Comments: To appear in the Industry Track of the 55th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2025)

  2. arXiv:2504.14582  [pdf, other

    cs.CV

    NTIRE 2025 Challenge on Image Super-Resolution ($\times$4): Methods and Results

    Authors: Zheng Chen, Kai Liu, Jue Gong, Jingkai Wang, Lei Sun, Zongwei Wu, Radu Timofte, Yulun Zhang, Xiangyu Kong, Xiaoxuan Yu, Hyunhee Park, Suejin Han, Hakjae Jeon, Dafeng Zhang, Hyung-Ju Chun, Donghun Ryou, Inju Ha, Bohyung Han, Lu Zhao, Yuyi Zhang, Pengyu Yan, Jiawei Hu, Pengwei Liu, Fengjun Guo, Hongyuan Yu , et al. (86 additional authors not shown)

    Abstract: This paper presents the NTIRE 2025 image super-resolution ($\times$4) challenge, one of the associated competitions of the 10th NTIRE Workshop at CVPR 2025. The challenge aims to recover high-resolution (HR) images from low-resolution (LR) counterparts generated through bicubic downsampling with a $\times$4 scaling factor. The objective is to develop effective network designs or solutions that ach… ▽ More

    Submitted 28 April, 2025; v1 submitted 20 April, 2025; originally announced April 2025.

    Comments: NTIRE 2025 webpage: https://www.cvlai.net/ntire/2025. Code: https://github.com/zhengchen1999/NTIRE2025_ImageSR_x4

  3. arXiv:2504.10686  [pdf, other

    cs.CV eess.IV

    The Tenth NTIRE 2025 Efficient Super-Resolution Challenge Report

    Authors: Bin Ren, Hang Guo, Lei Sun, Zongwei Wu, Radu Timofte, Yawei Li, Yao Zhang, Xinning Chai, Zhengxue Cheng, Yingsheng Qin, Yucai Yang, Li Song, Hongyuan Yu, Pufan Xu, Cheng Wan, Zhijuan Huang, Peng Guo, Shuyuan Cui, Chenjun Li, Xuehai Hu, Pan Pan, Xin Zhang, Heng Zhang, Qing Luo, Linyan Jiang , et al. (122 additional authors not shown)

    Abstract: This paper presents a comprehensive review of the NTIRE 2025 Challenge on Single-Image Efficient Super-Resolution (ESR). The challenge aimed to advance the development of deep models that optimize key computational metrics, i.e., runtime, parameters, and FLOPs, while achieving a PSNR of at least 26.90 dB on the $\operatorname{DIV2K\_LSDIR\_valid}$ dataset and 26.99 dB on the… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

    Comments: Accepted by CVPR2025 NTIRE Workshop, Efficient Super-Resolution Challenge Report. 50 pages

  4. arXiv:2503.13762  [pdf, other

    cs.SE

    Do Unit Proofs Work? An Empirical Study of Compositional Bounded Model Checking for Memory Safety Verification

    Authors: Paschal C. Amusuo, Owen Cochell, Taylor Le Lievre, Parth V. Patil, Aravind Machiry, James C. Davis

    Abstract: Memory safety defects pose a major threat to software reliability, enabling cyberattacks, outages, and crashes. To mitigate these risks, organizations adopt Compositional Bounded Model Checking (BMC), using unit proofs to formally verify memory safety. However, methods for creating unit proofs vary across organizations and are inconsistent within the same project, leading to errors and missed defe… ▽ More

    Submitted 17 March, 2025; originally announced March 2025.

    Comments: 13 pages

    ACM Class: D.2.4; F.3.1

  5. arXiv:2412.18972  [pdf, other

    cs.LG cs.AI cs.SE

    Recommending Pre-Trained Models for IoT Devices

    Authors: Parth V. Patil, Wenxin Jiang, Huiyun Peng, Daniel Lugo, Kelechi G. Kalu, Josh LeBlanc, Lawrence Smith, Hyeonwoo Heo, Nathanael Aou, James C. Davis

    Abstract: The availability of pre-trained models (PTMs) has enabled faster deployment of machine learning across applications by reducing the need for extensive training. Techniques like quantization and distillation have further expanded PTM applicability to resource-constrained IoT hardware. Given the many PTM options for any given task, engineers often find it too costly to evaluate each model's suitabil… ▽ More

    Submitted 25 December, 2024; originally announced December 2024.

    Comments: Accepted at SERP4IOT'25

  6. arXiv:2411.13595  [pdf

    cs.CV cs.LG

    Towards Accessible Learning: Deep Learning-Based Potential Dysgraphia Detection and OCR for Potentially Dysgraphic Handwriting

    Authors: Vydeki D, Divyansh Bhandari, Pranav Pratap Patil, Aarush Anand Kulkarni

    Abstract: Dysgraphia is a learning disorder that affects handwriting abilities, making it challenging for children to write legibly and consistently. Early detection and monitoring are crucial for providing timely support and interventions. This study applies deep learning techniques to address the dual tasks of dysgraphia detection and optical character recognition (OCR) on handwriting samples from childre… ▽ More

    Submitted 18 November, 2024; originally announced November 2024.

  7. arXiv:2410.14818  [pdf, other

    cs.SE

    A Unit Proofing Framework for Code-level Verification: A Research Agenda

    Authors: Paschal C. Amusuo, Parth V. Patil, Owen Cochell, Taylor Le Lievre, James C. Davis

    Abstract: Formal verification provides mathematical guarantees that a software is correct. Design-level verification tools ensure software specifications are correct, but they do not expose defects in actual implementations. For this purpose, engineers use code-level tools. However, such tools struggle to scale to large software. The process of "Unit Proofing" mitigates this by decomposing the software and… ▽ More

    Submitted 30 April, 2025; v1 submitted 18 October, 2024; originally announced October 2024.

    Comments: 5 pages, 2 figures

    ACM Class: D.2.4; F.3.1

  8. arXiv:2410.05222  [pdf, other

    cs.LG cs.CL cs.CV stat.AP

    Precise Model Benchmarking with Only a Few Observations

    Authors: Riccardo Fogliato, Pratik Patil, Nil-Jana Akpinar, Mathew Monfort

    Abstract: How can we precisely estimate a large language model's (LLM) accuracy on questions belonging to a specific topic within a larger question-answering dataset? The standard direct estimator, which averages the model's accuracy on the questions in each subgroup, may exhibit high variance for subgroups (topics) with small sample sizes. Synthetic regression modeling, which leverages the model's accuracy… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

    Comments: To appear at EMNLP 2024

  9. arXiv:2410.04363  [pdf, other

    cs.DC

    Multi Armed Bandit Algorithms Based Virtual Machine Allocation Policy for Security in Multi-Tenant Distributed Systems

    Authors: Pravin Patil, Geetanjali Kale, Tanmay Karmarkar, Ruturaj Ghatage

    Abstract: This work proposes a secure and dynamic VM allocation strategy for multi-tenant distributed systems using the Thompson sampling approach. The method proves more effective and secure compared to epsilon-greedy and upper confidence bound methods, showing lower regret levels.,Initially, VM allocation was static, but the unpredictable nature of attacks necessitated a dynamic approach. Historical VM da… ▽ More

    Submitted 6 October, 2024; originally announced October 2024.

  10. arXiv:2410.01259  [pdf, other

    stat.ML cs.LG math.ST

    Revisiting Optimism and Model Complexity in the Wake of Overparameterized Machine Learning

    Authors: Pratik Patil, Jin-Hong Du, Ryan J. Tibshirani

    Abstract: Common practice in modern machine learning involves fitting a large number of parameters relative to the number of observations. These overparameterized models can exhibit surprising generalization behavior, e.g., ``double descent'' in the prediction error curve when plotted against the raw number of model parameters, or another simplistic notion of complexity. In this paper, we revisit model comp… ▽ More

    Submitted 2 October, 2024; originally announced October 2024.

    Comments: 59 pages, 17 figures

  11. arXiv:2408.15784  [pdf, other

    cs.LG math.ST stat.ML

    Implicit Regularization Paths of Weighted Neural Representations

    Authors: Jin-Hong Du, Pratik Patil

    Abstract: We study the implicit regularization effects induced by (observation) weighting of pretrained features. For weight and feature matrices of bounded operator norms that are infinitesimally free with respect to (normalized) trace functionals, we derive equivalence paths connecting different weighting matrices and ridge regularization levels. Specifically, we show that ridge estimators trained on weig… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

    Comments: 19 pages for main and 19 pages for appendix

  12. arXiv:2408.09236  [pdf

    cs.IR cs.AI

    Hybrid Semantic Search: Unveiling User Intent Beyond Keywords

    Authors: Aman Ahluwalia, Bishwajit Sutradhar, Karishma Ghosh, Indrapal Yadav, Arpan Sheetal, Prashant Patil

    Abstract: This paper addresses the limitations of traditional keyword-based search in understanding user intent and introduces a novel hybrid search approach that leverages the strengths of non-semantic search engines, Large Language Models (LLMs), and embedding models. The proposed system integrates keyword matching, semantic vector embeddings, and LLM-generated structured queries to deliver highly relevan… ▽ More

    Submitted 6 September, 2024; v1 submitted 17 August, 2024; originally announced August 2024.

  13. arXiv:2407.18423  [pdf, other

    cs.LG cs.AI

    HDL-GPT: High-Quality HDL is All You Need

    Authors: Bhuvnesh Kumar, Saurav Nanda, Ganapathy Parthasarathy, Pawan Patil, Austin Tsai, Parivesh Choudhary

    Abstract: This paper presents Hardware Description Language Generative Pre-trained Transformers (HDL-GPT), a novel approach that leverages the vast repository of open-source High Definition Language (HDL) codes to train superior quality large code models. The core premise of this paper is the hypothesis that high-quality HDL is all you need to create models with exceptional performance and broad zero-shot g… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

    Comments: DAC 2024 Invited Paper

  14. arXiv:2406.07320  [pdf, other

    cs.CV stat.AP

    A Framework for Efficient Model Evaluation through Stratification, Sampling, and Estimation

    Authors: Riccardo Fogliato, Pratik Patil, Mathew Monfort, Pietro Perona

    Abstract: Model performance evaluation is a critical and expensive task in machine learning and computer vision. Without clear guidelines, practitioners often estimate model accuracy using a one-time completely random selection of the data. However, by employing tailored sampling and estimation strategies, one can obtain more precise estimates and reduce annotation costs. In this paper, we propose a statist… ▽ More

    Submitted 18 July, 2024; v1 submitted 11 June, 2024; originally announced June 2024.

    Comments: To appear at ECCV 2024

  15. arXiv:2405.06859  [pdf, other

    cs.LG cs.AI cs.CV

    Reimplementation of Learning to Reweight Examples for Robust Deep Learning

    Authors: Parth Patil, Ben Boardley, Jack Gardner, Emily Loiselle, Deerajkumar Parthipan

    Abstract: Deep neural networks (DNNs) have been used to create models for many complex analysis problems like image recognition and medical diagnosis. DNNs are a popular tool within machine learning due to their ability to model complex patterns and distributions. However, the performance of these networks is highly dependent on the quality of the data used to train the models. Two characteristics of these… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  16. arXiv:2404.01233  [pdf, other

    math.ST cs.LG stat.ML

    Optimal Ridge Regularization for Out-of-Distribution Prediction

    Authors: Pratik Patil, Jin-Hong Du, Ryan J. Tibshirani

    Abstract: We study the behavior of optimal ridge regularization and optimal ridge risk for out-of-distribution prediction, where the test distribution deviates arbitrarily from the train distribution. We establish general conditions that determine the sign of the optimal regularization level under covariate and regression shifts. These conditions capture the alignment between the covariance and signal struc… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 59 pages, 14 figures

  17. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1112 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 16 December, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  18. arXiv:2402.16793  [pdf, other

    math.ST cs.LG stat.ML

    Failures and Successes of Cross-Validation for Early-Stopped Gradient Descent

    Authors: Pratik Patil, Yuchen Wu, Ryan J. Tibshirani

    Abstract: We analyze the statistical properties of generalized cross-validation (GCV) and leave-one-out cross-validation (LOOCV) applied to early-stopped gradient descent (GD) in high-dimensional least squares regression. We prove that GCV is generically inconsistent as an estimator of the prediction risk of early-stopped GD, even for a well-specified linear model with isotropic features. In contrast, we sh… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: 76 pages, 27 figures

  19. Parallel Approximate Maximum Flows in Near-Linear Work and Polylogarithmic Depth

    Authors: Arpit Agarwal, Sanjeev Khanna, Huan Li, Prathamesh Patil, Chen Wang, Nathan White, Peilin Zhong

    Abstract: We present a parallel algorithm for the $(1-ε)$-approximate maximum flow problem in capacitated, undirected graphs with $n$ vertices and $m$ edges, achieving $O(ε^{-3}\text{polylog} n)$ depth and $O(m ε^{-3} \text{polylog} n)$ work in the PRAM model. Although near-linear time sequential algorithms for this problem have been known for almost a decade, no parallel algorithms that simultaneously achi… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  20. arXiv:2402.14031  [pdf, ps, other

    eess.SY cs.LG

    Autoencoder with Ordered Variance for Nonlinear Model Identification

    Authors: Midhun T. Augustine, Parag Patil, Mani Bhushan, Sharad Bhartiya

    Abstract: This paper presents a novel autoencoder with ordered variance (AEO) in which the loss function is modified with a variance regularization term to enforce order in the latent space. Further, the autoencoder is modified using ResNets, which results in a ResNet AEO (RAEO). The paper also illustrates the effectiveness of AEO and RAEO in extracting nonlinear relationships among input variables in an un… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: 14 pages, 8 figures

  21. arXiv:2402.14013  [pdf, ps, other

    cs.LG cs.DS

    Misalignment, Learning, and Ranking: Harnessing Users Limited Attention

    Authors: Arpit Agarwal, Rad Niazadeh, Prathamesh Patil

    Abstract: In digital health and EdTech, recommendation systems face a significant challenge: users often choose impulsively, in ways that conflict with the platform's long-term payoffs. This misalignment makes it difficult to effectively learn to rank items, as it may hinder exploration of items with greater long-term payoffs. Our paper tackles this issue by utilizing users' limited attention spans. We prop… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  22. Maximum Likelihood Quantum Error Mitigation for Algorithms with a Single Correct Output

    Authors: Dror Baron, Hrushikesh Pramod Patil, Huiyang Zhou

    Abstract: Quantum error mitigation is an important technique to reduce the impact of noise in quantum computers. With more and more qubits being supported on quantum computers, there are two emerging fundamental challenges. First, the number of shots required for quantum algorithms with large numbers of qubits needs to increase in order to obtain a meaningful distribution or expected value of an observable.… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.

    Comments: 10 pages, 1 figure

    Journal ref: 2024 IEEE International Conference on Quantum Computing and Engineering (QCE)

  23. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1326 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 9 May, 2025; v1 submitted 18 December, 2023; originally announced December 2023.

  24. arXiv:2312.06585  [pdf, other

    cs.LG

    Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

    Authors: Avi Singh, John D. Co-Reyes, Rishabh Agarwal, Ankesh Anand, Piyush Patil, Xavier Garcia, Peter J. Liu, James Harrison, Jaehoon Lee, Kelvin Xu, Aaron Parisi, Abhishek Kumar, Alex Alemi, Alex Rizkowsky, Azade Nova, Ben Adlam, Bernd Bohnet, Gamaleldin Elsayed, Hanie Sedghi, Igor Mordatch, Isabelle Simpson, Izzeddin Gur, Jasper Snoek, Jeffrey Pennington, Jiri Hron , et al. (16 additional authors not shown)

    Abstract: Fine-tuning language models~(LMs) on human-generated data remains a prevalent practice. However, the performance of such models is often limited by the quantity and diversity of high-quality human data. In this paper, we explore whether we can go beyond human data on tasks where we have access to scalar feedback, for example, on math problems where one can verify correctness. To do so, we investig… ▽ More

    Submitted 17 April, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: Accepted to TMLR. Camera-ready version. First three authors contributed equally

  25. Enhancing Low Resource NER Using Assisting Language And Transfer Learning

    Authors: Maithili Sabane, Aparna Ranade, Onkar Litake, Parth Patil, Raviraj Joshi, Dipali Kadam

    Abstract: Named Entity Recognition (NER) is a fundamental task in NLP that is used to locate the key information in text and is primarily applied in conversational and search systems. In commercial applications, NER or comparable slot-filling methods have been widely deployed for popular languages. NER is used in applications such as human resources, customer service, search engines, content classification,… ▽ More

    Submitted 10 June, 2023; originally announced June 2023.

    Comments: Accepted at International Conference on Applied Artificial Intelligence and Computing (ICAAIC) 2023

  26. arXiv:2306.01198  [pdf, other

    stat.ME cs.CV stat.ML

    Confidence Intervals for Error Rates in 1:1 Matching Tasks: Critical Statistical Analysis and Recommendations

    Authors: Riccardo Fogliato, Pratik Patil, Pietro Perona

    Abstract: Matching algorithms are commonly used to predict matches between items in a collection. For example, in 1:1 face verification, a matching algorithm predicts whether two face images depict the same person. Accurately assessing the uncertainty of the error rates of such algorithms can be challenging when data are dependent and error rates are low, two aspects that have been often overlooked in the l… ▽ More

    Submitted 26 April, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

  27. arXiv:2305.18496  [pdf, other

    math.ST cs.LG stat.ML

    Generalized equivalences between subsampling and ridge regularization

    Authors: Pratik Patil, Jin-Hong Du

    Abstract: We establish precise structural and risk equivalences between subsampling and ridge regularization for ensemble ridge estimators. Specifically, we prove that linear and quadratic functionals of subsample ridge estimators, when fitted with different ridge regularization levels $λ$ and subsample aspect ratios $ψ$, are asymptotically equivalent along specific paths in the $(λ,ψ)$-plane (where $ψ$ is… ▽ More

    Submitted 17 October, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: Fixed typos; add an figure to illustrate the risk monotonicity of optimal ridge

  28. arXiv:2304.13016  [pdf, other

    math.ST cs.LG stat.ML

    Subsample Ridge Ensembles: Equivalences and Generalized Cross-Validation

    Authors: Jin-Hong Du, Pratik Patil, Arun Kumar Kuchibhotla

    Abstract: We study subsampling-based ridge ensembles in the proportional asymptotics regime, where the feature size grows proportionally with the sample size such that their ratio converges to a constant. By analyzing the squared prediction risk of ridge ensembles as a function of the explicit penalty $λ$ and the limiting subsample aspect ratio $φ_s$ (the ratio of the feature size to the subsample size), we… ▽ More

    Submitted 16 July, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

    Comments: 47 pages, 11 figures; this version fixes minor typos. arXiv admin note: text overlap with arXiv:2210.11445

  29. arXiv:2301.07341  [pdf, other

    cs.CL cs.AI

    KILDST: Effective Knowledge-Integrated Learning for Dialogue State Tracking using Gazetteer and Speaker Information

    Authors: Hyungtak Choi, Hyeonmok Ko, Gurpreet Kaur, Lohith Ravuru, Kiranmayi Gandikota, Manisha Jhawar, Simma Dharani, Pranamya Patil

    Abstract: Dialogue State Tracking (DST) is core research in dialogue systems and has received much attention. In addition, it is necessary to define a new problem that can deal with dialogue between users as a step toward the conversational AI that extracts and recommends information from the dialogue between users. So, we introduce a new task - DST from dialogue between users about scheduling an event (DST… ▽ More

    Submitted 18 January, 2023; originally announced January 2023.

  30. arXiv:2211.03751  [pdf, other

    math.NA cs.DS math.ST

    Asymptotics of the Sketched Pseudoinverse

    Authors: Daniel LeJeune, Pratik Patil, Hamid Javadi, Richard G. Baraniuk, Ryan J. Tibshirani

    Abstract: We take a random matrix theory approach to random sketching and show an asymptotic first-order equivalence of the regularized sketched pseudoinverse of a positive semidefinite matrix to a certain evaluation of the resolvent of the same matrix. We focus on real-valued regularization and extend previous results on an asymptotic equivalence of random matrices to the real setting, providing a precise… ▽ More

    Submitted 6 October, 2023; v1 submitted 7 November, 2022; originally announced November 2022.

    Comments: 45 pages, 9 figures

    MSC Class: 15B52; 46L54; 62J07

  31. arXiv:2209.02438  [pdf

    cs.CV

    Threat Detection In Self-Driving Vehicles Using Computer Vision

    Authors: Umang Goenka, Aaryan Jagetia, Param Patil, Akshay Singh, Taresh Sharma, Poonam Saini

    Abstract: On-road obstacle detection is an important field of research that falls in the scope of intelligent transportation infrastructure systems. The use of vision-based approaches results in an accurate and cost-effective solution to such systems. In this research paper, we propose a threat detection mechanism for autonomous self-driving cars using dashcam videos to ensure the presence of any unwanted o… ▽ More

    Submitted 6 September, 2022; originally announced September 2022.

    Comments: Presented in 3rd International Conference on Machine Learning, Image Processing, Network Security and Data Sciences MIND-2021

  32. arXiv:2209.00627  [pdf, other

    q-bio.NC cs.LG

    Classification of Electroencephalograms during Mathematical Calculations Using Deep Learning

    Authors: Umang Goenka, Param Patil, Kush Gosalia, Aaryan Jagetia

    Abstract: Classifying Electroencephalogram(EEG) signals helps in understanding Brain-Computer Interface (BCI). EEG signals are vital in studying how the human mind functions. In this paper, we have used an Arithmetic Calculation dataset consisting of Before Calculation Signals (BCS) and During Calculation Signals (DCS). The dataset consisted of 36 participants. In order to understand the functioning of neur… ▽ More

    Submitted 31 August, 2022; originally announced September 2022.

    Comments: Paper presented in IEEE 23rd International Conference on Information Reuse and Integration for Data Science

  33. arXiv:2207.04588  [pdf, other

    stat.ML cs.LG

    Multi-Study Boosting: Theoretical Considerations for Merging vs. Ensembling

    Authors: Cathy Shyr, Pragya Sur, Giovanni Parmigiani, Prasad Patil

    Abstract: Cross-study replicability is a powerful model evaluation criterion that emphasizes generalizability of predictions. When training cross-study replicable prediction models, it is critical to decide between merging and treating the studies separately. We study boosting algorithms in the presence of potential heterogeneity in predictor-outcome relationships across studies and compare two multi-study… ▽ More

    Submitted 12 July, 2022; v1 submitted 10 July, 2022; originally announced July 2022.

  34. arXiv:2206.07633  [pdf, other

    cs.DS cs.LG

    Sublinear Algorithms for Hierarchical Clustering

    Authors: Arpit Agarwal, Sanjeev Khanna, Huan Li, Prathamesh Patil

    Abstract: Hierarchical clustering over graphs is a fundamental task in data mining and machine learning with applications in domains such as phylogenetics, social network analysis, and information retrieval. Specifically, we consider the recently popularized objective function for hierarchical clustering due to Dasgupta. Previous algorithms for (approximately) minimizing this objective function require line… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

  35. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  36. arXiv:2205.12937  [pdf, other

    math.ST cs.LG stat.ML

    Mitigating multiple descents: A model-agnostic framework for risk monotonization

    Authors: Pratik Patil, Arun Kumar Kuchibhotla, Yuting Wei, Alessandro Rinaldo

    Abstract: Recent empirical and theoretical analyses of several commonly used prediction procedures reveal a peculiar risk behavior in high dimensions, referred to as double/multiple descent, in which the asymptotic risk is a non-monotonic function of the limiting aspect ratio of the number of features or parameters to the sample size. To mitigate this undesirable behavior, we develop a general framework for… ▽ More

    Submitted 25 May, 2022; originally announced May 2022.

    Comments: 110 pages, 15 figures

  37. arXiv:2205.00984  [pdf, ps, other

    cs.LG stat.ML

    A Sharp Memory-Regret Trade-Off for Multi-Pass Streaming Bandits

    Authors: Arpit Agarwal, Sanjeev Khanna, Prathamesh Patil

    Abstract: The stochastic $K$-armed bandit problem has been studied extensively due to its applications in various domains ranging from online advertising to clinical trials. In practice however, the number of arms can be very large resulting in large memory requirements for simultaneously processing them. In this paper we consider a streaming setting where the arms are presented in a stream and the algorith… ▽ More

    Submitted 2 May, 2022; originally announced May 2022.

  38. arXiv:2204.06029  [pdf, other

    cs.CL cs.LG

    L3Cube-MahaNER: A Marathi Named Entity Recognition Dataset and BERT models

    Authors: Parth Patil, Aparna Ranade, Maithili Sabane, Onkar Litake, Raviraj Joshi

    Abstract: Named Entity Recognition (NER) is a basic NLP task and finds major applications in conversational and search systems. It helps us identify key entities in a sentence used for the downstream application. NER or similar slot filling systems for popular languages have been heavily used in commercial applications. In this work, we focus on Marathi, an Indian language, spoken prominently by the people… ▽ More

    Submitted 12 April, 2022; originally announced April 2022.

  39. Mono vs Multilingual BERT: A Case Study in Hindi and Marathi Named Entity Recognition

    Authors: Onkar Litake, Maithili Sabane, Parth Patil, Aparna Ranade, Raviraj Joshi

    Abstract: Named entity recognition (NER) is the process of recognising and classifying important information (entities) in text. Proper nouns, such as a person's name, an organization's name, or a location's name, are examples of entities. The NER is one of the important modules in applications like human resources, customer support, search engines, content classification, and academia. In this work, we con… ▽ More

    Submitted 24 March, 2022; originally announced March 2022.

    Comments: Accepted at ICMISC 2022

  40. arXiv:2111.13813  [pdf

    cs.CV cs.AI

    Video Content Classification using Deep Learning

    Authors: Pradyumn Patil, Vishwajeet Pawar, Yashraj Pawar, Shruti Pisal

    Abstract: Video content classification is an important research content in computer vision, which is widely used in many fields, such as image and video retrieval, computer vision. This paper presents a model that is a combination of Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) which develops, trains, and optimizes a deep learning network that can identify the type of video content… ▽ More

    Submitted 26 November, 2021; originally announced November 2021.

    Comments: for assosiated Dataset check :- https://github.com/coin-dataset/annotations

  41. Blockchain-based Security Services for Fog Computing

    Authors: Arvind W. Kiwelekar, Pramod Patil, Laxman D. Netak, Sanjay U Waikar

    Abstract: Fog computing is a paradigm for distributed computing that enables sharing of resources such as computing, storage and network services. Unlike cloud computing, fog computing platforms primarily support {\em non-functional properties} such as location awareness, mobility and reduced latency. This emerging paradigm has many potential applications in domains such as smart grids, smart cities, and tr… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.

    Comments: This is a pre-print of the following Chapter: Arvind W. Kiwelekar, Pramod Patil Laxman D. Netak and Sanjay U Waikar, {\em Blockchain-Based Security Services for Fog Computing} accepted and final version is published in Chang W., Wu J. (eds) Fog/Edge Computing For Security, Privacy, and Applications. Advances in Information Security, vol 83. Springer

  42. arXiv:2009.14108  [pdf, other

    cs.LG cs.AI stat.ML

    Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution

    Authors: Vihang P. Patil, Markus Hofmarcher, Marius-Constantin Dinu, Matthias Dorfer, Patrick M. Blies, Johannes Brandstetter, Jose A. Arjona-Medina, Sepp Hochreiter

    Abstract: Reinforcement learning algorithms require many samples when solving complex hierarchical tasks with sparse and delayed rewards. For such complex tasks, the recently proposed RUDDER uses reward redistribution to leverage steps in the Q-function that are associated with accomplishing sub-tasks. However, often only few episodes with high rewards are available as demonstrations since current explorati… ▽ More

    Submitted 28 June, 2022; v1 submitted 29 September, 2020; originally announced September 2020.

    Comments: Github: https://github.com/ml-jku/align-rudder, YouTube: https://youtu.be/HO-_8ZUl-UY

  43. arXiv:2008.10901  [pdf, ps, other

    cs.IT eess.SP

    Uplink-Downlink Duality Between Multiple-Access and Broadcast Channels with Compressing Relays

    Authors: Liang Liu, Ya-Feng Liu, Pratik Patil, Wei Yu

    Abstract: Uplink-downlink duality refers to the fact that under a sum-power constraint, the capacity regions of a Gaussian multiple-access channel and a Gaussian broadcast channel with Hermitian transposed channel matrices are identical. This paper generalizes this result to a cooperative cellular network, in which remote access-points are deployed as relays in serving the users under the coordination of a… ▽ More

    Submitted 26 August, 2021; v1 submitted 25 August, 2020; originally announced August 2020.

    Comments: Accepted in IEEE Transactions on Information Theory; 34 pages

  44. arXiv:2008.04849  [pdf, other

    q-bio.PE cs.OH physics.soc-ph q-bio.QM

    City-Scale Agent-Based Simulators for the Study of Non-Pharmaceutical Interventions in the Context of the COVID-19 Epidemic

    Authors: Shubhada Agrawal, Siddharth Bhandari, Anirban Bhattacharjee, Anand Deo, Narendra M. Dixit, Prahladh Harsha, Sandeep Juneja, Poonam Kesarwani, Aditya Krishna Swamy, Preetam Patil, Nihesh Rathod, Ramprasad Saptharishi, Sharad Shriram, Piyush Srivastava, Rajesh Sundaresan, Nidhin Koshy Vaidhiyan, Sarath Yasodharan

    Abstract: We highlight the usefulness of city-scale agent-based simulators in studying various non-pharmaceutical interventions to manage an evolving pandemic. We ground our studies in the context of the COVID-19 pandemic and demonstrate the power of the simulator via several exploratory case studies in two metropolises, Bengaluru and Mumbai. Such tools become common-place in any city administration's tool… ▽ More

    Submitted 11 August, 2020; originally announced August 2020.

    Comments: 56 pages

    Journal ref: Journal of the Indian Institute of Science, volume 100, pages 809-847, 2020

  45. arXiv:2006.11478  [pdf, ps, other

    cs.LG stat.ML

    Representation via Representations: Domain Generalization via Adversarially Learned Invariant Representations

    Authors: Zhun Deng, Frances Ding, Cynthia Dwork, Rachel Hong, Giovanni Parmigiani, Prasad Patil, Pragya Sur

    Abstract: We investigate the power of censoring techniques, first developed for learning {\em fair representations}, to address domain generalization. We examine {\em adversarial} censoring techniques for learning invariant representations from multiple "studies" (or domains), where each study is drawn according to a distribution on domains. The mapping is used at test time to classify instances from a new… ▽ More

    Submitted 19 June, 2020; originally announced June 2020.

  46. arXiv:2006.03375  [pdf, other

    q-bio.PE cs.OH physics.soc-ph q-bio.QM

    COVID-19 Epidemic Study II: Phased Emergence From the Lockdown in Mumbai

    Authors: Prahladh Harsha, Sandeep Juneja, Preetam Patil, Nihesh Rathod, Ramprasad Saptharishi, A. Y. Sarath, Sharad Sriram, Piyush Srivastava, Rajesh Sundaresan, Nidhin Koshy Vaidhiyan

    Abstract: The nation-wide lockdown starting 25 March 2020, aimed at suppressing the spread of the COVID-19 disease, was extended until 31 May 2020 in three subsequent orders by the Government of India. The extended lockdown has had significant social and economic consequences and `lockdown fatigue' has likely set in. Phased reopening began from 01 June 2020 onwards. Mumbai, one of the most crowded cities in… ▽ More

    Submitted 5 June, 2020; originally announced June 2020.

    Comments: 34 pages

  47. arXiv:2002.09943  [pdf, other

    cs.LG cs.SI stat.ML

    Network Clustering Via Kernel-ARMA Modeling and the Grassmannian The Brain-Network Case

    Authors: Cong Ye, Konstantinos Slavakis, Pratik V. Patil, Johan Nakuci, Sarah F. Muldoon, John Medaglia

    Abstract: This paper introduces a clustering framework for networks with nodes annotated with time-series data. The framework addresses all types of network-clustering problems: State clustering, node clustering within states (a.k.a. topology identification or community detection), and even subnetwork-state-sequence identification/tracking. Via a bottom-up approach, features are first extracted from the raw… ▽ More

    Submitted 18 February, 2020; originally announced February 2020.

    Comments: arXiv admin note: substantial text overlap with arXiv:1906.02292

  48. arXiv:1906.02292  [pdf, other

    cs.LG eess.SP stat.ML

    Brain-Network Clustering via Kernel-ARMA Modeling and the Grassmannian

    Authors: Cong Ye, Konstantinos Slavakis, Pratik V. Patil, Sarah F. Muldoon, John Medaglia

    Abstract: Recent advances in neuroscience and in the technology of functional magnetic resonance imaging (fMRI) and electro-encephalography (EEG) have propelled a growing interest in brain-network clustering via time-series analysis. Notwithstanding, most of the brain-network clustering methods revolve around state clustering and/or node clustering (a.k.a. community detection or topology inference) within s… ▽ More

    Submitted 5 June, 2019; originally announced June 2019.

  49. arXiv:1905.07382  [pdf, other

    stat.ML cs.LG

    Merging versus Ensembling in Multi-Study Prediction: Theoretical Insight from Random Effects

    Authors: Zoe Guan, Giovanni Parmigiani, Prasad Patil

    Abstract: A critical decision point when training predictors using multiple studies is whether studies should be combined or treated separately. We compare two multi-study prediction approaches in the presence of potential heterogeneity in predictor-outcome relationships across datasets: 1) merging all of the datasets and training a single learner, and 2) multi-study ensembling, which involves training a se… ▽ More

    Submitted 12 December, 2024; v1 submitted 17 May, 2019; originally announced May 2019.

  50. arXiv:1806.00673  [pdf, ps, other

    cs.IT

    Hybrid Data-Sharing and Compression Strategy for Downlink Cloud Radio Access Network

    Authors: Pratik Patil, Binbin Dai, Wei Yu

    Abstract: This paper studies transmission strategies for the downlink of a cloud radio access network, in which the base stations are connected to a centralized cloud-computing based processor with digital fronthaul or backhaul links. We provide a system-level performance comparison of two fundamentally different strategies, namely the data-sharing strategy and the compression strategy, that differ in the w… ▽ More

    Submitted 2 June, 2018; originally announced June 2018.

    Comments: 15 pages, 8 figures, to appear in IEEE Transactions on Communications