Skip to main content

Showing 1–50 of 153 results for author: Khashayar

.
  1. arXiv:2506.20786  [pdf

    cs.CV

    AI-Driven MRI-based Brain Tumour Segmentation Benchmarking

    Authors: Connor Ludwig, Khashayar Namdar, Farzad Khalvati

    Abstract: Medical image segmentation has greatly aided medical diagnosis, with U-Net based architectures and nnU-Net providing state-of-the-art performance. There have been numerous general promptable models and medical variations introduced in recent years, but there is currently a lack of evaluation and comparison of these models across a variety of prompt qualities on a common medical dataset. This resea… ▽ More

    Submitted 25 June, 2025; originally announced June 2025.

  2. arXiv:2506.04295  [pdf

    cs.LO

    Logical Inferentialism & Attacks on Classical Logic

    Authors: Khashayar Irani

    Abstract: This paper undertakes a foundational inquiry into logical inferentialism with particular emphasis on the normative standards it establishes and the implications these pose for classical logic. The central question addressed herein is: 'What is Logical Inferentialism & How do its Standards challenge Classical Logic?' In response, the study begins with a survey of the three principal proof systems t… ▽ More

    Submitted 4 June, 2025; originally announced June 2025.

    Comments: Draft

  3. arXiv:2505.23549  [pdf, ps, other

    cs.SE

    LLM-based Property-based Test Generation for Guardrailing Cyber-Physical Systems

    Authors: Khashayar Etemadi, Marjan Sirjani, Mahshid Helali Moghadam, Per Strandberg, Paul Pettersson

    Abstract: Cyber-physical systems (CPSs) are complex systems that integrate physical, computational, and communication subsystems. The heterogeneous nature of these systems makes their safety assurance challenging. In this paper, we propose a novel automated approach for guardrailing cyber-physical systems using property-based tests (PBTs) generated by Large Language Models (LLMs). Our approach employs an LL… ▽ More

    Submitted 13 June, 2025; v1 submitted 29 May, 2025; originally announced May 2025.

  4. arXiv:2505.09486  [pdf, other

    cs.LG cs.AI

    Preserving Plasticity in Continual Learning with Adaptive Linearity Injection

    Authors: Seyed Roozbeh Razavi Rohani, Khashayar Khajavi, Wesley Chung, Mo Chen, Sharan Vaswani

    Abstract: Loss of plasticity in deep neural networks is the gradual reduction in a model's capacity to incrementally learn and has been identified as a key obstacle to learning in non-stationary problem settings. Recent work has shown that deep linear networks tend to be resilient towards loss of plasticity. Motivated by this observation, we propose Adaptive Linearization (AdaLin), a general approach that d… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

    Comments: Accepted in 4th Conference on Lifelong Learning Agents (CoLLAs), 2025

  5. arXiv:2505.06070  [pdf, other

    eess.SY

    Zero Dynamics Attack Detection and Isolation in Cyber-Physical Systems with Event-triggered Communication

    Authors: Ali Eslami, Khashayar Khorasani

    Abstract: This paper investigates the problem of Zero Dynamics (ZD) cyber-attack detection and isolation in Cyber-Physical Systems (CPS). By utilizing the notion of auxiliary systems with event-based communications, we will develop a detection mechanism capable of detecting and isolating the ZD cyber-attack even when the attackers have full knowledge of the dynamics of the auxiliary system and can launch Fa… ▽ More

    Submitted 9 May, 2025; originally announced May 2025.

    Comments: 9 pages, 7 figures

  6. arXiv:2505.04994  [pdf, other

    cs.CL cs.AI

    Rethinking Invariance in In-context Learning

    Authors: Lizhe Fang, Yifei Wang, Khashayar Gatmiry, Lei Fang, Yisen Wang

    Abstract: In-Context Learning (ICL) has emerged as a pivotal capability of auto-regressive large language models, yet it is hindered by a notable sensitivity to the ordering of context examples regardless of their mutual independence. To address this issue, recent studies have introduced several variant algorithms of ICL that achieve permutation invariance. However, many of these do not exhibit comparable p… ▽ More

    Submitted 8 May, 2025; originally announced May 2025.

  7. arXiv:2505.04350  [pdf, other

    math.NA

    On the one-dimensional SPH approximation of fractional-order operators

    Authors: Khashayar Ghorbani, Fabio Semperlotti

    Abstract: This work presents a theoretical formalism and the corresponding numerical techniques to obtain the approximation of fractional-order operators over a 1D domain via the smoothed particle hydrodynamics (SPH) method. The method is presented for both constant- and variable-order operators, in either integral or differential forms. Several numerical examples are presented in order to validate the theo… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

  8. arXiv:2505.00467  [pdf, ps, other

    cs.CL cs.AI

    Red Teaming Large Language Models for Healthcare

    Authors: Vahid Balazadeh, Michael Cooper, David Pellow, Atousa Assadi, Jennifer Bell, Mark Coastworth, Kaivalya Deshpande, Jim Fackler, Gabriel Funingana, Spencer Gable-Cook, Anirudh Gangadhar, Abhishek Jaiswal, Sumanth Kaja, Christopher Khoury, Amrit Krishnan, Randy Lin, Kaden McKeen, Sara Naimimohasses, Khashayar Namdar, Aviraj Newatia, Allan Pang, Anshul Pattoo, Sameer Peesapati, Diana Prepelita, Bogdana Rakova , et al. (10 additional authors not shown)

    Abstract: We present the design process and findings of the pre-conference workshop at the Machine Learning for Healthcare Conference (2024) entitled Red Teaming Large Language Models for Healthcare, which took place on August 15, 2024. Conference participants, comprising a mix of computational and clinical expertise, attempted to discover vulnerabilities -- realistic clinical prompts for which a large lang… ▽ More

    Submitted 1 May, 2025; originally announced May 2025.

  9. arXiv:2504.19318  [pdf, other

    cs.RO eess.SY

    Unscented Particle Filter for Visual-inertial Navigation using IMU and Landmark Measurements

    Authors: Khashayar Ghanizadegan, Hashim A. Hashim

    Abstract: This paper introduces a geometric Quaternion-based Unscented Particle Filter for Visual-Inertial Navigation (QUPF-VIN) specifically designed for a vehicle operating with six degrees of freedom (6 DoF). The proposed QUPF-VIN technique is quaternion-based capturing the inherently nonlinear nature of true navigation kinematics. The filter fuses data from a low-cost inertial measurement unit (IMU) and… ▽ More

    Submitted 27 April, 2025; originally announced April 2025.

  10. arXiv:2503.14986  [pdf

    eess.SY

    Enhancing Fault Detection and Isolation in an All-Electric Auxiliary Power Unit (APU) Gas Generator by Utilizing Starter/Generator Signal

    Authors: Haotian Mao, Khashayar Khorasani, Yingqing Guo

    Abstract: This study proposes a novel paradigm for enhancing fault detection and isolation (FDI) of gas generators in all-electric auxiliary power unit (APU) by utilizing shaft power information from the starter/generator. First, we conduct a pioneering investigation into the challenges and opportunities for FDI brought about by APU electrification. Our analysis reveals that the electrification of APU opens… ▽ More

    Submitted 19 March, 2025; originally announced March 2025.

  11. DeepUKF-VIN: Adaptively-tuned Deep Unscented Kalman Filter for 3D Visual-Inertial Navigation based on IMU-Vision-Net

    Authors: Khashayar Ghanizadegan, Hashim A. Hashim

    Abstract: This paper addresses the challenge of estimating the orientation, position, and velocity of a vehicle operating in three-dimensional (3D) space with six degrees of freedom (6-DoF). A Deep Learning-based Adaptation Mechanism (DLAM) is proposed to adaptively tune the noise covariance matrices of Kalman-type filters for the Visual-Inertial Navigation (VIN) problem, leveraging IMU-Vision-Net. Subseque… ▽ More

    Submitted 12 March, 2025; v1 submitted 1 February, 2025; originally announced February 2025.

  12. Quaternion-based Unscented Kalman Filter for 6-DoF Vision-based Inertial Navigation in GPS-denied Regions

    Authors: Khashayar Ghanizadegan, Hashim A. Hashim

    Abstract: This paper investigates the orientation, position, and linear velocity estimation problem of a rigid-body moving in three-dimensional (3D) space with six degrees-of-freedom (6 DoF). The highly nonlinear navigation kinematics are formulated to ensure global representation of the navigation problem. A computationally efficient Quaternion-based Navigation Unscented Kalman Filter (QNUKF) is proposed o… ▽ More

    Submitted 3 December, 2024; originally announced December 2024.

    Comments: IEEE Transactions on Instrumentation and Measurement

  13. arXiv:2411.19275  [pdf, other

    cs.SE

    VeCoGen: Automating Generation of Formally Verified C Code with Large Language Models

    Authors: Merlijn Sevenhuijsen, Khashayar Etemadi, Mattias Nyberg

    Abstract: Large language models have demonstrated impressive capabilities in generating code, yet they often produce programs with flaws or deviations from intended behavior, limiting their suitability for safety-critical applications. To address this limitation, this paper introduces VECOGEN, a novel tool that combines large language models with formal verification to automate the generation of formally ve… ▽ More

    Submitted 7 April, 2025; v1 submitted 28 November, 2024; originally announced November 2024.

  14. arXiv:2411.00287  [pdf, other

    cs.LG cs.AI cs.CE math.NA stat.ML

    MBExplainer: Multilevel bandit-based explanations for downstream models with augmented graph embeddings

    Authors: Ashkan Golgoon, Ryan Franks, Khashayar Filom, Arjun Ravi Kannan

    Abstract: In many industrial applications, it is common that the graph embeddings generated from training GNNs are used in an ensemble model where the embeddings are combined with other tabular features (e.g., original node or edge features) in a downstream ML task. The tabular features may even arise naturally if, e.g., one tries to build a graph such that some of the node or edge features are stored in a… ▽ More

    Submitted 31 October, 2024; originally announced November 2024.

    MSC Class: 68T01 ACM Class: I.2

  15. arXiv:2410.21698  [pdf, other

    cs.LG math.ST stat.ML

    On the Role of Depth and Looping for In-Context Learning with Task Diversity

    Authors: Khashayar Gatmiry, Nikunj Saunshi, Sashank J. Reddi, Stefanie Jegelka, Sanjiv Kumar

    Abstract: The intriguing in-context learning (ICL) abilities of deep Transformer models have lately garnered significant attention. By studying in-context linear regression on unimodal Gaussian data, recent empirical and theoretical works have argued that ICL emerges from Transformers' abilities to simulate learning algorithms like gradient descent. However, these works fail to capture the remarkable abilit… ▽ More

    Submitted 28 October, 2024; originally announced October 2024.

  16. arXiv:2410.17336  [pdf, other

    cs.LG cs.DS cs.GT math.ST stat.ML

    Computing Optimal Regularizers for Online Linear Optimization

    Authors: Khashayar Gatmiry, Jon Schneider, Stefanie Jegelka

    Abstract: Follow-the-Regularized-Leader (FTRL) algorithms are a popular class of learning algorithms for online linear optimization (OLO) that guarantee sub-linear regret, but the choice of regularizer can significantly impact dimension-dependent factors in the regret bound. We present an algorithm that takes as input convex and symmetric action sets and loss sets for a specific OLO instance, and outputs a… ▽ More

    Submitted 22 October, 2024; originally announced October 2024.

  17. arXiv:2410.16401  [pdf, other

    cs.LG math.ST stat.ML

    Simplicity Bias via Global Convergence of Sharpness Minimization

    Authors: Khashayar Gatmiry, Zhiyuan Li, Sashank J. Reddi, Stefanie Jegelka

    Abstract: The remarkable generalization ability of neural networks is usually attributed to the implicit bias of SGD, which often yields models with lower complexity using simpler (e.g. linear) and low-rank features. Recent works have provided empirical and theoretical evidence for the bias of particular variants of SGD (such as label noise SGD) toward flatter regions of the loss landscape. Despite the folk… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

  18. AGN feeding along a one-armed spiral in NGC 4593: A study using ALMA CO(2-1) observations

    Authors: K. Kianfar, P. Andreani, J. A. Fernández-Ontiveros, F. Combes, L. Spinoglio, E. Hatziminaoglou, C. Ricci, A. Bewketu-Belete, M. Imanishi, M. Pereira-Santaella, R. Slater, M. Malheiro

    Abstract: We investigate active galactic nuclei (AGN) feeding through the molecular gas (CO(2-1) emission) properties of the local Seyfert 1 galaxy NGC 4593, using Atacama Large Millimeter Array (ALMA) observations and other multi-wavelength data. Our study aims to understand the interplay between the AGN and the interstellar medium (ISM) in this galaxy, examining the role of the AGN in steering gas dynamic… ▽ More

    Submitted 13 October, 2024; originally announced October 2024.

    Comments: 15 pages, 10 figures

  19. arXiv:2410.08292  [pdf, other

    cs.LG cs.AI stat.ML

    Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?

    Authors: Khashayar Gatmiry, Nikunj Saunshi, Sashank J. Reddi, Stefanie Jegelka, Sanjiv Kumar

    Abstract: The remarkable capability of Transformers to do reasoning and few-shot learning, without any fine-tuning, is widely conjectured to stem from their ability to implicitly simulate a multi-step algorithms -- such as gradient descent -- with their weights in a single forward pass. Recently, there has been progress in understanding this complex phenomenon from an expressivity point of view, by demonstr… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

  20. arXiv:2409.13074  [pdf, other

    cs.LG cs.CV stat.ML

    What does guidance do? A fine-grained analysis in a simple setting

    Authors: Muthu Chidambaram, Khashayar Gatmiry, Sitan Chen, Holden Lee, Jianfeng Lu

    Abstract: The use of guidance in diffusion models was originally motivated by the premise that the guidance-modified score is that of the data distribution tilted by a conditional likelihood raised to some power. In this work we clarify this misconception by rigorously proving that guidance fails to sample from the intended tilted distribution. Our main result is to give a fine-grained characterization of… ▽ More

    Submitted 19 September, 2024; originally announced September 2024.

  21. arXiv:2407.11215  [pdf, other

    cs.LG cs.AI cs.CE cs.CL math.NA

    Mechanistic interpretability of large language models with applications to the financial services industry

    Authors: Ashkan Golgoon, Khashayar Filom, Arjun Ravi Kannan

    Abstract: Large Language Models such as GPTs (Generative Pre-trained Transformers) exhibit remarkable capabilities across a broad spectrum of applications. Nevertheless, due to their intrinsic complexity, these models present substantial challenges in interpreting their internal decision-making processes. This lack of transparency poses critical challenges when it comes to their adaptation by financial inst… ▽ More

    Submitted 15 October, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

    MSC Class: 68T01 ACM Class: I.2.7

    Journal ref: 5th ACM International Conference on AI in Finance (ICAIF 2024)

  22. arXiv:2407.00571  [pdf, ps, other

    cs.LG

    Adversarial Online Learning with Temporal Feedback Graphs

    Authors: Khashayar Gatmiry, Jon Schneider

    Abstract: We study a variant of prediction with expert advice where the learner's action at round $t$ is only allowed to depend on losses on a specific subset of the rounds (where the structure of which rounds' losses are visible at time $t$ is provided by a directed "feedback graph" known to the learner). We present a novel learning algorithm for this setting based on a strategy of partitioning the losses… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  23. arXiv:2406.10375  [pdf, other

    cs.SE

    Mokav: Execution-driven Differential Testing with LLMs

    Authors: Khashayar Etemadi, Bardia Mohammadi, Zhendong Su, Martin Monperrus

    Abstract: It is essential to detect functional differences in various software engineering tasks, such as automated program repair, mutation testing, and code refactoring. The problem of detecting functional differences between two programs can be reduced to searching for a difference exposing test (DET): a test input that results in different outputs on the subject programs. In this paper, we propose Mokav… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  24. arXiv:2406.08878  [pdf, other

    cs.LG

    CIMRL: Combining IMitation and Reinforcement Learning for Safe Autonomous Driving

    Authors: Jonathan Booher, Khashayar Rohanimanesh, Junhong Xu, Vladislav Isenbaev, Ashwin Balakrishna, Ishan Gupta, Wei Liu, Aleksandr Petiushko

    Abstract: Modern approaches to autonomous driving rely heavily on learned components trained with large amounts of human driving data via imitation learning. However, these methods require large amounts of expensive data collection and even then face challenges with safely handling long-tail scenarios and compounding errors over time. At the same time, pure Reinforcement Learning (RL) methods can fail to le… ▽ More

    Submitted 11 November, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

  25. arXiv:2405.06760  [pdf

    cs.CL cs.AI

    Opportunities for Persian Digital Humanities Research with Artificial Intelligence Language Models; Case Study: Forough Farrokhzad

    Authors: Arash Rasti Meymandi, Zahra Hosseini, Sina Davari, Abolfazl Moshiri, Shabnam Rahimi-Golkhandan, Khashayar Namdar, Nikta Feizi, Mohamad Tavakoli-Targhi, Farzad Khalvati

    Abstract: This study explores the integration of advanced Natural Language Processing (NLP) and Artificial Intelligence (AI) techniques to analyze and interpret Persian literature, focusing on the poetry of Forough Farrokhzad. Utilizing computational methods, we aim to unveil thematic, stylistic, and linguistic patterns in Persian poetry. Specifically, the study employs AI models including transformer-based… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

  26. arXiv:2404.18869  [pdf, ps, other

    cs.LG cs.DS math.PR math.ST stat.ML

    Learning Mixtures of Gaussians Using Diffusion Models

    Authors: Khashayar Gatmiry, Jonathan Kelner, Holden Lee

    Abstract: We give a new algorithm for learning mixtures of $k$ Gaussians (with identity covariance in $\mathbb{R}^n$) to TV error $\varepsilon$, with quasi-polynomial ($O(n^{\text{poly\,log}\left(\frac{n+k}{\varepsilon}\right)})$) time and sample complexity, under a minimum weight assumption. Our results extend to continuous mixtures of Gaussians where the mixing distribution is supported on a union of $k$… ▽ More

    Submitted 4 March, 2025; v1 submitted 29 April, 2024; originally announced April 2024.

  27. arXiv:2402.15650  [pdf, ps, other

    cs.LG cs.AI

    Uniformly Safe RL with Objective Suppression for Multi-Constraint Safety-Critical Applications

    Authors: Zihan Zhou, Jonathan Booher, Khashayar Rohanimanesh, Wei Liu, Aleksandr Petiushko, Animesh Garg

    Abstract: Safe reinforcement learning tasks are a challenging domain despite being very common in the real world. The widely adopted CMDP model constrains the risks in expectation, which makes room for dangerous behaviors in long-tail states. In safety-critical domains, such behaviors could lead to disastrous outcomes. To address this issue, we first describe the problem with a stronger Uniformly Constraine… ▽ More

    Submitted 28 August, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

  28. arXiv:2402.06598  [pdf, other

    cs.SE cs.LG

    CigaR: Cost-efficient Program Repair with LLMs

    Authors: Dávid Hidvégi, Khashayar Etemadi, Sofia Bobadilla, Martin Monperrus

    Abstract: Large language models (LLM) have proven to be effective at automated program repair (APR). However, using LLMs can be costly, with companies invoicing users by the number of tokens. In this paper, we propose CigaR, the first LLM-based APR tool that focuses on minimizing the repair cost. CigaR works in two major steps: generating a first plausible patch and multiplying plausible patches. CigaR opti… ▽ More

    Submitted 18 April, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

  29. arXiv:2402.03547  [pdf

    eess.IV cs.CV q-bio.QM

    Improving Pediatric Low-Grade Neuroepithelial Tumors Molecular Subtype Identification Using a Novel AUROC Loss Function for Convolutional Neural Networks

    Authors: Khashayar Namdar, Matthias W. Wagner, Cynthia Hawkins, Uri Tabori, Birgit B. Ertl-Wagner, Farzad Khalvati

    Abstract: Pediatric Low-Grade Neuroepithelial Tumors (PLGNT) are the most common pediatric cancer type, accounting for 40% of brain tumors in children, and identifying PLGNT molecular subtype is crucial for treatment planning. However, the gold standard to determine the PLGNT subtype is biopsy, which can be impractical or dangerous for patients. This research improves the performance of Convolutional Neural… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  30. arXiv:2401.17626  [pdf

    cs.SE cs.AI cs.LG

    Generative AI to Generate Test Data Generators

    Authors: Benoit Baudry, Khashayar Etemadi, Sen Fang, Yogya Gamage, Yi Liu, Yuxin Liu, Martin Monperrus, Javier Ron, André Silva, Deepika Tiwari

    Abstract: Generating fake data is an essential dimension of modern software testing, as demonstrated by the number and significance of data faking libraries. Yet, developers of faking libraries cannot keep up with the wide range of data to be generated for different natural languages and domains. In this paper, we assess the ability of generative AI for generating test data in different domains. We design t… ▽ More

    Submitted 14 June, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

    Journal ref: IEEE Software, 2024

  31. arXiv:2310.19086  [pdf, other

    physics.flu-dyn

    Scalar mixing and entrainment in an axisymmetric jet subjected to external turbulence

    Authors: Khashayar F. Kohan, Susan J. Gaskin

    Abstract: The present study aims to understand the process of turbulent entrainment into a jet, as affected by background turbulence, using scalar statistics. Planar-laser-induced fluorescence was employed to capture the orthogonal cross sections of the jet at a fixed downstream station with varying background turbulence intensities and length scales. The conditional scalar profiles revealed that the thickn… ▽ More

    Submitted 16 September, 2024; v1 submitted 29 October, 2023; originally announced October 2023.

    Comments: 19 pages, 7 figures

    Journal ref: Physics of Fluids, Volume 36 (10), 2024

  32. arXiv:2308.11518  [pdf, ps, other

    cs.LG stat.ML

    EM for Mixture of Linear Regression with Clustered Data

    Authors: Amirhossein Reisizadeh, Khashayar Gatmiry, Asuman Ozdaglar

    Abstract: Modern data-driven and distributed learning frameworks deal with diverse massive data generated by clients spread across heterogeneous environments. Indeed, data heterogeneity is a major bottleneck in scaling up many distributed learning paradigms. In many settings however, heterogeneous data may be generated in clusters with shared structures, as is the case in several applications such as federa… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

  33. arXiv:2307.11655  [pdf, other

    cs.LG cs.AI cs.GT

    Preferences Evolve And So Should Your Bandits: Bandits with Evolving States for Online Platforms

    Authors: Khashayar Khosravi, Renato Paes Leme, Chara Podimata, Apostolis Tsorvantzis

    Abstract: We propose a model for learning with bandit feedback while accounting for deterministically evolving and unobservable states that we call Bandits with Deterministically Evolving States ($B$-$DES$). The workhorse applications of our model are learning for recommendation systems and learning for online ads. In both cases, the reward that the algorithm obtains at each round is a function of the short… ▽ More

    Submitted 28 January, 2025; v1 submitted 21 July, 2023; originally announced July 2023.

  34. arXiv:2306.13853  [pdf, other

    cs.LG

    A Unified Approach to Controlling Implicit Regularization via Mirror Descent

    Authors: Haoyuan Sun, Khashayar Gatmiry, Kwangjun Ahn, Navid Azizan

    Abstract: Inspired by the remarkable success of large neural networks, there has been significant interest in understanding the generalization performance of over-parameterized models. Substantial efforts have been invested in characterizing how optimization algorithms impact generalization through their "preferred" solutions, a phenomenon commonly referred to as implicit regularization. In particular, it h… ▽ More

    Submitted 11 January, 2024; v1 submitted 23 June, 2023; originally announced June 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2205.12808

  35. arXiv:2306.13239  [pdf, other

    cs.LG

    The Inductive Bias of Flatness Regularization for Deep Matrix Factorization

    Authors: Khashayar Gatmiry, Zhiyuan Li, Ching-Yao Chuang, Sashank Reddi, Tengyu Ma, Stefanie Jegelka

    Abstract: Recent works on over-parameterized neural networks have shown that the stochasticity in optimizers has the implicit regularization effect of minimizing the sharpness of the loss function (in particular, the trace of its Hessian) over the family zero-loss solutions. More explicit forms of flatness regularization also empirically improve the generalization performance. However, it remains unclear wh… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

  36. arXiv:2306.11121  [pdf, ps, other

    math.OC cs.LG

    Projection-Free Online Convex Optimization via Efficient Newton Iterations

    Authors: Khashayar Gatmiry, Zakaria Mhammedi

    Abstract: This paper presents new projection-free algorithms for Online Convex Optimization (OCO) over a convex domain $\mathcal{K} \subset \mathbb{R}^d$. Classical OCO algorithms (such as Online Gradient Descent) typically need to perform Euclidean projections onto the convex set $\cK$ to ensure feasibility of their iterates. Alternative algorithms, such as those based on the Frank-Wolfe method, swap poten… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

  37. arXiv:2306.07698  [pdf, other

    quant-ph cs.CR

    Public-Key Encryption with Quantum Keys

    Authors: Khashayar Barooti, Alex B. Grilo, Loïs Huguenin-Dumittan, Giulio Malavolta, Or Sattath, Quoc-Huy Vu, Michael Walter

    Abstract: In the framework of Impagliazzo's five worlds, a distinction is often made between two worlds, one where public-key encryption exists (Cryptomania), and one in which only one-way functions exist (MiniCrypt). However, the boundaries between these worlds can change when quantum information is taken into account. Recent work has shown that quantum variants of oblivious transfer and multi-party comput… ▽ More

    Submitted 20 June, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: This submission subsumes arXiv:2303.01143 and arXiv:2303.05368

  38. arXiv:2304.04724  [pdf, ps, other

    stat.CO cs.CC stat.ML

    When does Metropolized Hamiltonian Monte Carlo provably outperform Metropolis-adjusted Langevin algorithm?

    Authors: Yuansi Chen, Khashayar Gatmiry

    Abstract: We analyze the mixing time of Metropolized Hamiltonian Monte Carlo (HMC) with the leapfrog integrator to sample from a distribution on $\mathbb{R}^d$ whose log-density is smooth, has Lipschitz Hessian in Frobenius norm and satisfies isoperimetry. We bound the gradient complexity to reach $ε$ error in total variation distance from a warm start by $\tilde O(d^{1/4}\text{polylog}(1/ε))$ and demonstra… ▽ More

    Submitted 8 June, 2023; v1 submitted 10 April, 2023; originally announced April 2023.

    Comments: 43 pages

  39. arXiv:2304.04095  [pdf, ps, other

    stat.ML cs.CC cs.LG stat.CO

    A Simple Proof of the Mixing of Metropolis-Adjusted Langevin Algorithm under Smoothness and Isoperimetry

    Authors: Yuansi Chen, Khashayar Gatmiry

    Abstract: We study the mixing time of Metropolis-Adjusted Langevin algorithm (MALA) for sampling a target density on $\mathbb{R}^d$. We assume that the target density satisfies $ψ_μ$-isoperimetry and that the operator norm and trace of its Hessian are bounded by $L$ and $Υ$ respectively. Our main result establishes that, from a warm start, to achieve $ε$-total variation distance to the target density, MALA… ▽ More

    Submitted 8 June, 2023; v1 submitted 8 April, 2023; originally announced April 2023.

    Comments: 17 pages

  40. arXiv:2303.10216  [pdf, other

    cs.LG math.PR

    Approximation of group explainers with coalition structure using Monte Carlo sampling on the product space of coalitions and features

    Authors: Konstandinos Kotsiopoulos, Alexey Miroshnikov, Khashayar Filom, Arjun Ravi Kannan

    Abstract: In recent years, many Machine Learning (ML) explanation techniques have been designed using ideas from cooperative game theory. These game-theoretic explainers suffer from high complexity, hindering their exact computation in practical settings. In our work, we focus on a wide class of linear game values, as well as coalitional values, for the marginal game based on a given ML model and predictor… ▽ More

    Submitted 18 April, 2024; v1 submitted 17 March, 2023; originally announced March 2023.

    Comments: 31 pages, 6 figures

  41. arXiv:2303.02622  [pdf, other

    cs.CR cs.NI

    A Multi-Agent Adaptive Deep Learning Framework for Online Intrusion Detection

    Authors: Mahdi Soltani, Khashayar Khajavi, Mahdi Jafari Siavoshani, Amir Hossein Jahangir

    Abstract: The network security analyzers use intrusion detection systems (IDSes) to distinguish malicious traffic from benign ones. The deep learning-based IDSes are proposed to auto-extract high-level features and eliminate the time-consuming and costly signature extraction process. However, this new generation of IDSes still suffers from a number of challenges. One of the main issues of an IDS is facing t… ▽ More

    Submitted 5 March, 2023; originally announced March 2023.

  42. arXiv:2303.02080  [pdf, other

    quant-ph

    Nonlocality under Computational Assumptions

    Authors: Khashayar Barooti, Alexandru Gheorghiu, Grzegorz Głuch, Marc-Olivier Renou

    Abstract: Nonlocality and its connections to entanglement are fundamental features of quantum mechanics that have found numerous applications in quantum information science. A set of correlations is said to be nonlocal if it cannot be reproduced by spacelike-separated parties sharing randomness and performing local operations. An important practical consideration is that the runtime of the parties has to be… ▽ More

    Submitted 28 November, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: 65 pages

  43. arXiv:2303.01143  [pdf, ps, other

    quant-ph cs.CR

    A Simple Construction of Quantum Public-Key Encryption from Quantum-Secure One-Way Functions

    Authors: Khashayar Barooti, Giulio Malavolta, Michael Walter

    Abstract: Quantum public-key encryption [Gottesman; Kawachi et al., Eurocrypt'05] generalizes public-key encryption (PKE) by allowing the public keys to be quantum states. Prior work indicated that quantum PKE can be constructed from assumptions that are potentially weaker than those needed to realize its classical counterpart. In this work, we show that quantum PKE can be constructed from any quantum-secur… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

  44. arXiv:2303.00480  [pdf, other

    cs.DS cs.LO math.FA math.NA stat.ML

    Sampling with Barriers: Faster Mixing via Lewis Weights

    Authors: Khashayar Gatmiry, Jonathan Kelner, Santosh S. Vempala

    Abstract: We analyze Riemannian Hamiltonian Monte Carlo (RHMC) for sampling a polytope defined by $m$ inequalities in $\R^n$ endowed with the metric defined by the Hessian of a convex barrier function. The advantage of RHMC over Euclidean methods such as the ball walk, hit-and-run and the Dikin walk is in its ability to take longer steps. However, in all previous work, the mixing rate has a linear dependenc… ▽ More

    Submitted 19 April, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

  45. On marginal feature attributions of tree-based models

    Authors: Khashayar Filom, Alexey Miroshnikov, Konstandinos Kotsiopoulos, Arjun Ravi Kannan

    Abstract: Due to their power and ease of use, tree-based machine learning models, such as random forests and gradient-boosted tree ensembles, have become very popular. To interpret them, local feature attributions based on marginal expectations, e.g. marginal (interventional) Shapley, Owen or Banzhaf values, may be employed. Such methods are true to the model and implementation invariant, i.e. dependent onl… ▽ More

    Submitted 5 May, 2024; v1 submitted 16 February, 2023; originally announced February 2023.

    Comments: Minor corrections. 30 pages+appendix (64 pages in total), 10 figures. To appear in Foundations of Data Science

    MSC Class: Primary: 68T01; 91A12; 91A80; 05A19; Secondary: 91A68; 91A06; 05C05

  46. arXiv:2302.07940  [pdf, other

    cs.RO cs.AI cs.LG

    Online Tool Selection with Learned Grasp Prediction Models

    Authors: Khashayar Rohanimanesh, Jake Metzger, William Richards, Aviv Tamar

    Abstract: Deep learning-based grasp prediction models have become an industry standard for robotic bin-picking systems. To maximize pick success, production environments are often equipped with several end-effector tools that can be swapped on-the-fly, based on the target object. Tool-change, however, takes time. Choosing the order of grasps to perform, and corresponding tool-change actions, can improve sys… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

    Comments: 14 pages (including the cover page), 5 Figures, Technical Report, OSARO Inc

  47. arXiv:2212.13669  [pdf, ps, other

    cs.LG math.OC

    Near-Optimal Algorithms for Group Distributionally Robust Optimization and Beyond

    Authors: Tasuku Soma, Khashayar Gatmiry, Sharut Gupta, Stefanie Jegelka

    Abstract: Distributionally robust optimization (DRO) can improve the robustness and fairness of learning methods. In this paper, we devise stochastic algorithms for a class of DRO problems including group DRO, subpopulation fairness, and empirical conditional value at risk (CVaR) optimization. Our new algorithms achieve faster convergence rates than existing algorithms for multiple DRO settings. We also pro… ▽ More

    Submitted 31 January, 2025; v1 submitted 27 December, 2022; originally announced December 2022.

    Comments: 4 tables, 2 figures

  48. Augmenting Diffs With Runtime Information

    Authors: Khashayar Etemadi, Aman Sharma, Fernanda Madeiral, Martin Monperrus

    Abstract: Source code diffs are used on a daily basis as part of code review, inspection, and auditing. To facilitate understanding, they are typically accompanied by explanations that describe the essence of what is changed in the program. As manually crafting high-quality explanations is a cumbersome task, researchers have proposed automatic techniques to generate code diff explanations. Existing explanat… ▽ More

    Submitted 30 June, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Journal ref: IEEE Transactions on Software Engineering, 2023

  49. arXiv:2211.14396  [pdf

    cs.CV cs.LG q-bio.QM

    Non-invasive Liver Fibrosis Screening on CT Images using Radiomics

    Authors: Jay J. Yoo, Khashayar Namdar, Sean Carey, Sandra E. Fischer, Chris McIntosh, Farzad Khalvati, Patrik Rogalla

    Abstract: Objectives: To develop and evaluate a radiomics machine learning model for detecting liver fibrosis on CT of the liver. Methods: For this retrospective, single-centre study, radiomic features were extracted from Regions of Interest (ROIs) on CT images of patients who underwent simultaneous liver biopsy and CT examinations. Combinations of contrast, normalization, machine learning model, and feat… ▽ More

    Submitted 26 February, 2024; v1 submitted 25 November, 2022; originally announced November 2022.

  50. arXiv:2211.14122  [pdf

    eess.IV cs.CV cs.LG

    Automating Cobb Angle Measurement for Adolescent Idiopathic Scoliosis using Instance Segmentation

    Authors: Chaojun Chen, Khashayar Namdar, Yujie Wu, Shahob Hosseinpour, Manohar Shroff, Andrea S. Doria, Farzad Khalvati

    Abstract: Scoliosis is a three-dimensional deformity of the spine, most often diagnosed in childhood. It affects 2-3% of the population, which is approximately seven million people in North America. Currently, the reference standard for assessing scoliosis is based on the manual assignment of Cobb angles at the site of the curvature center. This manual process is time consuming and unreliable as it is affec… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.