Skip to main content

Showing 1–32 of 32 results for author: Ramamurthy, K N

Searching in archive stat. Search in all archives.
.
  1. arXiv:2412.03906  [pdf, other

    cs.LG stat.ML

    Final-Model-Only Data Attribution with a Unifying View of Gradient-Based Methods

    Authors: Dennis Wei, Inkit Padhi, Soumya Ghosh, Amit Dhurandhar, Karthikeyan Natesan Ramamurthy, Maria Chang

    Abstract: Training data attribution (TDA) is the task of attributing model behavior to elements in the training data. This paper draws attention to the common setting where one has access only to the final trained model, and not the training algorithm or intermediate information from training. To serve as a gold standard for TDA in this "final-model-only" setting, we propose further training, with appropria… ▽ More

    Submitted 5 December, 2024; originally announced December 2024.

    Comments: 28 pages, 8 figures

  2. arXiv:2402.08871  [pdf, other

    cs.LG stat.ML

    Position: Topological Deep Learning is the New Frontier for Relational Learning

    Authors: Theodore Papamarkou, Tolga Birdal, Michael Bronstein, Gunnar Carlsson, Justin Curry, Yue Gao, Mustafa Hajij, Roland Kwitt, Pietro Liò, Paolo Di Lorenzo, Vasileios Maroulas, Nina Miolane, Farzana Nasrin, Karthikeyan Natesan Ramamurthy, Bastian Rieck, Simone Scardapane, Michael T. Schaub, Petar Veličković, Bei Wang, Yusu Wang, Guo-Wei Wei, Ghada Zamzmi

    Abstract: Topological deep learning (TDL) is a rapidly evolving field that uses topological features to understand and design deep learning models. This paper posits that TDL is the new frontier for relational learning. TDL may complement graph representation learning and geometric deep learning by incorporating topological concepts, and can thus provide a natural choice for various machine learning setting… ▽ More

    Submitted 6 August, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

    Comments: Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

  3. arXiv:2402.02441  [pdf, other

    cs.LG cs.AI cs.MS stat.CO

    TopoX: A Suite of Python Packages for Machine Learning on Topological Domains

    Authors: Mustafa Hajij, Mathilde Papillon, Florian Frantzen, Jens Agerberg, Ibrahem AlJabea, Rubén Ballester, Claudio Battiloro, Guillermo Bernárdez, Tolga Birdal, Aiden Brent, Peter Chin, Sergio Escalera, Simone Fiorellino, Odin Hoff Gardaa, Gurusankar Gopalakrishnan, Devendra Govil, Josef Hoppe, Maneel Reddy Karri, Jude Khouja, Manuel Lecha, Neal Livesay, Jan Meißner, Soham Mukherjee, Alexander Nikitin, Theodore Papamarkou , et al. (18 additional authors not shown)

    Abstract: We introduce TopoX, a Python software suite that provides reliable and user-friendly building blocks for computing and machine learning on topological domains that extend graphs: hypergraphs, simplicial, cellular, path and combinatorial complexes. TopoX consists of three packages: TopoNetX facilitates constructing and computing on these domains, including working with nodes, edges and higher-order… ▽ More

    Submitted 8 December, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

  4. arXiv:2312.11862  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Topo-MLP : A Simplicial Network Without Message Passing

    Authors: Karthikeyan Natesan Ramamurthy, Aldo Guzmán-Sáenz, Mustafa Hajij

    Abstract: Due to their ability to model meaningful higher order relations among a set of entities, higher order network models have emerged recently as a powerful alternative for graph-based network models which are only capable of modeling binary relationships. Message passing paradigm is still dominantly used to learn representations even for higher order network models. While powerful, message passing ca… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  5. arXiv:2301.12616  [pdf, other

    cs.LG stat.ME

    Active Sequential Two-Sample Testing

    Authors: Weizhi Li, Prad Kadambi, Pouria Saidi, Karthikeyan Natesan Ramamurthy, Gautam Dasarathy, Visar Berisha

    Abstract: A two-sample hypothesis test is a statistical procedure used to determine whether the distributions generating two samples are identical. We consider the two-sample testing problem in a new scenario where the sample measurements (or sample features) are inexpensive to access, but their group memberships (or labels) are costly. To address the problem, we devise the first \emph{active sequential two… ▽ More

    Submitted 27 June, 2024; v1 submitted 29 January, 2023; originally announced January 2023.

  6. arXiv:2206.00606  [pdf, other

    cs.LG cs.CV cs.SI math.AT stat.ML

    Topological Deep Learning: Going Beyond Graph Data

    Authors: Mustafa Hajij, Ghada Zamzmi, Theodore Papamarkou, Nina Miolane, Aldo Guzmán-Sáenz, Karthikeyan Natesan Ramamurthy, Tolga Birdal, Tamal K. Dey, Soham Mukherjee, Shreyas N. Samaga, Neal Livesay, Robin Walters, Paul Rosen, Michael T. Schaub

    Abstract: Topological deep learning is a rapidly growing field that pertains to the development of deep learning models for data supported on topological domains such as simplicial complexes, cell complexes, and hypergraphs, which generalize many domains encountered in scientific computations. In this paper, we present a unifying deep learning framework built upon a richer data structure that includes widel… ▽ More

    Submitted 19 May, 2023; v1 submitted 1 June, 2022; originally announced June 2022.

  7. arXiv:2111.08861  [pdf, other

    cs.LG stat.ML

    A label-efficient two-sample test

    Authors: Weizhi Li, Gautam Dasarathy, Karthikeyan Natesan Ramamurthy, Visar Berisha

    Abstract: Two-sample tests evaluate whether two samples are realizations of the same distribution (the null hypothesis) or two different distributions (the alternative hypothesis). We consider a new setting for this problem where sample features are easily measured whereas sample labels are unknown and costly to obtain. Accordingly, we devise a three-stage framework in service of performing an effective two… ▽ More

    Submitted 19 July, 2022; v1 submitted 16 November, 2021; originally announced November 2021.

    Comments: Accepted to the 38th conference on Uncertainty in Artificial Intelligence (UAI2022)

  8. arXiv:2110.02491  [pdf, ps, other

    cs.LG cs.NE math.CT stat.ML

    Data-Centric AI Requires Rethinking Data Notion

    Authors: Mustafa Hajij, Ghada Zamzmi, Karthikeyan Natesan Ramamurthy, Aldo Guzman Saenz

    Abstract: The transition towards data-centric AI requires revisiting data notions from mathematical and implementational standpoints to obtain unified data-centric machine learning packages. Towards this end, this work proposes unifying principles offered by categorical and cochain notions of data, and discusses the importance of these principles in data-centric AI transition. In the categorical notion, dat… ▽ More

    Submitted 2 December, 2021; v1 submitted 6 October, 2021; originally announced October 2021.

    Journal ref: Conference: 35th Conference on Neural Information Processing Systems (NeurIPS 2021) At: NEURIPS DATA-CENTRIC AI WORKSHOP

  9. arXiv:2005.00060  [pdf, other

    cs.LG cs.CV stat.ML

    Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness

    Authors: Pu Zhao, Pin-Yu Chen, Payel Das, Karthikeyan Natesan Ramamurthy, Xue Lin

    Abstract: Mode connectivity provides novel geometric insights on analyzing loss landscapes and enables building high-accuracy pathways between well-trained neural networks. In this work, we propose to employ mode connectivity in loss landscapes to study the adversarial robustness of deep neural networks, and provide novel methods for improving this robustness. Our experiments cover various types of adversar… ▽ More

    Submitted 2 July, 2020; v1 submitted 30 April, 2020; originally announced May 2020.

    Comments: accepted by ICLR 2020

  10. arXiv:2003.06005  [pdf, other

    cs.LG cs.AI stat.ML

    Model Agnostic Multilevel Explanations

    Authors: Karthikeyan Natesan Ramamurthy, Bhanukiran Vinzamuri, Yunfeng Zhang, Amit Dhurandhar

    Abstract: In recent years, post-hoc local instance-level and global dataset-level explainability of black-box models has received a lot of attention. Much less attention has been given to obtaining insights at intermediate or group levels, which is a need outlined in recent works that study the challenges in realizing the guidelines in the General Data Protection Regulation (GDPR). In this paper, we propose… ▽ More

    Submitted 12 March, 2020; originally announced March 2020.

    Comments: 21 pages, 9 figures, 1 table

    Journal ref: NeurIPS 2020

  11. arXiv:1911.07819  [pdf, other

    cs.CL cs.LG stat.ML

    Drug Repurposing for Cancer: An NLP Approach to Identify Low-Cost Therapies

    Authors: Shivashankar Subramanian, Ioana Baldini, Sushma Ravichandran, Dmitriy A. Katz-Rogozhnikov, Karthikeyan Natesan Ramamurthy, Prasanna Sattigeri, Kush R. Varshney, Annmarie Wang, Pradeep Mangalath, Laura B. Kleiman

    Abstract: More than 200 generic drugs approved by the U.S. Food and Drug Administration for non-cancer indications have shown promise for treating cancer. Due to their long history of safe patient use, low cost, and widespread availability, repurposing of generic drugs represents a major opportunity to rapidly improve outcomes for cancer patients and reduce healthcare costs worldwide. Evidence on the effica… ▽ More

    Submitted 5 December, 2019; v1 submitted 18 November, 2019; originally announced November 2019.

  12. arXiv:1911.01509  [pdf, ps, other

    cs.LG cs.CY stat.ML

    Understanding racial bias in health using the Medical Expenditure Panel Survey data

    Authors: Moninder Singh, Karthikeyan Natesan Ramamurthy

    Abstract: Over the years, several studies have demonstrated that there exist significant disparities in health indicators in the United States population across various groups. Healthcare expense is used as a proxy for health in algorithms that drive healthcare systems and this exacerbates the existing bias. In this work, we focus on the presence of racial bias in health indicators in the publicly available… ▽ More

    Submitted 4 November, 2019; originally announced November 2019.

    Comments: 8 pages, 8 tables

  13. arXiv:1906.02299  [pdf, other

    cs.LG cs.AI stat.ML

    Teaching AI to Explain its Decisions Using Embeddings and Multi-Task Learning

    Authors: Noel C. F. Codella, Michael Hind, Karthikeyan Natesan Ramamurthy, Murray Campbell, Amit Dhurandhar, Kush R. Varshney, Dennis Wei, Aleksandra Mojsilović

    Abstract: Using machine learning in high-stakes applications often requires predictions to be accompanied by explanations comprehensible to the domain user, who has ultimate responsibility for decisions and outcomes. Recently, a new framework for providing explanations, called TED, has been proposed to provide meaningful explanations for predictions. This framework augments training data to include explanat… ▽ More

    Submitted 5 June, 2019; originally announced June 2019.

    Comments: presented at 2019 ICML Workshop on Human in the Loop Learning (HILL 2019), Long Beach, USA. arXiv admin note: substantial text overlap with arXiv:1805.11648

  14. arXiv:1906.00066  [pdf, other

    cs.LG cs.IT stat.ML

    Optimized Score Transformation for Consistent Fair Classification

    Authors: Dennis Wei, Karthikeyan Natesan Ramamurthy, Flavio du Pin Calmon

    Abstract: This paper considers fair probabilistic binary classification where the outputs of primary interest are predicted probabilities, commonly referred to as scores. We formulate the problem of transforming scores to satisfy fairness constraints that are linear in conditional means of scores while minimizing a cross-entropy objective. The formulation can be applied directly to post-process classifier o… ▽ More

    Submitted 29 October, 2021; v1 submitted 31 May, 2019; originally announced June 2019.

    Comments: 78 pages, 16 figures. Published in Journal of Machine Learning Research. Earlier version published at the 2020 International Conference on Artificial Intelligence and Statistics (AISTATS)

  15. arXiv:1812.06135  [pdf, other

    cs.LG cs.CY stat.ML

    Bias Mitigation Post-processing for Individual and Group Fairness

    Authors: Pranay K. Lohia, Karthikeyan Natesan Ramamurthy, Manish Bhide, Diptikalyan Saha, Kush R. Varshney, Ruchir Puri

    Abstract: Whereas previous post-processing approaches for increasing the fairness of predictions of biased classifiers address only group fairness, we propose a method for increasing both individual and group fairness. Our novel framework includes an individual bias detector used to prioritize data samples in a bias mitigation algorithm aiming to improve the group fairness measure of disparate impact. We sh… ▽ More

    Submitted 14 December, 2018; originally announced December 2018.

    Comments: 5 pages, 4 figures

  16. arXiv:1806.05362  [pdf, other

    stat.AP

    Financial Forecasting and Analysis for Low-Wage Workers

    Authors: Wenyu Zhang, Raya Horesh, Karthikeyan N. Ramamurthy, Lingfei Wu, Jinfeng Yi, Kryn Anderson, Kush R. Varshney

    Abstract: Despite the plethora of financial services and products on the market nowadays, there is a lack of such services and products designed especially for the low-wage population. Approximately 30% of the U.S. working population engage in low-wage work, and many of them lead a paycheck-to-paycheck lifestyle. Financial planning advice needs to explicitly address their financial instability. We propose… ▽ More

    Submitted 22 September, 2018; v1 submitted 14 June, 2018; originally announced June 2018.

    Comments: Presented at the Data For Good Exchange 2018

  17. arXiv:1805.09949  [pdf, other

    stat.ML cs.LG

    Topological Data Analysis of Decision Boundaries with Application to Model Selection

    Authors: Karthikeyan Natesan Ramamurthy, Kush R. Varshney, Krishnan Mody

    Abstract: We propose the labeled Čech complex, the plain labeled Vietoris-Rips complex, and the locally scaled labeled Vietoris-Rips complex to perform persistent homology inference of decision boundaries in classification tasks. We provide theoretical conditions and analysis for recovering the homology of a decision boundary from samples. Our main objective is quantification of deep neural network complexi… ▽ More

    Submitted 24 May, 2018; originally announced May 2018.

    Comments: Reproducible software available, 17 pages, 10 figures, 12 tables

  18. arXiv:1804.10961  [pdf, other

    stat.ML cs.LG

    Simultaneous Parameter Learning and Bi-Clustering for Multi-Response Models

    Authors: Ming Yu, Karthikeyan Natesan Ramamurthy, Addie Thompson, Aurélie Lozano

    Abstract: We consider multi-response and multitask regression models, where the parameter matrix to be estimated is expected to have an unknown grouping structure. The groupings can be along tasks, or features, or both, the last one indicating a bi-cluster or "checkerboard" structure. Discovering this grouping structure along with parameter inference makes sense in several applications, such as multi-respon… ▽ More

    Submitted 29 April, 2018; originally announced April 2018.

    Comments: 15 pages, 15 figures

  19. arXiv:1712.07106  [pdf, other

    stat.ML cs.LG

    Exploring High-Dimensional Structure via Axis-Aligned Decomposition of Linear Projections

    Authors: Jayaraman J. Thiagarajan, Shusen Liu, Karthikeyan Natesan Ramamurthy, Peer-Timo Bremer

    Abstract: Two-dimensional embeddings remain the dominant approach to visualize high dimensional data. The choice of embeddings ranges from highly non-linear ones, which can capture complex relationships but are difficult to interpret quantitatively, to axis-aligned projections, which are easy to interpret but are limited to bivariate relationships. Linear project can be considered as a compromise between co… ▽ More

    Submitted 19 December, 2017; v1 submitted 19 December, 2017; originally announced December 2017.

  20. arXiv:1711.01514  [pdf, ps, other

    stat.ML cs.LG

    Distribution-Preserving k-Anonymity

    Authors: Dennis Wei, Karthikeyan Natesan Ramamurthy, Kush R. Varshney

    Abstract: Preserving the privacy of individuals by protecting their sensitive attributes is an important consideration during microdata release. However, it is equally important to preserve the quality or utility of the data for at least some targeted workloads. We propose a novel framework for privacy preservation based on the k-anonymity model that is ideally suited for workloads that require preserving t… ▽ More

    Submitted 4 November, 2017; originally announced November 2017.

    Comments: Portions of this work were first presented at the 2015 SIAM International Conference on Data Mining

  21. arXiv:1710.01788  [pdf, other

    stat.ML

    Multitask Learning using Task Clustering with Applications to Predictive Modeling and GWAS of Plant Varieties

    Authors: Ming Yu, Addie M. Thompson, Karthikeyan Natesan Ramamurthy, Eunho Yang, Aurélie C. Lozano

    Abstract: Inferring predictive maps between multiple input and multiple output variables or tasks has innumerable applications in data science. Multi-task learning attempts to learn the maps to several output tasks simultaneously with information sharing between them. We propose a novel multi-task learning framework for sparse linear regression, where a full task hierarchy is automatically inferred from the… ▽ More

    Submitted 4 October, 2017; originally announced October 2017.

  22. arXiv:1708.00069  [pdf, other

    stat.ML cs.CV cs.LG

    Learning Robust Representations for Computer Vision

    Authors: Peng Zheng, Aleksandr Y. Aravkin, Karthikeyan Natesan Ramamurthy, Jayaraman Jayaraman Thiagarajan

    Abstract: Unsupervised learning techniques in computer vision often require learning latent representations, such as low-dimensional linear and non-linear subspaces. Noise and outliers in the data can frustrate these approaches by obscuring the latent spaces. Our main goal is deeper understanding and new development of robust approaches for representation learning. We provide a new interpretation for exis… ▽ More

    Submitted 31 July, 2017; originally announced August 2017.

    Comments: 8 pages, 7 pages

  23. arXiv:1706.01865  [pdf, other

    stat.ML

    Estimating Shape Parameters of Piecewise Linear-Quadratic Problems

    Authors: Peng Zheng, Aleksandr Y. Aravkin, Karthikeyan Natesan Ramamurthy

    Abstract: Piecewise Linear-Quadratic (PLQ) penalties are widely used to develop models in statistical inference, signal processing, and machine learning. Common examples of PLQ penalties include least squares, Huber, Vapnik, 1-norm, and their asymmetric generalizations. Properties of these estimators depend on the choice of penalty and its shape parameters, such as degree of asymmetry for the quantile loss,… ▽ More

    Submitted 31 December, 2020; v1 submitted 6 June, 2017; originally announced June 2017.

    Comments: 15 pages, 6 figures

  24. arXiv:1704.03354  [pdf, other

    stat.ML cs.CY cs.IT

    Optimized Data Pre-Processing for Discrimination Prevention

    Authors: Flavio P. Calmon, Dennis Wei, Karthikeyan Natesan Ramamurthy, Kush R. Varshney

    Abstract: Non-discrimination is a recognized objective in algorithmic decision making. In this paper, we introduce a novel probabilistic formulation of data pre-processing for reducing discrimination. We propose a convex optimization for learning a data transformation with three goals: controlling discrimination, limiting distortion in individual data samples, and preserving utility. We characterize the imp… ▽ More

    Submitted 11 April, 2017; originally announced April 2017.

  25. arXiv:1612.09007  [pdf, ps, other

    stat.ML cs.LG

    A Deep Learning Approach To Multiple Kernel Fusion

    Authors: Huan Song, Jayaraman J. Thiagarajan, Prasanna Sattigeri, Karthikeyan Natesan Ramamurthy, Andreas Spanias

    Abstract: Kernel fusion is a popular and effective approach for combining multiple features that characterize different aspects of data. Traditional approaches for Multiple Kernel Learning (MKL) attempt to learn the parameters for combining the kernels through sophisticated optimization procedures. In this paper, we propose an alternative approach that creates dense embeddings for data using the kernel simi… ▽ More

    Submitted 28 December, 2016; originally announced December 2016.

  26. arXiv:1612.04875  [pdf, other

    stat.ML

    Robust Local Scaling using Conditional Quantiles of Graph Similarities

    Authors: Jayaraman J. Thiagarajan, Prasanna Sattigeri, Karthikeyan Natesan Ramamurthy, Bhavya Kailkhura

    Abstract: Spectral analysis of neighborhood graphs is one of the most widely used techniques for exploratory data analysis, with applications ranging from machine learning to social sciences. In such applications, it is typical to first encode relationships between the data samples using an appropriate similarity function. Popular neighborhood construction techniques such as k-nearest neighbor (k-NN) graphs… ▽ More

    Submitted 14 December, 2016; originally announced December 2016.

  27. arXiv:1611.07429  [pdf, other

    stat.ML cs.LG

    TreeView: Peeking into Deep Neural Networks Via Feature-Space Partitioning

    Authors: Jayaraman J. Thiagarajan, Bhavya Kailkhura, Prasanna Sattigeri, Karthikeyan Natesan Ramamurthy

    Abstract: With the advent of highly predictive but opaque deep learning models, it has become more important than ever to understand and explain the predictions of such models. Existing approaches define interpretability as the inverse of complexity and achieve interpretability at the cost of accuracy. This introduces a risk of producing interpretable but misleading explanations. As humans, we are prone to… ▽ More

    Submitted 22 November, 2016; originally announced November 2016.

    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems

  28. arXiv:1511.03990  [pdf, other

    stat.ML

    Automatic Inference of the Quantile Parameter

    Authors: Karthikeyan Natesan Ramamurthy, Aleksandr Y. Aravkin, Jayaraman J. Thiagarajan

    Abstract: Supervised learning is an active research area, with numerous applications in diverse fields such as data analytics, computer vision, speech and audio processing, and image understanding. In most cases, the loss functions used in machine learning assume symmetric noise models, and seek to estimate the unknown function parameters. However, loss functions such as quantile and quantile Huber generali… ▽ More

    Submitted 12 November, 2015; originally announced November 2015.

    Comments: 8 pages

  29. arXiv:1510.04905  [pdf, other

    stat.ML cs.LG

    Robust Partially-Compressed Least-Squares

    Authors: Stephen Becker, Ban Kawas, Marek Petrik, Karthikeyan N. Ramamurthy

    Abstract: Randomized matrix compression techniques, such as the Johnson-Lindenstrauss transform, have emerged as an effective and practical way for solving large-scale problems efficiently. With a focus on computational efficiency, however, forsaking solutions quality and accuracy becomes the trade-off. In this paper, we investigate compressed least-squares problems and propose new models and algorithms tha… ▽ More

    Submitted 16 October, 2015; originally announced October 2015.

  30. arXiv:1403.6706  [pdf, other

    stat.ML cs.CV cs.LG math.OC

    Beyond L2-Loss Functions for Learning Sparse Models

    Authors: Karthikeyan Natesan Ramamurthy, Aleksandr Y. Aravkin, Jayaraman J. Thiagarajan

    Abstract: Incorporating sparsity priors in learning tasks can give rise to simple, and interpretable models for complex high dimensional data. Sparse models have found widespread use in structure discovery, recovering data from corruptions, and a variety of large scale unsupervised and supervised learning problems. Assuming the availability of sufficient data, these methods infer dictionaries for sparse rep… ▽ More

    Submitted 26 March, 2014; originally announced March 2014.

    Comments: 10 pages, 6 figures

    ACM Class: I.2.6; G.1.6

  31. arXiv:1303.4694  [pdf, other

    math.NA cs.LG stat.ML

    Recovering Non-negative and Combined Sparse Representations

    Authors: Karthikeyan Natesan Ramamurthy, Jayaraman J. Thiagarajan, Andreas Spanias

    Abstract: The non-negative solution to an underdetermined linear system can be uniquely recovered sometimes, even without imposing any additional sparsity constraints. In this paper, we derive conditions under which a unique non-negative solution for such a system can exist, based on the theory of polytopes. Furthermore, we develop the paradigm of combined sparse representations, where only a part of the co… ▽ More

    Submitted 20 September, 2013; v1 submitted 12 March, 2013; originally announced March 2013.

  32. arXiv:1303.0448  [pdf, other

    cs.CV stat.ML

    Learning Stable Multilevel Dictionaries for Sparse Representations

    Authors: Jayaraman J. Thiagarajan, Karthikeyan Natesan Ramamurthy, Andreas Spanias

    Abstract: Sparse representations using learned dictionaries are being increasingly used with success in several data processing and machine learning applications. The availability of abundant training data necessitates the development of efficient, robust and provably good dictionary learning algorithms. Algorithmic stability and generalization are desirable characteristics for dictionary learning algorithm… ▽ More

    Submitted 25 September, 2013; v1 submitted 2 March, 2013; originally announced March 2013.