Skip to main content

Showing 1–50 of 76 results for author: Chakraborty, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2505.10952  [pdf

    stat.AP

    Age-stratified clustering of multiple long-term conditions

    Authors: Anirban Chakraborty, Bruce Guthrie, Sohan Seth

    Abstract: Background: Most people with any long-term condition have multiple long-term conditions, but our understanding of how conditions cluster is limited. Many clustering studies identify clusters in the whole population, but the clusters that occur in people of different ages may be distinct. The aim of this paper was to explore similarities and differences in clusters found in different age-groups.… ▽ More

    Submitted 16 May, 2025; originally announced May 2025.

  2. arXiv:2411.06741  [pdf, other

    stat.AP cs.LG stat.ML

    Methane projections from Canada's oil sands tailings using scientific deep learning reveal significant underestimation

    Authors: Esha Saha, Oscar Wang, Amit K. Chakraborty, Pablo Venegas Garcia, Russell Milne, Hao Wang

    Abstract: Bitumen extraction for the production of synthetic crude oil in Canada's Athabasca Oil Sands industry has recently come under spotlight for being a significant source of greenhouse gas emission. A major cause of concern is methane, a greenhouse gas produced by the anaerobic biodegradation of hydrocarbons in oil sands residues, or tailings, stored in settle basins commonly known as oil sands tailin… ▽ More

    Submitted 11 November, 2024; originally announced November 2024.

    Comments: 19 pages, 8 figures, 2 tables

  3. arXiv:2409.19208  [pdf, other

    stat.CO stat.AP stat.ML

    Learning non-Gaussian spatial distributions via Bayesian transport maps with parametric shrinkage

    Authors: Anirban Chakraborty, Matthias Katzfuss

    Abstract: Many applications, including climate-model analysis and stochastic weather generators, require learning or emulating the distribution of a high-dimensional and non-Gaussian spatial field based on relatively few training samples. To address this challenge, a recently proposed Bayesian transport map (BTM) approach consists of a triangular transport map with nonparametric Gaussian-process (GP) compon… ▽ More

    Submitted 1 February, 2025; v1 submitted 27 September, 2024; originally announced September 2024.

  4. arXiv:2409.10771  [pdf, ps, other

    stat.ME

    Flexible survival regression with variable selection for heterogeneous population

    Authors: Abhishek Mandal, Abhisek Chakraborty

    Abstract: Survival regression is widely used to model time-to-events data, to explore how covariates may influence the occurrence of events. Modern datasets often encompass a vast number of covariates across many subjects, with only a subset of the covariates significantly affecting survival. Additionally, subjects often belong to an unknown number of latent groups, where covariate effects on survival diffe… ▽ More

    Submitted 16 September, 2024; originally announced September 2024.

  5. arXiv:2409.09512  [pdf, other

    stat.ME

    Doubly robust and computationally efficient high-dimensional variable selection

    Authors: Abhinav Chakraborty, Jeffrey Zhang, Eugene Katsevich

    Abstract: The variable selection problem is to discover which of a large set of predictors is associated with an outcome of interest, conditionally on the other predictors. This problem has been widely studied, but existing approaches lack either power against complex alternatives, robustness to model misspecification, computational efficiency, or quantification of evidence against individual hypotheses. We… ▽ More

    Submitted 14 September, 2024; originally announced September 2024.

  6. arXiv:2408.09582  [pdf, other

    stat.CO stat.AP stat.ME stat.ML

    A Likelihood-Free Approach to Goal-Oriented Bayesian Optimal Experimental Design

    Authors: Atlanta Chakraborty, Xun Huan, Tommie Catanach

    Abstract: Conventional Bayesian optimal experimental design seeks to maximize the expected information gain (EIG) on model parameters. However, the end goal of the experiment often is not to learn the model parameters, but to predict downstream quantities of interest (QoIs) that depend on the learned parameters. And designs that offer high EIG for parameters may not translate to high EIG for QoIs. Goal-orie… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

  7. arXiv:2407.13678  [pdf, other

    stat.ME

    Joint modelling of time-to-event and longitudinal response using robust skew normal-independent distributions

    Authors: Srimanti Dutta, Arindom Chakraborty, Dipankar Bandyopadhyay

    Abstract: Joint modelling of longitudinal observations and event times continues to remain a topic of considerable interest in biomedical research. For example, in HIV studies, the longitudinal bio-marker such as CD4 cell count in a patient's blood over follow up months is jointly modelled with the time to disease progression, death or dropout via a random intercept term mostly assumed to be Gaussian. Howev… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  8. arXiv:2406.20088  [pdf, other

    math.ST stat.ME stat.ML

    Minimax And Adaptive Transfer Learning for Nonparametric Classification under Distributed Differential Privacy Constraints

    Authors: Arnab Auddy, T. Tony Cai, Abhinav Chakraborty

    Abstract: This paper considers minimax and adaptive transfer learning for nonparametric classification under the posterior drift model with distributed differential privacy constraints. Our study is conducted within a heterogeneous framework, encompassing diverse sample sizes, varying privacy parameters, and data heterogeneity across different servers. We first establish the minimax misclassification rate,… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    MSC Class: 62G08; 62G20

  9. arXiv:2406.06755  [pdf, other

    math.ST cs.LG stat.ML

    Optimal Federated Learning for Nonparametric Regression with Heterogeneous Distributed Differential Privacy Constraints

    Authors: T. Tony Cai, Abhinav Chakraborty, Lasse Vuursteen

    Abstract: This paper studies federated learning for nonparametric regression in the context of distributed samples across different servers, each adhering to distinct differential privacy constraints. The setting we consider is heterogeneous, encompassing both varying sample sizes and differential privacy constraints across servers. Within this framework, both global and pointwise estimation are considered,… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 49 pages total, consisting of an article (24 pages) and a supplement (25 pages)

    MSC Class: 62G08; 62C20; 68P27; 62F30;

  10. arXiv:2406.06749  [pdf, other

    math.ST cs.LG stat.ML

    Federated Nonparametric Hypothesis Testing with Differential Privacy Constraints: Optimal Rates and Adaptive Tests

    Authors: T. Tony Cai, Abhinav Chakraborty, Lasse Vuursteen

    Abstract: Federated learning has attracted significant recent attention due to its applicability across a wide range of settings where data is collected and analyzed across disparate locations. In this paper, we study federated nonparametric goodness-of-fit testing in the white-noise-with-drift model under distributed differential privacy (DP) constraints. We first establish matching lower and upper bound… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 77 pages total; consisting of a main article (28 pages) and supplement (49 pages)

    MSC Class: 62G10; 62C20; 68P27; 62F30

  11. arXiv:2406.02794  [pdf, other

    stat.ME cs.SI math.ST

    PriME: Privacy-aware Membership profile Estimation in networks

    Authors: Abhinav Chakraborty, Sayak Chatterjee, Sagnik Nandy

    Abstract: This paper presents a novel approach to estimating community membership probabilities for network vertices generated by the Degree Corrected Mixed Membership Stochastic Block Model while preserving individual edge privacy. Operating within the $\varepsilon$-edge local differential privacy framework, we introduce an optimal private algorithm based on a symmetric edge flip mechanism and spectral clu… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  12. arXiv:2404.17763  [pdf, other

    stat.ME stat.CO stat.ML

    Likelihood Based Inference in Fully and Partially Observed Exponential Family Graphical Models with Intractable Normalizing Constants

    Authors: Yujie Chen, Anindya Bhadra, Antik Chakraborty

    Abstract: Probabilistic graphical models that encode an underlying Markov random field are fundamental building blocks of generative modeling to learn latent representations in modern multivariate data sets with complex dependency structures. Among these, the exponential family graphical models are especially popular, given their fairly well-understood statistical properties and computational scalability to… ▽ More

    Submitted 1 April, 2025; v1 submitted 26 April, 2024; originally announced April 2024.

  13. arXiv:2404.08893  [pdf, other

    cs.LG math.DS q-bio.PE stat.AP

    Early detection of disease outbreaks and non-outbreaks using incidence data

    Authors: Shan Gao, Amit K. Chakraborty, Russell Greiner, Mark A. Lewis, Hao Wang

    Abstract: Forecasting the occurrence and absence of novel disease outbreaks is essential for disease management. Here, we develop a general model, with no real-world training data, that accurately forecasts outbreaks and non-outbreaks. We propose a novel framework, using a feature-based time series classification method to forecast outbreaks and non-outbreaks. We tested our methods on synthetic data from a… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

  14. arXiv:2404.03152  [pdf, other

    stat.ME

    Orthogonal calibration via posterior projections with applications to the Schwarzschild model

    Authors: Antik Chakraborty, Jonelle B. Walsh, Louis Strigari, Bani K. Mallick, Anirban Bhattacharya

    Abstract: The orbital superposition method originally developed by Schwarzschild (1979) is used to study the dynamics of growth of a black hole and its host galaxy, and has uncovered new relationships between the galaxy's global characteristics. Scientists are specifically interested in finding optimal parameter choices for this model that best match physical measurements along with quantifying the uncertai… ▽ More

    Submitted 11 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

  15. arXiv:2403.16233  [pdf, other

    cs.LG q-bio.PE stat.AP

    An early warning indicator trained on stochastic disease-spreading models with different noises

    Authors: Amit K. Chakraborty, Shan Gao, Reza Miry, Pouria Ramazi, Russell Greiner, Mark A. Lewis, Hao Wang

    Abstract: The timely detection of disease outbreaks through reliable early warning signals (EWSs) is indispensable for effective public health mitigation strategies. Nevertheless, the intricate dynamics of real-world disease spread, often influenced by diverse sources of noise and limited data in the early stages of outbreaks, pose a significant challenge in developing reliable EWSs, as the performance of e… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  16. arXiv:2402.13890  [pdf, other

    stat.ME stat.ML

    A unified Bayesian framework for interval hypothesis testing in clinical trials

    Authors: Abhisek Chakraborty, Megan H. Murray, Ilya Lipkovich, Yu Du

    Abstract: The American Statistical Association (ASA) statement on statistical significance and P-values \cite{wasserstein2016asa} cautioned statisticians against making scientific decisions solely on the basis of traditional P-values. The statement delineated key issues with P-values, including a lack of transparency, an inability to quantify evidence in support of the null hypothesis, and an inability to m… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  17. arXiv:2401.16596  [pdf, other

    stat.ME cs.CR cs.SI math.ST stat.ML

    PrIsing: Privacy-Preserving Peer Effect Estimation via Ising Model

    Authors: Abhinav Chakraborty, Anirban Chatterjee, Abhinandan Dalal

    Abstract: The Ising model, originally developed as a spin-glass model for ferromagnetic elements, has gained popularity as a network-based model for capturing dependencies in agents' outputs. Its increasing adoption in healthcare and the social sciences has raised privacy concerns regarding the confidentiality of agents' responses. In this paper, we present a novel $(\varepsilon,δ)$-differentially private a… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: To Appear in AISTATS 2024

  18. arXiv:2401.15502  [pdf, other

    stat.ML cs.CR cs.LG

    Differentially private Bayesian tests

    Authors: Abhisek Chakraborty, Saptati Datta

    Abstract: Differential privacy has emerged as an significant cornerstone in the realm of scientific hypothesis testing utilizing confidential data. In reporting scientific discoveries, Bayesian tests are widely adopted since they effectively circumnavigate the key criticisms of P-values, namely, lack of interpretability and inability to quantify evidence in support of the competing hypotheses. We present a… ▽ More

    Submitted 1 May, 2024; v1 submitted 27 January, 2024; originally announced January 2024.

  19. arXiv:2310.12447  [pdf, other

    stat.ML cs.LG

    Constrained Reweighting of Distributions: an Optimal Transport Approach

    Authors: Abhisek Chakraborty, Anirban Bhattacharya, Debdeep Pati

    Abstract: We commonly encounter the problem of identifying an optimally weight adjusted version of the empirical distribution of observed data, adhering to predefined constraints on the weights. Such constraints often manifest as restrictions on the moments, tail behaviour, shapes, number of modes, etc., of the resulting weight adjusted empirical distribution. In this article, we substantially enhance the f… ▽ More

    Submitted 16 January, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

    Comments: arXiv admin note: text overlap with arXiv:2303.10085

  20. arXiv:2309.07882  [pdf, other

    stat.CO stat.AP stat.ML

    Scalable Model-Based Gaussian Process Clustering

    Authors: Anirban Chakraborty, Abhisek Chakraborty

    Abstract: Gaussian process is an indispensable tool in clustering functional data, owing to it's flexibility and inherent uncertainty quantification. However, when the functional data is observed over a large grid (say, of length $p$), Gaussian process clustering quickly renders itself infeasible, incurring $O(p^2)$ space complexity and $O(p^3)$ time complexity per iteration; and thus prohibiting it's natur… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  21. arXiv:2309.05132  [pdf, other

    cs.CV cs.LG stat.ML

    DAD++: Improved Data-free Test Time Adversarial Defense

    Authors: Gaurav Kumar Nayak, Inder Khatri, Shubham Randive, Ruchit Rawal, Anirban Chakraborty

    Abstract: With the increasing deployment of deep neural networks in safety-critical applications such as self-driving cars, medical imaging, anomaly detection, etc., adversarial robustness has become a crucial concern in the reliability of these networks in real-world scenarios. A plethora of works based on adversarial training and regularization-based techniques have been proposed to make these deep networ… ▽ More

    Submitted 10 September, 2023; originally announced September 2023.

    Comments: IJCV Journal (Under Review)

  22. arXiv:2305.17557  [pdf, other

    stat.ML cs.CY cs.LG

    Fair Clustering via Hierarchical Fair-Dirichlet Process

    Authors: Abhisek Chakraborty, Anirban Bhattacharya, Debdeep Pati

    Abstract: The advent of ML-driven decision-making and policy formation has led to an increasing focus on algorithmic fairness. As clustering is one of the most commonly used unsupervised machine learning approaches, there has naturally been a proliferation of literature on {\em fair clustering}. A popular notion of fairness in clustering mandates the clusters to be {\em balanced}, i.e., each level of a prot… ▽ More

    Submitted 27 May, 2023; originally announced May 2023.

  23. arXiv:2303.10085  [pdf, other

    stat.ME cs.LG stat.ML

    Robust probabilistic inference via a constrained transport metric

    Authors: Abhisek Chakraborty, Anirban Bhattacharya, Debdeep Pati

    Abstract: Flexible Bayesian models are typically constructed using limits of large parametric models with a multitude of parameters that are often uninterpretable. In this article, we offer a novel alternative by constructing an exponentially tilted empirical likelihood carefully designed to concentrate near a parametric family of distributions of choice with respect to a novel variant of the Wasserstein me… ▽ More

    Submitted 17 March, 2023; originally announced March 2023.

  24. arXiv:2211.14698  [pdf, other

    stat.ME math.ST

    Reconciling model-X and doubly robust approaches to conditional independence testing

    Authors: Ziang Niu, Abhinav Chakraborty, Oliver Dukes, Eugene Katsevich

    Abstract: Model-X approaches to testing conditional independence between a predictor and an outcome variable given a vector of covariates usually assume exact knowledge of the conditional distribution of the predictor given the covariates. Nevertheless, model-X methodologies are often deployed with this conditional distribution learned in sample. We investigate the consequences of this choice through the le… ▽ More

    Submitted 8 February, 2023; v1 submitted 26 November, 2022; originally announced November 2022.

  25. arXiv:2205.02604  [pdf, other

    cs.CV cs.HC cs.LG stat.ML

    Holistic Approach to Measure Sample-level Adversarial Vulnerability and its Utility in Building Trustworthy Systems

    Authors: Gaurav Kumar Nayak, Ruchit Rawal, Rohit Lal, Himanshu Patil, Anirban Chakraborty

    Abstract: Adversarial attack perturbs an image with an imperceptible noise, leading to incorrect model prediction. Recently, a few works showed inherent bias associated with such attack (robustness bias), where certain subgroups in a dataset (e.g. based on class, gender, etc.) are less robust than others. This bias not only persists even after adversarial training, but often results in severe performance di… ▽ More

    Submitted 5 May, 2022; originally announced May 2022.

    Comments: Accepted in CVPR Workshop 2022 on Human-centered Intelligent Services: Safe and Trustworthy

  26. arXiv:2203.09782  [pdf, other

    stat.ME stat.AP stat.CO

    Modularized Bayesian analyses and cutting feedback in likelihood-free inference

    Authors: Atlanta Chakraborty, David J. Nott, Christopher Drovandi, David T. Frazier, Scott A. Sisson

    Abstract: There has been much recent interest in modifying Bayesian inference for misspecified models so that it is useful for specific purposes. One popular modified Bayesian inference method is "cutting feedback" which can be used when the model consists of a number of coupled modules, with only some of the modules being misspecified. Cutting feedback methods represent the full posterior distribution in t… ▽ More

    Submitted 18 March, 2022; originally announced March 2022.

  27. arXiv:2202.09993  [pdf, other

    stat.ME stat.CO

    Weakly informative priors and prior-data conflict checking for likelihood-free inference

    Authors: Atlanta Chakraborty, David J. Nott, Michael Evans

    Abstract: Bayesian likelihood-free inference, which is used to perform Bayesian inference when the likelihood is intractable, enjoys an increasing number of important scientific applications. However, many aspects of a Bayesian analysis become more challenging in the likelihood-free setting. One example of this is prior-data conflict checking, where the goal is to assess whether the information in the data… ▽ More

    Submitted 21 February, 2022; originally announced February 2022.

  28. arXiv:2202.08107  [pdf, other

    stat.AP

    Estimating Software Reliability Using Size-biased Modelling

    Authors: Soumen Dey, Ashis Kumar Chakraborty

    Abstract: Software reliability estimation is one of the most active areas of research in software testing. Since time between failures (TBF) has often been challenging to record, software testing data are commonly recorded as test-case-wise in a discrete set up. We have developed a Bayesian generalised linear mixed model (GLMM) based on software testing detection data and a size-biased strategy which not… ▽ More

    Submitted 20 April, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

    Comments: 14 pages. This work has been submitted to the IEEE for possible publication

  29. arXiv:2110.14215  [pdf, other

    cs.CV cs.LG stat.ML

    Beyond Classification: Knowledge Distillation using Multi-Object Impressions

    Authors: Gaurav Kumar Nayak, Monish Keswani, Sharan Seshadri, Anirban Chakraborty

    Abstract: Knowledge Distillation (KD) utilizes training data as a transfer set to transfer knowledge from a complex network (Teacher) to a smaller network (Student). Several works have recently identified many scenarios where the training data may not be available due to data privacy or sensitivity concerns and have proposed solutions under this restrictive constraint for the classification task. Unlike exi… ▽ More

    Submitted 27 October, 2021; originally announced October 2021.

    Comments: Accepted in BMVC 2021

  30. arXiv:2110.07062  [pdf, other

    stat.CO

    Ordered conditional approximation of Potts models

    Authors: Anirban Chakraborty, Matthias Katzfuss, Joseph Guinness

    Abstract: Potts models, which can be used to analyze dependent observations on a lattice, have seen widespread application in a variety of areas, including statistical mechanics, neuroscience, and quantum computing. To address the intractability of Potts likelihoods for large spatial fields, we propose fast ordered conditional approximations that enable rapid inference for observed and hidden Potts models.… ▽ More

    Submitted 13 October, 2021; originally announced October 2021.

  31. arXiv:2106.02127  [pdf, ps, other

    stat.ME stat.CO

    Bayesian inference on high-dimensional multivariate binary responses

    Authors: Antik Chakraborty, Rihui Ou, David B. Dunson

    Abstract: It has become increasingly common to collect high-dimensional binary response data; for example, with the emergence of new sampling techniques in ecology. In smaller dimensions, multivariate probit (MVP) models are routinely used for inferences. However, algorithms for fitting such models face issues in scaling up to high dimensions due to the intractability of the likelihood, involving an integra… ▽ More

    Submitted 24 October, 2022; v1 submitted 3 June, 2021; originally announced June 2021.

  32. arXiv:2101.06069  [pdf, other

    cs.CV cs.LG stat.ML

    Mining Data Impressions from Deep Models as Substitute for the Unavailable Training Data

    Authors: Gaurav Kumar Nayak, Konda Reddy Mopuri, Saksham Jain, Anirban Chakraborty

    Abstract: Pretrained deep models hold their learnt knowledge in the form of model parameters. These parameters act as "memory" for the trained models and help them generalize well on unseen data. However, in absence of training data, the utility of a trained model is merely limited to either inference or better initialization towards a target task. In this paper, we go further and extract synthetic data by… ▽ More

    Submitted 30 August, 2021; v1 submitted 15 January, 2021; originally announced January 2021.

    Comments: Accepted in TPAMI, 2021. arXiv admin note: text overlap with arXiv:1905.08114

  33. arXiv:2011.07142  [pdf, other

    stat.ML cs.CV cs.LG math.OC

    Sparse Representations of Positive Functions via First and Second-Order Pseudo-Mirror Descent

    Authors: Abhishek Chakraborty, Ketan Rajawat, Alec Koppel

    Abstract: We consider expected risk minimization problems when the range of the estimator is required to be nonnegative, motivated by the settings of maximum likelihood estimation (MLE) and trajectory optimization. To facilitate nonlinear interpolation, we hypothesize that the search space is a Reproducing Kernel Hilbert Space (RKHS). We develop first and second-order variants of stochastic mirror descent e… ▽ More

    Submitted 3 May, 2022; v1 submitted 13 November, 2020; originally announced November 2020.

  34. arXiv:2011.04470  [pdf, other

    math.ST stat.ME

    High dimensional PCA: a new model selection criterion

    Authors: Abhinav Chakraborty, Soumendu Sundar Mukherjee, Arijit Chakrabarti

    Abstract: Given a random sample from a multivariate population, estimating the number of large eigenvalues of the population covariance matrix is an important problem in Statistics with wide applications in many areas. In the context of Principal Component Analysis (PCA), the linear combinations of the original variables having the largest amounts of variation are determined by this number. In this paper, w… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

    Comments: 37 pages, 6 figures, 2 tables

    MSC Class: 62H12; 62H25

  35. arXiv:2010.12622  [pdf, other

    cs.LG cs.CV stat.ML

    S2cGAN: Semi-Supervised Training of Conditional GANs with Fewer Labels

    Authors: Arunava Chakraborty, Rahul Ragesh, Mahir Shah, Nipun Kwatra

    Abstract: Generative adversarial networks (GANs) have been remarkably successful in learning complex high dimensional real word distributions and generating realistic samples. However, they provide limited control over the generation process. Conditional GANs (cGANs) provide a mechanism to control the generation process by conditioning the output on a user defined input. Although training GANs requires only… ▽ More

    Submitted 23 October, 2020; originally announced October 2020.

  36. arXiv:2008.03943  [pdf, other

    stat.AP

    A robust and non-parametric model for prediction of dengue incidence

    Authors: Atlanta Chakraborty, Vijay Chandru

    Abstract: Disease surveillance is essential not only for the prior detection of outbreaks but also for monitoring trends of the disease in the long run. In this paper, we aim to build a tactical model for the surveillance of dengue, in particular. Most existing models for dengue prediction exploit its known relationships between climate and socio-demographic factors with the incidence counts, however they a… ▽ More

    Submitted 10 August, 2020; originally announced August 2020.

    Comments: 2 figures, 11 pages, Presented at the 51st Annual Convention of the Operational Research Society of India (ORSI), IIT Mumbai, 2018

  37. arXiv:2004.08309  [pdf, other

    stat.ME stat.AP

    Bayesian semiparametric long memory models for discretized event data

    Authors: Antik Chakraborty, Otso Ovaskainen, David B. Dunson

    Abstract: We introduce a new class of semiparametric latent variable models for long memory discretized event data. The proposed methodology is motivated by a study of bird vocalizations in the Amazon rain forest; the timings of vocalizations exhibit self-similarity and long range dependence ruling out models based on Poisson processes. The proposed class of FRActional Probit (FRAP) models is based on thres… ▽ More

    Submitted 30 June, 2021; v1 submitted 17 April, 2020; originally announced April 2020.

  38. arXiv:2003.07953  [pdf, ps, other

    stat.ME stat.CO stat.ML

    Nearest Neighbor Dirichlet Mixtures

    Authors: Shounak Chattopadhyay, Antik Chakraborty, David B. Dunson

    Abstract: There is a rich literature on Bayesian methods for density estimation, which characterize the unknown density as a mixture of kernels. Such methods have advantages in terms of providing uncertainty quantification in estimation, while being adaptive to a rich variety of densities. However, relative to frequentist locally adaptive kernel methods, Bayesian approaches can be slow and unstable to imple… ▽ More

    Submitted 23 February, 2023; v1 submitted 17 March, 2020; originally announced March 2020.

    Journal ref: Journal of Machine Learning Research, 24(261), 1-46 (2023)

  39. arXiv:2002.08860  [pdf, other

    cs.LG eess.SY stat.ML

    Dissipative SymODEN: Encoding Hamiltonian Dynamics with Dissipation and Control into Deep Learning

    Authors: Yaofeng Desmond Zhong, Biswadip Dey, Amit Chakraborty

    Abstract: In this work, we introduce Dissipative SymODEN, a deep learning architecture which can infer the dynamics of a physical system with dissipation from observed state trajectories. To improve prediction accuracy while reducing network size, Dissipative SymODEN encodes the port-Hamiltonian dynamics with energy dissipation and external input into the design of its computation graph and learns the dynam… ▽ More

    Submitted 29 April, 2020; v1 submitted 20 February, 2020; originally announced February 2020.

    Comments: Published at ICLR 2020 Workshop on Integration of Deep Neural Models and Differential Equations (DeepDiffEq)

  40. arXiv:1912.11960  [pdf, other

    cs.LG cs.CV stat.ML

    DeGAN : Data-Enriching GAN for Retrieving Representative Samples from a Trained Classifier

    Authors: Sravanti Addepalli, Gaurav Kumar Nayak, Anirban Chakraborty, R. Venkatesh Babu

    Abstract: In this era of digital information explosion, an abundance of data from numerous modalities is being generated as well as archived everyday. However, most problems associated with training Deep Neural Networks still revolve around lack of data that is rich enough for a given task. Data is required not only for training an initial model, but also for future learning tasks such as Model Compression… ▽ More

    Submitted 26 December, 2019; originally announced December 2019.

    Comments: Accepted at AAAI-2020

  41. arXiv:1911.09722  [pdf, other

    stat.ML cs.LG eess.IV

    EvAn: Neuromorphic Event-based Anomaly Detection

    Authors: Lakshmi Annamalai, Anirban Chakraborty, Chetan Singh Thakur

    Abstract: Event-based cameras are bio-inspired novel sensors that asynchronously record changes in illumination in the form of events, thus resulting in significant advantages over conventional cameras in terms of low power utilization, high dynamic range, and no motion blur. Moreover, such cameras, by design, encode only the relative motion between the scene and the sensor (and not the static background) t… ▽ More

    Submitted 15 February, 2020; v1 submitted 21 November, 2019; originally announced November 2019.

  42. arXiv:1911.08556  [pdf, other

    cs.LG stat.ML

    Towards Reducing Bias in Gender Classification

    Authors: Komal K. Teru, Aishik Chakraborty

    Abstract: Societal bias towards certain communities is a big problem that affects a lot of machine learning systems. This work aims at addressing the racial bias present in many modern gender recognition systems. We learn race invariant representations of human faces with an adversarially trained autoencoder model. We show that such representations help us achieve less biased performance in gender classific… ▽ More

    Submitted 16 November, 2019; originally announced November 2019.

    Comments: arXiv admin note: text overlap with arXiv:1706.00409 by other authors

  43. arXiv:1910.02133  [pdf, other

    eess.IV cond-mat.mtrl-sci cs.LG stat.ML

    A Conditional Generative Model for Predicting Material Microstructures from Processing Methods

    Authors: Akshay Iyer, Biswadip Dey, Arindam Dasgupta, Wei Chen, Amit Chakraborty

    Abstract: Microstructures of a material form the bridge linking processing conditions - which can be controlled, to the material property - which is the primary interest in engineering applications. Thus a critical task in material design is establishing the processing-structure relationship, which requires domain expertise and techniques that can model the high-dimensional material microstructure. This wor… ▽ More

    Submitted 4 October, 2019; originally announced October 2019.

  44. arXiv:1909.12077  [pdf, other

    cs.LG eess.SY physics.comp-ph stat.ML

    Symplectic ODE-Net: Learning Hamiltonian Dynamics with Control

    Authors: Yaofeng Desmond Zhong, Biswadip Dey, Amit Chakraborty

    Abstract: In this paper, we introduce Symplectic ODE-Net (SymODEN), a deep learning framework which can infer the dynamics of a physical system, given by an ordinary differential equation (ODE), from observed state trajectories. To achieve better generalization with fewer training samples, SymODEN incorporates appropriate inductive bias by designing the associated computation graph in a physics-informed man… ▽ More

    Submitted 29 February, 2024; v1 submitted 26 September, 2019; originally announced September 2019.

    Comments: Published as a Conference Paper at ICLR 2020

    Journal ref: International Conference on Learning Representations (ICLR 2020); https://openreview.net/forum?id=ryxmb1rKDS

  45. Bayesian Neural Tree Models for Nonparametric Regression

    Authors: Tanujit Chakraborty, Gauri Kamat, Ashis Kumar Chakraborty

    Abstract: Frequentist and Bayesian methods differ in many aspects, but share some basic optimal properties. In real-life classification and regression problems, situations exist in which a model based on one of the methods is preferable based on some subjective criterion. Nonparametric classification and regression techniques, such as decision trees and neural networks, have frequentist (classification and… ▽ More

    Submitted 27 July, 2020; v1 submitted 1 September, 2019; originally announced September 2019.

    Journal ref: Australian and New Zealand Journal of Statistics, 2023

  46. arXiv:1908.00307  [pdf, ps, other

    stat.ME

    Optimum Testing Time of Software using Size-Biased Concepts

    Authors: Ashis Kumar Chakraborty, Parna Chatterjee, Poulami Chakraborty, Aleena Chanda

    Abstract: Optimum software release time problem has been an interesting area of research for several decades now. We introduce here a new concept of size-biased modelling to solve for the optimum software release time. Bayesian approach is used to solve the problem. We also discuss about the applicability of the model for a specific data set, though we believe that the model is applicable to all kind of sof… ▽ More

    Submitted 1 August, 2019; originally announced August 2019.

    Comments: Communicated

  47. arXiv:1906.08843  [pdf, other

    stat.ME math.ST stat.AP

    On Statistical Properties of A Veracity Scoring Method for Spatial Data

    Authors: Arnab Chakraborty, Soumendra N. Lahiri

    Abstract: Measuring veracity or reliability of noisy data is of utmost importance, especially in the scenarios where the information are gathered through automated systems. In a recent paper, Chakraborty et. al. (2019) have introduced a veracity scoring technique for geostatistical data. The authors have used a high-quality `reference' data to measure the veracity of the varying-quality observations and inc… ▽ More

    Submitted 20 June, 2019; originally announced June 2019.

    Comments: 37 pages, 4 figures, 6 tables, submitted to JRSS-B

  48. arXiv:1905.08114  [pdf, other

    cs.LG cs.CV stat.ML

    Zero-Shot Knowledge Distillation in Deep Networks

    Authors: Gaurav Kumar Nayak, Konda Reddy Mopuri, Vaisakh Shaj, R. Venkatesh Babu, Anirban Chakraborty

    Abstract: Knowledge distillation deals with the problem of training a smaller model (Student) from a high capacity source model (Teacher) so as to retain most of its performance. Existing approaches use either the training data or meta-data extracted from it in order to train the Student. However, accessing the dataset on which the Teacher has been trained may not always be feasible if the dataset is very l… ▽ More

    Submitted 20 May, 2019; originally announced May 2019.

    Comments: Accepted in ICML 2019, codes will be available at https://github.com/vcl-iisc/ZSKD

  49. arXiv:1904.02092  [pdf, other

    hep-ph hep-ex stat.ML

    Interpretable Deep Learning for Two-Prong Jet Classification with Jet Spectra

    Authors: Amit Chakraborty, Sung Hak Lim, Mihoko M. Nojiri

    Abstract: Classification of jets with deep learning has gained significant attention in recent times. However, the performance of deep neural networks is often achieved at the cost of interpretability. Here we propose an interpretable network trained on the jet spectrum $S_{2}(R)$ which is a two-point correlation function of the jet constituents. The spectrum can be derived from a functional Taylor series o… ▽ More

    Submitted 26 March, 2020; v1 submitted 3 April, 2019; originally announced April 2019.

    Comments: 32 pages, 21 figures, published in JHEP

    Report number: KEK-TH-2117

    Journal ref: J. High Energ. Phys. 2019, 135 (2019)

  50. arXiv:1903.04925  [pdf, other

    q-bio.GN cs.DS cs.LG stat.ML

    conLSH: Context based Locality Sensitive Hashing for Mapping of noisy SMRT Reads

    Authors: Angana Chakraborty, Sanghamitra Bandyopadhyay

    Abstract: Single Molecule Real-Time (SMRT) sequencing is a recent advancement of Next Gen technology developed by Pacific Bio (PacBio). It comes with an explosion of long and noisy reads demanding cutting edge research to get most out of it. To deal with the high error probability of SMRT data, a novel contextual Locality Sensitive Hashing (conLSH) based algorithm is proposed in this article, which can effe… ▽ More

    Submitted 11 March, 2019; originally announced March 2019.

    Comments: arXiv admin note: text overlap with arXiv:1705.03933