Skip to main content

Showing 1–8 of 8 results for author: Berisha, V

Searching in archive stat. Search in all archives.
.
  1. arXiv:2501.03568  [pdf, other

    cs.LG stat.ME

    Advanced Tutorial: Label-Efficient Two-Sample Tests

    Authors: Weizhi Li, Visar Berisha, Gautam Dasarathy

    Abstract: Hypothesis testing is a statistical inference approach used to determine whether data supports a specific hypothesis. An important type is the two-sample test, which evaluates whether two sets of data points are from identical distributions. This test is widely used, such as by clinical researchers comparing treatment effectiveness. This tutorial explores two-sample testing in a context where an a… ▽ More

    Submitted 7 January, 2025; originally announced January 2025.

  2. arXiv:2301.12616  [pdf, other

    cs.LG stat.ME

    Active Sequential Two-Sample Testing

    Authors: Weizhi Li, Prad Kadambi, Pouria Saidi, Karthikeyan Natesan Ramamurthy, Gautam Dasarathy, Visar Berisha

    Abstract: A two-sample hypothesis test is a statistical procedure used to determine whether the distributions generating two samples are identical. We consider the two-sample testing problem in a new scenario where the sample measurements (or sample features) are inexpensive to access, but their group memberships (or labels) are costly. To address the problem, we devise the first \emph{active sequential two… ▽ More

    Submitted 27 June, 2024; v1 submitted 29 January, 2023; originally announced January 2023.

  3. arXiv:2111.08861  [pdf, other

    cs.LG stat.ML

    A label-efficient two-sample test

    Authors: Weizhi Li, Gautam Dasarathy, Karthikeyan Natesan Ramamurthy, Visar Berisha

    Abstract: Two-sample tests evaluate whether two samples are realizations of the same distribution (the null hypothesis) or two different distributions (the alternative hypothesis). We consider a new setting for this problem where sample features are easily measured whereas sample labels are unknown and costly to obtain. Accordingly, we devise a three-stage framework in service of performing an effective two… ▽ More

    Submitted 19 July, 2022; v1 submitted 16 November, 2021; originally announced November 2021.

    Comments: Accepted to the 38th conference on Uncertainty in Artificial Intelligence (UAI2022)

  4. arXiv:2001.01900  [pdf, other

    cs.LG stat.ML

    Regularization via Structural Label Smoothing

    Authors: Weizhi Li, Gautam Dasarathy, Visar Berisha

    Abstract: Regularization is an effective way to promote the generalization performance of machine learning models. In this paper, we focus on label smoothing, a form of output distribution regularization that prevents overfitting of a neural network by softening the ground-truth labels in the training data in an attempt to penalize overconfident outputs. Existing approaches typically use cross-validation to… ▽ More

    Submitted 4 July, 2020; v1 submitted 7 January, 2020; originally announced January 2020.

  5. arXiv:1808.01535  [pdf, other

    eess.AS cs.CL cs.LG stat.ML

    Triplet Network with Attention for Speaker Diarization

    Authors: Huan Song, Megan Willi, Jayaraman J. Thiagarajan, Visar Berisha, Andreas Spanias

    Abstract: In automatic speech processing systems, speaker diarization is a crucial front-end component to separate segments from different speakers. Inspired by the recent success of deep neural networks (DNNs) in semantic inferencing, triplet loss-based architectures have been successfully used for this problem. However, existing work utilizes conventional i-vectors as the input representation and builds s… ▽ More

    Submitted 4 August, 2018; originally announced August 2018.

    Comments: Interspeech2018

  6. Direct estimation of density functionals using a polynomial basis

    Authors: Alan Wisler, Visar Berisha, Andreas Spanias, Alfred O. Hero

    Abstract: A number of fundamental quantities in statistical signal processing and information theory can be expressed as integral functions of two probability density functions. Such quantities are called density functionals as they map density functions onto the real line. For example, information divergence functions measure the dissimilarity between two probability density functions and are useful in a n… ▽ More

    Submitted 20 November, 2017; v1 submitted 21 February, 2017; originally announced February 2017.

    Comments: Under review for IEEE Transactions on Signal Processing

  7. arXiv:1412.6534  [pdf, other

    cs.IT stat.ML

    Empirically Estimable Classification Bounds Based on a New Divergence Measure

    Authors: Visar Berisha, Alan Wisler, Alfred O. Hero, Andreas Spanias

    Abstract: Information divergence functions play a critical role in statistics and information theory. In this paper we show that a non-parametric f-divergence measure can be used to provide improved bounds on the minimum binary classification probability of error for the case when the training and test data are drawn from the same distribution and for the case where there exists some mismatch between traini… ▽ More

    Submitted 10 February, 2015; v1 submitted 19 December, 2014; originally announced December 2014.

    Comments: 12 pages, 5 figures

  8. arXiv:1408.1182  [pdf, other

    stat.CO cs.IT stat.ML

    Empirical non-parametric estimation of the Fisher Information

    Authors: Visar Berisha, Alfred O. Hero

    Abstract: The Fisher information matrix (FIM) is a foundational concept in statistical signal processing. The FIM depends on the probability distribution, assumed to belong to a smooth parametric family. Traditional approaches to estimating the FIM require estimating the probability distribution function (PDF), or its parameters, along with its gradient or Hessian. However, in many practical situations the… ▽ More

    Submitted 16 November, 2014; v1 submitted 6 August, 2014; originally announced August 2014.

    Comments: 12 pages