Showing 1–2 of 2 results for author: Tvedebrink, T
-
Detecting Outliers in High-dimensional Data with Mixed Variable Types using Conditional Gaussian Regression Models
Authors:
Mads Lindskou,
Torben Tvedebrink,
Poul Svante Eriksen,
Niels Morling
Abstract:
Outlier detection has gained increasing interest in recent years, due to newly emerging technologies and the huge amount of high-dimensional data that are now available. Outlier detection can help practitioners to identify unwanted noise and/or locate interesting abnormal observations. To address this, we developed a novel method for outlier detection for use in, possibly high-dimensional, dataset…
▽ More
Outlier detection has gained increasing interest in recent years, due to newly emerging technologies and the huge amount of high-dimensional data that are now available. Outlier detection can help practitioners to identify unwanted noise and/or locate interesting abnormal observations. To address this, we developed a novel method for outlier detection for use in, possibly high-dimensional, datasets with both discrete and continuous variables. We exploit the family of decomposable graphical models in order to model the relationship between the variables and use this to form an exact likelihood ratio test for an observation that is considered an outlier. We show that our method outperforms the state-of-the-art Isolation Forest algorithm on a real data example.
△ Less
Submitted 19 May, 2021; v1 submitted 3 March, 2021;
originally announced March 2021.
-
The multivariate Dirichlet-multinomial distribution and its application in forensic genetics to adjust for sub-population effects using the θ-correction
Authors:
Torben Tvedebrink,
Poul Svante Eriksen,
Niels Morling
Abstract:
In this paper, we discuss the construction of a multivariate generalisation of the Dirichlet-multinomial distribution. An example from forensic genetics in the statistical analysis of DNA mixtures motivates the study of this multivariate extension.
In forensic genetics, adjustment of the match probabilities due to remote ancestry in the population is often done using the so-called θ-correction.…
▽ More
In this paper, we discuss the construction of a multivariate generalisation of the Dirichlet-multinomial distribution. An example from forensic genetics in the statistical analysis of DNA mixtures motivates the study of this multivariate extension.
In forensic genetics, adjustment of the match probabilities due to remote ancestry in the population is often done using the so-called θ-correction. This correction increases the probability of observing multiple copies of rare alleles and thereby reduces the weight of the evidence for rare genotypes.
By numerical examples, we show how the θ-correction incorporated by the use of the multivariate Dirichlet-multinomial distribution affects the weight of evidence. Furthermore, we demonstrate how the θ-correction can be incorporated in a Markov structure needed to make efficient computations in a Bayesian network.
△ Less
Submitted 4 November, 2014; v1 submitted 25 June, 2014;
originally announced June 2014.