Generalized Naive Bayes
Authors:
Edith Alice Kovács,
Anna Ország,
Dániel Pfeifer,
András Benczúr
Abstract:
In this paper we introduce the so-called Generalized Naive Bayes structure as an extension of the Naive Bayes structure. We give a new greedy algorithm that finds a good fitting Generalized Naive Bayes (GNB) probability distribution. We prove that this fits the data at least as well as the probability distribution determined by the classical Naive Bayes (NB). Then, under a not very restrictive con…
▽ More
In this paper we introduce the so-called Generalized Naive Bayes structure as an extension of the Naive Bayes structure. We give a new greedy algorithm that finds a good fitting Generalized Naive Bayes (GNB) probability distribution. We prove that this fits the data at least as well as the probability distribution determined by the classical Naive Bayes (NB). Then, under a not very restrictive condition, we give a second algorithm for which we can prove that it finds the optimal GNB probability distribution, i.e. best fitting structure in the sense of KL divergence. Both algorithms are constructed to maximize the information content and aim to minimize redundancy. Based on these algorithms, new methods for feature selection are introduced. We discuss the similarities and differences to other related algorithms in terms of structure, methodology, and complexity. Experimental results show, that the algorithms introduced outperform the related algorithms in many cases.
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
Matrix and graph representations of vine copula structures
Authors:
Dániel Pfeifer,
Edith Alice Kovács
Abstract:
Vine copulas can efficiently model multivariate probability distributions. This paper focuses on a more thorough understanding of their structures, since in the literature, vine copula representations are often ambiguous. The graph representations include the original, cherry and chordal graph sequence structures, which we show equivalence between. Importantly we also show a new result, namely tha…
▽ More
Vine copulas can efficiently model multivariate probability distributions. This paper focuses on a more thorough understanding of their structures, since in the literature, vine copula representations are often ambiguous. The graph representations include the original, cherry and chordal graph sequence structures, which we show equivalence between. Importantly we also show a new result, namely that when a perfect elimination ordering of a vine structure is given, then it can always be uniquely represented with a matrix. O. M. Nápoles has shown a way to represent vines in a matrix, and we algorithmify this previous approach, while also showing a new method for constructing such a matrix, through cherry tree sequences. We also calculate the runtime of these algorithms. Lastly, we prove that these two matrix-building algorithms are equivalent if the same perfect elimination ordering is being used.
△ Less
Submitted 10 March, 2023; v1 submitted 10 May, 2022;
originally announced May 2022.