-
Strongly Consistent of Kullback-Leibler Divergence Estimator and Tests for Model Selection Based on a Bias Reduced Kernel Density Estimator
Authors:
Papa Ngom,
Freedath Djibril Moussa,
Jean de Dieu Nkurunziza
Abstract:
In this paper, we study the strong consistency of a bias reduced kernel density estimator and derive a strongly con- sistent Kullback-Leibler divergence (KLD) estimator. As application, we formulate a goodness-of-fit test and an asymptotically standard normal test for model selection. The Monte Carlo simulation show the effectiveness of the proposed estimation methods and statistical tests.
In this paper, we study the strong consistency of a bias reduced kernel density estimator and derive a strongly con- sistent Kullback-Leibler divergence (KLD) estimator. As application, we formulate a goodness-of-fit test and an asymptotically standard normal test for model selection. The Monte Carlo simulation show the effectiveness of the proposed estimation methods and statistical tests.
△ Less
Submitted 18 May, 2018;
originally announced May 2018.
-
Relationship between the Bregman divergence and beta-divergence and their Applications
Authors:
Macoumba Ndourand Mactar Ndaw,
Papa Ngom
Abstract:
The Bregman divergence have been the subject of several studies. We do not go to do an exhaustive study of its subclasses, but propose a proof that shows that the \b{eta}-divergence are subclasses of the Bregman divergences. It is in this order of idea that we will make a proposition of demonstration which shows that the \b{eta}-divergence are particular cases of the Bregman divergence. And also w…
▽ More
The Bregman divergence have been the subject of several studies. We do not go to do an exhaustive study of its subclasses, but propose a proof that shows that the \b{eta}-divergence are subclasses of the Bregman divergences. It is in this order of idea that we will make a proposition of demonstration which shows that the \b{eta}-divergence are particular cases of the Bregman divergence. And also we will propose algorithms and their applications to show the consistency of our approach. This is of interest for numerous applications since these divergences are widely used for instant non-negative matrix factorization (NMF).
△ Less
Submitted 18 May, 2018;
originally announced May 2018.
-
The Alpha-Beta-Symetric Divergence and their Positive Definite Kernel
Authors:
Mactar Ndaw,
Macoumba Ndour,
Papa Ngom
Abstract:
In this article we study the field of Hilbertian metrics and positive definit (pd) kernels on probability measures, they have a real interest in kernel methods. Firstly we will make a study based on the Alpha-Beta-divergence to have a Hilbercan metric by proposing an improvement of this divergence by constructing it so that its is symmetrical the Alpha-Beta-Symmetric-divergence (ABS-divergence) an…
▽ More
In this article we study the field of Hilbertian metrics and positive definit (pd) kernels on probability measures, they have a real interest in kernel methods. Firstly we will make a study based on the Alpha-Beta-divergence to have a Hilbercan metric by proposing an improvement of this divergence by constructing it so that its is symmetrical the Alpha-Beta-Symmetric-divergence (ABS-divergence) and also do some studies on these properties but also propose the kernels associated with this divergence. Secondly we will do mumerical studies incorporating all proposed metrics/kernels into support vector machine (SVM). Finally we presented a algorithm for image classification by using our divergence.
△ Less
Submitted 15 September, 2018; v1 submitted 1 March, 2018;
originally announced March 2018.
-
Variable selection via Group LASSO Approach : Application to the Cox Regression and frailty model
Authors:
Jean Claude Utazirubanda,
Tomas Leon,
Papa Ngom
Abstract:
In the analysis of survival outcome supplemented with both clinical information and high-dimensional gene expression data, use of the traditional Cox proportional hazards model (1972) fails to meet some emerging needs in biomedical research. First, the number of covariates is generally much larger the sample size. Secondly, predicting an outcome based on individual gene expression is inadequate be…
▽ More
In the analysis of survival outcome supplemented with both clinical information and high-dimensional gene expression data, use of the traditional Cox proportional hazards model (1972) fails to meet some emerging needs in biomedical research. First, the number of covariates is generally much larger the sample size. Secondly, predicting an outcome based on individual gene expression is inadequate because multiple biological processes and functional pathways regulate the expression associated with a gene. Another challenge is that the Cox model assumes that populations are homogenous, implying that all individuals have the same risk of death, which is rarely true due to unmeasured risk factors among populations. In this paper we propose group LASSO with gamma-distributed frailty for variable selection in Cox regression by extending previous scholarship to account for heterogeneity among group structures related to exposure and susceptibility. The consistency property of the proposed method is established. This method is appropriate for addressing a wide variety of research questions from genetics to air pollution. Simulated analysis shows promising performance by group LASSO compared with other methods, including group SCAD and group MCP. Future directions include expanding the use of frailty with adaptive group LASSO and sparse group LASS.
△ Less
Submitted 23 February, 2018;
originally announced February 2018.
-
Discriminating between two models based on Bregman divergence in small samples
Authors:
Papa Ngom,
Jean de Dieu Nkurunziza,
Carlos Simplice Ogouyandjou
Abstract:
Recently in [1, 2], Ali-Akbar Bromideh introduced the Kullback-Leibler Divergence (KLD) test statistic in discrim- inating between two models. It was found that the Ratio Minimized Kulback-Leibler Divergence (RMKLD) works better than the Ratio of Maximized Likelihood (RML) for small sample size. The aim of this paper is to generalize the works of Ali-Akbar Bromideh by proposing a hypothesis testin…
▽ More
Recently in [1, 2], Ali-Akbar Bromideh introduced the Kullback-Leibler Divergence (KLD) test statistic in discrim- inating between two models. It was found that the Ratio Minimized Kulback-Leibler Divergence (RMKLD) works better than the Ratio of Maximized Likelihood (RML) for small sample size. The aim of this paper is to generalize the works of Ali-Akbar Bromideh by proposing a hypothesis testing based on Bregman divergence in order to improve the process of choice of the model. Our aproach differs from him. After observing n data points of unknown density f ; we firstly measure the closness between the bias reduced kernel density estimator and the first estimated candidate model. Secondly between the bias reduced kernel density estimator and the second estimated candidate model. In these two cases Bregman Divergence (BD) and the bias reduced kernel estimator [3] focuses on improving the con- vergence rates of kernel density estimators are used. Our testing procedure for model selection is thus based on the comparison of the value of model selection test statistic to critical values from a standard normal table. We establish the asymptotic properties of Bregman divergence estimator and approximations of the power functions are deduced. The multi-step MLE process will be used to estimate the parameters of the models. We explain the applicability of the BD by a real data set and by the data generating process (DGP). The Monte Carlo simulation and then the numerical analysis will be used to interpret the result.
△ Less
Submitted 29 September, 2017;
originally announced September 2017.
-
Overlap Coefficients Based on Kullback-Leibler Divergence: Exponential Populations Case
Authors:
Hamza Dhaker,
Papa Ngom,
Malick Mbodj
Abstract:
This article is devoted to the study of overlap measures of densities of two exponential populations. Various Overlapping Coefficients, namely: Matusita's measure $ρ$, Morisita's measure $λ$ and Weitzman's measure $Δ$. A new overlap measure $Λ$ based on Kullback-Leibler measure is proposed. The invariance property and a method of statistical inference of these coefficients also are presented. Tayl…
▽ More
This article is devoted to the study of overlap measures of densities of two exponential populations. Various Overlapping Coefficients, namely: Matusita's measure $ρ$, Morisita's measure $λ$ and Weitzman's measure $Δ$. A new overlap measure $Λ$ based on Kullback-Leibler measure is proposed. The invariance property and a method of statistical inference of these coefficients also are presented. Taylor series approximation are used to construct confidence intervals for the overlap measures. The bias and mean square error properties of the estimators are studied through a simulation study.
△ Less
Submitted 9 April, 2017;
originally announced April 2017.
-
Uniform-in-bandwidth consistency for nonparametric estimation of divergence measures
Authors:
Papa Ngom,
Hamza Dhaker,
Pierre Mendy,
El Hadji Deme
Abstract:
We propose nonparametric estimation of divergence measures between continuous distributions. Our approach is based on a plug-in kernel- type estimators of density functions. We give the uniform in bandwidth consistency for the proposal estimators. As a consequence, their asymp- totic 100% confidence intervals are also provided.
We propose nonparametric estimation of divergence measures between continuous distributions. Our approach is based on a plug-in kernel- type estimators of density functions. We give the uniform in bandwidth consistency for the proposal estimators. As a consequence, their asymp- totic 100% confidence intervals are also provided.
△ Less
Submitted 23 June, 2014;
originally announced June 2014.
-
Comparaison between the two models : new approach using the $α$-divergence
Authors:
Hamza Dhaker,
Papa Ngom,
Pierre Mendy
Abstract:
We propose new nonparametric accordance Rényi-$α$ and $α$-Tsallis divergence estimators for continuous distributions. We discuss this approach with a view to the selection model (on alétoire and autoregressive AR (1)). We lestimateur used by kernel density esttimer underlying. Nevertheless, we are able to prove that the estimators are consistent under certain conditions. We also describe how to ap…
▽ More
We propose new nonparametric accordance Rényi-$α$ and $α$-Tsallis divergence estimators for continuous distributions. We discuss this approach with a view to the selection model (on alétoire and autoregressive AR (1)). We lestimateur used by kernel density esttimer underlying. Nevertheless, we are able to prove that the estimators are consistent under certain conditions. We also describe how to apply these estimators and demonstrate their effectiveness through numerical experiments.
△ Less
Submitted 21 January, 2014;
originally announced January 2014.
-
Model selection of stochastic simulation algorithm based on generalized divergence measures
Authors:
Papa Ngom,
Badiassiatta Don Bosco Diatta
Abstract:
MCMC methods (Monte Carlo Markov Chain) are a class of methods used to perform simulations per a probability distribution $P$. These methods are often used when we have difficulties to directly sample per a given probability distribution $P$ . This distribution is then considered as a target and generates a Markov chain $(X_n)_{n\in\mathbb{N}}$ that, when $n$ is large we have $X_n\sim P$. These MC…
▽ More
MCMC methods (Monte Carlo Markov Chain) are a class of methods used to perform simulations per a probability distribution $P$. These methods are often used when we have difficulties to directly sample per a given probability distribution $P$ . This distribution is then considered as a target and generates a Markov chain $(X_n)_{n\in\mathbb{N}}$ that, when $n$ is large we have $X_n\sim P$. These MCMC methods consist of several simulation strategies including the \emph{Independent Sampler (IS)}, the \emph{Random Walk of Metropolis Hastings \small{(RWMH)}}, the \emph{Gibbs sampler}, the \emph{Adaptive Metropolis (AM)} and \emph{Metropolis Within Gibbs (MWG)} strategy. Each of these strategies can generate a Markov chain and is associated with a convergence speed. It is interesting, with a given target law, to compare several simulation strategies for determining the best. Chauveau and Vandekerkhove \cite{Chauv2007} have compared IS and RWMH strategies using the Kullback-Leibler divergence measure. In our article we will compare our five simulation methods already mentioned using generalized divergence measures. These divergence measures are taken in family of $α$-divergence measures \cite{Cichocki2010}, with a parameter $α$. This is the Rényi divergence, Tsallis divergence and $D_α$ divergence .
△ Less
Submitted 20 January, 2014;
originally announced January 2014.
-
Minimum penalized Hellinger distance for model selection in small samples
Authors:
Papa Ngom,
Bertrand Ntep
Abstract:
In statistical modeling area, the Akaike information criterion AIC, is a widely known and extensively used tool for model choice. The φ-divergence test statistic is a recently developed tool for statistical model selection. The popularity of the divergence criterion is however tempered by their known lack of robustness in small sample. In this paper the penalized minimum Hellinger distance type st…
▽ More
In statistical modeling area, the Akaike information criterion AIC, is a widely known and extensively used tool for model choice. The φ-divergence test statistic is a recently developed tool for statistical model selection. The popularity of the divergence criterion is however tempered by their known lack of robustness in small sample. In this paper the penalized minimum Hellinger distance type statistics are considered and some properties are established. The limit laws of the estimates and test statistics are given under both the null and the alternative hypotheses, and approximations of the power functions are deduced. A model selection criterion relative to these divergence measures are developed for parametric inference. Our interest is in the problem to testing for choosing between two models using some informational type statistics, when independent sample are drawn from a discrete population. Here, we discuss the asymptotic properties and the performance of new procedure tests and investigate their small sample behavior.
△ Less
Submitted 27 October, 2011; v1 submitted 14 October, 2011;
originally announced October 2011.