-
Profile Likelihood via Optimisation and Differential Equations
Authors:
Yves Deville
Abstract:
Profile likelihood provides a general framework to infer on a scalar parameter of a statistical model. A confidence interval is obtained by numerically finding the two abscissas where the profile log-likelihood curve intersects an horizontal line. An alternative derivation for this interval can be obtained by solving a constrained optimisation problem which can broadly be described as: maximise or…
▽ More
Profile likelihood provides a general framework to infer on a scalar parameter of a statistical model. A confidence interval is obtained by numerically finding the two abscissas where the profile log-likelihood curve intersects an horizontal line. An alternative derivation for this interval can be obtained by solving a constrained optimisation problem which can broadly be described as: maximise or minimise the parameter of interest under the constraint that the log-likelihood is high enough. This formulation allows nice geometrical interpretations; It can be used to infer on a parameter or on a known scalar function of the parameter, such as a quantile. Widely available routines for constrained optimisation can be used for this task, as well as Markov Chain Monte Carlo samplers. When the interest is on a smooth function depending on an extra continuous variable, the constrained optimisation framework can be used to derive Ordinary Differential Equation (ODE) for the confidence limits. This is illustrated with the return levels of Extreme Value models based on the Generalised Extreme Value distribution. Moreover the same ODE-based technique applies as well to the derivation of profile likelihood contours for couples of parameters. The initial value of the ODE used in the determination of the interval or the contour can itself be obtained by another auxiliary ODE with known initial value obtained by using the confidence level as the extra continuous variable.
△ Less
Submitted 3 April, 2024;
originally announced April 2024.
-
Beyond the density operator and Tr(ρA): Exploiting the higher-order statistics of random-coefficient pure states for quantum information processing
Authors:
Yannick Deville,
Alain Deville
Abstract:
Two types of states are widely used in quantum mechanics, namely (deterministic-coefficient) pure states and statistical mixtures. A density operator can be associated with each of them. We here address a third type of states, that we previously introduced in a more restricted framework. These states generalize pure ones by replacing each of their deterministic ket coefficients by a random variabl…
▽ More
Two types of states are widely used in quantum mechanics, namely (deterministic-coefficient) pure states and statistical mixtures. A density operator can be associated with each of them. We here address a third type of states, that we previously introduced in a more restricted framework. These states generalize pure ones by replacing each of their deterministic ket coefficients by a random variable. We therefore call them Random-Coefficient Pure States, or RCPS. We analyze their properties and their relationships with both types of usual states. We show that RCPS contain much richer information than the density operator and mean of observables that we associate with them. This occurs because the latter operator only exploits the second-order statistics of the random state coefficients, whereas their higher-order statistics contain additional information. That information can be accessed in practice with the multiple-preparation procedure that we propose for RCPS, by using second-order and higher-order statistics of associated random probabilities of measurement outcomes. Exploiting these higher-order statistics opens the way to a very general approach for performing advanced quantum information processing tasks. We illustrate the relevance of this approach with a generic example, dealing with the estimation of parameters of a quantum process and thus related to quantum process tomography. This parameter estimation is performed in the non-blind (i.e. supervised) or blind (i.e. unsupervised) mode. We show that this problem cannot be solved by using only the density operator ρof an RCPS and the associated mean value Tr(ρA) of the operator A that corresponds to the considered physical quantity. We succeed in solving this problem by exploiting a fourth-order statistical parameter of state coefficients, in addition to second-order statistics. Numerical tests validate this result.
△ Less
Submitted 21 April, 2022;
originally announced April 2022.
-
Single-preparation unsupervised quantum machine learning: concepts and applications
Authors:
Yannick Deville,
Alain Deville
Abstract:
The term "machine learning" especially refers to algorithms that derive mappings, i.e. intput/output transforms, by using numerical data that provide information about considered transforms. These transforms appear in many problems, related to classification/clustering, regression, system identification, system inversion and input signal restoration/separation. We here first analyze the connection…
▽ More
The term "machine learning" especially refers to algorithms that derive mappings, i.e. intput/output transforms, by using numerical data that provide information about considered transforms. These transforms appear in many problems, related to classification/clustering, regression, system identification, system inversion and input signal restoration/separation. We here first analyze the connections between all these problems, in the classical and quantum frameworks. We then focus on their most challenging versions, involving quantum data and/or quantum processing means, and unsupervised, i.e. blind, learning. Moreover, we propose the quite general concept of SIngle-Preparation Quantum Information Processing (SIPQIP). The resulting methods only require a single instance of each state, whereas usual methods have to very accurately create many copies of each fixed state. We apply our SIPQIP concept to various tasks, related to system identification (blind quantum process tomography or BQPT, blind Hamiltonian parameter estimation or BHPE, blind quantum channel identification/estimation, blind phase estimation), system inversion and state estimation (blind quantum source separation or BQSS, blind quantum entangled state restoration or BQSR, blind quantum channel equalization) and classification. Numerical tests show that our framework moreover yields much more accurate estimation than the standard multiple-preparation approach. Our methods are especially useful in a quantum computer, that we propose to more briefly call a "quamputer": BQPT and BHPE simplify the characterization of the gates of quamputers; BQSS and BQSR allow one to design quantum gates that may be used to compensate for the non-idealities that alter states stored in quantum registers, and they open the way to the much more general concept of self-adaptive quantum gates (see longer version of abstract in paper).
△ Less
Submitted 5 January, 2021;
originally announced January 2021.
-
Quantum process tomography with unknown single-preparation input states
Authors:
Yannick Deville,
Alain Deville
Abstract:
Quantum Process Tomography (QPT) methods aim at identifying, i.e. estimating, a given quantum process. QPT is a major quantum information processing tool, since it especially allows one to characterize the actual behavior of quantum gates, which are the building blocks of quantum computers. However, usual QPT procedures are complicated, since they set several constraints on the quantum states used…
▽ More
Quantum Process Tomography (QPT) methods aim at identifying, i.e. estimating, a given quantum process. QPT is a major quantum information processing tool, since it especially allows one to characterize the actual behavior of quantum gates, which are the building blocks of quantum computers. However, usual QPT procedures are complicated, since they set several constraints on the quantum states used as inputs of the process to be characterized. In this paper, we extend QPT so as to avoid two such constraints. On the one hand, usual QPT methods requires one to know, hence to precisely control (i.e. prepare), the specific quantum states used as inputs of the considered quantum process, which is cumbersome. We therefore propose a Blind, or unsupervised, extension of QPT (i.e. BQPT), which means that this approach uses input quantum states whose values are unknown and arbitrary, except that they are requested to meet some general known properties (and this approach exploits the output states of the considered quantum process). On the other hand, usual QPT methods require one to be able to prepare many copies of the same (known) input state, which is constraining. On the contrary, we propose "single-preparation methods", i.e. methods which can operate with only one instance of each considered input state. These two new concepts are here illustrated with practical BQPT methods which are numerically validated, in the case when: i) random pure states are used as inputs and their required properties are especially related to the statistical independence of the random variables that define them, ii) the considered quantum process is based on cylindrical-symmetry Heisenberg spin coupling. These concepts may be extended to a much wider class of processes and to BQPT methods based on other input quantum state properties.
△ Less
Submitted 18 September, 2019;
originally announced September 2019.
-
Group kernels for Gaussian process metamodels with categorical inputs
Authors:
Olivier Roustant,
Esperan Padonou,
Yves Deville,
Aloïs Clément,
Guillaume Perrin,
Jean Giorla,
Henry Wynn
Abstract:
Gaussian processes (GP) are widely used as a metamodel for emulating time-consuming computer codes. We focus on problems involving categorical inputs, with a potentially large number L of levels (typically several tens), partitioned in G << L groups of various sizes. Parsimonious covariance functions, or kernels, can then be defined by block covariance matrices T with constant covariances between…
▽ More
Gaussian processes (GP) are widely used as a metamodel for emulating time-consuming computer codes. We focus on problems involving categorical inputs, with a potentially large number L of levels (typically several tens), partitioned in G << L groups of various sizes. Parsimonious covariance functions, or kernels, can then be defined by block covariance matrices T with constant covariances between pairs of blocks and within blocks. We study the positive definiteness of such matrices to encourage their practical use. The hierarchical group/level structure, equivalent to a nested Bayesian linear model, provides a parameterization of valid block matrices T. The same model can then be used when the assumption within blocks is relaxed, giving a flexible parametric family of valid covariance matrices with constant covariances between pairs of blocks. The positive definiteness of T is equivalent to the positive definiteness of a smaller matrix of size G, obtained by averaging each block. The model is applied to a problem in nuclear waste analysis, where one of the categorical inputs is atomic number, which has more than 90 levels.
△ Less
Submitted 24 July, 2018; v1 submitted 7 February, 2018;
originally announced February 2018.
-
Inertia-Constrained Pixel-by-Pixel Nonnegative Matrix Factorisation: a Hyperspectral Unmixing Method Dealing with Intra-class Variability
Authors:
Charlotte Revel,
Yannick Deville,
Véronique Achard,
Xavier Briottet
Abstract:
Blind source separation is a common processing tool to analyse the constitution of pixels of hyperspectral images. Such methods usually suppose that pure pixel spectra (endmembers) are the same in all the image for each class of materials. In the framework of remote sensing, such an assumption is no more valid in the presence of intra-class variabilities due to illumination conditions, weathering,…
▽ More
Blind source separation is a common processing tool to analyse the constitution of pixels of hyperspectral images. Such methods usually suppose that pure pixel spectra (endmembers) are the same in all the image for each class of materials. In the framework of remote sensing, such an assumption is no more valid in the presence of intra-class variabilities due to illumination conditions, weathering, slight variations of the pure materials, etc... In this paper, we first describe the results of investigations highlighting intra-class variability measured in real images. Considering these results, a new formulation of the linear mixing model is presented leading to two new methods. Unconstrained Pixel-by-pixel NMF (UP-NMF) is a new blind source separation method based on the assumption of a linear mixing model, which can deal with intra-class variability. To overcome UP-NMF limitations an extended method is proposed, named Inertia-constrained Pixel-by-pixel NMF (IP-NMF). For each sensed spectrum, these extended versions of NMF extract a corresponding set of source spectra. A constraint is set to limit the spreading of each source's estimates in IP-NMF. The methods are tested on a semi-synthetic data set built with spectra extracted from a real hyperspectral image and then numerically mixed. We thus demonstrate the interest of our methods for realistic source variabilities. Finally, IP-NMF is tested on a real data set and it is shown to yield better performance than state of the art methods.
△ Less
Submitted 24 February, 2017;
originally announced February 2017.
-
Correction to: "Blind maximum likelihood separation of a linear-quadratic mixture"
Authors:
Shahram Hosseini,
Yannick Deville
Abstract:
An error occurred in the computation of a gradient in our paper entitled "Blind maximum likelihood separation of a linear-quadratic mixture", presented in ICA'2004. The equations (20) in Appendix and (17) in the text were not correct. The current paper presents the correct version of these equations.
An error occurred in the computation of a gradient in our paper entitled "Blind maximum likelihood separation of a linear-quadratic mixture", presented in ICA'2004. The equations (20) in Appendix and (17) in the text were not correct. The current paper presents the correct version of these equations.
△ Less
Submitted 6 January, 2010;
originally announced January 2010.
-
Effect of indirect dependencies on "A mutual information minimization approach for a class of nonlinear recurrent separating systems"
Authors:
Yannick Deville,
Alain Deville,
Shahram Hosseini
Abstract:
In a recent paper [4], Duarte and Jutten investigated the Blind Source Separation (BSS) problem, for the nonlinear mixing model that they introduced in that paper. They proposed to solve this problem by using information-theoretic tools, more precisely by minimizing the mutual information (MI) of the outputs of the separating structure. When applying the MI approach to BSS problems, one usually…
▽ More
In a recent paper [4], Duarte and Jutten investigated the Blind Source Separation (BSS) problem, for the nonlinear mixing model that they introduced in that paper. They proposed to solve this problem by using information-theoretic tools, more precisely by minimizing the mutual information (MI) of the outputs of the separating structure. When applying the MI approach to BSS problems, one usually determines the analytical expressions of the derivatives of the MI with respect to the parameters of the considered separating model. In the literature, these calculations were mainly reported for linear mixtures up to now. They are more complex for nonlinear mixtures, due to dependencies between the considered quantities. Moreover, the notations commonly employed by the BSS community in such calculations may become misleading when using them for nonlinear mixtures, due to the above-mentioned dependencies. We claim that the calculations reported in [4] contain an error, because they did not take into account all these dependencies. In this document, we therefore explain this phenomenon, by showing the effect of indirect dependencies on the application of the MI approach to the mixing and separating models considered in [4]. We thus introduce a corrected expression of the gradient of the considered BSS criterion based on MI. This correct gradient may then e.g. be used to optimize the adaptive coefficients of the considered separating system by means of the well-known gradient descent algorithm. As explained hereafter, this investigation has some similarities with an analysis that we previously reported in another arXiv document [3]. However, these two investigations concern different problems (mixture and separating structure, mathematical tools: see paper).
△ Less
Submitted 25 October, 2009; v1 submitted 23 October, 2009;
originally announced October 2009.
-
Effect of indirect dependencies on "Maximum likelihood blind separation of two quantum states (qubits) with cylindrical-symmetry Heisenberg spin coupling"
Authors:
Yannick Deville,
Alain Deville
Abstract:
In a previous paper [1], we investigated the Blind Source Separation (BSS) problem, for the nonlinear mixing model that we introduced in that paper. We proposed to solve this problem by using a maximum likelihood (ML) approach. When applying the ML approach to BSS problems, one usually determines the analytical expressions of the derivatives of the log-likelihood with respect to the parameters o…
▽ More
In a previous paper [1], we investigated the Blind Source Separation (BSS) problem, for the nonlinear mixing model that we introduced in that paper. We proposed to solve this problem by using a maximum likelihood (ML) approach. When applying the ML approach to BSS problems, one usually determines the analytical expressions of the derivatives of the log-likelihood with respect to the parameters of the considered mixing model. In the literature, these calculations were mainly considered for linear mixtures up to now. They are more complex for nonlinear mixtures, due to dependencies between the considered quantities. Moreover, the notations commonly employed by the BSS community in such calculations may become misleading when using them for nonlinear mixtures, due to the above-mentioned dependencies. In this document, we therefore explain this phenomenon, by showing the effect of indirect dependencies on the application of the ML approach to the mixing model considered in [1]. This yields the explicit expression of the complete derivative of the log-likelihood associated to that mixing model.
△ Less
Submitted 30 May, 2009;
originally announced June 2009.