-
Universum GANs: Improving GANs through contradictions
Authors:
Sauptik Dhar,
Javad Heydari,
Samarth Tripathi,
Unmesh Kurup,
Mohak Shah
Abstract:
Limited availability of labeled-data makes any supervised learning problem challenging. Alternative learning settings like semi-supervised and universum learning alleviate the dependency on labeled data, but still require a large amount of unlabeled data, which may be unavailable or expensive to acquire. GAN-based data generation methods have recently shown promise by generating synthetic samples…
▽ More
Limited availability of labeled-data makes any supervised learning problem challenging. Alternative learning settings like semi-supervised and universum learning alleviate the dependency on labeled data, but still require a large amount of unlabeled data, which may be unavailable or expensive to acquire. GAN-based data generation methods have recently shown promise by generating synthetic samples to improve learning. However, most existing GAN based approaches either provide poor discriminator performance under limited labeled data settings; or results in low quality generated data. In this paper, we propose a Universum GAN game which provides improved discriminator accuracy under limited data settings, while generating high quality realistic data. We further propose an evolving discriminator loss which improves its convergence and generalization performance. We derive the theoretical guarantees and provide empirical results in support of our approach.
△ Less
Submitted 20 September, 2022; v1 submitted 18 June, 2021;
originally announced June 2021.
-
Active Sampling for the Quickest Detection of Markov Networks
Authors:
Javad Heydari,
Ali Tajer,
H. Vincent Poor
Abstract:
Consider $n$ random variables forming a Markov random field (MRF). The true model of the MRF is unknown, and it is assumed to belong to a binary set. The objective is to sequentially sample the random variables (one-at-a-time) such that the true MRF model can be detected with the fewest number of samples, while in parallel, the decision reliability is controlled. The core element of an optimal dec…
▽ More
Consider $n$ random variables forming a Markov random field (MRF). The true model of the MRF is unknown, and it is assumed to belong to a binary set. The objective is to sequentially sample the random variables (one-at-a-time) such that the true MRF model can be detected with the fewest number of samples, while in parallel, the decision reliability is controlled. The core element of an optimal decision process is a rule for selecting and sampling the random variables over time. Such a process, at every time instant and adaptively to the collected data, selects the random variable that is expected to be most informative about the model, rendering an overall minimized number of samples required for reaching a reliable decision. The existing studies on detecting MRF structures generally sample the entire network at the same time and focus on designing optimal detection rules without regard to the data-acquisition process. This paper characterizes the sampling process for general MRFs, which, in conjunction with the sequential probability ratio test, is shown to be optimal in the asymptote of large $n$. The critical insight in designing the sampling process is devising an information measure that captures the decisions' inherent statistical dependence over time. Furthermore, when the MRFs can be modeled by acyclic probabilistic graphical models, the sampling rule is shown to take a computationally simple form. Performance analysis for the general case is provided, and the results are interpreted in several special cases: Gaussian MRFs, non-asymptotic regimes, connection to Chernoff's rule to controlled (active) sensing, and the problem of cluster detection.
△ Less
Submitted 2 August, 2020; v1 submitted 12 November, 2017;
originally announced November 2017.
-
Quickest Localization of Anomalies in Power Grids: A Stochastic Graphical Framework
Authors:
Javad Heydari,
Ali Tajer
Abstract:
Agile localization of anomalous events plays a pivotal role in enhancing the overall reliability of the grid and avoiding cascading failures. This is especially of paramount significance in the large-scale grids due to their geographical expansions and the large volume of data generated. This paper proposes a stochastic graphical framework, by leveraging which it aims to localize the anomalies wit…
▽ More
Agile localization of anomalous events plays a pivotal role in enhancing the overall reliability of the grid and avoiding cascading failures. This is especially of paramount significance in the large-scale grids due to their geographical expansions and the large volume of data generated. This paper proposes a stochastic graphical framework, by leveraging which it aims to localize the anomalies with the minimum amount of data. This framework capitalizes on the strong correlation structures observed among the measurements collected from different buses. The proposed approach, at its core, collects the measurements sequentially and progressively updates its decision about the location of the anomaly. The process resumes until the location of the anomaly can be identified with desired reliability. We provide a general theory for the quickest anomaly localization and also investigate its application for quickest line outage localization. Simulations in the IEEE 118-bus model are provided to establish the gains of the proposed approach.
△ Less
Submitted 6 February, 2017;
originally announced February 2017.
-
Bayesian hierarchical modelling for inferring genetic interactions in yeast
Authors:
Jonathan Heydari,
Conor Lawless,
David A. Lydall,
Darren J. Wilkinson
Abstract:
Quantitative Fitness Analysis (QFA) is a high-throughput experimental and computational methodology for measuring the growth of microbial populations. QFA screens can be used to compare the health of cell populations with and without a mutation in a query gene in order to infer genetic interaction strengths genome-wide, examining thousands of separate genotypes. We introduce Bayesian, hierarchical…
▽ More
Quantitative Fitness Analysis (QFA) is a high-throughput experimental and computational methodology for measuring the growth of microbial populations. QFA screens can be used to compare the health of cell populations with and without a mutation in a query gene in order to infer genetic interaction strengths genome-wide, examining thousands of separate genotypes. We introduce Bayesian, hierarchical models of population growth rates and genetic interactions that better reflect QFA experimental design than current approaches. Our new approach models population dynamics and genetic interaction simultaneously, thereby avoiding passing information between models via a univariate fitness summary. Matching experimental structure more closely, Bayesian hierarchical approaches use data more efficiently and find new evidence for genes which interact with yeast telomeres within a published dataset.
△ Less
Submitted 14 August, 2015;
originally announced August 2015.
-
Bayesian hierarchical modelling for inferring genetic interactions in yeast
Authors:
Jonathan Heydari
Abstract:
Identifying genetic interactions for a given microorganism such as yeast is difficult. Quantitative Fitness Analysis (QFA) is a high-throughput experimental and computational methodology for quantifying the fitness of microbial cultures. QFA can be used to compare between fitness observations for different genotypes and thereby infer genetic interaction strengths. Current "naive" frequentist stati…
▽ More
Identifying genetic interactions for a given microorganism such as yeast is difficult. Quantitative Fitness Analysis (QFA) is a high-throughput experimental and computational methodology for quantifying the fitness of microbial cultures. QFA can be used to compare between fitness observations for different genotypes and thereby infer genetic interaction strengths. Current "naive" frequentist statistical approaches used in QFA do not model between-genotype variation or difference in genotype variation under different conditions. In this thesis, a Bayesian approach is introduced to evaluate hierarchical models that better reflect the structure or design of QFA experiments. First, a two-stage approach is presented: a hierarchical logistic model is fitted to microbial culture growth curves and then a hierarchical interaction model is fitted to fitness summaries inferred for each genotype. Next, a one-stage Bayesian approach is presented: a joint hierarchical model which does not require a univariate summary of fitness, used to pass information between models. The new hierarchical approaches are then compared using a dataset examining the effect of telomere defects on yeast. By better describing the experimental structure, new evidence is found for genes and complexes which interact with the telomere cap. Various extensions of these models, including models for data transformation, batch effects, and intrinsically stochastic growth models are also considered.
△ Less
Submitted 27 May, 2014;
originally announced May 2014.
-
Fast Bayesian parameter estimation for stochastic logistic growth models
Authors:
Jonathan Heydari,
Conor Lawless,
David A. Lydall,
Darren J. Wilkinson
Abstract:
The transition density of a stochastic, logistic population growth model with multiplicative intrinsic noise is analytically intractable. Inferring model parameter values by fitting such stochastic differential equation (SDE) models to data therefore requires relatively slow numerical simulation. Where such simulation is prohibitively slow, an alternative is to use model approximations which do ha…
▽ More
The transition density of a stochastic, logistic population growth model with multiplicative intrinsic noise is analytically intractable. Inferring model parameter values by fitting such stochastic differential equation (SDE) models to data therefore requires relatively slow numerical simulation. Where such simulation is prohibitively slow, an alternative is to use model approximations which do have an analytically tractable transition density, enabling fast inference. We introduce two such approximations, with either multiplicative or additive intrinsic noise, each derived from the linear noise approximation of the logistic growth SDE. After Bayesian inference we find that our fast LNA models, using Kalman filter recursion for computation of marginal likelihoods, give similar posterior distributions to slow arbitrarily exact models. We also demonstrate that simulations from our LNA models better describe the characteristics of the stochastic logistic growth models than a related approach. Finally, we demonstrate that our LNA model with additive intrinsic noise and measurement error best describes an example set of longitudinal observations of microbial population size taken from a typical, genome-wide screening experiment.
△ Less
Submitted 26 October, 2013; v1 submitted 21 October, 2013;
originally announced October 2013.