-
Scalable Meta-Learning with Gaussian Processes
Authors:
Petru Tighineanu,
Lukas Grossberger,
Paul Baireuther,
Kathrin Skubch,
Stefan Falkner,
Julia Vinogradska,
Felix Berkenkamp
Abstract:
Meta-learning is a powerful approach that exploits historical data to quickly solve new tasks from the same distribution. In the low-data regime, methods based on the closed-form posterior of Gaussian processes (GP) together with Bayesian optimization have achieved high performance. However, these methods are either computationally expensive or introduce assumptions that hinder a principled propag…
▽ More
Meta-learning is a powerful approach that exploits historical data to quickly solve new tasks from the same distribution. In the low-data regime, methods based on the closed-form posterior of Gaussian processes (GP) together with Bayesian optimization have achieved high performance. However, these methods are either computationally expensive or introduce assumptions that hinder a principled propagation of uncertainty between task models. This may disrupt the balance between exploration and exploitation during optimization. In this paper, we develop ScaML-GP, a modular GP model for meta-learning that is scalable in the number of tasks. Our core contribution is a carefully designed multi-task kernel that enables hierarchical training and task scalability. Conditioning ScaML-GP on the meta-data exposes its modular nature yielding a test-task prior that combines the posteriors of meta-task GPs. In synthetic and real-world meta-learning experiments, we demonstrate that ScaML-GP can learn efficiently both with few and many meta-tasks.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
Transfer Learning with Gaussian Processes for Bayesian Optimization
Authors:
Petru Tighineanu,
Kathrin Skubch,
Paul Baireuther,
Attila Reiss,
Felix Berkenkamp,
Julia Vinogradska
Abstract:
Bayesian optimization is a powerful paradigm to optimize black-box functions based on scarce and noisy data. Its data efficiency can be further improved by transfer learning from related tasks. While recent transfer models meta-learn a prior based on large amount of data, in the low-data regime methods that exploit the closed-form posterior of Gaussian processes (GPs) have an advantage. In this se…
▽ More
Bayesian optimization is a powerful paradigm to optimize black-box functions based on scarce and noisy data. Its data efficiency can be further improved by transfer learning from related tasks. While recent transfer models meta-learn a prior based on large amount of data, in the low-data regime methods that exploit the closed-form posterior of Gaussian processes (GPs) have an advantage. In this setting, several analytically tractable transfer-model posteriors have been proposed, but the relative advantages of these methods are not well understood. In this paper, we provide a unified view on hierarchical GP models for transfer learning, which allows us to analyze the relationship between methods. As part of the analysis, we develop a novel closed-form boosted GP transfer model that fits between existing approaches in terms of complexity. We evaluate the performance of the different approaches in large-scale experiments and highlight strengths and weaknesses of the different transfer-learning methods.
△ Less
Submitted 15 March, 2022; v1 submitted 22 November, 2021;
originally announced November 2021.
-
Multi-Class Multi-Instance Count Conditioned Adversarial Image Generation
Authors:
Amrutha Saseendran,
Kathrin Skubch,
Margret Keuper
Abstract:
Image generation has rapidly evolved in recent years. Modern architectures for adversarial training allow to generate even high resolution images with remarkable quality. At the same time, more and more effort is dedicated towards controlling the content of generated images. In this paper, we take one further step in this direction and propose a conditional generative adversarial network (GAN) tha…
▽ More
Image generation has rapidly evolved in recent years. Modern architectures for adversarial training allow to generate even high resolution images with remarkable quality. At the same time, more and more effort is dedicated towards controlling the content of generated images. In this paper, we take one further step in this direction and propose a conditional generative adversarial network (GAN) that generates images with a defined number of objects from given classes. This entails two fundamental abilities (1) being able to generate high-quality images given a complex constraint and (2) being able to count object instances per class in a given image. Our proposed model modularly extends the successful StyleGAN2 architecture with a count-based conditioning as well as with a regression sub-network to count the number of generated objects per class during training. In experiments on three different datasets, we show that the proposed model learns to generate images according to the given multiple-class count condition even in the presence of complex backgrounds. In particular, we propose a new dataset, CityCount, which is derived from the Cityscapes street scenes dataset, to evaluate our approach in a challenging and practically relevant scenario.
△ Less
Submitted 31 March, 2021;
originally announced March 2021.
-
Core forging and local limit theorems for the k-core of random graphs
Authors:
Amin Coja-Oghlan,
Oliver Cooley,
Mihyun Kang,
Kathrin Skubch
Abstract:
We establish a multivariate local limit theorem for the order and size as well as several other parameters of the k-core of the Erdos-Renyi graph. The proof is based on a novel approach to the k-core problem that replaces the meticulous analysis of the peeling process by a generative model of graphs with a core of a given order and size. The generative model, which is inspired by the Warning Propa…
▽ More
We establish a multivariate local limit theorem for the order and size as well as several other parameters of the k-core of the Erdos-Renyi graph. The proof is based on a novel approach to the k-core problem that replaces the meticulous analysis of the peeling process by a generative model of graphs with a core of a given order and size. The generative model, which is inspired by the Warning Propagation message passing algorithm, facilitates the direct study of properties of the core and its connections with the mantle and should therefore be of interest in its own right.
△ Less
Submitted 1 September, 2017; v1 submitted 12 July, 2017;
originally announced July 2017.
-
Limits of discrete distributions and Gibbs measures on random graphs
Authors:
Amin Coja-Oghlan,
Will Perkins,
Kathrin Skubch
Abstract:
Building upon the theory of graph limits and the Aldous-Hoover representation and inspired by Panchenko's work on asymptotic Gibbs measures (Annals of Probability 2013), we construct continuous embeddings of discrete probability distributions. We show that the theory of graph limits induces a meaningful notion of convergence and derive a corresponding version of the Szemerédi regularity lemma. Mor…
▽ More
Building upon the theory of graph limits and the Aldous-Hoover representation and inspired by Panchenko's work on asymptotic Gibbs measures (Annals of Probability 2013), we construct continuous embeddings of discrete probability distributions. We show that the theory of graph limits induces a meaningful notion of convergence and derive a corresponding version of the Szemerédi regularity lemma. Moreover, complementing recent work (Bapst et. al. 2015), we apply these results to Gibbs measures induced by sparse random factor graphs and verify the "replica symmetric solution" predicted in the physics literature under the assumption of non-reconstruction.
△ Less
Submitted 21 December, 2015;
originally announced December 2015.
-
The core in random hypergraphs and local weak convergence
Authors:
Kathrin Skubch
Abstract:
The degree of a vertex in a hypergraph is defined as the number of edges incident to it. In this paper we study the $k$-core, defined as the maximal induced subhypergraph of minimum degree $k$, of the random $r$-uniform hypergraph $H_r(n,p)$ for $r\geq 3$. We consider the case $k\geq 2$ and $p=d/n^{r-1}$ for which every vertex has fixed average degree $d>0$. We derive a multi-type branching proces…
▽ More
The degree of a vertex in a hypergraph is defined as the number of edges incident to it. In this paper we study the $k$-core, defined as the maximal induced subhypergraph of minimum degree $k$, of the random $r$-uniform hypergraph $H_r(n,p)$ for $r\geq 3$. We consider the case $k\geq 2$ and $p=d/n^{r-1}$ for which every vertex has fixed average degree $d>0$. We derive a multi-type branching process that describes the local structure of the $k$-core together with the mantle, i.e. the vertices outside the core.
△ Less
Submitted 12 November, 2017; v1 submitted 6 November, 2015;
originally announced November 2015.
-
The minimum bisection in the planted bisection model
Authors:
Amin Coja-Oghlan,
Oliver Cooley,
Mihyun Kang,
Kathrin Skubch
Abstract:
In the planted bisection model a random graph $G(n,p_+,p_- )$ with $n$ vertices is created by partitioning the vertices randomly into two classes of equal size (up to $\pm1$). Any two vertices that belong to the same class are linked by an edge with probability $p_+$ and any two that belong to different classes with probability $p_- <p_+$ independently. The planted bisection model has been used ex…
▽ More
In the planted bisection model a random graph $G(n,p_+,p_- )$ with $n$ vertices is created by partitioning the vertices randomly into two classes of equal size (up to $\pm1$). Any two vertices that belong to the same class are linked by an edge with probability $p_+$ and any two that belong to different classes with probability $p_- <p_+$ independently. The planted bisection model has been used extensively to benchmark graph partitioning algorithms. If $p_{\pm} =2d_{\pm} /n$ for numbers $0\leq d_- <d_+ $ that remain fixed as $n\to\infty$, then w.h.p. the ``planted'' bisection (the one used to construct the graph) will not be a minimum bisection. In this paper we derive an asymptotic formula for the minimum bisection width under the assumption that $d_+ -d_- >c\sqrt{d_+ \ln d_+ }$ for a certain constant $c>0$.
△ Less
Submitted 12 May, 2015;
originally announced May 2015.
-
How does the core sit inside the mantle?
Authors:
Amin Coja-Oghlan,
Oliver Cooley,
Mihyun Kang,
Kathrin Skubch
Abstract:
The $k$-core, defined as the largest subgraph of minimum degree $k$, of the random graph $G(n,p)$ has been studied extensively. In a landmark paper Pittel, Wormald and Spencer [JCTB 67 (1996) 111--151] determined the threshold $d_k$ for the appearance of an extensive $k$-core. Here we derive a multi-type Galton-Watson branching process that describes precisely how the $k$-core is embedded into the…
▽ More
The $k$-core, defined as the largest subgraph of minimum degree $k$, of the random graph $G(n,p)$ has been studied extensively. In a landmark paper Pittel, Wormald and Spencer [JCTB 67 (1996) 111--151] determined the threshold $d_k$ for the appearance of an extensive $k$-core. Here we derive a multi-type Galton-Watson branching process that describes precisely how the $k$-core is embedded into the random graph for any $k\geq3$ and any fixed average degree $d=np>d_k$. This generalises prior results on, e.g., the internal structure of the $k$-core.
△ Less
Submitted 31 March, 2015;
originally announced March 2015.