-
Genetic prediction of quantitative traits: a machine learner's guide focused on height
Authors:
Lucie Bourguignon,
Caroline Weis,
Catherine R. Jutzeler,
Michael Adamer
Abstract:
Machine learning and deep learning have been celebrating many successes in the application to biological problems, especially in the domain of protein folding. Another equally complex and important question has received relatively little attention by the machine learning community, namely the one of prediction of complex traits from genetics. Tackling this problem requires in-depth knowledge of th…
▽ More
Machine learning and deep learning have been celebrating many successes in the application to biological problems, especially in the domain of protein folding. Another equally complex and important question has received relatively little attention by the machine learning community, namely the one of prediction of complex traits from genetics. Tackling this problem requires in-depth knowledge of the related genetics literature and awareness of various subtleties associated with genetic data. In this guide, we provide an overview for the machine learning community on current state of the art models and associated subtleties which need to be taken into consideration when developing new models for phenotype prediction. We use height as an example of a continuous-valued phenotype and provide an introduction to benchmark datasets, confounders, feature selection, and common metrics.
△ Less
Submitted 6 October, 2023;
originally announced October 2023.
-
The magnitude vector of images
Authors:
Michael F. Adamer,
Edward De Brouwer,
Leslie O'Bray,
Bastian Rieck
Abstract:
The magnitude of a finite metric space has recently emerged as a novel invariant quantity, allowing to measure the effective size of a metric space. Despite encouraging first results demonstrating the descriptive abilities of the magnitude, such as being able to detect the boundary of a metric space, the potential use cases of magnitude remain under-explored. In this work, we investigate the prope…
▽ More
The magnitude of a finite metric space has recently emerged as a novel invariant quantity, allowing to measure the effective size of a metric space. Despite encouraging first results demonstrating the descriptive abilities of the magnitude, such as being able to detect the boundary of a metric space, the potential use cases of magnitude remain under-explored. In this work, we investigate the properties of the magnitude on images, an important data modality in many machine learning applications. By endowing each individual images with its own metric space, we are able to define the concept of magnitude on images and analyse the individual contribution of each pixel with the magnitude vector. In particular, we theoretically show that the previously known properties of boundary detection translate to edge detection abilities in images. Furthermore, we demonstrate practical use cases of magnitude for machine learning applications and propose a novel magnitude model that consists of a computationally efficient magnitude computation and a learnable metric. By doing so, we address the computational hurdle that used to make magnitude impractical for many applications and open the way for the adoption of magnitude in machine learning research.
△ Less
Submitted 7 October, 2022; v1 submitted 28 October, 2021;
originally announced October 2021.
-
Algebraic Analysis of Rotation Data
Authors:
Michael F. Adamer,
András C. Lőrincz,
Anna-Laura Sattelberger,
Bernd Sturmfels
Abstract:
We develop algebraic tools for statistical inference from samples of rotation matrices. This rests on the theory of D-modules in algebraic analysis. Noncommutative Gröbner bases are used to design numerical algorithms for maximum likelihood estimation, building on the holonomic gradient method of Sei, Shibata, Takemura, Ohara, and Takayama. We study the Fisher model for sampling from rotation matr…
▽ More
We develop algebraic tools for statistical inference from samples of rotation matrices. This rests on the theory of D-modules in algebraic analysis. Noncommutative Gröbner bases are used to design numerical algorithms for maximum likelihood estimation, building on the holonomic gradient method of Sei, Shibata, Takemura, Ohara, and Takayama. We study the Fisher model for sampling from rotation matrices, and we apply our algorithms for data from the applied sciences. On the theoretical side, we generalize the underlying equivariant D-modules from SO(3) to arbitrary Lie groups. For compact groups, our D-ideals encode the normalizing constant of the Fisher model.
△ Less
Submitted 1 December, 2019;
originally announced December 2019.
-
Complexity of Model Testing for Dynamical Systems with Toric Steady States
Authors:
Michael F Adamer,
Martin Helmer
Abstract:
In this paper we investigate the complexity of model selection and model testing for dynamical systems with toric steady states. Such systems frequently arise in the study of chemical reaction networks. We do this by formulating these tasks as a constrained optimization problem in Euclidean space. This optimization problem is known as a Euclidean distance problem; the complexity of solving this pr…
▽ More
In this paper we investigate the complexity of model selection and model testing for dynamical systems with toric steady states. Such systems frequently arise in the study of chemical reaction networks. We do this by formulating these tasks as a constrained optimization problem in Euclidean space. This optimization problem is known as a Euclidean distance problem; the complexity of solving this problem is measured by an invariant called the Euclidean distance (ED) degree. We determine closed-form expressions for the ED degree of the steady states of several families of chemical reaction networks with toric steady states and arbitrarily many reactions. To illustrate the utility of this work we show how the ED degree can be used as a tool for estimating the computational cost of solving the model testing and model selection problems.
△ Less
Submitted 6 December, 2019; v1 submitted 24 July, 2017;
originally announced July 2017.