Aggregation as Unsupervised Learning and its Evaluation
Authors:
Maria Ulan,
Welf Löwe,
Morgan Ericsson,
Anna Wingkvist
Abstract:
Regression uses supervised machine learning to find a model that combines several independent variables to predict a dependent variable based on ground truth (labeled) data, i.e., tuples of independent and dependent variables (labels). Similarly, aggregation also combines several independent variables to a dependent variable. The dependent variable should preserve properties of the independent var…
▽ More
Regression uses supervised machine learning to find a model that combines several independent variables to predict a dependent variable based on ground truth (labeled) data, i.e., tuples of independent and dependent variables (labels). Similarly, aggregation also combines several independent variables to a dependent variable. The dependent variable should preserve properties of the independent variables, e.g., the ranking or relative distance of the independent variable tuples, and/or represent a latent ground truth that is a function of these independent variables. However, ground truth data is not available for finding the aggregation model. Consequently, aggregation models are data agnostic or can only be derived with unsupervised machine learning approaches.
We introduce a novel unsupervised aggregation approach based on intrinsic properties of unlabeled training data, such as the cumulative probability distributions of the single independent variables and their mutual dependencies.
We present an empirical evaluation framework that allows assessing the proposed approach against other aggregation approaches from two perspectives: (i) how well the aggregation output represents properties of the input tuples, and (ii) how well can aggregated output predict a latent ground truth. To this end, we use data sets for assessing supervised regression approaches that contain explicit ground truth labels. However, the ground truth is not used for deriving the aggregation models, but it allows for the assessment from a perspective (ii). More specifically, we use regression data sets from the UCI machine learning repository and benchmark several data-agnostic and unsupervised approaches for aggregation against ours.
The benchmark results indicate that our approach outperforms the other data-agnostic and unsupervised aggregation approaches. It is almost on par with linear regression.
△ Less
Submitted 28 October, 2021;
originally announced October 2021.
Integrability of geodesics of totally geodesic metrics
Authors:
Radosław A. Kycia,
Maria Ułan
Abstract:
Analysis of the geodesics in the space of signature $(1,3)$ that splits in two-dimensional distributions resulting from the Weyl tensor eignespaces - hyperbolic and elliptic ones - described in [V. Lychagin, V. Yumaguzhin, \emph{Differential invariants and exact solutions of the Einstein equations}, Anal.Math.Phys. 1664-235X 1-9 (2016)] are presented. Cases when geodesic equations are integrable a…
▽ More
Analysis of the geodesics in the space of signature $(1,3)$ that splits in two-dimensional distributions resulting from the Weyl tensor eignespaces - hyperbolic and elliptic ones - described in [V. Lychagin, V. Yumaguzhin, \emph{Differential invariants and exact solutions of the Einstein equations}, Anal.Math.Phys. 1664-235X 1-9 (2016)] are presented. Cases when geodesic equations are integrable are identified. Similar analysis is performed for the same model coupled to Electromagnetism described in [V. Lychagin, V. Yumaguzhi, \emph{Differential invariants and exact solutions of the Einstein-Maxwell equation}, Anal.Math.Phys. 1, 19--29, (2017)].
△ Less
Submitted 1 October, 2018;
originally announced October 2018.