-
Implicit Gaussian process representation of vector fields over arbitrary latent manifolds
Authors:
Robert L. Peach,
Matteo Vinao-Carl,
Nir Grossman,
Michael David,
Emma Mallas,
David Sharp,
Paresh A. Malhotra,
Pierre Vandergheynst,
Adam Gosztolai
Abstract:
Gaussian processes (GPs) are popular nonparametric statistical models for learning unknown functions and quantifying the spatiotemporal uncertainty in data. Recent works have extended GPs to model scalar and vector quantities distributed over non-Euclidean domains, including smooth manifolds appearing in numerous fields such as computer vision, dynamical systems, and neuroscience. However, these a…
▽ More
Gaussian processes (GPs) are popular nonparametric statistical models for learning unknown functions and quantifying the spatiotemporal uncertainty in data. Recent works have extended GPs to model scalar and vector quantities distributed over non-Euclidean domains, including smooth manifolds appearing in numerous fields such as computer vision, dynamical systems, and neuroscience. However, these approaches assume that the manifold underlying the data is known, limiting their practical utility. We introduce RVGP, a generalisation of GPs for learning vector signals over latent Riemannian manifolds. Our method uses positional encoding with eigenfunctions of the connection Laplacian, associated with the tangent bundle, readily derived from common graph-based approximation of data. We demonstrate that RVGP possesses global regularity over the manifold, which allows it to super-resolve and inpaint vector fields while preserving singularities. Furthermore, we use RVGP to reconstruct high-density neural dynamics derived from low-density EEG recordings in healthy individuals and Alzheimer's patients. We show that vector field singularities are important disease markers and that their reconstruction leads to a comparable classification accuracy of disease states to high-density recordings. Thus, our method overcomes a significant practical limitation in experimental and clinical applications.
△ Less
Submitted 17 January, 2024; v1 submitted 28 September, 2023;
originally announced September 2023.
-
Modeling Inter-Dependence Between Time and Mark in Multivariate Temporal Point Processes
Authors:
Govind Waghmare,
Ankur Debnath,
Siddhartha Asthana,
Aakarsh Malhotra
Abstract:
Temporal Point Processes (TPP) are probabilistic generative frameworks. They model discrete event sequences localized in continuous time. Generally, real-life events reveal descriptive information, known as marks. Marked TPPs model time and marks of the event together for practical relevance. Conditioned on past events, marked TPPs aim to learn the joint distribution of the time and the mark of th…
▽ More
Temporal Point Processes (TPP) are probabilistic generative frameworks. They model discrete event sequences localized in continuous time. Generally, real-life events reveal descriptive information, known as marks. Marked TPPs model time and marks of the event together for practical relevance. Conditioned on past events, marked TPPs aim to learn the joint distribution of the time and the mark of the next event. For simplicity, conditionally independent TPP models assume time and marks are independent given event history. They factorize the conditional joint distribution of time and mark into the product of individual conditional distributions. This structural limitation in the design of TPP models hurt the predictive performance on entangled time and mark interactions. In this work, we model the conditional inter-dependence of time and mark to overcome the limitations of conditionally independent models. We construct a multivariate TPP conditioning the time distribution on the current event mark in addition to past events. Besides the conventional intensity-based models for conditional joint distribution, we also draw on flexible intensity-free TPP models from the literature. The proposed TPP models outperform conditionally independent and dependent models in standard prediction tasks. Our experimentation on various datasets with multiple evaluation metrics highlights the merit of the proposed approach.
△ Less
Submitted 23 November, 2024; v1 submitted 27 October, 2022;
originally announced October 2022.
-
A hybrid econometric-machine learning approach for relative importance analysis: Prioritizing food policy
Authors:
Akash Malhotra
Abstract:
A measure of relative importance of variables is often desired by researchers when the explanatory aspects of econometric methods are of interest. To this end, the author briefly reviews the limitations of conventional econometrics in constructing a reliable measure of variable importance. The author highlights the relative stature of explanatory and predictive analysis in economics and the emerge…
▽ More
A measure of relative importance of variables is often desired by researchers when the explanatory aspects of econometric methods are of interest. To this end, the author briefly reviews the limitations of conventional econometrics in constructing a reliable measure of variable importance. The author highlights the relative stature of explanatory and predictive analysis in economics and the emergence of fruitful collaborations between econometrics and computer science. Learning lessons from both, the author proposes a hybrid approach based on conventional econometrics and advanced machine learning (ML) algorithms, which are otherwise, used in predictive analytics. The purpose of this article is two-fold, to propose a hybrid approach to assess relative importance and demonstrate its applicability in addressing policy priority issues with an example of food inflation in India, followed by a broader aim to introduce the possibility of conflation of ML and conventional econometrics to an audience of researchers in economics and social sciences, in general.
△ Less
Submitted 22 August, 2020; v1 submitted 9 June, 2018;
originally announced June 2018.
-
Understanding food inflation in India: A Machine Learning approach
Authors:
Akash Malhotra,
Mayank Maloo
Abstract:
Over the past decade, the stellar growth of Indian economy has been challenged by persistently high levels of inflation, particularly in food prices. The primary reason behind this stubborn food inflation is mismatch in supply-demand, as domestic agricultural production has failed to keep up with rising demand owing to a number of proximate factors. The relative significance of these factors in de…
▽ More
Over the past decade, the stellar growth of Indian economy has been challenged by persistently high levels of inflation, particularly in food prices. The primary reason behind this stubborn food inflation is mismatch in supply-demand, as domestic agricultural production has failed to keep up with rising demand owing to a number of proximate factors. The relative significance of these factors in determining the change in food prices have been analysed using gradient boosted regression trees (BRT), a machine learning technique. The results from BRT indicates all predictor variables to be fairly significant in explaining the change in food prices, with MSP and farm wages being relatively more important than others. International food prices were found to have limited relevance in explaining the variation in domestic food prices. The challenge of ensuring food and nutritional security for growing Indian population with rising incomes needs to be addressed through resolute policy reforms.
△ Less
Submitted 30 January, 2017;
originally announced January 2017.
-
A Statistical Analysis of Bowling Performance in Cricket
Authors:
Akash Malhotra,
Shailesh Krishna
Abstract:
There is a widespread notion in cricketing world that with increasing pace the performance of a bowler improves. Additionally, many commentators believe lower order batters to be more vulnerable to pace. The present study puts these two ubiquitous notions under test by statistically analysing the differences in performance of bowlers from three subpopulations based on average release velocities. R…
▽ More
There is a widespread notion in cricketing world that with increasing pace the performance of a bowler improves. Additionally, many commentators believe lower order batters to be more vulnerable to pace. The present study puts these two ubiquitous notions under test by statistically analysing the differences in performance of bowlers from three subpopulations based on average release velocities. Results from one-way ANOVA reveal faster bowlers to be performing better, in terms of Average and Strike-rate, but no significant differences in the case of Economy rate and CBR. Lower and Middle order batsmen were found to be more vulnerable against faster bowling. However, there was no statistically significant difference in performance of Fast and Fast-Medium bowlers against a top-order batter.
△ Less
Submitted 16 January, 2017;
originally announced January 2017.
-
A simulations approach for meta-analysis of genetic association studies based on additive genetic model
Authors:
Majnu John,
Todd Lencz,
Anil K Malhotra,
Christoph U Correll,
Jian-Ping Zhang
Abstract:
Genetic association studies are becoming an important component of medical research. To cite one instance, pharmacogenomics which is gaining prominence as a useful tool for personalized medicine is heavily reliant on results from genetic association studies. Meta-analysis of genetic association studies is being increasingly used to assess phenotypic differences between genotype groups. When the un…
▽ More
Genetic association studies are becoming an important component of medical research. To cite one instance, pharmacogenomics which is gaining prominence as a useful tool for personalized medicine is heavily reliant on results from genetic association studies. Meta-analysis of genetic association studies is being increasingly used to assess phenotypic differences between genotype groups. When the underlying genetic model is assumed to be dominant or recessive, assessing the phenotype differences based on summary statistics, reported for individual studies in a meta-analysis, is a valid strategy. However, when the genetic model is additive, a similar strategy based on summary statistics will lead to biased results. This fact about the additive model is one of the things that we establish in this paper, using simulations. The main goal of this paper is to present an alternate strategy for the additive model based on simulating data for the individual studies. We show that the alternate strategy is far superior to the strategy based on summary statistics.
△ Less
Submitted 29 December, 2016;
originally announced December 2016.