-
Probabilistic Modelling of Multiple Long-Term Condition Onset Times
Authors:
Kieran Richards,
Kelly Fleetwood,
Regina Prigge,
Paolo Missier,
Michael Barnes,
Nick J. Reynolds,
Bruce Guthrie,
Sohan Seth
Abstract:
The co-occurrence of multiple long-term conditions (MLTC), or multimorbidity, in an individual can reduce their lifespan and severely impact their quality of life. Exploring the longitudinal patterns, e.g. clusters, of disease accrual can help better understand the genetic and environmental drivers of multimorbidity, and potentially identify individuals who may benefit from early targeted interven…
▽ More
The co-occurrence of multiple long-term conditions (MLTC), or multimorbidity, in an individual can reduce their lifespan and severely impact their quality of life. Exploring the longitudinal patterns, e.g. clusters, of disease accrual can help better understand the genetic and environmental drivers of multimorbidity, and potentially identify individuals who may benefit from early targeted intervention. We introduce $\textit{probabilistic modelling of onset times}$, or $\texttt{ProMOTe}$, for clustering and forecasting MLTC trajectories. $\texttt{ProMOTe}$ seamlessly learns from incomplete and unreliable disease trajectories that is commonplace in Electronic Health Records but often ignored in existing longitudinal clustering methods. We analyse data from 150,000 individuals in the UK Biobank and identify 50 clusters showing patterns of disease accrual that have also been reported by some recent studies. We further discuss the forecasting capabilities of the model given the history of disease accrual.
△ Less
Submitted 10 December, 2024;
originally announced December 2024.
-
Imitating Language via Scalable Inverse Reinforcement Learning
Authors:
Markus Wulfmeier,
Michael Bloesch,
Nino Vieillard,
Arun Ahuja,
Jorg Bornschein,
Sandy Huang,
Artem Sokolov,
Matt Barnes,
Guillaume Desjardins,
Alex Bewley,
Sarah Maria Elisabeth Bechtle,
Jost Tobias Springenberg,
Nikola Momchev,
Olivier Bachem,
Matthieu Geist,
Martin Riedmiller
Abstract:
The majority of language model training builds on imitation learning. It covers pretraining, supervised fine-tuning, and affects the starting conditions for reinforcement learning from human feedback (RLHF). The simplicity and scalability of maximum likelihood estimation (MLE) for next token prediction led to its role as predominant paradigm. However, the broader field of imitation learning can mo…
▽ More
The majority of language model training builds on imitation learning. It covers pretraining, supervised fine-tuning, and affects the starting conditions for reinforcement learning from human feedback (RLHF). The simplicity and scalability of maximum likelihood estimation (MLE) for next token prediction led to its role as predominant paradigm. However, the broader field of imitation learning can more effectively utilize the sequential structure underlying autoregressive generation. We focus on investigating the inverse reinforcement learning (IRL) perspective to imitation, extracting rewards and directly optimizing sequences instead of individual token likelihoods and evaluate its benefits for fine-tuning large language models. We provide a new angle, reformulating inverse soft-Q-learning as a temporal difference regularized extension of MLE. This creates a principled connection between MLE and IRL and allows trading off added complexity with increased performance and diversity of generations in the supervised fine-tuning (SFT) setting. We find clear advantages for IRL-based imitation, in particular for retaining diversity while maximizing task performance, rendering IRL a strong alternative on fixed SFT datasets even without online data generation. Our analysis of IRL-extracted reward functions further indicates benefits for more robust reward functions via tighter integration of supervised and preference-based LLM post-training.
△ Less
Submitted 9 December, 2024; v1 submitted 2 September, 2024;
originally announced September 2024.
-
Mo' States Mo' Problems: Emergency Stop Mechanisms from Observation
Authors:
Samuel Ainsworth,
Matt Barnes,
Siddhartha Srinivasa
Abstract:
In many environments, only a relatively small subset of the complete state space is necessary in order to accomplish a given task. We develop a simple technique using emergency stops (e-stops) to exploit this phenomenon. Using e-stops significantly improves sample complexity by reducing the amount of required exploration, while retaining a performance bound that efficiently trades off the rate of…
▽ More
In many environments, only a relatively small subset of the complete state space is necessary in order to accomplish a given task. We develop a simple technique using emergency stops (e-stops) to exploit this phenomenon. Using e-stops significantly improves sample complexity by reducing the amount of required exploration, while retaining a performance bound that efficiently trades off the rate of convergence with a small asymptotic sub-optimality gap. We analyze the regret behavior of e-stops and present empirical results in discrete and continuous settings demonstrating that our reset mechanism can provide order-of-magnitude speedups on top of existing reinforcement learning methods.
△ Less
Submitted 3 December, 2019;
originally announced December 2019.
-
Imitation Learning as $f$-Divergence Minimization
Authors:
Liyiming Ke,
Sanjiban Choudhury,
Matt Barnes,
Wen Sun,
Gilwoo Lee,
Siddhartha Srinivasa
Abstract:
We address the problem of imitation learning with multi-modal demonstrations. Instead of attempting to learn all modes, we argue that in many tasks it is sufficient to imitate any one of them. We show that the state-of-the-art methods such as GAIL and behavior cloning, due to their choice of loss function, often incorrectly interpolate between such modes. Our key insight is to minimize the right d…
▽ More
We address the problem of imitation learning with multi-modal demonstrations. Instead of attempting to learn all modes, we argue that in many tasks it is sufficient to imitate any one of them. We show that the state-of-the-art methods such as GAIL and behavior cloning, due to their choice of loss function, often incorrectly interpolate between such modes. Our key insight is to minimize the right divergence between the learner and the expert state-action distributions, namely the reverse KL divergence or I-projection. We propose a general imitation learning framework for estimating and minimizing any f-Divergence. By plugging in different divergences, we are able to recover existing algorithms such as Behavior Cloning (Kullback-Leibler), GAIL (Jensen Shannon) and Dagger (Total Variation). Empirical results show that our approximate I-projection technique is able to imitate multi-modal behaviors more reliably than GAIL and behavior cloning.
△ Less
Submitted 31 May, 2020; v1 submitted 30 May, 2019;
originally announced May 2019.
-
On the Interaction Effects Between Prediction and Clustering
Authors:
Matt Barnes,
Artur Dubrawski
Abstract:
Machine learning systems increasingly depend on pipelines of multiple algorithms to provide high quality and well structured predictions. This paper argues interaction effects between clustering and prediction (e.g. classification, regression) algorithms can cause subtle adverse behaviors during cross-validation that may not be initially apparent. In particular, we focus on the problem of estimati…
▽ More
Machine learning systems increasingly depend on pipelines of multiple algorithms to provide high quality and well structured predictions. This paper argues interaction effects between clustering and prediction (e.g. classification, regression) algorithms can cause subtle adverse behaviors during cross-validation that may not be initially apparent. In particular, we focus on the problem of estimating the out-of-cluster (OOC) prediction loss given an approximate clustering with probabilistic error rate $p_0$. Traditional cross-validation techniques exhibit significant empirical bias in this setting, and the few attempts to estimate and correct for these effects are intractable on larger datasets. Further, no previous work has been able to characterize the conditions under which these empirical effects occur, and if they do, what properties they have. We precisely answer these questions by providing theoretical properties which hold in various settings, and prove that expected out-of-cluster loss behavior rapidly decays with even minor clustering errors. Fortunately, we are able to leverage these same properties to construct hypothesis tests and scalable estimators necessary for correcting the problem. Empirical results on benchmark datasets validate our theoretical results and demonstrate how scaling techniques provide solutions to new classes of problems.
△ Less
Submitted 28 December, 2018; v1 submitted 17 July, 2018;
originally announced July 2018.
-
Performance Bounds for Graphical Record Linkage
Authors:
Rebecca C. Steorts,
Matt Barnes,
Willie Neiswanger
Abstract:
Record linkage involves merging records in large, noisy databases to remove duplicate entities. It has become an important area because of its widespread occurrence in bibliometrics, public health, official statistics production, political science, and beyond. Traditional linkage methods directly linking records to one another are computationally infeasible as the number of records grows. As a res…
▽ More
Record linkage involves merging records in large, noisy databases to remove duplicate entities. It has become an important area because of its widespread occurrence in bibliometrics, public health, official statistics production, political science, and beyond. Traditional linkage methods directly linking records to one another are computationally infeasible as the number of records grows. As a result, it is increasingly common for researchers to treat record linkage as a clustering task, in which each latent entity is associated with one or more noisy database records. We critically assess performance bounds using the Kullback-Leibler (KL) divergence under a Bayesian record linkage framework, making connections to Kolchin partition models. We provide an upper bound using the KL divergence and a lower bound on the minimum probability of misclassifying a latent entity. We give insights for when our bounds hold using simulated data and provide practical user guidance.
△ Less
Submitted 7 March, 2017;
originally announced March 2017.
-
Clustering on the Edge: Learning Structure in Graphs
Authors:
Matt Barnes,
Artur Dubrawski
Abstract:
With the recent popularity of graphical clustering methods, there has been an increased focus on the information between samples. We show how learning cluster structure using edge features naturally and simultaneously determines the most likely number of clusters and addresses data scale issues. These results are particularly useful in instances where (a) there are a large number of clusters and (…
▽ More
With the recent popularity of graphical clustering methods, there has been an increased focus on the information between samples. We show how learning cluster structure using edge features naturally and simultaneously determines the most likely number of clusters and addresses data scale issues. These results are particularly useful in instances where (a) there are a large number of clusters and (b) we have some labeled edges. Applications in this domain include image segmentation, community discovery and entity resolution. Our model is an extension of the planted partition model and our solution uses results of correlation clustering, which achieves a partition O(log(n))-close to the log-likelihood of the true clustering.
△ Less
Submitted 5 May, 2016;
originally announced May 2016.
-
A Practioner's Guide to Evaluating Entity Resolution Results
Authors:
Matt Barnes
Abstract:
Entity resolution (ER) is the task of identifying records belonging to the same entity (e.g. individual, group) across one or multiple databases. Ironically, it has multiple names: deduplication and record linkage, among others. In this paper we survey metrics used to evaluate ER results in order to iteratively improve performance and guarantee sufficient quality prior to deployment. Some of these…
▽ More
Entity resolution (ER) is the task of identifying records belonging to the same entity (e.g. individual, group) across one or multiple databases. Ironically, it has multiple names: deduplication and record linkage, among others. In this paper we survey metrics used to evaluate ER results in order to iteratively improve performance and guarantee sufficient quality prior to deployment. Some of these metrics are borrowed from multi-class classification and clustering domains, though some key differences exist differentiating entity resolution from general clustering. Menestrina et al. empirically showed rankings from these metrics often conflict with each other, thus our primary motivation for studying them. This paper provides practitioners the basic knowledge to begin evaluating their entity resolution results.
△ Less
Submitted 14 September, 2015;
originally announced September 2015.
-
Performance Bounds for Pairwise Entity Resolution
Authors:
Matt Barnes,
Kyle Miller,
Artur Dubrawski
Abstract:
One significant challenge to scaling entity resolution algorithms to massive datasets is understanding how performance changes after moving beyond the realm of small, manually labeled reference datasets. Unlike traditional machine learning tasks, when an entity resolution algorithm performs well on small hold-out datasets, there is no guarantee this performance holds on larger hold-out datasets. W…
▽ More
One significant challenge to scaling entity resolution algorithms to massive datasets is understanding how performance changes after moving beyond the realm of small, manually labeled reference datasets. Unlike traditional machine learning tasks, when an entity resolution algorithm performs well on small hold-out datasets, there is no guarantee this performance holds on larger hold-out datasets. We prove simple bounding properties between the performance of a match function on a small validation set and the performance of a pairwise entity resolution algorithm on arbitrarily sized datasets. Thus, our approach enables optimization of pairwise entity resolution algorithms for large datasets, using a small set of labeled data.
△ Less
Submitted 10 September, 2015;
originally announced September 2015.