-
Learning from data with structured missingness
Authors:
Robin Mitra,
Sarah F. McGough,
Tapabrata Chakraborti,
Chris Holmes,
Ryan Copping,
Niels Hagenbuch,
Stefanie Biedermann,
Jack Noonan,
Brieuc Lehmann,
Aditi Shenvi,
Xuan Vinh Doan,
David Leslie,
Ginestra Bianconi,
Ruben Sanchez-Garcia,
Alisha Davies,
Maxine Mackintosh,
Eleni-Rosalina Andrinopoulou,
Anahid Basiri,
Chris Harbron,
Ben D. MacArthur
Abstract:
Missing data are an unavoidable complication in many machine learning tasks. When data are `missing at random' there exist a range of tools and techniques to deal with the issue. However, as machine learning studies become more ambitious, and seek to learn from ever-larger volumes of heterogeneous data, an increasingly encountered problem arises in which missing values exhibit an association or st…
▽ More
Missing data are an unavoidable complication in many machine learning tasks. When data are `missing at random' there exist a range of tools and techniques to deal with the issue. However, as machine learning studies become more ambitious, and seek to learn from ever-larger volumes of heterogeneous data, an increasingly encountered problem arises in which missing values exhibit an association or structure, either explicitly or implicitly. Such `structured missingness' raises a range of challenges that have not yet been systematically addressed, and presents a fundamental hindrance to machine learning at scale. Here, we outline the current literature and propose a set of grand challenges in learning from data with structured missingness.
△ Less
Submitted 3 April, 2023;
originally announced April 2023.
-
Where the Bee Sucks -- A Dynamic Bayesian Network Approach to Decision Support for Pollinator Abundance Strategies
Authors:
Martine J. Barons,
Aditi Shenvi
Abstract:
For policymakers wishing to make evidence-based decisions, one of the challenges is how to combine the relevant information and evidence in a coherent and defensible manner in order to formulate and evaluate candidate policies. Policymakers often need to rely on experts with disparate fields of expertise when making policy choices in complex, multi-faceted, dynamic environments such as those deali…
▽ More
For policymakers wishing to make evidence-based decisions, one of the challenges is how to combine the relevant information and evidence in a coherent and defensible manner in order to formulate and evaluate candidate policies. Policymakers often need to rely on experts with disparate fields of expertise when making policy choices in complex, multi-faceted, dynamic environments such as those dealing with ecosystem services. The pressures affecting the survival and pollination capabilities of honey bees (Apis mellifera), wild bees and other pollinators is well-documented, but incomplete. In order to estimate the potential effectiveness of various candidate policies to support pollination services, there is an urgent need to quantify the effect of various combinations of variables on the pollination ecosystem service, utilising available information, models and expert judgement. In this paper, we present a new application of the integrating decision support system methodology for combining inputs from multiple panels of experts to evaluate policies to support an abundant pollinator population.
△ Less
Submitted 5 December, 2022;
originally announced December 2022.
-
A Bayesian decision support system for counteracting activities of terrorist groups
Authors:
Aditi Shenvi,
F. Oliver Bunnin,
Jim Q. Smith
Abstract:
Activities of terrorist groups present a serious threat to the security and well-being of the general public. Counterterrorism authorities aim to identify and frustrate the plans of terrorist groups before they are put into action. Whilst the activities of terrorist groups are likely to be hidden and disguised, the members of such groups need to communicate and coordinate to organise their activit…
▽ More
Activities of terrorist groups present a serious threat to the security and well-being of the general public. Counterterrorism authorities aim to identify and frustrate the plans of terrorist groups before they are put into action. Whilst the activities of terrorist groups are likely to be hidden and disguised, the members of such groups need to communicate and coordinate to organise their activities. Such observable behaviour and communications data can be utilised by the authorities to estimate the threat posed by a terrorist group. However, to be credible, any such statistical model needs to fold in the level of threat posed by each member of the group. Unlike in other benign forms of social networks, considering the members of terrorist groups as exchangeable gives an incomplete picture of the combined capacity of the group to do harm. Here we develop a Bayesian integrating decision support system that can bring together information relating to each of the members of a terrorist group as well as the combined activities of the group.
△ Less
Submitted 16 December, 2021; v1 submitted 8 July, 2020;
originally announced July 2020.
-
Propagation for Dynamic Continuous Time Chain Event Graphs
Authors:
Aditi Shenvi,
Jim Q. Smith
Abstract:
Chain Event Graphs (CEGs) are a family of event-based graphical models that represent context-specific conditional independences typically exhibited by asymmetric state space problems. The class of continuous time dynamic CEGs (CT-DCEGs) provides a factored representation of longitudinally evolving trajectories of a process in continuous time. Temporal evidence in a CT-DCEG introduces dependence b…
▽ More
Chain Event Graphs (CEGs) are a family of event-based graphical models that represent context-specific conditional independences typically exhibited by asymmetric state space problems. The class of continuous time dynamic CEGs (CT-DCEGs) provides a factored representation of longitudinally evolving trajectories of a process in continuous time. Temporal evidence in a CT-DCEG introduces dependence between its transition and holding time distributions. We present a tractable exact inferential scheme analogous to the scheme in Kjærulff (1992) for discrete Dynamic Bayesian Networks (DBNs) which employs standard junction tree inference by "unrolling" the DBN. To enable this scheme, we present an extension of the standard CEG propagation algorithm (Thwaites et al., 2008). Interestingly, the CT-DCEG benefits from simplification of its graph on observing compatible evidence while preserving the still relevant symmetries within the asymmetric network. Our results indicate that the CT-DCEG is preferred to DBNs and continuous time BNs under contexts involving significant asymmetry and a natural total ordering of the process evolution.
△ Less
Submitted 29 June, 2020;
originally announced June 2020.
-
Constructing a Chain Event Graph from a Staged Tree
Authors:
Aditi Shenvi,
Jim Q. Smith
Abstract:
Chain Event Graphs (CEGs) are a recent family of probabilistic graphical models - a generalisation of Bayesian Networks - providing an explicit representation of structural zeros, structural missing values and context-specific conditional independences within their graph topology. A CEG is constructed from an event tree through a sequence of transformations beginning with the colouring of the vert…
▽ More
Chain Event Graphs (CEGs) are a recent family of probabilistic graphical models - a generalisation of Bayesian Networks - providing an explicit representation of structural zeros, structural missing values and context-specific conditional independences within their graph topology. A CEG is constructed from an event tree through a sequence of transformations beginning with the colouring of the vertices of the event tree to identify one-step transition symmetries. This coloured event tree, also known as a staged tree, is the output of the learning algorithms used for this family. Surprisingly, no general algorithm has yet been devised that automatically transforms any staged tree into a CEG representation. In this paper we provide a simple iterative backward algorithm for this transformation. Additionally, we show that no information is lost from transforming a staged tree into a CEG. Finally, we demonstrate that with an optimal stopping criterion, our algorithm is more efficient than the generalisation of a special case presented in Silander and Leong (2013). We also provide Python code using this algorithm to obtain a CEG from any staged tree along with the functionality to add edges with sampling zeros.
△ Less
Submitted 16 December, 2021; v1 submitted 29 June, 2020;
originally announced June 2020.