Search | arXiv e-print repository

Inference of epidemic networks: the effect of different data types

Authors: Oscar Fajardo-Fontiveros, Carl J. E. Suster, Eduardo G. Altmann

Abstract: We investigate how the properties of epidemic networks change depending on the availability of different types of data on a disease outbreak. This is achieved by introducing mathematical and computational methods that estimate the probability of transmission trees by combining generative models that jointly determine the number of infected hosts, the probability of infection between them depending… ▽ More We investigate how the properties of epidemic networks change depending on the availability of different types of data on a disease outbreak. This is achieved by introducing mathematical and computational methods that estimate the probability of transmission trees by combining generative models that jointly determine the number of infected hosts, the probability of infection between them depending on location and genetic information, and their time of infection and sampling. We introduce a suitable Markov Chain Monte Carlo method that we show to sample trees according to their probability. Statistics performed over the sampled trees lead to probabilistic estimations of network properties and other quantities of interest, such as the number of unobserved hosts and the depth of the infection tree. We confirm the validity of our approach by comparing the numerical results with analytically solvable examples. Finally, we apply our methodology to data from COVID-19 in Australia. We find that network properties that are important for the management of the outbreak depend sensitively on the type of data used in the inference. △ Less

Submitted 1 September, 2025; originally announced September 2025.

Comments: 15 pages, 8 figures

arXiv:2506.00602 [pdf, ps, other]

Assessing Honey Bee Colony Health Using Temperature Time Series

Authors: Karina Arias-Calluari, Theotime Colin, Tanya Latty, Mary Myerscough, Eduardo G. Altmann

Abstract: Honey bees face an increasing number of stressors that disrupt the natural behaviour of colonies and, in extreme cases, can lead to their collapse. Quantifying the status and resilience of colonies is essential to measure the impact of stressors and to identify colonies at risk. In this manuscript, we present and apply new methodologies to efficiently diagnose the status of a honey bee colony from… ▽ More Honey bees face an increasing number of stressors that disrupt the natural behaviour of colonies and, in extreme cases, can lead to their collapse. Quantifying the status and resilience of colonies is essential to measure the impact of stressors and to identify colonies at risk. In this manuscript, we present and apply new methodologies to efficiently diagnose the status of a honey bee colony from widely available time series of hive and environmental temperature. Healthy hives have a remarkable ability to control temperature near the brood area. Our method exploits this fact and quantifies the status of a hive by measuring how resilient they are to extreme environmental temperatures, which act as natural stressors. Analysing 22 hives during different times of the year, including 3 hives that collapsed, we find the statistical signatures of stress that reveal whether honeybees are doing well or are at risk of failure. Based on these analyses, we propose a simple scale of hive status (stable, warning, and collapse) that can be determined based on a few temperature measurements. Our approach offers a lower-cost and practical bee-monitoring solution, providing a non-invasive way to track hive conditions and trigger interventions to save the hives from collapse. △ Less

Submitted 31 May, 2025; originally announced June 2025.

Comments: 14 pages, 7 figures and 1 repository

MSC Class: 92Bxx ACM Class: J.2.8; I.6.4

arXiv:2211.02335 [pdf, other]

Modelling daily weight variation in honey bee hives

Authors: Karina Arias-Calluari, Theotime Colin, Tanya Latty, Mary Myerscough, Eduardo G. Altmann

Abstract: A quantitative understanding of the dynamics of bee colonies is important to support global efforts to improve bee health and enhance pollination services. Traditional approaches focus either on theoretical models or data-centred statistical analyses. Here we argue that the combination of these two approaches is essential to obtain interpretable information on the state of bee colonies and show ho… ▽ More A quantitative understanding of the dynamics of bee colonies is important to support global efforts to improve bee health and enhance pollination services. Traditional approaches focus either on theoretical models or data-centred statistical analyses. Here we argue that the combination of these two approaches is essential to obtain interpretable information on the state of bee colonies and show how this can be achieved in the case of time series of intra-day weight variation. We model how the foraging and food processing activities of bees affect global hive weight through a set of ordinary differential equations and show how to estimate reliable ranges for the ten parameters of this model from measurements on a single day. Our analysis of 10 hives at different times shows that crucial indicators of the health of honey bee colonies are estimated robustly and fall in ranges compatible with previously reported results. The indicators include the amount of food collected (foraging success) and the number of active foragers, which may be used to develop early warning indicators of colony failure. △ Less

Submitted 4 November, 2022; originally announced November 2022.

Comments: 26 pages with 9 figures

MSC Class: 92-XX (Primary); 92Bxx (Secondary) ACM Class: G.3; J.2

arXiv:2106.15821 [pdf, other]

doi 10.1140/epjds/s13688-021-00288-5

Multilayer Networks for Text Analysis with Multiple Data Types

Authors: Charles C. Hyland, Yuanming Tao, Lamiae Azizi, Martin Gerlach, Tiago P. Peixoto, Eduardo G. Altmann

Abstract: We are interested in the widespread problem of clustering documents and finding topics in large collections of written documents in the presence of metadata and hyperlinks. To tackle the challenge of accounting for these different types of datasets, we propose a novel framework based on Multilayer Networks and Stochastic Block Models. The main innovation of our approach over other techniques is th… ▽ More We are interested in the widespread problem of clustering documents and finding topics in large collections of written documents in the presence of metadata and hyperlinks. To tackle the challenge of accounting for these different types of datasets, we propose a novel framework based on Multilayer Networks and Stochastic Block Models. The main innovation of our approach over other techniques is that it applies the same non-parametric probabilistic framework to the different sources of datasets simultaneously. The key difference to other multilayer complex networks is the strong unbalance between the layers, with the average degree of different node types scaling differently with system size. We show that the latter observation is due to generic properties of text, such as Heaps' law, and strongly affects the inference of communities. We present and discuss the performance of our method in different datasets (hundreds of Wikipedia documents, thousands of scientific papers, and thousands of E-mails) showing that taking into account multiple types of information provides a more nuanced view on topic- and document-clusters and increases the ability to predict missing links. △ Less

Submitted 30 June, 2021; originally announced June 2021.

Comments: 17 pages, 6 figures

Journal ref: EPJ Data Science volume 10, Article number: 33 (2021)

arXiv:1708.01677 [pdf, other]

doi 10.1126/sciadv.aaq1360

A network approach to topic models

Authors: Martin Gerlach, Tiago P. Peixoto, Eduardo G. Altmann

Abstract: One of the main computational and scientific challenges in the modern age is to extract useful information from unstructured texts. Topic models are one popular machine-learning approach which infers the latent topical structure of a collection of documents. Despite their success --- in particular of its most widely used variant called Latent Dirichlet Allocation (LDA) --- and numerous application… ▽ More One of the main computational and scientific challenges in the modern age is to extract useful information from unstructured texts. Topic models are one popular machine-learning approach which infers the latent topical structure of a collection of documents. Despite their success --- in particular of its most widely used variant called Latent Dirichlet Allocation (LDA) --- and numerous applications in sociology, history, and linguistics, topic models are known to suffer from severe conceptual and practical problems, e.g. a lack of justification for the Bayesian priors, discrepancies with statistical properties of real texts, and the inability to properly choose the number of topics. Here we obtain a fresh view on the problem of identifying topical structures by relating it to the problem of finding communities in complex networks. This is achieved by representing text corpora as bipartite networks of documents and words. By adapting existing community-detection methods -- using a stochastic block model (SBM) with non-parametric priors -- we obtain a more versatile and principled framework for topic modeling (e.g., it automatically detects the number of topics and hierarchically clusters both the words and documents). The analysis of artificial and real corpora demonstrates that our SBM approach leads to better topic models than LDA in terms of statistical model selection. More importantly, our work shows how to formally relate methods from community detection and topic modeling, opening the possibility of cross-fertilization between these two fields. △ Less

Submitted 19 July, 2018; v1 submitted 4 August, 2017; originally announced August 2017.

Comments: 22 pages, 10 figures, code available at https://topsbm.github.io/

Journal ref: Science Advances 4, eaaq1360 (2018)

Showing 1–5 of 5 results for author: Altmann, E G