-
Autoregressive hidden Markov models for high-resolution animal movement data
Authors:
Ferdinand V. Stoye,
Annika Hoyer,
Roland Langrock
Abstract:
New types of high-resolution animal movement data allow for increasingly comprehensive biological inference, but method development to meet the statistical challenges associated with such data is lagging behind. In this contribution, we extend the commonly applied hidden Markov models for step lengths and turning angles to address the specific requirements posed by high-resolution movement data, i…
▽ More
New types of high-resolution animal movement data allow for increasingly comprehensive biological inference, but method development to meet the statistical challenges associated with such data is lagging behind. In this contribution, we extend the commonly applied hidden Markov models for step lengths and turning angles to address the specific requirements posed by high-resolution movement data, in particular the very strong within-state correlation induced by the momentum in the movement. The models feature autoregressive components of general order in both the step length and the turning angle variable, with the possibility to automate the selection of the autoregressive degree using a lasso approach. In a simulation study, we identify potential for improved inference when using the new model instead of the commonly applied basic hidden Markov model in cases where there is strong within-state autocorrelation. The practical use of the model is illustrated using high-resolution movement tracks of terns foraging near an anthropogenic structure causing turbulent water flow features.
△ Less
Submitted 16 December, 2024;
originally announced December 2024.
-
An Extension of Greenwood's Formula to Variances
Authors:
J. Rodenkirchen,
A. Hoyer
Abstract:
In this article, we introduce an estimator for the asymptotic variance of the Greenwood variance estimator, where the latter is crucial for assessing the accuracy of the Kaplan-Meier survival estimator. The result indicates that the asymptotic variance of the Greenwood variance estimator is considerably smaller than that of the Kaplan-Meier variance estimator. This finding emphasizes the robustnes…
▽ More
In this article, we introduce an estimator for the asymptotic variance of the Greenwood variance estimator, where the latter is crucial for assessing the accuracy of the Kaplan-Meier survival estimator. The result indicates that the asymptotic variance of the Greenwood variance estimator is considerably smaller than that of the Kaplan-Meier variance estimator. This finding emphasizes the robustness of the Greenwood estimator.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
A non-parametric proportional risk model to assess a treatment effect in time-to-event data
Authors:
Lucia Ameis,
Oliver Kuß,
Annika Hoyer,
Kathrin Möllenhoff
Abstract:
Time-to-event analysis often relies on prior parametric assumptions, or, if a non-parametric approach is chosen, Cox's model. This is inherently tied to the assumption of proportional hazards, with the analysis potentially invalidated if this assumption is not fulfilled. In addition, most interpretations focus on the hazard ratio, that is often misinterpreted as the relative risk. In this paper, w…
▽ More
Time-to-event analysis often relies on prior parametric assumptions, or, if a non-parametric approach is chosen, Cox's model. This is inherently tied to the assumption of proportional hazards, with the analysis potentially invalidated if this assumption is not fulfilled. In addition, most interpretations focus on the hazard ratio, that is often misinterpreted as the relative risk. In this paper, we introduce an alternative to current methodology for assessing a treatment effect in a two-group situation, not relying on the proportional hazards assumption but assuming proportional risks. Precisely, we propose a new non-parametric model to directly estimate the relative risk of two groups to experience an event under the assumption that the risk ratio is constant over time. In addition to this relative measure, our model allows for calculating the number needed to treat as an absolute measure, providing the possibility of an easy and holistic interpretation of the data. We demonstrate the validity of the approach by means of a simulation study and present an application to data from a large randomized controlled trial investigating the effect of dapagliflozin on the risk of first hospitalization for heart failure.
△ Less
Submitted 13 March, 2023;
originally announced March 2023.
-
Importance of diagnostic accuracy in big data: False-positive diagnoses of type 2 diabetes in health insurance claims data of 70 million Germans
Authors:
Ralph Brinks,
Thaddaeus Toennies,
Annika Hoyer
Abstract:
Large data sets comprising diagnoses about chronic conditions are becoming increasingly available for research purposes. In Germany, it is planned that aggregated claims data including medical diagnoses from the statutory health insurance with roughly 70 million insurants will be published on a regular basis. Validity of the diagnoses in such big data sets can hardly be assessed. In case the data…
▽ More
Large data sets comprising diagnoses about chronic conditions are becoming increasingly available for research purposes. In Germany, it is planned that aggregated claims data including medical diagnoses from the statutory health insurance with roughly 70 million insurants will be published on a regular basis. Validity of the diagnoses in such big data sets can hardly be assessed. In case the data set comprises prevalence, incidence and mortality, it is possible to estimate the proportion of false positive diagnoses using mathematical relations from the illness-death model. We apply the method to age-specific aggregated claims data from 70 million Germans about type 2 diabetes in Germany stratified by sex and report the findings in terms of the ratio of false positive diagnoses of type 2 diabetes (FPR) in the data set. The age-specific FPR for men and women changes with age. In men, the FPR increases linearly from 1 to 3 per mil in the age 30 to 50. For ages between 50 to 80 years, FPR remains below 4 per mil. After 80 years of age, we have an increase to about 5 per mil. In women, we find a steep increase from age 30 to 60, the peak FPR is reached at about 12 per mil between 60 and 70 years of age. After age 70, the FPR of women drops tremendously. In all age-groups, the FPR is higher in women than in men. In terms of absolute numbers, we find that there are 217 thousand people with a false-positive diagnosis in the data set (95% confidence interval, CI: 204 to 229), the vast majority women (172 thousand, 95% CI: 162 to 180). Our work indicates that possible false positive (and negative) diagnoses should appropriately be dealt with in claims data, e.g., by inclusion of age- and sex-specific error terms in statistical models, to avoid potentially biased or wrong conclusions.
△ Less
Submitted 26 February, 2022;
originally announced February 2022.
-
Multiple-point statistical simulation for hydrogeological models: 3-D training image development and conditioning strategies
Authors:
Anne-Sophie Høyer,
Giulio Vignoli,
Thomas Mejer Hansen,
Le Thanh Vu,
Donald A. Keefer,
Flemming Jørgensen
Abstract:
Most studies on the application of geostatistical simulations based on multiple-point statistics (MPS) to hydrogeological modelling focus on relatively fine-scale models and on the estimation of facies-level structural uncertainty. Less attention is paid to the input data and the construction of Training Images (TIs). E.g. even though the TI should capture a set of spatial geological characteristi…
▽ More
Most studies on the application of geostatistical simulations based on multiple-point statistics (MPS) to hydrogeological modelling focus on relatively fine-scale models and on the estimation of facies-level structural uncertainty. Less attention is paid to the input data and the construction of Training Images (TIs). E.g. even though the TI should capture a set of spatial geological characteristics, the majority of the research still relies on 2D or quasi-3D training images. Here, we demonstrate a novel strategy for 3D MPS modelling characterized by (i) realistic 3D TIs and (ii) an effective workflow for incorporating a diverse group of geological and geophysical data sets. The study covers 2810 km^2 in southern Denmark. MPS simulations are performed on a subset of the geological succession (the lower to middle Miocene sediments) which is characterized by relatively uniform structures and dominated by sand and clay. The simulated domain is large and each of the geostatistical realizations contains approximately 45 x 10^6 voxels with size 100 m x 100 m x 5 m. Data used for the modelling include water well logs, seismic data, and a previously published 3D geological model. We apply a series of different strategies for the simulations based on data quality and develop a novel method to effectively create observed spatial trends. The TI is constructed as a relatively small 3D voxel model covering an area of 90 km^2. We use an iterative training image development strategy and find that even slight modifications in the TI create significant changes in simulations. Thus, this study shows how to include both the geological environment and the type and quality of input information in order to achieve optimal results from MPS modelling. We present a practical workflow to build the TI and effectively handle different types of input information to perform large-scale geostatistical modelling
△ Less
Submitted 21 November, 2020;
originally announced November 2020.
-
Numerical considerations about the SIR epidemic model with infection age
Authors:
Ralph Brinks,
Annika Hoyer
Abstract:
We analyse the infection-age-dependent SIR model from a numerical point of view. First, we present an algorithm for calculating the solution the infection-age-structured SIR model without demography of the background host. Second, we examine how and under which conditions, the conventional SIR model (without infection-age) serves as a practical approximation to the infection-age SIR model. Special…
▽ More
We analyse the infection-age-dependent SIR model from a numerical point of view. First, we present an algorithm for calculating the solution the infection-age-structured SIR model without demography of the background host. Second, we examine how and under which conditions, the conventional SIR model (without infection-age) serves as a practical approximation to the infection-age SIR model. Special emphasis is given on the effective reproduction number.
△ Less
Submitted 26 June, 2020;
originally announced June 2020.
-
Estimation of the actual disease occurrence based on official case numbers during a COVID outbreak in Germany 2020
Authors:
Ralph Brinks,
Annika Hoyer
Abstract:
Since the beginning of March 2020, the cumulative numbers of cases of infection with the novel coronavirus SARS-CoV-2 in Germany have been reported on a daily basis. The reports originate from national laws, according to which positive test findings must be submitted to the Federal Health Authorities, the Robert Koch Institute, via the local health authorities. Since an enormous number of unreport…
▽ More
Since the beginning of March 2020, the cumulative numbers of cases of infection with the novel coronavirus SARS-CoV-2 in Germany have been reported on a daily basis. The reports originate from national laws, according to which positive test findings must be submitted to the Federal Health Authorities, the Robert Koch Institute, via the local health authorities. Since an enormous number of unreported cases can be expected, the question of how widespread the disease has been in the population cannot be answered based on these administrative reports. Using mathematical modeling, however, estimates can be made. These estimates indicate that the small numbers of diagnostic tests carried out at the beginning of the outbreak overlooked considerable parts of the infection. In order to cover the initial phase of future waves of the disease, wide-spread and comprehensive tests are recommended.
△ Less
Submitted 7 May, 2020;
originally announced May 2020.
-
Four Edge-Independent Spanning Trees
Authors:
Alexander Hoyer,
Robin Thomas
Abstract:
We prove an ear-decomposition theorem for $4$-edge-connected graphs and use it to prove that for every $4$-edge-connected graph $G$ and every $r\in V(G)$, there is a set of four spanning trees of $G$ with the following property. For every vertex in $G$, the unique paths back to $r$ in each tree are edge-disjoint. Our proof implies a polynomial-time algorithm for constructing the trees.
We prove an ear-decomposition theorem for $4$-edge-connected graphs and use it to prove that for every $4$-edge-connected graph $G$ and every $r\in V(G)$, there is a set of four spanning trees of $G$ with the following property. For every vertex in $G$, the unique paths back to $r$ in each tree are edge-disjoint. Our proof implies a polynomial-time algorithm for constructing the trees.
△ Less
Submitted 21 November, 2017; v1 submitted 2 May, 2017;
originally announced May 2017.
-
The Gyori-Lovasz theorem
Authors:
Alexander Hoyer,
Robin Thomas
Abstract:
Gyori and Lovasz independently proved the following beautiful theorem. Let $k\ge2$ be an integer, let $G$ be a $k$-connected graph on $n$ vertices, let $v_1,v_2,\ldots,v_k$ be distinct vertices of $G$ and let $n_1,n_2,\ldots,n_k$ be positive integers with $n_1+n_2+\cdots+n_k=n$. Then $G$ has disjoint connected subgraphs $G_1,G_2,\ldots,G_k$ such that for $i=1,2,\ldots,k$ the graph $G_i$ has $n_i$…
▽ More
Gyori and Lovasz independently proved the following beautiful theorem. Let $k\ge2$ be an integer, let $G$ be a $k$-connected graph on $n$ vertices, let $v_1,v_2,\ldots,v_k$ be distinct vertices of $G$ and let $n_1,n_2,\ldots,n_k$ be positive integers with $n_1+n_2+\cdots+n_k=n$. Then $G$ has disjoint connected subgraphs $G_1,G_2,\ldots,G_k$ such that for $i=1,2,\ldots,k$ the graph $G_i$ has $n_i$ vertices and $v_i\in V(G_i)$. We give a self-contained exposition of Gyori's proof.
△ Less
Submitted 22 June, 2016; v1 submitted 4 May, 2016;
originally announced May 2016.
-
Current-induced spin polarization in topological insulator-graphene heterostructures
Authors:
Kristina Vaklinova,
Alexander Hoyer,
Marko Burghard,
Klaus Kern
Abstract:
Further development of the field of all-electric spintronics requires the successful integration of spin transport channels with spin injector/generator elements. While with the advent of graphene and related 2D materials high performance spin channel materials are available, the use of nanostructured spin generators remains a major challenge. Especially promising for the latter purpose are 3D top…
▽ More
Further development of the field of all-electric spintronics requires the successful integration of spin transport channels with spin injector/generator elements. While with the advent of graphene and related 2D materials high performance spin channel materials are available, the use of nanostructured spin generators remains a major challenge. Especially promising for the latter purpose are 3D topological insulators, whose 2D surface states host massless Dirac fermions with spin-momentum locking. Here, we demonstrate injection of spin-polarized current from a topological insulator into graphene, enabled by its intimate coupling to an ultrathin Bi2Te2Se nanoplatelet within a van der Waals epitaxial heterostructure. The spin switching signal, whose magnitude scales inversely with temperature, is detectable up to ~15 K. Our findings establish topological insulators as prospective future components of spintronic devices wherein spin manipulation is achieved by purely electrical means.
△ Less
Submitted 29 March, 2016; v1 submitted 4 November, 2015;
originally announced November 2015.