-
Comment on "On the Extraction of Purely Motor EEG Neural Correlates during an Upper Limb Visuomotor Task"
Authors:
Patrick Ofner,
Joana Pereira,
Reinmar Kobler,
Andreas Schwarz,
Gernot R. Müller-Putz
Abstract:
Bibian et al. show in their recent paper (Bibián et al. 2021) that eye and head movements can affect the EEG-based classification in a reaching motor task. These movements can generate artefacts that can cause an overoptimistic estimation of the classification accuracy. They speculate that such artefacts jeopardise the interpretation of the results from several motor decoding studies including our…
▽ More
Bibian et al. show in their recent paper (Bibián et al. 2021) that eye and head movements can affect the EEG-based classification in a reaching motor task. These movements can generate artefacts that can cause an overoptimistic estimation of the classification accuracy. They speculate that such artefacts jeopardise the interpretation of the results from several motor decoding studies including our study (Ofner et al. 2017). While we endorse their warning about artefacts in general, we do have doubts whether their work supports such a statement with respect to our study. We provide in this commentary a more nuanced contextualization of our work presented in Ofner et al. and the type of artefacts investigated in Bibian et al.
△ Less
Submitted 22 July, 2022;
originally announced July 2022.
-
BITES: Balanced Individual Treatment Effect for Survival data
Authors:
Stefan Schrod,
Andreas Schäfer,
Stefan Solbrig,
Robert Lohmayer,
Wolfram Gronwald,
Peter J. Oefner,
Tim Beißbarth,
Rainer Spang,
Helena U. Zacharias,
Michael Altenbuchinger
Abstract:
Estimating the effects of interventions on patient outcome is one of the key aspects of personalized medicine. Their inference is often challenged by the fact that the training data comprises only the outcome for the administered treatment, and not for alternative treatments (the so-called counterfactual outcomes). Several methods were suggested for this scenario based on observational data, i.e.~…
▽ More
Estimating the effects of interventions on patient outcome is one of the key aspects of personalized medicine. Their inference is often challenged by the fact that the training data comprises only the outcome for the administered treatment, and not for alternative treatments (the so-called counterfactual outcomes). Several methods were suggested for this scenario based on observational data, i.e.~data where the intervention was not applied randomly, for both continuous and binary outcome variables. However, patient outcome is often recorded in terms of time-to-event data, comprising right-censored event times if an event does not occur within the observation period. Albeit their enormous importance, time-to-event data is rarely used for treatment optimization.
We suggest an approach named BITES (Balanced Individual Treatment Effect for Survival data), which combines a treatment-specific semi-parametric Cox loss with a treatment-balanced deep neural network; i.e.~we regularize differences between treated and non-treated patients using Integral Probability Metrics (IPM). We show in simulation studies that this approach outperforms the state of the art. Further, we demonstrate in an application to a cohort of breast cancer patients that hormone treatment can be optimized based on six routine parameters. We successfully validated this finding in an independent cohort. BITES is provided as an easy-to-use python implementation.
△ Less
Submitted 5 January, 2022;
originally announced January 2022.
-
State-Space Constraints Improve the Generalization of the Differentiable Neural Computer in some Algorithmic Tasks
Authors:
Patrick Ofner,
Roman Kern
Abstract:
Memory-augmented neural networks (MANNs) can solve algorithmic tasks like sorting. However, they often do not generalize to lengths of input sequences not seen in the training phase. Therefore, we introduce two approaches constraining the state-space of the network controller to improve the generalization to out-of-distribution-sized input sequences: state compression and state regularization. We…
▽ More
Memory-augmented neural networks (MANNs) can solve algorithmic tasks like sorting. However, they often do not generalize to lengths of input sequences not seen in the training phase. Therefore, we introduce two approaches constraining the state-space of the network controller to improve the generalization to out-of-distribution-sized input sequences: state compression and state regularization. We show that both approaches can improve the generalization capability of a particular type of MANN, the differentiable neural computer (DNC), and compare our approaches to a stateful and a stateless controller on a set of algorithmic tasks. Furthermore, we show that especially the combination of both approaches can enable a pre-trained DNC to be extended post hoc with a larger memory. Thus, our introduced approaches allow to train a DNC using shorter input sequences and thus save computational resources. Moreover, we observed that the capability for generalization is often accompanied by loop structures in the state-space, which could correspond to looping constructs in algorithms.
△ Less
Submitted 18 October, 2021;
originally announced October 2021.
-
Lessons Learned from the 1st ARIEL Machine Learning Challenge: Correcting Transiting Exoplanet Light Curves for Stellar Spots
Authors:
Nikolaos Nikolaou,
Ingo P. Waldmann,
Angelos Tsiaras,
Mario Morvan,
Billy Edwards,
Kai Hou Yip,
Giovanna Tinetti,
Subhajit Sarkar,
James M. Dawson,
Vadim Borisov,
Gjergji Kasneci,
Matej Petkovic,
Tomaz Stepisnik,
Tarek Al-Ubaidi,
Rachel Louise Bailey,
Michael Granitzer,
Sahib Julka,
Roman Kern,
Patrick Ofner,
Stefan Wagner,
Lukas Heppe,
Mirko Bunse,
Katharina Morik
Abstract:
The last decade has witnessed a rapid growth of the field of exoplanet discovery and characterisation. However, several big challenges remain, many of which could be addressed using machine learning methodology. For instance, the most prolific method for detecting exoplanets and inferring several of their characteristics, transit photometry, is very sensitive to the presence of stellar spots. The…
▽ More
The last decade has witnessed a rapid growth of the field of exoplanet discovery and characterisation. However, several big challenges remain, many of which could be addressed using machine learning methodology. For instance, the most prolific method for detecting exoplanets and inferring several of their characteristics, transit photometry, is very sensitive to the presence of stellar spots. The current practice in the literature is to identify the effects of spots visually and correct for them manually or discard the affected data. This paper explores a first step towards fully automating the efficient and precise derivation of transit depths from transit light curves in the presence of stellar spots. The methods and results we present were obtained in the context of the 1st Machine Learning Challenge organized for the European Space Agency's upcoming Ariel mission. We first present the problem, the simulated Ariel-like data and outline the Challenge while identifying best practices for organizing similar challenges in the future. Finally, we present the solutions obtained by the top-5 winning teams, provide their code and discuss their implications. Successful solutions either construct highly non-linear (w.r.t. the raw data) models with minimal preprocessing -deep neural networks and ensemble methods- or amount to obtaining meaningful statistics from the light curves, constructing linear models on which yields comparably good predictive performance.
△ Less
Submitted 29 October, 2020;
originally announced October 2020.
-
Fully integrative data analysis of NMR metabolic fingerprints with comprehensive patient data: a case report based on the German Chronic Kidney Disease (GCKD) study
Authors:
Helena U. Zacharias,
Michael Altenbuchinger,
Stefan Solbrig,
Andreas Schäfer,
Mustafa Buyukozkan,
Ulla T. Schultheiß,
Fruzsina Kotsis,
Anna Köttgen,
Jan Krumsiek,
Fabian J. Theis,
Rainer Spang,
Peter J. Oefner,
Wolfram Gronwald,
GCKD study investigators
Abstract:
Omics data facilitate the gain of novel insights into the pathophysiology of diseases and, consequently, their diagnosis, treatment, and prevention. To that end, it is necessary to integrate omics data with other data types such as clinical, phenotypic, and demographic parameters of categorical or continuous nature. Here, we exemplify this data integration issue for a study on chronic kidney disea…
▽ More
Omics data facilitate the gain of novel insights into the pathophysiology of diseases and, consequently, their diagnosis, treatment, and prevention. To that end, it is necessary to integrate omics data with other data types such as clinical, phenotypic, and demographic parameters of categorical or continuous nature. Here, we exemplify this data integration issue for a study on chronic kidney disease (CKD), where complex clinical and demographic parameters were assessed together with one-dimensional (1D) 1H NMR metabolic fingerprints. Routine analysis screens for associations of single metabolic features with clinical parameters, which requires confounding variables typically chosen by expert knowledge to be taken into account. This knowledge can be incomplete or unavailable. The results of this article are manifold. We introduce a framework for data integration that intrinsically adjusts for confounding variables. We give its mathematical and algorithmic foundation, provide a state-of-the-art implementation, and give several sanity checks. In particular, we show that the discovered associations remain significant after variable adjustment based on expert knowledge. In contrast, we illustrate that the discovery of associations in routine analysis can be biased by incorrect or incomplete expert knowledge in univariate screening approaches. Finally, we exemplify how our data integration approach reveals important associations between CKD comorbidities and metabolites. Moreover, we evaluate the predictive performance of the estimated models on independent validation data and contrast the results with a naive screening approach.
△ Less
Submitted 8 October, 2018;
originally announced October 2018.
-
Loss-function learning for digital tissue deconvolution
Authors:
Franziska Görtler,
Stefan Solbrig,
Tilo Wettig,
Peter J. Oefner,
Rainer Spang,
Michael Altenbuchinger
Abstract:
The gene expression profile of a tissue averages the expression profiles of all cells in this tissue. Digital tissue deconvolution (DTD) addresses the following inverse problem: Given the expression profile $y$ of a tissue, what is the cellular composition $c$ of that tissue? If $X$ is a matrix whose columns are reference profiles of individual cell types, the composition $c$ can be computed by mi…
▽ More
The gene expression profile of a tissue averages the expression profiles of all cells in this tissue. Digital tissue deconvolution (DTD) addresses the following inverse problem: Given the expression profile $y$ of a tissue, what is the cellular composition $c$ of that tissue? If $X$ is a matrix whose columns are reference profiles of individual cell types, the composition $c$ can be computed by minimizing $\mathcal L(y-Xc)$ for a given loss function $\mathcal L$. Current methods use predefined all-purpose loss functions. They successfully quantify the dominating cells of a tissue, while often falling short in detecting small cell populations.
Here we learn the loss function $\mathcal L$ along with the composition $c$. This allows us to adapt to application-specific requirements such as focusing on small cell populations or distinguishing phenotypically similar cell populations. Our method quantifies large cell fractions as accurately as existing methods and significantly improves the detection of small cell populations and the distinction of similar cell types.
△ Less
Submitted 25 January, 2018;
originally announced January 2018.
-
Scale-invariant biomarker discovery in urine and plasma metabolite fingerprints
Authors:
Helena U. Zacharias,
Thorsten Rehberg,
Sebastian Mehrl,
Daniel Richtmann,
Tilo Wettig,
Peter J. Oefner,
Rainer Spang,
Wolfram Gronwald,
Michael Altenbuchinger
Abstract:
Motivation: Metabolomics data is typically scaled to a common reference like a constant volume of body fluid, a constant creatinine level, or a constant area under the spectrum. Such normalization of the data, however, may affect the selection of biomarkers and the biological interpretation of results in unforeseen ways.
Results: First, we study how the outcome of hypothesis tests for differenti…
▽ More
Motivation: Metabolomics data is typically scaled to a common reference like a constant volume of body fluid, a constant creatinine level, or a constant area under the spectrum. Such normalization of the data, however, may affect the selection of biomarkers and the biological interpretation of results in unforeseen ways.
Results: First, we study how the outcome of hypothesis tests for differential metabolite concentration is affected by the choice of scale. Furthermore, we observe this interdependence also for different classification approaches. Second, to overcome this problem and establish a scale-invariant biomarker discovery algorithm, we extend linear zero-sum regression to the logistic regression framework and show in two applications to ${}^1$H NMR-based metabolomics data how this approach overcomes the scaling problem.
Availability: Logistic zero-sum regression is available as an R package as well as a high-performance computing implementation that can be downloaded at https://github.com/rehbergT/zeroSum
△ Less
Submitted 22 March, 2017;
originally announced March 2017.