Search | arXiv e-print repository

Deep Learning-based Alignment Measurement in Knee Radiographs

Authors: Zhisen Hu, Dominic Cullen, Peter Thompson, David Johnson, Chang Bian, Aleksei Tiulpin, Timothy Cootes, Claudia Lindner

Abstract: Radiographic knee alignment (KA) measurement is important for predicting joint health and surgical outcomes after total knee replacement. Traditional methods for KA measurements are manual, time-consuming and require long-leg radiographs. This study proposes a deep learning-based method to measure KA in anteroposterior knee radiographs via automatically localized knee anatomical landmarks. Our met… ▽ More Radiographic knee alignment (KA) measurement is important for predicting joint health and surgical outcomes after total knee replacement. Traditional methods for KA measurements are manual, time-consuming and require long-leg radiographs. This study proposes a deep learning-based method to measure KA in anteroposterior knee radiographs via automatically localized knee anatomical landmarks. Our method builds on hourglass networks and incorporates an attention gate structure to enhance robustness and focus on key anatomical features. To our knowledge, this is the first deep learning-based method to localize over 100 knee anatomical landmarks to fully outline the knee shape while integrating KA measurements on both pre-operative and post-operative images. It provides highly accurate and reliable anatomical varus/valgus KA measurements using the anatomical tibiofemoral angle, achieving mean absolute differences ~1° when compared to clinical ground truth measurements. Agreement between automated and clinical measurements was excellent pre-operatively (intra-class correlation coefficient (ICC) = 0.97) and good post-operatively (ICC = 0.86). Our findings demonstrate that KA assessment can be automated with high accuracy, creating opportunities for digitally enhanced clinical workflows. △ Less

Submitted 22 June, 2025; originally announced June 2025.

Comments: Accepted to MICCAI 2025

arXiv:2505.14917 [pdf, ps, other]

ConspEmoLLM-v2: A robust and stable model to detect sentiment-transformed conspiracy theories

Authors: Zhiwei Liu, Paul Thompson, Jiaqi Rong, Sophia Ananiadou

Abstract: Despite the many benefits of large language models (LLMs), they can also cause harm, e.g., through automatic generation of misinformation, including conspiracy theories. Moreover, LLMs can also ''disguise'' conspiracy theories by altering characteristic textual features, e.g., by transforming their typically strong negative emotions into a more positive tone. Although several studies have proposed… ▽ More Despite the many benefits of large language models (LLMs), they can also cause harm, e.g., through automatic generation of misinformation, including conspiracy theories. Moreover, LLMs can also ''disguise'' conspiracy theories by altering characteristic textual features, e.g., by transforming their typically strong negative emotions into a more positive tone. Although several studies have proposed automated conspiracy theory detection methods, they are usually trained using human-authored text, whose features can vary from LLM-generated text. Furthermore, several conspiracy detection models, including the previously proposed ConspEmoLLM, rely heavily on the typical emotional features of human-authored conspiracy content. As such, intentionally disguised content may evade detection. To combat such issues, we firstly developed an augmented version of the ConDID conspiracy detection dataset, ConDID-v2, which supplements human-authored conspiracy tweets with versions rewritten by an LLM to reduce the negativity of their original sentiment. The quality of the rewritten tweets was verified by combining human and LLM-based assessment. We subsequently used ConDID-v2 to train ConspEmoLLM-v2, an enhanced version of ConspEmoLLM. Experimental results demonstrate that ConspEmoLLM-v2 retains or exceeds the performance of ConspEmoLLM on the original human-authored content in ConDID, and considerably outperforms both ConspEmoLLM and several other baselines when applied to sentiment-transformed tweets in ConDID-v2. The project will be available at https://github.com/lzw108/ConspEmoLLM. △ Less

Submitted 20 May, 2025; originally announced May 2025.

Comments: work in progress

arXiv:2504.15267 [pdf, other]

Diffusion Bridge Models for 3D Medical Image Translation

Authors: Shaorong Zhang, Tamoghna Chattopadhyay, Sophia I. Thomopoulos, Jose-Luis Ambite, Paul M. Thompson, Greg Ver Steeg

Abstract: Diffusion tensor imaging (DTI) provides crucial insights into the microstructure of the human brain, but it can be time-consuming to acquire compared to more readily available T1-weighted (T1w) magnetic resonance imaging (MRI). To address this challenge, we propose a diffusion bridge model for 3D brain image translation between T1w MRI and DTI modalities. Our model learns to generate high-quality… ▽ More Diffusion tensor imaging (DTI) provides crucial insights into the microstructure of the human brain, but it can be time-consuming to acquire compared to more readily available T1-weighted (T1w) magnetic resonance imaging (MRI). To address this challenge, we propose a diffusion bridge model for 3D brain image translation between T1w MRI and DTI modalities. Our model learns to generate high-quality DTI fractional anisotropy (FA) images from T1w images and vice versa, enabling cross-modality data augmentation and reducing the need for extensive DTI acquisition. We evaluate our approach using perceptual similarity, pixel-level agreement, and distributional consistency metrics, demonstrating strong performance in capturing anatomical structures and preserving information on white matter integrity. The practical utility of the synthetic data is validated through sex classification and Alzheimer's disease classification tasks, where the generated images achieve comparable performance to real data. Our diffusion bridge model offers a promising solution for improving neuroimaging datasets and supporting clinical decision-making, with the potential to significantly impact neuroimaging research and clinical practice. △ Less

Submitted 21 April, 2025; originally announced April 2025.

arXiv:2411.19617 [pdf, other]

Materials Learning Algorithms (MALA): Scalable Machine Learning for Electronic Structure Calculations in Large-Scale Atomistic Simulations

Authors: Attila Cangi, Lenz Fiedler, Bartosz Brzoza, Karan Shah, Timothy J. Callow, Daniel Kotik, Steve Schmerler, Matthew C. Barry, James M. Goff, Andrew Rohskopf, Dayton J. Vogel, Normand Modine, Aidan P. Thompson, Sivasankaran Rajamanickam

Abstract: We present the Materials Learning Algorithms (MALA) package, a scalable machine learning framework designed to accelerate density functional theory (DFT) calculations suitable for large-scale atomistic simulations. Using local descriptors of the atomic environment, MALA models efficiently predict key electronic observables, including local density of states, electronic density, density of states,… ▽ More We present the Materials Learning Algorithms (MALA) package, a scalable machine learning framework designed to accelerate density functional theory (DFT) calculations suitable for large-scale atomistic simulations. Using local descriptors of the atomic environment, MALA models efficiently predict key electronic observables, including local density of states, electronic density, density of states, and total energy. The package integrates data sampling, model training and scalable inference into a unified library, while ensuring compatibility with standard DFT and molecular dynamics codes. We demonstrate MALA's capabilities with examples including boron clusters, aluminum across its solid-liquid phase boundary, and predicting the electronic structure of a stacking fault in a large beryllium slab. Scaling analyses reveal MALA's computational efficiency and identify bottlenecks for future optimization. With its ability to model electronic structures at scales far beyond standard DFT, MALA is well suited for modeling complex material systems, making it a versatile tool for advanced materials research. △ Less

Submitted 29 November, 2024; originally announced November 2024.

arXiv:2411.09618 [pdf, other]

doi 10.59275/j.melba.2024-9c68

MICCAI-CDMRI 2023 QuantConn Challenge Findings on Achieving Robust Quantitative Connectivity through Harmonized Preprocessing of Diffusion MRI

Authors: Nancy R. Newlin, Kurt Schilling, Serge Koudoro, Bramsh Qamar Chandio, Praitayini Kanakaraj, Daniel Moyer, Claire E. Kelly, Sila Genc, Jian Chen, Joseph Yuan-Mou Yang, Ye Wu, Yifei He, Jiawei Zhang, Qingrun Zeng, Fan Zhang, Nagesh Adluru, Vishwesh Nath, Sudhir Pathak, Walter Schneider, Anurag Gade, Yogesh Rathi, Tom Hendriks, Anna Vilanova, Maxime Chamberland, Tomasz Pieciak , et al. (11 additional authors not shown)

Abstract: White matter alterations are increasingly implicated in neurological diseases and their progression. International-scale studies use diffusion-weighted magnetic resonance imaging (DW-MRI) to qualitatively identify changes in white matter microstructure and connectivity. Yet, quantitative analysis of DW-MRI data is hindered by inconsistencies stemming from varying acquisition protocols. There is a… ▽ More White matter alterations are increasingly implicated in neurological diseases and their progression. International-scale studies use diffusion-weighted magnetic resonance imaging (DW-MRI) to qualitatively identify changes in white matter microstructure and connectivity. Yet, quantitative analysis of DW-MRI data is hindered by inconsistencies stemming from varying acquisition protocols. There is a pressing need to harmonize the preprocessing of DW-MRI datasets to ensure the derivation of robust quantitative diffusion metrics across acquisitions. In the MICCAI-CDMRI 2023 QuantConn challenge, participants were provided raw data from the same individuals collected on the same scanner but with two different acquisitions and tasked with preprocessing the DW-MRI to minimize acquisition differences while retaining biological variation. Submissions are evaluated on the reproducibility and comparability of cross-acquisition bundle-wise microstructure measures, bundle shape features, and connectomics. The key innovations of the QuantConn challenge are that (1) we assess bundles and tractography in the context of harmonization for the first time, (2) we assess connectomics in the context of harmonization for the first time, and (3) we have 10x additional subjects over prior harmonization challenge, MUSHAC and 100x over SuperMUDI. We find that bundle surface area, fractional anisotropy, connectome assortativity, betweenness centrality, edge count, modularity, nodal strength, and participation coefficient measures are most biased by acquisition and that machine learning voxel-wise correction, RISH mapping, and NeSH methods effectively reduce these biases. In addition, microstructure measures AD, MD, RD, bundle length, connectome density, efficiency, and path length are least biased by these acquisition differences. △ Less

Submitted 14 November, 2024; originally announced November 2024.

Comments: Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://melba-journal.org/2024/019

Journal ref: Machine.Learning.for.Biomedical.Imaging. 2 (2024)

arXiv:2405.15081 [pdf, other]

doi 10.1145/3637528.3671590

Distributed Harmonization: Federated Clustered Batch Effect Adjustment and Generalization

Authors: Bao Hoang, Yijiang Pang, Siqi Liang, Liang Zhan, Paul Thompson, Jiayu Zhou

Abstract: Independent and identically distributed (i.i.d.) data is essential to many data analysis and modeling techniques. In the medical domain, collecting data from multiple sites or institutions is a common strategy that guarantees sufficient clinical diversity, determined by the decentralized nature of medical data. However, data from various sites are easily biased by the local environment or faciliti… ▽ More Independent and identically distributed (i.i.d.) data is essential to many data analysis and modeling techniques. In the medical domain, collecting data from multiple sites or institutions is a common strategy that guarantees sufficient clinical diversity, determined by the decentralized nature of medical data. However, data from various sites are easily biased by the local environment or facilities, thereby violating the i.i.d. rule. A common strategy is to harmonize the site bias while retaining important biological information. The ComBat is among the most popular harmonization approaches and has recently been extended to handle distributed sites. However, when faced with situations involving newly joined sites in training or evaluating data from unknown/unseen sites, ComBat lacks compatibility and requires retraining with data from all the sites. The retraining leads to significant computational and logistic overhead that is usually prohibitive. In this work, we develop a novel Cluster ComBat harmonization algorithm, which leverages cluster patterns of the data in different sites and greatly advances the usability of ComBat harmonization. We use extensive simulation and real medical imaging data from ADNI to demonstrate the superiority of the proposed approach. Our codes are provided in https://github.com/illidanlab/distributed-cluster-harmonization. △ Less

Submitted 7 August, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

Comments: 11 pages, 7 figures, accepted to KDD2024-ADS

arXiv:2405.13190 [pdf, other]

Interpretable Spatio-Temporal Embedding for Brain Structural-Effective Network with Ordinary Differential Equation

Authors: Haoteng Tang, Guodong Liu, Siyuan Dai, Kai Ye, Kun Zhao, Wenlu Wang, Carl Yang, Lifang He, Alex Leow, Paul Thompson, Heng Huang, Liang Zhan

Abstract: The MRI-derived brain network serves as a pivotal instrument in elucidating both the structural and functional aspects of the brain, encompassing the ramifications of diseases and developmental processes. However, prevailing methodologies, often focusing on synchronous BOLD signals from functional MRI (fMRI), may not capture directional influences among brain regions and rarely tackle temporal fun… ▽ More The MRI-derived brain network serves as a pivotal instrument in elucidating both the structural and functional aspects of the brain, encompassing the ramifications of diseases and developmental processes. However, prevailing methodologies, often focusing on synchronous BOLD signals from functional MRI (fMRI), may not capture directional influences among brain regions and rarely tackle temporal functional dynamics. In this study, we first construct the brain-effective network via the dynamic causal model. Subsequently, we introduce an interpretable graph learning framework termed Spatio-Temporal Embedding ODE (STE-ODE). This framework incorporates specifically designed directed node embedding layers, aiming at capturing the dynamic interplay between structural and effective networks via an ordinary differential equation (ODE) model, which characterizes spatial-temporal brain dynamics. Our framework is validated on several clinical phenotype prediction tasks using two independent publicly available datasets (HCP and OASIS). The experimental results clearly demonstrate the advantages of our model compared to several state-of-the-art methods. △ Less

Submitted 21 May, 2024; originally announced May 2024.

arXiv:2403.06765 [pdf, other]

doi 10.3233/FAIA241060

ConspEmoLLM: Conspiracy Theory Detection Using an Emotion-Based Large Language Model

Authors: Zhiwei Liu, Boyang Liu, Paul Thompson, Kailai Yang, Sophia Ananiadou

Abstract: The internet has brought both benefits and harms to society. A prime example of the latter is misinformation, including conspiracy theories, which flood the web. Recent advances in natural language processing, particularly the emergence of large language models (LLMs), have improved the prospects of accurate misinformation detection. However, most LLM-based approaches to conspiracy theory detectio… ▽ More The internet has brought both benefits and harms to society. A prime example of the latter is misinformation, including conspiracy theories, which flood the web. Recent advances in natural language processing, particularly the emergence of large language models (LLMs), have improved the prospects of accurate misinformation detection. However, most LLM-based approaches to conspiracy theory detection focus only on binary classification and fail to account for the important relationship between misinformation and affective features (i.e., sentiment and emotions). Driven by a comprehensive analysis of conspiracy text that reveals its distinctive affective features, we propose ConspEmoLLM, the first open-source LLM that integrates affective information and is able to perform diverse tasks relating to conspiracy theories. These tasks include not only conspiracy theory detection, but also classification of theory type and detection of related discussion (e.g., opinions towards theories). ConspEmoLLM is fine-tuned based on an emotion-oriented LLM using our novel ConDID dataset, which includes five tasks to support LLM instruction tuning and evaluation. We demonstrate that when applied to these tasks, ConspEmoLLM largely outperforms several open-source general domain LLMs and ChatGPT, as well as an LLM that has been fine-tuned using ConDID, but which does not use affective features. This project will be released on https://github.com/lzw108/ConspEmoLLM/. △ Less

Submitted 12 August, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

Comments: Work in progress

arXiv:2402.01505 [pdf, other]

Code-Switched Language Identification is Harder Than You Think

Authors: Laurie Burchell, Alexandra Birch, Robert P. Thompson, Kenneth Heafield

Abstract: Code switching (CS) is a very common phenomenon in written and spoken communication but one that is handled poorly by many natural language processing applications. Looking to the application of building CS corpora, we explore CS language identification (LID) for corpus building. We make the task more realistic by scaling it to more languages and considering models with simpler architectures for f… ▽ More Code switching (CS) is a very common phenomenon in written and spoken communication but one that is handled poorly by many natural language processing applications. Looking to the application of building CS corpora, we explore CS language identification (LID) for corpus building. We make the task more realistic by scaling it to more languages and considering models with simpler architectures for faster inference. We also reformulate the task as a sentence-level multi-label tagging problem to make it more tractable. Having defined the task, we investigate three reasonable models for this task and define metrics which better reflect desired performance. We present empirical evidence that no current approach is adequate and finally provide recommendations for future work in this area. △ Less

Submitted 2 February, 2024; originally announced February 2024.

Comments: EACL 2024

arXiv:2311.11046 [pdf]

Classification of Major Depressive Disorder Using Vertex-Wise Brain Sulcal Depth, Curvature, and Thickness with a Deep and a Shallow Learning Model

Authors: Roberto Goya-Maldonado, Tracy Erwin-Grabner, Ling-Li Zeng, Christopher R. K. Ching, Andre Aleman, Alyssa R. Amod, Zeynep Basgoze, Francesco Benedetti, Bianca Besteher, Katharina Brosch, Robin Bülow, Romain Colle, Colm G. Connolly, Emmanuelle Corruble, Baptiste Couvy-Duchesne, Kathryn Cullen, Udo Dannlowski, Christopher G. Davey, Annemiek Dols, Jan Ernsting, Jennifer W. Evans, Lukas Fisch, Paola Fuentes-Claramonte, Ali Saffet Gonul, Ian H. Gotlib , et al. (62 additional authors not shown)

Abstract: Major depressive disorder (MDD) is a complex psychiatric disorder that affects the lives of hundreds of millions of individuals around the globe. Even today, researchers debate if morphological alterations in the brain are linked to MDD, likely due to the heterogeneity of this disorder. The application of deep learning tools to neuroimaging data, capable of capturing complex non-linear patterns, h… ▽ More Major depressive disorder (MDD) is a complex psychiatric disorder that affects the lives of hundreds of millions of individuals around the globe. Even today, researchers debate if morphological alterations in the brain are linked to MDD, likely due to the heterogeneity of this disorder. The application of deep learning tools to neuroimaging data, capable of capturing complex non-linear patterns, has the potential to provide diagnostic and predictive biomarkers for MDD. However, previous attempts to demarcate MDD patients and healthy controls (HC) based on segmented cortical features via linear machine learning approaches have reported low accuracies. Here, we used globally representative data from the ENIGMA-MDD working group containing 7,012 participants from 30 sites (N=2,772 MDD and N=4,240 HC), which allows a comprehensive analysis with generalizable results. Based on the hypothesis that integration of vertex-wise cortical features can improve classification performance, we evaluated the classification of a DenseNet and a Support Vector Machine (SVM), with the expectation that the former would outperform the latter. We found that both classifiers exhibited close to chance performance (balanced accuracy DenseNet: 51%; SVM: 53%), when estimated on unseen sites. Slightly higher classification performance (balanced accuracy DenseNet: 58%; SVM: 55%) was found when the cross-validation folds contained subjects from all sites, indicating site effect. In conclusion, the integration of vertex-wise morphometric features and the use of the non-linear classifier did not lead to the differentiability between MDD and HC. Our results support the notion that MDD classification on this combination of such features and classifiers is unfeasible. Perhaps more sophisticated integration of multimodal information may lead to a higher performance in this diagnostic task. △ Less

Submitted 24 January, 2025; v1 submitted 18 November, 2023; originally announced November 2023.

Comments: arXiv admin note: text overlap with arXiv:2206.08122

arXiv:2311.00671 [pdf, other]

doi 10.1016/j.inffus.2024.102300

Emotion Detection for Misinformation: A Review

Authors: Zhiwei Liu, Tianlin Zhang, Kailai Yang, Paul Thompson, Zeping Yu, Sophia Ananiadou

Abstract: With the advent of social media, an increasing number of netizens are sharing and reading posts and news online. However, the huge volumes of misinformation (e.g., fake news and rumors) that flood the internet can adversely affect people's lives, and have resulted in the emergence of rumor and fake news detection as a hot research topic. The emotions and sentiments of netizens, as expressed in soc… ▽ More With the advent of social media, an increasing number of netizens are sharing and reading posts and news online. However, the huge volumes of misinformation (e.g., fake news and rumors) that flood the internet can adversely affect people's lives, and have resulted in the emergence of rumor and fake news detection as a hot research topic. The emotions and sentiments of netizens, as expressed in social media posts and news, constitute important factors that can help to distinguish fake news from genuine news and to understand the spread of rumors. This article comprehensively reviews emotion-based methods for misinformation detection. We begin by explaining the strong links between emotions and misinformation. We subsequently provide a detailed analysis of a range of misinformation detection methods that employ a variety of emotion, sentiment and stance-based features, and describe their strengths and weaknesses. Finally, we discuss a number of ongoing challenges in emotion-based misinformation detection based on large language models and suggest future research directions, including data collection (multi-platform, multilingual), annotation, benchmark, multimodality, and interpretability. △ Less

Submitted 1 November, 2023; originally announced November 2023.

Comments: 30 pages, 11 figures

arXiv:2310.15761 [pdf]

doi 10.13140/RG.2.2.29274.52168

Agent-based models of social behaviour and communication in evacuations: A systematic review

Authors: Anne Templeton, Hui Xie, Steve Gwynne, Aoife Hunt, Pete Thompson, Gerta Köster

Abstract: Most modern agent-based evacuation models involve interactions between evacuees. However, the assumed reasons for interactions and portrayal of them may be overly simple. Research from social psychology suggests that people interact and communicate with one another when evacuating and evacuee response is impacted by the way information is communicated. Thus, we conducted a systematic review of age… ▽ More Most modern agent-based evacuation models involve interactions between evacuees. However, the assumed reasons for interactions and portrayal of them may be overly simple. Research from social psychology suggests that people interact and communicate with one another when evacuating and evacuee response is impacted by the way information is communicated. Thus, we conducted a systematic review of agent-based evacuation models to identify 1) how social interactions and communication approaches between agents are simulated, and 2) what key variables related to evacuation are addressed in these models. We searched Web of Science and ScienceDirect to identify articles that simulated information exchange between agents during evacuations, and social behaviour during evacuations. From the final 70 included articles, we categorised eight types of social interaction that increased in social complexity from collision avoidance to social influence based on strength of social connections with other agents. In the 17 models which simulated communication, we categorised four ways that agents communicate information: spatially through information trails or radii around agents, via social networks and via external communication. Finally, the variables either manipulated or measured in the models were categorised into the following groups: environmental condition, personal attributes of the agents, procedure, and source of information. We discuss promising directions for agent-based evacuation models to capture the effects of communication and group dynamics on evacuee behaviour. Moreover, we demonstrate how communication and group dynamics may impact the variables commonly used in agent-based evacuation models. △ Less

Submitted 24 October, 2023; originally announced October 2023.

Comments: Pre-print submitted to Safety Science special issue following the 2023 Pedestrian and Evacuation Dynamics conference

arXiv:2309.07352 [pdf]

Tackling the dimensions in imaging genetics with CLUB-PLS

Authors: Andre Altmann, Ana C Lawry Aguila, Neda Jahanshad, Paul M Thompson, Marco Lorenzi

Abstract: A major challenge in imaging genetics and similar fields is to link high-dimensional data in one domain, e.g., genetic data, to high dimensional data in a second domain, e.g., brain imaging data. The standard approach in the area are mass univariate analyses across genetic factors and imaging phenotypes. That entails executing one genome-wide association study (GWAS) for each pre-defined imaging m… ▽ More A major challenge in imaging genetics and similar fields is to link high-dimensional data in one domain, e.g., genetic data, to high dimensional data in a second domain, e.g., brain imaging data. The standard approach in the area are mass univariate analyses across genetic factors and imaging phenotypes. That entails executing one genome-wide association study (GWAS) for each pre-defined imaging measure. Although this approach has been tremendously successful, one shortcoming is that phenotypes must be pre-defined. Consequently, effects that are not confined to pre-selected regions of interest or that reflect larger brain-wide patterns can easily be missed. In this work we introduce a Partial Least Squares (PLS)-based framework, which we term Cluster-Bootstrap PLS (CLUB-PLS), that can work with large input dimensions in both domains as well as with large sample sizes. One key factor of the framework is to use cluster bootstrap to provide robust statistics for single input features in both domains. We applied CLUB-PLS to investigating the genetic basis of surface area and cortical thickness in a sample of 33,000 subjects from the UK Biobank. We found 107 genome-wide significant locus-phenotype pairs that are linked to 386 different genes. We found that a vast majority of these loci could be technically validated at a high rate: using classic GWAS or Genome-Wide Inferred Statistics (GWIS) we found that 85 locus-phenotype pairs exceeded the genome-wide suggestive (P<1e-05) threshold. △ Less

Submitted 19 September, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

Comments: 12 pages, 4 Figures, 2 Tables

arXiv:2309.04651 [pdf]

Video and Synthetic MRI Pre-training of 3D Vision Architectures for Neuroimage Analysis

Authors: Nikhil J. Dhinagar, Amit Singh, Saket Ozarkar, Ketaki Buwa, Sophia I. Thomopoulos, Conor Owens-Walton, Emily Laltoo, Yao-Liang Chen, Philip Cook, Corey McMillan, Chih-Chien Tsai, J-J Wang, Yih-Ru Wu, Paul M. Thompson

Abstract: Transfer learning represents a recent paradigm shift in the way we build artificial intelligence (AI) systems. In contrast to training task-specific models, transfer learning involves pre-training deep learning models on a large corpus of data and minimally fine-tuning them for adaptation to specific tasks. Even so, for 3D medical imaging tasks, we do not know if it is best to pre-train models on… ▽ More Transfer learning represents a recent paradigm shift in the way we build artificial intelligence (AI) systems. In contrast to training task-specific models, transfer learning involves pre-training deep learning models on a large corpus of data and minimally fine-tuning them for adaptation to specific tasks. Even so, for 3D medical imaging tasks, we do not know if it is best to pre-train models on natural images, medical images, or even synthetically generated MRI scans or video data. To evaluate these alternatives, here we benchmarked vision transformers (ViTs) and convolutional neural networks (CNNs), initialized with varied upstream pre-training approaches. These methods were then adapted to three unique downstream neuroimaging tasks with a range of difficulty: Alzheimer's disease (AD) and Parkinson's disease (PD) classification, "brain age" prediction. Experimental tests led to the following key observations: 1. Pre-training improved performance across all tasks including a boost of 7.4% for AD classification and 4.6% for PD classification for the ViT and 19.1% for PD classification and reduction in brain age prediction error by 1.26 years for CNNs, 2. Pre-training on large-scale video or synthetic MRI data boosted performance of ViTs, 3. CNNs were robust in limited-data settings, and in-domain pretraining enhanced their performances, 4. Pre-training improved generalization to out-of-distribution datasets and sites. Overall, we benchmarked different vision architectures, revealing the value of pre-training them with emerging datasets for model initialization. The resulting pre-trained models can be adapted to a range of downstream neuroimaging tasks, even when training data for the target task is limited. △ Less

Submitted 8 September, 2023; originally announced September 2023.

arXiv:2309.04607 [pdf]

Linking Symptom Inventories using Semantic Textual Similarity

Authors: Eamonn Kennedy, Shashank Vadlamani, Hannah M Lindsey, Kelly S Peterson, Kristen Dams OConnor, Kenton Murray, Ronak Agarwal, Houshang H Amiri, Raeda K Andersen, Talin Babikian, David A Baron, Erin D Bigler, Karen Caeyenberghs, Lisa Delano-Wood, Seth G Disner, Ekaterina Dobryakova, Blessen C Eapen, Rachel M Edelstein, Carrie Esopenko, Helen M Genova, Elbert Geuze, Naomi J Goodrich-Hunsaker, Jordan Grafman, Asta K Haberg, Cooper B Hodges , et al. (57 additional authors not shown)

Abstract: An extensive library of symptom inventories has been developed over time to measure clinical symptoms, but this variety has led to several long standing issues. Most notably, results drawn from different settings and studies are not comparable, which limits reproducibility. Here, we present an artificial intelligence (AI) approach using semantic textual similarity (STS) to link symptoms and scores… ▽ More An extensive library of symptom inventories has been developed over time to measure clinical symptoms, but this variety has led to several long standing issues. Most notably, results drawn from different settings and studies are not comparable, which limits reproducibility. Here, we present an artificial intelligence (AI) approach using semantic textual similarity (STS) to link symptoms and scores across previously incongruous symptom inventories. We tested the ability of four pre-trained STS models to screen thousands of symptom description pairs for related content - a challenging task typically requiring expert panels. Models were tasked to predict symptom severity across four different inventories for 6,607 participants drawn from 16 international data sources. The STS approach achieved 74.8% accuracy across five tasks, outperforming other models tested. This work suggests that incorporating contextual, semantic information can assist expert decision-making processes, yielding gains for both general and disease-specific clinical assessment. △ Less

Submitted 8 September, 2023; originally announced September 2023.

arXiv:2308.10654 [pdf, other]

doi 10.4204/EPTCS.383.3

Algebraic Reasoning About Timeliness

Authors: Seyed Hossein Haeri, Peter W. Thompson, Peter Van Roy, Magne Haveraaen, Neil J. Davies, Mikhail Barash, Kevin Hammond, James Chapman

Abstract: Designing distributed systems to have predictable performance under high load is difficult because of resource exhaustion, non-linearity, and stochastic behaviour. Timeliness, i.e., delivering results within defined time bounds, is a central aspect of predictable performance. In this paper, we focus on timeliness using the DELTA-Q Systems Development paradigm (DELTA-QSD, developed by PNSol), which… ▽ More Designing distributed systems to have predictable performance under high load is difficult because of resource exhaustion, non-linearity, and stochastic behaviour. Timeliness, i.e., delivering results within defined time bounds, is a central aspect of predictable performance. In this paper, we focus on timeliness using the DELTA-Q Systems Development paradigm (DELTA-QSD, developed by PNSol), which computes timeliness by modelling systems observationally using so-called outcome expressions. An outcome expression is a compositional definition of a system's observed behaviour in terms of its basic operations. Given the behaviour of the basic operations, DELTA-QSD efficiently computes the stochastic behaviour of the whole system including its timeliness. This paper formally proves useful algebraic properties of outcome expressions w.r.t. timeliness. We prove the different algebraic structures the set of outcome expressions form with the different DELTA-QSD operators and demonstrate why those operators do not form richer structures. We prove or disprove the set of all possible distributivity results on outcome expressions. On our way for disproving 8 of those distributivity results, we develop a technique called properisation, which gives rise to the first body of maths for improper random variables. Finally, we also prove 14 equivalences that have been used in the past in the practice of DELTA-QSD. An immediate benefit is rewrite rules that can be used for design exploration under established timeliness equivalence. This work is part of an ongoing project to disseminate and build tool support for DELTA-QSD. The ability to rewrite outcome expressions is essential for efficient tool support. △ Less

Submitted 21 August, 2023; originally announced August 2023.

Comments: In Proceedings ICE 2023, arXiv:2308.08920

ACM Class: B.8.2; C.4; D.2.4; D.2.8; F.3.2; F.3.1; F.4.1; F.4.3; I.1.1

Journal ref: EPTCS 383, 2023, pp. 35-54

arXiv:2305.16222 [pdf, ps, other]

Incomplete Multimodal Learning for Complex Brain Disorders Prediction

Authors: Reza Shirkavand, Liang Zhan, Heng Huang, Li Shen, Paul M. Thompson

Abstract: Recent advancements in the acquisition of various brain data sources have created new opportunities for integrating multimodal brain data to assist in early detection of complex brain disorders. However, current data integration approaches typically need a complete set of biomedical data modalities, which may not always be feasible, as some modalities are only available in large-scale research coh… ▽ More Recent advancements in the acquisition of various brain data sources have created new opportunities for integrating multimodal brain data to assist in early detection of complex brain disorders. However, current data integration approaches typically need a complete set of biomedical data modalities, which may not always be feasible, as some modalities are only available in large-scale research cohorts and are prohibitive to collect in routine clinical practice. Especially in studies of brain diseases, research cohorts may include both neuroimaging data and genetic data, but for practical clinical diagnosis, we often need to make disease predictions only based on neuroimages. As a result, it is desired to design machine learning models which can use all available data (different data could provide complementary information) during training but conduct inference using only the most common data modality. We propose a new incomplete multimodal data integration approach that employs transformers and generative adversarial networks to effectively exploit auxiliary modalities available during training in order to improve the performance of a unimodal model at inference. We apply our new method to predict cognitive degeneration and disease outcomes using the multimodal imaging genetic data from Alzheimer's Disease Neuroimaging Initiative (ADNI) cohort. Experimental results demonstrate that our approach outperforms the related machine learning and deep learning methods by a significant margin. △ Less

Submitted 25 May, 2023; originally announced May 2023.

arXiv:2304.00134 [pdf]

A Surface-Based Federated Chow Test Model for Integrating APOE Status, Tau Deposition Measure, and Hippocampal Surface Morphometry

Authors: Jianfeng Wu, Yi Su, Yanxi Chen, Wenhui Zhu, Eric M. Reiman, Richard J. Caselli, Kewei Chen, Paul M. Thompson, Junwen Wang, Yalin Wang

Abstract: Background: Alzheimer's Disease (AD) is the most common type of age-related dementia, affecting 6.2 million people aged 65 or older according to CDC data. It is commonly agreed that discovering an effective AD diagnosis biomarker could have enormous public health benefits, potentially preventing or delaying up to 40% of dementia cases. Tau neurofibrillary tangles are the primary driver of downstre… ▽ More Background: Alzheimer's Disease (AD) is the most common type of age-related dementia, affecting 6.2 million people aged 65 or older according to CDC data. It is commonly agreed that discovering an effective AD diagnosis biomarker could have enormous public health benefits, potentially preventing or delaying up to 40% of dementia cases. Tau neurofibrillary tangles are the primary driver of downstream neurodegeneration and subsequent cognitive impairment in AD, resulting in structural deformations such as hippocampal atrophy that can be observed in magnetic resonance imaging (MRI) scans. Objective: To build a surface-based model to 1) detect differences between APOE subgroups in patterns of tau deposition and hippocampal atrophy, and 2) use the extracted surface-based features to predict cognitive decline. Methods: Using data obtained from different institutions, we develop a surface-based federated Chow test model to study the synergistic effects of APOE, a previously reported significant risk factor of AD, and tau on hippocampal surface morphometry. Results: We illustrate that the APOE-specific morphometry features correlate with AD progression and better predict future AD conversion than other MRI biomarkers. For example, a strong association between atrophy and abnormal tau was identified in hippocampal subregion cornu ammonis 1 (CA1 subfield) and subiculum in e4 homozygote cohort. Conclusion: Our model allows for identifying MRI biomarkers for AD and cognitive decline prediction and may uncover a corner of the neural mechanism of the influence of APOE and tau deposition on hippocampal morphology. △ Less

Submitted 31 March, 2023; originally announced April 2023.

arXiv:2303.08224 [pdf]

Few-Shot Classification of Autism Spectrum Disorder using Site-Agnostic Meta-Learning and Brain MRI

Authors: Nikhil J. Dhinagar, Vignesh Santhalingam, Katherine E. Lawrence, Emily Laltoo, Paul M. Thompson

Abstract: For machine learning applications in medical imaging, the availability of training data is often limited, which hampers the design of radiological classifiers for subtle conditions such as autism spectrum disorder (ASD). Transfer learning is one method to counter this problem of low training data regimes. Here we explore the use of meta-learning for very low data regimes in the context of having p… ▽ More For machine learning applications in medical imaging, the availability of training data is often limited, which hampers the design of radiological classifiers for subtle conditions such as autism spectrum disorder (ASD). Transfer learning is one method to counter this problem of low training data regimes. Here we explore the use of meta-learning for very low data regimes in the context of having prior data from multiple sites - an approach we term site-agnostic meta-learning. Inspired by the effectiveness of meta-learning for optimizing a model across multiple tasks, here we propose a framework to adapt it to learn across multiple sites. We tested our meta-learning model for classifying ASD versus typically developing controls in 2,201 T1-weighted (T1-w) MRI scans collected from 38 imaging sites as part of Autism Brain Imaging Data Exchange (ABIDE) [age: 5.2-64.0 years]. The method was trained to find a good initialization state for our model that can quickly adapt to data from new unseen sites by fine-tuning on the limited data that is available. The proposed method achieved an ROC-AUC=0.857 on 370 scans from 7 unseen sites in ABIDE using a few-shot setting of 2-way 20-shot i.e., 20 training samples per site. Our results outperformed a transfer learning baseline by generalizing across a wider range of sites as well as other related prior work. We also tested our model in a zero-shot setting on an independent test site without any additional fine-tuning. Our experiments show the promise of the proposed site-agnostic meta-learning framework for challenging neuroimaging tasks involving multi-site heterogeneity with limited availability of training data. △ Less

Submitted 14 March, 2023; originally announced March 2023.

arXiv:2303.08216 [pdf]

Efficiently Training Vision Transformers on Structural MRI Scans for Alzheimer's Disease Detection

Authors: Nikhil J. Dhinagar, Sophia I. Thomopoulos, Emily Laltoo, Paul M. Thompson

Abstract: Neuroimaging of large populations is valuable to identify factors that promote or resist brain disease, and to assist diagnosis, subtyping, and prognosis. Data-driven models such as convolutional neural networks (CNNs) have increasingly been applied to brain images to perform diagnostic and prognostic tasks by learning robust features. Vision transformers (ViT) - a new class of deep learning archi… ▽ More Neuroimaging of large populations is valuable to identify factors that promote or resist brain disease, and to assist diagnosis, subtyping, and prognosis. Data-driven models such as convolutional neural networks (CNNs) have increasingly been applied to brain images to perform diagnostic and prognostic tasks by learning robust features. Vision transformers (ViT) - a new class of deep learning architectures - have emerged in recent years as an alternative to CNNs for several computer vision applications. Here we tested variants of the ViT architecture for a range of desired neuroimaging downstream tasks based on difficulty, in this case for sex and Alzheimer's disease (AD) classification based on 3D brain MRI. In our experiments, two vision transformer architecture variants achieved an AUC of 0.987 for sex and 0.892 for AD classification, respectively. We independently evaluated our models on data from two benchmark AD datasets. We achieved a performance boost of 5% and 9-10% upon fine-tuning vision transformer models pre-trained on synthetic (generated by a latent diffusion model) and real MRI scans, respectively. Our main contributions include testing the effects of different ViT training strategies including pre-training, data augmentation and learning rate warm-ups followed by annealing, as pertaining to the neuroimaging domain. These techniques are essential for training ViT-like models for neuroimaging applications where training data is usually limited. We also analyzed the effect of the amount of training data utilized on the test-time performance of the ViT via data-model scaling curves. △ Less

Submitted 14 March, 2023; originally announced March 2023.

arXiv:2303.01491 [pdf, other]

Transferring Models Trained on Natural Images to 3D MRI via Position Encoded Slice Models

Authors: Umang Gupta, Tamoghna Chattopadhyay, Nikhil Dhinagar, Paul M. Thompson, Greg Ver Steeg, The Alzheimer's Disease Neuroimaging Initiative

Abstract: Transfer learning has remarkably improved computer vision. These advances also promise improvements in neuroimaging, where training set sizes are often small. However, various difficulties arise in directly applying models pretrained on natural images to radiologic images, such as MRIs. In particular, a mismatch in the input space (2D images vs. 3D MRIs) restricts the direct transfer of models, of… ▽ More Transfer learning has remarkably improved computer vision. These advances also promise improvements in neuroimaging, where training set sizes are often small. However, various difficulties arise in directly applying models pretrained on natural images to radiologic images, such as MRIs. In particular, a mismatch in the input space (2D images vs. 3D MRIs) restricts the direct transfer of models, often forcing us to consider only a few MRI slices as input. To this end, we leverage the 2D-Slice-CNN architecture of Gupta et al. (2021), which embeds all the MRI slices with 2D encoders (neural networks that take 2D image input) and combines them via permutation-invariant layers. With the insight that the pretrained model can serve as the 2D encoder, we initialize the 2D encoder with ImageNet pretrained weights that outperform those initialized and trained from scratch on two neuroimaging tasks -- brain age prediction on the UK Biobank dataset and Alzheimer's disease detection on the ADNI dataset. Further, we improve the modeling capabilities of 2D-Slice models by incorporating spatial information through position embeddings, which can improve the performance in some cases. △ Less

Submitted 2 March, 2023; originally announced March 2023.

Comments: To appear at IEEE International Symposium on Biomedical Imaging 2023 (ISBI 2023). Code is available at https://github.com/umgupta/2d-slice-set-networks

arXiv:2302.13631 [pdf]

Curriculum Based Multi-Task Learning for Parkinson's Disease Detection

Authors: Nikhil J. Dhinagar, Conor Owens-Walton, Emily Laltoo, Christina P. Boyle, Yao-Liang Chen, Philip Cook, Corey McMillan, Chih-Chien Tsai, J-J Wang, Yih-Ru Wu, Ysbrand van der Werf, Paul M. Thompson

Abstract: There is great interest in developing radiological classifiers for diagnosis, staging, and predictive modeling in progressive diseases such as Parkinson's disease (PD), a neurodegenerative disease that is difficult to detect in its early stages. Here we leverage severity-based meta-data on the stages of disease to define a curriculum for training a deep convolutional neural network (CNN). Typicall… ▽ More There is great interest in developing radiological classifiers for diagnosis, staging, and predictive modeling in progressive diseases such as Parkinson's disease (PD), a neurodegenerative disease that is difficult to detect in its early stages. Here we leverage severity-based meta-data on the stages of disease to define a curriculum for training a deep convolutional neural network (CNN). Typically, deep learning networks are trained by randomly selecting samples in each mini-batch. By contrast, curriculum learning is a training strategy that aims to boost classifier performance by starting with examples that are easier to classify. Here we define a curriculum to progressively increase the difficulty of the training data corresponding to the Hoehn and Yahr (H&Y) staging system for PD (total N=1,012; 653 PD patients, 359 controls; age range: 20.0-84.9 years). Even with our multi-task setting using pre-trained CNNs and transfer learning, PD classification based on T1-weighted (T1-w) MRI was challenging (ROC AUC: 0.59-0.65), but curriculum training boosted performance (by 3.9%) compared to our baseline model. Future work with multimodal imaging may further boost performance. △ Less

Submitted 27 February, 2023; originally announced February 2023.

Comments: Accepted for publication at the 20th IEEE International Symposium on Biomedical Imaging, ISBI 2023

arXiv:2211.05235 [pdf]

Improved Prediction of Beta-Amyloid and Tau Burden Using Hippocampal Surface Multivariate Morphometry Statistics and Sparse Coding

Authors: Jianfeng Wu, Yi Su, Wenhui Zhu, Negar Jalili Mallak, Natasha Lepore, Eric M. Reiman, Richard J. Caselli, Paul M. Thompson, Kewei Chen, Yalin Wang

Abstract: Background: Beta-amyloid (A$β$) plaques and tau protein tangles in the brain are the defining 'A' and 'T' hallmarks of Alzheimer's disease (AD), and together with structural atrophy detectable on brain magnetic resonance imaging (MRI) scans as one of the neurodegenerative ('N') biomarkers comprise the ''ATN framework'' of AD. Current methods to detect A$β$/tau pathology include cerebrospinal fluid… ▽ More Background: Beta-amyloid (A$β$) plaques and tau protein tangles in the brain are the defining 'A' and 'T' hallmarks of Alzheimer's disease (AD), and together with structural atrophy detectable on brain magnetic resonance imaging (MRI) scans as one of the neurodegenerative ('N') biomarkers comprise the ''ATN framework'' of AD. Current methods to detect A$β$/tau pathology include cerebrospinal fluid (CSF; invasive), positron emission tomography (PET; costly and not widely available), and blood-based biomarkers (BBBM; promising but mainly still in development). Objective: To develop a non-invasive and widely available structural MRI-based framework to quantitatively predict the amyloid and tau measurements. Methods: With MRI-based hippocampal multivariate morphometry statistics (MMS) features, we apply our Patch Analysis-based Surface Correntropy-induced Sparse coding and max-pooling (PASCS-MP) method combined with the ridge regression model to individual amyloid/tau measure prediction. Results: We evaluate our framework on amyloid PET/MRI and tau PET/MRI datasets from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Each subject has one pair consisting of a PET image and MRI scan, collected at about the same time. Experimental results suggest that amyloid/tau measurements predicted with our PASCP-MP representations are closer to the real values than the measures derived from other approaches, such as hippocampal surface area, volume, and shape morphometry features based on spherical harmonics (SPHARM). Conclusion: The MMS-based PASCP-MP is an efficient tool that can bridge hippocampal atrophy with amyloid and tau pathology and thus help assess disease burden, progression, and treatment effects. △ Less

Submitted 27 October, 2022; originally announced November 2022.

Comments: 34 pages, 5 figures, 1 table, accepted by the Journal of Alzheimer's Disease

MSC Class: 65U05

arXiv:2208.11669 [pdf, other]

Towards Sparsified Federated Neuroimaging Models via Weight Pruning

Authors: Dimitris Stripelis, Umang Gupta, Nikhil Dhinagar, Greg Ver Steeg, Paul Thompson, José Luis Ambite

Abstract: Federated training of large deep neural networks can often be restrictive due to the increasing costs of communicating the updates with increasing model sizes. Various model pruning techniques have been designed in centralized settings to reduce inference times. Combining centralized pruning techniques with federated training seems intuitive for reducing communication costs -- by pruning the model… ▽ More Federated training of large deep neural networks can often be restrictive due to the increasing costs of communicating the updates with increasing model sizes. Various model pruning techniques have been designed in centralized settings to reduce inference times. Combining centralized pruning techniques with federated training seems intuitive for reducing communication costs -- by pruning the model parameters right before the communication step. Moreover, such a progressive model pruning approach during training can also reduce training times/costs. To this end, we propose FedSparsify, which performs model pruning during federated training. In our experiments in centralized and federated settings on the brain age prediction task (estimating a person's age from their brain MRI), we demonstrate that models can be pruned up to 95% sparsity without affecting performance even in challenging federated learning environments with highly heterogeneous data distributions. One surprising benefit of model pruning is improved model privacy. We demonstrate that models with high sparsity are less susceptible to membership inference attacks, a type of privacy attack. △ Less

Submitted 24 August, 2022; originally announced August 2022.

Comments: Accepted to 3rd MICCAI Workshop on Distributed, Collaborative and Federated Learning (DeCaF, 2022)

arXiv:2205.07854 [pdf, other]

Functional2Structural: Cross-Modality Brain Networks Representation Learning

Authors: Haoteng Tang, Xiyao Fu, Lei Guo, Yalin Wang, Scott Mackin, Olusola Ajilore, Alex Leow, Paul Thompson, Heng Huang, Liang Zhan

Abstract: MRI-based modeling of brain networks has been widely used to understand functional and structural interactions and connections among brain regions, and factors that affect them, such as brain development and disease. Graph mining on brain networks may facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases. Since brain networks derived from functional an… ▽ More MRI-based modeling of brain networks has been widely used to understand functional and structural interactions and connections among brain regions, and factors that affect them, such as brain development and disease. Graph mining on brain networks may facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases. Since brain networks derived from functional and structural MRI describe the brain topology from different perspectives, exploring a representation that combines these cross-modality brain networks is non-trivial. Most current studies aim to extract a fused representation of the two types of brain network by projecting the structural network to the functional counterpart. Since the functional network is dynamic and the structural network is static, mapping a static object to a dynamic object is suboptimal. However, mapping in the opposite direction is not feasible due to the non-negativity requirement of current graph learning techniques. Here, we propose a novel graph learning framework, known as Deep Signed Brain Networks (DSBN), with a signed graph encoder that, from an opposite perspective, learns the cross-modality representations by projecting the functional network to the structural counterpart. We validate our framework on clinical phenotype and neurodegenerative disease prediction tasks using two independent, publicly available datasets (HCP and OASIS). The experimental results clearly demonstrate the advantages of our model compared to several state-of-the-art methods. △ Less

Submitted 5 May, 2022; originally announced May 2022.

arXiv:2205.05249 [pdf, other]

Secure & Private Federated Neuroimaging

Authors: Dimitris Stripelis, Umang Gupta, Hamza Saleem, Nikhil Dhinagar, Tanmay Ghai, Rafael Chrysovalantis Anastasiou, Armaghan Asghar, Greg Ver Steeg, Srivatsan Ravi, Muhammad Naveed, Paul M. Thompson, Jose Luis Ambite

Abstract: The amount of biomedical data continues to grow rapidly. However, collecting data from multiple sites for joint analysis remains challenging due to security, privacy, and regulatory concerns. To overcome this challenge, we use Federated Learning, which enables distributed training of neural network models over multiple data sources without sharing data. Each site trains the neural network over its… ▽ More The amount of biomedical data continues to grow rapidly. However, collecting data from multiple sites for joint analysis remains challenging due to security, privacy, and regulatory concerns. To overcome this challenge, we use Federated Learning, which enables distributed training of neural network models over multiple data sources without sharing data. Each site trains the neural network over its private data for some time, then shares the neural network parameters (i.e., weights, gradients) with a Federation Controller, which in turn aggregates the local models, sends the resulting community model back to each site, and the process repeats. Our Federated Learning architecture, MetisFL, provides strong security and privacy. First, sample data never leaves a site. Second, neural network parameters are encrypted before transmission and the global neural model is computed under fully-homomorphic encryption. Finally, we use information-theoretic methods to limit information leakage from the neural model to prevent a curious site from performing model inversion or membership attacks. We present a thorough evaluation of the performance of secure, private federated learning in neuroimaging tasks, including for predicting Alzheimer's disease and estimating BrainAGE from magnetic resonance imaging (MRI) studies, in challenging, heterogeneous federated environments where sites have different amounts of data and statistical distributions. △ Less

Submitted 28 August, 2023; v1 submitted 10 May, 2022; originally announced May 2022.

Comments: 18 pages, 13 figures, 2 tables

ACM Class: I.2; I.5.1; J.3

arXiv:2110.10709 [pdf]

Predicting Tau Accumulation in Cerebral Cortex with Multivariate MRI Morphometry Measurements, Sparse Coding, and Correntropy

Authors: Jianfeng Wu, Wenhui Zhu, Yi Su, Jie Gui, Natasha Lepore, Eric M. Reiman, Richard J. Caselli, Paul M. Thompson, Kewei Chen, Yalin Wang

Abstract: Biomarker-assisted diagnosis and intervention in Alzheimer's disease (AD) may be the key to prevention breakthroughs. One of the hallmarks of AD is the accumulation of tau plaques in the human brain. However, current methods to detect tau pathology are either invasive (lumbar puncture) or quite costly and not widely available (Tau PET). In our previous work, structural MRI-based hippocampal multiv… ▽ More Biomarker-assisted diagnosis and intervention in Alzheimer's disease (AD) may be the key to prevention breakthroughs. One of the hallmarks of AD is the accumulation of tau plaques in the human brain. However, current methods to detect tau pathology are either invasive (lumbar puncture) or quite costly and not widely available (Tau PET). In our previous work, structural MRI-based hippocampal multivariate morphometry statistics (MMS) showed superior performance as an effective neurodegenerative biomarker for preclinical AD and Patch Analysis-based Surface Correntropy-induced Sparse coding and max-pooling (PASCS-MP) has excellent ability to generate low-dimensional representations with strong statistical power for brain amyloid prediction. In this work, we apply this framework together with ridge regression models to predict Tau deposition in Braak12 and Braak34 brain regions separately. We evaluate our framework on 925 subjects from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Each subject has one pair consisting of a PET image and MRI scan which were collected at about the same times. Experimental results suggest that the representations from our MMS and PASCS-MP have stronger predictive power and their predicted Braak12 and Braak34 are closer to the real values compared to the measures derived from other approaches such as hippocampal surface area and volume, and shape morphometry features based on spherical harmonics (SPHARM). △ Less

Submitted 20 October, 2021; originally announced October 2021.

Comments: 10 pages, 5 figures, 17th International Symposium on Medical Information Processing and Analysis

arXiv:2108.03437 [pdf, other]

Secure Neuroimaging Analysis using Federated Learning with Homomorphic Encryption

Authors: Dimitris Stripelis, Hamza Saleem, Tanmay Ghai, Nikhil Dhinagar, Umang Gupta, Chrysovalantis Anastasiou, Greg Ver Steeg, Srivatsan Ravi, Muhammad Naveed, Paul M. Thompson, Jose Luis Ambite

Abstract: Federated learning (FL) enables distributed computation of machine learning models over various disparate, remote data sources, without requiring to transfer any individual data to a centralized location. This results in an improved generalizability of models and efficient scaling of computation as more sources and larger datasets are added to the federation. Nevertheless, recent membership attack… ▽ More Federated learning (FL) enables distributed computation of machine learning models over various disparate, remote data sources, without requiring to transfer any individual data to a centralized location. This results in an improved generalizability of models and efficient scaling of computation as more sources and larger datasets are added to the federation. Nevertheless, recent membership attacks show that private or sensitive personal data can sometimes be leaked or inferred when model parameters or summary statistics are shared with a central site, requiring improved security solutions. In this work, we propose a framework for secure FL using fully-homomorphic encryption (FHE). Specifically, we use the CKKS construction, an approximate, floating point compatible scheme that benefits from ciphertext packing and rescaling. In our evaluation on large-scale brain MRI datasets, we use our proposed secure FL framework to train a deep learning model to predict a person's age from distributed MRI scans, a common benchmarking task, and demonstrate that there is no degradation in the learning performance between the encrypted and non-encrypted federated models. △ Less

Submitted 9 November, 2021; v1 submitted 7 August, 2021; originally announced August 2021.

Comments: 9 pages, 3 figures, 1 algorithm

arXiv:2105.02866 [pdf, other]

Membership Inference Attacks on Deep Regression Models for Neuroimaging

Authors: Umang Gupta, Dimitris Stripelis, Pradeep K. Lam, Paul M. Thompson, José Luis Ambite, Greg Ver Steeg

Abstract: Ensuring the privacy of research participants is vital, even more so in healthcare environments. Deep learning approaches to neuroimaging require large datasets, and this often necessitates sharing data between multiple sites, which is antithetical to the privacy objectives. Federated learning is a commonly proposed solution to this problem. It circumvents the need for data sharing by sharing para… ▽ More Ensuring the privacy of research participants is vital, even more so in healthcare environments. Deep learning approaches to neuroimaging require large datasets, and this often necessitates sharing data between multiple sites, which is antithetical to the privacy objectives. Federated learning is a commonly proposed solution to this problem. It circumvents the need for data sharing by sharing parameters during the training process. However, we demonstrate that allowing access to parameters may leak private information even if data is never directly shared. In particular, we show that it is possible to infer if a sample was used to train the model given only access to the model prediction (black-box) or access to the model itself (white-box) and some leaked samples from the training data distribution. Such attacks are commonly referred to as Membership Inference attacks. We show realistic Membership Inference attacks on deep learning models trained for 3D neuroimaging tasks in a centralized as well as decentralized setup. We demonstrate feasible attacks on brain age prediction models (deep learning models that predict a person's age from their brain MRI scan). We correctly identified whether an MRI scan was used in model training with a 60% to over 80% success rate depending on model complexity and security assumptions. △ Less

Submitted 3 June, 2021; v1 submitted 6 May, 2021; originally announced May 2021.

Comments: To appear at Medical Imaging with Deep Learning 2021 (MIDL 2021)

arXiv:2103.12420 [pdf, other]

HSEarch: semantic search system for workplace accident reports

Authors: Emrah Inan, Paul Thompson, Tim Yates, Sophia Ananiadou

Abstract: Semantic search engines, which integrate the output of text mining (TM) methods, can significantly increase the ease and efficiency of finding relevant documents and locating important information within them. We present a novel search engine for the construction industry, HSEarch (http://www.nactem.ac.uk/hse/), which uses TM methods to provide semantically-enhanced, faceted search over a reposito… ▽ More Semantic search engines, which integrate the output of text mining (TM) methods, can significantly increase the ease and efficiency of finding relevant documents and locating important information within them. We present a novel search engine for the construction industry, HSEarch (http://www.nactem.ac.uk/hse/), which uses TM methods to provide semantically-enhanced, faceted search over a repository of workplace accident reports. Compared to previous TM-driven search engines for the construction industry, HSEarch provides a more interactive means for users to explore the contents of the repository, to review documents more systematically and to locate relevant knowledge within them. △ Less

Submitted 23 March, 2021; originally announced March 2021.

Comments: Accepted to appear in ECIR 2021

arXiv:2102.10503 [pdf, ps, other]

Predicting Future Cognitive Decline with Hyperbolic Stochastic Coding

Authors: J. Zhang, Q. Dong, J. Shi, Q. Li, C. M. Stonnington, B. A. Gutman, K. Chen, E. M. Reiman, R. J. Caselli, P. M. Thompson, J. Ye, Y. Wang

Abstract: Hyperbolic geometry has been successfully applied in modeling brain cortical and subcortical surfaces with general topological structures. However such approaches, similar to other surface based brain morphology analysis methods, usually generate high dimensional features. It limits their statistical power in cognitive decline prediction research, especially in datasets with limited subject number… ▽ More Hyperbolic geometry has been successfully applied in modeling brain cortical and subcortical surfaces with general topological structures. However such approaches, similar to other surface based brain morphology analysis methods, usually generate high dimensional features. It limits their statistical power in cognitive decline prediction research, especially in datasets with limited subject numbers. To address the above limitation, we propose a novel framework termed as hyperbolic stochastic coding (HSC). Our preliminary experimental results show that our algorithm achieves superior results on various classification tasks. Our work may enrich surface based brain imaging research tools and potentially result in a diagnostic and prognostic indicator to be useful in individualized treatment strategies. △ Less

Submitted 20 February, 2021; originally announced February 2021.

arXiv:2102.08440 [pdf, other]

Scaling Neuroscience Research using Federated Learning

Authors: Dimitris Stripelis, Jose Luis Ambite, Pradeep Lam, Paul Thompson

Abstract: The amount of biomedical data continues to grow rapidly. However, the ability to analyze these data is limited due to privacy and regulatory concerns. Machine learning approaches that require data to be copied to a single location are hampered by the challenges of data sharing. Federated Learning is a promising approach to learn a joint model over data silos. This architecture does not share any s… ▽ More The amount of biomedical data continues to grow rapidly. However, the ability to analyze these data is limited due to privacy and regulatory concerns. Machine learning approaches that require data to be copied to a single location are hampered by the challenges of data sharing. Federated Learning is a promising approach to learn a joint model over data silos. This architecture does not share any subject data across sites, only aggregated parameters, often in encrypted environments, thus satisfying privacy and regulatory requirements. Here, we describe our Federated Learning architecture and training policies. We demonstrate our approach on a brain age prediction model on structural MRI scans distributed across multiple sites with diverse amounts of data and subject (age) distributions. In these heterogeneous environments, our Semi-Synchronous protocol provides faster convergence. △ Less

Submitted 16 February, 2021; originally announced February 2021.

Comments: To appear at IEEE International Symposium on Biomedical Imaging 2021 (ISBI 2021)

MSC Class: 68T07 ACM Class: I.5.4

arXiv:2102.04438 [pdf, other]

Improved Brain Age Estimation with Slice-based Set Networks

Authors: Umang Gupta, Pradeep K. Lam, Greg Ver Steeg, Paul M. Thompson

Abstract: Deep Learning for neuroimaging data is a promising but challenging direction. The high dimensionality of 3D MRI scans makes this endeavor compute and data-intensive. Most conventional 3D neuroimaging methods use 3D-CNN-based architectures with a large number of parameters and require more time and data to train. Recently, 2D-slice-based models have received increasing attention as they have fewer… ▽ More Deep Learning for neuroimaging data is a promising but challenging direction. The high dimensionality of 3D MRI scans makes this endeavor compute and data-intensive. Most conventional 3D neuroimaging methods use 3D-CNN-based architectures with a large number of parameters and require more time and data to train. Recently, 2D-slice-based models have received increasing attention as they have fewer parameters and may require fewer samples to achieve comparable performance. In this paper, we propose a new architecture for BrainAGE prediction. The proposed architecture works by encoding each 2D slice in an MRI with a deep 2D-CNN model. Next, it combines the information from these 2D-slice encodings using set networks or permutation invariant layers. Experiments on the BrainAGE prediction problem, using the UK Biobank dataset, showed that the model with the permutation invariant layers trains faster and provides better predictions compared to other state-of-the-art approaches. △ Less

Submitted 9 February, 2021; v1 submitted 8 February, 2021; originally announced February 2021.

Comments: To appear at IEEE International Symposium on Biomedical Imaging 2021 (ISBI 2021). Code is available at https://git.io/JtazG

arXiv:2011.12875 [pdf, other]

Rapid Exploration of Optimization Strategies on Advanced Architectures using TestSNAP and LAMMPS

Authors: Rahulkumar Gayatri, Stan Moore, Evan Weinberg, Nicholas Lubbers, Sarah Anderson, Jack Deslippe, Danny Perez, Aidan P. Thompson

Abstract: The exascale race is at an end with the announcement of the Aurora and Frontier machines. This next generation of supercomputers utilize diverse hardware architectures to achieve their compute performance, providing an added onus on the performance portability of applications. An expanding fragmentation of programming models would provide a compounding optimization challenge were it not for the ev… ▽ More The exascale race is at an end with the announcement of the Aurora and Frontier machines. This next generation of supercomputers utilize diverse hardware architectures to achieve their compute performance, providing an added onus on the performance portability of applications. An expanding fragmentation of programming models would provide a compounding optimization challenge were it not for the evolution of performance-portable frameworks, providing unified models for mapping abstract hierarchies of parallelism to diverse architectures. A solution to this challenge is the evolution of performance-portable frameworks, providing unified models for mapping abstract hierarchies of parallelism to diverse architectures. Kokkos is one such performance portable programming model for C++ applications, providing back-end implementations for each major HPC platform. Even with a performance portable framework, restructuring algorithms to expose higher degrees of parallelism is non-trivial. The Spectral Neighbor Analysis Potential (SNAP) is a machine-learned inter-atomic potential utilized in cutting-edge molecular dynamics simulations. Previous implementations of the SNAP calculation showed a downward trend in their performance relative to peak on newer-generation CPUs and low performance on GPUs. In this paper we describe the restructuring and optimization of SNAP as implemented in the Kokkos CUDA backend of the LAMMPS molecular dynamics package, benchmarked on NVIDIA GPUs. We identify novel patterns of hierarchical parallelism, facilitating a minimization of memory access overheads and pushing the implementation into a compute-saturated regime. Our implementation via Kokkos enables recompile-and-run efficiency on upcoming architectures. We find a $\sim$22x time-to-solution improvement relative to an existing implementation as measured on an NVIDIA Tesla V100-16GB for an important benchmark. △ Less

Submitted 25 November, 2020; originally announced November 2020.

Comments: Submitted to IPDPS 2021, October 19, 2020

arXiv:2010.04905 [pdf, other]

doi 10.1103/PhysRevB.104.035120

Accelerating Finite-temperature Kohn-Sham Density Functional Theory with Deep Neural Networks

Authors: J. Austin Ellis, Lenz Fiedler, Gabriel A. Popoola, Normand A. Modine, J. Adam Stephens, Aidan P. Thompson, Attila Cangi, Sivasankaran Rajamanickam

Abstract: We present a numerical modeling workflow based on machine learning (ML) which reproduces the the total energies produced by Kohn-Sham density functional theory (DFT) at finite electronic temperature to within chemical accuracy at negligible computational cost. Based on deep neural networks, our workflow yields the local density of states (LDOS) for a given atomic configuration. From the LDOS, spat… ▽ More We present a numerical modeling workflow based on machine learning (ML) which reproduces the the total energies produced by Kohn-Sham density functional theory (DFT) at finite electronic temperature to within chemical accuracy at negligible computational cost. Based on deep neural networks, our workflow yields the local density of states (LDOS) for a given atomic configuration. From the LDOS, spatially-resolved, energy-resolved, and integrated quantities can be calculated, including the DFT total free energy, which serves as the Born-Oppenheimer potential energy surface for the atoms. We demonstrate the efficacy of this approach for both solid and liquid metals and compare results between independent and unified machine-learning models for solid and liquid aluminum. Our machine-learning density functional theory framework opens up the path towards multiscale materials modeling for matter under ambient and extreme conditions at a computational scale and cost that is unattainable with current algorithms. △ Less

Submitted 9 July, 2021; v1 submitted 10 October, 2020; originally announced October 2020.

Journal ref: Phys. Rev. B 104, 035120 (2021)

arXiv:2007.14787 [pdf, ps, other]

Parameter identifiability and input-output equations

Authors: Alexey Ovchinnikov, Gleb Pogudin, Peter Thompson

Abstract: Structural parameter identifiability is a property of a differential model with parameters that allows for the parameters to be determined from the model equations in the absence of noise. One of the standard approaches to assessing this problem is via input-output equations and, in particular, characteristic sets of differential ideals. The precise relation between identifiability and input-outpu… ▽ More Structural parameter identifiability is a property of a differential model with parameters that allows for the parameters to be determined from the model equations in the absence of noise. One of the standard approaches to assessing this problem is via input-output equations and, in particular, characteristic sets of differential ideals. The precise relation between identifiability and input-output identifiability is subtle. The goal of this note is to clarify this relation. The main results are: 1) identifiability implies input-output identifiability; 2) these notions coincide if the model does not have rational first integrals; 3) the field of input-output identifiable functions is generated by the coefficients of a "minimal" characteristic set of the corresponding differential ideal. We expect that some of these facts may be known to the experts in the area, but we are not aware of any articles in which these facts are stated precisely and rigorously proved. △ Less

Submitted 27 December, 2020; v1 submitted 27 July, 2020; originally announced July 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:1910.03960

arXiv:2007.09777 [pdf, other]

Deep Representation Learning For Multimodal Brain Networks

Authors: Wen Zhang, Liang Zhan, Paul Thompson, Yalin Wang

Abstract: Applying network science approaches to investigate the functions and anatomy of the human brain is prevalent in modern medical imaging analysis. Due to the complex network topology, for an individual brain, mining a discriminative network representation from the multimodal brain networks is non-trivial. The recent success of deep learning techniques on graph-structured data suggests a new way to m… ▽ More Applying network science approaches to investigate the functions and anatomy of the human brain is prevalent in modern medical imaging analysis. Due to the complex network topology, for an individual brain, mining a discriminative network representation from the multimodal brain networks is non-trivial. The recent success of deep learning techniques on graph-structured data suggests a new way to model the non-linear cross-modality relationship. However, current deep brain network methods either ignore the intrinsic graph topology or require a network basis shared within a group. To address these challenges, we propose a novel end-to-end deep graph representation learning (Deep Multimodal Brain Networks - DMBN) to fuse multimodal brain networks. Specifically, we decipher the cross-modality relationship through a graph encoding and decoding process. The higher-order network mappings from brain structural networks to functional networks are learned in the node domain. The learned network representation is a set of node features that are informative to induce brain saliency maps in a supervised manner. We test our framework in both synthetic and real image data. The experimental results show the superiority of the proposed method over some other state-of-the-art deep brain network models. △ Less

Submitted 19 July, 2020; originally announced July 2020.

Comments: 11 pages, 3 figures, MICCAI 2020

arXiv:2006.00139 [pdf, other]

Multi-fidelity machine-learning with uncertainty quantification and Bayesian optimization for materials design: Application to ternary random alloys

Authors: Anh Tran, Julien Tranchida, Tim Wildey, Aidan P. Thompson

Abstract: We present a scale-bridging approach based on a multi-fidelity (MF) machine-learning (ML) framework leveraging Gaussian processes (GP) to fuse atomistic computational model predictions across multiple levels of fidelity. Through the posterior variance of the MFGP, our framework naturally enables uncertainty quantification, providing estimates of confidence in the predictions. We used Density Funct… ▽ More We present a scale-bridging approach based on a multi-fidelity (MF) machine-learning (ML) framework leveraging Gaussian processes (GP) to fuse atomistic computational model predictions across multiple levels of fidelity. Through the posterior variance of the MFGP, our framework naturally enables uncertainty quantification, providing estimates of confidence in the predictions. We used Density Functional Theory as high-fidelity prediction, while a ML interatomic potential is used as the low-fidelity prediction. Practical materials design efficiency is demonstrated by reproducing the ternary composition dependence of a quantity of interest (bulk modulus) across the full aluminum-niobium-titanium ternary random alloy composition space. The MFGP is then coupled to a Bayesian optimization procedure and the computational efficiency of this approach is demonstrated by performing an on-the-fly search for the global optimum of bulk modulus in the ternary composition space. The framework presented in this manuscript is the first application of MFGP to atomistic materials simulations fusing predictions between Density Functional Theory and classical interatomic potential calculations. △ Less

Submitted 5 August, 2020; v1 submitted 29 May, 2020; originally announced June 2020.

arXiv:2006.00115 [pdf, other]

Overview of Scanner Invariant Representations

Authors: Daniel Moyer, Greg Ver Steeg, Paul M. Thompson

Abstract: Pooled imaging data from multiple sources is subject to bias from each source. Studies that do not correct for these scanner/site biases at best lose statistical power, and at worst leave spurious correlations in their data. Estimation of the bias effects is non-trivial due to the paucity of data with correspondence across sites, so called "traveling phantom" data, which is expensive to collect. N… ▽ More Pooled imaging data from multiple sources is subject to bias from each source. Studies that do not correct for these scanner/site biases at best lose statistical power, and at worst leave spurious correlations in their data. Estimation of the bias effects is non-trivial due to the paucity of data with correspondence across sites, so called "traveling phantom" data, which is expensive to collect. Nevertheless, numerous solutions leveraging direct correspondence have been proposed. In contrast to this, Moyer et al. (2019) proposes an unsupervised solution using invariant representations, one which does not require correspondence and thus does not require paired images. By leveraging the data processing inequality, an invariant representation can then be used to create an image reconstruction that is uninformative of its original source, yet still faithful to the underlying structure. In the present abstract we provide an overview of this method. △ Less

Submitted 29 May, 2020; originally announced June 2020.

Comments: Accepted as a short paper in MIDL 2020. In accordance with the MIDL 2020 Call for Papers, this short paper is an overview of an already published work arXiv:1904.05375, and was submitted to MIDL in order to allow presentation and discussion at the meeting

Report number: MIDL/2020/ExtendedAbstract/yqm9RD_XHT

arXiv:1910.03960 [pdf, ps, other]

doi 10.1109/TAC.2022.3145571

Input-output equations and identifiability of linear ODE models

Authors: Alexey Ovchinnikov, Gleb Pogudin, Peter Thompson

Abstract: Structural identifiability is a property of a differential model with parameters that allows for the parameters to be determined from the model equations in the absence of noise. The method of input-output equations is one method for verifying structural identifiability. This method stands out in its importance because the additional insights it provides can be used to analyze and improve models.… ▽ More Structural identifiability is a property of a differential model with parameters that allows for the parameters to be determined from the model equations in the absence of noise. The method of input-output equations is one method for verifying structural identifiability. This method stands out in its importance because the additional insights it provides can be used to analyze and improve models. However, its complete theoretical grounds and applicability are still to be established. A subtlety and key for this method to work correctly is knowing whether the coefficients of these equations are identifiable. In this paper, to address this, we prove identifiability of the coefficients of input-output equations for types of differential models that often appear in practice, such as linear models with one output and linear compartment models in which, from each compartment, one can reach either a leak or an input. This shows that checking identifiability via input-output equations for these models is legitimate and, as we prove, that the field of identifiable functions is generated by the coefficients of the input-output equations. Finally, we exploit a connection between input-output equations and the transfer function matrix to show that, for a linear compartment model with an input and strongly connected graph, the field of all identifiable functions is generated by the coefficients of the transfer function matrix even if the initial conditions are generic. △ Less

Submitted 27 January, 2022; v1 submitted 9 October, 2019; originally announced October 2019.

MSC Class: 12H05; 34A55; 92B05; 93C15; 93B25; 93B30

arXiv:1904.06288 [pdf, other]

Outlier-robust estimation of a sparse linear model using $\ell_1$-penalized Huber's $M$-estimator

Authors: Arnak S. Dalalyan, Philip Thompson

Abstract: We study the problem of estimating a $p$-dimensional $s$-sparse vector in a linear model with Gaussian design and additive noise. In the case where the labels are contaminated by at most $o$ adversarial outliers, we prove that the $\ell_1$-penalized Huber's $M$-estimator based on $n$ samples attains the optimal rate of convergence $(s/n)^{1/2} + (o/n)$, up to a logarithmic factor. For more general… ▽ More We study the problem of estimating a $p$-dimensional $s$-sparse vector in a linear model with Gaussian design and additive noise. In the case where the labels are contaminated by at most $o$ adversarial outliers, we prove that the $\ell_1$-penalized Huber's $M$-estimator based on $n$ samples attains the optimal rate of convergence $(s/n)^{1/2} + (o/n)$, up to a logarithmic factor. For more general design matrices, our results highlight the importance of two properties: the transfer principle and the incoherence property. These properties with suitable constants are shown to yield the optimal rates, up to log-factors, of robust estimation with adversarial contamination. △ Less

Submitted 19 November, 2019; v1 submitted 12 April, 2019; originally announced April 2019.

Comments: This is a follow up paper of arXiv:1805.08020

arXiv:1904.05375 [pdf, other]

Scanner Invariant Representations for Diffusion MRI Harmonization

Authors: Daniel Moyer, Greg Ver Steeg, Chantal M. W. Tax, Paul M. Thompson

Abstract: Purpose: In the present work we describe the correction of diffusion-weighted MRI for site and scanner biases using a novel method based on invariant representation. Theory and Methods: Pooled imaging data from multiple sources are subject to variation between the sources. Correcting for these biases has become very important as imaging studies increase in size and multi-site cases become more c… ▽ More Purpose: In the present work we describe the correction of diffusion-weighted MRI for site and scanner biases using a novel method based on invariant representation. Theory and Methods: Pooled imaging data from multiple sources are subject to variation between the sources. Correcting for these biases has become very important as imaging studies increase in size and multi-site cases become more common. We propose learning an intermediate representation invariant to site/protocol variables, a technique adapted from information theory-based algorithmic fairness; by leveraging the data processing inequality, such a representation can then be used to create an image reconstruction that is uninformative of its original source, yet still faithful to underlying structures. To implement this, we use a deep learning method based on variational auto-encoders (VAE) to construct scanner invariant encodings of the imaging data. Results: To evaluate our method, we use training data from the 2018 MICCAI Computational Diffusion MRI (CDMRI) Challenge Harmonization dataset. Our proposed method shows improvements on independent test data relative to a recently published baseline method on each subtask, mapping data from three different scanning contexts to and from one separate target scanning context. Conclusion: As imaging studies continue to grow, the use of pooled multi-site imaging will similarly increase. Invariant representation presents a strong candidate for the harmonization of these data. △ Less

Submitted 31 January, 2020; v1 submitted 10 April, 2019; originally announced April 2019.

arXiv:1810.08553 [pdf, other]

Federated Learning in Distributed Medical Databases: Meta-Analysis of Large-Scale Subcortical Brain Data

Authors: Santiago Silva, Boris Gutman, Eduardo Romero, Paul M Thompson, Andre Altmann, Marco Lorenzi

Abstract: At this moment, databanks worldwide contain brain images of previously unimaginable numbers. Combined with developments in data science, these massive data provide the potential to better understand the genetic underpinnings of brain diseases. However, different datasets, which are stored at different institutions, cannot always be shared directly due to privacy and legal concerns, thus limiting t… ▽ More At this moment, databanks worldwide contain brain images of previously unimaginable numbers. Combined with developments in data science, these massive data provide the potential to better understand the genetic underpinnings of brain diseases. However, different datasets, which are stored at different institutions, cannot always be shared directly due to privacy and legal concerns, thus limiting the full exploitation of big data in the study of brain disorders. Here we propose a federated learning framework for securely accessing and meta-analyzing any biomedical data without sharing individual information. We illustrate our framework by investigating brain structural relationships across diseases and clinical cohorts. The framework is first tested on synthetic data and then applied to multi-centric, multi-database studies including ADNI, PPMI, MIRIAD and UK Biobank, showing the potential of the approach for further applications in distributed analysis of multi-centric cohorts △ Less

Submitted 28 January, 2025; v1 submitted 19 October, 2018; originally announced October 2018.

Comments: Federated learning, distributed databases, PCA, SVD, meta-analysis, brain disease

arXiv:1806.04634 [pdf, other]

Measures of Tractography Convergence

Authors: Daniel Moyer, Paul M. Thompson, Greg Ver Steeg

Abstract: In the present work, we use information theory to understand the empirical convergence rate of tractography, a widely-used approach to reconstruct anatomical fiber pathways in the living brain. Based on diffusion MRI data, tractography is the starting point for many methods to study brain connectivity. Of the available methods to perform tractography, most reconstruct a finite set of streamlines,… ▽ More In the present work, we use information theory to understand the empirical convergence rate of tractography, a widely-used approach to reconstruct anatomical fiber pathways in the living brain. Based on diffusion MRI data, tractography is the starting point for many methods to study brain connectivity. Of the available methods to perform tractography, most reconstruct a finite set of streamlines, or 3D curves, representing probable connections between anatomical regions, yet relatively little is known about how the sampling of this set of streamlines affects downstream results, and how exhaustive the sampling should be. Here we provide a method to measure the information theoretic surprise (self-cross entropy) for tract sampling schema. We then empirically assess four streamline methods. We demonstrate that the relative information gain is very low after a moderate number of streamlines have been generated for each tested method. The results give rise to several guidelines for optimal sampling in brain connectivity analyses. △ Less

Submitted 12 June, 2018; originally announced June 2018.

Comments: 11 pages

arXiv:1805.01049 [pdf, other]

Large-Scale Unsupervised Deep Representation Learning for Brain Structure

Authors: Ayush Jaiswal, Dong Guo, Cauligi S. Raghavendra, Paul Thompson

Abstract: Machine Learning (ML) is increasingly being used for computer aided diagnosis of brain related disorders based on structural magnetic resonance imaging (MRI) data. Most of such work employs biologically and medically meaningful hand-crafted features calculated from different regions of the brain. The construction of such highly specialized features requires a considerable amount of time, manual ov… ▽ More Machine Learning (ML) is increasingly being used for computer aided diagnosis of brain related disorders based on structural magnetic resonance imaging (MRI) data. Most of such work employs biologically and medically meaningful hand-crafted features calculated from different regions of the brain. The construction of such highly specialized features requires a considerable amount of time, manual oversight and careful quality control to ensure the absence of errors in the computational process. Recent advances in Deep Representation Learning have shown great promise in extracting highly non-linear and information-rich features from data. In this paper, we present a novel large-scale deep unsupervised approach to learn generic feature representations of structural brain MRI scans, which requires no specialized domain knowledge or manual intervention. Our method produces low-dimensional representations of brain structure, which can be used to reconstruct brain images with very low error and exhibit performance comparable to FreeSurfer features on various classification tasks. △ Less

Submitted 2 May, 2018; originally announced May 2018.

arXiv:1711.05766 [pdf, other]

Fast Predictive Simple Geodesic Regression

Authors: Zhipeng Ding, Greg Fleishman, Xiao Yang, Paul Thompson, Roland Kwitt, Marc Niethammer

Abstract: Deformable image registration and regression are important tasks in medical image analysis. However, they are computationally expensive, especially when analyzing large-scale datasets that contain thousands of images. Hence, cluster computing is typically used, making the approaches dependent on such computational infrastructure. Even larger computational resources are required as study sizes incr… ▽ More Deformable image registration and regression are important tasks in medical image analysis. However, they are computationally expensive, especially when analyzing large-scale datasets that contain thousands of images. Hence, cluster computing is typically used, making the approaches dependent on such computational infrastructure. Even larger computational resources are required as study sizes increase. This limits the use of deformable image registration and regression for clinical applications and as component algorithms for other image analysis approaches. We therefore propose using a fast predictive approach to perform image registrations. In particular, we employ these fast registration predictions to approximate a simplified geodesic regression model to capture longitudinal brain changes. The resulting method is orders of magnitude faster than the standard optimization-based regression model and hence facilitates large-scale analysis on a single graphics processing unit (GPU). We evaluate our results on 3D brain magnetic resonance images (MRI) from the ADNI datasets. △ Less

Submitted 15 November, 2017; originally announced November 2017.

Comments: 19 pages, 10 figures, 13 tables

arXiv:1709.03645 [pdf, other]

Identifying Genetic Risk Factors via Sparse Group Lasso with Group Graph Structure

Authors: Tao Yang, Paul Thompson, Sihai Zhao, Jieping Ye

Abstract: Genome-wide association studies (GWA studies or GWAS) investigate the relationships between genetic variants such as single-nucleotide polymorphisms (SNPs) and individual traits. Recently, incorporating biological priors together with machine learning methods in GWA studies has attracted increasing attention. However, in real-world, nucleotide-level bio-priors have not been well-studied to date. A… ▽ More Genome-wide association studies (GWA studies or GWAS) investigate the relationships between genetic variants such as single-nucleotide polymorphisms (SNPs) and individual traits. Recently, incorporating biological priors together with machine learning methods in GWA studies has attracted increasing attention. However, in real-world, nucleotide-level bio-priors have not been well-studied to date. Alternatively, studies at gene-level, for example, protein--protein interactions and pathways, are more rigorous and legitimate, and it is potentially beneficial to utilize such gene-level priors in GWAS. In this paper, we proposed a novel two-level structured sparse model, called Sparse Group Lasso with Group-level Graph structure (SGLGG), for GWAS. It can be considered as a sparse group Lasso along with a group-level graph Lasso. Essentially, SGLGG penalizes the nucleotide-level sparsity as well as takes advantages of gene-level priors (both gene groups and networks), to identifying phenotype-associated risk SNPs. We employ the alternating direction method of multipliers algorithm to optimize the proposed model. Our experiments on the Alzheimer's Disease Neuroimaging Initiative whole genome sequence data and neuroimage data demonstrate the effectiveness of SGLGG. As a regression model, it is competitive to the state-of-the-arts sparse models; as a variable selection method, SGLGG is promising for identifying Alzheimer's disease-related risk SNPs. △ Less

Submitted 11 September, 2017; originally announced September 2017.

arXiv:1708.04789 [pdf, other]

revisit: a Workflow Tool for Data Science

Authors: Norman Matloff, Reed Davis, Laurel Beckett, Paul Thompson

Abstract: In recent years there has been widespread concern in the scientific community over a reproducibility crisis. Among the major causes that have been identified is statistical: In many scientific research the statistical analysis (including data preparation) suffers from a lack of transparency and methodological problems, major obstructions to reproducibility. The revisit package aims toward remedyin… ▽ More In recent years there has been widespread concern in the scientific community over a reproducibility crisis. Among the major causes that have been identified is statistical: In many scientific research the statistical analysis (including data preparation) suffers from a lack of transparency and methodological problems, major obstructions to reproducibility. The revisit package aims toward remedying this problem, by generating a "software paper trail" of the statistical operations applied to a dataset. This record can be "replayed" for verification purposes, as well as be modified to enable alternative analyses. The software also issues warnings of certain kinds of potential errors in statistical methodology, again related to the reproducibility issue. △ Less

Submitted 16 August, 2017; originally announced August 2017.

arXiv:1706.06031 [pdf, other]

Evaluating 35 Methods to Generate Structural Connectomes Using Pairwise Classification

Authors: Dmitry Petrov, Alexander Ivanov, Joshua Faskowitz, Boris Gutman, Daniel Moyer, Julio Villalon, Neda Jahanshad, Paul Thompson

Abstract: There is no consensus on how to construct structural brain networks from diffusion MRI. How variations in pre-processing steps affect network reliability and its ability to distinguish subjects remains opaque. In this work, we address this issue by comparing 35 structural connectome-building pipelines. We vary diffusion reconstruction models, tractography algorithms and parcellations. Next, we cla… ▽ More There is no consensus on how to construct structural brain networks from diffusion MRI. How variations in pre-processing steps affect network reliability and its ability to distinguish subjects remains opaque. In this work, we address this issue by comparing 35 structural connectome-building pipelines. We vary diffusion reconstruction models, tractography algorithms and parcellations. Next, we classify structural connectome pairs as either belonging to the same individual or not. Connectome weights and eight topological derivative measures form our feature set. For experiments, we use three test-retest datasets from the Consortium for Reliability and Reproducibility (CoRR) comprised of a total of 105 individuals. We also compare pairwise classification results to a commonly used parametric test-retest measure, Intraclass Correlation Coefficient (ICC). △ Less

Submitted 19 June, 2017; originally announced June 2017.

Comments: Accepted for MICCAI 2017, 8 pages, 3 figures

arXiv:1705.10312 [pdf]

Classification of Major Depressive Disorder via Multi-Site Weighted LASSO Model

Authors: Dajiang Zhu, Brandalyn C. Riedel, Neda Jahanshad, Nynke A. Groenewold, Dan J. Stein, Ian H. Gotlib, Matthew D. Sacchet, Danai Dima, James H. Cole, Cynthia H. Y. Fu, Henrik Walter, Ilya M. Veer, Thomas Frodl, Lianne Schmaal, Dick J. Veltman, Paul M. Thompson

Abstract: Large-scale collaborative analysis of brain imaging data, in psychiatry and neu-rology, offers a new source of statistical power to discover features that boost ac-curacy in disease classification, differential diagnosis, and outcome prediction. However, due to data privacy regulations or limited accessibility to large datasets across the world, it is challenging to efficiently integrate distribut… ▽ More Large-scale collaborative analysis of brain imaging data, in psychiatry and neu-rology, offers a new source of statistical power to discover features that boost ac-curacy in disease classification, differential diagnosis, and outcome prediction. However, due to data privacy regulations or limited accessibility to large datasets across the world, it is challenging to efficiently integrate distributed information. Here we propose a novel classification framework through multi-site weighted LASSO: each site performs an iterative weighted LASSO for feature selection separately. Within each iteration, the classification result and the selected features are collected to update the weighting parameters for each feature. This new weight is used to guide the LASSO process at the next iteration. Only the fea-tures that help to improve the classification accuracy are preserved. In tests on da-ta from five sites (299 patients with major depressive disorder (MDD) and 258 normal controls), our method boosted classification accuracy for MDD by 4.9% on average. This result shows the potential of the proposed new strategy as an ef-fective and practical collaborative platform for machine learning on large scale distributed imaging and biobank data. △ Less

Submitted 3 June, 2017; v1 submitted 26 May, 2017; originally announced May 2017.

Comments: Accepted by MICCAI 2017

Showing 1–50 of 60 results for author: Thompson, P