-
FAST functional connectivity implicates P300 connectivity in working memory deficits in Alzheimer's disease
Authors:
Om Roy,
Yashar Moshfeghi,
Agustin Ibanez,
Francisco Lopera,
Mario A Parra,
Keith M Smith
Abstract:
Measuring transient functional connectivity is an important challenge in Electroencephalogram (EEG) research. Here, the rich potential for insightful, discriminative information of brain activity offered by high temporal resolution is confounded by the inherent noise of the medium and the spurious nature of correlations computed over short temporal windows. We propose a novel methodology to overco…
▽ More
Measuring transient functional connectivity is an important challenge in Electroencephalogram (EEG) research. Here, the rich potential for insightful, discriminative information of brain activity offered by high temporal resolution is confounded by the inherent noise of the medium and the spurious nature of correlations computed over short temporal windows. We propose a novel methodology to overcome these problems called Filter Average Short-Term (FAST) functional connectivity. First, long-term, stable, functional connectivity is averaged across an entire study cohort for a given pair of Visual Short Term Memory (VSTM) tasks. The resulting average connectivity matrix, containing information on the strongest general connections for the tasks, is used as a filter to analyse the transient high temporal resolution functional connectivity of individual subjects. In simulations, we show that this method accurately discriminates differences in noisy Event-Related Potentials (ERPs) between two conditions where standard connectivity and other comparable methods fail. We then apply this to analyse activity related to visual short-term memory binding deficits in two cohorts of familial and sporadic Alzheimer's disease. Reproducible significant differences were found in the binding task with no significant difference in the shape task in the P300 ERP range. This allows new sensitive measurements of transient functional connectivity, which can be implemented to obtain results of clinical significance.
△ Less
Submitted 9 February, 2025; v1 submitted 28 February, 2024;
originally announced February 2024.
-
Evaluation of large language models for discovery of gene set function
Authors:
Mengzhou Hu,
Sahar Alkhairy,
Ingoo Lee,
Rudolf T. Pillich,
Dylan Fong,
Kevin Smith,
Robin Bachelder,
Trey Ideker,
Dexter Pratt
Abstract:
Gene set analysis is a mainstay of functional genomics, but it relies on curated databases of gene functions that are incomplete. Here we evaluate five Large Language Models (LLMs) for their ability to discover the common biological functions represented by a gene set, substantiated by supporting rationale, citations and a confidence assessment. Benchmarking against canonical gene sets from the Ge…
▽ More
Gene set analysis is a mainstay of functional genomics, but it relies on curated databases of gene functions that are incomplete. Here we evaluate five Large Language Models (LLMs) for their ability to discover the common biological functions represented by a gene set, substantiated by supporting rationale, citations and a confidence assessment. Benchmarking against canonical gene sets from the Gene Ontology, GPT-4 confidently recovered the curated name or a more general concept (73% of cases), while benchmarking against random gene sets correctly yielded zero confidence. Gemini-Pro and Mixtral-Instruct showed ability in naming but were falsely confident for random sets, whereas Llama2-70b had poor performance overall. In gene sets derived from 'omics data, GPT-4 identified novel functions not reported by classical functional enrichment (32% of cases), which independent review indicated were largely verifiable and not hallucinations. The ability to rapidly synthesize common gene functions positions LLMs as valuable 'omics assistants.
△ Less
Submitted 1 April, 2024; v1 submitted 7 September, 2023;
originally announced September 2023.
-
Machine Learning Methods Applied to Cortico-Cortical Evoked Potentials Aid in Localizing Seizure Onset Zones
Authors:
Ian G. Malone,
Kaleb E. Smith,
Morgan E. Urdaneta,
Tyler S. Davis,
Daria Nesterovich Anderson,
Brian J. Phillip,
John D. Rolston,
Christopher R. Butson
Abstract:
Epilepsy affects millions of people, reducing quality of life and increasing risk of premature death. One-third of epilepsy cases are drug-resistant and require surgery for treatment, which necessitates localizing the seizure onset zone (SOZ) in the brain. Attempts have been made to use cortico-cortical evoked potentials (CCEPs) to improve SOZ localization but none have been successful enough for…
▽ More
Epilepsy affects millions of people, reducing quality of life and increasing risk of premature death. One-third of epilepsy cases are drug-resistant and require surgery for treatment, which necessitates localizing the seizure onset zone (SOZ) in the brain. Attempts have been made to use cortico-cortical evoked potentials (CCEPs) to improve SOZ localization but none have been successful enough for clinical adoption. Here, we compare the performance of ten machine learning classifiers in localizing SOZ from CCEP data. This preliminary study validates a novel application of machine learning, and the results establish our approach as a promising line of research that warrants further investigation. This work also serves to facilitate discussion and collaboration with fellow machine learning and/or epilepsy researchers.
△ Less
Submitted 14 November, 2022;
originally announced November 2022.
-
Power contours: optimising sample size and precision in experimental psychology and human neuroscience
Authors:
Daniel H. Baker,
Greta Vilidaite,
Freya A. Lygo,
Anika K. Smith,
Tessa R. Flack,
Andre D. Gouws,
Timothy J. Andrews
Abstract:
When designing experimental studies with human participants, experimenters must decide how many trials each participant will complete, as well as how many participants to test. Most discussion of statistical power (the ability of a study design to detect an effect) has focussed on sample size, and assumed sufficient trials. Here we explore the influence of both factors on statistical power, repres…
▽ More
When designing experimental studies with human participants, experimenters must decide how many trials each participant will complete, as well as how many participants to test. Most discussion of statistical power (the ability of a study design to detect an effect) has focussed on sample size, and assumed sufficient trials. Here we explore the influence of both factors on statistical power, represented as a two-dimensional plot on which iso-power contours can be visualised. We demonstrate the conditions under which the number of trials is particularly important, i.e. when the within-participant variance is large relative to the between-participants variance. We then derive power contour plots using existing data sets for eight experimental paradigms and methodologies (including reaction times, sensory thresholds, fMRI, MEG, and EEG), and provide example code to calculate estimates of the within- and between-participant variance for each method. In all cases, the within-participant variance was larger than the between-participants variance, meaning that the number of trials has a meaningful influence on statistical power in commonly used paradigms. An online tool is provided (https://shiny.york.ac.uk/powercontours/) for generating power contours, from which the optimal combination of trials and participants can be calculated when designing future studies.
△ Less
Submitted 4 February, 2020; v1 submitted 16 February, 2019;
originally announced February 2019.
-
Nanopore occlusion: A biophysical mechanism for bipolar cancellation in cell membranes
Authors:
Thiruvallur R. Gowrishankar,
Julie V. Stern,
Kyle C. Smith,
James C. Weaver
Abstract:
Extraordinarily large but short electric field pulses are reported by many experiments to cause bipolar cancellation (BPC). This unusual cell response occurs if a first pulse is followed by a second pulse with opposite polarity. Possibly universal, BPC presently lacks a mechanistic explanation. Multiple versions of the "standard model" of cell electroporation (EP) fail to account for BPC. Here we…
▽ More
Extraordinarily large but short electric field pulses are reported by many experiments to cause bipolar cancellation (BPC). This unusual cell response occurs if a first pulse is followed by a second pulse with opposite polarity. Possibly universal, BPC presently lacks a mechanistic explanation. Multiple versions of the "standard model" of cell electroporation (EP) fail to account for BPC. Here we show, for the first time, how an extension of the standard model can account for a key experimental observation that essentially defines BPC: the amount of a tracer that enters a cell, and how tracer influx can be decreased by the second part of a bipolar pulse. The extended model can also account for the recovery of BPC wherein the extent of BPC is diminished if the spacing between the first and second pulses is increased. Our approach is reverse engineering, meaning that we identify and introduce an additional biophysical mechanism that allows pore transport to change. We hypothesize that occluding molecules from outside the membrane enter or relocate within a pore. Significantly, the additional mechanism is fundamental and general, involving a combination of partitioning and hindrance. Molecules near the membrane can enter pores to block transport of tracer molecules while still passing small ions (+/- 1) that govern electrical behavior. Accounting for such behavior requires an extension of the standard model.
△ Less
Submitted 3 July, 2018;
originally announced July 2018.
-
The cognitive roots of regularization in language
Authors:
Vanessa Ferdinand,
Simon Kirby,
Kenny Smith
Abstract:
Regularization occurs when the output a learner produces is less variable than the linguistic data they observed. In an artificial language learning experiment, we show that there exist at least two independent sources of regularization bias in cognition: a domain-general source based on cognitive load and a domain-specific source triggered by linguistic stimuli. Both of these factors modulate how…
▽ More
Regularization occurs when the output a learner produces is less variable than the linguistic data they observed. In an artificial language learning experiment, we show that there exist at least two independent sources of regularization bias in cognition: a domain-general source based on cognitive load and a domain-specific source triggered by linguistic stimuli. Both of these factors modulate how frequency information is encoded and produced, but only the production-side modulations result in regularization (i.e. cause learners to eliminate variation from the observed input). We formalize the definition of regularization as the reduction of entropy and find that entropy measures are better at identifying regularization behavior than frequency-based analyses. Using our experimental data and a model of cultural transmission, we generate predictions for the amount of regularity that would develop in each experimental condition if the artificial language were transmitted over several generations of learners. Here we find that the effect of cognitive constraints can become more complex when put into the context of cultural evolution: although learning biases certainly carry information about the course of language evolution, we should not expect a one-to-one correspondence between the micro-level processes that regularize linguistic datasets and the macro-level evolution of linguistic regularity.
△ Less
Submitted 18 October, 2018; v1 submitted 9 March, 2017;
originally announced March 2017.
-
Accounting for the Complex Hierarchical Topology of EEG Phase-Based Functional Connectivity in Network Binarisation
Authors:
Keith Smith,
Daniel Abasalo,
Javier Escudero
Abstract:
Research into binary network analysis of brain function faces a methodological challenge in selecting an appropriate threshold to binarise edge weights. For EEG phase-based functional connectivity, we test the hypothesis that such binarisation should take into account the complex hierarchical structure found in functional connectivity. We explore the density range suitable for such structure and p…
▽ More
Research into binary network analysis of brain function faces a methodological challenge in selecting an appropriate threshold to binarise edge weights. For EEG phase-based functional connectivity, we test the hypothesis that such binarisation should take into account the complex hierarchical structure found in functional connectivity. We explore the density range suitable for such structure and provide a comparison of state-of-the-art binarisation techniques, the recently proposed Cluster-Span Threshold (CST), minimum spanning trees, efficiency-cost optimisation and union of shortest path graphs, with arbitrary proportional thresholds and weighted networks. We test these techniques on weighted complex hierarchy models by contrasting model realisations with small parametric differences. We also test the robustness of these techniques to random and targeted topological attacks.We find that the CST performs consistenty well in state-of-the-art modelling of EEG network topology, robustness to topological network attacks, and in three real datasets, agreeing with our hypothesis of hierarchical complexity. This provides interesting new evidence into the relevance of considering a large number of edges in EEG functional connectivity research to provide informational density in the topology.
△ Less
Submitted 29 September, 2017; v1 submitted 20 October, 2016;
originally announced October 2016.
-
Locating Temporal Functional Dynamics of Visual Short-Term Memory Binding using Graph Modular Dirichlet Energy
Authors:
Keith Smith,
Benjamin Ricaud,
Nauman Shahid,
Stephen Rhodes,
John M. Starr,
Agustin Ibanez,
Mario A. Parra,
Javier Escudero,
Pierre Vandergheynst
Abstract:
Visual short-term memory binding tasks are a promising early marker for Alzheimer's disease (AD). To uncover functional deficits of AD in these tasks it is meaningful to first study unimpaired brain function. Electroencephalogram recordings were obtained from encoding and maintenance periods of tasks performed by healthy young volunteers. We probe the task's transient physiological underpinnings b…
▽ More
Visual short-term memory binding tasks are a promising early marker for Alzheimer's disease (AD). To uncover functional deficits of AD in these tasks it is meaningful to first study unimpaired brain function. Electroencephalogram recordings were obtained from encoding and maintenance periods of tasks performed by healthy young volunteers. We probe the task's transient physiological underpinnings by contrasting shape only (Shape) and shape-colour binding (Bind) conditions, displayed in the left and right sides of the screen, separately. Particularly, we introduce and implement a novel technique named Modular Dirichlet Energy (MDE) which allows robust and flexible analysis of the functional network with unprecedented temporal precision. We find that connectivity in the Bind condition is less integrated with the global network than in the Shape condition in occipital and frontal modules during the encoding period of the right screen condition. Using MDE we are able to discern driving effects in the occipital module between 100-140ms, coinciding with the P100 visually evoked potential, followed by a driving effect in the frontal module between 140-180ms, suggesting that the differences found constitute an information processing difference between these modules. This provides temporally precise information over a heterogeneous population in promising tasks for the detection of AD.
△ Less
Submitted 8 September, 2016; v1 submitted 8 June, 2016;
originally announced June 2016.
-
Comparison of Network Analysis Approaches on EEG Connectivity in Beta during Visual Short-Term Memory Binding Tasks
Authors:
Keith Smith,
Hamed Azami,
Mario A. Parra,
Javier Escudero,
John M. Starr
Abstract:
We analyse the electroencephalogram signals in the beta band of working memory representation recorded from young healthy volunteers performing several different Visual Short-Term Memory (VSTM) tasks which have proven useful in the assessment of clinical and preclinical Alzheimer's disease. We compare network analysis using Maximum Spanning Trees (MSTs) with network analysis obtained using 20% and…
▽ More
We analyse the electroencephalogram signals in the beta band of working memory representation recorded from young healthy volunteers performing several different Visual Short-Term Memory (VSTM) tasks which have proven useful in the assessment of clinical and preclinical Alzheimer's disease. We compare network analysis using Maximum Spanning Trees (MSTs) with network analysis obtained using 20% and 25% connection thresholds on the VSTM data. MSTs are a promising method of network analysis negating the more classical use of thresholds which are so far chosen arbitrarily. However, we find that the threshold analyses outperforms MSTs for detection of functional network differences. Particularly, MSTs fail to find any significant differences. Further, the thresholds detect significant differences between shape and shape-colour binding tasks when these are tested in the left side of the display screen, but no such differences are detected when these tasks are tested for in the right side of the display screen. This provides evidence that contralateral activity is a significant factor in sensitivity for detection of cognitive task differences.
△ Less
Submitted 8 April, 2016;
originally announced April 2016.
-
Cluster-Span Threshold: An unbiased threshold for binarising weighted complete networks in functional connectivity analysis
Authors:
Keith Smith,
Hamed Azami,
Mario A. Parra,
John M. Starr,
Javier Escudero
Abstract:
We propose a new unbiased threshold for network analysis named the Cluster-Span Threshold (CST). This is based on the clustering coefficient, C, following logic that a balance of `clustering' to `spanning' triples results in a useful topology for network analysis and that the product of complementing properties has a unique value only when perfectly balanced. We threshold networks by fixing C at t…
▽ More
We propose a new unbiased threshold for network analysis named the Cluster-Span Threshold (CST). This is based on the clustering coefficient, C, following logic that a balance of `clustering' to `spanning' triples results in a useful topology for network analysis and that the product of complementing properties has a unique value only when perfectly balanced. We threshold networks by fixing C at this balanced value, rather than fixing connection density at an arbitrary value, as has been the trend. We compare results from an electroencephalogram data set of volunteers performing visual short term memory tasks of the CST alongside other thresholds, including maximum spanning trees. We find that the CST holds as a sensitive threshold for distinguishing differences in the functional connectivity between tasks. This provides a sensitive and objective method for setting a threshold on weighted complete networks which may prove influential on the future of functional connectivity research.
△ Less
Submitted 8 April, 2016;
originally announced April 2016.
-
The Complex Hierarchical Topology of EEG Functional Connectivity
Authors:
Keith Smith,
Javier Escudero
Abstract:
Understanding the complex hierarchical topology of functional brain networks is a key aspect of functional connectivity research. Such topics are obscured by the widespread use of sparse binary network models which are fundamentally different to the complete weighted networks derived from functional connectivity. We introduce two techniques to probe the hierarchical complexity of topologies. First…
▽ More
Understanding the complex hierarchical topology of functional brain networks is a key aspect of functional connectivity research. Such topics are obscured by the widespread use of sparse binary network models which are fundamentally different to the complete weighted networks derived from functional connectivity. We introduce two techniques to probe the hierarchical complexity of topologies. Firstly, a new metric to measure hierarchical complexity; secondly, a Weighted Complex Hierarchy (WCH) model. To thoroughly evaluate our techniques, we generalise sparse binary network archetypes to weighted forms and explore the main topological features of brain networks- integration, regularity and modularity- using curves over density. By controlling the parameters of our model, the highest complexity is found to arise between a random topology and a strict 'class-based' topology. Further, the model has equivalent complexity to EEG phase-lag networks at peak performance. Hierarchical complexity attains greater magnitude and range of differences between different networks than the previous commonly used complexity metric and our WCH model offers a much broader range of network topology than the standard scale-free and small-world models at a full range of densities. Our metric and model provide a rigorous characterisation of hierarchical complexity. Importantly, our framework shows a scale of complexity arising between 'all nodes are equal' topologies at one extreme and 'strict class-based' topologies at the other.
△ Less
Submitted 9 November, 2016; v1 submitted 6 April, 2016;
originally announced April 2016.
-
High-resolution transcriptome analysis with long-read RNA sequencing
Authors:
Hyunghoon Cho,
Joe Davis,
Xin Li,
Kevin S. Smith,
Alexis Battle,
Stephen B. Montgomery
Abstract:
RNA sequencing (RNA-seq) enables characterization and quantification of individual transcriptomes as well as detection of patterns of allelic expression and alternative splicing. Current RNA-seq protocols depend on high-throughput short-read sequencing of cDNA. However, as ongoing advances are rapidly yielding increasing read lengths, a technical hurdle remains in identifying the degree to which d…
▽ More
RNA sequencing (RNA-seq) enables characterization and quantification of individual transcriptomes as well as detection of patterns of allelic expression and alternative splicing. Current RNA-seq protocols depend on high-throughput short-read sequencing of cDNA. However, as ongoing advances are rapidly yielding increasing read lengths, a technical hurdle remains in identifying the degree to which differences in read length influence various transcriptome analyses. In this study, we generated two paired-end RNA-seq datasets of differing read lengths (2x75 bp and 2x262 bp) for lymphoblastoid cell line GM12878 and compared the effect of read length on transcriptome analyses, including read-mapping performance, gene and transcript quantification, and detection of allele-specific expression (ASE) and allele-specific alternative splicing (ASAS) patterns. Our results indicate that, while the current long-read protocol is considerably more expensive than short-read sequencing, there are important benefits that can only be achieved with longer read length, including lower mapping bias and reduced ambiguity in assigning reads to genomic elements, such as mRNA transcript. We show that these benefits ultimately lead to improved detection of cis-acting regulatory and splicing variation effects within individuals.
△ Less
Submitted 28 May, 2014;
originally announced May 2014.
-
Stochastic dynamics of lexicon learning in an uncertain and nonuniform world
Authors:
Rainer Reisenauer,
Kenny Smith,
Richard A. Blythe
Abstract:
We study the time taken by a language learner to correctly identify the meaning of all words in a lexicon under conditions where many plausible meanings can be inferred whenever a word is uttered. We show that the most basic form of cross-situational learning - whereby information from multiple episodes is combined to eliminate incorrect meanings - can perform badly when words are learned independ…
▽ More
We study the time taken by a language learner to correctly identify the meaning of all words in a lexicon under conditions where many plausible meanings can be inferred whenever a word is uttered. We show that the most basic form of cross-situational learning - whereby information from multiple episodes is combined to eliminate incorrect meanings - can perform badly when words are learned independently and meanings are drawn from a nonuniform distribution. If learners further assume that no two words share a common meaning, we find a phase transition between a maximally-efficient learning regime, where the learning time is reduced to the shortest it can possibly be, and a partially-efficient regime where incorrect candidate meanings for words persist at late times. We obtain exact results for the word-learning process through an equivalence to a statistical mechanical problem of enumerating loops in the space of word-meaning mappings.
△ Less
Submitted 31 May, 2013; v1 submitted 22 February, 2013;
originally announced February 2013.
-
Short and Long Range Population Dynamics of the Monarch
Authors:
Komi Messan,
Kyle Smith,
Shawn Tsosie,
Shuchen Zhu,
Sergei Suslov
Abstract:
The monarch butterfly annually migrates from central Mexico to southern Canada. During recent decades, its population has been reduced due to human interaction with their habitat. We examine the effect of herbicide usage on the monarch butterfly's population by creating a system of linear and non-linear ordinary differential equations that describe the interaction between the monarch's population…
▽ More
The monarch butterfly annually migrates from central Mexico to southern Canada. During recent decades, its population has been reduced due to human interaction with their habitat. We examine the effect of herbicide usage on the monarch butterfly's population by creating a system of linear and non-linear ordinary differential equations that describe the interaction between the monarch's population and its environment at various stages of migration: spring migration, summer loitering, and fall migration. The model has various stages that are used to describe the dynamics of the monarch butterfly population over multiple generations. In Stage 1, we propose a system of coupled ordinary differential equations that model the populations of the monarch butterflies and larvae during spring migration. In Stage 2, we propose a predator-prey model with age structure to model the population dynamics at the summer breeding site. In Stages 3 and 4, we propose exponential decay functions to model the monarch butterfly's fall migration to central Mexico and their time at the overwintering site. The model is used to analyze the long-term behavior of the monarch butterflies through numerical analysis, given data available in the research literature.
△ Less
Submitted 16 December, 2011;
originally announced December 2011.