Search | arXiv e-print repository

Inkspot: A stress-resilient, anthocyanin rich, dwarf tomato variant for off-world cultivation

Authors: Sarah Lang, A'nya Buckner, Solomon Jones, Gabriel Erwin, Sally Lee, Rafael Loureiro

Abstract: As humanity prepares for sustained off-world habitation, the development of regolith-based agriculture (RBA) is essential for achieving self-sufficiency in space crop production. However, lunar regolith's alkaline pH, poor water retention, and high metal content pose severe physiological and biochemical challenges to plant growth. This study evaluates the performance of Solanum lycopersicum 'Inksp… ▽ More As humanity prepares for sustained off-world habitation, the development of regolith-based agriculture (RBA) is essential for achieving self-sufficiency in space crop production. However, lunar regolith's alkaline pH, poor water retention, and high metal content pose severe physiological and biochemical challenges to plant growth. This study evaluates the performance of Solanum lycopersicum 'Inkspot', a stress-adaptive, anthocyanin-rich tomato variant, in comparison to its progenitor 'Tiny Tim', under control and simulated lunar regolith (LHS-2) conditions. A randomized complete block design was used to assess germination dynamics, morphology, fruit quality, antioxidant activity, and root architecture across 80 replicates over 65 days in controlled chambers. Inkspot maintained high germination rates (85% in regolith) with low variation (CV = 14%) and showed only moderate reductions in height and biomass, while Tiny Tim suffered a 45% biomass reduction and 60% fruit yield loss. Inkspot fruits increased anthocyanin content 2.5-fold in regolith, functioning as a stress-response mechanism and potential bioindicator. Physiological assessments revealed greater retention of chlorophyll, Fv/Fm efficiency, and stomatal conductance in Inkspot, correlated with higher SOD and CAT enzyme activity and lower lipid peroxidation. Root imaging showed Inkspot developed a significantly larger, more complex root system, while Tiny Tim's roots contracted under stress. These findings highlight Inkspot's abiotic stress tolerance and potential as a candidate for closed-loop life support and in-situ resource utilization strategies in RBA systems. △ Less

Submitted 22 May, 2025; originally announced May 2025.

Comments: 16 pages, 4 figures. Submitted to ASGSR 2024 Student Research Track. Includes biochemical stress responses and physiological performance

arXiv:2504.08016 [pdf, other]

Emergence of psychopathological computations in large language models

Authors: Soo Yong Lee, Hyunjin Hwang, Taekwan Kim, Yuyeong Kim, Kyuri Park, Jaemin Yoo, Denny Borsboom, Kijung Shin

Abstract: Can large language models (LLMs) implement computations of psychopathology? An effective approach to the question hinges on addressing two factors. First, for conceptual validity, we require a general and computational account of psychopathology that is applicable to computational entities without biological embodiment or subjective experience. Second, mechanisms underlying LLM behaviors need to b… ▽ More Can large language models (LLMs) implement computations of psychopathology? An effective approach to the question hinges on addressing two factors. First, for conceptual validity, we require a general and computational account of psychopathology that is applicable to computational entities without biological embodiment or subjective experience. Second, mechanisms underlying LLM behaviors need to be studied for better methodological validity. Thus, we establish a computational-theoretical framework to provide an account of psychopathology applicable to LLMs. To ground the theory for empirical analysis, we also propose a novel mechanistic interpretability method alongside a tailored empirical analytic framework. Based on the frameworks, we conduct experiments demonstrating three key claims: first, that distinct dysfunctional and problematic representational states are implemented in LLMs; second, that their activations can spread and self-sustain to trap LLMs; and third, that dynamic, cyclic structural causal models encoded in the LLMs underpin these patterns. In concert, the empirical results corroborate our hypothesis that network-theoretic computations of psychopathology have already emerged in LLMs. This suggests that certain LLM behaviors mirroring psychopathology may not be a superficial mimicry but a feature of their internal processing. Thus, our work alludes to the possibility of AI systems with psychopathological behaviors in the near future. △ Less

Submitted 10 April, 2025; originally announced April 2025.

Comments: pre-print

arXiv:2504.00052 [pdf]

Assessing Validity of ICD-10 Administrative Data in Coding Comorbidities

Authors: Jie Pan, Seungwon Lee, Cheligeer Cheligeer, Bing Li, Guosong Wu, Catherine A Eastwood, Yuan Xu, Hude Quan

Abstract: Objectives: Administrative data is commonly used to inform chronic disease prevalence and support health informatics research. This study assessed the validity of coding comorbidity in the International Classification of Diseases, 10th Revision (ICD-10) administrative data. Methods: We analyzed three chart review cohorts (4,008 patients in 2003, 3,045 in 2015, and 9,024 in 2022) in Alberta, Canada… ▽ More Objectives: Administrative data is commonly used to inform chronic disease prevalence and support health informatics research. This study assessed the validity of coding comorbidity in the International Classification of Diseases, 10th Revision (ICD-10) administrative data. Methods: We analyzed three chart review cohorts (4,008 patients in 2003, 3,045 in 2015, and 9,024 in 2022) in Alberta, Canada. Nurse reviewers assessed the presence of 17 clinical conditions using a consistent protocol. The reviews were linked with administrative data using unique identifiers. We compared the accuracy in coding comorbidity by ICD-10, using chart review data as the reference standard. Results: Our findings showed that the mean difference in prevalence between chart reviews and ICD-10 for these 17 conditions was 2.1% in 2003, 7.6% in 2015, and 6.3% in 2022. Some conditions were relatively stable, such as diabetes (1.9%, 2.1%, and 1.1%) and metastatic cancer (0.3%, 1.1%, and 0.4%). For these 17 conditions, the sensitivity ranged from 39.6-85.1% in 2003, 1.3-85.2% in 2015, and 3.0-89.7% in 2022. The C-statistics for predicting in-hospital mortality using comorbidities by ICD-10 were 0.84 in 2003, 0.81 in 2015, and 0.78 in 2022. Discussion: The under-coding could be primarily due to the increase of hospital patient volumes and the limited time allocated to coders. There is a potential to develop artificial intelligence methods based on electronic health records to support coding practices and improve coding quality. Conclusion: Comorbidities were increasingly under-coded over 20 years. The validity of ICD-10 decreased but remained relatively stable for certain conditions mandated for coding. The under-coding exerted minimal impact on in-hospital mortality prediction. △ Less

Submitted 31 March, 2025; originally announced April 2025.

arXiv:2501.15208 [pdf]

Advancing Understanding of Long COVID Pathophysiology Through Quantum Walk-Based Network Analysis

Authors: Jaesub Park, Woochang Hwang, Seokjun Lee, Hyun Chang Lee, Méabh MacMahon, Matthias Zilbauer, Namshik Han

Abstract: Long COVID is a multisystem condition characterized by persistent symptoms such as fatigue, cognitive impairment, and systemic inflammation, following COVID-19 infection, yet its mechanisms remain poorly understood. In this study, we applied quantum walk (QW), a computational approach leveraging quantum interference, to explore large-scale SARS-CoV-2-induced protein (SIP) networks. Compared to the… ▽ More Long COVID is a multisystem condition characterized by persistent symptoms such as fatigue, cognitive impairment, and systemic inflammation, following COVID-19 infection, yet its mechanisms remain poorly understood. In this study, we applied quantum walk (QW), a computational approach leveraging quantum interference, to explore large-scale SARS-CoV-2-induced protein (SIP) networks. Compared to the conventional random walk with restart (RWR) method, QW demonstrated superior capacity to traverse deeper regions of the network, uncovering proteins and pathways implicated in Long COVID. Key findings include mitochondrial dysfunction, thromboinflammatory responses, and neuronal inflammation as central mechanisms. QW uniquely identified the CDGSH iron-sulfur domain-containing protein family and VDAC1, a mitochondrial calcium transporter, as critical regulators of these processes. VDAC1 emerged as a potential biomarker and therapeutic target, supported by FDA-approved compounds such as cannabidiol. These findings highlight QW as a powerful tool for elucidating complex biological systems and identifying novel therapeutic targets for conditions like Long COVID. △ Less

Submitted 29 January, 2025; v1 submitted 25 January, 2025; originally announced January 2025.

Comments: 25 pages, 6 figures and 3 tables

arXiv:2501.14790 [pdf, other]

Towards Dynamic Neural Communication and Speech Neuroprosthesis Based on Viseme Decoding

Authors: Ji-Ha Park, Seo-Hyun Lee, Soowon Kim, Seong-Whan Lee

Abstract: Decoding text, speech, or images from human neural signals holds promising potential both as neuroprosthesis for patients and as innovative communication tools for general users. Although neural signals contain various information on speech intentions, movements, and phonetic details, generating informative outputs from them remains challenging, with mostly focusing on decoding short intentions or… ▽ More Decoding text, speech, or images from human neural signals holds promising potential both as neuroprosthesis for patients and as innovative communication tools for general users. Although neural signals contain various information on speech intentions, movements, and phonetic details, generating informative outputs from them remains challenging, with mostly focusing on decoding short intentions or producing fragmented outputs. In this study, we developed a diffusion model-based framework to decode visual speech intentions from speech-related non-invasive brain signals, to facilitate face-to-face neural communication. We designed an experiment to consolidate various phonemes to train visemes of each phoneme, aiming to learn the representation of corresponding lip formations from neural signals. By decoding visemes from both isolated trials and continuous sentences, we successfully reconstructed coherent lip movements, effectively bridging the gap between brain signals and dynamic visual interfaces. The results highlight the potential of viseme decoding and talking face reconstruction from human neural signals, marking a significant step toward dynamic neural communication systems and speech neuroprosthesis for patients. △ Less

Submitted 8 January, 2025; originally announced January 2025.

Comments: 5 pages, 5 figures, 1 table, Name of Conference: 2025 IEEE International Conference on Acoustics, Speech, and Signal Processing

arXiv:2410.20660 [pdf, other]

TurboHopp: Accelerated Molecule Scaffold Hopping with Consistency Models

Authors: Kiwoong Yoo, Owen Oertell, Junhyun Lee, Sanghoon Lee, Jaewoo Kang

Abstract: Navigating the vast chemical space of druggable compounds is a formidable challenge in drug discovery, where generative models are increasingly employed to identify viable candidates. Conditional 3D structure-based drug design (3D-SBDD) models, which take into account complex three-dimensional interactions and molecular geometries, are particularly promising. Scaffold hopping is an efficient strat… ▽ More Navigating the vast chemical space of druggable compounds is a formidable challenge in drug discovery, where generative models are increasingly employed to identify viable candidates. Conditional 3D structure-based drug design (3D-SBDD) models, which take into account complex three-dimensional interactions and molecular geometries, are particularly promising. Scaffold hopping is an efficient strategy that facilitates the identification of similar active compounds by strategically modifying the core structure of molecules, effectively narrowing the wide chemical space and enhancing the discovery of drug-like products. However, the practical application of 3D-SBDD generative models is hampered by their slow processing speeds. To address this bottleneck, we introduce TurboHopp, an accelerated pocket-conditioned 3D scaffold hopping model that merges the strategic effectiveness of traditional scaffold hopping with rapid generation capabilities of consistency models. This synergy not only enhances efficiency but also significantly boosts generation speeds, achieving up to 30 times faster inference speed as well as superior generation quality compared to existing diffusion-based models, establishing TurboHopp as a powerful tool in drug discovery. Supported by faster inference speed, we further optimize our model, using Reinforcement Learning for Consistency Models (RLCM), to output desirable molecules. We demonstrate the broad applicability of TurboHopp across multiple drug discovery scenarios, underscoring its potential in diverse molecular settings. △ Less

Submitted 1 February, 2025; v1 submitted 27 October, 2024; originally announced October 2024.

Comments: 22 pages, 11 figures, 8 tables. Presented at NeurIPS 2024

arXiv:2409.20013 [pdf]

Single-shot reconstruction of three-dimensional morphology of biological cells in digital holographic microscopy using a physics-driven neural network

Authors: Jihwan Kim, Youngdo Kim, Hyo Seung Lee, Eunseok Seo, Sang Joon Lee

Abstract: Recent advances in deep learning-based image reconstruction techniques have led to significant progress in phase retrieval using digital in-line holographic microscopy (DIHM). However, existing deep learning-based phase retrieval methods have technical limitations in generalization performance and three-dimensional (3D) morphology reconstruction from a single-shot hologram of biological cells. In… ▽ More Recent advances in deep learning-based image reconstruction techniques have led to significant progress in phase retrieval using digital in-line holographic microscopy (DIHM). However, existing deep learning-based phase retrieval methods have technical limitations in generalization performance and three-dimensional (3D) morphology reconstruction from a single-shot hologram of biological cells. In this study, we propose a novel deep learning model, named MorpHoloNet, for single-shot reconstruction of 3D morphology by integrating physics-driven and coordinate-based neural networks. By simulating the optical diffraction of coherent light through a 3D phase shift distribution, the proposed MorpHoloNet is optimized by minimizing the loss between the simulated and input holograms on the sensor plane. Compared to existing DIHM methods that face challenges with twin image and phase retrieval problems, MorpHoloNet enables direct reconstruction of 3D complex light field and 3D morphology of a test sample from its single-shot hologram without requiring multiple phase-shifted holograms or angle scanning. The performance of the proposed MorpHoloNet is validated by reconstructing 3D morphologies and refractive index distributions from synthetic holograms of ellipsoids and experimental holograms of biological cells. The proposed deep learning model is utilized to reconstruct spatiotemporal variations in 3D translational and rotational behaviors and morphological deformations of biological cells from consecutive single-shot holograms captured using DIHM. MorpHoloNet would pave the way for advancing label-free, real-time 3D imaging and dynamic analysis of biological cells under various cellular microenvironments in biomedical and engineering fields. △ Less

Submitted 30 September, 2024; originally announced September 2024.

Comments: 35 pages, 7 figures, 1 table

arXiv:2409.07806 [pdf]

Investigation of Electrical Conductivity Changes during Brain Functional Activity in 3T MRI

Authors: Kyu-Jin Jung, Chuanjiang Cui, Soo-Hyung Lee, Chan-Hee Park, Ji-Won Chun, Dong-Hyun Kim

Abstract: Blood oxygenation level-dependent (BOLD) functional magnetic resonance imaging (fMRI) is widely used to visualize brain activation regions by detecting hemodynamic responses associated with increased metabolic demand. While alternative MRI methods have been employed to monitor functional activities, the investigation of in-vivo electrical property changes during brain function remains limited. In… ▽ More Blood oxygenation level-dependent (BOLD) functional magnetic resonance imaging (fMRI) is widely used to visualize brain activation regions by detecting hemodynamic responses associated with increased metabolic demand. While alternative MRI methods have been employed to monitor functional activities, the investigation of in-vivo electrical property changes during brain function remains limited. In this study, we explored the relationship between fMRI signals and electrical conductivity (measured at the Larmor frequency) changes using phase-based electrical properties tomography (EPT). Our results revealed consistent patterns: conductivity changes showed negative correlations, with conductivity decreasing in the functionally active regions whereas B1 phase mapping exhibited positive correlations around activation regions. These observations were consistent across both motor and visual cortex activations. To further substantiate these findings, we conducted electromagnetic radio-frequency simulations that modeled activation states with varying conductivity, which demonstrated trends similar to our in-vivo results for both B1 phase and conductivity. These findings suggest that in-vivo electrical conductivity changes can indeed be measured during brain activity. However, further investigation is needed to fully understand the underlying mechanisms driving these measurements. △ Less

Submitted 12 September, 2024; originally announced September 2024.

arXiv:2406.08140 [pdf]

Functional voxel hierarchy and afferent capacity revealed mental state transition on dynamic correlation resting-state fMRI

Authors: Dong Soo Lee, Hyun Joo Kim, Youngmin Huh, Yeon Koo Kang, Wonseok Whi, Hyekyoung Lee, Hyejin Kang

Abstract: Voxel hierarchy on dynamic brain graphs is produced by k core percolation on functional dynamic amplitude correlation of resting-state fMRI. Directed graphs and their afferent/efferent capacities are produced by Markov modeling of the universal cover of undirected graphs simultaneously with the calculation of volume entropy. Positive and unsigned negative brain graphs were analyzed separately on s… ▽ More Voxel hierarchy on dynamic brain graphs is produced by k core percolation on functional dynamic amplitude correlation of resting-state fMRI. Directed graphs and their afferent/efferent capacities are produced by Markov modeling of the universal cover of undirected graphs simultaneously with the calculation of volume entropy. Positive and unsigned negative brain graphs were analyzed separately on sliding-window representation to underpin the visualization and quantitation of mental dynamic states with their transitions. Voxel hierarchy animation maps of positive graphs revealed abrupt changes in coreness k and kmaxcore, which we called mental state transitions. Afferent voxel capacities of the positive graphs also revealed transient modules composed of dominating voxels/independent components and their exchanges representing mental state transitions. Animation and quantification plots of voxel hierarchy and afferent capacity corroborated each other in underpinning mental state transitions and afferent module exchange on the positive directed functional connectivity graphs. We propose the use of spatiotemporal trajectories of voxels on positive dynamic graphs to construct hierarchical structures by k core percolation and quantified in- and out-flows of information of voxels by volume entropy/directed graphs to subserve diverse resting mental state transitions on resting-state fMRI graphs in normal human individuals. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2406.07843 [pdf, other]

Self-Attention-Based Contextual Modulation Improves Neural System Identification

Authors: Isaac Lin, Tianye Wang, Shang Gao, Shiming Tang, Tai Sing Lee

Abstract: Convolutional neural networks (CNNs) have been shown to be state-of-the-art models for visual cortical neurons. Cortical neurons in the primary visual cortex are sensitive to contextual information mediated by extensive horizontal and feedback connections. Standard CNNs integrate global contextual information to model contextual modulation via two mechanisms: successive convolutions and a fully co… ▽ More Convolutional neural networks (CNNs) have been shown to be state-of-the-art models for visual cortical neurons. Cortical neurons in the primary visual cortex are sensitive to contextual information mediated by extensive horizontal and feedback connections. Standard CNNs integrate global contextual information to model contextual modulation via two mechanisms: successive convolutions and a fully connected readout layer. In this paper, we find that self-attention (SA), an implementation of non-local network mechanisms, can improve neural response predictions over parameter-matched CNNs in two key metrics: tuning curve correlation and peak tuning. We introduce peak tuning as a metric to evaluate a model's ability to capture a neuron's top feature preference. We factorize networks to assess each context mechanism, revealing that information in the local receptive field is most important for modeling overall tuning, but surround information is critically necessary for characterizing the tuning peak. We find that self-attention can replace posterior spatial-integration convolutions when learned incrementally, and is further enhanced in the presence of a fully connected readout layer, suggesting that the two context mechanisms are complementary. Finally, we find that decomposing receptive field learning and contextual modulation learning in an incremental manner may be an effective and robust mechanism for learning surround-center interactions. △ Less

Submitted 28 February, 2025; v1 submitted 11 June, 2024; originally announced June 2024.

Comments: ICLR 2025

arXiv:2405.06663 [pdf, other]

Protein Representation Learning by Capturing Protein Sequence-Structure-Function Relationship

Authors: Eunji Ko, Seul Lee, Minseon Kim, Dongki Kim

Abstract: The goal of protein representation learning is to extract knowledge from protein databases that can be applied to various protein-related downstream tasks. Although protein sequence, structure, and function are the three key modalities for a comprehensive understanding of proteins, existing methods for protein representation learning have utilized only one or two of these modalities due to the dif… ▽ More The goal of protein representation learning is to extract knowledge from protein databases that can be applied to various protein-related downstream tasks. Although protein sequence, structure, and function are the three key modalities for a comprehensive understanding of proteins, existing methods for protein representation learning have utilized only one or two of these modalities due to the difficulty of capturing the asymmetric interrelationships between them. To account for this asymmetry, we introduce our novel asymmetric multi-modal masked autoencoder (AMMA). AMMA adopts (1) a unified multi-modal encoder to integrate all three modalities into a unified representation space and (2) asymmetric decoders to ensure that sequence latent features reflect structural and functional information. The experiments demonstrate that the proposed AMMA is highly effective in learning protein representations that exhibit well-aligned inter-modal relationships, which in turn makes it effective for various downstream protein-related tasks. △ Less

Submitted 29 April, 2024; originally announced May 2024.

Comments: ICLR 2024 MLGenX Workshop (Spotlight)

arXiv:2405.01974 [pdf, other]

Multitask Extension of Geometrically Aligned Transfer Encoder

Authors: Sung Moon Ko, Sumin Lee, Dae-Woong Jeong, Hyunseung Kim, Chanhui Lee, Soorin Yim, Sehui Han

Abstract: Molecular datasets often suffer from a lack of data. It is well-known that gathering data is difficult due to the complexity of experimentation or simulation involved. Here, we leverage mutual information across different tasks in molecular data to address this issue. We extend an algorithm that utilizes the geometric characteristics of the encoding space, known as the Geometrically Aligned Transf… ▽ More Molecular datasets often suffer from a lack of data. It is well-known that gathering data is difficult due to the complexity of experimentation or simulation involved. Here, we leverage mutual information across different tasks in molecular data to address this issue. We extend an algorithm that utilizes the geometric characteristics of the encoding space, known as the Geometrically Aligned Transfer Encoder (GATE), to a multi-task setup. Thus, we connect multiple molecular tasks by aligning the curved coordinates onto locally flat coordinates, ensuring the flow of information from source tasks to support performance on target data. △ Less

Submitted 3 May, 2024; originally announced May 2024.

Comments: 7 pages, 3 figures, 2 tables

arXiv:2405.01554 [pdf, other]

Early-stage detection of cognitive impairment by hybrid quantum-classical algorithm using resting-state functional MRI time-series

Authors: Junggu Choi, Tak Hur, Daniel K. Park, Na-Young Shin, Seung-Koo Lee, Hakbae Lee, Sanghoon Han

Abstract: Following the recent development of quantum machine learning techniques, the literature has reported several quantum machine learning algorithms for disease detection. This study explores the application of a hybrid quantum-classical algorithm for classifying region-of-interest time-series data obtained from resting-state functional magnetic resonance imaging in patients with early-stage cognitive… ▽ More Following the recent development of quantum machine learning techniques, the literature has reported several quantum machine learning algorithms for disease detection. This study explores the application of a hybrid quantum-classical algorithm for classifying region-of-interest time-series data obtained from resting-state functional magnetic resonance imaging in patients with early-stage cognitive impairment based on the importance of cognitive decline for dementia or aging. Classical one-dimensional convolutional layers are used together with quantum convolutional neural networks in our hybrid algorithm. In the classical simulation, the proposed hybrid algorithms showed higher balanced accuracies than classical convolutional neural networks under the similar training conditions. Moreover, a total of nine brain regions (left precentral gyrus, right superior temporal gyrus, left rolandic operculum, right rolandic operculum, left parahippocampus, right hippocampus, left medial frontal gyrus, right cerebellum crus, and cerebellar vermis) among 116 brain regions were found to be relatively effective brain regions for the classification based on the model performances. The associations of the selected nine regions with cognitive decline, as found in previous studies, were additionally validated through seed-based functional connectivity analysis. We confirmed both the improvement of model performance with the quantum convolutional neural network and neuroscientific validities of brain regions from our hybrid quantum-classical model. △ Less

Submitted 16 March, 2024; originally announced May 2024.

Comments: 28 pages, 10 figures

arXiv:2311.09354 [pdf]

doi 10.1063/5.0189222

Nondestructive, quantitative viability analysis of 3D tissue cultures using machine learning image segmentation

Authors: Kylie J. Trettner, Jeremy Hsieh, Weikun Xiao, Jerry S. H. Lee, Andrea M. Armani

Abstract: Ascertaining the collective viability of cells in different cell culture conditions has typically relied on averaging colorimetric indicators and is often reported out in simple binary readouts. Recent research has combined viability assessment techniques with image-based deep-learning models to automate the characterization of cellular properties. However, further development of viability measure… ▽ More Ascertaining the collective viability of cells in different cell culture conditions has typically relied on averaging colorimetric indicators and is often reported out in simple binary readouts. Recent research has combined viability assessment techniques with image-based deep-learning models to automate the characterization of cellular properties. However, further development of viability measurements to assess the continuity of possible cellular states and responses to perturbation across cell culture conditions is needed. In this work, we demonstrate an image processing algorithm for quantifying cellular viability in 3D cultures without the need for assay-based indicators. We show that our algorithm performs similarly to a pair of human experts in whole-well images over a range of days and culture matrix compositions. To demonstrate potential utility, we perform a longitudinal study investigating the impact of a known therapeutic on pancreatic cancer spheroids. Using images taken with a high content imaging system, the algorithm successfully tracks viability at the individual spheroid and whole-well level. The method we propose reduces analysis time by 97% in comparison to the experts. Because the method is independent of the microscope or imaging system used, this approach lays the foundation for accelerating progress in and for improving the robustness and reproducibility of 3D culture analysis across biological and clinical research. △ Less

Submitted 11 March, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

Comments: 52 total pages, Main text and SI included, 35 figures (5 main text, 30 supplemental), 9 tables, 6 datasets (provided on linked GitHub), linked image files on Zenodo

arXiv:2311.08735 [pdf, other]

Neurophysiological Response Based on Auditory Sense for Brain Modulation Using Monaural Beat

Authors: Ha-Na Jo, Young-Seok Kweon, Gi-Hwan Shin, Heon-Gyu Kwak, Seong-Whan Lee

Abstract: Brain modulation is a modification process of brain activity through external stimulations. However, which condition can induce the activation is still unclear. Therefore, we aimed to identify brain activation conditions using 40 Hz monaural beat (MB). Under this stimulation, auditory sense status which is determined by frequency and power range is the condition to consider. Hence, we designed fiv… ▽ More Brain modulation is a modification process of brain activity through external stimulations. However, which condition can induce the activation is still unclear. Therefore, we aimed to identify brain activation conditions using 40 Hz monaural beat (MB). Under this stimulation, auditory sense status which is determined by frequency and power range is the condition to consider. Hence, we designed five sessions to compare; no stimulation, audible (AB), inaudible in frequency, inaudible in power, and inaudible in frequency and power. Ten healthy participants underwent each stimulation session for ten minutes with electroencephalogram (EEG) recording. For analysis, we calculated the power spectral density (PSD) of EEG for each session and compared them in frequency, time, and five brain regions. As a result, we observed the prominent power peak at 40 Hz in only AB. The induced EEG amplitude increase started at one minute and increased until the end of the session. These results of AB had significant differences in frontal, central, temporal, parietal, and occipital regions compared to other stimulations. From the statistical analysis, the PSD of the right temporal region was significantly higher than the left. We figure out the role that the auditory sense is important to lead brain activation. These findings help to understand the neurophysiological principle and effects of auditory stimulation. △ Less

Submitted 15 November, 2023; originally announced November 2023.

Comments: Accepted to EMBC 2023

arXiv:2311.08703 [pdf, other]

Impact of Nap on Performance in Different Working Memory Tasks Using EEG

Authors: Gi-Hwan Shin, Young-Seok Kweon, Heon-Gyu Kwak, Ha-Na Jo, Seong-Whan Lee

Abstract: Electroencephalography (EEG) has been widely used to study the relationship between naps and working memory, yet the effects of naps on distinct working memory tasks remain unclear. Here, participants performed word-pair and visuospatial working memory tasks pre- and post-nap sessions. We found marked differences in accuracy and reaction time between tasks performed pre- and post-nap. In order to… ▽ More Electroencephalography (EEG) has been widely used to study the relationship between naps and working memory, yet the effects of naps on distinct working memory tasks remain unclear. Here, participants performed word-pair and visuospatial working memory tasks pre- and post-nap sessions. We found marked differences in accuracy and reaction time between tasks performed pre- and post-nap. In order to identify the impact of naps on performance in each working memory task, we employed clustering to classify participants as high- or low-performers. Analysis of sleep architecture revealed significant variations in sleep onset latency and rapid eye movement (REM) proportion. In addition, the two groups exhibited prominent differences, especially in the delta power of the Non-REM 3 stage linked to memory. Our results emphasize the interplay between nap-related neural activity and working memory, underlining specific EEG markers associated with cognitive performance. △ Less

Submitted 15 November, 2023; originally announced November 2023.

Comments: Submitted to 2024 12th IEEE International Winter Conference on Brain-Computer Interface

arXiv:2311.08433

Clinical Characteristics and Laboratory Biomarkers in ICU-admitted Septic Patients with and without Bacteremia

Authors: Sangwon Baek, Seung Jun Lee

Abstract: Few studies have investigated the diagnostic utilities of biomarkers for predicting bacteremia among septic patients admitted to intensive care units (ICU). Therefore, this study evaluated the prediction power of laboratory biomarkers to utilize those markers with high performance to optimize the predictive model for bacteremia. This retrospective cross-sectional study was conducted at the ICU dep… ▽ More Few studies have investigated the diagnostic utilities of biomarkers for predicting bacteremia among septic patients admitted to intensive care units (ICU). Therefore, this study evaluated the prediction power of laboratory biomarkers to utilize those markers with high performance to optimize the predictive model for bacteremia. This retrospective cross-sectional study was conducted at the ICU department of Gyeongsang National University Changwon Hospital in 2019. Adult patients qualifying SEPSIS-3 (increase in sequential organ failure score greater than or equal to 2) criteria with at least two sets of blood culture were selected. Collected data was initially analyzed independently to identify the significant predictors, which was then used to build the multivariable logistic regression (MLR) model. A total of 218 patients with 48 cases of true bacteremia were analyzed in this research. Both CRP and PCT showed a substantial area under the curve (AUC) value for discriminating bacteremia among septic patients (0.757 and 0.845, respectively). To further enhance the predictive accuracy, we combined PCT, bilirubin, neutrophil lymphocyte ratio (NLR), platelets, lactic acid, erythrocyte sedimentation rate (ESR), and Glasgow Coma Scale (GCS) score to build the predictive model with an AUC of 0.907 (95% CI, 0.843 to 0.956). In addition, a high association between bacteremia and mortality rate was discovered through the survival analysis (0.004). While PCT is certainly a useful index for distinguishing patients with and without bacteremia by itself, our MLR model indicates that the accuracy of bacteremia prediction substantially improves by the combined use of PCT, bilirubin, NLR, platelets, lactic acid, ESR, and GCS score. △ Less

Submitted 16 November, 2023; v1 submitted 14 November, 2023; originally announced November 2023.

Comments: This article is not the right fit to be published as preprint in arXiv

arXiv:2311.07962 [pdf, other]

Relationship Between Mood, Sleepiness, and EEG Functional Connectivity by 40 Hz Monaural Beats

Authors: Ha-Na Jo, Young-Seok Kweon, Gi-Hwan Shin, Heon-Gyu Kwak, Seong-Whan Lee

Abstract: The monaural beat is known that it can modulate brain and personal states. However, which changes in brain waves are related to changes in state is still unclear. Therefore, we aimed to investigate the effects of monaural beats and find the relationship between them. Ten participants took part in five separate random sessions, which included a baseline session and four sessions with monaural beats… ▽ More The monaural beat is known that it can modulate brain and personal states. However, which changes in brain waves are related to changes in state is still unclear. Therefore, we aimed to investigate the effects of monaural beats and find the relationship between them. Ten participants took part in five separate random sessions, which included a baseline session and four sessions with monaural beats stimulation: one audible session and three inaudible sessions. Electroencephalogram (EEG) were recorded and participants completed pre- and post-stimulation questionnaires assessing mood and sleepiness. As a result, audible session led to increased arousal and positive mood compared to other conditions. From the neurophysiological analysis, statistical differences in frontal-central, central-central, and central-parietal connectivity were observed only in the audible session. Furthermore, a significant correlation was identified between sleepiness and EEG power in the temporal and occipital regions. These results suggested a more detailed correlation for stimulation to change its personal state. These findings have implications for applications in areas such as cognitive enhancement, mood regulation, and sleep management. △ Less

Submitted 20 November, 2023; v1 submitted 14 November, 2023; originally announced November 2023.

arXiv:2309.03227

Learning a Patent-Informed Biomedical Knowledge Graph Reveals Technological Potential of Drug Repositioning Candidates

Authors: Yongseung Jegal, Jaewoong Choi, Jiho Lee, Ki-Su Park, Seyoung Lee, Janghyeok Yoon

Abstract: Drug repositioning-a promising strategy for discovering new therapeutic uses for existing drugs-has been increasingly explored in the computational science literature using biomedical databases. However, the technological potential of drug repositioning candidates has often been overlooked. This study presents a novel protocol to comprehensively analyse various sources such as pharmaceutical paten… ▽ More Drug repositioning-a promising strategy for discovering new therapeutic uses for existing drugs-has been increasingly explored in the computational science literature using biomedical databases. However, the technological potential of drug repositioning candidates has often been overlooked. This study presents a novel protocol to comprehensively analyse various sources such as pharmaceutical patents and biomedical databases, and identify drug repositioning candidates with both technological potential and scientific evidence. To this end, first, we constructed a scientific biomedical knowledge graph (s-BKG) comprising relationships between drugs, diseases, and genes derived from biomedical databases. Our protocol involves identifying drugs that exhibit limited association with the target disease but are closely located in the s-BKG, as potential drug candidates. We constructed a patent-informed biomedical knowledge graph (p-BKG) by adding pharmaceutical patent information. Finally, we developed a graph embedding protocol to ascertain the structure of the p-BKG, thereby calculating the relevance scores of those candidates with target disease-related patents to evaluate their technological potential. Our case study on Alzheimer's disease demonstrates its efficacy and feasibility, while the quantitative outcomes and systematic methods are expected to bridge the gap between computational discoveries and successful market applications in drug repositioning research. △ Less

Submitted 24 July, 2024; v1 submitted 3 September, 2023; originally announced September 2023.

Comments: We are sorry to withdraw this paper. We found some critical errors in the introduction and results sections. Specifically, we found that the first author have wrongly inserted citations on background works and he made mistakes in the graph embedding methods and relevant results are wrongly calculated. In this regard, we tried to revise this paper and withdraw the current version. Thank you

arXiv:2307.10181 [pdf, other]

Community-Aware Transformer for Autism Prediction in fMRI Connectome

Authors: Anushree Bannadabhavi, Soojin Lee, Wenlong Deng, Xiaoxiao Li

Abstract: Autism spectrum disorder(ASD) is a lifelong neurodevelopmental condition that affects social communication and behavior. Investigating functional magnetic resonance imaging (fMRI)-based brain functional connectome can aid in the understanding and diagnosis of ASD, leading to more effective treatments. The brain is modeled as a network of brain Regions of Interest (ROIs), and ROIs form communities… ▽ More Autism spectrum disorder(ASD) is a lifelong neurodevelopmental condition that affects social communication and behavior. Investigating functional magnetic resonance imaging (fMRI)-based brain functional connectome can aid in the understanding and diagnosis of ASD, leading to more effective treatments. The brain is modeled as a network of brain Regions of Interest (ROIs), and ROIs form communities and knowledge of these communities is crucial for ASD diagnosis. On the one hand, Transformer-based models have proven to be highly effective across several tasks, including fMRI connectome analysis to learn useful representations of ROIs. On the other hand, existing transformer-based models treat all ROIs equally and overlook the impact of community-specific associations when learning node embeddings. To fill this gap, we propose a novel method, Com-BrainTF, a hierarchical local-global transformer architecture that learns intra and inter-community aware node embeddings for ASD prediction task. Furthermore, we avoid over-parameterization by sharing the local transformer parameters for different communities but optimize unique learnable prompt tokens for each community. Our model outperforms state-of-the-art (SOTA) architecture on ABIDE dataset and has high interpretability, evident from the attention module. Our code is available at https://github.com/ubc-tea/Com-BrainTF. △ Less

Submitted 24 June, 2023; originally announced July 2023.

Comments: Accepted by 26th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2023)

arXiv:2307.00932 [pdf]

A large calcium-imaging dataset reveals a systematic V4 organization for natural scenes

Authors: Tianye Wang, Haoxuan Yao, Tai Sing Lee, Jiayi Hong, Yang Li, Hongfei Jiang, Ian Max Andolina, Shiming Tang

Abstract: The visual system evolved to process natural scenes, yet most of our understanding of the topology and function of visual cortex derives from studies using artificial stimuli. To gain deeper insights into visual processing of natural scenes, we utilized widefield calcium-imaging of primate V4 in response to many natural images, generating a large dataset of columnar-scale responses. We used this d… ▽ More The visual system evolved to process natural scenes, yet most of our understanding of the topology and function of visual cortex derives from studies using artificial stimuli. To gain deeper insights into visual processing of natural scenes, we utilized widefield calcium-imaging of primate V4 in response to many natural images, generating a large dataset of columnar-scale responses. We used this dataset to build a digital twin of V4 via deep learning, generating a detailed topographical map of natural image preferences at each cortical position. The map revealed clustered functional domains for specific classes of natural image features. These ranged from surface-related attributes like color and texture to shape-related features such as edges, curvature, and facial features. We validated the model-predicted domains with additional widefield calcium-imaging and single-cell resolution two-photon imaging. Our study illuminates the detailed topological organization and neural codes in V4 that represent natural scenes. △ Less

Submitted 23 July, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

Comments: 39 pages, 14 figures

arXiv:2305.03799 [pdf]

Detecting disruption of HER2 membrane protein organization in cell membranes with nanoscale precision

Authors: Yasaman Moradi, Jerry SH Lee, Andrea M. Armani

Abstract: The spatio-temporal organization of proteins within the cell membrane can affect numerous biological functions, including cell signaling, communication, and transportation. Deviations from normal spatial arrangements have been observed in various diseases, and better understanding this process is a key stepping-stone to advancing development of clinical interventions. However, given the nanometer… ▽ More The spatio-temporal organization of proteins within the cell membrane can affect numerous biological functions, including cell signaling, communication, and transportation. Deviations from normal spatial arrangements have been observed in various diseases, and better understanding this process is a key stepping-stone to advancing development of clinical interventions. However, given the nanometer length scales involved, detecting these subtle changes has primarily relied on complex super resolution and single molecule imaging methods. In this work, we demonstrate an alternative fluorescent imaging strategy for detecting protein organization based on a material that exhibits a unique photophysical behavior known as aggregation induced emission (AIE). Organic AIE molecules have an increase in emission signal when they are in close proximity and the molecular motion is restricted. This property simultaneously addresses the high background noise and low detection signal that limit conventional widefield fluorescent imaging. To demonstrate the potential of this approach, the fluorescent molecule sensor is conjugated to a human epidermal growth factor receptor 2 (HER2) specific antibody and used to investigate the spatio-temporal behavior of HER2 clustering in the membrane of HER2-overexpressing breast cancer cells. Notably, the disruption of HER2 clusters in response to an FDA-approved monoclonal antibody therapeutic (Trastuzumab) is successfully detected using a simple widefield fluorescent microscope. While the sensor demonstrated here is optimized for sensing HER2 clustering, it is an easily adaptable platform. Moreover, given the compatibility with widefield imaging, the system has the potential to be used with high-throughput imaging techniques, accelerating investigations into membrane protein spatio-temporal organization. △ Less

Submitted 23 October, 2023; v1 submitted 5 May, 2023; originally announced May 2023.

arXiv:2303.11563 [pdf, other]

Dynamic Healthcare Embeddings for Improving Patient Care

Authors: Hankyu Jang, Sulyun Lee, D. M. Hasibul Hasan, Philip M. Polgreen, Sriram V. Pemmaraju, Bijaya Adhikari

Abstract: As hospitals move towards automating and integrating their computing systems, more fine-grained hospital operations data are becoming available. These data include hospital architectural drawings, logs of interactions between patients and healthcare professionals, prescription data, procedures data, and data on patient admission, discharge, and transfers. This has opened up many fascinating avenue… ▽ More As hospitals move towards automating and integrating their computing systems, more fine-grained hospital operations data are becoming available. These data include hospital architectural drawings, logs of interactions between patients and healthcare professionals, prescription data, procedures data, and data on patient admission, discharge, and transfers. This has opened up many fascinating avenues for healthcare-related prediction tasks for improving patient care. However, in order to leverage off-the-shelf machine learning software for these tasks, one needs to learn structured representations of entities involved from heterogeneous, dynamic data streams. Here, we propose DECENT, an auto-encoding heterogeneous co-evolving dynamic neural network, for learning heterogeneous dynamic embeddings of patients, doctors, rooms, and medications from diverse data streams. These embeddings capture similarities among doctors, rooms, patients, and medications based on static attributes and dynamic interactions. DECENT enables several applications in healthcare prediction, such as predicting mortality risk and case severity of patients, adverse events (e.g., transfer back into an intensive care unit), and future healthcare-associated infections. The results of using the learned patient embeddings in predictive modeling show that DECENT has a gain of up to 48.1% on the mortality risk prediction task, 12.6% on the case severity prediction task, 6.4% on the medical intensive care unit transfer task, and 3.8% on the Clostridioides difficile (C.diff) Infection (CDI) prediction task over the state-of-the-art baselines. In addition, case studies on the learned doctor, medication, and room embeddings show that our approach learns meaningful and interpretable embeddings. △ Less

Submitted 20 March, 2023; originally announced March 2023.

Comments: To be published in IEEE/ACM ASONAM 2022

arXiv:2302.14684 [pdf, other]

doi 10.1088/2632-072X/acef9d

Exploring 3D community inconsistency in human chromosome contact networks

Authors: Dolores Bernenko, Sang Hoon Lee, Ludvig Lizana

Abstract: Researchers developed chromosome capture methods such as Hi-C to better understand DNA's 3D folding in nuclei. The Hi-C method captures contact frequencies between DNA segment pairs across the genome. When analyzing Hi-C data sets, it is common to group these pairs using standard bioinformatics methods (e.g., PCA). Other approaches handle Hi-C data as weighted networks, where connected node repres… ▽ More Researchers developed chromosome capture methods such as Hi-C to better understand DNA's 3D folding in nuclei. The Hi-C method captures contact frequencies between DNA segment pairs across the genome. When analyzing Hi-C data sets, it is common to group these pairs using standard bioinformatics methods (e.g., PCA). Other approaches handle Hi-C data as weighted networks, where connected node represent DNA segments in 3D proximity. In this representation, one can leverage community detection techniques developed in complex network theory to group nodes into mesoscale communities containing similar connection patterns. While there are several successful attempts to analyze Hi-C data in this way, it is common to report and study the most typical community structure. But in reality, there are often several valid candidates. Therefore, depending on algorithm design, different community detection methods focusing on slightly different connectivity features may have differing views on the ideal node groupings. In fact, even the same community detection method may yield different results if using a stochastic algorithm. This ambiguity is fundamental to community detection and shared by most complex networks whenever interactions span all scales in the network. This is known as community inconsistency. This paper explores this inconsistency of 3D communities in Hi-C data for all human chromosomes. We base our analysis on two inconsistency metrics, one local and one global, and quantify the network scales where the community separation is most variable. For example, we find that TADs are less reliable than A/B compartments and that nodes with highly variable node-community memberships are associated with open chromatin. Overall, our study provides a helpful framework for data-driven researchers and increases awareness of some inherent challenges when clustering Hi-C data into 3D communities. △ Less

Submitted 28 February, 2023; originally announced February 2023.

Comments: 10 pages, 7 figures

Journal ref: J. Phys. Complex. 4 035004 (2023)

arXiv:2302.13693 [pdf, other]

Learning Topology-Specific Experts for Molecular Property Prediction

Authors: Su Kim, Dongha Lee, SeongKu Kang, Seonghyeon Lee, Hwanjo Yu

Abstract: Recently, graph neural networks (GNNs) have been successfully applied to predicting molecular properties, which is one of the most classical cheminformatics tasks with various applications. Despite their effectiveness, we empirically observe that training a single GNN model for diverse molecules with distinct structural patterns limits its prediction performance. In this paper, motivated by this o… ▽ More Recently, graph neural networks (GNNs) have been successfully applied to predicting molecular properties, which is one of the most classical cheminformatics tasks with various applications. Despite their effectiveness, we empirically observe that training a single GNN model for diverse molecules with distinct structural patterns limits its prediction performance. In this paper, motivated by this observation, we propose TopExpert to leverage topology-specific prediction models (referred to as experts), each of which is responsible for each molecular group sharing similar topological semantics. That is, each expert learns topology-specific discriminative features while being trained with its corresponding topological group. To tackle the key challenge of grouping molecules by their topological patterns, we introduce a clustering-based gating module that assigns an input molecule into one of the clusters and further optimizes the gating module with two different types of self-supervision: topological semantics induced by GNNs and molecular scaffolds, respectively. Extensive experiments demonstrate that TopExpert has boosted the performance for molecular property prediction and also achieved better generalization for new molecules with unseen scaffolds than baselines. The code is available at https://github.com/kimsu55/ToxExpert. △ Less

Submitted 11 March, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

Comments: 11 pages with 8 figures

MSC Class: 68T05 ACM Class: I.2.1; J.3

Journal ref: The 37th AAAI conference on artificial intelligence (AAAI 2023)

arXiv:2302.03137 [pdf, other]

Predicting Development of Chronic Obstructive Pulmonary Disease and its Risk Factor Analysis

Authors: Soojin Lee, Ingu Sean Lee, Samuel Kim

Abstract: Chronic Obstructive Pulmonary Disease (COPD) is an irreversible airway obstruction with a high societal burden. Although smoking is known to be the biggest risk factor, additional components need to be considered. In this study, we aim to identify COPD risk factors by applying machine learning models that integrate sociodemographic, clinical, and genetic data to predict COPD development. Chronic Obstructive Pulmonary Disease (COPD) is an irreversible airway obstruction with a high societal burden. Although smoking is known to be the biggest risk factor, additional components need to be considered. In this study, we aim to identify COPD risk factors by applying machine learning models that integrate sociodemographic, clinical, and genetic data to predict COPD development. △ Less

Submitted 6 February, 2023; originally announced February 2023.

Comments: submitted to EMBC 2023

arXiv:2301.11669 [pdf, other]

Predator Extinction arose from Chaos of the Prey: the Chaotic Behavior of a Homomorphic Two-Dimensional Logistic Map in the Form of Lotka-Volterra Equations

Authors: Wei Shan Lee, Hou Fai Chan, Ka Ian Im, Kuan Ieong Chan, U Hin Cheang

Abstract: A two-dimensional homomorphic logistic map that preserves features of the Lotka-Volterra equations was proposed. To examine chaos, iteration plots of the population, Lyapunov exponents calculated from Jacobian eigenvalues of the $2$D logistic mapping, and from time series algorithms of Rosenstein and Eckmann et al. were calculated. Bifurcation diagrams may be divided into four categories depending… ▽ More A two-dimensional homomorphic logistic map that preserves features of the Lotka-Volterra equations was proposed. To examine chaos, iteration plots of the population, Lyapunov exponents calculated from Jacobian eigenvalues of the $2$D logistic mapping, and from time series algorithms of Rosenstein and Eckmann et al. were calculated. Bifurcation diagrams may be divided into four categories depending on topological shapes. Our model not only recovered the $1$D logistic map, which exhibits flip bifurcation, for the prey when there is a nonzero initial predator population, but it can also simulate normal competition between two species with equal initial populations. Despite the possibility for two species to go into chaos simultaneously, where the Neimark-Sacker bifurcation was observed, it is also possible that with the same interspecies parameters as normal but with a predator population $10$ times more than that of the prey, the latter becomes chaotic, while the former dramatically reduces to zero with only a few iterations, indicating total annihilation of the predator species. Interpreting humans as predators and natural resources as preys in the ecological system, the above-mentioned conclusion may imply that not only excessive consumption of natural resources, but its chaotic state triggered by an overpopulation of humans may backfire in a manner of total extinction of the human species. Fortunately, there is little chance for the survival of the human race, as isolated fixed points in the bifurcation diagram of the predator reveal. Finally, two possible applications of the phenomenon of chaotic extinction are proposed: one is to inhibit viruses or pests by initiating the chaotic states of the prey on which the viruses or pests rely for existence, and the other is to achieve the superconducting state with the chaotic state of the applied magnetic field. △ Less

Submitted 24 February, 2024; v1 submitted 27 January, 2023; originally announced January 2023.

Comments: Paper abstract presented at the 33rd Annual Conference of the Society for Chaos Theory in Psychology & Life Sciences, The Fields Institute, U of Toronto, Aug. 4, 2023

arXiv:2301.09987 [pdf, other]

Chemical Integration of ODEs using Idealized Abstract Solutions

Authors: Su Hyeong Lee

Abstract: In this work, we propose a general inversion framework to non-uniquely invert a very large class of ordinary differential equations (ODEs) into chemical reaction networks. A thorough treatment of the relevant chemical reaction network theory from the literature is given. Various simulation results are provided to augment the selection procedure for the inverse framework, where a previously known k… ▽ More In this work, we propose a general inversion framework to non-uniquely invert a very large class of ordinary differential equations (ODEs) into chemical reaction networks. A thorough treatment of the relevant chemical reaction network theory from the literature is given. Various simulation results are provided to augment the selection procedure for the inverse framework, where a previously known kineticization strategy is shown to be deterministically excellent but undesirable in chemical simulations. The utility of the framework is verified by simulating reaction network forms of meaningful ODE systems, and their time series are analyzed. In particular, we provide simulations of deterministic chaotic attractors whose newly discovered reaction networks are non-equivalent with any existing chemical interpretations within the literature, as well as presenting exemplary figures which may form a roadmap to the successful biochemical implementation of the integration of ODE systems. △ Less

Submitted 23 January, 2023; originally announced January 2023.

arXiv:2210.07145 [pdf, other]

Accurate, reliable and interpretable solubility prediction of druglike molecules with attention pooling and Bayesian learning

Authors: Seongok Ryu, Sumin Lee

Abstract: In drug discovery, aqueous solubility is an important pharmacokinetic property which affects absorption and assay availability of drug. Thus, in silico prediction of solubility has been studied for its utility in virtual screening and lead optimization. Recently, machine learning (ML) methods using experimental data has been popular because physics-based methods like quantum mechanics and molecula… ▽ More In drug discovery, aqueous solubility is an important pharmacokinetic property which affects absorption and assay availability of drug. Thus, in silico prediction of solubility has been studied for its utility in virtual screening and lead optimization. Recently, machine learning (ML) methods using experimental data has been popular because physics-based methods like quantum mechanics and molecular dynamics are not suitable for high-throughput tasks due to its computational costs. However, ML method can exhibit over-fitting problem in a data-deficient condition, and this is the case for most chemical property datasets. In addition, ML methods are regarded as a black box function in that it is difficult to interpret contribution of hidden features to outputs, hindering analysis and modification of structure-activity relationship. To deal with mentioned issues, we developed Bayesian graph neural networks (GNNs) with the self-attention readout layer. Unlike most GNNs using self-attention in node updates, self-attention applied at readout layer enabled a model to improve prediction performance as well as to identify atom-wise importance, which can help lead optimization as exemplified for three FDA-approved drugs. Also, Bayesian inference enables us to separate more or less accurate results according to uncertainty in solubility prediction task We expect that our accurate, reliable and interpretable model can be used for more careful decision-making and various applications in the development of drugs. △ Less

Submitted 29 September, 2022; originally announced October 2022.

arXiv:2208.11853 [pdf, other]

Data-driven Discovery of Chemotactic Migration of Bacteria via Machine Learning

Authors: Yorgos M. Psarellis, Seungjoon Lee, Tapomoy Bhattacharjee, Sujit S. Datta, Juan M. Bello-Rivas, Ioannis G. Kevrekidis

Abstract: E. coli chemotactic motion in the presence of a chemoattractant field has been extensively studied using wet laboratory experiments, stochastic computational models as well as partial differential equation-based models (PDEs). The most challenging step in bridging these approaches, is establishing a closed form of the so-called chemotactic term, which describes how bacteria bias their motion up ch… ▽ More E. coli chemotactic motion in the presence of a chemoattractant field has been extensively studied using wet laboratory experiments, stochastic computational models as well as partial differential equation-based models (PDEs). The most challenging step in bridging these approaches, is establishing a closed form of the so-called chemotactic term, which describes how bacteria bias their motion up chemonutrient concentration gradients, as a result of a cascade of biochemical processes. Data-driven models can be used to learn the entire evolution operator of the chemotactic PDEs (black box models), or, in a more targeted fashion, to learn just the chemotactic term (gray box models). In this work, data-driven Machine Learning approaches for learning the underlying model PDEs are (a) validated through the use of simulation data from established continuum models and (b) used to infer chemotactic PDEs from experimental data. Even when the data at hand are sparse (coarse in space and/or time), noisy (due to inherent stochasticity in measurements) or partial (e.g. lack of measurements of the associated chemoattractant field), we can attempt to learn the right-hand-side of a closed PDE for an evolving bacterial density. In fact we show that data-driven PDEs including a short history of the bacterial density field (e.g. in the form of higher-order in time PDEs in terms of the measurable bacterial density) can be successful in predicting further bacterial density evolution, and even possibly recovering estimates of the unmeasured chemonutrient field. The main tool in this effort is the effective low-dimensionality of the dynamics (in the spirit of the Whitney and Takens embedding theorems). The resulting data-driven PDE can then be simulated to reproduce/predict computational or experimental bacterial density profile data, and estimate the underlying (unmeasured) chemonutrient field evolution. △ Less

Submitted 24 August, 2022; originally announced August 2022.

Comments: 44 pages, 20 figures, 3 tables

arXiv:2206.07632 [pdf, other]

Exploring Chemical Space with Score-based Out-of-distribution Generation

Authors: Seul Lee, Jaehyeong Jo, Sung Ju Hwang

Abstract: A well-known limitation of existing molecular generative models is that the generated molecules highly resemble those in the training set. To generate truly novel molecules that may have even better properties for de novo drug discovery, more powerful exploration in the chemical space is necessary. To this end, we propose Molecular Out-Of-distribution Diffusion(MOOD), a score-based diffusion schem… ▽ More A well-known limitation of existing molecular generative models is that the generated molecules highly resemble those in the training set. To generate truly novel molecules that may have even better properties for de novo drug discovery, more powerful exploration in the chemical space is necessary. To this end, we propose Molecular Out-Of-distribution Diffusion(MOOD), a score-based diffusion scheme that incorporates out-of-distribution (OOD) control in the generative stochastic differential equation (SDE) with simple control of a hyperparameter, thus requires no additional costs. Since some novel molecules may not meet the basic requirements of real-world drugs, MOOD performs conditional generation by utilizing the gradients from a property predictor that guides the reverse-time diffusion process to high-scoring regions according to target properties such as protein-ligand interactions, drug-likeness, and synthesizability. This allows MOOD to search for novel and meaningful molecules rather than generating unseen yet trivial ones. We experimentally validate that MOOD is able to explore the chemical space beyond the training distribution, generating molecules that outscore ones found with existing methods, and even the top 0.01% of the original training pool. Our code is available at https://github.com/SeulLee05/MOOD. △ Less

Submitted 3 June, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

Comments: ICML 2023

arXiv:2205.13545 [pdf, other]

Learning black- and gray-box chemotactic PDEs/closures from agent based Monte Carlo simulation data

Authors: Seungjoon Lee, Yorgos M. Psarellis, Constantinos I. Siettos, Ioannis G. Kevrekidis

Abstract: We propose a machine learning framework for the data-driven discovery of macroscopic chemotactic Partial Differential Equations (PDEs) -- and the closures that lead to them -- from high-fidelity, individual-based stochastic simulations of E.coli bacterial motility. The fine scale, detailed, hybrid (continuum - Monte Carlo) simulation model embodies the underlying biophysics, and its parameters are… ▽ More We propose a machine learning framework for the data-driven discovery of macroscopic chemotactic Partial Differential Equations (PDEs) -- and the closures that lead to them -- from high-fidelity, individual-based stochastic simulations of E.coli bacterial motility. The fine scale, detailed, hybrid (continuum - Monte Carlo) simulation model embodies the underlying biophysics, and its parameters are informed from experimental observations of individual cells. We exploit Automatic Relevance Determination (ARD) within a Gaussian Process framework for the identification of a parsimonious set of collective observables that parametrize the law of the effective PDEs. Using these observables, in a second step we learn effective, coarse-grained "Keller-Segel class" chemotactic PDEs using machine learning regressors: (a) (shallow) feedforward neural networks and (b) Gaussian Processes. The learned laws can be black-box (when no prior knowledge about the PDE law structure is assumed) or gray-box when parts of the equation (e.g. the pure diffusion part) is known and "hardwired" in the regression process. We also discuss data-driven corrections (both additive and functional) of analytically known, approximate closures. △ Less

Submitted 25 May, 2022; originally announced May 2022.

Comments: 33 pages, 5 figures, 1 table

arXiv:2204.07362 [pdf, other]

Decoding Neural Correlation of Language-Specific Imagined Speech using EEG Signals

Authors: Keon-Woo Lee, Dae-Hyeok Lee, Sung-Jin Kim, Seong-Whan Lee

Abstract: Speech impairments due to cerebral lesions and degenerative disorders can be devastating. For humans with severe speech deficits, imagined speech in the brain-computer interface has been a promising hope for reconstructing the neural signals of speech production. However, studies in the EEG-based imagined speech domain still have some limitations due to high variability in spatial and temporal inf… ▽ More Speech impairments due to cerebral lesions and degenerative disorders can be devastating. For humans with severe speech deficits, imagined speech in the brain-computer interface has been a promising hope for reconstructing the neural signals of speech production. However, studies in the EEG-based imagined speech domain still have some limitations due to high variability in spatial and temporal information and low signal-to-noise ratio. In this paper, we investigated the neural signals for two groups of native speakers with two tasks with different languages, English and Chinese. Our assumption was that English, a non-tonal and phonogram-based language, would have spectral differences in neural computation compared to Chinese, a tonal and ideogram-based language. The results showed the significant difference in the relative power spectral density between English and Chinese in specific frequency band groups. Also, the spatial evaluation of Chinese native speakers in the theta band was distinctive during the imagination task. Hence, this paper would suggest the key spectral and spatial information of word imagination with specialized language while decoding the neural signals of speech. △ Less

Submitted 15 April, 2022; originally announced April 2022.

Comments: Accepted in EMBC 2022

arXiv:2203.05707 [pdf]

doi 10.3233/JAD-220021

Machine Learning Based Multimodal Neuroimaging Genomics Dementia Score for Predicting Future Conversion to Alzheimer's Disease

Authors: Ghazal Mirabnahrazam, Da Ma, Sieun Lee, Karteek Popuri, Hyunwoo Lee, Jiguo Cao, Lei Wang, James E Galvin, Mirza Faisal Beg, the Alzheimer's Disease Neuroimaging Initiative

Abstract: Background: The increasing availability of databases containing both magnetic resonance imaging (MRI) and genetic data allows researchers to utilize multimodal data to better understand the characteristics of dementia of Alzheimer's type (DAT). Objective: The goal of this study was to develop and analyze novel biomarkers that can help predict the development and progression of DAT. Methods: We use… ▽ More Background: The increasing availability of databases containing both magnetic resonance imaging (MRI) and genetic data allows researchers to utilize multimodal data to better understand the characteristics of dementia of Alzheimer's type (DAT). Objective: The goal of this study was to develop and analyze novel biomarkers that can help predict the development and progression of DAT. Methods: We used feature selection and ensemble learning classifier to develop an image/genotype-based DAT score that represents a subject's likelihood of developing DAT in the future. Three feature types were used: MRI only, genetic only, and combined multimodal data. We used a novel data stratification method to better represent different stages of DAT. Using a pre-defined 0.5 threshold on DAT scores, we predicted whether or not a subject would develop DAT in the future. Results: Our results on Alzheimer's Disease Neuroimaging Initiative (ADNI) database showed that dementia scores using genetic data could better predict future DAT progression for currently normal control subjects (Accuracy=0.857) compared to MRI (Accuracy=0.143), while MRI can better characterize subjects with stable mild cognitive impairment (Accuracy=0.614) compared to genetics (Accuracy=0.356). Combining MRI and genetic data showed improved classification performance in the remaining stratified groups. Conclusion: MRI and genetic data can contribute to DAT prediction in different ways. MRI data reflects anatomical changes in the brain, while genetic data can detect the risk of DAT progression prior to the symptomatic onset. Combining information from multimodal data in the right way can improve prediction performance. △ Less

Submitted 10 March, 2022; originally announced March 2022.

Journal ref: J Alzheimers Dis 1 Jan. (2022) 1-21

arXiv:2203.03920 [pdf, ps, other]

doi 10.1016/j.jtbi.2022.111202

A second-order stability analysis for the continuous model of indirect reciprocity

Authors: Sanghun Lee, Yohsuke Murase, Seung Ki Baek

Abstract: Reputation is one of key mechanisms to maintain human cooperation, but its analysis gets complicated if we consider the possibility that reputation does not reach consensus because of erroneous assessment. The difficulty is alleviated if we assume that reputation and cooperation do not take binary values but have continuous spectra so that disagreement over reputation can be analysed in a perturba… ▽ More Reputation is one of key mechanisms to maintain human cooperation, but its analysis gets complicated if we consider the possibility that reputation does not reach consensus because of erroneous assessment. The difficulty is alleviated if we assume that reputation and cooperation do not take binary values but have continuous spectra so that disagreement over reputation can be analysed in a perturbative way. In this work, we carry out the analysis by expanding the dynamics of reputation to the second order of perturbation under the assumption that everyone initially cooperates with good reputation. The second-order theory clarifies the difference between Image Scoring and Simple Standing in that punishment for defection against a well-reputed player should be regarded as good for maintaining cooperation. Moreover, comparison among the leading eight shows that the stabilizing effect of justified punishment weakens if cooperation between two ill-reputed players is regarded as bad. Our analysis thus explains how Simple Standing achieves a high level of stability by permitting justified punishment and also by disregarding irrelevant information in assessing cooperation. This observation suggests which factors affect the stability of a social norm when reputation can be perturbed by noise. △ Less

Submitted 11 July, 2022; v1 submitted 8 March, 2022; originally announced March 2022.

Comments: 21 pages, 2 figures

Journal ref: J. Theor. Biol. 548, 111202 (2022)

arXiv:2112.05240 [pdf]

doi 10.34133/2022/9786242

Label-free virtual HER2 immunohistochemical staining of breast tissue using deep learning

Authors: Bijie Bai, Hongda Wang, Yuzhu Li, Kevin de Haan, Francesco Colonnese, Yujie Wan, Jingyi Zuo, Ngan B. Doan, Xiaoran Zhang, Yijie Zhang, Jingxi Li, Wenjie Dong, Morgan Angus Darrow, Elham Kamangar, Han Sung Lee, Yair Rivenson, Aydogan Ozcan

Abstract: The immunohistochemical (IHC) staining of the human epidermal growth factor receptor 2 (HER2) biomarker is widely practiced in breast tissue analysis, preclinical studies and diagnostic decisions, guiding cancer treatment and investigation of pathogenesis. HER2 staining demands laborious tissue treatment and chemical processing performed by a histotechnologist, which typically takes one day to pre… ▽ More The immunohistochemical (IHC) staining of the human epidermal growth factor receptor 2 (HER2) biomarker is widely practiced in breast tissue analysis, preclinical studies and diagnostic decisions, guiding cancer treatment and investigation of pathogenesis. HER2 staining demands laborious tissue treatment and chemical processing performed by a histotechnologist, which typically takes one day to prepare in a laboratory, increasing analysis time and associated costs. Here, we describe a deep learning-based virtual HER2 IHC staining method using a conditional generative adversarial network that is trained to rapidly transform autofluorescence microscopic images of unlabeled/label-free breast tissue sections into bright-field equivalent microscopic images, matching the standard HER2 IHC staining that is chemically performed on the same tissue sections. The efficacy of this virtual HER2 staining framework was demonstrated by quantitative analysis, in which three board-certified breast pathologists blindly graded the HER2 scores of virtually stained and immunohistochemically stained HER2 whole slide images (WSIs) to reveal that the HER2 scores determined by inspecting virtual IHC images are as accurate as their immunohistochemically stained counterparts. A second quantitative blinded study performed by the same diagnosticians further revealed that the virtually stained HER2 images exhibit a comparable staining quality in the level of nuclear detail, membrane clearness, and absence of staining artifacts with respect to their immunohistochemically stained counterparts. This virtual HER2 staining framework bypasses the costly, laborious, and time-consuming IHC staining procedures in laboratory, and can be extended to other types of biomarkers to accelerate the IHC tissue staining used in life sciences and biomedical workflow. △ Less

Submitted 8 December, 2021; originally announced December 2021.

Comments: 26 Pages, 5 Figures

Journal ref: BME Frontiers (2022)

arXiv:2111.06583 [pdf, other]

doi 10.3938/NPSM.70.1077

Mesoscale properties of mutualistic networks in ecosystems

Authors: Sang Hoon Lee

Abstract: Uncovering structural properties of ecological networks is a crucial starting point of studying the system's stability in response to various types of perturbations. We analyze pollination and seed disposal networks, which are representative examples of mutualistic networks in ecosystems, in various scales. In particular, we examine mesoscale properties such as the nested structure, the core-perip… ▽ More Uncovering structural properties of ecological networks is a crucial starting point of studying the system's stability in response to various types of perturbations. We analyze pollination and seed disposal networks, which are representative examples of mutualistic networks in ecosystems, in various scales. In particular, we examine mesoscale properties such as the nested structure, the core-periphery structure, and the community structure by statistically investigating their interrelationships with real network data. As a result of community detection in different scales, we find the absence of meaningful hierarchy between networks, and the negative correlation between the modularity and the two other structures (nestedness and core-periphery-ness), which themselves are highly positively correlated. In addition, no characteristic scale of communities is perceivable from the community-inconsistency analysis. Therefore, the community structures, which are most widely studied mesoscale structures of networks, are not in fact adequate to characterize the mutualistic networks of this scale in ecosystems. △ Less

Submitted 14 November, 2021; v1 submitted 12 November, 2021; originally announced November 2021.

Comments: 9 pages, 4 figures, in Korean

Journal ref: New Phys.: Sae Mulli 70, 1077 (2020)

arXiv:2110.06492 [pdf, other]

Understanding of a brain spatial map based on threshold-free function dendrogramization

Authors: Hyekyoung Lee, Hyejin Kang, Youngmin Huh, Hongyoon Choi, Dong Soo Lee

Abstract: Linear matrix factorizations (LMFs) such as independent component analysis (ICA), principal component analysis (PCA), and their extensions, have been widely used for finding relevant spatial maps in brain imaging data. The last step of an LMF before interpretation is usually to extract the activated brain regions from the map by thresholding. However, it is difficult to determine an appropriate th… ▽ More Linear matrix factorizations (LMFs) such as independent component analysis (ICA), principal component analysis (PCA), and their extensions, have been widely used for finding relevant spatial maps in brain imaging data. The last step of an LMF before interpretation is usually to extract the activated brain regions from the map by thresholding. However, it is difficult to determine an appropriate threshold level. Thresholding can remove the underlying properties of spatial maps and their features imposed by the model. In this study, we propose a threshold-free activated region extraction method which involves simplifying a brain spatial map to a dendrogram through Morse filtration. Since a dendrogram is related to the change of clustering structure in Rips filtration, we first show the relationship between the Rips filtration of a graph and the Morse filtration of a function. Then, we dendrogramize a spatial map in order to visualize the activated brain regions and the range of their importance in a spatial map. The proposed method can be applied to any spatial maps that a user wants to threshold and interpret. In experiments, we applied the proposed method to independent component maps (ICMs) obtained from resting-state fMRI data, and the dominant subnetworks obtained by the PCA of a correlation-based functional connectivity of FDG PET Alzheimer's disease neuroimaging initiative (ADNI) data. We found that dendrogramization can help to understand a brain spatial map without thresholding. △ Less

Submitted 13 October, 2021; originally announced October 2021.

arXiv:2110.05987 [pdf]

doi 10.1126/science.abm7530

Getting Genetic Ancestry Right for Science and Society

Authors: Anna C. F. Lewis, Santiago J. Molina, Paul S Appelbaum, Bege Dauda, Anna Di Rienzo, Agustin Fuentes, Stephanie M. Fullerton, Nanibaa' A. Garrison, Nayanika Ghosh, Evelynn M. Hammonds, David S. Jones, Eimear E. Kenny, Peter Kraft, Sandra S. -J. Lee, Madelyn Mauro, John Novembre, Aaron Panofsky, Mashaal Sohail, Benjamin M. Neale, Danielle S. Allen

Abstract: There is a scientific and ethical imperative to embrace a multidimensional, continuous view of ancestry and move away from continental ancestry categories There is a scientific and ethical imperative to embrace a multidimensional, continuous view of ancestry and move away from continental ancestry categories △ Less

Submitted 14 October, 2021; v1 submitted 12 October, 2021; originally announced October 2021.

arXiv:2110.01219 [pdf, other]

Hit and Lead Discovery with Explorative RL and Fragment-based Molecule Generation

Authors: Soojung Yang, Doyeong Hwang, Seul Lee, Seongok Ryu, Sung Ju Hwang

Abstract: Recently, utilizing reinforcement learning (RL) to generate molecules with desired properties has been highlighted as a promising strategy for drug design. A molecular docking program - a physical simulation that estimates protein-small molecule binding affinity - can be an ideal reward scoring function for RL, as it is a straightforward proxy of the therapeutic potential. Still, two imminent chal… ▽ More Recently, utilizing reinforcement learning (RL) to generate molecules with desired properties has been highlighted as a promising strategy for drug design. A molecular docking program - a physical simulation that estimates protein-small molecule binding affinity - can be an ideal reward scoring function for RL, as it is a straightforward proxy of the therapeutic potential. Still, two imminent challenges exist for this task. First, the models often fail to generate chemically realistic and pharmacochemically acceptable molecules. Second, the docking score optimization is a difficult exploration problem that involves many local optima and less smooth surfaces with respect to molecular structure. To tackle these challenges, we propose a novel RL framework that generates pharmacochemically acceptable molecules with large docking scores. Our method - Fragment-based generative RL with Explorative Experience replay for Drug design (FREED) - constrains the generated molecules to a realistic and qualified chemical space and effectively explores the space to find drugs by coupling our fragment-based generation method and a novel error-prioritized experience replay (PER). We also show that our model performs well on both de novo and scaffold-based schemes. Our model produces molecules of higher quality compared to existing methods while achieving state-of-the-art performance on two of three targets in terms of the docking scores of the generated molecules. We further show with ablation studies that our method, predictive error-PER (FREED(PE)), significantly improves the model performance. △ Less

Submitted 26 October, 2021; v1 submitted 4 October, 2021; originally announced October 2021.

Comments: To be published in NeurIPS 2021

arXiv:2106.16154 [pdf]

Ultra-Sharp Nanowire Arrays Natively Permeate, Record, and Stimulate Intracellular Activity in Neuronal and Cardiac Networks

Authors: Ren Liu, Jihwan Lee, Youngbin Tchoe, Deborah Pre, Andrew M. Bourhis, Agnieszka D'Antonio-Chronowska, Gaelle Robin, Sang Heon Lee, Yun Goo Ro, Ritwik Vatsyayan, Karen J. Tonsfeldt, Lorraine A. Hossain, M. Lisa Phipps, Jinkyoung Yoo, John Nogan, Jennifer S. Martinez, Kelly A. Frazer, Anne G. Bang, Shadi A. Dayeh

Abstract: Intracellular access with high spatiotemporal resolution can enhance our understanding of how neurons or cardiomyocytes regulate and orchestrate network activity, and how this activity can be affected with pharmacology or other interventional modalities. Nanoscale devices often employ electroporation to transiently permeate the cell membrane and record intracellular potentials, which tend to decre… ▽ More Intracellular access with high spatiotemporal resolution can enhance our understanding of how neurons or cardiomyocytes regulate and orchestrate network activity, and how this activity can be affected with pharmacology or other interventional modalities. Nanoscale devices often employ electroporation to transiently permeate the cell membrane and record intracellular potentials, which tend to decrease rapidly to extracellular potential amplitudes with time. Here, we report innovative scalable, vertical, ultra-sharp nanowire arrays that are individually addressable to enable long-term, native recordings of intracellular potentials. We report large action potential amplitudes that are indicative of intracellular access from 3D tissue-like networks of neurons and cardiomyocytes across recording days and that do not decrease to extracellular amplitudes for the duration of the recording of several minutes. Our findings are validated with cross-sectional microscopy, pharmacology, and electrical interventions. Our experiments and simulations demonstrate that individual electrical addressability of nanowires is necessary for high-fidelity intracellular electrophysiological recordings. This study advances our understanding of and control over high-quality multi-channel intracellular recordings, and paves the way toward predictive, high-throughput, and low-cost electrophysiological drug screening platforms. △ Less

Submitted 5 July, 2021; v1 submitted 30 June, 2021; originally announced June 2021.

Comments: Main manuscript: 33 pages, 4 figures, Supporting information: 43 pages, 27 figures, Submitted to Advanced Materials

arXiv:2106.04026 [pdf, other]

Subject-Independent Brain-Computer Interface for Decoding High-Level Visual Imagery Tasks

Authors: Dae-Hyeok Lee, Dong-Kyun Han, Sung-Jin Kim, Ji-Hoon Jeong, Seong-Whan Lee

Abstract: Brain-computer interface (BCI) is used for communication between humans and devices by recognizing status and intention of humans. Communication between humans and a drone using electroencephalogram (EEG) signals is one of the most challenging issues in the BCI domain. In particular, the control of drone swarms (the direction and formation) has more advantages compared to the control of a drone. T… ▽ More Brain-computer interface (BCI) is used for communication between humans and devices by recognizing status and intention of humans. Communication between humans and a drone using electroencephalogram (EEG) signals is one of the most challenging issues in the BCI domain. In particular, the control of drone swarms (the direction and formation) has more advantages compared to the control of a drone. The visual imagery (VI) paradigm is that subjects visually imagine specific objects or scenes. Reduction of the variability among EEG signals of subjects is essential for practical BCI-based systems. In this study, we proposed the subepoch-wise feature encoder (SEFE) to improve the performances in the subject-independent tasks by using the VI dataset. This study is the first attempt to demonstrate the possibility of generalization among subjects in the VI-based BCI. We used the leave-one-subject-out cross-validation for evaluating the performances. We obtained higher performances when including our proposed module than excluding our proposed module. The DeepConvNet with SEFE showed the highest performance of 0.72 among six different decoding models. Hence, we demonstrated the feasibility of decoding the VI dataset in the subject-independent task with robust performances by using our proposed module. △ Less

Submitted 16 August, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

Comments: 6 pages, 3 figures

arXiv:2104.11364 [pdf]

A field guide to cultivating computational biology

Authors: Anne E Carpenter, Casey S Greene, Piero Carnici, Benilton S Carvalho, Michiel de Hoon, Stacey Finley, Kim-Anh Le Cao, Jerry SH Lee, Luigi Marchionni, Suzanne Sindi, Fabian J Theis, Gregory P Way, Jean YH Yang, Elana J Fertig

Abstract: Biomedical research centers can empower basic discovery and novel therapeutic strategies by leveraging their large-scale datasets from experiments and patients. This data, together with new technologies to create and analyze it, has ushered in an era of data-driven discovery which requires moving beyond the traditional individual, single-discipline investigator research model. This interdisciplina… ▽ More Biomedical research centers can empower basic discovery and novel therapeutic strategies by leveraging their large-scale datasets from experiments and patients. This data, together with new technologies to create and analyze it, has ushered in an era of data-driven discovery which requires moving beyond the traditional individual, single-discipline investigator research model. This interdisciplinary niche is where computational biology thrives. It has matured over the past three decades and made major contributions to scientific knowledge and human health, yet researchers in the field often languish in career advancement, publication, and grant review. We propose solutions for individual scientists, institutions, journal publishers, funding agencies, and educators. △ Less

Submitted 22 April, 2021; originally announced April 2021.

arXiv:2104.02881 [pdf, other]

doi 10.1038/s41598-021-93598-7

Local stability of cooperation in a continuous model of indirect reciprocity

Authors: Sanghun Lee, Yohsuke Murase, Seung Ki Baek

Abstract: Reputation is a powerful mechanism to enforce cooperation among unrelated individuals through indirect reciprocity, but it suffers from disagreement originating from private assessment, noise, and incomplete information. In this work, we investigate stability of cooperation in the donation game by regarding each player's reputation and behaviour as continuous variables. Through perturbative calcul… ▽ More Reputation is a powerful mechanism to enforce cooperation among unrelated individuals through indirect reciprocity, but it suffers from disagreement originating from private assessment, noise, and incomplete information. In this work, we investigate stability of cooperation in the donation game by regarding each player's reputation and behaviour as continuous variables. Through perturbative calculation, we derive a condition that a social norm should satisfy to give penalties to its close variants, provided that everyone initially cooperates with a good reputation, and this result is supported by numerical simulation. A crucial factor of the condition is whether a well-reputed player's donation to an ill-reputed co-player is appreciated by other members of the society, and the condition can be reduced to a threshold for the benefit-cost ratio of cooperation which depends on the reputational sensitivity to a donor's behaviour as well as on the behavioural sensitivity to a recipient's reputation. Our continuum formulation suggests how indirect reciprocity can work beyond the dichotomy between good and bad even in the presence of inhomogeneity, noise, and incomplete information. △ Less

Submitted 9 July, 2021; v1 submitted 6 April, 2021; originally announced April 2021.

Comments: 13 pages, 3 figures

Journal ref: Sci. Rep. 11, 14225 (2021)

arXiv:2009.09514 [pdf, other]

Early Indicators of COVID-19 Spread Risk Using Digital Trace Data of Population Activities

Authors: Xinyu Gao, Chao Fan, Yang Yang, Sanghyeon Lee, Qingchun Li, Mikel Maron, Ali Mostafavi

Abstract: The spread of pandemics such as COVID-19 is strongly linked to human activities. The objective of this paper is to specify and examine early indicators of disease spread risk in cities during the initial stages of outbreak based on patterns of human activities obtained from digital trace data. In this study, the Venables distance (D_v), and the activity density (D_a) are used to quantify and evalu… ▽ More The spread of pandemics such as COVID-19 is strongly linked to human activities. The objective of this paper is to specify and examine early indicators of disease spread risk in cities during the initial stages of outbreak based on patterns of human activities obtained from digital trace data. In this study, the Venables distance (D_v), and the activity density (D_a) are used to quantify and evaluate human activities for 193 US counties, whose cumulative number of confirmed cases was greater than 100 as of March 31, 2020. Venables distance provides a measure of the agglomeration of the level of human activities based on the average distance of human activities across a city or a county (less distance could lead to a greater contact risk). Activity density provides a measure of level of overall activity level in a county or a city (more activity could lead to a greater risk). Accordingly, Pearson correlation analysis is used to examine the relationship between the two human activity indicators and the basic reproduction number in the following weeks. The results show statistically significant correlations between the indicators of human activities and the basic reproduction number in all counties, as well as a significant leader-follower relationship (time lag) between them. The results also show one to two weeks' lag between the change in activity indicators and the decrease in the basic reproduction number. This result implies that the human activity indicators provide effective early indicators for the spread risk of the pandemic during the early stages of the outbreak. Hence, the results could be used by the authorities to proactively assess the risk of disease spread by monitoring the daily Venables distance and activity density in a proactive manner. △ Less

Submitted 20 September, 2020; originally announced September 2020.

Comments: 12 pages, 8 figures

arXiv:2008.05377 [pdf]

Network reinforcement driven drug repurposing for COVID-19 by exploiting disease-gene-drug associations

Authors: Yonghyun Nam, Jae-Seung Yun, Seung Mi Lee, Ji Won Park, Ziqi Chen, Brian Lee, Anurag Verma, Xia Ning, Li Shen, Dokyoon Kim

Abstract: Currently, the number of patients with COVID-19 has significantly increased. Thus, there is an urgent need for developing treatments for COVID-19. Drug repurposing, which is the process of reusing already-approved drugs for new medical conditions, can be a good way to solve this problem quickly and broadly. Many clinical trials for COVID-19 patients using treatments for other diseases have already… ▽ More Currently, the number of patients with COVID-19 has significantly increased. Thus, there is an urgent need for developing treatments for COVID-19. Drug repurposing, which is the process of reusing already-approved drugs for new medical conditions, can be a good way to solve this problem quickly and broadly. Many clinical trials for COVID-19 patients using treatments for other diseases have already been in place or will be performed at clinical sites in the near future. Additionally, patients with comorbidities such as diabetes mellitus, obesity, liver cirrhosis, kidney diseases, hypertension, and asthma are at higher risk for severe illness from COVID-19. Thus, the relationship of comorbidity disease with COVID-19 may help to find repurposable drugs. To reduce trial and error in finding treatments for COVID-19, we propose building a network-based drug repurposing framework to prioritize repurposable drugs. First, we utilized knowledge of COVID-19 to construct a disease-gene-drug network (DGDr-Net) representing a COVID-19-centric interactome with components for diseases, genes, and drugs. DGDr-Net consisted of 592 diseases, 26,681 human genes and 2,173 drugs, and medical information for 18 common comorbidities. The DGDr-Net recommended candidate repurposable drugs for COVID-19 through network reinforcement driven scoring algorithms. The scoring algorithms determined the priority of recommendations by utilizing graph-based semi-supervised learning. From the predicted scores, we recommended 30 drugs, including dexamethasone, resveratrol, methotrexate, indomethacin, quercetin, etc., as repurposable drugs for COVID-19, and the results were verified with drugs that have been under clinical trials. The list of drugs via a data-driven computational approach could help reduce trial-and-error in finding treatment for COVID-19. △ Less

Submitted 12 August, 2020; originally announced August 2020.

Comments: 4 figures

arXiv:2007.04578 [pdf, other]

On the Reliability and Generalizability of Brain-inspired Reinforcement Learning Algorithms

Authors: Dongjae Kim, Jee Hang Lee, Jae Hoon Shin, Minsu Abel Yang, Sang Wan Lee

Abstract: Although deep RL models have shown a great potential for solving various types of tasks with minimal supervision, several key challenges remain in terms of learning from limited experience, adapting to environmental changes, and generalizing learning from a single task. Recent evidence in decision neuroscience has shown that the human brain has an innate capacity to resolve these issues, leading t… ▽ More Although deep RL models have shown a great potential for solving various types of tasks with minimal supervision, several key challenges remain in terms of learning from limited experience, adapting to environmental changes, and generalizing learning from a single task. Recent evidence in decision neuroscience has shown that the human brain has an innate capacity to resolve these issues, leading to optimism regarding the development of neuroscience-inspired solutions toward sample-efficient, and generalizable RL algorithms. We show that the computational model combining model-based and model-free control, which we term the prefrontal RL, reliably encodes the information of high-level policy that humans learned, and this model can generalize the learned policy to a wide range of tasks. First, we trained the prefrontal RL, and deep RL algorithms on 82 subjects' data, collected while human participants were performing two-stage Markov decision tasks, in which we manipulated the goal, state-transition uncertainty and state-space complexity. In the reliability test, which includes the latent behavior profile and the parameter recoverability test, we showed that the prefrontal RL reliably learned the latent policies of the humans, while all the other models failed. Second, to test the ability to generalize what these models learned from the original task, we situated them in the context of environmental volatility. Specifically, we ran large-scale simulations with 10 Markov decision tasks, in which latent context variables change over time. Our information-theoretic analysis showed that the prefrontal RL showed the highest level of adaptability and episodic encoding efficacy. This is the first attempt to formally test the possibility that computational models mimicking the way the brain solves general problems can lead to practical solutions to key challenges in machine learning. △ Less

Submitted 9 July, 2020; originally announced July 2020.

arXiv:2006.11843 [pdf]

Unsupervised Learning of Deep-Learned Features from Breast Cancer Images

Authors: Sanghoon Lee, Colton Farley, Simon Shim, Yanjun Zhao, Wookjin Choi, Wook-Sung Yoo

Abstract: Detecting cancer manually in whole slide images requires significant time and effort on the laborious process. Recent advances in whole slide image analysis have stimulated the growth and development of machine learning-based approaches that improve the efficiency and effectiveness in the diagnosis of cancer diseases. In this paper, we propose an unsupervised learning approach for detecting cancer… ▽ More Detecting cancer manually in whole slide images requires significant time and effort on the laborious process. Recent advances in whole slide image analysis have stimulated the growth and development of machine learning-based approaches that improve the efficiency and effectiveness in the diagnosis of cancer diseases. In this paper, we propose an unsupervised learning approach for detecting cancer in breast invasive carcinoma (BRCA) whole slide images. The proposed method is fully automated and does not require human involvement during the unsupervised learning procedure. We demonstrate the effectiveness of the proposed approach for cancer detection in BRCA and show how the machine can choose the most appropriate clusters during the unsupervised learning procedure. Moreover, we present a prototype application that enables users to select relevant groups mapping all regions related to the groups in whole slide images. △ Less

Submitted 21 June, 2020; originally announced June 2020.

Comments: 7 pages for IEEE BIBE

arXiv:2006.01054 [pdf, other]

Effects of Population Co-location Reduction on Cross-county Transmission Risk of COVID-19 in the United States

Authors: Chao Fan, Sanghyeon Lee, Yang Yang, Bora Oztekin, Qingchun Li, Ali Mostafavi

Abstract: The rapid spread of COVID-19 in the United States has imposed a major threat to public health, the real economy, and human well-being. With the absence of effective vaccines, the preventive actions of social distancing and travel reduction are recognized as essential non-pharmacologic approaches to control the spread of COVID-19. Prior studies demonstrated that human movement and mobility drove th… ▽ More The rapid spread of COVID-19 in the United States has imposed a major threat to public health, the real economy, and human well-being. With the absence of effective vaccines, the preventive actions of social distancing and travel reduction are recognized as essential non-pharmacologic approaches to control the spread of COVID-19. Prior studies demonstrated that human movement and mobility drove the spatiotemporal distribution of COVID-19 in China. Little is known, however, about the patterns and effects of co-location reduction on cross-county transmission risk of COVID-19. This study utilizes Facebook co-location data for all counties in the United States from March to early May 2020. The analysis examines the synchronicity and time lag between travel reduction and pandemic growth trajectory to evaluate the efficacy of social distancing in ceasing the population co-location probabilities, and subsequently the growth in weekly new cases. The results show that the mitigation effects of co-location reduction appear in the growth of weekly new cases with one week of delay. Furthermore, significant segregation is found among different county groups which are categorized based on numbers of cases. The results suggest that within-group co-location probabilities remain stable, and social distancing policies primarily resulted in reduced cross-group co-location probabilities (due to travel reduction from counties with large number of cases to counties with low numbers of cases). These findings could have important practical implications for local governments to inform their intervention measures for monitoring and reducing the spread of COVID-19, as well as for adoption in future pandemics. Public policy, economic forecasting, and epidemic modeling need to account for population co-location patterns in evaluating transmission risk of COVID-19 across counties. △ Less

Submitted 1 June, 2020; originally announced June 2020.

Comments: 12 pages, 7 figures

arXiv:2005.08701 [pdf, other]

doi 10.1007/s40042-021-00056-8

Machine learning for the diagnosis of early stage diabetes using temporal glucose profiles

Authors: Woo Seok Lee, Junghyo Jo, Taegeun Song

Abstract: Machine learning shows remarkable success for recognizing patterns in data. Here we apply the machine learning (ML) for the diagnosis of early stage diabetes, which is known as a challenging task in medicine. Blood glucose levels are tightly regulated by two counter-regulatory hormones, insulin and glucagon, and the failure of the glucose homeostasis leads to the common metabolic disease, diabetes… ▽ More Machine learning shows remarkable success for recognizing patterns in data. Here we apply the machine learning (ML) for the diagnosis of early stage diabetes, which is known as a challenging task in medicine. Blood glucose levels are tightly regulated by two counter-regulatory hormones, insulin and glucagon, and the failure of the glucose homeostasis leads to the common metabolic disease, diabetes mellitus. It is a chronic disease that has a long latent period the complicates detection of the disease at an early stage. The vast majority of diabetics result from that diminished effectiveness of insulin action. The insulin resistance must modify the temporal profile of blood glucose. Thus we propose to use ML to detect the subtle change in the temporal pattern of glucose concentration. Time series data of blood glucose with sufficient resolution is currently unavailable, so we confirm the proposal using synthetic data of glucose profiles produced by a biophysical model that considers the glucose regulation and hormone action. Multi-layered perceptrons, convolutional neural networks, and recurrent neural networks all identified the degree of insulin resistance with high accuracy above $85\%$. △ Less

Submitted 18 May, 2020; originally announced May 2020.

Comments: 4 pages, 2 figure

Showing 1–50 of 104 results for author: Lee, S