-
Multi Source COVID-19 Detection via Kernel-Density-based Slice Sampling
Authors:
Chia-Ming Lee,
Bo-Cheng Qiu,
Ting-Yao Chen,
Ming-Han Sun,
Fang-Ying Lin,
Jung-Tse Tsai,
I-An Tsai,
Yu-Fan Lin,
Chih-Chung Hsu
Abstract:
We present our solution for the Multi-Source COVID-19 Detection Challenge, which classifies chest CT scans from four distinct medical centers. To address multi-source variability, we employ the Spatial-Slice Feature Learning (SSFL) framework with Kernel-Density-based Slice Sampling (KDS). Our preprocessing pipeline combines lung region extraction, quality control, and adaptive slice sampling to se…
▽ More
We present our solution for the Multi-Source COVID-19 Detection Challenge, which classifies chest CT scans from four distinct medical centers. To address multi-source variability, we employ the Spatial-Slice Feature Learning (SSFL) framework with Kernel-Density-based Slice Sampling (KDS). Our preprocessing pipeline combines lung region extraction, quality control, and adaptive slice sampling to select eight representative slices per scan. We compare EfficientNet and Swin Transformer architectures on the validation set. The EfficientNet model achieves an F1-score of 94.68%, compared to the Swin Transformer's 93.34%. The results demonstrate the effectiveness of our KDS-based pipeline on multi-source data and highlight the importance of dataset balance in multi-institutional medical imaging evaluation.
△ Less
Submitted 2 July, 2025;
originally announced July 2025.
-
Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking
Authors:
Pengxiang Li,
Shilin Yan,
Joey Tsai,
Renrui Zhang,
Ruichuan An,
Ziyu Guo,
Xiaowei Gao
Abstract:
Classifier-Free Guidance (CFG) significantly enhances controllability in generative models by interpolating conditional and unconditional predictions. However, standard CFG often employs a static unconditional input, which can be suboptimal for iterative generation processes where model uncertainty varies dynamically. We introduce Adaptive Classifier-Free Guidance (A-CFG), a novel method that tail…
▽ More
Classifier-Free Guidance (CFG) significantly enhances controllability in generative models by interpolating conditional and unconditional predictions. However, standard CFG often employs a static unconditional input, which can be suboptimal for iterative generation processes where model uncertainty varies dynamically. We introduce Adaptive Classifier-Free Guidance (A-CFG), a novel method that tailors the unconditional input by leveraging the model's instantaneous predictive confidence. At each step of an iterative (masked) diffusion language model, A-CFG identifies tokens in the currently generated sequence for which the model exhibits low confidence. These tokens are temporarily re-masked to create a dynamic, localized unconditional input. This focuses CFG's corrective influence precisely on areas of ambiguity, leading to more effective guidance. We integrate A-CFG into a state-of-the-art masked diffusion language model and demonstrate its efficacy. Experiments on diverse language generation benchmarks show that A-CFG yields substantial improvements over standard CFG, achieving, for instance, a 3.9 point gain on GPQA. Our work highlights the benefit of dynamically adapting guidance mechanisms to model uncertainty in iterative generation.
△ Less
Submitted 26 May, 2025;
originally announced May 2025.
-
CrossLMM: Decoupling Long Video Sequences from LMMs via Dual Cross-Attention Mechanisms
Authors:
Shilin Yan,
Jiaming Han,
Joey Tsai,
Hongwei Xue,
Rongyao Fang,
Lingyi Hong,
Ziyu Guo,
Ray Zhang
Abstract:
The advent of Large Multimodal Models (LMMs) has significantly enhanced Large Language Models (LLMs) to process and interpret diverse data modalities (e.g., image and video). However, as input complexity increases, particularly with long video sequences, the number of required tokens has grown significantly, leading to quadratically computational costs. This has made the efficient compression of v…
▽ More
The advent of Large Multimodal Models (LMMs) has significantly enhanced Large Language Models (LLMs) to process and interpret diverse data modalities (e.g., image and video). However, as input complexity increases, particularly with long video sequences, the number of required tokens has grown significantly, leading to quadratically computational costs. This has made the efficient compression of video tokens in LMMs, while maintaining performance integrity, a pressing research challenge. In this paper, we introduce CrossLMM, decoupling long video sequences from LMMs via a dual cross-attention mechanism, which substantially reduces visual token quantity with minimal performance degradation. Specifically, we first implement a significant token reduction from pretrained visual encoders through a pooling methodology. Then, within LLM layers, we employ a visual-to-visual cross-attention mechanism, wherein the pooled visual tokens function as queries against the original visual token set. This module enables more efficient token utilization while retaining fine-grained informational fidelity. In addition, we introduce a text-to-visual cross-attention mechanism, for which the text tokens are enhanced through interaction with the original visual tokens, enriching the visual comprehension of the text tokens. Comprehensive empirical evaluation demonstrates that our approach achieves comparable or superior performance across diverse video-based LMM benchmarks, despite utilizing substantially fewer computational resources.
△ Less
Submitted 22 May, 2025;
originally announced May 2025.
-
PreCare: Designing AI Assistants for Advance Care Planning (ACP) to Enhance Personal Value Exploration, Patient Knowledge, and Decisional Confidence
Authors:
Yu Lun Hsu,
Yun-Rung Chou,
Chiao-Ju Chang,
Yu-Cheng Chang,
Zer-Wei Lee,
Rokas Gipiškis,
Rachel Li,
Chih-Yuan Shih,
Jen-Kuei Peng,
Hsien-Liang Huang,
Jaw-Shiun Tsai,
Mike Y. Chen
Abstract:
Advance Care Planning (ACP) allows individuals to specify their preferred end-of-life life-sustaining treatments before they become incapacitated by injury or terminal illness (e.g., coma, cancer, dementia). While online ACP offers high accessibility, it lacks key benefits of clinical consultations, including personalized value exploration, immediate clarification of decision consequences. To brid…
▽ More
Advance Care Planning (ACP) allows individuals to specify their preferred end-of-life life-sustaining treatments before they become incapacitated by injury or terminal illness (e.g., coma, cancer, dementia). While online ACP offers high accessibility, it lacks key benefits of clinical consultations, including personalized value exploration, immediate clarification of decision consequences. To bridge this gap, we conducted two formative studies: 1) shadowed and interviewed 3 ACP teams consisting of physicians, nurses, and social workers (18 patients total), and 2) interviewed 14 users of ACP websites. Building on these insights, we designed PreCare in collaboration with 6 ACP professionals. PreCare is a website with 3 AI-driven assistants designed to guide users through exploring personal values, gaining ACP knowledge, and supporting informed decision-making. A usability study (n=12) showed that PreCare achieved a System Usability Scale (SUS) rating of excellent. A comparative evaluation (n=12) showed that PreCare's AI assistants significantly improved exploration of personal values, knowledge, and decisional confidence, and was preferred by 92% of participants.
△ Less
Submitted 13 May, 2025;
originally announced May 2025.
-
Introducing JIRIAF: A Virtual Kubelet Integration for Optimizing HPC Resource Provisioning
Authors:
Vardan Gyurjyan,
Graham Heyes,
Christopher Larrieu,
David Lawrence,
Jeng-Yuan Tsai
Abstract:
The JIRIAF (JLab Integrated Research Infrastructure Across Facilities) framework is designed to streamline resource management and optimize high-performance computing (HPC) workloads across heterogeneous environments. Central to JIRIAF is the JIRIAF Resource Manager (JRM), which effectively leverages Kubernetes and Virtual Kubelet to manage resources dynamically, even in environments with restrict…
▽ More
The JIRIAF (JLab Integrated Research Infrastructure Across Facilities) framework is designed to streamline resource management and optimize high-performance computing (HPC) workloads across heterogeneous environments. Central to JIRIAF is the JIRIAF Resource Manager (JRM), which effectively leverages Kubernetes and Virtual Kubelet to manage resources dynamically, even in environments with restricted user privileges. By operating in userspace, JRM facilitates the execution of user applications as containers across diverse computing sites, ensuring unified control and monitoring. The framework's effectiveness is demonstrated through a case study involving the deployment of data-stream processing pipelines on the Perlmutter system at NERSC, showcasing its capability to manage large-scale HPC applications efficiently. Additionally, we discuss the integration of a digital twin model for a simulated queue system related to a streaming system, using a Dynamic Bayesian Network (DBN) to enhance real-time monitoring and control, providing valuable insights into system performance and optimization strategies.
△ Less
Submitted 25 February, 2025;
originally announced February 2025.
-
LITA: An Efficient LLM-assisted Iterative Topic Augmentation Framework
Authors:
Chia-Hsuan Chang,
Jui-Tse Tsai,
Yi-Hang Tsai,
San-Yih Hwang
Abstract:
Topic modeling is widely used for uncovering thematic structures within text corpora, yet traditional models often struggle with specificity and coherence in domain-focused applications. Guided approaches, such as SeededLDA and CorEx, incorporate user-provided seed words to improve relevance but remain labor-intensive and static. Large language models (LLMs) offer potential for dynamic topic refin…
▽ More
Topic modeling is widely used for uncovering thematic structures within text corpora, yet traditional models often struggle with specificity and coherence in domain-focused applications. Guided approaches, such as SeededLDA and CorEx, incorporate user-provided seed words to improve relevance but remain labor-intensive and static. Large language models (LLMs) offer potential for dynamic topic refinement and discovery, yet their application often incurs high API costs. To address these challenges, we propose the LLM-assisted Iterative Topic Augmentation framework (LITA), an LLM-assisted approach that integrates user-provided seeds with embedding-based clustering and iterative refinement. LITA identifies a small number of ambiguous documents and employs an LLM to reassign them to existing or new topics, minimizing API costs while enhancing topic quality. Experiments on two datasets across topic quality and clustering performance metrics demonstrate that LITA outperforms five baseline models, including LDA, SeededLDA, CorEx, BERTopic, and PromptTopic. Our work offers an efficient and adaptable framework for advancing topic modeling and text clustering.
△ Less
Submitted 21 May, 2025; v1 submitted 16 December, 2024;
originally announced December 2024.
-
Graph Transformer Networks for Accurate Band Structure Prediction: An End-to-End Approach
Authors:
Weiyi Gong,
Tao Sun,
Hexin Bai,
Jeng-Yuan Tsai,
Haibin Ling,
Qimin Yan
Abstract:
Predicting electronic band structures from crystal structures is crucial for understanding structure-property correlations in materials science. First-principles approaches are accurate but computationally intensive. Recent years, machine learning (ML) has been extensively applied to this field, while existing ML models predominantly focus on band gap predictions or indirect band structure estimat…
▽ More
Predicting electronic band structures from crystal structures is crucial for understanding structure-property correlations in materials science. First-principles approaches are accurate but computationally intensive. Recent years, machine learning (ML) has been extensively applied to this field, while existing ML models predominantly focus on band gap predictions or indirect band structure estimation via solving predicted Hamiltonians. An end-to-end model to predict band structure accurately and efficiently is still lacking. Here, we introduce a graph Transformer-based end-to-end approach that directly predicts band structures from crystal structures with high accuracy. Our method leverages the continuity of the k-path and treat continuous bands as a sequence. We demonstrate that our model not only provides accurate band structure predictions but also can derive other properties (such as band gap, band center, and band dispersion) with high accuracy. We verify the model performance on large and diverse datasets.
△ Less
Submitted 25 November, 2024;
originally announced November 2024.
-
Social Media Algorithms Can Shape Affective Polarization via Exposure to Antidemocratic Attitudes and Partisan Animosity
Authors:
Tiziano Piccardi,
Martin Saveski,
Chenyan Jia,
Jeffrey T. Hancock,
Jeanne L. Tsai,
Michael Bernstein
Abstract:
There is widespread concern about the negative impacts of social media feed ranking algorithms on political polarization. Leveraging advancements in large language models (LLMs), we develop an approach to re-rank feeds in real-time to test the effects of content that is likely to polarize: expressions of antidemocratic attitudes and partisan animosity (AAPA). In a preregistered 10-day field experi…
▽ More
There is widespread concern about the negative impacts of social media feed ranking algorithms on political polarization. Leveraging advancements in large language models (LLMs), we develop an approach to re-rank feeds in real-time to test the effects of content that is likely to polarize: expressions of antidemocratic attitudes and partisan animosity (AAPA). In a preregistered 10-day field experiment on X/Twitter with 1,256 consented participants, we increase or decrease participants' exposure to AAPA in their algorithmically curated feeds. We observe more positive outparty feelings when AAPA exposure is decreased and more negative outparty feelings when AAPA exposure is increased. Exposure to AAPA content also results in an immediate increase in negative emotions, such as sadness and anger. The interventions do not significantly impact traditional engagement metrics such as re-post and favorite rates. These findings highlight a potential pathway for developing feed algorithms that mitigate affective polarization by addressing content that undermines the shared values required for a healthy democracy.
△ Less
Submitted 21 November, 2024;
originally announced November 2024.
-
Site-Specific Color Features of Green Coffee Beans
Authors:
Shu-Min Tan,
Shih-Hsun Hung,
Je-Chiang Tsai
Abstract:
Coffee is one of the most valuable primary commodities. Despite this, the common selection technique of green coffee beans relies on personnel visual inspection, which is labor-intensive and subjective. Therefore, an efficient way to evaluate the quality of beans is needed. In this paper, we demonstrate a site-independent approach to find site-specific color features of the seed coat in qualified…
▽ More
Coffee is one of the most valuable primary commodities. Despite this, the common selection technique of green coffee beans relies on personnel visual inspection, which is labor-intensive and subjective. Therefore, an efficient way to evaluate the quality of beans is needed. In this paper, we demonstrate a site-independent approach to find site-specific color features of the seed coat in qualified green coffee beans. We then propose two evaluation schemes for green coffee beans based on this site-specific color feature of qualified beans. Due to the site-specific properties of these color features, machine learning classifiers indicate that compared with the existing evaluation schemes of beans, our evaluation schemes have the advantages of being simple, having less computational costs, and having universal applicability. Finally, this site-specific color feature can distinguish qualified beans from different growing sites. Moreover, this function can prevent cheating in the coffee business and is unique to our evaluation scheme of beans.
△ Less
Submitted 6 September, 2024;
originally announced September 2024.
-
Reranking Social Media Feeds: A Practical Guide for Field Experiments
Authors:
Tiziano Piccardi,
Martin Saveski,
Chenyan Jia,
Jeffrey Hancock,
Jeanne L. Tsai,
Michael S. Bernstein
Abstract:
Social media plays a central role in shaping public opinion and behavior, yet performing experiments on these platforms and, in particular, on feed algorithms is becoming increasingly challenging. This article offers practical recommendations to researchers developing and deploying field experiments focused on real-time re-ranking of social media feeds. This article is organized around two contrib…
▽ More
Social media plays a central role in shaping public opinion and behavior, yet performing experiments on these platforms and, in particular, on feed algorithms is becoming increasingly challenging. This article offers practical recommendations to researchers developing and deploying field experiments focused on real-time re-ranking of social media feeds. This article is organized around two contributions. First, we overview an experimental method using web browser extensions that intercepts and re-ranks content in real-time, enabling naturalistic re-ranking field experiments. We then describe feed interventions and measurements that this paradigm enables on participants' actual feeds, without requiring the involvement of social media platforms. Second, we offer concrete technical recommendations for intercepting and re-ranking social media feeds with minimal user-facing delay, and provide an open-source implementation. This document aims to summarize lessons learned, provide concrete implementation details, and foster the ecosystem of independent social media research.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Event-Based Eye Tracking. AIS 2024 Challenge Survey
Authors:
Zuowen Wang,
Chang Gao,
Zongwei Wu,
Marcos V. Conde,
Radu Timofte,
Shih-Chii Liu,
Qinyu Chen,
Zheng-jun Zha,
Wei Zhai,
Han Han,
Bohao Liao,
Yuliang Wu,
Zengyu Wan,
Zhong Wang,
Yang Cao,
Ganchao Tan,
Jinze Chen,
Yan Ru Pei,
Sasskia Brüers,
Sébastien Crouzet,
Douglas McLelland,
Oliver Coenen,
Baoheng Zhang,
Yizhao Gao,
Jingyuan Li
, et al. (14 additional authors not shown)
Abstract:
This survey reviews the AIS 2024 Event-Based Eye Tracking (EET) Challenge. The task of the challenge focuses on processing eye movement recorded with event cameras and predicting the pupil center of the eye. The challenge emphasizes efficient eye tracking with event cameras to achieve good task accuracy and efficiency trade-off. During the challenge period, 38 participants registered for the Kaggl…
▽ More
This survey reviews the AIS 2024 Event-Based Eye Tracking (EET) Challenge. The task of the challenge focuses on processing eye movement recorded with event cameras and predicting the pupil center of the eye. The challenge emphasizes efficient eye tracking with event cameras to achieve good task accuracy and efficiency trade-off. During the challenge period, 38 participants registered for the Kaggle competition, and 8 teams submitted a challenge factsheet. The novel and diverse methods from the submitted factsheets are reviewed and analyzed in this survey to advance future event-based eye tracking research.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
NotNets: Accelerating Microservices by Bypassing the Network
Authors:
Peter Alvaro,
Matthew Adiletta,
Adrian Cockroft,
Frank Hady,
Ramesh Illikkal,
Esteban Ramos,
James Tsai,
Robert Soulé
Abstract:
Remote procedure calls are the workhorse of distributed systems. However, as software engineering trends, such as micro-services and serverless computing, push applications towards ever finer-grained decompositions, the overhead of RPC-based communication is becoming too great to bear. In this paper, we argue that point solutions that attempt to optimize one aspect of RPC logic are unlikely to mit…
▽ More
Remote procedure calls are the workhorse of distributed systems. However, as software engineering trends, such as micro-services and serverless computing, push applications towards ever finer-grained decompositions, the overhead of RPC-based communication is becoming too great to bear. In this paper, we argue that point solutions that attempt to optimize one aspect of RPC logic are unlikely to mitigate these ballooning communication costs. Rather, we need a dramatic reappraisal of how we provide communication. Towards this end, we propose to emulate message-passing RPCs by sharing message payloads and metadata on CXL 3.0-backed far memory. We provide initial evidence of feasibility and analyze the expected benefits.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
How Culture Shapes What People Want From AI
Authors:
Xiao Ge,
Chunchen Xu,
Daigo Misaki,
Hazel Rose Markus,
Jeanne L Tsai
Abstract:
There is an urgent need to incorporate the perspectives of culturally diverse groups into AI developments. We present a novel conceptual framework for research that aims to expand, reimagine, and reground mainstream visions of AI using independent and interdependent cultural models of the self and the environment. Two survey studies support this framework and provide preliminary evidence that peop…
▽ More
There is an urgent need to incorporate the perspectives of culturally diverse groups into AI developments. We present a novel conceptual framework for research that aims to expand, reimagine, and reground mainstream visions of AI using independent and interdependent cultural models of the self and the environment. Two survey studies support this framework and provide preliminary evidence that people apply their cultural models when imagining their ideal AI. Compared with European American respondents, Chinese respondents viewed it as less important to control AI and more important to connect with AI, and were more likely to prefer AI with capacities to influence. Reflecting both cultural models, findings from African American respondents resembled both European American and Chinese respondents. We discuss study limitations and future directions and highlight the need to develop culturally responsive and relevant AI to serve a broader segment of the world population.
△ Less
Submitted 8 March, 2024;
originally announced March 2024.
-
Interactive Shape Sonification for Tumor Localization in Breast Cancer Surgery
Authors:
Laura Schütz,
Trishia El Chemaly,
Emmanuelle Weber,
Anh Thien Doan,
Jacqueline Tsai,
Christoph Leuze,
Bruce Daniel,
Nassir Navab
Abstract:
About 20 percent of patients undergoing breast-conserving surgery require reoperation due to cancerous tissue remaining inside the breast. Breast cancer localization systems utilize auditory feedback to convey the distance between a localization probe and a small marker (seed) implanted into the breast tumor prior to surgery. However, no information on the location of the tumor margin is provided.…
▽ More
About 20 percent of patients undergoing breast-conserving surgery require reoperation due to cancerous tissue remaining inside the breast. Breast cancer localization systems utilize auditory feedback to convey the distance between a localization probe and a small marker (seed) implanted into the breast tumor prior to surgery. However, no information on the location of the tumor margin is provided. To reduce the reoperation rate by improving the usability and accuracy of the surgical task, we developed an auditory display using shape sonification to assist with tumor margin localization. Accuracy and usability of the interactive shape sonification were determined on models of the female breast in three user studies with both breast surgeons and non-clinical participants. The comparative studies showed a significant increase in usability (p<0.05) and localization accuracy (p<0.001) of the shape sonification over the auditory feedback currently used in surgery.
△ Less
Submitted 28 January, 2024; v1 submitted 26 December, 2023;
originally announced December 2023.
-
CDGraph: Dual Conditional Social Graph Synthesizing via Diffusion Model
Authors:
Jui-Yi Tsai,
Ya-Wen Teng,
Ho Chiok Yew,
De-Nian Yang,
Lydia Y. Chen
Abstract:
The social graphs synthesized by the generative models are increasingly in demand due to data scarcity and concerns over user privacy. One of the key performance criteria for generating social networks is the fidelity to specified conditionals, such as users with certain membership and financial status. While recent diffusion models have shown remarkable performance in generating images, their eff…
▽ More
The social graphs synthesized by the generative models are increasingly in demand due to data scarcity and concerns over user privacy. One of the key performance criteria for generating social networks is the fidelity to specified conditionals, such as users with certain membership and financial status. While recent diffusion models have shown remarkable performance in generating images, their effectiveness in synthesizing graphs has not yet been explored in the context of conditional social graphs. In this paper, we propose the first kind of conditional diffusion model for social networks, CDGraph, which trains and synthesizes graphs based on two specified conditions. We propose the co-evolution dependency in the denoising process of CDGraph to capture the mutual dependencies between the dual conditions and further incorporate social homophily and social contagion to preserve the connectivity between nodes while satisfying the specified conditions. Moreover, we introduce a novel classifier loss, which guides the training of the diffusion process through the mutual dependency of dual conditions. We evaluate CDGraph against four existing graph generative methods, i.e., SPECTRE, GSM, EDGE, and DiGress, on four datasets. Our results show that the generated graphs from CDGraph achieve much higher dual-conditional validity and lower discrepancy in various social network metrics than the baselines, thus demonstrating its proficiency in generating dual-conditional social graphs.
△ Less
Submitted 5 November, 2023; v1 submitted 3 November, 2023;
originally announced November 2023.
-
Semi-automated extraction of research topics and trends from NCI funding in radiological sciences from 2000-2020
Authors:
Mark Nguyen,
Peter Beidler,
Joseph Tsai,
August Anderson,
Daniel Chen,
Paul Kinahan,
John Kang
Abstract:
Investigators, funders, and the public desire knowledge on topics and trends in publicly funded research but current efforts in manual categorization are limited in scale and understanding. We developed a semi-automated approach to extract and name research topics, and applied this to \$1.9B of NCI funding over 21 years in the radiological sciences to determine micro- and macro-scale research topi…
▽ More
Investigators, funders, and the public desire knowledge on topics and trends in publicly funded research but current efforts in manual categorization are limited in scale and understanding. We developed a semi-automated approach to extract and name research topics, and applied this to \$1.9B of NCI funding over 21 years in the radiological sciences to determine micro- and macro-scale research topics and funding trends. Our method relies on sequential clustering of existing biomedical-based word embeddings, naming using subject matter experts, and visualization to discover trends at a macroscopic scale above individual topics. We present results using 15 and 60 cluster topics, where we found that 2D projection of grant embeddings reveals two dominant axes: physics-biology and therapeutic-diagnostic. For our dataset, we found that funding for therapeutics- and physics-based research have outpaced diagnostics- and biology-based research, respectively. We hope these results may (1) give insight to funders on the appropriateness of their funding allocation, (2) assist investigators in contextualizing their work and explore neighboring research domains, and (3) allow the public to review where their tax dollars are being allocated.
△ Less
Submitted 22 June, 2023;
originally announced June 2023.
-
Associations Between Natural Language Processing (NLP) Enriched Social Determinants of Health and Suicide Death among US Veterans
Authors:
Avijit Mitra,
Richeek Pradhan,
Rachel D Melamed,
Kun Chen,
David C Hoaglin,
Katherine L Tucker,
Joel I Reisman,
Zhichao Yang,
Weisong Liu,
Jack Tsai,
Hong Yu
Abstract:
Importance: Social determinants of health (SDOH) are known to be associated with increased risk of suicidal behaviors, but few studies utilized SDOH from unstructured electronic health record (EHR) notes.
Objective: To investigate associations between suicide and recent SDOH, identified using structured and unstructured data.
Design: Nested case-control study.
Setting: EHR data from the US V…
▽ More
Importance: Social determinants of health (SDOH) are known to be associated with increased risk of suicidal behaviors, but few studies utilized SDOH from unstructured electronic health record (EHR) notes.
Objective: To investigate associations between suicide and recent SDOH, identified using structured and unstructured data.
Design: Nested case-control study.
Setting: EHR data from the US Veterans Health Administration (VHA).
Participants: 6,122,785 Veterans who received care in the US VHA between October 1, 2010, and September 30, 2015.
Exposures: Occurrence of SDOH over a maximum span of two years compared with no occurrence of SDOH.
Main Outcomes and Measures: Cases of suicide deaths were matched with 4 controls on birth year, cohort entry date, sex, and duration of follow-up. We developed an NLP system to extract SDOH from unstructured notes. Structured data, NLP on unstructured data, and combining them yielded six, eight and nine SDOH respectively. Adjusted odds ratios (aORs) and 95% confidence intervals (CIs) were estimated using conditional logistic regression.
Results: In our cohort, 8,821 Veterans committed suicide during 23,725,382 person-years of follow-up (incidence rate 37.18/100,000 person-years). Our cohort was mostly male (92.23%) and white (76.99%). Across the five common SDOH as covariates, NLP-extracted SDOH, on average, covered 80.03% of all SDOH occurrences. All SDOH, measured by structured data and NLP, were significantly associated with increased risk of suicide. The SDOH with the largest effects was legal problems (aOR=2.66, 95% CI=.46-2.89), followed by violence (aOR=2.12, 95% CI=1.98-2.27). NLP-extracted and structured SDOH were also associated with suicide.
Conclusions and Relevance: NLP-extracted SDOH were always significantly associated with increased risk of suicide among Veterans, suggesting the potential of NLP in public health studies.
△ Less
Submitted 28 December, 2022; v1 submitted 11 December, 2022;
originally announced December 2022.
-
Automated Identification of Eviction Status from Electronic Health Record Notes
Authors:
Zonghai Yao,
Jack Tsai,
Weisong Liu,
David A. Levy,
Emily Druhl,
Joel I Reisman,
Hong Yu
Abstract:
Objective: Evictions are important social and behavioral determinants of health. Evictions are associated with a cascade of negative events that can lead to unemployment, housing insecurity/homelessness, long-term poverty, and mental health problems. In this study, we developed a natural language processing system to automatically detect eviction status from electronic health record (EHR) notes.…
▽ More
Objective: Evictions are important social and behavioral determinants of health. Evictions are associated with a cascade of negative events that can lead to unemployment, housing insecurity/homelessness, long-term poverty, and mental health problems. In this study, we developed a natural language processing system to automatically detect eviction status from electronic health record (EHR) notes.
Materials and Methods: We first defined eviction status (eviction presence and eviction period) and then annotated eviction status in 5000 EHR notes from the Veterans Health Administration (VHA). We developed a novel model, KIRESH, that has shown to substantially outperform other state-of-the-art models such as fine-tuning pre-trained language models like BioBERT and BioClinicalBERT. Moreover, we designed a novel prompt to further improve the model performance by using the intrinsic connection between the two sub-tasks of eviction presence and period prediction. Finally, we used the Temperature Scaling-based Calibration on our KIRESH-Prompt method to avoid over-confidence issues arising from the imbalance dataset.
Results: KIRESH-Prompt substantially outperformed strong baseline models including fine-tuning the BioClinicalBERT model to achieve 0.74672 MCC, 0.71153 Macro-F1, and 0.83396 Micro-F1 in predicting eviction period and 0.66827 MCC, 0.62734 Macro-F1, and 0.7863 Micro-F1 in predicting eviction presence. We also conducted additional experiments on a benchmark social determinants of health (SBDH) dataset to demonstrate the generalizability of our methods.
Conclusion and Future Work: KIRESH-Prompt has substantially improved eviction status classification. We plan to deploy KIRESH-Prompt to the VHA EHRs as an eviction surveillance system to help address the US Veterans' housing insecurity.
△ Less
Submitted 20 May, 2023; v1 submitted 6 December, 2022;
originally announced December 2022.
-
Live Multi-Streaming and Donation Recommendations via Coupled Donation-Response Tensor Factorization
Authors:
Hsu-Chao Lai,
Jui-Yi Tsai,
Hong-Han Shuai,
Jiun-Long Huang,
Wang-Chien Lee,
De-Nian Yang
Abstract:
In contrast to traditional online videos, live multi-streaming supports real-time social interactions between multiple streamers and viewers, such as donations. However, donation and multi-streaming channel recommendations are challenging due to complicated streamer and viewer relations, asymmetric communications, and the tradeoff between personal interests and group interactions. In this paper, w…
▽ More
In contrast to traditional online videos, live multi-streaming supports real-time social interactions between multiple streamers and viewers, such as donations. However, donation and multi-streaming channel recommendations are challenging due to complicated streamer and viewer relations, asymmetric communications, and the tradeoff between personal interests and group interactions. In this paper, we introduce Multi-Stream Party (MSP) and formulate a new multi-streaming recommendation problem, called Donation and MSP Recommendation (DAMRec). We propose Multi-stream Party Recommender System (MARS) to extract latent features via socio-temporal coupled donation-response tensor factorization for donation and MSP recommendations. Experimental results on Twitch and Douyu manifest that MARS significantly outperforms existing recommenders by at least 38.8% in terms of hit ratio and mean average precision.
△ Less
Submitted 5 October, 2021;
originally announced October 2021.
-
Graph Neural Network for Hamiltonian-Based Material Property Prediction
Authors:
Hexin Bai,
Peng Chu,
Jeng-Yuan Tsai,
Nathan Wilson,
Xiaofeng Qian,
Qimin Yan,
Haibin Ling
Abstract:
Development of next-generation electronic devices for applications call for the discovery of quantum materials hosting novel electronic, magnetic, and topological properties. Traditional electronic structure methods require expensive computation time and memory consumption, thus a fast and accurate prediction model is desired with increasing importance. Representing the interactions among atomic o…
▽ More
Development of next-generation electronic devices for applications call for the discovery of quantum materials hosting novel electronic, magnetic, and topological properties. Traditional electronic structure methods require expensive computation time and memory consumption, thus a fast and accurate prediction model is desired with increasing importance. Representing the interactions among atomic orbitals in any material, a material Hamiltonian provides all the essential elements that control the structure-property correlations in inorganic compounds. Effective learning of material Hamiltonian by developing machine learning methodologies therefore offers a transformative approach to accelerate the discovery and design of quantum materials. With this motivation, we present and compare several different graph convolution networks that are able to predict the band gap for inorganic materials. The models are developed to incorporate two different features: the information of each orbital itself and the interaction between each other. The information of each orbital includes the name, relative coordinates with respect to the center of super cell and the atom number, while the interaction between orbitals are represented by the Hamiltonian matrix. The results show that our model can get a promising prediction accuracy with cross-validation.
△ Less
Submitted 27 May, 2020;
originally announced May 2020.
-
MIDI-Sheet Music Alignment Using Bootleg Score Synthesis
Authors:
Thitaree Tanprasert,
Teerapat Jenrungrot,
Meinard Mueller,
T. J. Tsai
Abstract:
MIDI-sheet music alignment is the task of finding correspondences between a MIDI representation of a piece and its corresponding sheet music images. Rather than using optical music recognition to bridge the gap between sheet music and MIDI, we explore an alternative approach: projecting the MIDI data into pixel space and performing alignment in the image domain. Our method converts the MIDI data i…
▽ More
MIDI-sheet music alignment is the task of finding correspondences between a MIDI representation of a piece and its corresponding sheet music images. Rather than using optical music recognition to bridge the gap between sheet music and MIDI, we explore an alternative approach: projecting the MIDI data into pixel space and performing alignment in the image domain. Our method converts the MIDI data into a crude representation of the score that only contains rectangular floating notehead blobs, a process we call bootleg score synthesis. Furthermore, we project sheet music images into the same bootleg space by applying a deep watershed notehead detector and filling in the bounding boxes around each detected notehead. Finally, we align the bootleg representations using a simple variant of dynamic time warping. On a dataset of 68 real scanned piano scores from IMSLP and corresponding MIDI performances, our method achieves a 97.3% accuracy at an error tolerance of one second, outperforming several baseline systems that employ optical music recognition.
△ Less
Submitted 21 April, 2020;
originally announced April 2020.
-
Y-net: Multi-scale feature aggregation network with wavelet structure similarity loss function for single image dehazing
Authors:
Hao-Hsiang Yang,
Chao-Han Huck Yang,
Yi-Chang James Tsai
Abstract:
Single image dehazing is the ill-posed two-dimensional signal reconstruction problem. Recently, deep convolutional neural networks (CNN) have been successfully used in many computer vision problems. In this paper, we propose a Y-net that is named for its structure. This network reconstructs clear images by aggregating multi-scale features maps. Additionally, we propose a Wavelet Structure SIMilari…
▽ More
Single image dehazing is the ill-posed two-dimensional signal reconstruction problem. Recently, deep convolutional neural networks (CNN) have been successfully used in many computer vision problems. In this paper, we propose a Y-net that is named for its structure. This network reconstructs clear images by aggregating multi-scale features maps. Additionally, we propose a Wavelet Structure SIMilarity (W-SSIM) loss function in the training step. In the proposed loss function, discrete wavelet transforms are applied repeatedly to divide the image into differently sized patches with different frequencies and scales. The proposed loss function is the accumulation of SSIM loss of various patches with respective ratios. Extensive experimental results demonstrate that the proposed Y-net with the W-SSIM loss function restores high-quality clear images and outperforms state-of-the-art algorithms. Code and models are available at https://github.com/dectrfov/Y-net.
△ Less
Submitted 30 March, 2020;
originally announced March 2020.
-
Interpretable Self-Attention Temporal Reasoning for Driving Behavior Understanding
Authors:
Yi-Chieh Liu,
Yung-An Hsieh,
Min-Hung Chen,
Chao-Han Huck Yang,
Jesper Tegner,
Yi-Chang James Tsai
Abstract:
Performing driving behaviors based on causal reasoning is essential to ensure driving safety. In this work, we investigated how state-of-the-art 3D Convolutional Neural Networks (CNNs) perform on classifying driving behaviors based on causal reasoning. We proposed a perturbation-based visual explanation method to inspect the models' performance visually. By examining the video attention saliency,…
▽ More
Performing driving behaviors based on causal reasoning is essential to ensure driving safety. In this work, we investigated how state-of-the-art 3D Convolutional Neural Networks (CNNs) perform on classifying driving behaviors based on causal reasoning. We proposed a perturbation-based visual explanation method to inspect the models' performance visually. By examining the video attention saliency, we found that existing models could not precisely capture the causes (e.g., traffic light) of the specific action (e.g., stopping). Therefore, the Temporal Reasoning Block (TRB) was proposed and introduced to the models. With the TRB models, we achieved the accuracy of $\mathbf{86.3\%}$, which outperform the state-of-the-art 3D CNNs from previous works. The attention saliency also demonstrated that TRB helped models focus on the causes more precisely. With both numerical and visual evaluations, we concluded that our proposed TRB models were able to provide accurate driving behavior prediction by learning the causal reasoning of the behaviors.
△ Less
Submitted 5 November, 2019;
originally announced November 2019.
-
Synthesizing New Retinal Symptom Images by Multiple Generative Models
Authors:
Yi-Chieh Liu,
Hao-Hsiang Yang,
Chao-Han Huck Yang,
Jia-Hong Huang,
Meng Tian,
Hiromasa Morikawa,
Yi-Chang James Tsai,
Jesper Tegner
Abstract:
Age-Related Macular Degeneration (AMD) is an asymptomatic retinal disease which may result in loss of vision. There is limited access to high-quality relevant retinal images and poor understanding of the features defining sub-classes of this disease. Motivated by recent advances in machine learning we specifically explore the potential of generative modeling, using Generative Adversarial Networks…
▽ More
Age-Related Macular Degeneration (AMD) is an asymptomatic retinal disease which may result in loss of vision. There is limited access to high-quality relevant retinal images and poor understanding of the features defining sub-classes of this disease. Motivated by recent advances in machine learning we specifically explore the potential of generative modeling, using Generative Adversarial Networks (GANs) and style transferring, to facilitate clinical diagnosis and disease understanding by feature extraction. We design an analytic pipeline which first generates synthetic retinal images from clinical images; a subsequent verification step is applied. In the synthesizing step we merge GANs (DCGANs and WGANs architectures) and style transferring for the image generation, whereas the verified step controls the accuracy of the generated images. We find that the generated images contain sufficient pathological details to facilitate ophthalmologists' task of disease classification and in discovery of disease relevant features. In particular, our system predicts the drusen and geographic atrophy sub-classes of AMD. Furthermore, the performance using CFP images for GANs outperforms the classification based on using only the original clinical dataset. Our results are evaluated using existing classifier of retinal diseases and class activated maps, supporting the predictive power of the synthetic images and their utility for feature extraction. Our code examples are available online.
△ Less
Submitted 11 February, 2019;
originally announced February 2019.
-
When Causal Intervention Meets Adversarial Examples and Image Masking for Deep Neural Networks
Authors:
Chao-Han Huck Yang,
Yi-Chieh Liu,
Pin-Yu Chen,
Xiaoli Ma,
Yi-Chang James Tsai
Abstract:
Discovering and exploiting the causality in deep neural networks (DNNs) are crucial challenges for understanding and reasoning causal effects (CE) on an explainable visual model. "Intervention" has been widely used for recognizing a causal relation ontologically. In this paper, we propose a causal inference framework for visual reasoning via do-calculus. To study the intervention effects on pixel-…
▽ More
Discovering and exploiting the causality in deep neural networks (DNNs) are crucial challenges for understanding and reasoning causal effects (CE) on an explainable visual model. "Intervention" has been widely used for recognizing a causal relation ontologically. In this paper, we propose a causal inference framework for visual reasoning via do-calculus. To study the intervention effects on pixel-level features for causal reasoning, we introduce pixel-wise masking and adversarial perturbation. In our framework, CE is calculated using features in a latent space and perturbed prediction from a DNN-based model. We further provide the first look into the characteristics of discovered CE of adversarially perturbed images generated by gradient-based methods \footnote{~~https://github.com/jjaacckkyy63/Causal-Intervention-AE-wAdvImg}. Experimental results show that CE is a competitive and robust index for understanding DNNs when compared with conventional methods such as class-activation mappings (CAMs) on the Chest X-Ray-14 dataset for human-interpretable feature(s) (e.g., symptom) reasoning. Moreover, CE holds promises for detecting adversarial examples as it possesses distinct characteristics in the presence of adversarial perturbations.
△ Less
Submitted 25 June, 2019; v1 submitted 9 February, 2019;
originally announced February 2019.
-
Design And Fabrication of High Numerical Aperture And Low Aberration Bi-Convex Micro Lens Array
Authors:
Jhy-Cherng Tsai,
Ming-Fong Chen,
Hsiharng Yang
Abstract:
Micro lens array is crucial in various kinds of optical and electronic applications. A micro lens array with high numerical aperture (NA) and low aberration is in particular needed. This research is aimed to design and fabricate such a micro lens array with simple structure while keeps the same NA of a same-diameter hemisphere lens. A bi-convex semispherical micro lens array, with corresponding…
▽ More
Micro lens array is crucial in various kinds of optical and electronic applications. A micro lens array with high numerical aperture (NA) and low aberration is in particular needed. This research is aimed to design and fabricate such a micro lens array with simple structure while keeps the same NA of a same-diameter hemisphere lens. A bi-convex semispherical micro lens array, with corresponding NA 0.379, by PDMS is first designed and analyzed. Experiments are further conducted to fabricate the designed micro lens array by the thermal reflow process. The formed profile is then sputtered with copper to serve as the mold. The front and the rear micro lens array are fabricated by plating PDMS to the mold and then assembled to form the designed micro lens array.
△ Less
Submitted 7 May, 2008;
originally announced May 2008.
-
Design And Fabrication of Condenser Microphone Using Wafer Transfer And Micro-electroplating Technique
Authors:
Zhen-Zhun Shu,
Ming-Li Ke,
Guan-Wei Chen,
Ray Hua Horng,
Chao-Chih Chang,
Jean-Yih Tsai,
Chung-Ching Lai,
Ji-Liang Chen
Abstract:
A novel fabrication process, which uses wafer transfer and micro-electroplating technique, has been proposed and tested. In this paper, the effects of the diaphragm thickness and stress, the air-gap thickness, and the area ratio of acoustic holes to backplate on the sensitivity of the condenser microphone have been demonstrated since the performance of the microphone depends on these parameters.…
▽ More
A novel fabrication process, which uses wafer transfer and micro-electroplating technique, has been proposed and tested. In this paper, the effects of the diaphragm thickness and stress, the air-gap thickness, and the area ratio of acoustic holes to backplate on the sensitivity of the condenser microphone have been demonstrated since the performance of the microphone depends on these parameters. The microphone diaphragm has been designed with a diameter and thickness of 1.9 mm and 0.6 $μ$m, respectively, an air-gap thickness of 10 $μ$m, and a 24% area ratio of acoustic holes to backplate. To obtain a lower initial stress, the material used for the diaphragm is polyimide. The measured sensitivities of the microphone at the bias voltages of 24 V and 12 V are -45.3 and -50.2 dB/Pa (at 1 kHz), respectively. The fabricated microphone shows a flat frequency response extending to 20 kHz.
△ Less
Submitted 7 May, 2008;
originally announced May 2008.
-
Micro-Ball Lens Array Fabrication in Photoresist Using Ptfe Hydrophobic Effect
Authors:
Ruey-Fang Shyu,
Hsiharng Yang,
Wen-Ren Tsai,
Jhy-Cherng Tsai
Abstract:
This paper presents a simple method to fabricate micro-ball lens and its array. The key technology is to use the hydrophobic characteristics of polyterafluoroethylene (PTFE) substrate. High contact angle between melted photoresist pattern and PTFE can generate micro-ball lens and its array. PTFE thin film was spun onto a silicon wafer and dried in oven. Photoresist AZ4620 was used to pattern mic…
▽ More
This paper presents a simple method to fabricate micro-ball lens and its array. The key technology is to use the hydrophobic characteristics of polyterafluoroethylene (PTFE) substrate. High contact angle between melted photoresist pattern and PTFE can generate micro-ball lens and its array. PTFE thin film was spun onto a silicon wafer and dried in oven. Photoresist AZ4620 was used to pattern micro-columns with different diameters 60, 70 and 80 $μ$m. A thermal reflow process then was applied to melt these micro-column patterns resulted in micro-ball lens array. The achieved micro-ball lens array with diameter 98 $μ$m was fabricated using 80 $μ$m in diameter patterns. This method provides a simple fabrication process and low material cost.
△ Less
Submitted 21 November, 2007;
originally announced November 2007.