-
Predicting Generalization of AI Colonoscopy Models to Unseen Data
Authors:
Joel Shor,
Carson McNeil,
Yotam Intrator,
Joseph R Ledsam,
Hiro-o Yamano,
Daisuke Tsurumaru,
Hiroki Kayama,
Atsushi Hamabe,
Koji Ando,
Mitsuhiko Ota,
Haruei Ogino,
Hiroshi Nakase,
Kaho Kobayashi,
Masaaki Miyo,
Eiji Oki,
Ichiro Takemasa,
Ehud Rivlin,
Roman Goldenberg
Abstract:
$\textbf{Background}$: Generalizability of AI colonoscopy algorithms is important for wider adoption in clinical practice. However, current techniques for evaluating performance on unseen data require expensive and time-intensive labels.
$\textbf{Methods}…
▽ More
$\textbf{Background}$: Generalizability of AI colonoscopy algorithms is important for wider adoption in clinical practice. However, current techniques for evaluating performance on unseen data require expensive and time-intensive labels.
$\textbf{Methods}$: We use a "Masked Siamese Network" (MSN) to identify novel phenomena in unseen data and predict polyp detector performance. MSN is trained to predict masked out regions of polyp images, without any labels. We test MSN's ability to be trained on data only from Israel and detect unseen techniques, narrow-band imaging (NBI) and chromendoscoy (CE), on colonoscopes from Japan (354 videos, 128 hours). We also test MSN's ability to predict performance of Computer Aided Detection (CADe) of polyps on colonoscopies from both countries, even though MSN is not trained on data from Japan.
$\textbf{Results}$: MSN correctly identifies NBI and CE as less similar to Israel whitelight than Japan whitelight (bootstrapped z-test, |z| > 496, p < 10^-8 for both) using the label-free Frechet distance. MSN detects NBI with 99% accuracy, predicts CE better than our heuristic (90% vs 79% accuracy) despite being trained only on whitelight, and is the only method that is robust to noisy labels. MSN predicts CADe polyp detector performance on in-domain Israel and out-of-domain Japan colonoscopies (r=0.79, 0.37 respectively). With few examples of Japan detector performance to train on, MSN prediction of Japan performance improves (r=0.56).
$\textbf{Conclusion}$: Our technique can identify distribution shifts in clinical data and can predict CADe detector performance on unseen data, without labels. Our self-supervised approach can aid in detecting when data in practice is different from training, such as between hospitals or data has meaningfully shifted from training. MSN has potential for application to medical image domains beyond colonoscopy.
△ Less
Submitted 22 March, 2024; v1 submitted 14 March, 2024;
originally announced March 2024.
-
RDF-star2Vec: RDF-star Graph Embeddings for Data Mining
Authors:
Shusaku Egami,
Takanori Ugai,
Masateru Oota,
Kyoumoto Matsushita,
Takahiro Kawamura,
Kouji Kozaki,
Ken Fukuda
Abstract:
Knowledge Graphs (KGs) such as Resource Description Framework (RDF) data represent relationships between various entities through the structure of triples (<subject, predicate, object>). Knowledge graph embedding (KGE) is crucial in machine learning applications, specifically in node classification and link prediction tasks. KGE remains a vital research topic within the semantic web community. RDF…
▽ More
Knowledge Graphs (KGs) such as Resource Description Framework (RDF) data represent relationships between various entities through the structure of triples (<subject, predicate, object>). Knowledge graph embedding (KGE) is crucial in machine learning applications, specifically in node classification and link prediction tasks. KGE remains a vital research topic within the semantic web community. RDF-star introduces the concept of a quoted triple (QT), a specific form of triple employed either as the subject or object within another triple. Moreover, RDF-star permits a QT to act as compositional entities within another QT, thereby enabling the representation of recursive, hyper-relational KGs with nested structures. However, existing KGE models fail to adequately learn the semantics of QTs and entities, primarily because they do not account for RDF-star graphs containing multi-leveled nested QTs and QT-QT relationships. This study introduces RDF-star2Vec, a novel KGE model specifically designed for RDF-star graphs. RDF-star2Vec introduces graph walk techniques that enable probabilistic transitions between a QT and its compositional entities. Feature vectors for QTs, entities, and relations are derived from generated sequences through the structured skip-gram model. Additionally, we provide a dataset and a benchmarking framework for data mining tasks focused on complex RDF-star graphs. Evaluative experiments demonstrated that RDF-star2Vec yielded superior performance compared to recent extensions of RDF2Vec in various tasks including classification, clustering, entity relatedness, and QT similarity.
△ Less
Submitted 25 December, 2023;
originally announced December 2023.
-
The unreasonable effectiveness of AI CADe polyp detectors to generalize to new countries
Authors:
Joel Shor,
Hiro-o Yamano,
Daisuke Tsurumaru,
Yotami Intrator,
Hiroki Kayama,
Joe Ledsam,
Atsushi Hamabe,
Koji Ando,
Mitsuhiko Ota,
Haruei Ogino,
Hiroshi Nakase,
Kaho Kobayashi,
Eiji Oki,
Roman Goldenberg,
Ehud Rivlin,
Ichiro Takemasa
Abstract:
$\textbf{Background and aims}…
▽ More
$\textbf{Background and aims}$: Artificial Intelligence (AI) Computer-Aided Detection (CADe) is commonly used for polyp detection, but data seen in clinical settings can differ from model training. Few studies evaluate how well CADe detectors perform on colonoscopies from countries not seen during training, and none are able to evaluate performance without collecting expensive and time-intensive labels.
$\textbf{Methods}$: We trained a CADe polyp detector on Israeli colonoscopy videos (5004 videos, 1106 hours) and evaluated on Japanese videos (354 videos, 128 hours) by measuring the True Positive Rate (TPR) versus false alarms per minute (FAPM). We introduce a colonoscopy dissimilarity measure called "MAsked mediCal Embedding Distance" (MACE) to quantify differences between colonoscopies, without labels. We evaluated CADe on all Japan videos and on those with the highest MACE.
$\textbf{Results}$: MACE correctly quantifies that narrow-band imaging (NBI) and chromoendoscopy (CE) frames are less similar to Israel data than Japan whitelight (bootstrapped z-test, |z| > 690, p < $10^{-8}$ for both). Despite differences in the data, CADe performance on Japan colonoscopies was non-inferior to Israel ones without additional training (TPR at 0.5 FAPM: 0.957 and 0.972 for Israel and Japan; TPR at 1.0 FAPM: 0.972 and 0.989 for Israel and Japan; superiority test t > 45.2, p < $10^{-8}$). Despite not being trained on NBI or CE, TPR on those subsets were non-inferior to Japan overall (non-inferiority test t > 47.3, p < $10^{-8}$, $δ$ = 1.5% for both).
$\textbf{Conclusion}$: Differences that prevent CADe detectors from performing well in non-medical settings do not degrade the performance of our AI CADe polyp detector when applied to data from a new country. MACE can help medical AI models internationalize by identifying the most "dissimilar" data on which to evaluate models.
△ Less
Submitted 17 December, 2023; v1 submitted 11 December, 2023;
originally announced December 2023.
-
Balance Measures Derived from Insole Sensor Differentiate Prodromal Dementia with Lewy Bodies
Authors:
Masatomo Kobayashi,
Yasunori Yamada,
Kaoru Shinkawa,
Miyuki Nemoto,
Miho Ota,
Kiyotaka Nemoto,
Tetsuaki Arai
Abstract:
Dementia with Lewy bodies is the second most common type of neurodegenerative dementia, and identification at the prodromal stage$-$i.e., mild cognitive impairment due to Lewy bodies (MCI-LB)$-$is important for providing appropriate care. However, MCI-LB is often underrecognized because of its diversity in clinical manifestations and similarities with other conditions such as mild cognitive impair…
▽ More
Dementia with Lewy bodies is the second most common type of neurodegenerative dementia, and identification at the prodromal stage$-$i.e., mild cognitive impairment due to Lewy bodies (MCI-LB)$-$is important for providing appropriate care. However, MCI-LB is often underrecognized because of its diversity in clinical manifestations and similarities with other conditions such as mild cognitive impairment due to Alzheimer's disease (MCI-AD). In this study, we propose a machine learning-based automatic pipeline that helps identify MCI-LB by exploiting balance measures acquired with an insole sensor during a 30-s standing task. An experiment with 98 participants (14 MCI-LB, 38 MCI-AD, 46 cognitively normal) showed that the resultant models could discriminate MCI-LB from the other groups with up to 78.0% accuracy (AUC: 0.681), which was 6.8% better than the accuracy of a reference model based on demographic and clinical neuropsychological measures. Our findings may open up a new approach for timely identification of MCI-LB, enabling better care for patients.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
Smartwatch-derived Acoustic Markers for Deficits in Cognitively Relevant Everyday Functioning
Authors:
Yasunori Yamada,
Kaoru Shinkawa,
Masatomo Kobayashi,
Miyuki Nemoto,
Miho Ota,
Kiyotaka Nemoto,
Tetsuaki Arai
Abstract:
Detection of subtle deficits in everyday functioning due to cognitive impairment is important for early detection of neurodegenerative diseases, particularly Alzheimer's disease. However, current standards for assessment of everyday functioning are based on qualitative, subjective ratings. Speech has been shown to provide good objective markers for cognitive impairments, but the association with c…
▽ More
Detection of subtle deficits in everyday functioning due to cognitive impairment is important for early detection of neurodegenerative diseases, particularly Alzheimer's disease. However, current standards for assessment of everyday functioning are based on qualitative, subjective ratings. Speech has been shown to provide good objective markers for cognitive impairments, but the association with cognition-relevant everyday functioning remains uninvestigated. In this study, we demonstrate the feasibility of using a smartwatch-based application to collect acoustic features as objective markers for detecting deficits in everyday functioning. We collected voice data during the performance of cognitive tasks and daily conversation, as possible application scenarios, from 54 older adults, along with a measure of everyday functioning. Machine learning models using acoustic features could detect individuals with deficits in everyday functioning with up to 77.8% accuracy, which was higher than the 68.5% accuracy with standard neuropsychological tests. We also identified common acoustic features for robustly discriminating deficits in everyday functioning across both types of voice data (cognitive tasks and daily conversation). Our results suggest that common acoustic features extracted from different types of voice data can be used as markers for deficits in everyday functioning.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
Automated Analysis of Drawing Process for Detecting Prodromal and Clinical Dementia
Authors:
Yasunori Yamada,
Masatomo Kobayashi,
Kaoru Shinkawa,
Miyuki Nemoto,
Miho Ota,
Kiyotaka Nemoto,
Tetsuaki Arai
Abstract:
Early diagnosis of dementia, particularly in the prodromal stage (i.e., mild cognitive impairment, or MCI), has become a research and clinical priority but remains challenging. Automated analysis of the drawing process has been studied as a promising means for screening prodromal and clinical dementia, providing multifaceted information encompassing features, such as drawing speed, pen posture, wr…
▽ More
Early diagnosis of dementia, particularly in the prodromal stage (i.e., mild cognitive impairment, or MCI), has become a research and clinical priority but remains challenging. Automated analysis of the drawing process has been studied as a promising means for screening prodromal and clinical dementia, providing multifaceted information encompassing features, such as drawing speed, pen posture, writing pressure, and pauses. We examined the feasibility of using these features not only for detecting prodromal and clinical dementia but also for predicting the severity of cognitive impairments assessed using Mini-Mental State Examination (MMSE) as well as the severity of neuropathological changes assessed by medial temporal lobe (MTL) atrophy. We collected drawing data with a digitizing tablet and pen from 145 older adults of cognitively normal (CN), MCI, and dementia. The nested cross-validation results indicate that the combination of drawing features could be used to classify CN, MCI, and dementia with an AUC of 0.909 and 75.1% accuracy (CN vs. MCI: 82.4% accuracy; CN vs. dementia: 92.2% accuracy; MCI vs. dementia: 80.3% accuracy) and predict MMSE scores with an $R^2$ of 0.491 and severity of MTL atrophy with an $R^2$ of 0.293. Our findings suggest that automated analysis of the drawing process can provide information about cognitive impairments and neuropathological changes due to dementia, which can help identify prodromal and clinical dementia as a digital biomarker.
△ Less
Submitted 16 November, 2022;
originally announced November 2022.
-
Approximation and parameterized algorithms to find balanced connected partitions of graphs
Authors:
Phablo F. S. Moura,
Matheus J. Ota,
Yoshiko Wakabayashi
Abstract:
Partitioning a connected graph into $k$~vertex-disjoint connected subgraphs of similar (or given) orders is a classical problem that has been intensively investigated since late seventies. Given a connected graph $G=(V,E)$ and a weight function $w : V \to \mathbb{Q}_\geq$, a connected $k$-partition of $G$ is a partition of $V$ such that each class induces a connected subgraph. The balanced connect…
▽ More
Partitioning a connected graph into $k$~vertex-disjoint connected subgraphs of similar (or given) orders is a classical problem that has been intensively investigated since late seventies. Given a connected graph $G=(V,E)$ and a weight function $w : V \to \mathbb{Q}_\geq$, a connected $k$-partition of $G$ is a partition of $V$ such that each class induces a connected subgraph. The balanced connected $k$-partition problem consists in finding a connected $k$-partition in which every class has roughly the same weight. To model this concept of balance, one may seek connected $k$-partitions that either maximize the weight of a lightest class $(\text{max-min BCP}_k)$ or minimize the weight of a heaviest class $(\text{min-max BCP}_k)$. Such problems are equivalent when $k=2$, but they are different when $k\geq 3$. In this work, we propose a simple pseudo-polynomial $\frac{k}{2}$-approximation algorithm for $\text{min-max BCP}_k$ which runs in time $\mathcal{O}(W|V||E|)$, where $W = \sum_{v \in V} w(v)$. Based on this algorithm and using a scaling technique, we design a (polynomial) $(\frac{k}{2} +\varepsilon)$-approximation for the same problem with running-time $\mathcal{O}(|V|^3|E|/\varepsilon)$, for any fixed $\varepsilon>0$. Additionally, we propose a fixed-parameter tractable algorithm based on integer linear programming for the unweighted $\text{max-min BCP}_k$ parameterized by the size of a vertex cover.
△ Less
Submitted 23 August, 2021;
originally announced August 2021.
-
Integer Programming Approaches to Balanced Connected $k$-Partition
Authors:
Flávio K. Miyazawa,
Phablo F. S. Moura,
Matheus J. Ota,
Yoshiko Wakabayashi
Abstract:
We address the problem of partitioning a vertex-weighted connected graph into $k$ connected subgraphs that have similar weights, for a fixed integer $k\geq 2$. This problem, known as the \emph{balanced connected $k$-partition problem} ($BCP_k$), is defined as follows. Given a connected graph $G$ with nonnegative weights on the vertices, find a partition $\{V_i\}_{i=1}^k$ of $V(G)$ such that each c…
▽ More
We address the problem of partitioning a vertex-weighted connected graph into $k$ connected subgraphs that have similar weights, for a fixed integer $k\geq 2$. This problem, known as the \emph{balanced connected $k$-partition problem} ($BCP_k$), is defined as follows. Given a connected graph $G$ with nonnegative weights on the vertices, find a partition $\{V_i\}_{i=1}^k$ of $V(G)$ such that each class $V_i$ induces a connected subgraph of $G$, and the weight of a class with the minimum weight is as large as possible. It is known that $BCP_k$ is $NP$-hard even on bipartite graphs and on interval graphs. It has been largely investigated under different approaches and perspectives. On the practical side, $BCP_k$ is used to model many applications arising in police patrolling, image processing, cluster analysis, operating systems and robotics. We propose three integer linear programming formulations for the balanced connected $k$-partition problem. The first one contains only binary variables and a potentially large number of constraints that are separable in polynomial time. Some polyhedral results on this formulation, when all vertices have unit weight, are also presented. The other formulations are based on flows and have a polynomial number of constraints and variables. Preliminary computational experiments have shown that the proposed formulations outperform the other formulations presented in the literature.
△ Less
Submitted 13 November, 2019;
originally announced November 2019.