-
Domain Adaptive Multiple Instance Learning for Instance-level Prediction of Pathological Images
Authors:
Shusuke Takahama,
Yusuke Kurose,
Yusuke Mukuta,
Hiroyuki Abe,
Akihiko Yoshizawa,
Tetsuo Ushiku,
Masashi Fukayama,
Masanobu Kitagawa,
Masaru Kitsuregawa,
Tatsuya Harada
Abstract:
Pathological image analysis is an important process for detecting abnormalities such as cancer from cell images. However, since the image size is generally very large, the cost of providing detailed annotations is high, which makes it difficult to apply machine learning techniques. One way to improve the performance of identifying abnormalities while keeping the annotation cost low is to use only…
▽ More
Pathological image analysis is an important process for detecting abnormalities such as cancer from cell images. However, since the image size is generally very large, the cost of providing detailed annotations is high, which makes it difficult to apply machine learning techniques. One way to improve the performance of identifying abnormalities while keeping the annotation cost low is to use only labels for each slide, or to use information from another dataset that has already been labeled. However, such weak supervisory information often does not provide sufficient performance. In this paper, we proposed a new task setting to improve the classification performance of the target dataset without increasing annotation costs. And to solve this problem, we propose a pipeline that uses multiple instance learning (MIL) and domain adaptation (DA) methods. Furthermore, in order to combine the supervisory information of both methods effectively, we propose a method to create pseudo-labels with high confidence. We conducted experiments on the pathological image dataset we created for this study and showed that the proposed method significantly improves the classification performance compared to existing methods.
△ Less
Submitted 7 April, 2023;
originally announced April 2023.
-
Evolution of the public opinion on COVID-19 vaccination in Japan
Authors:
Yuri Nakayama,
Yuka Takedomi,
Towa Suda,
Takeaki Uno,
Takako Hashimoto,
Masashi Toyoda,
Naoki Yoshinaga,
Masaru Kitsuregawa,
Luis E. C. Rocha,
Ryota Kobayashi
Abstract:
Vaccines are promising tools to control the spread of COVID-19. An effective vaccination campaign requires government policies and community engagement, sharing experiences for social support, and voicing concerns to vaccine safety and efficiency. The increasing use of online social platforms allows us to trace large-scale communication and infer public opinion in real-time. We collected more than…
▽ More
Vaccines are promising tools to control the spread of COVID-19. An effective vaccination campaign requires government policies and community engagement, sharing experiences for social support, and voicing concerns to vaccine safety and efficiency. The increasing use of online social platforms allows us to trace large-scale communication and infer public opinion in real-time. We collected more than 100 million vaccine-related tweets posted by 8 million users and used the Latent Dirichlet Allocation model to perform automated topic modeling of tweet texts during the vaccination campaign in Japan. We identified 15 topics grouped into 4 themes on Personal issue, Breaking news, Politics, and Conspiracy and humour. The evolution of the popularity of themes revealed a shift in public opinion, initially sharing the attention over personal issues (individual aspect), collecting information from the news (knowledge acquisition), and government criticisms, towards personal experiences once confidence in the vaccination campaign was established. An interrupted time series regression analysis showed that the Tokyo Olympic Games affected public opinion more than other critical events but not the course of the vaccination. Public opinion on politics was significantly affected by various events, positively shifting the attention in the early stages of the vaccination campaign and negatively later. Tweets about personal issues were mostly retweeted when the vaccination reached the younger population. The associations between the vaccination campaign stages and tweet themes suggest that the public engagement in the social platform contributed to speedup vaccine uptake by reducing anxiety via social learning and support.
△ Less
Submitted 22 July, 2022;
originally announced July 2022.
-
A System for Worldwide COVID-19 Information Aggregation
Authors:
Akiko Aizawa,
Frederic Bergeron,
Junjie Chen,
Fei Cheng,
Katsuhiko Hayashi,
Kentaro Inui,
Hiroyoshi Ito,
Daisuke Kawahara,
Masaru Kitsuregawa,
Hirokazu Kiyomaru,
Masaki Kobayashi,
Takashi Kodama,
Sadao Kurohashi,
Qianying Liu,
Masaki Matsubara,
Yusuke Miyao,
Atsuyuki Morishima,
Yugo Murawaki,
Kazumasa Omura,
Haiyue Song,
Eiichiro Sumita,
Shinji Suzuki,
Ribeka Tanaka,
Yu Tanaka,
Masashi Toyoda
, et al. (4 additional authors not shown)
Abstract:
The global pandemic of COVID-19 has made the public pay close attention to related news, covering various domains, such as sanitation, treatment, and effects on education. Meanwhile, the COVID-19 condition is very different among the countries (e.g., policies and development of the epidemic), and thus citizens would be interested in news in foreign countries. We build a system for worldwide COVID-…
▽ More
The global pandemic of COVID-19 has made the public pay close attention to related news, covering various domains, such as sanitation, treatment, and effects on education. Meanwhile, the COVID-19 condition is very different among the countries (e.g., policies and development of the epidemic), and thus citizens would be interested in news in foreign countries. We build a system for worldwide COVID-19 information aggregation containing reliable articles from 10 regions in 7 languages sorted by topics. Our reliable COVID-19 related website dataset collected through crowdsourcing ensures the quality of the articles. A neural machine translation module translates articles in other languages into Japanese and English. A BERT-based topic-classifier trained on our article-topic pair dataset helps users find their interested information efficiently by putting articles into different categories.
△ Less
Submitted 11 October, 2020; v1 submitted 27 July, 2020;
originally announced August 2020.
-
Vocabulary Adaptation for Distant Domain Adaptation in Neural Machine Translation
Authors:
Shoetsu Sato,
Jin Sakuma,
Naoki Yoshinaga,
Masashi Toyoda,
Masaru Kitsuregawa
Abstract:
Neural network methods exhibit strong performance only in a few resource-rich domains. Practitioners, therefore, employ domain adaptation from resource-rich domains that are, in most cases, distant from the target domain. Domain adaptation between distant domains (e.g., movie subtitles and research papers), however, cannot be performed effectively due to mismatches in vocabulary; it will encounter…
▽ More
Neural network methods exhibit strong performance only in a few resource-rich domains. Practitioners, therefore, employ domain adaptation from resource-rich domains that are, in most cases, distant from the target domain. Domain adaptation between distant domains (e.g., movie subtitles and research papers), however, cannot be performed effectively due to mismatches in vocabulary; it will encounter many domain-specific words (e.g., "angstrom") and words whose meanings shift across domains(e.g., "conductor"). In this study, aiming to solve these vocabulary mismatches in domain adaptation for neural machine translation (NMT), we propose vocabulary adaptation, a simple method for effective fine-tuning that adapts embedding layers in a given pre-trained NMT model to the target domain. Prior to fine-tuning, our method replaces the embedding layers of the NMT model by projecting general word embeddings induced from monolingual data in a target domain onto a source-domain embedding space. Experimental results indicate that our method improves the performance of conventional fine-tuning by 3.86 and 3.28 BLEU points in En-Ja and De-En translation, respectively.
△ Less
Submitted 31 October, 2020; v1 submitted 30 April, 2020;
originally announced April 2020.
-
Learning to Describe Phrases with Local and Global Contexts
Authors:
Shonosuke Ishiwatari,
Hiroaki Hayashi,
Naoki Yoshinaga,
Graham Neubig,
Shoetsu Sato,
Masashi Toyoda,
Masaru Kitsuregawa
Abstract:
When reading a text, it is common to become stuck on unfamiliar words and phrases, such as polysemous words with novel senses, rarely used idioms, internet slang, or emerging entities. If we humans cannot figure out the meaning of those expressions from the immediate local context, we consult dictionaries for definitions or search documents or the web to find other global context to help in interp…
▽ More
When reading a text, it is common to become stuck on unfamiliar words and phrases, such as polysemous words with novel senses, rarely used idioms, internet slang, or emerging entities. If we humans cannot figure out the meaning of those expressions from the immediate local context, we consult dictionaries for definitions or search documents or the web to find other global context to help in interpretation. Can machines help us do this work? Which type of context is more important for machines to solve the problem? To answer these questions, we undertake a task of describing a given phrase in natural language based on its local and global contexts. To solve this task, we propose a neural description model that consists of two context encoders and a description decoder. In contrast to the existing methods for non-standard English explanation [Ni+ 2017] and definition generation [Noraset+ 2017; Gadetsky+ 2018], our model appropriately takes important clues from both local and global contexts. Experimental results on three existing datasets (including WordNet, Oxford and Urban Dictionaries) and a dataset newly created from Wikipedia demonstrate the effectiveness of our method over previous work.
△ Less
Submitted 10 April, 2019; v1 submitted 1 November, 2018;
originally announced November 2018.
-
Fast and Exact Top-k Search for Random Walk with Restart
Authors:
Yasuhiro Fujiwara,
Makoto Nakatsuji,
Makoto Onizuka,
Masaru Kitsuregawa
Abstract:
Graphs are fundamental data structures and have been employed for centuries to model real-world systems and phenomena. Random walk with restart (RWR) provides a good proximity score between two nodes in a graph, and it has been successfully used in many applications such as automatic image captioning, recommender systems, and link prediction. The goal of this work is to find nodes that have top-k…
▽ More
Graphs are fundamental data structures and have been employed for centuries to model real-world systems and phenomena. Random walk with restart (RWR) provides a good proximity score between two nodes in a graph, and it has been successfully used in many applications such as automatic image captioning, recommender systems, and link prediction. The goal of this work is to find nodes that have top-k highest proximities for a given node. Previous approaches to this problem find nodes efficiently at the expense of exactness. The main motivation of this paper is to answer, in the affirmative, the question, `Is it possible to improve the search time without sacrificing the exactness?'. Our solution, {it K-dash}, is based on two ideas: (1) It computes the proximity of a selected node efficiently by sparse matrices, and (2) It skips unnecessary proximity computations when searching for the top-k nodes. Theoretical analyses show that K-dash guarantees result exactness. We perform comprehensive experiments to verify the efficiency of K-dash. The results show that K-dash can find top-k nodes significantly faster than the previous approaches while it guarantees exactness.
△ Less
Submitted 31 January, 2012;
originally announced January 2012.