-
A simple DNN regression for the chemical composition in essential oil
Authors:
Yuki Harada,
Shuichi Maeda,
Masato Kiyama,
Shinichiro Nakamura
Abstract:
Although experimental design and methodological surveys for mono-molecular activity/property has been extensively investigated, those for chemical composition have received little attention, with the exception of a few prior studies. In this study, we configured three simple DNN regressors to predict essential oil property based on chemical composition. Despite showing overfitting due to the small…
▽ More
Although experimental design and methodological surveys for mono-molecular activity/property has been extensively investigated, those for chemical composition have received little attention, with the exception of a few prior studies. In this study, we configured three simple DNN regressors to predict essential oil property based on chemical composition. Despite showing overfitting due to the small size of dataset, all models were trained effectively in this study.
△ Less
Submitted 17 December, 2024;
originally announced December 2024.
-
Emulating Clinical Quality Muscle B-mode Ultrasound Images from Plane Wave Images Using a Two-Stage Machine Learning Model
Authors:
Reed Chen,
Courtney Trutna Paley,
Wren Wightman,
Lisa Hobson-Webb,
Yohei Harada,
Felix Jin,
Ouwen Huang,
Mark Palmeri,
Kathryn Nightingale
Abstract:
Research ultrasound scanners such as the Verasonics Vantage often lack the advanced image processing algorithms used by clinical systems. Image quality is even lower in plane wave imaging - often used for shear wave elasticity imaging (SWEI) - which sacrifices spatial resolution for temporal resolution. As a result, delay-and-summed images acquired from SWEI have limited interpretability. In this…
▽ More
Research ultrasound scanners such as the Verasonics Vantage often lack the advanced image processing algorithms used by clinical systems. Image quality is even lower in plane wave imaging - often used for shear wave elasticity imaging (SWEI) - which sacrifices spatial resolution for temporal resolution. As a result, delay-and-summed images acquired from SWEI have limited interpretability. In this project, a two-stage machine learning model was trained to enhance single plane wave images of muscle acquired with a Verasonics Vantage system. The first stage of the model consists of a U-Net trained to emulate plane wave compounding, histogram matching, and unsharp masking using paired images. The second stage consists of a CycleGAN trained to emulate clinical muscle B-modes using unpaired images. This two-stage model was implemented on the Verasonics Vantage research ultrasound scanner, and its ability to provide high-speed image formation at a frame rate of 28.5 +/- 0.6 FPS from a single plane wave transmit was demonstrated. A reader study with two physicians demonstrated that these processed images had significantly greater structural fidelity and less speckle than the original plane wave images.
△ Less
Submitted 7 December, 2024;
originally announced December 2024.
-
PaperWave: Listening to Research Papers as Conversational Podcasts Scripted by LLM
Authors:
Yuchi Yahagi,
Rintaro Chujo,
Yuga Harada,
Changyo Han,
Kohei Sugiyama,
Takeshi Naemura
Abstract:
Listening to audio content, such as podcasts and audiobooks, is one way for people to engage with knowledge. Listening affords people more mobility than reading by seeing, thereby broadening their learning opportunities. This study explores the potential applications of large language models (LLMs) to adapt text documents to audio content and addresses the lack of listening-friendly materials for…
▽ More
Listening to audio content, such as podcasts and audiobooks, is one way for people to engage with knowledge. Listening affords people more mobility than reading by seeing, thereby broadening their learning opportunities. This study explores the potential applications of large language models (LLMs) to adapt text documents to audio content and addresses the lack of listening-friendly materials for niche content, such as research papers. LLMs can generate scripts of audio content in various styles tailored to specific needs, such as full-content duration or speech types (monologue or dialogue). To explore this potential, we developed PaperWave as a prototype that transforms academic paper PDFs into conversational podcasts. Our two-month investigation, involving 11 participants (including the authors), employed an autobiographical design, a field study, and a design workshop. The findings highlight the importance of considering listener interaction with their environment when designing document-to-audio systems.
△ Less
Submitted 21 January, 2025; v1 submitted 19 October, 2024;
originally announced October 2024.
-
LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Authors:
LLM-jp,
:,
Akiko Aizawa,
Eiji Aramaki,
Bowen Chen,
Fei Cheng,
Hiroyuki Deguchi,
Rintaro Enomoto,
Kazuki Fujii,
Kensuke Fukumoto,
Takuya Fukushima,
Namgi Han,
Yuto Harada,
Chikara Hashimoto,
Tatsuya Hiraoka,
Shohei Hisada,
Sosuke Hosokawa,
Lu Jie,
Keisuke Kamata,
Teruhito Kanazawa,
Hiroki Kanezashi,
Hiroshi Kataoka,
Satoru Katsumata,
Daisuke Kawahara,
Seiya Kawano
, et al. (58 additional authors not shown)
Abstract:
This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its…
▽ More
This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its activities, and technical reports on the LLMs developed by LLM-jp. For the latest activities, visit https://llm-jp.nii.ac.jp/en/.
△ Less
Submitted 30 December, 2024; v1 submitted 4 July, 2024;
originally announced July 2024.
-
A New Deep State-Space Analysis Framework for Patient Latent State Estimation and Classification from EHR Time Series Data
Authors:
Aya Nakamura,
Ryosuke Kojima,
Yuji Okamoto,
Eiichiro Uchino,
Yohei Mineharu,
Yohei Harada,
Mayumi Kamada,
Manabu Muto,
Motoko Yanagita,
Yasushi Okuno
Abstract:
Many diseases, including cancer and chronic conditions, require extended treatment periods and long-term strategies. Machine learning and AI research focusing on electronic health records (EHRs) have emerged to address this need. Effective treatment strategies involve more than capturing sequential changes in patient test values. It requires an explainable and clinically interpretable model by cap…
▽ More
Many diseases, including cancer and chronic conditions, require extended treatment periods and long-term strategies. Machine learning and AI research focusing on electronic health records (EHRs) have emerged to address this need. Effective treatment strategies involve more than capturing sequential changes in patient test values. It requires an explainable and clinically interpretable model by capturing the patient's internal state over time.
In this study, we propose the "deep state-space analysis framework," using time-series unsupervised learning of EHRs with a deep state-space model. This framework enables learning, visualizing, and clustering of temporal changes in patient latent states related to disease progression.
We evaluated our framework using time-series laboratory data from 12,695 cancer patients. By estimating latent states, we successfully discover latent states related to prognosis. By visualization and cluster analysis, the temporal transition of patient status and test items during state transitions characteristic of each anticancer drug were identified. Our framework surpasses existing methods in capturing interpretable latent space. It can be expected to enhance our comprehension of disease progression from EHRs, aiding treatment adjustments and prognostic determinations.
△ Less
Submitted 21 July, 2023;
originally announced July 2023.
-
Gaussian Process Classification Bandits
Authors:
Tatsuya Hayashi,
Naoki Ito,
Koji Tabata,
Atsuyoshi Nakamura,
Katsumasa Fujita,
Yoshinori Harada,
Tamiki Komatsuzaki
Abstract:
Classification bandits are multi-armed bandit problems whose task is to classify a given set of arms into either positive or negative class depending on whether the rate of the arms with the expected reward of at least h is not less than w for given thresholds h and w. We study a special classification bandit problem in which arms correspond to points x in d-dimensional real space with expected re…
▽ More
Classification bandits are multi-armed bandit problems whose task is to classify a given set of arms into either positive or negative class depending on whether the rate of the arms with the expected reward of at least h is not less than w for given thresholds h and w. We study a special classification bandit problem in which arms correspond to points x in d-dimensional real space with expected rewards f(x) which are generated according to a Gaussian process prior. We develop a framework algorithm for the problem using various arm selection policies and propose policies called FCB and FTSV. We show a smaller sample complexity upper bound for FCB than that for the existing algorithm of the level set estimation, in which whether f(x) is at least h or not must be decided for every arm's x. Arm selection policies depending on an estimated rate of arms with rewards of at least h are also proposed and shown to improve empirical sample complexity. According to our experimental results, the rate-estimation versions of FCB and FTSV, together with that of the popular active learning policy that selects the point with the maximum variance, outperform other policies for synthetic functions, and the version of FTSV is also the best performer for our real-world dataset.
△ Less
Submitted 26 December, 2022;
originally announced December 2022.
-
Real-world Video Anomaly Detection by Extracting Salient Features in Videos
Authors:
Yudai Watanabe,
Makoto Okabe,
Yasunori Harada,
Naoji Kashima
Abstract:
We propose a lightweight and accurate method for detecting anomalies in videos. Existing methods used multiple-instance learning (MIL) to determine the normal/abnormal status of each segment of the video. Recent successful researches argue that it is important to learn the temporal relationships among segments to achieve high accuracy, instead of focusing on only a single segment. Therefore we ana…
▽ More
We propose a lightweight and accurate method for detecting anomalies in videos. Existing methods used multiple-instance learning (MIL) to determine the normal/abnormal status of each segment of the video. Recent successful researches argue that it is important to learn the temporal relationships among segments to achieve high accuracy, instead of focusing on only a single segment. Therefore we analyzed the existing methods that have been successful in recent years, and found that while it is indeed important to learn all segments together, the temporal orders among them are irrelevant to achieving high accuracy. Based on this finding, we do not use the MIL framework, but instead propose a lightweight model with a self-attention mechanism to automatically extract features that are important for determining normal/abnormal from all input segments. As a result, our neural network model has 1.3\% of the number of parameters of the existing method. We evaluated the frame-level detection accuracy of our method on three benchmark datasets (UCF-Crime, ShanghaiTech, and XD-Violence) and demonstrate that our method can achieve the comparable or better accuracy than state-of-the-art methods.
△ Less
Submitted 14 September, 2022;
originally announced September 2022.
-
Log-based Anomaly Detection of CPS Using a Statistical Method
Authors:
Yoshiyuki Harada,
Yoriyuki Yamagata,
Osamu Mizuno,
Eun-Hye Choi
Abstract:
Detecting anomalies of a cyber physical system (CPS), which is a complex system consisting of both physical and software parts, is important because a CPS often operates autonomously in an unpredictable environment. However, because of the ever-changing nature and lack of a precise model for a CPS, detecting anomalies is still a challenging task. To address this problem, we propose applying an out…
▽ More
Detecting anomalies of a cyber physical system (CPS), which is a complex system consisting of both physical and software parts, is important because a CPS often operates autonomously in an unpredictable environment. However, because of the ever-changing nature and lack of a precise model for a CPS, detecting anomalies is still a challenging task. To address this problem, we propose applying an outlier detection method to a CPS log. By using a log obtained from an actual aquarium management system, we evaluated the effectiveness of our proposed method by analyzing outliers that it detected. By investigating the outliers with the developer of the system, we confirmed that some outliers indicate actual faults in the system. For example, our method detected failures of mutual exclusion in the control system that were unknown to the developer. Our method also detected transient losses of functionalities and unexpected reboots. On the other hand, our method did not detect anomalies that were too many and similar. In addition, our method reported rare but unproblematic concurrent combinations of operations as anomalies. Thus, our approach is effective at finding anomalies, but there is still room for improvement.
△ Less
Submitted 12 January, 2017;
originally announced January 2017.
-
An Approximation Algorithm for Maximum Internal Spanning Tree
Authors:
Zhi-Zhong Chen,
Youta Harada,
Lusheng Wang
Abstract:
Given a graph G, the {\em maximum internal spanning tree problem} (MIST for short) asks for computing a spanning tree T of G such that the number of internal vertices in T is maximized. MIST has possible applications in the design of cost-efficient communication networks and water supply networks and hence has been extensively studied in the literature. MIST is NP-hard and hence a number of polyno…
▽ More
Given a graph G, the {\em maximum internal spanning tree problem} (MIST for short) asks for computing a spanning tree T of G such that the number of internal vertices in T is maximized. MIST has possible applications in the design of cost-efficient communication networks and water supply networks and hence has been extensively studied in the literature. MIST is NP-hard and hence a number of polynomial-time approximation algorithms have been designed for MIST in the literature. The previously best polynomial-time approximation algorithm for MIST achieves a ratio of 3/4. In this paper, we first design a simpler algorithm that achieves the same ratio and the same time complexity as the previous best. We then refine the algorithm into a new approximation algorithm that achieves a better ratio (namely, 13/17) with the same time complexity. Our new algorithm explores much deeper structure of the problem than the previous best. The discovered structure may be used to design even better approximation or parameterized algorithms for the problem in the future.
△ Less
Submitted 31 July, 2016;
originally announced August 2016.