-
Avatar Communication Provides More Efficient Online Social Support Than Text Communication
Authors:
Masanori Takano,
Kenji Yokotani,
Takahiro Kato,
Nobuhito Abe,
Fumiaki Taka
Abstract:
Online communication via avatars provides a richer online social experience than text communication. This reinforces the importance of online social support. Online social support is effective for people who lack social resources because of the anonymity of online communities. We aimed to understand online social support via avatars and their social relationships to provide better social support t…
▽ More
Online communication via avatars provides a richer online social experience than text communication. This reinforces the importance of online social support. Online social support is effective for people who lack social resources because of the anonymity of online communities. We aimed to understand online social support via avatars and their social relationships to provide better social support to avatar users. Therefore, we administered a questionnaire to three avatar communication service users (Second Life, ZEPETO, and Pigg Party) and three text communication service users (Facebook, X, and Instagram) (N=8,947). There was no duplication of users for each service. By comparing avatar and text communication users, we examined the amount of online social support, stability of online relationships, and the relationships between online social support and offline social resources (e.g., offline social support). We observed that avatar communication service users received more online social support, had more stable relationships, and had fewer offline social resources than text communication service users. However, the positive association between online and offline social support for avatar communication users was more substantial than for text communication users. These findings highlight the significance of realistic online communication experiences through avatars, including nonverbal and real-time interactions with co-presence. The findings also highlighted avatar communication service users' problems in the physical world, such as the lack of offline social resources. This study suggests that enhancing online social support through avatars can address these issues. This could help resolve social resource problems, both online and offline in future metaverse societies.
△ Less
Submitted 1 May, 2025;
originally announced May 2025.
-
Market, power, gift, and concession economies: Comparison using four-mode primitive network models
Authors:
Takeshi Kato,
Junichi Miyakoshi,
Misa Owa,
Ryuji Mine
Abstract:
Reducing wealth inequality is a global challenge, and the problems of capitalism stem from the enclosure of the commons and the breakdown of the community. According to previous studies by Polanyi, Karatani, and Graeber, economic modes can be divided into capitalist market economy (enclosure and exchange), power economy (de-enclosure and redistribution), gift economy (obligation to return and reci…
▽ More
Reducing wealth inequality is a global challenge, and the problems of capitalism stem from the enclosure of the commons and the breakdown of the community. According to previous studies by Polanyi, Karatani, and Graeber, economic modes can be divided into capitalist market economy (enclosure and exchange), power economy (de-enclosure and redistribution), gift economy (obligation to return and reciprocity), and concession economy (de-obligation to return). The concession economy reflects Graeber's baseline communism (from each according to their abilities, to each according to their needs) and Deguchi's We-turn philosophy (the "I" as an individual has a "fundamental incapability" and the subject of physical action, responsibility, and freedom is "We" as a multi-agent system, including the "I"). In this study, we constructed novel network models for these four modes and compared their properties (cluster coefficient, graph density, reciprocity, assortativity, centrality, and Gini coefficient). From the calculation results, it became clear that the market economy leads to inequality; the power economy mitigates inequality but cannot eliminate it; the gift and concession economies lead to a healthy and equal economy; and the concession economy, free from the ties of obligation to return, is possible without guaranteeing reciprocity. We intend to promote the transformation from a capitalist economy to a concession economy through activities that disseminate baseline communism and the We-turn philosophy that promotes concession, that is, developing a cooperative platform to support concession through information technology and empirical research through fieldwork.
△ Less
Submitted 8 April, 2025;
originally announced April 2025.
-
Scaling Laws for Upcycling Mixture-of-Experts Language Models
Authors:
Seng Pei Liew,
Takuya Kato,
Sho Takase
Abstract:
Pretraining large language models (LLMs) is resource-intensive, often requiring months of training time even with high-end GPU clusters. There are two approaches of mitigating such computational demands: reusing smaller models to train larger ones (upcycling), and training computationally efficient models like mixture-of-experts (MoE). In this paper, we study the upcycling of LLMs to MoE models, o…
▽ More
Pretraining large language models (LLMs) is resource-intensive, often requiring months of training time even with high-end GPU clusters. There are two approaches of mitigating such computational demands: reusing smaller models to train larger ones (upcycling), and training computationally efficient models like mixture-of-experts (MoE). In this paper, we study the upcycling of LLMs to MoE models, of which the scaling behavior remains underexplored. Through extensive experiments, we identify empirical scaling laws that describe how performance depends on dataset size and model configuration. Particularly, we show that, while scaling these factors improves performance, there is a novel interaction term between the dense and upcycled training dataset that limits the efficiency of upcycling at large computational budgets. Based on these findings, we provide guidance to scale upcycling, and establish conditions under which upcycling outperforms from-scratch trainings within budget constraints.
△ Less
Submitted 16 June, 2025; v1 submitted 5 February, 2025;
originally announced February 2025.
-
Linearly Convergent Mixup Learning
Authors:
Gakuto Obi,
Ayato Saito,
Yuto Sasaki,
Tsuyoshi Kato
Abstract:
Learning in the reproducing kernel Hilbert space (RKHS) such as the support vector machine has been recognized as a promising technique. It continues to be highly effective and competitive in numerous prediction tasks, particularly in settings where there is a shortage of training data or computational limitations exist. These methods are especially valued for their ability to work with small data…
▽ More
Learning in the reproducing kernel Hilbert space (RKHS) such as the support vector machine has been recognized as a promising technique. It continues to be highly effective and competitive in numerous prediction tasks, particularly in settings where there is a shortage of training data or computational limitations exist. These methods are especially valued for their ability to work with small datasets and their interpretability. To address the issue of limited training data, mixup data augmentation, widely used in deep learning, has remained challenging to apply to learning in RKHS due to the generation of intermediate class labels. Although gradient descent methods handle these labels effectively, dual optimization approaches are typically not directly applicable. In this study, we present two novel algorithms that extend to a broader range of binary classification models. Unlike gradient-based approaches, our algorithms do not require hyperparameters like learning rates, simplifying their implementation and optimization. Both the number of iterations to converge and the computational cost per iteration scale linearly with respect to the dataset size. The numerical experiments demonstrate that our algorithms achieve faster convergence to the optimal solution compared to gradient descent approaches, and that mixup data augmentation consistently improves the predictive performance across various loss functions.
△ Less
Submitted 13 January, 2025;
originally announced January 2025.
-
Visual-Based Forklift Learning System Enabling Zero-Shot Sim2Real Without Real-World Data
Authors:
Koshi Oishi,
Teruki Kato,
Hiroya Makino,
Seigo Ito
Abstract:
Forklifts are used extensively in various industrial settings and are in high demand for automation. In particular, counterbalance forklifts are highly versatile and employed in diverse scenarios. However, efforts to automate these processes are lacking, primarily owing to the absence of a safe and performance-verifiable development environment. This study proposes a learning system that combines…
▽ More
Forklifts are used extensively in various industrial settings and are in high demand for automation. In particular, counterbalance forklifts are highly versatile and employed in diverse scenarios. However, efforts to automate these processes are lacking, primarily owing to the absence of a safe and performance-verifiable development environment. This study proposes a learning system that combines a photorealistic digital learning environment with a 1/14-scale robotic forklift environment to address this challenge. Inspired by the training-based learning approach adopted by forklift operators, we employ an end-to-end vision-based deep reinforcement learning approach. The learning is conducted in a digitalized environment created from CAD data, making it safe and eliminating the need for real-world data. In addition, we safely validate the method in a physical setting utilizing a 1/14-scale robotic forklift with a configuration similar to that of a real forklift. We achieved a 60% success rate in pallet loading tasks in real experiments using a robotic forklift. Our approach demonstrates zero-shot sim2real with a simple method that does not require heuristic additions. This learning-based approach is considered a first step towards the automation of counterbalance forklifts.
△ Less
Submitted 6 May, 2025; v1 submitted 16 December, 2024;
originally announced December 2024.
-
Analyzing Correlations Between Intrinsic and Extrinsic Bias Metrics of Static Word Embeddings With Their Measuring Biases Aligned
Authors:
Taisei Katô,
Yusuke Miyao
Abstract:
We examine the abilities of intrinsic bias metrics of static word embeddings to predict whether Natural Language Processing (NLP) systems exhibit biased behavior. A word embedding is one of the fundamental NLP technologies that represents the meanings of words through real vectors, and problematically, it also learns social biases such as stereotypes. An intrinsic bias metric measures bias by exam…
▽ More
We examine the abilities of intrinsic bias metrics of static word embeddings to predict whether Natural Language Processing (NLP) systems exhibit biased behavior. A word embedding is one of the fundamental NLP technologies that represents the meanings of words through real vectors, and problematically, it also learns social biases such as stereotypes. An intrinsic bias metric measures bias by examining a characteristic of vectors, while an extrinsic bias metric checks whether an NLP system trained with a word embedding is biased. A previous study found that a common intrinsic bias metric usually does not correlate with extrinsic bias metrics. However, the intrinsic and extrinsic bias metrics did not measure the same bias in most cases, which makes us question whether the lack of correlation is genuine. In this paper, we extract characteristic words from datasets of extrinsic bias metrics and analyze correlations with intrinsic bias metrics with those words to ensure both metrics measure the same bias. We observed moderate to high correlations with some extrinsic bias metrics but little to no correlations with the others. This result suggests that intrinsic bias metrics can predict biased behavior in particular settings but not in others. Experiment codes are available at GitHub.
△ Less
Submitted 13 September, 2024;
originally announced September 2024.
-
Environment-Centric Active Inference
Authors:
Kanako Esaki,
Tadayuki Matsumura,
Takeshi Kato,
Shunsuke Minusa,
Yang Shao,
Hiroyuki Mizuno
Abstract:
To handle unintended changes in the environment by agents, we propose an environment-centric active inference EC-AIF in which the Markov Blanket of active inference is defined starting from the environment. In normal active inference, the Markov Blanket is defined starting from the agent. That is, first the agent was defined as the entity that performs the "action" such as a robot or a person, the…
▽ More
To handle unintended changes in the environment by agents, we propose an environment-centric active inference EC-AIF in which the Markov Blanket of active inference is defined starting from the environment. In normal active inference, the Markov Blanket is defined starting from the agent. That is, first the agent was defined as the entity that performs the "action" such as a robot or a person, then the environment was defined as other people or objects that are directly affected by the agent's "action," and the boundary between the agent and the environment was defined as the Markov Blanket. This agent-centric definition does not allow the agent to respond to unintended changes in the environment caused by factors outside of the defined environment. In the proposed EC-AIF, there is no entity corresponding to an agent. The environment includes all observable things, including people and things conventionally considered to be the environment, as well as entities that perform "actions" such as robots and people. Accordingly, all states, including robots and people, are included in inference targets, eliminating unintended changes in the environment. The EC-AIF was applied to a robot arm and validated with an object transport task by the robot arm. The results showed that the robot arm successfully transported objects while responding to changes in the target position of the object and to changes in the orientation of another robot arm.
△ Less
Submitted 22 August, 2024;
originally announced August 2024.
-
Designing Unit Ising Models for Logic Gate Simulation through Integer Linear Programming
Authors:
Shunsuke Tsukiyama,
Koji Nakano,
Xiaotian Li,
Yasuaki Ito,
Takumi Kato,
Yuya Kawamata
Abstract:
An Ising model is defined by a quadratic objective function known as the Hamiltonian, composed of spin variables that can take values of either $-1$ or $+1$. The goal is to assign spin values to these variables in a way that minimizes the value of the Hamiltonian. Ising models are instrumental in tackling many combinatorial optimization problems, leading to significant research in developing solve…
▽ More
An Ising model is defined by a quadratic objective function known as the Hamiltonian, composed of spin variables that can take values of either $-1$ or $+1$. The goal is to assign spin values to these variables in a way that minimizes the value of the Hamiltonian. Ising models are instrumental in tackling many combinatorial optimization problems, leading to significant research in developing solvers for them. Notably, D-Wave Systems has pioneered the creation of quantum annealers, programmable solvers based on quantum mechanics, for these models. This paper introduces unit Ising models, where all non-zero coefficients of linear and quadratic terms are either $-1$ or $+1$. Due to the limited resolution of quantum annealers, unit Ising models are more suitable for quantum annealers to find optimal solutions. We propose a novel design methodology for unit Ising models to simulate logic circuits computing Boolean functions through integer linear programming. By optimizing these Ising models with quantum annealers, we can compute Boolean functions and their inverses. With a fixed unit Ising model for a logic circuit, we can potentially design Application-Specific Unit Quantum Annealers (ASUQAs) for computing the inverse function, which is analogous to Application-Specific Integrated Circuits (ASICs) in digital circuitry. For instance, if we apply this technique to a multiplication circuit, we can design an ASUQA for factorization of two numbers. Our findings suggest a powerful new method for compromising the RSA cryptosystem by leveraging ASUQAs in factorization.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Large Vocabulary Size Improves Large Language Models
Authors:
Sho Takase,
Ryokan Ri,
Shun Kiyono,
Takuya Kato
Abstract:
This paper empirically investigates the relationship between subword vocabulary size and the performance of large language models (LLMs) to provide insights on how to define the vocabulary size. Experimental results show that larger vocabulary sizes lead to better performance in LLMs. Moreover, we consider a continual training scenario where a pre-trained language model is trained on a different t…
▽ More
This paper empirically investigates the relationship between subword vocabulary size and the performance of large language models (LLMs) to provide insights on how to define the vocabulary size. Experimental results show that larger vocabulary sizes lead to better performance in LLMs. Moreover, we consider a continual training scenario where a pre-trained language model is trained on a different target language. We introduce a simple method to use a new vocabulary instead of the pre-defined one. We show that using the new vocabulary outperforms the model with the vocabulary used in pre-training.
△ Less
Submitted 27 May, 2025; v1 submitted 24 June, 2024;
originally announced June 2024.
-
JBBQ: Japanese Bias Benchmark for Analyzing Social Biases in Large Language Models
Authors:
Hitomi Yanaka,
Namgi Han,
Ryoma Kumon,
Jie Lu,
Masashi Takeshita,
Ryo Sekizawa,
Taisei Kato,
Hiromi Arai
Abstract:
With the development of large language models (LLMs), social biases in these LLMs have become a pressing issue. Although there are various benchmarks for social biases across languages, the extent to which Japanese LLMs exhibit social biases has not been fully investigated. In this study, we construct the Japanese Bias Benchmark dataset for Question Answering (JBBQ) based on the English bias bench…
▽ More
With the development of large language models (LLMs), social biases in these LLMs have become a pressing issue. Although there are various benchmarks for social biases across languages, the extent to which Japanese LLMs exhibit social biases has not been fully investigated. In this study, we construct the Japanese Bias Benchmark dataset for Question Answering (JBBQ) based on the English bias benchmark BBQ, with analysis of social biases in Japanese LLMs. The results show that while current open Japanese LLMs with more parameters show improved accuracies on JBBQ, their bias scores increase. In addition, prompts with a warning about social biases and chain-of-thought prompting reduce the effect of biases in model outputs, but there is room for improvement in extracting the correct evidence from contexts in Japanese. Our dataset is available at https://github.com/ynklab/JBBQ_data.
△ Less
Submitted 13 June, 2025; v1 submitted 4 June, 2024;
originally announced June 2024.
-
A Study on Unsupervised Anomaly Detection and Defect Localization using Generative Model in Ultrasonic Non-Destructive Testing
Authors:
Yusaku Ando,
Miya Nakajima,
Takahiro Saitoh,
Tsuyoshi Kato
Abstract:
In recent years, the deterioration of artificial materials used in structures has become a serious social issue, increasing the importance of inspections. Non-destructive testing is gaining increased demand due to its capability to inspect for defects and deterioration in structures while preserving their functionality. Among these, Laser Ultrasonic Visualization Testing (LUVT) stands out because…
▽ More
In recent years, the deterioration of artificial materials used in structures has become a serious social issue, increasing the importance of inspections. Non-destructive testing is gaining increased demand due to its capability to inspect for defects and deterioration in structures while preserving their functionality. Among these, Laser Ultrasonic Visualization Testing (LUVT) stands out because it allows the visualization of ultrasonic propagation. This makes it visually straightforward to detect defects, thereby enhancing inspection efficiency. With the increasing number of the deterioration structures, challenges such as a shortage of inspectors and increased workload in non-destructive testing have become more apparent. Efforts to address these challenges include exploring automated inspection using machine learning. However, the lack of anomalous data with defects poses a barrier to improving the accuracy of automated inspection through machine learning. Therefore, in this study, we propose a method for automated LUVT inspection using an anomaly detection approach with a diffusion model that can be trained solely on negative examples (defect-free data). We experimentally confirmed that our proposed method improves defect detection and localization compared to general object detection algorithms used previously.
△ Less
Submitted 26 May, 2024;
originally announced May 2024.
-
Wealth inequality and utility: Effect evaluation of redistribution and consumption morals using the macro-econophysical coupled approach
Authors:
Takeshi Kato,
Yosuke Tanabe,
Mohammad Rezoanul Hoque
Abstract:
Reducing wealth inequality and increasing utility are critical issues. This study reveals the effects of redistribution and consumption morals on wealth inequality and utility. To this end, we present a novel approach that couples the dynamic model of capital, consumption, and utility in macroeconomics with the interaction model of joint business and redistribution in econophysics. With this appro…
▽ More
Reducing wealth inequality and increasing utility are critical issues. This study reveals the effects of redistribution and consumption morals on wealth inequality and utility. To this end, we present a novel approach that couples the dynamic model of capital, consumption, and utility in macroeconomics with the interaction model of joint business and redistribution in econophysics. With this approach, we calculate the capital (wealth), the utility based on consumption, and the Gini index of these inequality using redistribution and consumption thresholds as moral parameters. The results show that: under-redistribution and waste exacerbate inequality; conversely, over-redistribution and stinginess reduce utility; and a balanced moderate moral leads to achieve both reduced inequality and increased utility. These findings provide renewed economic and numerical support for the moral importance known from philosophy, anthropology, and religion. The revival of redistribution and consumption morals should promote the transformation to a human mutual-aid economy, as indicated by philosopher and anthropologist, instead of the capitalist economy that has produced the current inequality. The practical challenge is to implement bottom-up social business, on a foothold of worker coops and platform cooperatives as a community against the state and the market, with moral consensus and its operation.
△ Less
Submitted 17 April, 2025; v1 submitted 22 May, 2024;
originally announced May 2024.
-
ProgrammableGrass: A Shape-Changing Artificial Grass Display Adapted for Dynamic and Interactive Display Features
Authors:
Kojiro Tanaka,
Akito Mizuno,
Toranosuke Kato,
Masahiko Mikawa,
Makoto Fujisawa
Abstract:
There are various proposals for employing grass materials as a green landscape-friendly display. However, it is difficult for current techniques to display smooth animations using 8-bit images and to adjust display resolution, similar to conventional displays. We present ProgrammableGrass, an artificial grass display with scalable resolution, capable of swiftly controlling grass color at 8-bit lev…
▽ More
There are various proposals for employing grass materials as a green landscape-friendly display. However, it is difficult for current techniques to display smooth animations using 8-bit images and to adjust display resolution, similar to conventional displays. We present ProgrammableGrass, an artificial grass display with scalable resolution, capable of swiftly controlling grass color at 8-bit levels. This grass display can control grass colors linearly at the 8-bit level, similar to an LCD display, and can also display not only 8-bit-based images but also videos. This display enables pixel-by-pixel color transitions from yellow to green using fixed-length yellow and adjustable-length green grass. We designed a grass module that can be connected to other modules. Utilizing a proportional derivative control, the grass colors are manipulated to display animations at approximately 10 [fps]. Since the relationship between grass lengths and colors is nonlinear, we developed a calibration system for ProgrammableGrass. We revealed that this calibration system allows ProgrammableGrass to linearly control grass colors at 8-bit levels through experiments under multiple conditions. Lastly, we demonstrate ProgrammableGrass to show smooth animations with 8-bit grayscale images. Moreover, we show several application examples to illustrate the potential of ProgrammableGrass. With the advancement of this technology, users will be able to treat grass as a green-based interactive display device.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Time preference, wealth and utility inequality: A microeconomic interaction and dynamic macroeconomic model connection approach
Authors:
Takeshi Kato
Abstract:
Based on interactions between individuals and others and references to social norms, this study reveals the impact of heterogeneity in time preference on wealth distribution and inequality. We present a novel approach that connects the interactions between microeconomic agents that generate heterogeneity to the dynamic equations for capital and consumption in macroeconomic models. Using this appro…
▽ More
Based on interactions between individuals and others and references to social norms, this study reveals the impact of heterogeneity in time preference on wealth distribution and inequality. We present a novel approach that connects the interactions between microeconomic agents that generate heterogeneity to the dynamic equations for capital and consumption in macroeconomic models. Using this approach, we estimate the impact of changes in the discount rate due to microeconomic interactions on capital, consumption and utility and the degree of inequality. The results show that intercomparisons with others regarding consumption significantly affect capital, i.e. wealth inequality. Furthermore, the impact on utility is never small and social norms can reduce this impact. Our supporting evidence shows that the quantitative results of inequality calculations correspond to survey data from cohort and cross-cultural studies. This study's micro-macro connection approach can be deployed to connect microeconomic interactions, such as exchange, interest and debt, redistribution, mutual aid and time preference, to dynamic macroeconomic models.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
WE economy: Potential of mutual aid distribution based on moral responsibility and risk vulnerability
Authors:
Takeshi Kato
Abstract:
Reducing wealth inequality and disparity is a global challenge. The economic system is mainly divided into (1) gift and reciprocity, (2) power and redistribution, (3) market exchange, and (4) mutual aid without reciprocal obligations. The current inequality stems from a capitalist economy consisting of (2) and (3). To sublimate (1), which is the human economy, to (4), the concept of a "mixbiotic s…
▽ More
Reducing wealth inequality and disparity is a global challenge. The economic system is mainly divided into (1) gift and reciprocity, (2) power and redistribution, (3) market exchange, and (4) mutual aid without reciprocal obligations. The current inequality stems from a capitalist economy consisting of (2) and (3). To sublimate (1), which is the human economy, to (4), the concept of a "mixbiotic society" has been proposed in the philosophical realm. This is a society in which free and diverse individuals, "I," mix with each other, recognize their respective "fundamental incapability" and sublimate them into "WE" solidarity. The economy in this society must have moral responsibility as a coadventurer and consideration for vulnerability to risk. Therefore, I focus on two factors of mind perception: moral responsibility and risk vulnerability, and propose a novel model of wealth distribution following an econophysical approach. Specifically, I developed a joint-venture model, a redistribution model in the joint-venture model, and a "WE economy" model. A simulation comparison of a combination of the joint ventures and redistribution with the WE economies reveals that WE economies are effective in reducing inequality and resilient in normalizing wealth distribution as advantages, and susceptible to free riders as disadvantages. However, this disadvantage can be compensated for by fostering consensus and fellowship, and by complementing it with joint ventures. This study essentially presents the effectiveness of moral responsibility, the complementarity between the WE economy and the joint economy, and the direction of the economy toward reducing inequality. Future challenges are to develop the WE economy model based on real economic analysis and psychology, as well as to promote WE economy fieldwork for worker coops and platform cooperatives to realize a desirable mixbiotic society.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
Investigating the gaze control ability of VALORANT players using a Python based tool
Authors:
Inhyeok Jeong,
Takuma Nobuto,
Naotsugu Kaneko,
Takaaki Kato,
Kimitaka Nakazawa
Abstract:
The current study investigated the gaze movements of FPS gamers in actual game environments. We developed a low-cost analysis tool using Python to identify gaze movements in real-world gaming environments. In Experiment 1, 11 middle-skilled and ten high-skilled FPS gamers performed a task under the experimental condition. Gaze position, reaction time, and accuracy were calculated during the task.…
▽ More
The current study investigated the gaze movements of FPS gamers in actual game environments. We developed a low-cost analysis tool using Python to identify gaze movements in real-world gaming environments. In Experiment 1, 11 middle-skilled and ten high-skilled FPS gamers performed a task under the experimental condition. Gaze position, reaction time, and accuracy were calculated during the task. Reaction time exhibited a significant positive correlation with task accuracy, suggesting that speed and accuracy were associated with higher game performance. The middle-skilled gamers had a significantly wider horizontal gaze distribution than the high-skilled gamers, and gaze distribution and reaction time showed a negative correlation. These results suggested that high-skilled players utilize peripheral vision during gameplay. In Experiment 2, 15 middle-skilled and 12 high-skilled FPS gamers performed an actual FPS game match. The gaze distribution, kill/death/assist ratio (KDA), and percentage of gaze on game information were calculated. In experiment 2, gaze locations in less important areas were positively correlated with KDA. Thus, performance was determined by the important areas where the gaze was focused rather than by the coordination of gaze position alone. Therefore, a broader range of environments is necessary to comprehend the superior performance of FPS gamers.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
Evaluation of Cross-Lingual Bug Localization: Two Industrial Cases
Authors:
Shinpei Hayashi,
Takashi Kobayashi,
Tadahisa Kato
Abstract:
This study reports the results of applying the cross-lingual bug localization approach proposed by Xia et al. to industrial software projects. To realize cross-lingual bug localization, we applied machine translation to non-English descriptions in the source code and bug reports, unifying them into English-based texts, to which an existing English-based bug localization technique was applied. In a…
▽ More
This study reports the results of applying the cross-lingual bug localization approach proposed by Xia et al. to industrial software projects. To realize cross-lingual bug localization, we applied machine translation to non-English descriptions in the source code and bug reports, unifying them into English-based texts, to which an existing English-based bug localization technique was applied. In addition, a prototype tool based on BugLocator was implemented and applied to two Japanese industrial projects, which resulted in a slightly different performance from that of Xia et al.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
Lower Gravity Demonstratable Testbed for Space Robot Experiments
Authors:
Kentaro Uno,
Kazuki Takada,
Keita Nagaoka,
Takuya Kato,
Arthur Candalot,
Kazuya Yoshida
Abstract:
In developing mobile robots for exploration on the planetary surface, it is crucial to evaluate the robot's performance, demonstrating the harsh environment in which the robot will actually be deployed. Repeatable experiments in a controlled testing environment that can reproduce various terrain and gravitational conditions are essential. This paper presents the development of a minimal and space-…
▽ More
In developing mobile robots for exploration on the planetary surface, it is crucial to evaluate the robot's performance, demonstrating the harsh environment in which the robot will actually be deployed. Repeatable experiments in a controlled testing environment that can reproduce various terrain and gravitational conditions are essential. This paper presents the development of a minimal and space-saving indoor testbed, which can simulate steep slopes, uneven terrain, and lower gravity, employing a three-dimensional target tracking mechanism (active xy and passive z) with a counterweight.
△ Less
Submitted 22 October, 2023; v1 submitted 18 September, 2023;
originally announced September 2023.
-
Generative AI trial for nonviolent communication mediation
Authors:
Takeshi Kato
Abstract:
Aiming for a mixbiotic society that combines freedom and solidarity among people with diverse values, I focused on nonviolent communication (NVC) that enables compassionate giving in various situations of social division and conflict, and tried a generative AI for it. Specifically, ChatGPT was used in place of the traditional certified trainer to test the possibility of mediating (modifying) input…
▽ More
Aiming for a mixbiotic society that combines freedom and solidarity among people with diverse values, I focused on nonviolent communication (NVC) that enables compassionate giving in various situations of social division and conflict, and tried a generative AI for it. Specifically, ChatGPT was used in place of the traditional certified trainer to test the possibility of mediating (modifying) input sentences in four processes: observation, feelings, needs, and requests. The results indicate that there is potential for the application of generative AI, although not yet at a practical level. Suggested improvement guidelines included adding model responses, relearning revised responses, specifying appropriate terminology for each process, and re-asking for required information. The use of generative AI will be useful initially to assist certified trainers, to prepare for and review events and workshops, and in the future to support consensus building and cooperative behavior in digital democracy, platform cooperatives, and cyber-human social co-operating systems. It is hoped that the widespread use of NVC mediation using generative AI will lead to the early realization of a mixbiotic society.
△ Less
Submitted 7 August, 2023;
originally announced August 2023.
-
Avatar Fusion Karaoke: Research and development on multi-user music play VR experience in the metaverse
Authors:
Alexandre Berthault,
Takuma Kato,
Akihiko Shirai
Abstract:
This paper contributes to building a standard process of research and development (R&D) for new user experiences (UX) in metaverse services. We tested this R&D process on a new UX proof of concept (PoC) for Meta Quest head-mounted display (HMDs) consisting of a school-life karaoke experience with the hypothesis that it is possible to design the avatars with only the necessary functions and renderi…
▽ More
This paper contributes to building a standard process of research and development (R&D) for new user experiences (UX) in metaverse services. We tested this R&D process on a new UX proof of concept (PoC) for Meta Quest head-mounted display (HMDs) consisting of a school-life karaoke experience with the hypothesis that it is possible to design the avatars with only the necessary functions and rendering costs. The school life metaverse is a relevant subject for discovering issues and problems in this type of simultaneous connection. To qualitatively evaluate the potential of a multi-person metaverse experience, this study investigated subjects where each avatar requires expressive skills. While avatar play experiences feature artistic expressions, such as dancing, playing musical instruments, and drawing, and these can be used to evaluate their operability and expressive capabilities qualitatively, the Quest's tracking capabilities are insufficient for full-body performance and graphical art expression. Considering such hardware limitations, this study evaluated the Quest, focusing primarily on UX simplicity using AI Fusion techniques and expressiveness in instrumental scenes played by approximately four avatars. This research reported methods for multiuser metaverse communication and its supporting technologies, such as head-mounted devices and their graphics performance, special interaction techniques, and complementary tools and the importance of PoC development, its evaluation, and its iterations. The result is remarkable for further research; these expressive technologies in a multi-user context are directly related to the quality of communication within the metaverse and the value of the user-generated content (UGC) produced there.
△ Less
Submitted 4 August, 2023;
originally announced August 2023.
-
Dual-Matrix Domain-Wall: A Novel Technique for Generating Permutations by QUBO and Ising Models with Quadratic Sizes
Authors:
Koji Nakano,
Shunsuke Tsukiyama,
Yasuaki Ito,
Takashi Yazane,
Junko Yano,
Takumi Kato,
Shiro Ozaki,
Rie Mori,
Ryota Katsuki
Abstract:
The Ising model is defined by an objective function using a quadratic formula of qubit variables. The problem of an Ising model aims to determine the qubit values of the variables that minimize the objective function, and many optimization problems can be reduced to this problem. In this paper, we focus on optimization problems related to permutations, where the goal is to find the optimal permuta…
▽ More
The Ising model is defined by an objective function using a quadratic formula of qubit variables. The problem of an Ising model aims to determine the qubit values of the variables that minimize the objective function, and many optimization problems can be reduced to this problem. In this paper, we focus on optimization problems related to permutations, where the goal is to find the optimal permutation out of the $n!$ possible permutations of $n$ elements. To represent these problems as Ising models, a commonly employed approach is to use a kernel that utilizes one-hot encoding to find any one of the $n!$ permutations as the optimal solution. However, this kernel contains a large number of quadratic terms and high absolute coefficient values. The main contribution of this paper is the introduction of a novel permutation encoding technique called dual-matrix domain-wall, which significantly reduces the number of quadratic terms and the maximum absolute coefficient values in the kernel. Surprisingly, our dual-matrix domain-wall encoding reduces the quadratic term count and maximum absolute coefficient values from $n^3-n^2$ and $2n-4$ to $6n^2-12n+4$ and $2$, respectively. We also demonstrate the applicability of our encoding technique to partial permutations and Quadratic Unconstrained Binary Optimization (QUBO) models. Furthermore, we discuss a family of permutation problems that can be efficiently implemented using Ising/QUBO models with our dual-matrix domain-wall encoding.
△ Less
Submitted 1 November, 2023; v1 submitted 2 August, 2023;
originally announced August 2023.
-
Mixbiotic society measures: Comparison of organizational structures based on communication simulation
Authors:
Takeshi Kato,
Jyunichi Miyakoshi,
Tadayuki Matsumura,
Yasuyuki Kudo,
Ryuji Mine,
Hiroyuki Mizuno,
Yasuo Deguchi
Abstract:
The philosophical world has proposed the concept of "mixbiotic society," in which individuals with freedom and diverse values mix and mingle to recognize their respective "fundamental incapability" each other and sublimate into solidarity, toward solving the issues of social isolation and fragmentation. Based on this concept, the mixbiotic society measures have been proposed to evaluate dynamic co…
▽ More
The philosophical world has proposed the concept of "mixbiotic society," in which individuals with freedom and diverse values mix and mingle to recognize their respective "fundamental incapability" each other and sublimate into solidarity, toward solving the issues of social isolation and fragmentation. Based on this concept, the mixbiotic society measures have been proposed to evaluate dynamic communication patterns with reference to classification in cellular automata and particle reaction-diffusion that simulate living phenomena. In this paper, we applied these measures to five typologies of organizational structure (Red: impulsive, Amber: adaptive, Orange: achievement, Green: pluralistic, and Teal: evolutionary) and evaluated their features. Specifically, we formed star, tree, tree+jumpers, tree+more jumpers, and small-world type networks corresponding to each of five typologies, conducted communication simulations on these networks, and calculated values for mixbiotic society measures. The results showed that Teal organization has the highest value of the mixism measure among mixbiotic society measures, i.e., it balances similarity (mixing) and dissimilarity (mingling) in communication, and is living and mixbiotic between order and chaos. Measures other than mixism showed that in Teal organization, information is not concentrated in a central leader and that communication takes place among various members. This evaluation of organizational structures shows that the mixbiotic society measures is also useful for assessing organizational change. In the future, these measures will be used not only in business organizations, but also in digital democratic organizations and platform cooperatives in conjunction with information technology.
△ Less
Submitted 28 July, 2023;
originally announced July 2023.
-
Mixbiotic society measures: Assessment of community well-going as living system
Authors:
Takeshi Kato,
Jyunichi Miyakoshi,
Tadayuki Matsumura,
Ryuji Mine,
Hiroyuki Mizuno,
Yasuo Deguchi
Abstract:
Social isolation is caused by the impoverishment of community (atomism) and fragmentation is caused by the enlargement of in-group (mobism), both of which can be viewed as social problems related to communication. To solve these problems, the philosophical world has proposed the concept of "mixbiotic society," in which individuals with freedom and diverse values mix and mingle to recognize their r…
▽ More
Social isolation is caused by the impoverishment of community (atomism) and fragmentation is caused by the enlargement of in-group (mobism), both of which can be viewed as social problems related to communication. To solve these problems, the philosophical world has proposed the concept of "mixbiotic society," in which individuals with freedom and diverse values mix and mingle to recognize their respective "fundamental incapability" each other and sublimate into solidarity. Based on this concept, this study proposes new mixbiotic society measures to evaluate dynamic communication patterns with reference to classification in cellular automata and particle reaction diffusion that simulate living phenomena. Specifically, the hypothesis of measures corresponding to the four classes was formulated, and the hypothesis was validated by simulating the generation and disappearance of communication. As a result, considering communication patterns as multidimensional vectors, it found that the mean of Euclidean distance for "mobism," the variance of the relative change in distance for "atomism," the composite measure that multiplies the mean and variance of cosine similarity for "mixism," which corresponds to the well-going of mixbiotic society, and the almost zero measures for "nihilism," are suitable. Then, evaluating seven real-society datasets using these measures, we showed that the mixism measure is useful for assessing the livingness of communication, and that it is possible to typify communities based on plural measures. The measures established in this study are superior to conventional analysis in that they can evaluate dynamic patterns, they are simple to calculate, and their meanings are easy to interpret. As a future development, the mixbiotic society measures will be used in the fields of digital democracy and platform cooperativism toward a desirable society.
△ Less
Submitted 24 July, 2023; v1 submitted 21 July, 2023;
originally announced July 2023.
-
Subjective-objective policy making approach: Coupling of resident-values multiple regression analysis with value-indices, multi-agent-based simulation
Authors:
Misa Owa,
Junichi Miyakoshi,
Takeshi Kato
Abstract:
Given the concerns around the existing subjective and objective policy evaluation approaches, this study proposes a new combined subjective-objective policy evaluation approach to choose better policy that reflects the will of citizens and is backed up by objective facts. Subjective approaches, such as the Life Satisfaction Approach and the Contingent Valuation Method, convert subjectivity into ec…
▽ More
Given the concerns around the existing subjective and objective policy evaluation approaches, this study proposes a new combined subjective-objective policy evaluation approach to choose better policy that reflects the will of citizens and is backed up by objective facts. Subjective approaches, such as the Life Satisfaction Approach and the Contingent Valuation Method, convert subjectivity into economic value, raising the question whether a higher economic value really accords with what citizens want. Objective policy evaluation approaches, such as Evidence Based Policy Making and Multi-Agent-Based Simulation, do not take subjectivity into account, making it difficult to choose from diverse and pluralistic candidate policies. The proposed approach establishes a subjective target function based on a multiple regression analysis of the results of a residents questionnaire survey, and uses MABS to calculate the objective evaluation indices for a number of candidate policies. Next, a new subjective-objective coupling target function, combining the explanatory variables of the subjective target function with objective evaluation indices, is set up, optimized to select the preferred policies from numerous candidates. To evaluate this approach, we conducted a verification of renewable energy introduction policies at Takaharu Town in Miyazaki Prefecture, Japan. The results show a good potential for using a new subjective-objective coupling target function to select policies consistent with the residents values for well-being from 20,000 policy candidates for social, ecological, and economic values obtained in MABS. Using the new approach to compare several policies enables concrete expression of the will of stakeholders with diverse values, and contributes to constructive discussions and consensus-building.
△ Less
Submitted 13 June, 2023;
originally announced June 2023.
-
Simulation-Aided Deep Learning for Laser Ultrasonic Visualization Testing
Authors:
Miya Nakajima,
Takahiro Saitoh,
Tsuyoshi Kato
Abstract:
In recent years, laser ultrasonic visualization testing (LUVT) has attracted much attention because of its ability to efficiently perform non-contact ultrasonic non-destructive testing.Despite many success reports of deep learning based image analysis for widespread areas, attempts to apply deep learning to defect detection in LUVT images face the difficulty of preparing a large dataset of LUVT im…
▽ More
In recent years, laser ultrasonic visualization testing (LUVT) has attracted much attention because of its ability to efficiently perform non-contact ultrasonic non-destructive testing.Despite many success reports of deep learning based image analysis for widespread areas, attempts to apply deep learning to defect detection in LUVT images face the difficulty of preparing a large dataset of LUVT images that is too expensive to scale. To compensate for the scarcity of such training data, we propose a data augmentation method that generates artificial LUVT images by simulation and applies a style transfer to simulated LUVT images.The experimental results showed that the effectiveness of data augmentation based on the style-transformed simulated images improved the prediction performance of defects, rather than directly using the raw simulated images for data augmentation.
△ Less
Submitted 29 May, 2023;
originally announced May 2023.
-
A Study on Deep CNN Structures for Defect Detection From Laser Ultrasonic Visualization Testing Images
Authors:
Miya Nakajima,
Takahiro Saitoh,
Tsuyoshi Kato
Abstract:
The importance of ultrasonic nondestructive testing has been increasing in recent years, and there are high expectations for the potential of laser ultrasonic visualization testing, which combines laser ultrasonic testing with scattered wave visualization technology. Even if scattered waves are visualized, inspectors still need to carefully inspect the images. To automate this, this paper proposes…
▽ More
The importance of ultrasonic nondestructive testing has been increasing in recent years, and there are high expectations for the potential of laser ultrasonic visualization testing, which combines laser ultrasonic testing with scattered wave visualization technology. Even if scattered waves are visualized, inspectors still need to carefully inspect the images. To automate this, this paper proposes a deep neural network for automatic defect detection and localization in LUVT images. To explore the structure of a neural network suitable to this task, we compared the LUVT image analysis problem with the generic object detection problem. Numerical experiments using real-world data from a SUS304 flat plate showed that the proposed method is more effective than the general object detection model in terms of prediction performance. We also show that the computational time required for prediction is faster than that of the general object detection model.
△ Less
Submitted 23 May, 2023;
originally announced May 2023.
-
Well-being policy evaluation methodology based on WE pluralism
Authors:
Takeshi Kato
Abstract:
Methodologies for evaluating and selecting policies that contribute to the well-being of diverse populations need clarification. To bridge the gap between objective indicators and policies related to well-being, this study shifts from constitutive pluralism based on objective indicators to conceptual pluralism that emphasizes subjective context, develops from subject-object pluralism through indiv…
▽ More
Methodologies for evaluating and selecting policies that contribute to the well-being of diverse populations need clarification. To bridge the gap between objective indicators and policies related to well-being, this study shifts from constitutive pluralism based on objective indicators to conceptual pluralism that emphasizes subjective context, develops from subject-object pluralism through individual-group pluralism to WE pluralism, and presents a new policy evaluation method that combines joint fact-finding based on policy plurality. First, to evaluate policies involving diverse stakeholders, I develop from individual subjectivity-objectivity to individual subjectivity and group intersubjectivity, and then move to a narrow-wide WE pluralism in the gradation of I-family-community-municipality-nation-world. Additionally, by referring to some functional forms of well-being, I formulate the dependence of well-being on narrow-wide WE. Finally, given that policies themselves have a plurality of social, ecological, and economic values, I define a set of policies for each of the narrow-wide WE and consider a mapping between the two to provide an evaluation basis. Furthermore, by combining well-being and joint fact-finding on the narrow-wide WE consensus, the policy evaluation method is formulated. The fact-value combined parameter system, combined policy-making approach, and combined impact evaluation are disclosed as examples of implementation. This paper contributes to the realization of a well-being society by bridging philosophical theory and policies based on WE pluralism and presenting a new method of policy evaluation based on subjective context and consensus building.
△ Less
Submitted 8 May, 2023;
originally announced May 2023.
-
Multi-Target Tobit Models for Completing Water Quality Data
Authors:
Yuya Takada,
Tsuyoshi Kato
Abstract:
Monitoring microbiological behaviors in water is crucial to manage public health risk from waterborne pathogens, although quantifying the concentrations of microbiological organisms in water is still challenging because concentrations of many pathogens in water samples may often be below the quantification limit, producing censoring data. To enable statistical analysis based on quantitative values…
▽ More
Monitoring microbiological behaviors in water is crucial to manage public health risk from waterborne pathogens, although quantifying the concentrations of microbiological organisms in water is still challenging because concentrations of many pathogens in water samples may often be below the quantification limit, producing censoring data. To enable statistical analysis based on quantitative values, the true values of non-detected measurements are required to be estimated with high precision. Tobit model is a well-known linear regression model for analyzing censored data. One drawback of the Tobit model is that only the target variable is allowed to be censored. In this study, we devised a novel extension of the classical Tobit model, called the \emph{multi-target Tobit model}, to handle multiple censored variables simultaneously by introducing multiple target variables. For fitting the new model, a numerical stable optimization algorithm was developed based on elaborate theories. Experiments conducted using several real-world water quality datasets provided an evidence that estimating multiple columns jointly gains a great advantage over estimating them separately.
△ Less
Submitted 21 February, 2023;
originally announced February 2023.
-
Wealth Redistribution and Mutual Aid: Comparison using Equivalent/Nonequivalent Exchange Models of Econophysics
Authors:
Takeshi Kato
Abstract:
Given the wealth inequality worldwide, there is an urgent need to identify the mode of wealth exchange through which it arises. To address the research gap regarding models that combine equivalent exchange and redistribution, this study compares an equivalent market exchange with redistribution based on power centers and a nonequivalent exchange with mutual aid using the Polanyi, Graeber, and Kara…
▽ More
Given the wealth inequality worldwide, there is an urgent need to identify the mode of wealth exchange through which it arises. To address the research gap regarding models that combine equivalent exchange and redistribution, this study compares an equivalent market exchange with redistribution based on power centers and a nonequivalent exchange with mutual aid using the Polanyi, Graeber, and Karatani modes of exchange. Two new exchange models based on multi-agent interactions are reconstructed following an econophysics approach for evaluating the Gini index (inequality) and total exchange (economic flow). Exchange simulations indicate that the evaluation parameter of the total exchange divided by the Gini index can be expressed by the same saturated curvilinear approximate equation using the wealth transfer rate and time period of redistribution and the surplus contribution rate of the wealthy and the saving rate. However, considering the coercion of taxes and its associated costs and independence based on the morality of mutual aid, a nonequivalent exchange without return obligation is preferred. This is oriented toward Graeber's baseline communism and Karatani's mode of exchange D, with implications for alternatives to the capitalist economy.
△ Less
Submitted 30 December, 2022;
originally announced January 2023.
-
Composite Consensus-Building Process: Permissible Meeting Analysis and Compromise Choice Exploration
Authors:
Yasuhiro Asa,
Takeshi Kato,
Ryuji Mine
Abstract:
In solving today's social issues, it is necessary to determine solutions that are acceptable to all stakeholders and collaborate to apply them. The conventional technology of "permissive meeting analysis" derives a consensusable choice that falls within everyone's permissible range through mathematical analyses; however, it tends to be biased toward the majority in a group, making it difficult to…
▽ More
In solving today's social issues, it is necessary to determine solutions that are acceptable to all stakeholders and collaborate to apply them. The conventional technology of "permissive meeting analysis" derives a consensusable choice that falls within everyone's permissible range through mathematical analyses; however, it tends to be biased toward the majority in a group, making it difficult to reach a consensus when a conflict arises. To support consensus building (defined here as an acceptable compromise that not everyone rejects), we developed a composite consensus-building process. The developed process addresses this issue by combining permissible meeting analysis with a new "compromise choice-exploration" technology, which presents a consensusable choice that emphasizes fairness and equality among everyone when permissible meeting analysis fails to do so. When both permissible meeting analysis and compromise choice exploration do not arrive at a consensus, a facility is provided to create a sublated choice among those provided by them. The trial experimental results confirmed that permissive meeting analysis and compromise choice exploration are sufficiently useful for deriving consensusable choices. Furthermore, we found that compromise choice exploration is characterized by its ability to derive choices that control the balance between compromise and fairness. Our proposed composite consensus-building approach could be applied in a wide range of situations, from local issues in municipalities and communities to international issues such as environmental protection and human rights issues. It could also aid in developing digital democracy and platform cooperativism.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
Free energy model of emotional valence in dual-process perceptions
Authors:
Hideyoshi Yanagisawa,
Xiaoxiang Wu,
Kazutaka Ueda,
Takeo Kato
Abstract:
An appropriate level of arousal induces positive emotions, and a high arousal potential may provoke negative emotions. To explain the effect of arousal on emotional valence, we propose a novel mathematical framework of arousal potential variations in the dual process of human cognition: automatic and controlled. A suitable mathematical formulation to explain the emotions in the dual process is sti…
▽ More
An appropriate level of arousal induces positive emotions, and a high arousal potential may provoke negative emotions. To explain the effect of arousal on emotional valence, we propose a novel mathematical framework of arousal potential variations in the dual process of human cognition: automatic and controlled. A suitable mathematical formulation to explain the emotions in the dual process is still absent. Our model associates free energy with arousal potential and its variations to explain emotional valence. Decreasing and increasing free energy consequently induce positive and negative emotions, respectively. We formalize a transition from the automatic to the controlled process in the dual process as a change of Bayesian prior. Further, we model emotional valence using free energy increase (FI) when one tries changing one's Bayesian prior and its reduction (FR) when one succeeds in recognizing the same stimuli with a changed prior and define three emotions: "interest," "confusion," and "boredom" using the variations. The results of our mathematical analysis comparing various Gaussian model parameters reveals the following: 1) prediction error (PR) increases FR (representing "interest") when the first prior variance is greater than the second prior variance, 2) PR decreases FR when the first prior variance is less than the second prior variance, and 3) the distance between priors' means always increases FR. We also discuss the association of the outcomes with emotions in the controlled process. The proposed mathematical model provides a general framework for predicting and controlling emotional valence in the dual process that varies with viewpoint and stimuli, as well as for understanding the contradictions in the effects of arousal on the valence.
△ Less
Submitted 21 October, 2022; v1 submitted 18 October, 2022;
originally announced October 2022.
-
Islamic and capitalist economies: Comparison using econophysics models of wealth exchange and redistribution
Authors:
Takeshi Kato
Abstract:
Islamic and capitalist economies have several differences, the most fundamental being that the Islamic economy is characterized by the prohibition of interest (riba) and speculation (gharar) and the enforcement of Shariah-compliant profit-loss sharing (mudaraba, murabaha, salam, etc.) and wealth redistribution (waqf, sadaqah, and zakat). In this study, I apply new econophysics models of wealth exc…
▽ More
Islamic and capitalist economies have several differences, the most fundamental being that the Islamic economy is characterized by the prohibition of interest (riba) and speculation (gharar) and the enforcement of Shariah-compliant profit-loss sharing (mudaraba, murabaha, salam, etc.) and wealth redistribution (waqf, sadaqah, and zakat). In this study, I apply new econophysics models of wealth exchange and redistribution to quantitatively compare these characteristics to those of capitalism and evaluate wealth distribution and disparity using a simulation. Specifically, regarding exchange, I propose a loan interest model representing finance capitalism and riba and a joint venture model representing shareholder capitalism and mudaraba; regarding redistribution, I create a transfer model representing inheritance tax and waqf. As exchanges are repeated from an initial uniform distribution of wealth, wealth distribution approaches a power-law distribution more quickly for the loan interest than the joint venture model; and the Gini index, representing disparity, rapidly increases. The joint venture model's Gini index increases more slowly, but eventually, the wealth distribution in both models becomes a delta distribution, and the Gini index gradually approaches 1. Next, when both models are combined with the transfer model to redistribute wealth in every given period, the loan interest model has a larger Gini index than the joint venture model, but both converge to a Gini index of less than 1. These results quantitatively reveal that in the Islamic economy, disparity is restrained by prohibiting riba and promoting reciprocal exchange in mudaraba and redistribution through waqf. Comparing Islamic and capitalist economies provides insights into the benefits of economically embracing the ethical practice of mutual aid and suggests guidelines for an alternative to capitalism.
△ Less
Submitted 23 September, 2022; v1 submitted 11 June, 2022;
originally announced June 2022.
-
Social Co-OS: Cyber-Human Social Co-Operating System
Authors:
Takeshi Kato,
Yasuyuki Kudo,
Junichi Miyakoshi,
Misa Owa,
Yasuhiro Asa,
Takashi Numata,
Ryuji Mine,
Hiroyuki Mizuno
Abstract:
The novel concept of a Cyber-Human Social System (CHSS) and a diverse and pluralistic 'mixed-life society' is proposed, wherein cyber and human societies commit to each other. This concept enhances the Cyber-Physical System (CPS), which is associated with the current Society 5.0, a social vision realised through the fusion of cyber (virtual) and physical (real) spaces following information society…
▽ More
The novel concept of a Cyber-Human Social System (CHSS) and a diverse and pluralistic 'mixed-life society' is proposed, wherein cyber and human societies commit to each other. This concept enhances the Cyber-Physical System (CPS), which is associated with the current Society 5.0, a social vision realised through the fusion of cyber (virtual) and physical (real) spaces following information society (Society 4.0 and Industry 4.0). Moreover, the CHSS enhances the Human-CPS, the Human-in-the-Loop CPS (HiLCPS), and the Cyber-Human System by intervening in individual behaviour pro-socially and supporting consensus building. As a form of architecture that embodies the CHSS concept, the Cyber-Human Social Co-Operating System (Social Co-OS) that combines cyber and human societies is shown. In this architecture, the cyber and human systems cooperate through the fast loop (operation and administration) and slow loop (consensus and politics). Furthermore, the technical content and current implementation of the basic functions of the Social Co-OS are described. These functions consist of individual behavioural diagnostics, interventions in the fast loop, group decision diagnostics and consensus building in the slow loop. Subsequently, this system will contribute to mutual aid communities and platform cooperatives.
△ Less
Submitted 21 September, 2022; v1 submitted 5 October, 2021;
originally announced October 2021.
-
Physical Activity Analysis of College Students During the COVID-19 Pandemic Using Smartphones
Authors:
Yuuki Nishiyama,
Yuui Kakino,
Enishi Naka,
Yuka Noda,
Satsuki Hashiba,
Yusuke Yamada,
Wataru Sasaki,
Tadashi Okoshi,
Jin Nakazawa,
Masaki Mori,
Hisashi Mizutori,
Kotomi Shiota,
Tomohisa Nagano,
Yuko Tokairin,
Takaaki Kato
Abstract:
Owing to the pandemic caused by the coronavirus disease of 2019 (COVID-19), several universities have closed their campuses for preventing the spread of infection. Consequently, the university classes are being held over the Internet, and students attend these classes from their homes. While the COVID-19 pandemic is expected to be prolonged, the online-centric lifestyle has raised concerns about s…
▽ More
Owing to the pandemic caused by the coronavirus disease of 2019 (COVID-19), several universities have closed their campuses for preventing the spread of infection. Consequently, the university classes are being held over the Internet, and students attend these classes from their homes. While the COVID-19 pandemic is expected to be prolonged, the online-centric lifestyle has raised concerns about secondary health issues caused by reduced physical activity (PA). However, the actual status of PA among university students has not yet been examined in Japan. Hence, in this study, we collected daily PA data (including the data corresponding to the number of steps taken and the data associated with six types of activities) by employing smartphones and thereby analyzed the changes in the PA of university students. The PA data were collected over a period of ten weeks from 305 first-year university students who were attending a mandatory class of physical education at the university. The obtained results indicate that compared to the average number of steps taken before the COVID-19 pandemic (6474.87 steps), the average number of steps taken after the COVID-19 pandemic (3522.5 steps) has decreased by 45.6%. Furthermore, the decrease in commuting time (7 AM to 10 AM), classroom time, and extracurricular activity time (11 AM to 12 AM) has led to a decrease in PA on weekdays owing to reduced unplanned exercise opportunities and has caused an increase in the duration of being in the stationary state in the course of daily life.
△ Less
Submitted 11 March, 2021;
originally announced March 2021.
-
Learning Sign-Constrained Support Vector Machines
Authors:
Kenya Tajima,
Takahiko Henmi,
Kohei Tsuchida,
Esmeraldo Ronnie R. Zara,
Tsuyoshi Kato
Abstract:
Domain knowledge is useful to improve the generalization performance of learning machines. Sign constraints are a handy representation to combine domain knowledge with learning machine. In this paper, we consider constraining the signs of the weight coefficients in learning the linear support vector machine, and develop two optimization algorithms for minimizing the empirical risk under the sign c…
▽ More
Domain knowledge is useful to improve the generalization performance of learning machines. Sign constraints are a handy representation to combine domain knowledge with learning machine. In this paper, we consider constraining the signs of the weight coefficients in learning the linear support vector machine, and develop two optimization algorithms for minimizing the empirical risk under the sign constraints. One of the two algorithms is based on the projected gradient method, in which each iteration of the projected gradient method takes $O(nd)$ computational cost and the sublinear convergence of the objective error is guaranteed. The second algorithm is based on the Frank-Wolfe method that also converges sublinearly and possesses a clear termination criterion. We show that each iteration of the Frank-Wolfe also requires $O(nd)$ cost. Furthermore, we derive the explicit expression for the minimal iteration number to ensure an $ε$-accurate solution by analyzing the curvature of the objective function. Finally, we empirically demonstrate that the sign constraints are a promising technique when similarities to the training examples compose the feature vector.
△ Less
Submitted 5 January, 2021;
originally announced January 2021.
-
Proof of Authenticity of Logistics Information with Passive RFID Tags and Blockchain
Authors:
Hiroshi Watanabe,
Kenji Saito,
Satoshi Miyazaki,
Toshiharu Okada,
Hiroyuki Fukuyama,
Tsuneo Kato,
Katsuo Taniguchi
Abstract:
In tracing the (robotically automated) logistics of large quantities of goods, inexpensive passive RFID tags are preferred for cost reasons. Accordingly, security between such tags and readers have primarily been studied among many issues of RFID. However, the authenticity of data cannot be guaranteed if logistics services can give false information. Although the use of blockchain is often discuss…
▽ More
In tracing the (robotically automated) logistics of large quantities of goods, inexpensive passive RFID tags are preferred for cost reasons. Accordingly, security between such tags and readers have primarily been studied among many issues of RFID. However, the authenticity of data cannot be guaranteed if logistics services can give false information. Although the use of blockchain is often discussed, it is simply a recording system, so there is a risk that false records may be written to it.
As a solution, we propose a design in which a digitally signing, location-constrained and tamper-evident reader atomically writes an evidence to blockchain along with its reading and writing a tag.
By semi-formal modeling, we confirmed that the confidentiality and integrity of the information can be maintained throughout the system, and digitally signed data can be verified later despite possible compromise of private keys or signature algorithms, or expiration of public key certificates. We also introduce a prototype design to show that our proposal is viable.
This makes it possible to trace authentic logistics information using inexpensive passive RFID tags. Furthermore, by abstracting the reader/writer as a sensor/actuator, this model can be extended to IoT in general.
△ Less
Submitted 10 November, 2020;
originally announced November 2020.
-
Frank-Wolfe algorithm for learning SVM-type multi-category classifiers
Authors:
Kenya Tajima,
Yoshihiro Hirohashi,
Esmeraldo Ronnie Rey Zara,
Tsuyoshi Kato
Abstract:
Multi-category support vector machine (MC-SVM) is one of the most popular machine learning algorithms. There are lots of variants of MC-SVM, although different optimization algorithms were developed for different learning machines. In this study, we developed a new optimization algorithm that can be applied to many of MC-SVM variants. The algorithm is based on the Frank-Wolfe framework that requir…
▽ More
Multi-category support vector machine (MC-SVM) is one of the most popular machine learning algorithms. There are lots of variants of MC-SVM, although different optimization algorithms were developed for different learning machines. In this study, we developed a new optimization algorithm that can be applied to many of MC-SVM variants. The algorithm is based on the Frank-Wolfe framework that requires two subproblems, direction finding and line search, in each iteration. The contribution of this study is the discovery that both subproblems have a closed form solution if the Frank-Wolfe framework is applied to the dual problem. Additionally, the closed form solutions on both for the direction finding and for the line search exist even for the Moreau envelopes of the loss functions. We use several large datasets to demonstrate that the proposed optimization algorithm converges rapidly and thereby improves the pattern recognition performance.
△ Less
Submitted 20 August, 2020;
originally announced August 2020.
-
Positionality-Weighted Aggregation Methods for Cumulative Voting
Authors:
Takeshi Kato,
Yasuhiro Asa,
Misa Owa
Abstract:
Respecting minority opinions is vital in solving social problems. However, minority opinions are often ignored in general majority rules. To build consensus on pluralistic values and make social choices that consider minority opinions, we propose aggregation methods that give weighting to the minority's positionality on cardinal cumulative voting. Based on quadratic and linear voting, we formulate…
▽ More
Respecting minority opinions is vital in solving social problems. However, minority opinions are often ignored in general majority rules. To build consensus on pluralistic values and make social choices that consider minority opinions, we propose aggregation methods that give weighting to the minority's positionality on cardinal cumulative voting. Based on quadratic and linear voting, we formulated three weighted aggregation methods that differ in the ratio of votes to cumulative points and the weighting of the minority to all members, and assuming that the distributions of votes follow normal distributions, we calculated the frequency distributions of the aggregation results. We found that minority opinions are more likely to be reflected proportionately to the average of the distribution in two of the above three methods. This implies that Sen and Gotoh's idea of considering the social position of unfortunate people on ordinal ranking in the welfare economics, was illustrated by weighting the minority's positionality on cardinal voting. In addition, it is possible to visualize the number and positionality of the minority from the analysis of the aggregation results. These results will be useful to promote mutual understanding between the majority and minority by interactively visualizing the contents of the proposed aggregation methods in the consensus-building process. With the further development of information technology, the consensus building based on big data will be necessary. We recommend the use of our proposed aggregation methods to make social choices for pluralistic values such as social, environmental, and economic.
△ Less
Submitted 22 February, 2021; v1 submitted 19 August, 2020;
originally announced August 2020.
-
Adaptive Signal Variances: CNN Initialization Through Modern Architectures
Authors:
Takahiko Henmi,
Esmeraldo Ronnie Rey Zara,
Yoshihiro Hirohashi,
Tsuyoshi Kato
Abstract:
Deep convolutional neural networks (CNN) have achieved the unwavering confidence in its performance on image processing tasks. The CNN architecture constitutes a variety of different types of layers including the convolution layer and the max-pooling layer. CNN practitioners widely understand the fact that the stability of learning depends on how to initialize the model parameters in each layer. N…
▽ More
Deep convolutional neural networks (CNN) have achieved the unwavering confidence in its performance on image processing tasks. The CNN architecture constitutes a variety of different types of layers including the convolution layer and the max-pooling layer. CNN practitioners widely understand the fact that the stability of learning depends on how to initialize the model parameters in each layer. Nowadays, no one doubts that the de facto standard scheme for initialization is the so-called Kaiming initialization that has been developed by He et al. The Kaiming scheme was derived from a much simpler model than the currently used CNN structure having evolved since the emergence of the Kaiming scheme. The Kaiming model consists only of the convolution and fully connected layers, ignoring the max-pooling layer and the global average pooling layer. In this study, we derived the initialization scheme again not from the simplified Kaiming model, but precisely from the modern CNN architectures, and empirically investigated how the new initialization method performs compared to the de facto standard ones that are widely used today.
△ Less
Submitted 29 August, 2020; v1 submitted 16 August, 2020;
originally announced August 2020.
-
Embeddings of Label Components for Sequence Labeling: A Case Study of Fine-grained Named Entity Recognition
Authors:
Takuma Kato,
Kaori Abe,
Hiroki Ouchi,
Shumpei Miyawaki,
Jun Suzuki,
Kentaro Inui
Abstract:
In general, the labels used in sequence labeling consist of different types of elements. For example, IOB-format entity labels, such as B-Person and I-Person, can be decomposed into span (B and I) and type information (Person). However, while most sequence labeling models do not consider such label components, the shared components across labels, such as Person, can be beneficial for label predict…
▽ More
In general, the labels used in sequence labeling consist of different types of elements. For example, IOB-format entity labels, such as B-Person and I-Person, can be decomposed into span (B and I) and type information (Person). However, while most sequence labeling models do not consider such label components, the shared components across labels, such as Person, can be beneficial for label prediction. In this work, we propose to integrate label component information as embeddings into models. Through experiments on English and Japanese fine-grained named entity recognition, we demonstrate that the proposed method improves performance, especially for instances with low-frequency labels.
△ Less
Submitted 4 June, 2020; v1 submitted 1 June, 2020;
originally announced June 2020.
-
Solving tiling puzzles with quantum annealing
Authors:
Asa Eagle,
Takumi Kato,
Yuichiro Minato
Abstract:
To solve tiling puzzles, such as "pentomino" or "tetromino" puzzles, we need to find the correct solutions out of numerous combinations of rotations or piece locations. Solving this kind of combinatorial optimization problem is a very difficult problem in computational science, and quantum computing is expected to play an important role in this field. In this article, we propose a method and obtai…
▽ More
To solve tiling puzzles, such as "pentomino" or "tetromino" puzzles, we need to find the correct solutions out of numerous combinations of rotations or piece locations. Solving this kind of combinatorial optimization problem is a very difficult problem in computational science, and quantum computing is expected to play an important role in this field. In this article, we propose a method and obtained specific formulas to find solutions for tetromino tiling puzzles using a quantum annealer. In addition, we evaluated these formulas using a simulator and using actual hardware DW2000Q.
△ Less
Submitted 3 April, 2019;
originally announced April 2019.
-
Pebble Exchange Group of Graphs
Authors:
Tatsuoki Kato,
Tomoki Nakamigawa,
Tadashi Sakuma
Abstract:
A graph puzzle ${\rm Puz}(G)$ of a graph $G$ is defined as follows. A configuration of ${\rm Puz}(G)$ is a bijection from the set of vertices of a board graph to the set of vertices of a pebble graph, both graphs being isomorphic to some input graph $G$. A move of pebbles is defined as exchanging two pebbles which are adjacent on both a board graph and a pebble graph. For a pair of configurations…
▽ More
A graph puzzle ${\rm Puz}(G)$ of a graph $G$ is defined as follows. A configuration of ${\rm Puz}(G)$ is a bijection from the set of vertices of a board graph to the set of vertices of a pebble graph, both graphs being isomorphic to some input graph $G$. A move of pebbles is defined as exchanging two pebbles which are adjacent on both a board graph and a pebble graph. For a pair of configurations $f$ and $g$, we say that $f$ is equivalent to $g$ if $f$ can be transformed into $g$ by a finite sequence of moves.
Let ${\rm Aut}(G)$ be the automorphism group of $G$, and let ${\rm 1}_G$ be the unit element of ${\rm Aut}(G)$. The pebble exchange group of $G$, denoted by ${\rm Peb}(G)$, is defined as the set of all automorphisms $f$ of $G$ such that ${\rm 1}_G$ and $f$ are equivalent to each other.
In this paper, some basic properties of ${\rm Peb}(G)$ are studied. Among other results, it is shown that for any connected graph $G$, all automorphisms of $G$ are contained in ${\rm Peb}(G^2)$, where $G^2$ is a square graph of $G$.
△ Less
Submitted 29 March, 2021; v1 submitted 31 March, 2019;
originally announced April 2019.
-
Landscape of IoT Patterns
Authors:
Hironori Washizaki,
Nobukazu Yoshioka,
Atsuo Hazeyama,
Takehisa Kato,
Haruhiko Kaiya,
Shinpei Ogata,
Takao Okubo,
Eduardo B. Fernandez
Abstract:
Patterns are encapsulations of problems and solutions under specific contexts. As the industry is realizing many successes (and failures) in IoT systems development and operations, many IoT patterns have been published such as IoT design patterns and IoT architecture patterns. Because these patterns are not well classified, their adoption does not live up to their potential. To understand the reas…
▽ More
Patterns are encapsulations of problems and solutions under specific contexts. As the industry is realizing many successes (and failures) in IoT systems development and operations, many IoT patterns have been published such as IoT design patterns and IoT architecture patterns. Because these patterns are not well classified, their adoption does not live up to their potential. To understand the reasons, this paper analyzes an extensive set of published IoT architecture and design patterns according to several dimensions and outlines directions for improvements in publishing and adopting IoT patterns.
△ Less
Submitted 25 February, 2019;
originally announced February 2019.
-
Investigating context features hidden in End-to-End TTS
Authors:
Kohki Mametani,
Tsuneo Kato,
Seiichi Yamamoto
Abstract:
Recent studies have introduced end-to-end TTS, which integrates the production of context and acoustic features in statistical parametric speech synthesis. As a result, a single neural network replaced laborious feature engineering with automated feature learning. However, little is known about what types of context information end-to-end TTS extracts from text input before synthesizing speech, an…
▽ More
Recent studies have introduced end-to-end TTS, which integrates the production of context and acoustic features in statistical parametric speech synthesis. As a result, a single neural network replaced laborious feature engineering with automated feature learning. However, little is known about what types of context information end-to-end TTS extracts from text input before synthesizing speech, and the previous knowledge about context features is barely utilized. In this work, we first point out the model similarity between end-to-end TTS and parametric TTS. Based on the similarity, we evaluate the quality of encoder outputs from an end-to-end TTS system against eight criteria that are derived from a standard set of context information used in parametric TTS. We conduct experiments using an evaluation procedure that has been newly developed in the machine learning literature for quantitative analysis of neural representations, while adapting it to the TTS domain. Experimental results show that the encoder outputs reflect both linguistic and phonetic contexts, such as vowel reduction at phoneme level, lexical stress at syllable level, and part-of-speech at word level, possibly due to the joint optimization of context and acoustic features.
△ Less
Submitted 25 February, 2019; v1 submitted 4 November, 2018;
originally announced November 2018.
-
Filling Missing Paths: Modeling Co-occurrences of Word Pairs and Dependency Paths for Recognizing Lexical Semantic Relations
Authors:
Koki Washio,
Tsuneaki Kato
Abstract:
Recognizing lexical semantic relations between word pairs is an important task for many applications of natural language processing. One of the mainstream approaches to this task is to exploit the lexico-syntactic paths connecting two target words, which reflect the semantic relations of word pairs. However, this method requires that the considered words co-occur in a sentence. This requirement is…
▽ More
Recognizing lexical semantic relations between word pairs is an important task for many applications of natural language processing. One of the mainstream approaches to this task is to exploit the lexico-syntactic paths connecting two target words, which reflect the semantic relations of word pairs. However, this method requires that the considered words co-occur in a sentence. This requirement is hardly satisfied because of Zipf's law, which states that most content words occur very rarely. In this paper, we propose novel methods with a neural model of $P(path|w_1, w_2)$ to solve this problem. Our proposed model of $P(path|w_1, w_2)$ can be learned in an unsupervised manner and can generalize the co-occurrences of word pairs and dependency paths. This model can be used to augment the path data of word pairs that do not co-occur in the corpus, and extract features capturing relational information from word pairs. Our experimental results demonstrate that our methods improve on previous neural approaches based on dependency paths and successfully solve the focused problem.
△ Less
Submitted 10 September, 2018;
originally announced September 2018.
-
Neural Latent Relational Analysis to Capture Lexical Semantic Relations in a Vector Space
Authors:
Koki Washio,
Tsuneaki Kato
Abstract:
Capturing the semantic relations of words in a vector space contributes to many natural language processing tasks. One promising approach exploits lexico-syntactic patterns as features of word pairs. In this paper, we propose a novel model of this pattern-based approach, neural latent relational analysis (NLRA). NLRA can generalize co-occurrences of word pairs and lexico-syntactic patterns, and ob…
▽ More
Capturing the semantic relations of words in a vector space contributes to many natural language processing tasks. One promising approach exploits lexico-syntactic patterns as features of word pairs. In this paper, we propose a novel model of this pattern-based approach, neural latent relational analysis (NLRA). NLRA can generalize co-occurrences of word pairs and lexico-syntactic patterns, and obtain embeddings of the word pairs that do not co-occur. This overcomes the critical data sparseness problem encountered in previous pattern-based models. Our experimental results on measuring relational similarity demonstrate that NLRA outperforms the previous pattern-based models. In addition, when combined with a vector offset model, NLRA achieves a performance comparable to that of the state-of-the-art model that exploits additional semantic relational data.
△ Less
Submitted 10 September, 2018;
originally announced September 2018.
-
Decreasing the size of the Restricted Boltzmann machine
Authors:
Yohei Saito,
Takuya Kato
Abstract:
We propose a method to decrease the number of hidden units of the restricted Boltzmann machine while avoiding decrease of the performance measured by the Kullback-Leibler divergence. Then, we demonstrate our algorithm by using numerical simulations.
We propose a method to decrease the number of hidden units of the restricted Boltzmann machine while avoiding decrease of the performance measured by the Kullback-Leibler divergence. Then, we demonstrate our algorithm by using numerical simulations.
△ Less
Submitted 12 December, 2018; v1 submitted 9 July, 2018;
originally announced July 2018.
-
Parametric Models for Mutual Kernel Matrix Completion
Authors:
Rachelle Rivero,
Tsuyoshi Kato
Abstract:
Recent studies utilize multiple kernel learning to deal with incomplete-data problem. In this study, we introduce new methods that do not only complete multiple incomplete kernel matrices simultaneously, but also allow control of the flexibility of the model by parameterizing the model matrix. By imposing restrictions on the model covariance, overfitting of the data is avoided. A limitation of ker…
▽ More
Recent studies utilize multiple kernel learning to deal with incomplete-data problem. In this study, we introduce new methods that do not only complete multiple incomplete kernel matrices simultaneously, but also allow control of the flexibility of the model by parameterizing the model matrix. By imposing restrictions on the model covariance, overfitting of the data is avoided. A limitation of kernel matrix estimations done via optimization of an objective function is that the positive definiteness of the result is not guaranteed. In view of this limitation, our proposed methods employ the LogDet divergence, which ensures the positive definiteness of the resulting inferred kernel matrix. We empirically show that our proposed restricted covariance models, employed with LogDet divergence, yield significant improvements in the generalization performance of previous completion methods.
△ Less
Submitted 17 April, 2018;
originally announced April 2018.
-
Threshold Auto-Tuning Metric Learning
Authors:
Yuya Onuma,
Rachelle Rivero,
Tsuyoshi Kato
Abstract:
It has been reported repeatedly that discriminative learning of distance metric boosts the pattern recognition performance. A weak point of ITML-based methods is that the distance threshold for similarity/dissimilarity constraints must be determined manually and it is sensitive to generalization performance, although the ITML-based methods enjoy an advantage that the Bregman projection framework c…
▽ More
It has been reported repeatedly that discriminative learning of distance metric boosts the pattern recognition performance. A weak point of ITML-based methods is that the distance threshold for similarity/dissimilarity constraints must be determined manually and it is sensitive to generalization performance, although the ITML-based methods enjoy an advantage that the Bregman projection framework can be applied for optimization of distance metric. In this paper, we present a new formulation of metric learning algorithm in which the distance threshold is optimized together. Since the optimization is still in the Bregman projection framework, the Dykstra algorithm can be applied for optimization. A nonlinear equation has to be solved to project the solution onto a half-space in each iteration. Naïve method takes $O(LMn^{3})$ computational time to solve the nonlinear equation. In this study, an efficient technique that can solve the nonlinear equation in $O(Mn^{3})$ has been discovered. We have proved that the root exists and is unique. We empirically show that the accuracy of pattern recognition for the proposed metric learning algorithm is comparable to the existing metric learning methods, yet the distance threshold is automatically tuned for the proposed metric learning algorithm.
△ Less
Submitted 12 February, 2018; v1 submitted 6 January, 2018;
originally announced January 2018.
-
Reinforcement Learning of Speech Recognition System Based on Policy Gradient and Hypothesis Selection
Authors:
Taku Kato,
Takahiro Shinozaki
Abstract:
Speech recognition systems have achieved high recognition performance for several tasks. However, the performance of such systems is dependent on the tremendously costly development work of preparing vast amounts of task-matched transcribed speech data for supervised training. The key problem here is the cost of transcribing speech data. The cost is repeatedly required to support new languages and…
▽ More
Speech recognition systems have achieved high recognition performance for several tasks. However, the performance of such systems is dependent on the tremendously costly development work of preparing vast amounts of task-matched transcribed speech data for supervised training. The key problem here is the cost of transcribing speech data. The cost is repeatedly required to support new languages and new tasks. Assuming broad network services for transcribing speech data for many users, a system would become more self-sufficient and more useful if it possessed the ability to learn from very light feedback from the users without annoying them. In this paper, we propose a general reinforcement learning framework for speech recognition systems based on the policy gradient method. As a particular instance of the framework, we also propose a hypothesis selection-based reinforcement learning method. The proposed framework provides a new view for several existing training and adaptation methods. The experimental results show that the proposed method improves the recognition performance compared to unsupervised adaptation.
△ Less
Submitted 9 November, 2017;
originally announced November 2017.