Search | arXiv e-print repository

Bridging Predictive Coding and MDL: A Two-Part Code Framework for Deep Learning

Authors: Benjamin Prada, Shion Matsumoto, Abdul Malik Zekri, Ankur Mali

Abstract: We present the first theoretical framework that connects predictive coding (PC), a biologically inspired local learning rule, with the minimum description length (MDL) principle in deep networks. We prove that layerwise PC performs block-coordinate descent on the MDL two-part code objective, thereby jointly minimizing empirical risk and model complexity. Using Hoeffding's inequality and a prefix-c… ▽ More We present the first theoretical framework that connects predictive coding (PC), a biologically inspired local learning rule, with the minimum description length (MDL) principle in deep networks. We prove that layerwise PC performs block-coordinate descent on the MDL two-part code objective, thereby jointly minimizing empirical risk and model complexity. Using Hoeffding's inequality and a prefix-code prior, we derive a novel generalization bound of the form $R(θ) \le \hat{R}(θ) + \frac{L(θ)}{N}$, capturing the tradeoff between fit and compression. We further prove that each PC sweep monotonically decreases the empirical two-part codelength, yielding tighter high-probability risk bounds than unconstrained gradient descent. Finally, we show that repeated PC updates converge to a block-coordinate stationary point, providing an approximate MDL-optimal solution. To our knowledge, this is the first result offering formal generalization and convergence guarantees for PC-trained deep models, positioning PC as a theoretically grounded and biologically plausible alternative to backpropagation. △ Less

Submitted 20 May, 2025; originally announced May 2025.

Comments: 24 pages, 2 figures

arXiv:2505.00763 [pdf, ps, other]

JFlow: Model-Independent Spherical Jeans Analysis using Equivariant Continuous Normalizing Flows

Authors: Sung Hak Lim, Kohei Hayashi, Shun'ichi Horigome, Shigeki Matsumoto, Mihoko M. Nojiri

Abstract: The kinematics of stars in dwarf spheroidal galaxies have been studied to understand the structure of dark matter halos. However, the kinematic information of these stars is often limited to celestial positions and line-of-sight velocities, making full phase space analysis challenging. Conventional methods rely on projected analytic phase space density models with several parameters and infer dark… ▽ More The kinematics of stars in dwarf spheroidal galaxies have been studied to understand the structure of dark matter halos. However, the kinematic information of these stars is often limited to celestial positions and line-of-sight velocities, making full phase space analysis challenging. Conventional methods rely on projected analytic phase space density models with several parameters and infer dark matter halo structures by solving the spherical Jeans equation. In this paper, we introduce an unsupervised machine learning method for solving the spherical Jeans equation in a model-independent way as a first step toward model-independent analysis of dwarf spheroidal galaxies. Using equivariant continuous normalizing flows, we demonstrate that spherically symmetric stellar phase space densities and velocity dispersions can be estimated without model assumptions. As a proof of concept, we apply our method to Gaia challenge datasets for spherical models and measure dark matter mass densities for given velocity anisotropy profiles. Our method can identify halo structures accurately, even with a small number of tracer stars. △ Less

Submitted 2 June, 2025; v1 submitted 1 May, 2025; originally announced May 2025.

Comments: 10 pages, 3 figures, 1 table, revised version for the journal submission

Report number: CTPU-PTC-25-15

arXiv:2504.18150 [pdf, other]

Toward Automated Test Generation for Dockerfiles Based on Analysis of Docker Image Layers

Authors: Yuki Goto, Shinsuke Matsumoto, Shinji Kusumoto

Abstract: Docker has gained attention as a lightweight container-based virtualization platform. The process for building a Docker image is defined in a text file called a Dockerfile. A Dockerfile can be considered as a kind of source code that contains instructions on how to build a Docker image. Its behavior should be verified through testing, as is done for source code in a general programming language. F… ▽ More Docker has gained attention as a lightweight container-based virtualization platform. The process for building a Docker image is defined in a text file called a Dockerfile. A Dockerfile can be considered as a kind of source code that contains instructions on how to build a Docker image. Its behavior should be verified through testing, as is done for source code in a general programming language. For source code in languages such as Java, search-based test generation techniques have been proposed. However, existing automated test generation techniques cannot be applied to Dockerfiles. Since a Dockerfile does not contain branches, the coverage metric, typically used as an objective function in existing methods, becomes meaningless. In this study, we propose an automated test generation method for Dockerfiles based on processing results rather than processing steps. The proposed method determines which files should be tested and generates the corresponding tests based on an analysis of Dockerfile instructions and Docker image layers. The experimental results show that the proposed method can reproduce over 80% of the tests created by developers. △ Less

Submitted 25 April, 2025; originally announced April 2025.

Comments: The paper has been peer-reviewed and accepted for publication in the proceedings of the 29th International Conference on Evaluation and Assessment in Software Engineering (EASE 2025)

arXiv:2305.13469 [pdf, other]

MAILEX: Email Event and Argument Extraction

Authors: Saurabh Srivastava, Gaurav Singh, Shou Matsumoto, Ali Raz, Paulo Costa, Joshua Poore, Ziyu Yao

Abstract: In this work, we present the first dataset, MailEx, for performing event extraction from conversational email threads. To this end, we first proposed a new taxonomy covering 10 event types and 76 arguments in the email domain. Our final dataset includes 1.5K email threads and ~4K emails, which are annotated with totally ~8K event instances. To understand the task challenges, we conducted a series… ▽ More In this work, we present the first dataset, MailEx, for performing event extraction from conversational email threads. To this end, we first proposed a new taxonomy covering 10 event types and 76 arguments in the email domain. Our final dataset includes 1.5K email threads and ~4K emails, which are annotated with totally ~8K event instances. To understand the task challenges, we conducted a series of experiments comparing three types of approaches, i.e., fine-tuned sequence labeling, fine-tuned generative extraction, and few-shot in-context learning. Our results showed that the task of email event extraction is far from being addressed, due to challenges lying in, e.g., extracting non-continuous, shared trigger spans, extracting non-named entity arguments, and modeling the email conversational history. Our work thus suggests more future investigations in this domain-specific event extraction task. △ Less

Submitted 20 October, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

Comments: Accepted at EMNLP 2023

arXiv:2203.10218 [pdf, other]

Shared Control in Human Robot Teaming: Toward Context-Aware Communication

Authors: Sachiko Matsumoto, Laurel D. Riek

Abstract: In the field of Human-Robot Interaction (HRI), many researchers study shared control systems. Shared control is when a person and agent both contribute to the performance of a task in a collaborative way, often by providing control inputs for a robot. One of the most important things in shared control is the nature of the communication between the person and robot, which could help the human-robot… ▽ More In the field of Human-Robot Interaction (HRI), many researchers study shared control systems. Shared control is when a person and agent both contribute to the performance of a task in a collaborative way, often by providing control inputs for a robot. One of the most important things in shared control is the nature of the communication between the person and robot, which could help the human-robot team handle a variety of challenging situations. In this paper, we identify key challenges in shared control, in which better communication design could be useful, including when encountering novel situations and contexts, resolving tensions between preferences and performance, and alleviating cognitive burden and interruptions. Through the use of four exemplar shared control scenarios, we explore how well-designed human-robot communication strategies could help address each challenge. We hope this paper will draw attention to these challenges and assist researchers in advancing the development of shared control systems. △ Less

Submitted 18 March, 2022; originally announced March 2022.

Comments: Appears in Proceedings of the AAAI SSS-22 Symposium "Closing the Assessment Loop: Communicating Proficiency and Intent in Human-Robot Teaming"

arXiv:2203.06393 [pdf, other]

doi 10.1007/978-3-030-62466-8_11

G2GML: Graph to Graph Mapping Language for Bridging RDF and Property Graphs

Authors: Hirokazu Chiba, Ryota Yamanaka, Shota Matsumoto

Abstract: How can we maximize the value of accumulated RDF data? Whereas the RDF data can be queried using the SPARQL language, even the SPARQL-based operation has a limitation in implementing traversal or analytical algorithms. Recently, a variety of database implementations dedicated to analyses on the property graph (PG) model have emerged. Importing RDF datasets into these graph analysis engines provide… ▽ More How can we maximize the value of accumulated RDF data? Whereas the RDF data can be queried using the SPARQL language, even the SPARQL-based operation has a limitation in implementing traversal or analytical algorithms. Recently, a variety of database implementations dedicated to analyses on the property graph (PG) model have emerged. Importing RDF datasets into these graph analysis engines provides access to the accumulated datasets through various application interfaces. However, the RDF model and the PG model are not interoperable. Here, we developed a framework based on the Graph to Graph Mapping Language (G2GML) for mapping RDF graphs to PGs to make the most of accumulated RDF data. Using this framework, accumulated graph data described in the RDF model can be converted to the PG model, which can then be loaded to graph database engines for further analysis. For supporting different graph database implementations, we redefined the PG model and proposed its exchangeable serialization formats. We demonstrate several use cases, where publicly available RDF data are extracted and converted to PGs. This study bridges RDF and PGs and contributes to interoperable management of knowledge graphs, thereby expanding the use cases of accumulated RDF data. △ Less

Submitted 12 March, 2022; originally announced March 2022.

Journal ref: The Semantic Web - ISWC 2020 pp. 160-175

arXiv:1907.03936 [pdf, ps, other]

Property Graph Exchange Format

Authors: Hirokazu Chiba, Ryota Yamanaka, Shota Matsumoto

Abstract: Recently, a variety of database implementations adopting the property graph model have emerged. However, interoperable management of graph data on these implementations is challenging due to the differences in data models and formats. Here, we redefine the property graph model incorporating the differences in the existing models and propose interoperable serialization formats for property graphs.… ▽ More Recently, a variety of database implementations adopting the property graph model have emerged. However, interoperable management of graph data on these implementations is challenging due to the differences in data models and formats. Here, we redefine the property graph model incorporating the differences in the existing models and propose interoperable serialization formats for property graphs. The model is independent of specific implementations and provides a basis of interoperable management of property graph data. The proposed serialization is not only general but also intuitive, thus it is useful for creating and maintaining graph data. To demonstrate the practical use of our model and serialization, we implemented converters from our serialization into existing formats, which can then be loaded into various graph databases. This work provides a basis of an interoperable platform for creating, exchanging, and utilizing property graph data. △ Less

Submitted 8 July, 2019; originally announced July 2019.

Comments: 4 pages

arXiv:1904.12958 [pdf]

Predictive Situation Awareness for Ebola Virus Disease using a Collective Intelligence Multi-Model Integration Platform: Bayes Cloud

Authors: Cheol Young Park, Shou Matsumoto, Jubyung Ha, YoungWon Park

Abstract: The humanity has been facing a plethora of challenges associated with infectious diseases, which kill more than 6 million people a year. Although continuous efforts have been applied to relieve the potential damages from such misfortunate events, it is unquestionable that there are many persisting challenges yet to overcome. One related issue we particularly address here is the assessment and pred… ▽ More The humanity has been facing a plethora of challenges associated with infectious diseases, which kill more than 6 million people a year. Although continuous efforts have been applied to relieve the potential damages from such misfortunate events, it is unquestionable that there are many persisting challenges yet to overcome. One related issue we particularly address here is the assessment and prediction of such epidemics. In this field of study, traditional and ad-hoc models frequently fail to provide proper predictive situation awareness (PSAW), characterized by understanding the current situations and predicting the future situations. Comprehensive PSAW for infectious disease can support decision making and help to hinder disease spread. In this paper, we develop a computing system platform focusing on collective intelligence causal modeling, in order to support PSAW in the domain of infectious disease. Analyses of global epidemics require integration of multiple different data and models, which can be originated from multiple independent researchers. These models should be integrated to accurately assess and predict the infectious disease in terms of holistic view. The system shall provide three main functions: (1) collaborative causal modeling, (2) causal model integration, and (3) causal model reasoning. These functions are supported by subject-matter expert and artificial intelligence (AI), with uncertainty treatment. Subject-matter experts, as collective intelligence, develop causal models and integrate them as one joint causal model. The integrated causal model shall be used to reason about: (1) the past, regarding how the causal factors have occurred; (2) the present, regarding how the spread is going now; and (3) the future, regarding how it will proceed. Finally, we introduce one use case of predictive situation awareness for the Ebola virus disease. △ Less

Submitted 4 May, 2019; v1 submitted 29 April, 2019; originally announced April 2019.

arXiv:1812.01801 [pdf, other]

Mapping RDF Graphs to Property Graphs

Authors: Shota Matsumoto, Ryota Yamanaka, Hirokazu Chiba

Abstract: Increasing amounts of scientific and social data are published in the Resource Description Framework (RDF). Although the RDF data can be queried using the SPARQL language, even the SPARQL-based operation has a limitation in implementing traversal or analytical algorithms. Recently, a variety of graph database implementations dedicated to analyses on the property graph model have emerged. However,… ▽ More Increasing amounts of scientific and social data are published in the Resource Description Framework (RDF). Although the RDF data can be queried using the SPARQL language, even the SPARQL-based operation has a limitation in implementing traversal or analytical algorithms. Recently, a variety of graph database implementations dedicated to analyses on the property graph model have emerged. However, the RDF model and the property graph model are not interoperable. Here, we developed a framework based on the Graph to Graph Mapping Language (G2GML) for mapping RDF graphs to property graphs to make the most of accumulated RDF data. Using this framework, graph data described in the RDF model can be converted to the property graph model and can be loaded to several graph database engines for further analysis. Future works include implementing and utilizing graph algorithms to make the most of the accumulated data in various analytical engines. △ Less

Submitted 4 December, 2018; originally announced December 2018.

Comments: 4 pages, 4 figures

arXiv:1806.02415 [pdf]

doi 10.3390/app9102055

Gaussian Mixture Reduction for Time-Constrained Approximate Inference in Hybrid Bayesian Networks

Authors: Cheol Young Park, Kathryn Blackmond Laskey, Paulo C. G. Costa, Shou Matsumoto

Abstract: Hybrid Bayesian Networks (HBNs), which contain both discrete and continuous variables, arise naturally in many application areas (e.g., image understanding, data fusion, medical diagnosis, fraud detection). This paper concerns inference in an important subclass of HBNs, the conditional Gaussian (CG) networks, in which all continuous random variables have Gaussian distributions and all children of… ▽ More Hybrid Bayesian Networks (HBNs), which contain both discrete and continuous variables, arise naturally in many application areas (e.g., image understanding, data fusion, medical diagnosis, fraud detection). This paper concerns inference in an important subclass of HBNs, the conditional Gaussian (CG) networks, in which all continuous random variables have Gaussian distributions and all children of continuous random variables must be continuous. Inference in CG networks can be NP-hard even for special-case structures, such as poly-trees, where inference in discrete Bayesian networks can be performed in polynomial time. Therefore, approximate inference is required. In approximate inference, it is often necessary to trade off accuracy against solution time. This paper presents an extension to the Hybrid Message Passing inference algorithm for general CG networks and an algorithm for optimizing its accuracy given a bound on computation time. The extended algorithm uses Gaussian mixture reduction to prevent an exponential increase in the number of Gaussian mixture components. The trade-off algorithm performs pre-processing to find optimal run-time settings for the extended algorithm. Experimental results for four CG networks compare performance of the extended algorithm with existing algorithms and show the optimal settings for these CG networks. △ Less

Submitted 6 June, 2018; originally announced June 2018.

Journal ref: Appl. Sci. 2019, 9, 2055

arXiv:1506.03392 [pdf, other]

Designing a Global Authentication Infrastructure

Authors: Stephanos Matsumoto, Raphael M. Reischuk, Pawel Szalachowski, Tiffany Hyun-Jin Kim, Adrian Perrig

Abstract: We address the problem of scaling authentication for naming, routing, and end-entity certification to a global environment in which authentication policies and users' sets of trust roots vary widely. The current mechanisms for authenticating names (DNSSEC), routes (BGPSEC), and end-entity certificates (TLS) do not support a coexistence of authentication policies, affect the entire Internet when co… ▽ More We address the problem of scaling authentication for naming, routing, and end-entity certification to a global environment in which authentication policies and users' sets of trust roots vary widely. The current mechanisms for authenticating names (DNSSEC), routes (BGPSEC), and end-entity certificates (TLS) do not support a coexistence of authentication policies, affect the entire Internet when compromised, cannot update trust root information efficiently, and do not provide users with the ability to make flexible trust decisions. We propose a Scalable Authentication Infrastructure for Next-generation Trust (SAINT), which partitions the Internet into groups with common, local trust roots, and isolates the effects of a compromised trust root. SAINT requires groups with direct routing connections to cross-sign each other for authentication purposes, allowing diverse authentication policies while keeping all entities globally verifiable. SAINT makes trust root management a central part of the network architecture, enabling trust root updates within seconds and allowing users to make flexible trust decisions. SAINT operates without a significant performance penalty and can be deployed alongside existing infrastructures. △ Less

Submitted 12 June, 2015; v1 submitted 10 June, 2015; originally announced June 2015.

arXiv:1405.7178 [pdf, ps, other]

Artificial Wrestling: A Dynamical Formulation of Autonomous Agents Fighting in a Coupled Inverted Pendula Framework

Authors: Katsutoshi Yoshida, Shigeki Matsumoto, Yoichi Matsue

Abstract: We develop autonomous agents fighting with each other, inspired by human wrestling. For this purpose, we propose a coupled inverted pendula (CIP) framework in which: 1) tips of two inverted pendulums are linked by a connection rod, 2) each pendulum is primarily stabilized by a PD-controller, 3) and is additionally equipped with an intelligent controller. Based on this framework, we dynamically for… ▽ More We develop autonomous agents fighting with each other, inspired by human wrestling. For this purpose, we propose a coupled inverted pendula (CIP) framework in which: 1) tips of two inverted pendulums are linked by a connection rod, 2) each pendulum is primarily stabilized by a PD-controller, 3) and is additionally equipped with an intelligent controller. Based on this framework, we dynamically formulate an intelligent controller designed to store dynamical correspondence from initial states to final states of the CIP model, to receive state vectors of the model, and to output impulsive control forces to produce desired final states of the model. Developing a quantized and reduced order design of this controller, we have a practical control procedure based on an off-line learning method. We then conduct numerical simulations to investigate individual performance of the intelligent controller, showing that the performance can be improved by adding a delay element into the intelligent controller. The result shows that the performance depends not only on quantization resolutions of learning data but also on delay time of the delay element. Finally, we install the intelligent controllers into both pendulums in the proposed framework to demonstrate autonomous competitive behavior between inverted pendulums. △ Less

Submitted 19 October, 2015; v1 submitted 28 May, 2014; originally announced May 2014.

Comments: The 12th International Conference on Motion and Vibration Control (MOVIC 2014), August 3-7, 2014, Sapporo, Japan. This article was selected as an article of Mechanical Engineering Journal after minor revisions; the final version is available at http://dx.doi.org/10.1299/mej.14-00518

Showing 1–12 of 12 results for author: Matsumoto, S