Skip to main content

Showing 1–50 of 86 results for author: Sakaguchi, K

.
  1. arXiv:2505.04231  [pdf, other

    cs.RO cs.MA eess.SY

    Multi-Agent Reinforcement Learning-based Cooperative Autonomous Driving in Smart Intersections

    Authors: Taoyuan Yu, Kui Wang, Zongdian Li, Tao Yu, Kei Sakaguchi

    Abstract: Unsignalized intersections pose significant safety and efficiency challenges due to complex traffic flows. This paper proposes a novel roadside unit (RSU)-centric cooperative driving system leveraging global perception and vehicle-to-infrastructure (V2I) communication. The core of the system is an RSU-based decision-making module using a two-stage hybrid reinforcement learning (RL) framework. At f… ▽ More

    Submitted 7 May, 2025; originally announced May 2025.

    Comments: 7 pages

  2. arXiv:2504.20542  [pdf, other

    cs.OH

    Digital Twin-Empowered Cooperative Autonomous Car-sharing Services: Proof-of-Concept

    Authors: Kazuma Nonomura, Kui Wang, Zongdian Li, Tao Yu, Kei Sakaguchi

    Abstract: This paper presents a digital twin-empowered real-time optimal delivery system specifically validated through a proof-of-concept (PoC) demonstration of a real-world autonomous car-sharing service. This study integrates real-time data from roadside units (RSUs) and connected and autonomous vehicles (CAVs) within a digital twin of a campus environment to address the dynamic challenges of urban traff… ▽ More

    Submitted 29 April, 2025; originally announced April 2025.

    Comments: The paper was accepted by the 36th IEEE Intelligent Vehicles Symposium (IEEE IV 2025)

  3. arXiv:2503.23899  [pdf, other

    cs.CL

    Rubrik's Cube: Testing a New Rubric for Evaluating Explanations on the CUBE dataset

    Authors: Diana Galvan-Sosa, Gabrielle Gaudeau, Pride Kavumba, Yunmeng Li, Hongyi gu, Zheng Yuan, Keisuke Sakaguchi, Paula Buttery

    Abstract: The performance and usability of Large-Language Models (LLMs) are driving their use in explanation generation tasks. However, despite their widespread adoption, LLM explanations have been found to be unreliable, making it difficult for users to distinguish good from bad explanations. To address this issue, we present Rubrik's CUBE, an education-inspired rubric and a dataset of 26k explanations, wr… ▽ More

    Submitted 31 March, 2025; originally announced March 2025.

    Comments: 9 main pages (21 appendix pages), 7 figures, submitted to ACL 2025

    ACM Class: I.2.7

  4. arXiv:2503.03590  [pdf, other

    cs.NI eess.SY

    Digital Twin-Enabled Blockage-Aware Dynamic mmWave Multi-Hop V2X Communication

    Authors: Supat Roongpraiwan, Zongdian Li, Tao Yu, Kei Sakaguchi

    Abstract: Millimeter wave (mmWave) technology in vehicle-to-everything (V2X) communication offers unprecedented data rates and low latency, but faces significant reliability challenges due to signal blockages and limited range. This paper introduces a novel system for managing dynamic multi-hop mmWave V2X communications in complex blocking environments. We present a system architecture that integrates a mob… ▽ More

    Submitted 17 March, 2025; v1 submitted 5 March, 2025; originally announced March 2025.

  5. arXiv:2502.00344  [pdf, other

    cs.CL

    FinchGPT: a Transformer based language model for birdsong analysis

    Authors: Kosei Kobayashi, Kosuke Matsuzaki, Masaya Taniguchi, Keisuke Sakaguchi, Kentaro Inui, Kentaro Abe

    Abstract: The long-range dependencies among the tokens, which originate from hierarchical structures, are a defining hallmark of human language. However, whether similar dependencies exist within the sequential vocalization of non-human animals remains a topic of investigation. Transformer architectures, known for their ability to model long-range dependencies among tokens, provide a powerful tool for inves… ▽ More

    Submitted 1 February, 2025; originally announced February 2025.

    Comments: 12 pages, 4 figures

  6. arXiv:2501.15754  [pdf, other

    cs.CL

    Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference

    Authors: Go Kamoda, Benjamin Heinzerling, Tatsuro Inaba, Keito Kudo, Keisuke Sakaguchi, Kentaro Inui

    Abstract: According to the stages-of-inference hypothesis, early layers of language models map their subword-tokenized input, which does not necessarily correspond to a linguistically meaningful segmentation, to more meaningful representations that form the model's "inner vocabulary". Prior analysis of this detokenization stage has predominantly relied on probing and interventions such as path patching, whi… ▽ More

    Submitted 10 February, 2025; v1 submitted 26 January, 2025; originally announced January 2025.

    Comments: 22 pages, 14 figures, to appear in NAACL Findings 2025

  7. arXiv:2412.01113  [pdf, other

    cs.CL

    Think-to-Talk or Talk-to-Think? When LLMs Come Up with an Answer in Multi-Step Arithmetic Reasoning

    Authors: Keito Kudo, Yoichi Aoki, Tatsuki Kuribayashi, Shusaku Sone, Masaya Taniguchi, Ana Brassard, Keisuke Sakaguchi, Kentaro Inui

    Abstract: This study investigates the internal reasoning process of language models during arithmetic multi-step reasoning, motivated by the question of when they internally form their answers during reasoning. Particularly, we inspect whether the answer is determined before or after chain-of-thought (CoT) begins to determine whether models follow a post-hoc Think-to-Talk mode or a step-by-step Talk-to-Thin… ▽ More

    Submitted 17 April, 2025; v1 submitted 1 December, 2024; originally announced December 2024.

  8. arXiv:2411.06387  [pdf, other

    cs.LG cs.AI cs.CL

    Self-Training Meets Consistency: Improving LLMs' Reasoning with Consistency-Driven Rationale Evaluation

    Authors: Jaehyeok Lee, Keisuke Sakaguchi, JinYeong Bak

    Abstract: Self-training approach for large language models (LLMs) improves reasoning abilities by training the models on their self-generated rationales. Previous approaches have labeled rationales that produce correct answers for a given question as appropriate for training. However, a single measure risks misjudging rationale quality, leading the models to learn flawed reasoning patterns. To address this… ▽ More

    Submitted 6 February, 2025; v1 submitted 10 November, 2024; originally announced November 2024.

    Comments: Accepted to NAACL 2025

  9. arXiv:2410.12163  [pdf, other

    eess.SY

    Augmented Intelligence in Smart Intersections: Local Digital Twins-Assisted Hybrid Autonomous Driving

    Authors: Kui Wang, Kazuma Nonomura, Zongdian Li, Tao Yu, Kei Sakaguchi, Omar Hashash, Walid Saad, Changyang She, Yonghui Li

    Abstract: Vehicle-road collaboration is a promising approach for enhancing the safety and efficiency of autonomous driving by extending the intelligence of onboard systems to smart roadside infrastructures. The introduction of digital twins (DTs), particularly local DTs (LDTs) at the edge, in smart mobility presents a new embodiment of augmented intelligence, which could enhance information exchange and ext… ▽ More

    Submitted 18 October, 2024; v1 submitted 15 October, 2024; originally announced October 2024.

    Comments: 14 pages, 9 figures

  10. arXiv:2409.07232  [pdf, other

    cs.DC cs.CE

    Optimizing the Weather Research and Forecasting Model with OpenMP Offload and Codee

    Authors: Chayanon, Wichitrnithed, Woo-Sun-Yang, Yun, He, Brad Richardson, Koichi Sakaguchi, Manuel Arenaz, William I. Gustafson Jr., Jacob Shpund, Ulises Costi Blanco, Alvaro Goldar Dieste

    Abstract: Currently, the Weather Research and Forecasting model (WRF) utilizes shared memory (OpenMP) and distributed memory (MPI) parallelisms. To take advantage of GPU resources on the Perlmutter supercomputer at NERSC, we port parts of the computationally expensive routines of the Fast Spectral Bin Microphysics (FSBM) microphysical scheme to NVIDIA GPUs using OpenMP device offloading directives. To facil… ▽ More

    Submitted 11 September, 2024; originally announced September 2024.

  11. arXiv:2409.00040  [pdf, other

    cs.NI

    Digital Twin-Empowered Routing Management for Reliable Multi-Hop Millimeter Wave V2X

    Authors: Supat Roongpraiwan, Zongdian Li, Tao Yu, Kei Sakaguchi

    Abstract: Digital twin (DT) technology can replicate physical entities in cyberspace. A mobility DT digitalizes connected and autonomous vehicles (CAVs) and their surrounding traffic environment, allowing to monitor the maneuvering and distribution of CAVs in real-time, which is crucial for managing vehicle-to-everything (V2X) connectivity, especially when millimeter wave (mmWave) is adopted. MmWave V2X rel… ▽ More

    Submitted 18 August, 2024; originally announced September 2024.

  12. arXiv:2408.03554  [pdf, other

    cs.CL cs.CR cs.LG

    Empirical Analysis of Large Vision-Language Models against Goal Hijacking via Visual Prompt Injection

    Authors: Subaru Kimura, Ryota Tanaka, Shumpei Miyawaki, Jun Suzuki, Keisuke Sakaguchi

    Abstract: We explore visual prompt injection (VPI) that maliciously exploits the ability of large vision-language models (LVLMs) to follow instructions drawn onto the input image. We propose a new VPI method, "goal hijacking via visual prompt injection" (GHVPI), that swaps the execution task of LVLMs from an original task to an alternative task designated by an attacker. The quantitative analysis indicates… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

    Comments: 8 pages, 6 figures, Accepted to NAACL 2024 SRW

  13. arXiv:2407.03963  [pdf, other

    cs.CL cs.AI

    LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

    Authors: LLM-jp, :, Akiko Aizawa, Eiji Aramaki, Bowen Chen, Fei Cheng, Hiroyuki Deguchi, Rintaro Enomoto, Kazuki Fujii, Kensuke Fukumoto, Takuya Fukushima, Namgi Han, Yuto Harada, Chikara Hashimoto, Tatsuya Hiraoka, Shohei Hisada, Sosuke Hosokawa, Lu Jie, Keisuke Kamata, Teruhito Kanazawa, Hiroki Kanezashi, Hiroshi Kataoka, Satoru Katsumata, Daisuke Kawahara, Seiya Kawano , et al. (58 additional authors not shown)

    Abstract: This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its… ▽ More

    Submitted 30 December, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

  14. arXiv:2406.16078  [pdf, other

    cs.CL

    First Heuristic Then Rational: Dynamic Use of Heuristics in Language Model Reasoning

    Authors: Yoichi Aoki, Keito Kudo, Tatsuki Kuribayashi, Shusaku Sone, Masaya Taniguchi, Keisuke Sakaguchi, Kentaro Inui

    Abstract: Multi-step reasoning instruction, such as chain-of-thought prompting, is widely adopted to explore better language models (LMs) performance. We report on the systematic strategy that LMs employ in such a multi-step reasoning process. Our controlled experiments reveal that LMs rely more heavily on heuristics, such as lexical overlap, in the earlier stages of reasoning, where more reasoning steps re… ▽ More

    Submitted 7 October, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

    Comments: This paper is accepted at EMNLP 2024

  15. arXiv:2406.06032  [pdf, other

    cs.CL

    The Curse of Popularity: Popular Entities have Catastrophic Side Effects when Deleting Knowledge from Language Models

    Authors: Ryosuke Takahashi, Go Kamoda, Benjamin Heinzerling, Keisuke Sakaguchi, Kentaro Inui

    Abstract: Language models (LMs) encode world knowledge in their internal parameters through training. However, LMs may learn personal and confidential information from the training data, leading to privacy concerns such as data leakage. Therefore, research on knowledge deletion from LMs is essential. This study focuses on the knowledge stored in LMs and analyzes the relationship between the side effects of… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  16. arXiv:2405.04818  [pdf, other

    cs.CL

    ACORN: Aspect-wise Commonsense Reasoning Explanation Evaluation

    Authors: Ana Brassard, Benjamin Heinzerling, Keito Kudo, Keisuke Sakaguchi, Kentaro Inui

    Abstract: Evaluating the quality of free-text explanations is a multifaceted, subjective, and labor-intensive task. Large language models (LLMs) present an appealing alternative due to their potential for consistency, scalability, and cost-efficiency. In this work, we present ACORN, a new dataset of 3,500 free-text explanations and aspect-wise quality ratings, and use it to evaluate how LLMs rate explanatio… ▽ More

    Submitted 1 September, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

    Comments: 18 pages, 7 figures, accepted to COLM 2024. Data available here: https://github.com/a-brassard/ACORN

  17. arXiv:2405.03935  [pdf, other

    eess.SY

    Roadside Units Assisted Localized Automated Vehicle Maneuvering: An Offline Reinforcement Learning Approach

    Authors: Kui Wang, Changyang She, Zongdian Li, Tao Yu, Yonghui Li, Kei Sakaguchi

    Abstract: Traffic intersections present significant challenges for the safe and efficient maneuvering of connected and automated vehicles (CAVs). This research proposes an innovative roadside unit (RSU)-assisted cooperative maneuvering system aimed at enhancing road safety and traveling efficiency at intersections for CAVs. We utilize RSUs for real-time traffic data acquisition and train an offline reinforc… ▽ More

    Submitted 17 September, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

    Comments: 6 pages, 6 figures

  18. arXiv:2403.08173  [pdf, other

    cs.LO cs.DS cs.PL

    A bargain for mergesorts (functional pearl) -- How to prove your mergesort correct and stable, almost for free

    Authors: Cyril Cohen, Kazuhiko Sakaguchi

    Abstract: We present a novel characterization of stable mergesort functions using relational parametricity, and show that it implies the correctness of mergesort. As a result, one can prove the correctness of several variations of mergesort (e.g., top-down, bottom-up, tail-recursive, non-tail-recursive, smooth, and non-smooth mergesorts) by proving the characterization property for each variation. To furthe… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: The supplementary material is available at https://github.com/pi8027/stablesort

  19. arXiv:2402.14411  [pdf, other

    cs.CL

    J-UniMorph: Japanese Morphological Annotation through the Universal Feature Schema

    Authors: Kosuke Matsuzaki, Masaya Taniguchi, Kentaro Inui, Keisuke Sakaguchi

    Abstract: We introduce a Japanese Morphology dataset, J-UniMorph, developed based on the UniMorph feature schema. This dataset addresses the unique and rich verb forms characteristic of the language's agglutinative nature. J-UniMorph distinguishes itself from the existing Japanese subset of UniMorph, which is automatically extracted from Wiktionary. On average, the Wiktionary Edition features around 12 infl… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: 14 pages, 4 figures

  20. Smart Mobility Digital Twin Based Automated Vehicle Navigation System: A Proof of Concept

    Authors: Kui Wang, Zongdian Li, Kazuma Nonomura, Tao Yu, Kei Sakaguchi, Omar Hashash, Walid Saad

    Abstract: Digital twins (DTs) have driven major advancements across various industrial domains over the past two decades. With the rapid advancements in autonomous driving and vehicle-to-everything (V2X) technologies, integrating DTs into vehicular platforms is anticipated to further revolutionize smart mobility systems. In this paper, a new smart mobility DT (SMDT) platform is proposed for the control of c… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: 15 pages, 10 figures

  21. arXiv:2401.08654  [pdf, other

    cs.RO eess.SY

    Smart Mobility Digital Twin for Automated Driving: Design and Proof-of-Concept

    Authors: Kui Wang, Zongdian Li, Tao Yu, Kei Sakaguchi

    Abstract: During the past decade, smart mobility and intelligent vehicles have attracted increasing attention, because they promise to create a highly efficient and safe transportation system in the future. Meanwhile, digital twin, as an emerging technology, will play an important role in automated driving and intelligent transportation systems. This technology is applied in this paper to design a platform… ▽ More

    Submitted 24 December, 2023; originally announced January 2024.

  22. arXiv:2401.08653  [pdf, other

    cs.NI

    Digital Twins for Autonomous Driving: A Comprehensive Implementation and Demonstration

    Authors: Kui Wang, Tao Yu, Zongdian Li, Kei Sakaguchi, Omar Hashash, Walid Saad

    Abstract: The concept of a digital twin (DT) plays a pivotal role in the ongoing digital transformation and has achieved significant strides for various wireless applications in recent years. In particular, the field of autonomous vehicles is a domain that is ripe for exploiting the concept of DT. Nevertheless, there are many challenges that include holistic consideration and integration of hardware, softwa… ▽ More

    Submitted 24 December, 2023; originally announced January 2024.

    Comments: 7 pages, 8 figures

  23. Internet of Federated Digital Twins (IoFDT): Connecting Twins Beyond Borders for Society 5.0

    Authors: Tao Yu, Zongdian Li, Kei Sakaguchi, Omar Hashash, Walid Saad, Merouane Debbah

    Abstract: The concept of digital twin (DT), which enables the creation of a programmable, digital representation of physical systems, is expected to revolutionize future industries and will lie at the heart of the vision of a future smart society, namely, Society 5.0, in which high integration between cyber (digital) and physical spaces is exploited to bring economic and societal advancements. However, the… ▽ More

    Submitted 27 October, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Journal ref: IEEE Internet of Things Magazine, vol.7, no.5, pp.64-71, Sept. 2024

  24. arXiv:2312.00334  [pdf, other

    cs.NI eess.SP

    UAV-Aided Lifelong Learning for AoI and Energy Optimization in Non-Stationary IoT Networks

    Authors: Zhenzhen Gong, Omar Hashash, Yingze Wang, Qimei Cui, Wei Ni, Walid Saad, Kei Sakaguchi

    Abstract: In this paper, a novel joint energy and age of information (AoI) optimization framework for IoT devices in a non-stationary environment is presented. In particular, IoT devices that are distributed in the real-world are required to efficiently utilize their computing resources so as to balance the freshness of their data and their energy consumption. To optimize the performance of IoT devices in s… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

    Comments: 15 pages, 14 figures

  25. arXiv:2310.17121  [pdf, other

    cs.CL

    Test-time Augmentation for Factual Probing

    Authors: Go Kamoda, Benjamin Heinzerling, Keisuke Sakaguchi, Kentaro Inui

    Abstract: Factual probing is a method that uses prompts to test if a language model "knows" certain world knowledge facts. A problem in factual probing is that small changes to the prompt can lead to large changes in model output. Previous work aimed to alleviate this problem by optimizing prompts via text mining or fine-tuning. However, such approaches are relation-specific and do not generalize to unseen… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: 12 pages, 4 figures, accepted to EMNLP 2023 Findings (short paper)

  26. arXiv:2305.19472  [pdf, other

    cs.CL cs.AI cs.LG

    PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Planning

    Authors: Faeze Brahman, Chandra Bhagavatula, Valentina Pyatkin, Jena D. Hwang, Xiang Lorraine Li, Hirona J. Arai, Soumya Sanyal, Keisuke Sakaguchi, Xiang Ren, Yejin Choi

    Abstract: Procedural planning, which entails decomposing a high-level goal into a sequence of temporally ordered steps, is an important yet intricate task for machines. It involves integrating common-sense knowledge to reason about complex and often contextualized situations, e.g. ``scheduling a doctor's appointment without a phone''. While current approaches show encouraging results using large language mo… ▽ More

    Submitted 18 September, 2024; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: ICLR 2024 version , 31 pages

  27. arXiv:2304.10282  [pdf, other

    cs.IT cs.AI

    The Seven Worlds and Experiences of the Wireless Metaverse: Challenges and Opportunities

    Authors: Omar Hashash, Christina Chaccour, Walid Saad, Tao Yu, Kei Sakaguchi, Merouane Debbah

    Abstract: The wireless metaverse will create diverse user experiences at the intersection of the physical, digital, and virtual worlds. These experiences will enable novel interactions between the constituents (e.g., extended reality (XR) users and avatars) of the three worlds. However, remarkably, to date, there is no holistic vision that identifies the full set of metaverse worlds, constituents, and exper… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

  28. arXiv:2303.18027  [pdf, other

    cs.CL

    Evaluating GPT-4 and ChatGPT on Japanese Medical Licensing Examinations

    Authors: Jungo Kasai, Yuhei Kasai, Keisuke Sakaguchi, Yutaro Yamada, Dragomir Radev

    Abstract: As large language models (LLMs) gain popularity among speakers of diverse languages, we believe that it is crucial to benchmark them to better understand model behaviors, failures, and limitations in languages beyond English. In this work, we evaluate LLM APIs (ChatGPT, GPT-3, and GPT-4) on the Japanese national medical licensing examinations from the past five years, including the current year. O… ▽ More

    Submitted 5 April, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

    Comments: Added results from the March 2023 exam

  29. arXiv:2303.15381  [pdf, other

    cs.CL

    Causal schema induction for knowledge discovery

    Authors: Michael Regan, Jena D. Hwang, Keisuke Sakaguchi, James Pustejovsky

    Abstract: Making sense of familiar yet new situations typically involves making generalizations about causal schemas, stories that help humans reason about event sequences. Reasoning about events includes identifying cause and effect relations shared across event instances, a process we refer to as causal schema induction. Statistical schema induction systems may leverage structural knowledge encoded in dis… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: 8 pages, appendix

  30. arXiv:2303.14342  [pdf, other

    cs.CL

    Analyzing the Performance of GPT-3.5 and GPT-4 in Grammatical Error Correction

    Authors: Steven Coyne, Keisuke Sakaguchi, Diana Galvan-Sosa, Michael Zock, Kentaro Inui

    Abstract: GPT-3 and GPT-4 models are powerful, achieving high performance on a variety of Natural Language Processing tasks. However, there is a relative lack of detailed published analysis of their performance on the task of grammatical error correction (GEC). To address this, we perform experiments testing the capabilities of a GPT-3.5 model (text-davinci-003) and a GPT-4 model (gpt-4-0314) on major GEC b… ▽ More

    Submitted 30 May, 2023; v1 submitted 24 March, 2023; originally announced March 2023.

  31. arXiv:2302.08148  [pdf, other

    cs.AI cs.CL

    Empirical Investigation of Neural Symbolic Reasoning Strategies

    Authors: Yoichi Aoki, Keito Kudo, Tatsuki Kuribayashi, Ana Brassard, Masashi Yoshikawa, Keisuke Sakaguchi, Kentaro Inui

    Abstract: Neural reasoning accuracy improves when generating intermediate reasoning steps. However, the source of this improvement is yet unclear. Here, we investigate and factorize the benefit of generating intermediate steps for symbolic reasoning. Specifically, we decompose the reasoning strategy w.r.t. step granularity and chaining strategy. With a purely symbolic numerical reasoning dataset (e.g., A=1,… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Comments: This paper is accepted as the findings at EACL 2023, and the earlier version (non-archival) of this work got the Best Paper Award in the Student Research Workshop of AACL 2022

  32. arXiv:2302.07866  [pdf, other

    cs.CL cs.AI

    Do Deep Neural Networks Capture Compositionality in Arithmetic Reasoning?

    Authors: Keito Kudo, Yoichi Aoki, Tatsuki Kuribayashi, Ana Brassard, Masashi Yoshikawa, Keisuke Sakaguchi, Kentaro Inui

    Abstract: Compositionality is a pivotal property of symbolic reasoning. However, how well recent neural models capture compositionality remains underexplored in the symbolic reasoning tasks. This study empirically addresses this question by systematically examining recently published pre-trained seq2seq models with a carefully controlled dataset of multi-hop arithmetic symbolic reasoning. We introduce a ski… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

    Comments: accepted by EACL 2023

  33. arXiv:2212.09246  [pdf, other

    cs.CL

    I2D2: Inductive Knowledge Distillation with NeuroLogic and Self-Imitation

    Authors: Chandra Bhagavatula, Jena D. Hwang, Doug Downey, Ronan Le Bras, Ximing Lu, Lianhui Qin, Keisuke Sakaguchi, Swabha Swayamdipta, Peter West, Yejin Choi

    Abstract: Commonsense capabilities of pre-trained language models dramatically improve with scale, leading many to believe that scale is the only winning recipe. But is it? Here, we investigate an alternative that a priori seems impossible: can smaller language models (e.g., GPT-2) win over models that are orders of magnitude larger and better (e.g., GPT-3), if powered with novel commonsense distillation al… ▽ More

    Submitted 26 May, 2023; v1 submitted 18 December, 2022; originally announced December 2022.

    Comments: ACL 2023

  34. arXiv:2211.14686  [pdf, other

    cs.IT cs.NI

    Towards a Decentralized Metaverse: Synchronized Orchestration of Digital Twins and Sub-Metaverses

    Authors: Omar Hashash, Christina Chaccour, Walid Saad, Kei Sakaguchi, Tao Yu

    Abstract: Accommodating digital twins (DTs) in the metaverse is essential to achieving digital reality. This need for integrating DTs into the metaverse while operating them at the network edge has increased the demand for a decentralized edge-enabled metaverse. Hence, to consolidate the fusion between real and digital entities, it is necessary to harmonize the interoperability between DTs and the metaverse… ▽ More

    Submitted 26 November, 2022; originally announced November 2022.

  35. arXiv:2211.02295  [pdf

    cs.IT

    Experiment of Multi-UAV Full-Duplex System Equipped with Directional Antennas

    Authors: Tao Yu, Kento Kajiwara, Kiyomichi Araki, Kei Sakaguchi

    Abstract: One of the key enablers for the realization of a variety of unmanned aerial vehicle (UAV)-based systems is the high-performance communication system linking many UAVs and ground station. We have proposed a spectrum-efficient full-duplex directional-antennas-equipped multi-UAV communication system with low hardware complexity to address the issues of low spectrum efficiency caused by co-channel int… ▽ More

    Submitted 4 November, 2022; originally announced November 2022.

    Comments: The paper was accepted by IEEE Consumer Communications & Networking Conference (CCNC) 2023

  36. arXiv:2207.13332  [pdf, other

    cs.CL

    RealTime QA: What's the Answer Right Now?

    Authors: Jungo Kasai, Keisuke Sakaguchi, Yoichi Takahashi, Ronan Le Bras, Akari Asai, Xinyan Yu, Dragomir Radev, Noah A. Smith, Yejin Choi, Kentaro Inui

    Abstract: We introduce REALTIME QA, a dynamic question answering (QA) platform that announces questions and evaluates systems on a regular basis (weekly in this version). REALTIME QA inquires about the current world, and QA systems need to answer questions about novel events or information. It therefore challenges static, conventional assumptions in open-domain QA datasets and pursues instantaneous applicat… ▽ More

    Submitted 28 February, 2024; v1 submitted 27 July, 2022; originally announced July 2022.

    Comments: RealTime QA Website: https://realtimeqa.github.io/

  37. arXiv:2205.11484  [pdf, other

    cs.CL

    Towards Automated Document Revision: Grammatical Error Correction, Fluency Edits, and Beyond

    Authors: Masato Mita, Keisuke Sakaguchi, Masato Hagiwara, Tomoya Mizumoto, Jun Suzuki, Kentaro Inui

    Abstract: Natural language processing technology has rapidly improved automated grammatical error correction tasks, and the community begins to explore document-level revision as one of the next challenges. To go beyond sentence-level automated grammatical error correction to NLP-based document-level revision assistant, there are two major obstacles: (1) there are few public corpora with document-level revi… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

    Comments: 14 pages

  38. arXiv:2205.09273  [pdf, other

    cs.CL

    Twist Decoding: Diverse Generators Guide Each Other

    Authors: Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Hao Peng, Ximing Lu, Dragomir Radev, Yejin Choi, Noah A. Smith

    Abstract: Many language generation models are now available for a wide range of generation tasks, including machine translation and summarization. Combining such diverse models may lead to further progress, but ensembling generation models is challenging during inference: conventional ensembling methods (e.g., shallow fusion) require that the models share vocabulary/tokenization schemes. We introduce Twist… ▽ More

    Submitted 28 October, 2022; v1 submitted 18 May, 2022; originally announced May 2022.

    Comments: Proc. of EMNLP 2022

  39. arXiv:2205.00395  [pdf, other

    cs.CL

    ELQA: A Corpus of Metalinguistic Questions and Answers about English

    Authors: Shabnam Behzad, Keisuke Sakaguchi, Nathan Schneider, Amir Zeldes

    Abstract: We present ELQA, a corpus of questions and answers in and about the English language. Collected from two online forums, the >70k questions (from English learners and others) cover wide-ranging topics including grammar, meaning, fluency, and etymology. The answers include descriptions of general properties of English vocabulary and grammar as well as explanations about specific (correct and incorre… ▽ More

    Submitted 3 July, 2023; v1 submitted 1 May, 2022; originally announced May 2022.

    Comments: Accepted to ACL 2023

  40. arXiv:2204.05424  [pdf, other

    cs.CL

    A Call for Clarity in Beam Search: How It Works and When It Stops

    Authors: Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Dragomir Radev, Yejin Choi, Noah A. Smith

    Abstract: Text generation with beam search has proven successful in a wide range of applications. We point out that, though largely overlooked in the literature, the commonly-used implementation of beam decoding (e.g., Hugging Face Transformers and fairseq) uses a first come, first served heuristic: it keeps a set of already completed sequences over time steps and stops when the size of this set reaches the… ▽ More

    Submitted 28 February, 2024; v1 submitted 11 April, 2022; originally announced April 2022.

    Comments: LREC-COLING 2024

  41. arXiv:2202.04330  [pdf, other

    cs.PL cs.LO

    Reflexive tactics for algebra, revisited

    Authors: Kazuhiko Sakaguchi

    Abstract: Computational reflection allows us to turn verified decision procedures into efficient automated reasoning tools in proof assistants. The typical applications of such methodology include mathematical structures that have decidable theory fragments, e.g., equational theories of commutative rings and lattices. However, such existing tools are known not to cooperate with packed classes, a methodology… ▽ More

    Submitted 9 February, 2022; originally announced February 2022.

    Comments: Under review

  42. Context-Based MEC Platform for Augmented-Reality Services in 5G Networks

    Authors: Yue Wang, Tao Yu, Kei Sakaguchi

    Abstract: Augmented reality (AR) has drawn great attention in recent years. However, current AR devices have drawbacks, e.g., weak computation ability and large power consumption. To solve the problem, mobile edge computing (MEC) can be introduced as a key technology to offload data and computation from AR devices to MEC servers via 5th Generation Mobile Communication Technology (5G) networks. To this end,… ▽ More

    Submitted 3 February, 2022; originally announced February 2022.

    Comments: Accepted in VTC 2021 Fall

  43. arXiv:2202.00177  [pdf

    cs.IT

    Spectrum Sharing between Directional-Antenna- Equipped UAV System and Terrestrial Systems

    Authors: Tao Yu, Kento Kajiwara, Kiyomichi Araki, Kei Sakaguchi

    Abstract: Unmanned aerial vehicles (UAVs)-based applications, such as surveillance systems and wireless relays, are attracting increasing attention from academia and industrial fields. The high-performance aerial communication system is one of the key enablers for them. However, due to the low attenuation of radio waves in the air-to-ground channels, the interference between aerial and terrestrial communica… ▽ More

    Submitted 31 January, 2022; originally announced February 2022.

    Comments: This paper was accepted by IEEE Annual Computing and Communication Workshop and Conference (CCWC) 2022

  44. arXiv:2202.00176  [pdf

    cs.IT

    Full-Duplex Aerial Communication System for Multiple UAVs with Directional Antennas

    Authors: Tao Yu, Kiyomichi Araki, Kei Sakaguchi

    Abstract: UAV-based wireless systems, such as wireless relay and remote sensing, have attracted great attentions from academia and industry. To realize them, a high-performance wireless aerial communication system, which bridges UAVs and ground stations, is one of the key enablers. However, there are still issues hindering its development, such as the severe co-channel interference among UAVs, and the limit… ▽ More

    Submitted 31 January, 2022; originally announced February 2022.

    Comments: The paper was accepted by IEEE Consumer Communications & Networking Conference (CCNC) 2022

  45. arXiv:2202.00175  [pdf

    cs.IT

    Ground Experiment of Full-Duplex Multi-UAV System Enabled by Directional Antennas

    Authors: Tao Yu, Kiyomichi Araki, Kei Sakaguchi

    Abstract: A high performance multi-UAV communication system, which bridges multiple UAVs and ground station, is one of the key enablers to realize a variety of UAV-based systems. To address the issues such as the low spectrum efficiency caused by the co-channel interference, we have proposed a spectrum-efficient full-duplex multi-UA V communication system with low hardware complexity. In this paper, on-grou… ▽ More

    Submitted 31 January, 2022; originally announced February 2022.

    Comments: This paper was accepted by IEEE Annual Computing and Communication Workshop and Conference (CCWC) 2022

  46. arXiv:2112.07867  [pdf, other

    cs.AI

    Interscript: A dataset for interactive learning of scripts through error feedback

    Authors: Niket Tandon, Aman Madaan, Peter Clark, Keisuke Sakaguchi, Yiming Yang

    Abstract: How can an end-user provide feedback if a deployed structured prediction model generates inconsistent output, ignoring the structural complexity of human language? This is an emerging topic with recent progress in synthetic or constrained settings, and the next big leap would require testing and tuning models in real-world settings. We present a new dataset, Interscript, containing user feedback o… ▽ More

    Submitted 15 December, 2021; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: AAAI'22-Workshop on Interactive Machine Learning

  47. arXiv:2112.04139  [pdf, other

    cs.CL

    Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand

    Authors: Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Lavinia Dunagan, Jacob Morrison, Alexander R. Fabbri, Yejin Choi, Noah A. Smith

    Abstract: Natural language processing researchers have identified limitations of evaluation methodology for generation tasks, with new questions raised about the validity of automatic metrics and of crowdworker judgments. Meanwhile, efforts to improve generation models tend to depend on simple n-gram overlap metrics (e.g., BLEU, ROUGE). We argue that new advances on models and metrics should each more direc… ▽ More

    Submitted 18 May, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

    Comments: Proc. of NAACL 2022

  48. arXiv:2111.08940  [pdf, other

    cs.CL cs.CV

    Transparent Human Evaluation for Image Captioning

    Authors: Jungo Kasai, Keisuke Sakaguchi, Lavinia Dunagan, Jacob Morrison, Ronan Le Bras, Yejin Choi, Noah A. Smith

    Abstract: We establish THumB, a rubric-based human evaluation protocol for image captioning models. Our scoring rubrics and their definitions are carefully developed based on machine- and human-generated captions on the MSCOCO dataset. Each caption is evaluated along two main dimensions in a tradeoff (precision and recall) as well as other aspects that measure the text quality (fluency, conciseness, and inc… ▽ More

    Submitted 18 May, 2022; v1 submitted 17 November, 2021; originally announced November 2021.

    Comments: Proc. of NAACL 2022

  49. arXiv:2111.02609  [pdf, ps, other

    cond-mat.quant-gas

    Hydrodynamic generation of skyrmions in a two-component Bose-Einstein condensate

    Authors: Kyoshiro Sakaguchi, Keisuke Jimbo, Hiroki Saito

    Abstract: When an obstacle is moved in a superfluid faster than a critical velocity, quantized vortices are generated behind the obstacle. Here we propose a method to create more complicated topological excitations, three-dimensional skyrmions, behind a moving obstacle. We numerically show that, in a two-component Bose-Einstein condensate, component-dependent obstacle potentials can generate skyrmions in th… ▽ More

    Submitted 3 November, 2021; originally announced November 2021.

    Comments: 7 pages, 5 figures, 8 movies

  50. arXiv:2110.07574  [pdf, other

    cs.CL

    Can Machines Learn Morality? The Delphi Experiment

    Authors: Liwei Jiang, Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Jenny Liang, Jesse Dodge, Keisuke Sakaguchi, Maxwell Forbes, Jon Borchardt, Saadia Gabriel, Yulia Tsvetkov, Oren Etzioni, Maarten Sap, Regina Rini, Yejin Choi

    Abstract: As AI systems become increasingly powerful and pervasive, there are growing concerns about machines' morality or a lack thereof. Yet, teaching morality to machines is a formidable task, as morality remains among the most intensely debated questions in humanity, let alone for AI. Existing AI systems deployed to millions of users, however, are already making decisions loaded with moral implications,… ▽ More

    Submitted 12 July, 2022; v1 submitted 14 October, 2021; originally announced October 2021.