Search | arXiv e-print repository

Foundation Models for CPS-IoT: Opportunities and Challenges

Authors: Ozan Baris, Yizhuo Chen, Gaofeng Dong, Liying Han, Tomoyoshi Kimura, Pengrui Quan, Ruijie Wang, Tianchen Wang, Tarek Abdelzaher, Mario Bergés, Paul Pu Liang, Mani Srivastava

Abstract: Methods from machine learning (ML) have transformed the implementation of Perception-Cognition-Communication-Action loops in Cyber-Physical Systems (CPS) and the Internet of Things (IoT), replacing mechanistic and basic statistical models with those derived from data. However, the first generation of ML approaches, which depend on supervised learning with annotated data to create task-specific mod… ▽ More Methods from machine learning (ML) have transformed the implementation of Perception-Cognition-Communication-Action loops in Cyber-Physical Systems (CPS) and the Internet of Things (IoT), replacing mechanistic and basic statistical models with those derived from data. However, the first generation of ML approaches, which depend on supervised learning with annotated data to create task-specific models, faces significant limitations in scaling to the diverse sensor modalities, deployment configurations, application tasks, and operating dynamics characterizing real-world CPS-IoT systems. The success of task-agnostic foundation models (FMs), including multimodal large language models (LLMs), in addressing similar challenges across natural language, computer vision, and human speech has generated considerable enthusiasm for and exploration of FMs and LLMs as flexible building blocks in CPS-IoT analytics pipelines, promising to reduce the need for costly task-specific engineering. Nonetheless, a significant gap persists between the current capabilities of FMs and LLMs in the CPS-IoT domain and the requirements they must meet to be viable for CPS-IoT applications. In this paper, we analyze and characterize this gap through a thorough examination of the state of the art and our research, which extends beyond it in various dimensions. Based on the results of our analysis and research, we identify essential desiderata that CPS-IoT domain-specific FMs and LLMs must satisfy to bridge this gap. We also propose actions by CPS-IoT researchers to collaborate in developing key community resources necessary for establishing FMs and LLMs as foundational tools for the next generation of CPS-IoT systems. △ Less

Submitted 4 February, 2025; v1 submitted 22 January, 2025; originally announced January 2025.

arXiv:2404.02461 [pdf, other]

On the Efficiency and Robustness of Vibration-based Foundation Models for IoT Sensing: A Case Study

Authors: Tomoyoshi Kimura, Jinyang Li, Tianshi Wang, Denizhan Kara, Yizhuo Chen, Yigong Hu, Ruijie Wang, Maggie Wigness, Shengzhong Liu, Mani Srivastava, Suhas Diggavi, Tarek Abdelzaher

Abstract: This paper demonstrates the potential of vibration-based Foundation Models (FMs), pre-trained with unlabeled sensing data, to improve the robustness of run-time inference in (a class of) IoT applications. A case study is presented featuring a vehicle classification application using acoustic and seismic sensing. The work is motivated by the success of foundation models in the areas of natural lang… ▽ More This paper demonstrates the potential of vibration-based Foundation Models (FMs), pre-trained with unlabeled sensing data, to improve the robustness of run-time inference in (a class of) IoT applications. A case study is presented featuring a vehicle classification application using acoustic and seismic sensing. The work is motivated by the success of foundation models in the areas of natural language processing and computer vision, leading to generalizations of the FM concept to other domains as well, where significant amounts of unlabeled data exist that can be used for self-supervised pre-training. One such domain is IoT applications. Foundation models for selected sensing modalities in the IoT domain can be pre-trained in an environment-agnostic fashion using available unlabeled sensor data and then fine-tuned to the deployment at hand using a small amount of labeled data. The paper shows that the pre-training/fine-tuning approach improves the robustness of downstream inference and facilitates adaptation to different environmental conditions. More specifically, we present a case study in a real-world setting to evaluate a simple (vibration-based) FM-like model, called FOCAL, demonstrating its superior robustness and adaptation, compared to conventional supervised deep neural networks (DNNs). We also demonstrate its superior convergence over supervised solutions. Our findings highlight the advantages of vibration-based FMs (and FM-inspired selfsupervised models in general) in terms of inference robustness, runtime efficiency, and model adaptation (via fine-tuning) in resource-limited IoT settings. △ Less

Submitted 3 April, 2024; originally announced April 2024.

arXiv:2403.11992 [pdf, other]

Sub-photon accuracy noise reduction of single shot coherent diffraction pattern with atomic model trained autoencoder

Authors: Takuto Ishikawa, Yoko Takeo, Kai Sakurai, Kyota Yoshinaga, Noboru Furuya, Yuichi Inubushi, Kensuke Tono, Yasumasa Joti, Makina Yabashi, Takashi Kimura, Kazuyoshi Yoshimi

Abstract: Single-shot imaging with femtosecond X-ray lasers is a powerful measurement technique that can achieve both high spatial and temporal resolution. However, its accuracy has been severely limited by the difficulty of applying conventional noise-reduction processing. This study uses deep learning to validate noise reduction techniques, with autoencoders serving as the learning model. Focusing on the… ▽ More Single-shot imaging with femtosecond X-ray lasers is a powerful measurement technique that can achieve both high spatial and temporal resolution. However, its accuracy has been severely limited by the difficulty of applying conventional noise-reduction processing. This study uses deep learning to validate noise reduction techniques, with autoencoders serving as the learning model. Focusing on the diffraction patterns of nanoparticles, we simulated a large dataset treating the nanoparticles as composed of many independent atoms. Three neural network architectures are investigated: neural network, convolutional neural network and U-net, with U-net showing superior performance in noise reduction and subphoton reproduction. We also extended our models to apply to diffraction patterns of particle shapes different from those in the simulated data. We then applied the U-net model to a coherent diffractive imaging study, wherein a nanoparticle in a microfluidic device is exposed to a single X-ray free-electron laser pulse. After noise reduction, the reconstructed nanoparticle image improved significantly even though the nanoparticle shape was different from the training data, highlighting the importance of transfer learning. △ Less

Submitted 18 March, 2024; originally announced March 2024.

Comments: 17 pages, 10 figures

arXiv:2103.11789 [pdf]

Time-Domain Hybrid PAM for Data-Rate and Distance Adaptive UWOC System

Authors: T. Kodama, M. Aizat, F. Kobori, T. Kimura, Y. Inoue, M. Jinno

Abstract: The challenge for next-generation underwater optical wireless communication systems is to develop optical transceivers that can operate with low power consumption by maximizing the transmission capacity according to the transmission distance between transmitters and receivers. This study proposes an underwater wireless optical communication (UWOC) system using an optical transceiver with an optimu… ▽ More The challenge for next-generation underwater optical wireless communication systems is to develop optical transceivers that can operate with low power consumption by maximizing the transmission capacity according to the transmission distance between transmitters and receivers. This study proposes an underwater wireless optical communication (UWOC) system using an optical transceiver with an optimum transmission rate for the deep sea with near-pure water properties. As a method for actualizing an optical transceiver with an optimum transmission rate in a UWOC system, time-domain hybrid pulse amplitude modulation (PAM) (TDHP) using a transmission rate and distance-adaptive intensity modulation/direct detection optical transceiver is considered. In the TDHP method, variable transmission capacity is actualized while changing the generation ratio of two intensity-modulated signals with different noise immunities in the time domain. Three different color laser diodes (LDs), red, blue, and green are used in an underwater channel transmission transceiver that comprises the LD and a photodiode. The maximum transmission distance while changing the incidence of PAM 2 and PAM 4 signals that calibrate the TDHP in a pure transmission line and how the maximum transmission distance changes when the optical transmitter/receiver spatial optical system is altered from the optimum conditions are clarified based on numerical calculation and simulation. To the best knowledge of the authors, there is no other research on data-rate and distance adaptive UWOC system that applies the TDHP signal with power optimization between two modulation formats. △ Less

Submitted 8 March, 2021; originally announced March 2021.

arXiv:2002.05339 [pdf, other]

Distributed Collaborative 3D-Deployment of UAV Base Stations for On-Demand Coverage

Authors: Tatsuaki Kimura, Masaki Ogura

Abstract: Deployment of unmanned aerial vehicles (UAVs) performing as flying aerial base stations (BSs) has a great potential of adaptively serving ground users during temporary events, such as major disasters and massive events. However, planning an efficient, dynamic, and 3D deployment of UAVs in adaptation to dynamically and spatially varying ground users is a highly complicated problem due to the comple… ▽ More Deployment of unmanned aerial vehicles (UAVs) performing as flying aerial base stations (BSs) has a great potential of adaptively serving ground users during temporary events, such as major disasters and massive events. However, planning an efficient, dynamic, and 3D deployment of UAVs in adaptation to dynamically and spatially varying ground users is a highly complicated problem due to the complexity in air-to-ground channels and interference among UAVs. In this paper, we propose a novel distributed 3D deployment method for UAV-BSs in a downlink network for on-demand coverage. Our method consists mainly of the following two parts: sensing-aided crowd density estimation and distributed push-sum algorithm. The first part estimates the ground user density from its observation through on-ground sensors, thereby allowing us to avoid the computationally intensive process of obtaining the positions of all the ground users. On the basis of the estimated user density, in the second part, each UAV dynamically updates its 3D position in collaboration with its neighboring UAVs for maximizing the total coverage. We prove the convergence of our distributed algorithm by employing a distributed push-sum algorithm framework. Simulation results demonstrate that our method can improve the overall coverage with a limited number of ground sensors. We also demonstrate that our method can be applied to a dynamic network in which the density of ground users varies temporally. △ Less

Submitted 12 February, 2020; originally announced February 2020.

Comments: to appear in IEEE International Conference on Computer Communications 2020 (INFOCOM2020)

arXiv:1802.06882 [pdf, other]

Theoretical Framework for Estimating Target-Object Shape by Using Location-Unknown Mobile Distance Sensors

Authors: Hiroshi Saito, Tatsuaki Kimura

Abstract: This paper proposes a theoretical framework for estimating a target-object shape, the location of which is not given, by using mobile distance sensors the locations of which are also unknown. Typically, mobile sensors are mounted on vehicles. Each sensor continuously measures the distance from it to the target object. The proposed framework does not require any positioning function, anchor-locatio… ▽ More This paper proposes a theoretical framework for estimating a target-object shape, the location of which is not given, by using mobile distance sensors the locations of which are also unknown. Typically, mobile sensors are mounted on vehicles. Each sensor continuously measures the distance from it to the target object. The proposed framework does not require any positioning function, anchor-location information, or additional mechanisms to obtain side information such as angle of arrival of signal. Under the assumption of a convex polygon target object, each edge length and vertex angle and their combinations are estimated and finally the shape of the target object is estimated. To the best of our knowledge, this is the first result in which a target-object shape was estimated by using the data of distance sensors without using their locations. △ Less

Submitted 4 March, 2018; v1 submitted 15 February, 2018; originally announced February 2018.

Showing 1–6 of 6 results for author: Kimura, T