-
MoiréTac: A Dual-Mode Visuotactile Sensor for Multidimensional Perception Using Moiré Pattern Amplification
Authors:
Kit-Wa Sou,
Junhao Gong,
Shoujie Li,
Chuqiao Lyu,
Ziwu Song,
Shilong Mu,
Wenbo Ding
Abstract:
Visuotactile sensors typically employ sparse marker arrays that limit spatial resolution and lack clear analytical force-to-image relationships. To solve this problem, we present \textbf{MoiréTac}, a dual-mode sensor that generates dense interference patterns via overlapping micro-gratings within a transparent architecture. When two gratings overlap with misalignment, they create moiré patterns th…
▽ More
Visuotactile sensors typically employ sparse marker arrays that limit spatial resolution and lack clear analytical force-to-image relationships. To solve this problem, we present \textbf{MoiréTac}, a dual-mode sensor that generates dense interference patterns via overlapping micro-gratings within a transparent architecture. When two gratings overlap with misalignment, they create moiré patterns that amplify microscopic deformations. The design preserves optical clarity for vision tasks while producing continuous moiré fields for tactile sensing, enabling simultaneous 6-axis force/torque measurement, contact localization, and visual perception. We combine physics-based features (brightness, phase gradient, orientation, and period) from moiré patterns with deep spatial features. These are mapped to 6-axis force/torque measurements, enabling interpretable regression through end-to-end learning. Experimental results demonstrate three capabilities: force/torque measurement with R^2 > 0.98 across tested axes; sensitivity tuning through geometric parameters (threefold gain adjustment); and vision functionality for object classification despite moiré overlay. Finally, we integrate the sensor into a robotic arm for cap removal with coordinated force and torque control, validating its potential for dexterous manipulation.
△ Less
Submitted 16 September, 2025;
originally announced September 2025.
-
Energy-Efficient Secure Communications via Joint Optimization of UAV Trajectory and Movable-Antenna Array Beamforming
Authors:
Sanghyeok Kim,
Jinu Gong,
Joonhyuk Kang
Abstract:
This paper investigates the potential of unmanned aerial vehicles (UAVs) equipped with movable-antenna (MA) arrays to strengthen security in wireless communication systems. We propose a novel framework that jointly optimizes the UAV trajectory and the reconfigurable beamforming of the MA array to maximize secrecy energy efficiency, while ensuring reliable communication with legitimate users. By ex…
▽ More
This paper investigates the potential of unmanned aerial vehicles (UAVs) equipped with movable-antenna (MA) arrays to strengthen security in wireless communication systems. We propose a novel framework that jointly optimizes the UAV trajectory and the reconfigurable beamforming of the MA array to maximize secrecy energy efficiency, while ensuring reliable communication with legitimate users. By exploiting the spatial degrees of freedom enabled by the MA array, the system can form highly directional beams and deep nulls, thereby significantly improving physical layer security. Numerical results demonstrate that the proposed approach achieves superior secrecy energy efficiency, attributed to the enhanced spatial flexibility provided by the movable antenna architecture.
△ Less
Submitted 27 July, 2025;
originally announced July 2025.
-
Debunking Optimization Myths in Federated Learning for Medical Image Classification
Authors:
Youngjoon Lee,
Hyukjoon Lee,
Jinu Gong,
Yang Cao,
Joonhyuk Kang
Abstract:
Federated Learning (FL) is a collaborative learning method that enables decentralized model training while preserving data privacy. Despite its promise in medical imaging, recent FL methods are often sensitive to local factors such as optimizers and learning rates, limiting their robustness in practical deployments. In this work, we revisit vanilla FL to clarify the impact of edge device configura…
▽ More
Federated Learning (FL) is a collaborative learning method that enables decentralized model training while preserving data privacy. Despite its promise in medical imaging, recent FL methods are often sensitive to local factors such as optimizers and learning rates, limiting their robustness in practical deployments. In this work, we revisit vanilla FL to clarify the impact of edge device configurations, benchmarking recent FL methods on colorectal pathology and blood cell classification task. We numerically show that the choice of local optimizer and learning rate has a greater effect on performance than the specific FL method. Moreover, we find that increasing local training epochs can either enhance or impair convergence, depending on the FL method. These findings indicate that appropriate edge-specific configuration is more crucial than algorithmic complexity for achieving effective FL.
△ Less
Submitted 26 July, 2025;
originally announced July 2025.
-
Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model
Authors:
Ailin Huang,
Bingxin Li,
Bruce Wang,
Boyong Wu,
Chao Yan,
Chengli Feng,
Heng Wang,
Hongyu Zhou,
Hongyuan Wang,
Jingbei Li,
Jianjian Sun,
Joanna Wang,
Mingrui Chen,
Peng Liu,
Ruihang Miao,
Shilei Jiang,
Tian Fei,
Wang You,
Xi Chen,
Xuerui Yang,
Yechang Huang,
Yuxiang Zhang,
Zheng Ge,
Zheng Gong,
Zhewei Huang
, et al. (51 additional authors not shown)
Abstract:
Large Audio-Language Models (LALMs) have significantly advanced intelligent human-computer interaction, yet their reliance on text-based outputs limits their ability to generate natural speech responses directly, hindering seamless audio interactions. To address this, we introduce Step-Audio-AQAA, a fully end-to-end LALM designed for Audio Query-Audio Answer (AQAA) tasks. The model integrates a du…
▽ More
Large Audio-Language Models (LALMs) have significantly advanced intelligent human-computer interaction, yet their reliance on text-based outputs limits their ability to generate natural speech responses directly, hindering seamless audio interactions. To address this, we introduce Step-Audio-AQAA, a fully end-to-end LALM designed for Audio Query-Audio Answer (AQAA) tasks. The model integrates a dual-codebook audio tokenizer for linguistic and semantic feature extraction, a 130-billion-parameter backbone LLM and a neural vocoder for high-fidelity speech synthesis. Our post-training approach employs interleaved token-output of text and audio to enhance semantic coherence and combines Direct Preference Optimization (DPO) with model merge to improve performance. Evaluations on the StepEval-Audio-360 benchmark demonstrate that Step-Audio-AQAA excels especially in speech control, outperforming the state-of-art LALMs in key areas. This work contributes a promising solution for end-to-end LALMs and highlights the critical role of token-based vocoder in enhancing overall performance for AQAA tasks.
△ Less
Submitted 13 June, 2025; v1 submitted 10 June, 2025;
originally announced June 2025.
-
ACE-Step: A Step Towards Music Generation Foundation Model
Authors:
Junmin Gong,
Sean Zhao,
Sen Wang,
Shengyuan Xu,
Joe Guo
Abstract:
We introduce ACE-Step, a novel open-source foundation model for music generation that overcomes key limitations of existing approaches and achieves state-of-the-art performance through a holistic architectural design. Current methods face inherent trade-offs between generation speed, musical coherence, and controllability. For example, LLM-based models (e.g. Yue, SongGen) excel at lyric alignment…
▽ More
We introduce ACE-Step, a novel open-source foundation model for music generation that overcomes key limitations of existing approaches and achieves state-of-the-art performance through a holistic architectural design. Current methods face inherent trade-offs between generation speed, musical coherence, and controllability. For example, LLM-based models (e.g. Yue, SongGen) excel at lyric alignment but suffer from slow inference and structural artifacts. Diffusion models (e.g. DiffRhythm), on the other hand, enable faster synthesis but often lack long-range structural coherence. ACE-Step bridges this gap by integrating diffusion-based generation with Sana's Deep Compression AutoEncoder (DCAE) and a lightweight linear transformer. It also leverages MERT and m-hubert to align semantic representations (REPA) during training, allowing rapid convergence. As a result, our model synthesizes up to 4 minutes of music in just 20 seconds on an A100 GPU-15x faster than LLM-based baselines-while achieving superior musical coherence and lyric alignment across melody, harmony, and rhythm metrics. Moreover, ACE-Step preserves fine-grained acoustic details, enabling advanced control mechanisms such as voice cloning, lyric editing, remixing, and track generation (e.g. lyric2vocal, singing2accompaniment). Rather than building yet another end-to-end text-to-music pipeline, our vision is to establish a foundation model for music AI: a fast, general-purpose, efficient yet flexible architecture that makes it easy to train subtasks on top of it. This paves the way for the development of powerful tools that seamlessly integrate into the creative workflows of music artists, producers, and content creators. In short, our goal is to build a stable diffusion moment for music. The code, the model weights and the demo are available at: https://ace-step.github.io/.
△ Less
Submitted 28 May, 2025;
originally announced June 2025.
-
Improving Generalizability of Kolmogorov-Arnold Networks via Error-Correcting Output Codes
Authors:
Youngjoon Lee,
Jinu Gong,
Joonhyuk Kang
Abstract:
Kolmogorov-Arnold Networks (KAN) offer universal function approximation using univariate spline compositions without nonlinear activations. In this work, we integrate Error-Correcting Output Codes (ECOC) into the KAN framework to transform multi-class classification into multiple binary tasks, improving robustness via Hamming distance decoding. Our proposed KAN with ECOC framework outperforms vani…
▽ More
Kolmogorov-Arnold Networks (KAN) offer universal function approximation using univariate spline compositions without nonlinear activations. In this work, we integrate Error-Correcting Output Codes (ECOC) into the KAN framework to transform multi-class classification into multiple binary tasks, improving robustness via Hamming distance decoding. Our proposed KAN with ECOC framework outperforms vanilla KAN on a challenging blood cell classification dataset, achieving higher accuracy across diverse hyperparameter settings. Ablation studies further confirm that ECOC consistently enhances performance across FastKAN and FasterKAN variants. These results demonstrate that ECOC integration significantly boosts KAN generalizability in critical healthcare AI applications. To the best of our knowledge, this is the first work of ECOC with KAN for enhancing multi-class medical image classification performance.
△ Less
Submitted 17 September, 2025; v1 submitted 9 May, 2025;
originally announced May 2025.
-
A Unified Benchmark of Federated Learning with Kolmogorov-Arnold Networks for Medical Imaging
Authors:
Youngjoon Lee,
Jinu Gong,
Joonhyuk Kang
Abstract:
Federated Learning (FL) enables model training across decentralized devices without sharing raw data, thereby preserving privacy in sensitive domains like healthcare. In this paper, we evaluate Kolmogorov-Arnold Networks (KAN) architectures against traditional MLP across six state-of-the-art FL algorithms on a blood cell classification dataset. Notably, our experiments demonstrate that KAN can eff…
▽ More
Federated Learning (FL) enables model training across decentralized devices without sharing raw data, thereby preserving privacy in sensitive domains like healthcare. In this paper, we evaluate Kolmogorov-Arnold Networks (KAN) architectures against traditional MLP across six state-of-the-art FL algorithms on a blood cell classification dataset. Notably, our experiments demonstrate that KAN can effectively replace MLP in federated environments, achieving superior performance with simpler architectures. Furthermore, we analyze the impact of key hyperparameters-grid size and network architecture-on KAN performance under varying degrees of Non-IID data distribution. In addition, our ablation studies reveal that optimizing KAN width while maintaining minimal depth yields the best performance in federated settings. As a result, these findings establish KAN as a promising alternative for privacy-preserving medical imaging applications in distributed healthcare. To the best of our knowledge, this is the first comprehensive benchmark of KAN in FL settings for medical imaging task.
△ Less
Submitted 17 September, 2025; v1 submitted 28 April, 2025;
originally announced April 2025.
-
Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction
Authors:
Ailin Huang,
Boyong Wu,
Bruce Wang,
Chao Yan,
Chen Hu,
Chengli Feng,
Fei Tian,
Feiyu Shen,
Jingbei Li,
Mingrui Chen,
Peng Liu,
Ruihang Miao,
Wang You,
Xi Chen,
Xuerui Yang,
Yechang Huang,
Yuxiang Zhang,
Zheng Gong,
Zixin Zhang,
Hongyu Zhou,
Jianjian Sun,
Brian Li,
Chengting Feng,
Changyi Wan,
Hanpeng Hu
, et al. (120 additional authors not shown)
Abstract:
Real-time speech interaction, serving as a fundamental interface for human-machine collaboration, holds immense potential. However, current open-source models face limitations such as high costs in voice data collection, weakness in dynamic control, and limited intelligence. To address these challenges, this paper introduces Step-Audio, the first production-ready open-source solution. Key contribu…
▽ More
Real-time speech interaction, serving as a fundamental interface for human-machine collaboration, holds immense potential. However, current open-source models face limitations such as high costs in voice data collection, weakness in dynamic control, and limited intelligence. To address these challenges, this paper introduces Step-Audio, the first production-ready open-source solution. Key contributions include: 1) a 130B-parameter unified speech-text multi-modal model that achieves unified understanding and generation, with the Step-Audio-Chat version open-sourced; 2) a generative speech data engine that establishes an affordable voice cloning framework and produces the open-sourced lightweight Step-Audio-TTS-3B model through distillation; 3) an instruction-driven fine control system enabling dynamic adjustments across dialects, emotions, singing, and RAP; 4) an enhanced cognitive architecture augmented with tool calling and role-playing abilities to manage complex tasks effectively. Based on our new StepEval-Audio-360 evaluation benchmark, Step-Audio achieves state-of-the-art performance in human evaluations, especially in terms of instruction following. On open-source benchmarks like LLaMA Question, shows 9.3% average performance improvement, demonstrating our commitment to advancing the development of open-source multi-modal language technologies. Our code and models are available at https://github.com/stepfun-ai/Step-Audio.
△ Less
Submitted 18 February, 2025; v1 submitted 17 February, 2025;
originally announced February 2025.
-
Engineering-Oriented Design of Drift-Resilient MTJ Random Number Generator via Hybrid Control Strategies
Authors:
Ran Zhang,
Caihua Wan,
Yingqian Xu,
Xiaohan Li,
Raik Hoffmann,
Meike Hindenberg,
Shiqiang Liu,
Dehao Kong,
Shilong Xiong,
Shikun He,
Alptekin Vardar,
Qiang Dai,
Junlu Gong,
Yihui Sun,
Zejie Zheng,
Thomas Kämpfe,
Guoqiang Yu,
Xiufeng Han
Abstract:
Magnetic Tunnel Junctions (MTJs) have shown great promise as hardware sources for true random number generation (TRNG) due to their intrinsic stochastic switching behavior. However, practical deployment remains challenged by drift in switching probability caused by thermal fluctuations, device aging, and environmental instability. This work presents an engineering-oriented, drift-resilient MTJ-bas…
▽ More
Magnetic Tunnel Junctions (MTJs) have shown great promise as hardware sources for true random number generation (TRNG) due to their intrinsic stochastic switching behavior. However, practical deployment remains challenged by drift in switching probability caused by thermal fluctuations, device aging, and environmental instability. This work presents an engineering-oriented, drift-resilient MTJ-based TRNG architecture, enabled by a hybrid control strategy that combines self-stabilizing feedback with pulse width modulation. A key component is the Downcalibration-2 scheme, which updates the control parameter every two steps using only integer-resolution timing, ensuring excellent statistical quality without requiring bit discarding, pre-characterization, or external calibration. Extensive experimental measurements and numerical simulations demonstrate that this approach maintains stable randomness under dynamic temperature drift, using only simple digital logic. The proposed architecture offers high throughput, robustness, and scalability, making it well-suited for secure hardware applications, embedded systems, and edge computing environments.
△ Less
Submitted 19 April, 2025; v1 submitted 25 January, 2025;
originally announced January 2025.
-
Generative AI-Powered Plugin for Robust Federated Learning in Heterogeneous IoT Networks
Authors:
Youngjoon Lee,
Jinu Gong,
Joonhyuk Kang
Abstract:
Federated learning enables edge devices to collaboratively train a global model while maintaining data privacy by keeping data localized. However, the Non-IID nature of data distribution across devices often hinders model convergence and reduces performance. In this paper, we propose a novel plugin for federated optimization techniques that approximates Non-IID data distributions to IID through ge…
▽ More
Federated learning enables edge devices to collaboratively train a global model while maintaining data privacy by keeping data localized. However, the Non-IID nature of data distribution across devices often hinders model convergence and reduces performance. In this paper, we propose a novel plugin for federated optimization techniques that approximates Non-IID data distributions to IID through generative AI-enhanced data augmentation and balanced sampling strategy. Key idea is to synthesize additional data for underrepresented classes on each edge device, leveraging generative AI to create a more balanced dataset across the FL network. Additionally, a balanced sampling approach at the central server selectively includes only the most IID-like devices, accelerating convergence while maximizing the global model's performance. Experimental results validate that our approach significantly improves convergence speed and robustness against data imbalance, establishing a flexible, privacy-preserving FL plugin that is applicable even in data-scarce environments.
△ Less
Submitted 25 April, 2025; v1 submitted 31 October, 2024;
originally announced October 2024.
-
RSMA Assisted ISAC With Hybrid Beamforming
Authors:
Zhuohui Yao,
Wenchi Cheng,
Liping Liang,
Tao Zhang,
Jun Gong
Abstract:
The harsh environment and scarce resources post-disaster drive the equipment to be miniaturized and portable. Based on this, integrated sensing and communication (ISAC) systems play a significant role in providing emergency wireless networks. In order to reduce the hardware cost, a hybrid beamforming (HBF) assisted millimeter-wave (mmWave) ISAC system, which exploits the limited number of radio fr…
▽ More
The harsh environment and scarce resources post-disaster drive the equipment to be miniaturized and portable. Based on this, integrated sensing and communication (ISAC) systems play a significant role in providing emergency wireless networks. In order to reduce the hardware cost, a hybrid beamforming (HBF) assisted millimeter-wave (mmWave) ISAC system, which exploits the limited number of radio frequency (RF) chains, is considered in this paper. However, the HBF structure reduces the spatial degrees of freedom, thus leading to increased interference among communication users and radar sensing. To solve this problem, a rate-splitting multiple access (RSMA) strategy is adopted to enhance the emergency mmWave-ISAC system. We formulate the weighted sum rate (WSR) maximization objective by jointly designing common rate allocation and HBF. Then, we propose the penalty dual decomposition (PDD) coupled with the weighted mean squared error (WMMSE) method to solve this high-dimensional non-convex problem. Numerical results demonstrate the effectiveness of the proposed algorithm and show that the RSMA-ISAC scheme outperforms other benchmark schemes.
△ Less
Submitted 1 September, 2025; v1 submitted 7 June, 2024;
originally announced June 2024.
-
Transfer Learning-Enhanced Instantaneous Multi-Person Indoor Localization by CSI
Authors:
Zhiyuan He,
Ke Deng,
Jiangchao Gong,
Yi Zhou,
Desheng Wang
Abstract:
Passive indoor localization, integral to smart buildings, emergency response, and indoor navigation, has traditionally been limited by a focus on single-target localization and reliance on multi-packet CSI. We introduce a novel Multi-target loss, notably enhancing multi-person localization. Utilizing this loss function, our instantaneous CSI-ResNet achieves an impressive 99.21% accuracy at 0.6m pr…
▽ More
Passive indoor localization, integral to smart buildings, emergency response, and indoor navigation, has traditionally been limited by a focus on single-target localization and reliance on multi-packet CSI. We introduce a novel Multi-target loss, notably enhancing multi-person localization. Utilizing this loss function, our instantaneous CSI-ResNet achieves an impressive 99.21% accuracy at 0.6m precision with single-timestamp CSI. A preprocessing algorithm is implemented to counteract WiFi-induced variability, thereby augmenting robustness. Furthermore, we incorporate Nuclear Norm-Based Transfer Pre-Training, ensuring adaptability in diverse environments, which provides a new paradigm for indoor multi-person localization. Additionally, we have developed an extensive dataset, surpassing existing ones in scope and diversity, to underscore the efficacy of our method and facilitate future fingerprint-based localization research.
△ Less
Submitted 2 March, 2024;
originally announced March 2024.
-
Intelligent Agricultural Greenhouse Control System Based on Internet of Things and Machine Learning
Authors:
Cangqing Wang,
Jiangchuan Gong
Abstract:
This study endeavors to conceptualize and execute a sophisticated agricultural greenhouse control system grounded in the amalgamation of the Internet of Things (IoT) and machine learning. Through meticulous monitoring of intrinsic environmental parameters within the greenhouse and the integration of machine learning algorithms, the conditions within the greenhouse are aptly modulated. The envisage…
▽ More
This study endeavors to conceptualize and execute a sophisticated agricultural greenhouse control system grounded in the amalgamation of the Internet of Things (IoT) and machine learning. Through meticulous monitoring of intrinsic environmental parameters within the greenhouse and the integration of machine learning algorithms, the conditions within the greenhouse are aptly modulated. The envisaged outcome is an enhancement in crop growth efficiency and yield, accompanied by a reduction in resource wastage. In the backdrop of escalating global population figures and the escalating exigencies of climate change, agriculture confronts unprecedented challenges. Conventional agricultural paradigms have proven inadequate in addressing the imperatives of food safety and production efficiency. Against this backdrop, greenhouse agriculture emerges as a viable solution, proffering a controlled milieu for crop cultivation to augment yields, refine quality, and diminish reliance on natural resources [b1]. Nevertheless, greenhouse agriculture contends with a gamut of challenges. Traditional greenhouse management strategies, often grounded in experiential knowledge and predefined rules, lack targeted personalized regulation, thereby resulting in resource inefficiencies. The exigencies of real-time monitoring and precise control of the greenhouse's internal environment gain paramount importance with the burgeoning scale of agriculture. To redress this challenge, the study introduces IoT technology and machine learning algorithms into greenhouse agriculture, aspiring to institute an intelligent agricultural greenhouse control system conducive to augmenting the efficiency and sustainability of agricultural production.
△ Less
Submitted 20 March, 2025; v1 submitted 14 February, 2024;
originally announced February 2024.
-
Radar detection of wake vortex behind the aircraft: the detection range problem
Authors:
Jiangkun Gong,
Jun Yan,
Deyong Kong,
Deren Li
Abstract:
In this study, we showcased the detection of the wake vortex produced by a medium aircraft at distances exceeding 10 km using an X-band pulse-Doppler radar. We analyzed radar signals within the range profiles behind a Boeing 737 aircraft on February 7, 2021, within the airspace of the Runway Protection Zone (RPZ) at Tianhe Airport, Wuhan, China. The findings revealed that the wake vortex extended…
▽ More
In this study, we showcased the detection of the wake vortex produced by a medium aircraft at distances exceeding 10 km using an X-band pulse-Doppler radar. We analyzed radar signals within the range profiles behind a Boeing 737 aircraft on February 7, 2021, within the airspace of the Runway Protection Zone (RPZ) at Tianhe Airport, Wuhan, China. The findings revealed that the wake vortex extended up to 6 km from the aircraft, which is 10 km from the radar, displaying distinct stages characterized by scattering patterns and Doppler signatures. Despite the wake vortex exhibiting a scattering power approximately 10 dB lower than that of the aircraft, its Doppler Signal-to-Clutter Ratio (DSCR) values were only 5 dB lower, indicating a notably strong scattering power within a single radar bin. Additionally, certain radar parameters proved inconsistent in the stable detection and tracking of wake vortex, aligning with our earlier concept of cognitive micro-Doppler radar.
△ Less
Submitted 27 December, 2023;
originally announced December 2023.
-
An introduction to radar Automatic Target Recognition (ATR) technology in ground-based radar systems
Authors:
Jiangkun Gong,
Jun Yan,
Deyong Kong,
Deren Li
Abstract:
This paper presents a brief examination of Automatic Target Recognition (ATR) technology within ground-based radar systems. It offers a lucid comprehension of the ATR concept, delves into its historical milestones, and categorizes ATR methods according to different scattering regions. By incorporating ATR solutions into radar systems, this study demonstrates the expansion of radar detection ranges…
▽ More
This paper presents a brief examination of Automatic Target Recognition (ATR) technology within ground-based radar systems. It offers a lucid comprehension of the ATR concept, delves into its historical milestones, and categorizes ATR methods according to different scattering regions. By incorporating ATR solutions into radar systems, this study demonstrates the expansion of radar detection ranges and the enhancement of tracking capabilities, leading to superior situational awareness. Drawing insights from the Russo-Ukrainian War, the paper highlights three pressing radar applications that urgently necessitate ATR technology: detecting stealth aircraft, countering small drones, and implementing anti-jamming measures. Anticipating the next wave of radar ATR research, the study predicts a surge in cognitive radar and machine learning (ML)-driven algorithms. These emerging methodologies aspire to confront challenges associated with system adaptation, real-time recognition, and environmental adaptability. Ultimately, ATR stands poised to revolutionize conventional radar systems, ushering in an era of 4D sensing capabilities.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
Formation Wing-Beat Modulation (FWM): A Tool for Quantifying Bird Flocks Using Radar Micro-Doppler Signals
Authors:
Jiangkun Gong,
Jun Yan,
Deyong Kong,
Ruizhi Chen,
Deren Li
Abstract:
Radar echoes from bird flocks contain modulation signals, which we find are produced by the flapping gaits of birds in the flock, resulting in a group of spectral peaks with similar amplitudes spaced at a specific interval. We call this the formation wing-beat modulation (FWM) effect. FWM signals are micro-Doppler modulated by flapping wings and are related to the bird number, wing-beat frequency,…
▽ More
Radar echoes from bird flocks contain modulation signals, which we find are produced by the flapping gaits of birds in the flock, resulting in a group of spectral peaks with similar amplitudes spaced at a specific interval. We call this the formation wing-beat modulation (FWM) effect. FWM signals are micro-Doppler modulated by flapping wings and are related to the bird number, wing-beat frequency, and flight phasing strategy. Our X-band radar data show that FWM signals exist in radar signals of a seagull flock, providing tools for quantifying the bird number and estimating the mean wingbeat rate of birds. This new finding could aid in research on the quantification of bird migration numbers and estimation of bird flight behavior in radar ornithology and aero-ecology.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
Resilient Controller Synthesis Against DoS Attacks for Vehicular Platooning in Spatial Domain
Authors:
Jian Gong,
Carlos Murguia,
Anggera Bayuwindra,
Jinde Cao
Abstract:
This paper proposes a vehicular platoon control approach under Denial-of-Service (DoS) attacks and external disturbances. DoS attacks increase the service time on the communication network and cause additional transmission delays, which consequently increase the risk of rear-end collisions of vehicles in the platoon. To counter DoS attacks, we propose a resilient control scheme that exploits polyt…
▽ More
This paper proposes a vehicular platoon control approach under Denial-of-Service (DoS) attacks and external disturbances. DoS attacks increase the service time on the communication network and cause additional transmission delays, which consequently increase the risk of rear-end collisions of vehicles in the platoon. To counter DoS attacks, we propose a resilient control scheme that exploits polytopic overapproximations of the closed-loop dynamics under DoS attacks. This scheme allows synthesizing robust controllers that guarantee tracking of both the desired spacing policy and spatially varying reference velocity for all space-varying DoS attacks satisfying a hard upper bound on the attack duration. In addition, L2 string stability conditions are derived to ensure that external perturbations do not grow as they propagate through the platoon, thus ensuring the string stability. Numerical simulations illustrate the effectiveness of the proposed control method.
△ Less
Submitted 28 July, 2023;
originally announced July 2023.
-
Introduction to Drone Detection Radar with Emphasis on Automatic Target Recognition (ATR) technology
Authors:
Jiangkun Gong,
Jun Yan,
Deyong Kong,
Deren Li
Abstract:
This paper discusses the challenges of detecting and categorizing small drones with radar automatic target recognition (ATR) technology. The authors suggest integrating ATR capabilities into drone detection radar systems to improve performance and manage emerging threats. The study focuses primarily on drones in Group 1 and 2. The paper highlights the need to consider kinetic features and signal s…
▽ More
This paper discusses the challenges of detecting and categorizing small drones with radar automatic target recognition (ATR) technology. The authors suggest integrating ATR capabilities into drone detection radar systems to improve performance and manage emerging threats. The study focuses primarily on drones in Group 1 and 2. The paper highlights the need to consider kinetic features and signal signatures, such as micro-Doppler, in ATR techniques to efficiently recognize small drones. The authors also present a comprehensive drone detection radar system design that balances detection and tracking requirements, incorporating parameter adjustment based on scattering region theory. They offer an example of a performance improvement achieved using feedback and situational awareness mechanisms with the integrated ATR capabilities. Furthermore, the paper examines challenges related to one-way attack drones and explores the potential of cognitive radar as a solution. The integration of ATR capabilities transforms a 3D radar system into a 4D radar system, resulting in improved drone detection performance. These advancements are useful in military, civilian, and commercial applications, and ongoing research and development efforts are essential to keep radar systems effective and ready to detect, track, and respond to emerging threats.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
Noise-to-Norm Reconstruction for Industrial Anomaly Detection and Localization
Authors:
Shiqi Deng,
Zhiyu Sun,
Ruiyan Zhuang,
Jun Gong
Abstract:
Anomaly detection has a wide range of applications and is especially important in industrial quality inspection. Currently, many top-performing anomaly-detection models rely on feature-embedding methods. However, these methods do not perform well on datasets with large variations in object locations. Reconstruction-based methods use reconstruction errors to detect anomalies without considering pos…
▽ More
Anomaly detection has a wide range of applications and is especially important in industrial quality inspection. Currently, many top-performing anomaly-detection models rely on feature-embedding methods. However, these methods do not perform well on datasets with large variations in object locations. Reconstruction-based methods use reconstruction errors to detect anomalies without considering positional differences between samples. In this study, a reconstruction-based method using the noise-to-norm paradigm is proposed, which avoids the invariant reconstruction of anomalous regions. Our reconstruction network is based on M-net and incorporates multiscale fusion and residual attention modules to enable end-to-end anomaly detection and localization. Experiments demonstrate that the method is effective in reconstructing anomalous regions into normal patterns and achieving accurate anomaly detection and localization. On the MPDD and VisA datasets, our proposed method achieved more competitive results than the latest methods, and it set a new state-of-the-art standard on the MPDD dataset.
△ Less
Submitted 6 July, 2023;
originally announced July 2023.
-
FOCUS : A framework for energy system optimization from prosumer to district and city scale
Authors:
Jingyu Gong,
Yi Nie,
Jonas van Ouwerkerk,
Felix Wege,
Mauricio Celi Cortés,
Christoph von Oy,
Jonas Brucksch,
Christian Bußar,
Thomas Schreiber,
Dirk Uwe Sauer,
Dirk Müller,
Antonello Monti
Abstract:
Decarbonizing the energy sector is one of the main challenges to combat the climate crisis. Cities play an important role to reach climate neutrality as more than 70% of global CO2 emissions originate from urban areas. Decarbonization of energy supply systems can be achieved through various means, including the use of renewable energy sources, improving the efficiency of technologies, the coupling…
▽ More
Decarbonizing the energy sector is one of the main challenges to combat the climate crisis. Cities play an important role to reach climate neutrality as more than 70% of global CO2 emissions originate from urban areas. Decarbonization of energy supply systems can be achieved through various means, including the use of renewable energy sources, improving the efficiency of technologies, the coupling of different energy sectors, and the use of flexibility considering individual prosumer behaviour. This leads to an increasingly decentralized energy system, which is challenging to operate in a robust and cost-effective way. The evaluation of technologies and subsystems can only be done from the perspective of the system in which it is embedded and it is highly dependent on their networking and application scenarios. Therefore, the design and operation of energy systems require adequate computation and evaluation tools, which offer a holistic view of all interconnected components. The currently available optimization tools have limitations, such as limited scope of technologies and sectors, high requirements on data, high computational cost, and difficulty in handling multi-objective optimization. To overcome these limitations a software framework called FOCUS for the flexible and dynamic modeling of any urban sector-coupled energy system is developed. The framework includes a library containing models for different technologies and offers a variety of parameter sets for each technology. FOCUS can handle multi-objective problems by returning Pareto-optimal fronts, which helps users to discover the trade-off between criteria and objectives. The developed tool can identify new flexibility potentials in the energy system, actively support companies in the respective field to optimize urban energy system planning solutions, and determine possible threads to the stable operation of such systems.
△ Less
Submitted 14 April, 2023;
originally announced April 2023.
-
Compressed Particle-Based Federated Bayesian Learning and Unlearning
Authors:
Jinu Gong,
Osvaldo Simeone,
Joonhyuk Kang
Abstract:
Conventional frequentist FL schemes are known to yield overconfident decisions. Bayesian FL addresses this issue by allowing agents to process and exchange uncertainty information encoded in distributions over the model parameters. However, this comes at the cost of a larger per-iteration communication overhead. This letter investigates whether Bayesian FL can still provide advantages in terms of…
▽ More
Conventional frequentist FL schemes are known to yield overconfident decisions. Bayesian FL addresses this issue by allowing agents to process and exchange uncertainty information encoded in distributions over the model parameters. However, this comes at the cost of a larger per-iteration communication overhead. This letter investigates whether Bayesian FL can still provide advantages in terms of calibration when constraining communication bandwidth. We present compressed particle-based Bayesian FL protocols for FL and federated "unlearning" that apply quantization and sparsification across multiple particles. The experimental results confirm that the benefits of Bayesian FL are robust to bandwidth constraints.
△ Less
Submitted 19 September, 2022; v1 submitted 14 September, 2022;
originally announced September 2022.
-
Forget-SVGD: Particle-Based Bayesian Federated Unlearning
Authors:
Jinu Gong,
Osvaldo Simeone,
Rahif Kassab,
Joonhyuk Kang
Abstract:
Variational particle-based Bayesian learning methods have the advantage of not being limited by the bias affecting more conventional parametric techniques. This paper proposes to leverage the flexibility of non-parametric Bayesian approximate inference to develop a novel Bayesian federated unlearning method, referred to as Forget-Stein Variational Gradient Descent (Forget-SVGD). Forget-SVGD builds…
▽ More
Variational particle-based Bayesian learning methods have the advantage of not being limited by the bias affecting more conventional parametric techniques. This paper proposes to leverage the flexibility of non-parametric Bayesian approximate inference to develop a novel Bayesian federated unlearning method, referred to as Forget-Stein Variational Gradient Descent (Forget-SVGD). Forget-SVGD builds on SVGD - a particle-based approximate Bayesian inference scheme using gradient-based deterministic updates - and on its distributed (federated) extension known as Distributed SVGD (DSVGD). Upon the completion of federated learning, as one or more participating agents request for their data to be "forgotten", Forget-SVGD carries out local SVGD updates at the agents whose data need to be "unlearned", which are interleaved with communication rounds with a parameter server. The proposed method is validated via performance comparisons with non-parametric schemes that train from scratch by excluding data to be forgotten, as well as with existing parametric Bayesian unlearning methods.
△ Less
Submitted 23 November, 2021;
originally announced November 2021.
-
Deep Learning Adapted Acceleration for Limited-view Photoacoustic Computed Tomography
Authors:
Hengrong Lan,
Jiali Gong,
Fei Gao
Abstract:
Photoacoustic imaging (PAI) is a non-invasive imaging modality that detects the ultrasound signal generated from tissue with light excitation. Photoacoustic computed tomography (PACT) uses unfocused large-area light to illuminate the target with ultrasound transducer array for PA signal detection. Limited-view issue could cause a low-quality image in PACT due to the limitation of geometric conditi…
▽ More
Photoacoustic imaging (PAI) is a non-invasive imaging modality that detects the ultrasound signal generated from tissue with light excitation. Photoacoustic computed tomography (PACT) uses unfocused large-area light to illuminate the target with ultrasound transducer array for PA signal detection. Limited-view issue could cause a low-quality image in PACT due to the limitation of geometric condition. The model-based method is used to resolve this problem, which contains different regularization. To adapt fast and high-quality reconstruction of limited-view PA data, in this paper, a model-based method that combines the mathematical variational model with deep learning is proposed to speed up and regularize the unrolled procedure of reconstruction. A deep neural network is designed to adapt the step of the gradient updated term of data consistency in the gradient descent procedure, which can obtain a high-quality PA image only with a few iterations. Note that all parameters and priors are automatically learned during the offline training stage. In experiments, we show that this method outperforms the other methods with half-view (180 degrees) simulation and real data. The comparison of different model-based methods show that our proposed scheme has superior performances (over 0.05 for SSIM) with same iteration (3 times) steps. Furthermore, an unseen data is used to validate the generalization of different methods. Finally, we find that our method obtains superior results (0.94 value of SSIM for in vivo) with a high robustness and accelerated reconstruction.
△ Less
Submitted 7 November, 2021;
originally announced November 2021.
-
Computational Imaging and Artificial Intelligence: The Next Revolution of Mobile Vision
Authors:
Jinli Suo,
Weihang Zhang,
Jin Gong,
Xin Yuan,
David J. Brady,
Qionghai Dai
Abstract:
Signal capture stands in the forefront to perceive and understand the environment and thus imaging plays the pivotal role in mobile vision. Recent explosive progresses in Artificial Intelligence (AI) have shown great potential to develop advanced mobile platforms with new imaging devices. Traditional imaging systems based on the "capturing images first and processing afterwards" mechanism cannot m…
▽ More
Signal capture stands in the forefront to perceive and understand the environment and thus imaging plays the pivotal role in mobile vision. Recent explosive progresses in Artificial Intelligence (AI) have shown great potential to develop advanced mobile platforms with new imaging devices. Traditional imaging systems based on the "capturing images first and processing afterwards" mechanism cannot meet this unprecedented demand. Differently, Computational Imaging (CI) systems are designed to capture high-dimensional data in an encoded manner to provide more information for mobile vision systems.Thanks to AI, CI can now be used in real systems by integrating deep learning algorithms into the mobile vision platform to achieve the closed loop of intelligent acquisition, processing and decision making, thus leading to the next revolution of mobile vision.Starting from the history of mobile vision using digital cameras, this work first introduces the advances of CI in diverse applications and then conducts a comprehensive review of current research topics combining CI and AI. Motivated by the fact that most existing studies only loosely connect CI and AI (usually using AI to improve the performance of CI and only limited works have deeply connected them), in this work, we propose a framework to deeply integrate CI and AI by using the example of self-driving vehicles with high-speed communication, edge computing and traffic planning. Finally, we outlook the future of CI plus AI by investigating new materials, brain science and new computing techniques to shed light on new directions of mobile vision systems.
△ Less
Submitted 18 September, 2021;
originally announced September 2021.
-
Sequential Point Cloud Prediction in Interactive Scenarios: A Survey
Authors:
Haowen Wang,
Zirui Li,
Jianwei Gong
Abstract:
Point cloud has been widely used in the field of autonomous driving since it can provide a more comprehensive three-dimensional representation of the environment than 2D images. Point-wise prediction based on point cloud sequence (PCS) is an essential part of environment understanding, which can assist in the decision-making and motion-planning of autonomous vehicles. However, PCS prediction has n…
▽ More
Point cloud has been widely used in the field of autonomous driving since it can provide a more comprehensive three-dimensional representation of the environment than 2D images. Point-wise prediction based on point cloud sequence (PCS) is an essential part of environment understanding, which can assist in the decision-making and motion-planning of autonomous vehicles. However, PCS prediction has not been deeply researched in the literature. This paper proposes a brief review of the sequential point cloud prediction methods, focusing on interactive scenarios. Firstly, we define the PCS prediction problem and introduce commonly-used frameworks. Secondly, by reviewing non-predictive problems, we analyze and summarize the spatio-temporal feature extraction methods based on PCS. On this basis, we review two types of PCS prediction tasks, scene flow estimation (SFE) and point cloud location prediction (PCLP), highlighting their connections and differences. Finally, we discuss some opening issues and point out some potential research directions.
△ Less
Submitted 15 September, 2021;
originally announced September 2021.
-
Life-Long Multi-Task Learning of Adaptive Path Tracking Policy for Autonomous Vehicle
Authors:
Cheng Gong,
Jianwei Gong,
Chao Lu,
Zhe Liu,
Zirui Li
Abstract:
This paper proposes a life-long adaptive path tracking policy learning method for autonomous vehicles that can self-evolve and self-adapt with multi-task knowledge. Firstly, the proposed method can learn a model-free control policy for path tracking directly from the historical driving experience, where the property of vehicle dynamics and corresponding control strategy can be learned simultaneous…
▽ More
This paper proposes a life-long adaptive path tracking policy learning method for autonomous vehicles that can self-evolve and self-adapt with multi-task knowledge. Firstly, the proposed method can learn a model-free control policy for path tracking directly from the historical driving experience, where the property of vehicle dynamics and corresponding control strategy can be learned simultaneously. Secondly, by utilizing the life-long learning method, the proposed method can learn the policy with task-incremental knowledge without encountering catastrophic forgetting. Thus, with continual multi-task knowledge learned, the policy can iteratively adapt to new tasks and improve its performance with knowledge from new tasks. Thirdly, a memory evaluation and updating method is applied to optimize memory structure for life-long learning which enables the policy to learn toward selected directions. Experiments are conducted using a high-fidelity vehicle dynamic model in a complex curvy road to evaluate the performance of the proposed method. Results show that the proposed method can effectively evolve with continual multi-task knowledge and adapt to the new environment, where the performance of the proposed method can also surpass two commonly used baseline methods after evolving.
△ Less
Submitted 15 September, 2021;
originally announced September 2021.
-
Decision-Making in Driver-Automation Shared Control: A Review and Perspectives
Authors:
Wenshuo Wang,
Xiaoxiang Na,
Dongpu Cao,
Jianwei Gong,
Junqiang Xi,
Yang Xi,
Fei-Yue Wang
Abstract:
Shared control schemes allow a human driver to work with an automated driving agent in driver-vehicle systems while retaining the driver's abilities to control. The human driver, as an essential agent in the driver-vehicle shared control systems, should be precisely modeled regarding their cognitive processes, control strategies, and decision-making processes. The interactive strategy design betwe…
▽ More
Shared control schemes allow a human driver to work with an automated driving agent in driver-vehicle systems while retaining the driver's abilities to control. The human driver, as an essential agent in the driver-vehicle shared control systems, should be precisely modeled regarding their cognitive processes, control strategies, and decision-making processes. The interactive strategy design between drivers and automated driving agents brings an excellent challenge for human-centric driver assistance systems due to the inherent characteristics of humans. Many open-ended questions arise, such as what proper role of human drivers should act in a shared control scheme? How to make an intelligent decision capable of balancing the benefits of agents in shared control systems? Due to the advent of these attentions and questions, it is desirable to present a survey on the decision-making between human drivers and highly automated vehicles, to understand their architectures, human driver modeling, and interaction strategies under the driver-vehicle shared schemes. Finally, we give a further discussion on the key future challenges and opportunities. They are likely to shape new potential research directions.
△ Less
Submitted 3 July, 2020;
originally announced July 2020.
-
Stealth UAV through Coanda Effect
Authors:
Dongyoon Shin,
Hyeji Kim,
Jihyuk Gong,
Uijeong Jeong,
Yeeun Jo,
Eric Matson
Abstract:
This paper uses Coanda Effect to reduce motors, the source of noise, and finds low noise materials with sufficient lift force so that it can achieve acoustical stealth UAVs.According to NASA research [1], the noise of UAVs is better heard to people. But there must be some moments when we need to operate the drones quietly, so how can we reduce the noise? In previous research, there have also been…
▽ More
This paper uses Coanda Effect to reduce motors, the source of noise, and finds low noise materials with sufficient lift force so that it can achieve acoustical stealth UAVs.According to NASA research [1], the noise of UAVs is better heard to people. But there must be some moments when we need to operate the drones quietly, so how can we reduce the noise? In previous research, there have also been steady attempts to produce UAVs using Coanda Effect, but have never tried to achieve Acoustic Stealth through Coanda UAVs. But Coanda Effect uses only one motor and is structurally quiet. So we tried to find quiet methods (materials, structures) while at the same time having sufficient stimulus through the Coanda Effect. Verification went through experiments. The control group used the most common type of Quadrone, and determine if the hypothesis is correct by testing various structures and materials under the same conditions, and measuring noise. UAVs using Coanda Effect are not of any shape or structure that is not changeable, and internal space is also empty. That's why the Coanda Effect UAV we present can be improved through follow-up research. That's why the Coanda Effect UAV could open up a new frontier for the Stealth UAVs.
△ Less
Submitted 29 April, 2020;
originally announced May 2020.
-
Age of Processing: Age-driven Status Sampling and Processing Offloading for Edge Computing-enabled Real-time IoT Applications
Authors:
Rui Li,
Qian Ma,
Jie Gong,
Zhi Zhou,
Xu Chen
Abstract:
The freshness of status information is of great importance for time-critical Internet of Things (IoT) applications. A metric measuring status freshness is the age-of-information (AoI), which captures the time elapsed from the status being generated at the source node (e.g., a sensor) to the latest status update.However, in intelligent IoT applications such as video surveillance, the status informa…
▽ More
The freshness of status information is of great importance for time-critical Internet of Things (IoT) applications. A metric measuring status freshness is the age-of-information (AoI), which captures the time elapsed from the status being generated at the source node (e.g., a sensor) to the latest status update.However, in intelligent IoT applications such as video surveillance, the status information is revealed after some computation intensive and time-consuming data processing operations, which would affect the status freshness. In this paper, we propose a novel metric, age-of-processing (AoP), to quantify such status freshness, which captures the time elapsed of the newest received processed status data since it is generated. Compared with AoI, AoP further takes the data processing time into account. Since an IoT device has limited computation and energy resource, the device can choose to offload the data processing to the nearby edge server under constrained status sampling frequency.We aim to minimize the average AoP in a long-term process by jointly optimizing the status sampling frequency and processing offloading policy. We formulate this online problem as an infinite-horizon constrained Markov decision process (CMDP) with average reward criterion. We then transform the CMDP problem into an unconstrained Markov decision process (MDP) by leveraging a Lagrangian method, and propose a Lagrangian transformation framework for the original CMDP problem. Furthermore, we integrate the framework with perturbation based refinement for achieving the optimal policy of the CMDP problem. Extensive numerical evaluations show that the proposed algorithm outperforms the benchmarks, with an average AoP reduction up to 30%.
△ Less
Submitted 24 March, 2020;
originally announced March 2020.
-
Joint Transmission and Computing Scheduling for Status Update with Mobile Edge Computing
Authors:
Jie Gong,
Qiaobin Kuang,
Xiang Chen
Abstract:
Age of Information (AoI), defined as the time elapsed since the generation of the latest received update, is a promising performance metric to measure data freshness for real-time status monitoring. In many applications, status information needs to be extracted through computing, which can be processed at an edge server enabled by mobile edge computing (MEC). In this paper, we aim to minimize the…
▽ More
Age of Information (AoI), defined as the time elapsed since the generation of the latest received update, is a promising performance metric to measure data freshness for real-time status monitoring. In many applications, status information needs to be extracted through computing, which can be processed at an edge server enabled by mobile edge computing (MEC). In this paper, we aim to minimize the average AoI within a given deadline by jointly scheduling the transmissions and computations of a series of update packets with deterministic transmission and computing times. The main analytical results are summarized as follows. Firstly, the minimum deadline to guarantee the successful transmission and computing of all packets is given. Secondly, a \emph{no-wait computing} policy which intuitively attains the minimum AoI is introduced, and the feasibility condition of the policy is derived. Finally, a closed-form optimal scheduling policy is obtained on the condition that the deadline exceeds a certain threshold. The behavior of the optimal transmission and computing policy is illustrated by numerical results with different values of the deadline, which validates the analytical results.
△ Less
Submitted 22 February, 2020;
originally announced February 2020.
-
Analysis on Computation-Intensive Status Update in Mobile Edge Computing
Authors:
Qiaobin Kuang,
Jie Gong,
Xiang Chen,
Xiao Ma
Abstract:
In status update scenarios, the freshness of information is measured in terms of age-of-information (AoI), which essentially reflects the timeliness for real-time applications to transmit status update messages to a remote controller. For some applications, computational expensive and time consuming data processing is inevitable for status information of messages to be displayed. Mobile edge serve…
▽ More
In status update scenarios, the freshness of information is measured in terms of age-of-information (AoI), which essentially reflects the timeliness for real-time applications to transmit status update messages to a remote controller. For some applications, computational expensive and time consuming data processing is inevitable for status information of messages to be displayed. Mobile edge servers are equipped with adequate computation resources and they are placed close to users. Thus, mobile edge computing (MEC) can be a promising technology to reduce AoI for computation-intensive messages. In this paper, we study the AoI for computation-intensive messages with MEC, and consider three computing schemes: local computing, remote computing at the MEC server, and partial computing, i.e., some part of computing tasks are performed locally, and the rest is executed at the MEC server. Zero-wait policy is adopted in all three schemes. Specifically, in local computing, a new message is generated immediately after the previous one is revealed by computing. While in remote computing and partial computing, a new message is generated once the previous one is received by the remote MEC server. With infinite queue size and exponentially distributed transmission time, closed-form average AoI for exponentially distributed computing time is derived for the three computing schemes. For deterministic computing time, the average AoI is analyzed numerically. Simulation results show that by carefully partitioning the computing tasks, the average AoI in partial computing is the smallest compared to local computing and remote computing. The results also indicate numerically the conditions on which remote computing attains smaller average AoI compared with local computing.
△ Less
Submitted 15 February, 2020;
originally announced February 2020.
-
A Time Efficient Approach for Decision-Making Style Recognition in Lane-Change Behavior
Authors:
Sen Yang,
Wenshuo Wang,
Chao Lu,
Jianwei Gong,
Junqiang Xi
Abstract:
Fast recognizing driver's decision-making style of changing lanes plays a pivotal role in safety-oriented and personalized vehicle control system design. This paper presents a time-efficient recognition method by integrating k-means clustering (k-MC) with K-nearest neighbor (KNN), called kMC-KNN. The mathematical morphology is implemented to automatically label the decision-making data into three…
▽ More
Fast recognizing driver's decision-making style of changing lanes plays a pivotal role in safety-oriented and personalized vehicle control system design. This paper presents a time-efficient recognition method by integrating k-means clustering (k-MC) with K-nearest neighbor (KNN), called kMC-KNN. The mathematical morphology is implemented to automatically label the decision-making data into three styles (moderate, vague, and aggressive), while the integration of kMC and KNN helps to improve the recognition speed and accuracy. Our developed mathematical morphology-based clustering algorithm is then validated by comparing to agglomerative hierarchical clustering. Experimental results demonstrate that the developed kMC-KNN method, in comparison to the traditional KNN, can shorten the recognition time by over 72.67% with recognition accuracy of 90%-98%. In addition, our developed kMC-KNN method also outperforms the support vector machine (SVM) in recognition accuracy and stability. The developed time-efficient recognition approach would have great application potential to the in-vehicle embedded solutions with restricted design specifications.
△ Less
Submitted 8 November, 2018;
originally announced December 2018.