Skip to main content

Showing 1–50 of 51 results for author: Fang, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2506.12263  [pdf, ps, other

    cs.LG cs.AI eess.SY

    A Survey of Foundation Models for IoT: Taxonomy and Criteria-Based Analysis

    Authors: Hui Wei, Dong Yoon Lee, Shubham Rohal, Zhizhang Hu, Shiwei Fang, Shijia Pan

    Abstract: Foundation models have gained growing interest in the IoT domain due to their reduced reliance on labeled data and strong generalizability across tasks, which address key limitations of traditional machine learning approaches. However, most existing foundation model based methods are developed for specific IoT tasks, making it difficult to compare approaches across IoT domains and limiting guidanc… ▽ More

    Submitted 13 June, 2025; originally announced June 2025.

    Comments: Preprint. Under Submission

  2. arXiv:2505.22069  [pdf, ps, other

    cs.SD eess.AS

    Delayed-KD: Delayed Knowledge Distillation based CTC for Low-Latency Streaming ASR

    Authors: Longhao Li, Yangze Li, Hongfei Xue, Jie Liu, Shuai Fang, Kai Wang, Lei Xie

    Abstract: CTC-based streaming ASR has gained significant attention in real-world applications but faces two main challenges: accuracy degradation in small chunks and token emission latency. To mitigate these challenges, we propose Delayed-KD, which applies delayed knowledge distillation on CTC posterior probabilities from a non-streaming to a streaming model. Specifically, with a tiny chunk size, we introdu… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

    Comments: Accepted by Interspeech2025

  3. arXiv:2505.22025  [pdf, ps, other

    cs.CV eess.IV

    Learnable Burst-Encodable Time-of-Flight Imaging for High-Fidelity Long-Distance Depth Sensing

    Authors: Manchao Bao, Shengjiang Fang, Tao Yue, Xuemei Hu

    Abstract: Long-distance depth imaging holds great promise for applications such as autonomous driving and robotics. Direct time-of-flight (dToF) imaging offers high-precision, long-distance depth sensing, yet demands ultra-short pulse light sources and high-resolution time-to-digital converters. In contrast, indirect time-of-flight (iToF) imaging often suffers from phase wrapping and low signal-to-noise rat… ▽ More

    Submitted 28 May, 2025; originally announced May 2025.

  4. arXiv:2505.16220  [pdf, ps, other

    eess.AS cs.CL

    Meta-PerSER: Few-Shot Listener Personalized Speech Emotion Recognition via Meta-learning

    Authors: Liang-Yeh Shen, Shi-Xin Fang, Yi-Cheng Lin, Huang-Cheng Chou, Hung-yi Lee

    Abstract: This paper introduces Meta-PerSER, a novel meta-learning framework that personalizes Speech Emotion Recognition (SER) by adapting to each listener's unique way of interpreting emotion. Conventional SER systems rely on aggregated annotations, which often overlook individual subtleties and lead to inconsistent predictions. In contrast, Meta-PerSER leverages a Model-Agnostic Meta-Learning (MAML) appr… ▽ More

    Submitted 22 May, 2025; originally announced May 2025.

    Comments: Accepted by INTERSPEECH 2025. 7 pages, including 2 pages of appendix

  5. arXiv:2411.05361  [pdf, ps, other

    cs.CL eess.AS

    Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

    Authors: Chien-yu Huang, Wei-Chih Chen, Shu-wen Yang, Andy T. Liu, Chen-An Li, Yu-Xiang Lin, Wei-Cheng Tseng, Anuj Diwan, Yi-Jen Shih, Jiatong Shi, William Chen, Chih-Kai Yang, Wenze Ren, Xuanjun Chen, Chi-Yuan Hsiao, Puyuan Peng, Shih-Heng Wang, Chun-Yi Kuan, Ke-Han Lu, Kai-Wei Chang, Fabian Ritter-Gutierrez, Kuan-Po Huang, Siddhant Arora, You-Kuan Lin, Ming To Chuang , et al. (55 additional authors not shown)

    Abstract: Multimodal foundation models, such as Gemini and ChatGPT, have revolutionized human-machine interactions by seamlessly integrating various forms of data. Developing a universal spoken language model that comprehends a wide range of natural language instructions is critical for bridging communication gaps and facilitating more intuitive interactions. However, the absence of a comprehensive evaluati… ▽ More

    Submitted 9 June, 2025; v1 submitted 8 November, 2024; originally announced November 2024.

    Comments: ICLR 2025

  6. arXiv:2409.19647  [pdf, other

    cs.RO cs.AI eess.SY

    Fine-Tuning Hybrid Physics-Informed Neural Networks for Vehicle Dynamics Model Estimation

    Authors: Shiming Fang, Kaiyan Yu

    Abstract: Accurate dynamic modeling is critical for autonomous racing vehicles, especially during high-speed and agile maneuvers where precise motion prediction is essential for safety. Traditional parameter estimation methods face limitations such as reliance on initial guesses, labor-intensive fitting procedures, and complex testing setups. On the other hand, purely data-driven machine learning methods st… ▽ More

    Submitted 29 September, 2024; originally announced September 2024.

  7. arXiv:2409.01695  [pdf, other

    cs.SD cs.AI eess.AS

    USTC-KXDIGIT System Description for ASVspoof5 Challenge

    Authors: Yihao Chen, Haochen Wu, Nan Jiang, Xiang Xia, Qing Gu, Yunqi Hao, Pengfei Cai, Yu Guan, Jialong Wang, Weilin Xie, Lei Fang, Sian Fang, Yan Song, Wu Guo, Lin Liu, Minqiang Xu

    Abstract: This paper describes the USTC-KXDIGIT system submitted to the ASVspoof5 Challenge for Track 1 (speech deepfake detection) and Track 2 (spoofing-robust automatic speaker verification, SASV). Track 1 showcases a diverse range of technical qualities from potential processing algorithms and includes both open and closed conditions. For these conditions, our system consists of a cascade of a frontend f… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

    Comments: ASVspoof5 workshop paper

  8. arXiv:2407.05928  [pdf, other

    eess.SP

    CA-FedRC: Codebook Adaptation via Federated Reservoir Computing in 5G NR

    Authors: Ziqiang Ye, Sikai Liao, Yulan Gao, Shu Fang, Yue Xiao, Ming Xiao, Saviour Zammit

    Abstract: With the burgeon deployment of the fifth-generation new radio (5G NR) networks, the codebook plays a crucial role in enabling the base station (BS) to acquire the channel state information (CSI). Different 5G NR codebooks incur varying overheads and exhibit performance disparities under diverse channel conditions, necessitating codebook adaptation based on channel conditions to reduce feedback ove… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  9. arXiv:2407.01939  [pdf, other

    eess.AS eess.SP

    Unsupervised Face-Masked Speech Enhancement Using Generative Adversarial Networks With Human-in-the-Loop Assessment Metrics

    Authors: Syu-Siang Wang, Jia-Yang Chen, Bo-Ren Bai, Shih-Hau Fang, Yu Tsao

    Abstract: The utilization of face masks is an essential healthcare measure, particularly during times of pandemics, yet it can present challenges in communication in our daily lives. To address this problem, we propose a novel approach known as the human-in-the-loop StarGAN (HL-StarGAN) face-masked speech enhancement method. HL-StarGAN comprises discriminator, classifier, metric assessment predictor, and ge… ▽ More

    Submitted 20 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

    Comments: face-mask speech enhancement, generative adversarial networks, StarGAN, human-in-the-loop, unsupervised learning

    Journal ref: IEEE/ACM Transactions on Audio, Speech and Language Processing, 2024

  10. arXiv:2402.14136  [pdf, other

    cs.RO cs.LG eess.SP

    GDTM: An Indoor Geospatial Tracking Dataset with Distributed Multimodal Sensors

    Authors: Ho Lyun Jeong, Ziqi Wang, Colin Samplawski, Jason Wu, Shiwei Fang, Lance M. Kaplan, Deepak Ganesan, Benjamin Marlin, Mani Srivastava

    Abstract: Constantly locating moving objects, i.e., geospatial tracking, is essential for autonomous building infrastructure. Accurate and robust geospatial tracking often leverages multimodal sensor fusion algorithms, which require large datasets with time-aligned, synchronized data from various sensor types. However, such datasets are not readily available. Hence, we propose GDTM, a nine-hour dataset for… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  11. arXiv:2309.07765  [pdf, other

    cs.SD cs.CL eess.AS

    Echotune: A Modular Extractor Leveraging the Variable-Length Nature of Speech in ASR Tasks

    Authors: Sizhou Chen, Songyang Gao, Sen Fang

    Abstract: The Transformer architecture has proven to be highly effective for Automatic Speech Recognition (ASR) tasks, becoming a foundational component for a plethora of research in the domain. Historically, many approaches have leaned on fixed-length attention windows, which becomes problematic for varied speech samples in duration and complexity, leading to data over-smoothing and neglect of essential lo… ▽ More

    Submitted 7 April, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

  12. arXiv:2308.11557  [pdf, other

    eess.IV cs.CV

    Open Set Synthetic Image Source Attribution

    Authors: Shengbang Fang, Tai D. Nguyen, Matthew C. Stamm

    Abstract: AI-generated images have become increasingly realistic and have garnered significant public attention. While synthetic images are intriguing due to their realism, they also pose an important misinformation threat. To address this new threat, researchers have developed multiple algorithms to detect synthetic images and identify their source generators. However, most existing source attribution tech… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

  13. arXiv:2307.15898  [pdf, other

    cs.SD cs.AI eess.AS

    UniBriVL: Robust Universal Representation and Generation of Audio Driven Diffusion Models

    Authors: Sen Fang, Bowen Gao, Yangjian Wu, Teik Toe Teoh

    Abstract: Multimodal large models have been recognized for their advantages in various performance and downstream tasks. The development of these models is crucial towards achieving general artificial intelligence in the future. In this paper, we propose a novel universal language representation learning method called UniBriVL, which is based on Bridging-Vision-and-Language (BriVL). Universal BriVL embeds a… ▽ More

    Submitted 9 September, 2023; v1 submitted 29 July, 2023; originally announced July 2023.

    Comments: Voice-Text fusion input; The first work of audio driven diffusion model. arXiv admin note: text overlap with arXiv:2303.04585

  14. arXiv:2307.11778  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Transsion TSUP's speech recognition system for ASRU 2023 MADASR Challenge

    Authors: Xiaoxiao Li, Gaosheng Zhang, An Zhu, Weiyong Li, Shuming Fang, Xiaoyue Yang, Jianchao Zhu

    Abstract: This paper presents a speech recognition system developed by the Transsion Speech Understanding Processing Team (TSUP) for the ASRU 2023 MADASR Challenge. The system focuses on adapting ASR models for low-resource Indian languages and covers all four tracks of the challenge. For tracks 1 and 2, the acoustic model utilized a squeezeformer encoder and bidirectional transformer decoder with joint CTC… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

  15. arXiv:2303.04585  [pdf, other

    cs.SD cs.AI eess.AS

    Exploring Efficient-Tuned Learning Audio Representation Method from BriVL

    Authors: Sen Fang, Yangjian Wu, Bowen Gao, Jingwen Cai, Teik Toe Teoh

    Abstract: Recently, researchers have gradually realized that in some cases, the self-supervised pre-training on large-scale Internet data is better than that of high-quality/manually labeled data sets, and multimodal/large models are better than single or bimodal/small models. In this paper, we propose a robust audio representation learning method WavBriVL based on Bridging-Vision-and-Language (BriVL). WavB… ▽ More

    Submitted 28 July, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

    Comments: 13 pages, 2023.3 Finished

  16. arXiv:2301.01592  [pdf, other

    cs.NI cs.AI cs.HC eess.SY

    CarFi: Rider Localization Using Wi-Fi CSI

    Authors: Sirajum Munir, Hongkai Chen, Shiwei Fang, Mahathir Monjur, Shan Lin, Shahriar Nirjon

    Abstract: With the rise of hailing services, people are increasingly relying on shared mobility (e.g., Uber, Lyft) drivers to pick up for transportation. However, such drivers and riders have difficulties finding each other in urban areas as GPS signals get blocked by skyscrapers, in crowded environments (e.g., in stadiums, airports, and bars), at night, and in bad weather. It wastes their time, creates a b… ▽ More

    Submitted 21 December, 2022; originally announced January 2023.

    ACM Class: C.3

  17. arXiv:2211.12314  [pdf, other

    eess.IV cs.CV cs.LG eess.SP

    Attacking Image Splicing Detection and Localization Algorithms Using Synthetic Traces

    Authors: Shengbang Fang, Matthew C Stamm

    Abstract: Recent advances in deep learning have enabled forensics researchers to develop a new class of image splicing detection and localization algorithms. These algorithms identify spliced content by detecting localized inconsistencies in forensic traces using Siamese neural networks, either explicitly during analysis or implicitly during training. At the same time, deep learning has enabled new forms of… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

  18. arXiv:2211.08146  [pdf, other

    eess.IV cs.CV cs.LG

    Encoding feature supervised UNet++: Redesigning Supervision for liver and tumor segmentation

    Authors: Jiahao Cui, Ruoxin Xiao, Shiyuan Fang, Minnan Pei, Yixuan Yu

    Abstract: Liver tumor segmentation in CT images is a critical step in the diagnosis, surgical planning and postoperative evaluation of liver disease. An automatic liver and tumor segmentation method can greatly relieve physicians of the heavy workload of examining CT images and better improve the accuracy of diagnosis. In the last few decades, many modifications based on U-Net model have been proposed in th… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

  19. arXiv:2209.12940  [pdf, other

    cs.RO cs.CV cs.LG eess.SP

    ERASE-Net: Efficient Segmentation Networks for Automotive Radar Signals

    Authors: Shihong Fang, Haoran Zhu, Devansh Bisla, Anna Choromanska, Satish Ravindran, Dongyin Ren, Ryan Wu

    Abstract: Among various sensors for assisted and autonomous driving systems, automotive radar has been considered as a robust and low-cost solution even in adverse weather or lighting conditions. With the recent development of radar technologies and open-sourced annotated data sets, semantic segmentation with radar signals has become very promising. However, existing methods are either computationally expen… ▽ More

    Submitted 24 February, 2023; v1 submitted 26 September, 2022; originally announced September 2022.

    Comments: accepted by ICRA 2023

  20. arXiv:2208.06457  [pdf, other

    cs.IT eess.SP

    Intelligent Omni Surface-Assisted Self-Interference Cancellation for Full-Duplex MISO System

    Authors: Sisai Fang, Gaojie Chen, Pei Xiao, Kai-Kit Wong, Rahim Tafazolli

    Abstract: The full-duplex (FD) communication can achieve higher spectrum efficiency than conventional half-duplex (HD) communication; however, self-interference (SI) is the key hurdle. This paper is the first work to propose the intelligent Omni surface (IOS)-assisted FD multi-input single-output (MISO) FD communication systems to mitigate SI, which solves the frequency-selectivity issue. In particular, two… ▽ More

    Submitted 12 August, 2022; originally announced August 2022.

    Comments: 30 pages, 8 figures

  21. arXiv:2202.10777  [pdf, other

    eess.AS cs.AI cs.SD q-bio.QM

    Continuous Speech for Improved Learning Pathological Voice Disorders

    Authors: Syu-Siang Wang, Chi-Te Wang, Chih-Chung Lai, Yu Tsao, Shih-Hau Fang

    Abstract: Goal: Numerous studies had successfully differentiated normal and abnormal voice samples. Nevertheless, further classification had rarely been attempted. This study proposes a novel approach, using continuous Mandarin speech instead of a single vowel, to classify four common voice disorders (i.e. functional dysphonia, neoplasm, phonotrauma, and vocal palsy). Methods: In the proposed framework, aco… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

  22. arXiv:2201.09717  [pdf, other

    cs.CV eess.IV

    Keeping Deep Lithography Simulators Updated: Global-Local Shape-Based Novelty Detection and Active Learning

    Authors: Hao-Chiang Shao, Hsing-Lei Ping, Kuo-shiuan Chen, Weng-Tai Su, Chia-Wen Lin, Shao-Yun Fang, Pin-Yian Tsai, Yan-Hsiu Liu

    Abstract: Learning-based pre-simulation (i.e., layout-to-fabrication) models have been proposed to predict the fabrication-induced shape deformation from an IC layout to its fabricated circuit. Such models are usually driven by pairwise learning, involving a training set of layout patterns and their reference shape images after fabrication. However, it is expensive and time-consuming to collect the referenc… ▽ More

    Submitted 24 January, 2022; originally announced January 2022.

  23. arXiv:2112.02538  [pdf, ps, other

    eess.AS cs.SD

    Toward Real-World Voice Disorder Classification

    Authors: Heng-Cheng Kuo, Yu-Peng Hsieh, Huan-Hsin Tseng, Chi-Te Wang, Shih-Hau Fang, Yu Tsao

    Abstract: Objective: Voice disorders significantly compromise individuals' ability to speak in their daily lives. Without early diagnosis and treatment, these disorders may deteriorate drastically. Thus, automatic classification systems at home are desirable for people who are inaccessible to clinical disease assessments. However, the performance of such systems may be weakened due to the constrained resour… ▽ More

    Submitted 26 April, 2023; v1 submitted 5 December, 2021; originally announced December 2021.

    Comments: Accepted by IEEE TBME (under an IEEE Open Access publishing Agreement)

  24. arXiv:2102.02730  [pdf, other

    cs.IT cs.LG eess.SP eess.SY math.OC

    Feedback Capacity of Parallel ACGN Channels and Kalman Filter: Power Allocation with Feedback

    Authors: Song Fang, Quanyan Zhu

    Abstract: In this paper, we relate the feedback capacity of parallel additive colored Gaussian noise (ACGN) channels to a variant of the Kalman filter. By doing so, we obtain lower bounds on the feedback capacity of such channels, as well as the corresponding feedback (recursive) coding schemes, which are essentially power allocation policies with feedback, to achieve the bounds. The results are seen to red… ▽ More

    Submitted 15 February, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

    Comments: arXiv admin note: text overlap with arXiv:2001.03108

  25. Optimal Energy Scheduling and Sensitivity Analysis for Integrated Power-Water-Heat Systems

    Authors: Sidun Fang, Chenxu Wang, Yashen Lin, Changhong Zhao

    Abstract: The conventionally independent power, water, and heating networks are becoming more tightly connected, which motivates their joint optimal energy scheduling to improve the overall efficiency of an integrated energy system. However, such a joint optimization is known as a challenging problem with complex network constraints and couplings of electric, hydraulic, and thermal models that are nonlinear… ▽ More

    Submitted 25 November, 2021; v1 submitted 1 February, 2021; originally announced February 2021.

    Comments: 12 pages, 13 figures, accepted to IEEE Systems Journal

  26. arXiv:2101.00957  [pdf, ps, other

    eess.SY cs.RO

    Relativistic Rocket Control (Relativistic Space-Travel Flight Control): Feedback Control of Relativistic Dynamics Propelled by Ejecting Mass

    Authors: Song Fang, Quanyan Zhu

    Abstract: In this short note, we investigate the feedback control of relativistic dynamics propelled by mass ejection, modeling, e.g., the relativistic rocket control or the relativistic (space-travel) flight control. As an extreme case, we also examine the control of relativistic photon rockets which are propelled by ejecting photons.

    Submitted 11 January, 2021; v1 submitted 29 December, 2020; originally announced January 2021.

    Comments: arXiv admin note: text overlap with arXiv:1912.03367

  27. arXiv:2012.12174  [pdf, other

    eess.SY cs.IT cs.LG cs.RO math.OC

    Fundamental Limits of Controlled Stochastic Dynamical Systems: An Information-Theoretic Approach

    Authors: Song Fang, Quanyan Zhu

    Abstract: In this paper, we examine the fundamental performance limitations in the control of stochastic dynamical systems; more specifically, we derive generic $\mathcal{L}_p$ bounds that hold for any causal (stabilizing) controllers and any stochastic disturbances, by an information-theoretic analysis. We first consider the scenario where the plant (i.e., the dynamical system to be controlled) is linear t… ▽ More

    Submitted 3 June, 2021; v1 submitted 22 December, 2020; originally announced December 2020.

    Comments: Note that this is an extended version of the original submission "Fundamental Limits on the Maximum Deviations in Control Systems: How Short Can Distribution Tails be Made by Feedback?"; arXiv admin note: text overlap with arXiv:1912.05541

  28. Blind Monaural Source Separation on Heart and Lung Sounds Based on Periodic-Coded Deep Autoencoder

    Authors: Kun-Hsi Tsai, Wei-Chien Wang, Chui-Hsuan Cheng, Chan-Yen Tsai, Jou-Kou Wang, Tzu-Hao Lin, Shih-Hau Fang, Li-Chin Chen, Yu Tsao

    Abstract: Auscultation is the most efficient way to diagnose cardiovascular and respiratory diseases. To reach accurate diagnoses, a device must be able to recognize heart and lung sounds from various clinical situations. However, the recorded chest sounds are mixed by heart and lung sounds. Thus, effectively separating these two sounds is critical in the pre-processing stage. Recent advances in machine lea… ▽ More

    Submitted 11 December, 2020; originally announced December 2020.

    Comments: 13 pages, 11 figures, Accepted by IEEE Journal of Biomedical and Health Informatics

  29. arXiv:2012.04023  [pdf, ps, other

    math.ST cs.LG eess.SP math.PR stat.ML

    The Spectral-Domain $\mathcal{W}_2$ Wasserstein Distance for Elliptical Processes and the Spectral-Domain Gelbrich Bound

    Authors: Song Fang, Quanyan Zhu

    Abstract: In this short note, we introduce the spectral-domain $\mathcal{W}_2$ Wasserstein distance for elliptical stochastic processes in terms of their power spectra. We also introduce the spectral-domain Gelbrich bound for processes that are not necessarily elliptical.

    Submitted 6 January, 2021; v1 submitted 7 December, 2020; originally announced December 2020.

  30. arXiv:2012.03809  [pdf, ps, other

    math.ST cs.AI cs.LG eess.SP stat.ML

    Independent Elliptical Distributions Minimize Their $\mathcal{W}_2$ Wasserstein Distance from Independent Elliptical Distributions with the Same Density Generator

    Authors: Song Fang, Quanyan Zhu

    Abstract: This short note is on a property of the $\mathcal{W}_2$ Wasserstein distance which indicates that independent elliptical distributions minimize their $\mathcal{W}_2$ Wasserstein distance from given independent elliptical distributions with the same density generators. Furthermore, we examine the implications of this property in the Gelbrich bound when the distributions are not necessarily elliptic… ▽ More

    Submitted 7 December, 2020; originally announced December 2020.

  31. arXiv:2012.02009  [pdf, other

    eess.SY cs.CR cs.IT eess.SP math.OC

    Fundamental Stealthiness-Distortion Tradeoffs in Dynamical Systems under Injection Attacks: A Power Spectral Analysis

    Authors: Song Fang, Quanyan Zhu

    Abstract: In this paper, we analyze the fundamental stealthiness-distortion tradeoffs of linear Gaussian dynamical systems under data injection attacks using a power spectral analysis, whereas the Kullback-Leibler (KL) divergence is employed as the stealthiness measure. Particularly, we obtain explicit formulas in terms of power spectra that characterize analytically the stealthiness-distortion tradeoffs as… ▽ More

    Submitted 11 May, 2021; v1 submitted 3 December, 2020; originally announced December 2020.

  32. arXiv:2011.02560  [pdf, ps, other

    cs.IT cs.LG eess.SP eess.SY math.ST

    Independent Gaussian Distributions Minimize the Kullback-Leibler (KL) Divergence from Independent Gaussian Distributions

    Authors: Song Fang, Quanyan Zhu

    Abstract: This short note is on a property of the Kullback-Leibler (KL) divergence which indicates that independent Gaussian distributions minimize the KL divergence from given independent Gaussian distributions. The primary purpose of this note is for the referencing of papers that need to make use of this property entirely or partially.

    Submitted 3 December, 2020; v1 submitted 4 November, 2020; originally announced November 2020.

  33. arXiv:2011.00718  [pdf, other

    cs.IT cs.CR cs.LG eess.SP eess.SY

    Fundamental Limits of Obfuscation for Linear Gaussian Dynamical Systems: An Information-Theoretic Approach

    Authors: Song Fang, Quanyan Zhu

    Abstract: In this paper, we study the fundamental limits of obfuscation in terms of privacy-distortion tradeoffs for linear Gaussian dynamical systems via an information-theoretic approach. Particularly, we obtain analytical formulas that capture the fundamental privacy-distortion tradeoffs when privacy masks are to be added to the outputs of the dynamical systems, while indicating explicitly how to design… ▽ More

    Submitted 29 October, 2020; originally announced November 2020.

    Comments: arXiv admin note: text overlap with arXiv:2008.04893

  34. arXiv:2008.04893  [pdf, ps, other

    cs.IT cs.CR cs.LG eess.SP math.ST

    Channel Leakage, Information-Theoretic Limitations of Obfuscation, and Optimal Privacy Mask Design for Streaming Data

    Authors: Song Fang, Quanyan Zhu

    Abstract: In this paper, we first introduce the notion of channel leakage as the minimum mutual information between the channel input and channel output. As its name indicates, channel leakage quantifies the minimum information leakage to the malicious receiver. In a broad sense, it can be viewed as a dual concept of channel capacity, which characterizes the maximum information transmission to the targeted… ▽ More

    Submitted 29 September, 2020; v1 submitted 11 August, 2020; originally announced August 2020.

    Comments: The title was changed from "Channel Leakage and Information Theoretic Privacy-Distortion Tradeoffs for Streaming Data" to the current one on 29th September 2020

  35. Modeling, Analysis, and Optimization of Grant-Free NOMA in Massive MTC via Stochastic Geometry

    Authors: Jiaqi Liu, Gang Wu, Xiaoxu Zhang, Shu Fang, Shaoqian Li

    Abstract: Massive machine-type communications (mMTC) is a crucial scenario to support booming Internet of Things (IoTs) applications. In mMTC, although a large number of devices are registered to an access point (AP), very few of them are active with uplink short packet transmission at the same time, which requires novel design of protocols and receivers to enable efficient data transmission and accurate mu… ▽ More

    Submitted 5 April, 2020; originally announced April 2020.

    Comments: This paper is submitted to IEEE Internet Of Things Journal

    Journal ref: IEEE Internet Of Things Journal, VOL. 8, NO. 6, MARCH 15, 2021, pp. 4389~4402, https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9207733

  36. From IC Layout to Die Photo: A CNN-Based Data-Driven Approach

    Authors: Hao-Chiang Shao, Chao-Yi Peng, Jun-Rei Wu, Chia-Wen Lin, Shao-Yun Fang, Pin-Yen Tsai, Yan-Hsiu Liu

    Abstract: We propose a deep learning-based data-driven framework consisting of two convolutional neural networks: i) LithoNet that predicts the shape deformations on a circuit due to IC fabrication, and ii) OPCNet that suggests IC layout corrections to compensate for such shape deformations. By learning the shape correspondences between pairs of layout design patterns and their scanning electron microscope… ▽ More

    Submitted 6 August, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

    Comments: 14 pages, 16 figures

  37. arXiv:2001.03108  [pdf, other

    cs.IT cs.LG eess.SP eess.SY math.OC

    Feedback Capacity and a Variant of the Kalman Filter with ARMA Gaussian Noises: Explicit Bounds and Feedback Coding Design

    Authors: Song Fang, Quanyan Zhu

    Abstract: In this paper, we relate a feedback channel with any finite-order autoregressive moving-average (ARMA) Gaussian noises to a variant of the Kalman filter. In light of this, we obtain relatively explicit lower bounds on the feedback capacity for such colored Gaussian noises, and the bounds are seen to be consistent with various existing results in the literature. Meanwhile, this variant of the Kalma… ▽ More

    Submitted 3 June, 2021; v1 submitted 9 January, 2020; originally announced January 2020.

    Comments: Note that this is an extended version of the original submission "A Connection between Feedback Capacity and Kalman Filter for Colored Gaussian Noises"

  38. arXiv:1912.05541  [pdf, other

    eess.SY cs.IT cs.LG cs.RO stat.ML

    Information-Theoretic Performance Limitations of Feedback Control: Underlying Entropic Laws and Generic $\mathcal{L}_{p}$ Bounds

    Authors: Song Fang, Quanyan Zhu

    Abstract: In this paper, we utilize information theory to study the fundamental performance limitations of generic feedback systems, where both the controller and the plant may be any causal functions/mappings while the disturbance can be with any distributions. More specifically, we obtain fundamental $\mathcal{L}_p$ bounds on the control error, which are shown to be completely characterized by the conditi… ▽ More

    Submitted 6 May, 2021; v1 submitted 11 December, 2019; originally announced December 2019.

    Comments: arXiv admin note: text overlap with arXiv:1912.02628

  39. arXiv:1912.03367  [pdf, ps, other

    eess.SY cs.RO math.DS physics.acc-ph physics.app-ph

    Relativistic Control: Feedback Control of Relativistic Dynamics

    Authors: Song Fang, Quanyan Zhu

    Abstract: Strictly speaking, Newton's second law of motion is only an approximation of the so-called relativistic dynamics, i.e., Einstein's modification of the second law based on his theory of special relativity. Although the approximation is almost exact when the velocity of the dynamical system is far less than the speed of light, the difference will become larger and larger (and will eventually go to i… ▽ More

    Submitted 13 January, 2021; v1 submitted 6 December, 2019; originally announced December 2019.

  40. arXiv:1912.02628  [pdf, ps, other

    cs.LG cs.IT eess.SP math.ST stat.ML

    Fundamental Limitations in Sequential Prediction and Recursive Algorithms: $\mathcal{L}_{p}$ Bounds via an Entropic Analysis

    Authors: Song Fang, Quanyan Zhu

    Abstract: In this paper, we obtain fundamental $\mathcal{L}_{p}$ bounds in sequential prediction and recursive algorithms via an entropic analysis. Both classes of problems are examined by investigating the underlying entropic relationships of the data and/or noises involved, and the derived lower bounds may all be quantified in a conditional entropy characterization. We also study the conditions to achieve… ▽ More

    Submitted 11 May, 2021; v1 submitted 3 December, 2019; originally announced December 2019.

    Comments: arXiv admin note: substantial text overlap with arXiv:1910.06742. text overlap with arXiv:1912.05541

  41. arXiv:1911.08153  [pdf, ps, other

    eess.AS cs.LG cs.SD

    Distributed Microphone Speech Enhancement based on Deep Learning

    Authors: Syu-Siang Wang, Yu-You Liang, Jeih-weih Hung, Yu Tsao, Hsin-Min Wang, Shih-Hau Fang

    Abstract: Speech-related applications deliver inferior performance in complex noise environments. Therefore, this study primarily addresses this problem by introducing speech-enhancement (SE) systems based on deep neural networks (DNNs) applied to a distributed microphone architecture, and then investigates the effectiveness of three different DNN-model structures. The first system constructs a DNN model fo… ▽ More

    Submitted 24 May, 2020; v1 submitted 19 November, 2019; originally announced November 2019.

    Comments: deep neural network, multi-channel speech enhancement, distributed microphone architecture, diffuse noise environment

  42. arXiv:1910.06742  [pdf, ps, other

    cs.LG cs.IT eess.SP math.ST stat.ML

    Generic Bounds on the Maximum Deviations in Sequential Prediction: An Information-Theoretic Analysis

    Authors: Song Fang, Quanyan Zhu

    Abstract: In this paper, we derive generic bounds on the maximum deviations in prediction errors for sequential prediction via an information-theoretic approach. The fundamental bounds are shown to depend only on the conditional entropy of the data point to be predicted given the previous data points. In the asymptotic case, the bounds are achieved if and only if the prediction error is white and uniformly… ▽ More

    Submitted 11 May, 2021; v1 submitted 11 October, 2019; originally announced October 2019.

    Comments: arXiv admin note: text overlap with arXiv:1904.04765. text overlap with arXiv:2001.03813

  43. arXiv:1909.01999  [pdf, other

    eess.SY cs.IT cs.RO math.OC

    Two-Way Coding and Attack Decoupling in Control Systems Under Injection Attacks

    Authors: Song Fang, Karl Henrik Johansson, Mikael Skoglund, Henrik Sandberg, Hideaki Ishii

    Abstract: In this paper, we introduce the concept of two-way coding, which originates in communication theory characterizing coding schemes for two-way channels, into control theory, particularly to facilitate the analysis and design of feedback control systems under injection attacks. Moreover, we propose the notion of attack decoupling, and show how the controller and the two-way coding can be co-designed… ▽ More

    Submitted 4 September, 2019; originally announced September 2019.

    Comments: arXiv admin note: text overlap with arXiv:1901.05420

  44. arXiv:1904.04765  [pdf, ps, other

    cs.IT cs.LG eess.SP math.ST stat.ML

    Generic Variance Bounds on Estimation and Prediction Errors in Time Series Analysis: An Entropy Perspective

    Authors: Song Fang, Mikael Skoglund, Karl Henrik Johansson, Hideaki Ishii, Quanyan Zhu

    Abstract: In this paper, we obtain generic bounds on the variances of estimation and prediction errors in time series analysis via an information-theoretic approach. It is seen in general that the error bounds are determined by the conditional entropy of the data point to be estimated or predicted given the side information or past observations. Additionally, we discover that in order to achieve the predict… ▽ More

    Submitted 11 May, 2021; v1 submitted 9 April, 2019; originally announced April 2019.

  45. arXiv:1901.05420  [pdf, other

    eess.SY eess.SP math.OC

    Two-Way Coding in Control Systems Under Injection Attacks: From Attack Detection to Attack Correction

    Authors: Song Fang, Karl Henrik Johansson, Mikael Skoglund, Henrik Sandberg, Hideaki Ishii

    Abstract: In this paper, we introduce the method of two-way coding, a concept originating in communication theory characterizing coding schemes for two-way channels, into (networked) feedback control systems under injection attacks. We first show that the presence of two-way coding can distort the perspective of the attacker on the control system. In general, the distorted viewpoint on the attacker side as… ▽ More

    Submitted 17 January, 2019; v1 submitted 16 January, 2019; originally announced January 2019.

  46. arXiv:1811.10376  [pdf, other

    cs.LG cs.SD eess.AS stat.ML

    Robustness against the channel effect in pathological voice detection

    Authors: Yi-Te Hsu, Zining Zhu, Chi-Te Wang, Shih-Hau Fang, Frank Rudzicz, Yu Tsao

    Abstract: Many people are suffering from voice disorders, which can adversely affect the quality of their lives. In response, some researchers have proposed algorithms for automatic assessment of these disorders, based on voice signals. However, these signals can be sensitive to the recording devices. Indeed, the channel effect is a pervasive problem in machine learning for healthcare. In this study, we pro… ▽ More

    Submitted 2 December, 2018; v1 submitted 26 November, 2018; originally announced November 2018.

    Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216

    Report number: ML4H/2018/200

  47. arXiv:1807.08604  [pdf, other

    eess.SY cs.RO eess.SP math.OC

    A Frequency-Domain Characterization of Optimal Error Covariance for the Kalman-Bucy Filter

    Authors: Song Fang, Hideaki Ishii, Jie Chen, Karl Henrik Johansson

    Abstract: In this paper, we discover that the trace of the division of the optimal output estimation error covariance over the noise covariance attained by the Kalman-Bucy filter can be explicitly expressed in terms of the plant dynamics and noise statistics in a frequency-domain integral characterization. Towards this end, we examine the algebraic Riccati equation associated with Kalman-Bucy filtering usin… ▽ More

    Submitted 23 July, 2018; originally announced July 2018.

  48. arXiv:1705.00945  [pdf

    eess.SY

    Adaptive Noise Cancellation Using Deep Cerebellar Model Articulation Controller

    Authors: Yu Tsao, Hao-Chun Chu, Shih-Wei Lan, Shih-Hau Fang, Junghsi Lee, Chih-Min Lin

    Abstract: This paper proposes a deep cerebellar model articulation controller (DCMAC) for adaptive noise cancellation (ANC). We expand upon the conventional CMAC by stacking sin-gle-layer CMAC models into multiple layers to form a DCMAC model and derive a modified backpropagation training algorithm to learn the DCMAC parameters. Com-pared with conventional CMAC, the DCMAC can characterize nonlinear transfor… ▽ More

    Submitted 2 May, 2017; originally announced May 2017.

  49. arXiv:1412.1056   

    eess.SY cs.IT math.OC

    Three Laws of Multivariable Feedback Systems, Extended Spectral Flatness (Extended Wiener Entropy), 'Uncertainty Principles' in Variance Minimization, and Performance Limitations in Minimum Variance Estimation/Filtering

    Authors: Song Fang

    Abstract: In this paper, three laws are obtained for multiple-input multiple-output feedback systems, which are in entropy domain, frequency domain, and time domain, respectively. The system setup is that with causal plants and causal controllers. Those laws characterize the performance limitations of such systems imposed by the feedback mechanism. Some new notions are proposed to facilitate the analysis: n… ▽ More

    Submitted 19 December, 2014; v1 submitted 1 December, 2014; originally announced December 2014.

    Comments: This paper has been withdrawn by the author due to personal reasons

  50. arXiv:1411.0825   

    eess.SY cs.IT math.OC

    Limitations of state estimation: absolute lower bound of minimum variance estimation/filtering, Gaussianity-whiteness measure (joint Shannon-Wiener entropy), and Gaussianing-whitening filter (maximum Gaussianity-whiteness measure principle)

    Authors: Song Fang

    Abstract: This paper aims at obtaining performance limitations of state estimation in terms of variance minimization (minimum variance estimation and filtering) using information theory. Two new notions, negentropy rate and Gaussianity-whiteness measure (joint Shannon-Wiener entropy), are proposed to facilitate the analysis. Topics such as Gaussianing-whitening filter (the maximum Gaussianity-whiteness meas… ▽ More

    Submitted 19 December, 2014; v1 submitted 4 November, 2014; originally announced November 2014.

    Comments: This paper has been withdrawn by the author due to personal reasons