Skip to main content

Showing 1–50 of 59 results for author: Bai, H

Searching in archive eess. Search in all archives.
.
  1. arXiv:2505.24160  [pdf, ps, other

    eess.IV cs.CV

    Beyond the LUMIR challenge: The pathway to foundational registration models

    Authors: Junyu Chen, Shuwen Wei, Joel Honkamaa, Pekka Marttinen, Hang Zhang, Min Liu, Yichao Zhou, Zuopeng Tan, Zhuoyuan Wang, Yi Wang, Hongchao Zhou, Shunbo Hu, Yi Zhang, Qian Tao, Lukas Förner, Thomas Wendler, Bailiang Jian, Benedikt Wiestler, Tim Hable, Jin Kim, Dan Ruan, Frederic Madesta, Thilo Sentker, Wiebke Heyer, Lianrui Zuo , et al. (11 additional authors not shown)

    Abstract: Medical image challenges have played a transformative role in advancing the field, catalyzing algorithmic innovation and establishing new performance standards across diverse clinical applications. Image registration, a foundational task in neuroimaging pipelines, has similarly benefited from the Learn2Reg initiative. Building on this foundation, we introduce the Large-scale Unsupervised Brain MRI… ▽ More

    Submitted 29 May, 2025; originally announced May 2025.

  2. arXiv:2505.19206  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    SpeakStream: Streaming Text-to-Speech with Interleaved Data

    Authors: Richard He Bai, Zijin Gu, Tatiana Likhomanenko, Navdeep Jaitly

    Abstract: The latency bottleneck of traditional text-to-speech (TTS) systems fundamentally hinders the potential of streaming large language models (LLMs) in conversational AI. These TTS systems, typically trained and inferenced on complete utterances, introduce unacceptable delays, even with optimized inference speeds, when coupled with streaming LLM outputs. This is particularly problematic for creating r… ▽ More

    Submitted 25 May, 2025; originally announced May 2025.

  3. arXiv:2505.04657  [pdf, other

    eess.IV cs.MM

    EvEnhancer: Empowering Effectiveness, Efficiency and Generalizability for Continuous Space-Time Video Super-Resolution with Events

    Authors: Shuoyan Wei, Feng Li, Shengeng Tang, Yao Zhao, Huihui Bai

    Abstract: Continuous space-time video super-resolution (C-STVSR) endeavors to upscale videos simultaneously at arbitrary spatial and temporal scales, which has recently garnered increasing interest. However, prevailing methods struggle to yield satisfactory videos at out-of-distribution spatial and temporal scales. On the other hand, event streams characterized by high temporal resolution and high dynamic r… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

    Comments: 19 pages, 11 figures, 11 tables. Accepted to CVPR 2025 (Highlight)

  4. arXiv:2504.20927  [pdf, other

    eess.SY cs.LG cs.MA math.OC

    Exploiting inter-agent coupling information for efficient reinforcement learning of cooperative LQR

    Authors: Shahbaz P Qadri Syed, He Bai

    Abstract: Developing scalable and efficient reinforcement learning algorithms for cooperative multi-agent control has received significant attention over the past years. Existing literature has proposed inexact decompositions of local Q-functions based on empirical information structures between the agents. In this paper, we exploit inter-agent coupling information and propose a systematic approach to exact… ▽ More

    Submitted 29 April, 2025; originally announced April 2025.

    Comments: Accepted at Learning for Dynamics and Control (L4DC), 2025

  5. arXiv:2503.23772  [pdf, other

    eess.IV

    TransVFC: A Transformable Video Feature Compression Framework for Machines

    Authors: Yuxiao Sun, Yao Zhao, Meiqin Liu, Chao Yao, Huihui Bai, Chunyu Lin, Weisi Lin

    Abstract: Nowadays, more and more video transmissions primarily aim at downstream machine vision tasks rather than humans. While widely deployed Human Visual System (HVS) oriented video coding standards like H.265/HEVC and H.264/AVC are efficient, they are not the optimal approaches for Video Coding for Machines (VCM) scenarios, leading to unnecessary bitrate expenditure. The academic and technical explorat… ▽ More

    Submitted 31 March, 2025; originally announced March 2025.

    Comments: This paper is submitted to elsevier's journel Pattern Recognition

  6. Simultaneous Automatic Picking and Manual Picking Refinement for First-Break

    Authors: Haowen Bai, Zixiang Zhao, Jiangshe Zhang, Yukun Cui, Chunxia Zhang, Zhenbo Guo, Yongjun Wang

    Abstract: First-break picking is a pivotal procedure in processing microseismic data for geophysics and resource exploration. Recent advancements in deep learning have catalyzed the evolution of automated methods for identifying first-break. Nevertheless, the complexity of seismic data acquisition and the requirement for detailed, expert-driven labeling often result in outliers and potential mislabeling wit… ▽ More

    Submitted 3 February, 2025; originally announced February 2025.

    Journal ref: IEEE Transactions on Geoscience and Remote Sensing (TGRS) (Volume: 62), May 14, 2024, Article Sequence Number: 5916112

  7. arXiv:2412.17943  [pdf, other

    eess.IV

    Optimizing Prompt Strategies for SAM: Advancing lesion Segmentation Across Diverse Medical Imaging Modalities

    Authors: Yuli Wang, Victoria Shi, Wen-Chi Hsu, Yuwei Dai, Sophie Yao, Zhusi Zhong, Zishu Zhang, Jing Wu, Aaron Maxwell, Scott Collins, Zhicheng Jiao, Harrison X. Bai

    Abstract: Purpose: To evaluate various Segmental Anything Model (SAM) prompt strategies across four lesions datasets and to subsequently develop a reinforcement learning (RL) agent to optimize SAM prompt placement. Materials and Methods: This retrospective study included patients with four independent ovarian, lung, renal, and breast tumor datasets. Manual segmentation and SAM-assisted segmentation were per… ▽ More

    Submitted 28 December, 2024; v1 submitted 23 December, 2024; originally announced December 2024.

  8. arXiv:2412.10044  [pdf, other

    eess.SP

    Data-Driven Quantification of Battery Degradation Modes via Critical Features from Charging

    Authors: Yuanhao Cheng, Hanyu Bai, Yichen Liang, Xiaofan Cui, Weiren Jiang, Ziyou Song

    Abstract: Battery degradation modes influence the aging behavior of Li-ion batteries, leading to accelerated capacity loss and potential safety issues. Quantifying these aging mechanisms poses challenges for both online and offline diagnostics in charging station applications. Data-driven algorithms have emerged as effective tools for addressing state-of-health issues by learning hard-to-model electrochemic… ▽ More

    Submitted 13 December, 2024; originally announced December 2024.

  9. arXiv:2412.04201  [pdf, other

    cs.CV eess.IV

    Hipandas: Hyperspectral Image Joint Denoising and Super-Resolution by Image Fusion with the Panchromatic Image

    Authors: Shuang Xu, Zixiang Zhao, Haowen Bai, Chang Yu, Jiangjun Peng, Xiangyong Cao, Deyu Meng

    Abstract: Hyperspectral images (HSIs) are frequently noisy and of low resolution due to the constraints of imaging devices. Recently launched satellites can concurrently acquire HSIs and panchromatic (PAN) images, enabling the restoration of HSIs to generate clean and high-resolution imagery through fusing PAN images for denoising and super-resolution. However, previous studies treated these two tasks as in… ▽ More

    Submitted 5 December, 2024; originally announced December 2024.

  10. arXiv:2411.17690  [pdf, other

    cs.MM cs.CV cs.SD eess.AS

    Visatronic: A Multimodal Decoder-Only Model for Speech Synthesis

    Authors: Akshita Gupta, Tatiana Likhomanenko, Karren Dai Yang, Richard He Bai, Zakaria Aldeneh, Navdeep Jaitly

    Abstract: The rapid progress of foundation models and large language models (LLMs) has fueled significantly improvement in the capabilities of machine learning systems that benefit from mutlimodal input data. However, existing multimodal models are predominantly built on top of pre-trained LLMs, which can limit accurate modeling of temporal dependencies across other modalities and thus limit the model's abi… ▽ More

    Submitted 29 May, 2025; v1 submitted 26 November, 2024; originally announced November 2024.

  11. arXiv:2410.02231  [pdf, other

    cs.AI cs.LG eess.SY

    SEAL: SEmantic-Augmented Imitation Learning via Language Model

    Authors: Chengyang Gu, Yuxin Pan, Haotian Bai, Hui Xiong, Yize Chen

    Abstract: Hierarchical Imitation Learning (HIL) is a promising approach for tackling long-horizon decision-making tasks. While it is a challenging task due to the lack of detailed supervisory labels for sub-goal learning, and reliance on hundreds to thousands of expert demonstrations. In this work, we introduce SEAL, a novel framework that leverages Large Language Models (LLMs)'s powerful semantic and world… ▽ More

    Submitted 3 October, 2024; originally announced October 2024.

    Comments: 18 pages, 5 figures, in submission

  12. arXiv:2409.10788  [pdf, other

    eess.AS cs.SD

    Exploring Prediction Targets in Masked Pre-Training for Speech Foundation Models

    Authors: Li-Wei Chen, Takuya Higuchi, He Bai, Ahmed Hussen Abdelaziz, Alexander Rudnicky, Shinji Watanabe, Tatiana Likhomanenko, Barry-John Theobald, Zakaria Aldeneh

    Abstract: Speech foundation models, such as HuBERT and its variants, are pre-trained on large amounts of unlabeled speech data and then used for a range of downstream tasks. These models use a masked prediction objective, where the model learns to predict information about masked input segments from the unmasked context. The choice of prediction targets in this framework impacts their performance on downstr… ▽ More

    Submitted 17 January, 2025; v1 submitted 16 September, 2024; originally announced September 2024.

    Comments: ICASSP 2025

  13. Content-driven Magnitude-Derivative Spectrum Complementary Learning for Hyperspectral Image Classification

    Authors: Huiyan Bai, Tingfa Xu, Huan Chen, Peifu Liu, Jianan Li

    Abstract: Extracting discriminative information from complex spectral details in hyperspectral image (HSI) for HSI classification is pivotal. While current prevailing methods rely on spectral magnitude features, they could cause confusion in certain classes, resulting in misclassification and decreased accuracy. We find that the derivative spectrum proves more adept at capturing concealed information, there… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

    Comments: accepted by TGRS

  14. arXiv:2407.15835  [pdf, other

    cs.CL cs.AI cs.SD eess.AS

    dMel: Speech Tokenization made Simple

    Authors: Richard He Bai, Tatiana Likhomanenko, Ruixiang Zhang, Zijin Gu, Zakaria Aldeneh, Navdeep Jaitly

    Abstract: Large language models have revolutionized natural language processing by leveraging self-supervised pretraining on vast textual data. Inspired by this success, researchers have investigated various compression-based speech tokenization methods to discretize continuous speech signals, enabling the application of language modeling techniques to discrete tokens. However, audio compressor introduces a… ▽ More

    Submitted 21 May, 2025; v1 submitted 22 July, 2024; originally announced July 2024.

    Comments: preprint

  15. arXiv:2407.13179  [pdf, other

    eess.IV cs.CV

    Learned HDR Image Compression for Perceptually Optimal Storage and Display

    Authors: Peibei Cao, Haoyu Chen, Jingzhe Ma, Yu-Chieh Yuan, Zhiyong Xie, Xin Xie, Haiqing Bai, Kede Ma

    Abstract: High dynamic range (HDR) capture and display have seen significant growth in popularity driven by the advancements in technology and increasing consumer demand for superior image quality. As a result, HDR image compression is crucial to fully realize the benefits of HDR imaging without suffering from large file sizes and inefficient data handling. Conventionally, this is achieved by introducing a… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  16. arXiv:2407.04888  [pdf, other

    eess.IV cs.CV

    Unraveling Radiomics Complexity: Strategies for Optimal Simplicity in Predictive Modeling

    Authors: Mahdi Ait Lhaj Loutfi, Teodora Boblea Podasca, Alex Zwanenburg, Taman Upadhaya, Jorge Barrios, David R. Raleigh, William C. Chen, Dante P. I. Capaldi, Hong Zheng, Olivier Gevaert, Jing Wu, Alvin C. Silva, Paul J. Zhang, Harrison X. Bai, Jan Seuntjens, Steffen Löck, Patrick O. Richard, Olivier Morin, Caroline Reinhold, Martin Lepage, Martin Vallières

    Abstract: Background: The high dimensionality of radiomic feature sets, the variability in radiomic feature types and potentially high computational requirements all underscore the need for an effective method to identify the smallest set of predictive features for a given clinical problem. Purpose: Develop a methodology and tools to identify and explain the smallest set of predictive radiomic features. Mat… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

  17. arXiv:2406.00758  [pdf, other

    eess.IV cs.CV cs.MM

    Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaption

    Authors: Anqi Li, Feng Li, Yuxi Liu, Runmin Cong, Yao Zhao, Huihui Bai

    Abstract: Although recent generative image compression methods have demonstrated impressive potential in optimizing the rate-distortion-perception trade-off, they still face the critical challenge of flexible rate adaption to diverse compression necessities and scenarios. To overcome this challenge, this paper proposes a Controllable Generative Image Compression framework, termed Control-GIC, the first capa… ▽ More

    Submitted 4 December, 2024; v1 submitted 2 June, 2024; originally announced June 2024.

  18. arXiv:2405.15216  [pdf, other

    cs.LG cs.CL cs.SD eess.AS

    Denoising LM: Pushing the Limits of Error Correction Models for Speech Recognition

    Authors: Zijin Gu, Tatiana Likhomanenko, He Bai, Erik McDermott, Ronan Collobert, Navdeep Jaitly

    Abstract: Language models (LMs) have long been used to improve results of automatic speech recognition (ASR) systems, but they are unaware of the errors that ASR systems make. Error correction models are designed to fix ASR errors, however, they showed little improvement over traditional LMs mainly due to the lack of supervised training data. In this paper, we present Denoising LM (DLM), which is a… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: under review

  19. arXiv:2405.14905  [pdf, other

    eess.IV cs.AI cs.CL

    Structural Entities Extraction and Patient Indications Incorporation for Chest X-ray Report Generation

    Authors: Kang Liu, Zhuoqi Ma, Xiaolu Kang, Zhusi Zhong, Zhicheng Jiao, Grayson Baird, Harrison Bai, Qiguang Miao

    Abstract: The automated generation of imaging reports proves invaluable in alleviating the workload of radiologists. A clinically applicable reports generation algorithm should demonstrate its effectiveness in producing reports that accurately describe radiology findings and attend to patient-specific indications. In this paper, we introduce a novel method, \textbf{S}tructural \textbf{E}ntities extraction a… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: The code is available at https://github.com/mk-runner/SEI-Temp or https://github.com/mk-runner/SEI

  20. arXiv:2405.14113  [pdf, other

    eess.IV cs.CV

    Multi-modality Regional Alignment Network for Covid X-Ray Survival Prediction and Report Generation

    Authors: Zhusi Zhong, Jie Li, John Sollee, Scott Collins, Harrison Bai, Paul Zhang, Terrence Healey, Michael Atalay, Xinbo Gao, Zhicheng Jiao

    Abstract: In response to the worldwide COVID-19 pandemic, advanced automated technologies have emerged as valuable tools to aid healthcare professionals in managing an increased workload by improving radiology report generation and prognostic analysis. This study proposes Multi-modality Regional Alignment Network (MRANet), an explainable model for radiology report generation and survival prediction that foc… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  21. arXiv:2405.13476  [pdf, other

    eess.SY

    Restricting Voltage Deviation of DC Microgrids with Critical and Ordinary Nodes

    Authors: Handong Bai, Peng Li, Hongwei Zhang

    Abstract: Restricting bus voltage deviation is crucial for normal operation of multi-bus DC microgrids, yet it has received insufficient attention due to the conflict between two main control objectives in DC microgrids, i.e., voltage regulation and current sharing. By revealing a necessary and sufficient condition for achieving these two objectives, this paper proposes a compromised distributed control alg… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  22. arXiv:2403.18707  [pdf, other

    math.OC eess.SY

    On the Reachability of 3-Dimensional Paths with a Prescribed Curvature Bound

    Authors: Juho Bae, Ji Hoon Bai, Byung-Yoon Lee, Jun-Yong Lee, Chang-Hun Lee

    Abstract: This paper presents the reachability analysis of curves in $\mathbb{R}^3$ with a prescribed curvature bound. Based on Pontryagin Maximum Principle, we leverage the existing knowledge on the structure of solutions to minimum-time problems, or Markov-Dubins problem, to reachability considerations. Based on this development, two types of reachability are discussed. First, we prove that any boundary p… ▽ More

    Submitted 26 March, 2025; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: Accepted for publication in Automatica

  23. arXiv:2403.00628  [pdf, other

    cs.CV eess.IV

    Region-Adaptive Transform with Segmentation Prior for Image Compression

    Authors: Yuxi Liu, Wenhan Yang, Huihui Bai, Yunchao Wei, Yao Zhao

    Abstract: Learned Image Compression (LIC) has shown remarkable progress in recent years. Existing works commonly employ CNN-based or self-attention-based modules as transform methods for compression. However, there is no prior research on neural transform that focuses on specific regions. In response, we introduce the class-agnostic segmentation masks (i.e. semantic masks without category labels) for extrac… ▽ More

    Submitted 24 September, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

    Comments: Accepted to ECCV 2024

  24. arXiv:2401.14304  [pdf, other

    eess.SY

    Constraint-Aware Mesh Refinement Method by Reachability Set Envelope of Curvature Bounded Paths

    Authors: Juho Bae, Ji Hoon Bai, Byung-Yoon Lee, Jun-Yong Lee

    Abstract: This paper presents an enhanced direct-method-based approach for the real-time solution of optimal control problems to handle path constraints, such as obstacles. The principal contributions of this work are twofold: first, the existing methods for constructing reachability sets in the literature are extended to derive the envelope of these sets, which determines the region swept by all feasible t… ▽ More

    Submitted 4 March, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: Preprint submitted to Automatica

  25. PIPO-Net: A Penalty-based Independent Parameters Optimization Deep Unfolding Network

    Authors: Xiumei Li, Zhijie Zhang, Huang Bai, Ljubiša Stanković, Junpeng Hao, Junmei Sun

    Abstract: Compressive sensing (CS) has been widely applied in signal and image processing fields. Traditional CS reconstruction algorithms have a complete theoretical foundation but suffer from the high computational complexity, while fashionable deep network-based methods can achieve high-accuracy reconstruction of CS but are short of interpretability. These facts motivate us to develop a deep unfolding ne… ▽ More

    Submitted 4 November, 2023; originally announced November 2023.

  26. A Sigmoid-based car-following model to improve acceleration stability in traffic oscillation and following failure in free flow

    Authors: Xingyu Chen, Haijian Bai

    Abstract: This paper proposes an improved Intelligent driving model (Sigmoid-IDM) to address the problems of excessive acceleration in traffic oscillation and following failure in free flow. The Sigmoid-IDM uses a Sigmoid function to enhance the start-following characteristics, improve the output strategy of the spacing term, and stabilize the steady-state velocity in free flow. Moreover, the model asymmetr… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

    Comments: 15 pages, 51 figures,

  27. arXiv:2308.06786  [pdf, other

    eess.SY

    Challenges and Opportunities for Second-life Batteries: A Review of Key Technologies and Economy

    Authors: Xubo Gu, Hanyu Bai, Xiaofan Cui, Juner Zhu, Weichao Zhuang, Zhaojian Li, Xiaosong Hu, Ziyou Song

    Abstract: Due to the increasing volume of Electric Vehicles in automotive markets and the limited lifetime of onboard lithium-ion batteries (LIBs), the large-scale retirement of LIBs is imminent. The battery packs retired from Electric Vehicles still own 70%-80% of the initial capacity, thus having the potential to be utilized in scenarios with lower energy and power requirements to maximize the value of LI… ▽ More

    Submitted 13 August, 2023; originally announced August 2023.

  28. arXiv:2306.15561  [pdf, other

    cs.CV cs.MM eess.IV

    You Can Mask More For Extremely Low-Bitrate Image Compression

    Authors: Anqi Li, Feng Li, Jiaxin Han, Huihui Bai, Runmin Cong, Chunjie Zhang, Meng Wang, Weisi Lin, Yao Zhao

    Abstract: Learned image compression (LIC) methods have experienced significant progress during recent years. However, these methods are primarily dedicated to optimizing the rate-distortion (R-D) performance at medium and high bitrates (> 0.1 bits per pixel (bpp)), while research on extremely low bitrates is limited. Besides, existing methods fail to explicitly explore the image structure and texture compon… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

    Comments: Under review

  29. arXiv:2306.08941  [pdf, other

    eess.IV cs.CV

    Exploring Resolution Fields for Scalable Image Compression with Uncertainty Guidance

    Authors: Dongyi Zhang, Feng Li, Man Liu, Runmin Cong, Huihui Bai, Meng Wang, Yao Zhao

    Abstract: Recently, there are significant advancements in learning-based image compression methods surpassing traditional coding standards. Most of them prioritize achieving the best rate-distortion performance for a particular compression rate, which limits their flexibility and adaptability in various applications with complex and varying constraints. In this work, we explore the potential of resolution f… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

  30. arXiv:2302.10864  [pdf, other

    eess.SY

    Reinforcement Learning-based Control of Nonlinear Systems using Carleman Approximation: Structured and Unstructured Designs

    Authors: Jishnudeep Kar, He Bai, Aranya Chakrabortty

    Abstract: We develop data-driven reinforcement learning (RL) control designs for input-affine nonlinear systems. We use Carleman linearization to express the state-space representation of the nonlinear dynamical model in the Carleman space, and develop a real-time algorithm that can learn nonlinear state-feedback controllers using state and input measurements in the infinite-dimensional Carleman space. Ther… ▽ More

    Submitted 7 August, 2024; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: 18 pages, extended version, preprint version

  31. arXiv:2211.03545  [pdf, other

    eess.AS cs.CL cs.SD

    ERNIE-SAT: Speech and Text Joint Pretraining for Cross-Lingual Multi-Speaker Text-to-Speech

    Authors: Xiaoran Fan, Chao Pang, Tian Yuan, He Bai, Renjie Zheng, Pengfei Zhu, Shuohuan Wang, Junkun Chen, Zeyu Chen, Liang Huang, Yu Sun, Hua Wu

    Abstract: Speech representation learning has improved both speech understanding and speech synthesis tasks for single language. However, its ability in cross-lingual scenarios has not been explored. In this paper, we extend the pretraining method for cross-lingual multi-speaker speech synthesis tasks, including cross-lingual multi-speaker voice cloning and cross-lingual multi-speaker speech editing. We prop… ▽ More

    Submitted 4 December, 2022; v1 submitted 7 November, 2022; originally announced November 2022.

  32. arXiv:2206.08023  [pdf, other

    eess.IV cs.CV cs.LG

    AMOS: A Large-Scale Abdominal Multi-Organ Benchmark for Versatile Medical Image Segmentation

    Authors: Yuanfeng Ji, Haotian Bai, Jie Yang, Chongjian Ge, Ye Zhu, Ruimao Zhang, Zhen Li, Lingyan Zhang, Wanling Ma, Xiang Wan, Ping Luo

    Abstract: Despite the considerable progress in automatic abdominal multi-organ segmentation from CT/MRI scans in recent years, a comprehensive evaluation of the models' capabilities is hampered by the lack of a large-scale benchmark from diverse clinical scenarios. Constraint by the high cost of collecting and labeling 3D medical data, most of the deep learning models to date are driven by datasets with a l… ▽ More

    Submitted 1 September, 2022; v1 submitted 16 June, 2022; originally announced June 2022.

  33. arXiv:2206.06401  [pdf

    cs.NI eess.SY

    GoAutoBash: Golang-based Multi-Thread Automatic Pull-Execute Framework with GitHub Webhooks And Queuing Strategy

    Authors: Hao Bai

    Abstract: Recently, more and more server tasks are done using full automation, including grading tasks for students in the college courses, integrating tasks for programmers in big projects and server-based transactions, and visualization tasks for researchers in a data-dense topic. Using automation on servers provides a great possibility for reducing the burden on manual tasks. Although server tools like C… ▽ More

    Submitted 13 June, 2022; originally announced June 2022.

    Comments: Accepted by EPCE'22

  34. arXiv:2203.09690  [pdf, other

    eess.AS cs.CL cs.SD

    A$^3$T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing

    Authors: He Bai, Renjie Zheng, Junkun Chen, Xintong Li, Mingbo Ma, Liang Huang

    Abstract: Recently, speech representation learning has improved many speech-related tasks such as speech recognition, speech classification, and speech-to-text translation. However, all the above tasks are in the direction of speech understanding, but for the inverse direction, speech synthesis, the potential of representation learning is yet to be realized, due to the challenging nature of generating high-… ▽ More

    Submitted 18 June, 2022; v1 submitted 17 March, 2022; originally announced March 2022.

    Comments: Accepted by ICML 2022, 12 pages, 10 figures

  35. arXiv:2201.04962  [pdf, other

    cs.MA cs.AI cs.LG eess.SY math.OC

    Distributed Cooperative Multi-Agent Reinforcement Learning with Directed Coordination Graph

    Authors: Gangshan Jing, He Bai, Jemin George, Aranya Chakrabortty, Piyush. K. Sharma

    Abstract: Existing distributed cooperative multi-agent reinforcement learning (MARL) frameworks usually assume undirected coordination graphs and communication graphs while estimating a global reward via consensus algorithms for policy evaluation. Such a framework may induce expensive communication costs and exhibit poor scalability due to requirement of global consensus. In this work, we study MARLs with d… ▽ More

    Submitted 9 January, 2022; originally announced January 2022.

  36. arXiv:2112.09574  [pdf

    eess.IV cs.CV cs.LG

    Super-resolution reconstruction of cytoskeleton image based on A-net deep learning network

    Authors: Qian Chen, Haoxin Bai, Bingchen Che, Tianyun Zhao, Ce Zhang, Kaige Wang, Jintao Bai, Wei Zhao

    Abstract: To date, live-cell imaging at the nanometer scale remains challenging. Even though super-resolution microscopy methods have enabled visualization of subcellular structures below the optical resolution limit, the spatial resolution is still far from enough for the structural reconstruction of biomolecules in vivo (i.e. ~24 nm thickness of microtubule fiber). In this study, we proposed an A-net netw… ▽ More

    Submitted 17 December, 2021; originally announced December 2021.

    Comments: The manuscript has 17 pages, 10 figures and 58 references

  37. arXiv:2107.12416  [pdf, other

    eess.SY cs.AI cs.LG math.OC

    Asynchronous Distributed Reinforcement Learning for LQR Control via Zeroth-Order Block Coordinate Descent

    Authors: Gangshan Jing, He Bai, Jemin George, Aranya Chakrabortty, Piyush K. Sharma

    Abstract: Recently introduced distributed zeroth-order optimization (ZOO) algorithms have shown their utility in distributed reinforcement learning (RL). Unfortunately, in the gradient estimation process, almost all of them require random samples with the same dimension as the global variable and/or require evaluation of the global cost function, which may induce high estimation variance for large-scale net… ▽ More

    Submitted 2 May, 2024; v1 submitted 26 July, 2021; originally announced July 2021.

    Comments: The arxiv version contains proofs of Lemma 3 and Lemma 5, which are missing in the published version

  38. Variance Reduction of Quadcopter Trajectory Tracking in Turbulent Wind

    Authors: Asma Tabassum, Rohit K. S. S. Vuppala, He Bai, Kursat Kara

    Abstract: We consider a quadcopter operating in a turbulent windy environment. The turbulent environment may be imposed on a quadcopter by structures, landscapes, terrains and most importantly by the unique physical phenomena in the lower atmosphere. Turbulence can negatively impact quadcopter's performance and operations. Modeling turbulence as a stochastic random input, we investigate control designs that… ▽ More

    Submitted 25 August, 2021; v1 submitted 20 April, 2021; originally announced April 2021.

  39. Dynamic Control Allocation between Onboard and Delayed Remote Control for Unmanned Aircraft System Detect-and-Avoid

    Authors: Asma Tabassum, He Bai

    Abstract: This paper develops and evaluates the performance of an allocation agent to be potentially integrated into the onboard Detect and Avoid (DAA) computer of an Unmanned Aircraft System (UAS). We consider a UAS that can be fully controlled by the onboard DAA system and by a remote human pilot. With a communication channel prone to latency, we consider a mixed initiative interaction environment, where… ▽ More

    Submitted 13 March, 2021; originally announced March 2021.

  40. arXiv:2103.04480  [pdf, other

    eess.SY math.OC

    Learning Distributed Stabilizing Controllers for Multi-Agent Systems

    Authors: Gangshan Jing, He Bai, Jemin George, Aranya Chakrabortty, Piyush K. Sharma

    Abstract: We address the problem of model-free distributed stabilization of heterogeneous multi-agent systems using reinforcement learning (RL). Two algorithms are developed. The first algorithm solves a centralized linear quadratic regulator (LQR) problem without knowing any initial stabilizing gain in advance. The second algorithm builds upon the results of the first algorithm, and extends it to distribut… ▽ More

    Submitted 7 March, 2021; originally announced March 2021.

    Comments: This paper propose model-free RL algorithms for deriving stabilizing gains of continuous-time multi-agent systems

  41. arXiv:2012.09154  [pdf

    eess.IV cs.CV physics.optics

    Exploration of Whether Skylight Polarization Patterns Contain Three-dimensional Attitude Information

    Authors: Huaju Liang, Hongyang Bai, Tong Zhou

    Abstract: Our previous work has demonstrated that Rayleigh model, which is widely used in polarized skylight navigation to describe skylight polarization patterns, does not contain three-dimensional (3D) attitude information [1]. However, it is still necessary to further explore whether the skylight polarization patterns contain 3D attitude information. So, in this paper, a social spider optimization (SSO)… ▽ More

    Submitted 30 November, 2020; originally announced December 2020.

  42. Online Observer-Based Inverse Reinforcement Learning

    Authors: Ryan Self, Kevin Coleman, He Bai, Rushikesh Kamalapurkar

    Abstract: In this paper, a novel approach to the output-feedback inverse reinforcement learning (IRL) problem is developed by casting the IRL problem, for linear systems with quadratic cost functions, as a state estimation problem. Two observer-based techniques for IRL are developed, including a novel observer method that re-uses previous state estimates via history stacks. Theoretical guarantees for conver… ▽ More

    Submitted 17 July, 2023; v1 submitted 3 November, 2020; originally announced November 2020.

    Comments: 7 pages, 3 figures

  43. arXiv:2010.08615  [pdf, other

    eess.SY cs.AI math.OC

    Decomposability and Parallel Computation of Multi-Agent LQR

    Authors: Gangshan Jing, He Bai, Jemin George, Aranya Chakrabortty

    Abstract: Individual agents in a multi-agent system (MAS) may have decoupled open-loop dynamics, but a cooperative control objective usually results in coupled closed-loop dynamics thereby making the control design computationally expensive. The computation time becomes even higher when a learning strategy such as reinforcement learning (RL) needs to be applied to deal with the situation when the agents dyn… ▽ More

    Submitted 7 March, 2021; v1 submitted 16 October, 2020; originally announced October 2020.

    Comments: This paper contains proofs of all the theorems in the conference paper "Decomposability and Parallel Computation of Multi-Agent LQR"

  44. arXiv:2008.06604  [pdf, other

    eess.SY cs.MA math.OC

    Model-Free Optimal Control of Linear Multi-Agent Systems via Decomposition and Hierarchical Approximation

    Authors: Gangshan Jing, He Bai, Jemin George, Aranya Chakrabortty

    Abstract: Designing the optimal linear quadratic regulator (LQR) for a large-scale multi-agent system (MAS) is time-consuming since it involves solving a large-size matrix Riccati equation. The situation is further exasperated when the design needs to be done in a model-free way using schemes such as reinforcement learning (RL). To reduce this computational complexity, we decompose the large-scale LQR desig… ▽ More

    Submitted 16 March, 2021; v1 submitted 14 August, 2020; originally announced August 2020.

    Comments: This paper proposes a hierarchical learning and control framework for model-free LQR of heterogeneous linear multi-agent systems

  45. arXiv:2007.14186  [pdf, ps, other

    eess.SY math.DS

    Hierarchical Control of Multi-Agent Systems using Online Reinforcement Learning

    Authors: He Bai, Jemin George, Aranya Chakrabortty

    Abstract: We propose a new reinforcement learning based approach to designing hierarchical linear quadratic regulator (LQR) controllers for heterogeneous linear multi-agent systems with unknown state-space models and separated control objectives. The separation arises from grouping the agents into multiple non-overlapping groups, and defining the control goal as two distinct objectives. The first objective… ▽ More

    Submitted 28 July, 2020; originally announced July 2020.

  46. arXiv:2006.11667  [pdf, other

    eess.SP

    Emulating UAV Motion by Utilizing Robotic Arm for mmWave Wireless Channel Characterization

    Authors: Amit Kachroo, Collin A. Thornton, Md Arifur Rahman Sarker, Wooyeol Choi, He Bai, Ickhyun Song, John O'Hara, Sabit Ekin

    Abstract: In this paper, millimeter wave (mmWave) wireless channel characteristics (Doppler spread and path loss modeling) for Unmanned Aerial Vehicles (UAVs) assisted communication is analyzed and studied by emulating the real UAV motion using a robotic arm. The motion considers the actual turbulence caused by the wind gusts to the UAV in the atmosphere, which is statistically modeled by the widely used Dr… ▽ More

    Submitted 21 March, 2021; v1 submitted 20 June, 2020; originally announced June 2020.

    Comments: 12 pages, 17 figures, accepted for publication in IEEE Transactions on Antennas and Propagation

  47. Reduced-Dimensional Reinforcement Learning Control using Singular Perturbation Approximations

    Authors: Sayak Mukherjee, He Bai, Aranya Chakrabortty

    Abstract: We present a set of model-free, reduced-dimensional reinforcement learning (RL) based optimal control designs for linear time-invariant singularly perturbed (SP) systems. We first present a state-feedback and output-feedback based RL control design for a generic SP system with unknown state and input matrices. We take advantage of the underlying time-scale separation property of the plant to learn… ▽ More

    Submitted 29 April, 2020; originally announced April 2020.

    Journal ref: Automatica 2021 (full version with proofs)

  48. arXiv:2001.03851  [pdf, other

    cs.CV eess.IV

    Deep Optimized Multiple Description Image Coding via Scalar Quantization Learning

    Authors: Lijun Zhao, Huihui Bai, Anhong Wang, Yao Zhao

    Abstract: In this paper, we introduce a deep multiple description coding (MDC) framework optimized by minimizing multiple description (MD) compressive loss. First, MD multi-scale-dilated encoder network generates multiple description tensors, which are discretized by scalar quantizers, while these quantized tensors are decompressed by MD cascaded-ResBlock decoder networks. To greatly reduce the total amount… ▽ More

    Submitted 12 January, 2020; originally announced January 2020.

    Comments: 14 PAGES, 10 FIGURES

  49. arXiv:1912.02057  [pdf, other

    cs.LG eess.SP

    RTN: Reparameterized Ternary Network

    Authors: Yuhang Li, Xin Dong, Sai Qian Zhang, Haoli Bai, Yuanpeng Chen, Wei Wang

    Abstract: To deploy deep neural networks on resource-limited devices, quantization has been widely explored. In this work, we study the extremely low-bit networks which have tremendous speed-up, memory saving with quantized activation and weights. We first bring up three omitted issues in extremely low-bit networks: the squashing range of quantized values; the gradient vanishing during backpropagation and t… ▽ More

    Submitted 12 December, 2019; v1 submitted 4 December, 2019; originally announced December 2019.

    Comments: To appear at AAAI-20

  50. arXiv:1910.12258  [pdf, ps, other

    cs.LG eess.SP stat.ML

    Compressed Sensing with Probability-based Prior Information

    Authors: Q. Jiang, S. Li, Z. Zhu, H. Bai, X. He, R. C. de Lamare

    Abstract: This paper deals with the design of a sensing matrix along with a sparse recovery algorithm by utilizing the probability-based prior information for compressed sensing system. With the knowledge of the probability for each atom of the dictionary being used, a diagonal weighted matrix is obtained and then the sensing matrix is designed by minimizing a weighted function such that the Gram of the equ… ▽ More

    Submitted 27 October, 2019; originally announced October 2019.

    Comments: 13 pages, 9 figures