Skip to main content

Showing 1–33 of 33 results for author: Yuan, L

Searching in archive eess. Search in all archives.
.
  1. arXiv:2507.03315  [pdf, ps, other

    eess.IV cs.CV

    Towards Interpretable PolSAR Image Classification: Polarimetric Scattering Mechanism Informed Concept Bottleneck and Kolmogorov-Arnold Network

    Authors: Jinqi Zhang, Fangzhou Han, Di Zhuang, Lamei Zhang, Bin Zou, Li Yuan

    Abstract: In recent years, Deep Learning (DL) based methods have received extensive and sufficient attention in the field of PolSAR image classification, which show excellent performance. However, due to the ``black-box" nature of DL methods, the interpretation of the high-dimensional features extracted and the backtracking of the decision-making process based on the features are still unresolved problems.… ▽ More

    Submitted 4 July, 2025; originally announced July 2025.

  2. arXiv:2506.17540  [pdf, ps, other

    eess.IV cs.CV cs.LG

    MTSIC: Multi-stage Transformer-based GAN for Spectral Infrared Image Colorization

    Authors: Tingting Liu, Yuan Liu, Jinhui Tang, Liyin Yuan, Chengyu Liu, Chunlai Li, Xiubao Sui, Qian Chen

    Abstract: Thermal infrared (TIR) images, acquired through thermal radiation imaging, are unaffected by variations in lighting conditions and atmospheric haze. However, TIR images inherently lack color and texture information, limiting downstream tasks and potentially causing visual fatigue. Existing colorization methods primarily rely on single-band images with limited spectral information and insufficient… ▽ More

    Submitted 20 June, 2025; originally announced June 2025.

  3. arXiv:2503.17708  [pdf, other

    cs.NI eess.SP

    RAISE: Optimizing RIS Placement to Maximize Task Throughput in Multi-Server Vehicular Edge Computing

    Authors: Yanan Ma, Zhengru Fang, Longzhi Yuan, Yiqin Deng, Xianhao Chen, Yuguang Fang

    Abstract: Given the limited computing capabilities on autonomous vehicles, onboard processing of large volumes of latency-sensitive tasks presents significant challenges. While vehicular edge computing (VEC) has emerged as a solution, offloading data-intensive tasks to roadside servers or other vehicles is hindered by large obstacles like trucks/buses and the surge in service demands during rush hours. To a… ▽ More

    Submitted 22 March, 2025; originally announced March 2025.

    Comments: 14 pages, 10 figures

  4. arXiv:2502.08455  [pdf, other

    cs.MA eess.SY

    Resilient Quantized Consensus in Multi-Hop Relay Networks

    Authors: Liwei Yuan, Hideaki Ishii

    Abstract: We study resilient quantized consensus in multi-agent systems, where some agents may malfunction. The network consists of agents taking integer-valued states, and the agents' communication is subject to asynchronous updates and time delays. We utilize the quantized weighted mean subsequence reduced algorithm where agents communicate with others through multi-hop relays. We prove necessary and suff… ▽ More

    Submitted 12 February, 2025; originally announced February 2025.

    Comments: 13 pages

  5. arXiv:2411.09954  [pdf, other

    cs.MA eess.SY

    Reaching Resilient Leader-Follower Consensus in Time-Varying Networks via Multi-Hop Relays

    Authors: Liwei Yuan, Hideaki Ishii

    Abstract: We study resilient leader-follower consensus of multi-agent systems (MASs) in the presence of adversarial agents, where agents' communication is modeled by time-varying topologies. The objective is to develop distributed algorithms for the nonfaulty/normal followers to track an arbitrary reference value propagated by a set of leaders while they are in interaction with the unknown adversarial agent… ▽ More

    Submitted 15 November, 2024; originally announced November 2024.

    Comments: 15 pages

  6. arXiv:2409.01199  [pdf, other

    cs.CV eess.IV

    OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model

    Authors: Liuhan Chen, Zongjian Li, Bin Lin, Bin Zhu, Qian Wang, Shenghai Yuan, Xing Zhou, Xinhua Cheng, Li Yuan

    Abstract: Variational Autoencoder (VAE), compressing videos into latent representations, is a crucial preceding component of Latent Video Diffusion Models (LVDMs). With the same reconstruction quality, the more sufficient the VAE's compression for videos is, the more efficient the LVDMs are. However, most LVDMs utilize 2D image VAE, whose compression for videos is only in the spatial dimension and often ign… ▽ More

    Submitted 9 September, 2024; v1 submitted 2 September, 2024; originally announced September 2024.

    Comments: https://github.com/PKU-YuanGroup/Open-Sora-Plan

  7. arXiv:2405.18752  [pdf, other

    cs.MA eess.SY

    Resilient Average Consensus with Adversaries via Distributed Detection and Recovery

    Authors: Liwei Yuan, Hideaki Ishii

    Abstract: We study the problem of resilient average consensus in multi-agent systems where some of the agents are subject to failures or attacks. The objective of resilient average consensus is for non-faulty/normal agents to converge to the average of their initial values despite the erroneous effects from malicious agents. To this end, we propose a successful distributed iterative resilient average consen… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 16 pages

  8. EvaNet: Elevation-Guided Flood Extent Mapping on Earth Imagery (Extended Version)

    Authors: Mirza Tanzim Sami, Da Yan, Saugat Adhikari, Lyuheng Yuan, Jiao Han, Zhe Jiang, Jalal Khalil, Yang Zhou

    Abstract: Accurate and timely mapping of flood extent from high-resolution satellite imagery plays a crucial role in disaster management such as damage assessment and relief activities. However, current state-of-the-art solutions are based on U-Net, which can-not segment the flood pixels accurately due to the ambiguous pixels (e.g., tree canopies, clouds) that prevent a direct judgement from only the spectr… ▽ More

    Submitted 25 September, 2024; v1 submitted 27 April, 2024; originally announced April 2024.

    Comments: Published at the International Joint Conference on Artificial Intelligence (IJCAI, 2024)

  9. arXiv:2404.17170  [pdf, other

    cs.CV eess.IV

    Image Quality Assessment With Compressed Sampling

    Authors: Ronghua Liao, Chen Hui, Lang Yuan, Haiqi Zhu, Feng Jiang

    Abstract: No-Reference Image Quality Assessment (NR-IQA) aims at estimating image quality in accordance with subjective human perception. However, most methods focus on exploring increasingly complex networks to improve the final performance,accompanied by limitations on input images. Especially when applied to high-resolution (HR) images, these methods offen have to adjust the size of original image to mee… ▽ More

    Submitted 11 September, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

  10. arXiv:2404.00352  [pdf

    eess.IV

    Dependability Evaluation of Stable Diffusion with Soft Errors on the Model Parameters

    Authors: Zhen Gao, Lini Yuan, Pedro Reviriego, Shanshan Liu, Fabrizio Lombardi

    Abstract: Stable Diffusion is a popular Transformer-based model for image generation from text; it applies an image information creator to the input text and the visual knowledge is added in a step-by-step fashion to create an image that corresponds to the input text. However, this diffusion process can be corrupted by errors from the underlying hardware, which are especially relevant for implementations at… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: 6 pages, 16 figures

  11. arXiv:2403.12852  [pdf, other

    eess.IV cs.CV

    Generative Enhancement for 3D Medical Images

    Authors: Lingting Zhu, Noel Codella, Dongdong Chen, Zhenchao Jin, Lu Yuan, Lequan Yu

    Abstract: The limited availability of 3D medical image datasets, due to privacy concerns and high collection or annotation costs, poses significant challenges in the field of medical imaging. While a promising alternative is the use of synthesized medical data, there are few solutions for realistic 3D medical image synthesis due to difficulties in backbone design and fewer 3D training samples compared to 2D… ▽ More

    Submitted 24 May, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: 20 pages, 8 figures

  12. arXiv:2403.07640  [pdf, other

    cs.MA eess.SY

    Asynchronous Approximate Byzantine Consensus: A Multi-hop Relay Method and Tight Graph Conditions

    Authors: Liwei Yuan, Hideaki Ishii

    Abstract: We study a multi-agent resilient consensus problem, where some agents are of the Byzantine type and try to prevent the normal ones from reaching consensus. In our setting, normal agents communicate with each other asynchronously over multi-hop relay channels with delays. To solve this asynchronous Byzantine consensus problem, we develop the multi-hop weighted mean subsequence reduced (MW-MSR) algo… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: text overlap with arXiv:2201.03214

  13. arXiv:2402.19085  [pdf, other

    cs.CL cs.AI eess.SY

    Controllable Preference Optimization: Toward Controllable Multi-Objective Alignment

    Authors: Yiju Guo, Ganqu Cui, Lifan Yuan, Ning Ding, Zexu Sun, Bowen Sun, Huimin Chen, Ruobing Xie, Jie Zhou, Yankai Lin, Zhiyuan Liu, Maosong Sun

    Abstract: Alignment in artificial intelligence pursues the consistency between model responses and human preferences as well as values. In practice, the multifaceted nature of human preferences inadvertently introduces what is known as the "alignment tax" -a compromise where enhancements in alignment within one objective (e.g.,harmlessness) can diminish performance in others (e.g.,helpfulness). However, exi… ▽ More

    Submitted 11 October, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

    Comments: EMNLP 2024 main conference

  14. arXiv:2401.13995  [pdf, other

    eess.SP

    Knowledge Graph Driven UAV Cognitive Semantic Communication Systems for Efficient Object Detection

    Authors: Xi Song, Lu Yuan, Zhibo Qu, Fuhui Zhou, Qihui Wu, Tony Q. S. Quek, Rose Qingyang Hu

    Abstract: Unmanned aerial vehicles (UAVs) are widely used for object detection. However, the existing UAV-based object detection systems are subject to the serious challenge, namely, the finite computation, energy and communication resources, which limits the achievable detection performance. In order to overcome this challenge, a UAV cognitive semantic communication system is proposed by exploiting knowled… ▽ More

    Submitted 21 February, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

  15. Human Sensing via Passive Spectrum Monitoring

    Authors: Huaizheng Mu, Liangqi Yuan, Jia Li

    Abstract: Human sensing is significantly improving our lifestyle in many fields such as elderly healthcare and public safety. Research has demonstrated that human activity can alter the passive radio frequency (PRF) spectrum, which represents the passive reception of RF signals in the surrounding environment without actively transmitting a target signal. This paper proposes a novel passive human sensing met… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

  16. arXiv:2305.12311  [pdf, other

    cs.CL cs.AI cs.CV cs.LG eess.AS

    i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data

    Authors: Ziyi Yang, Mahmoud Khademi, Yichong Xu, Reid Pryzant, Yuwei Fang, Chenguang Zhu, Dongdong Chen, Yao Qian, Mei Gao, Yi-Ling Chen, Robert Gmyr, Naoyuki Kanda, Noel Codella, Bin Xiao, Yu Shi, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang

    Abstract: The convergence of text, visual, and audio data is a key step towards human-like artificial intelligence, however the current Vision-Language-Speech landscape is dominated by encoder-only models which lack generative abilities. We propose closing this gap with i-Code V2, the first model capable of generating natural language from any combination of Vision, Language, and Speech data. i-Code V2 is a… ▽ More

    Submitted 20 May, 2023; originally announced May 2023.

  17. arXiv:2305.11367  [pdf, other

    cs.CV cs.HC cs.LG eess.SP

    Smart Pressure e-Mat for Human Sleeping Posture and Dynamic Activity Recognition

    Authors: Liangqi Yuan, Yuan Wei, Jia Li

    Abstract: With the emphasis on healthcare, early childhood education, and fitness, non-invasive measurement and recognition methods have received more attention. Pressure sensing has been extensively studied because of its advantages of simple structure, easy access, visualization application, and harmlessness. This paper introduces a Smart Pressure e-Mat (SPeM) system based on piezoresistive material, Velo… ▽ More

    Submitted 19 November, 2024; v1 submitted 18 May, 2023; originally announced May 2023.

  18. arXiv:2304.06513  [pdf, other

    eess.SP cs.AI eess.SY

    Passive Radio Frequency-based 3D Indoor Positioning System via Ensemble Learning

    Authors: Liangqi Yuan, Houlin Chen, Robert Ewing, Jia Li

    Abstract: Passive radio frequency (PRF)-based indoor positioning systems (IPS) have attracted researchers' attention due to their low price, easy and customizable configuration, and non-invasive design. This paper proposes a PRF-based three-dimensional (3D) indoor positioning system (PIPS), which is able to use signals of opportunity (SoOP) for positioning and also capture a scenario signature. PIPS passive… ▽ More

    Submitted 25 March, 2023; originally announced April 2023.

    Comments: DDDAS 2022

  19. arXiv:2205.01818  [pdf, other

    cs.LG cs.AI cs.CL cs.CV eess.AS

    i-Code: An Integrative and Composable Multimodal Learning Framework

    Authors: Ziyi Yang, Yuwei Fang, Chenguang Zhu, Reid Pryzant, Dongdong Chen, Yu Shi, Yichong Xu, Yao Qian, Mei Gao, Yi-Ling Chen, Liyang Lu, Yujia Xie, Robert Gmyr, Noel Codella, Naoyuki Kanda, Bin Xiao, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang

    Abstract: Human intelligence is multimodal; we integrate visual, linguistic, and acoustic signals to maintain a holistic worldview. Most current pretraining methods, however, are limited to one or two modalities. We present i-Code, a self-supervised pretraining framework where users may flexibly combine the modalities of vision, speech, and language into unified and general-purpose vector representations. I… ▽ More

    Submitted 5 May, 2022; v1 submitted 3 May, 2022; originally announced May 2022.

  20. Event-triggered Approximate Byzantine Consensus with Multi-hop Communication

    Authors: Liwei Yuan, Hideaki Ishii

    Abstract: In this paper, we consider a resilient consensus problem for the multi-agent network where some of the agents are subject to Byzantine attacks and may transmit erroneous state values to their neighbors. In particular, we develop an event-triggered update rule to tackle this problem as well as reduce the communication for each agent. Our approach is based on the mean subsequence reduced (MSR) algor… ▽ More

    Submitted 19 April, 2022; originally announced April 2022.

    Comments: arXiv admin note: text overlap with arXiv:2201.03214

  21. arXiv:2202.06948  [pdf

    cs.NE cs.HC cs.LG eess.SP

    Towards Best Practice of Interpreting Deep Learning Models for EEG-based Brain Computer Interfaces

    Authors: Jian Cui, Liqiang Yuan, Zhaoxiang Wang, Ruilin Li, Tianzi Jiang

    Abstract: As deep learning has achieved state-of-the-art performance for many tasks of EEG-based BCI, many efforts have been made in recent years trying to understand what have been learned by the models. This is commonly done by generating a heatmap indicating to which extent each pixel of the input contributes to the final classification for a trained model. Despite the wide use, it is not yet understood… ▽ More

    Submitted 17 April, 2023; v1 submitted 12 February, 2022; originally announced February 2022.

  22. arXiv:2201.03214  [pdf, other

    cs.MA cs.DC eess.SY

    Resilient Consensus with Multi-hop Communication

    Authors: Liwei Yuan, Hideaki Ishii

    Abstract: In this paper, we study the problem of resilient consensus for a multi-agent network where some of the nodes might be adversarial, attempting to prevent consensus by transmitting faulty values. Our approach is based on that of the so-called weighted mean subsequence reduced (W-MSR) algorithm with a special emphasis on its use in agents capable to communicate with multi-hop neighbors. The MSR algor… ▽ More

    Submitted 10 January, 2022; originally announced January 2022.

  23. arXiv:2112.12386  [pdf, other

    eess.IV cs.CV

    KFWC: A Knowledge-Driven Deep Learning Model for Fine-grained Classification of Wet-AMD

    Authors: Haihong E, Jiawen He, Tianyi Hu, Lifei Wang, Lifei Yuan, Ruru Zhang, Meina Song

    Abstract: Automated diagnosis using deep neural networks can help ophthalmologists detect the blinding eye disease wet Age-related Macular Degeneration (AMD). Wet-AMD has two similar subtypes, Neovascular AMD and Polypoidal Choroidal Vessels (PCV). However, due to the difficulty in data collection and the similarity between images, most studies have only achieved the coarse-grained classification of wet-AMD… ▽ More

    Submitted 23 December, 2021; originally announced December 2021.

  24. Automatic Modulation Classification Using Involution Enabled Residual Networks

    Authors: Hao Zhang, Lu Yuan, Guangyu Wu, Fuhui Zhou, Qihui Wu

    Abstract: Automatic modulation classification (AMC) is of crucial importance for realizing wireless intelligence communications. Many deep learning based models especially convolution neural networks (CNNs) have been proposed for AMC. However, the computation cost is very high, which makes them inappropriate for beyond the fifth generation wireless communication networks that have stringent requirements on… ▽ More

    Submitted 23 August, 2021; originally announced August 2021.

    Journal ref: IEEE Wireless Communications Letters,2021

  25. arXiv:2104.11178  [pdf, other

    cs.CV cs.AI cs.LG cs.MM eess.IV

    VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text

    Authors: Hassan Akbari, Liangzhe Yuan, Rui Qian, Wei-Hong Chuang, Shih-Fu Chang, Yin Cui, Boqing Gong

    Abstract: We present a framework for learning multimodal representations from unlabeled data using convolution-free Transformer architectures. Specifically, our Video-Audio-Text Transformer (VATT) takes raw signals as inputs and extracts multimodal representations that are rich enough to benefit a variety of downstream tasks. We train VATT end-to-end from scratch using multimodal contrastive losses and eval… ▽ More

    Submitted 6 December, 2021; v1 submitted 22 April, 2021; originally announced April 2021.

    Comments: Published in the 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

  26. arXiv:2103.08888  [pdf, other

    eess.SY

    AutoFlow: Hotspot-Aware, Dynamic Load Balancing for Distributed Stream Processing

    Authors: Pengqi Lu, Liang Yuan, Yunquan Zhang, Hang Cao, Kun Li

    Abstract: Stream applications are widely deployed on the cloud. While modern distributed streaming systems like Flink and Spark Streaming can schedule and execute them efficiently, streaming dataflows are often dynamically changing, which may cause computation imbalance and backpressure. We introduce AutoFlow, an automatic, hotspot-aware dynamic load balance system for streaming dataflows. It incorporates a… ▽ More

    Submitted 16 March, 2021; originally announced March 2021.

  27. arXiv:2101.05087  [pdf, other

    eess.SY

    Secure Consensus with Distributed Detection via Two-hop Communication

    Authors: Liwei Yuan, Hideaki Ishii

    Abstract: In this paper, we consider a multi-agent resilient consensus problem, where some of the nodes may behave maliciously. The approach is to equip all nodes with a scheme to detect neighboring nodes when they behave in an abnormal fashion. To this end, the nodes exchange not only their own states but also information regarding their neighbor nodes. Such two-hop communication has long been studied in f… ▽ More

    Submitted 13 January, 2021; originally announced January 2021.

    Comments: 15 pages

  28. arXiv:2012.15564  [pdf, other

    eess.IV cs.CV

    Exploiting Shared Knowledge from Non-COVID Lesions for Annotation-Efficient COVID-19 CT Lung Infection Segmentation

    Authors: Yichi Zhang, Qingcheng Liao, Lin Yuan, He Zhu, Jiezhen Xing, Jicong Zhang

    Abstract: The novel Coronavirus disease (COVID-19) is a highly contagious virus and has spread all over the world, posing an extremely serious threat to all countries. Automatic lung infection segmentation from computed tomography (CT) plays an important role in the quantitative analysis of COVID-19. However, the major challenge lies in the inadequacy of annotated COVID-19 datasets. Currently, there are sev… ▽ More

    Submitted 27 July, 2021; v1 submitted 31 December, 2020; originally announced December 2020.

    Comments: 12 pages

  29. arXiv:2011.05623  [pdf, other

    q-bio.NC cs.CV cs.NE eess.IV

    Fooling the primate brain with minimal, targeted image manipulation

    Authors: Li Yuan, Will Xiao, Giorgia Dellaferrera, Gabriel Kreiman, Francis E. H. Tay, Jiashi Feng, Margaret S. Livingstone

    Abstract: Artificial neural networks (ANNs) are considered the current best models of biological vision. ANNs are the best predictors of neural activity in the ventral stream; moreover, recent work has demonstrated that ANN models fitted to neuronal activity can guide the synthesis of images that drive pre-specified response patterns in small neuronal populations. Despite the success in predicting and steer… ▽ More

    Submitted 30 March, 2022; v1 submitted 11 November, 2020; originally announced November 2020.

  30. arXiv:2007.00191  [pdf

    physics.app-ph eess.SY

    Ultrasonic and Electromagnetic Sensors for Downhole Reservoir Characterization

    Authors: K. Wang, H. T. Chien, S. Liao, L. P. Yuan, S. H. Sheen, S. Bakhtiari, A. C. Raptis

    Abstract: The current work covers the evaluation of ultrasonic and electromagnetic (EM) techniques applied to temperature measurement and flow characterization for Enhanced Geothermal System (EGS). We have evaluated both ultrasonic techniques and microwave radiometry for temperature gradient and profile measurements. A waveguide-based ultrasonic probe was developed to measure the temperature gradient. A sta… ▽ More

    Submitted 30 June, 2020; originally announced July 2020.

    Comments: 7 pages, 5 figures

    Journal ref: PROCEEDINGS, Thirty-Sixth Workshop on Geothermal Reservoir Engineering Stanford University, Stanford, California, January 31 - February 2, 2011 SGP-TR-191

  31. arXiv:2005.00762  [pdf, other

    cs.CV eess.IV

    Projection Inpainting Using Partial Convolution for Metal Artifact Reduction

    Authors: Lin Yuan, Yixing Huang, Andreas Maier

    Abstract: In computer tomography, due to the presence of metal implants in the patient body, reconstructed images will suffer from metal artifacts. In order to reduce metal artifacts, metals are typically removed in projection images. Therefore, the metal corrupted projection areas need to be inpainted. For deep learning inpainting methods, convolutional neural networks (CNNs) are widely used, for example,… ▽ More

    Submitted 2 May, 2020; originally announced May 2020.

    Comments: Technical Report

  32. arXiv:2004.05571  [pdf, other

    cs.CV cs.GR eess.IV

    Cross-domain Correspondence Learning for Exemplar-based Image Translation

    Authors: Pan Zhang, Bo Zhang, Dong Chen, Lu Yuan, Fang Wen

    Abstract: We present a general framework for exemplar-based image translation, which synthesizes a photo-realistic image from the input in a distinct domain (e.g., semantic segmentation mask, or edge map, or pose keypoints), given an exemplar image. The output has the style (e.g., color, texture) in consistency with the semantically corresponding objects in the exemplar. We propose to jointly learn the cros… ▽ More

    Submitted 12 April, 2020; originally announced April 2020.

    Comments: Accepted as a CVPR 2020 oral paper

    Journal ref: CVPR 2020

  33. arXiv:1909.11591  [pdf, other

    cs.LG cs.AI cs.LO eess.SY stat.ML

    Modular Deep Reinforcement Learning with Temporal Logic Specifications

    Authors: Lim Zun Yuan, Mohammadhosein Hasanbeig, Alessandro Abate, Daniel Kroening

    Abstract: We propose an actor-critic, model-free, and online Reinforcement Learning (RL) framework for continuous-state continuous-action Markov Decision Processes (MDPs) when the reward is highly sparse but encompasses a high-level temporal structure. We represent this temporal structure by a finite-state machine and construct an on-the-fly synchronised product with the MDP and the finite machine. The temp… ▽ More

    Submitted 22 November, 2019; v1 submitted 23 September, 2019; originally announced September 2019.

    Comments: arXiv admin note: text overlap with arXiv:1902.00778