-
HPGN: Hybrid Priors-Guided Network for Compressed Low-Light Image Enhancement
Authors:
Hantang Li,
Jinhua Hao,
Lei Xiong,
Shuyuan Zhu
Abstract:
In practical applications, conventional methods generate large volumes of low-light images that require compression for efficient storage and transmission. However, most existing methods either disregard the removal of potential compression artifacts during the enhancement process or fail to establish a unified framework for joint task enhancement of images with varying compression qualities. To s…
▽ More
In practical applications, conventional methods generate large volumes of low-light images that require compression for efficient storage and transmission. However, most existing methods either disregard the removal of potential compression artifacts during the enhancement process or fail to establish a unified framework for joint task enhancement of images with varying compression qualities. To solve this problem, we propose the hybrid priors-guided network (HPGN), which enhances compressed low-light images by integrating both compression and illumination priors. Our approach fully utilizes the JPEG quality factor (QF) and DCT quantization matrix (QM) to guide the design of efficient joint task plug-and-play modules. Additionally, we employ a random QF generation strategy to guide model training, enabling a single model to enhance images across different compression levels. Experimental results confirm the superiority of our proposed method.
△ Less
Submitted 3 April, 2025;
originally announced April 2025.
-
State-Aware Perturbation Optimization for Robust Deep Reinforcement Learning
Authors:
Zongyuan Zhang,
Tianyang Duan,
Zheng Lin,
Dong Huang,
Zihan Fang,
Zekai Sun,
Ling Xiong,
Hongbin Liang,
Heming Cui,
Yong Cui
Abstract:
Recently, deep reinforcement learning (DRL) has emerged as a promising approach for robotic control. However, the deployment of DRL in real-world robots is hindered by its sensitivity to environmental perturbations. While existing whitebox adversarial attacks rely on local gradient information and apply uniform perturbations across all states to evaluate DRL robustness, they fail to account for te…
▽ More
Recently, deep reinforcement learning (DRL) has emerged as a promising approach for robotic control. However, the deployment of DRL in real-world robots is hindered by its sensitivity to environmental perturbations. While existing whitebox adversarial attacks rely on local gradient information and apply uniform perturbations across all states to evaluate DRL robustness, they fail to account for temporal dynamics and state-specific vulnerabilities. To combat the above challenge, we first conduct a theoretical analysis of white-box attacks in DRL by establishing the adversarial victim-dynamics Markov decision process (AVD-MDP), to derive the necessary and sufficient conditions for a successful attack. Based on this, we propose a selective state-aware reinforcement adversarial attack method, named STAR, to optimize perturbation stealthiness and state visitation dispersion. STAR first employs a soft mask-based state-targeting mechanism to minimize redundant perturbations, enhancing stealthiness and attack effectiveness. Then, it incorporates an information-theoretic optimization objective to maximize mutual information between perturbations, environmental states, and victim actions, ensuring a dispersed state-visitation distribution that steers the victim agent into vulnerable states for maximum return reduction. Extensive experiments demonstrate that STAR outperforms state-of-the-art benchmarks.
△ Less
Submitted 26 March, 2025;
originally announced March 2025.
-
Unsupervised CP-UNet Framework for Denoising DAS Data with Decay Noise
Authors:
Tianye Huang,
Aopeng Li,
Xiang Li,
Jing Zhang,
Sijing Xian,
Qi Zhang,
Mingkong Lu,
Guodong Chen,
Liangming Xiong,
Xiangyun Hu
Abstract:
Distributed acoustic sensor (DAS) technology leverages optical fiber cables to detect acoustic signals, providing cost-effective and dense monitoring capabilities. It offers several advantages including resistance to extreme conditions, immunity to electromagnetic interference, and accurate detection. However, DAS typically exhibits a lower signal-to-noise ratio (S/N) compared to geophones and is…
▽ More
Distributed acoustic sensor (DAS) technology leverages optical fiber cables to detect acoustic signals, providing cost-effective and dense monitoring capabilities. It offers several advantages including resistance to extreme conditions, immunity to electromagnetic interference, and accurate detection. However, DAS typically exhibits a lower signal-to-noise ratio (S/N) compared to geophones and is susceptible to various noise types, such as random noise, erratic noise, level noise, and long-period noise. This reduced S/N can negatively impact data analyses containing inversion and interpretation. While artificial intelligence has demonstrated excellent denoising capabilities, most existing methods rely on supervised learning with labeled data, which imposes stringent requirements on the quality of the labels. To address this issue, we develop a label-free unsupervised learning (UL) network model based on Context-Pyramid-UNet (CP-UNet) to suppress erratic and random noises in DAS data. The CP-UNet utilizes the Context Pyramid Module in the encoding and decoding process to extract features and reconstruct the DAS data. To enhance the connectivity between shallow and deep features, we add a Connected Module (CM) to both encoding and decoding section. Layer Normalization (LN) is utilized to replace the commonly employed Batch Normalization (BN), accelerating the convergence of the model and preventing gradient explosion during training. Huber-loss is adopted as our loss function whose parameters are experimentally determined. We apply the network to both the 2-D synthetic and filed data. Comparing to traditional denoising methods and the latest UL framework, our proposed method demonstrates superior noise reduction performance.
△ Less
Submitted 18 February, 2025;
originally announced February 2025.
-
Baichuan-Omni-1.5 Technical Report
Authors:
Yadong Li,
Jun Liu,
Tao Zhang,
Tao Zhang,
Song Chen,
Tianpeng Li,
Zehuan Li,
Lijun Liu,
Lingfeng Ming,
Guosheng Dong,
Da Pan,
Chong Li,
Yuanbo Fang,
Dongdong Kuang,
Mingrui Wang,
Chenglin Zhu,
Youwei Zhang,
Hongyu Guo,
Fengyu Zhang,
Yuran Wang,
Bowen Ding,
Wei Song,
Xu Li,
Yuqi Huo,
Zheng Liang
, et al. (68 additional authors not shown)
Abstract:
We introduce Baichuan-Omni-1.5, an omni-modal model that not only has omni-modal understanding capabilities but also provides end-to-end audio generation capabilities. To achieve fluent and high-quality interaction across modalities without compromising the capabilities of any modality, we prioritized optimizing three key aspects. First, we establish a comprehensive data cleaning and synthesis pip…
▽ More
We introduce Baichuan-Omni-1.5, an omni-modal model that not only has omni-modal understanding capabilities but also provides end-to-end audio generation capabilities. To achieve fluent and high-quality interaction across modalities without compromising the capabilities of any modality, we prioritized optimizing three key aspects. First, we establish a comprehensive data cleaning and synthesis pipeline for multimodal data, obtaining about 500B high-quality data (text, audio, and vision). Second, an audio-tokenizer (Baichuan-Audio-Tokenizer) has been designed to capture both semantic and acoustic information from audio, enabling seamless integration and enhanced compatibility with MLLM. Lastly, we designed a multi-stage training strategy that progressively integrates multimodal alignment and multitask fine-tuning, ensuring effective synergy across all modalities. Baichuan-Omni-1.5 leads contemporary models (including GPT4o-mini and MiniCPM-o 2.6) in terms of comprehensive omni-modal capabilities. Notably, it achieves results comparable to leading models such as Qwen2-VL-72B across various multimodal medical benchmarks.
△ Less
Submitted 25 January, 2025;
originally announced January 2025.
-
HGNET: A Hierarchical Feature Guided Network for Occupancy Flow Field Prediction
Authors:
Zhan Chen,
Chen Tang,
Lu Xiong
Abstract:
Predicting the motion of multiple traffic participants has always been one of the most challenging tasks in autonomous driving. The recently proposed occupancy flow field prediction method has shown to be a more effective and scalable representation compared to general trajectory prediction methods. However, in complex multi-agent traffic scenarios, it remains difficult to model the interactions a…
▽ More
Predicting the motion of multiple traffic participants has always been one of the most challenging tasks in autonomous driving. The recently proposed occupancy flow field prediction method has shown to be a more effective and scalable representation compared to general trajectory prediction methods. However, in complex multi-agent traffic scenarios, it remains difficult to model the interactions among various factors and the dependencies among prediction outputs at different time steps. In view of this, we propose a transformer-based hierarchical feature guided network (HGNET), which can efficiently extract features of agents and map information from visual and vectorized inputs, modeling multimodal interaction relationships. Second, we design the Feature-Guided Attention (FGAT) module to leverage the potential guiding effects between different prediction targets, thereby improving prediction accuracy. Additionally, to enhance the temporal consistency and causal relationships of the predictions, we propose a Time Series Memory framework to learn the conditional distribution models of the prediction outputs at future time steps from multivariate time series. The results demonstrate that our model exhibits competitive performance, which ranks 3rd in the 2024 Waymo Occupancy and Flow Prediction Challenge.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Evolving Testing Scenario Generation Method and Intelligence Evaluation Framework for Automated Vehicles
Authors:
Yining Ma,
Wei Jiang,
Lingtong Zhang,
Junyi Chen,
Hong Wang,
Chen Lv,
Xuesong Wang,
Lu Xiong
Abstract:
Interaction between the background vehicles (BVs) and automated vehicles (AVs) in scenario-based testing plays a critical role in evaluating the intelligence of the AVs. Current testing scenarios typically employ predefined or scripted BVs, which inadequately reflect the complexity of human-like social behaviors in real-world driving scenarios, and also lack a systematic metric for evaluating the…
▽ More
Interaction between the background vehicles (BVs) and automated vehicles (AVs) in scenario-based testing plays a critical role in evaluating the intelligence of the AVs. Current testing scenarios typically employ predefined or scripted BVs, which inadequately reflect the complexity of human-like social behaviors in real-world driving scenarios, and also lack a systematic metric for evaluating the comprehensive intelligence of AVs. Therefore, this paper proposes an evolving scenario generation method that utilizes deep reinforcement learning (DRL) to create human-like BVs for testing and intelligence evaluation of AVs. Firstly, a class of driver models with human-like competitive, cooperative, and mutual driving motivations is designed. Then, utilizing an improved "level-k" training procedure, the three distinct driver models acquire game-based interactive driving policies. And these models are assigned to BVs for generating evolving scenarios in which all BVs can interact continuously and evolve diverse contents. Next, a framework including safety, driving efficiency, and interaction utility are presented to evaluate and quantify the intelligence performance of 3 systems under test (SUTs), indicating the effectiveness of the evolving scenario for intelligence testing. Finally, the complexity and fidelity of the proposed evolving testing scenario are validated. The results demonstrate that the proposed evolving scenario exhibits the highest level of complexity compared to other baseline scenarios and has more than 85% similarity to naturalistic driving data. This highlights the potential of the proposed method to facilitate the development and evaluation of high-level AVs in a realistic and challenging environment.
△ Less
Submitted 12 June, 2023;
originally announced June 2023.
-
An Ontology-based Method to Identify Triggering Conditions for Perception Insufficiency of Autonomous Vehicles
Authors:
Xingyu Xing,
Tong Jia,
Junyi Chen,
Lu Xiong,
Zhuoping Yu
Abstract:
The autonomous vehicle (AV) is a safety-critical system relying on complex sensors and algorithms. The AV may confront risk conditions if these sensors and algorithms misunderstand the environment and situation, even though all components are fault-free. The ISO 21448 defined the safety of the intended functionality (SOTIF), aiming to enhance the AV's safety by specifying AV's development and vali…
▽ More
The autonomous vehicle (AV) is a safety-critical system relying on complex sensors and algorithms. The AV may confront risk conditions if these sensors and algorithms misunderstand the environment and situation, even though all components are fault-free. The ISO 21448 defined the safety of the intended functionality (SOTIF), aiming to enhance the AV's safety by specifying AV's development and validation process. As required in the ISO 21448, the triggering conditions, which may lead to the vehicle's functional insufficiencies, should be analyzed and verified. However, there is not yet a method to realize a comprehensive and systematic identification of triggering conditions so far. This paper proposed an analysis framework of triggering conditions for the perception system based on the propagation chain of events model, which consists of triggering source, influenced perception stage, and triggering effect. According to the analysis framework, ontologies of triggering source and perception stage were constructed, and the relationships between concepts in ontologies are defined. According to these ontologies, triggering conditions can be generated comprehensively and systematically. The proposed method was applied on an L3 autonomous vehicle, and 20 from 87 triggering conditions identified were tested in the field, among which eight triggering conditions triggered risky behaviors of the vehicle.
△ Less
Submitted 16 October, 2022;
originally announced October 2022.