Search | arXiv e-print repository

ArtifactsBench: Bridging the Visual-Interactive Gap in LLM Code Generation Evaluation

Authors: Chenchen Zhang, Yuhang Li, Can Xu, Jiaheng Liu, Ao Liu, Shihui Hu, Dengpeng Wu, Guanhua Huang, Kejiao Li, Qi Yi, Ruibin Xiong, Haotian Zhu, Yuanxing Zhang, Yuhao Jiang, Yue Zhang, Zenan Xu, Bohui Zhai, Guoxiang He, Hebin Li, Jie Zhao, Le Zhang, Lingyun Tan, Pengyu Guo, Xianshu Pang, Yang Ruan , et al. (7 additional authors not shown)

Abstract: The generative capabilities of Large Language Models (LLMs) are rapidly expanding from static code to dynamic, interactive visual artifacts. This progress is bottlenecked by a critical evaluation gap: established benchmarks focus on algorithmic correctness and are blind to the visual fidelity and interactive integrity that define modern user experiences. To bridge this gap, we introduce ArtifactsB… ▽ More The generative capabilities of Large Language Models (LLMs) are rapidly expanding from static code to dynamic, interactive visual artifacts. This progress is bottlenecked by a critical evaluation gap: established benchmarks focus on algorithmic correctness and are blind to the visual fidelity and interactive integrity that define modern user experiences. To bridge this gap, we introduce ArtifactsBench, a new benchmark and paradigm for the automated, multimodal evaluation of visual code generation. Our framework programmatically renders each generated artifact and captures its dynamic behavior through temporal screenshots. This visual evidence, alongside the source code, is then assessed by a Multimodal LLM (MLLM)-as-Judge, which is rigorously guided by a fine-grained, per-task checklist to ensure holistic and reproducible scoring. We construct a new benchmark of 1,825 diverse tasks and evaluate over 30 leading LLMs. Our automated evaluation achieves a striking 94.4% ranking consistency with WebDev Arena, the gold-standard for human preference in web development, and over 90% pairwise agreement with human experts. This establishes ArtifactsBench as the first framework to reliably automate the assessment of human-perceived quality at scale. Our analysis provides a high-resolution map of the current SOTA, revealing that generalist models often outperform domain-specific ones. We open-source ArtifactsBench, including the benchmark, evaluation harness, and baseline results at https://artifactsbenchmark.github.io/, to provide the community with a scalable and accurate tool to accelerate the development of user-centric generative models. △ Less

Submitted 7 July, 2025; originally announced July 2025.

arXiv:2506.01430 [pdf, ps, other]

DNAEdit: Direct Noise Alignment for Text-Guided Rectified Flow Editing

Authors: Chenxi Xie, Minghan Li, Shuai Li, Yuhui Wu, Qiaosi Yi, Lei Zhang

Abstract: Leveraging the powerful generation capability of large-scale pretrained text-to-image models, training-free methods have demonstrated impressive image editing results. Conventional diffusion-based methods, as well as recent rectified flow (RF)-based methods, typically reverse synthesis trajectories by gradually adding noise to clean images, during which the noisy latent at the current timestep is… ▽ More Leveraging the powerful generation capability of large-scale pretrained text-to-image models, training-free methods have demonstrated impressive image editing results. Conventional diffusion-based methods, as well as recent rectified flow (RF)-based methods, typically reverse synthesis trajectories by gradually adding noise to clean images, during which the noisy latent at the current timestep is used to approximate that at the next timesteps, introducing accumulated drift and degrading reconstruction accuracy. Considering the fact that in RF the noisy latent is estimated through direct interpolation between Gaussian noises and clean images at each timestep, we propose Direct Noise Alignment (DNA), which directly refines the desired Gaussian noise in the noise domain, significantly reducing the error accumulation in previous methods. Specifically, DNA estimates the velocity field of the interpolated noised latent at each timestep and adjusts the Gaussian noise by computing the difference between the predicted and expected velocity field. We validate the effectiveness of DNA and reveal its relationship with existing RF-based inversion methods. Additionally, we introduce a Mobile Velocity Guidance (MVG) to control the target prompt-guided generation process, balancing image background preservation and target object editability. DNA and MVG collectively constitute our proposed method, namely DNAEdit. Finally, we introduce DNA-Bench, a long-prompt benchmark, to evaluate the performance of advanced image editing models. Experimental results demonstrate that our DNAEdit achieves superior performance to state-of-the-art text-guided editing methods. Codes and benchmark will be available at \href{ https://xiechenxi99.github.io/DNAEdit/}{https://xiechenxi99.github.io/DNAEdit/}. △ Less

Submitted 2 June, 2025; originally announced June 2025.

Comments: Project URL: https://xiechenxi99.github.io/DNAEdit

arXiv:2506.01394 [pdf, ps, other]

NTIRE 2025 the 2nd Restore Any Image Model (RAIM) in the Wild Challenge

Authors: Jie Liang, Radu Timofte, Qiaosi Yi, Zhengqiang Zhang, Shuaizheng Liu, Lingchen Sun, Rongyuan Wu, Xindong Zhang, Hui Zeng, Lei Zhang

Abstract: In this paper, we present a comprehensive overview of the NTIRE 2025 challenge on the 2nd Restore Any Image Model (RAIM) in the Wild. This challenge established a new benchmark for real-world image restoration, featuring diverse scenarios with and without reference ground truth. Participants were tasked with restoring real-captured images suffering from complex and unknown degradations, where both… ▽ More In this paper, we present a comprehensive overview of the NTIRE 2025 challenge on the 2nd Restore Any Image Model (RAIM) in the Wild. This challenge established a new benchmark for real-world image restoration, featuring diverse scenarios with and without reference ground truth. Participants were tasked with restoring real-captured images suffering from complex and unknown degradations, where both perceptual quality and fidelity were critically evaluated. The challenge comprised two tracks: (1) the low-light joint denoising and demosaicing (JDD) task, and (2) the image detail enhancement/generation task. Each track included two sub-tasks. The first sub-task involved paired data with available ground truth, enabling quantitative evaluation. The second sub-task dealt with real-world yet unpaired images, emphasizing restoration efficiency and subjective quality assessed through a comprehensive user study. In total, the challenge attracted nearly 300 registrations, with 51 teams submitting more than 600 results. The top-performing methods advanced the state of the art in image restoration and received unanimous recognition from all 20+ expert judges. The datasets used in Track 1 and Track 2 are available at https://drive.google.com/drive/folders/1Mgqve-yNcE26IIieI8lMIf-25VvZRs_J and https://drive.google.com/drive/folders/1UB7nnzLwqDZOwDmD9aT8J0KVg2ag4Qae, respectively. The official challenge pages for Track 1 and Track 2 can be found at https://codalab.lisn.upsaclay.fr/competitions/21334#learn_the_details and https://codalab.lisn.upsaclay.fr/competitions/21623#learn_the_details. △ Less

Submitted 2 June, 2025; originally announced June 2025.

arXiv:2506.01277 [pdf, ps, other]

GeoLocSFT: Efficient Visual Geolocation via Supervised Fine-Tuning of Multimodal Foundation Models

Authors: Qiang Yi, Lianlei Shan

Abstract: Accurately determining the geographic location where a single image was taken, visual geolocation, remains a formidable challenge due to the planet's vastness and the deceptive similarity among distant locations. We introduce GeoLocSFT, a framework that demonstrates how targeted supervised fine-tuning (SFT) of a large multimodal foundation model (Gemma 3) using a small, high-quality dataset can yi… ▽ More Accurately determining the geographic location where a single image was taken, visual geolocation, remains a formidable challenge due to the planet's vastness and the deceptive similarity among distant locations. We introduce GeoLocSFT, a framework that demonstrates how targeted supervised fine-tuning (SFT) of a large multimodal foundation model (Gemma 3) using a small, high-quality dataset can yield highly competitive geolocation performance. GeoLocSFT is trained with only 2700 carefully selected image-GPS pairs from our geographically diverse MR600k dataset. Despite this limited data, our SFT-centric approach substantially improves over baseline models and achieves robust results on standard benchmarks such as Im2GPS-3k and YFCC-4k, as well as on our newly proposed and challenging MR40k benchmark, aimed specifically at sparsely populated regions. Further, we explore multi-candidate inference and aggregation strategies but find that the core gains are already realized at the SFT stage. Our findings highlight the power of high-quality supervision and efficient SFT for planet-scale image geolocation, especially when compared to prior methods that require massive databases or complex pipelines. To foster further research, we publicly release the MR40k benchmark dataset. △ Less

Submitted 1 June, 2025; originally announced June 2025.

Comments: 29 pages, 14 figures

arXiv:2505.18197 [pdf, ps, other]

A Novel Benchmark and Dataset for Efficient 3D Gaussian Splatting with Gaussian Point Cloud Compression

Authors: Kangli Wang, Shihao Li, Qianxi Yi, Wei Gao

Abstract: Recently, immersive media and autonomous driving applications have significantly advanced through 3D Gaussian Splatting (3DGS), which offers high-fidelity rendering and computational efficiency. Despite these advantages, 3DGS as a display-oriented representation requires substantial storage due to its numerous Gaussian attributes. Current compression methods have shown promising results but typica… ▽ More Recently, immersive media and autonomous driving applications have significantly advanced through 3D Gaussian Splatting (3DGS), which offers high-fidelity rendering and computational efficiency. Despite these advantages, 3DGS as a display-oriented representation requires substantial storage due to its numerous Gaussian attributes. Current compression methods have shown promising results but typically neglect the compression of Gaussian spatial positions, creating unnecessary bitstream overhead. We conceptualize Gaussian primitives as point clouds and propose leveraging point cloud compression techniques for more effective storage. AI-based point cloud compression demonstrates superior performance and faster inference compared to MPEG Geometry-based Point Cloud Compression (G-PCC). However, direct application of existing models to Gaussian compression may yield suboptimal results, as Gaussian point clouds tend to exhibit globally sparse yet locally dense geometric distributions that differ from conventional point cloud characteristics. To address these challenges, we introduce GausPcgc for Gaussian point cloud geometry compression along with a specialized training dataset GausPcc-1K. Our work pioneers the integration of AI-based point cloud compression into Gaussian compression pipelines, achieving superior compression ratios. The framework complements existing Gaussian compression methods while delivering significant performance improvements. All code, data, and pre-trained models will be publicly released to facilitate further research advances in this field. △ Less

Submitted 21 May, 2025; originally announced May 2025.

Comments: 22 pages, 13 figures

arXiv:2505.15431 [pdf, ps, other]

Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought

Authors: Tencent Hunyuan Team, Ao Liu, Botong Zhou, Can Xu, Chayse Zhou, ChenChen Zhang, Chengcheng Xu, Chenhao Wang, Decheng Wu, Dengpeng Wu, Dian Jiao, Dong Du, Dong Wang, Feng Zhang, Fengzong Lian, Guanghui Xu, Guanwei Zhang, Hai Wang, Haipeng Luo, Han Hu, Huilin Xu, Jiajia Wu, Jianchen Zhu, Jianfeng Yan, Jiaqi Zhu , et al. (230 additional authors not shown)

Abstract: As Large Language Models (LLMs) rapidly advance, we introduce Hunyuan-TurboS, a novel large hybrid Transformer-Mamba Mixture of Experts (MoE) model. It synergistically combines Mamba's long-sequence processing efficiency with Transformer's superior contextual understanding. Hunyuan-TurboS features an adaptive long-short chain-of-thought (CoT) mechanism, dynamically switching between rapid response… ▽ More As Large Language Models (LLMs) rapidly advance, we introduce Hunyuan-TurboS, a novel large hybrid Transformer-Mamba Mixture of Experts (MoE) model. It synergistically combines Mamba's long-sequence processing efficiency with Transformer's superior contextual understanding. Hunyuan-TurboS features an adaptive long-short chain-of-thought (CoT) mechanism, dynamically switching between rapid responses for simple queries and deep "thinking" modes for complex problems, optimizing computational resources. Architecturally, this 56B activated (560B total) parameter model employs 128 layers (Mamba2, Attention, FFN) with an innovative AMF/MF block pattern. Faster Mamba2 ensures linear complexity, Grouped-Query Attention minimizes KV cache, and FFNs use an MoE structure. Pre-trained on 16T high-quality tokens, it supports a 256K context length and is the first industry-deployed large-scale Mamba model. Our comprehensive post-training strategy enhances capabilities via Supervised Fine-Tuning (3M instructions), a novel Adaptive Long-short CoT Fusion method, Multi-round Deliberation Learning for iterative improvement, and a two-stage Large-scale Reinforcement Learning process targeting STEM and general instruction-following. Evaluations show strong performance: overall top 7 rank on LMSYS Chatbot Arena with a score of 1356, outperforming leading models like Gemini-2.0-Flash-001 (1352) and o4-mini-2025-04-16 (1345). TurboS also achieves an average of 77.9% across 23 automated benchmarks. Hunyuan-TurboS balances high performance and efficiency, offering substantial capabilities at lower inference costs than many reasoning models, establishing a new paradigm for efficient large-scale pre-trained models. △ Less

Submitted 4 July, 2025; v1 submitted 21 May, 2025; originally announced May 2025.

arXiv:2505.06192 [pdf, other]

GECAM Discovery of Peculiar Oscillating Particle Precipitation Events

Authors: Chenwei Wang, Shaolin Xiong, Yi Zhao, Wei Xu, Gaopeng Lu, Xuzhi Zhou, Xiaocheng Guo, Wenya Li, Xiaochao Yang, Qinghe Zhang, Xinqiao Li, Zhenxia Zhang, Zhenghua An, Ce Cai, Peiyi Feng, Yue Huang, Min Gao, Ke Gong, Dongya Guo, Haoxuan Guo, Bing Li, Xiaobo Li, Yaqing Liu, Jiacong Liu, Xiaojing Liu , et al. (30 additional authors not shown)

Abstract: Charged particle precipitation typically manifests as a gradual increase and decrease of flux observed by space detectors. Cases with rapidly flux variation are very rare. Periodic events are even more extraordinary. These oscillating particle precipitation (OPP) events are usually attributed to the bounce motion of electrons, which are induced by lightning. Owing to the observation limitations, t… ▽ More Charged particle precipitation typically manifests as a gradual increase and decrease of flux observed by space detectors. Cases with rapidly flux variation are very rare. Periodic events are even more extraordinary. These oscillating particle precipitation (OPP) events are usually attributed to the bounce motion of electrons, which are induced by lightning. Owing to the observation limitations, there has been debate regarding whether these oscillations originate from temporal flux evolution or spatial structure evolution. Here we report three peculiar charged particle precipitation events detected by GECAM during a geomagnetic storm on March 21, 2024, with two exhibiting significant periodicity. These events were observed around the same region during three consecutive orbits. Through comprehensive temporal and spectral analyses, we revealed that one of the OPP events exhibited a transition in spectral lag of mini-pulses, shifting from "softer-earlier" to "softer-later" while showing no significant time evolution in overall frequency characteristics. And there is no association found between these two OPP events and lightning activity. Several possible scenarios are discussed to explain these charged particles with a life time of more than 3.5 hours, but the nature of these three events remains an enigma. We suggest that these GECAM-detected OPP events may represent a new type of particle precipitation event or a peculiar Lightning-induced Electron Precipitations (LEPs). △ Less

Submitted 9 May, 2025; originally announced May 2025.

arXiv:2504.16354 [pdf, other]

VeriFix: Verifying Your Fix Towards An Atomicity Violation

Authors: Zhuang Li, Qiuping Yi, Jeff Huang

Abstract: Atomicity violation is one of the most serious types of bugs in concurrent programs. Synchronizations are commonly used to enforce atomicity. However, it is very challenging to place synchronizations correctly and sufficiently due to complex thread interactions and large input space. This paper presents \textsf{VeriFix}, a new approach for verifying atomicity violation fixes. Given a buggy trace… ▽ More Atomicity violation is one of the most serious types of bugs in concurrent programs. Synchronizations are commonly used to enforce atomicity. However, it is very challenging to place synchronizations correctly and sufficiently due to complex thread interactions and large input space. This paper presents \textsf{VeriFix}, a new approach for verifying atomicity violation fixes. Given a buggy trace that exposes an atomicity violation and a corresponding fix, % in the form of locks, \textsf{VeriFix} effectively verifies if the fix introduces sufficient synchronizations to repair the atomicity violation without introducing new deadlocks. The key idea is that \textsf{VeriFix} transforms the fix verification problem into a property verification problem, in which both the observed atomicity violation and potential deadlocks are encoded as a safety property, and both the inputs and schedules are encoded as symbolic constraints. By reasoning the conjoined constraints with an SMT solver, \textsf{VeriFix} systematically explores all reachable paths %from the whole schedule and input space and verifies if there exists a concrete \textit{schedule+input} combination to manifest the intended atomicity or any new deadlocks. We have implemented and evaluated \verifix\ on a collection of real-world C/C++ programs. The result shows that \textsf{VeriFix} significantly outperforms the state-of-the-art. △ Less

Submitted 22 April, 2025; originally announced April 2025.

arXiv:2504.12711 [pdf, other]

NTIRE 2025 Challenge on Day and Night Raindrop Removal for Dual-Focused Images: Methods and Results

Authors: Xin Li, Yeying Jin, Xin Jin, Zongwei Wu, Bingchen Li, Yufei Wang, Wenhan Yang, Yu Li, Zhibo Chen, Bihan Wen, Robby T. Tan, Radu Timofte, Qiyu Rong, Hongyuan Jing, Mengmeng Zhang, Jinglong Li, Xiangyu Lu, Yi Ren, Yuting Liu, Meng Zhang, Xiang Chen, Qiyuan Guan, Jiangxin Dong, Jinshan Pan, Conglin Gou , et al. (112 additional authors not shown)

Abstract: This paper reviews the NTIRE 2025 Challenge on Day and Night Raindrop Removal for Dual-Focused Images. This challenge received a wide range of impressive solutions, which are developed and evaluated using our collected real-world Raindrop Clarity dataset. Unlike existing deraining datasets, our Raindrop Clarity dataset is more diverse and challenging in degradation types and contents, which includ… ▽ More This paper reviews the NTIRE 2025 Challenge on Day and Night Raindrop Removal for Dual-Focused Images. This challenge received a wide range of impressive solutions, which are developed and evaluated using our collected real-world Raindrop Clarity dataset. Unlike existing deraining datasets, our Raindrop Clarity dataset is more diverse and challenging in degradation types and contents, which includes day raindrop-focused, day background-focused, night raindrop-focused, and night background-focused degradations. This dataset is divided into three subsets for competition: 14,139 images for training, 240 images for validation, and 731 images for testing. The primary objective of this challenge is to establish a new and powerful benchmark for the task of removing raindrops under varying lighting and focus conditions. There are a total of 361 participants in the competition, and 32 teams submitting valid solutions and fact sheets for the final testing phase. These submissions achieved state-of-the-art (SOTA) performance on the Raindrop Clarity dataset. The project can be found at https://lixinustc.github.io/CVPR-NTIRE2025-RainDrop-Competition.github.io/. △ Less

Submitted 19 April, 2025; v1 submitted 17 April, 2025; originally announced April 2025.

Comments: Challenge Report of CVPR NTIRE 2025; 26 pages; Methods from 32 teams

arXiv:2504.04422 [pdf, other]

LeakGuard: Detecting Memory Leaks Accurately and Scalably

Authors: Hongliang Liang, Luming Yin, Guohao Wu, Yuxiang Li, Qiuping Yi, Lei Wang

Abstract: Memory leaks are prevalent in various real-world software projects, thereby leading to serious attacks like denial-of-service. Though prior methods for detecting memory leaks made significant advance, they often suffer from low accuracy and weak scalability for testing large and complex programs. In this paper we present LeakGuard, a memory leak detection tool which provides satisfactory balance o… ▽ More Memory leaks are prevalent in various real-world software projects, thereby leading to serious attacks like denial-of-service. Though prior methods for detecting memory leaks made significant advance, they often suffer from low accuracy and weak scalability for testing large and complex programs. In this paper we present LeakGuard, a memory leak detection tool which provides satisfactory balance of accuracy and scalability. For accuracy, LeakGuard analyzes the behaviors of library and developer-defined memory allocation and deallocation functions in a path-sensitive manner and generates function summaries for them in a bottom-up approach. Additionally, we develop a pointer escape analysis technique to model the transfer of pointer ownership. For scalability, LeakGuard examines each function of interest independently by using its function summary and under-constrained symbolic execution technique, which effectively mitigates path explosion problem. Our extensive evaluation on 18 real-world software projects and standard benchmark datasets demonstrates that LeakGuard achieves significant advancements in multiple aspects: it exhibits superior MAD function identification capability compared to Goshawk, outperforms five state-of-the-art methods in defect detection accuracy, and successfully identifies 129 previously undetected memory leak bugs, all of which have been independently verified and confirmed by the respective development teams. △ Less

Submitted 6 April, 2025; originally announced April 2025.

Comments: 21 pages, 5 figures, conference paper on memory leak detection

arXiv:2503.23512 [pdf, ps, other]

SCORE: Story Coherence and Retrieval Enhancement for AI Narratives

Authors: Qiang Yi, Yangfan He, Jianhui Wang, Xinyuan Song, Shiyao Qian, Xinhang Yuan, Li Sun, Yi Xin, Jingqun Tang, Keqin Li, Kuan Lu, Menghao Huo, Jiaqi Chen, Tianyu Shi

Abstract: Large Language Models (LLMs) can generate creative and engaging narratives from user-specified input, but maintaining coherence and emotional depth throughout these AI-generated stories remains a challenge. In this work, we propose SCORE, a framework for Story Coherence and Retrieval Enhancement, designed to detect and resolve narrative inconsistencies. By tracking key item statuses and generating… ▽ More Large Language Models (LLMs) can generate creative and engaging narratives from user-specified input, but maintaining coherence and emotional depth throughout these AI-generated stories remains a challenge. In this work, we propose SCORE, a framework for Story Coherence and Retrieval Enhancement, designed to detect and resolve narrative inconsistencies. By tracking key item statuses and generating episode summaries, SCORE uses a Retrieval-Augmented Generation (RAG) approach, incorporating TF-IDF and cosine similarity to identify related episodes and enhance the overall story structure. Results from testing multiple LLM-generated stories demonstrate that SCORE significantly improves the consistency and stability of narrative coherence compared to baseline GPT models, providing a more robust method for evaluating and refining AI-generated narratives. △ Less

Submitted 12 June, 2025; v1 submitted 30 March, 2025; originally announced March 2025.

arXiv:2503.03161 [pdf, other]

doi 10.1007/s11433-024-2544-3

The GECAM Ground Search System for Gamma-ray Transients

Authors: Ce Cai, Yan-Qiu Zhang, Shao-Lin Xiong, Ping Wang, Jian-Hui Li, Xiao-Bo Li, Cheng-Kui Li, Yue Huang, Shi-Jie Zheng, Li-Ming Song, Shuo Xiao, Qi-Bin Yi, Yi Zhao, Sheng-Lun Xie, Rui Qiao, Yan-Qi Du, Zhi-Wei Guo, Wang-Chen Xue, Chao Zheng, Jia-Cong Liu, Chen-Wei Wang, Wen-Jun Tan, Yue Wang, Jin-Peng Zhang, Chao-Yang Li , et al. (13 additional authors not shown)

Abstract: In the era of time-domain, multi-messenger astronomy, the detection of transient events on the high-energy electromagnetic sky has become more important than ever. The Gravitational wave high-energy Electromagnetic Counterpart All-sky Monitor (GECAM) is a dedicated mission to monitor gamma-ray transients, launched in December, 2020. A real-time on-board trigger and location software, using the tra… ▽ More In the era of time-domain, multi-messenger astronomy, the detection of transient events on the high-energy electromagnetic sky has become more important than ever. The Gravitational wave high-energy Electromagnetic Counterpart All-sky Monitor (GECAM) is a dedicated mission to monitor gamma-ray transients, launched in December, 2020. A real-time on-board trigger and location software, using the traditional signal-to-noise ratio (SNR) method for blind search, is constrained to relatively bright signals due to the limitations in on-board computing resources and the need for real-time search. In this work, we developed a ground-based pipeline for GECAM to search for various transients, especially for weak bursts missed by on-board software. This pipeline includes both automatic and manual mode, offering options for blind search and targeted search. The targeted search is specifically designed to search for interesting weak bursts, such as gravitational wave-associated gamma-ray bursts (GRBs). From the ground search of the data in the first year, GECAM has been triggered by 54 GRBs and other transients, including soft gamma-ray repeaters, X-ray binaries, solar flares, terrestrial gamma-ray flashes. We report the properties of each type of triggers,such as trigger time and light curves. With this search pipeline and assuming a soft Band spectrum, the GRB detection sensitivity of GECAM is increased to about 1.1E-08 erg cm-2 s-1 (10 keV - 1000 keV, burst duration of 20 s). These results demonstrate that the GECAM ground search system (both blind search and targeted search) is a versatile pipeline to recover true astrophysical signals which were too weak to be found in the on-board search. △ Less

Submitted 4 March, 2025; originally announced March 2025.

Comments: Accepted by SCIENCE CHINA Physics, Mechanics & Astronomy (SCPMA)

Journal ref: The GECAM ground search system for gamma-ray transients. Sci. China-Phys. Mech. Astron. Volume 68, article number 239511, (2025)

arXiv:2502.15267 [pdf, ps, other]

doi 10.3847/1538-4357/abf4c4

New insight into the Rapid Burster by Insight-HXMT

Authors: Y. P. Chen, S. Zhang, S. N. Zhang, L. Ji, L. D. Kong, P. J. Wang, L. Tao, M. Y. Ge, C. Z. Liu, F. J. Lu, J. L. Qu, T. P. Li, Y. P. Xu, X. L. Cao, Y. Chen, Q. C. Bu, C. Cai, Z. Chang, G. Chen, L. Chen, T. X. Chen, W. W. Cui, Y. Y. Du, G. H. Gao, H. Gao , et al. (70 additional authors not shown)

Abstract: We report the timing and spectral analyses upon of the type II X-ray bursts from the Rapid Burster (MXB 1730--335) observed by Insight-HXMT and Swift/XRT. By stacking the long-duration bursts, we find for the first time that the hard X-rays are lagging than the soft X-rays by 3 seconds. However, such a lag is not visible for the short-duration bursts, probably because of the poor statistics. For a… ▽ More We report the timing and spectral analyses upon of the type II X-ray bursts from the Rapid Burster (MXB 1730--335) observed by Insight-HXMT and Swift/XRT. By stacking the long-duration bursts, we find for the first time that the hard X-rays are lagging than the soft X-rays by 3 seconds. However, such a lag is not visible for the short-duration bursts, probably because of the poor statistics. For all bursts the energy spectrum is found to be non-thermal, thanks to the broad band coverage of Insight-HXMT. These findings put new insights into the type-II bursts and require a temporally showing-up corona for possible interpretation. △ Less

Submitted 21 February, 2025; originally announced February 2025.

Journal ref: 2021,ApJ,913,150

arXiv:2502.15217 [pdf, other]

FormalSpecCpp: A Dataset of C++ Formal Specifications created using LLMs

Authors: Madhurima Chakraborty, Peter Pirkelbauer, Qing Yi

Abstract: FormalSpecCpp is a dataset designed to fill the gap in standardized benchmarks for verifying formal specifications in C++ programs. To the best of our knowledge, this is the first comprehensive collection of C++ programs with well-defined preconditions and postconditions. It provides a structured benchmark for evaluating specification inference tools and testing theaccuracy of generated specificat… ▽ More FormalSpecCpp is a dataset designed to fill the gap in standardized benchmarks for verifying formal specifications in C++ programs. To the best of our knowledge, this is the first comprehensive collection of C++ programs with well-defined preconditions and postconditions. It provides a structured benchmark for evaluating specification inference tools and testing theaccuracy of generated specifications. Researchers and developers can use this dataset to benchmark specification inference tools,fine-tune Large Language Models (LLMs) for automated specification generation, and analyze the role of formal specifications in improving program verification and automated testing. By making this dataset publicly available, we aim to advance research in program verification, specification inference, and AI-assisted software development. The dataset and the code are available at https://github.com/MadhuNimmo/FormalSpecCpp. △ Less

Submitted 21 February, 2025; originally announced February 2025.

Comments: Accepted at the 2025 IEEE/ACM 22nd International Conference on Mining Software Repositories (MSR)

arXiv:2502.02113 [pdf, ps, other]

Mathematical analysis and numerical simulation of coupled nonlinear space-fractional Ginzburg-Landau equations

Authors: Hengfei Ding, Yuxin Zhang, Qian Yi

Abstract: The coupled nonlinear space fractional Ginzburg-Landau (CNLSFGL) equations with the fractional Laplacian have been widely used to model the dynamical processes in a fractal media with fractional dispersion. Due to the existence of fractional power derivatives and strong nonlinearity, it is extremely difficult to mathematically analyze the CNLSFGL equations and construct efficient numerical algor… ▽ More The coupled nonlinear space fractional Ginzburg-Landau (CNLSFGL) equations with the fractional Laplacian have been widely used to model the dynamical processes in a fractal media with fractional dispersion. Due to the existence of fractional power derivatives and strong nonlinearity, it is extremely difficult to mathematically analyze the CNLSFGL equations and construct efficient numerical algorithms. For this reason, this paper aims to investigate the theoretical results about the considered system and construct a novel high-order numerical scheme for this coupled system. We prove rigorously an a priori estimate of the solution to the coupled system and the well-posedness of its weak solution. Then, to develop the efficient numerical algorithm, we construct a fourth-order numerical differential formula to approximate the fractional Laplacian. Based on this formula, we construct a high-order implicit difference scheme for the coupled system. Furthermore, the unique solvability and convergence of the established algorithm are proved in detail. To implement the implicit algorithm efficiently, an iterative algorithm is designed in the numerical simulation. Extensive numerical examples are reported to further demonstrate the correctness of the theoretical analysis and the efficiency of the proposed numerical algorithm. △ Less

Submitted 4 February, 2025; originally announced February 2025.

Comments: 42 pages

arXiv:2501.12174 [pdf, other]

BiMarker: Enhancing Text Watermark Detection for Large Language Models with Bipolar Watermarks

Authors: Zhuang Li, Qiuping Yi, Zongcheng Ji, Yijian Lu, Yanqi Li, Keyang Xiao, Hongliang Liang

Abstract: The rapid growth of Large Language Models (LLMs) raises concerns about distinguishing AI-generated text from human content. Existing watermarking techniques, like \kgw, struggle with low watermark strength and stringent false-positive requirements. Our analysis reveals that current methods rely on coarse estimates of non-watermarked text, limiting watermark detectability. To address this, we propo… ▽ More The rapid growth of Large Language Models (LLMs) raises concerns about distinguishing AI-generated text from human content. Existing watermarking techniques, like \kgw, struggle with low watermark strength and stringent false-positive requirements. Our analysis reveals that current methods rely on coarse estimates of non-watermarked text, limiting watermark detectability. To address this, we propose Bipolar Watermark (\tool), which splits generated text into positive and negative poles, enhancing detection without requiring additional computational resources or knowledge of the prompt. Theoretical analysis and experimental results demonstrate \tool's effectiveness and compatibility with existing optimization techniques, providing a new optimization dimension for watermarking in LLM-generated content. △ Less

Submitted 21 May, 2025; v1 submitted 21 January, 2025; originally announced January 2025.

arXiv:2412.12547 [pdf, other]

A MARL Based Multi-Target Tracking Algorithm Under Jamming Against Radar

Authors: Ziang Wang, Lei Wang, Qi Yi, Yimin Liu

Abstract: Unmanned aerial vehicles (UAVs) have played an increasingly important role in military operations and social life. Among all application scenarios, multi-target tracking tasks accomplished by UAV swarms have received extensive attention. However, when UAVs use radar to track targets, the tracking performance can be severely compromised by jammers. To track targets in the presence of jammers, UAVs… ▽ More Unmanned aerial vehicles (UAVs) have played an increasingly important role in military operations and social life. Among all application scenarios, multi-target tracking tasks accomplished by UAV swarms have received extensive attention. However, when UAVs use radar to track targets, the tracking performance can be severely compromised by jammers. To track targets in the presence of jammers, UAVs can use passive radar to position the jammer. This paper proposes a system where a UAV swarm selects the radar's active or passive work mode to track multiple differently located and potentially jammer-carrying targets. After presenting the optimization problem and proving its solving difficulty, we use a multi-agent reinforcement learning algorithm to solve this control problem. We also propose a mechanism based on simulated annealing algorithm to avoid cases where UAV actions violate constraints. Simulation experiments demonstrate the effectiveness of the proposed algorithm. △ Less

Submitted 17 December, 2024; originally announced December 2024.

arXiv:2412.03017 [pdf, other]

Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach

Authors: Lingchen Sun, Rongyuan Wu, Zhiyuan Ma, Shuaizheng Liu, Qiaosi Yi, Lei Zhang

Abstract: Diffusion prior-based methods have shown impressive results in real-world image super-resolution (SR). However, most existing methods entangle pixel-level and semantic-level SR objectives in the training process, struggling to balance pixel-wise fidelity and perceptual quality. Meanwhile, users have varying preferences on SR results, thus it is demanded to develop an adjustable SR model that can b… ▽ More Diffusion prior-based methods have shown impressive results in real-world image super-resolution (SR). However, most existing methods entangle pixel-level and semantic-level SR objectives in the training process, struggling to balance pixel-wise fidelity and perceptual quality. Meanwhile, users have varying preferences on SR results, thus it is demanded to develop an adjustable SR model that can be tailored to different fidelity-perception preferences during inference without re-training. We present Pixel-level and Semantic-level Adjustable SR (PiSA-SR), which learns two LoRA modules upon the pre-trained stable-diffusion (SD) model to achieve improved and adjustable SR results. We first formulate the SD-based SR problem as learning the residual between the low-quality input and the high-quality output, then show that the learning objective can be decoupled into two distinct LoRA weight spaces: one is characterized by the $\ell_2$-loss for pixel-level regression, and another is characterized by the LPIPS and classifier score distillation losses to extract semantic information from pre-trained classification and SD models. In its default setting, PiSA-SR can be performed in a single diffusion step, achieving leading real-world SR results in both quality and efficiency. By introducing two adjustable guidance scales on the two LoRA modules to control the strengths of pixel-wise fidelity and semantic-level details during inference, PiSASR can offer flexible SR results according to user preference without re-training. Codes and models can be found at https://github.com/csslc/PiSA-SR. △ Less

Submitted 3 April, 2025; v1 submitted 3 December, 2024; originally announced December 2024.

arXiv:2409.05313 [pdf, other]

Influences of non-standard interactions on PeV neutrino events with and without a $L_α-L_β$ symmetry

Authors: Qiu-Xia Yi, Ya-Ru Wang, Shu-Jun Rong

Abstract: The recently reported astrophysical neutrinos events in the TeV-PeV energy range open a winder to explore new physics at energy frontiers. In this paper, we examine effects of non-standard interactions (NSIs) on the PeV neutrinos events. We consider NSIs with and without a gauge symmetry $L_α$ - $L_β$. We find that, for typical $μ^{\pm}$ damping and $π^{\pm}$ decay sources, the NSI with an extra g… ▽ More The recently reported astrophysical neutrinos events in the TeV-PeV energy range open a winder to explore new physics at energy frontiers. In this paper, we examine effects of non-standard interactions (NSIs) on the PeV neutrinos events. We consider NSIs with and without a gauge symmetry $L_α$ - $L_β$. We find that, for typical $μ^{\pm}$ damping and $π^{\pm}$ decay sources, the NSI with an extra gauge symmetry has more noticeable effects on the PeV events. Therefore, the detection of the events in the upcoming experiments could set stringent constraints on the NSI parameters in the $L_α$ - $L_β$ symmetric case. △ Less

Submitted 14 January, 2025; v1 submitted 8 September, 2024; originally announced September 2024.

Comments: 18 pages, 9 figures

arXiv:2408.10145 [pdf, other]

Multi-Scale Representation Learning for Image Restoration with State-Space Model

Authors: Yuhong He, Long Peng, Qiaosi Yi, Chen Wu, Lu Wang

Abstract: Image restoration endeavors to reconstruct a high-quality, detail-rich image from a degraded counterpart, which is a pivotal process in photography and various computer vision systems. In real-world scenarios, different types of degradation can cause the loss of image details at various scales and degrade image contrast. Existing methods predominantly rely on CNN and Transformer to capture multi-s… ▽ More Image restoration endeavors to reconstruct a high-quality, detail-rich image from a degraded counterpart, which is a pivotal process in photography and various computer vision systems. In real-world scenarios, different types of degradation can cause the loss of image details at various scales and degrade image contrast. Existing methods predominantly rely on CNN and Transformer to capture multi-scale representations. However, these methods are often limited by the high computational complexity of Transformers and the constrained receptive field of CNN, which hinder them from achieving superior performance and efficiency in image restoration. To address these challenges, we propose a novel Multi-Scale State-Space Model-based (MS-Mamba) for efficient image restoration that enhances the capacity for multi-scale representation learning through our proposed global and regional SSM modules. Additionally, an Adaptive Gradient Block (AGB) and a Residual Fourier Block (RFB) are proposed to improve the network's detail extraction capabilities by capturing gradients in various directions and facilitating learning details in the frequency domain. Extensive experiments on nine public benchmarks across four classic image restoration tasks, image deraining, dehazing, denoising, and low-light enhancement, demonstrate that our proposed method achieves new state-of-the-art performance while maintaining low computational complexity. The source code will be publicly available. △ Less

Submitted 19 August, 2024; originally announced August 2024.

arXiv:2407.05245 [pdf, ps, other]

doi 10.1088/2053-1583/ada0b8

Electrical magnetochiral anisotropy and quantum metric in chiral conductors

Authors: Yiyang Jiang, Qinyan Yi, Binghai Yan

Abstract: Electrical magnetochiral anisotropy (EMCA) refers to the chirality- and current-dependent nonlinear magnetoresistance in chiral conductors and is commonly interpreted in a semiclassical picture. In this work, we reveal a quantum geometry origin of EMCA using a chiral rectangular lattice model that resembles a chiral organic conductor (DM-EDT-TTF)${}_2$ClO${}_4$ studied for EMCA recently and exhibi… ▽ More Electrical magnetochiral anisotropy (EMCA) refers to the chirality- and current-dependent nonlinear magnetoresistance in chiral conductors and is commonly interpreted in a semiclassical picture. In this work, we reveal a quantum geometry origin of EMCA using a chiral rectangular lattice model that resembles a chiral organic conductor (DM-EDT-TTF)${}_2$ClO${}_4$ studied for EMCA recently and exhibits symmetry-protected Dirac bands similar to those of graphene. Compared to the semiclassical term, we find that Dirac states contribute significantly to both traditional longitudinal EMCA and the unconventional transverse EMCA via the quantum metric when Fermi energy is close to the Dirac point. Besides, we discover that a topological insulator state can emerge once spin-orbit coupling (SOC) is added to our chiral model lattice. Our work paves a path toward understanding quantum geometry in the magnetotransport of chiral materials. △ Less

Submitted 6 June, 2025; v1 submitted 6 July, 2024; originally announced July 2024.

Comments: 14 pages, 4 figures

Journal ref: 2D Mater. 12 015020 (2025)

arXiv:2406.03250 [pdf, other]

Prompt-based Visual Alignment for Zero-shot Policy Transfer

Authors: Haihan Gao, Rui Zhang, Qi Yi, Hantao Yao, Haochen Li, Jiaming Guo, Shaohui Peng, Yunkai Gao, QiCheng Wang, Xing Hu, Yuanbo Wen, Zihao Zhang, Zidong Du, Ling Li, Qi Guo, Yunji Chen

Abstract: Overfitting in RL has become one of the main obstacles to applications in reinforcement learning(RL). Existing methods do not provide explicit semantic constrain for the feature extractor, hindering the agent from learning a unified cross-domain representation and resulting in performance degradation on unseen domains. Besides, abundant data from multiple domains are needed. To address these issue… ▽ More Overfitting in RL has become one of the main obstacles to applications in reinforcement learning(RL). Existing methods do not provide explicit semantic constrain for the feature extractor, hindering the agent from learning a unified cross-domain representation and resulting in performance degradation on unseen domains. Besides, abundant data from multiple domains are needed. To address these issues, in this work, we propose prompt-based visual alignment (PVA), a robust framework to mitigate the detrimental domain bias in the image for zero-shot policy transfer. Inspired that Visual-Language Model (VLM) can serve as a bridge to connect both text space and image space, we leverage the semantic information contained in a text sequence as an explicit constraint to train a visual aligner. Thus, the visual aligner can map images from multiple domains to a unified domain and achieve good generalization performance. To better depict semantic information, prompt tuning is applied to learn a sequence of learnable tokens. With explicit constraints of semantic information, PVA can learn unified cross-domain representation under limited access to cross-domain data and achieves great zero-shot generalization ability in unseen domains. We verify PVA on a vision-based autonomous driving task with CARLA simulator. Experiments show that the agent generalizes well on unseen domains under limited access to multi-domain data. △ Less

Submitted 5 June, 2024; originally announced June 2024.

Comments: This paper has been accepted by ICML2024

arXiv:2405.09923 [pdf, other]

NTIRE 2024 Restore Any Image Model (RAIM) in the Wild Challenge

Authors: Jie Liang, Radu Timofte, Qiaosi Yi, Shuaizheng Liu, Lingchen Sun, Rongyuan Wu, Xindong Zhang, Hui Zeng, Lei Zhang

Abstract: In this paper, we review the NTIRE 2024 challenge on Restore Any Image Model (RAIM) in the Wild. The RAIM challenge constructed a benchmark for image restoration in the wild, including real-world images with/without reference ground truth in various scenarios from real applications. The participants were required to restore the real-captured images from complex and unknown degradation, where gener… ▽ More In this paper, we review the NTIRE 2024 challenge on Restore Any Image Model (RAIM) in the Wild. The RAIM challenge constructed a benchmark for image restoration in the wild, including real-world images with/without reference ground truth in various scenarios from real applications. The participants were required to restore the real-captured images from complex and unknown degradation, where generative perceptual quality and fidelity are desired in the restoration result. The challenge consisted of two tasks. Task one employed real referenced data pairs, where quantitative evaluation is available. Task two used unpaired images, and a comprehensive user study was conducted. The challenge attracted more than 200 registrations, where 39 of them submitted results with more than 400 submissions. Top-ranked methods improved the state-of-the-art restoration performance and obtained unanimous recognition from all 18 judges. The proposed datasets are available at https://drive.google.com/file/d/1DqbxUoiUqkAIkExu3jZAqoElr_nu1IXb/view?usp=sharing and the homepage of this challenge is at https://codalab.lisn.upsaclay.fr/competitions/17632. △ Less

Submitted 16 May, 2024; originally announced May 2024.

arXiv:2402.06865 [pdf]

A spin-torque nano-oscillator based on interlayer-coupled meron-skyrmion pairs with a fixed orbit

Authors: Qiyun Yi, Ting Han, Jinyi Jiang, Xiangjun Xing

Abstract: In recent years, magnetic skyrmion-based spin-torque nano-oscillators (STNOs) attract considerable interest for their prospect in future-generation communication and spintronic technologies. However, some critical issues, which hamper their practical applications, e.g., the long start-up time and variable skyrmion gyration orbit, remain to be resolved. Here, we numerically demonstrate a realizatio… ▽ More In recent years, magnetic skyrmion-based spin-torque nano-oscillators (STNOs) attract considerable interest for their prospect in future-generation communication and spintronic technologies. However, some critical issues, which hamper their practical applications, e.g., the long start-up time and variable skyrmion gyration orbit, remain to be resolved. Here, we numerically demonstrate a realization of a fixed-orbit STNO, which is based on an interlayer-coupled meron-skyrmion (MS) pair other than a magnetic skyrmion. In this STNO, the MS pair possesses a structurally defined, fixed orbit within a broad range of driving current, even in the presence of random defects. The output frequency range of the STNO based on an MS pair far exceeds that of the STNO typically based on a single skyrmion. Moreover, the output frequency of this STNO can be further elevated if more MS pairs are incorporated. Our results reveal the nontrivial dynamics of the interlayer-coupled MS pair, opening perspectives for the design and optimization of fundamental spintronic devices. △ Less

Submitted 9 February, 2024; originally announced February 2024.

Comments: 24 pages, 7 figures

arXiv:2402.06306 [pdf, other]

Multi-Modal Concurrent Transmission

Authors: Majid Nasiri Khormuji, Alberto Giuseppe Perotti, Qin Yi, Branislav Popovic

Abstract: This paper introduces a novel physical-layer method labelled as Multi-Modal Concurrent Transmission (MMCT) for efficient transmission of multiple data streams with different reliability-latency performance requirements. The MMCT arranges data from multiple streams within a same physical-layer transport block wherein stream-specific modulation and coding scheme (MCS) selection is combined with join… ▽ More This paper introduces a novel physical-layer method labelled as Multi-Modal Concurrent Transmission (MMCT) for efficient transmission of multiple data streams with different reliability-latency performance requirements. The MMCT arranges data from multiple streams within a same physical-layer transport block wherein stream-specific modulation and coding scheme (MCS) selection is combined with joint mapping of modulated codewords to Multiple-Input Multiple-Output spatial layers and frequency resources. Mapping to spatial-frequency resources with higher Signal-to-Noise Ratios (SNRs) provides the required performance boost for the more demanding streams. In tactile internet applications, wherein haptic feedback/actuation and audio-video streams flow in parallel, the method provides significant SNR and spectral efficiency enhancements compared to conventional 3GPP New Radio (NR) transmission methods. △ Less

Submitted 9 February, 2024; originally announced February 2024.

Comments: 6 pages, 4 figures, 1 table

Journal ref: 2024 IEEE Wireless Communications and Networking Conference

arXiv:2312.06162 [pdf, other]

Textual Prompt Guided Image Restoration

Authors: Qiuhai Yan, Aiwen Jiang, Kang Chen, Long Peng, Qiaosi Yi, Chunjie Zhang

Abstract: Image restoration has always been a cutting-edge topic in the academic and industrial fields of computer vision. Since degradation signals are often random and diverse, "all-in-one" models that can do blind image restoration have been concerned in recent years. Early works require training specialized headers and tails to handle each degradation of concern, which are manually cumbersome. Recent wo… ▽ More Image restoration has always been a cutting-edge topic in the academic and industrial fields of computer vision. Since degradation signals are often random and diverse, "all-in-one" models that can do blind image restoration have been concerned in recent years. Early works require training specialized headers and tails to handle each degradation of concern, which are manually cumbersome. Recent works focus on learning visual prompts from data distribution to identify degradation type. However, the prompts employed in most of models are non-text, lacking sufficient emphasis on the importance of human-in-the-loop. In this paper, an effective textual prompt guided image restoration model has been proposed. In this model, task-specific BERT is fine-tuned to accurately understand user's instructions and generating textual prompt guidance. Depth-wise multi-head transposed attentions and gated convolution modules are designed to bridge the gap between textual prompts and visual features. The proposed model has innovatively introduced semantic prompts into low-level visual domain. It highlights the potential to provide a natural, precise, and controllable way to perform image restoration tasks. Extensive experiments have been done on public denoising, dehazing and deraining datasets. The experiment results demonstrate that, compared with popular state-of-the-art methods, the proposed model can obtain much more superior performance, achieving accurate recognition and removal of degradation without increasing model's complexity. Related source codes and data will be publicly available on github site https://github.com/MoTong-AI-studio/TextPromptIR. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: 12 pages, 10figures

arXiv:2311.04474 [pdf, other]

Emergent Communication for Rules Reasoning

Authors: Yuxuan Guo, Yifan Hao, Rui Zhang, Enshuai Zhou, Zidong Du, Xishan Zhang, Xinkai Song, Yuanbo Wen, Yongwei Zhao, Xuehai Zhou, Jiaming Guo, Qi Yi, Shaohui Peng, Di Huang, Ruizhi Chen, Qi Guo, Yunji Chen

Abstract: Research on emergent communication between deep-learning-based agents has received extensive attention due to its inspiration for linguistics and artificial intelligence. However, previous attempts have hovered around emerging communication under perception-oriented environmental settings, that forces agents to describe low-level perceptual features intra image or symbol contexts. In this work, in… ▽ More Research on emergent communication between deep-learning-based agents has received extensive attention due to its inspiration for linguistics and artificial intelligence. However, previous attempts have hovered around emerging communication under perception-oriented environmental settings, that forces agents to describe low-level perceptual features intra image or symbol contexts. In this work, inspired by the classic human reasoning test (namely Raven's Progressive Matrix), we propose the Reasoning Game, a cognition-oriented environment that encourages agents to reason and communicate high-level rules, rather than perceived low-level contexts. Moreover, we propose 1) an unbiased dataset (namely rule-RAVEN) as a benchmark to avoid overfitting, 2) and a two-stage curriculum agent training method as a baseline for more stable convergence in the Reasoning Game, where contexts and semantics are bilaterally drifting. Experimental results show that, in the Reasoning Game, a semantically stable and compositional language emerges to solve reasoning problems. The emerged language helps agents apply the extracted rules to the generalization of unseen context attributes, and to the transfer between different context attributes or even tasks. △ Less

Submitted 8 November, 2023; originally announced November 2023.

arXiv:2311.03695 [pdf, other]

Context Shift Reduction for Offline Meta-Reinforcement Learning

Authors: Yunkai Gao, Rui Zhang, Jiaming Guo, Fan Wu, Qi Yi, Shaohui Peng, Siming Lan, Ruizhi Chen, Zidong Du, Xing Hu, Qi Guo, Ling Li, Yunji Chen

Abstract: Offline meta-reinforcement learning (OMRL) utilizes pre-collected offline datasets to enhance the agent's generalization ability on unseen tasks. However, the context shift problem arises due to the distribution discrepancy between the contexts used for training (from the behavior policy) and testing (from the exploration policy). The context shift problem leads to incorrect task inference and fur… ▽ More Offline meta-reinforcement learning (OMRL) utilizes pre-collected offline datasets to enhance the agent's generalization ability on unseen tasks. However, the context shift problem arises due to the distribution discrepancy between the contexts used for training (from the behavior policy) and testing (from the exploration policy). The context shift problem leads to incorrect task inference and further deteriorates the generalization ability of the meta-policy. Existing OMRL methods either overlook this problem or attempt to mitigate it with additional information. In this paper, we propose a novel approach called Context Shift Reduction for OMRL (CSRO) to address the context shift problem with only offline datasets. The key insight of CSRO is to minimize the influence of policy in context during both the meta-training and meta-test phases. During meta-training, we design a max-min mutual information representation learning mechanism to diminish the impact of the behavior policy on task representation. In the meta-test phase, we introduce the non-prior context collection strategy to reduce the effect of the exploration policy. Experimental results demonstrate that CSRO significantly reduces the context shift and improves the generalization ability, surpassing previous methods across various challenging domains. △ Less

Submitted 6 November, 2023; originally announced November 2023.

arXiv:2311.02104 [pdf, other]

Efficient Symbolic Policy Learning with Differentiable Symbolic Expression

Authors: Jiaming Guo, Rui Zhang, Shaohui Peng, Qi Yi, Xing Hu, Ruizhi Chen, Zidong Du, Xishan Zhang, Ling Li, Qi Guo, Yunji Chen

Abstract: Deep reinforcement learning (DRL) has led to a wide range of advances in sequential decision-making tasks. However, the complexity of neural network policies makes it difficult to understand and deploy with limited computational resources. Currently, employing compact symbolic expressions as symbolic policies is a promising strategy to obtain simple and interpretable policies. Previous symbolic po… ▽ More Deep reinforcement learning (DRL) has led to a wide range of advances in sequential decision-making tasks. However, the complexity of neural network policies makes it difficult to understand and deploy with limited computational resources. Currently, employing compact symbolic expressions as symbolic policies is a promising strategy to obtain simple and interpretable policies. Previous symbolic policy methods usually involve complex training processes and pre-trained neural network policies, which are inefficient and limit the application of symbolic policies. In this paper, we propose an efficient gradient-based learning method named Efficient Symbolic Policy Learning (ESPL) that learns the symbolic policy from scratch in an end-to-end way. We introduce a symbolic network as the search space and employ a path selector to find the compact symbolic policy. By doing so we represent the policy with a differentiable symbolic expression and train it in an off-policy manner which further improves the efficiency. In addition, in contrast with previous symbolic policies which only work in single-task RL because of complexity, we expand ESPL on meta-RL to generate symbolic policies for unseen tasks. Experimentally, we show that our approach generates symbolic policies with higher performance and greatly improves data efficiency for single-task RL. In meta-RL, we demonstrate that compared with neural network policies the proposed symbolic policy achieves higher performance and efficiency and shows the potential to be interpretable. △ Less

Submitted 1 November, 2023; originally announced November 2023.

Comments: Accepted by NeurIPS2023

arXiv:2311.01771 [pdf, other]

Efficient Generalized Low-Rank Tensor Contextual Bandits

Authors: Qianxin Yi, Yiyang Yang, Shaojie Tang, Jiapeng Liu, Yao Wang

Abstract: In this paper, we aim to build a novel bandits algorithm that is capable of fully harnessing the power of multi-dimensional data and the inherent non-linearity of reward functions to provide high-usable and accountable decision-making services. To this end, we introduce a generalized low-rank tensor contextual bandits model in which an action is formed from three feature vectors, and thus can be r… ▽ More In this paper, we aim to build a novel bandits algorithm that is capable of fully harnessing the power of multi-dimensional data and the inherent non-linearity of reward functions to provide high-usable and accountable decision-making services. To this end, we introduce a generalized low-rank tensor contextual bandits model in which an action is formed from three feature vectors, and thus can be represented by a tensor. In this formulation, the reward is determined through a generalized linear function applied to the inner product of the action's feature tensor and a fixed but unknown parameter tensor with a low tubal rank. To effectively achieve the trade-off between exploration and exploitation, we introduce a novel algorithm called "Generalized Low-Rank Tensor Exploration Subspace then Refine" (G-LowTESTR). This algorithm first collects raw data to explore the intrinsic low-rank tensor subspace information embedded in the decision-making scenario, and then converts the original problem into an almost lower-dimensional generalized linear contextual bandits problem. Rigorous theoretical analysis shows that the regret bound of G-LowTESTR is superior to those in vectorization and matricization cases. We conduct a series of simulations and real data experiments to further highlight the effectiveness of G-LowTESTR, leveraging its ability to capitalize on the low-rank tensor structure for enhanced learning. △ Less

Submitted 17 January, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

arXiv:2311.01075 [pdf, other]

Contrastive Modules with Temporal Attention for Multi-Task Reinforcement Learning

Authors: Siming Lan, Rui Zhang, Qi Yi, Jiaming Guo, Shaohui Peng, Yunkai Gao, Fan Wu, Ruizhi Chen, Zidong Du, Xing Hu, Xishan Zhang, Ling Li, Yunji Chen

Abstract: In the field of multi-task reinforcement learning, the modular principle, which involves specializing functionalities into different modules and combining them appropriately, has been widely adopted as a promising approach to prevent the negative transfer problem that performance degradation due to conflicts between tasks. However, most of the existing multi-task RL methods only combine shared mod… ▽ More In the field of multi-task reinforcement learning, the modular principle, which involves specializing functionalities into different modules and combining them appropriately, has been widely adopted as a promising approach to prevent the negative transfer problem that performance degradation due to conflicts between tasks. However, most of the existing multi-task RL methods only combine shared modules at the task level, ignoring that there may be conflicts within the task. In addition, these methods do not take into account that without constraints, some modules may learn similar functions, resulting in restricting the model's expressiveness and generalization capability of modular methods. In this paper, we propose the Contrastive Modules with Temporal Attention(CMTA) method to address these limitations. CMTA constrains the modules to be different from each other by contrastive learning and combining shared modules at a finer granularity than the task level with temporal attention, alleviating the negative transfer within the task and improving the generalization ability and the performance for multi-task RL. We conducted the experiment on Meta-World, a multi-task RL benchmark containing various robotics manipulation tasks. Experimental results show that CMTA outperforms learning each task individually for the first time and achieves substantial performance improvements over the baselines. △ Less

Submitted 2 November, 2023; originally announced November 2023.

Comments: This paper has been accepted at NeurIPS 2023 as a poster

arXiv:2310.10522 [pdf, other]

Observation of GRB 221009A early afterglow in X/$γ$-ray energy band

Authors: Chao Zheng, Yan-Qiu Zhang, Shao-Lin Xiong, Cheng-Kui Li, He Gao, Wang-Chen Xue, Jia-Cong Liu, Chen-Wei Wang, Wen-Jun Tan, Wen-Xi Peng, Zheng-Hua An, Ce Cai, Ming-Yu Ge, Dong-Ya Guo, Yue Huang, Bing Li, Ti-Pei Li, Xiao-Bo Li, Xin-Qiao Li, Xu-Fang Li, Jin-Yuan Liao, Cong-Zhan Liu, Fang-Jun Lu, Xiang Ma, Rui Qiao , et al. (23 additional authors not shown)

Abstract: The early afterglow of a Gamma-ray burst (GRB) can provide critical information on the jet and progenitor of the GRB. The extreme brightness of GRB 221009A allows us to probe its early afterglow in unprecedented detail. In this letter, we report comprehensive observation results of the early afterglow of GRB 221009A (from $T_0$+660 s to $T_0$+1860 s, where $T_0$ is the \textit{Insight}-HXMT/HE tri… ▽ More The early afterglow of a Gamma-ray burst (GRB) can provide critical information on the jet and progenitor of the GRB. The extreme brightness of GRB 221009A allows us to probe its early afterglow in unprecedented detail. In this letter, we report comprehensive observation results of the early afterglow of GRB 221009A (from $T_0$+660 s to $T_0$+1860 s, where $T_0$ is the \textit{Insight}-HXMT/HE trigger time) in X/$γ$-ray energy band (from 20 keV to 20 MeV) by \textit{Insight}-HXMT/HE, GECAM-C and \textit{Fermi}/GBM. We find that the spectrum of the early afterglow in 20 keV-20 MeV could be well described by a cutoff power-law with an extra power-law which dominates the low and high energy bands respectively. The cutoff power-law $E_{\rm peak}$ is $\sim$ 30 keV and the power-law photon index is $\sim$ 1.8 throughout the early afterglow phase. By fitting the light curves in different energy bands, we find that a significant achromatic break (from keV to TeV) is required at $T_0$ + 1246$^{+27}_{-26}$ s (i.e. 1021 s since the afterglow starting time $T_{\rm AG}$=$T_0$+225 s), providing compelling evidence of a jet break. Interestingly, both the pre-break and post-break decay slopes vary with energy, and these two slopes become closer in the lower energy band, making the break less identifiable. Intriguingly, the spectrum of the early afterglow experienced a slight hardening before the break and a softening after the break. These results provide new insights into the understanding of this remarkable GRB. △ Less

Submitted 19 January, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

Comments: Accepted for publication in ApJ Letters on 19-Jan-2024, 11 pages, 7 figures and 2 tables

arXiv:2310.07205 [pdf, other]

Evidence of mini-jet emission in a large emission zone from a magnetically-dominated gamma-ray burst jet

Authors: S. -X. Yi, C. -W. Wang, X. -Y. Shao, R. Moradi, H. Gao, B. Zhang, S. -L. Xiong, S. -N. Zhang, W. -J. Tan, J. -C. Liu, W. -C. Xue, Y. -Q. Zhang, C. Zheng, Y. Wang, P. Zhang, Z. -H. An, C. Cai, P. -Y. Feng, K. Gong, D. -Y. Guo, Y. Huang, B. Li, X. -B. Li, X. -Q. Li, X. -J. Liu , et al. (21 additional authors not shown)

Abstract: The second brightest GRB in history, GRB230307A, provides an ideal laboratory to study the mechanism of GRB prompt emission thanks to its extraordinarily high photon statistics and its single episode activity. Here we demonstrate that the rapidly variable components of its prompt emission compose an overall broad single pulse-like profile. Although these individual rapid components are aligned in… ▽ More The second brightest GRB in history, GRB230307A, provides an ideal laboratory to study the mechanism of GRB prompt emission thanks to its extraordinarily high photon statistics and its single episode activity. Here we demonstrate that the rapidly variable components of its prompt emission compose an overall broad single pulse-like profile. Although these individual rapid components are aligned in time across all energy bands, this overall profile conspires to show a well-defined energy-dependent behavior which is typically seen in single GRB pulses. Such a feature demonstrates that the prompt emission of this burst is from many individual emitting units that are casually linked in a emission site at a large distance from the central engine. Such a scenario is in natural consistency with the internal-collision-induced magnetic reconnection and turbulence framework, which invokes many mini-jets due to local magnetic reconnection that constantly appear and disappear in a global magnetically-dominated jet. △ Less

Submitted 21 April, 2025; v1 submitted 11 October, 2023; originally announced October 2023.

Comments: 16 pages, 19 figures, 4 tables. Accepted for publication in ApJ. :)

arXiv:2309.01352 [pdf, other]

Self-driven Grounding: Large Language Model Agents with Automatical Language-aligned Skill Learning

Authors: Shaohui Peng, Xing Hu, Qi Yi, Rui Zhang, Jiaming Guo, Di Huang, Zikang Tian, Ruizhi Chen, Zidong Du, Qi Guo, Yunji Chen, Ling Li

Abstract: Large language models (LLMs) show their powerful automatic reasoning and planning capability with a wealth of semantic knowledge about the human world. However, the grounding problem still hinders the applications of LLMs in the real-world environment. Existing studies try to fine-tune the LLM or utilize pre-defined behavior APIs to bridge the LLMs and the environment, which not only costs huge hu… ▽ More Large language models (LLMs) show their powerful automatic reasoning and planning capability with a wealth of semantic knowledge about the human world. However, the grounding problem still hinders the applications of LLMs in the real-world environment. Existing studies try to fine-tune the LLM or utilize pre-defined behavior APIs to bridge the LLMs and the environment, which not only costs huge human efforts to customize for every single task but also weakens the generality strengths of LLMs. To autonomously ground the LLM onto the environment, we proposed the Self-Driven Grounding (SDG) framework to automatically and progressively ground the LLM with self-driven skill learning. SDG first employs the LLM to propose the hypothesis of sub-goals to achieve tasks and then verify the feasibility of the hypothesis via interacting with the underlying environment. Once verified, SDG can then learn generalized skills with the guidance of these successfully grounded subgoals. These skills can be further utilized to accomplish more complex tasks which fail to pass the verification phase. Verified in the famous instruction following task set-BabyAI, SDG achieves comparable performance in the most challenging tasks compared with imitation learning methods that cost millions of demonstrations, proving the effectiveness of learned skills and showing the feasibility and efficiency of our framework. △ Less

Submitted 4 September, 2023; originally announced September 2023.

arXiv:2308.11362 [pdf, other]

Calibration of the Timing Performance of GECAM-C

Authors: Shuo Xiao, Ya-Qing Liu, Ke Gong, Zheng-Hua An, Shao-Lin Xiong, Xin-Qiao Li, Xiang-Yang Wen, Wen-Xi Peng, Da-Li Zhang, You-Li Tuo, Shi-Jie Zheng, Li-Ming Song, Ping Wang, Xiao-Yun Zhao, Yue Huang, Xiang Ma, Xiao-Jing Liu, Rui Qiao, Yan-Bing Xu, Sheng Yang, Fan Zhang, Yue Wang, Yan-Qiu Zhang, Wang-Chen Xue, Jia-Cong Liu , et al. (13 additional authors not shown)

Abstract: As a new member of the Gravitational wave high-energy Electromagnetic Counterpart All-sky Monitor (GECAM) after GECAM-A and GECAM-B, GECAM-C (originally called HEBS), which was launched on board the SATech-01 satellite on July 27, 2022, aims to monitor and localize X-ray and gamma-ray transients from $\sim$ 6 keV to 6 MeV. GECAM-C utilizes a similar design to GECAM but operates in a more complex o… ▽ More As a new member of the Gravitational wave high-energy Electromagnetic Counterpart All-sky Monitor (GECAM) after GECAM-A and GECAM-B, GECAM-C (originally called HEBS), which was launched on board the SATech-01 satellite on July 27, 2022, aims to monitor and localize X-ray and gamma-ray transients from $\sim$ 6 keV to 6 MeV. GECAM-C utilizes a similar design to GECAM but operates in a more complex orbital environment. In this work, we utilize the secondary particles simultaneously produced by the cosmic-ray events on orbit and recorded by multiple detectors, to calibrate the relative timing accuracy between all detectors of GECAM-C. We find the result is 0.1 $μ\rm s$, which is the highest time resolution among all GRB detectors ever flown and very helpful in timing analyses such as minimum variable timescale and spectral lags, as well as in time delay localization. Besides, we calibrate the absolute time accuracy using the one-year Crab pulsar data observed by GECAM-C and Fermi/GBM, as well as GECAM-C and GECAM-B. The results are $2.02\pm 2.26\ μ\rm s$ and $5.82\pm 3.59\ μ\rm s$, respectively. Finally, we investigate the spectral lag between the different energy bands of Crab pulsar observed by GECAM and GBM, which is $\sim -0.2\ {\rm μs\ keV^{-1}}$. △ Less

Submitted 22 August, 2023; originally announced August 2023.

Comments: submitted

arXiv:2307.06608 [pdf, other]

MF-CLIP: Leveraging CLIP as Surrogate Models for No-box Adversarial Attacks

Authors: Jiaming Zhang, Lingyu Qiu, Qi Yi, Yige Li, Jitao Sang, Changsheng Xu, Dit-Yan Yeung

Abstract: The vulnerability of Deep Neural Networks (DNNs) to adversarial attacks poses a significant challenge to their deployment in safety-critical applications. While extensive research has addressed various attack scenarios, the no-box attack setting where adversaries have no prior knowledge, including access to training data of the target model, remains relatively underexplored despite its practical r… ▽ More The vulnerability of Deep Neural Networks (DNNs) to adversarial attacks poses a significant challenge to their deployment in safety-critical applications. While extensive research has addressed various attack scenarios, the no-box attack setting where adversaries have no prior knowledge, including access to training data of the target model, remains relatively underexplored despite its practical relevance. This work presents a systematic investigation into leveraging large-scale Vision-Language Models (VLMs), particularly CLIP, as surrogate models for executing no-box attacks. Our theoretical and empirical analyses reveal a key limitation in the execution of no-box attacks stemming from insufficient discriminative capabilities for direct application of vanilla CLIP as a surrogate model. To address this limitation, we propose MF-CLIP: a novel framework that enhances CLIP's effectiveness as a surrogate model through margin-aware feature space optimization. Comprehensive evaluations across diverse architectures and datasets demonstrate that MF-CLIP substantially advances the state-of-the-art in no-box attacks, surpassing existing baselines by 15.23% on standard models and achieving a 9.52% improvement on adversarially trained models. Our code will be made publicly available to facilitate reproducibility and future research in this direction. △ Less

Submitted 24 March, 2025; v1 submitted 13 July, 2023; originally announced July 2023.

arXiv:2306.10255 [pdf, other]

doi 10.1029/2022GL102325

The First GECAM Observation Results on Terrestrial Gamma-ray Flashes and Terrestrial Electron Beams

Authors: Y. Zhao, J. C. Liu, S. L. Xiong, W. C. Xue, Q. B. Yi, G. P. Lu, W. Xu, F. C. Lyu, J. C. Sun, W. X. Peng, C. Zheng, Y. Q. Zhang, C. Cai, S. Xiao, S. L. Xie, C. W. Wang, W. J. Tan, Z. H. An, G. Chen, Y. Q. Du, Y. Huang, M. Gao, K. Gong, D. Y. Guo, J. J. He , et al. (37 additional authors not shown)

Abstract: Gravitational-wave high-energy Electromagnetic Counterpart All-sky Monitor (GECAM) is a space-borne instrument dedicated to monitoring high-energy transients, including Terrestrial Gamma-ray Flashes (TGFs) and Terrestrial Electron Beams (TEBs). We implemented a TGF/TEB search algorithm for GECAM, with which 147 bright TGFs, 2 typical TEBs and 2 special TEB-like events are identified during an effe… ▽ More Gravitational-wave high-energy Electromagnetic Counterpart All-sky Monitor (GECAM) is a space-borne instrument dedicated to monitoring high-energy transients, including Terrestrial Gamma-ray Flashes (TGFs) and Terrestrial Electron Beams (TEBs). We implemented a TGF/TEB search algorithm for GECAM, with which 147 bright TGFs, 2 typical TEBs and 2 special TEB-like events are identified during an effective observation time of $\sim$9 months. We show that, with gamma-ray and charged particle detectors, GECAM can effectively identify and distinguish TGFs and TEBs, and measure their temporal and spectral properties in detail. A very high TGF-lightning association rate of $\sim$80\% is obtained between GECAM and GLD360 in east Asia region. △ Less

Submitted 17 June, 2023; originally announced June 2023.

Comments: The paper was accepted by Geophysical Research Letters on June 16th, 2023

arXiv:2306.07307 [pdf, other]

Online Prototype Alignment for Few-shot Policy Transfer

Authors: Qi Yi, Rui Zhang, Shaohui Peng, Jiaming Guo, Yunkai Gao, Kaizhao Yuan, Ruizhi Chen, Siming Lan, Xing Hu, Zidong Du, Xishan Zhang, Qi Guo, Yunji Chen

Abstract: Domain adaptation in reinforcement learning (RL) mainly deals with the changes of observation when transferring the policy to a new environment. Many traditional approaches of domain adaptation in RL manage to learn a mapping function between the source and target domain in explicit or implicit ways. However, they typically require access to abundant data from the target domain. Besides, they ofte… ▽ More Domain adaptation in reinforcement learning (RL) mainly deals with the changes of observation when transferring the policy to a new environment. Many traditional approaches of domain adaptation in RL manage to learn a mapping function between the source and target domain in explicit or implicit ways. However, they typically require access to abundant data from the target domain. Besides, they often rely on visual clues to learn the mapping function and may fail when the source domain looks quite different from the target domain. To address these problems, we propose a novel framework Online Prototype Alignment (OPA) to learn the mapping function based on the functional similarity of elements and is able to achieve the few-shot policy transfer within only several episodes. The key insight of OPA is to introduce an exploration mechanism that can interact with the unseen elements of the target domain in an efficient and purposeful manner, and then connect them with the seen elements in the source domain according to their functionalities (instead of visual clues). Experimental results show that when the target domain looks visually different from the source domain, OPA can achieve better transfer performance even with much fewer samples from the target domain, outperforming prior methods. △ Less

Submitted 12 June, 2023; originally announced June 2023.

Comments: This paper has been accepted at ICML2023

arXiv:2305.15368 [pdf]

doi 10.1016/j.compositesb.2023.111048

Directional eddy current probe configuration for in-line detection of out-of-plane wrinkles

Authors: Meirbek Mussatayev, Qiuji Yi, Mark Fitzgerald, Vincent K. Maes, Paul Wilcox, Robert Hughes

Abstract: Real-time monitoring of carbon fibre composites during Automated Fibre Placement (AFP) manufacturing remains a challenge for non-destructive evaluation (NDE) techniques. An directional eddy-current (EC) probe with asymmetric transmit and differential receive (Tx-dRx) coils is designed, constructed and characterized to evaluate the detectability of out-of-plane wrinkles. Initial studies were conduc… ▽ More Real-time monitoring of carbon fibre composites during Automated Fibre Placement (AFP) manufacturing remains a challenge for non-destructive evaluation (NDE) techniques. An directional eddy-current (EC) probe with asymmetric transmit and differential receive (Tx-dRx) coils is designed, constructed and characterized to evaluate the detectability of out-of-plane wrinkles. Initial studies were conducted to determine suitable excitation frequencies and to analyse the impact of relative orientations of driver and pickup coils on wrinkle detectability. The probe configurations are evaluated experimentally and employ a new finite element modelling approach to better understand the relationship between eddy-current density and defect detection. The findings indicate that a probe configuration with an asymmetric driver coil normal to the material surface and aligned with the fibre directions, and with differential pickup coils 90 degrees to the scanning direction, shows the best capability for out-of-plane wrinkle detection, with SNR >20 for wrinkles over 1.3 mm in amplitude. △ Less

Submitted 20 October, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

Comments: [2024] Elsevier. This manuscript version is made available under the CC BY-NC-ND 4.0 license. [https://doi.org/10.1016/j.compositesb.2023.111048]

Journal ref: Compos. Part B Eng., vol. 268, p. 111048, 2024

arXiv:2303.05069 [pdf, other]

Conceptual Reinforcement Learning for Language-Conditioned Tasks

Authors: Shaohui Peng, Xing Hu, Rui Zhang, Jiaming Guo, Qi Yi, Ruizhi Chen, Zidong Du, Ling Li, Qi Guo, Yunji Chen

Abstract: Despite the broad application of deep reinforcement learning (RL), transferring and adapting the policy to unseen but similar environments is still a significant challenge. Recently, the language-conditioned policy is proposed to facilitate policy transfer through learning the joint representation of observation and text that catches the compact and invariant information across environments. Exist… ▽ More Despite the broad application of deep reinforcement learning (RL), transferring and adapting the policy to unseen but similar environments is still a significant challenge. Recently, the language-conditioned policy is proposed to facilitate policy transfer through learning the joint representation of observation and text that catches the compact and invariant information across environments. Existing studies of language-conditioned RL methods often learn the joint representation as a simple latent layer for the given instances (episode-specific observation and text), which inevitably includes noisy or irrelevant information and cause spurious correlations that are dependent on instances, thus hurting generalization performance and training efficiency. To address this issue, we propose a conceptual reinforcement learning (CRL) framework to learn the concept-like joint representation for language-conditioned policy. The key insight is that concepts are compact and invariant representations in human cognition through extracting similarities from numerous instances in real-world. In CRL, we propose a multi-level attention encoder and two mutual information constraints for learning compact and invariant concepts. Verified in two challenging environments, RTFM and Messenger, CRL significantly improves the training efficiency (up to 70%) and generalization ability (up to 30%) to the new environment dynamics. △ Less

Submitted 9 March, 2023; originally announced March 2023.

Comments: Accepted by AAAI 2023

arXiv:2303.01203 [pdf, other]

Insight-HXMT and GECAM-C observations of the brightest-of-all-time GRB 221009A

Authors: Zheng-Hua An, S. Antier, Xing-Zi Bi, Qing-Cui Bu, Ce Cai, Xue-Lei Cao, Anna-Elisa Camisasca, Zhi Chang, Gang Chen, Li Chen, Tian-Xiang Chen, Wen Chen, Yi-Bao Chen, Yong Chen, Yu-Peng Chen, Michael W. Coughlin, Wei-Wei Cui, Zi-Gao Dai, T. Hussenot-Desenonges, Yan-Qi Du, Yuan-Yuan Du, Yun-Fei Du, Cheng-Cheng Fan, Filippo Frontera, He Gao , et al. (153 additional authors not shown)

Abstract: GRB 221009A is the brightest gamma-ray burst ever detected since the discovery of this kind of energetic explosions. However, an accurate measurement of the prompt emission properties of this burst is very challenging due to its exceptional brightness. With joint observations of \textit{Insight}-HXMT and GECAM-C, we made an unprecedentedly accurate measurement of the emission during the first… ▽ More GRB 221009A is the brightest gamma-ray burst ever detected since the discovery of this kind of energetic explosions. However, an accurate measurement of the prompt emission properties of this burst is very challenging due to its exceptional brightness. With joint observations of \textit{Insight}-HXMT and GECAM-C, we made an unprecedentedly accurate measurement of the emission during the first $\sim$1800 s of GRB 221009A, including its precursor, main emission (ME, which dominates the burst in flux), flaring emission and early afterglow, in the hard X-ray to soft gamma-ray band from $\sim$ 10 keV to $\sim$ 6 MeV. Based on the GECAM-C unsaturated data of the ME, we measure a record-breaking isotropic equivalent energy ($E_{\rm iso}$) of $\bf \sim 1.5 \times 10^{55}$ erg, which is about eight times the total rest-mass energy of the Sun. The early afterglow data require a significant jet break between 650 s and 1100 s, most likely at $\sim950$ s from the afterglow starting time $T_{AG}$, which corresponds to a jet opening angle of $\sim {0.7^\circ} \ (η_γn)^{1/8}$, where $n$ is the ambient medium density in units of $\rm cm^{-3}$ and $η_γ$ is the ratio between $γ$-ray energy and afterglow kinetic energy. The beaming-corrected total $γ$-ray energy $E_γ$ is $\sim 1.15 \times10^{51} \ (η_γn)^{1/4}$ erg, which is typical for long GRBs. These results suggest that this GRB may have a special central engine, which could launch and collimate a very narrowly beamed jet with an ordinary energy budget, leading to exceptionally luminous gamma-ray radiation per unit solid angle. Alternatively, more GRBs might have such a narrow and bright beam, which are missed by an unfavorable viewing angle or have been detected without distance measurement. △ Less

Submitted 3 March, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

Comments: Submitted to National Science Review. This paper is under press embargo, contact the corresponding author for details

arXiv:2303.00698 [pdf, other]

Cross calibration of gamma-ray detectors (GRD) of GECAM-C

Authors: Yan-Qiu Zhang, Shao-Lin Xiong, Rui Qiao, Dong-Ya Guo, Wen-Xi Peng, Xin-Qiao Li, Wang-Chen Xue, Chao Zheng, Jia-Cong Liu, Wen-Jun Tan, Chen-Wei Wang, Peng Zhang, Ping Wang, Ce Cai, Shuo Xiao, Yue Huang, Pei-Yi Feng, Xiao-Bo Li, Li-Ming Song, Qi-Bin Yi, Yi Zhao, Zhi-Wei Guo, Jian-Jian He, Chao-Yang Li, Ya-Qing Liu , et al. (20 additional authors not shown)

Abstract: The gamma-ray detectors (GRDs) of GECAM-C onborad SATech-01 satellite is designed to monitor gamma-ray transients all over the sky from 6 keV to 6 MeV. The energy response matrix is the key to do spectral measurements of bursts, which is usually generated from GEANT4 simulation and partially verified by the ground calibration. In this work, energy response matrix of GECAM-C GRD is cross-calibrated… ▽ More The gamma-ray detectors (GRDs) of GECAM-C onborad SATech-01 satellite is designed to monitor gamma-ray transients all over the sky from 6 keV to 6 MeV. The energy response matrix is the key to do spectral measurements of bursts, which is usually generated from GEANT4 simulation and partially verified by the ground calibration. In this work, energy response matrix of GECAM-C GRD is cross-calibrated with Fermi/GBM and Swift/BAT using a sample of Gamma-Ray Bursts (GRBs) and Soft Gamma-Ray Repeaters (SGRs). The calibration results show there is a good agreement between GECAM-C and other reasonably well calibrated instrument (i.e. Fermi/GBM and Swift/BAT). We also find that different GRD detectors of GECAM-C also show consistency with each other. All these results indicate that GECAM-C GRD can provide reliable spectral measurements. △ Less

Submitted 1 March, 2023; originally announced March 2023.

Comments: preliminary version, will be updated soon

arXiv:2303.00687 [pdf, other]

Ground calibration of Gamma-Ray Detectors of GECAM-C

Authors: Chao Zheng, Zheng-Hua An, Wen-Xi Peng, Da-Li Zhang, Shao-Lin Xiong, Rui. Qiao, Yan-Qiu Zhang, Wang-Chen Xue, Jia-Cong Liu, Pei-Yi Feng, Ce. Cai, Min Gao, Ke Gong, Dong-Ya Guo, Dong-Jie Hou, Gang Li, Xin-Qiao Li, Yan-Guo Li, Mao-Shun Li, Xiao-Hua Liang, Ya-Qing Liu, Xiao-Jing Liu, Li-Ming Song, Xi-Lei Sun, Wen-Jun Tan , et al. (13 additional authors not shown)

Abstract: As a new member of GECAM mission, GECAM-C (also named High Energy Burst Searcher, HEBS) was launched onboard the SATech-01 satellite on July 27th, 2022, which is capable to monitor gamma-ray transients from $\sim$ 6 keV to 6 MeV. As the main detector, there are 12 gamma-ray detectors (GRDs) equipped for GECAM-C. In order to verify the GECAM-C GRD detector performance and to validate the Monte Carl… ▽ More As a new member of GECAM mission, GECAM-C (also named High Energy Burst Searcher, HEBS) was launched onboard the SATech-01 satellite on July 27th, 2022, which is capable to monitor gamma-ray transients from $\sim$ 6 keV to 6 MeV. As the main detector, there are 12 gamma-ray detectors (GRDs) equipped for GECAM-C. In order to verify the GECAM-C GRD detector performance and to validate the Monte Carlo simulations of detector response, comprehensive on-ground calibration experiments have been performed using X-ray beam and radioactive sources, including Energy-Channel relation, energy resolution, detection efficiency, SiPM voltage-gain relation and the non-uniformity of positional response. In this paper, the detailed calibration campaigns and data analysis results for GECAM-C GRDs are presented, demonstrating the excellent performance of GECAM-C GRD detectors. △ Less

Submitted 30 May, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

Comments: third version

arXiv:2303.00537 [pdf, other]

The performance of SiPM-based gamma-ray detector (GRD) of GECAM-C

Authors: Dali Zhang, Chao Zheng, Jiacong Liu, Zhenghua An, Chenwei Wang, Xiangyang Wen, Xinqiao Li, Xilei Sun, Ke Gong, Yaqing Liu, Xiaojing Liu, Sheng Yang, Wenxi Peng, Rui Qiao, Dongya Guo, Peiyi Feng, Yanqiu Zhang, Wangchen Xue, Wenjun Tan, Ce Cai, Shuo Xiao, Qibin Yi, Yanbing Xu, Min Gao, Jinzhou Wang , et al. (20 additional authors not shown)

Abstract: As a new member of GECAM mission, the GECAM-C (also called High Energy Burst Searcher, HEBS) is a gamma-ray all-sky monitor onboard SATech-01 satellite, which was launched on July 27th, 2022 to detect gamma-ray transients from 6 keV to 6 MeV, such as Gamma-Ray Bursts (GRBs), high energy counterpart of Gravitational Waves (GWs) and Fast Radio Bursts (FRBs), and Soft Gamma-ray Repeaters (SGRs). Toge… ▽ More As a new member of GECAM mission, the GECAM-C (also called High Energy Burst Searcher, HEBS) is a gamma-ray all-sky monitor onboard SATech-01 satellite, which was launched on July 27th, 2022 to detect gamma-ray transients from 6 keV to 6 MeV, such as Gamma-Ray Bursts (GRBs), high energy counterpart of Gravitational Waves (GWs) and Fast Radio Bursts (FRBs), and Soft Gamma-ray Repeaters (SGRs). Together with GECAM-A and GECAM-B launched in December 2020, GECAM-C will greatly improve the monitoring coverage, localization, as well as temporal and spectral measurements of gamma-ray transients. GECAM-C employs 12 SiPM-based Gamma-Ray Detectors (GRDs) to detect gamma-ray transients . In this paper, we firstly give a brief description of the design of GECAM-C GRDs, and then focus on the on-ground tests and in-flight performance of GRDs. We also did the comparison study of the SiPM in-flight performance between GECAM-C and GECAM-B. The results show GECAM-C GRD works as expected and is ready to make scientific observations. △ Less

Submitted 7 March, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

Comments: 18 pages, 16 figures

arXiv:2302.11755 [pdf, other]

doi 10.1093/mnras/stac3075

Burst search method based on likelihood ratio in Poisson Statistics

Authors: Ce Cai, Shao-Lin Xiong, Wang-Chen Xue, Yi Zhao, Shuo Xiao, Qi-Bin Yi, Zhi-Wei Guo, Jia-Cong Liu, Yan-Qiu Zhang, Chao Zheng, Sheng-Lun Xie, Yan-Qi Du, Xiao-Yun Zhao, Cheng-Kui Li, Ping Wang, Wen-Xi Peng, Shi-Jie Zheng, Li-Ming Song, Xin-Qiao Li, Xiang-Yang Wen, Fan Zhang

Abstract: Searching for X-ray and gamma-ray bursts, including Gamma-ray bursts (GRBs), Soft Gamma-ray Repeaters (SGRs) and high energy transients associated with Gravitational wave (GW) events or Fast radio bursts (FRBs), is of great importance in the multi-messenger and multi-wavelength era. Although a coherent search based on the likelihood ratio and Gaussian statistics has been established and utilized i… ▽ More Searching for X-ray and gamma-ray bursts, including Gamma-ray bursts (GRBs), Soft Gamma-ray Repeaters (SGRs) and high energy transients associated with Gravitational wave (GW) events or Fast radio bursts (FRBs), is of great importance in the multi-messenger and multi-wavelength era. Although a coherent search based on the likelihood ratio and Gaussian statistics has been established and utilized in many studies, this Gaussian-based method could be problematic for weak and short bursts which usually have very few counts. To deal with all bursts including weak ones, here we propose the coherent search in Poisson statistics. We studied the difference between Poisson-based and Gaussian-based search methods by Monte Carlo simulations, and find that the Poisson-based search method has advantages compared to the Gaussian case especially for weak bursts. Our results show that, for very weak bursts with very low number of counts, the Poisson-based search can provide higher significance than the Gaussian-based search, and its likelihood ratio (for background fluctuation) still generally follows the chi2 distribution, making the significance estimation of searched bursts very convenient. Thus, we suggest that the coherent search based on Poisson likelihood ratio is more appropriate in the search for generic transients including very weak ones. △ Less

Submitted 22 February, 2023; originally announced February 2023.

Comments: 10 pages, 10 figures,

Journal ref: MNRAS,2023

arXiv:2301.01429 [pdf, other]

doi 10.1088/1674-1056/aca7ed

Atlas of dynamic spectra of fast radio burst FRB 20201124A

Authors: Bo-Jun Wang, Heng Xu, Jin-Chen Jiang, Jiang-Wei Xu, Jia-Rui Niu, Ping Chen, Ke-Jia Lee, Bing Zhang, Wei-Wei Zhu, Su-Bo Dong, Chun-Feng Zhang, Hai Fu, De-Jiang Zhou, Yong-Kun Zhang, Pei Wang, Yi Feng, Ye Li, Dong-Zi Li, Wen-Bin Lu, Yuan-Pei Yang, R. N. Caballero, Ce Cai, Mao-Zheng Chen, Zi-Gao Dai, A. Esamdin , et al. (42 additional authors not shown)

Abstract: Fast radio bursts (FRBs) are highly dispersed millisecond-duration radio bursts, of which the physical origin is still not fully understood. FRB 20201124A is one of the most actively repeating FRBs. In this paper, we present the collection of 1863 burst dynamic spectra of FRB 20201124A measured with the Five-hundred-meter Aperture Spherical radio Telescope (FAST). The current collection, taken fro… ▽ More Fast radio bursts (FRBs) are highly dispersed millisecond-duration radio bursts, of which the physical origin is still not fully understood. FRB 20201124A is one of the most actively repeating FRBs. In this paper, we present the collection of 1863 burst dynamic spectra of FRB 20201124A measured with the Five-hundred-meter Aperture Spherical radio Telescope (FAST). The current collection, taken from the observation during the FRB active phase from April to June 2021, is the largest burst sample detected in any FRB so far. The standard PSRFITs format is adopted, including dynamic spectra of the burst, and the time information of the dynamic spectra, in addition, mask files help readers to identify the pulse positions are also provided. △ Less

Submitted 3 January, 2023; originally announced January 2023.

arXiv:2301.01217 [pdf, other]

Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples

Authors: Jiaming Zhang, Xingjun Ma, Qi Yi, Jitao Sang, Yu-Gang Jiang, Yaowei Wang, Changsheng Xu

Abstract: There is a growing interest in developing unlearnable examples (UEs) against visual privacy leaks on the Internet. UEs are training samples added with invisible but unlearnable noise, which have been found can prevent unauthorized training of machine learning models. UEs typically are generated via a bilevel optimization framework with a surrogate model to remove (minimize) errors from the origina… ▽ More There is a growing interest in developing unlearnable examples (UEs) against visual privacy leaks on the Internet. UEs are training samples added with invisible but unlearnable noise, which have been found can prevent unauthorized training of machine learning models. UEs typically are generated via a bilevel optimization framework with a surrogate model to remove (minimize) errors from the original samples, and then applied to protect the data against unknown target models. However, existing UE generation methods all rely on an ideal assumption called label-consistency, where the hackers and protectors are assumed to hold the same label for a given sample. In this work, we propose and promote a more practical label-agnostic setting, where the hackers may exploit the protected data quite differently from the protectors. E.g., a m-class unlearnable dataset held by the protector may be exploited by the hacker as a n-class dataset. Existing UE generation methods are rendered ineffective in this challenging setting. To tackle this challenge, we present a novel technique called Unlearnable Clusters (UCs) to generate label-agnostic unlearnable examples with cluster-wise perturbations. Furthermore, we propose to leverage VisionandLanguage Pre-trained Models (VLPMs) like CLIP as the surrogate model to improve the transferability of the crafted UCs to diverse domains. We empirically verify the effectiveness of our proposed approach under a variety of settings with different datasets, target models, and even commercial platforms Microsoft Azure and Baidu PaddlePaddle. Code is available at \url{https://github.com/jiamingzhang94/Unlearnable-Clusters}. △ Less

Submitted 23 March, 2023; v1 submitted 30 December, 2022; originally announced January 2023.

Comments: CVPR2023

arXiv:2211.15570 [pdf, other]

doi 10.3847/1538-4365/acafeb

GECAM Localization of High Energy Transients and the Systematic Error

Authors: Yi Zhao, Wang-Chen Xue, Shao-Lin Xiong, Yuan-Hao Wang, Jia-Cong Liu, Qi Liuo, Yan-Qiu Zhang, Jian-Chao Sun, Xiao-Yun Zhao, Ce Cai, Shuo Xiao, Yue Huang, Xiao-Bo Li, Zhen Zhang, Jin-Yuan Liao, Sheng Yang, Rui Qiao, Dong-Ya Guo, Chao Zheng, Qi-Bin Yi, Sheng-Lun Xie, Zhi-Wei Guo, Chao-Yang Li, Chen-Wei Wang, Wen-Jun Tan , et al. (41 additional authors not shown)

Abstract: Gravitational wave high-energy Electromagnetic Counterpart All-sky Monitor (GECAM) is a pair of microsatellites (i.e. GECAM-A and GECAM-B) dedicated to monitoring gamma-ray transients including gravitational waves high-energy electromagnetic counterparts, Gamma-ray Bursts, Soft Gamma-ray Repeaters, Solar Flares and Terrestrial Gamma-ray Flashes. Since launch in December 2020, GECAM-B has detected… ▽ More Gravitational wave high-energy Electromagnetic Counterpart All-sky Monitor (GECAM) is a pair of microsatellites (i.e. GECAM-A and GECAM-B) dedicated to monitoring gamma-ray transients including gravitational waves high-energy electromagnetic counterparts, Gamma-ray Bursts, Soft Gamma-ray Repeaters, Solar Flares and Terrestrial Gamma-ray Flashes. Since launch in December 2020, GECAM-B has detected hundreds of astronomical and terrestrial events. For these bursts, localization is the key for burst identification and classification as well as follow-up observations in multi-wavelength. Here, we propose a Bayesian localization method with Poisson data with Gaussian background profile likelihood to localize GECAM bursts based on the burst counts distribution in detectors with different orientations. We demonstrate that this method can work well for all kinds of bursts, especially for extremely short ones. In addition, we propose a new method to estimate the systematic error of localization based on a confidence level test, which can overcome some problems of the existing method in literature. We validate this method by Monte Carlo simulations, and then apply it to a burst sample with accurate location and find that the mean value of the systematic error of GECAM-B localization is $\sim 2.5^{\circ}$. By considering this systematic error, we can obtain a reliable localization probability map for GECAM bursts. Our methods can be applied to other gamma-ray monitors. △ Less

Submitted 23 December, 2022; v1 submitted 28 November, 2022; originally announced November 2022.

Comments: The paper has been accepted by Astrophysical Journal Supplement Series

arXiv:2210.07802 [pdf, other]

Object-Category Aware Reinforcement Learning

Authors: Qi Yi, Rui Zhang, Shaohui Peng, Jiaming Guo, Xing Hu, Zidong Du, Xishan Zhang, Qi Guo, Yunji Chen

Abstract: Object-oriented reinforcement learning (OORL) is a promising way to improve the sample efficiency and generalization ability over standard RL. Recent works that try to solve OORL tasks without additional feature engineering mainly focus on learning the object representations and then solving tasks via reasoning based on these object representations. However, none of these works tries to explicitly… ▽ More Object-oriented reinforcement learning (OORL) is a promising way to improve the sample efficiency and generalization ability over standard RL. Recent works that try to solve OORL tasks without additional feature engineering mainly focus on learning the object representations and then solving tasks via reasoning based on these object representations. However, none of these works tries to explicitly model the inherent similarity between different object instances of the same category. Objects of the same category should share similar functionalities; therefore, the category is the most critical property of an object. Following this insight, we propose a novel framework named Object-Category Aware Reinforcement Learning (OCARL), which utilizes the category information of objects to facilitate both perception and reasoning. OCARL consists of three parts: (1) Category-Aware Unsupervised Object Discovery (UOD), which discovers the objects as well as their corresponding categories; (2) Object-Category Aware Perception, which encodes the category information and is also robust to the incompleteness of (1) at the same time; (3) Object-Centric Modular Reasoning, which adopts multiple independent and object-category-specific networks when reasoning based on objects. Our experiments show that OCARL can improve both the sample efficiency and generalization in the OORL domain. △ Less

Submitted 13 October, 2022; originally announced October 2022.

Comments: This paper is to be published on NeurIPS 2022

arXiv:2210.06964 [pdf, other]

Causality-driven Hierarchical Structure Discovery for Reinforcement Learning

Authors: Shaohui Peng, Xing Hu, Rui Zhang, Ke Tang, Jiaming Guo, Qi Yi, Ruizhi Chen, Xishan Zhang, Zidong Du, Ling Li, Qi Guo, Yunji Chen

Abstract: Hierarchical reinforcement learning (HRL) effectively improves agents' exploration efficiency on tasks with sparse reward, with the guide of high-quality hierarchical structures (e.g., subgoals or options). However, how to automatically discover high-quality hierarchical structures is still a great challenge. Previous HRL methods can hardly discover the hierarchical structures in complex environme… ▽ More Hierarchical reinforcement learning (HRL) effectively improves agents' exploration efficiency on tasks with sparse reward, with the guide of high-quality hierarchical structures (e.g., subgoals or options). However, how to automatically discover high-quality hierarchical structures is still a great challenge. Previous HRL methods can hardly discover the hierarchical structures in complex environments due to the low exploration efficiency by exploiting the randomness-driven exploration paradigm. To address this issue, we propose CDHRL, a causality-driven hierarchical reinforcement learning framework, leveraging a causality-driven discovery instead of a randomness-driven exploration to effectively build high-quality hierarchical structures in complicated environments. The key insight is that the causalities among environment variables are naturally fit for modeling reachable subgoals and their dependencies and can perfectly guide to build high-quality hierarchical structures. The results in two complex environments, 2D-Minecraft and Eden, show that CDHRL significantly boosts exploration efficiency with the causality-driven paradigm. △ Less

Submitted 13 October, 2022; originally announced October 2022.

Comments: Accepted by NeurIPS 2022

Showing 1–50 of 115 results for author: Yi, Q