Search | arXiv e-print repository

Collimated Hard X-Rays from Hybrid Laser and Plasma Wakefield Accelerators

Authors: Hong Zhang, Jianmeng Wei, Mengyuan Chu, Jiale Zheng, Zhiheng Lou, Ruoxuan Ma, Xizhuan Chen, Hao Wang, Gaojie Zeng, Hang Guo, Yinlong Zheng, Hai Jiang, Yanjie Ge, Kangnan Jiang, Runshu Hu, Jiayi Qian, Jiacheng Zhu, Zongxin Zhang, Yi Xu, Yuxin Leng, Song Li, Ke Feng, Wentao Wang, Ruxin Li

Abstract: We report a synergistic enhancement of betatron radiation based on the hybrid laser and plasma wakefield acceleration scheme. Quasi-phase-stable acceleration in an up-ramp plasma density first generates GeV-energy electron beams that act as a drive beam for PWFA, which then further accelerates the witness beam to GeV energies, enhancing both photon energy and flux. A full width at half maximum div… ▽ More We report a synergistic enhancement of betatron radiation based on the hybrid laser and plasma wakefield acceleration scheme. Quasi-phase-stable acceleration in an up-ramp plasma density first generates GeV-energy electron beams that act as a drive beam for PWFA, which then further accelerates the witness beam to GeV energies, enhancing both photon energy and flux. A full width at half maximum divergence $(6.1 \pm 1.9)\times(5.8\pm 1.6) $ mrad$^2$ of betatron radiation, a critical energy of $71 \pm 8$ keV, and an average flux of more than $10^{14}$ photons per steradian above 5 keV were all experimentally obtained thanks to this scheme, which was an order of magnitude higher than the previous reports. Quasi-three-dimensional particle-in-cell simulations were used to model the acceleration and radiation of the electrons in our experimental conditions, establishing a new paradigm for compact collimated hard X-ray sources. △ Less

Submitted 12 June, 2025; v1 submitted 7 June, 2025; originally announced June 2025.

Comments: 7 pages,6 figures,

arXiv:2505.00125 [pdf, other]

Roadmap on Advancements of the FHI-aims Software Package

Authors: Joseph W. Abbott, Carlos Mera Acosta, Alaa Akkoush, Alberto Ambrosetti, Viktor Atalla, Alexej Bagrets, Jörg Behler, Daniel Berger, Björn Bieniek, Jonas Björk, Volker Blum, Saeed Bohloul, Connor L. Box, Nicholas Boyer, Danilo Simoes Brambila, Gabriel A. Bramley, Kyle R. Bryenton, María Camarasa-Gómez, Christian Carbogno, Fabio Caruso, Sucismita Chutia, Michele Ceriotti, Gábor Csányi, William Dawson, Francisco A. Delesma , et al. (177 additional authors not shown)

Abstract: Electronic-structure theory is the foundation of the description of materials including multiscale modeling of their properties and functions. Obviously, without sufficient accuracy at the base, reliable predictions are unlikely at any level that follows. The software package FHI-aims has proven to be a game changer for accurate free-energy calculations because of its scalability, numerical precis… ▽ More Electronic-structure theory is the foundation of the description of materials including multiscale modeling of their properties and functions. Obviously, without sufficient accuracy at the base, reliable predictions are unlikely at any level that follows. The software package FHI-aims has proven to be a game changer for accurate free-energy calculations because of its scalability, numerical precision, and its efficient handling of density functional theory (DFT) with hybrid functionals and van der Waals interactions. It treats molecules, clusters, and extended systems (solids and liquids) on an equal footing. Besides DFT, FHI-aims also includes quantum-chemistry methods, descriptions for excited states and vibrations, and calculations of various types of transport. Recent advancements address the integration of FHI-aims into an increasing number of workflows and various artificial intelligence (AI) methods. This Roadmap describes the state-of-the-art of FHI-aims and advancements that are currently ongoing or planned. △ Less

Submitted 5 June, 2025; v1 submitted 30 April, 2025; originally announced May 2025.

Comments: arXiv admin note: Includes articles arXiv:2502.02460, arXiv:2501.02550, arXiv:2411.01680, arXiv:2501.16091, arXiv:2411.04951

arXiv:2504.17787 [pdf, other]

The Fourth Monocular Depth Estimation Challenge

Authors: Anton Obukhov, Matteo Poggi, Fabio Tosi, Ripudaman Singh Arora, Jaime Spencer, Chris Russell, Simon Hadfield, Richard Bowden, Shuaihang Wang, Zhenxin Ma, Weijie Chen, Baobei Xu, Fengyu Sun, Di Xie, Jiang Zhu, Mykola Lavreniuk, Haining Guan, Qun Wu, Yupei Zeng, Chao Lu, Huanran Wang, Guangyuan Zhou, Haotian Zhang, Jianxiong Wang, Qiang Rao , et al. (32 additional authors not shown)

Abstract: This paper presents the results of the fourth edition of the Monocular Depth Estimation Challenge (MDEC), which focuses on zero-shot generalization to the SYNS-Patches benchmark, a dataset featuring challenging environments in both natural and indoor settings. In this edition, we revised the evaluation protocol to use least-squares alignment with two degrees of freedom to support disparity and aff… ▽ More This paper presents the results of the fourth edition of the Monocular Depth Estimation Challenge (MDEC), which focuses on zero-shot generalization to the SYNS-Patches benchmark, a dataset featuring challenging environments in both natural and indoor settings. In this edition, we revised the evaluation protocol to use least-squares alignment with two degrees of freedom to support disparity and affine-invariant predictions. We also revised the baselines and included popular off-the-shelf methods: Depth Anything v2 and Marigold. The challenge received a total of 24 submissions that outperformed the baselines on the test set; 10 of these included a report describing their approach, with most leading methods relying on affine-invariant predictions. The challenge winners improved the 3D F-Score over the previous edition's best result, raising it from 22.58% to 23.05%. △ Less

Submitted 24 April, 2025; originally announced April 2025.

Comments: To appear in CVPRW2025

arXiv:2504.09946 [pdf, other]

Assessing Judging Bias in Large Reasoning Models: An Empirical Study

Authors: Qian Wang, Zhanzhi Lou, Zhenheng Tang, Nuo Chen, Xuandong Zhao, Wenxuan Zhang, Dawn Song, Bingsheng He

Abstract: Large Reasoning Models (LRMs) like DeepSeek-R1 and OpenAI-o1 have demonstrated remarkable reasoning capabilities, raising important questions about their biases in LLM-as-a-judge settings. We present a comprehensive benchmark comparing judging biases between LLMs and LRMs across both subjective preference-alignment datasets and objective fact-based datasets. Through investigation of bandwagon, aut… ▽ More Large Reasoning Models (LRMs) like DeepSeek-R1 and OpenAI-o1 have demonstrated remarkable reasoning capabilities, raising important questions about their biases in LLM-as-a-judge settings. We present a comprehensive benchmark comparing judging biases between LLMs and LRMs across both subjective preference-alignment datasets and objective fact-based datasets. Through investigation of bandwagon, authority, position, and distraction biases, we uncover four key findings: (1) despite their advanced reasoning capabilities, LRMs remain susceptible to the above biases; (2) LRMs demonstrate better robustness than LLMs specifically on fact-related datasets; (3) LRMs exhibit notable position bias, preferring options in later positions; and (4) we identify a novel "superficial reflection bias" where phrases mimicking reasoning (e.g., "wait, let me think...") significantly influence model judgments. To address these biases, we design and evaluate three mitigation strategies: specialized system prompts that reduce judging biases by up to 19\% in preference alignment datasets and 14\% in fact-related datasets, in-context learning that provides up to 27\% improvement on preference tasks but shows inconsistent results on factual tasks, and a self-reflection mechanism that reduces biases by up to 10\% in preference datasets and 16\% in fact-related datasets, with self-reflection proving particularly effective for LRMs. Our work provides crucial insights for developing more reliable LLM-as-a-Judge frameworks, especially as LRMs become increasingly deployed as automated judges. △ Less

Submitted 17 April, 2025; v1 submitted 14 April, 2025; originally announced April 2025.

arXiv:2504.06205 [pdf, other]

HRMedSeg: Unlocking High-resolution Medical Image segmentation via Memory-efficient Attention Modeling

Authors: Qing Xu, Zhenye Lou, Chenxin Li, Xiangjian He, Rong Qu, Tesema Fiseha Berhanu, Yi Wang, Wenting Duan, Zhen Chen

Abstract: High-resolution segmentation is critical for precise disease diagnosis by extracting micro-imaging information from medical images. Existing transformer-based encoder-decoder frameworks have demonstrated remarkable versatility and zero-shot performance in medical segmentation. While beneficial, they usually require huge memory costs when handling large-size segmentation mask predictions, which are… ▽ More High-resolution segmentation is critical for precise disease diagnosis by extracting micro-imaging information from medical images. Existing transformer-based encoder-decoder frameworks have demonstrated remarkable versatility and zero-shot performance in medical segmentation. While beneficial, they usually require huge memory costs when handling large-size segmentation mask predictions, which are expensive to apply to real-world scenarios. To address this limitation, we propose a memory-efficient framework for high-resolution medical image segmentation, called HRMedSeg. Specifically, we first devise a lightweight gated vision transformer (LGViT) as our image encoder to model long-range dependencies with linear complexity. Then, we design an efficient cross-multiscale decoder (ECM-Decoder) to generate high-resolution segmentation masks. Moreover, we utilize feature distillation during pretraining to unleash the potential of our proposed model. Extensive experiments reveal that HRMedSeg outperforms state-of-the-arts in diverse high-resolution medical image segmentation tasks. In particular, HRMedSeg uses only 0.59GB GPU memory per batch during fine-tuning, demonstrating low training costs. Besides, when HRMedSeg meets the Segment Anything Model (SAM), our HRMedSegSAM takes 0.61% parameters of SAM-H. The code is available at https://github.com/xq141839/HRMedSeg. △ Less

Submitted 8 April, 2025; originally announced April 2025.

Comments: Under Review

arXiv:2504.04236 [pdf, other]

The effects of temperature and viscosity on the metachronal swimming of crustaceans

Authors: Adrian Herrera-Amaya, Nils B. Tack, Zhipeng Lou, Chengyu Li, Monica M. Wilhelmus

Abstract: Temperature changes as small as $3 ^\circ$C have been observed to significantly impact how self-propelled organisms move through their environment, especially for those inhabiting the transitional flow regime in which both viscous and inertial effects are important. Nonetheless, many oceanic species can successfully migrate across temperature changes in the order of $20 ^\circ $C, corresponding to… ▽ More Temperature changes as small as $3 ^\circ$C have been observed to significantly impact how self-propelled organisms move through their environment, especially for those inhabiting the transitional flow regime in which both viscous and inertial effects are important. Nonetheless, many oceanic species can successfully migrate across temperature changes in the order of $20 ^\circ $C, corresponding to $40 \%$ differences in viscosity, via metachronal propulsion, suggesting that this propulsion mechanism is resilient to drastic changes in water column properties. We investigate marsh grass shrimp (\textit{Palaemon vulgaris}) as a model organism to explore the combined physical and physiological effects on their locomotion at natural seasonal temperature extremes ($6^\circ - 20^\circ$C). Experimentally, we manipulate temperature and viscosity independently to isolate physical and physiological effects. We then use the shrimp morphology and gait data to inform a computational fluid dynamics parametric study to estimate the force-to-power ratios of varying viscosity and beat frequencies through naturally occurring extremes. Our research demonstrates that shrimp do not modify their gait parameters to naturally occurring viscosity changes, and their swimming performance is impacted by less than $9 \% $. The robustness of the metachronal gait is evidence of the ecological success of shrimp-like organisms in all climates, from the tropics to pole waters and inland freshwater △ Less

Submitted 5 April, 2025; originally announced April 2025.

arXiv:2503.10136 [pdf, ps, other]

A Max-Min problem on spectral radius and connectedness of graphs

Authors: Zhenzhen Lou, Changxiang He

Abstract: In the past decades, many scholars concerned which edge-extremal problems have spectral analogues? Recently, Wang, Kang and Xue showed an interesting result on $F$-free graphs [J. Combin. Theory Ser. B 159 (2023) 20--41]. In this paper, we study the above problem on critical graphs.Let $P$ be a property defined on a family $\mathbb{G}$ of graphs. A graph $G$ in $\mathbb{G}$ is said to be $P$-criti… ▽ More In the past decades, many scholars concerned which edge-extremal problems have spectral analogues? Recently, Wang, Kang and Xue showed an interesting result on $F$-free graphs [J. Combin. Theory Ser. B 159 (2023) 20--41]. In this paper, we study the above problem on critical graphs.Let $P$ be a property defined on a family $\mathbb{G}$ of graphs. A graph $G$ in $\mathbb{G}$ is said to be $P$-critical,if it has the property $P$ but $G-e$ no longer has for any edge $e\in E(G)$. Especially, a graph is minimally $k$-(edge)-connected,if it is $k$-connected (respectively, $k$-edge connected) and deleting an arbitrary edge always leaves a graph which is not $k$-connected (respectively, $k$-edge-connected). An interesting Max-Min problem asks what is the maximal spectral radius of an $n$-vertex minimally $k$-(edge)-connected graphs? In 2019, Chen and Guo [Discrete Math. 342 (2019) 2092--2099] gave the answer for $k=2$. In 2021, Fan, Goryainov and Lin [Discrete Appl. Math. 305 (2021) 154--163] determined the extremal spectral radius for minimally $3$-connected graphs. We obtain some structural properties of minimally $k$-(edge)-connected graphs. Furthermore, we solve the above Max-Min problem for $k\geq3$, which implies that every minimally $k$-(edge)-connected graph with maximal spectral radius also has maximal number of edges. Finally, a general problem is posed for further research. △ Less

Submitted 13 March, 2025; originally announced March 2025.

arXiv:2503.05140 [pdf, other]

Mixed norm estimates for dilated averages over planar curves

Authors: Junfeng Li, Zengjian Lou, Haixia Yu

Abstract: In this paper, we investigate the mixed norm estimates for the operator $ T $associated with a dilated plane curve $(ut, uγ(t))$, defined by \[ Tf(x, u) := \int_{0}^{1} f(x_1 - ut, x_2 - uγ(t)) \, dt, \] where $ x := (x_1, x_2) $ and $γ$ is a general plane curve satisfying appropriate smoothness and curvature conditions. Our results partially address a problem posed by Hickman [J. Funct. Anal. 201… ▽ More In this paper, we investigate the mixed norm estimates for the operator $ T $associated with a dilated plane curve $(ut, uγ(t))$, defined by \[ Tf(x, u) := \int_{0}^{1} f(x_1 - ut, x_2 - uγ(t)) \, dt, \] where $ x := (x_1, x_2) $ and $γ$ is a general plane curve satisfying appropriate smoothness and curvature conditions. Our results partially address a problem posed by Hickman [J. Funct. Anal. 2016] in the two-dimensional setting. More precisely, we establish the $ L_x^p(\mathbb{R}^2) \rightarrow L_x^q L_u^r(\mathbb{R}^2 \times [1, 2]) $ (space-time) estimates for $ T $, whenever $(\frac{1}{p},\frac{1}{q})$ satisfy \[ \max\left\{0, \frac{1}{2p} - \frac{1}{2r}, \frac{3}{p} - \frac{r+2}{r}\right\} < \frac{1}{q} \leq \frac{1}{p} < \frac{r+1}{2r} \] and $$1 + (1 + ω)\left(\frac{1}{q} - \frac{1}{p}\right) > 0,$$ where $ r \in [1, \infty] $ and $ ω:= \limsup_{t \rightarrow 0^+} \frac{\ln|γ(t)|}{\ln t} $. These results are sharp, except for certain borderline cases. Additionally, we examine the $ L_x^p(\mathbb{R}^2) \rightarrow L_u^r L_x^q(\mathbb{R}^2 \times [1, 2]) $ (time-space) estimates for $T $, which are especially almost sharp when $p=2$. △ Less

Submitted 6 March, 2025; originally announced March 2025.

arXiv:2502.09941 [pdf, other]

A Lightweight and Effective Image Tampering Localization Network with Vision Mamba

Authors: Kun Guo, Gang Cao, Zijie Lou, Xianglin Huang, Jiaoyun Liu

Abstract: Current image tampering localization methods primarily rely on Convolutional Neural Networks (CNNs) and Transformers. While CNNs suffer from limited local receptive fields, Transformers offer global context modeling at the expense of quadratic computational complexity. Recently, the state space model Mamba has emerged as a competitive alternative, enabling linear-complexity global dependency model… ▽ More Current image tampering localization methods primarily rely on Convolutional Neural Networks (CNNs) and Transformers. While CNNs suffer from limited local receptive fields, Transformers offer global context modeling at the expense of quadratic computational complexity. Recently, the state space model Mamba has emerged as a competitive alternative, enabling linear-complexity global dependency modeling. Inspired by it, we propose a lightweight and effective FORensic network based on vision MAmba (ForMa) for blind image tampering localization. Firstly, ForMa captures multi-scale global features that achieves efficient global dependency modeling through linear complexity. Then the pixel-wise localization map is generated by a lightweight decoder, which employs a parameter-free pixel shuffle layer for upsampling. Additionally, a noise-assisted decoding strategy is proposed to integrate complementary manipulation traces from tampered images, boosting decoder sensitivity to forgery cues. Experimental results on 10 standard datasets demonstrate that ForMa achieves state-of-the-art generalization ability and robustness, while maintaining the lowest computational complexity. Code is available at https://github.com/multimediaFor/ForMa. △ Less

Submitted 14 February, 2025; originally announced February 2025.

arXiv:2501.17472 [pdf, other]

A Heliocentric-orbiting Objects Processing System (HOPS) for the Wide Field Survey Telescope: Architecture, Processing Workflow, and Preliminary Results

Authors: Shao-Han Wang, Bing-Xue Fu, Jun-Qiang Lu, LuLu Fan, Min-Xuan Cai, Ze-Lin Xu, Xu Kong, Haibin Zhao, Bin Li, Ya-Ting Liu, Qing-feng Zhu, Xu Zhou, Zhen Wan, Jingquan Cheng, Ji-an Jiang, Feng Li, Ming Liang, Hao Liu, Wentao Luo, Zhen Lou, Hairen Wang, Jian Wang, Tinggui Wang, Yongquan Xue, Hongfei Zhang , et al. (1 additional authors not shown)

Abstract: Wide-field surveys have markedly enhanced the discovery and study of solar system objects (SSOs). The 2.5-meter Wide Field Survey Telescope (WFST) represents the foremost facility dedicated to optical time-domain surveys in the northern hemisphere. To fully exploit WFST's capabilities for SSO detection, we have developed a heliocentric-orbiting objects processing system (HOPS) tailored for identif… ▽ More Wide-field surveys have markedly enhanced the discovery and study of solar system objects (SSOs). The 2.5-meter Wide Field Survey Telescope (WFST) represents the foremost facility dedicated to optical time-domain surveys in the northern hemisphere. To fully exploit WFST's capabilities for SSO detection, we have developed a heliocentric-orbiting objects processing system (HOPS) tailored for identifying these objects. This system integrates HelioLinC3D, an algorithm well suited for the WFST survey cadence, characterized by revisiting the same sky field twice on the majority of nights. In this paper, we outline the architecture and processing flow of our SSO processing system. The application of the system to the WFST pilot survey data collected between March and May 2024 demonstrates exceptional performance in terms of both temporal efficiency and completeness. A total of 658,489 observations encompassing 38,520 known asteroids have been documented, and 241 newly discovered asteroids have been assigned provisional designations. In particular, 27% of these new discoveries were achieved using merely two observations per night on three nights. The preliminary results not only illuminate the effectiveness of integrating HelioLinC3D within the SSO processing system, but also emphasize the considerable potential contributions of WFST to the field of solar system science. △ Less

Submitted 29 January, 2025; originally announced January 2025.

Comments: 23 pages, 6 figures, submitted to AAS journal

arXiv:2501.16741 [pdf, other]

Quantum Geometric Origin of Strain-Tunable Giant Second-Harmonic Generation in Bi$_2$O$_2$X (X=S, Se, Te)

Authors: Zhefeng Lou, Zhihao Gong, Ziye Zhu, Wenbin Li, Xiao Lin, Hua Wang

Abstract: Two-dimensional (2D) materials with giant nonlinear optical (NLO) responses are essential for the development of advanced on-chip NLO devices. Using first-principles calculations, we predict a remarkable strain-induced enhancement of second-harmonic generation (SHG) in the high-performance 2D semiconductors Bi$_2$O$_2$X (X = S, Se, Te). The SHG susceptibilities of Bi$_2$O$_2$X under strain are on… ▽ More Two-dimensional (2D) materials with giant nonlinear optical (NLO) responses are essential for the development of advanced on-chip NLO devices. Using first-principles calculations, we predict a remarkable strain-induced enhancement of second-harmonic generation (SHG) in the high-performance 2D semiconductors Bi$_2$O$_2$X (X = S, Se, Te). The SHG susceptibilities of Bi$_2$O$_2$X under strain are on the order of 1~nm/V, rivalling the highest values reported among 2D materials. This giant SHG response originates from gauge-invariant geometric quantities, including the quantum metric, shift vector, and triple phase product. The strain also induces a bandgap variation in Bi$_2$O$_2$X. Intriguingly, in Bi$_2$O$_2$Te, strain-induced bandgap tuning drives a transition from a semiconductor to a half-metal, and ultimately to a polar metal. Our findings present a unique platform that combines strain-tunable bandgap engineering with exceptional NLO properties, while also highlighting the crucial role of quantum geometry in enhancing SHG. △ Less

Submitted 28 January, 2025; originally announced January 2025.

Comments: 8 pages, 4 figures

arXiv:2501.09466 [pdf, other]

DEFOM-Stereo: Depth Foundation Model Based Stereo Matching

Authors: Hualie Jiang, Zhiqiang Lou, Laiyan Ding, Rui Xu, Minglang Tan, Wenjie Jiang, Rui Huang

Abstract: Stereo matching is a key technique for metric depth estimation in computer vision and robotics. Real-world challenges like occlusion and non-texture hinder accurate disparity estimation from binocular matching cues. Recently, monocular relative depth estimation has shown remarkable generalization using vision foundation models. Thus, to facilitate robust stereo matching with monocular depth cues,… ▽ More Stereo matching is a key technique for metric depth estimation in computer vision and robotics. Real-world challenges like occlusion and non-texture hinder accurate disparity estimation from binocular matching cues. Recently, monocular relative depth estimation has shown remarkable generalization using vision foundation models. Thus, to facilitate robust stereo matching with monocular depth cues, we incorporate a robust monocular relative depth model into the recurrent stereo-matching framework, building a new framework for depth foundation model-based stereo-matching, DEFOM-Stereo. In the feature extraction stage, we construct the combined context and matching feature encoder by integrating features from conventional CNNs and DEFOM. In the update stage, we use the depth predicted by DEFOM to initialize the recurrent disparity and introduce a scale update module to refine the disparity at the correct scale. DEFOM-Stereo is verified to have much stronger zero-shot generalization compared with SOTA methods. Moreover, DEFOM-Stereo achieves top performance on the KITTI 2012, KITTI 2015, Middlebury, and ETH3D benchmarks, ranking $1^{st}$ on many metrics. In the joint evaluation under the robust vision challenge, our model simultaneously outperforms previous models on the individual benchmarks, further demonstrating its outstanding capabilities. △ Less

Submitted 23 April, 2025; v1 submitted 16 January, 2025; originally announced January 2025.

Comments: https://insta360-research-team.github.io/DEFOM-Stereo/

Journal ref: CVPR 2025

arXiv:2412.11214 [pdf, other]

Image Forgery Localization with State Space Models

Authors: Zijie Lou, Gang Cao, Kun Guo, Shaowei Weng, Lifang Yu

Abstract: Pixel dependency modeling from tampered images is pivotal for image forgery localization. Current approaches predominantly rely on Convolutional Neural Networks (CNNs) or Transformer-based models, which often either lack sufficient receptive fields or entail significant computational overheads. Recently, State Space Models (SSMs), exemplified by Mamba, have emerged as a promising approach. They no… ▽ More Pixel dependency modeling from tampered images is pivotal for image forgery localization. Current approaches predominantly rely on Convolutional Neural Networks (CNNs) or Transformer-based models, which often either lack sufficient receptive fields or entail significant computational overheads. Recently, State Space Models (SSMs), exemplified by Mamba, have emerged as a promising approach. They not only excel in modeling long-range interactions but also maintain a linear computational complexity. In this paper, we propose LoMa, a novel image forgery localization method that leverages the selective SSMs. Specifically, LoMa initially employs atrous selective scan to traverse the spatial domain and convert the tampered image into ordered patch sequences, and subsequently applies multi-directional state space modeling. In addition, an auxiliary convolutional branch is introduced to enhance local feature extraction. Extensive experimental results validate the superiority of LoMa over CNN-based and Transformer-based state-of-the-arts. To our best knowledge, this is the first image forgery localization model constructed based on the SSM-based model. We aim to establish a baseline and provide valuable insights for the future development of more efficient and effective SSM-based forgery localization models. Code is available at https://github.com/multimediaFor/LoMa. △ Less

Submitted 14 February, 2025; v1 submitted 15 December, 2024; originally announced December 2024.

arXiv:2410.05672 [pdf, ps, other]

Embedding derivatives and derivative Area operators of Hardy spaces into Lebesgue spaces

Authors: Xiaosong Liu, Zengjian Lou, Zixing Yuan, Ruhan Zhao

Abstract: We characterize the compactness of embedding derivatives from Hardy space $H^p$ into Lebesgue space $L^q(μ)$. We also completely characterize the boundedness and compactness of derivative area operators from $H^p$ into $L^q(\mathbb{S}_n)$, $0<p, q<\infty$. Some of the tools used in the proof of the one-dimensional case are not available in higher dimensions, such as the strong factorization of Har… ▽ More We characterize the compactness of embedding derivatives from Hardy space $H^p$ into Lebesgue space $L^q(μ)$. We also completely characterize the boundedness and compactness of derivative area operators from $H^p$ into $L^q(\mathbb{S}_n)$, $0<p, q<\infty$. Some of the tools used in the proof of the one-dimensional case are not available in higher dimensions, such as the strong factorization of Hardy spaces. Therefore, we need the theory of tent spaces which was established by Coifman, Mayer and Stein in 1985. △ Less

Submitted 8 October, 2024; originally announced October 2024.

Comments: 28pages

MSC Class: Primary 47B38; Secondary 32A35; 32A37 ACM Class: F.2.2; I.2.7

arXiv:2408.11787 [pdf, other]

NuSegDG: Integration of Heterogeneous Space and Gaussian Kernel for Domain-Generalized Nuclei Segmentation

Authors: Zhenye Lou, Qing Xu, Zekun Jiang, Xiangjian He, Zhen Chen, Yi Wang, Chenxin Li, Maggie M. He, Wenting Duan

Abstract: Domain-generalized nuclei segmentation refers to the generalizability of models to unseen domains based on knowledge learned from source domains and is challenged by various image conditions, cell types, and stain strategies. Recently, the Segment Anything Model (SAM) has made great success in universal image segmentation by interactive prompt modes (e.g., point and box). Despite its strengths, th… ▽ More Domain-generalized nuclei segmentation refers to the generalizability of models to unseen domains based on knowledge learned from source domains and is challenged by various image conditions, cell types, and stain strategies. Recently, the Segment Anything Model (SAM) has made great success in universal image segmentation by interactive prompt modes (e.g., point and box). Despite its strengths, the original SAM presents limited adaptation to medical images. Moreover, SAM requires providing manual bounding box prompts for each object to produce satisfactory segmentation masks, so it is laborious in nuclei segmentation scenarios. To address these limitations, we propose a domain-generalizable framework for nuclei image segmentation, abbreviated to NuSegDG. Specifically, we first devise a Heterogeneous Space Adapter (HS-Adapter) to learn multi-dimensional feature representations of different nuclei domains by injecting a small number of trainable parameters into the image encoder of SAM. To alleviate the labor-intensive requirement of manual prompts, we introduce a Gaussian-Kernel Prompt Encoder (GKP-Encoder) to generate density maps driven by a single point, which guides segmentation predictions by mixing position prompts and semantic prompts. Furthermore, we present a Two-Stage Mask Decoder (TSM-Decoder) to effectively convert semantic masks to instance maps without the manual demand for morphological shape refinement. Based on our experimental evaluations, the proposed NuSegDG demonstrates state-of-the-art performance in nuclei instance segmentation, exhibiting superior domain generalization capabilities. The source code is available at https://github.com/xq141839/NuSegDG. △ Less

Submitted 24 August, 2024; v1 submitted 21 August, 2024; originally announced August 2024.

Comments: Under Reivew

arXiv:2408.02929 [pdf, other]

Segmenting Small Stroke Lesions with Novel Labeling Strategies

Authors: Liang Shang, Zhengyang Lou, Andrew L. Alexander, Vivek Prabhakaran, William A. Sethares, Veena A. Nair, Nagesh Adluru

Abstract: Deep neural networks have demonstrated exceptional efficacy in stroke lesion segmentation. However, the delineation of small lesions, critical for stroke diagnosis, remains a challenge. In this study, we propose two straightforward yet powerful approaches that can be seamlessly integrated into a variety of networks: Multi-Size Labeling (MSL) and Distance-Based Labeling (DBL), with the aim of enhan… ▽ More Deep neural networks have demonstrated exceptional efficacy in stroke lesion segmentation. However, the delineation of small lesions, critical for stroke diagnosis, remains a challenge. In this study, we propose two straightforward yet powerful approaches that can be seamlessly integrated into a variety of networks: Multi-Size Labeling (MSL) and Distance-Based Labeling (DBL), with the aim of enhancing the segmentation accuracy of small lesions. MSL divides lesion masks into various categories based on lesion volume while DBL emphasizes the lesion boundaries. Experimental evaluations on the Anatomical Tracings of Lesions After Stroke (ATLAS) v2.0 dataset showcase that an ensemble of MSL and DBL achieves consistently better or equal performance on recall (3.6% and 3.7%), F1 (2.4% and 1.5%), and Dice scores (1.3% and 0.0%) compared to the top-1 winner of the 2022 MICCAI ATLAS Challenge on both the subset only containing small lesions and the entire dataset, respectively. Notably, on the mini-lesion subset, a single MSL model surpasses the previous best ensemble strategy, with enhancements of 1.0% and 0.3% on F1 and Dice scores, respectively. Our code is available at: https://github.com/nadluru/StrokeLesSeg. △ Less

Submitted 5 August, 2024; originally announced August 2024.

arXiv:2407.18458 [pdf, other]

Phase engineering of giant second harmonic generation in Bi$_2$O$_2$Se

Authors: Zhefeng Lou, Yingjie Zhao, Zhihao Gong, Ziye Zhu, Mengqi Wu, Tao Wang, Jialu Wang, Haoyu Qi, Huakun Zuo, Zhuokai Xu, Jichuang Shen, Zhiwei Wang, Lan Li, Shuigang Xu, Wei Kong, Wenbin Li, Xiaorui Zheng, Hua Wang, Xiao Lin

Abstract: Two-dimensional (2D) materials with remarkable second-harmonic generation (SHG) hold promise for future on-chip nonlinear optics. Relevant materials with both giant SHG response and environmental stability are long-sought targets. Here, we demonstrate the enormous SHG from the phase engineering of a high-performance semiconductor, Bi$_2$O$_2$Se (BOS), under uniaxial strain. SHG signals captured in… ▽ More Two-dimensional (2D) materials with remarkable second-harmonic generation (SHG) hold promise for future on-chip nonlinear optics. Relevant materials with both giant SHG response and environmental stability are long-sought targets. Here, we demonstrate the enormous SHG from the phase engineering of a high-performance semiconductor, Bi$_2$O$_2$Se (BOS), under uniaxial strain. SHG signals captured in strained 20 nm-BOS films exceed those of NbOI$_2$ and NbOCl$_2$ of similar thickness by a factor of 10, and are four orders of magnitude higher than monolayer-MoS$_2$, resulting in a significant second-order nonlinear susceptibility on the order of 1 nm V$^{-1}$. Intriguingly, the strain enables continuous adjustment of the ferroelectric phase transition across room temperature. Consequently, an exceptionally large tunability of SHG, approximately six orders of magnitude, is achieved through strain or thermal modulation. This colossal SHG, originating from the geometric phase of Bloch wave functions and coupled with sensitive tunability through multiple approaches in this air-stable 2D semiconductor, opens new possibilities for designing chip-scale, switchable nonlinear optical devices. △ Less

Submitted 25 July, 2024; originally announced July 2024.

arXiv:2406.17628 [pdf, other]

Video Inpainting Localization with Contrastive Learning

Authors: Zijie Lou, Gang Cao, Man Lin

Abstract: Deep video inpainting is typically used as malicious manipulation to remove important objects for creating fake videos. It is significant to identify the inpainted regions blindly. This letter proposes a simple yet effective forensic scheme for Video Inpainting LOcalization with ContrAstive Learning (ViLocal). Specifically, a 3D Uniformer encoder is applied to the video noise residual for learning… ▽ More Deep video inpainting is typically used as malicious manipulation to remove important objects for creating fake videos. It is significant to identify the inpainted regions blindly. This letter proposes a simple yet effective forensic scheme for Video Inpainting LOcalization with ContrAstive Learning (ViLocal). Specifically, a 3D Uniformer encoder is applied to the video noise residual for learning effective spatiotemporal forensic features. To enhance the discriminative power, supervised contrastive learning is adopted to capture the local inconsistency of inpainted videos through attracting/repelling the positive/negative pristine and forged pixel pairs. A pixel-wise inpainting localization map is yielded by a lightweight convolution decoder with a specialized two-stage training strategy. To prepare enough training samples, we build a video object segmentation dataset of 2500 videos with pixel-level annotations per frame. Extensive experimental results validate the superiority of ViLocal over state-of-the-arts. Code and dataset will be available at https://github.com/multimediaFor/ViLocal. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2406.13576

arXiv:2406.13576 [pdf, other]

Trusted Video Inpainting Localization via Deep Attentive Noise Learning

Authors: Zijie Lou, Gang Cao, Man Lin

Abstract: Digital video inpainting techniques have been substantially improved with deep learning in recent years. Although inpainting is originally designed to repair damaged areas, it can also be used as malicious manipulation to remove important objects for creating false scenes and facts. As such it is significant to identify inpainted regions blindly. In this paper, we present a Trusted Video Inpaintin… ▽ More Digital video inpainting techniques have been substantially improved with deep learning in recent years. Although inpainting is originally designed to repair damaged areas, it can also be used as malicious manipulation to remove important objects for creating false scenes and facts. As such it is significant to identify inpainted regions blindly. In this paper, we present a Trusted Video Inpainting Localization network (TruVIL) with excellent robustness and generalization ability. Observing that high-frequency noise can effectively unveil the inpainted regions, we design deep attentive noise learning in multiple stages to capture the inpainting traces. Firstly, a multi-scale noise extraction module based on 3D High Pass (HP3D) layers is used to create the noise modality from input RGB frames. Then the correlation between such two complementary modalities are explored by a cross-modality attentive fusion module to facilitate mutual feature learning. Lastly, spatial details are selectively enhanced by an attentive noise decoding module to boost the localization performance of the network. To prepare enough training samples, we also build a frame-level video object segmentation dataset of 2500 videos with pixel-level annotation for all frames. Extensive experimental results validate the superiority of TruVIL compared with the state-of-the-arts. In particular, both quantitative and qualitative evaluations on various inpainted videos verify the remarkable robustness and generalization ability of our proposed TruVIL. Code and dataset will be available at https://github.com/multimediaFor/TruVIL. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.13565 [pdf, other]

Exploring Multi-view Pixel Contrast for General and Robust Image Forgery Localization

Authors: Zijie Lou, Gang Cao, Kun Guo, Haochen Zhu, Lifang Yu

Abstract: Image forgery localization, which aims to segment tampered regions in an image, is a fundamental yet challenging digital forensic task. While some deep learning-based forensic methods have achieved impressive results, they directly learn pixel-to-label mappings without fully exploiting the relationship between pixels in the feature space. To address such deficiency, we propose a Multi-view Pixel-w… ▽ More Image forgery localization, which aims to segment tampered regions in an image, is a fundamental yet challenging digital forensic task. While some deep learning-based forensic methods have achieved impressive results, they directly learn pixel-to-label mappings without fully exploiting the relationship between pixels in the feature space. To address such deficiency, we propose a Multi-view Pixel-wise Contrastive algorithm (MPC) for image forgery localization. Specifically, we first pre-train the backbone network with the supervised contrastive loss to model pixel relationships from the perspectives of within-image, cross-scale and cross-modality. That is aimed at increasing intra-class compactness and inter-class separability. Then the localization head is fine-tuned using the cross-entropy loss, resulting in a better pixel localizer. The MPC is trained on three different scale training datasets to make a comprehensive and fair comparison with existing image forgery localization algorithms. Extensive experiments on the small, medium and large scale training datasets show that the proposed MPC achieves higher generalization performance and robustness against post-processing than the state-of-the-arts. Code will be available at https://github.com/multimediaFor/MPC. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2405.02911 [pdf, other]

Multimodal Sense-Informed Prediction of 3D Human Motions

Authors: Zhenyu Lou, Qiongjie Cui, Haofan Wang, Xu Tang, Hong Zhou

Abstract: Predicting future human pose is a fundamental application for machine intelligence, which drives robots to plan their behavior and paths ahead of time to seamlessly accomplish human-robot collaboration in real-world 3D scenarios. Despite encouraging results, existing approaches rarely consider the effects of the external scene on the motion sequence, leading to pronounced artifacts and physical im… ▽ More Predicting future human pose is a fundamental application for machine intelligence, which drives robots to plan their behavior and paths ahead of time to seamlessly accomplish human-robot collaboration in real-world 3D scenarios. Despite encouraging results, existing approaches rarely consider the effects of the external scene on the motion sequence, leading to pronounced artifacts and physical implausibilities in the predictions. To address this limitation, this work introduces a novel multi-modal sense-informed motion prediction approach, which conditions high-fidelity generation on two modal information: external 3D scene, and internal human gaze, and is able to recognize their salience for future human activity. Furthermore, the gaze information is regarded as the human intention, and combined with both motion and scene features, we construct a ternary intention-aware attention to supervise the generation to match where the human wants to reach. Meanwhile, we introduce semantic coherence-aware attention to explicitly distinguish the salient point clouds and the underlying ones, to ensure a reasonable interaction of the generated sequence with the 3D scene. On two real-world benchmarks, the proposed method achieves state-of-the-art performance both in 3D human pose and trajectory prediction. △ Less

Submitted 5 May, 2024; originally announced May 2024.

arXiv:2404.04505 [pdf, other]

Exploring UAV Networking from the Terrain Information Completeness Perspective: A Tutorial

Authors: Zhengying Lou, Ruibo Wang, Baha Eddine Youcef Belmekki, Mustafa A. Kishk, Mohamed-Slim Alouini

Abstract: Terrain information is a crucial factor affecting the performance of unmanned aerial vehicle (UAV) networks. As a tutorial, this article provides a unique perspective on the completeness of terrain information, summarizing and enhancing the research on terrain-based UAV deployment. In the presence of complete terrain information, two highly discussed topics are UAV-aided map construction and dynam… ▽ More Terrain information is a crucial factor affecting the performance of unmanned aerial vehicle (UAV) networks. As a tutorial, this article provides a unique perspective on the completeness of terrain information, summarizing and enhancing the research on terrain-based UAV deployment. In the presence of complete terrain information, two highly discussed topics are UAV-aided map construction and dynamic trajectory design based on maps. We propose a case study illustrating the mutually reinforcing relationship between them. When terrain information is incomplete, and only terrain-related feature parameters are available, we discuss how existing models map terrain features to blockage probabilities. By introducing the application of this model with stochastic geometry, a case study is proposed to analyze the accuracy of the model. When no terrain information is available, UAVs gather terrain information during the real-time networking process and determine the next position by collected information. This real-time search method is currently limited to relay communication. In the case study, we extend it to a multi-user scenario and summarize three trade-offs of the method. Finally, we conduct a qualitative analysis to assess the impact of three factors that have been overlooked in terrain-based UAV deployment. △ Less

Submitted 9 April, 2024; v1 submitted 6 April, 2024; originally announced April 2024.

arXiv:2403.01686 [pdf, other]

doi 10.3847/2041-8213/ad319

AT2023lli: A Tidal Disruption Event with Prominent Optical Early Bump and Delayed Episodic X-ray Emission

Authors: Shifeng Huang, Ning Jiang, Jiazheng Zhu, Yibo Wang, Tinggui Wang, Shan-Qin Wang, Wen-Pei Gan, En-Wei Liang, Yu-Jing Qin, Zheyu Lin, Lin-Na Xu, Min-Xuan Cai, Ji-An Jiang, Xu Kong, Jiaxun Li, Long Li, Jian-Guo Wang, Ze-Lin Xu, Yongquan Xue, Ye-Fei Yuan, Jingquan Cheng, Lulu Fan, Jie Gao, Lei Hu, Weida Hu , et al. (20 additional authors not shown)

Abstract: High-cadence, multiwavelength observations have continuously revealed the diversity of tidal disruption events (TDEs), thus greatly advancing our knowledge and understanding of TDEs. In this work, we conducted an intensive optical-UV and X-ray follow-up campaign of TDE AT2023lli, and found a remarkable month-long bump in its UV/optical light curve nearly two months prior to maximum brightness. The… ▽ More High-cadence, multiwavelength observations have continuously revealed the diversity of tidal disruption events (TDEs), thus greatly advancing our knowledge and understanding of TDEs. In this work, we conducted an intensive optical-UV and X-ray follow-up campaign of TDE AT2023lli, and found a remarkable month-long bump in its UV/optical light curve nearly two months prior to maximum brightness. The bump represents the longest separation time from the main peak among known TDEs to date. The main UV/optical outburst declines as $t^{-4.10}$, making it one of the fastest decaying optically selected TDEs. Furthermore, we detected sporadic X-ray emission 30 days after the UV/optical peak, accompanied by a reduction in the period of inactivity. It is proposed that the UV/optical bump could be caused by the self-intersection of the stream debris, whereas the primary peak is generated by the reprocessed emission of the accretion process. In addition, our results suggest that episodic X-ray radiation during the initial phase of decline may be due to the patched obscurer surrounding the accretion disk, a phenomenon associated with the inhomogeneous reprocessing process. The double TDE scenario, in which two stars are disrupted in sequence, is also a possible explanation for producing the observed early bump and main peak. We anticipate that the multicolor light curves of TDEs, especially in the very early stages, and the underlying physics can be better understood in the near future with the assistance of dedicated surveys such as the deep high-cadence survey of the 2.5-meter Wide Field Survey Telescope (WFST). △ Less

Submitted 26 March, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

Comments: 14 pages, 8 figures,accepted for publication by ApJL

arXiv:2402.16121 [pdf, other]

Towards Accurate Post-training Quantization for Reparameterized Models

Authors: Luoming Zhang, Yefei He, Wen Fei, Zhenyu Lou, Weijia Wu, YangWei Ying, Hong Zhou

Abstract: Model reparameterization is a widely accepted technique for improving inference speed without compromising performance. However, current Post-training Quantization (PTQ) methods often lead to significant accuracy degradation when applied to reparameterized models. This is primarily caused by channel-specific and sample-specific outliers, which appear only at specific samples and channels and impac… ▽ More Model reparameterization is a widely accepted technique for improving inference speed without compromising performance. However, current Post-training Quantization (PTQ) methods often lead to significant accuracy degradation when applied to reparameterized models. This is primarily caused by channel-specific and sample-specific outliers, which appear only at specific samples and channels and impact on the selection of quantization parameters. To address this issue, we propose RepAPQ, a novel framework that preserves the accuracy of quantized reparameterization models. Different from previous frameworks using Mean Squared Error (MSE) as a measurement, we utilize Mean Absolute Error (MAE) to mitigate the influence of outliers on quantization parameters. Our framework comprises two main components: Quantization Protecting Reparameterization and Across-block Calibration. For effective calibration, Quantization Protecting Reparameterization combines multiple branches into a single convolution with an affine layer. During training, the affine layer accelerates convergence and amplifies the output of the convolution to better accommodate samples with outliers. Additionally, Across-block Calibration leverages the measurement of stage output as supervision to address the gradient problem introduced by MAE and enhance the interlayer correlation with quantization parameters. Comprehensive experiments demonstrate the effectiveness of RepAPQ across various models and tasks. Our framework outperforms previous methods by approximately 1\% for 8-bit PTQ and 2\% for 6-bit PTQ, showcasing its superior performance. The code is available at \url{https://github.com/ilur98/DLMC-QUANT}. △ Less

Submitted 25 February, 2024; originally announced February 2024.

arXiv:2402.10186 [pdf, other]

Self-consistent Validation for Machine Learning Electronic Structure

Authors: Gengyuan Hu, Gengchen Wei, Zekun Lou, Philip H. S. Torr, Wanli Ouyang, Han-sen Zhong, Chen Lin

Abstract: Machine learning has emerged as a significant approach to efficiently tackle electronic structure problems. Despite its potential, there is less guarantee for the model to generalize to unseen data that hinders its application in real-world scenarios. To address this issue, a technique has been proposed to estimate the accuracy of the predictions. This method integrates machine learning with self-… ▽ More Machine learning has emerged as a significant approach to efficiently tackle electronic structure problems. Despite its potential, there is less guarantee for the model to generalize to unseen data that hinders its application in real-world scenarios. To address this issue, a technique has been proposed to estimate the accuracy of the predictions. This method integrates machine learning with self-consistent field methods to achieve both low validation cost and interpret-ability. This, in turn, enables exploration of the model's ability with active learning and instills confidence in its integration into real-world studies. △ Less

Submitted 15 February, 2024; originally announced February 2024.

Comments: 6 pages, 4 figures

arXiv:2401.16040 [pdf, other]

$L_x^p\rightarrow L^q_{x,u}$ estimates for dilated averages over planar curves

Authors: Junfeng Li, Naijia Liu, Zengjian Lou, Haixia Yu

Abstract: In this paper, we consider the $L_x^p(\mathbb{R}^2)\rightarrow L_{x,u}^q(\mathbb{R}^2\times [1,2])$ estimate for the operator $T$ along a dilated plane curve $(ut,uγ(t))$, where $$Tf(x,u):=\int_{0}^{1}f(x_1-ut,x_2-u γ(t))\,\textrm{d}t,$$ $x:=(x_1,x_2)$ and $γ$ is a general plane curve satisfying some suitable smoothness and curvature conditions. We show that $T$ is $L_x^p(\mathbb{R}^2)$ to… ▽ More In this paper, we consider the $L_x^p(\mathbb{R}^2)\rightarrow L_{x,u}^q(\mathbb{R}^2\times [1,2])$ estimate for the operator $T$ along a dilated plane curve $(ut,uγ(t))$, where $$Tf(x,u):=\int_{0}^{1}f(x_1-ut,x_2-u γ(t))\,\textrm{d}t,$$ $x:=(x_1,x_2)$ and $γ$ is a general plane curve satisfying some suitable smoothness and curvature conditions. We show that $T$ is $L_x^p(\mathbb{R}^2)$ to $L_{x,u}^q(\mathbb{R}^2\times [1,2])$ bounded whenever $(\frac{1}{p},\frac{1}{q})\in \square \cup \{(0,0)\}\cup \{(\frac{2}{3},\frac{1}{3})\}$ and $1+(1 +ω)(\frac{1}{q}-\frac{1}{p})>0$, where the trapezium $\square:=\{(\frac{1}{p},\frac{1}{q}):\ \frac{2}{p}-1\leq\frac{1}{q}\leq \frac{1}{p}, \frac{1}{q}>\frac{1}{3p}, \frac{1}{q}>\frac{1}{p}-\frac{1}{3}\}$ and $ω:=\limsup_{t\rightarrow 0^{+}}\frac{\ln|γ(t)|}{\ln t}$. This result is sharp except for some borderline cases. On the other hand, in a smaller $(\frac{1}{p},\frac{1}{q})$ region, we also obtain the almost sharp estimate $T : L_x^p(\mathbb{R}^2)\rightarrow L_{x}^q(\mathbb{R}^2)$ uniformly for $u\in [1,2]$. These results imply that the operator $T$ has the so called local smoothing phenomenon, i.e., the $L^q$ integral about $u$ on $[1,2]$ extends the region of $(\frac{1}{p},\frac{1}{q})$ in uniform estimate $T : L_x^p(\mathbb{R}^2)\rightarrow L_{x}^q(\mathbb{R}^2)$. △ Less

Submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.09346 [pdf, other]

High Confidence Level Inference is Almost Free using Parallel Stochastic Optimization

Authors: Wanrong Zhu, Zhipeng Lou, Ziyang Wei, Wei Biao Wu

Abstract: Uncertainty quantification for estimation through stochastic optimization solutions in an online setting has gained popularity recently. This paper introduces a novel inference method focused on constructing confidence intervals with efficient computation and fast convergence to the nominal level. Specifically, we propose to use a small number of independent multi-runs to acquire distribution info… ▽ More Uncertainty quantification for estimation through stochastic optimization solutions in an online setting has gained popularity recently. This paper introduces a novel inference method focused on constructing confidence intervals with efficient computation and fast convergence to the nominal level. Specifically, we propose to use a small number of independent multi-runs to acquire distribution information and construct a t-based confidence interval. Our method requires minimal additional computation and memory beyond the standard updating of estimates, making the inference process almost cost-free. We provide a rigorous theoretical guarantee for the confidence interval, demonstrating that the coverage is approximately exact with an explicit convergence rate and allowing for high confidence level inference. In particular, a new Gaussian approximation result is developed for the online estimators to characterize the coverage properties of our confidence intervals in terms of relative errors. Additionally, our method also allows for leveraging parallel computing to further accelerate calculations using multiple cores. It is easy to implement and can be integrated with existing stochastic algorithms without the need for complicated modifications. △ Less

Submitted 17 January, 2024; originally announced January 2024.

arXiv:2312.02468 [pdf, other]

Terrain-Based UAV Deployment: Providing Coverage for Outdoor Users

Authors: Zhengying Lou, Ruibo Wang, Baha Eddine Youcef Belmekki, Mustafa A. Kishk, Mohamed-Slim Alouini

Abstract: Deploying unmanned aerial vehicle (UAV) networks to provide coverage for outdoor users has attracted great attention during the last decade. However, outdoor coverage is challenging due to the high mobility of crowds and the diverse terrain configurations causing building blockage. Most studies use stochastic channel models to characterize the impact of building blockage on user performance and do… ▽ More Deploying unmanned aerial vehicle (UAV) networks to provide coverage for outdoor users has attracted great attention during the last decade. However, outdoor coverage is challenging due to the high mobility of crowds and the diverse terrain configurations causing building blockage. Most studies use stochastic channel models to characterize the impact of building blockage on user performance and do not take into account terrain information. On the other hand, real-time search methods use terrain information, but they are only practical when a single UAV serves a single user.In this paper, we put forward two methods to avoid building blockage in a multi-user system by collecting prior terrain information and using real-time search.We proposed four algorithms related to the combinations of the above methods and their performances are evaluated and compared in different scenarios.By adjusting the height of the UAV based on terrain information collected before networking, the performance is significantly enhanced compared to the one when no terrain information is available.The algorithm based on real-time search further improves the coverage performance by avoiding the shadow of buildings. During the execution of the real-time search algorithm, the search distance is reduced using the collected terrain information. △ Less

Submitted 6 December, 2023; v1 submitted 4 December, 2023; originally announced December 2023.

arXiv:2312.01938 [pdf, other]

DSText V2: A Comprehensive Video Text Spotting Dataset for Dense and Small Text

Authors: Weijia Wu, Yiming Zhang, Yefei He, Luoming Zhang, Zhenyu Lou, Hong Zhou, Xiang Bai

Abstract: Recently, video text detection, tracking, and recognition in natural scenes are becoming very popular in the computer vision community. However, most existing algorithms and benchmarks focus on common text cases (e.g., normal size, density) and single scenario, while ignoring extreme video text challenges, i.e., dense and small text in various scenarios. In this paper, we establish a video text re… ▽ More Recently, video text detection, tracking, and recognition in natural scenes are becoming very popular in the computer vision community. However, most existing algorithms and benchmarks focus on common text cases (e.g., normal size, density) and single scenario, while ignoring extreme video text challenges, i.e., dense and small text in various scenarios. In this paper, we establish a video text reading benchmark, named DSText V2, which focuses on Dense and Small text reading challenges in the video with various scenarios. Compared with the previous datasets, the proposed dataset mainly include three new challenges: 1) Dense video texts, a new challenge for video text spotters to track and read. 2) High-proportioned small texts, coupled with the blurriness and distortion in the video, will bring further challenges. 3) Various new scenarios, e.g., Game, Sports, etc. The proposed DSText V2 includes 140 video clips from 7 open scenarios, supporting three tasks, i.e., video text detection (Task 1), video text tracking (Task 2), and end-to-end video text spotting (Task 3). In this article, we describe detailed statistical information of the dataset, tasks, evaluation protocols, and the results summaries. Most importantly, a thorough investigation and analysis targeting three unique challenges derived from our dataset are provided, aiming to provide new insights. Moreover, we hope the benchmark will promise video text research in the community. DSText v2 is built upon DSText v1, which was previously introduced to organize the ICDAR 2023 competition for dense and small video text. △ Less

Submitted 29 November, 2023; originally announced December 2023.

Comments: arXiv admin note: text overlap with arXiv:2304.04376

Journal ref: Pattern Recognition 2023/2024

arXiv:2310.09659 [pdf, other]

HAPS in the Non-Terrestrial Network Nexus: Prospective Architectures and Performance Insights

Authors: Zhengying Lou, Baha Eddine Youcef Belmekki, Mohamed-Slim Alouini

Abstract: High altitude platform stations (HAPS) have recently emerged as a new key stratospheric player in non-terrestrial networks (NTN) alongside satellites and low-altitude platforms. In this paper, we present the main communication links between HAPS and other NTN platforms, their advantages, and their challenges. Then, prospective network architectures in which HAPS plays an indispensable role in the… ▽ More High altitude platform stations (HAPS) have recently emerged as a new key stratospheric player in non-terrestrial networks (NTN) alongside satellites and low-altitude platforms. In this paper, we present the main communication links between HAPS and other NTN platforms, their advantages, and their challenges. Then, prospective network architectures in which HAPS plays an indispensable role in the future NTNs are presented such as ad-hoc, cell-free, and integrated access and backhaul. To showcase the importance of HAPS in the NTN, we provide comprehensive performance insights when using HAPS in the prospective architectures with the most suitable communication link. The insights show the HAPS' ability to interconnect the NTN nexus as well as their versatility by incorporating different metrics into the analysis such as routing latency, energy efficiency, coverage probability, and channel capacity. Depending on the architecture, HAPS will play different roles in NTN, such as a UAV network center, satellite relay, and ground network extension. Finally, the performance gain provided by HAPS usage in NTN is further highlighted by comparing the results when no HAPS are used. △ Less

Submitted 14 October, 2023; originally announced October 2023.

arXiv:2310.04836 [pdf, other]

Dual Grained Quantization: Efficient Fine-Grained Quantization for LLM

Authors: Luoming Zhang, Wen Fei, Weijia Wu, Yefei He, Zhenyu Lou, Hong Zhou

Abstract: Large Language Models (LLMs) pose significant hardware challenges related to memory requirements and computational ability. There are two mainstream quantization schemes for LLMs: coarse-grained ($\textit{e.g.,}$ channel-wise) quantization and fine-grained ($\textit{e.g.,}$ group-wise) quantization. Fine-grained quantization has smaller quantization loss, consequently achieving superior performanc… ▽ More Large Language Models (LLMs) pose significant hardware challenges related to memory requirements and computational ability. There are two mainstream quantization schemes for LLMs: coarse-grained ($\textit{e.g.,}$ channel-wise) quantization and fine-grained ($\textit{e.g.,}$ group-wise) quantization. Fine-grained quantization has smaller quantization loss, consequently achieving superior performance. However, when applied to weight-activation quantization, it disrupts continuous integer matrix multiplication, leading to inefficient inference. In this paper, we introduce Dual Grained Quantization (DGQ), a novel A8W4 quantization for LLM that maintains superior performance while ensuring fast inference speed. DSQ dequantizes the fine-grained INT4 weight into coarse-grained INT8 representation and preform matrix multiplication using INT8 kernels. Besides, we develop a two-phase grid search algorithm to simplify the determination of fine-grained and coarse-grained quantization scales. We also devise a percentile clipping schema for smoothing the activation outliers without the need for complex optimization techniques. Experimental results demonstrate that DGQ consistently outperforms prior methods across various LLM architectures and a wide range of tasks. Remarkably, by our implemented efficient CUTLASS kernel, we achieve $\textbf{1.12}$ $\times$ memory reduction and $\textbf{3.24}$ $\times$ speed gains comparing A16W4 implementation. These advancements enable efficient deployment of A8W4 LLMs for real-world applications. △ Less

Submitted 7 October, 2023; originally announced October 2023.

Comments: 15 pages, 2 figures

arXiv:2309.10243 [pdf]

Transferable Adversarial Attack on Image Tampering Localization

Authors: Yuqi Wang, Gang Cao, Zijie Lou, Haochen Zhu

Abstract: It is significant to evaluate the security of existing digital image tampering localization algorithms in real-world applications. In this paper, we propose an adversarial attack scheme to reveal the reliability of such tampering localizers, which would be fooled and fail to predict altered regions correctly. Specifically, the adversarial examples based on optimization and gradient are implemented… ▽ More It is significant to evaluate the security of existing digital image tampering localization algorithms in real-world applications. In this paper, we propose an adversarial attack scheme to reveal the reliability of such tampering localizers, which would be fooled and fail to predict altered regions correctly. Specifically, the adversarial examples based on optimization and gradient are implemented for white/black-box attacks. Correspondingly, the adversarial example is optimized via reverse gradient propagation, and the perturbation is added adaptively in the direction of gradient rising. The black-box attack is achieved by relying on the transferability of such adversarial examples to different localizers. Extensive evaluations verify that the proposed attack sharply reduces the localization accuracy while preserving high visual quality of the attacked images. △ Less

Submitted 18 September, 2023; originally announced September 2023.

arXiv:2309.09482 [pdf, other]

Spatio-temporal Co-attention Fusion Network for Video Splicing Localization

Authors: Man Lin, Gang Cao, Zijie Lou

Abstract: Digital video splicing has become easy and ubiquitous. Malicious users copy some regions of a video and paste them to another video for creating realistic forgeries. It is significant to blindly detect such forgery regions in videos. In this paper, a spatio-temporal co-attention fusion network (SCFNet) is proposed for video splicing localization. Specifically, a three-stream network is used as an… ▽ More Digital video splicing has become easy and ubiquitous. Malicious users copy some regions of a video and paste them to another video for creating realistic forgeries. It is significant to blindly detect such forgery regions in videos. In this paper, a spatio-temporal co-attention fusion network (SCFNet) is proposed for video splicing localization. Specifically, a three-stream network is used as an encoder to capture manipulation traces across multiple frames. The deep interaction and fusion of spatio-temporal forensic features are achieved by the novel parallel and cross co-attention fusion modules. A lightweight multilayer perceptron (MLP) decoder is adopted to yield a pixel-level tampering localization map. A new large-scale video splicing dataset is created for training the SCFNet. Extensive tests on benchmark datasets show that the localization and generalization performances of our SCFNet outperform the state-of-the-art. Code and datasets will be available at https://github.com/multimediaFor/SCFNet. △ Less

Submitted 18 September, 2023; originally announced September 2023.

arXiv:2309.00264 [pdf, other]

doi 10.1038/s41586-024-07431-y

Superconducting diode effect and interference patterns in Kagome CsV3Sb5

Authors: Tian Le, Zhiming Pan, Zhuokai Xu, Jinjin Liu, Jialu Wang, Zhefeng Lou, Xiaohui Yang, Zhiwei Wang, Yugui Yao, Congjun Wu, Xiao Lin

Abstract: The interplay among frustrated lattice geometry, nontrivial band topology and correlation yields rich quantum states of matter in Kagome systems. A series of recent members in this family, AV3Sb5 (A= K, Rb, Cs), exhibit a cascade of symmetry-breaking transitions, involving the 3Q chiral charge ordering, electronic nematicity, roton pair-density-wave and superconductivity. The nature of the superco… ▽ More The interplay among frustrated lattice geometry, nontrivial band topology and correlation yields rich quantum states of matter in Kagome systems. A series of recent members in this family, AV3Sb5 (A= K, Rb, Cs), exhibit a cascade of symmetry-breaking transitions, involving the 3Q chiral charge ordering, electronic nematicity, roton pair-density-wave and superconductivity. The nature of the superconducting order is yet to be resolved. Here, we report an indication of chiral superconducting domains with boundary supercurrents in intrinsic CsV3Sb5 flakes. Magnetic field-free superconducting diode effect is observed with polarity modulated by thermal histories, suggesting dynamical superconducting order domains in a spontaneous time-reversal symmetry breaking background. Strikingly, the critical current exhibits the double-slit superconducting interference patterns when subjected to an external magnetic field. Characteristics of the patterns are modulated by thermal cycling. These phenomena are proposed as a consequence of periodically modulated supercurrents flowing along certain domain boundaries constrained by fluxoid quantization. Our results imply a chiral superconducting order, opening a potential for exploring exotic physics, e.g. Majorana zero modes, in this intriguing topological Kagome system. △ Less

Submitted 15 May, 2024; v1 submitted 1 September, 2023; originally announced September 2023.

Comments: 17 pages,13 figures

Journal ref: Nature(2024)

arXiv:2308.09034 [pdf]

doi 10.1007/s11433-024-2457-7

Annealing-induced long-range charge density wave order in magnetic kagome FeGe: fluctuations and disordered structure

Authors: Chenfei Shi, Yi Liu, Bishal Baran Maity, Qi Wang, Surya Rohith Kotla, Sitaram Ramakrishnan, Claudio Eisele, Harshit Agarwal, Leila Noohinejad, Qian Tao, Baojuan Kang, Zhefeng Lou, Xiaohui Yang, Yanpeng Qi, Xiao Lin, Zhu-An Xu, A. Thamizhavel, Guang-Han Cao, Sander van Smaalen, Shixun Cao, Jin-Ke Bao

Abstract: Charge density wave (CDW) in kagome materials with the geometric frustration is able to carry unconventional characteristics. Recently, a CDW has been observed below the antiferromagnetic order in kagome FeGe, in which magnetism and CDW are intertwined to form an emergent quantum ground state. However, the CDW is only short-ranged and the structural modulation originating from it has yet to be det… ▽ More Charge density wave (CDW) in kagome materials with the geometric frustration is able to carry unconventional characteristics. Recently, a CDW has been observed below the antiferromagnetic order in kagome FeGe, in which magnetism and CDW are intertwined to form an emergent quantum ground state. However, the CDW is only short-ranged and the structural modulation originating from it has yet to be determined experimentally. Here we realize a long-range CDW order by post-annealing process, and resolve the structure model through single crystal x-ray diffraction. Occupational disorder of Ge resulting from short-range CDW correlations above $T_\mathrm{CDW}$ is identified from structure refinements. The partial dimerization of Ge along the $c$ axis is unveiled to be the dominant distortion for the CDW. Occupational disorder of Ge is also proved to exist in the CDW phase due to the random selection of partially dimerized Ge sites. Our work provides useful insights for understanding the unconventional nature of the CDW in FeGe. △ Less

Submitted 23 July, 2024; v1 submitted 17 August, 2023; originally announced August 2023.

Comments: 4 figures and 1 table in the main text. Supplementary materials included. To be published in SCIENCE CHINA Physics, Mechanics & Astronomy (SCPMA)

Journal ref: Sci. China Phys. Mech. Astron. 67, 117012 (2024)

arXiv:2308.05872 [pdf, other]

Vision Backbone Enhancement via Multi-Stage Cross-Scale Attention

Authors: Liang Shang, Yanli Liu, Zhengyang Lou, Shuxue Quan, Nagesh Adluru, Bochen Guan, William A. Sethares

Abstract: Convolutional neural networks (CNNs) and vision transformers (ViTs) have achieved remarkable success in various vision tasks. However, many architectures do not consider interactions between feature maps from different stages and scales, which may limit their performance. In this work, we propose a simple add-on attention module to overcome these limitations via multi-stage and cross-scale interac… ▽ More Convolutional neural networks (CNNs) and vision transformers (ViTs) have achieved remarkable success in various vision tasks. However, many architectures do not consider interactions between feature maps from different stages and scales, which may limit their performance. In this work, we propose a simple add-on attention module to overcome these limitations via multi-stage and cross-scale interactions. Specifically, the proposed Multi-Stage Cross-Scale Attention (MSCSA) module takes feature maps from different stages to enable multi-stage interactions and achieves cross-scale interactions by computing self-attention at different scales based on the multi-stage feature maps. Our experiments on several downstream tasks show that MSCSA provides a significant performance boost with modest additional FLOPs and runtime. △ Less

Submitted 14 August, 2023; v1 submitted 10 August, 2023; originally announced August 2023.

arXiv:2308.02918 [pdf, other]

Spectral Ranking Inferences based on General Multiway Comparisons

Authors: Jianqing Fan, Zhipeng Lou, Weichen Wang, Mengxin Yu

Abstract: This paper studies the performance of the spectral method in the estimation and uncertainty quantification of the unobserved preference scores of compared entities in a general and more realistic setup. Specifically, the comparison graph consists of hyper-edges of possible heterogeneous sizes, and the number of comparisons can be as low as one for a given hyper-edge. Such a setting is pervasive in… ▽ More This paper studies the performance of the spectral method in the estimation and uncertainty quantification of the unobserved preference scores of compared entities in a general and more realistic setup. Specifically, the comparison graph consists of hyper-edges of possible heterogeneous sizes, and the number of comparisons can be as low as one for a given hyper-edge. Such a setting is pervasive in real applications, circumventing the need to specify the graph randomness and the restrictive homogeneous sampling assumption imposed in the commonly used Bradley-Terry-Luce (BTL) or Plackett-Luce (PL) models. Furthermore, in scenarios where the BTL or PL models are appropriate, we unravel the relationship between the spectral estimator and the Maximum Likelihood Estimator (MLE). We discover that a two-step spectral method, where we apply the optimal weighting estimated from the equal weighting vanilla spectral method, can achieve the same asymptotic efficiency as the MLE. Given the asymptotic distributions of the estimated preference scores, we also introduce a comprehensive framework to carry out both one-sample and two-sample ranking inferences, applicable to both fixed and random graph settings. It is noteworthy that this is the first time effective two-sample rank testing methods have been proposed. Finally, we substantiate our findings via comprehensive numerical simulations and subsequently apply our developed methodologies to perform statistical inferences for statistical journals and movie rankings. △ Less

Submitted 1 March, 2024; v1 submitted 5 August, 2023; originally announced August 2023.

Comments: 62 pages, 4 figures

arXiv:2307.15605 [pdf, ps, other]

Disproof of a conjecture on the minimum spectral radius and the domination number

Authors: Yarong Hu, Zhenzhen Lou, Qiongxiang Huang

Abstract: Let $G_{n,γ}$ be the set of all connected graphs on $n$ vertices with domination number $γ$. A graph is called a minimizer graph if it attains the minimum spectral radius among $G_{n,γ}$. Very recently, Liu, Li and Xie [Linear Algebra and its Applications 673 (2023) 233--258] proved that the minimizer graph over all graphs in $\mathbb{G}_{n,γ}$ must be a tree. Moreover, they determined the minimiz… ▽ More Let $G_{n,γ}$ be the set of all connected graphs on $n$ vertices with domination number $γ$. A graph is called a minimizer graph if it attains the minimum spectral radius among $G_{n,γ}$. Very recently, Liu, Li and Xie [Linear Algebra and its Applications 673 (2023) 233--258] proved that the minimizer graph over all graphs in $\mathbb{G}_{n,γ}$ must be a tree. Moreover, they determined the minimizer graph among $G_{n,\lfloor\frac{n}{2}\rfloor}$ for even $n$, and posed the conjecture on the minimizer graph among $G_{n,\lfloor\frac{n}{2}\rfloor}$ for odd $n$. In this paper, we disprove the conjecture and completely determine the unique minimizer graph among $G_{n,\lfloor\frac{n}{2}\rfloor}$ for odd $n$. △ Less

Submitted 28 July, 2023; originally announced July 2023.

arXiv:2306.07590 [pdf, other]

doi 10.1007/s11433-023-2197-5

Sciences with the 2.5-meter Wide Field Survey Telescope (WFST)

Authors: WFST Collaboration, Tinggui Wang, Guilin Liu, Zhenyi Cai, Jinjun Geng, Min Fang, Haoning He, Ji-an Jiang, Ning Jiang, Xu Kong, Bin Li, Ye Li, Wentao Luo, Zhizheng Pan, Xuefeng Wu, Ji Yang, Jiming Yu, Xianzhong Zheng, Qingfeng Zhu, Yi-Fu Cai, Yuanyuan Chen, Zhiwei Chen, Zigao Dai, Lulu Fan, Yizhong Fan , et al. (38 additional authors not shown)

Abstract: The Wide Field Survey Telescope (WFST) is a dedicated photometric surveying facility being built jointly by the University of Science and Technology of China and the Purple Mountain Observatory. It is equipped with a 2.5-meter diameter primary mirror, an active optics system, and a mosaic CCD camera with 0.73 gigapixels on the primary focal plane for high-quality image capture over an FOV of 6.5-s… ▽ More The Wide Field Survey Telescope (WFST) is a dedicated photometric surveying facility being built jointly by the University of Science and Technology of China and the Purple Mountain Observatory. It is equipped with a 2.5-meter diameter primary mirror, an active optics system, and a mosaic CCD camera with 0.73 gigapixels on the primary focal plane for high-quality image capture over an FOV of 6.5-square-degree. It is anticipated that WFST will be set up at the Lenghu site in the summer of 2023 and begin to observe the northern sky in four optical bands (u, g, r, and i) with a range of cadences, from hourly/daily in the Deep High-Cadence Survey (DHS) program to semiweekly in the Wide-Field Survey (WFS) program, three months later. During a photometric night, a nominal 30 s exposure in the WFS program will reach a depth of 22.27, 23.32, 22.84, and 22.31 (AB magnitudes) in these four bands, respectively, allowing for the detection of a tremendous amount of transients in the low-z universe and a systematic investigation of the variability of Galactic and extragalactic objects. In the DHS program, intranight 90 s exposures as deep as 23 (u) and 24 mag (g), in combination with target of opportunity follow-ups, will provide a unique opportunity to explore energetic transients in demand for high sensitivities, including the electromagnetic counterparts of gravitational wave events, supernovae within a few hours of their explosions, tidal disruption events and fast, luminous optical transients even beyond a redshift of unity. In addition, the final 6-year co-added images, anticipated to reach g=25.8 mag in WFS or 1.5 mags deeper in DHS, will be of fundamental importance to general Galactic and extragalactic science. The highly uniform legacy surveys of WFST will serve as an indispensable complement to those of LSST that monitor the southern sky. △ Less

Submitted 14 September, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

Comments: 48 pages

Journal ref: SCPMA-Vol. 66 No. 10: 109512 (2023)

arXiv:2305.16481 [pdf, other]

SimHaze: game engine simulated data for real-world dehazing

Authors: Zhengyang Lou, Huan Xu, Fangzhou Mu, Yanli Liu, Xiaoyu Zhang, Liang Shang, Jiang Li, Bochen Guan, Yin Li, Yu Hen Hu

Abstract: Deep models have demonstrated recent success in single-image dehazing. Most prior methods consider fully supervised training and learn from paired clean and hazy images, where a hazy image is synthesized based on a clean image and its estimated depth map. This paradigm, however, can produce low-quality hazy images due to inaccurate depth estimation, resulting in poor generalization of the trained… ▽ More Deep models have demonstrated recent success in single-image dehazing. Most prior methods consider fully supervised training and learn from paired clean and hazy images, where a hazy image is synthesized based on a clean image and its estimated depth map. This paradigm, however, can produce low-quality hazy images due to inaccurate depth estimation, resulting in poor generalization of the trained models. In this paper, we explore an alternative approach for generating paired clean-hazy images by leveraging computer graphics. Using a modern game engine, our approach renders crisp clean images and their precise depth maps, based on which high-quality hazy images can be synthesized for training dehazing models. To this end, we present SimHaze: a new synthetic haze dataset. More importantly, we show that training with SimHaze alone allows the latest dehazing models to achieve significantly better performance in comparison to previous dehazing datasets. Our dataset and code will be made publicly available. △ Less

Submitted 25 May, 2023; originally announced May 2023.

Comments: Submitted to ICIP 2023

arXiv:2303.10021 [pdf, other]

Coverage Analysis of Hybrid RF/THz Networks With Best Relay Selection

Authors: Zhengying Lou, Baha Eddine Youcef Belmekki, Mohamed-Slim Alouini

Abstract: Utilizing terahertz (THz) transmission to enhance coverage has proven various benefits compared to traditional radio frequency (RF) counterparts. This letter proposes a dual-hop decode-and-forward (DF) routing protocol in a hybrid RF and THz relay network named hybrid relay selection (HRS). The coverage probability of the HRS protocol is derived. The HRS protocol prioritizes THz relays for higher… ▽ More Utilizing terahertz (THz) transmission to enhance coverage has proven various benefits compared to traditional radio frequency (RF) counterparts. This letter proposes a dual-hop decode-and-forward (DF) routing protocol in a hybrid RF and THz relay network named hybrid relay selection (HRS). The coverage probability of the HRS protocol is derived. The HRS protocol prioritizes THz relays for higher data rates or short source-destination distances; and RF relays for lower data rates or large source-destination distances. The proposed HRS protocol offers nearly the same performance as the optimal selection protocol, which requires complete instantaneous channel state information (CSI) of all the nodes. △ Less

Submitted 17 March, 2023; originally announced March 2023.

arXiv:2303.02882 [pdf, other]

doi 10.1002/adma.202300450

Achieving ferroelectricity in a centrosymmetric high-performance semiconductor by strain engineering

Authors: Mengqi Wu, Zhefeng Lou, Chen-Min Dai, Tao Wang, Jiaqi Wang, Ziye Zhu, Zhuokai Xu, Tulai Sun, Wenbin Li, Xiaorui Zheng, Xiao Lin

Abstract: Phase engineering by strains in 2D semiconductors is of great importance for a variety of applications. Here, we present a study of strain induced ferroelectric (FE) transition on bismuth oxyselenide (Bi$_2$O$_2$Se) films, a high-performance (HP) semiconductor for next-generation electronics. Bi$_2$O$_2$Se is non-FE at ambient. Upon a loading force $\gtrsim 400$ nN, piezoelectric force responses e… ▽ More Phase engineering by strains in 2D semiconductors is of great importance for a variety of applications. Here, we present a study of strain induced ferroelectric (FE) transition on bismuth oxyselenide (Bi$_2$O$_2$Se) films, a high-performance (HP) semiconductor for next-generation electronics. Bi$_2$O$_2$Se is non-FE at ambient. Upon a loading force $\gtrsim 400$ nN, piezoelectric force responses exhibit butterfly loops on magnitude and 180$^\textrm{o}$ phase switching. By carefully ruling out extrinsic factors, these features are attributed to a transition to FE phase. The transition is further proved by the appearance of a sharp peak on optical second harmonic generation under an uniaxial strain. Fundamentally, solids with paraelectric at ambient and FE under strains are scarce. FE transition is discussed with the help of first-principle calculations and theoretical simulations. The switching of FE polarization acts as a knob for Schottky barrier engineering at contacts and serves as basis for a memristor with a huge switching ratio of 10$^6$. Our work endows a new degree of freedom to a HP electronic/optoelectronic semiconductor and the integration of FE and HP semiconductivity paving the way for multiple exciting functionalities, including HP neuromorphic computation and bulk piezophotovoltaic. △ Less

Submitted 5 March, 2023; originally announced March 2023.

Comments: 12 pages, 5 figures

arXiv:2302.13287 [pdf, ps, other]

Reducibility of linear quasi-periodic Hamiltonian derivative wave equations and half-wave equations under the Brjuno conditions

Authors: Zhaowei Lou

Abstract: In this paper, we prove the reducibility for some linear quasi-periodic Hamiltonian derivative wave and half-wave equations under the Brjuno-Rüssmann non-resonance conditions. This generalizes KAM theory by Pöschel in [38] from the finite dimensional Hamiltonian systems to Hamiltonian PDEs. In this paper, we prove the reducibility for some linear quasi-periodic Hamiltonian derivative wave and half-wave equations under the Brjuno-Rüssmann non-resonance conditions. This generalizes KAM theory by Pöschel in [38] from the finite dimensional Hamiltonian systems to Hamiltonian PDEs. △ Less

Submitted 26 February, 2023; originally announced February 2023.

MSC Class: 37K55; 35L05; 35Q55

arXiv:2302.12111 [pdf, other]

Communication-Efficient Distributed Estimation and Inference for Cox's Model

Authors: Pierre Bayle, Jianqing Fan, Zhipeng Lou

Abstract: Motivated by multi-center biomedical studies that cannot share individual data due to privacy and ownership concerns, we develop communication-efficient iterative distributed algorithms for estimation and inference in the high-dimensional sparse Cox proportional hazards model. We demonstrate that our estimator, even with a relatively small number of iterations, achieves the same convergence rate a… ▽ More Motivated by multi-center biomedical studies that cannot share individual data due to privacy and ownership concerns, we develop communication-efficient iterative distributed algorithms for estimation and inference in the high-dimensional sparse Cox proportional hazards model. We demonstrate that our estimator, even with a relatively small number of iterations, achieves the same convergence rate as the ideal full-sample estimator under very mild conditions. To construct confidence intervals for linear combinations of high-dimensional hazard regression coefficients, we introduce a novel debiased method, establish central limit theorems, and provide consistent variance estimators that yield asymptotically valid distributed confidence intervals. In addition, we provide valid and powerful distributed hypothesis tests for any coordinate element based on a decorrelated score test. We allow time-dependent covariates as well as censored survival times. Extensive numerical experiments on both simulated and real data lend further support to our theory and demonstrate that our communication-efficient distributed estimators, confidence intervals, and hypothesis tests improve upon alternative methods. △ Less

Submitted 23 June, 2024; v1 submitted 23 February, 2023; originally announced February 2023.

arXiv:2302.11721 [pdf, other]

doi 10.1038/s41535-023-00579-2

Clues to potential dipolar-Kondo and RKKY interactions in a polar metal

Authors: Xiaohui Yang, Wanghua Hu, Jialu Wang, Zhuokai Xu, Tao Wang, Zhefeng Lou, Xiao Lin

Abstract: The coexistence of electric dipoles and itinerant electrons in a solid was postulated decades ago, before being experimentally established in several 'polar metals' during the last decade. Here, we report a concentration-driven polar-to-nonpolar phase transition in electron-doped BaTiO_3. Comparing our case with other polar metals, we find a particular threshold concentration (n*) linked to the di… ▽ More The coexistence of electric dipoles and itinerant electrons in a solid was postulated decades ago, before being experimentally established in several 'polar metals' during the last decade. Here, we report a concentration-driven polar-to-nonpolar phase transition in electron-doped BaTiO_3. Comparing our case with other polar metals, we find a particular threshold concentration (n*) linked to the dipole density (n_d). The universal ratio n_d/n*=8(0.6) suggests a common mechanism across different polar systems, possibly explained by a dipolar Ruderman-Kittel-Kasuya-Yosida theory. Moreover, in BaTiO_3, we observe enhanced thermopower and upturn on resistivity at low temperatures near n*, resembling the Kondo effect. We argue that local electric dipoles act as two-level-systems, whose fluctuations couple with surrounding electron clouds, giving rise to a potential dipolar-counterpart of the Kondo effect. Our findings unveil a mostly uncharted territory for exploring emerging physics associated with electron-dipole correlations, encouraging further theoretical work on dipolar-RKKY and Kondo interactions. △ Less

Submitted 10 October, 2023; v1 submitted 22 February, 2023; originally announced February 2023.

Comments: 8 pages, 4 figures

arXiv:2302.11090 [pdf, ps, other]

Duality for $α$-Möbius invariant Besov spaces

Authors: Guanlong Bao, Zengjian Lou, Xiaojing Zhou

Abstract: For $1\leq p\leq \infty$ and $α>0$, Besov spaces $B^p_α$ play a key role in the theory of $α$-Möbius invariant function spaces. In some sense, $B^1_α$ is the minimal $α$-Möbius invariant function space, $B^2_α$ is the unique $α$-Möbius invariant Hilbert space, and $B^\infty_α$ is the maximal $α$-Möbius invariant function space. In this paper, under the $α$-Möbius invariant pairing and by the space… ▽ More For $1\leq p\leq \infty$ and $α>0$, Besov spaces $B^p_α$ play a key role in the theory of $α$-Möbius invariant function spaces. In some sense, $B^1_α$ is the minimal $α$-Möbius invariant function space, $B^2_α$ is the unique $α$-Möbius invariant Hilbert space, and $B^\infty_α$ is the maximal $α$-Möbius invariant function space. In this paper, under the $α$-Möbius invariant pairing and by the space $B^\infty_α$, we identify the predual and dual spaces of $B^1_α$. In particular, the corresponding identifications are isometric isomorphisms. The duality theorem via the $α$-Möbius invariant pairing for $B^p_α$ with $p>1$ is also given. △ Less

Submitted 21 February, 2023; originally announced February 2023.

arXiv:2301.04209 [pdf, other]

High Dimensional Analysis of Variance in Multivariate Linear Regression

Authors: Zhipeng Lou, Xianyang Zhang, Wei Biao Wu

Abstract: In this paper, we develop a systematic theory for high dimensional analysis of variance in multivariate linear regression, where the dimension and the number of coefficients can both grow with the sample size. We propose a new \emph{U}~type test statistic to test linear hypotheses and establish a high dimensional Gaussian approximation result under fairly mild moment assumptions. Our general frame… ▽ More In this paper, we develop a systematic theory for high dimensional analysis of variance in multivariate linear regression, where the dimension and the number of coefficients can both grow with the sample size. We propose a new \emph{U}~type test statistic to test linear hypotheses and establish a high dimensional Gaussian approximation result under fairly mild moment assumptions. Our general framework and theory can be applied to deal with the classical one-way multivariate ANOVA and the nonparametric one-way MANOVA in high dimensions. To implement the test procedure in practice, we introduce a sample-splitting based estimator of the second moment of the error covariance and discuss its properties. A simulation study shows that our proposed test outperforms some existing tests in various settings. △ Less

Submitted 10 January, 2023; originally announced January 2023.

arXiv:2212.06524 [pdf, other]

SST: Real-time End-to-end Monocular 3D Reconstruction via Sparse Spatial-Temporal Guidance

Authors: Chenyangguang Zhang, Zhiqiang Lou, Yan Di, Federico Tombari, Xiangyang Ji

Abstract: Real-time monocular 3D reconstruction is a challenging problem that remains unsolved. Although recent end-to-end methods have demonstrated promising results, tiny structures and geometric boundaries are hardly captured due to their insufficient supervision neglecting spatial details and oversimplified feature fusion ignoring temporal cues. To address the problems, we propose an end-to-end 3D recon… ▽ More Real-time monocular 3D reconstruction is a challenging problem that remains unsolved. Although recent end-to-end methods have demonstrated promising results, tiny structures and geometric boundaries are hardly captured due to their insufficient supervision neglecting spatial details and oversimplified feature fusion ignoring temporal cues. To address the problems, we propose an end-to-end 3D reconstruction network SST, which utilizes Sparse estimated points from visual SLAM system as additional Spatial guidance and fuses Temporal features via a novel cross-modal attention mechanism, achieving more detailed reconstruction results. We propose a Local Spatial-Temporal Fusion module to exploit more informative spatial-temporal cues from multi-view color information and sparse priors, as well a Global Spatial-Temporal Fusion module to refine the local TSDF volumes with the world-frame model from coarse to fine. Extensive experiments on ScanNet and 7-Scenes demonstrate that SST outperforms all state-of-the-art competitors, whilst keeping a high inference speed at 59 FPS, enabling real-world applications with real-time requirements. △ Less

Submitted 24 July, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

Comments: ICME 2023 (oral)

Report number: camera ready for ICME 2023

arXiv:2211.11959 [pdf, ps, other]

Robust High-dimensional Tuning Free Multiple Testing

Authors: Jianqing Fan, Zhipeng Lou, Mengxin Yu

Abstract: A stylized feature of high-dimensional data is that many variables have heavy tails, and robust statistical inference is critical for valid large-scale statistical inference. Yet, the existing developments such as Winsorization, Huberization and median of means require the bounded second moments and involve variable-dependent tuning parameters, which hamper their fidelity in applications to large-… ▽ More A stylized feature of high-dimensional data is that many variables have heavy tails, and robust statistical inference is critical for valid large-scale statistical inference. Yet, the existing developments such as Winsorization, Huberization and median of means require the bounded second moments and involve variable-dependent tuning parameters, which hamper their fidelity in applications to large-scale problems. To liberate these constraints, this paper revisits the celebrated Hodges-Lehmann (HL) estimator for estimating location parameters in both the one- and two-sample problems, from a non-asymptotic perspective. Our study develops Berry-Esseen inequality and Cramér type moderate deviation for the HL estimator based on newly developed non-asymptotic Bahadur representation, and builds data-driven confidence intervals via a weighted bootstrap approach. These results allow us to extend the HL estimator to large-scale studies and propose \emph{tuning-free} and \emph{moment-free} high-dimensional inference procedures for testing global null and for large-scale multiple testing with false discovery proportion control. It is convincingly shown that the resulting tuning-free and moment-free methods control false discovery proportion at a prescribed level. The simulation studies lend further support to our developed theory. △ Less

Submitted 23 November, 2022; v1 submitted 21 November, 2022; originally announced November 2022.

Comments: In this paper, we develop tuning-free and moment-free high dimensional inference procedures;

arXiv:2211.11957 [pdf, other]

Ranking Inferences Based on the Top Choice of Multiway Comparisons

Authors: Jianqing Fan, Zhipeng Lou, Weichen Wang, Mengxin Yu

Abstract: This paper considers ranking inference of $n$ items based on the observed data on the top choice among $M$ randomly selected items at each trial. This is a useful modification of the Plackett-Luce model for $M$-way ranking with only the top choice observed and is an extension of the celebrated Bradley-Terry-Luce model that corresponds to $M=2$. Under a uniform sampling scheme in which any $M$ dist… ▽ More This paper considers ranking inference of $n$ items based on the observed data on the top choice among $M$ randomly selected items at each trial. This is a useful modification of the Plackett-Luce model for $M$-way ranking with only the top choice observed and is an extension of the celebrated Bradley-Terry-Luce model that corresponds to $M=2$. Under a uniform sampling scheme in which any $M$ distinguished items are selected for comparisons with probability $p$ and the selected $M$ items are compared $L$ times with multinomial outcomes, we establish the statistical rates of convergence for underlying $n$ preference scores using both $\ell_2$-norm and $\ell_\infty$-norm, with the minimum sampling complexity. In addition, we establish the asymptotic normality of the maximum likelihood estimator that allows us to construct confidence intervals for the underlying scores. Furthermore, we propose a novel inference framework for ranking items through a sophisticated maximum pairwise difference statistic whose distribution is estimated via a valid Gaussian multiplier bootstrap. The estimated distribution is then used to construct simultaneous confidence intervals for the differences in the preference scores and the ranks of individual items. They also enable us to address various inference questions on the ranks of these items. Extensive simulation studies lend further support to our theoretical results. A real data application illustrates the usefulness of the proposed methods convincingly. △ Less

Submitted 5 January, 2023; v1 submitted 21 November, 2022; originally announced November 2022.

Comments: In this paper, we build simultaneous confidence intervals for ranks through multiway comparisons

Showing 1–50 of 103 results for author: Lou, Z