-
Precision measurement of the branching fraction for the decay $ψ(2S)\rightarrowτ^{+}τ^{-}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann,
H. Cai
, et al. (691 additional authors not shown)
Abstract:
Using $(2259.3 \pm 11.1)\times10^{6}$ $ψ(2S)$ events acquired with the BESIII detector, the branching fraction of $ψ(2S)\rightarrowτ^{+}τ^{-}$ is measured with improved precision to be $\mathcal{B}_{ψ(2S)\rightarrowτ^{+}τ^{-}}=(3.240~\pm~0.023~\pm~0.081)\times 10^{-3}$, where the first and second uncertainties are statistical and systematic, respectively, which is consistent with the world average…
▽ More
Using $(2259.3 \pm 11.1)\times10^{6}$ $ψ(2S)$ events acquired with the BESIII detector, the branching fraction of $ψ(2S)\rightarrowτ^{+}τ^{-}$ is measured with improved precision to be $\mathcal{B}_{ψ(2S)\rightarrowτ^{+}τ^{-}}=(3.240~\pm~0.023~\pm~0.081)\times 10^{-3}$, where the first and second uncertainties are statistical and systematic, respectively, which is consistent with the world average value within one standard deviation. This value, along with those for the branching fractions of the $ψ(2S)$ decaying into $e^{+}e^{-}$ and $μ^{+}μ^{-}$, is in good agreement with the relation predicted by the sequential lepton hypothesis. Combining the branching fraction values with the leptonic width of the $ψ(2S)$, the total width of the $ψ(2S)$ is determined to be (287 $\pm$ 9) keV.
△ Less
Submitted 27 February, 2025;
originally announced February 2025.
-
OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference
Authors:
Xiangyu Zhao,
Shengyuan Ding,
Zicheng Zhang,
Haian Huang,
Maosong Cao,
Weiyun Wang,
Jiaqi Wang,
Xinyu Fang,
Wenhai Wang,
Guangtao Zhai,
Haodong Duan,
Hua Yang,
Kai Chen
Abstract:
Recent advancements in open-source multi-modal large language models (MLLMs) have primarily focused on enhancing foundational capabilities, leaving a significant gap in human preference alignment. This paper introduces OmniAlign-V, a comprehensive dataset of 200K high-quality training samples featuring diverse images, complex questions, and varied response formats to improve MLLMs' alignment with…
▽ More
Recent advancements in open-source multi-modal large language models (MLLMs) have primarily focused on enhancing foundational capabilities, leaving a significant gap in human preference alignment. This paper introduces OmniAlign-V, a comprehensive dataset of 200K high-quality training samples featuring diverse images, complex questions, and varied response formats to improve MLLMs' alignment with human preferences. We also present MM-AlignBench, a human-annotated benchmark specifically designed to evaluate MLLMs' alignment with human values. Experimental results show that finetuning MLLMs with OmniAlign-V, using Supervised Fine-Tuning (SFT) or Direct Preference Optimization (DPO), significantly enhances human preference alignment while maintaining or enhancing performance on standard VQA benchmarks, preserving their fundamental capabilities. Our datasets, benchmark, code and checkpoints have been released at https://github.com/PhoenixZ810/OmniAlign-V.
△ Less
Submitted 28 February, 2025; v1 submitted 25 February, 2025;
originally announced February 2025.
-
High quality superconducting tantalum resonators with beta phase defects
Authors:
Ritika Dhundhwal,
Haoran Duan,
Lucas Brauch,
Soroush Arabi,
Dirk Fuchs,
Amir-Abbas Haghighirad,
Alexander Welle,
Florentine Scharwaechter,
Sudip Pal,
Marc Scheffler,
José Palomo,
Zaki Leghtas,
Anil Murani,
Horst Hahn,
Jasmin Aghassi-Hagmann,
Christian Kübel,
Wulf Wulfhekel,
Ioan M. Pop,
Thomas Reisinger
Abstract:
For practical superconducting quantum processors, orders of magnitude improvement in coherence is required, motivating efforts to optimize hardware design and explore new materials. Among the latter, the coherence of superconducting transmon qubits has been shown to improve by forming the qubit capacitor pads from $α$-tantalum, avoiding the meta-stable $β$-phase that forms when depositing tantalum…
▽ More
For practical superconducting quantum processors, orders of magnitude improvement in coherence is required, motivating efforts to optimize hardware design and explore new materials. Among the latter, the coherence of superconducting transmon qubits has been shown to improve by forming the qubit capacitor pads from $α$-tantalum, avoiding the meta-stable $β$-phase that forms when depositing tantalum at room temperature, and has been previously identified to be a source of microwave losses. In this work, we show lumped element resonators containing $β$-phase tantalum in the form of inclusions near the metal-substrate interface with internal quality factors ($Q_\text{i}$) up to $(5.0 \pm 2.5) \times 10^6$ in the single photon regime. They outperform resonators with no sign of the $β$-phase in x-ray diffraction and thermal quasi-particle loss. Our results indicate that small concentrations of $β$-phase can be beneficial, enhancing critical magnetic fields and potentially, for improving coherence in tantalum based superconducting circuits.
△ Less
Submitted 24 February, 2025;
originally announced February 2025.
-
Multi-Dimensional Quality Assessment for Text-to-3D Assets: Dataset and Model
Authors:
Kang Fu,
Huiyu Duan,
Zicheng Zhang,
Xiaohong Liu,
Xiongkuo Min,
Jia Wang,
Guangtao Zhai
Abstract:
Recent advancements in text-to-image (T2I) generation have spurred the development of text-to-3D asset (T23DA) generation, leveraging pretrained 2D text-to-image diffusion models for text-to-3D asset synthesis. Despite the growing popularity of text-to-3D asset generation, its evaluation has not been well considered and studied. However, given the significant quality discrepancies among various te…
▽ More
Recent advancements in text-to-image (T2I) generation have spurred the development of text-to-3D asset (T23DA) generation, leveraging pretrained 2D text-to-image diffusion models for text-to-3D asset synthesis. Despite the growing popularity of text-to-3D asset generation, its evaluation has not been well considered and studied. However, given the significant quality discrepancies among various text-to-3D assets, there is a pressing need for quality assessment models aligned with human subjective judgments. To tackle this challenge, we conduct a comprehensive study to explore the T23DA quality assessment (T23DAQA) problem in this work from both subjective and objective perspectives. Given the absence of corresponding databases, we first establish the largest text-to-3D asset quality assessment database to date, termed the AIGC-T23DAQA database. This database encompasses 969 validated 3D assets generated from 170 prompts via 6 popular text-to-3D asset generation models, and corresponding subjective quality ratings for these assets from the perspectives of quality, authenticity, and text-asset correspondence, respectively. Subsequently, we establish a comprehensive benchmark based on the AIGC-T23DAQA database, and devise an effective T23DAQA model to evaluate the generated 3D assets from the aforementioned three perspectives, respectively.
△ Less
Submitted 24 February, 2025;
originally announced February 2025.
-
Single Inclusive $π^\pm$ and $K^\pm$ Production in $e^+e^-$ Annihilation at center-of-mass Energies from 2.000 to 3.671GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann,
H. Cai
, et al. (707 additional authors not shown)
Abstract:
Using data samples with a total integrated luminosity of 253 $\rm pb^{-1}$ collected by the BESIII detector operating at the BEPCII collider, the differential cross-sections of inclusive $π^\pm$ and $K^\pm$ production, as a function of momentum and normalized by the total hadronic cross-section, are measured at center-of-mass energies from 2.000 to 3.671 GeV. The measured $π^{\pm}$ cross sections…
▽ More
Using data samples with a total integrated luminosity of 253 $\rm pb^{-1}$ collected by the BESIII detector operating at the BEPCII collider, the differential cross-sections of inclusive $π^\pm$ and $K^\pm$ production, as a function of momentum and normalized by the total hadronic cross-section, are measured at center-of-mass energies from 2.000 to 3.671 GeV. The measured $π^{\pm}$ cross sections are consistent with the previously reported $π^{0}$ cross-sections by BESIII, while the $K^{\pm}$ cross sections are systematically higher than the $K^0_S$ cross sections by a factor of approximately 1.4. These new results are in agreement with state-of-the-art QCD analyses at next-to-next-to-leading order accuracy, particularly in the large hadron momentum region at energy scales down to 3 GeV. These findings support the validity of isospin symmetry in parton fragmentation processes.
△ Less
Submitted 22 February, 2025;
originally announced February 2025.
-
Amplitude analysis of $ψ(3686)\to γK_S^0 K_S^0 $
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann,
H. Cai
, et al. (704 additional authors not shown)
Abstract:
Using $(2712\pm14)\times10^6$ $ψ(3686)$ events collected with the BESIII detector, we perform the first amplitude analysis of the radiative decay $ψ(3686)\to γK_S^0 K_S^0$ within the mass region $M_{K_S^0 K_S^0 }<2.8$ GeV/$c^2$. Employing a one-channel K-matrix approach for the description of the dynamics of the $K^0_S K^0_S$ system, the data sample is well described with four poles for the $f_0$-…
▽ More
Using $(2712\pm14)\times10^6$ $ψ(3686)$ events collected with the BESIII detector, we perform the first amplitude analysis of the radiative decay $ψ(3686)\to γK_S^0 K_S^0$ within the mass region $M_{K_S^0 K_S^0 }<2.8$ GeV/$c^2$. Employing a one-channel K-matrix approach for the description of the dynamics of the $K^0_S K^0_S$ system, the data sample is well described with four poles for the $f_0$-wave and three poles for the $f_2$-wave. The determined pole positions are consistent with those of well-established resonance states. The observed $f_0$ and $f_{2}$ states are found to be in agreement with those produced in radiative $J/ψ$ decays. The production behaviors of $f_0$ and $f_2$ poles in $ψ(3686)\toγK_S^0 K_S^0$ are qualified with their residues and the converted branching fractions. By comparing with $J/ψ\toγK_S^0 K_S^0$ decay, the ratios $\frac{\mathcal{B}(ψ(3686)\toγf_{0,2})}{\mathcal{B}(J/ψ\toγf_{0,2})}$ are determined, which provides crucial experimental inputs on the internal structure of the $f_{0,2}$ states, especially their potential mixing with glueball components.
△ Less
Submitted 7 May, 2025; v1 submitted 19 February, 2025;
originally announced February 2025.
-
Progress of the TianQin project
Authors:
Jun Luo,
Shaojun Bai,
Yan-Zheng Bai,
Lin Cai,
Hao Dang,
Qijia Dong,
Hui-Zong Duan,
Yuanbo Du,
Lei Fan,
Xinju Fu,
Yong Gao,
Xingyu Gou,
Changlei Guo,
Wei Hong,
Bin Hu,
Heran Hu,
Ming Hu,
Yi-Ming Hu,
Fa Peng Huang,
Defeng Gu,
Xin Ji,
Yuan-Ze Jiang,
En-Kun Li,
Hongyin Li,
Ming Li
, et al. (76 additional authors not shown)
Abstract:
TianQin is a future space-based gravitational wave observatory targeting the frequency window of $10^{-4}$ Hz $\sim 1$ Hz. A large variety of gravitational wave sources are expected in this frequency band, including the merger of massive black hole binaries, the inspiral of extreme/intermediate mass ratio systems, stellar-mass black hole binaries, Galactic compact binaries, and so on. TianQin will…
▽ More
TianQin is a future space-based gravitational wave observatory targeting the frequency window of $10^{-4}$ Hz $\sim 1$ Hz. A large variety of gravitational wave sources are expected in this frequency band, including the merger of massive black hole binaries, the inspiral of extreme/intermediate mass ratio systems, stellar-mass black hole binaries, Galactic compact binaries, and so on. TianQin will consist of three Earth orbiting satellites on nearly identical orbits with orbital radii of about $10^5$ km. The satellites will form a normal triangle constellation whose plane is nearly perpendicular to the ecliptic plane. The TianQin project has been progressing smoothly following the ``0123" technology roadmap. In step ``0", the TianQin laser ranging station has been constructed and it has successfully ranged to all the five retro-reflectors on the Moon. In step ``1", the drag-free control technology has been tested and demonstrated using the TianQin-1 satellite. In step ``2", the inter-satellite laser interferometry technology will be tested using the pair of TianQin-2 satellites. The TianQin-2 mission has been officially approved and the satellites will be launched around 2026. In step ``3", i.e., the TianQin-3 mission, three identical satellites will be launched around 2035 to form the space-based gravitational wave detector, TianQin, and to start gravitational wave detection in space.
△ Less
Submitted 16 February, 2025;
originally announced February 2025.
-
Search for the Cabibbo-suppressed decays $Λ_c^{+}\toΣ^0K^{+}π^{0}$ and $Λ_c^{+}\toΣ^0K^{+}π^{+}π^{-}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann,
H. Cai
, et al. (687 additional authors not shown)
Abstract:
Utilizing 4.5 $fb^-$ of $e^+e^-$ annihilation data collected at center-of-mass energies ranging from 4599.53 MeV to 4698.82 MeV by the BESIII detector at the BEPCII collider, we search for the singly Cabibbo-suppressed hadronic decays $Λ_{c}^{+}\toΣ^{0} K^{+}π^{0}$ and $Λ_{c}^{+}\toΣ^{0}K^{+}π^+π^-$ with a single-tag method. No significant signals are observed for both decays. The upper limits on…
▽ More
Utilizing 4.5 $fb^-$ of $e^+e^-$ annihilation data collected at center-of-mass energies ranging from 4599.53 MeV to 4698.82 MeV by the BESIII detector at the BEPCII collider, we search for the singly Cabibbo-suppressed hadronic decays $Λ_{c}^{+}\toΣ^{0} K^{+}π^{0}$ and $Λ_{c}^{+}\toΣ^{0}K^{+}π^+π^-$ with a single-tag method. No significant signals are observed for both decays. The upper limits on the branching fractions at the $90\%$ confidence level are determined to be $5.0\times 10^{-4}$ for $Λ_{c}^{+}\toΣ^{0} K^{+}π^{0}$ and $6.5\times 10^{-4}$ for $Λ_c^{+}\toΣ^0K^{+}π^{+}π^{-}$.
△ Less
Submitted 16 February, 2025;
originally announced February 2025.
-
Adaptive Multi-Objective Bayesian Optimization for Capacity Planning of Hybrid Heat Sources in Electric-Heat Coupling Systems of Cold Regions
Authors:
Ruizhe Yang,
Zhongkai Yi,
Ying Xu,
Guiyu Chen,
Haojie Yang,
Rong Yi,
Tongqing Li,
Miaozhe ShenJin Li,
Haoxiang Gao,
Hongyu Duan
Abstract:
The traditional heat-load generation pattern of combined heat and power generators has become a problem leading to renewable energy source (RES) power curtailment in cold regions, motivating the proposal of a planning model for alternative heat sources. The model aims to identify non-dominant capacity allocation schemes for heat pumps, thermal energy storage, electric boilers, and combined storage…
▽ More
The traditional heat-load generation pattern of combined heat and power generators has become a problem leading to renewable energy source (RES) power curtailment in cold regions, motivating the proposal of a planning model for alternative heat sources. The model aims to identify non-dominant capacity allocation schemes for heat pumps, thermal energy storage, electric boilers, and combined storage heaters to construct a Pareto front, considering both economic and sustainable objectives. The integration of various heat sources from both generation and consumption sides enhances flexibility in utilization. The study introduces a novel optimization algorithm, the adaptive multi-objective Bayesian optimization (AMBO). Compared to other widely used multi-objective optimization algorithms, AMBO eliminates predefined parameters that may introduce subjectivity from planners. Beyond the algorithm, the proposed model incorporates a noise term to account for inevitable simulation deviations, enabling the identification of better-performing planning results that meet the unique requirements of cold regions. What's more, the characteristics of electric-thermal coupling scenarios are captured and reflected in the operation simulation model to make sure the simulation is close to reality. Numerical simulation verifies the superiority of the proposed approach in generating a more diverse and evenly distributed Pareto front in a sample-efficient manner, providing comprehensive and objective planning choices.
△ Less
Submitted 13 February, 2025;
originally announced February 2025.
-
Precise Measurement of the $χ_{c0}$ Resonance Parameters and Branching Fractions of $χ_{c0,c2}\toπ^+π^-/K^+K^-$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (648 additional authors not shown)
Abstract:
By analyzing a $ψ(3686)$ data sample containing $(107.7\pm0.6)\times10^{6}$ events taken with the BESIII detector at the BEPCII storage ring in 2009, the $χ_{c0}$ resonance parameters are precisely measured using $χ_{c0,c2} \to π^+π^-/K^+K^-$ events. The mass of $χ_{c0}$ is determined to be $M(χ_{c0})=(3415.67\pm0.07\pm0.06\pm0.07$)~MeV/$c^2$, and its full width is…
▽ More
By analyzing a $ψ(3686)$ data sample containing $(107.7\pm0.6)\times10^{6}$ events taken with the BESIII detector at the BEPCII storage ring in 2009, the $χ_{c0}$ resonance parameters are precisely measured using $χ_{c0,c2} \to π^+π^-/K^+K^-$ events. The mass of $χ_{c0}$ is determined to be $M(χ_{c0})=(3415.67\pm0.07\pm0.06\pm0.07$)~MeV/$c^2$, and its full width is $Γ(χ_{c0})=(12.44\pm0.12\pm0.12)~{\rm MeV}$, where the first uncertainty is statistical, the second systematic, and the third for mass comes from $χ_{c2}$ mass uncertainty. These measurements improve the precision of $χ_{c0}$ mass by a factor of four and width by one order of magnitude over the previous individual measurements, and significantly boost our knowledge about the charmonium spectrum. Together with additional $(345.4\pm2.6)\times10^{6}$ $ψ(3686)$ data events taken in 2012, the decay branching fractions of $χ_{c0,c2}\toπ^+π^-/K^+K^-$ are measured as well, with precision improved by a factor of three compared to previous measurements. These $χ_{c0}$ decay branching fractions provide important inputs for the study of glueballs.
△ Less
Submitted 12 February, 2025;
originally announced February 2025.
-
Joint Modelling Histology and Molecular Markers for Cancer Classification
Authors:
Xiaofei Wang,
Hanyu Liu,
Yupei Zhang,
Boyang Zhao,
Hao Duan,
Wanming Hu,
Yonggao Mou,
Stephen Price,
Chao Li
Abstract:
Cancers are characterized by remarkable heterogeneity and diverse prognosis. Accurate cancer classification is essential for patient stratification and clinical decision-making. Although digital pathology has been advancing cancer diagnosis and prognosis, the paradigm in cancer pathology has shifted from purely relying on histology features to incorporating molecular markers. There is an urgent ne…
▽ More
Cancers are characterized by remarkable heterogeneity and diverse prognosis. Accurate cancer classification is essential for patient stratification and clinical decision-making. Although digital pathology has been advancing cancer diagnosis and prognosis, the paradigm in cancer pathology has shifted from purely relying on histology features to incorporating molecular markers. There is an urgent need for digital pathology methods to meet the needs of the new paradigm. We introduce a novel digital pathology approach to jointly predict molecular markers and histology features and model their interactions for cancer classification. Firstly, to mitigate the challenge of cross-magnification information propagation, we propose a multi-scale disentangling module, enabling the extraction of multi-scale features from high-magnification (cellular-level) to low-magnification (tissue-level) whole slide images. Further, based on the multi-scale features, we propose an attention-based hierarchical multi-task multi-instance learning framework to simultaneously predict histology and molecular markers. Moreover, we propose a co-occurrence probability-based label correlation graph network to model the co-occurrence of molecular markers. Lastly, we design a cross-modal interaction module with the dynamic confidence constrain loss and a cross-modal gradient modulation strategy, to model the interactions of histology and molecular markers. Our experiments demonstrate that our method outperforms other state-of-the-art methods in classifying glioma, histology features and molecular markers. Our method promises to promote precise oncology with the potential to advance biomedical research and clinical applications. The code is available at https://github.com/LHY1007/M3C2
△ Less
Submitted 11 February, 2025;
originally announced February 2025.
-
Search for $e^+e^-\to K_S^0 K_S^0 h_c$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (642 additional authors not shown)
Abstract:
Using $e^+e^-$ collision data at 13 center-of-mass energies ranging from 4.600 to 4.950 GeV collected with the BESIII detector, we search for the unmeasured $e^+e^-\to K_S^0 K_S^0 h_c$ process . No significant signal is observed, and the upper limits of the Born cross sections at each center-of-mass energy are presented.
Using $e^+e^-$ collision data at 13 center-of-mass energies ranging from 4.600 to 4.950 GeV collected with the BESIII detector, we search for the unmeasured $e^+e^-\to K_S^0 K_S^0 h_c$ process . No significant signal is observed, and the upper limits of the Born cross sections at each center-of-mass energy are presented.
△ Less
Submitted 11 February, 2025;
originally announced February 2025.
-
VideoRoPE: What Makes for Good Video Rotary Position Embedding?
Authors:
Xilin Wei,
Xiaoran Liu,
Yuhang Zang,
Xiaoyi Dong,
Pan Zhang,
Yuhang Cao,
Jian Tong,
Haodong Duan,
Qipeng Guo,
Jiaqi Wang,
Xipeng Qiu,
Dahua Lin
Abstract:
While Rotary Position Embedding (RoPE) and its variants are widely adopted for their long-context capabilities, the extension of the 1D RoPE to video, with its complex spatio-temporal structure, remains an open challenge. This work first introduces a comprehensive analysis that identifies four key characteristics essential for the effective adaptation of RoPE to video, which have not been fully co…
▽ More
While Rotary Position Embedding (RoPE) and its variants are widely adopted for their long-context capabilities, the extension of the 1D RoPE to video, with its complex spatio-temporal structure, remains an open challenge. This work first introduces a comprehensive analysis that identifies four key characteristics essential for the effective adaptation of RoPE to video, which have not been fully considered in prior work. As part of our analysis, we introduce a challenging V-NIAH-D (Visual Needle-In-A-Haystack with Distractors) task, which adds periodic distractors into V-NIAH. The V-NIAH-D task demonstrates that previous RoPE variants, lacking appropriate temporal dimension allocation, are easily misled by distractors. Based on our analysis, we introduce \textbf{VideoRoPE}, with a \textit{3D structure} designed to preserve spatio-temporal relationships. VideoRoPE features \textit{low-frequency temporal allocation} to mitigate periodic oscillations, a \textit{diagonal layout} to maintain spatial symmetry, and \textit{adjustable temporal spacing} to decouple temporal and spatial indexing. VideoRoPE consistently surpasses previous RoPE variants, across diverse downstream tasks such as long video retrieval, video understanding, and video hallucination. Our code will be available at \href{https://github.com/Wiselnn570/VideoRoPE}{https://github.com/Wiselnn570/VideoRoPE}.
△ Less
Submitted 27 April, 2025; v1 submitted 7 February, 2025;
originally announced February 2025.
-
Observation of $D\to \bar{K}_{1}(1270)μ^+ν_μ$ and test of lepton flavor universality with $D\to \bar{K}_1(1270) \ell^{+} ν_{\ell}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (646 additional authors not shown)
Abstract:
By analyzing 7.93 $\rm fb^{-1}$ of $e^+e^-$ collision data collected at the center-of-mass energy of 3.773 GeV with the BESIII detector operated at the BEPCII collider, we report the observation of the semimuonic decays of $D^+\to \bar K_1(1270)^0μ^+ν_μ$ and $D^0\to K_1(1270)^-μ^+ν_μ$ with statistical significances of $12.5σ$ and $6.0σ$, respectively. Their decay branching fractions are determined…
▽ More
By analyzing 7.93 $\rm fb^{-1}$ of $e^+e^-$ collision data collected at the center-of-mass energy of 3.773 GeV with the BESIII detector operated at the BEPCII collider, we report the observation of the semimuonic decays of $D^+\to \bar K_1(1270)^0μ^+ν_μ$ and $D^0\to K_1(1270)^-μ^+ν_μ$ with statistical significances of $12.5σ$ and $6.0σ$, respectively. Their decay branching fractions are determined to be ${\mathcal B}[D^{+}\to \bar{K}_1(1270)^0 μ^{+}ν_μ]=(2.36\pm0.20^{+0.18}_{-0.27}\pm 0.48)\times10^{-3}$ and ${\mathcal B}[D^{0}\to K_1(1270)^{-} μ^{+}ν_μ]=(0.78\pm0.11^{+0.05}_{-0.09}\pm 0.15)\times10^{-3}$, where the first and second uncertainties are statistical and systematic, respectively, and the third originates from the input branching fraction of $\bar K_{1}(1270)^0\to K^- π^+π^0$ or $K_1(1270)^-\to K^-π^+π^-$. Combining our branching fractions with the previous measurements of ${\mathcal B}[D^+\to \bar K_1(1270)^0e^+ν_{e}]$ and ${\mathcal B}[D^0\to K_1(1270)^-e^+ν_{e}]$, we determine the branching fraction ratios to be ${\mathcal B}[D^+\to \bar K_1(1270)^0μ^+ν_μ]/{\mathcal B}[D^+\to \bar K_1(1270)^0e^+ν_{e}]=1.03 \pm 0.14 \substack{+0.11\\-0.15}$ and ${\mathcal B}[D^0\to K_1(1270)^-μ^+ν_μ]/{\mathcal B}[D^0\to K_1(1270)^-e^+ν_{e}]=0.74\pm 0.13 \substack{+0.08\\-0.13}$. Using the branching fractions measured in this work and the world-average lifetimes of the $D^+$ and $D^0$ mesons, we determine the semimuonic partial decay width ratio to be $Γ[D^+\to \bar K_1(1270)^0 μ^+ν_μ]/Γ[D^0\to K_1(1270)^- μ^+ν_μ]=1.22\pm 0.10\substack{+0.06\\-0.09}$, which is consistent with unity as predicted by isospin conservation.
△ Less
Submitted 18 April, 2025; v1 submitted 6 February, 2025;
originally announced February 2025.
-
Laser: Efficient Language-Guided Segmentation in Neural Radiance Fields
Authors:
Xingyu Miao,
Haoran Duan,
Yang Bai,
Tejal Shah,
Jun Song,
Yang Long,
Rajiv Ranjan,
Ling Shao
Abstract:
In this work, we propose a method that leverages CLIP feature distillation, achieving efficient 3D segmentation through language guidance. Unlike previous methods that rely on multi-scale CLIP features and are limited by processing speed and storage requirements, our approach aims to streamline the workflow by directly and effectively distilling dense CLIP features, thereby achieving precise segme…
▽ More
In this work, we propose a method that leverages CLIP feature distillation, achieving efficient 3D segmentation through language guidance. Unlike previous methods that rely on multi-scale CLIP features and are limited by processing speed and storage requirements, our approach aims to streamline the workflow by directly and effectively distilling dense CLIP features, thereby achieving precise segmentation of 3D scenes using text. To achieve this, we introduce an adapter module and mitigate the noise issue in the dense CLIP feature distillation process through a self-cross-training strategy. Moreover, to enhance the accuracy of segmentation edges, this work presents a low-rank transient query attention mechanism. To ensure the consistency of segmentation for similar colors under different viewpoints, we convert the segmentation task into a classification task through label volume, which significantly improves the consistency of segmentation in color-similar areas. We also propose a simplified text augmentation strategy to alleviate the issue of ambiguity in the correspondence between CLIP features and text. Extensive experimental results show that our method surpasses current state-of-the-art technologies in both training speed and performance. Our code is available on: https://github.com/xingy038/Laser.git.
△ Less
Submitted 31 January, 2025;
originally announced January 2025.
-
Observation of $h_{c}$ radiative decays to multiple light hadrons and the tensor state $f_2(1270)$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann,
H. Cai
, et al. (666 additional authors not shown)
Abstract:
Using $ψ(3686)\rightarrow π^{0} h_{c}$ decays from a data sample of $(27.12\pm0.14)\times10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider, $h_c$ radiative decays to $γπ^{+}π^{-},~γπ^{+}π^{-}η,~\gamma2(π^{+}π^{-})$, and $γp\bar{p}$ are observed for the first time, each with a significance greater than $5σ$. The corresponding branching fractions are measured. Furtherm…
▽ More
Using $ψ(3686)\rightarrow π^{0} h_{c}$ decays from a data sample of $(27.12\pm0.14)\times10^{8}$ $ψ(3686)$ events collected by the BESIII detector at the BEPCII collider, $h_c$ radiative decays to $γπ^{+}π^{-},~γπ^{+}π^{-}η,~\gamma2(π^{+}π^{-})$, and $γp\bar{p}$ are observed for the first time, each with a significance greater than $5σ$. The corresponding branching fractions are measured. Furthermore, intermediate states below 2.8 GeV/$c^{2}$ are investigated, leading to the first observation of the decay process of $h_c\rightarrowγf_{2}(1270)\rightarrowγπ^{+}π^{-}$ with a significance of $5.5\,σ$. This observation represents the first instance of $h_c$ radiative decay to a tensor state.
△ Less
Submitted 26 January, 2025;
originally announced January 2025.
-
Cross section measurement of $e^{+}e^{-} \to f_{1}(1285)π^{+}π^{-}$ at center-of-mass energies between $3.808$ and $4.951\rm GeV$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
Using data samples collected by the \mbox{BESIII} detector located at the Beijing Electron Positron Collider, the cross sections of the process $e^+e^-\to f_{1}(1285)π^+π^-$ are measured at forty-five center-of-mass energies from $3.808$ to $4.951 {\rm GeV}$. An investigation on the cross section line shape is performed, and no significant structure is observed.
Using data samples collected by the \mbox{BESIII} detector located at the Beijing Electron Positron Collider, the cross sections of the process $e^+e^-\to f_{1}(1285)π^+π^-$ are measured at forty-five center-of-mass energies from $3.808$ to $4.951 {\rm GeV}$. An investigation on the cross section line shape is performed, and no significant structure is observed.
△ Less
Submitted 23 January, 2025;
originally announced January 2025.
-
Redundancy Principles for MLLMs Benchmarks
Authors:
Zicheng Zhang,
Xiangyu Zhao,
Xinyu Fang,
Chunyi Li,
Xiaohong Liu,
Xiongkuo Min,
Haodong Duan,
Kai Chen,
Guangtao Zhai
Abstract:
With the rapid iteration of Multi-modality Large Language Models (MLLMs) and the evolving demands of the field, the number of benchmarks produced annually has surged into the hundreds. The rapid growth has inevitably led to significant redundancy among benchmarks. Therefore, it is crucial to take a step back and critically assess the current state of redundancy and propose targeted principles for…
▽ More
With the rapid iteration of Multi-modality Large Language Models (MLLMs) and the evolving demands of the field, the number of benchmarks produced annually has surged into the hundreds. The rapid growth has inevitably led to significant redundancy among benchmarks. Therefore, it is crucial to take a step back and critically assess the current state of redundancy and propose targeted principles for constructing effective MLLM benchmarks. In this paper, we focus on redundancy from three key perspectives: 1) Redundancy of benchmark capability dimensions, 2) Redundancy in the number of test questions, and 3) Cross-benchmark redundancy within specific domains. Through the comprehensive analysis over hundreds of MLLMs' performance across more than 20 benchmarks, we aim to quantitatively measure the level of redundancy lies in existing MLLM evaluations, provide valuable insights to guide the future development of MLLM benchmarks, and offer strategies to refine and address redundancy issues effectively.
△ Less
Submitted 20 January, 2025;
originally announced January 2025.
-
InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Authors:
Yuhang Zang,
Xiaoyi Dong,
Pan Zhang,
Yuhang Cao,
Ziyu Liu,
Shengyuan Ding,
Shenxi Wu,
Yubo Ma,
Haodong Duan,
Wenwei Zhang,
Kai Chen,
Dahua Lin,
Jiaqi Wang
Abstract:
Despite the promising performance of Large Vision Language Models (LVLMs) in visual understanding, they occasionally generate incorrect outputs. While reward models (RMs) with reinforcement learning or test-time scaling offer the potential for improving generation quality, a critical gap remains: publicly available multi-modal RMs for LVLMs are scarce, and the implementation details of proprietary…
▽ More
Despite the promising performance of Large Vision Language Models (LVLMs) in visual understanding, they occasionally generate incorrect outputs. While reward models (RMs) with reinforcement learning or test-time scaling offer the potential for improving generation quality, a critical gap remains: publicly available multi-modal RMs for LVLMs are scarce, and the implementation details of proprietary models are often unclear. We bridge this gap with InternLM-XComposer2.5-Reward (IXC-2.5-Reward), a simple yet effective multi-modal reward model that aligns LVLMs with human preferences. To ensure the robustness and versatility of IXC-2.5-Reward, we set up a high-quality multi-modal preference corpus spanning text, image, and video inputs across diverse domains, such as instruction following, general understanding, text-rich documents, mathematical reasoning, and video understanding. IXC-2.5-Reward achieves excellent results on the latest multi-modal reward model benchmark and shows competitive performance on text-only reward model benchmarks. We further demonstrate three key applications of IXC-2.5-Reward: (1) Providing a supervisory signal for RL training. We integrate IXC-2.5-Reward with Proximal Policy Optimization (PPO) yields IXC-2.5-Chat, which shows consistent improvements in instruction following and multi-modal open-ended dialogue; (2) Selecting the best response from candidate responses for test-time scaling; and (3) Filtering outlier or noisy samples from existing image and video instruction tuning training data. To ensure reproducibility and facilitate further research, we have open-sourced all model weights and training recipes at https://github.com/InternLM/InternLM-XComposer
△ Less
Submitted 21 January, 2025;
originally announced January 2025.
-
Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement
Authors:
Maosong Cao,
Taolin Zhang,
Mo Li,
Chuyu Zhang,
Yunxin Liu,
Haodong Duan,
Songyang Zhang,
Kai Chen
Abstract:
The quality of Supervised Fine-Tuning (SFT) data plays a critical role in enhancing the conversational capabilities of Large Language Models (LLMs). However, as LLMs become more advanced, the availability of high-quality human-annotated SFT data has become a significant bottleneck, necessitating a greater reliance on synthetic training data. In this work, we introduce Condor, a novel two-stage syn…
▽ More
The quality of Supervised Fine-Tuning (SFT) data plays a critical role in enhancing the conversational capabilities of Large Language Models (LLMs). However, as LLMs become more advanced, the availability of high-quality human-annotated SFT data has become a significant bottleneck, necessitating a greater reliance on synthetic training data. In this work, we introduce Condor, a novel two-stage synthetic data generation framework that incorporates World Knowledge Tree and Self-Reflection Refinement to produce high-quality SFT data at scale. Our experimental results demonstrate that a base model fine-tuned on only 20K Condor-generated samples achieves superior performance compared to counterparts. The additional refinement stage in Condor further enables iterative self-improvement for LLMs at various scales (up to 72B), validating the effectiveness of our approach. Furthermore, our investigation into the scaling for synthetic data in post-training reveals substantial unexplored potential for performance improvements, opening promising avenues for future research.
△ Less
Submitted 21 January, 2025;
originally announced January 2025.
-
You Can't Eat Your Cake and Have It Too: The Performance Degradation of LLMs with Jailbreak Defense
Authors:
Wuyuao Mai,
Geng Hong,
Pei Chen,
Xudong Pan,
Baojun Liu,
Yuan Zhang,
Haixin Duan,
Min Yang
Abstract:
With the rise of generative large language models (LLMs) like LLaMA and ChatGPT, these models have significantly transformed daily life and work by providing advanced insights. However, as jailbreak attacks continue to circumvent built-in safety mechanisms, exploiting carefully crafted scenarios or tokens, the safety risks of LLMs have come into focus. While numerous defense strategies--such as pr…
▽ More
With the rise of generative large language models (LLMs) like LLaMA and ChatGPT, these models have significantly transformed daily life and work by providing advanced insights. However, as jailbreak attacks continue to circumvent built-in safety mechanisms, exploiting carefully crafted scenarios or tokens, the safety risks of LLMs have come into focus. While numerous defense strategies--such as prompt detection, modification, and model fine-tuning--have been proposed to counter these attacks, a critical question arises: do these defenses compromise the utility and usability of LLMs for legitimate users? Existing research predominantly focuses on the effectiveness of defense strategies without thoroughly examining their impact on performance, leaving a gap in understanding the trade-offs between LLM safety and performance. Our research addresses this gap by conducting a comprehensive study on the utility degradation, safety elevation, and exaggerated-safety escalation of LLMs with jailbreak defense strategies. We propose USEBench, a novel benchmark designed to evaluate these aspects, along with USEIndex, a comprehensive metric for assessing overall model performance. Through experiments on seven state-of-the-art LLMs, we found that mainstream jailbreak defenses fail to ensure both safety and performance simultaneously. Although model-finetuning performs the best overall, their effectiveness varies across LLMs. Furthermore, vertical comparisons reveal that developers commonly prioritize performance over safety when iterating or fine-tuning their LLMs.
△ Less
Submitted 21 January, 2025;
originally announced January 2025.
-
Study of $η\rightarrowπ^+π^-l^+l^-$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (637 additional authors not shown)
Abstract:
Using a sample of $(10087\pm44)\times10^{6}$ $J/ψ$ events accumulated with the BESIII detector, we analyze the decays $η\rightarrowπ^+π^-l^+l^-$ ($l=e$ or $μ$) via the process $J/ψ\rightarrowγη$. The branching fraction of $η\rightarrowπ^+π^-e^+e^-$ is measured to be $\mathcal{B}(η\rightarrowπ^+π^-e^+e^-)=(3.07\pm0.12_{\rm{stat.}}\pm0.19_{\rm{syst.}}) \times10^{-4}$. No signal events are observed f…
▽ More
Using a sample of $(10087\pm44)\times10^{6}$ $J/ψ$ events accumulated with the BESIII detector, we analyze the decays $η\rightarrowπ^+π^-l^+l^-$ ($l=e$ or $μ$) via the process $J/ψ\rightarrowγη$. The branching fraction of $η\rightarrowπ^+π^-e^+e^-$ is measured to be $\mathcal{B}(η\rightarrowπ^+π^-e^+e^-)=(3.07\pm0.12_{\rm{stat.}}\pm0.19_{\rm{syst.}}) \times10^{-4}$. No signal events are observed for the $η\rightarrowπ^{+}π^{-}μ^{+}μ^{-}$ decay, leading to an upper limit on the branching fraction of $\mathcal{B}(η\rightarrowπ^{+}π^{-}μ^{+}μ^{-})<4.0\times10^{-7}$ at the 90\% confidence level. Furthermore, the $CP$-violation asymmetry parameter is found to be $\mathcal{A}_{CP}(η\rightarrowπ^{+}π^{-}e^{+}e^{-})=(-4.04\pm4.69_{\rm{stat.}}\pm0.14_{\rm{syst.}})\%$, showing no evidence of $CP$-violation with current statistics. Additionally, we extract the transition form factor from the decay amplitude of $η\rightarrowπ^+π^-e^+e^-$. Finally, axion-like particles are searched for via the decay $η\rightarrowπ^+π^-a, a\rightarrow e^+e^-$, and upper limits on this branching fraction relative to that of $η\rightarrowπ^+π^-e^+e^-$ are presented as a function of the axion-like particle mass in the range $5-200\ \mathrm{MeV}/c^{2}$.
△ Less
Submitted 17 January, 2025;
originally announced January 2025.
-
Temporal refraction and reflection in modulated mechanical metabeams: theory and physical observation
Authors:
Shaoyun Wang,
Nan Shao,
Hui Chen,
Jiaji Chen,
Honghua Qian,
Qian Wu,
Huiling Duan,
Andrea Alu,
Guoliang Huang
Abstract:
Wave reflection and refraction at a time interface follow different conservation laws compared to conventional scattering at a spatial interface. This study presents the experimental demonstration of refraction and reflection of flexural waves across a temporal boundary in a continuum based mechanical metabeam, and unveils opportunities that emerge by tailoring temporal scattering phenomena for ph…
▽ More
Wave reflection and refraction at a time interface follow different conservation laws compared to conventional scattering at a spatial interface. This study presents the experimental demonstration of refraction and reflection of flexural waves across a temporal boundary in a continuum based mechanical metabeam, and unveils opportunities that emerge by tailoring temporal scattering phenomena for phononic applications. We observe these phenomena in an elastic beam attached to an array of piezoelectric patches that can vary in time the effective elastic properties of the beam. Frequency conversion and phase conjugation are observed upon a single temporal interface. These results are consistent with the temporal Snell law and Fresnel equations for temporal interfaces. Further, we illustrate the manipulation of amplitude and frequency spectra of flexural wave temporal refraction and reflection through multi stepped temporal interfaces. Finally, by implementing a smooth time variation of wave impedance, we numerically and experimentally demonstrate the capabilities of the temporal metabeam to realize waveform morphing and information coding. Our findings lay the foundation for developing time mechanical metamaterials and time phononic crystals, offering new avenues for advanced phonon manipulation in both wave amplitude and frequency
△ Less
Submitted 17 January, 2025;
originally announced January 2025.
-
Search for the FCNC charmonium decay $J/ψ\to D^0 μ^+ μ^- + \text{c.c.}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann,
H. Cai
, et al. (680 additional authors not shown)
Abstract:
Based on a data sample of $(10087 \pm 44) \times 10^6$ $J/ψ$ events taken with the BESIII detector, we search for the flavor-changing neutral current charmonium decay $J/ψ\to D^{0} μ^{+} μ^{-} + \text{c.c.}$. No significant signal above the background is observed, and the upper limit on its branching fraction is set to be $\mathcal{B}(J/ψ\to D^{0}μ^{+}μ^{-} + \text{c.c.} ) < 1.1 \times 10^{-7}$ at…
▽ More
Based on a data sample of $(10087 \pm 44) \times 10^6$ $J/ψ$ events taken with the BESIII detector, we search for the flavor-changing neutral current charmonium decay $J/ψ\to D^{0} μ^{+} μ^{-} + \text{c.c.}$. No significant signal above the background is observed, and the upper limit on its branching fraction is set to be $\mathcal{B}(J/ψ\to D^{0}μ^{+}μ^{-} + \text{c.c.} ) < 1.1 \times 10^{-7}$ at the 90% confidence level. This marks the first search for a flavor-changing neutral current charmonium decay involving muons in the final state.
△ Less
Submitted 14 February, 2025; v1 submitted 14 January, 2025;
originally announced January 2025.
-
SP-SLAM: Neural Real-Time Dense SLAM With Scene Priors
Authors:
Zhen Hong,
Bowen Wang,
Haoran Duan,
Yawen Huang,
Xiong Li,
Zhenyu Wen,
Xiang Wu,
Wei Xiang,
Yefeng Zheng
Abstract:
Neural implicit representations have recently shown promising progress in dense Simultaneous Localization And Mapping (SLAM). However, existing works have shortcomings in terms of reconstruction quality and real-time performance, mainly due to inflexible scene representation strategy without leveraging any prior information. In this paper, we introduce SP-SLAM, a novel neural RGB-D SLAM system tha…
▽ More
Neural implicit representations have recently shown promising progress in dense Simultaneous Localization And Mapping (SLAM). However, existing works have shortcomings in terms of reconstruction quality and real-time performance, mainly due to inflexible scene representation strategy without leveraging any prior information. In this paper, we introduce SP-SLAM, a novel neural RGB-D SLAM system that performs tracking and mapping in real-time. SP-SLAM computes depth images and establishes sparse voxel-encoded scene priors near the surfaces to achieve rapid convergence of the model. Subsequently, the encoding voxels computed from single-frame depth image are fused into a global volume, which facilitates high-fidelity surface reconstruction. Simultaneously, we employ tri-planes to store scene appearance information, striking a balance between achieving high-quality geometric texture mapping and minimizing memory consumption. Furthermore, in SP-SLAM, we introduce an effective optimization strategy for mapping, allowing the system to continuously optimize the poses of all historical input frames during runtime without increasing computational overhead. We conduct extensive evaluations on five benchmark datasets (Replica, ScanNet, TUM RGB-D, Synthetic RGB-D, 7-Scenes). The results demonstrate that, compared to existing methods, we achieve superior tracking accuracy and reconstruction quality, while running at a significantly faster speed.
△ Less
Submitted 11 January, 2025;
originally announced January 2025.
-
Search for $K^0_S$ invisible decays
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (642 additional authors not shown)
Abstract:
Based on $(1.0087\pm0.0044)\times10^{10}$ $J/ψ$ events collected with the BESIII detector at the BEPCII $e^+e^-$ storage ring, we search for $K_{S}^{0}$ invisible decays via the $J/ψ\to φK_{S}^{0} K_{S}^{0}$ process. No significant signal is observed, and the upper limit of the branching fraction of these invisible decays is set at 8.4 $\times$ $10^{-4}$ at the 90\% confidence level. This is the f…
▽ More
Based on $(1.0087\pm0.0044)\times10^{10}$ $J/ψ$ events collected with the BESIII detector at the BEPCII $e^+e^-$ storage ring, we search for $K_{S}^{0}$ invisible decays via the $J/ψ\to φK_{S}^{0} K_{S}^{0}$ process. No significant signal is observed, and the upper limit of the branching fraction of these invisible decays is set at 8.4 $\times$ $10^{-4}$ at the 90\% confidence level. This is the first experimental search for $K^0_S$ invisible decays.
△ Less
Submitted 10 January, 2025;
originally announced January 2025.
-
OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?
Authors:
Yifei Li,
Junbo Niu,
Ziyang Miao,
Chunjiang Ge,
Yuanhang Zhou,
Qihao He,
Xiaoyi Dong,
Haodong Duan,
Shuangrui Ding,
Rui Qian,
Pan Zhang,
Yuhang Zang,
Yuhang Cao,
Conghui He,
Jiaqi Wang
Abstract:
Temporal Awareness, the ability to reason dynamically based on the timestamp when a question is raised, is the key distinction between offline and online video LLMs. Unlike offline models, which rely on complete videos for static, post hoc analysis, online models process video streams incrementally and dynamically adapt their responses based on the timestamp at which the question is posed. Despite…
▽ More
Temporal Awareness, the ability to reason dynamically based on the timestamp when a question is raised, is the key distinction between offline and online video LLMs. Unlike offline models, which rely on complete videos for static, post hoc analysis, online models process video streams incrementally and dynamically adapt their responses based on the timestamp at which the question is posed. Despite its significance, temporal awareness has not been adequately evaluated in existing benchmarks. To fill this gap, we present OVO-Bench (Online-VideO-Benchmark), a novel video benchmark that emphasizes the importance of timestamps for advanced online video understanding capability benchmarking. OVO-Bench evaluates the ability of video LLMs to reason and respond to events occurring at specific timestamps under three distinct scenarios: (1) Backward tracing: trace back to past events to answer the question. (2) Real-time understanding: understand and respond to events as they unfold at the current timestamp. (3) Forward active responding: delay the response until sufficient future information becomes available to answer the question accurately. OVO-Bench comprises 12 tasks, featuring 644 unique videos and approximately human-curated 2,800 fine-grained meta-annotations with precise timestamps. We combine automated generation pipelines with human curation. With these high-quality samples, we further developed an evaluation pipeline to systematically query video LLMs along the video timeline. Evaluations of nine Video-LLMs reveal that, despite advancements on traditional benchmarks, current models struggle with online video understanding, showing a significant gap compared to human agents. We hope OVO-Bench will drive progress in video LLMs and inspire future research in online video reasoning. Our benchmark and code can be accessed at https://github.com/JoeLeelyf/OVO-Bench.
△ Less
Submitted 27 March, 2025; v1 submitted 9 January, 2025;
originally announced January 2025.
-
Search for the leptonic decay $D^{+}\to e^{+}ν_{e}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (646 additional authors not shown)
Abstract:
We search for the leptonic decay $D^+\to e^+ν_{e}$ using an $e^+e^-$ collision data sample with an integrated luminosity of 20.3~fb$^{-1}$ collected with the BESIII detector at the center-of-mass energy of 3.773~GeV. No significant signal is observed and an upper limit on the branching fraction of $D^+\to e^+ν_{e}$ is set as $9.7 \times 10^{-7}$, at the 90\% confidence level. Our upper limit is an…
▽ More
We search for the leptonic decay $D^+\to e^+ν_{e}$ using an $e^+e^-$ collision data sample with an integrated luminosity of 20.3~fb$^{-1}$ collected with the BESIII detector at the center-of-mass energy of 3.773~GeV. No significant signal is observed and an upper limit on the branching fraction of $D^+\to e^+ν_{e}$ is set as $9.7 \times 10^{-7}$, at the 90\% confidence level. Our upper limit is an order of magnitude smaller than the previous limit for this decay mode.
△ Less
Submitted 8 January, 2025;
originally announced January 2025.
-
Observation of the $W$-annihilation process $D_s^+ \to ωρ^+$ and measurement of $D_s^+ \to φρ^+$ in $D^+_s\to π^+π^+π^-π^0π^0$ decays
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (642 additional authors not shown)
Abstract:
We present the first amplitude analysis and branching fraction measurement of the decay $D^+_s\to π^+π^+π^-π^0π^0$, using $e^+e^-$ collision data collected with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV corresponding to an integrated luminosity of 7.33 fb$^{-1}$, and report the first observation of the pure $W$-annihilation decay $D_s^+ \to ωρ^+$ with a branching f…
▽ More
We present the first amplitude analysis and branching fraction measurement of the decay $D^+_s\to π^+π^+π^-π^0π^0$, using $e^+e^-$ collision data collected with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV corresponding to an integrated luminosity of 7.33 fb$^{-1}$, and report the first observation of the pure $W$-annihilation decay $D_s^+ \to ωρ^+$ with a branching fraction of $(0.99\pm0.08_{\rm stat}\pm0.07_{\rm syst})\%$. In comparison to the low significance of the $\mathcal{D}$ wave in the decay $D_s^+ \to φρ^+$, the dominance of the $\mathcal{D}$ wave over the $\mathcal{S}$ and $\mathcal{P}$ waves, with a fraction of $(51.85\pm7.28_{\rm stat}\pm7.90_{\rm syst})\%$ observed in the decay, provides crucial information for the``polarization puzzle", as well as for the understanding of charm meson decays. The branching fraction of $D^+_s\to π^+π^+π^-π^0π^0$ is measured to be $(4.41\pm0.15_{\rm stat}\pm0.13_{\rm syst})\%$. Moreover, the branching fraction of $D_s^+ \to φρ^+$ is measured to be $(3.98\pm0.33_{\rm stat}\pm0.21_{\rm syst})\%$, and the $R_φ= {\mathcal{B}(φ\toπ^+π^-π^0)}/{\mathcal{B}(φ\to K^+K^-)}$ is determined to be $(0.222\pm0.019_{\rm stat}\pm0.016_{\rm syst}$), which is consistent with the previous measurement based on charm meson decays, but deviates from the results from $e^+e^-$ annihilation and $K$-$N$ scattering experiments by more than 3$σ$.
△ Less
Submitted 8 January, 2025;
originally announced January 2025.
-
Study of the electromagnetic Dalitz decay $J/ψ\to e^+e^- π^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
We study the electromagnetic Dalitz decay $J/ψ\to e^+e^- π^0$ using $(10087 \pm 44) \times 10^6$ $J/ψ$ events collected by the \bes detector. The di-electron-invariant-mass dependent transition form factor of this decay is explored for the first time. A significant resonant structure corresponding to the $ρ/ω$ resonance is observed, which cannot be described by existing theoretical models, due to…
▽ More
We study the electromagnetic Dalitz decay $J/ψ\to e^+e^- π^0$ using $(10087 \pm 44) \times 10^6$ $J/ψ$ events collected by the \bes detector. The di-electron-invariant-mass dependent transition form factor of this decay is explored for the first time. A significant resonant structure corresponding to the $ρ/ω$ resonance is observed, which cannot be described by existing theoretical models, due to contributions from the isospin-conserving $J/ψ\to ρπ^0$ and isospin-volating $J/ψ\to ωπ^0$ decays. The observed $ρ$--$ω$ interference is consistent with that of the pion form factor but features a relatively narrow $ρ$ peak. By taking into account the contribution of this resonant structure, the branching fraction of $J/ψ\to e^+e^- π^0$ in the full $e^+e^-$ invariant mass spectrum range is also measured for the first time to be $(8.06 \pm 0.31 (\rm{stat}) \pm 0.38 (\rm{syst}))\times 10^{-7}$, which is two times larger than the prediction of the Vector Meson Dominance model due to the observed resonant contribution of $ρ/ω$ resonances.
△ Less
Submitted 8 January, 2025;
originally announced January 2025.
-
BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning
Authors:
Beichen Zhang,
Yuhong Liu,
Xiaoyi Dong,
Yuhang Zang,
Pan Zhang,
Haodong Duan,
Yuhang Cao,
Dahua Lin,
Jiaqi Wang
Abstract:
Large language models (LLMs) have demonstrated impressive ability in solving complex mathematical problems with multi-step reasoning and can be further enhanced with well-designed in-context learning (ICL) examples. However, this potential is often constrained by two major challenges in ICL: granularity mismatch and irrelevant information. We observe that while LLMs excel at decomposing mathematic…
▽ More
Large language models (LLMs) have demonstrated impressive ability in solving complex mathematical problems with multi-step reasoning and can be further enhanced with well-designed in-context learning (ICL) examples. However, this potential is often constrained by two major challenges in ICL: granularity mismatch and irrelevant information. We observe that while LLMs excel at decomposing mathematical problems, they often struggle with reasoning errors in fine-grained steps. Moreover, ICL examples retrieved at the question level may omit critical steps or even mislead the model with irrelevant details. To address this issue, we propose BoostStep, a method that enhances reasoning accuracy through step-aligned ICL, a novel mechanism that carefully aligns retrieved reference steps with the corresponding reasoning steps. Additionally, BoostStep incorporates an effective "first-try" strategy to deliver exemplars highly relevant to the current state of reasoning. BoostStep is a flexible and powerful method that integrates seamlessly with chain-of-thought (CoT) and tree search algorithms, refining both candidate selection and decision-making. Empirical results show that BoostStep improves GPT-4o's CoT performance by 4.6% across mathematical benchmarks, significantly surpassing traditional few-shot learning's 1.2%. Moreover, it can achieve an additional 7.5\% gain combined with tree search. Surprisingly, it enhances state-of-the-art LLMs to solve challenging math problems using simpler examples. It improves DeepSeek-R1-671B's performance on AIME by 2.2%, leveraging simple examples only from the MATH dataset.
△ Less
Submitted 17 February, 2025; v1 submitted 6 January, 2025;
originally announced January 2025.
-
Observation of $ψ(3686) \to K^{-}Λ(1520)\barΞ^{+} + c.c.$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (642 additional authors not shown)
Abstract:
Based on $(2712.4 \pm 14.3)\times 10^6$ $ψ(3686)$ events collected at the BESIII detector operating at the BEPCII collider, we present the first observation of the decay $ψ(3686) \to K^{-}Λ(1520)\barΞ^{+} + c.c.$. The product branching fraction ${\cal B}[ψ(3686) \to K^{-}Λ(1520)\barΞ^{+} + c.c.] \times {\cal B}[Λ(1520) \to pK^{-}]$ is measured to be $(9.5 \pm 0.8 \pm 1.1) \times 10^{-7}$, where th…
▽ More
Based on $(2712.4 \pm 14.3)\times 10^6$ $ψ(3686)$ events collected at the BESIII detector operating at the BEPCII collider, we present the first observation of the decay $ψ(3686) \to K^{-}Λ(1520)\barΞ^{+} + c.c.$. The product branching fraction ${\cal B}[ψ(3686) \to K^{-}Λ(1520)\barΞ^{+} + c.c.] \times {\cal B}[Λ(1520) \to pK^{-}]$ is measured to be $(9.5 \pm 0.8 \pm 1.1) \times 10^{-7}$, where the first uncertainty is statistical and the second systematic.
△ Less
Submitted 5 January, 2025;
originally announced January 2025.
-
Facial Attractiveness Prediction in Live Streaming: A New Benchmark and Multi-modal Method
Authors:
Hui Li,
Xiaoyu Ren,
Hongjiu Yu,
Huiyu Duan,
Kai Li,
Ying Chen,
Libo Wang,
Xiongkuo Min,
Guangtao Zhai,
Xu Liu
Abstract:
Facial attractiveness prediction (FAP) has long been an important computer vision task, which could be widely applied in live streaming for facial retouching, content recommendation, etc. However, previous FAP datasets are either small, closed-source, or lack diversity. Moreover, the corresponding FAP models exhibit limited generalization and adaptation ability. To overcome these limitations, in t…
▽ More
Facial attractiveness prediction (FAP) has long been an important computer vision task, which could be widely applied in live streaming for facial retouching, content recommendation, etc. However, previous FAP datasets are either small, closed-source, or lack diversity. Moreover, the corresponding FAP models exhibit limited generalization and adaptation ability. To overcome these limitations, in this paper we present LiveBeauty, the first large-scale live-specific FAP dataset, in a more challenging application scenario, i.e., live streaming. 10,000 face images are collected from a live streaming platform directly, with 200,000 corresponding attractiveness annotations obtained from a well-devised subjective experiment, making LiveBeauty the largest open-access FAP dataset in the challenging live scenario. Furthermore, a multi-modal FAP method is proposed to measure the facial attractiveness in live streaming. Specifically, we first extract holistic facial prior knowledge and multi-modal aesthetic semantic features via a Personalized Attractiveness Prior Module (PAPM) and a Multi-modal Attractiveness Encoder Module (MAEM), respectively, then integrate the extracted features through a Cross-Modal Fusion Module (CMFM). Extensive experiments conducted on both LiveBeauty and other open-source FAP datasets demonstrate that our proposed method achieves state-of-the-art performance. Dataset will be available soon.
△ Less
Submitted 12 March, 2025; v1 submitted 5 January, 2025;
originally announced January 2025.
-
Search for $η_c(2S)\to p\bar{p}K^+K^-$ and measurement of $χ_{cJ}\to p\bar{p}K^+K^-$ in $ψ(3686)$ radiative decays
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann,
H. Cai
, et al. (639 additional authors not shown)
Abstract:
A search for $η_c(2S)\to p\bar{p}K^+K^-$, together with measurement of branching fractions of $χ_{cJ(J=0,1,2)}\to p\bar{p}K^+K^-$ in the $ψ(3686) \to γη_c(2S)$ and the $ψ(3686) \to γχ_{cJ}$ radiative decays, is performed with $(2712.4\pm14.3)\times 10^6$ $ψ(3686)$ events collected with the BESIII detector at the BEPCII collider. An evidence for $η_c(2S)\to p\bar{p}K^+K^-$ is found, with a signific…
▽ More
A search for $η_c(2S)\to p\bar{p}K^+K^-$, together with measurement of branching fractions of $χ_{cJ(J=0,1,2)}\to p\bar{p}K^+K^-$ in the $ψ(3686) \to γη_c(2S)$ and the $ψ(3686) \to γχ_{cJ}$ radiative decays, is performed with $(2712.4\pm14.3)\times 10^6$ $ψ(3686)$ events collected with the BESIII detector at the BEPCII collider. An evidence for $η_c(2S)\to p\bar{p}K^+K^-$ is found, with a significance of $3.3σ$. The product branching fraction of $\mathcal{B}[ψ(3686)\toγη_c(2S)]\cdot\mathcal{B}[η_c(2S)\to p\bar{p}K^+K^-]$ is determined to be $(1.98\mkern 2mu\pm\mkern 2mu0.41_{\text{stat.}}\mkern 2mu\pm\mkern 2mu0.99_{\text{syst.}})\times 10^{-7}$. The product branching fractions of $\mathcal{B}[ψ(3686)\toγχ_{cJ}]\cdot\mathcal{B}[χ_{cJ}\to p\bar{p}K^+K^-]$ are measured to be $(2.49\mkern 2mu\pm\mkern 2mu 0.03_{\text{stat.}}\mkern 2mu\pm\mkern 2mu 0.15_{\text{syst.}})\times 10^{-5}$, $(1.83\mkern 2mu \pm\mkern 2mu 0.02_{\text{stat.}}\mkern 2mu \pm\mkern 2mu 0.11_{\text{syst.}})\times 10^{-5}$, and $(2.43\mkern 2mu\pm\mkern 2mu 0.02_{\text{stat.}}\mkern 2mu\pm\mkern 2mu 0.15_{\text{syst.}})\times 10^{-5}$, for $J=0,\ 1$, and 2, respectively.
△ Less
Submitted 3 January, 2025;
originally announced January 2025.
-
HarmonyIQA: Pioneering Benchmark and Model for Image Harmonization Quality Assessment
Authors:
Zitong Xu,
Huiyu Duan,
Guangji Ma,
Liu Yang,
Jiarui Wang,
Qingbo Wu,
Xiongkuo Min,
Guangtao Zhai,
Patrick Le Callet
Abstract:
Image composition involves extracting a foreground object from one image and pasting it into another image through Image harmonization algorithms (IHAs), which aim to adjust the appearance of the foreground object to better match the background. Existing image quality assessment (IQA) methods may fail to align with human visual preference on image harmonization due to the insensitivity to minor co…
▽ More
Image composition involves extracting a foreground object from one image and pasting it into another image through Image harmonization algorithms (IHAs), which aim to adjust the appearance of the foreground object to better match the background. Existing image quality assessment (IQA) methods may fail to align with human visual preference on image harmonization due to the insensitivity to minor color or light inconsistency. To address the issue and facilitate the advancement of IHAs, we introduce the first Image Quality Assessment Database for image Harmony evaluation (HarmonyIQAD), which consists of 1,350 harmonized images generated by 9 different IHAs, and the corresponding human visual preference scores. Based on this database, we propose a Harmony Image Quality Assessment (HarmonyIQA), to predict human visual preference for harmonized images. Extensive experiments show that HarmonyIQA achieves state-of-the-art performance on human visual preference evaluation for harmonized images, and also achieves competing results on traditional IQA tasks. Furthermore, cross-dataset evaluation also shows that HarmonyIQA exhibits better generalization ability than self-supervised learning-based IQA methods. Both HarmonyIQAD and HarmonyIQA will be made publicly available upon paper publication.
△ Less
Submitted 2 January, 2025;
originally announced January 2025.
-
ESVQA: Perceptual Quality Assessment of Egocentric Spatial Videos
Authors:
Xilei Zhu,
Huiyu Duan,
Liu Yang,
Yucheng Zhu,
Xiongkuo Min,
Guangtao Zhai,
Patrick Le Callet
Abstract:
With the rapid development of eXtended Reality (XR), egocentric spatial shooting and display technologies have further enhanced immersion and engagement for users. Assessing the quality of experience (QoE) of egocentric spatial videos is crucial to ensure a high-quality viewing experience. However, the corresponding research is still lacking. In this paper, we use the embodied experience to highli…
▽ More
With the rapid development of eXtended Reality (XR), egocentric spatial shooting and display technologies have further enhanced immersion and engagement for users. Assessing the quality of experience (QoE) of egocentric spatial videos is crucial to ensure a high-quality viewing experience. However, the corresponding research is still lacking. In this paper, we use the embodied experience to highlight this more immersive experience and study the new problem, i.e., embodied perceptual quality assessment for egocentric spatial videos. Specifically, we introduce the first Egocentric Spatial Video Quality Assessment Database (ESVQAD), which comprises 600 egocentric spatial videos and their mean opinion scores (MOSs). Furthermore, we propose a novel multi-dimensional binocular feature fusion model, termed ESVQAnet, which integrates binocular spatial, motion, and semantic features to predict the perceptual quality. Experimental results demonstrate the ESVQAnet outperforms 16 state-of-the-art VQA models on the embodied perceptual quality assessment task, and exhibits strong generalization capability on traditional VQA tasks. The database and codes will be released upon the publication.
△ Less
Submitted 29 December, 2024;
originally announced December 2024.
-
Measurement of Born cross section of $e^+e^-\toΣ^0\barΣ^0$ at $\sqrt{s} = 3.50-4.95$ GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (649 additional authors not shown)
Abstract:
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at thirty-two center-of-mass energies from 3.50 to 4.95 GeV, corresponding to an integrated luminosity of 25 $\rm{fb^{-1}}$, we measure the Born cross section of the $e^+e^-\toΣ^0\barΣ^0$ reaction and the effective form factor. No significant charmonium(-like) state, i.e., $ψ(3770)$, $ψ(4040)$, $ψ(4160)$,…
▽ More
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at thirty-two center-of-mass energies from 3.50 to 4.95 GeV, corresponding to an integrated luminosity of 25 $\rm{fb^{-1}}$, we measure the Born cross section of the $e^+e^-\toΣ^0\barΣ^0$ reaction and the effective form factor. No significant charmonium(-like) state, i.e., $ψ(3770)$, $ψ(4040)$, $ψ(4160)$, $ψ(4230)$, $ψ(4360)$, $ψ(4415)$, or $ψ(4660)$, decaying into the $Σ^0\barΣ^0$ final state is observed by fitting the $e^+e^- \to Σ^0\barΣ^0$ dressed cross section. The upper limits for the product of the branching fraction and the electronic partial width at the 90% confidence level are provided for each assumed charmonium(-like) state. In addition, the ratios of the Born cross section and the effective form factor between the $e^+e^-\toΣ^0\barΣ^0$ and the $e^+e^-\toΣ^+\barΣ^-$ reactions are provided, which can be used to validate the prediction of the vector meson dominance model.
△ Less
Submitted 14 March, 2025; v1 submitted 28 December, 2024;
originally announced December 2024.
-
Search for the double Dalitz decays $η/η' \to e^+e^-μ^+μ^-$ and $η' \to μ^+μ^-μ^+μ^-$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (648 additional authors not shown)
Abstract:
Using a data sample of $(10087 \pm 44) \times {10^{6}}$ $J/ψ$ events collected with the BESIII detector, we search for the decays $η/η'\to e^+e^-μ^+μ^-$ and $η' \to μ^+μ^-μ^+μ^-$ via the radiative decays $J/ψ\toγη$/$γη'$. No excess of events over expected background is observed for any of the decays of interest. At 90% confidence level, we report the first upper limits on the branching fractions o…
▽ More
Using a data sample of $(10087 \pm 44) \times {10^{6}}$ $J/ψ$ events collected with the BESIII detector, we search for the decays $η/η'\to e^+e^-μ^+μ^-$ and $η' \to μ^+μ^-μ^+μ^-$ via the radiative decays $J/ψ\toγη$/$γη'$. No excess of events over expected background is observed for any of the decays of interest. At 90% confidence level, we report the first upper limits on the branching fractions of $η' \to e^{+}e^{-}μ^{+}μ^{-}$ and $η' \to μ^{+}μ^{-}μ^{+}μ^{-}$ to be $ 1.75 \times {10^{-6}}$ and $5.28 \times {10^{-7}}$, respectively. In addition, we set an upper limit on the branching fraction of $η\to e^{+}e^{-}μ^{+}μ^{-}$ to be $6.88 \times {10^{-6}}$, which improves the previous result by about two orders of magnitude.
△ Less
Submitted 27 December, 2024;
originally announced December 2024.
-
FineVQ: Fine-Grained User Generated Content Video Quality Assessment
Authors:
Huiyu Duan,
Qiang Hu,
Jiarui Wang,
Liu Yang,
Zitong Xu,
Lu Liu,
Xiongkuo Min,
Chunlei Cai,
Tianxiao Ye,
Xiaoyun Zhang,
Guangtao Zhai
Abstract:
The rapid growth of user-generated content (UGC) videos has produced an urgent need for effective video quality assessment (VQA) algorithms to monitor video quality and guide optimization and recommendation procedures. However, current VQA models generally only give an overall rating for a UGC video, which lacks fine-grained labels for serving video processing and recommendation applications. To a…
▽ More
The rapid growth of user-generated content (UGC) videos has produced an urgent need for effective video quality assessment (VQA) algorithms to monitor video quality and guide optimization and recommendation procedures. However, current VQA models generally only give an overall rating for a UGC video, which lacks fine-grained labels for serving video processing and recommendation applications. To address the challenges and promote the development of UGC videos, we establish the first large-scale Fine-grained Video quality assessment Database, termed FineVD, which comprises 6104 UGC videos with fine-grained quality scores and descriptions across multiple dimensions. Based on this database, we propose a Fine-grained Video Quality assessment (FineVQ) model to learn the fine-grained quality of UGC videos, with the capabilities of quality rating, quality scoring, and quality attribution. Extensive experimental results demonstrate that our proposed FineVQ can produce fine-grained video-quality results and achieve state-of-the-art performance on FineVD and other commonly used UGC-VQA datasets.
△ Less
Submitted 26 April, 2025; v1 submitted 26 December, 2024;
originally announced December 2024.
-
Exemplar-condensed Federated Class-incremental Learning
Authors:
Rui Sun,
Yumin Zhang,
Varun Ojha,
Tejal Shah,
Haoran Duan,
Bo Wei,
Rajiv Ranjan
Abstract:
We propose Exemplar-Condensed federated class-incremental learning (ECoral) to distil the training characteristics of real images from streaming data into informative rehearsal exemplars. The proposed method eliminates the limitations of exemplar selection in replay-based approaches for mitigating catastrophic forgetting in federated continual learning (FCL). The limitations particularly related t…
▽ More
We propose Exemplar-Condensed federated class-incremental learning (ECoral) to distil the training characteristics of real images from streaming data into informative rehearsal exemplars. The proposed method eliminates the limitations of exemplar selection in replay-based approaches for mitigating catastrophic forgetting in federated continual learning (FCL). The limitations particularly related to the heterogeneity of information density of each summarized data. Our approach maintains the consistency of training gradients and the relationship to past tasks for the summarized exemplars to represent the streaming data compared to the original images effectively. Additionally, our approach reduces the information-level heterogeneity of the summarized data by inter-client sharing of the disentanglement generative model. Extensive experiments show that our ECoral outperforms several state-of-the-art methods and can be seamlessly integrated with many existing approaches to enhance performance.
△ Less
Submitted 25 December, 2024;
originally announced December 2024.
-
NoiseHGNN: Synthesized Similarity Graph-Based Neural Network For Noised Heterogeneous Graph Representation Learning
Authors:
Xiong Zhang,
Cheng Xie,
Haoran Duan,
Beibei Yu
Abstract:
Real-world graph data environments intrinsically exist noise (e.g., link and structure errors) that inevitably disturb the effectiveness of graph representation and downstream learning tasks. For homogeneous graphs, the latest works use original node features to synthesize a similarity graph that can correct the structure of the noised graph. This idea is based on the homogeneity assumption, which…
▽ More
Real-world graph data environments intrinsically exist noise (e.g., link and structure errors) that inevitably disturb the effectiveness of graph representation and downstream learning tasks. For homogeneous graphs, the latest works use original node features to synthesize a similarity graph that can correct the structure of the noised graph. This idea is based on the homogeneity assumption, which states that similar nodes in the homogeneous graph tend to have direct links in the original graph. However, similar nodes in heterogeneous graphs usually do not have direct links, which can not be used to correct the original noise graph. This causes a significant challenge in noised heterogeneous graph learning. To this end, this paper proposes a novel synthesized similarity-based graph neural network compatible with noised heterogeneous graph learning. First, we calculate the original feature similarities of all nodes to synthesize a similarity-based high-order graph. Second, we propose a similarity-aware encoder to embed original and synthesized graphs with shared parameters. Then, instead of graph-to-graph supervising, we synchronously supervise the original and synthesized graph embeddings to predict the same labels. Meanwhile, a target-based graph extracted from the synthesized graph contrasts the structure of the metapath-based graph extracted from the original graph to learn the mutual information. Extensive experiments in numerous real-world datasets show the proposed method achieves state-of-the-art records in the noised heterogeneous graph learning tasks. In highlights, +5$\sim$6\% improvements are observed in several noised datasets compared with previous SOTA methods. The code and datasets are available at https://github.com/kg-cc/NoiseHGNN.
△ Less
Submitted 24 December, 2024;
originally announced December 2024.
-
A Tale of Three: Magnetic Fields along the Orion Integral-Shaped Filament as Revealed by JCMT BISTRO survey
Authors:
Jintai Wu,
Keping Qiu,
Frederick Poidevin,
Pierre Bastien,
Junhao Liu,
Tao-Chung Ching,
Tyler L. Bourke,
Derek Ward-Thompson,
Kate Pattle,
Doug Johnstone,
Patrick M. Koch,
Doris Arzoumanian,
Chang Won Lee,
Lapo Fanciullo,
Takashi Onaka,
Jihye Hwang,
Valentin J. M. Le Gouellec,
Archana Soam,
Motohide Tamura,
Mehrnoosh Tahani,
Chakali Eswaraiah,
Hua-Bai Li,
David Berry,
Ray S. Furuya,
Simon Coude
, et al. (130 additional authors not shown)
Abstract:
As part of the BISTRO survey, we present JCMT 850 $μ$m polarimetric observations towards the Orion Integral-Shaped Filament (ISF) that covers three portions known as OMC-1, OMC-2, and OMC-3. The magnetic field threading the ISF seen in the JCMT POL-2 map appears as a tale of three: pinched for OMC-1, twisted for OMC-2, and nearly uniform for OMC-3. A multi-scale analysis shows that the magnetic fi…
▽ More
As part of the BISTRO survey, we present JCMT 850 $μ$m polarimetric observations towards the Orion Integral-Shaped Filament (ISF) that covers three portions known as OMC-1, OMC-2, and OMC-3. The magnetic field threading the ISF seen in the JCMT POL-2 map appears as a tale of three: pinched for OMC-1, twisted for OMC-2, and nearly uniform for OMC-3. A multi-scale analysis shows that the magnetic field structure in OMC-3 is very consistent at all the scales, whereas the field structure in OMC-2 shows no correlation across different scales. In OMC-1, the field retains its mean orientation from large to small scales, but shows some deviations at small scales. Histograms of relative orientations between the magnetic field and filaments reveal a bimodal distribution for OMC-1, a relatively random distribution for OMC-2, and a distribution with a predominant peak at 90$^\circ$ for OMC-3. Furthermore, the magnetic fields in OMC-1 and OMC-3 both appear to be aligned perpendicular to the fibers, which are denser structures within the filament, but the field in OMC-2 is aligned along with the fibers. All these suggest that gravity, turbulence, and magnetic field are each playing a leading role in OMC-1, 2, and 3, respectively. While OMC-2 and 3 have almost the same gas mass, density, and non-thermal velocity dispersion, there are on average younger and fewer young stellar objects in OMC-3, providing evidence that a stronger magnetic field will induce slower and less efficient star formation in molecular clouds.
△ Less
Submitted 23 December, 2024;
originally announced December 2024.
-
ErasableMask: A Robust and Erasable Privacy Protection Scheme against Black-box Face Recognition Models
Authors:
Sipeng Shen,
Yunming Zhang,
Dengpan Ye,
Xiuwen Shi,
Long Tang,
Haoran Duan,
Jiacheng Deng,
Ziyi Liu
Abstract:
While face recognition (FR) models have brought remarkable convenience in face verification and identification, they also pose substantial privacy risks to the public. Existing facial privacy protection schemes usually adopt adversarial examples to disrupt face verification of FR models. However, these schemes often suffer from weak transferability against black-box FR models and permanently damag…
▽ More
While face recognition (FR) models have brought remarkable convenience in face verification and identification, they also pose substantial privacy risks to the public. Existing facial privacy protection schemes usually adopt adversarial examples to disrupt face verification of FR models. However, these schemes often suffer from weak transferability against black-box FR models and permanently damage the identifiable information that cannot fulfill the requirements of authorized operations such as forensics and authentication. To address these limitations, we propose ErasableMask, a robust and erasable privacy protection scheme against black-box FR models. Specifically, via rethinking the inherent relationship between surrogate FR models, ErasableMask introduces a novel meta-auxiliary attack, which boosts black-box transferability by learning more general features in a stable and balancing optimization strategy. It also offers a perturbation erasion mechanism that supports the erasion of semantic perturbations in protected face without degrading image quality. To further improve performance, ErasableMask employs a curriculum learning strategy to mitigate optimization conflicts between adversarial attack and perturbation erasion. Extensive experiments on the CelebA-HQ and FFHQ datasets demonstrate that ErasableMask achieves the state-of-the-art performance in transferability, achieving over 72% confidence on average in commercial FR systems. Moreover, ErasableMask also exhibits outstanding perturbation erasion performance, achieving over 90% erasion success rate.
△ Less
Submitted 29 December, 2024; v1 submitted 22 December, 2024;
originally announced December 2024.
-
Revealing the Black Box of Device Search Engine: Scanning Assets, Strategies, and Ethical Consideration
Authors:
Mengying Wu,
Geng Hong,
Jinsong Chen,
Qi Liu,
Shujun Tang,
Youhao Li,
Baojun Liu,
Haixin Duan,
Min Yang
Abstract:
In the digital age, device search engines such as Censys and Shodan play crucial roles by scanning the internet to catalog online devices, aiding in the understanding and mitigation of network security risks. While previous research has used these tools to detect devices and assess vulnerabilities, there remains uncertainty regarding the assets they scan, the strategies they employ, and whether th…
▽ More
In the digital age, device search engines such as Censys and Shodan play crucial roles by scanning the internet to catalog online devices, aiding in the understanding and mitigation of network security risks. While previous research has used these tools to detect devices and assess vulnerabilities, there remains uncertainty regarding the assets they scan, the strategies they employ, and whether they adhere to ethical guidelines. This study presents the first comprehensive examination of these engines' operational and ethical dimensions. We developed a novel framework to trace the IP addresses utilized by these engines and collected 1,407 scanner IPs. By uncovering their IPs, we gain deep insights into the actions of device search engines for the first time and gain original findings. By employing 28 honeypots to monitor their scanning activities extensively in one year, we demonstrate that users can hardly evade scans by blocklisting scanner IPs or migrating service ports. Our findings reveal significant ethical concerns, including a lack of transparency, harmlessness, and anonymity. Notably, these engines often fail to provide transparency and do not allow users to opt out of scans. Further, the engines send malformed requests, attempt to access excessive details without authorization, and even publish personally identifiable information (PII) and screenshots on search results. These practices compromise user privacy and expose devices to further risks by potentially aiding malicious entities. This paper emphasizes the urgent need for stricter ethical standards and enhanced transparency in the operations of device search engines, offering crucial insights into safeguarding against invasive scanning practices and protecting digital infrastructures.
△ Less
Submitted 20 December, 2024;
originally announced December 2024.
-
Measurement of the Branching Fraction for the Decay $χ_{cJ}\to p\bar{p}ηπ^{0}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (642 additional authors not shown)
Abstract:
Using $(2712.4\pm 14.3)\times10^6 ψ(3686)$ events collected by the BESIII detector operating at the BEPCII collider, we present the first observations of the decays $χ_{cJ}(J=0,1,2)\to p\bar{p}ηπ^{0}$. Their decay branching fractions are determined to be ${\cal B}(χ_{c0}\to p\bar{p}ηπ^{0})=({2.41 \pm 0.07 \pm 0.19}) \times 10^{-4}$,…
▽ More
Using $(2712.4\pm 14.3)\times10^6 ψ(3686)$ events collected by the BESIII detector operating at the BEPCII collider, we present the first observations of the decays $χ_{cJ}(J=0,1,2)\to p\bar{p}ηπ^{0}$. Their decay branching fractions are determined to be ${\cal B}(χ_{c0}\to p\bar{p}ηπ^{0})=({2.41 \pm 0.07 \pm 0.19}) \times 10^{-4}$, ${\cal B}(χ_{c1}\to p\bar{p}ηπ^{0})=({1.95 \pm 0.05 \pm 0.12}) \times 10^{-4}$, and ${\cal B}(χ_{c2}\to p\bar{p}ηπ^{0})=({1.31 \pm 0.05 \pm 0.08}) \times 10^{-4}$, where the first uncertainties are statistical and the second systematic.
△ Less
Submitted 18 December, 2024; v1 submitted 18 December, 2024;
originally announced December 2024.
-
User-Generated Content and Editors in Games: A Comprehensive Survey
Authors:
Yuyue Liu,
Haihan Duan,
Wei Cai
Abstract:
User-Generated Content (UGC) refers to any form of content, such as posts and images, created by users rather than by professionals. In recent years, UGC has become an essential part of the evolving video game industry, influencing both game culture and community dynamics. The ability for users to actively contribute to the games they engage with has shifted the landscape of gaming from a one-dire…
▽ More
User-Generated Content (UGC) refers to any form of content, such as posts and images, created by users rather than by professionals. In recent years, UGC has become an essential part of the evolving video game industry, influencing both game culture and community dynamics. The ability for users to actively contribute to the games they engage with has shifted the landscape of gaming from a one-directional entertainment experience into a collaborative, user-driven ecosystem. Therefore, this growing trend highlights the urgent need for summarizing the current UGC development in game industry. Our conference paper has systematically classified the existing UGC in games and the UGC editors separately into four types. However, the previous survey lacks the depth and precision necessary to capture the wide-ranging and increasingly complex nature of UGC. To this end, as an extension of previous work, this paper presents a refined and expanded classification of UGC and UGC editors within video games, offering a more robust and comprehensive framework with representative cases that better reflects the diversity and nuances of contemporary user-generated contributions. Moreover, we provide our insights on the future of UGC, involving game culture, game genre and user creative tendencies, artificial intelligence, its potential ethical considerations, and relationship between games, users and communities.
△ Less
Submitted 18 December, 2024;
originally announced December 2024.
-
F-Bench: Rethinking Human Preference Evaluation Metrics for Benchmarking Face Generation, Customization, and Restoration
Authors:
Lu Liu,
Huiyu Duan,
Qiang Hu,
Liu Yang,
Chunlei Cai,
Tianxiao Ye,
Huayu Liu,
Xiaoyun Zhang,
Guangtao Zhai
Abstract:
Artificial intelligence generative models exhibit remarkable capabilities in content creation, particularly in face image generation, customization, and restoration. However, current AI-generated faces (AIGFs) often fall short of human preferences due to unique distortions, unrealistic details, and unexpected identity shifts, underscoring the need for a comprehensive quality evaluation framework f…
▽ More
Artificial intelligence generative models exhibit remarkable capabilities in content creation, particularly in face image generation, customization, and restoration. However, current AI-generated faces (AIGFs) often fall short of human preferences due to unique distortions, unrealistic details, and unexpected identity shifts, underscoring the need for a comprehensive quality evaluation framework for AIGFs. To address this need, we introduce FaceQ, a large-scale, comprehensive database of AI-generated Face images with fine-grained Quality annotations reflecting human preferences. The FaceQ database comprises 12,255 images generated by 29 models across three tasks: (1) face generation, (2) face customization, and (3) face restoration. It includes 32,742 mean opinion scores (MOSs) from 180 annotators, assessed across multiple dimensions: quality, authenticity, identity (ID) fidelity, and text-image correspondence. Using the FaceQ database, we establish F-Bench, a benchmark for comparing and evaluating face generation, customization, and restoration models, highlighting strengths and weaknesses across various prompts and evaluation dimensions. Additionally, we assess the performance of existing image quality assessment (IQA), face quality assessment (FQA), AI-generated content image quality assessment (AIGCIQA), and preference evaluation metrics, manifesting that these standard metrics are relatively ineffective in evaluating authenticity, ID fidelity, and text-image correspondence. The FaceQ database will be publicly available upon publication.
△ Less
Submitted 17 December, 2024;
originally announced December 2024.
-
Observation of the charmonium decay $η_c\toγγ$ in $J/ψ\toγη_c$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann,
H. Cai
, et al. (658 additional authors not shown)
Abstract:
Using $(2712.4\pm14.3)\times10^{6}$ $ψ(3686)$ events collected with the BESIII detector at the BEPCII collider, the decay $η_c\toγγ$ in $J/ψ\toγη_c$ is observed. We determine the product branching fraction $\mathcal{B}(J/ψ\toγη_c)\times\mathcal{B}(η_c\toγγ)=(5.23\pm0.26_{\rm{stat.}}\pm0.30_{\rm{syst.}})\times10^{-6}$. This result is consistent with the LQCD calculation…
▽ More
Using $(2712.4\pm14.3)\times10^{6}$ $ψ(3686)$ events collected with the BESIII detector at the BEPCII collider, the decay $η_c\toγγ$ in $J/ψ\toγη_c$ is observed. We determine the product branching fraction $\mathcal{B}(J/ψ\toγη_c)\times\mathcal{B}(η_c\toγγ)=(5.23\pm0.26_{\rm{stat.}}\pm0.30_{\rm{syst.}})\times10^{-6}$. This result is consistent with the LQCD calculation $(5.34\pm0.16)\times10^{-6}$ from HPQCD in 2023. By using the world-average values of $\mathcal{B}(J/ψ\toγη_c)$ and the total decay width of $η_c$, the partial decay width $Γ(η_c\toγγ)$ is determined to be $(11.30\pm0.56_{\rm{stat.}}\pm0.66_{\rm{syst.}}\pm1.14_{\rm{ref.}})~\rm{keV}$, which deviates from the corresponding world-average value by $3.4σ$.
△ Less
Submitted 2 April, 2025; v1 submitted 17 December, 2024;
originally announced December 2024.
-
Amplitude analysis and branching fraction measurement of the Cabibbo-favored decay $D^+ \to K^-π^+π^+π^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (651 additional authors not shown)
Abstract:
An amplitude analysis of the Cabibbo-favored decay $D^+ \to K^-π^+π^+π^0$ is performed, using 7.93 $\rm{fb}^{-1}$ of $e^+e^-$ collision data collected with the BESIII detector at the center-of-mass energy of 3.773 GeV. The branching fractions of the intermediate processes are measured, with the dominant contribution $D^+ \to \bar{K}^{*}(892)^0ρ(770)^+$ observed to have a branching fraction of…
▽ More
An amplitude analysis of the Cabibbo-favored decay $D^+ \to K^-π^+π^+π^0$ is performed, using 7.93 $\rm{fb}^{-1}$ of $e^+e^-$ collision data collected with the BESIII detector at the center-of-mass energy of 3.773 GeV. The branching fractions of the intermediate processes are measured, with the dominant contribution $D^+ \to \bar{K}^{*}(892)^0ρ(770)^+$ observed to have a branching fraction of $(4.15\pm0.07_{\rm stat.}\pm0.17_{\rm syst.})\%$. With the detection efficiency derived from the amplitude analysis, the absolute branching fraction of $D^+ \to K^-π^+π^+π^0$ is measured to be $(6.06\pm0.04_{\rm stat.}\pm0.07_{\rm syst.})\%$.
△ Less
Submitted 14 December, 2024;
originally announced December 2024.
-
Know Unreported Roadway Incidents in Real-time: Early Traffic Anomaly Detection
Authors:
Haocheng Duan,
Hao Wu,
Sean Qian
Abstract:
This research aims to know traffic anomalies as early as possible. A traffic anomaly refers to a generic incident on the road that influences traffic flow and calls for urgent traffic management measures. `Knowing'' the occurrence of a traffic anomaly is twofold: the ability to detect this anomaly before it is reported anywhere, or it may be such that an anomaly can be predicted before it actually…
▽ More
This research aims to know traffic anomalies as early as possible. A traffic anomaly refers to a generic incident on the road that influences traffic flow and calls for urgent traffic management measures. `Knowing'' the occurrence of a traffic anomaly is twofold: the ability to detect this anomaly before it is reported anywhere, or it may be such that an anomaly can be predicted before it actually occurs on the road (e.g., non-recurrent traffic breakdown). In either way, the objective is to inform traffic operators of unreported incidents in real time and as early as possible. The key is to stay ahead of the curve. Time is of the essence.
Conventional automatic incident detection (AID) methods often struggle with early detection due to their limited consideration of spatial effects and early-stage characteristics. Therefore, we propose a deep learning framework utilizing prior domain knowledge and model-designing strategies. This allows the model to detect a broader range of anomalies, not only incidents that significantly influence traffic flow but also early characteristics of incidents along with historically unreported anomalies. We specially design the model to target the early-stage detection/prediction of an incident. Additionally, unlike most conventional AID studies, our method is highly scalable and generalizable, as it is fully automated with no manual selection of historical reports required, relies solely on widely available low-cost data, and requires no additional detectors. The experimental results across numerous road segments on different maps demonstrate that our model leads to more effective and early anomaly detection.
△ Less
Submitted 23 April, 2025; v1 submitted 14 December, 2024;
originally announced December 2024.