-
Study of the electromagnetic Dalitz decay $J/ψ\to e^+e^- π^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
We study the electromagnetic Dalitz decay $J/ψ\to e^+e^- π^0$ using $(10087 \pm 44) \times 10^6$ $J/ψ$ events collected by the \bes detector. The di-electron-invariant-mass dependent transition form factor of this decay is explored for the first time. A significant resonant structure corresponding to the $ρ/ω$ resonance is observed, which cannot be described by existing theoretical models, due to…
▽ More
We study the electromagnetic Dalitz decay $J/ψ\to e^+e^- π^0$ using $(10087 \pm 44) \times 10^6$ $J/ψ$ events collected by the \bes detector. The di-electron-invariant-mass dependent transition form factor of this decay is explored for the first time. A significant resonant structure corresponding to the $ρ/ω$ resonance is observed, which cannot be described by existing theoretical models, due to contributions from the isospin-conserving $J/ψ\to ρπ^0$ and isospin-volating $J/ψ\to ωπ^0$ decays. The observed $ρ$--$ω$ interference is consistent with that of the pion form factor but features a relatively narrow $ρ$ peak. By taking into account the contribution of this resonant structure, the branching fraction of $J/ψ\to e^+e^- π^0$ in the full $e^+e^-$ invariant mass spectrum range is also measured for the first time to be $(8.06 \pm 0.31 (\rm{stat}) \pm 0.38 (\rm{syst}))\times 10^{-7}$, which is two times larger than the prediction of the Vector Meson Dominance model due to the observed resonant contribution of $ρ/ω$ resonances.
△ Less
Submitted 8 January, 2025;
originally announced January 2025.
-
Uncovering underappreciated physical effects hidden in the cosmic-ray electron spectra at very high-energy
Authors:
Wei Zhu,
Yu-Chen Tang,
Feng-zheng Zhu,
Bo Yang
Abstract:
We show that the behavior of the cosmic ray electron spectrum in the TeV energy band near the Earth is dominated by gluon condensation and anomalous electron/positron pair-production in Cygnus X.
We show that the behavior of the cosmic ray electron spectrum in the TeV energy band near the Earth is dominated by gluon condensation and anomalous electron/positron pair-production in Cygnus X.
△ Less
Submitted 7 January, 2025;
originally announced January 2025.
-
Private, Auditable, and Distributed Ledger for Financial Institutes
Authors:
Shaltiel Eloul,
Yash Satsangi,
Yeoh Wei Zhu,
Omar Amer,
Georgios Papadopoulos,
Marco Pistoia
Abstract:
Distributed ledger technology offers several advantages for banking and finance industry, including efficient transaction processing and cross-party transaction reconciliation. The key challenges for adoption of this technology in financial institutes are (a) the building of a privacy-preserving ledger, (b) supporting auditing and regulatory requirements, and (c) flexibility to adapt to complex us…
▽ More
Distributed ledger technology offers several advantages for banking and finance industry, including efficient transaction processing and cross-party transaction reconciliation. The key challenges for adoption of this technology in financial institutes are (a) the building of a privacy-preserving ledger, (b) supporting auditing and regulatory requirements, and (c) flexibility to adapt to complex use-cases with multiple digital assets and actors. This paper proposes a framework for a private, audit-able, and distributed ledger (PADL) that adapts easily to fundamental use-cases within financial institutes. PADL employs widely-used cryptography schemes combined with zero-knowledge proofs to propose a transaction scheme for a `table' like ledger. It enables fast confidential peer-to-peer multi-asset transactions, and transaction graph anonymity, in a no-trust setup, but with customized privacy. We prove that integrity and anonymity of PADL is secured against a strong threat model. Furthermore, we showcase three fundamental real-life use-cases, namely, an assets exchange ledger, a settlement ledger, and a bond market ledger. Based on these use-cases we show that PADL supports smooth-lined inter-assets auditing while preserving privacy of the participants. For example, we show how a bank can be audited for its liquidity or credit risk without violation of privacy of itself or any other party, or how can PADL ensures honest coupon rate payment in bond market without sharing investors values. Finally, our evaluation shows PADL's advantage in performance against previous relevant schemes.
△ Less
Submitted 7 January, 2025;
originally announced January 2025.
-
Sequence Complementor: Complementing Transformers For Time Series Forecasting with Learnable Sequences
Authors:
Xiwen Chen,
Peijie Qiu,
Wenhui Zhu,
Huayu Li,
Hao Wang,
Aristeidis Sotiras,
Yalin Wang,
Abolfazl Razi
Abstract:
Since its introduction, the transformer has shifted the development trajectory away from traditional models (e.g., RNN, MLP) in time series forecasting, which is attributed to its ability to capture global dependencies within temporal tokens. Follow-up studies have largely involved altering the tokenization and self-attention modules to better adapt Transformers for addressing special challenges l…
▽ More
Since its introduction, the transformer has shifted the development trajectory away from traditional models (e.g., RNN, MLP) in time series forecasting, which is attributed to its ability to capture global dependencies within temporal tokens. Follow-up studies have largely involved altering the tokenization and self-attention modules to better adapt Transformers for addressing special challenges like non-stationarity, channel-wise dependency, and variable correlation in time series. However, we found that the expressive capability of sequence representation is a key factor influencing Transformer performance in time forecasting after investigating several representative methods, where there is an almost linear relationship between sequence representation entropy and mean square error, with more diverse representations performing better. In this paper, we propose a novel attention mechanism with Sequence Complementors and prove feasible from an information theory perspective, where these learnable sequences are able to provide complementary information beyond current input to feed attention. We further enhance the Sequence Complementors via a diversification loss that is theoretically covered. The empirical evaluation of both long-term and short-term forecasting has confirmed its superiority over the recent state-of-the-art methods.
△ Less
Submitted 5 January, 2025;
originally announced January 2025.
-
Observation of $ψ(3686) \to K^{-}Λ(1520)\barΞ^{+} + c.c.$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (642 additional authors not shown)
Abstract:
Based on $(2712.4 \pm 14.3)\times 10^6$ $ψ(3686)$ events collected at the BESIII detector operating at the BEPCII collider, we present the first observation of the decay $ψ(3686) \to K^{-}Λ(1520)\barΞ^{+} + c.c.$. The product branching fraction ${\cal B}[ψ(3686) \to K^{-}Λ(1520)\barΞ^{+} + c.c.] \times {\cal B}[Λ(1520) \to pK^{-}]$ is measured to be $(9.5 \pm 0.8 \pm 1.1) \times 10^{-7}$, where th…
▽ More
Based on $(2712.4 \pm 14.3)\times 10^6$ $ψ(3686)$ events collected at the BESIII detector operating at the BEPCII collider, we present the first observation of the decay $ψ(3686) \to K^{-}Λ(1520)\barΞ^{+} + c.c.$. The product branching fraction ${\cal B}[ψ(3686) \to K^{-}Λ(1520)\barΞ^{+} + c.c.] \times {\cal B}[Λ(1520) \to pK^{-}]$ is measured to be $(9.5 \pm 0.8 \pm 1.1) \times 10^{-7}$, where the first uncertainty is statistical and the second systematic.
△ Less
Submitted 5 January, 2025;
originally announced January 2025.
-
Search for $η_c(2S)\to p\bar{p}K^+K^-$ and measurement of $χ_{cJ}\to p\bar{p}K^+K^-$ in $ψ(3686)$ radiative decays
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann,
H. Cai
, et al. (639 additional authors not shown)
Abstract:
A search for $η_c(2S)\to p\bar{p}K^+K^-$, together with measurement of branching fractions of $χ_{cJ(J=0,1,2)}\to p\bar{p}K^+K^-$ in the $ψ(3686) \to γη_c(2S)$ and the $ψ(3686) \to γχ_{cJ}$ radiative decays, is performed with $(2712.4\pm14.3)\times 10^6$ $ψ(3686)$ events collected with the BESIII detector at the BEPCII collider. An evidence for $η_c(2S)\to p\bar{p}K^+K^-$ is found, with a signific…
▽ More
A search for $η_c(2S)\to p\bar{p}K^+K^-$, together with measurement of branching fractions of $χ_{cJ(J=0,1,2)}\to p\bar{p}K^+K^-$ in the $ψ(3686) \to γη_c(2S)$ and the $ψ(3686) \to γχ_{cJ}$ radiative decays, is performed with $(2712.4\pm14.3)\times 10^6$ $ψ(3686)$ events collected with the BESIII detector at the BEPCII collider. An evidence for $η_c(2S)\to p\bar{p}K^+K^-$ is found, with a significance of $3.3σ$. The product branching fraction of $\mathcal{B}[ψ(3686)\toγη_c(2S)]\cdot\mathcal{B}[η_c(2S)\to p\bar{p}K^+K^-]$ is determined to be $(1.98\mkern 2mu\pm\mkern 2mu0.41_{\text{stat.}}\mkern 2mu\pm\mkern 2mu0.99_{\text{syst.}})\times 10^{-7}$. The product branching fractions of $\mathcal{B}[ψ(3686)\toγχ_{cJ}]\cdot\mathcal{B}[χ_{cJ}\to p\bar{p}K^+K^-]$ are measured to be $(2.49\mkern 2mu\pm\mkern 2mu 0.03_{\text{stat.}}\mkern 2mu\pm\mkern 2mu 0.15_{\text{syst.}})\times 10^{-5}$, $(1.83\mkern 2mu \pm\mkern 2mu 0.02_{\text{stat.}}\mkern 2mu \pm\mkern 2mu 0.11_{\text{syst.}})\times 10^{-5}$, and $(2.43\mkern 2mu\pm\mkern 2mu 0.02_{\text{stat.}}\mkern 2mu\pm\mkern 2mu 0.15_{\text{syst.}})\times 10^{-5}$, for $J=0,\ 1$, and 2, respectively.
△ Less
Submitted 3 January, 2025;
originally announced January 2025.
-
Efficient Implementation of Third-order Tensor Methods with Adaptive Regularization for Unconstrained Optimization
Authors:
Coralia Cartis,
Raphael Hauser,
Yang Liu,
Karl Welzel,
Wenqi Zhu
Abstract:
High-order tensor methods that employ local Taylor models of degree $p$ within adaptive regularization frameworks (AR$p$) have recently received significant attention, due to their optimal global and local rates of convergence for both convex and nonconvex optimization problems. However, their numerical performance for general unconstrained optimization problems remains insufficiently explored, wh…
▽ More
High-order tensor methods that employ local Taylor models of degree $p$ within adaptive regularization frameworks (AR$p$) have recently received significant attention, due to their optimal global and local rates of convergence for both convex and nonconvex optimization problems. However, their numerical performance for general unconstrained optimization problems remains insufficiently explored, which we address by showcasing the numerical performance of standard second- and third-order variants ($p=2,3$) and proposing novel techniques for key algorithmic aspects when $p\geq3$ to improve numerical efficiency. To improve the adaptive choice of the regularization parameter, we extend the interpolation-based updating strategy introduced in (Gould, Porcelli, and Toint, 2012) for $p=2$ to $p\geq3$. We identify fundamental differences between the local minima of regularized subproblems for $p=2$ and $p\geq3$ and their effect on performance. Then, for $p\geq3$, we introduce a novel pre-rejection technique that rejects poor subproblem minimizers (referred to as `transient') before any function evaluation, reducing cost and selecting useful (`persistent') ones. Numerical studies confirm efficiency improvements in our modified AR$3$ algorithm. We also assess the effect of different subproblem termination conditions and the choice of the initial regularization parameter on overall performance. Finally, we benchmark our best-performing AR$3$ variants, along with those in (Birgin et al., 2020), against second-order ones (AR$2$). Encouraging results on standard test problems confirm that AR$3$ variants can outperform AR$2$ in terms of objective evaluations, derivative evaluations, and subproblem solves. We provide an efficient, extensive, and modular MATLAB software package including various AR$2$ and AR$3$ variants, allowing ease of use and experimentation for interested users.
△ Less
Submitted 28 February, 2025; v1 submitted 31 December, 2024;
originally announced January 2025.
-
Automatically Planning Optimal Parallel Strategy for Large Language Models
Authors:
Zongbiao Li,
Xiezhao Li,
Yinghao Cui,
Yijun Chen,
Zhixuan Gu,
Yuxuan Liu,
Wenbo Zhu,
Fei Jia,
Ke Liu,
Qifeng Li,
Junyao Zhan,
Jiangtao Zhou,
Chenxi Zhang,
Qike Liu
Abstract:
The number of parameters in large-scale language models based on transformers is gradually increasing, and the scale of computing clusters is also growing. The technology of quickly mobilizing large amounts of computing resources for parallel computing is becoming increasingly important. In this paper, we propose an automatic parallel algorithm that automatically plans the parallel strategy with m…
▽ More
The number of parameters in large-scale language models based on transformers is gradually increasing, and the scale of computing clusters is also growing. The technology of quickly mobilizing large amounts of computing resources for parallel computing is becoming increasingly important. In this paper, we propose an automatic parallel algorithm that automatically plans the parallel strategy with maximum throughput based on model and hardware information. By decoupling the training time into computation, communication, and overlap, we established a training duration simulation model. Based on this simulation model, we prune the parallel solution space to shorten the search time required. The multi-node experiment results show that the algorithm can estimate the parallel training duration in real time with an average accuracy of 96%. In our test, the recommendation strategy provided by the algorithm is always globally optimal.
△ Less
Submitted 30 December, 2024;
originally announced January 2025.
-
Multimodal Variational Autoencoder: a Barycentric View
Authors:
Peijie Qiu,
Wenhui Zhu,
Sayantan Kumar,
Xiwen Chen,
Xiaotong Sun,
Jin Yang,
Abolfazl Razi,
Yalin Wang,
Aristeidis Sotiras
Abstract:
Multiple signal modalities, such as vision and sounds, are naturally present in real-world phenomena. Recently, there has been growing interest in learning generative models, in particular variational autoencoder (VAE), to for multimodal representation learning especially in the case of missing modalities. The primary goal of these models is to learn a modality-invariant and modality-specific repr…
▽ More
Multiple signal modalities, such as vision and sounds, are naturally present in real-world phenomena. Recently, there has been growing interest in learning generative models, in particular variational autoencoder (VAE), to for multimodal representation learning especially in the case of missing modalities. The primary goal of these models is to learn a modality-invariant and modality-specific representation that characterizes information across multiple modalities. Previous attempts at multimodal VAEs approach this mainly through the lens of experts, aggregating unimodal inference distributions with a product of experts (PoE), a mixture of experts (MoE), or a combination of both. In this paper, we provide an alternative generic and theoretical formulation of multimodal VAE through the lens of barycenter. We first show that PoE and MoE are specific instances of barycenters, derived by minimizing the asymmetric weighted KL divergence to unimodal inference distributions. Our novel formulation extends these two barycenters to a more flexible choice by considering different types of divergences. In particular, we explore the Wasserstein barycenter defined by the 2-Wasserstein distance, which better preserves the geometry of unimodal distributions by capturing both modality-specific and modality-invariant representations compared to KL divergence. Empirical studies on three multimodal benchmarks demonstrated the effectiveness of the proposed method.
△ Less
Submitted 29 December, 2024;
originally announced December 2024.
-
Measurement of Born cross section of $e^+e^-\toΣ^0\barΣ^0$ at $\sqrt{s} = 3.50-4.95$ GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (649 additional authors not shown)
Abstract:
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at thirty-two center-of-mass energies from 3.50 to 4.95 GeV, corresponding to an integrated luminosity of 25 $\rm{fb^{-1}}$, we measure the Born cross section of the $e^+e^-\toΣ^0\barΣ^0$ reaction and the effective form factor. No significant charmonium(-like) state, i.e., $ψ(3770)$, $ψ(4040)$, $ψ(4160)$,…
▽ More
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at thirty-two center-of-mass energies from 3.50 to 4.95 GeV, corresponding to an integrated luminosity of 25 $\rm{fb^{-1}}$, we measure the Born cross section of the $e^+e^-\toΣ^0\barΣ^0$ reaction and the effective form factor. No significant charmonium(-like) state, i.e., $ψ(3770)$, $ψ(4040)$, $ψ(4160)$, $ψ(4230)$, $ψ(4360)$, $ψ(4415)$, or $ψ(4660)$, decaying into the $Σ^0\barΣ^0$ final state is observed by fitting the $e^+e^- \to Σ^0\barΣ^0$ dressed cross section. The upper limits for the product of the branching fraction and the electronic partial width at the 90% confidence level are provided for each assumed charmonium(-like) state. In addition, the ratios of the Born cross section and the effective form factor between the $e^+e^-\toΣ^0\barΣ^0$ and the $e^+e^-\toΣ^+\barΣ^-$ reactions are provided, which can be used to validate the prediction of the vector meson dominance model.
△ Less
Submitted 14 March, 2025; v1 submitted 28 December, 2024;
originally announced December 2024.
-
Search for the double Dalitz decays $η/η' \to e^+e^-μ^+μ^-$ and $η' \to μ^+μ^-μ^+μ^-$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (648 additional authors not shown)
Abstract:
Using a data sample of $(10087 \pm 44) \times {10^{6}}$ $J/ψ$ events collected with the BESIII detector, we search for the decays $η/η'\to e^+e^-μ^+μ^-$ and $η' \to μ^+μ^-μ^+μ^-$ via the radiative decays $J/ψ\toγη$/$γη'$. No excess of events over expected background is observed for any of the decays of interest. At 90% confidence level, we report the first upper limits on the branching fractions o…
▽ More
Using a data sample of $(10087 \pm 44) \times {10^{6}}$ $J/ψ$ events collected with the BESIII detector, we search for the decays $η/η'\to e^+e^-μ^+μ^-$ and $η' \to μ^+μ^-μ^+μ^-$ via the radiative decays $J/ψ\toγη$/$γη'$. No excess of events over expected background is observed for any of the decays of interest. At 90% confidence level, we report the first upper limits on the branching fractions of $η' \to e^{+}e^{-}μ^{+}μ^{-}$ and $η' \to μ^{+}μ^{-}μ^{+}μ^{-}$ to be $ 1.75 \times {10^{-6}}$ and $5.28 \times {10^{-7}}$, respectively. In addition, we set an upper limit on the branching fraction of $η\to e^{+}e^{-}μ^{+}μ^{-}$ to be $6.88 \times {10^{-6}}$, which improves the previous result by about two orders of magnitude.
△ Less
Submitted 27 December, 2024;
originally announced December 2024.
-
Movable Intelligent Surface (MIS) for Wireless Communications: Architecture, Modeling, Algorithm, and Prototyping
Authors:
Ziyuan Zheng,
Qingqing Wu,
Wen Chen,
Xiangming Wu,
Weiren Zhu
Abstract:
Reconfigurable intelligent surfaces enhance wireless systems by reshaping propagation environments. However, dynamic metasurfaces (MSs) with numerous phase-shift elements incur undesired control and hardware costs. In contrast, static MSs (SMSs), configured with static phase shifts pre-designed for specific communication demands, offer a cost-effective alternative by eliminating element-wise tunin…
▽ More
Reconfigurable intelligent surfaces enhance wireless systems by reshaping propagation environments. However, dynamic metasurfaces (MSs) with numerous phase-shift elements incur undesired control and hardware costs. In contrast, static MSs (SMSs), configured with static phase shifts pre-designed for specific communication demands, offer a cost-effective alternative by eliminating element-wise tuning. Nevertheless, SMSs typically support a single beam pattern with limited flexibility. In this paper, we propose a novel Movable Intelligent Surface (MIS) technology that enables dynamic beamforming while maintaining static phase shifts. Specifically, we design a MIS architecture comprising two closely stacked transmissive MSs: a larger fixed-position MS 1 and a smaller movable MS 2. By differentially shifting MS 2's position relative to MS 1, the MIS synthesizes distinct beam patterns. Then, we model the interaction between MS 2 and MS 1 using binary selection matrices and padding vectors and formulate a new optimization problem that jointly designs the MIS phase shifts and selects shifting positions for worst-case signal-to-noise ratio maximization. This position selection, equal to beam pattern scheduling, offers a new degree of freedom for RIS-aided systems. To solve the intractable problem, we develop an efficient algorithm that handles unit-modulus and binary constraints and employs manifold optimization methods. Finally, extensive validation results are provided. We implement a MIS prototype and perform proof-of-concept experiments, demonstrating the MIS's ability to synthesize desired beam patterns that achieve one-dimensional beam steering. Numerical results show that by introducing MS 2 with a few elements, MIS effectively offers beamforming flexibility for significantly improved performance. We also draw insights into the optimal MIS configuration and element allocation strategy.
△ Less
Submitted 26 December, 2024;
originally announced December 2024.
-
Single Trajectory Distillation for Accelerating Image and Video Style Transfer
Authors:
Sijie Xu,
Runqi Wang,
Wei Zhu,
Dejia Song,
Nemo Chen,
Xu Tang,
Yao Hu
Abstract:
Diffusion-based stylization methods typically denoise from a specific partial noise state for image-to-image and video-to-video tasks. This multi-step diffusion process is computationally expensive and hinders real-world application. A promising solution to speed up the process is to obtain few-step consistency models through trajectory distillation. However, current consistency models only force…
▽ More
Diffusion-based stylization methods typically denoise from a specific partial noise state for image-to-image and video-to-video tasks. This multi-step diffusion process is computationally expensive and hinders real-world application. A promising solution to speed up the process is to obtain few-step consistency models through trajectory distillation. However, current consistency models only force the initial-step alignment between the probability flow ODE (PF-ODE) trajectories of the student and the imperfect teacher models. This training strategy can not ensure the consistency of whole trajectories. To address this issue, we propose single trajectory distillation (STD) starting from a specific partial noise state. We introduce a trajectory bank to store the teacher model's trajectory states, mitigating the time cost during training. Besides, we use an asymmetric adversarial loss to enhance the style and quality of the generated images. Extensive experiments on image and video stylization demonstrate that our method surpasses existing acceleration models in terms of style similarity and aesthetic evaluations. Our code and results will be available on the project page: https://single-trajectory-distillation.github.io.
△ Less
Submitted 25 December, 2024;
originally announced December 2024.
-
SMAC-Hard: Enabling Mixed Opponent Strategy Script and Self-play on SMAC
Authors:
Yue Deng,
Yan Yu,
Weiyu Ma,
Zirui Wang,
Wenhui Zhu,
Jian Zhao,
Yin Zhang
Abstract:
The availability of challenging simulation environments is pivotal for advancing the field of Multi-Agent Reinforcement Learning (MARL). In cooperative MARL settings, the StarCraft Multi-Agent Challenge (SMAC) has gained prominence as a benchmark for algorithms following centralized training with decentralized execution paradigm. However, with continual advancements in SMAC, many algorithms now ex…
▽ More
The availability of challenging simulation environments is pivotal for advancing the field of Multi-Agent Reinforcement Learning (MARL). In cooperative MARL settings, the StarCraft Multi-Agent Challenge (SMAC) has gained prominence as a benchmark for algorithms following centralized training with decentralized execution paradigm. However, with continual advancements in SMAC, many algorithms now exhibit near-optimal performance, complicating the evaluation of their true effectiveness. To alleviate this problem, in this work, we highlight a critical issue: the default opponent policy in these environments lacks sufficient diversity, leading MARL algorithms to overfit and exploit unintended vulnerabilities rather than learning robust strategies. To overcome these limitations, we propose SMAC-HARD, a novel benchmark designed to enhance training robustness and evaluation comprehensiveness. SMAC-HARD supports customizable opponent strategies, randomization of adversarial policies, and interfaces for MARL self-play, enabling agents to generalize to varying opponent behaviors and improve model stability. Furthermore, we introduce a black-box testing framework wherein agents are trained without exposure to the edited opponent scripts but are tested against these scripts to evaluate the policy coverage and adaptability of MARL algorithms. We conduct extensive evaluations of widely used and state-of-the-art algorithms on SMAC-HARD, revealing the substantial challenges posed by edited and mixed strategy opponents. Additionally, the black-box strategy tests illustrate the difficulty of transferring learned policies to unseen adversaries. We envision SMAC-HARD as a critical step toward benchmarking the next generation of MARL algorithms, fostering progress in self-play methods for multi-agent systems. Our code is available at https://github.com/devindeng94/smac-hard.
△ Less
Submitted 24 December, 2024; v1 submitted 23 December, 2024;
originally announced December 2024.
-
Electrical Manipulation of Spin Splitting Torque in Altermagnetic RuO2
Authors:
Yichi Zhang,
Hua Bai,
Lei Han,
Jiankun Dai,
Chong Chen,
Shixuan Liang,
Yanzhang Cao,
Yingying Zhang,
Qian Wang,
Wenxuan Zhu,
Feng Pan,
Cheng Song
Abstract:
Due to nonrelativistic altermagnetic spin splitting effect (ASSE), altermagnets can generate time-reversal-odd spin current and spin splitting torque (SST) with spin polarization parallel to the Néel vector. Hence the effective manipulation of SST would provide plenty of opportunities for designable spintronic devices, which remains elusive. Here, the electrical control of SST is achieved in alter…
▽ More
Due to nonrelativistic altermagnetic spin splitting effect (ASSE), altermagnets can generate time-reversal-odd spin current and spin splitting torque (SST) with spin polarization parallel to the Néel vector. Hence the effective manipulation of SST would provide plenty of opportunities for designable spintronic devices, which remains elusive. Here, the electrical control of SST is achieved in altermagnetic RuO2, based on controllable Néel vector of RuO2 and Néel vector-dependent generation of SST. We demonstrate the current-induced switching of Néel vector via spin-orbit torque in RuO2 films, according to the reversible polarity of electrical transport measurements and X-ray magnetic linear dichroism (XMLD). The XMLD also unprecedentedly demonstrates that Néel vector really exists in altermagnets. The switching of Néel vector to the current direction and resultantly enhanced spin polarization parallel to the Néel vector brings about stronger ASSE-induced spin current. Our findings not only enrich the properties of altermagnets but also pave the way for high speed memories and nano-oscillators with excellent controllability and efficiency.
△ Less
Submitted 22 December, 2024;
originally announced December 2024.
-
Multimodal Latent Diffusion Model for Complex Sewing Pattern Generation
Authors:
Shengqi Liu,
Yuhao Cheng,
Zhuo Chen,
Xingyu Ren,
Wenhan Zhu,
Lincheng Li,
Mengxiao Bi,
Xiaokang Yang,
Yichao Yan
Abstract:
Generating sewing patterns in garment design is receiving increasing attention due to its CG-friendly and flexible-editing nature. Previous sewing pattern generation methods have been able to produce exquisite clothing, but struggle to design complex garments with detailed control. To address these issues, we propose SewingLDM, a multi-modal generative model that generates sewing patterns controll…
▽ More
Generating sewing patterns in garment design is receiving increasing attention due to its CG-friendly and flexible-editing nature. Previous sewing pattern generation methods have been able to produce exquisite clothing, but struggle to design complex garments with detailed control. To address these issues, we propose SewingLDM, a multi-modal generative model that generates sewing patterns controlled by text prompts, body shapes, and garment sketches. Initially, we extend the original vector of sewing patterns into a more comprehensive representation to cover more intricate details and then compress them into a compact latent space. To learn the sewing pattern distribution in the latent space, we design a two-step training strategy to inject the multi-modal conditions, \ie, body shapes, text prompts, and garment sketches, into a diffusion model, ensuring the generated garments are body-suited and detail-controlled. Comprehensive qualitative and quantitative experiments show the effectiveness of our proposed method, significantly surpassing previous approaches in terms of complex garment design and various body adaptability. Our project page: https://shengqiliu1.github.io/SewingLDM.
△ Less
Submitted 18 December, 2024;
originally announced December 2024.
-
Measurement of the Branching Fraction for the Decay $χ_{cJ}\to p\bar{p}ηπ^{0}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (642 additional authors not shown)
Abstract:
Using $(2712.4\pm 14.3)\times10^6 ψ(3686)$ events collected by the BESIII detector operating at the BEPCII collider, we present the first observations of the decays $χ_{cJ}(J=0,1,2)\to p\bar{p}ηπ^{0}$. Their decay branching fractions are determined to be ${\cal B}(χ_{c0}\to p\bar{p}ηπ^{0})=({2.41 \pm 0.07 \pm 0.19}) \times 10^{-4}$,…
▽ More
Using $(2712.4\pm 14.3)\times10^6 ψ(3686)$ events collected by the BESIII detector operating at the BEPCII collider, we present the first observations of the decays $χ_{cJ}(J=0,1,2)\to p\bar{p}ηπ^{0}$. Their decay branching fractions are determined to be ${\cal B}(χ_{c0}\to p\bar{p}ηπ^{0})=({2.41 \pm 0.07 \pm 0.19}) \times 10^{-4}$, ${\cal B}(χ_{c1}\to p\bar{p}ηπ^{0})=({1.95 \pm 0.05 \pm 0.12}) \times 10^{-4}$, and ${\cal B}(χ_{c2}\to p\bar{p}ηπ^{0})=({1.31 \pm 0.05 \pm 0.08}) \times 10^{-4}$, where the first uncertainties are statistical and the second systematic.
△ Less
Submitted 18 December, 2024; v1 submitted 18 December, 2024;
originally announced December 2024.
-
Deploying Foundation Model Powered Agent Services: A Survey
Authors:
Wenchao Xu,
Jinyu Chen,
Peirong Zheng,
Xiaoquan Yi,
Tianyi Tian,
Wenhui Zhu,
Quan Wan,
Haozhao Wang,
Yunfeng Fan,
Qinliang Su,
Xuemin Shen
Abstract:
Foundation model (FM) powered agent services are regarded as a promising solution to develop intelligent and personalized applications for advancing toward Artificial General Intelligence (AGI). To achieve high reliability and scalability in deploying these agent services, it is essential to collaboratively optimize computational and communication resources, thereby ensuring effective resource all…
▽ More
Foundation model (FM) powered agent services are regarded as a promising solution to develop intelligent and personalized applications for advancing toward Artificial General Intelligence (AGI). To achieve high reliability and scalability in deploying these agent services, it is essential to collaboratively optimize computational and communication resources, thereby ensuring effective resource allocation and seamless service delivery. In pursuit of this vision, this paper proposes a unified framework aimed at providing a comprehensive survey on deploying FM-based agent services across heterogeneous devices, with the emphasis on the integration of model and resource optimization to establish a robust infrastructure for these services. Particularly, this paper begins with exploring various low-level optimization strategies during inference and studies approaches that enhance system scalability, such as parallelism techniques and resource scaling methods. The paper then discusses several prominent FMs and investigates research efforts focused on inference acceleration, including techniques such as model compression and token reduction. Moreover, the paper also investigates critical components for constructing agent services and highlights notable intelligent applications. Finally, the paper presents potential research directions for developing real-time agent services with high Quality of Service (QoS).
△ Less
Submitted 17 December, 2024;
originally announced December 2024.
-
Observation of the charmonium decay $η_c\toγγ$ in $J/ψ\toγη_c$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann,
H. Cai
, et al. (658 additional authors not shown)
Abstract:
Using $(2712.4\pm14.3)\times10^{6}$ $ψ(3686)$ events collected with the BESIII detector at the BEPCII collider, the decay $η_c\toγγ$ in $J/ψ\toγη_c$ is observed. We determine the product branching fraction $\mathcal{B}(J/ψ\toγη_c)\times\mathcal{B}(η_c\toγγ)=(5.23\pm0.26_{\rm{stat.}}\pm0.30_{\rm{syst.}})\times10^{-6}$. This result is consistent with the LQCD calculation…
▽ More
Using $(2712.4\pm14.3)\times10^{6}$ $ψ(3686)$ events collected with the BESIII detector at the BEPCII collider, the decay $η_c\toγγ$ in $J/ψ\toγη_c$ is observed. We determine the product branching fraction $\mathcal{B}(J/ψ\toγη_c)\times\mathcal{B}(η_c\toγγ)=(5.23\pm0.26_{\rm{stat.}}\pm0.30_{\rm{syst.}})\times10^{-6}$. This result is consistent with the LQCD calculation $(5.34\pm0.16)\times10^{-6}$ from HPQCD in 2023. By using the world-average values of $\mathcal{B}(J/ψ\toγη_c)$ and the total decay width of $η_c$, the partial decay width $Γ(η_c\toγγ)$ is determined to be $(11.30\pm0.56_{\rm{stat.}}\pm0.66_{\rm{syst.}}\pm1.14_{\rm{ref.}})~\rm{keV}$, which deviates from the corresponding world-average value by $3.4σ$.
△ Less
Submitted 2 April, 2025; v1 submitted 17 December, 2024;
originally announced December 2024.
-
Learning Human-Aware Robot Policies for Adaptive Assistance
Authors:
Jason Qin,
Shikun Ban,
Wentao Zhu,
Yizhou Wang,
Dimitris Samaras
Abstract:
Developing robots that can assist humans efficiently, safely, and adaptively is crucial for real-world applications such as healthcare. While previous work often assumes a centralized system for co-optimizing human-robot interactions, we argue that real-world scenarios are much more complicated, as humans have individual preferences regarding how tasks are performed. Robots typically lack direct a…
▽ More
Developing robots that can assist humans efficiently, safely, and adaptively is crucial for real-world applications such as healthcare. While previous work often assumes a centralized system for co-optimizing human-robot interactions, we argue that real-world scenarios are much more complicated, as humans have individual preferences regarding how tasks are performed. Robots typically lack direct access to these implicit preferences. However, to provide effective assistance, robots must still be able to recognize and adapt to the individual needs and preferences of different users. To address these challenges, we propose a novel framework in which robots infer human intentions and reason about human utilities through interaction. Our approach features two critical modules: the anticipation module is a motion predictor that captures the spatial-temporal relationship between the robot agent and user agent, which contributes to predicting human behavior; the utility module infers the underlying human utility functions through progressive task demonstration sampling. Extensive experiments across various robot types and assistive tasks demonstrate that the proposed framework not only enhances task success and efficiency but also significantly improves user satisfaction, paving the way for more personalized and adaptive assistive robotic systems. Code and demos are available at https://asonin.github.io/Human-Aware-Assistance/.
△ Less
Submitted 27 December, 2024; v1 submitted 16 December, 2024;
originally announced December 2024.
-
Amplitude analysis and branching fraction measurement of the Cabibbo-favored decay $D^+ \to K^-π^+π^+π^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (651 additional authors not shown)
Abstract:
An amplitude analysis of the Cabibbo-favored decay $D^+ \to K^-π^+π^+π^0$ is performed, using 7.93 $\rm{fb}^{-1}$ of $e^+e^-$ collision data collected with the BESIII detector at the center-of-mass energy of 3.773 GeV. The branching fractions of the intermediate processes are measured, with the dominant contribution $D^+ \to \bar{K}^{*}(892)^0ρ(770)^+$ observed to have a branching fraction of…
▽ More
An amplitude analysis of the Cabibbo-favored decay $D^+ \to K^-π^+π^+π^0$ is performed, using 7.93 $\rm{fb}^{-1}$ of $e^+e^-$ collision data collected with the BESIII detector at the center-of-mass energy of 3.773 GeV. The branching fractions of the intermediate processes are measured, with the dominant contribution $D^+ \to \bar{K}^{*}(892)^0ρ(770)^+$ observed to have a branching fraction of $(4.15\pm0.07_{\rm stat.}\pm0.17_{\rm syst.})\%$. With the detection efficiency derived from the amplitude analysis, the absolute branching fraction of $D^+ \to K^-π^+π^+π^0$ is measured to be $(6.06\pm0.04_{\rm stat.}\pm0.07_{\rm syst.})\%$.
△ Less
Submitted 14 December, 2024;
originally announced December 2024.
-
Study of the semileptonic decay $D^0\rightarrow \bar{K}^0π^-e^+ν_e$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (650 additional authors not shown)
Abstract:
We report an improved study of the semileptonic decay $D^0 \rightarrow \bar{K}^0π^-e^+ν_{e}$ based on a sample of $7.9~\mathrm{fb}^{-1}$ of $e^+e^-$ annihilation data collected at a center-of-mass energy of 3.773~GeV with the BESIII detector at the BEPCII collider. The branching fraction of this decay is measured to be…
▽ More
We report an improved study of the semileptonic decay $D^0 \rightarrow \bar{K}^0π^-e^+ν_{e}$ based on a sample of $7.9~\mathrm{fb}^{-1}$ of $e^+e^-$ annihilation data collected at a center-of-mass energy of 3.773~GeV with the BESIII detector at the BEPCII collider. The branching fraction of this decay is measured to be $\mathcal{B}(D^0\rightarrow \bar{K}^0π^-e^+ν_{e}) = (1.444 \pm 0.022_{\rm stat} \pm 0.024_{\rm syst})\%$, which is the most precise to date, where the first uncertainty is statistical and the second is systematic. Based on investigation of the decay dynamics, we find that the decay is dominated by the $K^{*}(892)^-$ component and present an improved measurement of its branching fraction to be $\mathcal{B}(D^0\rightarrow K^{*}(892)^-e^+ν_e) = (2.039 \pm 0.032_{\rm stat} \pm 0.034_{\rm syst})\%$. We also determine the ratios of the hadronic form factors for the $K^{*}(892)^-e^+ν_e$ decay to be $r_{V} = V(0)/A_1(0) = 1.48 \pm 0.05_{\rm stat} \pm 0.02_{\rm syst}$ and $r_{2} = A_2(0)/A_1(0) = 0.70 \pm 0.04_{\rm stat} \pm 0.02_{\rm syst}$, where $V(0)$ is the vector form factor and $A_{1,2}(0)$ are the axial form factors. In addition, the $\bar{K}^0π^-$ $\mathcal{S}$-wave component is found to account for $(5.87 \pm 0.32_{\rm stat} \pm 0.16_{\rm syst})\%$ of the total decay rate, corresponding to a branching fraction of $\mathcal{B}[D^0\rightarrow (\bar{K}^0π^-)_{S-{\rm wave}}e^+ν_e] = (0.085 \pm 0.005_{\rm stat} \pm 0.003_{\rm syst})\%$.
△ Less
Submitted 14 December, 2024;
originally announced December 2024.
-
Revisiting the Galactic Winds in M82 I: the recent starburst and launch of outflow in simulations
Authors:
Tian-Rui Wang,
Weishan Zhu,
Xue-Fu Li,
Wen-Sheng Hong,
Long-Long Feng
Abstract:
We revisit the launch of the galactic outflow in M82 through hydrodynamic simulations. Employing a sink-particle module, we self-consistently resolve the star formation and feedback processes, avoiding the reliance on various assumed models. We probe the effects of different stellar feedback mechanisms, gas return from star-forming clouds and gas disc mass on the starburst and outflow. Our simulat…
▽ More
We revisit the launch of the galactic outflow in M82 through hydrodynamic simulations. Employing a sink-particle module, we self-consistently resolve the star formation and feedback processes, avoiding the reliance on various assumed models. We probe the effects of different stellar feedback mechanisms, gas return from star-forming clouds and gas disc mass on the starburst and outflow. Our simulations can generate a starburst that lasts $\sim25$ Myr, peaking at 20-50 $\rm{M_{\odot} yr^{-1}}$. However, the total stellar mass formed in the starburst often exceeds M82's estimated value. The outflow's launch occurs in two stages. Initially, continuous SNe explosions form small bubbles, merging into a super bubble foam composed of warm/hot gas and high-density cool filaments. After $\sim10$ Myr of SN injection, the super bubble breakout the disc, marking the second stage, which takes $\sim15$ Myr to develop a kpc-scale outflow. Our simulations reveal that cool filaments within the ISM can survive from the stellar feedback, then were entrained into the outflow and stretched to hundreds pc in length. While the mass loading factor of the well-developed outflow is comparable to M82, the cool gas mass outflow rate is often lower, and its velocity is slower than the estimated value in M82 by $\sim60\%$. Warm and hot gas are $\sim25\%$ slower. SN feedback acts as the primary driver of the outflow, while gas return significantly influences the starburst and outflow. Other factors have moderate effects. To address the shortcoming in our results, enhanced SN feedback effect due to clustered SNe is likely necessary.
△ Less
Submitted 15 December, 2024; v1 submitted 12 December, 2024;
originally announced December 2024.
-
Neural Networks for Threshold Dynamics Reconstruction
Authors:
Elisa Negrini,
Almanzo Jiahe Gao,
Abigail Bowering,
Wei Zhu,
Luca Capogna
Abstract:
We introduce two convolutional neural network (CNN) architectures, inspired by the Merriman-Bence-Osher (MBO) algorithm and by cellular automatons, to model and learn threshold dynamics for front evolution from video data. The first model, termed the (single-dynamics) MBO network, learns a specific kernel and threshold for each input video without adapting to new dynamics, while the second, a meta…
▽ More
We introduce two convolutional neural network (CNN) architectures, inspired by the Merriman-Bence-Osher (MBO) algorithm and by cellular automatons, to model and learn threshold dynamics for front evolution from video data. The first model, termed the (single-dynamics) MBO network, learns a specific kernel and threshold for each input video without adapting to new dynamics, while the second, a meta-learning MBO network, generalizes across diverse threshold dynamics by adapting its parameters per input. Both models are evaluated on synthetic and real-world videos (ice melting and fire front propagation), with performance metrics indicating effective reconstruction and extrapolation of evolving boundaries, even under noisy conditions. Empirical results highlight the robustness of both networks across varied synthetic and real-world dynamics.
△ Less
Submitted 12 December, 2024;
originally announced December 2024.
-
Video Repurposing from User Generated Content: A Large-scale Dataset and Benchmark
Authors:
Yongliang Wu,
Wenbo Zhu,
Jiawang Cao,
Yi Lu,
Bozheng Li,
Weiheng Chi,
Zihan Qiu,
Lirian Su,
Haolin Zheng,
Jay Wu,
Xu Yang
Abstract:
The demand for producing short-form videos for sharing on social media platforms has experienced significant growth in recent times. Despite notable advancements in the fields of video summarization and highlight detection, which can create partially usable short films from raw videos, these approaches are often domain-specific and require an in-depth understanding of real-world video content. To…
▽ More
The demand for producing short-form videos for sharing on social media platforms has experienced significant growth in recent times. Despite notable advancements in the fields of video summarization and highlight detection, which can create partially usable short films from raw videos, these approaches are often domain-specific and require an in-depth understanding of real-world video content. To tackle this predicament, we propose Repurpose-10K, an extensive dataset comprising over 10,000 videos with more than 120,000 annotated clips aimed at resolving the video long-to-short task. Recognizing the inherent constraints posed by untrained human annotators, which can result in inaccurate annotations for repurposed videos, we propose a two-stage solution to obtain annotations from real-world user-generated content. Furthermore, we offer a baseline model to address this challenging task by integrating audio, visual, and caption aspects through a cross-modal fusion and alignment framework. We aspire for our work to ignite groundbreaking research in the lesser-explored realms of video repurposing.
△ Less
Submitted 15 December, 2024; v1 submitted 11 December, 2024;
originally announced December 2024.
-
PSR J1922+37: a 1.9-second pulsar discovered in the direction of the old open cluster NGC 6791
Authors:
Xiao-Jin Liu,
Rahul Sengar,
Matthew Bailes,
Ralph P. Eatough,
Jianping Yuan,
Na Wang,
Weiwei Zhu,
Lu Zhou,
He Gao,
Zong-Hong Zhu,
Xing-Jiang Zhu
Abstract:
More than 300 pulsars have been discovered in Galactic globular clusters; however, none have been found in open clusters. Here we present results from 20-hour pulsar searching observations in seven open clusters with the Five-hundred-meter Aperture Spherical radio Telescope (FAST). Our first discovery is a 1.9-second pulsar (J1922+37) found in the direction of the old open cluster NGC 6791. The me…
▽ More
More than 300 pulsars have been discovered in Galactic globular clusters; however, none have been found in open clusters. Here we present results from 20-hour pulsar searching observations in seven open clusters with the Five-hundred-meter Aperture Spherical radio Telescope (FAST). Our first discovery is a 1.9-second pulsar (J1922+37) found in the direction of the old open cluster NGC 6791. The measured dispersion measure (DM) implies a distance of 4.79 kpc and 8.92 kpc based on the NE2001 and YMW16 electron density models, respectively. Given the large uncertainty of DM distance estimates, it is likely that PSR J1922+37 is indeed a member of NGC 6791, for which the distance is $4.19\pm0.02$ kpc based on Gaia Data Release 3. If confirmed, PSR J1922+37 will be the first pulsar found in Galactic open clusters. We outline future observations that can confirm this pulsar-open cluster association and discuss the general prospects of finding pulsars in open clusters.
△ Less
Submitted 28 February, 2025; v1 submitted 10 December, 2024;
originally announced December 2024.
-
DRUM: Learning Demonstration Retriever for Large MUlti-modal Models
Authors:
Ellen Yi-Ge,
Jiechao Gao,
Wei Han,
Wei Zhu
Abstract:
Recently, large language models (LLMs) have demonstrated impressive capabilities in dealing with new tasks with the help of in-context learning (ICL). In the study of Large Vision-Language Models (LVLMs), when implementing ICL, researchers usually adopts the naive strategies like fixed demonstrations across different samples, or selecting demonstrations directly via a visual-language embedding mod…
▽ More
Recently, large language models (LLMs) have demonstrated impressive capabilities in dealing with new tasks with the help of in-context learning (ICL). In the study of Large Vision-Language Models (LVLMs), when implementing ICL, researchers usually adopts the naive strategies like fixed demonstrations across different samples, or selecting demonstrations directly via a visual-language embedding model. These methods does not guarantee the configured demonstrations fit the need of the LVLMs. To address this issue, we now propose a novel framework, \underline{d}emonstration \underline{r}etriever for large m\underline{u}lti-modal \underline{m}odel (DRUM), which fine-tunes the visual-language embedding model to better meet the LVLM's needs. First, we discuss the retrieval strategies for a visual-language task, assuming an embedding model is given. And we propose to concate the image and text embeddings to enhance the retrieval performance. Second, we propose to re-rank the demonstrations retrieved by the embedding model via the LVLM's feedbacks, and calculate a list-wise ranking loss for training the embedding model. Third, we propose an iterative demonstration mining strategy to improve the training of the embedding model. Through extensive experiments on 3 types of visual-language tasks, 7 benchmark datasets, our DRUM framework is proven to be effective in boosting the LVLM's in-context learning performance via retrieving more proper demonstrations.
△ Less
Submitted 10 December, 2024;
originally announced December 2024.
-
Beyond Idle Channels: Unlocking Idle Space with Signal Alignment in Massive MIMO Cognitive Radio Networks
Authors:
Weidong Zhu,
Xueqian Li,
Longwei Wang,
Zheng Zhang
Abstract:
Cognitive radio networks (CRNs) have traditionally focused on utilizing idle channels to enhance spectrum efficiency. However, as wireless networks grow denser, channel-centric strategies face increasing limitations. This paper introduces a paradigm shift by exploring the underutilized potential of idle spatial dimensions, termed idle space, in co-channel transmissions. By integrating massive mult…
▽ More
Cognitive radio networks (CRNs) have traditionally focused on utilizing idle channels to enhance spectrum efficiency. However, as wireless networks grow denser, channel-centric strategies face increasing limitations. This paper introduces a paradigm shift by exploring the underutilized potential of idle spatial dimensions, termed idle space, in co-channel transmissions. By integrating massive multiple-input multiple-output (MIMO) systems with signal alignment techniques, we enable secondary users to transmit without causing interference to primary users by aligning their signals within the null spaces of primary receivers. We propose a comprehensive framework that synergizes spatial spectrum sensing, signal alignment, and resource allocation, specifically designed for secondary users in CRNs. Theoretical analyses and extensive simulations validate the framework, demonstrating substantial gains in spectrum efficiency, throughput, and interference mitigation. The results show that the proposed approach not only ensures interference-free coexistence with primary users but also unlocks untapped spatial resources for secondary transmissions.
△ Less
Submitted 9 December, 2024;
originally announced December 2024.
-
Study of the decay ψ(3686) \to Σ^{0}\barΣ^{0}φ
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (644 additional authors not shown)
Abstract:
Using $(27.12\pm 0.14)\times 10^{8}$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, the decay $ψ(3686)\toΣ^{0}\barΣ^{0}φ$ is observed for the first time with a statistical significance of 7.6$σ$. Its branching fraction is measured to be $(2.64 \pm 0.32_{\textrm{stat}} \pm 0.12_{\textrm{sys}}) \times 10^{-6}$, where the first uncertainty is statistical and the…
▽ More
Using $(27.12\pm 0.14)\times 10^{8}$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, the decay $ψ(3686)\toΣ^{0}\barΣ^{0}φ$ is observed for the first time with a statistical significance of 7.6$σ$. Its branching fraction is measured to be $(2.64 \pm 0.32_{\textrm{stat}} \pm 0.12_{\textrm{sys}}) \times 10^{-6}$, where the first uncertainty is statistical and the second is systematic. In addition, we search for potential intermediate states in the $Σ^{0}φ$($\barΣ^{0}φ$) invariant mass distribution and a possible threshold enhancement in the $Σ^{0}\barΣ^{0}$ system, but no conclusive evidence of is observed.
△ Less
Submitted 9 December, 2024;
originally announced December 2024.
-
Partial wave analyses of $ψ(3686)\to p\bar{p}π^0$ and $ψ(3686)\to p\bar{p}η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (644 additional authors not shown)
Abstract:
Using a sample of $(2712\pm14)\times10^6$ $ψ(3686)$ events collected with the BESIII detector, we perform partial wave analyses of the decays $ψ(3686)\to p\bar{p}π^0$ and $ψ(3686)\to p\bar{p}η$. The branching fractions of $ψ(3686)\to p\bar{p}π^0$ and $ψ(3686)\to p\bar{p}η$ are determined to be $(133.9\pm11.2\pm2.3)\times10^{-6}$ or $(183.7\pm13.7\pm3.2)\times10^{-6}$ and…
▽ More
Using a sample of $(2712\pm14)\times10^6$ $ψ(3686)$ events collected with the BESIII detector, we perform partial wave analyses of the decays $ψ(3686)\to p\bar{p}π^0$ and $ψ(3686)\to p\bar{p}η$. The branching fractions of $ψ(3686)\to p\bar{p}π^0$ and $ψ(3686)\to p\bar{p}η$ are determined to be $(133.9\pm11.2\pm2.3)\times10^{-6}$ or $(183.7\pm13.7\pm3.2)\times10^{-6}$ and $(61.5\pm6.5\pm1.1)\times10^{-6}$ or $(84.4\pm6.9\pm1.4)\times10^{-6}$, respectively, where the two solutions are caused by an ambiguous phase angle between resonant and continuum processes. Several well-established $N^*$ states are observed in the $pπ^0$ and $pη$ systems, and the corresponding branching fractions are measured. The ratio of decay widths $Γ_{N(1535)\to Nη}/Γ_{N(1535)\to Nπ}$ is determined to be $0.99\pm0.05\pm0.19$.
△ Less
Submitted 19 February, 2025; v1 submitted 9 December, 2024;
originally announced December 2024.
-
GAQAT: gradient-adaptive quantization-aware training for domain generalization
Authors:
Jiacheng Jiang,
Yuan Meng,
Chen Tang,
Han Yu,
Qun Li,
Zhi Wang,
Wenwu Zhu
Abstract:
Research on loss surface geometry, such as Sharpness-Aware Minimization (SAM), shows that flatter minima improve generalization. Recent studies further reveal that flatter minima can also reduce the domain generalization (DG) gap. However, existing flatness-based DG techniques predominantly operate within a full-precision training process, which is impractical for deployment on resource-constraine…
▽ More
Research on loss surface geometry, such as Sharpness-Aware Minimization (SAM), shows that flatter minima improve generalization. Recent studies further reveal that flatter minima can also reduce the domain generalization (DG) gap. However, existing flatness-based DG techniques predominantly operate within a full-precision training process, which is impractical for deployment on resource-constrained edge devices that typically rely on lower bit-width representations (e.g., 4 bits, 3 bits). Consequently, low-precision quantization-aware training is critical for optimizing these techniques in real-world applications. In this paper, we observe a significant degradation in performance when applying state-of-the-art DG-SAM methods to quantized models, suggesting that current approaches fail to preserve generalizability during the low-precision training process. To address this limitation, we propose a novel Gradient-Adaptive Quantization-Aware Training (GAQAT) framework for DG. Our approach begins by identifying the scale-gradient conflict problem in low-precision quantization, where the task loss and smoothness loss induce conflicting gradients for the scaling factors of quantizers, with certain layers exhibiting opposing gradient directions. This conflict renders the optimization of quantized weights highly unstable. To mitigate this, we further introduce a mechanism to quantify gradient inconsistencies and selectively freeze the gradients of scaling factors, thereby stabilizing the training process and enhancing out-of-domain generalization. Extensive experiments validate the effectiveness of the proposed GAQAT framework. On PACS, our 3-bit and 4-bit models outperform direct DG-QAT integration by up to 4.5%. On DomainNet, the 4-bit model achieves near-lossless performance compared to full precision, with improvements of 1.39% (4-bit) and 1.06% (3-bit) over the SOTA QAT baseline.
△ Less
Submitted 7 December, 2024;
originally announced December 2024.
-
DEYOLO: Dual-Feature-Enhancement YOLO for Cross-Modality Object Detection
Authors:
Yishuo Chen,
Boran Wang,
Xinyu Guo,
Wenbin Zhu,
Jiasheng He,
Xiaobin Liu,
Jing Yuan
Abstract:
Object detection in poor-illumination environments is a challenging task as objects are usually not clearly visible in RGB images. As infrared images provide additional clear edge information that complements RGB images, fusing RGB and infrared images has potential to enhance the detection ability in poor-illumination environments. However, existing works involving both visible and infrared images…
▽ More
Object detection in poor-illumination environments is a challenging task as objects are usually not clearly visible in RGB images. As infrared images provide additional clear edge information that complements RGB images, fusing RGB and infrared images has potential to enhance the detection ability in poor-illumination environments. However, existing works involving both visible and infrared images only focus on image fusion, instead of object detection. Moreover, they directly fuse the two kinds of image modalities, which ignores the mutual interference between them. To fuse the two modalities to maximize the advantages of cross-modality, we design a dual-enhancement-based cross-modality object detection network DEYOLO, in which semantic-spatial cross modality and novel bi-directional decoupled focus modules are designed to achieve the detection-centered mutual enhancement of RGB-infrared (RGB-IR). Specifically, a dual semantic enhancing channel weight assignment module (DECA) and a dual spatial enhancing pixel weight assignment module (DEPA) are firstly proposed to aggregate cross-modality information in the feature space to improve the feature representation ability, such that feature fusion can aim at the object detection task. Meanwhile, a dual-enhancement mechanism, including enhancements for two-modality fusion and single modality, is designed in both DECAand DEPAto reduce interference between the two kinds of image modalities. Then, a novel bi-directional decoupled focus is developed to enlarge the receptive field of the backbone network in different directions, which improves the representation quality of DEYOLO. Extensive experiments on M3FD and LLVIP show that our approach outperforms SOTA object detection algorithms by a clear margin. Our code is available at https://github.com/chips96/DEYOLO.
△ Less
Submitted 6 December, 2024;
originally announced December 2024.
-
Many-MobileNet: Multi-Model Augmentation for Robust Retinal Disease Classification
Authors:
Hao Wang,
Wenhui Zhu,
Xuanzhao Dong,
Yanxi Chen,
Xin Li,
Peijie Qiu,
Xiwen Chen,
Vamsi Krishna Vasa,
Yujian Xiong,
Oana M. Dumitrascu,
Abolfazl Razi,
Yalin Wang
Abstract:
In this work, we propose Many-MobileNet, an efficient model fusion strategy for retinal disease classification using lightweight CNN architecture. Our method addresses key challenges such as overfitting and limited dataset variability by training multiple models with distinct data augmentation strategies and different model complexities. Through this fusion technique, we achieved robust generaliza…
▽ More
In this work, we propose Many-MobileNet, an efficient model fusion strategy for retinal disease classification using lightweight CNN architecture. Our method addresses key challenges such as overfitting and limited dataset variability by training multiple models with distinct data augmentation strategies and different model complexities. Through this fusion technique, we achieved robust generalization in data-scarce domains while balancing computational efficiency with feature extraction capabilities.
△ Less
Submitted 3 December, 2024;
originally announced December 2024.
-
QuakeFormer: A Uniform Approach to Earthquake Ground Motion Prediction Using Masked Transformers
Authors:
Yitian Feng,
Weiqiang Zhu,
Xinzheng Lu
Abstract:
Ground motion prediction (GMP) models are critical for hazard reduction before, during and after destructive earthquakes. In these three stages, intensity forecasting, early warning and interpolation models are corresponding employed to assess the risk. Considering the high cost in numerical methods and the oversimplification in statistical methods, deep-learning-based approaches aim to provide ac…
▽ More
Ground motion prediction (GMP) models are critical for hazard reduction before, during and after destructive earthquakes. In these three stages, intensity forecasting, early warning and interpolation models are corresponding employed to assess the risk. Considering the high cost in numerical methods and the oversimplification in statistical methods, deep-learning-based approaches aim to provide accurate and near-real-time ground motion prediction. Current approaches are limited by specialized architectures, overlooking the interconnection among these three tasks. What's more, the inadequate modeling of absolute and relative spatial dependencies mischaracterizes epistemic uncertainty into aleatory variability. Here we introduce QuakeFormer, a unified deep learning architecture that combines these three tasks in one framework. We design a multi-station-based Transformer architecture and a flexible masking strategy for training QuakeFormer. This data-driven approach enables the model to learn spatial ground motion dependencies directly from real seismic recordings, incorporating location embeddings that include both absolute and relative spatial coordinates. The results indicate that our model outperforms state-of-the-art ground motion prediction models across all three tasks in our research areas. We also find that pretraining a uniform forecasting and interpolation model enhances the performance on early warning task. QuakeFormer offers a flexible approach to directly learning and modeling ground motion, providing valuable insights and applications for both earthquake science and engineering.
△ Less
Submitted 1 December, 2024;
originally announced December 2024.
-
EM-based Fast Uncertainty Quantification for Bayesian Multi-setup Operational Modal Analysis
Authors:
Wei Zhu,
Binbin Li,
Zuo Zhu
Abstract:
The current Bayesian FFT algorithm relies on direct differentiation to obtain the posterior covariance matrix (PCM), which is time-consuming, memory-intensive, and hard to code, especially for the multi-setup operational modal analysis (OMA). Aiming at accelerating the uncertainty quantification in multi-setup OMA, an expectation-maximization (EM)-based algorithm is proposed by reformulating the H…
▽ More
The current Bayesian FFT algorithm relies on direct differentiation to obtain the posterior covariance matrix (PCM), which is time-consuming, memory-intensive, and hard to code, especially for the multi-setup operational modal analysis (OMA). Aiming at accelerating the uncertainty quantification in multi-setup OMA, an expectation-maximization (EM)-based algorithm is proposed by reformulating the Hessian matrix of the negative log-likelihood function (NLLF) as a sum of simplified components corresponding to the complete-data NLLF. Matrix calculus is employed to derive these components in a compact manner, resulting in expressions similar to those in the single-setup case. This similarity allows for the reuse of existing Bayesian single-setup OMA codes, simplifying implementation. The singularity caused by mode shape norm constraints is addressed through null space projection, eliminating potential numerical errors from the conventional pseudoinverse operation. A sparse assembly strategy is further adopted, avoiding unnecessary calculations and storage of predominant zero elements in the Hessian matrix. The proposed method is then validated through a comprehensive parametric study and applied to a multi-setup OMA of a high-rise building. Results demonstrate that the proposed method efficiently calculates the PCM within seconds, even for cases with hundreds of parameters. This represents an efficiency improvement of at least one order of magnitude over the state-of-the-art method. Such performance paves the way for a real-time modal identification of large-scale structures, including those with closely-spaced modes.
△ Less
Submitted 1 December, 2024;
originally announced December 2024.
-
Study of the tracking efficiency of charged pions at BESIII
Authors:
Fang Liu,
Xiao-Bin Ji,
Sheng-Sen Sun,
Huai-Min Liu,
Shuang-Shi Fang,
Xiao-Ling Li,
Tong Chen,
Xin-Nan Wang,
Ming-Run Li,
Liang-Liang Wang,
Ling-Hui Wu,
Ye Yuan,
Yao Zhang,
Wen-Jing Zhu
Abstract:
Using $(10087 \pm 44) \times 10^6$ $J/ψ$ events collected with the BESIII detector in 2009, 2012, 2018 and 2019, the tracking efficiency of charged pions is studied using the decay $J/ψ\rightarrow π^+ π^- π^0$. The systematic uncertainty of the tracking efficiency and the corresponding correction factors for charged pions are evaluated, in bins of transverse momentum and polar angle of the charged…
▽ More
Using $(10087 \pm 44) \times 10^6$ $J/ψ$ events collected with the BESIII detector in 2009, 2012, 2018 and 2019, the tracking efficiency of charged pions is studied using the decay $J/ψ\rightarrow π^+ π^- π^0$. The systematic uncertainty of the tracking efficiency and the corresponding correction factors for charged pions are evaluated, in bins of transverse momentum and polar angle of the charged pions.
△ Less
Submitted 30 November, 2024;
originally announced December 2024.
-
FreeCloth: Free-form Generation Enhances Challenging Clothed Human Modeling
Authors:
Hang Ye,
Xiaoxuan Ma,
Hai Ci,
Wentao Zhu,
Yizhou Wang
Abstract:
Achieving realistic animated human avatars requires accurate modeling of pose-dependent clothing deformations. Existing learning-based methods heavily rely on the Linear Blend Skinning (LBS) of minimally-clothed human models like SMPL to model deformation. However, they struggle to handle loose clothing, such as long dresses, where the canonicalization process becomes ill-defined when the clothing…
▽ More
Achieving realistic animated human avatars requires accurate modeling of pose-dependent clothing deformations. Existing learning-based methods heavily rely on the Linear Blend Skinning (LBS) of minimally-clothed human models like SMPL to model deformation. However, they struggle to handle loose clothing, such as long dresses, where the canonicalization process becomes ill-defined when the clothing is far from the body, leading to disjointed and fragmented results. To overcome this limitation, we propose FreeCloth, a novel hybrid framework to model challenging clothed humans. Our core idea is to use dedicated strategies to model different regions, depending on whether they are close to or distant from the body. Specifically, we segment the human body into three categories: unclothed, deformed, and generated. We simply replicate unclothed regions that require no deformation. For deformed regions close to the body, we leverage LBS to handle the deformation. As for the generated regions, which correspond to loose clothing areas, we introduce a novel free-form, part-aware generator to model them, as they are less affected by movements. This free-form generation paradigm brings enhanced flexibility and expressiveness to our hybrid framework, enabling it to capture the intricate geometric details of challenging loose clothing, such as skirts and dresses. Experimental results on the benchmark dataset featuring loose clothing demonstrate that FreeCloth achieves state-of-the-art performance with superior visual fidelity and realism, particularly in the most challenging cases.
△ Less
Submitted 9 April, 2025; v1 submitted 29 November, 2024;
originally announced November 2024.
-
Measurement of the Inclusive Cross Sections of Prompt $J/ψ$ and $ψ(3686)$ Production in $e^{+}e^{-}$ Annihilation from $\sqrt{s}=3.808$ to $4.951$ GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
X. C. Ai,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (599 additional authors not shown)
Abstract:
The inclusive cross sections of prompt $J/ψ$ and $ψ(3686)$ production are measured at center-of-mass energies from 3.808 to 4.951 GeV. The dataset used is 22 fb$^{-1}$ of $e^{+}e^{-}$ annihilation data collected with the BESIII detector operating at the BEPCII storage ring. The results obtained are in agreement with the previous BESIII measurements of exclusive $J/ψ$ and $ψ(3686)$ production. The…
▽ More
The inclusive cross sections of prompt $J/ψ$ and $ψ(3686)$ production are measured at center-of-mass energies from 3.808 to 4.951 GeV. The dataset used is 22 fb$^{-1}$ of $e^{+}e^{-}$ annihilation data collected with the BESIII detector operating at the BEPCII storage ring. The results obtained are in agreement with the previous BESIII measurements of exclusive $J/ψ$ and $ψ(3686)$ production. The average values obtained for the cross sections measured in the center-of-mass energy ranges from 4.527 to 4.951 GeV for $J/ψ$ and from 4.843 to 4.951 GeV for $ψ(3686)$, where the impact of known resonances is negligible, are $14.0\pm1.7\pm3.1$ pb and $15.3\pm3.0$ pb, respectively. For $J/ψ$, the first and the second uncertainties are statistical and systematic, respectively. For $ψ(3686)$, the uncertainty is total. These values are useful for testing charmonium production models.
△ Less
Submitted 19 February, 2025; v1 submitted 29 November, 2024;
originally announced November 2024.
-
VisualLens: Personalization through Visual History
Authors:
Wang Bill Zhu,
Deqing Fu,
Kai Sun,
Yi Lu,
Zhaojiang Lin,
Seungwhan Moon,
Kanika Narang,
Mustafa Canim,
Yue Liu,
Anuj Kumar,
Xin Luna Dong
Abstract:
We hypothesize that a user's visual history with images reflecting their daily life, offers valuable insights into their interests and preferences, and can be leveraged for personalization. Among the many challenges to achieve this goal, the foremost is the diversity and noises in the visual history, containing images not necessarily related to a recommendation task, not necessarily reflecting the…
▽ More
We hypothesize that a user's visual history with images reflecting their daily life, offers valuable insights into their interests and preferences, and can be leveraged for personalization. Among the many challenges to achieve this goal, the foremost is the diversity and noises in the visual history, containing images not necessarily related to a recommendation task, not necessarily reflecting the user's interest, or even not necessarily preference-relevant. Existing recommendation systems either rely on task-specific user interaction logs, such as online shopping history for shopping recommendations, or focus on text signals. We propose a novel approach, VisualLens, that extracts, filters, and refines image representations, and leverages these signals for personalization. We created two new benchmarks with task-agnostic visual histories, and show that our method improves over state-of-the-art recommendations by 5-10% on Hit@3, and improves over GPT-4o by 2-5%. Our approach paves the way for personalized recommendations in scenarios where traditional methods fail.
△ Less
Submitted 24 November, 2024;
originally announced November 2024.
-
Measurement of cross sections of $e^+e^-\to K^0_S K^0_S ψ(3686)$ from $\sqrt{s}=$ 4.682 to 4.951 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (642 additional authors not shown)
Abstract:
The process $e^+e^-\to K^0_S K^0_S ψ(3686)$ is studied by analyzing $e^+e^-$ collision data samples collected at eight center-of-mass energies ranging from 4.682 to 4.951 GeV with the BESIII detector operating at the BEPCII collider, corresponding to an integrated luminosity of $4.1~{\rm fb}^{-1}$. Observation of the $e^+e^-\to K^0_S K^0_S ψ(3686)$ process is found for the first time with a statis…
▽ More
The process $e^+e^-\to K^0_S K^0_S ψ(3686)$ is studied by analyzing $e^+e^-$ collision data samples collected at eight center-of-mass energies ranging from 4.682 to 4.951 GeV with the BESIII detector operating at the BEPCII collider, corresponding to an integrated luminosity of $4.1~{\rm fb}^{-1}$. Observation of the $e^+e^-\to K^0_S K^0_S ψ(3686)$ process is found for the first time with a statistical significance of $6.3σ$, and the cross sections at each center-of-mass energy are measured. The ratio of cross sections of $e^+e^-\to K_S^0 K_S^0 ψ(3686)$ relative to $e^+e^-\to K^+ K^- ψ(3686)$ is determined to be $\frac{σ(e^+e^-\to K_S^0 K_S^0 ψ(3686))}{σ(e^+e^-\to K^+ K^- ψ(3686))}=0.45 \pm 0.25$, which is consistent with the prediction based on isospin symmetry. The uncertainty includes both statistical and systematic contributions. Additionally, the $K_S^0ψ(3686)$ invariant mass distribution is found to be consistent with three-body phase space. The significance of a contribution beyond three-body phase space is only $0.8σ$.
△ Less
Submitted 3 March, 2025; v1 submitted 24 November, 2024;
originally announced November 2024.
-
A 44-minute periodic radio transient in a supernova remnant
Authors:
Di Li,
Mao Yuan,
Lin Wu,
Jingye Yan,
Xuning Lv,
Chao-Wei Tsai,
Pei Wang,
WeiWei Zhu,
Li Deng,
Ailan Lan,
Renxin Xu,
Xianglei Chen,
Lingqi Meng,
Jian Li,
Xiangdong Li,
Ping Zhou,
Haoran Yang,
Mengyao Xue,
Jiguang Lu,
Chenchen Miao,
Weiyang Wang,
Jiarui Niu,
Ziyao Fang,
Qiuyang Fu,
Yi Feng
, et al. (23 additional authors not shown)
Abstract:
Long-period radio transients (LPTs) are a newly discovered class of radio emitters with yet incomprehensibly long rotation periods, ranging from minutes to hours. The astrophysical nature of their isolated counterparts remains undetermined. We report a new LPT, DART J1832-0911 (2656.23 $\pm$ 0.15 s period), the first evidence associating such objects to supernova remnants (SNRs). Its dispersion me…
▽ More
Long-period radio transients (LPTs) are a newly discovered class of radio emitters with yet incomprehensibly long rotation periods, ranging from minutes to hours. The astrophysical nature of their isolated counterparts remains undetermined. We report a new LPT, DART J1832-0911 (2656.23 $\pm$ 0.15 s period), the first evidence associating such objects to supernova remnants (SNRs). Its dispersion measure distance aligns well with the distance of the SNR, confirming its origin from a supernova explosion. The source displays either phase-locked circularly polarized emission or nearly 100% linear polarization in radio bands. No detectable optical counterpart was found, even with a 10 m class telescope. The J1832-0911's SNR association, stable, highly polarized emission, and abnormally long period strongly favor its origin from a young neutron star, whose spin has been braked, possibly by interaction with supernova's fallback materials. This discovery provides critical insights into the nature of ultra-long period transients and their evolutionary link to stellar remnants.
△ Less
Submitted 24 November, 2024;
originally announced November 2024.
-
Travel Time Based Task Mapping for NoC-Based DNN Accelerator
Authors:
Yizhi Chen,
Wenyao Zhu,
Zhonghai Lu
Abstract:
Network-on-Chip (NoC) based architectures are recently proposed to accelerate deep neural networks in specialized hardware. Given that the hardware configuration is fixed post-manufacture, proper task mapping attracts researchers' interest. We propose a travel time-based task mapping method that allocates uneven counts of tasks across different Processing Elements (PEs). This approach utilizes the…
▽ More
Network-on-Chip (NoC) based architectures are recently proposed to accelerate deep neural networks in specialized hardware. Given that the hardware configuration is fixed post-manufacture, proper task mapping attracts researchers' interest. We propose a travel time-based task mapping method that allocates uneven counts of tasks across different Processing Elements (PEs). This approach utilizes the travel time recorded in the sampling window and implicitly makes use of static NoC architecture information and dynamic NoC congestion status. Furthermore, we examine the effectiveness of our method under various configurations, including different mapping iterations, flit sizes, and NoC architecture. Our method achieves up to 12.1% improvement compared with even mapping and static distance mapping for one layer. For a complete NN example, our method achieves 10.37% and 13.75% overall improvements to row-major mapping and distance-based mapping, respectively. While ideal travel time-based mapping (post-run) achieves 10.37% overall improvements to row-major mapping, we adopt a sampling window to efficiently map tasks during the running, achieving 8.17% (sampling window 10) improvement.
△ Less
Submitted 19 November, 2024;
originally announced November 2024.
-
Evidence for Two Excited $Ω^{-}$ Hyperons
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (650 additional authors not shown)
Abstract:
Using $e^+e^-$ collision data corresponding to an integrated luminosity of 19,fb$^{-1}$ collected by the BESIII detector at center-of-mass energies ranging from 4.13 to 4.70,GeV, we report the first evidence for a new excited $Ω^{-}$ hyperon, the $Ω(2109)^{-}$, through the process $e^+ e^- \to Ω(2109)^{-} \barΩ^{+} +c.c.$ with a significance of 4.1 $σ$. The mass and width of $Ω(2109)^{-}$ are meas…
▽ More
Using $e^+e^-$ collision data corresponding to an integrated luminosity of 19,fb$^{-1}$ collected by the BESIII detector at center-of-mass energies ranging from 4.13 to 4.70,GeV, we report the first evidence for a new excited $Ω^{-}$ hyperon, the $Ω(2109)^{-}$, through the process $e^+ e^- \to Ω(2109)^{-} \barΩ^{+} +c.c.$ with a significance of 4.1 $σ$. The mass and width of $Ω(2109)^{-}$ are measured to be $2108.5 \pm 5.2_{\rm stat} \pm 0.9_{\rm syst}\,{\rm MeV}/c^{2}$ and $18.3 \pm 16.4_{\rm stat} \pm 5.7_{\rm syst}\,{\rm MeV}$, respectively. We also present evidence for a new production mechanism for the previously identified $Ω(2012)^-$ via the process $e^+ e^- \to Ω(2012)^{-} \barΩ^{+} +c.c.$ with a significance of 3.5 $σ$.
△ Less
Submitted 25 April, 2025; v1 submitted 18 November, 2024;
originally announced November 2024.
-
Number it: Temporal Grounding Videos like Flipping Manga
Authors:
Yongliang Wu,
Xinting Hu,
Yuyang Sun,
Yizhou Zhou,
Wenbo Zhu,
Fengyun Rao,
Bernt Schiele,
Xu Yang
Abstract:
Video Large Language Models (Vid-LLMs) have made remarkable advancements in comprehending video content for QA dialogue. However, they struggle to extend this visual understanding to tasks requiring precise temporal localization, known as Video Temporal Grounding (VTG). To address this gap, we introduce Number-Prompt (NumPro), a novel method that empowers Vid-LLMs to bridge visual comprehension wi…
▽ More
Video Large Language Models (Vid-LLMs) have made remarkable advancements in comprehending video content for QA dialogue. However, they struggle to extend this visual understanding to tasks requiring precise temporal localization, known as Video Temporal Grounding (VTG). To address this gap, we introduce Number-Prompt (NumPro), a novel method that empowers Vid-LLMs to bridge visual comprehension with temporal grounding by adding unique numerical identifiers to each video frame. Treating a video as a sequence of numbered frame images, NumPro transforms VTG into an intuitive process: flipping through manga panels in sequence. This allows Vid-LLMs to "read" event timelines, accurately linking visual content with corresponding temporal information. Our experiments demonstrate that NumPro significantly boosts VTG performance of top-tier Vid-LLMs without additional computational cost. Furthermore, fine-tuning on a NumPro-enhanced dataset defines a new state-of-the-art for VTG, surpassing previous top-performing methods by up to 6.9\% in mIoU for moment retrieval and 8.5\% in mAP for highlight detection. The code will be available at https://github.com/yongliang-wu/NumPro.
△ Less
Submitted 21 March, 2025; v1 submitted 15 November, 2024;
originally announced November 2024.
-
Study of the light scalar $a_{0}(980)$ through the decay $D^{0} \to a_{0}(980)^-e^{+} ν_{e}$ with $a_{0}(980)^- \to ηπ^-$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (649 additional authors not shown)
Abstract:
Using 7.93 ${\rm fb^{-1}}$ of $e^+e^-$ collision data collected at a center-of-mass energy of 3.773 ${\rm GeV}$ with the BESIII detector, we present an analysis of the decay $D^{0} \to ηπ^- e^+ ν_{e}$. The branching fraction of the decay $D^{0} \to a_{0}(980)^{-} e^+ ν_{e}$ with $a_{0}(980)^{-} \to ηπ^{-}$ is measured to be $(0.86\pm0.17_{\text{stat}}\pm0.05_{\text{syst}})\times 10^{-4}$. The deca…
▽ More
Using 7.93 ${\rm fb^{-1}}$ of $e^+e^-$ collision data collected at a center-of-mass energy of 3.773 ${\rm GeV}$ with the BESIII detector, we present an analysis of the decay $D^{0} \to ηπ^- e^+ ν_{e}$. The branching fraction of the decay $D^{0} \to a_{0}(980)^{-} e^+ ν_{e}$ with $a_{0}(980)^{-} \to ηπ^{-}$ is measured to be $(0.86\pm0.17_{\text{stat}}\pm0.05_{\text{syst}})\times 10^{-4}$. The decay dynamics of this process is studied with a single-pole parameterization of the hadronic form factor and the Flatté formula describing the $a_0(980)$ line shape in the differential decay rate. The product of the form factor $f^{ a_0}_{+}(0)$ and the Cabibbo-Kobayashi-Maskawa matrix element $|V_{cd}|$ is determined for the first time with the result $f^{ a_0}_+(0)|V_{cd}|=0.126\pm0.013_{\rm stat}\pm0.003_{\rm syst}$.
△ Less
Submitted 12 November, 2024;
originally announced November 2024.
-
VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos
Authors:
Shehan Munasinghe,
Hanan Gani,
Wenqi Zhu,
Jiale Cao,
Eric Xing,
Fahad Shahbaz Khan,
Salman Khan
Abstract:
Fine-grained alignment between videos and text is challenging due to complex spatial and temporal dynamics in videos. Existing video-based Large Multimodal Models (LMMs) handle basic conversations but struggle with precise pixel-level grounding in videos. To address this, we introduce VideoGLaMM, a LMM designed for fine-grained pixel-level grounding in videos based on user-provided textual inputs.…
▽ More
Fine-grained alignment between videos and text is challenging due to complex spatial and temporal dynamics in videos. Existing video-based Large Multimodal Models (LMMs) handle basic conversations but struggle with precise pixel-level grounding in videos. To address this, we introduce VideoGLaMM, a LMM designed for fine-grained pixel-level grounding in videos based on user-provided textual inputs. Our design seamlessly connects three key components: a Large Language Model, a dual vision encoder that emphasizes both spatial and temporal details, and a spatio-temporal decoder for accurate mask generation. This connection is facilitated via tunable V-L and L-V adapters that enable close Vision-Language (VL) alignment. The architecture is trained to synchronize both spatial and temporal elements of video content with textual instructions. To enable fine-grained grounding, we curate a multimodal dataset featuring detailed visually-grounded conversations using a semiautomatic annotation pipeline, resulting in a diverse set of 38k video-QA triplets along with 83k objects and 671k masks. We evaluate VideoGLaMM on three challenging tasks: Grounded Conversation Generation, Visual Grounding, and Referring Video Segmentation. Experimental results show that our model consistently outperforms existing approaches across all three tasks.
△ Less
Submitted 25 March, 2025; v1 submitted 7 November, 2024;
originally announced November 2024.
-
High-throughput Screening of Ferrimagnetic Semiconductors With Ultrahigh N$\acute{e}$el Temperature
Authors:
Haidi Wang,
Qingqing Feng,
Shuo Li,
Wei Lin,
Weiduo Zhu,
Zhao Chen,
Zhongjun Li,
Xiaofeng Liu,
Xingxing Li
Abstract:
Ferrimagnetic semiconductors, integrated with net magnetization, antiferromagnetic coupling and semi-conductivity, have constructed an ideal platform for spintronics. For practical applications, achieving high N$\acute{e}$el temperatures ($T_{\mathrm{N}}$) is very desirable, but remains a significant challenge. Here, via high-throughput density-functional-theory calculations, we identify 19 intrin…
▽ More
Ferrimagnetic semiconductors, integrated with net magnetization, antiferromagnetic coupling and semi-conductivity, have constructed an ideal platform for spintronics. For practical applications, achieving high N$\acute{e}$el temperatures ($T_{\mathrm{N}}$) is very desirable, but remains a significant challenge. Here, via high-throughput density-functional-theory calculations, we identify 19 intrinsic ferrimagnetic semiconductor candidates from nearly 44,000 structures in the Materials Project database, including 10 ferrimagnetic bipolar magnetic semiconductors (BMS) and 9 ferrimagnetic half semiconductors (HSC). Notably, the BMS \ce{NaFe5O8} possesses a high $T_{\mathrm{N}}$ of 768 K. By element substitutions, we obtain an HSC \ce{NaFe5S8} with a $T_{\mathrm{N}}$ of 957 K and a BMS \ce{LiFe5O8} with a $T_{\mathrm{N}}$ reaching 1059 K. Our results pave a promising avenue toward the development of ferrimagnetic spintronics at ambient temperature.
△ Less
Submitted 7 November, 2024;
originally announced November 2024.
-
Algorithm for motivic Hilbert zeta function of some curve singularities
Authors:
Wenhao Zhu,
Yizi Chen,
Hussein Mourtada
Abstract:
We develop an algorithm for computing the motivic Hilbert zeta function for curve singularities with a monomial valuation group or for singular curves defined by $y^{k}=x^{n}$, where $gcd(k,n)=1$. It is well known that the Hilbert scheme of points on a smooth curve is isomorphic to the symmetric product of the curve. However, the geometrical structure of Hilbert scheme of points on singular curves…
▽ More
We develop an algorithm for computing the motivic Hilbert zeta function for curve singularities with a monomial valuation group or for singular curves defined by $y^{k}=x^{n}$, where $gcd(k,n)=1$. It is well known that the Hilbert scheme of points on a smooth curve is isomorphic to the symmetric product of the curve. However, the geometrical structure of Hilbert scheme of points on singular curves remains less understood. The algorithm we propose computes the motivic Hilbert zeta function, $Z_{(C,O)}^{Hilb}(q)\in K_{0}(Var_{\mathbb{C}})[[q]]$, for such curve singularities. This function is represented as a series with coefficients in the Grothendieck ring of varieties over $\mathbb{C}$.
The main computational challenge arises from the infinity of $Γ$. To address this, we approximate $Γ$ by truncating it to a finite subset to allow effective algorithm operation. We also analyze the time complexity and estimate the range of the effective finite length of $Γ$ necessary for reliable results. The Python implementation of our algorithm is available at https://github.com/whaozhu/motivic_hilbert.
△ Less
Submitted 23 March, 2025; v1 submitted 5 November, 2024;
originally announced November 2024.
-
Optimizing Multi-Scale Representations to Detect Effect Heterogeneity Using Earth Observation and Computer Vision: Applications to Two Anti-Poverty RCTs
Authors:
Fucheng Warren Zhu,
Connor T. Jerzak,
Adel Daoud
Abstract:
Earth Observation (EO) data are increasingly used in policy analysis by enabling granular estimation of conditional average treatment effects (CATE). However, a challenge in EO-based causal inference is determining the scale of the input satellite imagery -- balancing the trade-off between capturing fine-grained individual heterogeneity in smaller images and broader contextual information in large…
▽ More
Earth Observation (EO) data are increasingly used in policy analysis by enabling granular estimation of conditional average treatment effects (CATE). However, a challenge in EO-based causal inference is determining the scale of the input satellite imagery -- balancing the trade-off between capturing fine-grained individual heterogeneity in smaller images and broader contextual information in larger ones. This paper introduces Multi-Scale Representation Concatenation, a set of composable procedures that transform arbitrary single-scale EO-based CATE estimation algorithms into multi-scale ones. We benchmark the performance of Multi-Scale Representation Concatenation on a CATE estimation pipeline that combines Vision Transformer (ViT) models (which encode images) with Causal Forests (CFs) to obtain CATE estimates from those encodings. We first perform simulation studies where the causal mechanism is known, showing that our multi-scale approach captures information relevant to effect heterogeneity that single-scale ViT models fail to capture as measured by $R^2$. We then apply the multi-scale method to two randomized controlled trials (RCTs) conducted in Peru and Uganda using Landsat satellite imagery. As we do not have access to ground truth CATEs in the RCT analysis, the Rank Average Treatment Effect Ratio (RATE Ratio) measure is employed to assess performance. Results indicate that Multi-Scale Representation Concatenation improves the performance of deep learning models in EO-based CATE estimation without the complexity of designing new multi-scale architectures for a specific use case. The application of Multi-Scale Representation Concatenation could have meaningful policy benefits -- e.g., potentially increasing the impact of poverty alleviation programs without additional resource expenditure.
△ Less
Submitted 15 March, 2025; v1 submitted 4 November, 2024;
originally announced November 2024.
-
TableGPT2: A Large Multimodal Model with Tabular Data Integration
Authors:
Aofeng Su,
Aowen Wang,
Chao Ye,
Chen Zhou,
Ga Zhang,
Gang Chen,
Guangcheng Zhu,
Haobo Wang,
Haokai Xu,
Hao Chen,
Haoze Li,
Haoxuan Lan,
Jiaming Tian,
Jing Yuan,
Junbo Zhao,
Junlin Zhou,
Kaizhe Shou,
Liangyu Zha,
Lin Long,
Liyao Li,
Pengzuo Wu,
Qi Zhang,
Qingyi Huang,
Saisai Yang,
Tao Zhang
, et al. (8 additional authors not shown)
Abstract:
The emergence of models like GPTs, Claude, LLaMA, and Qwen has reshaped AI applications, presenting vast new opportunities across industries. Yet, the integration of tabular data remains notably underdeveloped, despite its foundational role in numerous real-world domains.
This gap is critical for three main reasons. First, database or data warehouse data integration is essential for advanced app…
▽ More
The emergence of models like GPTs, Claude, LLaMA, and Qwen has reshaped AI applications, presenting vast new opportunities across industries. Yet, the integration of tabular data remains notably underdeveloped, despite its foundational role in numerous real-world domains.
This gap is critical for three main reasons. First, database or data warehouse data integration is essential for advanced applications; second, the vast and largely untapped resource of tabular data offers immense potential for analysis; and third, the business intelligence domain specifically demands adaptable, precise solutions that many current LLMs may struggle to provide.
In response, we introduce TableGPT2, a model rigorously pre-trained and fine-tuned with over 593.8K tables and 2.36M high-quality query-table-output tuples, a scale of table-related data unprecedented in prior research. This extensive training enables TableGPT2 to excel in table-centric tasks while maintaining strong general language and coding abilities.
One of TableGPT2's key innovations is its novel table encoder, specifically designed to capture schema-level and cell-level information. This encoder strengthens the model's ability to handle ambiguous queries, missing column names, and irregular tables commonly encountered in real-world applications. Similar to visual language models, this pioneering approach integrates with the decoder to form a robust large multimodal model.
We believe the results are compelling: over 23 benchmarking metrics, TableGPT2 achieves an average performance improvement of 35.20% in the 7B model and 49.32% in the 72B model over prior benchmark-neutral LLMs, with robust general-purpose capabilities intact.
△ Less
Submitted 6 November, 2024; v1 submitted 4 November, 2024;
originally announced November 2024.