-
Embrittling bulk metals into hydride in acid solution
Authors:
Ankang Chen,
Zihao Huo,
Jiewen Liu,
Chuang Liu,
Yongming Sui,
Xuan Liu,
Qingkun Yuan,
Bao Yuan,
Yan Li,
Defang Duan,
Bo Zou
Abstract:
Hydride induced embrittlement (HIE), in which the hydrogen infiltrates metal lattices to form hydrides, typically causes catastrophic failure. Inspired by HIE effect, we propose an "HIE-mediated synthesis" approach, where bulk metal foils serve as precursors and oleic/sulfuric acid act as hydrogen donors under solvo/hydrothermal conditions, enabling the synthesis of 18 high-purity metal hydrides (…
▽ More
Hydride induced embrittlement (HIE), in which the hydrogen infiltrates metal lattices to form hydrides, typically causes catastrophic failure. Inspired by HIE effect, we propose an "HIE-mediated synthesis" approach, where bulk metal foils serve as precursors and oleic/sulfuric acid act as hydrogen donors under solvo/hydrothermal conditions, enabling the synthesis of 18 high-purity metal hydrides (MgH$_2$, ScH$_2$, YH$_2$, LaH$_2$, LaH$_{2.3}$, SmH$_2$, LuH$_2$, TiH$_2$, $δ$-ZrH$_{1.6}$, $ε$-ZrH$_2$, HfH$_{1.7}$, HfH$_2$, VH$_{0.8}$, VH$_2$, NbH, NbH$_2$, Ta$_2$H, and TaH). Integrated high-pressure experiments and first-principles calculations, the concept of equivalent chemical pressure ($Δ$Pc) was introduced to elucidate the mechanism of synthesizing and stabilizing metal hydrides in an acidic environment. This mechanism predicts the synthesis of challenging hydrides such as LiH. Our approach successfully converts HIE from a primary culprit of material failure to an effective contributor in hydride synthesis.
△ Less
Submitted 5 June, 2025;
originally announced June 2025.
-
Improving Multilingual Social Media Insights: Aspect-based Comment Analysis
Authors:
Longyin Zhang,
Bowei Zou,
Ai Ti Aw
Abstract:
The inherent nature of social media posts, characterized by the freedom of language use with a disjointed array of diverse opinions and topics, poses significant challenges to downstream NLP tasks such as comment clustering, comment summarization, and social media opinion analysis. To address this, we propose a granular level of identifying and generating aspect terms from individual comments to g…
▽ More
The inherent nature of social media posts, characterized by the freedom of language use with a disjointed array of diverse opinions and topics, poses significant challenges to downstream NLP tasks such as comment clustering, comment summarization, and social media opinion analysis. To address this, we propose a granular level of identifying and generating aspect terms from individual comments to guide model attention. Specifically, we leverage multilingual large language models with supervised fine-tuning for comment aspect term generation (CAT-G), further aligning the model's predictions with human expectations through DPO. We demonstrate the effectiveness of our method in enhancing the comprehension of social media discourse on two NLP tasks. Moreover, this paper contributes the first multilingual CAT-G test set on English, Chinese, Malay, and Bahasa Indonesian. As LLM capabilities vary among languages, this test set allows for a comparative analysis of performance across languages with varying levels of LLM proficiency.
△ Less
Submitted 28 May, 2025;
originally announced May 2025.
-
DeCoDe: Defer-and-Complement Decision-Making via Decoupled Concept Bottleneck Models
Authors:
Chengbo He,
Bochao Zou,
Junliang Xing,
Jiansheng Chen,
Yuanchun Shi,
Huimin Ma
Abstract:
In human-AI collaboration, a central challenge is deciding whether the AI should handle a task, be deferred to a human expert, or be addressed through collaborative effort. Existing Learning to Defer approaches typically make binary choices between AI and humans, neglecting their complementary strengths. They also lack interpretability, a critical property in high-stakes scenarios where users must…
▽ More
In human-AI collaboration, a central challenge is deciding whether the AI should handle a task, be deferred to a human expert, or be addressed through collaborative effort. Existing Learning to Defer approaches typically make binary choices between AI and humans, neglecting their complementary strengths. They also lack interpretability, a critical property in high-stakes scenarios where users must understand and, if necessary, correct the model's reasoning. To overcome these limitations, we propose Defer-and-Complement Decision-Making via Decoupled Concept Bottleneck Models (DeCoDe), a concept-driven framework for human-AI collaboration. DeCoDe makes strategy decisions based on human-interpretable concept representations, enhancing transparency throughout the decision process. It supports three flexible modes: autonomous AI prediction, deferral to humans, and human-AI collaborative complementarity, selected via a gating network that takes concept-level inputs and is trained using a novel surrogate loss that balances accuracy and human effort. This approach enables instance-specific, interpretable, and adaptive human-AI collaboration. Experiments on real-world datasets demonstrate that DeCoDe significantly outperforms AI-only, human-only, and traditional deferral baselines, while maintaining strong robustness and interpretability even under noisy expert annotations.
△ Less
Submitted 25 May, 2025;
originally announced May 2025.
-
Automatic and Structure-Aware Sparsification of Hybrid Neural ODEs
Authors:
Bob Junyi Zou,
Lu Tian
Abstract:
Hybrid neural ordinary differential equations (neural ODEs) integrate mechanistic models with neural ODEs, offering strong inductive bias and flexibility, and are particularly advantageous in data-scarce healthcare settings. However, excessive latent states and interactions from mechanistic models can lead to training inefficiency and over-fitting, limiting practical effectiveness of hybrid neural…
▽ More
Hybrid neural ordinary differential equations (neural ODEs) integrate mechanistic models with neural ODEs, offering strong inductive bias and flexibility, and are particularly advantageous in data-scarce healthcare settings. However, excessive latent states and interactions from mechanistic models can lead to training inefficiency and over-fitting, limiting practical effectiveness of hybrid neural ODEs. In response, we propose a new hybrid pipeline for automatic state selection and structure optimization in mechanistic neural ODEs, combining domain-informed graph modifications with data-driven regularization to sparsify the model for improving predictive performance and stability while retaining mechanistic plausibility. Experiments on synthetic and real-world data show improved predictive performance and robustness with desired sparsity, establishing an effective solution for hybrid model reduction in healthcare applications.
△ Less
Submitted 25 May, 2025;
originally announced May 2025.
-
HR-VILAGE-3K3M: A Human Respiratory Viral Immunization Longitudinal Gene Expression Dataset for Systems Immunity
Authors:
Xuejun Sun,
Yiran Song,
Xiaochen Zhou,
Ruilie Cai,
Yu Zhang,
Xinyi Li,
Rui Peng,
Jialiu Xie,
Yuanyuan Yan,
Muyao Tang,
Prem Lakshmanane,
Baiming Zou,
James S. Hagood,
Raymond J. Pickles,
Didong Li,
Fei Zou,
Xiaojing Zheng
Abstract:
Respiratory viral infections pose a global health burden, yet the cellular immune responses driving protection or pathology remain unclear. Natural infection cohorts often lack pre-exposure baseline data and structured temporal sampling. In contrast, inoculation and vaccination trials generate insightful longitudinal transcriptomic data. However, the scattering of these datasets across platforms,…
▽ More
Respiratory viral infections pose a global health burden, yet the cellular immune responses driving protection or pathology remain unclear. Natural infection cohorts often lack pre-exposure baseline data and structured temporal sampling. In contrast, inoculation and vaccination trials generate insightful longitudinal transcriptomic data. However, the scattering of these datasets across platforms, along with inconsistent metadata and preprocessing procedure, hinders AI-driven discovery. To address these challenges, we developed the Human Respiratory Viral Immunization LongitudinAl Gene Expression (HR-VILAGE-3K3M) repository: an AI-ready, rigorously curated dataset that integrates 14,136 RNA-seq profiles from 3,178 subjects across 66 studies encompassing over 2.56 million cells. Spanning vaccination, inoculation, and mixed exposures, the dataset includes microarray, bulk RNA-seq, and single-cell RNA-seq from whole blood, PBMCs, and nasal swabs, sourced from GEO, ImmPort, and ArrayExpress. We harmonized subject-level metadata, standardized outcome measures, applied unified preprocessing pipelines with rigorous quality control, and aligned all data to official gene symbols. To demonstrate the utility of HR-VILAGE-3K3M, we performed predictive modeling of vaccine responders and evaluated batch-effect correction methods. Beyond these initial demonstrations, it supports diverse systems immunology applications and benchmarking of feature selection and transfer learning algorithms. Its scale and heterogeneity also make it ideal for pretraining foundation models of the human immune response and for advancing multimodal learning frameworks. As the largest longitudinal transcriptomic resource for human respiratory viral immunization, it provides an accessible platform for reproducible AI-driven research, accelerating systems immunology and vaccine development against emerging viral threats.
△ Less
Submitted 19 May, 2025;
originally announced May 2025.
-
UniCTokens: Boosting Personalized Understanding and Generation via Unified Concept Tokens
Authors:
Ruichuan An,
Sihan Yang,
Renrui Zhang,
Zijun Shen,
Ming Lu,
Gaole Dai,
Hao Liang,
Ziyu Guo,
Shilin Yan,
Yulin Luo,
Bocheng Zou,
Chaoqun Yang,
Wentao Zhang
Abstract:
Personalized models have demonstrated remarkable success in understanding and generating concepts provided by users. However, existing methods use separate concept tokens for understanding and generation, treating these tasks in isolation. This may result in limitations for generating images with complex prompts. For example, given the concept $\langle bo\rangle$, generating "$\langle bo\rangle$ w…
▽ More
Personalized models have demonstrated remarkable success in understanding and generating concepts provided by users. However, existing methods use separate concept tokens for understanding and generation, treating these tasks in isolation. This may result in limitations for generating images with complex prompts. For example, given the concept $\langle bo\rangle$, generating "$\langle bo\rangle$ wearing its hat" without additional textual descriptions of its hat. We call this kind of generation personalized knowledge-driven generation. To address the limitation, we present UniCTokens, a novel framework that effectively integrates personalized information into a unified vision language model (VLM) for understanding and generation. UniCTokens trains a set of unified concept tokens to leverage complementary semantics, boosting two personalized tasks. Moreover, we propose a progressive training strategy with three stages: understanding warm-up, bootstrapping generation from understanding, and deepening understanding from generation to enhance mutual benefits between both tasks. To quantitatively evaluate the unified VLM personalization, we present UnifyBench, the first benchmark for assessing concept understanding, concept generation, and knowledge-driven generation. Experimental results on UnifyBench indicate that UniCTokens shows competitive performance compared to leading methods in concept understanding, concept generation, and achieving state-of-the-art results in personalized knowledge-driven generation. Our research demonstrates that enhanced understanding improves generation, and the generation process can yield valuable insights into understanding. Our code and dataset will be released at: \href{https://github.com/arctanxarc/UniCTokens}{https://github.com/arctanxarc/UniCTokens}.
△ Less
Submitted 22 May, 2025; v1 submitted 20 May, 2025;
originally announced May 2025.
-
Evidence for the existence of a flavor-sextet charmed meson?
Authors:
Feng-Kun Guo,
Bing-Song Zou
Abstract:
Recently, the LHCb Collaboration reported a signal for a new resonance in the $D_s^+ π^\pm$ invariant mass distribution in the decays $B\to \bar{D}^{(*)} D_s^+ π^+ π^-$ (arXiv:2411.03399). This could be a direct observation of an SU(3) flavor-sextet charmed meson with $J^P=0^+$. Such an SU(3) multiplet, is beyond the conventional quark-antiquark picture, and thus a verification of this observation…
▽ More
Recently, the LHCb Collaboration reported a signal for a new resonance in the $D_s^+ π^\pm$ invariant mass distribution in the decays $B\to \bar{D}^{(*)} D_s^+ π^+ π^-$ (arXiv:2411.03399). This could be a direct observation of an SU(3) flavor-sextet charmed meson with $J^P=0^+$. Such an SU(3) multiplet, is beyond the conventional quark-antiquark picture, and thus a verification of this observation is important.
△ Less
Submitted 30 April, 2025;
originally announced April 2025.
-
Analysis of $Σ^*$ via isospin selective reaction $K_Lp \to π^+Σ^0$
Authors:
Dan Guo,
Jun Shi,
Igor Strakovsky,
Bing-Song Zou
Abstract:
The isospin-selective reaction $K_Lp \to π^+Σ^0$ provides a clean probe for investigating $I=1$ $Σ^*$ resonances. In this work, we perform an analysis of this reaction using an effective Lagrangian approach for the first time, incorporating the well-established $Σ(1189) 1/2^+$, $Σ(1385) 3/2^+$, $Σ(1670) 3/2^-$, $Σ(1775) 5/2^-$ states, while also exploring contributions from other unestablished sta…
▽ More
The isospin-selective reaction $K_Lp \to π^+Σ^0$ provides a clean probe for investigating $I=1$ $Σ^*$ resonances. In this work, we perform an analysis of this reaction using an effective Lagrangian approach for the first time, incorporating the well-established $Σ(1189) 1/2^+$, $Σ(1385) 3/2^+$, $Σ(1670) 3/2^-$, $Σ(1775) 5/2^-$ states, while also exploring contributions from other unestablished states.
By fitting the available differential cross section and recoil polarization data, adhering to partial-wave phase conventions same as PDG, we find that besides the established resonances, contributions from $Σ(1660) 1/2^+$, $Σ(1580) 3/2^-$ and a $Σ^*(1/2^-)$ improve the description.
Notably, a $Σ^*(1/2^-)$ resonance with mass around 1.54 GeV, consistent with $Σ(1620)1/2^-$, is found to be essential for describing the data in this channel, a stronger indication than found in previous analyses focusing on $πΛ$ final states.
While providing complementary support for $Σ(1660) 1/2^+$ and $Σ(1580) 3/2^-$, our results highlight the importance of the $Σ(1620) 1/2^-$ region in $K_Lp \to π^+Σ^0$. Future high-precision measurements are needed to solidify these findings and further constrain the $Σ^*$ spectrum.
△ Less
Submitted 30 April, 2025;
originally announced April 2025.
-
Mamba Based Feature Extraction And Adaptive Multilevel Feature Fusion For 3D Tumor Segmentation From Multi-modal Medical Image
Authors:
Zexin Ji,
Beiji Zou,
Xiaoyan Kui,
Hua Li,
Pierre Vera,
Su Ruan
Abstract:
Multi-modal 3D medical image segmentation aims to accurately identify tumor regions across different modalities, facing challenges from variations in image intensity and tumor morphology. Traditional convolutional neural network (CNN)-based methods struggle with capturing global features, while Transformers-based methods, despite effectively capturing global context, encounter high computational c…
▽ More
Multi-modal 3D medical image segmentation aims to accurately identify tumor regions across different modalities, facing challenges from variations in image intensity and tumor morphology. Traditional convolutional neural network (CNN)-based methods struggle with capturing global features, while Transformers-based methods, despite effectively capturing global context, encounter high computational costs in 3D medical image segmentation. The Mamba model combines linear scalability with long-distance modeling, making it a promising approach for visual representation learning. However, Mamba-based 3D multi-modal segmentation still struggles to leverage modality-specific features and fuse complementary information effectively. In this paper, we propose a Mamba based feature extraction and adaptive multilevel feature fusion for 3D tumor segmentation using multi-modal medical image. We first develop the specific modality Mamba encoder to efficiently extract long-range relevant features that represent anatomical and pathological structures present in each modality. Moreover, we design an bi-level synergistic integration block that dynamically merges multi-modal and multi-level complementary features by the modality attention and channel attention learning. Lastly, the decoder combines deep semantic information with fine-grained details to generate the tumor segmentation map. Experimental results on medical image datasets (PET/CT and MRI multi-sequence) show that our approach achieve competitive performance compared to the state-of-the-art CNN, Transformer, and Mamba-based approaches.
△ Less
Submitted 29 April, 2025;
originally announced April 2025.
-
Iterative Collaboration Network Guided By Reconstruction Prior for Medical Image Super-Resolution
Authors:
Xiaoyan Kui,
Zexin Ji,
Beiji Zou,
Yang Li,
Yulan Dai,
Liming Chen,
Pierre Vera,
Su Ruan
Abstract:
High-resolution medical images can provide more detailed information for better diagnosis. Conventional medical image super-resolution relies on a single task which first performs the extraction of the features and then upscaling based on the features. The features extracted may not be complete for super-resolution. Recent multi-task learning,including reconstruction and super-resolution, is a goo…
▽ More
High-resolution medical images can provide more detailed information for better diagnosis. Conventional medical image super-resolution relies on a single task which first performs the extraction of the features and then upscaling based on the features. The features extracted may not be complete for super-resolution. Recent multi-task learning,including reconstruction and super-resolution, is a good solution to obtain additional relevant information. The interaction between the two tasks is often insufficient, which still leads to incomplete and less relevant deep features. To address above limitations, we propose an iterative collaboration network (ICONet) to improve communications between tasks by progressively incorporating reconstruction prior to the super-resolution learning procedure in an iterative collaboration way. It consists of a reconstruction branch, a super-resolution branch, and a SR-Rec fusion module. The reconstruction branch generates the artifact-free image as prior, which is followed by a super-resolution branch for prior knowledge-guided super-resolution. Unlike the widely-used convolutional neural networks for extracting local features and Transformers with quadratic computational complexity for modeling long-range dependencies, we develop a new residual spatial-channel feature learning (RSCFL) module of two branches to efficiently establish feature relationships in spatial and channel dimensions. Moreover, the designed SR-Rec fusion module fuses the reconstruction prior and super-resolution features with each other in an adaptive manner. Our ICONet is built with multi-stage models to iteratively upscale the low-resolution images using steps of 2x and simultaneously interact between two branches in multi-stage supervisions.
△ Less
Submitted 22 April, 2025;
originally announced April 2025.
-
Global and Local Mamba Network for Multi-Modality Medical Image Super-Resolution
Authors:
Zexin Ji,
Beiji Zou,
Xiaoyan Kui,
Sebastien Thureau,
Su Ruan
Abstract:
Convolutional neural networks and Transformer have made significant progresses in multi-modality medical image super-resolution. However, these methods either have a fixed receptive field for local learning or significant computational burdens for global learning, limiting the super-resolution performance. To solve this problem, State Space Models, notably Mamba, is introduced to efficiently model…
▽ More
Convolutional neural networks and Transformer have made significant progresses in multi-modality medical image super-resolution. However, these methods either have a fixed receptive field for local learning or significant computational burdens for global learning, limiting the super-resolution performance. To solve this problem, State Space Models, notably Mamba, is introduced to efficiently model long-range dependencies in images with linear computational complexity. Relying on the Mamba and the fact that low-resolution images rely on global information to compensate for missing details, while high-resolution reference images need to provide more local details for accurate super-resolution, we propose a global and local Mamba network (GLMamba) for multi-modality medical image super-resolution. To be specific, our GLMamba is a two-branch network equipped with a global Mamba branch and a local Mamba branch. The global Mamba branch captures long-range relationships in low-resolution inputs, and the local Mamba branch focuses more on short-range details in high-resolution reference images. We also use the deform block to adaptively extract features of both branches to enhance the representation ability. A modulator is designed to further enhance deformable features in both global and local Mamba blocks. To fully integrate the reference image for low-resolution image super-resolution, we further develop a multi-modality feature fusion block to adaptively fuse features by considering similarities, differences, and complementary aspects between modalities. In addition, a contrastive edge loss (CELoss) is developed for sufficient enhancement of edge textures and contrast in medical images.
△ Less
Submitted 14 April, 2025;
originally announced April 2025.
-
Effects of strange molecular partners of $P_c$ states in $γp \to K Σ$ reactions
Authors:
Jian-Cheng Suo,
Di Ben,
Bing-Song Zou
Abstract:
Our previous studies revealed evidence of the strange molecular partners of $P_c$ states, $N(2080)3/2^-$ and $N(2270)3/2^-$, in the $γp \to K^{*+} Σ^0 / K^{*0} Σ^+$ and $γp \to φp$ reactions. Motivated by the differential cross-section data for $γp \to K^+ Σ^0$ from CLAS 2010, which exhibits some bump structures at $W \approx$ 1875, 2080 and 2270 MeV, we extend our previous analysis by investigati…
▽ More
Our previous studies revealed evidence of the strange molecular partners of $P_c$ states, $N(2080)3/2^-$ and $N(2270)3/2^-$, in the $γp \to K^{*+} Σ^0 / K^{*0} Σ^+$ and $γp \to φp$ reactions. Motivated by the differential cross-section data for $γp \to K^+ Σ^0$ from CLAS 2010, which exhibits some bump structures at $W \approx$ 1875, 2080 and 2270 MeV, we extend our previous analysis by investigating the effects of $N(1535)1/2^-$, $N(1875)3/2^-$, $N(2080)1/2^- \&\ 3/2^-$ and $N(2270)1/2^- , 3/2^- \&\ 5/2^-$, as strange partners of $P_c$ molecular states, in the reactions $γp \to K^+ Σ^0$ and $γp \to K^0 Σ^+$. The theoretical model employed in this study utilizes an effective Lagrangian approach in the tree-level Born approximation. It contains the contributions from $s$-channel with exchanges of $N$, $Δ$, $N^*$ (including the hadronic molecules with hidden strangeness), and $Δ^*$; $t$-channel; $u$-channel; and the generalized contact term. The results corresponding to the final fitted parameters are in good agreement with all available experimental data of both cross-sections and polarization observables for $γp \to K^+ Σ^0$ and $γp \to K^0 Σ^+$. Notably, the $s$-channel exchanges of molecules significantly contribute to the bump structures in cross-sections for $γp \to K Σ$ at $W \approx$ 1900, 2080 and 2270 MeV, and show considerable coherence with contributions from $s$-channel exchanges of general resonances to construct the overall structures of cross-sections. More abundant experiments, particularly for the reaction $γp \to K^0 Σ^+$, are necessary to further strengthen the constraints on the theoretical models.
△ Less
Submitted 8 April, 2025;
originally announced April 2025.
-
Disentangling Instruction Influence in Diffusion Transformers for Parallel Multi-Instruction-Guided Image Editing
Authors:
Hui Liu,
Bin Zou,
Suiyun Zhang,
Kecheng Chen,
Rui Liu,
Haoliang Li
Abstract:
Instruction-guided image editing enables users to specify modifications using natural language, offering more flexibility and control. Among existing frameworks, Diffusion Transformers (DiTs) outperform U-Net-based diffusion models in scalability and performance. However, while real-world scenarios often require concurrent execution of multiple instructions, step-by-step editing suffers from accum…
▽ More
Instruction-guided image editing enables users to specify modifications using natural language, offering more flexibility and control. Among existing frameworks, Diffusion Transformers (DiTs) outperform U-Net-based diffusion models in scalability and performance. However, while real-world scenarios often require concurrent execution of multiple instructions, step-by-step editing suffers from accumulated errors and degraded quality, and integrating multiple instructions with a single prompt usually results in incomplete edits due to instruction conflicts. We propose Instruction Influence Disentanglement (IID), a novel framework enabling parallel execution of multiple instructions in a single denoising process, designed for DiT-based models. By analyzing self-attention mechanisms in DiTs, we identify distinctive attention patterns in multi-instruction settings and derive instruction-specific attention masks to disentangle each instruction's influence. These masks guide the editing process to ensure localized modifications while preserving consistency in non-edited regions. Extensive experiments on open-source and custom datasets demonstrate that IID reduces diffusion steps while improving fidelity and instruction completion compared to existing baselines. The codes will be publicly released upon the acceptance of the paper.
△ Less
Submitted 7 April, 2025;
originally announced April 2025.
-
Study of Nucleon and $Δ$ resonances from a Systematic Analysis of $K^\astΣ$ Photoproduction
Authors:
Jun Shi,
Bing-Song Zou
Abstract:
A systematic analysis of the $K^\astΣ$ photoproduction off proton is performed with all the available differential cross section data. We carry out a strategy different from the previous studies of these reactions, that instead of fixing the parameters of the added resonances from PDG or pentaquark models, we add the resonance with a specific $J^P$ and leave its parameters to be determined from th…
▽ More
A systematic analysis of the $K^\astΣ$ photoproduction off proton is performed with all the available differential cross section data. We carry out a strategy different from the previous studies of these reactions, that instead of fixing the parameters of the added resonances from PDG or pentaquark models, we add the resonance with a specific $J^P$ and leave its parameters to be determined from the experimental data. When adding only one resonance, the best result is to add one $N(3/2^-)$ resonance around $2097$ MeV, which substantially reduce the $χ^2$ per degree of freedom to $1.35$. There are two best solutions when adding two resonances, one is to add one $N^\ast(3/2^-)$ and one $N^\ast(7/2^-)$, the other is to add one $N^\ast(3/2^-)$ and one $Δ^\ast(7/2^-)$, leading the $χ^2$ per degree of freedom to be $1.10$ and $1.09$, respectively. The mass values of the $N^\ast(3/2^-)$ in the two-resonance solutions are both near $2070$ MeV. Our solutions indicate that the $N(3/2^-)$ resonance around $2080$ MeV is strongly coupled with the $K^\astΣ$ final state and support the molecular picture of $N(2080, 3/2^-)$.
△ Less
Submitted 5 April, 2025;
originally announced April 2025.
-
Two-pole structures in QCD -- a universal phenomenon governed by chiral dynamics
Authors:
Jia-Ming Xie,
Jun-Xu Lu,
Li-Sheng Geng,
Bing-Song Zou
Abstract:
We illustrate how the two-pole structures of the $Λ(1405)$ emerge from the underlying universal chiral dynamics that describe the coupled-channel interactions between octet baryons and pseudo-Nambu-Goldstone bosons. Specifically, we attribute this phenomenon to the form of the leading-order chiral potential, which is of the Weinberg-Tomozawa type. We reveal how the underlying chiral dynamics can b…
▽ More
We illustrate how the two-pole structures of the $Λ(1405)$ emerge from the underlying universal chiral dynamics that describe the coupled-channel interactions between octet baryons and pseudo-Nambu-Goldstone bosons. Specifically, we attribute this phenomenon to the form of the leading-order chiral potential, which is of the Weinberg-Tomozawa type. We reveal how the underlying chiral dynamics can be exposed by examining the light-quark mass evolution of the two poles. The latest lattice QCD simulations have indeed found evidence for the existence of the two poles of $Λ(1405)$, in qualitative agreement with our predictions. We briefly mention a recent work in which lattice QCD simulations are studied more quantitatively, along with a proposal for how the SU(3) flavor content of the two poles of $Λ(1405)$ can be experimentally verified.
△ Less
Submitted 4 April, 2025;
originally announced April 2025.
-
Large Language Models for Traffic and Transportation Research: Methodologies, State of the Art, and Future Opportunities
Authors:
Yimo Yan,
Yejia Liao,
Guanhao Xu,
Ruili Yao,
Huiying Fan,
Jingran Sun,
Xia Wang,
Jonathan Sprinkle,
Ziyan An,
Meiyi Ma,
Xi Cheng,
Tong Liu,
Zemian Ke,
Bo Zou,
Matthew Barth,
Yong-Hong Kuo
Abstract:
The rapid rise of Large Language Models (LLMs) is transforming traffic and transportation research, with significant advancements emerging between the years 2023 and 2025 -- a period marked by the inception and swift growth of adopting and adapting LLMs for various traffic and transportation applications. However, despite these significant advancements, a systematic review and synthesis of the exi…
▽ More
The rapid rise of Large Language Models (LLMs) is transforming traffic and transportation research, with significant advancements emerging between the years 2023 and 2025 -- a period marked by the inception and swift growth of adopting and adapting LLMs for various traffic and transportation applications. However, despite these significant advancements, a systematic review and synthesis of the existing studies remain lacking. To address this gap, this paper provides a comprehensive review of the methodologies and applications of LLMs in traffic and transportation, highlighting their ability to process unstructured textual data to advance transportation research. We explore key applications, including autonomous driving, travel behavior prediction, and general transportation-related queries, alongside methodologies such as zero- or few-shot learning, prompt engineering, and fine-tuning. Our analysis identifies critical research gaps. From the methodological perspective, many research gaps can be addressed by integrating LLMs with existing tools and refining LLM architectures. From the application perspective, we identify numerous opportunities for LLMs to tackle a variety of traffic and transportation challenges, building upon existing research. By synthesizing these findings, this review not only clarifies the current state of LLM adoption and adaptation in traffic and transportation but also proposes future research directions, paving the way for smarter and more sustainable transportation systems.
△ Less
Submitted 27 March, 2025;
originally announced March 2025.
-
A Comprehensive Survey on Magnetic Resonance Image Reconstruction
Authors:
Xiaoyan Kui,
Zijie Fan,
Zexin Ji,
Qinsong Li,
Chengtao Liu,
Weixin Si,
Beiji Zou
Abstract:
Magnetic resonance imaging (MRI) reconstruction is a fundamental task aimed at recovering high-quality images from undersampled or low-quality MRI data. This process enhances diagnostic accuracy and optimizes clinical applications. In recent years, deep learning-based MRI reconstruction has made significant progress. Advancements include single-modality feature extraction using different network a…
▽ More
Magnetic resonance imaging (MRI) reconstruction is a fundamental task aimed at recovering high-quality images from undersampled or low-quality MRI data. This process enhances diagnostic accuracy and optimizes clinical applications. In recent years, deep learning-based MRI reconstruction has made significant progress. Advancements include single-modality feature extraction using different network architectures, the integration of multimodal information, and the adoption of unsupervised or semi-supervised learning strategies. However, despite extensive research, MRI reconstruction remains a challenging problem that has yet to be fully resolved. This survey provides a systematic review of MRI reconstruction methods, covering key aspects such as data acquisition and preprocessing, publicly available datasets, single and multi-modal reconstruction models, training strategies, and evaluation metrics based on image reconstruction and downstream tasks. Additionally, we analyze the major challenges in this field and explore potential future directions.
△ Less
Submitted 10 March, 2025;
originally announced March 2025.
-
CAFusion: Controllable Anatomical Synthesis of Perirectal Lymph Nodes via SDF-guided Diffusion
Authors:
Weidong Guo,
Hantao Zhang,
Shouhong Wan,
Bingbing Zou,
Wanqin Wang,
Chenyang Qiu,
Peiquan Jin
Abstract:
Lesion synthesis methods have made significant progress in generating large-scale synthetic datasets. However, existing approaches predominantly focus on texture synthesis and often fail to accurately model masks for anatomically complex lesions. Additionally, these methods typically lack precise control over the synthesis process. For example, perirectal lymph nodes, which range in diameter from…
▽ More
Lesion synthesis methods have made significant progress in generating large-scale synthetic datasets. However, existing approaches predominantly focus on texture synthesis and often fail to accurately model masks for anatomically complex lesions. Additionally, these methods typically lack precise control over the synthesis process. For example, perirectal lymph nodes, which range in diameter from 1 mm to 10 mm, exhibit irregular and intricate contours that are challenging for current techniques to replicate faithfully. To address these limitations, we introduce CAFusion, a novel approach for synthesizing perirectal lymph nodes. By leveraging Signed Distance Functions (SDF), CAFusion generates highly realistic 3D anatomical structures. Furthermore, it offers flexible control over both anatomical and textural features by decoupling the generation of morphological attributes (such as shape, size, and position) from textural characteristics, including signal intensity. Experimental results demonstrate that our synthetic data substantially improve segmentation performance, achieving a 6.45% increase in the Dice coefficient. In the visual Turing test, experienced radiologists found it challenging to distinguish between synthetic and real lesions, highlighting the high degree of realism and anatomical accuracy achieved by our approach. These findings validate the effectiveness of our method in generating high-quality synthetic lesions for advancing medical image processing applications.
△ Less
Submitted 10 March, 2025;
originally announced March 2025.
-
The $J/ψ$-nucleon interaction mechanism: A theoretical study based on scattering length
Authors:
Bing Wu,
Xiang-Kun Dong,
Meng-Lin Du,
Feng-Kun Guo,
Bing-Song Zou
Abstract:
The low-energy $J/ψN$ scattering is of significant importance for various reasons. It is deeply interconnected with the hidden-charm $P_c$ pentaquark states, provides insights into the role of gluons in nucleon structures, and is pertinent to the properties of $J/ψ$ in nuclear medium. The scattering can occur through two distinct mechanisms: the coupled-channel mechanism involving open-charm meson…
▽ More
The low-energy $J/ψN$ scattering is of significant importance for various reasons. It is deeply interconnected with the hidden-charm $P_c$ pentaquark states, provides insights into the role of gluons in nucleon structures, and is pertinent to the properties of $J/ψ$ in nuclear medium. The scattering can occur through two distinct mechanisms: the coupled-channel mechanism involving open-charm meson-baryon intermediate states $Λ_c \bar D^{(*)}$ and $ Σ_c^{(*)}\bar D^{(*)}$, and the soft-gluon exchange mechanism. In this study, we investigate the $S$-wave $J/ψN$ scattering length arising from both mechanisms. Our findings indicate that both mechanisms lead to attractive interactions, yielding scattering lengths of $[-10, -0.1] \times 10^{-3}$ fm for the coupled-channel mechanism and $<-0.16$ fm for the soft-gluon exchange mechanism, respectively. Notably, the soft-gluon exchange mechanism produces a scattering length that is at least one order of magnitude larger than that from the coupled-channel mechanism, indicating its predominance. These findings can be corroborated through lattice calculations and will enhance our understanding of scattering processes that violate the Okubo-Zweig-Iizuka rule.
△ Less
Submitted 1 March, 2025;
originally announced March 2025.
-
Humanoid-VLA: Towards Universal Humanoid Control with Visual Integration
Authors:
Pengxiang Ding,
Jianfei Ma,
Xinyang Tong,
Binghong Zou,
Xinxin Luo,
Yiguo Fan,
Ting Wang,
Hongchao Lu,
Panzhong Mo,
Jinxin Liu,
Yuefan Wang,
Huaicheng Zhou,
Wenshuo Feng,
Jiacheng Liu,
Siteng Huang,
Donglin Wang
Abstract:
This paper addresses the limitations of current humanoid robot control frameworks, which primarily rely on reactive mechanisms and lack autonomous interaction capabilities due to data scarcity. We propose Humanoid-VLA, a novel framework that integrates language understanding, egocentric scene perception, and motion control, enabling universal humanoid control. Humanoid-VLA begins with language-mot…
▽ More
This paper addresses the limitations of current humanoid robot control frameworks, which primarily rely on reactive mechanisms and lack autonomous interaction capabilities due to data scarcity. We propose Humanoid-VLA, a novel framework that integrates language understanding, egocentric scene perception, and motion control, enabling universal humanoid control. Humanoid-VLA begins with language-motion pre-alignment using non-egocentric human motion datasets paired with textual descriptions, allowing the model to learn universal motion patterns and action semantics. We then incorporate egocentric visual context through a parameter efficient video-conditioned fine-tuning, enabling context-aware motion generation. Furthermore, we introduce a self-supervised data augmentation strategy that automatically generates pseudoannotations directly derived from motion data. This process converts raw motion sequences into informative question-answer pairs, facilitating the effective use of large-scale unlabeled video data. Built upon whole-body control architectures, extensive experiments show that Humanoid-VLA achieves object interaction and environment exploration tasks with enhanced contextual awareness, demonstrating a more human-like capacity for adaptive and intelligent engagement.
△ Less
Submitted 21 February, 2025; v1 submitted 20 February, 2025;
originally announced February 2025.
-
Establishing the vector-meson-exchange dominance for the short range interactions of light quarks
Authors:
Bing-Song Zou
Abstract:
I give a brief comment on a recent study revealing the vector meson exchange (VME) dominant for the short range interactions between u/d quarks in the $NN$, $D_{03}$, and $D_{30}$ systems. The finding echoes nicely with an earlier study of hadron spectroscopy using a quark model with hidden local symmetry which also favors the VME dominant for the short range interactions of light quarks. The VME…
▽ More
I give a brief comment on a recent study revealing the vector meson exchange (VME) dominant for the short range interactions between u/d quarks in the $NN$, $D_{03}$, and $D_{30}$ systems. The finding echoes nicely with an earlier study of hadron spectroscopy using a quark model with hidden local symmetry which also favors the VME dominant for the short range interactions of light quarks. The VME dominance for the short range interactions of light quarks gives a natural explanation of the empiric VME dominance for the interactions between hadrons containing light quarks.
△ Less
Submitted 6 February, 2025;
originally announced February 2025.
-
Competition between excitonic insulators and quantum Hall states in correlated electron-hole bilayers
Authors:
Ruishi Qi,
Qize Li,
Zuocheng Zhang,
Zhiyuan Cui,
Bo Zou,
Haleem Kim,
Collin Sanborn,
Sudi Chen,
Jingxu Xie,
Takashi Taniguchi,
Kenji Watanabe,
Michael F. Crommie,
Allan H. MacDonald,
Feng Wang
Abstract:
Excitonic insulators represent a unique quantum phase of matter, providing a rich ground for studying exotic quantum bosonic states. Strongly coupled electron-hole bilayers, which host stable dipolar exciton fluids with an exciton density that can be adjusted electrostatically, offer an ideal platform to investigate correlated excitonic insulators. Based on electron-hole bilayers made of MoSe2/hBN…
▽ More
Excitonic insulators represent a unique quantum phase of matter, providing a rich ground for studying exotic quantum bosonic states. Strongly coupled electron-hole bilayers, which host stable dipolar exciton fluids with an exciton density that can be adjusted electrostatically, offer an ideal platform to investigate correlated excitonic insulators. Based on electron-hole bilayers made of MoSe2/hBN/WSe2 heterostructures, here we study the behavior of excitonic insulators in a perpendicular magnetic field. We report the observation of excitonic quantum oscillations in both Coulomb drag signals and electrical resistance at low to medium magnetic fields. Under a strong magnetic field, we identify multiple quantum phase transitions between the excitonic insulator phase and the bilayer quantum Hall insulator phase. These findings underscore the interplay between the electron-hole interactions and Landau level quantization that opens new possibilities for exploring quantum phenomena in composite bosonic insulators.
△ Less
Submitted 30 January, 2025;
originally announced January 2025.
-
Quantum oscillations in a dipolar excitonic insulator
Authors:
Phuong X. Nguyen,
Raghav Chaturvedi,
Bo Zou,
Kenji Watanabe,
Takashi Taniguchi,
Allan H. MacDonald,
Kin Fai Mak,
Jie Shan
Abstract:
Quantum oscillations in magnetization or resistivity are a defining feature of metals subject to an external magnetic field. The phenomenon is generally not expected in insulators without a Fermi surface. The observations of quantum oscillations in Kondo insulating materials have provided a rare counterexample and attracted much theoretical interest. However, the magnetic oscillations in correlate…
▽ More
Quantum oscillations in magnetization or resistivity are a defining feature of metals subject to an external magnetic field. The phenomenon is generally not expected in insulators without a Fermi surface. The observations of quantum oscillations in Kondo insulating materials have provided a rare counterexample and attracted much theoretical interest. However, the magnetic oscillations in correlated insulators remain poorly understood. Here we report the observations of resistivity quantum oscillations in an excitonic insulator realized in Coulomb-coupled electron-hole double layers with gate-tunability that allows the phenomenon to be explored in a more controllable fashion than in bulk materials. When the cyclotron energy of the electrons or holes is tuned to be comparable to or larger than the exciton binding energy, recurring transitions between excitonic insulators and electron-hole decoupled quantum Hall states are observed. Compressibility measurements show an oscillatory exciton binding energy as a function of magnetic field and electron-hole pair density. Coulomb drag measurements further reveal the formation of excitons with finite angular momentum. Our results are qualitatively captured by mean-field theory calculations. The study demonstrates a new platform for studying quantum oscillations in correlated insulators.
△ Less
Submitted 29 January, 2025;
originally announced January 2025.
-
AdaCoT: Rethinking Cross-Lingual Factual Reasoning through Adaptive Chain-of-Thought
Authors:
Xin Huang,
Tarun Kumar Vangani,
Zhengyuan Liu,
Bowei Zou,
Ai Ti Aw
Abstract:
Large language models have shown impressive multilingual capabilities through pretraining on diverse corpora. While these models show strong reasoning abilities, their performance varies significantly across languages due to imbalanced training data distribution. Existing approaches using sample-level translation for extensive multilingual pretraining and cross-lingual tuning face scalability chal…
▽ More
Large language models have shown impressive multilingual capabilities through pretraining on diverse corpora. While these models show strong reasoning abilities, their performance varies significantly across languages due to imbalanced training data distribution. Existing approaches using sample-level translation for extensive multilingual pretraining and cross-lingual tuning face scalability challenges and often fail to capture nuanced reasoning processes across languages. In this paper, we introduce AdaCoT (Adaptive Chain-of-Thought), a framework that enhances multilingual factual reasoning by dynamically routing thought processes in intermediary ``thinking languages'' before generating target-language responses. AdaCoT leverages a language-agnostic core and incorporates an adaptive, reward-based mechanism for selecting optimal reasoning pathways without requiring additional pretraining. Our comprehensive evaluation across multiple benchmarks demonstrates substantial improvements in both factual reasoning quality and cross-lingual consistency, with particularly strong performance gains in low-resource language settings. The results suggest that adaptive reasoning paths can effectively bridge the performance gap between high and low-resource languages while maintaining cultural and linguistic nuances.
△ Less
Submitted 9 May, 2025; v1 submitted 27 January, 2025;
originally announced January 2025.
-
Optimal Insurance under Endogenous Default and Background Risk
Authors:
Zongxia Liang,
Zhaojie Ren,
Bin Zou
Abstract:
This paper studies an optimal insurance problem for a utility-maximizing buyer of insurance, subject to the seller's endogenous default and background risk. An endogenous default occurs when the buyer's contractual indemnity exceeds the seller's available reserve, which is random due to the background risk. We obtain an analytical solution to the optimal contract for two types of contracts, differ…
▽ More
This paper studies an optimal insurance problem for a utility-maximizing buyer of insurance, subject to the seller's endogenous default and background risk. An endogenous default occurs when the buyer's contractual indemnity exceeds the seller's available reserve, which is random due to the background risk. We obtain an analytical solution to the optimal contract for two types of contracts, differentiated by whether their indemnity functions depend on the seller's background risk. The results shed light on the joint effect of the seller's default and background risk on the buyer's insurance demand.
△ Less
Submitted 9 January, 2025;
originally announced January 2025.
-
Enhancing LLM Reasoning with Multi-Path Collaborative Reactive and Reflection agents
Authors:
Chengbo He,
Bochao Zou,
Xin Li,
Jiansheng Chen,
Junliang Xing,
Huimin Ma
Abstract:
Agents have demonstrated their potential in scientific reasoning tasks through large language models. However, they often face challenges such as insufficient accuracy and degeneration of thought when handling complex reasoning tasks, which impede their performance. To overcome these issues, we propose the Reactive and Reflection agents with Multi-Path Reasoning (RR-MP) Framework, aimed at enhanci…
▽ More
Agents have demonstrated their potential in scientific reasoning tasks through large language models. However, they often face challenges such as insufficient accuracy and degeneration of thought when handling complex reasoning tasks, which impede their performance. To overcome these issues, we propose the Reactive and Reflection agents with Multi-Path Reasoning (RR-MP) Framework, aimed at enhancing the reasoning capabilities of LLMs. Our approach improves scientific reasoning accuracy by employing a multi-path reasoning mechanism where each path consists of a reactive agent and a reflection agent that collaborate to prevent degeneration of thought inherent in single-agent reliance. Additionally, the RR-MP framework does not require additional training; it utilizes multiple dialogue instances for each reasoning path and a separate summarizer to consolidate insights from all paths. This design integrates diverse perspectives and strengthens reasoning across each path. We conducted zero-shot and few-shot evaluations on tasks involving moral scenarios, college-level physics, and mathematics. Experimental results demonstrate that our method outperforms baseline approaches, highlighting the effectiveness and advantages of the RR-MP framework in managing complex scientific reasoning tasks.
△ Less
Submitted 2 January, 2025; v1 submitted 31 December, 2024;
originally announced January 2025.
-
Decoding spin-parity quantum numbers and decay widths of double $J/ψ$ exotic states
Authors:
Kaiwen Chen,
Feng-Xiao Liu,
Qiang Zhao,
Xian-Hui Zhong,
Ruilin Zhu,
Bing-Song Zou
Abstract:
We derive helicity amplitudes for the fully charmed tetraquark states decays into vector meson pair under two types of models, where the one is from quark model and the other one is from heavy quark effective theory. The angular distributions have been given by the cascade decays $T_{4c}\to J/ψ(D_{(s)}^*)+J/ψ(\bar{D}_{(s)}^*)$ along with $J/ψ\to μ^++μ^-$ or $D_{(s)}^*\to D_{(s)}+π$, showing that s…
▽ More
We derive helicity amplitudes for the fully charmed tetraquark states decays into vector meson pair under two types of models, where the one is from quark model and the other one is from heavy quark effective theory. The angular distributions have been given by the cascade decays $T_{4c}\to J/ψ(D_{(s)}^*)+J/ψ(\bar{D}_{(s)}^*)$ along with $J/ψ\to μ^++μ^-$ or $D_{(s)}^*\to D_{(s)}+π$, showing that spin-0 and spin-2 states can be distinguished. If we assume quantum entanglement as a fundamental principle, there is a strict constraint formula for helicity amplitudes. These findings will assist in experimentally differentiating various spin-parity states, determining decay widths and hunting for unobserved structures, thereby shedding light on the internal properties of double $J/ψ$ exotic states.
△ Less
Submitted 17 December, 2024;
originally announced December 2024.
-
$\textrm{A}^{\textrm{2}}$RNet: Adversarial Attack Resilient Network for Robust Infrared and Visible Image Fusion
Authors:
Jiawei Li,
Hongwei Yu,
Jiansheng Chen,
Xinlong Ding,
Jinlong Wang,
Jinyuan Liu,
Bochao Zou,
Huimin Ma
Abstract:
Infrared and visible image fusion (IVIF) is a crucial technique for enhancing visual performance by integrating unique information from different modalities into one fused image. Exiting methods pay more attention to conducting fusion with undisturbed data, while overlooking the impact of deliberate interference on the effectiveness of fusion results. To investigate the robustness of fusion models…
▽ More
Infrared and visible image fusion (IVIF) is a crucial technique for enhancing visual performance by integrating unique information from different modalities into one fused image. Exiting methods pay more attention to conducting fusion with undisturbed data, while overlooking the impact of deliberate interference on the effectiveness of fusion results. To investigate the robustness of fusion models, in this paper, we propose a novel adversarial attack resilient network, called $\textrm{A}^{\textrm{2}}$RNet. Specifically, we develop an adversarial paradigm with an anti-attack loss function to implement adversarial attacks and training. It is constructed based on the intrinsic nature of IVIF and provide a robust foundation for future research advancements. We adopt a Unet as the pipeline with a transformer-based defensive refinement module (DRM) under this paradigm, which guarantees fused image quality in a robust coarse-to-fine manner. Compared to previous works, our method mitigates the adverse effects of adversarial perturbations, consistently maintaining high-fidelity fusion results. Furthermore, the performance of downstream tasks can also be well maintained under adversarial attacks. Code is available at https://github.com/lok-18/A2RNet.
△ Less
Submitted 13 February, 2025; v1 submitted 13 December, 2024;
originally announced December 2024.
-
Investigation of $Λ_c$ States and $(\bar{D}N)$ Molecules Production at EicC and EIC
Authors:
Kai-Sa Qiao,
Bing-Song Zou
Abstract:
We explore various $Λ_c$ states, including $Λ_c$, $Λ_c(2595)$, $Λ_c(2940)$, and the predicted $(\bar{D}N)$ hadronic molecular states, in photoproduction and electroproduction to estimate their yields at EicC and EIC. Assuming $Λ_c(2940)$ as either a hadronic molecular state or a three-quark state, our analysis demonstrates that its production rates are of the same order of magnitude, posing challe…
▽ More
We explore various $Λ_c$ states, including $Λ_c$, $Λ_c(2595)$, $Λ_c(2940)$, and the predicted $(\bar{D}N)$ hadronic molecular states, in photoproduction and electroproduction to estimate their yields at EicC and EIC. Assuming $Λ_c(2940)$ as either a hadronic molecular state or a three-quark state, our analysis demonstrates that its production rates are of the same order of magnitude, posing challenges in identifying its underlying structure. After considering the integral luminosity, the yields of $Λ_c$ excited states reach $10^6$ to $10^7$ at EicC and EIC. The $(\bar{D}N)$ molecular states with both isospin $I = 0$ and $I = 1$ are also studied, with yields reaching $10^5$, making them likely to be detectable at these facilities.
△ Less
Submitted 16 December, 2024; v1 submitted 4 December, 2024;
originally announced December 2024.
-
Production of exotic hadrons in $pp$ and nuclear collisions
Authors:
Jinhui Chen,
Feng-Kun Guo,
Yu-Gang Ma,
Cheng-Ping Shen,
Qiye Shou,
Qian Wang,
Jia-Jun Wu,
Bing-Song Zou
Abstract:
Exotic hadrons beyond the conventional quark model have been discovered in the past two decades. Investigations of these states can lead to deep understanding of nonperturbative dynamics of the strong interaction. In this concise review, we focus on the productions of exotic hadrons in $pp$, $p\bar p$, and nuclear collisions. Experimental observations of light nuclei and hypernuclei, as prototypes…
▽ More
Exotic hadrons beyond the conventional quark model have been discovered in the past two decades. Investigations of these states can lead to deep understanding of nonperturbative dynamics of the strong interaction. In this concise review, we focus on the productions of exotic hadrons in $pp$, $p\bar p$, and nuclear collisions. Experimental observations of light nuclei and hypernuclei, as prototypes of hadronic molecules, in heavy ion collisions will also be briefly discussed.
△ Less
Submitted 27 November, 2024;
originally announced November 2024.
-
Magnetic polaronic exciton in A-type 2D van der Waals bulk material CrSBr
Authors:
Xiaodong Shen,
Jiajun Cao,
Weizheng Liang,
Borong Cong,
Bao Ke,
Jialong Zhao,
Bingsuo Zou
Abstract:
2D magnetic semiconductor CrSBr exhibits unique magneto-optical properties, yet its electronic structure and photophysical mechanisms remain unclear at high magnetic field and low temperature. Through comprehensive spectroscopic investigations, its charge-transfer band edge is identified at 500 nm. Below this band-edge, local excitonic magnetic polaronic states from Cr3+ ions out of FM aggregates…
▽ More
2D magnetic semiconductor CrSBr exhibits unique magneto-optical properties, yet its electronic structure and photophysical mechanisms remain unclear at high magnetic field and low temperature. Through comprehensive spectroscopic investigations, its charge-transfer band edge is identified at 500 nm. Below this band-edge, local excitonic magnetic polaronic states from Cr3+ ions out of FM aggregates in layer and bilayer could be seen due to phonon-spin-exciton coupling, in which magnetic polaronic PL1 emission occurs at 720 nm from single Cr3+ d-d transition, a dark-state pair exciton occurs at 850 nm in 10 K magnetic field, and double-peak PL2 emission at 920 nm out of Cr3+ FM trimer in monolayer is seen; besides, the magnetic bi-polaronic PL3 at 990 nm can be assigned to Cr3+ tetramers between FM adjacent layers. In magnetic field perpendicular to the layer, direct competition between PL1and dark-state excitons and PL2 and PL3 excitonic states persist in different temperatures. This study sheds light on the complicated magneto-exciton interactions in the multi-body effect of CrSBr, beneficial for quantum modulation in layered magnetic semiconductors.
△ Less
Submitted 26 November, 2024;
originally announced November 2024.
-
IC Mechanisms for Risk-Averse Advertisers in the Online Advertising System
Authors:
Bingzhe Wang,
Ruohan Qian,
Yuejia Dou,
Qi Qi,
Bo Shen,
Changyuan Li,
Yixuan Zhang,
Yixin Su,
Xin Yuan,
Wenqiang liu,
Bin Zou,
Wen Yi,
Zhi Guo,
Shuanglong Li,
Liu Lin
Abstract:
The autobidding system generates huge revenue for advertising platforms, garnering substantial research attention. Existing studies in autobidding systems focus on designing Autobidding Incentive Compatible (AIC) mechanisms, where the mechanism is Incentive Compatible (IC) under ex ante expectations. However, upon deploying AIC mechanisms in advertising platforms, we observe a notable deviation be…
▽ More
The autobidding system generates huge revenue for advertising platforms, garnering substantial research attention. Existing studies in autobidding systems focus on designing Autobidding Incentive Compatible (AIC) mechanisms, where the mechanism is Incentive Compatible (IC) under ex ante expectations. However, upon deploying AIC mechanisms in advertising platforms, we observe a notable deviation between the actual auction outcomes and these expectations during runtime, particularly in the scene with few clicks (sparse-click). This discrepancy undermines truthful bidding among advertisers in AIC mechanisms, especially for risk-averse advertisers who are averse to outcomes that do not align with the expectations. To address this issue, we propose a mechanism, Decoupled First-Price Auction (DFP), that retains its IC property even during runtime. DFP dynamically adjusts the payment based on real-time user conversion outcomes, ensuring that advertisers' realized utilities closely approximate their expected utilities during runtime. To realize the payment mechanism of DFP, we propose a PPO-based RL algorithm, with a meticulously crafted reward function. This algorithm dynamically adjusts the payment to fit DFP mechanism. We conduct extensive experiments leveraging real-world data to validate our findings.
△ Less
Submitted 20 November, 2024;
originally announced November 2024.
-
Vortex lattice states of bilayer electron-hole fluids in quantizing magnetic fields
Authors:
Bo Zou,
Allan H. MacDonald
Abstract:
We show that the ground state of a weakly charged two-dimensional electron-hole fluid in a strong magnetic field is a broken translation symmetry state with interpenetrating lattices of localized vortices and antivortices in the electron-hole-pair field. The vortices and antivortices carry fractional charges of equal sign but unequal magnitude and have a honeycomb lattice structure that contrasts…
▽ More
We show that the ground state of a weakly charged two-dimensional electron-hole fluid in a strong magnetic field is a broken translation symmetry state with interpenetrating lattices of localized vortices and antivortices in the electron-hole-pair field. The vortices and antivortices carry fractional charges of equal sign but unequal magnitude and have a honeycomb lattice structure that contrasts with the triangular lattices of superconducting electron-electron-pair vortex lattices. We predict that increasing charge density or weakening magnetic field drives a vortex delocalization transition that would be signaled experimentally by an abrupt increase in counterflow transport resistance.
△ Less
Submitted 27 May, 2025; v1 submitted 13 November, 2024;
originally announced November 2024.
-
Atomic-scale study on core-shell Cu precipitation in steels: atom probe tomography and ab initio calculations
Authors:
Xiao Shen,
YiXu Wang,
Zigan Xu,
Bowen Zou,
Enzo Liotti,
Richard Dronskowski,
Wenwen Song
Abstract:
The present work investigates the atomic interactions among Cu, Al, and Ni elements in bcc-iron matrix, focusing on the formation mechanism of nano-sized core-shell Cu precipitates. Using a combination of atom probe tomography (APT), density functional theory (DFT) cal-culations, and molecular dynamics (MD) simulations, the study provides insights into the atomic-scale migration tendencies of thes…
▽ More
The present work investigates the atomic interactions among Cu, Al, and Ni elements in bcc-iron matrix, focusing on the formation mechanism of nano-sized core-shell Cu precipitates. Using a combination of atom probe tomography (APT), density functional theory (DFT) cal-culations, and molecular dynamics (MD) simulations, the study provides insights into the atomic-scale migration tendencies of these elements in the supersaturated solid solution sur-rounding Cu precipitate in the martensite phase of a medium-Mn steel. The results show that Ni and Al atoms were not expelled by Cu atoms but were instead attracted to the bcc iron matrix, forming a stable co-segregation in the outer shell. This phase effectively surrounded the nano-sized Cu precipitate and prevented its rapid growth, contributing to improved me-chanical properties. The findings offer a theoretical method for developing Cu-contaminated circular steels by utilizing DFT calculations to unravel bonding preferences and assess the po-tential for forming a stable precipitation phase around nano-sized Cu precipitates.
△ Less
Submitted 12 November, 2024;
originally announced November 2024.
-
Deciphering the mechanism of $J/ψ$-nucleon scattering
Authors:
Bing Wu,
Xiang-Kun Dong,
Meng-Lin Du,
Feng-Kun Guo,
Bing-Song Zou
Abstract:
The low-energy $J/ψN$ scattering is important for various reasons: it is related to the hidden-charm $P_c$ pentaquark states, provides insights into the role of gluons in nucleon structures, and is relevant to the $J/ψ$ properties in nuclear medium. The scattering can happen through two distinct mechanisms: the coupled-channel mechanism via open-charm meson-baryon intermediate states, and the soft…
▽ More
The low-energy $J/ψN$ scattering is important for various reasons: it is related to the hidden-charm $P_c$ pentaquark states, provides insights into the role of gluons in nucleon structures, and is relevant to the $J/ψ$ properties in nuclear medium. The scattering can happen through two distinct mechanisms: the coupled-channel mechanism via open-charm meson-baryon intermediate states, and the soft-gluon exchange mechanism. We investigate the $J/ψN$ $S$-wave scattering length through both mechanisms, and find that the soft-gluon exchange mechanism leads to a scattering length at least one order of magnitude larger than that from the coupled-channel mechanism and thus is the predominant one. The findings can be verified by lattice calculations and will enhance our understanding of the scattering processes breaking the Okubo-Zweig-Iizuka rule.
△ Less
Submitted 25 October, 2024;
originally announced October 2024.
-
Contrasting results of surface metrology techniques for three-dimensional human fingerprints
Authors:
Brian Lee Beatty,
Shani Kahan,
Burcak Bas,
Bettina Zou,
Nicole Werpachowski
Abstract:
Fingerprints, otherwise known as dermatoglyphs, are most commonly thought of in the context of identification, but have myriad other roles in human biology. They are formed by the restricted ability of ridges and furrows of the epidermis to flatten. The patterns these ridges and furrows make can be represented as 2D fingerprints, but also as 3D structures with cross-sectional shapes that may add n…
▽ More
Fingerprints, otherwise known as dermatoglyphs, are most commonly thought of in the context of identification, but have myriad other roles in human biology. They are formed by the restricted ability of ridges and furrows of the epidermis to flatten. The patterns these ridges and furrows make can be represented as 2D fingerprints, but also as 3D structures with cross-sectional shapes that may add new levels of detail to identification, forensic, and behavioral uses/studies. Surface metrology techniques better allow for the quantification of these features, though it is unclear what tool and what scale is most appropriate. A Sensofar S Neox white light reflectance confocal microscope and a Gelsight Mobile 2 were used to independently measure the surface roughness of the fingerprints of four individuals from preserved cadaveric remains. Scale-sensitive fractal analyses (SSFA) were performed on the data from the S Neox (a small area), Gelsight (a larger area), and the same Gelsight datasets cropped down to the size of the S Neox scan size. Though fewer SSFA parameters identified differences between individuals from the smaller, extracted Gelsight area, all three forms of measurement found significant differences between some individuals from the study. No significant differences were found that differ between fingers themselves. Though only an initial step, these data suggest that a variety of surface metrology techniques may be useful in differentiating individuals.
△ Less
Submitted 16 October, 2024;
originally announced October 2024.
-
TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Authors:
Mu Cai,
Reuben Tan,
Jianrui Zhang,
Bocheng Zou,
Kai Zhang,
Feng Yao,
Fangrui Zhu,
Jing Gu,
Yiwu Zhong,
Yuzhang Shang,
Yao Dou,
Jaden Park,
Jianfeng Gao,
Yong Jae Lee,
Jianwei Yang
Abstract:
Understanding fine-grained temporal dynamics is crucial for multimodal video comprehension and generation. Due to the lack of fine-grained temporal annotations, existing video benchmarks mostly resemble static image benchmarks and are incompetent at evaluating models for temporal understanding. In this paper, we introduce TemporalBench, a new benchmark dedicated to evaluating fine-grained temporal…
▽ More
Understanding fine-grained temporal dynamics is crucial for multimodal video comprehension and generation. Due to the lack of fine-grained temporal annotations, existing video benchmarks mostly resemble static image benchmarks and are incompetent at evaluating models for temporal understanding. In this paper, we introduce TemporalBench, a new benchmark dedicated to evaluating fine-grained temporal understanding in videos. TemporalBench consists of ~10K video question-answer pairs, derived from ~2K high-quality human annotations detailing the temporal dynamics in video clips. As a result, our benchmark provides a unique testbed for evaluating various temporal understanding and reasoning abilities such as action frequency, motion magnitude, event order, etc. Moreover, it enables evaluations on various tasks like both video question answering and captioning, both short and long video understanding, as well as different models such as multimodal video embedding models and text generation models. Results show that state-of-the-art models like GPT-4o achieve only 38.5% question answering accuracy on TemporalBench, demonstrating a significant gap (~30%) between humans and AI in temporal understanding. Furthermore, we notice a critical pitfall for multi-choice QA where LLMs can detect the subtle changes in negative captions and find a centralized description as a cue for its prediction, where we propose Multiple Binary Accuracy (MBA) to correct such bias. We hope that TemporalBench can foster research on improving models' temporal reasoning capabilities. Both dataset and evaluation code will be made available.
△ Less
Submitted 15 October, 2024; v1 submitted 14 October, 2024;
originally announced October 2024.
-
Peetre conjecture on real interpolation spaces of Besov spaces and Grid K functional
Authors:
Qixiang Yang,
Haibo Yang,
Bin Zou,
Jianxun He
Abstract:
In this paper, Peetre's conjecture about the real interpolation space of Besov space {\bf is solved completely } by using the classification of vertices of cuboids defined by {\bf wavelet coefficients and wavelet's grid structure}. Littlewood-Paley analysis provides only a decomposition of the function on the ring. We extend Lorentz's rearrangement function and Hunt's Marcinkiewicz interpolation t…
▽ More
In this paper, Peetre's conjecture about the real interpolation space of Besov space {\bf is solved completely } by using the classification of vertices of cuboids defined by {\bf wavelet coefficients and wavelet's grid structure}. Littlewood-Paley analysis provides only a decomposition of the function on the ring. We extend Lorentz's rearrangement function and Hunt's Marcinkiewicz interpolation theorem to more general cases. We use the method of calculating the topological quantity of the grid to replace the traditional methods of data classification such as gradient descent method and distributed algorithm.
We developed a series of new techniques to solve this longstanding open problem. These skills make up for the deficiency of Lions-Peetre iterative theorem in dealing with strong nonlinearity. Using the properties of wavelet basis, a series of {\bf functional nonlinearities} are studied. Using the lattice property of wavelet, we study the lattice topology. By three kinds of {\bf topology nonlinearities}, we give the specific wavelet expression of K functional.
△ Less
Submitted 12 September, 2024;
originally announced October 2024.
-
RNA-Protein Interaction Prediction Based on Deep Learning: A Comprehensive Survey
Authors:
Danyu Li,
Rubing Huang,
Chenhui Cui,
Dave Towey,
Ling Zhou,
Jinyu Tian,
Bin Zou
Abstract:
The interaction between Ribonucleic Acids (RNAs) and proteins, also called RNA Protein Interaction (RPI), plays an important role in the life activities of organisms, including in various regulatory processes, such as gene splicing, gene localization, and disease pathogenesis. RPI Prediction (RPIP) predicts the interactions between RNAs and proteins, which includes looking for the existence of int…
▽ More
The interaction between Ribonucleic Acids (RNAs) and proteins, also called RNA Protein Interaction (RPI), plays an important role in the life activities of organisms, including in various regulatory processes, such as gene splicing, gene localization, and disease pathogenesis. RPI Prediction (RPIP) predicts the interactions between RNAs and proteins, which includes looking for the existence of interactions and the binding sites of interactions, and adding RNA-protein functional annotations (such as immunity regulation, neuroprotection, etc). Due to the huge amounts of complex biological data, Deep Learning-based RPIP (DL-based RPIP) has been widely investigated, as it can extract high-dimensional features from data and make accurate predictions. Over the last decade, there have been many achievements and contributions in DL-based RPIP. Although some previous studies review DL-based RPIP, to the best of our knowledge, there is still a lack of a comprehensive survey. In this paper, we extensively survey DL-based RPIP in terms of its entire process, including: feature encoding, deep learning modeling, results evaluation, RPIP application domains, and available websites and software. We also identify some open research challenges, and discuss the potential future work for DL-based RPIP.
△ Less
Submitted 30 September, 2024;
originally announced October 2024.
-
Exploring Information-Theoretic Metrics Associated with Neural Collapse in Supervised Training
Authors:
Kun Song,
Zhiquan Tan,
Bochao Zou,
Jiansheng Chen,
Huimin Ma,
Weiran Huang
Abstract:
In this paper, we introduce matrix entropy as an analytical tool for studying supervised learning, investigating the information content of data representations and classification head vectors, as well as the dynamic interactions between them during the supervised learning process. Our experimental results reveal that matrix entropy effectively captures the variations in information content of dat…
▽ More
In this paper, we introduce matrix entropy as an analytical tool for studying supervised learning, investigating the information content of data representations and classification head vectors, as well as the dynamic interactions between them during the supervised learning process. Our experimental results reveal that matrix entropy effectively captures the variations in information content of data representations and classification head vectors as neural networks approach Neural Collapse during supervised training, while also serving as a robust metric for measuring similarity among data samples. Leveraging this property, we propose Cross-Model Alignment (CMA) loss to optimize the fine-tuning of pretrained models. To characterize the dynamics of neural networks nearing the Neural Collapse state, we introduce two novel metrics: the Matrix Mutual Information Ratio (MIR) and the Matrix Entropy Difference Ratio (HDR), which quantitatively assess the interactions between data representations and classification heads in supervised learning, with theoretical optimal values derived under the Neural Collapse state. Our experiments demonstrate that MIR and HDR effectively explain various phenomena in neural networks, including the dynamics of standard supervised training, linear mode connectivity. Moreover, we use MIR and HDR to analyze the dynamics of grokking, which is a fascinating phenomenon in supervised learning where a model unexpectedly exhibits generalization long after achieving training data fit.
△ Less
Submitted 28 February, 2025; v1 submitted 25 September, 2024;
originally announced September 2024.
-
Synergistic Spotting and Recognition of Micro-Expression via Temporal State Transition
Authors:
Bochao Zou,
Zizheng Guo,
Wenfeng Qin,
Xin Li,
Kangsheng Wang,
Huimin Ma
Abstract:
Micro-expressions are involuntary facial movements that cannot be consciously controlled, conveying subtle cues with substantial real-world applications. The analysis of micro-expressions generally involves two main tasks: spotting micro-expression intervals in long videos and recognizing the emotions associated with these intervals. Previous deep learning methods have primarily relied on classifi…
▽ More
Micro-expressions are involuntary facial movements that cannot be consciously controlled, conveying subtle cues with substantial real-world applications. The analysis of micro-expressions generally involves two main tasks: spotting micro-expression intervals in long videos and recognizing the emotions associated with these intervals. Previous deep learning methods have primarily relied on classification networks utilizing sliding windows. However, fixed window sizes and window-level hard classification introduce numerous constraints. Additionally, these methods have not fully exploited the potential of complementary pathways for spotting and recognition. In this paper, we present a novel temporal state transition architecture grounded in the state space model, which replaces conventional window-level classification with video-level regression. Furthermore, by leveraging the inherent connections between spotting and recognition tasks, we propose a synergistic strategy that enhances overall analysis performance. Extensive experiments demonstrate that our method achieves state-of-the-art performance. The codes and pre-trained models are available at https://github.com/zizheng-guo/ME-TST.
△ Less
Submitted 15 September, 2024;
originally announced September 2024.
-
CCAT: A status update on the EoR-Spec instrument module for Prime-Cam
Authors:
Rodrigo Freundt,
Yaqiong Li,
Doug Henke,
Jason Austermann,
James R. Burgoyne,
Scott Chapman,
Steve K. Choi,
Cody J. Duell,
Zach Huber,
Michael Niemack,
Thomas Nikola,
Lawrence Lin,
Dominik A. Riechers,
Gordon Stacey,
Anna K. Vaskuri,
Eve M. Vavagiakis,
Jordan Wheeler,
Bugao Zou
Abstract:
The Epoch of Reionization Spectrometer (EoR-Spec) is an upcoming Line Intensity Mapping (LIM) instrument designed to study the evolution of the early universe (z = 3.5 to 8) by probing the redshifted [CII] 158 $μ$m fine-structure line from aggregates of galaxies. The [CII] emission is an excellent tracer of star formation since it is the dominant cooling line from neutral gas heated by OB star lig…
▽ More
The Epoch of Reionization Spectrometer (EoR-Spec) is an upcoming Line Intensity Mapping (LIM) instrument designed to study the evolution of the early universe (z = 3.5 to 8) by probing the redshifted [CII] 158 $μ$m fine-structure line from aggregates of galaxies. The [CII] emission is an excellent tracer of star formation since it is the dominant cooling line from neutral gas heated by OB star light and thus can be used to probe the reionization of the early Universe due to star formation. EoR-Spec will be deployed on Prime-Cam, a modular direct-detection receiver for the 6-meter Fred Young Submillimeter Telescope (FYST), currently under construction by CPI Vertex Antennentechnik GmbH and to be installed near the summit of Cerro Chajnantor in the Atacama Desert. This instrument features an image plane populated with more than 6500 Microwave Kinetic Inductance Detectors (MKIDs) that are illuminated by a 4-lens optical design with a cryogenic, scanning Fabry-Perot Interferometer (FPI) at the pupil of the optical system. The FPI is designed to provide a spectral resolving power of $R\sim100$ over the full spectral range of 210--420 GHz. EoR-Spec will tomographically survey the E-COSMOS and E-CDFS fields with a depth of about 4000 hours over a 5 year period. Here we give an update on EoR-Spec's final mechanical/optical design and the current status of fabrication, characterization and testing towards first light in 2026.
△ Less
Submitted 9 September, 2024;
originally announced September 2024.
-
Mini-Proceedings of the "Fourth International Workshop on the Extension Project for the J-PARC Hadron Experimental Facility (HEF-ex 2024)"
Authors:
P. Achenbach,
K. Aoki,
S. Aoki,
C. Curceanu,
S. Diehl,
T. Doi,
M. Endo,
M. Fujita,
T. Fukuda,
H. Garcia-Tecocoatzi,
L. S. Geng,
T. Gunji,
C. Hanhart,
M. Harada,
T. Harada,
S. Hayakawa,
B. R. He,
E. Hiyama,
R. Honda,
Y. Ichikawa,
M. Isaka,
D. Jido,
A. Jinno,
K. Kamada,
Y. Kamiya
, et al. (36 additional authors not shown)
Abstract:
The mini proceedings of the "Fourth International Workshop on the Extension Project for the J-PARC Hadron Experimental Facility (HEF-ex 2024) [https://kds.kek.jp/event/46965]" held at J-PARC, February 19-21, 2024, are presented. The workshop was devoted to discussing the physics case that connects both the present and the future Hadron Experimental Facility at J-PARC, covering a wide range of topi…
▽ More
The mini proceedings of the "Fourth International Workshop on the Extension Project for the J-PARC Hadron Experimental Facility (HEF-ex 2024) [https://kds.kek.jp/event/46965]" held at J-PARC, February 19-21, 2024, are presented. The workshop was devoted to discussing the physics case that connects both the present and the future Hadron Experimental Facility at J-PARC, covering a wide range of topics in flavor, hadron, and nuclear physics related to both experimental and theoretical activities being conducted at the facility.
△ Less
Submitted 31 August, 2024;
originally announced September 2024.
-
Toward Robust Early Detection of Alzheimer's Disease via an Integrated Multimodal Learning Approach
Authors:
Yifei Chen,
Shenghao Zhu,
Zhaojie Fang,
Chang Liu,
Binfeng Zou,
Yuhe Wang,
Shuo Chang,
Fan Jia,
Feiwei Qin,
Jin Fan,
Yong Peng,
Changmiao Wang
Abstract:
Alzheimer's Disease (AD) is a complex neurodegenerative disorder marked by memory loss, executive dysfunction, and personality changes. Early diagnosis is challenging due to subtle symptoms and varied presentations, often leading to misdiagnosis with traditional unimodal diagnostic methods due to their limited scope. This study introduces an advanced multimodal classification model that integrates…
▽ More
Alzheimer's Disease (AD) is a complex neurodegenerative disorder marked by memory loss, executive dysfunction, and personality changes. Early diagnosis is challenging due to subtle symptoms and varied presentations, often leading to misdiagnosis with traditional unimodal diagnostic methods due to their limited scope. This study introduces an advanced multimodal classification model that integrates clinical, cognitive, neuroimaging, and EEG data to enhance diagnostic accuracy. The model incorporates a feature tagger with a tabular data coding architecture and utilizes the TimesBlock module to capture intricate temporal patterns in Electroencephalograms (EEG) data. By employing Cross-modal Attention Aggregation module, the model effectively fuses Magnetic Resonance Imaging (MRI) spatial information with EEG temporal data, significantly improving the distinction between AD, Mild Cognitive Impairment, and Normal Cognition. Simultaneously, we have constructed the first AD classification dataset that includes three modalities: EEG, MRI, and tabular data. Our innovative approach aims to facilitate early diagnosis and intervention, potentially slowing the progression of AD. The source code and our private ADMC dataset are available at https://github.com/JustlfC03/MSTNet.
△ Less
Submitted 3 January, 2025; v1 submitted 29 August, 2024;
originally announced August 2024.
-
LN-Gen: Rectal Lymph Nodes Generation via Anatomical Features
Authors:
Weidong Guo,
Hantao Zhang,
Shouhong Wan,
Bingbing Zou,
Wanqin Wang,
Peiquan Jin
Abstract:
Accurate segmentation of rectal lymph nodes is crucial for the staging and treatment planning of rectal cancer. However, the complexity of the surrounding anatomical structures and the scarcity of annotated data pose significant challenges. This study introduces a novel lymph node synthesis technique aimed at generating diverse and realistic synthetic rectal lymph node samples to mitigate the reli…
▽ More
Accurate segmentation of rectal lymph nodes is crucial for the staging and treatment planning of rectal cancer. However, the complexity of the surrounding anatomical structures and the scarcity of annotated data pose significant challenges. This study introduces a novel lymph node synthesis technique aimed at generating diverse and realistic synthetic rectal lymph node samples to mitigate the reliance on manual annotation. Unlike direct diffusion methods, which often produce masks that are discontinuous and of suboptimal quality, our approach leverages an implicit SDF-based method for mask generation, ensuring the production of continuous, stable, and morphologically diverse masks. Experimental results demonstrate that our synthetic data significantly improves segmentation performance. Our work highlights the potential of diffusion model for accurately synthesizing structurally complex lesions, such as lymph nodes in rectal cancer, alleviating the challenge of limited annotated data in this field and aiding in advancements in rectal cancer diagnosis and treatment.
△ Less
Submitted 27 August, 2024;
originally announced August 2024.
-
CCAT: Prime-Cam Optics Overview and Status Update
Authors:
Zachary B. Huber,
Lawrence T. Lin,
Eve M. Vavagiakis,
Rodrigo G. Freundt,
Victoria Butler,
Scott C. Chapman,
Steve K. Choi,
Abigail T. Crites,
Cody J. Duell,
Patricio A. Gallardo,
Anthony I. Huber,
Ben Keller,
Alicia Middleton,
Michael D. Niemack,
Thomas Nikola,
John Orlowski-Scherer,
Ema Smith,
Gordon Stacey,
Samantha Walker,
Bugao Zou
Abstract:
Prime-Cam is a first-generation science instrument for the CCAT Observatory's six-meter aperture Fred Young Submillimeter Telescope (FYST). FYST's crossed-Dragone design provides high optical throughput to take advantage of its unique site at 5600 m on Cerro Chajnantor in Chile's Atacama Desert to reach mapping speeds over ten times greater than current and near-term submillimeter experiments. Hou…
▽ More
Prime-Cam is a first-generation science instrument for the CCAT Observatory's six-meter aperture Fred Young Submillimeter Telescope (FYST). FYST's crossed-Dragone design provides high optical throughput to take advantage of its unique site at 5600 m on Cerro Chajnantor in Chile's Atacama Desert to reach mapping speeds over ten times greater than current and near-term submillimeter experiments. Housing up to seven independent instrument modules in its 1.8-meter diameter cryostat, Prime-Cam will combine broadband polarization-sensitive modules and spectrometer modules designed for observations in several frequency windows between 210 GHz and 850 GHz to study a wide range of astrophysical questions from Big Bang cosmology to the formation of stars and galaxies in the Epoch of Reionization and beyond. In order to cover this range of frequencies and observation modes, each of the modules contains a set of cold reimaging optics that is optimized for the science goals of that module. These optical setups include several filters, three or four anti-reflection-coated silicon lenses, and a Lyot stop to control the field of view and illumination of the primary mirror, satisfy a series of mechanical constraints, and maximize optical performance within each passband. We summarize the design considerations and trade-offs for the optics in these modules and provide a status update on the fabrication of the Prime-Cam receiver and the design of its 1 K and 100 mK thermal BUSs.
△ Less
Submitted 30 July, 2024;
originally announced July 2024.
-
GFE-Mamba: Mamba-based AD Multi-modal Progression Assessment via Generative Feature Extraction from MCI
Authors:
Zhaojie Fang,
Shenghao Zhu,
Yifei Chen,
Binfeng Zou,
Fan Jia,
Chang Liu,
Xiang Feng,
Linwei Qiu,
Feiwei Qin,
Jin Fan,
Changbiao Chu,
Changmiao Wang
Abstract:
Alzheimer's Disease (AD) is a progressive, irreversible neurodegenerative disorder that often originates from Mild Cognitive Impairment (MCI). This progression results in significant memory loss and severely affects patients' quality of life. Clinical trials have consistently shown that early and targeted interventions for individuals with MCI may slow or even prevent the advancement of AD. Resear…
▽ More
Alzheimer's Disease (AD) is a progressive, irreversible neurodegenerative disorder that often originates from Mild Cognitive Impairment (MCI). This progression results in significant memory loss and severely affects patients' quality of life. Clinical trials have consistently shown that early and targeted interventions for individuals with MCI may slow or even prevent the advancement of AD. Research indicates that accurate medical classification requires diverse multimodal data, including detailed assessment scales and neuroimaging techniques like Magnetic Resonance Imaging (MRI) and Positron Emission Tomography (PET). However, simultaneously collecting the aforementioned three modalities for training presents substantial challenges. To tackle these difficulties, we propose GFE-Mamba, a multimodal classifier founded on Generative Feature Extractor. The intermediate features provided by this Extractor can compensate for the shortcomings of PET and achieve profound multimodal fusion in the classifier. The Mamba block, as the backbone of the classifier, enables it to efficiently extract information from long-sequence scale information. Pixel-level Bi-cross Attention supplements pixel-level information from MRI and PET. We provide our rationale for developing this cross-temporal progression prediction dataset and the pre-trained Extractor weights. Our experimental findings reveal that the GFE-Mamba model effectively predicts the progression from MCI to AD and surpasses several leading methods in the field. Our source code is available at https://github.com/Tinysqua/GFE-Mamba.
△ Less
Submitted 29 January, 2025; v1 submitted 22 July, 2024;
originally announced July 2024.
-
VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation
Authors:
Bocheng Zou,
Mu Cai,
Jianrui Zhang,
Yong Jae Lee
Abstract:
In the realm of vision models, the primary mode of representation is using pixels to rasterize the visual world. Yet this is not always the best or unique way to represent visual content, especially for designers and artists who depict the world using geometry primitives such as polygons. Vector graphics (VG), on the other hand, offer a textual representation of visual content, which can be more c…
▽ More
In the realm of vision models, the primary mode of representation is using pixels to rasterize the visual world. Yet this is not always the best or unique way to represent visual content, especially for designers and artists who depict the world using geometry primitives such as polygons. Vector graphics (VG), on the other hand, offer a textual representation of visual content, which can be more concise and powerful for content like cartoons, sketches and scientific figures. Recent studies have shown promising results on processing vector graphics with capable Large Language Models (LLMs). However, such works focus solely on qualitative results, understanding, or a specific type of vector graphics. We propose VGBench, a comprehensive benchmark for LLMs on handling vector graphics through diverse aspects, including (a) both visual understanding and generation, (b) evaluation of various vector graphics formats, (c) diverse question types, (d) wide range of prompting techniques, (e) under multiple LLMs and (f) comparison with VLMs on rasterized representations. Evaluating on our collected 4279 understanding and 5845 generation samples, we find that LLMs show strong capability on both aspects while exhibiting less desirable performance on low-level formats (SVG). Both data and evaluation pipeline will be open-sourced at https://vgbench.github.io.
△ Less
Submitted 29 August, 2024; v1 submitted 15 July, 2024;
originally announced July 2024.
-
Self-Prior Guided Mamba-UNet Networks for Medical Image Super-Resolution
Authors:
Zexin Ji,
Beiji Zou,
Xiaoyan Kui,
Pierre Vera,
Su Ruan
Abstract:
In this paper, we propose a self-prior guided Mamba-UNet network (SMamba-UNet) for medical image super-resolution. Existing methods are primarily based on convolutional neural networks (CNNs) or Transformers. CNNs-based methods fail to capture long-range dependencies, while Transformer-based approaches face heavy calculation challenges due to their quadratic computational complexity. Recently, Sta…
▽ More
In this paper, we propose a self-prior guided Mamba-UNet network (SMamba-UNet) for medical image super-resolution. Existing methods are primarily based on convolutional neural networks (CNNs) or Transformers. CNNs-based methods fail to capture long-range dependencies, while Transformer-based approaches face heavy calculation challenges due to their quadratic computational complexity. Recently, State Space Models (SSMs) especially Mamba have emerged, capable of modeling long-range dependencies with linear computational complexity. Inspired by Mamba, our approach aims to learn the self-prior multi-scale contextual features under Mamba-UNet networks, which may help to super-resolve low-resolution medical images in an efficient way. Specifically, we obtain self-priors by perturbing the brightness inpainting of the input image during network training, which can learn detailed texture and brightness information that is beneficial for super-resolution. Furthermore, we combine Mamba with Unet network to mine global features at different levels. We also design an improved 2D-Selective-Scan (ISS2D) module to divide image features into different directional sequences to learn long-range dependencies in multiple directions, and adaptively fuse sequence information to enhance super-resolved feature representation. Both qualitative and quantitative experimental results demonstrate that our approach outperforms current state-of-the-art methods on two public medical datasets: the IXI and fastMRI.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Deform-Mamba Network for MRI Super-Resolution
Authors:
Zexin Ji,
Beiji Zou,
Xiaoyan Kui,
Pierre Vera,
Su Ruan
Abstract:
In this paper, we propose a new architecture, called Deform-Mamba, for MR image super-resolution. Unlike conventional CNN or Transformer-based super-resolution approaches which encounter challenges related to the local respective field or heavy computational cost, our approach aims to effectively explore the local and global information of images. Specifically, we develop a Deform-Mamba encoder wh…
▽ More
In this paper, we propose a new architecture, called Deform-Mamba, for MR image super-resolution. Unlike conventional CNN or Transformer-based super-resolution approaches which encounter challenges related to the local respective field or heavy computational cost, our approach aims to effectively explore the local and global information of images. Specifically, we develop a Deform-Mamba encoder which is composed of two branches, modulated deform block and vision Mamba block. We also design a multi-view context module in the bottleneck layer to explore the multi-view contextual content. Thanks to the extracted features of the encoder, which include content-adaptive local and efficient global information, the vision Mamba decoder finally generates high-quality MR images. Moreover, we introduce a contrastive edge loss to promote the reconstruction of edge and contrast related content. Quantitative and qualitative experimental results indicate that our approach on IXI and fastMRI datasets achieves competitive performance.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.