-
Stabilization Analysis and Mode Recognition of Kerosene Supersonic Combustion: A Deep Learning Approach Based on Res-CNN-beta-VAE
Authors:
Weiming Xu,
Tao Yang,
Chang Liu,
Kun Wu,
Peng Zhang
Abstract:
The scramjet engine is a key propulsion system for hypersonic vehicles, leveraging supersonic airflow to achieve high specific impulse, making it a promising technology for aerospace applications. Understanding and controlling the complex interactions between fuel injection, turbulent combustion, and aerodynamic effects of compressible flows are crucial for ensuring stable combustion in scramjet e…
▽ More
The scramjet engine is a key propulsion system for hypersonic vehicles, leveraging supersonic airflow to achieve high specific impulse, making it a promising technology for aerospace applications. Understanding and controlling the complex interactions between fuel injection, turbulent combustion, and aerodynamic effects of compressible flows are crucial for ensuring stable combustion in scramjet engines. However, identifying stable modes in scramjet combustors is often challenging due to limited experimental measurement means and extremely complex spatiotemporal evolution of supersonic turbulent combustion. This work introduces an innovative deep learning framework that combines dimensionality reduction via the Residual Convolutional Neural Network-beta-Variational Autoencoder (Res-CNN-beta-VAE) model with unsupervised clustering (K-means) to identify and analyze dynamical combustion modes in a supersonic combustor. By mapping high-dimensional data of combustion snapshots to a reduced three-dimensional latent space, the Res-CNN-beta-VAE model captures the essential temporal and spatial features of flame behaviors and enables the observation of transitions between combustion states. By analyzing the standard deviation of latent variable trajectories, we introduce a novel method for objectively distinguishing between dynamic transitions, which provides a scalable and expert-independent alternative to traditional classification methods. Besides, the unsupervised K-means clustering approach effectively identifies the complex interplay between the cavity and the jet-wake stabilization mechanisms, offering new insights into the system's behavior across different gas-to-liquid mass flow ratios (GLRs).
△ Less
Submitted 16 March, 2025;
originally announced March 2025.
-
Cross-Modal Consistency Learning for Sign Language Recognition
Authors:
Kepeng Wu,
Zecheng Li,
Hezhen Hu,
Wengang Zhou,
Houqiang Li
Abstract:
Pre-training has been proven to be effective in boosting the performance of Isolated Sign Language Recognition (ISLR). Existing pre-training methods solely focus on the compact pose data, which eliminates background perturbation but inevitably suffers from insufficient semantic cues compared to raw RGB videos. Nevertheless, learning representation directly from RGB videos remains challenging due t…
▽ More
Pre-training has been proven to be effective in boosting the performance of Isolated Sign Language Recognition (ISLR). Existing pre-training methods solely focus on the compact pose data, which eliminates background perturbation but inevitably suffers from insufficient semantic cues compared to raw RGB videos. Nevertheless, learning representation directly from RGB videos remains challenging due to the presence of sign-independent visual features. To address this dilemma, we propose a Cross-modal Consistency Learning framework (CCL-SLR), which leverages the cross-modal consistency from both RGB and pose modalities based on self-supervised pre-training. First, CCL-SLR employs contrastive learning for instance discrimination within and across modalities. Through the single-modal and cross-modal contrastive learning, CCL-SLR gradually aligns the feature spaces of RGB and pose modalities, thereby extracting consistent sign representations. Second, we further introduce Motion-Preserving Masking (MPM) and Semantic Positive Mining (SPM) techniques to improve cross-modal consistency from the perspective of data augmentation and sample similarity, respectively. Extensive experiments on four ISLR benchmarks show that CCL-SLR achieves impressive performance, demonstrating its effectiveness. The code will be released to the public.
△ Less
Submitted 21 March, 2025; v1 submitted 16 March, 2025;
originally announced March 2025.
-
Oscillation-eliminating central DG schemes for hyperbolic conservation laws
Authors:
Manting Peng,
Kailiang Wu,
Caiyou Yuan
Abstract:
This paper proposes and analyzes a class of essentially non-oscillatory central discontinuous Galerkin (CDG) methods for general hyperbolic conservation laws. First, we introduce a novel compact, non-oscillatory stabilization mechanism that effectively suppresses spurious oscillations while preserving the high-order accuracy of CDG methods. Unlike existing limiter-based approaches that rely on lar…
▽ More
This paper proposes and analyzes a class of essentially non-oscillatory central discontinuous Galerkin (CDG) methods for general hyperbolic conservation laws. First, we introduce a novel compact, non-oscillatory stabilization mechanism that effectively suppresses spurious oscillations while preserving the high-order accuracy of CDG methods. Unlike existing limiter-based approaches that rely on large stencils or problem-specific parameters for oscillation control, our dual damping mechanism is inspired by CDG-based numerical dissipation and leverages overlapping solutions within the CDG framework, significantly enhancing stability while maintaining compactness. Our approach is free of problem-dependent parameters and complex characteristic decomposition, making it efficient and robust. Second, we provide a rigorous stability and optimal error analysis for fully discrete Runge-Kutta (RK) CDG schemes, addressing a gap in the theoretical understanding of these methods. Specifically, we establish the approximate skew-symmetry and weak boundedness of the CDG discretization. These results enable us to rigorously analyze the fully discrete error estimates for our oscillation-eliminating CDG (OECDG) method, a challenging task due to its nonlinear nature, even for linear advection equations. Building on this framework, we reformulate nonlinear oscillation-eliminating CDG schemes as linear RK CDG schemes with a nonlinear source term, extending error estimates beyond the linear case to schemes with nonlinear oscillation control. While existing error analyses for DG or CDG schemes have largely been restricted to linear cases without nonlinear oscillation-control techniques, our analysis represents an important theoretical advancement. Experiments validate the theoretical findings and demonstrate the effectiveness of the OECDG method.
△ Less
Submitted 16 March, 2025;
originally announced March 2025.
-
On Local Minimum Entropy Principle of High-Order Schemes for Relativistic Euler Equations
Authors:
Shumo Cui,
Kailiang Wu,
Linfeng Xu
Abstract:
This paper establishes the minimum entropy principle (MEP) for the relativistic Euler equations with a broad class of equations of state (EOSs) and addresses the challenge of preserving the local version of the discovered MEP in high-order numerical schemes. At the continuous level, we find out a family of entropy pairs for the relativistic Euler equations and provide rigorous analysis to prove th…
▽ More
This paper establishes the minimum entropy principle (MEP) for the relativistic Euler equations with a broad class of equations of state (EOSs) and addresses the challenge of preserving the local version of the discovered MEP in high-order numerical schemes. At the continuous level, we find out a family of entropy pairs for the relativistic Euler equations and provide rigorous analysis to prove the strict convexity of entropy under a necessary and sufficient condition. At the numerical level, we develop a rigorous framework for designing provably entropy-preserving high-order schemes that ensure both physical admissibility and the discovered MEP. The relativistic effects, coupled with the abstract and general EOS formulation, introduce significant challenges not encountered in the nonrelativistic case or with the ideal EOS. In particular, entropy is a highly nonlinear and implicit function of the conservative variables, making it particularly difficult to enforce entropy preservation. To address these challenges, we establish a series of auxiliary theories via highly technical inequalities. Another key innovation is the use of geometric quasi-linearization (GQL), which reformulates the nonlinear constraints into equivalent linear ones by introducing additional free parameters. These advancements form the foundation of our entropy-preserving analysis. We propose novel, robust, locally entropy-preserving high-order frameworks. A central challenge is accurately estimating the local minimum of entropy, particularly in the presence of shock waves at unknown locations. To address this, we introduce two new approaches for estimating local lower bounds of specific entropy, which prove effective for both smooth and discontinuous problems. Numerical experiments demonstrate that our entropy-preserving methods maintain high-order accuracy while effectively suppressing spurious oscillations.
△ Less
Submitted 16 March, 2025;
originally announced March 2025.
-
Neutrinos as a new tool to characterise the Milky Way Centre
Authors:
Paul C. W. Lai,
Beatrice Crudele,
Matteo Agostini,
Hayden P. H. Ng,
Ellis R. Owen,
Nishta Varma,
Kinwah Wu
Abstract:
The Central Molecular Zone (CMZ), a star-forming region rich in molecular clouds located within hundreds of parsecs from the centre of our Galaxy, converts gas into stars less efficient than anticipated. A key challenge in refining star-formation models is the lack of precise mapping of these dense molecular hydrogen clouds, where traditional tracers often yield inconsistent results due to environ…
▽ More
The Central Molecular Zone (CMZ), a star-forming region rich in molecular clouds located within hundreds of parsecs from the centre of our Galaxy, converts gas into stars less efficient than anticipated. A key challenge in refining star-formation models is the lack of precise mapping of these dense molecular hydrogen clouds, where traditional tracers often yield inconsistent results due to environmental limitations. We demonstrate how, in the not-so-far future, neutrinos will emerge as a robust mass tracer thanks to advancements in neutrino telescopes. Since neutrinos are produced alongside gamma-rays when cosmic-rays interact with molecular clouds, they offer a complementary, systematics-independent measurement of the gas density. In an optimistic case where most gamma-ray emission from the Galactic Centre region originates from pion decays, we expect several tens of muon neutrinos to be detected in about two decades by KM3NeT, Baikal-GVD, and P-ONE combined, which will enable a better determination of the baryonic content in the Galactic Centre region. The CMZ will serve as a testbed to calibrate conventional tracers against neutrinos, ultimately improving gas measurements in distant galaxies, where neutrinos are undetectable, but traditional tracers remain available.
△ Less
Submitted 14 March, 2025;
originally announced March 2025.
-
When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning
Authors:
Junwei Luo,
Yingying Zhang,
Xue Yang,
Kang Wu,
Qi Zhu,
Lei Liang,
Jingdong Chen,
Yansheng Li
Abstract:
Efficient vision-language understanding of large Remote Sensing Images (RSIs) is meaningful but challenging. Current Large Vision-Language Models (LVLMs) typically employ limited pre-defined grids to process images, leading to information loss when handling gigapixel RSIs. Conversely, using unlimited grids significantly increases computational costs. To preserve image details while reducing comput…
▽ More
Efficient vision-language understanding of large Remote Sensing Images (RSIs) is meaningful but challenging. Current Large Vision-Language Models (LVLMs) typically employ limited pre-defined grids to process images, leading to information loss when handling gigapixel RSIs. Conversely, using unlimited grids significantly increases computational costs. To preserve image details while reducing computational complexity, we propose a text-guided token pruning method with Dynamic Image Pyramid (DIP) integration. Our method introduces: (i) a Region Focus Module (RFM) that leverages text-aware region localization capability to identify critical vision tokens, and (ii) a coarse-to-fine image tile selection and vision token pruning strategy based on DIP, which is guided by RFM outputs and avoids directly processing the entire large imagery. Additionally, existing benchmarks for evaluating LVLMs' perception ability on large RSI suffer from limited question diversity and constrained image sizes. We construct a new benchmark named LRS-VQA, which contains 7,333 QA pairs across 8 categories, with image length up to 27,328 pixels. Our method outperforms existing high-resolution strategies on four datasets using the same data. Moreover, compared to existing token reduction methods, our approach demonstrates higher efficiency under high-resolution settings. Dataset and code are in https://github.com/VisionXLab/LRS-VQA.
△ Less
Submitted 25 March, 2025; v1 submitted 10 March, 2025;
originally announced March 2025.
-
Real-Time Load Estimation for Load-lifting Exoskeletons Using Insole Pressure Sensors and Machine Learning
Authors:
Kaida Wu,
Peihao Xiang,
Chaohao Lin,
Lixuan Chen,
Ou Bai
Abstract:
This paper presents a novel method for real-time lifting-load estimation to enhance the control strategies of upper-limb assistive exoskeletons. By leveraging cost-effective insole pressure sensors, the proposed system extracts differential pressure data that minimizes disturbances from variations in body weight and sensor placement. Two modeling approaches are explored: a channel-based method tha…
▽ More
This paper presents a novel method for real-time lifting-load estimation to enhance the control strategies of upper-limb assistive exoskeletons. By leveraging cost-effective insole pressure sensors, the proposed system extracts differential pressure data that minimizes disturbances from variations in body weight and sensor placement. Two modeling approaches are explored: a channel-based method that employs traditional regression techniques-Elastic Net, Support Vector Regression (SVR), and Multi-Layer Perceptron (MLP)-and a map-based method that utilizes transfer learning with a pre-trained MobileNetV2 model. The experiment is in the preliminary test stage, covering load ranges from 2 kg to 10 kg in increments of 0.5 kg, and collecting data from three subjects to test the approach. In the Channel-based method, the average Weighted Mean Absolute Percentage Error(WMAPE) for three subjects showed that the SVR achieved 13.46%, with the MLP performing similarly. In the Map-based method, using data from one subject, the Fully Fine-Tuned MobileNetV2 model reached a WMAPE of 9.74%. The results indicate that the integration of insole sensor technology with advanced machine learning models provides an effective solution for dynamic load estimation, potentially reducing the risks of over- and under-compensation in exoskeleton control.
△ Less
Submitted 10 March, 2025;
originally announced March 2025.
-
Viewport-Unaware Blind Omnidirectional Image Quality Assessment: A Flexible and Effective Paradigm
Authors:
Jiebin Yan,
Kangcheng Wu,
Junjie Chen,
Ziwen Tan,
Yuming Fang
Abstract:
Most of existing blind omnidirectional image quality assessment (BOIQA) models rely on viewport generation by modeling user viewing behavior or transforming omnidirectional images (OIs) into varying formats; however, these methods are either computationally expensive or less scalable. To solve these issues, in this paper, we present a flexible and effective paradigm, which is viewport-unaware and…
▽ More
Most of existing blind omnidirectional image quality assessment (BOIQA) models rely on viewport generation by modeling user viewing behavior or transforming omnidirectional images (OIs) into varying formats; however, these methods are either computationally expensive or less scalable. To solve these issues, in this paper, we present a flexible and effective paradigm, which is viewport-unaware and can be easily adapted to 2D plane image quality assessment (2D-IQA). Specifically, the proposed BOIQA model includes an adaptive prior-equator sampling module for extracting a patch sequence from the equirectangular projection (ERP) image in a resolution-agnostic manner, a progressive deformation-unaware feature fusion module which is able to capture patch-wise quality degradation in a deformation-immune way, and a local-to-global quality aggregation module to adaptively map local perception to global quality. Extensive experiments across four OIQA databases (including uniformly distorted OIs and non-uniformly distorted OIs) demonstrate that the proposed model achieves competitive performance with low complexity against other state-of-the-art models, and we also verify its adaptive capacity to 2D-IQA.
△ Less
Submitted 8 March, 2025;
originally announced March 2025.
-
A Digital Twin-Driven Recommendation System for Adaptive Campus Course Timetabling
Authors:
Keshu Wu,
Xinyue Ye,
Suphanut Jamonnak,
Xin Feng
Abstract:
Efficient and adaptive course timetabling for large, dynamic university campuses remains a significant challenge due to the complex interplay of hard and soft constraints. Traditional static optimization methods often fail to accommodate real-time disruptions, evolving user preferences, and the nuanced spatial-temporal relationships inherent in campus environments. This paper reconceptualizes the…
▽ More
Efficient and adaptive course timetabling for large, dynamic university campuses remains a significant challenge due to the complex interplay of hard and soft constraints. Traditional static optimization methods often fail to accommodate real-time disruptions, evolving user preferences, and the nuanced spatial-temporal relationships inherent in campus environments. This paper reconceptualizes the timetabling problem as a recommendation-based task and leverages the Texas A&M Campus Digital Twin as a dynamic data platform. Our proposed framework integrates collaborative and content-based filtering techniques with iterative feedback mechanisms, thereby generating a ranked set of adaptive timetable recommendations. A composite scoring function, incorporating metrics for classroom occupancy, travel distance, travel time, and vertical transitions, enables the framework to systematically balance resource utilization with user-centric factors. Extensive experiments using real-world data from Texas A&M University demonstrate that our approach effectively reduces travel inefficiencies, optimizes classroom utilization, and enhances overall user satisfaction. By coupling a recommendation-oriented paradigm with a digital twin environment, this study offers a robust and scalable blueprint for intelligent campus planning and resource allocation, with potential applications in broader urban contexts.
△ Less
Submitted 8 March, 2025;
originally announced March 2025.
-
Adaptive-LIO: Enhancing Robustness and Precision through Environmental Adaptation in LiDAR Inertial Odometry
Authors:
Chengwei Zhao,
Kun Hu,
Jie Xu,
Lijun Zhao,
Baiwen Han,
Kaidi Wu,
Maoshan Tian,
Shenghai Yuan
Abstract:
The emerging Internet of Things (IoT) applications, such as driverless cars, have a growing demand for high-precision positioning and navigation. Nowadays, LiDAR inertial odometry becomes increasingly prevalent in robotics and autonomous driving. However, many current SLAM systems lack sufficient adaptability to various scenarios. Challenges include decreased point cloud accuracy with longer frame…
▽ More
The emerging Internet of Things (IoT) applications, such as driverless cars, have a growing demand for high-precision positioning and navigation. Nowadays, LiDAR inertial odometry becomes increasingly prevalent in robotics and autonomous driving. However, many current SLAM systems lack sufficient adaptability to various scenarios. Challenges include decreased point cloud accuracy with longer frame intervals under the constant velocity assumption, coupling of erroneous IMU information when IMU saturation occurs, and decreased localization accuracy due to the use of fixed-resolution maps during indoor-outdoor scene transitions. To address these issues, we propose a loosely coupled adaptive LiDAR-Inertial-Odometry named \textbf{Adaptive-LIO}, which incorporates adaptive segmentation to enhance mapping accuracy, adapts motion modality through IMU saturation and fault detection, and adjusts map resolution adaptively using multi-resolution voxel maps based on the distance from the LiDAR center. Our proposed method has been tested in various challenging scenarios, demonstrating the effectiveness of the improvements we introduce. The code is open-source on GitHub: \href{https://github.com/chengwei0427/adaptive_lio}{Adaptive-LIO}.
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
Mixed Likelihood Variational Gaussian Processes
Authors:
Kaiwen Wu,
Craig Sanders,
Benjamin Letham,
Phillip Guan
Abstract:
Gaussian processes (GPs) are powerful models for human-in-the-loop experiments due to their flexibility and well-calibrated uncertainty. However, GPs modeling human responses typically ignore auxiliary information, including a priori domain expertise and non-task performance information like user confidence ratings. We propose mixed likelihood variational GPs to leverage auxiliary information, whi…
▽ More
Gaussian processes (GPs) are powerful models for human-in-the-loop experiments due to their flexibility and well-calibrated uncertainty. However, GPs modeling human responses typically ignore auxiliary information, including a priori domain expertise and non-task performance information like user confidence ratings. We propose mixed likelihood variational GPs to leverage auxiliary information, which combine multiple likelihoods in a single evidence lower bound to model multiple types of data. We demonstrate the benefits of mixing likelihoods in three real-world experiments with human participants. First, we use mixed likelihood training to impose prior knowledge constraints in GP classifiers, which accelerates active learning in a visual perception task where users are asked to identify geometric errors resulting from camera position errors in virtual reality. Second, we show that leveraging Likert scale confidence ratings by mixed likelihood training improves model fitting for haptic perception of surface roughness. Lastly, we show that Likert scale confidence ratings improve human preference learning in robot gait optimization. The modeling performance improvements found using our framework across this diverse set of applications illustrates the benefits of incorporating auxiliary information into active learning and preference learning by using mixed likelihoods to jointly model multiple inputs.
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
Real-Time Burst-Mode Digital Signal Processing for Passive Optical Networks
Authors:
Ji Zhou,
Kainan Wu,
Haide Wang,
Jinyang Yang,
Weiping Liu,
Junwen Zhang,
Changyuan Yu,
Xiangjun Xin,
Liangchuan Li
Abstract:
Driven by the ever-increasing capacity demands, the 50G passive optical network (PON) is maturing gradually. One of the main challenges for the 50G PON is implementing burst-mode digital signal processing (BM-DSP) for the burst upstream signal. In this paper, we demonstrate a real-time BM-DSP for burst reception of 25Gbit/s on-off keying signal to meet the asymmetric-mode 50G PON demand. The real-…
▽ More
Driven by the ever-increasing capacity demands, the 50G passive optical network (PON) is maturing gradually. One of the main challenges for the 50G PON is implementing burst-mode digital signal processing (BM-DSP) for the burst upstream signal. In this paper, we demonstrate a real-time BM-DSP for burst reception of 25Gbit/s on-off keying signal to meet the asymmetric-mode 50G PON demand. The real-time BM-DSP includes the BM frequency-domain timing recovery and BM frequency-domain equalizer, which can be fast converged based on the 42ns designed preamble. Meanwhile, the simplified implementations for fast-Fourier-transform, minimum-mean-square-error, and decision-directed least-mean-square-error algorithms decrease the DSP resources by 28.57%, enabling the loading of real-time BM-DSP in the field programmable gate array with the limited DSP resources. The real-time implementation of BM-DSP can guide the design of application-specific integrated circuits for 50G PON.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
V2X-LLM: Enhancing V2X Integration and Understanding in Connected Vehicle Corridors
Authors:
Keshu Wu,
Pei Li,
Yang Zhou,
Rui Gan,
Junwei You,
Yang Cheng,
Jingwen Zhu,
Steven T. Parker,
Bin Ran,
David A. Noyce,
Zhengzhong Tu
Abstract:
The advancement of Connected and Automated Vehicles (CAVs) and Vehicle-to-Everything (V2X) offers significant potential for enhancing transportation safety, mobility, and sustainability. However, the integration and analysis of the diverse and voluminous V2X data, including Basic Safety Messages (BSMs) and Signal Phase and Timing (SPaT) data, present substantial challenges, especially on Connected…
▽ More
The advancement of Connected and Automated Vehicles (CAVs) and Vehicle-to-Everything (V2X) offers significant potential for enhancing transportation safety, mobility, and sustainability. However, the integration and analysis of the diverse and voluminous V2X data, including Basic Safety Messages (BSMs) and Signal Phase and Timing (SPaT) data, present substantial challenges, especially on Connected Vehicle Corridors. These challenges include managing large data volumes, ensuring real-time data integration, and understanding complex traffic scenarios. Although these projects have developed an advanced CAV data pipeline that enables real-time communication between vehicles, infrastructure, and other road users for managing connected vehicle and roadside unit (RSU) data, significant hurdles in data comprehension and real-time scenario analysis and reasoning persist. To address these issues, we introduce the V2X-LLM framework, a novel enhancement to the existing CV data pipeline. V2X-LLM leverages Large Language Models (LLMs) to improve the understanding and real-time analysis of V2X data. The framework includes four key tasks: Scenario Explanation, offering detailed narratives of traffic conditions; V2X Data Description, detailing vehicle and infrastructure statuses; State Prediction, forecasting future traffic states; and Navigation Advisory, providing optimized routing instructions. By integrating LLM-driven reasoning with V2X data within the data pipeline, the V2X-LLM framework offers real-time feedback and decision support for traffic management. This integration enhances the accuracy of traffic analysis, safety, and traffic optimization. Demonstrations in a real-world urban corridor highlight the framework's potential to advance intelligent transportation systems.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
Flat bands and temperature-driven phase transition in quasi-one-dimensional zigzag chains
Authors:
Jisong Gao,
Haijun Cao,
Xuegao Hu,
Hui Zhou,
Zhihao Cai,
Qiaoxiao Zhao,
Dong Li,
Zhicheng Gao,
Shin-ichiro Ideta,
Kenya Shimada,
Peng Cheng,
Lan Chen,
Kehui Wu,
Sheng Meng,
Baojie Feng
Abstract:
Flat-band materials have garnered extensive attention due to their captivating properties associated with strong correlation effects. While flat bands have been discovered in several types of 2D materials, their existence in 1D systems remains elusive. Here, we propose a 1D frustrated lattice, specifically the 1D zigzag lattice, as a platform for hosting flat bands. This lattice can be experimenta…
▽ More
Flat-band materials have garnered extensive attention due to their captivating properties associated with strong correlation effects. While flat bands have been discovered in several types of 2D materials, their existence in 1D systems remains elusive. Here, we propose a 1D frustrated lattice, specifically the 1D zigzag lattice, as a platform for hosting flat bands. This lattice can be experimentally realized by growing CuTe chains on Cu(111). The presence of flat bands was confirmed by tight-binding model analysis, first-principles calculations, and angle-resolved photoemission spectroscopy measurements. In addition, we discovered a temperature-driven phase transition at approximately 250 K. Detailed analyses demonstrate that the system has a Tomonaga-Luttinger liquid behavior, accompanied by spin-charge separation effects. Our work unveils new prospects for investigating strongly correlated electron behaviors and topological properties in the 1D limit.
△ Less
Submitted 3 March, 2025;
originally announced March 2025.
-
Quantifying Bias due to non-Gaussian Foregrounds in an Optimal Reconstruction of CMB Lensing and Temperature Power Spectra
Authors:
M. Doohan,
M. Millea,
S. Raghunathan,
F. Ge,
L. Knox,
K. Prabhu,
C. L. Reichardt,
W. L. K. Wu
Abstract:
We estimate the magnitude of the bias due to non-Gaussian extragalactic foregrounds on the optimal reconstruction of the cosmic microwave background (CMB) lensing potential and temperature power spectra. The reconstruction is performed using a Bayesian inference method known as the marginal unbiased score expansion (MUSE). We apply MUSE to a minimum variance combination of multifrequency maps draw…
▽ More
We estimate the magnitude of the bias due to non-Gaussian extragalactic foregrounds on the optimal reconstruction of the cosmic microwave background (CMB) lensing potential and temperature power spectra. The reconstruction is performed using a Bayesian inference method known as the marginal unbiased score expansion (MUSE). We apply MUSE to a minimum variance combination of multifrequency maps drawn from the Agora publicly available simulations of the lensed CMB and correlated extragalactic foreground emission. Taking noise levels appropriate to two years of data with the SPT-3G instrument on the South Pole Telescope, we find no statistically significant bias in the MUSE reconstruction when limited to angular multipoles $\ell \leq 3000$. We find a 4.7$σ$ bias in the recovered lensing potential power spectrum when smaller scale modes ($\ell \leq 3500$) are included. This work is a first step toward understanding the impact of extragalactic foregrounds on optimal reconstructions of CMB temperature and lensing potential power spectra.
△ Less
Submitted 2 March, 2025; v1 submitted 28 February, 2025;
originally announced February 2025.
-
Convex inequalities in Hilbert $C^*$-modules
Authors:
Kangjian Wu,
Jia Li,
Qingxiang Xu
Abstract:
The H$\ddot{\rm o}$lder-McCarty inequalities are originally derived in the Hilbert space case and have been generalized via a convex inequality. The main purpose of this paper is to extend this convex inequality to the Hilbert $C^*$-module case, and meanwhile to make some investigations on the H$\ddot{\rm o}$lder-McCarty inequalities in the Hilbert $C^*$-module case.
The H$\ddot{\rm o}$lder-McCarty inequalities are originally derived in the Hilbert space case and have been generalized via a convex inequality. The main purpose of this paper is to extend this convex inequality to the Hilbert $C^*$-module case, and meanwhile to make some investigations on the H$\ddot{\rm o}$lder-McCarty inequalities in the Hilbert $C^*$-module case.
△ Less
Submitted 28 February, 2025;
originally announced February 2025.
-
Notes on the numerical radius for adjointable operators on Hilbert $C^*$-modules
Authors:
J. Li,
K. Wu,
Q. Xu
Abstract:
Given a Hilbert module $H$ over a $C^*$-algebra, let $\mathcal{L}(H)$ be the set of all adjointable operators on $H$. For each $T\in\mathcal{L}(H)$, its numerical radius is defined by $w(T)=\sup\big\{\|\langle Tx, x \rangle\|: x\in H, \|x\|=1\big\}$. It is proved that $w(T)=\|T\|$ whenever $T$ is normal. Examples are constructed to show that there exist Hilbert module $H$ over certain $C^*$-algebr…
▽ More
Given a Hilbert module $H$ over a $C^*$-algebra, let $\mathcal{L}(H)$ be the set of all adjointable operators on $H$. For each $T\in\mathcal{L}(H)$, its numerical radius is defined by $w(T)=\sup\big\{\|\langle Tx, x \rangle\|: x\in H, \|x\|=1\big\}$. It is proved that $w(T)=\|T\|$ whenever $T$ is normal. Examples are constructed to show that there exist Hilbert module $H$ over certain $C^*$-algebra and $T_1,T_2\in \mathcal{L}(H)$ with $T_1^2=0$ such that $w(T_1)\ne \frac12 \|T_1\|$ and $\sup\limits_{θ\in [0,2π]}\|\mbox{Re}(e^{iθ}T_2)\|<w(T_2)$. In addition, a new characterization of the spatial numerical radius is given, and it is proved that $w\big(π(T)\big)\le w(T)$ for every faithful representation $(π, X)$ of $\mathcal{L}(H)$ and every $T\in\mathcal{L}(H)$. Some inequalities are derived based on the newly obtained results.
△ Less
Submitted 27 February, 2025;
originally announced February 2025.
-
GraphSparseNet: a Novel Method for Large Scale Traffic Flow Prediction
Authors:
Weiyang Kong,
Kaiqi Wu,
Sen Zhang,
Yubao Liu
Abstract:
Traffic flow forecasting is a critical spatio-temporal data mining task with wide-ranging applications in intelligent route planning and dynamic traffic management. Recent advancements in deep learning, particularly through Graph Neural Networks (GNNs), have significantly enhanced the accuracy of these forecasts by capturing complex spatio-temporal dynamics. However, the scalability of GNNs remain…
▽ More
Traffic flow forecasting is a critical spatio-temporal data mining task with wide-ranging applications in intelligent route planning and dynamic traffic management. Recent advancements in deep learning, particularly through Graph Neural Networks (GNNs), have significantly enhanced the accuracy of these forecasts by capturing complex spatio-temporal dynamics. However, the scalability of GNNs remains a challenge due to their exponential growth in model complexity with increasing nodes in the graph. Existing methods to address this issue, including sparsification, decomposition, and kernel-based approaches, either do not fully resolve the complexity issue or risk compromising predictive accuracy. This paper introduces GraphSparseNet (GSNet), a novel framework designed to improve both the scalability and accuracy of GNN-based traffic forecasting models. GraphSparseNet is comprised of two core modules: the Feature Extractor and the Relational Compressor. These modules operate with linear time and space complexity, thereby reducing the overall computational complexity of the model to a linear scale. Our extensive experiments on multiple real-world datasets demonstrate that GraphSparseNet not only significantly reduces training time by 3.51x compared to state-of-the-art linear models but also maintains high predictive performance.
△ Less
Submitted 13 May, 2025; v1 submitted 27 February, 2025;
originally announced February 2025.
-
An Improved Privacy and Utility Analysis of Differentially Private SGD with Bounded Domain and Smooth Losses
Authors:
Hao Liang,
Wanrong Zhang,
Xinlei He,
Kaishun Wu,
Hong Xing
Abstract:
Differentially Private Stochastic Gradient Descent (DPSGD) is widely used to protect sensitive data during the training of machine learning models, but its privacy guarantees often come at the cost of model performance, largely due to the inherent challenge of accurately quantifying privacy loss. While recent efforts have strengthened privacy guarantees by focusing solely on the final output and b…
▽ More
Differentially Private Stochastic Gradient Descent (DPSGD) is widely used to protect sensitive data during the training of machine learning models, but its privacy guarantees often come at the cost of model performance, largely due to the inherent challenge of accurately quantifying privacy loss. While recent efforts have strengthened privacy guarantees by focusing solely on the final output and bounded domain cases, they still impose restrictive assumptions, such as convexity and other parameter limitations, and often lack a thorough analysis of utility. In this paper, we provide rigorous privacy and utility characterization for DPSGD for smooth loss functions in both bounded and unbounded domains. We track the privacy loss over multiple iterations by exploiting the noisy smooth-reduction property and establish the utility analysis by leveraging the projection's non-expansiveness and clipped SGD properties. In particular, we show that for DPSGD with a bounded domain, (i) the privacy loss can still converge without the convexity assumption, and (ii) a smaller bounded diameter can improve both privacy and utility simultaneously under certain conditions. Numerical results validate our results.
△ Less
Submitted 28 February, 2025; v1 submitted 24 February, 2025;
originally announced February 2025.
-
Mitigating Data Scarcity in Time Series Analysis: A Foundation Model with Series-Symbol Data Generation
Authors:
Wenxuan Wang,
Kai Wu,
Yujian Betterest Li,
Dan Wang,
Xiaoyu Zhang,
Jing Liu
Abstract:
Foundation models for time series analysis (TSA) have attracted significant attention. However, challenges such as data scarcity and data imbalance continue to hinder their development. To address this, we consider modeling complex systems through symbolic expressions that serve as semantic descriptors of time series. Building on this concept, we introduce a series-symbol (S2) dual-modulity data g…
▽ More
Foundation models for time series analysis (TSA) have attracted significant attention. However, challenges such as data scarcity and data imbalance continue to hinder their development. To address this, we consider modeling complex systems through symbolic expressions that serve as semantic descriptors of time series. Building on this concept, we introduce a series-symbol (S2) dual-modulity data generation mechanism, enabling the unrestricted creation of high-quality time series data paired with corresponding symbolic representations. Leveraging the S2 dataset, we develop SymTime, a pre-trained foundation model for TSA. SymTime demonstrates competitive performance across five major TSA tasks when fine-tuned with downstream task, rivaling foundation models pre-trained on real-world datasets. This approach underscores the potential of dual-modality data generation and pretraining mechanisms in overcoming data scarcity and enhancing task performance.
△ Less
Submitted 21 February, 2025;
originally announced February 2025.
-
GneissWeb: Preparing High Quality Data for LLMs at Scale
Authors:
Hajar Emami Gohari,
Swanand Ravindra Kadhe,
Syed Yousaf Shah. Constantin Adam,
Abdulhamid Adebayo,
Praneet Adusumilli,
Farhan Ahmed,
Nathalie Baracaldo Angel,
Santosh Borse,
Yuan-Chi Chang,
Xuan-Hong Dang,
Nirmit Desai,
Ravital Eres,
Ran Iwamoto,
Alexei Karve,
Yan Koyfman,
Wei-Han Lee,
Changchang Liu,
Boris Lublinsky,
Takuyo Ohko,
Pablo Pesce,
Maroun Touma,
Shiqiang Wang,
Shalisha Witherspoon,
Herbert Woisetschlager,
David Wood
, et al. (6 additional authors not shown)
Abstract:
Data quantity and quality play a vital role in determining the performance of Large Language Models (LLMs). High-quality data, in particular, can significantly boost the LLM's ability to generalize on a wide range of downstream tasks. Large pre-training datasets for leading LLMs remain inaccessible to the public, whereas many open datasets are small in size (less than 5 trillion tokens), limiting…
▽ More
Data quantity and quality play a vital role in determining the performance of Large Language Models (LLMs). High-quality data, in particular, can significantly boost the LLM's ability to generalize on a wide range of downstream tasks. Large pre-training datasets for leading LLMs remain inaccessible to the public, whereas many open datasets are small in size (less than 5 trillion tokens), limiting their suitability for training large models.
In this paper, we introduce GneissWeb, a large dataset yielding around 10 trillion tokens that caters to the data quality and quantity requirements of training LLMs. Our GneissWeb recipe that produced the dataset consists of sharded exact sub-string deduplication and a judiciously constructed ensemble of quality filters. GneissWeb achieves a favorable trade-off between data quality and quantity, producing models that outperform models trained on state-of-the-art open large datasets (5+ trillion tokens).
We show that models trained using GneissWeb dataset outperform those trained on FineWeb-V1.1.0 by 2.73 percentage points in terms of average score computed on a set of 11 commonly used benchmarks (both zero-shot and few-shot) for pre-training dataset evaluation. When the evaluation set is extended to 20 benchmarks (both zero-shot and few-shot), models trained using GneissWeb still achieve a 1.75 percentage points advantage over those trained on FineWeb-V1.1.0.
△ Less
Submitted 18 February, 2025;
originally announced February 2025.
-
Untangling New Physics in Single Resonant Top Quarks
Authors:
Krish Wu,
Brandon Sun,
Nitish Polishetty,
Justin Kline,
Max Fieg,
Daniel Whiteson
Abstract:
Collisions of particles at the energy frontier can reveal new particles and forces via localized excesses. However, the initial observation may be consistent with a large variety of theoretical models, especially in sectors with new top quark partners, which feature a rich set of possible underlying interactions. We explore the power of the LHC dataset to distinguish between models of the singly p…
▽ More
Collisions of particles at the energy frontier can reveal new particles and forces via localized excesses. However, the initial observation may be consistent with a large variety of theoretical models, especially in sectors with new top quark partners, which feature a rich set of possible underlying interactions. We explore the power of the LHC dataset to distinguish between models of the singly produced heavy top-like quark which interacts with the Standard Model through an electromagnetic form factor. We study the heavy top decay to a top quark and a virtual photon which produces a pair of fermions, propose a technique to disentangle the models, and calculate the expected statistical significance to distinguish between various hypotheses.
△ Less
Submitted 19 February, 2025;
originally announced February 2025.
-
Baichuan-M1: Pushing the Medical Capability of Large Language Models
Authors:
Bingning Wang,
Haizhou Zhao,
Huozhi Zhou,
Liang Song,
Mingyu Xu,
Wei Cheng,
Xiangrong Zeng,
Yupeng Zhang,
Yuqi Huo,
Zecheng Wang,
Zhengyun Zhao,
Da Pan,
Fei Kou,
Fei Li,
Fuzhong Chen,
Guosheng Dong,
Han Liu,
Hongda Zhang,
Jin He,
Jinjie Yang,
Kangxi Wu,
Kegeng Wu,
Lei Su,
Linlin Niu,
Linzhuang Sun
, et al. (17 additional authors not shown)
Abstract:
The current generation of large language models (LLMs) is typically designed for broad, general-purpose applications, while domain-specific LLMs, especially in vertical fields like medicine, remain relatively scarce. In particular, the development of highly efficient and practical LLMs for the medical domain is challenging due to the complexity of medical knowledge and the limited availability of…
▽ More
The current generation of large language models (LLMs) is typically designed for broad, general-purpose applications, while domain-specific LLMs, especially in vertical fields like medicine, remain relatively scarce. In particular, the development of highly efficient and practical LLMs for the medical domain is challenging due to the complexity of medical knowledge and the limited availability of high-quality data. To bridge this gap, we introduce Baichuan-M1, a series of large language models specifically optimized for medical applications. Unlike traditional approaches that simply continue pretraining on existing models or apply post-training to a general base model, Baichuan-M1 is trained from scratch with a dedicated focus on enhancing medical capabilities. Our model is trained on 20 trillion tokens and incorporates a range of effective training methods that strike a balance between general capabilities and medical expertise. As a result, Baichuan-M1 not only performs strongly across general domains such as mathematics and coding but also excels in specialized medical fields. We have open-sourced Baichuan-M1-14B, a mini version of our model, which can be accessed through the following links.
△ Less
Submitted 5 March, 2025; v1 submitted 18 February, 2025;
originally announced February 2025.
-
REGNav: Room Expert Guided Image-Goal Navigation
Authors:
Pengna Li,
Kangyi Wu,
Jingwen Fu,
Sanping Zhou
Abstract:
Image-goal navigation aims to steer an agent towards the goal location specified by an image. Most prior methods tackle this task by learning a navigation policy, which extracts visual features of goal and observation images, compares their similarity and predicts actions. However, if the agent is in a different room from the goal image, it's extremely challenging to identify their similarity and…
▽ More
Image-goal navigation aims to steer an agent towards the goal location specified by an image. Most prior methods tackle this task by learning a navigation policy, which extracts visual features of goal and observation images, compares their similarity and predicts actions. However, if the agent is in a different room from the goal image, it's extremely challenging to identify their similarity and infer the likely goal location, which may result in the agent wandering around. Intuitively, when humans carry out this task, they may roughly compare the current observation with the goal image, having an approximate concept of whether they are in the same room before executing the actions. Inspired by this intuition, we try to imitate human behaviour and propose a Room Expert Guided Image-Goal Navigation model (REGNav) to equip the agent with the ability to analyze whether goal and observation images are taken in the same room. Specifically, we first pre-train a room expert with an unsupervised learning technique on the self-collected unlabelled room images. The expert can extract the hidden room style information of goal and observation images and predict their relationship about whether they belong to the same room. In addition, two different fusion approaches are explored to efficiently guide the agent navigation with the room relation knowledge. Extensive experiments show that our REGNav surpasses prior state-of-the-art works on three popular benchmarks.
△ Less
Submitted 15 February, 2025;
originally announced February 2025.
-
VisCon-100K: Leveraging Contextual Web Data for Fine-tuning Vision Language Models
Authors:
Gokul Karthik Kumar,
Iheb Chaabane,
Kebin Wu
Abstract:
Vision-language models (VLMs) excel in various visual benchmarks but are often constrained by the lack of high-quality visual fine-tuning data. To address this challenge, we introduce VisCon-100K, a novel dataset derived from interleaved image-text web documents. Our approach transforms 45K web documents from the OBELICS dataset into 100K image conversation samples. We utilize GPT-4V to generate i…
▽ More
Vision-language models (VLMs) excel in various visual benchmarks but are often constrained by the lack of high-quality visual fine-tuning data. To address this challenge, we introduce VisCon-100K, a novel dataset derived from interleaved image-text web documents. Our approach transforms 45K web documents from the OBELICS dataset into 100K image conversation samples. We utilize GPT-4V to generate image-contextual captions and OpenChat 3.5 model to convert these captions into diverse free-form and multiple-choice question-answer pairs. Integrating this dataset for fine-tuning considerably enhances VLM performance across multiple benchmarks. Unlike methods that focus solely on fine-grained visual content, our approach leverages accompanying web context, yielding superior results. We also discover that a 'leaky modality mix', where conversation samples contain questions answerable from both the image and its contextual caption, outperforms non-leaky combinations of captions and Q&A pairs. VisCon-100k dataset shows strong performance with two popular VLM approaches: text-only large language model (LLM) aligned with a vision encoder using image captions data (ShareGPT4V-7b) and multimodally pretrained LLM (IDEFICS2-8b) using interleaved image-text data. In addition to releasing the VisCon-100K dataset, we provide a contextual captioner trained on this dataset, facilitating scalable fine-tuning data generation for future research and open-source applications. Using the same pipeline, but substituting our trained contextual captioner for GPT-4V, we also release the larger VisCon-1M dataset.
△ Less
Submitted 24 February, 2025; v1 submitted 14 February, 2025;
originally announced February 2025.
-
Diffusion Trajectory-guided Policy for Long-horizon Robot Manipulation
Authors:
Shichao Fan,
Quantao Yang,
Yajie Liu,
Kun Wu,
Zhengping Che,
Qingjie Liu,
Min Wan
Abstract:
Recently, Vision-Language-Action models (VLA) have advanced robot imitation learning, but high data collection costs and limited demonstrations hinder generalization and current imitation learning methods struggle in out-of-distribution scenarios, especially for long-horizon tasks. A key challenge is how to mitigate compounding errors in imitation learning, which lead to cascading failures over ex…
▽ More
Recently, Vision-Language-Action models (VLA) have advanced robot imitation learning, but high data collection costs and limited demonstrations hinder generalization and current imitation learning methods struggle in out-of-distribution scenarios, especially for long-horizon tasks. A key challenge is how to mitigate compounding errors in imitation learning, which lead to cascading failures over extended trajectories. To address these challenges, we propose the Diffusion Trajectory-guided Policy (DTP) framework, which generates 2D trajectories through a diffusion model to guide policy learning for long-horizon tasks. By leveraging task-relevant trajectories, DTP provides trajectory-level guidance to reduce error accumulation. Our two-stage approach first trains a generative vision-language model to create diffusion-based trajectories, then refines the imitation policy using them. Experiments on the CALVIN benchmark show that DTP outperforms state-of-the-art baselines by 25% in success rate, starting from scratch without external pretraining. Moreover, DTP significantly improves real-world robot performance.
△ Less
Submitted 14 February, 2025;
originally announced February 2025.
-
Integrating Spatiotemporal Vision Transformer into Digital Twins for High-Resolution Heat Stress Forecasting in Campus Environments
Authors:
Wenjing Gong,
Xinyue Ye,
Keshu Wu,
Suphanut Jamonnak,
Wenyu Zhang,
Yifan Yang,
Xiao Huang
Abstract:
Extreme heat events exacerbated by climate change pose significant challenges to urban resilience and planning. This study introduces a climate-responsive digital twin framework integrating the Spatiotemporal Vision Transformer (ST-ViT) model to enhance heat stress forecasting and decision-making. Using a Texas campus as a testbed, we synthesized high-resolution physical model simulations with spa…
▽ More
Extreme heat events exacerbated by climate change pose significant challenges to urban resilience and planning. This study introduces a climate-responsive digital twin framework integrating the Spatiotemporal Vision Transformer (ST-ViT) model to enhance heat stress forecasting and decision-making. Using a Texas campus as a testbed, we synthesized high-resolution physical model simulations with spatial and meteorological data to develop fine-scale human thermal predictions. The ST-ViT-powered digital twin enables efficient, data-driven insights for planners, policymakers, and campus stakeholders, supporting targeted heat mitigation strategies and advancing climate-adaptive urban design.
△ Less
Submitted 12 February, 2025;
originally announced February 2025.
-
LP-LM: No Hallucinations in Question Answering with Logic Programming
Authors:
Katherine Wu,
Yanhong A. Liu
Abstract:
Large language models (LLMs) are able to generate human-like responses to user queries. However, LLMs exhibit inherent limitations, especially because they hallucinate. This paper introduces LP-LM, a system that grounds answers to questions in known facts contained in a knowledge base (KB), facilitated through semantic parsing in Prolog, and always produces answers that are reliable.
LP-LM gener…
▽ More
Large language models (LLMs) are able to generate human-like responses to user queries. However, LLMs exhibit inherent limitations, especially because they hallucinate. This paper introduces LP-LM, a system that grounds answers to questions in known facts contained in a knowledge base (KB), facilitated through semantic parsing in Prolog, and always produces answers that are reliable.
LP-LM generates a most probable constituency parse tree along with a corresponding Prolog term for an input question via Prolog definite clause grammar (DCG) parsing. The term is then executed against a KB of natural language sentences also represented as Prolog terms for question answering. By leveraging DCG and tabling, LP-LM runs in linear time in the size of input sentences for sufficiently many grammar rules. Performing experiments comparing LP-LM with current well-known LLMs in accuracy, we show that LLMs hallucinate on even simple questions, unlike LP-LM.
△ Less
Submitted 13 February, 2025;
originally announced February 2025.
-
Nature Language Model: Deciphering the Language of Nature for Scientific Discovery
Authors:
Yingce Xia,
Peiran Jin,
Shufang Xie,
Liang He,
Chuan Cao,
Renqian Luo,
Guoqing Liu,
Yue Wang,
Zequn Liu,
Yuan-Jyue Chen,
Zekun Guo,
Yeqi Bai,
Pan Deng,
Yaosen Min,
Ziheng Lu,
Hongxia Hao,
Han Yang,
Jielan Li,
Chang Liu,
Jia Zhang,
Jianwei Zhu,
Ran Bi,
Kehan Wu,
Wei Zhang,
Kaiyuan Gao
, et al. (21 additional authors not shown)
Abstract:
Foundation models have revolutionized natural language processing and artificial intelligence, significantly enhancing how machines comprehend and generate human languages. Inspired by the success of these foundation models, researchers have developed foundation models for individual scientific domains, including small molecules, materials, proteins, DNA, RNA and even cells. However, these models…
▽ More
Foundation models have revolutionized natural language processing and artificial intelligence, significantly enhancing how machines comprehend and generate human languages. Inspired by the success of these foundation models, researchers have developed foundation models for individual scientific domains, including small molecules, materials, proteins, DNA, RNA and even cells. However, these models are typically trained in isolation, lacking the ability to integrate across different scientific domains. Recognizing that entities within these domains can all be represented as sequences, which together form the "language of nature", we introduce Nature Language Model (NatureLM), a sequence-based science foundation model designed for scientific discovery. Pre-trained with data from multiple scientific domains, NatureLM offers a unified, versatile model that enables various applications including: (i) generating and optimizing small molecules, proteins, RNA, and materials using text instructions; (ii) cross-domain generation/design, such as protein-to-molecule and protein-to-RNA generation; and (iii) top performance across different domains, matching or surpassing state-of-the-art specialist models. NatureLM offers a promising generalist approach for various scientific tasks, including drug discovery (hit generation/optimization, ADMET optimization, synthesis), novel material design, and the development of therapeutic proteins or nucleotides. We have developed NatureLM models in different sizes (1 billion, 8 billion, and 46.7 billion parameters) and observed a clear improvement in performance as the model size increases.
△ Less
Submitted 20 June, 2025; v1 submitted 11 February, 2025;
originally announced February 2025.
-
CMB-S4: Foreground-Cleaning Pipeline Comparison for Measuring Primordial Gravitational Waves
Authors:
Federico Bianchini,
Dominic Beck,
W. L. Kimmy Wu,
Zeeshan Ahmed,
Sebastian Belkner,
Julien Carron,
Brandon S. Hensley,
Clement L. Pryke,
Caterina Umilta
Abstract:
We compare multiple foreground-cleaning pipelines for estimating the tensor-to-scalar ratio, $r$, using simulated maps of the planned CMB-S4 experiment within the context of the South Pole Deep Patch. To evaluate robustness, we analyze bias and uncertainty on $r$ across various foreground suites using map-based simulations. The foreground-cleaning methods include: a parametric maximum likelihood a…
▽ More
We compare multiple foreground-cleaning pipelines for estimating the tensor-to-scalar ratio, $r$, using simulated maps of the planned CMB-S4 experiment within the context of the South Pole Deep Patch. To evaluate robustness, we analyze bias and uncertainty on $r$ across various foreground suites using map-based simulations. The foreground-cleaning methods include: a parametric maximum likelihood approach applied to auto- and cross-power spectra between frequency maps; a map-based parametric maximum-likelihood method; and a harmonic-space internal linear combination using frequency maps. We summarize the conceptual basis of each method to highlight their similarities and differences. To better probe the impact of foreground residuals, we implement an iterative internal delensing step, leveraging a map-based pipeline to generate a lensing $B$-mode template from the Large Aperture Telescope frequency maps. Our results show that the performance of the three approaches is comparable for simple and intermediate-complexity foregrounds, with $σ(r)$ ranging from 3 to 5 $\times 10^{-4}$. However, biases at the $1-2σ$ level appear when analyzing more complex forms of foreground emission. By extending the baseline pipelines to marginalize over foreground residuals, we demonstrate that contamination can be reduced to within statistical uncertainties, albeit with a pipeline-dependent impact on $σ(r)$, which translates to a detection significance between 2 and 4$σ$ for an input value of $r = 0.003$. These findings suggest varying levels of maturity among the tested pipelines, with the auto- and cross-spectra-based approach demonstrating the best stability and overall performance. Moreover, given the extremely low noise levels, mutual validation of independent foreground-cleaning pipelines is essential to ensure the robustness of any potential detection.
△ Less
Submitted 6 February, 2025;
originally announced February 2025.
-
Systolic Sparse Tensor Slices: FPGA Building Blocks for Sparse and Dense AI Acceleration
Authors:
Endri Taka,
Ning-Chi Huang,
Chi-Chih Chang,
Kai-Chiang Wu,
Aman Arora,
Diana Marculescu
Abstract:
FPGA architectures have recently been enhanced to meet the substantial computational demands of modern deep neural networks (DNNs). To this end, both FPGA vendors and academic researchers have proposed in-fabric blocks that perform efficient tensor computations. However, these blocks are primarily optimized for dense computation, while most DNNs exhibit sparsity. To address this limitation, we pro…
▽ More
FPGA architectures have recently been enhanced to meet the substantial computational demands of modern deep neural networks (DNNs). To this end, both FPGA vendors and academic researchers have proposed in-fabric blocks that perform efficient tensor computations. However, these blocks are primarily optimized for dense computation, while most DNNs exhibit sparsity. To address this limitation, we propose incorporating structured sparsity support into FPGA architectures. We architect 2D systolic in-fabric blocks, named systolic sparse tensor (SST) slices, that support multiple degrees of sparsity to efficiently accelerate a wide variety of DNNs. SSTs support dense operation, 2:4 (50%) and 1:4 (75%) sparsity, as well as a new 1:3 (66.7%) sparsity level to further increase flexibility. When demonstrating on general matrix multiplication (GEMM) accelerators, which are the heart of most current DNN accelerators, our sparse SST-based designs attain up to 5x higher FPGA frequency and 10.9x lower area, compared to traditional FPGAs. Moreover, evaluation of the proposed SSTs on state-of-the-art sparse ViT and CNN models exhibits up to 3.52x speedup with minimal area increase of up to 13.3%, compared to dense in-fabric acceleration.
△ Less
Submitted 5 February, 2025;
originally announced February 2025.
-
An Inorganic Liquid Crystalline Dispersion with 2D Ferroelectric Moieties
Authors:
Ziyang Huang,
Zehao Zhang,
Rongjie Zhang,
Baofu Ding,
Liu Yang,
Keyou Wu,
Youan Xu,
Gaokuo Zhong,
Chuanlai Ren,
Jiarong Liu,
Yugan Hao,
Menghao Wu,
Teng Ma,
Bilu Liu
Abstract:
Electro-optical effect based liquid crystal devices have been extensively used in optical modulation techniques, in which the Kerr coefficient reflects the sensitivity of the liquid crystals and determines the strength of the device operational electric field. The Peterlin-Stuart theory and the O'Konski model jointly indicate that a giant Kerr coefficient could be obtained in a material with both…
▽ More
Electro-optical effect based liquid crystal devices have been extensively used in optical modulation techniques, in which the Kerr coefficient reflects the sensitivity of the liquid crystals and determines the strength of the device operational electric field. The Peterlin-Stuart theory and the O'Konski model jointly indicate that a giant Kerr coefficient could be obtained in a material with both a large geometrical anisotropy and an intrinsic polarization, but such a material is not yet reported. Here we reveal a ferroelectric effect in a monolayer two-dimensional mineral vermiculite. A large geometrical anisotropy factor and a large inherent electric dipole together raise the record value of Kerr coefficient by an order of magnitude, till $3.0\times 10^{-4}$ m V$^{-2}$. This finding enables an ultra-low operational electric field of $10^2$-$10^4$ V m$^{-1}$ and the fabrication of electro-optical devices with an inch-level electrode separation, which is not practical previously. Because of its high ultraviolet stability (decay <1% under ultraviolet exposure of 1000 hours), large-scale, and energy-efficiency, prototypical displayable billboards have been fabricated for outdoor interactive scenes. The work provides new insights for both liquid crystal optics and two-dimensional ferroelectrics.
△ Less
Submitted 1 February, 2025;
originally announced February 2025.
-
RMDM: Radio Map Diffusion Model with Physics Informed
Authors:
Haozhe Jia,
Wenshuo Chen,
Zhihui Huang,
Hongru Xiao,
Nanqian Jia,
Keming Wu,
Songning Lai,
Yutao Yue
Abstract:
With the rapid development of wireless communication technology, the efficient utilization of spectrum resources, optimization of communication quality, and intelligent communication have become critical. Radio map reconstruction is essential for enabling advanced applications, yet challenges such as complex signal propagation and sparse data hinder accurate reconstruction. To address these issues…
▽ More
With the rapid development of wireless communication technology, the efficient utilization of spectrum resources, optimization of communication quality, and intelligent communication have become critical. Radio map reconstruction is essential for enabling advanced applications, yet challenges such as complex signal propagation and sparse data hinder accurate reconstruction. To address these issues, we propose the **Radio Map Diffusion Model (RMDM)**, a physics-informed framework that integrates **Physics-Informed Neural Networks (PINNs)** to incorporate constraints like the **Helmholtz equation**. RMDM employs a dual U-Net architecture: the first ensures physical consistency by minimizing PDE residuals, boundary conditions, and source constraints, while the second refines predictions via diffusion-based denoising. By leveraging physical laws, RMDM significantly enhances accuracy, robustness, and generalization. Experiments demonstrate that RMDM outperforms state-of-the-art methods, achieving **NMSE of 0.0031** and **RMSE of 0.0125** under the Static RM (SRM) setting, and **NMSE of 0.0047** and **RMSE of 0.0146** under the Dynamic RM (DRM) setting. These results establish a novel paradigm for integrating physics-informed and data-driven approaches in radio map reconstruction, particularly under sparse data conditions.
△ Less
Submitted 19 March, 2025; v1 submitted 31 January, 2025;
originally announced January 2025.
-
Free-T2M: Frequency Enhanced Text-to-Motion Diffusion Model With Consistency Loss
Authors:
Wenshuo Chen,
Haozhe Jia,
Songning Lai,
Keming Wu,
Hongru Xiao,
Lijie Hu,
Yutao Yue
Abstract:
Rapid progress in text-to-motion generation has been largely driven by diffusion models. However, existing methods focus solely on temporal modeling, thereby overlooking frequency-domain analysis. We identify two key phases in motion denoising: the **semantic planning stage** and the **fine-grained improving stage**. To address these phases effectively, we propose **Fre**quency **e**nhanced **t**e…
▽ More
Rapid progress in text-to-motion generation has been largely driven by diffusion models. However, existing methods focus solely on temporal modeling, thereby overlooking frequency-domain analysis. We identify two key phases in motion denoising: the **semantic planning stage** and the **fine-grained improving stage**. To address these phases effectively, we propose **Fre**quency **e**nhanced **t**ext-**to**-**m**otion diffusion model (**Free-T2M**), incorporating stage-specific consistency losses that enhance the robustness of static features and improve fine-grained accuracy. Extensive experiments demonstrate the effectiveness of our method. Specifically, on StableMoFusion, our method reduces the FID from **0.189** to **0.051**, establishing a new SOTA performance within the diffusion architecture. These findings highlight the importance of incorporating frequency-domain insights into text-to-motion generation for more precise and robust results.
△ Less
Submitted 30 January, 2025;
originally announced January 2025.
-
Baichuan-Omni-1.5 Technical Report
Authors:
Yadong Li,
Jun Liu,
Tao Zhang,
Tao Zhang,
Song Chen,
Tianpeng Li,
Zehuan Li,
Lijun Liu,
Lingfeng Ming,
Guosheng Dong,
Da Pan,
Chong Li,
Yuanbo Fang,
Dongdong Kuang,
Mingrui Wang,
Chenglin Zhu,
Youwei Zhang,
Hongyu Guo,
Fengyu Zhang,
Yuran Wang,
Bowen Ding,
Wei Song,
Xu Li,
Yuqi Huo,
Zheng Liang
, et al. (68 additional authors not shown)
Abstract:
We introduce Baichuan-Omni-1.5, an omni-modal model that not only has omni-modal understanding capabilities but also provides end-to-end audio generation capabilities. To achieve fluent and high-quality interaction across modalities without compromising the capabilities of any modality, we prioritized optimizing three key aspects. First, we establish a comprehensive data cleaning and synthesis pip…
▽ More
We introduce Baichuan-Omni-1.5, an omni-modal model that not only has omni-modal understanding capabilities but also provides end-to-end audio generation capabilities. To achieve fluent and high-quality interaction across modalities without compromising the capabilities of any modality, we prioritized optimizing three key aspects. First, we establish a comprehensive data cleaning and synthesis pipeline for multimodal data, obtaining about 500B high-quality data (text, audio, and vision). Second, an audio-tokenizer (Baichuan-Audio-Tokenizer) has been designed to capture both semantic and acoustic information from audio, enabling seamless integration and enhanced compatibility with MLLM. Lastly, we designed a multi-stage training strategy that progressively integrates multimodal alignment and multitask fine-tuning, ensuring effective synergy across all modalities. Baichuan-Omni-1.5 leads contemporary models (including GPT4o-mini and MiniCPM-o 2.6) in terms of comprehensive omni-modal capabilities. Notably, it achieves results comparable to leading models such as Qwen2-VL-72B across various multimodal medical benchmarks.
△ Less
Submitted 25 January, 2025;
originally announced January 2025.
-
Uni-Sign: Toward Unified Sign Language Understanding at Scale
Authors:
Zecheng Li,
Wengang Zhou,
Weichao Zhao,
Kepeng Wu,
Hezhen Hu,
Houqiang Li
Abstract:
Sign language pre-training has gained increasing attention for its ability to enhance performance across various sign language understanding (SLU) tasks. However, existing methods often suffer from a gap between pre-training and fine-tuning, leading to suboptimal results. To address this, we propose Uni-Sign, a unified pre-training framework that eliminates the gap between pre-training and downstr…
▽ More
Sign language pre-training has gained increasing attention for its ability to enhance performance across various sign language understanding (SLU) tasks. However, existing methods often suffer from a gap between pre-training and fine-tuning, leading to suboptimal results. To address this, we propose Uni-Sign, a unified pre-training framework that eliminates the gap between pre-training and downstream SLU tasks through a large-scale generative pre-training strategy and a novel fine-tuning paradigm. First, we introduce CSL-News, a large-scale Chinese Sign Language (CSL) dataset containing 1,985 hours of video paired with textual annotations, which enables effective large-scale pre-training. Second, Uni-Sign unifies SLU tasks by treating downstream tasks as a single sign language translation (SLT) task during fine-tuning, ensuring seamless knowledge transfer between pre-training and fine-tuning. Furthermore, we incorporate a prior-guided fusion (PGF) module and a score-aware sampling strategy to efficiently fuse pose and RGB information, addressing keypoint inaccuracies and improving computational efficiency. Extensive experiments across multiple SLU benchmarks demonstrate that Uni-Sign achieves state-of-the-art performance across multiple downstream SLU tasks. Dataset and code are available at github.com/ZechengLi19/Uni-Sign.
△ Less
Submitted 13 March, 2025; v1 submitted 25 January, 2025;
originally announced January 2025.
-
Classification of $\mathrm{GL}_{n}(\mathbb{C})$-Representations Distinguished by $\mathrm{GL}_n(\mathbb{R})$
Authors:
Basudev Pattanayak,
Kaidi Wu,
Hongfeng Zhang
Abstract:
This paper provides a complete classification of $\mathrm{GL}_n(\mathbb{R})$-distinguished irreducible representations of $\mathrm{GL}_n(\mathbb{C})$ when the representations are either generic or unitary. Additionally, for each such $\mathrm{GL}_n(\mathbb{R})$-distinguished representation, we explicitly construct the associated period and prove its non-vanishing on the distinguished minimal $K$-t…
▽ More
This paper provides a complete classification of $\mathrm{GL}_n(\mathbb{R})$-distinguished irreducible representations of $\mathrm{GL}_n(\mathbb{C})$ when the representations are either generic or unitary. Additionally, for each such $\mathrm{GL}_n(\mathbb{R})$-distinguished representation, we explicitly construct the associated period and prove its non-vanishing on the distinguished minimal $K$-type. Furthermore, we offer some applications to the branching problem using theta correspondence.
△ Less
Submitted 10 March, 2025; v1 submitted 24 January, 2025;
originally announced January 2025.
-
Group-Agent Reinforcement Learning with Heterogeneous Agents
Authors:
Kaiyue Wu,
Xiao-Jun Zeng,
Tingting Mu
Abstract:
Group-agent reinforcement learning (GARL) is a newly arising learning scenario, where multiple reinforcement learning agents study together in a group, sharing knowledge in an asynchronous fashion. The goal is to improve the learning performance of each individual agent. Under a more general heterogeneous setting where different agents learn using different algorithms, we advance GARL by designing…
▽ More
Group-agent reinforcement learning (GARL) is a newly arising learning scenario, where multiple reinforcement learning agents study together in a group, sharing knowledge in an asynchronous fashion. The goal is to improve the learning performance of each individual agent. Under a more general heterogeneous setting where different agents learn using different algorithms, we advance GARL by designing novel and effective group-learning mechanisms. They guide the agents on whether and how to learn from action choices from the others, and allow the agents to adopt available policy and value function models sent by another agent if they perform better. We have conducted extensive experiments on a total of 43 different Atari 2600 games to demonstrate the superior performance of the proposed method. After the group learning, among the 129 agents examined, 96% are able to achieve a learning speed-up, and 72% are able to learn over 100 times faster. Also, around 41% of those agents have achieved a higher accumulated reward score by learning in less than 5% of the time steps required by a single agent when learning on its own.
△ Less
Submitted 15 February, 2025; v1 submitted 20 January, 2025;
originally announced January 2025.
-
Enhancing Brain Tumor Segmentation Using Channel Attention and Transfer learning
Authors:
Majid Behzadpour,
Ebrahim Azizi,
Kai Wu,
Bengie L. Ortiz
Abstract:
Accurate and efficient segmentation of brain tumors is critical for diagnosis, treatment planning, and monitoring in clinical practice. In this study, we present an enhanced ResUNet architecture for automatic brain tumor segmentation, integrating an EfficientNetB0 encoder, a channel attention mechanism, and an Atrous Spatial Pyramid Pooling (ASPP) module. The EfficientNetB0 encoder leverages pre-t…
▽ More
Accurate and efficient segmentation of brain tumors is critical for diagnosis, treatment planning, and monitoring in clinical practice. In this study, we present an enhanced ResUNet architecture for automatic brain tumor segmentation, integrating an EfficientNetB0 encoder, a channel attention mechanism, and an Atrous Spatial Pyramid Pooling (ASPP) module. The EfficientNetB0 encoder leverages pre-trained features to improve feature extraction efficiency, while the channel attention mechanism enhances the model's focus on tumor-relevant features. ASPP enables multiscale contextual learning, crucial for handling tumors of varying sizes and shapes. The proposed model was evaluated on two benchmark datasets: TCGA LGG and BraTS 2020. Experimental results demonstrate that our method consistently outperforms the baseline ResUNet and its EfficientNet variant, achieving Dice coefficients of 0.903 and 0.851 and HD95 scores of 9.43 and 3.54 for whole tumor and tumor core regions on the BraTS 2020 dataset, respectively. compared with state-of-the-art methods, our approach shows competitive performance, particularly in whole tumor and tumor core segmentation. These results indicate that combining a powerful encoder with attention mechanisms and ASPP can significantly enhance brain tumor segmentation performance. The proposed approach holds promise for further optimization and application in other medical image segmentation tasks.
△ Less
Submitted 19 January, 2025;
originally announced January 2025.
-
Complete Hamiltonian Framework of Relativistic Hierarchical Triple Systems: Capabilities and Limitations of Secular Perturbation Theory
Authors:
Kaye Jiale Li,
Kinwah Wu,
Ziri Younsi,
Tjonnie G. F. Li
Abstract:
Relativistic secular perturbation theory has ignited significant interest in uncovering intricate cross-term effects, especially the interplay between 1PN and quadrupole terms. While most existing studies rely on the Lagrangian planetary perturbation method for computing cross terms, a comprehensive Hamiltonian framework for the field has been missing. In this work, we introduce a framework based…
▽ More
Relativistic secular perturbation theory has ignited significant interest in uncovering intricate cross-term effects, especially the interplay between 1PN and quadrupole terms. While most existing studies rely on the Lagrangian planetary perturbation method for computing cross terms, a comprehensive Hamiltonian framework for the field has been missing. In this work, we introduce a framework based on von Zeipel transformation, utilizing two sequential canonical transformations to systematically compute cross terms to arbitrary orders. Our results reveal secular cross terms up to quadrupole-squared order, showcasing remarkable consistency with both the Lagrangian method [1] and the effective-field-theory approach [2]. We present leading-order periodic cross terms arising from the interactions between 1PN and quadrupole, and present estimates of higher-order cross terms. It is demonstrated that this method not only accurately predicts the long-term evolution of hierarchical systems but also captures fast oscillations observed in N-body simulations. We identify and validate resonances caused by quadrupole-squared effects, highlighting both consistencies and discrepancies when compared to N-body simulations. These discrepancies underscore the importance of mean-motion resonances, a factor overlooked in current secular perturbation frameworks. Finally, we provide a comprehensive review of the subtleties and limitations inherent to secular perturbation theory, paving the way for future research and advancements in this field.
△ Less
Submitted 14 January, 2025;
originally announced January 2025.
-
VINGS-Mono: Visual-Inertial Gaussian Splatting Monocular SLAM in Large Scenes
Authors:
Ke Wu,
Zicheng Zhang,
Muer Tie,
Ziqing Ai,
Zhongxue Gan,
Wenchao Ding
Abstract:
VINGS-Mono is a monocular (inertial) Gaussian Splatting (GS) SLAM framework designed for large scenes. The framework comprises four main components: VIO Front End, 2D Gaussian Map, NVS Loop Closure, and Dynamic Eraser. In the VIO Front End, RGB frames are processed through dense bundle adjustment and uncertainty estimation to extract scene geometry and poses. Based on this output, the mapping modu…
▽ More
VINGS-Mono is a monocular (inertial) Gaussian Splatting (GS) SLAM framework designed for large scenes. The framework comprises four main components: VIO Front End, 2D Gaussian Map, NVS Loop Closure, and Dynamic Eraser. In the VIO Front End, RGB frames are processed through dense bundle adjustment and uncertainty estimation to extract scene geometry and poses. Based on this output, the mapping module incrementally constructs and maintains a 2D Gaussian map. Key components of the 2D Gaussian Map include a Sample-based Rasterizer, Score Manager, and Pose Refinement, which collectively improve mapping speed and localization accuracy. This enables the SLAM system to handle large-scale urban environments with up to 50 million Gaussian ellipsoids. To ensure global consistency in large-scale scenes, we design a Loop Closure module, which innovatively leverages the Novel View Synthesis (NVS) capabilities of Gaussian Splatting for loop closure detection and correction of the Gaussian map. Additionally, we propose a Dynamic Eraser to address the inevitable presence of dynamic objects in real-world outdoor scenes. Extensive evaluations in indoor and outdoor environments demonstrate that our approach achieves localization performance on par with Visual-Inertial Odometry while surpassing recent GS/NeRF SLAM methods. It also significantly outperforms all existing methods in terms of mapping and rendering quality. Furthermore, we developed a mobile app and verified that our framework can generate high-quality Gaussian maps in real time using only a smartphone camera and a low-frequency IMU sensor. To the best of our knowledge, VINGS-Mono is the first monocular Gaussian SLAM method capable of operating in outdoor environments and supporting kilometer-scale large scenes.
△ Less
Submitted 14 January, 2025;
originally announced January 2025.
-
Asymptotic-Preserving Neural Networks based on Even-odd Decomposition for Multiscale Gray Radiative Transfer Equations
Authors:
Keke Wu,
Xizhe Xie,
Wengu Chen,
Han Wang,
Zheng Ma
Abstract:
We present a novel Asymptotic-Preserving Neural Network (APNN) approach utilizing even-odd decomposition to tackle the nonlinear gray radiative transfer equations (GRTEs). Our AP loss demonstrates consistent stability concerning the small Knudsen number, ensuring the neural network solution uniformly converges to the macro solution. This APNN method alleviates the rigorous conservation requirement…
▽ More
We present a novel Asymptotic-Preserving Neural Network (APNN) approach utilizing even-odd decomposition to tackle the nonlinear gray radiative transfer equations (GRTEs). Our AP loss demonstrates consistent stability concerning the small Knudsen number, ensuring the neural network solution uniformly converges to the macro solution. This APNN method alleviates the rigorous conservation requirements while simultaneously incorporating an auxiliary deep neural network, distinguishing it from the APNN method based on micro-macro decomposition for GRTE. Several numerical problems are examined to demonstrate the effectiveness of our proposed APNN technique.
△ Less
Submitted 14 January, 2025;
originally announced January 2025.
-
Rationalisation of multiple square roots in Feynman integrals
Authors:
Georgios Papathanasiou,
Stefan Weinzierl,
Konglong Wu,
Yang Zhang
Abstract:
Feynman integrals are very often computed from their differential equations. It is not uncommon that the $\varepsilon$-factorised differential equation contains only dlog-forms with algebraic arguments, where the algebraic part is given by (multiple) square roots. It is well-known that if all square roots are simultaneously rationalisable, the Feynman integrals can be expressed in terms of multipl…
▽ More
Feynman integrals are very often computed from their differential equations. It is not uncommon that the $\varepsilon$-factorised differential equation contains only dlog-forms with algebraic arguments, where the algebraic part is given by (multiple) square roots. It is well-known that if all square roots are simultaneously rationalisable, the Feynman integrals can be expressed in terms of multiple polylogarithms. This is a sufficient, but not a necessary criterium. In this paper we investigate weaker requirements. We discuss under which conditions we may use different rationalisations in different parts of the calculation. In particular we show that we may use different rationalisations if they correspond to different parameterisations of the same integration path. We present a non-trivial example -- the one-loop pentagon function with three adjacent massive external legs involving seven square roots -- where this technique can be used to express the result in terms of multiple polylogarithms.
△ Less
Submitted 2 April, 2025; v1 submitted 13 January, 2025;
originally announced January 2025.
-
Measurements of the Temperature and E-mode Polarization of the Cosmic Microwave Background from the Full 500-square-degree SPTpol Dataset
Authors:
T. -L. Chou,
P. A. R. Ade,
A. J. Anderson,
J. E. Austermann,
L. Balkenhol,
J. A. Beall,
A. N. Bender,
B. A. Benson,
F. Bianchini,
L. E. Bleem,
J. E. Carlstrom,
C. L. Chang,
P. Chaubal,
H. C. Chiang,
R. Citron,
C. Corbett Moran,
T. M. Crawford,
A. T. Crites,
T. de Haan,
M. A. Dobbs,
D. Dutcher,
W. Everett,
J. Gallicchio,
E. M. George,
N. Gupta
, et al. (37 additional authors not shown)
Abstract:
Using the full four-year SPTpol 500 deg$^2$ dataset in both the 95 GHz and 150 GHz frequency bands, we present measurements of the temperature and $E$-mode polarization of the cosmic microwave background (CMB), as well as the $E$-mode polarization auto-power spectrum ($EE$) and temperature-$E$-mode cross-power spectrum ($TE$) in the angular multipole range $50<\ell<8000$. We find the SPTpol datase…
▽ More
Using the full four-year SPTpol 500 deg$^2$ dataset in both the 95 GHz and 150 GHz frequency bands, we present measurements of the temperature and $E$-mode polarization of the cosmic microwave background (CMB), as well as the $E$-mode polarization auto-power spectrum ($EE$) and temperature-$E$-mode cross-power spectrum ($TE$) in the angular multipole range $50<\ell<8000$. We find the SPTpol dataset to be self-consistent, passing several internal consistency tests based on maps, frequency bands, bandpowers, and cosmological parameters. The full SPTpol dataset is well-fit by the $ΛCDM$ model, for which we find $H_0=70.48\pm2.16$ km s$^{-1}$ Mpc$^{-1}$ and $Ω_m=0.271\pm0.026$, when using only the SPTpol data and a Planck-based prior on the optical depth to reionization. The $ΛCDM$ parameter constraints are consistent across the 95 GHz-only, 150 GHz-only, $TE$-only, and $EE$-only data splits. Between the $\ell<1000$ and $\ell>1000$ data splits, the $ΛCDM$ parameter constraints are borderline consistent at the $\sim2σ$ level. This consistency improves when including a parameter $A_L$, the degree of lensing of the CMB inferred from the smearing of acoustic peaks. When marginalized over $A_L$, the $ΛCDM$ parameter constraints from SPTpol are consistent with those from Planck. The power spectra presented here are the most sensitive measurements of the lensed CMB damping tail to date for roughly $\ell > 1700$ in $TE$ and $\ell > 2000$ in $EE$.
△ Less
Submitted 12 January, 2025;
originally announced January 2025.
-
Holographic Entanglement Entropy as a Probe of Dynamical Criticality in Scalarizing Black Holes
Authors:
Yi Li,
Ke-tai Wu,
Chong-Ye Chen,
Chao Niu,
Cheng-Yong Zhang,
Peng Liu
Abstract:
We demonstrate that holographic entanglement entropy (HEE) serves as a powerful diagnostic tool for both static and dynamical critical phenomena in the Einstein-Born-Infeld-Scalar (EBIS) model. While HEE is well-known for capturing static phase transitions, we reveal its novel ability to probe dynamical criticality, particularly the ''flip'' phenomenon-a sign inversion in the scalar field at a cri…
▽ More
We demonstrate that holographic entanglement entropy (HEE) serves as a powerful diagnostic tool for both static and dynamical critical phenomena in the Einstein-Born-Infeld-Scalar (EBIS) model. While HEE is well-known for capturing static phase transitions, we reveal its novel ability to probe dynamical criticality, particularly the ''flip'' phenomenon-a sign inversion in the scalar field at a critical point. Near the flip, HEE exhibits relaxation dynamics that closely mirror those of the scalar field, with both relaxation times scaling logarithmically with the distance from the critical point. This intimate connection between the relaxation of HEE and the scalar field highlights HEE as a sensitive probe of dynamical critical phenomena. Our findings provide new insights into the interplay between quantum information and gravitational dynamics, offering a deeper understanding of critical behavior in strongly coupled systems.
△ Less
Submitted 10 January, 2025;
originally announced January 2025.
-
Blockchain-Based Secure Vehicle Auction System with Smart Contracts
Authors:
Ka Wai Wu
Abstract:
The problem of a single point of failure in centralized systems poses a great challenge to the stability of such systems. Meanwhile, the tamperability of data within centralized systems makes users reluctant to trust and use centralized applications in many scenarios, including the financial and business sectors.
Blockchain, as a new decentralized technology, addresses these issues effectively.…
▽ More
The problem of a single point of failure in centralized systems poses a great challenge to the stability of such systems. Meanwhile, the tamperability of data within centralized systems makes users reluctant to trust and use centralized applications in many scenarios, including the financial and business sectors.
Blockchain, as a new decentralized technology, addresses these issues effectively. As a typical decentralized system, blockchain can be utilized to build a data-sharing model. Users in a blockchain do not need to trust other users; instead, they trust that the majority of miner nodes are honest. Smart contracts enable developers to write distributed programs based on blockchain systems, ensuring that all code is immutable and secure.
In this paper, we analyze the security of blockchain technology to illustrate its advantages and justify its use. Furthermore, we design a new system for storing and trading vehicle information based on the Ethereum blockchain and smart contract technology. Specifically, our system allows users to upload vehicle information and auction vehicles to transfer ownership. Our application provides great convenience to buyers and owners, while the use of smart contracts enhances the security and privacy of the system.
△ Less
Submitted 19 April, 2025; v1 submitted 8 January, 2025;
originally announced January 2025.
-
A Liouville theorem for supercritical Fujita equation and its applications
Authors:
Kelei Wang,
Juncheng Wei,
Ke Wu
Abstract:
We prove a Liouville theorem for ancient solutions to the supercritical Fujita equation \[\partial_tu-Δu=|u|^{p-1}u, \quad -\infty <t<0, \quad p>\frac{n+2}{n-2},\] which says if $u$ is close to the ODE solution $u_0(t):=(p-1)^{-\frac{1}{p-1}}(-t)^{-\frac{1}{p-1}}$ at large scales, then it is an ODE solution (i.e. it depends only on $t$). This implies a stability property for ODE blow ups in this p…
▽ More
We prove a Liouville theorem for ancient solutions to the supercritical Fujita equation \[\partial_tu-Δu=|u|^{p-1}u, \quad -\infty <t<0, \quad p>\frac{n+2}{n-2},\] which says if $u$ is close to the ODE solution $u_0(t):=(p-1)^{-\frac{1}{p-1}}(-t)^{-\frac{1}{p-1}}$ at large scales, then it is an ODE solution (i.e. it depends only on $t$). This implies a stability property for ODE blow ups in this problem.
As an application of these results, we show that for a suitable weak solution, its singular set at the end time can be decomposed into two parts: one part is relatively open and $(n-1)$-rectifiable, and it is characterized by the property that tangent functions at these points are the two constants $\pm(p-1)^{-\frac{1}{p-1}}$; the other part is relatively closed and its Hausdorff dimension is not larger than $n-\left[2\frac{p+1}{p-1}\right]-1$.
△ Less
Submitted 7 January, 2025;
originally announced January 2025.
-
Foundations of Platform-Assisted Auctions
Authors:
Hao Chung,
Ke Wu,
Elaine Shi
Abstract:
Today, many auctions are carried out with the help of intermediary platforms like Google and eBay. We refer to such auctions as platform-assisted auctions.Traditionally, the auction theory literature mainly focuses on designing auctions that incentivize the buyers to bid truthfully,assuming that the platform always faithfully implements the auction. In practice, however, the platforms have been fo…
▽ More
Today, many auctions are carried out with the help of intermediary platforms like Google and eBay. We refer to such auctions as platform-assisted auctions.Traditionally, the auction theory literature mainly focuses on designing auctions that incentivize the buyers to bid truthfully,assuming that the platform always faithfully implements the auction. In practice, however, the platforms have been found to manipulate the auctions to earn more profit, resulting in high-profile anti-trust lawsuits. We propose a new model for studying platform-assisted auctions in the permissionless setting. We explore whether it is possible to design a dream auction in thisnew model, such that honest behavior is the utility-maximizing strategy for each individual buyer, the platform, the seller, as well as platform-seller or platform-buyer coalitions.Through a collection of feasibility and infeasibility results,we carefully characterize the mathematical landscape of platform-assisted auctions. We show how cryptography can lend to the design of an efficient platform-assisted auction with dream properties. Although a line of works have also used MPC or the blockchain to remove the reliance on a trusted auctioneer, our work is distinct in nature in several dimensions.First, we initiate a systematic exploration of the game theoretic implications when the service providers are strategic and can collude with sellers or buyers. Second, we observe that the full simulation paradigm is too stringent and leads to high asymptotical costs. Specifically, because every player has a different private outcomein an auction protocol, running any generic MPC protocol among the players would incur at least $n^2$ total cost. We propose a new notion of simulation calledutility-dominated emulation.Under this new notion, we showhow to design efficient auction protocols with quasilinear efficiency.
△ Less
Submitted 6 January, 2025;
originally announced January 2025.
-
Nonrelativistic spin-splitting multiferroic antiferromagnet and compensated ferrimagnet with zero net magnetization
Authors:
Jianting Dong,
Kun Wu,
Meng Zhu,
Fanxing Zheng,
Xinlu Li,
Jia Zhang
Abstract:
Spin-splitting antiferromagnets with spin-polarized band structures in momentum space have garnered intensive research attention due to their zero net magnetic moments, ultras fast spin dynamics as conventional antiferromagnets, and spin-polarized transport properties akin to ferromagnets, making them promising candidates for antiferromagnetic spintronics. However, unlike spin-torque switching of…
▽ More
Spin-splitting antiferromagnets with spin-polarized band structures in momentum space have garnered intensive research attention due to their zero net magnetic moments, ultras fast spin dynamics as conventional antiferromagnets, and spin-polarized transport properties akin to ferromagnets, making them promising candidates for antiferromagnetic spintronics. However, unlike spin-torque switching of ferromagnets by electric current, efficient electric control of spin-splitting antiferromagnetic order remains challenges. In this work, we identify prototypes of multiferroic spin-splitting antiferromagnets, including BiFeO3, Fe2Mo3O8 and compensated ferrimagnet GaFeO3 with ferroelectric polarization as well as spin-polarized electronic structures. We establish design principles for the spin-splitting multiferroic antiferromagnets and compensated ferrimagnets, elucidating the band symmetry features in Brillouin zone. We demonstrate that the spin polarization in spin-splitting magnets, despite of zero net magnetic moment, can be switched by ferroelectric polarization, providing an efficient means of controlling the antiferromagnetic order. Our work may inspire future development of novel multiferroic functional magnets with zero magnetic moments and pave the way for their applications in magnetoelectric spintronic devices.
△ Less
Submitted 6 January, 2025;
originally announced January 2025.
-
A Survey of Test-Time Compute: From Intuitive Inference to Deliberate Reasoning
Authors:
Yixin Ji,
Juntao Li,
Yang Xiang,
Hai Ye,
Kaixin Wu,
Kai Yao,
Jia Xu,
Linjian Mo,
Min Zhang
Abstract:
The remarkable performance of the o1 model in complex reasoning demonstrates that test-time compute scaling can further unlock the model's potential, enabling powerful System-2 thinking. However, there is still a lack of comprehensive surveys for test-time compute scaling. We trace the concept of test-time compute back to System-1 models. In System-1 models, test-time compute addresses distributio…
▽ More
The remarkable performance of the o1 model in complex reasoning demonstrates that test-time compute scaling can further unlock the model's potential, enabling powerful System-2 thinking. However, there is still a lack of comprehensive surveys for test-time compute scaling. We trace the concept of test-time compute back to System-1 models. In System-1 models, test-time compute addresses distribution shifts and improves robustness and generalization through parameter updating, input modification, representation editing, and output calibration. In System-2 models, it enhances the model's reasoning ability to solve complex problems through repeated sampling, self-correction, and tree search. We organize this survey according to the trend of System-1 to System-2 thinking, highlighting the key role of test-time compute in the transition from System-1 models to weak System-2 models, and then to strong System-2 models. We also point out advanced topics and future directions.
△ Less
Submitted 29 June, 2025; v1 submitted 5 January, 2025;
originally announced January 2025.