-
The Amazon Nova Family of Models: Technical Report and Model Card
Authors:
Amazon AGI,
Aaron Langford,
Aayush Shah,
Abhanshu Gupta,
Abhimanyu Bhatter,
Abhinav Goyal,
Abhinav Mathur,
Abhinav Mohanty,
Abhishek Kumar,
Abhishek Sethi,
Abi Komma,
Abner Pena,
Achin Jain,
Adam Kunysz,
Adam Opyrchal,
Adarsh Singh,
Aditya Rawal,
Adok Achar Budihal Prasad,
Adrià de Gispert,
Agnika Kumar,
Aishwarya Aryamane,
Ajay Nair,
Akilan M,
Akshaya Iyengar,
Akshaya Vishnu Kudlu Shanbhogue
, et al. (761 additional authors not shown)
Abstract:
We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents…
▽ More
We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents and text. Amazon Nova Micro is a text-only model that delivers our lowest-latency responses at very low cost. Amazon Nova Canvas is an image generation model that creates professional grade images with rich customization controls. Amazon Nova Reel is a video generation model offering high-quality outputs, customization, and motion control. Our models were built responsibly and with a commitment to customer trust, security, and reliability. We report benchmarking results for core capabilities, agentic performance, long context, functional adaptation, runtime performance, and human evaluation.
△ Less
Submitted 17 March, 2025;
originally announced June 2025.
-
Resolve Highway Conflict in Multi-Autonomous Vehicle Controls with Local State Attention
Authors:
Xuan Duy Ta,
Bang Giang Le,
Thanh Ha Le,
Viet Cuong Ta
Abstract:
In mixed-traffic environments, autonomous vehicles must adapt to human-controlled vehicles and other unusual driving situations. This setting can be framed as a multi-agent reinforcement learning (MARL) environment with full cooperative reward among the autonomous vehicles. While methods such as Multi-agent Proximal Policy Optimization can be effective in training MARL tasks, they often fail to re…
▽ More
In mixed-traffic environments, autonomous vehicles must adapt to human-controlled vehicles and other unusual driving situations. This setting can be framed as a multi-agent reinforcement learning (MARL) environment with full cooperative reward among the autonomous vehicles. While methods such as Multi-agent Proximal Policy Optimization can be effective in training MARL tasks, they often fail to resolve local conflict between agents and are unable to generalize to stochastic events. In this paper, we propose a Local State Attention module to assist the input state representation. By relying on the self-attention operator, the module is expected to compress the essential information of nearby agents to resolve the conflict in traffic situations. Utilizing a simulated highway merging scenario with the priority vehicle as the unexpected event, our approach is able to prioritize other vehicles' information to manage the merging process. The results demonstrate significant improvements in merging efficiency compared to popular baselines, especially in high-density traffic settings.
△ Less
Submitted 12 June, 2025;
originally announced June 2025.
-
Euclid preparation: The NISP spectroscopy channel, on ground performance and calibration
Authors:
Euclid Collaboration,
W. Gillard,
T. Maciaszek,
E. Prieto,
F. Grupp,
A. Costille,
K. Jahnke,
J. Clemens,
S. Dusini,
M. Carle,
C. Sirignano,
E. Medinaceli,
S. Ligori,
E. Franceschi,
M. Trifoglio,
W. Bon,
R. Barbier,
S. Ferriol,
A. Secroun,
N. Auricchio,
P. Battaglia,
C. Bonoli,
L. Corcione,
F. Hormuth,
D. Le Mignant
, et al. (334 additional authors not shown)
Abstract:
ESA's Euclid cosmology mission relies on the very sensitive and accurately calibrated spectroscopy channel of the Near-Infrared Spectrometer and Photometer (NISP). With three operational grisms in two wavelength intervals, NISP provides diffraction-limited slitless spectroscopy over a field of $0.57$ deg$^2$. A blue grism $\text{BG}_\text{E}$ covers the wavelength range $926$--$1366$\,nm at a spec…
▽ More
ESA's Euclid cosmology mission relies on the very sensitive and accurately calibrated spectroscopy channel of the Near-Infrared Spectrometer and Photometer (NISP). With three operational grisms in two wavelength intervals, NISP provides diffraction-limited slitless spectroscopy over a field of $0.57$ deg$^2$. A blue grism $\text{BG}_\text{E}$ covers the wavelength range $926$--$1366$\,nm at a spectral resolution $R=440$--$900$ for a $0.5''$ diameter source with a dispersion of $1.24$ nm px$^{-1}$. Two red grisms $\text{RG}_\text{E}$ span $1206$ to $1892$\,nm at $R=550$--$740$ and a dispersion of $1.37$ nm px$^{-1}$. We describe the construction of the grisms as well as the ground testing of the flight model of the NISP instrument where these properties were established.
△ Less
Submitted 9 June, 2025;
originally announced June 2025.
-
ETT-CKGE: Efficient Task-driven Tokens for Continual Knowledge Graph Embedding
Authors:
Lijing Zhu,
Qizhen Lan,
Qing Tian,
Wenbo Sun,
Li Yang,
Lu Xia,
Yixin Xie,
Xi Xiao,
Tiehang Duan,
Cui Tao,
Shuteng Niu
Abstract:
Continual Knowledge Graph Embedding (CKGE) seeks to integrate new knowledge while preserving past information. However, existing methods struggle with efficiency and scalability due to two key limitations: (1) suboptimal knowledge preservation between snapshots caused by manually designed node/relation importance scores that ignore graph dependencies relevant to the downstream task, and (2) comput…
▽ More
Continual Knowledge Graph Embedding (CKGE) seeks to integrate new knowledge while preserving past information. However, existing methods struggle with efficiency and scalability due to two key limitations: (1) suboptimal knowledge preservation between snapshots caused by manually designed node/relation importance scores that ignore graph dependencies relevant to the downstream task, and (2) computationally expensive graph traversal for node/relation importance calculation, leading to slow training and high memory overhead. To address these limitations, we introduce ETT-CKGE (Efficient, Task-driven, Tokens for Continual Knowledge Graph Embedding), a novel task-guided CKGE method that leverages efficient task-driven tokens for efficient and effective knowledge transfer between snapshots. Our method introduces a set of learnable tokens that directly capture task-relevant signals, eliminating the need for explicit node scoring or traversal. These tokens serve as consistent and reusable guidance across snapshots, enabling efficient token-masked embedding alignment between snapshots. Importantly, knowledge transfer is achieved through simple matrix operations, significantly reducing training time and memory usage. Extensive experiments across six benchmark datasets demonstrate that ETT-CKGE consistently achieves superior or competitive predictive performance, while substantially improving training efficiency and scalability compared to state-of-the-art CKGE methods. The code is available at: https://github.com/lijingzhu1/ETT-CKGE/tree/main
△ Less
Submitted 9 June, 2025;
originally announced June 2025.
-
Euclid preparation. Constraining parameterised models of modifications of gravity with the spectroscopic and photometric primary probes
Authors:
Euclid Collaboration,
I. S. Albuquerque,
N. Frusciante,
Z. Sakr,
S. Srinivasan,
L. Atayde,
B. Bose,
V. F. Cardone,
S. Casas,
M. Martinelli,
J. Noller,
E. M. Teixeira,
D. B. Thomas,
I. Tutusaus,
M. Cataneo,
K. Koyama,
L. Lombriser,
F. Pace,
A. Silvestri,
N. Aghanim,
A. Amara,
S. Andreon,
N. Auricchio,
C. Baccigalupi,
M. Baldi
, et al. (263 additional authors not shown)
Abstract:
The Euclid mission has the potential to understand the fundamental physical nature of late-time cosmic acceleration and, as such, of deviations from the standard cosmological model, LCDM. In this paper, we focus on model-independent methods to modify the evolution of scalar perturbations at linear scales. We consider two approaches: the first is based on the two phenomenological modified gravity (…
▽ More
The Euclid mission has the potential to understand the fundamental physical nature of late-time cosmic acceleration and, as such, of deviations from the standard cosmological model, LCDM. In this paper, we focus on model-independent methods to modify the evolution of scalar perturbations at linear scales. We consider two approaches: the first is based on the two phenomenological modified gravity (PMG) parameters, $μ_{\rm mg}$ and $Σ_{\rm mg}$, which are phenomenologically connected to the clustering of matter and weak lensing, respectively; and the second is the effective field theory (EFT) of dark energy and modified gravity, which we use to parameterise the braiding function, $α_{\rm B}$, which defines the mixing between the metric and the dark energy field. We discuss the predictions from spectroscopic and photometric primary probes by Euclid on the cosmological parameters and a given set of additional parameters featuring the PMG and EFT models. We use the Fisher matrix method applied to spectroscopic galaxy clustering (GCsp), weak lensing (WL), photometric galaxy clustering (GCph), and cross-correlation (XC) between GCph and WL. For the modelling of photometric predictions on nonlinear scales, we use the halo model to cover two limits for the screening mechanism: the unscreened (US) case, for which the screening mechanism is not present; and the super-screened (SS) case, which assumes strong screening. We also assume scale cuts to account for our uncertainties in the modelling of nonlinear perturbation evolution. We choose a time-dependent form for $\{μ_{\rm mg},Σ_{\rm mg}\}$, with two fiducial sets of values for the corresponding model parameters at the present time, $\{\barμ_0,\barΣ_0\}$, and two forms for $α_{\rm B}$, with one fiducial set of values for each of the model parameters, $α_{\rm B,0}$ and $\{α_{\rm B,0},m\}$. (Abridged)
△ Less
Submitted 3 June, 2025;
originally announced June 2025.
-
SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving
Authors:
Wendong Xu,
Jing Xiong,
Chenyang Zhao,
Qiujiang Chen,
Haoran Wang,
Hui Shen,
Zhongwei Wan,
Jianbo Dai,
Taiqiang Wu,
He Xiao,
Chaofan Tao,
Z. Morley Mao,
Ying Sheng,
Zhijiang Guo,
Hongxia Yang,
Bei Yu,
Lingpeng Kong,
Quanquan Gu,
Ngai Wong
Abstract:
We present SwingArena, a competitive evaluation framework for Large Language Models (LLMs) that closely mirrors real-world software development workflows. Unlike traditional static benchmarks, SwingArena models the collaborative process of software iteration by pairing LLMs as submitters, who generate patches, and reviewers, who create test cases and verify the patches through continuous integrati…
▽ More
We present SwingArena, a competitive evaluation framework for Large Language Models (LLMs) that closely mirrors real-world software development workflows. Unlike traditional static benchmarks, SwingArena models the collaborative process of software iteration by pairing LLMs as submitters, who generate patches, and reviewers, who create test cases and verify the patches through continuous integration (CI) pipelines. To support these interactive evaluations, we introduce a retrieval-augmented code generation (RACG) module that efficiently handles long-context challenges by providing syntactically and semantically relevant code snippets from large codebases, supporting multiple programming languages (C++, Python, Rust, and Go). This enables the framework to scale across diverse tasks and contexts while respecting token limitations. Our experiments, using over 400 high-quality real-world GitHub issues selected from a pool of 2,300 issues, show that models like GPT-4o excel at aggressive patch generation, whereas DeepSeek and Gemini prioritize correctness in CI validation. SwingArena presents a scalable and extensible methodology for evaluating LLMs in realistic, CI-driven software development settings. More details are available on our project page: swing-bench.github.io
△ Less
Submitted 2 June, 2025; v1 submitted 29 May, 2025;
originally announced May 2025.
-
A Joint Learning Framework with Feature Reconstruction and Prediction for Incomplete Satellite Image Time Series in Agricultural Semantic Segmentation
Authors:
Yuze Wang,
Mariana Belgiu,
Haiyang Wu,
Dandan Zhong,
Yangyang Cao,
Chao Tao
Abstract:
Satellite Image Time Series (SITS) is crucial for agricultural semantic segmentation. However, Cloud contamination introduces time gaps in SITS, disrupting temporal dependencies and causing feature shifts, leading to degraded performance of models trained on complete SITS. Existing methods typically address this by reconstructing the entire SITS before prediction or using data augmentation to simu…
▽ More
Satellite Image Time Series (SITS) is crucial for agricultural semantic segmentation. However, Cloud contamination introduces time gaps in SITS, disrupting temporal dependencies and causing feature shifts, leading to degraded performance of models trained on complete SITS. Existing methods typically address this by reconstructing the entire SITS before prediction or using data augmentation to simulate missing data. Yet, full reconstruction may introduce noise and redundancy, while the data-augmented model can only handle limited missing patterns, leading to poor generalization. We propose a joint learning framework with feature reconstruction and prediction to address incomplete SITS more effectively. During training, we simulate data-missing scenarios using temporal masks. The two tasks are guided by both ground-truth labels and the teacher model trained on complete SITS. The prediction task constrains the model from selectively reconstructing critical features from masked inputs that align with the teacher's temporal feature representations. It reduces unnecessary reconstruction and limits noise propagation. By integrating reconstructed features into the prediction task, the model avoids learning shortcuts and maintains its ability to handle varied missing patterns and complete SITS. Experiments on SITS from Hunan Province, Western France, and Catalonia show that our method improves mean F1-scores by 6.93% in cropland extraction and 7.09% in crop classification over baselines. It also generalizes well across satellite sensors, including Sentinel-2 and PlanetScope, under varying temporal missing rates and model backbones.
△ Less
Submitted 25 May, 2025;
originally announced May 2025.
-
BEDI: A Comprehensive Benchmark for Evaluating Embodied Agents on UAVs
Authors:
Mingning Guo,
Mengwei Wu,
Jiarun He,
Shaoxian Li,
Haifeng Li,
Chao Tao
Abstract:
With the rapid advancement of low-altitude remote sensing and Vision-Language Models (VLMs), Embodied Agents based on Unmanned Aerial Vehicles (UAVs) have shown significant potential in autonomous tasks. However, current evaluation methods for UAV-Embodied Agents (UAV-EAs) remain constrained by the lack of standardized benchmarks, diverse testing scenarios and open system interfaces. To address th…
▽ More
With the rapid advancement of low-altitude remote sensing and Vision-Language Models (VLMs), Embodied Agents based on Unmanned Aerial Vehicles (UAVs) have shown significant potential in autonomous tasks. However, current evaluation methods for UAV-Embodied Agents (UAV-EAs) remain constrained by the lack of standardized benchmarks, diverse testing scenarios and open system interfaces. To address these challenges, we propose BEDI (Benchmark for Embodied Drone Intelligence), a systematic and standardized benchmark designed for evaluating UAV-EAs. Specifically, we introduce a novel Dynamic Chain-of-Embodied-Task paradigm based on the perception-decision-action loop, which decomposes complex UAV tasks into standardized, measurable subtasks. Building on this paradigm, we design a unified evaluation framework encompassing five core sub-skills: semantic perception, spatial perception, motion control, tool utilization, and task planning. Furthermore, we construct a hybrid testing platform that integrates static real-world environments with dynamic virtual scenarios, enabling comprehensive performance assessment of UAV-EAs across varied contexts. The platform also offers open and standardized interfaces, allowing researchers to customize tasks and extend scenarios, thereby enhancing flexibility and scalability in the evaluation process. Finally, through empirical evaluations of several state-of-the-art (SOTA) VLMs, we reveal their limitations in embodied UAV tasks, underscoring the critical role of the BEDI benchmark in advancing embodied intelligence research and model optimization. By filling the gap in systematic and standardized evaluation within this field, BEDI facilitates objective model comparison and lays a robust foundation for future development in this field. Our benchmark will be released at https://github.com/lostwolves/BEDI .
△ Less
Submitted 23 May, 2025;
originally announced May 2025.
-
PhyX: Does Your Model Have the "Wits" for Physical Reasoning?
Authors:
Hui Shen,
Taiqiang Wu,
Qi Han,
Yunta Hsieh,
Jizhou Wang,
Yuyue Zhang,
Yuxin Cheng,
Zijian Hao,
Yuansheng Ni,
Xin Wang,
Zhongwei Wan,
Kai Zhang,
Wendong Xu,
Jing Xiong,
Ping Luo,
Wenhu Chen,
Chaofan Tao,
Zhuoqing Mao,
Ngai Wong
Abstract:
Existing benchmarks fail to capture a crucial aspect of intelligence: physical reasoning, the integrated ability to combine domain knowledge, symbolic reasoning, and understanding of real-world constraints. To address this gap, we introduce PhyX: the first large-scale benchmark designed to assess models capacity for physics-grounded reasoning in visual scenarios. PhyX includes 3K meticulously cura…
▽ More
Existing benchmarks fail to capture a crucial aspect of intelligence: physical reasoning, the integrated ability to combine domain knowledge, symbolic reasoning, and understanding of real-world constraints. To address this gap, we introduce PhyX: the first large-scale benchmark designed to assess models capacity for physics-grounded reasoning in visual scenarios. PhyX includes 3K meticulously curated multimodal questions spanning 6 reasoning types across 25 sub-domains and 6 core physics domains: thermodynamics, electromagnetism, mechanics, modern physics, optics, and wave\&acoustics. In our comprehensive evaluation, even state-of-the-art models struggle significantly with physical reasoning. GPT-4o, Claude3.7-Sonnet, and GPT-o4-mini achieve only 32.5%, 42.2%, and 45.8% accuracy respectively-performance gaps exceeding 29% compared to human experts. Our analysis exposes critical limitations in current models: over-reliance on memorized disciplinary knowledge, excessive dependence on mathematical formulations, and surface-level visual pattern matching rather than genuine physical understanding. We provide in-depth analysis through fine-grained statistics, detailed case studies, and multiple evaluation paradigms to thoroughly examine physical reasoning capabilities. To ensure reproducibility, we implement a compatible evaluation protocol based on widely-used toolkits such as VLMEvalKit, enabling one-click evaluation. More details are available on our project page: https://phyx-bench.github.io/.
△ Less
Submitted 29 May, 2025; v1 submitted 21 May, 2025;
originally announced May 2025.
-
Code2Logic: Game-Code-Driven Data Synthesis for Enhancing VLMs General Reasoning
Authors:
Jingqi Tong,
Jixin Tang,
Hangcheng Li,
Yurong Mou,
Ming Zhang,
Jun Zhao,
Yanbo Wen,
Fan Song,
Jiahao Zhan,
Yuyang Lu,
Chaoran Tao,
Zhiyuan Guo,
Jizhou Yu,
Tianhao Cheng,
Changhao Jiang,
Zhen Wang,
Tao Liang,
Zhihui Fei,
Mingyang Wan,
Guojun Ma,
Weifeng Ge,
Guanhua Chen,
Tao Gui,
Xipeng Qiu,
Qi Zhang
, et al. (1 additional authors not shown)
Abstract:
Visual-language Chain-of-Thought (CoT) data resources are relatively scarce compared to text-only counterparts, limiting the improvement of reasoning capabilities in Vision Language Models (VLMs). However, high-quality vision-language reasoning data is expensive and labor-intensive to annotate. To address this issue, we leverage a promising resource: game code, which naturally contains logical str…
▽ More
Visual-language Chain-of-Thought (CoT) data resources are relatively scarce compared to text-only counterparts, limiting the improvement of reasoning capabilities in Vision Language Models (VLMs). However, high-quality vision-language reasoning data is expensive and labor-intensive to annotate. To address this issue, we leverage a promising resource: game code, which naturally contains logical structures and state transition processes. Therefore, we propose Code2Logic, a novel game-code-driven approach for multimodal reasoning data synthesis. Our approach leverages Large Language Models (LLMs) to adapt game code, enabling automatic acquisition of reasoning processes and results through code execution. Using the Code2Logic approach, we developed the GameQA dataset to train and evaluate VLMs. GameQA is cost-effective and scalable to produce, challenging for state-of-the-art models, and diverse with 30 games and 158 tasks. Surprisingly, despite training solely on game data, VLMs demonstrated out of domain generalization, specifically Qwen2.5-VL-7B improving performance by 2.33\% across 7 diverse vision-language benchmarks. Our code and dataset are available at https://github.com/tongjingqi/Code2Logic.
△ Less
Submitted 19 May, 2025;
originally announced May 2025.
-
An Agnostic Approach to Building Empirical Type Ia Supernova Light Curves: Evidence for Intrinsic Chromatic Flux Variation Using Nearby Supernova Factory Data
Authors:
Jared Hand,
A. G. Kim,
G. Aldering,
P. Antilogus,
C. Aragon,
S. Bailey,
C. Baltay,
S. Bongard,
K. Boone,
C. Buton,
Y. Copin,
S. Dixon,
D. Fouchez,
E. Gangler,
R. Gupta,
B. Hayden,
W. Hillebrandt,
Mitchell Karmen,
M. Kowalski,
D. Küsters,
P. -F. Léget,
F. Mondon,
J. Nordin,
R. Pain,
E. Pecontal
, et al. (13 additional authors not shown)
Abstract:
We present a new empirical Type Ia supernova (SN Ia) model with three chromatic flux variation templates: one phase dependent and two phase independent. No underlying dust extinction model or patterns of intrinsic variability are assumed. Implemented with Stan and trained using spectrally binned Nearby Supernova Factory spectrophotometry, we examine this model's 2D, phase-independent flux variatio…
▽ More
We present a new empirical Type Ia supernova (SN Ia) model with three chromatic flux variation templates: one phase dependent and two phase independent. No underlying dust extinction model or patterns of intrinsic variability are assumed. Implemented with Stan and trained using spectrally binned Nearby Supernova Factory spectrophotometry, we examine this model's 2D, phase-independent flux variation space using two motivated basis representations. In both, the first phase-independent template captures variation that appears dust-like, while the second captures a combination of effectively intrinsic variability and second-order dust-like effects. We find that approximately 13% of the modeled phase-independent flux variance is not dust-like. Previous empirical SN Ia models either assume an effective dust extinction recipe in their architecture, or only allow for a single mode of phase-independent variation. The presented results demonstrate such an approach may be insufficient, because it could "leak" noticeable intrinsic variation into phase-independent templates.
△ Less
Submitted 10 May, 2025;
originally announced May 2025.
-
Euclid preparation. The impact of redshift interlopers on the two-point correlation function analysis
Authors:
Euclid Collaboration,
I. Risso,
A. Veropalumbo,
E. Branchini,
E. Maragliano,
S. de la Torre,
E. Sarpa,
P. Monaco,
B. R. Granett,
S. Lee,
G. E. Addison,
S. Bruton,
C. Carbone,
G. Lavaux,
K. Markovic,
K. McCarthy,
G. Parimbelli,
F. Passalacqua,
W. J. Percival,
C. Scarlata,
E. Sefusatti,
Y. Wang,
M. Bonici,
F. Oppizzi,
N. Aghanim
, et al. (295 additional authors not shown)
Abstract:
The Euclid survey aims to measure the spectroscopic redshift of emission-line galaxies by identifying the H$\,α$ line in their slitless spectra. This method is sensitive to the signal-to-noise ratio of the line, as noise fluctuations or other strong emission lines can be misidentified as H$\,α$, depending on redshift. These effects lead to catastrophic redshift errors and the inclusion of interlop…
▽ More
The Euclid survey aims to measure the spectroscopic redshift of emission-line galaxies by identifying the H$\,α$ line in their slitless spectra. This method is sensitive to the signal-to-noise ratio of the line, as noise fluctuations or other strong emission lines can be misidentified as H$\,α$, depending on redshift. These effects lead to catastrophic redshift errors and the inclusion of interlopers in the sample. We forecast the impact of such redshift errors on galaxy clustering measurements. In particular, we study the effect of interloper contamination on the two-point correlation function (2PCF), the growth rate of structures, and the Alcock-Paczynski (AP) parameters. We analyze 1000 synthetic spectroscopic catalogues, the EuclidLargeMocks, designed to match the area and selection function of the Data Release 1 (DR1) sample. We estimate the 2PCF of the contaminated catalogues, isolating contributions from correctly identified galaxies and from interlopers. We explore different models with increasing complexity to describe the measured 2PCF at fixed cosmology. Finally, we perform a cosmological inference and evaluate the systematic error on the inferred $fσ_8$, $α_{\parallel}$ and $α_{\perp}$ values associated with different models. Our results demonstrate that a minimal modelling approach, which only accounts for an attenuation of the clustering signal regardless of the type of contaminants, is sufficient to recover the correct values of $fσ_8$, $α_{\parallel}$, and $α_{\perp}$ at DR1. The accuracy and precision of the estimated AP parameters are largely insensitive to the presence of interlopers. The adoption of a minimal model induces a 1%-3% systematic error on the growth rate of structure estimation, depending on the redshift. However, this error remains smaller than the statistical error expected for the Euclid DR1 analysis.
△ Less
Submitted 7 May, 2025;
originally announced May 2025.
-
TransparentGS: Fast Inverse Rendering of Transparent Objects with Gaussians
Authors:
Letian Huang,
Dongwei Ye,
Jialin Dan,
Chengzhi Tao,
Huiwen Liu,
Kun Zhou,
Bo Ren,
Yuanqi Li,
Yanwen Guo,
Jie Guo
Abstract:
The emergence of neural and Gaussian-based radiance field methods has led to considerable advancements in novel view synthesis and 3D object reconstruction. Nonetheless, specular reflection and refraction continue to pose significant challenges due to the instability and incorrect overfitting of radiance fields to high-frequency light variations. Currently, even 3D Gaussian Splatting (3D-GS), as a…
▽ More
The emergence of neural and Gaussian-based radiance field methods has led to considerable advancements in novel view synthesis and 3D object reconstruction. Nonetheless, specular reflection and refraction continue to pose significant challenges due to the instability and incorrect overfitting of radiance fields to high-frequency light variations. Currently, even 3D Gaussian Splatting (3D-GS), as a powerful and efficient tool, falls short in recovering transparent objects with nearby contents due to the existence of apparent secondary ray effects. To address this issue, we propose TransparentGS, a fast inverse rendering pipeline for transparent objects based on 3D-GS. The main contributions are three-fold. Firstly, an efficient representation of transparent objects, transparent Gaussian primitives, is designed to enable specular refraction through a deferred refraction strategy. Secondly, we leverage Gaussian light field probes (GaussProbe) to encode both ambient light and nearby contents in a unified framework. Thirdly, a depth-based iterative probes query (IterQuery) algorithm is proposed to reduce the parallax errors in our probe-based framework. Experiments demonstrate the speed and accuracy of our approach in recovering transparent objects from complex environments, as well as several applications in computer graphics and vision.
△ Less
Submitted 1 May, 2025; v1 submitted 25 April, 2025;
originally announced April 2025.
-
Euclid preparation: TBD. Cosmic Dawn Survey: evolution of the galaxy stellar mass function across 0.2<z<6.5 measured over 10 square degrees
Authors:
Euclid Collaboration,
L. Zalesky,
J. R. Weaver,
C. J. R. McPartland,
G. Murphree,
I. Valdes,
C. K. Jespersen,
S. Taamoli,
N. Chartab,
N. Allen,
S. W. J. Barrow,
D. B. Sanders,
S. Toft,
B. Mobasher,
I. Szapudi,
B. Altieri,
A. Amara,
S. Andreon,
N. Auricchio,
C. Baccigalupi,
M. Baldi,
S. Bardelli,
P. Battaglia,
A. Biviano,
D. Bonino
, et al. (282 additional authors not shown)
Abstract:
The Cosmic Dawn Survey Pre-launch (PL) catalogues cover an effective 10.13 deg$^{2}$ area with uniform deep Spitzer/IRAC data ($m\sim25$ mag, 5$σ$), the largest area covered to these depths in the infrared. These data are used to gain new insight into the growth of stellar mass across cosmic history by characterising the evolution of the galaxy stellar mass function (GSMF) through…
▽ More
The Cosmic Dawn Survey Pre-launch (PL) catalogues cover an effective 10.13 deg$^{2}$ area with uniform deep Spitzer/IRAC data ($m\sim25$ mag, 5$σ$), the largest area covered to these depths in the infrared. These data are used to gain new insight into the growth of stellar mass across cosmic history by characterising the evolution of the galaxy stellar mass function (GSMF) through $0.2 < z \leq 6.5$. The total volume (0.62 Gpc$^{3}$) represents a tenfold increase compared to previous works that have explored $z > 3$ and significantly reduces cosmic variance, yielding strong constraints on the abundance of massive galaxies. Results are generally consistent with the literature but now provide firm estimates of number density where only upper limits were previously available. Contrasting the GSMF with the dark matter halo mass function suggests that massive galaxies ($M \gtrsim10^{11}$ M$_{\odot}$) at $z > 3.5$ required integrated star-formation efficiencies of $M/(M_{\rm h}f_{\rm b}) \gtrsim$ 0.25--0.5, in excess of the commonly-held view of ``universal peak efficiency" from studies on the stellar-to-halo mass relation (SHMR). Such increased efficiencies imply an evolving peak in the SHMR at $z > 3.5$ which can be maintained if feedback mechanisms from active galactic nuclei and stellar processes are ineffective at early times. In addition, a significant fraction of the most massive quiescent galaxies are observed to be in place already by $z\sim 2.5$--3. The apparent lack in change of their number density by $z\sim 0.2$ is consistent with relatively little mass growth from mergers. Utilising the unique volume, evidence for an environmental dependence of the galaxy stellar mass function is found all the way through $z\sim 3.5$ for the first time, though a more careful characterisation of the density field is ultimately required for confirmation.
△ Less
Submitted 24 April, 2025;
originally announced April 2025.
-
Launching Insights: A Pilot Study on Leveraging Real-World Observational Data from the Mayo Clinic Platform to Advance Clinical Research
Authors:
Yue Yu,
Xinyue Hu,
Sivaraman Rajaganapathy,
Jingna Feng,
Ahmed Abdelhameed,
Xiaodi Li,
Jianfu Li,
Ken Liu,
Liu Yang,
Nilufer Taner,
Phil Fiero,
Soulmaz Boroumand,
Richard Larsen,
Maneesh Goyal,
Clark Otley,
Nansu Zong,
John Halamka,
Cui Tao
Abstract:
Backgrounds: Artificial intelligence (AI) is transforming healthcare, yet translating AI models from theoretical frameworks to real-world clinical applications remains challenging. The Mayo Clinic Platform (MCP) was established to address these challenges by providing a scalable ecosystem that integrates real-world multiple modalities data from multiple institutions, advanced analytical tools, and…
▽ More
Backgrounds: Artificial intelligence (AI) is transforming healthcare, yet translating AI models from theoretical frameworks to real-world clinical applications remains challenging. The Mayo Clinic Platform (MCP) was established to address these challenges by providing a scalable ecosystem that integrates real-world multiple modalities data from multiple institutions, advanced analytical tools, and secure computing environments to support clinical research and AI development. Methods: In this study, we conducted four research projects leveraging MCP's data infrastructure and analytical capabilities to demonstrate its potential in facilitating real-world evidence generation and AI-driven clinical insights. Utilizing MCP's tools and environment, we facilitated efficient cohort identification, data extraction, and subsequent statistical or AI-powered analyses. Results: The results underscore MCP's role in accelerating translational research by offering de-identified, standardized real-world data and facilitating AI model validation across diverse healthcare settings. Compared to Mayo's internal Electronic Health Record (EHR) data, MCP provides broader accessibility, enhanced data standardization, and multi-institutional integration, making it a valuable resource for both internal and external researchers. Conclusion: Looking ahead, MCP is well-positioned to transform clinical research through its scalable ecosystem, effectively bridging the divide between AI innovation and clinical deployment. Future investigations will build upon this foundation, further exploring MCP's capacity to advance precision medicine and enhance patient outcomes.
△ Less
Submitted 21 March, 2025;
originally announced April 2025.
-
CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning
Authors:
Yang Yue,
Yulin Wang,
Chenxin Tao,
Pan Liu,
Shiji Song,
Gao Huang
Abstract:
Humans can develop internal world models that encode common sense knowledge, telling them how the world works and predicting the consequences of their actions. This concept has emerged as a promising direction for establishing general-purpose machine-learning models in recent preliminary works, e.g., for visual representation learning. In this paper, we present CheXWorld, the first effort towards…
▽ More
Humans can develop internal world models that encode common sense knowledge, telling them how the world works and predicting the consequences of their actions. This concept has emerged as a promising direction for establishing general-purpose machine-learning models in recent preliminary works, e.g., for visual representation learning. In this paper, we present CheXWorld, the first effort towards a self-supervised world model for radiographic images. Specifically, our work develops a unified framework that simultaneously models three aspects of medical knowledge essential for qualified radiologists, including 1) local anatomical structures describing the fine-grained characteristics of local tissues (e.g., architectures, shapes, and textures); 2) global anatomical layouts describing the global organization of the human body (e.g., layouts of organs and skeletons); and 3) domain variations that encourage CheXWorld to model the transitions across different appearance domains of radiographs (e.g., varying clarity, contrast, and exposure caused by collecting radiographs from different hospitals, devices, or patients). Empirically, we design tailored qualitative and quantitative analyses, revealing that CheXWorld successfully captures these three dimensions of medical knowledge. Furthermore, transfer learning experiments across eight medical image classification and segmentation benchmarks showcase that CheXWorld significantly outperforms existing SSL methods and large-scale medical foundation models. Code & pre-trained models are available at https://github.com/LeapLabTHU/CheXWorld.
△ Less
Submitted 18 April, 2025;
originally announced April 2025.
-
Euclid preparation. Estimating galaxy physical properties using CatBoost chained regressors with attention
Authors:
Euclid Collaboration,
A. Humphrey,
P. A. C. Cunha,
L. Bisigello,
C. Tortora,
M. Bolzonella,
L. Pozzetti,
M. Baes,
B. R. Granett,
A. Amara,
S. Andreon,
N. Auricchio,
C. Baccigalupi,
M. Baldi,
S. Bardelli,
A. Biviano,
C. Bodendorf,
D. Bonino,
E. Branchini,
M. Brescia,
J. Brinchmann,
S. Camera,
G. Cañas-Herrera,
V. Capobianco,
C. Carbone
, et al. (210 additional authors not shown)
Abstract:
Euclid will image ~14000 deg^2 of the extragalactic sky at visible and NIR wavelengths, providing a dataset of unprecedented size and richness that will facilitate a multitude of studies into the evolution of galaxies. In the vast majority of cases the main source of information will come from broad-band images and data products thereof. Therefore, there is a pressing need to identify or develop s…
▽ More
Euclid will image ~14000 deg^2 of the extragalactic sky at visible and NIR wavelengths, providing a dataset of unprecedented size and richness that will facilitate a multitude of studies into the evolution of galaxies. In the vast majority of cases the main source of information will come from broad-band images and data products thereof. Therefore, there is a pressing need to identify or develop scalable yet reliable methodologies to estimate the redshift and physical properties of galaxies using broad-band photometry from Euclid, optionally including ground-based optical photometry also. To address this need, we present a novel method to estimate the redshift, stellar mass, star-formation rate, specific star-formation rate, E(B-V), and age of galaxies, using mock Euclid and ground-based photometry. The main novelty of our property-estimation pipeline is its use of the CatBoost implementation of gradient-boosted regression-trees, together with chained regression and an intelligent, automatic optimization of the training data. The pipeline also includes a computationally-efficient method to estimate prediction uncertainties, and, in the absence of ground-truth labels, provides accurate predictions for metrics of model performance up to z~2. We apply our pipeline to several datasets consisting of mock Euclid broad-band photometry and mock ground-based ugriz photometry, to evaluate the performance of our methodology for estimating the redshift and physical properties of galaxies detected in the Euclid Wide Survey. The quality of our photometric redshift and physical property estimates are highly competitive overall, validating our modeling approach. We find that the inclusion of ground-based optical photometry significantly improves the quality of the property estimation, highlighting the importance of combining Euclid data with ancillary ground-based optical data. (Abridged)
△ Less
Submitted 17 April, 2025;
originally announced April 2025.
-
CodeRAG: Supportive Code Retrieval on Bigraph for Real-World Code Generation
Authors:
Jia Li,
Xianjie Shi,
Kechi Zhang,
Lei Li,
Ge Li,
Zhengwei Tao,
Jia Li,
Fang Liu,
Chongyang Tao,
Zhi Jin
Abstract:
Large language models (LLMs) have shown promising performance in automated code generation, especially excelling in simple tasks such as generating standalone codes. Different from simple tasks, real-world code generation usually depends on specific programming environment (e.g., code repositories). It contains complex dependencies and domain knowledge, which is needed for LLMs when generating tar…
▽ More
Large language models (LLMs) have shown promising performance in automated code generation, especially excelling in simple tasks such as generating standalone codes. Different from simple tasks, real-world code generation usually depends on specific programming environment (e.g., code repositories). It contains complex dependencies and domain knowledge, which is needed for LLMs when generating target code snippets. In this paper, we propose CodeRAG, a retrieval-augmented code generation (RAG) framework to comprehensively retrieve supportive codes for real-world code generation. Beginning with the requirement, CodeRAG first constructs a requirement graph for the current repository, and retrieves sub- and similar- requirement nodes of the target requirement on the graph. Meanwhile, it models the repository into a DS-code graph. CodeRAG then maps these relevant requirement nodes into their corresponding code nodes, and treats these code nodes as archors for LLM reasoning on DS-code graph. Finally, CodeRAG introduces a code-oriented agentic reasoning process, seamlessly allowing LLMs to reason and comprehensively retrieve for supportive codes which LLMs' need for generating correct programs. Experiments show that CodeRAG achieves significant improvements (i.e., increasing 40.90 and 37.79 Pass@1 on GPT-4o and Gemini-Pro on DevEval) compared to no RAG scenarios. Further tests on reasoning LLMs (i.e., QwQ-32B) confirm CodeRAG's adaptability and efficacy across various types of LLMs. In addition, CodeRAG outperforms commercial programming products such as Copilit and Cursor. We further investigate the performance of our framework on different dependency types, and observe that CodeRAG is superior in generating examples where target codes invoke predefined cross-file code snippets. These results demonstrate CodeRAG's potential in solving real-world repo-level coding challenges.
△ Less
Submitted 14 April, 2025;
originally announced April 2025.
-
DBaS-Log-MPPI: Efficient and Safe Trajectory Optimization via Barrier States
Authors:
Fanxin Wang,
Haolong Jiang,
Chuyuan Tao,
Wenbin Wan,
Yikun Cheng
Abstract:
Optimizing trajectory costs for nonlinear control systems remains a significant challenge. Model Predictive Control (MPC), particularly sampling-based approaches such as the Model Predictive Path Integral (MPPI) method, has recently demonstrated considerable success by leveraging parallel computing to efficiently evaluate numerous trajectories. However, MPPI often struggles to balance safe navigat…
▽ More
Optimizing trajectory costs for nonlinear control systems remains a significant challenge. Model Predictive Control (MPC), particularly sampling-based approaches such as the Model Predictive Path Integral (MPPI) method, has recently demonstrated considerable success by leveraging parallel computing to efficiently evaluate numerous trajectories. However, MPPI often struggles to balance safe navigation in constrained environments with effective exploration in open spaces, leading to infeasibility in cluttered conditions. To address these limitations, we propose DBaS-Log-MPPI, a novel algorithm that integrates Discrete Barrier States (DBaS) to ensure safety while enabling adaptive exploration with enhanced feasibility. Our method is efficiently validated through three simulation missions and one real-world experiment, involving a 2D quadrotor and a ground vehicle navigating through cluttered obstacles. We demonstrate that our algorithm surpasses both Vanilla MPPI and Log-MPPI, achieving higher success rates, lower tracking errors, and a conservative average speed.
△ Less
Submitted 26 March, 2025;
originally announced April 2025.
-
PROPHET: An Inferable Future Forecasting Benchmark with Causal Intervened Likelihood Estimation
Authors:
Zhengwei Tao,
Zhi Jin,
Bincheng Li,
Xiaoying Bai,
Haiyan Zhao,
Chengfeng Dou,
Xiancai Chen,
Jia Li,
Linyu Li,
Chongyang Tao
Abstract:
Predicting future events stands as one of the ultimate aspirations of artificial intelligence. Recent advances in large language model (LLM)-based systems have shown remarkable potential in forecasting future events, thereby garnering significant interest in the research community. Currently, several benchmarks have been established to evaluate the forecasting capabilities by formalizing the event…
▽ More
Predicting future events stands as one of the ultimate aspirations of artificial intelligence. Recent advances in large language model (LLM)-based systems have shown remarkable potential in forecasting future events, thereby garnering significant interest in the research community. Currently, several benchmarks have been established to evaluate the forecasting capabilities by formalizing the event prediction as a retrieval-augmented generation (RAG) and reasoning task. In these benchmarks, each prediction question is answered with relevant retrieved news articles. However, because there is no consideration on whether the questions can be supported by valid or sufficient supporting rationales, some of the questions in these benchmarks may be inherently noninferable. To address this issue, we introduce a new benchmark, PROPHET, which comprises inferable forecasting questions paired with relevant news for retrieval. To ensure the inferability of the benchmark, we propose Causal Intervened Likelihood (CIL), a statistical measure that assesses inferability through causal inference. In constructing this benchmark, we first collected recent trend forecasting questions and then filtered the data using CIL, resulting in an inferable benchmark for event prediction. Through extensive experiments, we first demonstrate the validity of CIL and in-depth investigations into event prediction with the aid of CIL. Subsequently, we evaluate several representative prediction systems on PROPHET, drawing valuable insights for future directions.
△ Less
Submitted 2 April, 2025;
originally announced April 2025.
-
A large-scale image-text dataset benchmark for farmland segmentation
Authors:
Chao Tao,
Dandan Zhong,
Weiliang Mu,
Zhuofei Du,
Haiyang Wu
Abstract:
The traditional deep learning paradigm that solely relies on labeled data has limitations in representing the spatial relationships between farmland elements and the surrounding environment.It struggles to effectively model the dynamic temporal evolution and spatial heterogeneity of farmland. Language,as a structured knowledge carrier,can explicitly express the spatiotemporal characteristics of fa…
▽ More
The traditional deep learning paradigm that solely relies on labeled data has limitations in representing the spatial relationships between farmland elements and the surrounding environment.It struggles to effectively model the dynamic temporal evolution and spatial heterogeneity of farmland. Language,as a structured knowledge carrier,can explicitly express the spatiotemporal characteristics of farmland, such as its shape, distribution,and surrounding environmental information.Therefore,a language-driven learning paradigm can effectively alleviate the challenges posed by the spatiotemporal heterogeneity of farmland.However,in the field of remote sensing imagery of farmland,there is currently no comprehensive benchmark dataset to support this research direction.To fill this gap,we introduced language based descriptions of farmland and developed FarmSeg-VL dataset,the first fine-grained image-text dataset designed for spatiotemporal farmland segmentation.Firstly, this article proposed a semi-automatic annotation method that can accurately assign caption to each image, ensuring high data quality and semantic richness while improving the efficiency of dataset construction.Secondly,the FarmSeg-VL exhibits significant spatiotemporal characteristics.In terms of the temporal dimension,it covers all four seasons.In terms of the spatial dimension,it covers eight typical agricultural regions across China.In addition, in terms of captions,FarmSeg-VL covers rich spatiotemporal characteristics of farmland,including its inherent properties,phenological characteristics, spatial distribution,topographic and geomorphic features,and the distribution of surrounding environments.Finally,we present a performance analysis of VLMs and the deep learning models that rely solely on labels trained on the FarmSeg-VL,demonstrating its potential as a standard benchmark for farmland segmentation.
△ Less
Submitted 29 March, 2025;
originally announced March 2025.
-
Decomposing a factorial into large factors
Authors:
Boris Alexeev,
Evan Conway,
Matthieu Rosenfeld,
Andrew V. Sutherland,
Terence Tao,
Markus Uhr,
Kevin Ventullo
Abstract:
Let $t(N)$ denote the largest number such that $N!$ can be expressed as the product of $N$ integers greater than or equal to $t(N)$. The bound $t(N)/N = 1/e-o(1)$ was apparently established in unpublished work of Erdős, Selfridge, and Straus; but the proof is lost. Here we obtain the more precise asymptotic…
▽ More
Let $t(N)$ denote the largest number such that $N!$ can be expressed as the product of $N$ integers greater than or equal to $t(N)$. The bound $t(N)/N = 1/e-o(1)$ was apparently established in unpublished work of Erdős, Selfridge, and Straus; but the proof is lost. Here we obtain the more precise asymptotic $$ \frac{t(N)}{N} = \frac{1}{e} - \frac{c_0}{\log N} + O\left( \frac{1}{\log^{1+c} N} \right)$$ for an explicit constant $c_0 = 0.30441901\dots$ and some absolute constant $c>0$, answering a question of Erdős and Graham. For the upper bound, a further lower order term in the asymptotic expansion is also obtained. With numerical assistance, we obtain highly precise computations of $t(N)$ for wide ranges of $N$, establishing several explicit conjectures of Guy and Selfridge on this sequence. For instance, we show that $t(N) \geq N/3$ for $N \geq 43632$, with the threshold shown to be best possible.
△ Less
Submitted 2 June, 2025; v1 submitted 25 March, 2025;
originally announced March 2025.
-
Euclid preparation LXX. Forecasting detection limits for intracluster light in the Euclid Wide Survey
Authors:
Euclid Collaboration,
C. Bellhouse,
J. B. Golden-Marx,
S. P. Bamford,
N. A. Hatch,
M. Kluge,
A. Ellien,
S. L. Ahad,
P. Dimauro,
F. Durret,
A. H. Gonzalez,
Y. Jimenez-Teja,
M. Montes,
M. Sereno,
E. Slezak,
M. Bolzonella,
G. Castignani,
O. Cucciati,
G. De Lucia,
Z. Ghaffari,
L. Moscardini,
R. Pello,
L. Pozzetti,
T. Saifollahi,
A. S. Borlaff
, et al. (270 additional authors not shown)
Abstract:
The intracluster light (ICL) permeating galaxy clusters is a tracer of the cluster's assembly history, and potentially a tracer of their dark matter structure. In this work we explore the capability of the Euclid Wide Survey to detect ICL using H-band mock images. We simulate clusters across a range of redshifts (0.3-1.8) and halo masses ($10^{13.9}$-$10^{15.0}$ M$_\odot$), using an observationall…
▽ More
The intracluster light (ICL) permeating galaxy clusters is a tracer of the cluster's assembly history, and potentially a tracer of their dark matter structure. In this work we explore the capability of the Euclid Wide Survey to detect ICL using H-band mock images. We simulate clusters across a range of redshifts (0.3-1.8) and halo masses ($10^{13.9}$-$10^{15.0}$ M$_\odot$), using an observationally motivated model of the ICL. We identify a 50-200 kpc circular annulus around the brightest cluster galaxy (BCG) in which the signal-to-noise ratio (S/N) of the ICL is maximised and use the S/N within this aperture as our figure of merit for ICL detection. We compare three state-of-the-art methods for ICL detection, and find that a method that performs simple aperture photometry after high-surface brightness source masking is able to detect ICL with minimal bias for clusters more massive than $10^{14.2}$ M$_\odot$. The S/N of the ICL detection is primarily limited by the redshift of the cluster, driven by cosmological dimming, rather than the mass of the cluster. Assuming the ICL in each cluster contains 15% of the stellar light, we forecast that Euclid will be able to measure the presence of ICL in up to $\sim80000$ clusters of $>10^{14.2}$ M$_\odot$ between $z=0.3$ and 1.5 with a S/N$>3$. Half of these clusters will reside below $z=0.75$ and the majority of those below $z=0.6$ will be detected with a S/N $>20$. A few thousand clusters at $1.3<z<1.5$ will have ICL detectable with a S/N greater than 3. The surface brightness profile of the ICL model is strongly dependent on both the mass of the cluster and the redshift at which it is observed so the outer ICL is best observed in the most massive clusters of $>10^{14.7}$ M$_\odot$. Euclid will detect the ICL at more than 500 kpc distance from the BCG, up to $z=0.7$, in several hundred of these massive clusters over its large survey volume.
△ Less
Submitted 21 March, 2025;
originally announced March 2025.
-
Euclid preparation. Spatially resolved stellar populations of local galaxies with Euclid: a proof of concept using synthetic images with the TNG50 simulation
Authors:
Euclid Collaboration,
Abdurro'uf,
C. Tortora,
M. Baes,
A. Nersesian,
I. Kovačić,
M. Bolzonella,
A. Lançon,
L. Bisigello,
F. Annibali,
M. N. Bremer,
D. Carollo,
C. J. Conselice,
A. Enia,
A. M. N. Ferguson,
A. Ferré-Mateu,
L. K. Hunt,
E. Iodice,
J. H. Knapen,
A. Iovino,
F. R. Marleau,
R. F. Peletier,
R. Ragusa,
M. Rejkuba,
A. S. G. Robotham
, et al. (264 additional authors not shown)
Abstract:
The European Space Agency's Euclid mission will observe approximately 14,000 $\rm{deg}^{2}$ of the extragalactic sky and deliver high-quality imaging for many galaxies. The depth and high spatial resolution of the data will enable a detailed analysis of stellar population properties of local galaxies. In this study, we test our pipeline for spatially resolved SED fitting using synthetic images of…
▽ More
The European Space Agency's Euclid mission will observe approximately 14,000 $\rm{deg}^{2}$ of the extragalactic sky and deliver high-quality imaging for many galaxies. The depth and high spatial resolution of the data will enable a detailed analysis of stellar population properties of local galaxies. In this study, we test our pipeline for spatially resolved SED fitting using synthetic images of Euclid, LSST, and GALEX generated from the TNG50 simulation. We apply our pipeline to 25 local simulated galaxies to recover their resolved stellar population properties. We produce 3 types of data cubes: GALEX + LSST + Euclid, LSST + Euclid, and Euclid-only. We perform the SED fitting tests with two SPS models in a Bayesian framework. Because the age, metallicity, and dust attenuation estimates are biased when applying only classical formulations of flat priors, we examine the effects of additional priors in the forms of mass-age-$Z$ relations, constructed using a combination of empirical and simulated data. Stellar-mass surface densities can be recovered well using any of the 3 data cubes, regardless of the SPS model and prior variations. The new priors then significantly improve the measurements of mass-weighted age and $Z$ compared to results obtained without priors, but they may play an excessive role compared to the data in determining the outcome when no UV data is available. The spatially resolved SED fitting method is powerful for mapping the stellar populations of galaxies with the current abundance of high-quality imaging data. Our study re-emphasizes the gain added by including multiwavelength data from ancillary surveys and the roles of priors in Bayesian SED fitting. With the Euclid data alone, we will be able to generate complete and deep stellar mass maps of galaxies in the local Universe, thus exploiting the telescope's wide field, NIR sensitivity, and high spatial resolution.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
Euclid: Quick Data Release (Q1) -- Photometric studies of known transients
Authors:
C. Duffy,
E. Cappellaro,
M. T. Botticella,
I. M. Hook,
F. Poidevin,
T. J. Moriya,
A. A. Chrimes,
V. Petrecca,
K. Paterson,
A. Goobar,
L. Galbany,
R. Kotak,
C. Gall,
C. M. Gutierrez,
C. Tao,
L. Izzo,
N. Aghanim,
B. Altieri,
A. Amara,
S. Andreon,
N. Auricchio,
C. Baccigalupi,
M. Baldi,
A. Balestra,
S. Bardelli
, et al. (152 additional authors not shown)
Abstract:
We report on serendipitous Euclid observations of previously known transients, using the Euclid Q1 data release. By cross-matching with the Transient Name Server (TNS) we identify 164 transients that coincide with the data release. Although the Euclid Q1 release only includes single-epoch data, we are able to make Euclid photometric measurements at the location of 161 of these transients. Euclid o…
▽ More
We report on serendipitous Euclid observations of previously known transients, using the Euclid Q1 data release. By cross-matching with the Transient Name Server (TNS) we identify 164 transients that coincide with the data release. Although the Euclid Q1 release only includes single-epoch data, we are able to make Euclid photometric measurements at the location of 161 of these transients. Euclid obtained deep photometric measurements or upper limits of these transients in the $I_E$, $Y_E$, $J_E$, and $H_E$ bands at various phases of the transient light-curves, including before, during, and after the observations of ground-based transient surveys. Approximately 70\% of known transients reported in the six months before the Euclid observation date and with discovery magnitude brighter than 24 were detected in Euclid $\IE$ images. Our observations include one of the earliest near-infrared detections of a Type~Ia supernova (SN 2024pvw) 15 days prior to its peak brightness, and the late-phase (435.9 days post peak) observations of the enigmatic core-collapse SN 2023aew. Euclid deep photometry provides valuable information on the nature of these transients such as their progenitor systems and power sources, with late time observations being a uniquely powerful contribution. In addition, Euclid is able to detect the host galaxies of some transients that were previously classed as hostless. The Q1 data demonstrate the power of the Euclid data even with only single-epoch observations available, as will be the case for much larger areas of sky in the Euclid Wide Survey.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
Euclid Quick Data Release (Q1). The first catalogue of strong-lensing galaxy clusters
Authors:
Euclid Collaboration,
P. Bergamini,
M. Meneghetti,
A. Acebron,
B. Clément,
M. Bolzonella,
C. Grillo,
P. Rosati,
D. Abriola,
J. A. Acevedo Barroso,
G. Angora,
L. Bazzanini,
R. Cabanac,
B. C. Nagam,
A. R. Cooray,
G. Despali,
G. Di Rosa,
J. M. Diego,
M. Fogliardi,
A. Galan,
R. Gavazzi,
G. Granata,
N. B. Hogg,
K. Jahnke,
L. Leuzzi
, et al. (353 additional authors not shown)
Abstract:
We present the first catalogue of strong lensing galaxy clusters identified in the Euclid Quick Release 1 observations (covering $63.1\,\mathrm{deg^2}$). This catalogue is the result of the visual inspection of 1260 cluster fields. Each galaxy cluster was ranked with a probability, $\mathcal{P}_{\mathrm{lens}}$, based on the number and plausibility of the identified strong lensing features. Specif…
▽ More
We present the first catalogue of strong lensing galaxy clusters identified in the Euclid Quick Release 1 observations (covering $63.1\,\mathrm{deg^2}$). This catalogue is the result of the visual inspection of 1260 cluster fields. Each galaxy cluster was ranked with a probability, $\mathcal{P}_{\mathrm{lens}}$, based on the number and plausibility of the identified strong lensing features. Specifically, we identified 83 gravitational lenses with $\mathcal{P}_{\mathrm{lens}}>0.5$, of which 14 have $\mathcal{P}_{\mathrm{lens}}=1$, and clearly exhibiting secure strong lensing features, such as giant tangential and radial arcs, and multiple images. Considering the measured number density of lensing galaxy clusters, approximately $0.3\,\mathrm{deg}^{-2}$ for $\mathcal{P}_{\mathrm{lens}}>0.9$, we predict that \Euclid\ will likely see more than 4500 strong lensing clusters over the course of the mission. Notably, only three of the identified cluster-scale lenses had been previously observed from space. Thus, \Euclid has provided the first high-resolution imaging for the remaining $80$ galaxy cluster lenses, including those with the highest probability. The identified strong lensing features will be used for training deep-learning models for identifying gravitational arcs and multiple images automatically in \Euclid observations. This study confirms the huge potential of \Euclid for finding new strong lensing clusters, enabling exciting new discoveries on the nature of dark matter and dark energy and the study of the high-redshift Universe.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
Euclid Quick Data Release (Q1). LEMON -- Lens Modelling with Neural networks. Automated and fast modelling of Euclid gravitational lenses with a singular isothermal ellipsoid mass profile
Authors:
Euclid Collaboration,
V. Busillo,
C. Tortora,
R. B. Metcalf,
J. W. Nightingale,
M. Meneghetti,
F. Gentile,
R. Gavazzi,
F. Zhong,
R. Li,
B. Clément,
G. Covone,
N. R. Napolitano,
F. Courbin,
M. Walmsley,
E. Jullo,
J. Pearson,
D. Scott,
A. M. C. Le Brun,
L. Leuzzi,
N. Aghanim,
B. Altieri,
A. Amara,
S. Andreon,
H. Aussel
, et al. (290 additional authors not shown)
Abstract:
The Euclid mission aims to survey around 14000 deg^{2} of extragalactic sky, providing around 10^{5} gravitational lens images. Modelling of gravitational lenses is fundamental to estimate the total mass of the lens galaxy, along with its dark matter content. Traditional modelling of gravitational lenses is computationally intensive and requires manual input. In this paper, we use a Bayesian neura…
▽ More
The Euclid mission aims to survey around 14000 deg^{2} of extragalactic sky, providing around 10^{5} gravitational lens images. Modelling of gravitational lenses is fundamental to estimate the total mass of the lens galaxy, along with its dark matter content. Traditional modelling of gravitational lenses is computationally intensive and requires manual input. In this paper, we use a Bayesian neural network, LEns MOdelling with Neural networks (LEMON), for modelling Euclid gravitational lenses with a singular isothermal ellipsoid mass profile. Our method estimates key lens mass profile parameters, such as the Einstein radius, while also predicting the light parameters of foreground galaxies and their uncertainties. We validate LEMON's performance on both mock Euclid data sets, real Euclidised lenses observed with Hubble Space Telescope (hereafter HST), and real Euclid lenses found in the Perseus ERO field, demonstrating the ability of LEMON to predict parameters of both simulated and real lenses. Results show promising accuracy and reliability in predicting the Einstein radius, axis ratio, position angle, effective radius, Sérsic index, and lens magnitude for simulated lens galaxies. The application to real data, including the latest Quick Release 1 strong lens candidates, provides encouraging results, particularly for the Einstein radius. We also verified that LEMON has the potential to accelerate traditional modelling methods, by giving to the classical optimiser the LEMON predictions as starting points, resulting in a speed-up of up to 26 times the original time needed to model a sample of gravitational lenses, a result that would be impossible with randomly initialised guesses. This work represents a significant step towards efficient, automated gravitational lens modelling, which is crucial for handling the large data volumes expected from Euclid.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
Euclid Quick Data Release (Q1). Extending the quest for little red dots to z<4
Authors:
Euclid Collaboration,
L. Bisigello,
G. Rodighiero,
S. Fotopoulou,
F. Ricci,
K. Jahnke,
A. Feltre,
V. Allevato,
F. Shankar,
P. Cassata,
E. Dalla Bontà,
G. Gandolfi,
G. Girardi,
M. Giulietti,
A. Grazian,
C. C. Lovell,
R. Maiolino,
T. Matamoro Zatarain,
M. Mezcua,
I. Prandoni,
D. Roberts,
W. Roster,
M. Salvato,
M. Siudek,
F. Tarsitano
, et al. (326 additional authors not shown)
Abstract:
Recent James Webb Space Telescope (JWST) observations have revealed a population of sources with a compact morphology and a `v-shaped' continuum, namely blue at rest-frame $λ<4000$A and red at longer wavelengths. The nature of these sources, called `little red dots' (LRDs), is still debated, since it is unclear if they host active galactic nuclei (AGN) and their number seems to drastically drop at…
▽ More
Recent James Webb Space Telescope (JWST) observations have revealed a population of sources with a compact morphology and a `v-shaped' continuum, namely blue at rest-frame $λ<4000$A and red at longer wavelengths. The nature of these sources, called `little red dots' (LRDs), is still debated, since it is unclear if they host active galactic nuclei (AGN) and their number seems to drastically drop at z<4. We utilise the 63 $deg^2$ covered by the quick Euclid Quick Data Release (Q1) to extend the search for LRDs to brighter magnitudes and to lower z than what has been possible with JWST to have a broader view of the evolution of this peculiar galaxy population. The selection is done by fitting the available photometric data (Euclid, Spitzer/IRAC, and ground-based griz data) with two power laws, to retrieve the rest-frame optical and UV slopes consistently over a large redshift range (i.e, z<7.6). We exclude extended objects and possible line emitters, and perform a visual inspection to remove imaging artefacts. The final selection includes 3341 LRD candidates from z=0.33 to z=3.6, with 29 detected in IRAC. Their rest-frame UV luminosity function, in contrast with previous JWST studies, shows that the number density of LRD candidates increases from high-z down to z=1.5-2.5 and decreases at even lower z. Less evolution is apparent focusing on the subsample of more robust LRD candidates having IRAC detections, which is affected by low statistics and limited by the IRAC resolution. The comparison with previous quasar UV luminosity functions shows that LRDs are not the dominant AGN population at z<4. Follow-up studies of these LRD candidates are key to confirm their nature, probe their physical properties and check for their compatibility with JWST sources, since the different spatial resolution and wavelength coverage of Euclid and JWST could select different samples of compact sources.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
Euclid Quick Data Release (Q1). An investigation of optically faint, red objects in the Euclid Deep Fields
Authors:
Euclid Collaboration,
G. Girardi,
G. Rodighiero,
L. Bisigello,
A. Enia,
A. Grazian,
E. Dalla Bontà,
E. Daddi,
S. Serjeant,
G. Gandolfi,
C. C. Lovell,
K. I. Caputi,
A. Bianchetti,
A. Vietri,
N. Aghanim,
B. Altieri,
A. Amara,
S. Andreon,
N. Auricchio,
H. Aussel,
C. Baccigalupi,
M. Baldi,
A. Balestra,
S. Bardelli,
P. Battaglia
, et al. (304 additional authors not shown)
Abstract:
Our understanding of cosmic star-formation at $z>3$ used to largely rely on rest-frame UV observations. However, these observations overlook dusty and massive sources, resulting in an incomplete census of early star-forming galaxies. Recently, infrared data from Spitzer and the James Webb Space Telescope (JWST) have revealed a hidden population at $z\sim$3-6 with extreme red colours. Taking advant…
▽ More
Our understanding of cosmic star-formation at $z>3$ used to largely rely on rest-frame UV observations. However, these observations overlook dusty and massive sources, resulting in an incomplete census of early star-forming galaxies. Recently, infrared data from Spitzer and the James Webb Space Telescope (JWST) have revealed a hidden population at $z\sim$3-6 with extreme red colours. Taking advantage of the overlap between imaging in the Euclid Deep Fields (EDFs), covering $\sim$ 60 deg$^2$, and ancillary Spitzer observations, we identified 27000 extremely red objects with $H_E-{\rm IRAC}2>2.25$ (dubbed HIEROs) down to a $10σ$ completeness magnitude limit of IRAC2 $=$ 22.5 AB. After a visual inspection to discard artefacts and objects with troubling photometry, we ended up with a final sample of 3900 candidates. We retrieved the physical parameter estimates for these objects from the SED-fitting tool CIGALE. Our results confirm that HIERO galaxies may populate the high-mass end of the stellar mass function at $z>3$, with some reaching extreme stellar masses ($M_*>10^{11}M_\odot$) and exhibiting high dust attenuation ($A_V>3$). However, we consider stellar mass estimates unreliable for $z>3.5$, favouring a lower-z solution. The challenges faced by SED-fitting tools in characterising these objects highlight the need for further studies, incorporating shorter-wavelength and spectroscopic data. Euclid spectra will help resolve degeneracies and better constrain the physical properties of the brightest galaxies. Given the extreme nature of this population, characterising these sources is crucial for understanding galaxy evolution. This work demonstrates Euclid's potential to provide statistical samples of rare, massive, dust-obscured galaxies at $z>3$, which will be prime targets for JWST, ALMA, and ELT.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
Euclid Quick Data Release (Q1): From spectrograms to spectra: the SIR spectroscopic Processing Function
Authors:
Euclid Collaboration,
Y. Copin,
M. Fumana,
C. Mancini,
P. N. Appleton,
R. Chary,
S. Conseil,
A. L. Faisst,
S. Hemmati,
D. C. Masters,
C. Scarlata,
M. Scodeggio,
A. Alavi,
A. Carle,
P. Casenove,
T. Contini,
I. Das,
W. Gillard,
G. Herzog,
J. Jacobson,
V. Le Brun,
D. Maino,
G. Setnikar,
N. R. Stickley,
D. Tavagnacco
, et al. (326 additional authors not shown)
Abstract:
The Euclid space mission aims to investigate the nature of dark energy and dark matter by mapping the large-scale structure of the Universe. A key component of Euclid's observational strategy is slitless spectroscopy, conducted using the Near Infrared Spectrometer and Photometer (NISP). This technique enables the acquisition of large-scale spectroscopic data without the need for targeted apertures…
▽ More
The Euclid space mission aims to investigate the nature of dark energy and dark matter by mapping the large-scale structure of the Universe. A key component of Euclid's observational strategy is slitless spectroscopy, conducted using the Near Infrared Spectrometer and Photometer (NISP). This technique enables the acquisition of large-scale spectroscopic data without the need for targeted apertures, allowing precise redshift measurements for millions of galaxies. These data are essential for Euclid's core science objectives, including the study of cosmic acceleration and the evolution of galaxy clustering, as well as enabling many non-cosmological investigations. This study presents the SIR processing function (PF), which is responsible for processing slitless spectroscopic data. The objective is to generate science-grade fully-calibrated one-dimensional spectra, ensuring high-quality spectroscopic data. The processing function relies on a source catalogue generated from photometric data, effectively corrects detector effects, subtracts cross-contaminations, minimizes self-contamination, calibrates wavelength and flux, and produces reliable spectra for later scientific use. The first Quick Data Release (Q1) of Euclid's spectroscopic data provides approximately three million validated spectra for sources observed in the red-grism mode from a selected portion of the Euclid Wide Survey. We find that wavelength accuracy and measured resolving power are within requirements, thanks to the excellent optical quality of the instrument. The SIR PF represents a significant step in processing slitless spectroscopic data for the Euclid mission. As the survey progresses, continued refinements and additional features will enhance its capabilities, supporting high-precision cosmological and astrophysical measurements.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
Euclid Quick Data Release (Q1). NIR processing and data products
Authors:
Euclid Collaboration,
G. Polenta,
M. Frailis,
A. Alavi,
P. N. Appleton,
P. Awad,
A. Bonchi,
R. Bouwens,
L. Bramante,
D. Busonero,
G. Calderone,
F. Cogato,
S. Conseil,
M. Correnti,
R. da Silva,
I. Das,
F. Faustini,
Y. Fu,
T. Gasparetto,
W. Gillard,
A. Grazian,
S. Hemmati,
J. Jacobson,
K. Jahnke,
B. Kubik
, et al. (345 additional authors not shown)
Abstract:
This paper describes the near-infrared processing function (NIR PF) that processes near-infrared images from the Near-Infrared Spectrometer and Photometer (NISP) instrument onboard the Euclid satellite. NIR PF consists of three main components: (i) a common pre-processing stage for both photometric (NIR) and spectroscopic (SIR) data to remove instrumental effects; (ii) astrometric and photometric…
▽ More
This paper describes the near-infrared processing function (NIR PF) that processes near-infrared images from the Near-Infrared Spectrometer and Photometer (NISP) instrument onboard the Euclid satellite. NIR PF consists of three main components: (i) a common pre-processing stage for both photometric (NIR) and spectroscopic (SIR) data to remove instrumental effects; (ii) astrometric and photometric calibration of NIR data, along with catalogue extraction; and (iii) resampling and stacking. The necessary calibration products are generated using dedicated pipelines that process observations from both the early performance verification (PV) phase in 2023 and the nominal survey operations. After outlining the pipeline's structure and algorithms, we demonstrate its application to Euclid Q1 images. For Q1, we achieve an astrometric accuracy of 9-15 mas, a relative photometric accuracy of 5 mmag, and an absolute flux calibration limited by the 1% uncertainty of the Hubble Space Telescope (HST) CALSPEC database. We characterise the point-spread function (PSF) that we find very stable across the focal plane, and we discuss current limitations of NIR PF that will be improved upon for future data releases.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
Euclid Quick Data Release (Q1): VIS processing and data products
Authors:
Euclid Collaboration,
H. J. McCracken,
K. Benson,
C. Dolding,
T. Flanet,
C. Grenet,
O. Herent,
P. Hudelot,
C. Laigle,
G. Leroy,
P. Liebing,
R. Massey,
S. Mottet,
R. Nakajima,
H. N. Nguyen-Kim,
J. W. Nightingale,
J. Skottfelt,
L. C. Smith,
F. Soldano,
E. Vilenius,
M. Wander,
M. von Wietersheim-Kramsta,
M. Akhlaghi,
H. Aussel,
S. Awan
, et al. (355 additional authors not shown)
Abstract:
This paper describes the VIS Processing Function (VIS PF) of the Euclid ground segment pipeline, which processes and calibrates raw data from the VIS camera. We present the algorithms used in each processing element, along with a description of the on-orbit performance of VIS PF, based on Performance Verification (PV) and Q1 data. We demonstrate that the principal performance metrics (image qualit…
▽ More
This paper describes the VIS Processing Function (VIS PF) of the Euclid ground segment pipeline, which processes and calibrates raw data from the VIS camera. We present the algorithms used in each processing element, along with a description of the on-orbit performance of VIS PF, based on Performance Verification (PV) and Q1 data. We demonstrate that the principal performance metrics (image quality, astrometric accuracy, photometric calibration) are within pre-launch specifications. The image-to-image photometric scatter is less than $0.8\%$, and absolute astrometric accuracy compared to Gaia is $5$ mas Image quality is stable over all Q1 images with a full width at half maximum (FWHM) of $0.\!^{\prime\prime}16$. The stacked images (combining four nominal and two short exposures) reach $I_\mathrm{E} = 25.6$ ($10σ$, measured as the variance of $1.\!^{\prime\prime}3$ diameter apertures). We also describe quality control metrics provided with each image, and an appendix provides a detailed description of the provided data products. The excellent quality of these images demonstrates the immense potential of Euclid VIS data for weak lensing. VIS data, covering most of the extragalactic sky, will provide a lasting high-resolution atlas of the Universe.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
Euclid Quick Data Release (Q1) -- Data release overview
Authors:
Euclid Collaboration,
H. Aussel,
I. Tereno,
M. Schirmer,
G. Alguero,
B. Altieri,
E. Balbinot,
T. de Boer,
P. Casenove,
P. Corcho-Caballero,
H. Furusawa,
J. Furusawa,
M. J. Hudson,
K. Jahnke,
G. Libet,
J. Macias-Perez,
N. Masoumzadeh,
J. J. Mohr,
J. Odier,
D. Scott,
T. Vassallo,
G. Verdoes Kleijn,
A. Zacchei,
N. Aghanim,
A. Amara
, et al. (385 additional authors not shown)
Abstract:
The first Euclid Quick Data Release, Q1, comprises 63.1 sq deg of the Euclid Deep Fields (EDFs) to nominal wide-survey depth. It encompasses visible and near-infrared space-based imaging and spectroscopic data, ground-based photometry in the u, g, r, i and z bands, as well as corresponding masks. Overall, Q1 contains about 30 million objects in three areas near the ecliptic poles around the EDF-No…
▽ More
The first Euclid Quick Data Release, Q1, comprises 63.1 sq deg of the Euclid Deep Fields (EDFs) to nominal wide-survey depth. It encompasses visible and near-infrared space-based imaging and spectroscopic data, ground-based photometry in the u, g, r, i and z bands, as well as corresponding masks. Overall, Q1 contains about 30 million objects in three areas near the ecliptic poles around the EDF-North and EDF-South, as well as the EDF-Fornax field in the constellation of the same name. The purpose of this data release -- and its associated technical papers -- is twofold. First, it is meant to inform the community of the enormous potential of the Euclid survey data, to describe what is contained in these data, and to help prepare expectations for the forthcoming first major data release DR1. Second, it enables a wide range of initial scientific projects with wide-survey Euclid data, ranging from the early Universe to the Solar System. The Q1 data were processed with early versions of the processing pipelines, which already demonstrate good performance, with numerous improvements in implementation compared to pre-launch development. In this paper, we describe the sky areas released in Q1, the observations, a top-level view of the data processing of Euclid and associated external data, the Q1 photometric masks, and how to access the data. We also give an overview of initial scientific results obtained using the Q1 data set by Euclid Consortium scientists, and conclude with important caveats when using the data. As a complementary product, Q1 also contains observations of a star-forming area in Lynd's Dark Nebula 1641 in the Orion~A Cloud, observed for technical purposes during Euclid's performance-verification phase. This is a unique target, of a type not commonly found in Euclid's nominal sky survey.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
Conformal Prediction and MLLM aided Uncertainty Quantification in Scene Graph Generation
Authors:
Sayak Nag,
Udita Ghosh,
Calvin-Khang Ta,
Sarosij Bose,
Jiachen Li,
Amit K Roy Chowdhury
Abstract:
Scene Graph Generation (SGG) aims to represent visual scenes by identifying objects and their pairwise relationships, providing a structured understanding of image content. However, inherent challenges like long-tailed class distributions and prediction variability necessitate uncertainty quantification in SGG for its practical viability. In this paper, we introduce a novel Conformal Prediction (C…
▽ More
Scene Graph Generation (SGG) aims to represent visual scenes by identifying objects and their pairwise relationships, providing a structured understanding of image content. However, inherent challenges like long-tailed class distributions and prediction variability necessitate uncertainty quantification in SGG for its practical viability. In this paper, we introduce a novel Conformal Prediction (CP) based framework, adaptive to any existing SGG method, for quantifying their predictive uncertainty by constructing well-calibrated prediction sets over their generated scene graphs. These scene graph prediction sets are designed to achieve statistically rigorous coverage guarantees. Additionally, to ensure these prediction sets contain the most practically interpretable scene graphs, we design an effective MLLM-based post-processing strategy for selecting the most visually and semantically plausible scene graphs within these prediction sets. We show that our proposed approach can produce diverse possible scene graphs from an image, assess the reliability of SGG methods, and improve overall SGG performance.
△ Less
Submitted 10 April, 2025; v1 submitted 18 March, 2025;
originally announced March 2025.
-
LLM-Match: An Open-Sourced Patient Matching Model Based on Large Language Models and Retrieval-Augmented Generation
Authors:
Xiaodi Li,
Shaika Chowdhury,
Chung Il Wi,
Maria Vassilaki,
Xiaoke Liu,
Terence T Sio,
Owen Garrick,
Young J Juhn,
James R Cerhan,
Cui Tao,
Nansu Zong
Abstract:
Patient matching is the process of linking patients to appropriate clinical trials by accurately identifying and matching their medical records with trial eligibility criteria. We propose LLM-Match, a novel framework for patient matching leveraging fine-tuned open-source large language models. Our approach consists of four key components. First, a retrieval-augmented generation (RAG) module extrac…
▽ More
Patient matching is the process of linking patients to appropriate clinical trials by accurately identifying and matching their medical records with trial eligibility criteria. We propose LLM-Match, a novel framework for patient matching leveraging fine-tuned open-source large language models. Our approach consists of four key components. First, a retrieval-augmented generation (RAG) module extracts relevant patient context from a vast pool of electronic health records (EHRs). Second, a prompt generation module constructs input prompts by integrating trial eligibility criteria (both inclusion and exclusion criteria), patient context, and system instructions. Third, a fine-tuning module with a classification head optimizes the model parameters using structured prompts and ground-truth labels. Fourth, an evaluation module assesses the fine-tuned model's performance on the testing datasets. We evaluated LLM-Match on four open datasets - n2c2, SIGIR, TREC 2021, and TREC 2022 - using open-source models, comparing it against TrialGPT, Zero-Shot, and GPT-4-based closed models. LLM-Match outperformed all baselines.
△ Less
Submitted 24 March, 2025; v1 submitted 17 March, 2025;
originally announced March 2025.
-
Euclid preparation. BAO analysis of photometric galaxy clustering in configuration space
Authors:
Euclid Collaboration,
V. Duret,
S. Escoffier,
W. Gillard,
I. Tutusaus,
S. Camera,
N. Tessore,
F. J. Castander,
N. Aghanim,
A. Amara,
L. Amendola,
S. Andreon,
N. Auricchio,
C. Baccigalupi,
M. Baldi,
S. Bardelli,
P. Battaglia,
A. Biviano,
D. Bonino,
E. Branchini,
M. Brescia,
J. Brinchmann,
A. Caillat,
G. Cañas-Herrera,
V. Capobianco
, et al. (264 additional authors not shown)
Abstract:
With about 1.5 billion galaxies expected to be observed, the very large number of objects in the Euclid photometric survey will allow for precise studies of galaxy clustering from a single survey, over a large range of redshifts $0.2 < z < 2.5$. In this work, we use photometric redshifts to extract the baryon acoustic oscillation signal (BAO) from the Flagship galaxy mock catalogue with a tomograp…
▽ More
With about 1.5 billion galaxies expected to be observed, the very large number of objects in the Euclid photometric survey will allow for precise studies of galaxy clustering from a single survey, over a large range of redshifts $0.2 < z < 2.5$. In this work, we use photometric redshifts to extract the baryon acoustic oscillation signal (BAO) from the Flagship galaxy mock catalogue with a tomographic approach to constrain the evolution of the Universe and infer its cosmological parameters. We measure the two-point angular correlation function in 13 redshift bins. A template-fitting approach is applied to the measurement to extract the shift of the BAO peak through the transverse Alcock--Paczynski parameter $α$. A joint analysis of all redshift bins is performed to constrain $α$ at the effective redshift $z_\mathrm{eff}=0.77$ with MCMC and profile likelihood techniques. We also extract one $α_i$ parameter per redshift bin to quantify its evolution as a function of time. From these 13 $α_i$, which are directly proportional to the ratio $D_\mathrm{A}/\,r_\mathrm{s,\,drag}$, we constrain $h$, $Ω_\mathrm{b}$, and $Ω_\mathrm{cdm}$. From the joint analysis, we constrain $α(z_\mathrm{eff}=0.77)=1.0011^{+0.0078}_{-0.0079}$, which represents a three-fold improvement over current constraints from the Dark Energy Survey. As expected, the constraining power in the analysis of each redshift bin is lower, with an uncertainty ranging from $\pm\,0.13$ to $\pm\,0.024$. From these results, we constrain $h$ at 0.45 %, $Ω_\mathrm{b}$ at 0.91 %, and $Ω_\mathrm{cdm}$ at 7.7 %. We quantify the influence of analysis choices like the template, scale cuts, redshift bins, and systematic effects like redshift-space distortions over our constraints both at the level of the extracted $α_i$ parameters and at the level of cosmological inference.
△ Less
Submitted 17 March, 2025; v1 submitted 14 March, 2025;
originally announced March 2025.
-
MEET: A Million-Scale Dataset for Fine-Grained Geospatial Scene Classification with Zoom-Free Remote Sensing Imagery
Authors:
Yansheng Li,
Yuning Wu,
Gong Cheng,
Chao Tao,
Bo Dang,
Yu Wang,
Jiahao Zhang,
Chuge Zhang,
Yiting Liu,
Xu Tang,
Jiayi Ma,
Yongjun Zhang
Abstract:
Accurate fine-grained geospatial scene classification using remote sensing imagery is essential for a wide range of applications. However, existing approaches often rely on manually zooming remote sensing images at different scales to create typical scene samples. This approach fails to adequately support the fixed-resolution image interpretation requirements in real-world scenarios. To address th…
▽ More
Accurate fine-grained geospatial scene classification using remote sensing imagery is essential for a wide range of applications. However, existing approaches often rely on manually zooming remote sensing images at different scales to create typical scene samples. This approach fails to adequately support the fixed-resolution image interpretation requirements in real-world scenarios. To address this limitation, we introduce the Million-scale finE-grained geospatial scEne classification dataseT (MEET), which contains over 1.03 million zoom-free remote sensing scene samples, manually annotated into 80 fine-grained categories. In MEET, each scene sample follows a scene-inscene layout, where the central scene serves as the reference, and auxiliary scenes provide crucial spatial context for finegrained classification. Moreover, to tackle the emerging challenge of scene-in-scene classification, we present the Context-Aware Transformer (CAT), a model specifically designed for this task, which adaptively fuses spatial context to accurately classify the scene samples. CAT adaptively fuses spatial context to accurately classify the scene samples by learning attentional features that capture the relationships between the center and auxiliary scenes. Based on MEET, we establish a comprehensive benchmark for fine-grained geospatial scene classification, evaluating CAT against 11 competitive baselines. The results demonstrate that CAT significantly outperforms these baselines, achieving a 1.88% higher balanced accuracy (BA) with the Swin-Large backbone, and a notable 7.87% improvement with the Swin-Huge backbone. Further experiments validate the effectiveness of each module in CAT and show the practical applicability of CAT in the urban functional zone mapping. The source code and dataset will be publicly available at https://jerrywyn.github.io/project/MEET.html.
△ Less
Submitted 14 March, 2025;
originally announced March 2025.
-
Enhance Exploration in Safe Reinforcement Learning with Contrastive Representation Learning
Authors:
Duc Kien Doan,
Bang Giang Le,
Viet Cuong Ta
Abstract:
In safe reinforcement learning, agent needs to balance between exploration actions and safety constraints. Following this paradigm, domain transfer approaches learn a prior Q-function from the related environments to prevent unsafe actions. However, because of the large number of false positives, some safe actions are never executed, leading to inadequate exploration in sparse-reward environments.…
▽ More
In safe reinforcement learning, agent needs to balance between exploration actions and safety constraints. Following this paradigm, domain transfer approaches learn a prior Q-function from the related environments to prevent unsafe actions. However, because of the large number of false positives, some safe actions are never executed, leading to inadequate exploration in sparse-reward environments. In this work, we aim to learn an efficient state representation to balance the exploration and safety-prefer action in a sparse-reward environment. Firstly, the image input is mapped to latent representation by an auto-encoder. A further contrastive learning objective is employed to distinguish safe and unsafe states. In the learning phase, the latent distance is used to construct an additional safety check, which allows the agent to bias the exploration if it visits an unsafe state. To verify the effectiveness of our method, the experiment is carried out in three navigation-based MiniGrid environments. The result highlights that our method can explore the environment better while maintaining a good balance between safety and efficiency.
△ Less
Submitted 13 March, 2025;
originally announced March 2025.
-
G$^{2}$SF-MIAD: Geometry-Guided Score Fusion for Multimodal Industrial Anomaly Detection
Authors:
Chengyu Tao,
Xuanming Cao,
Juan Du
Abstract:
Industrial quality inspection plays a critical role in modern manufacturing by identifying defective products during production. While single-modality approaches using either 3D point clouds or 2D RGB images suffer from information incompleteness, multimodal anomaly detection offers promise through the complementary fusion of crossmodal data. However, existing methods face challenges in effectivel…
▽ More
Industrial quality inspection plays a critical role in modern manufacturing by identifying defective products during production. While single-modality approaches using either 3D point clouds or 2D RGB images suffer from information incompleteness, multimodal anomaly detection offers promise through the complementary fusion of crossmodal data. However, existing methods face challenges in effectively integrating unimodal results and improving discriminative power. To address these limitations, we first reinterpret memory bank-based anomaly scores in single modalities as isotropic Euclidean distances in local feature spaces. Dynamically evolving from Eulidean metrics, we propose a novel \underline{G}eometry-\underline{G}uided \underline{S}core \underline{F}usion (G$^{2}$SF) framework that progressively learns an anisotropic local distance metric as a unified score for the fusion task. Through a geometric encoding operator, a novel Local Scale Prediction Network (LSPN) is proposed to predict direction-aware scaling factors that characterize first-order local feature distributions, thereby enhancing discrimination between normal and anomalous patterns. Additionally, we develop specialized loss functions and score aggregation strategy from geometric priors to ensure both metric generalization and efficacy. Comprehensive evaluations on the MVTec-3D AD dataset demonstrate the state-of-the-art detection performance of our method with low positive rate and better recall, which is essential in industrial application, and detailed ablation analysis validates each component's contribution.
△ Less
Submitted 13 March, 2025;
originally announced March 2025.
-
LONGCODEU: Benchmarking Long-Context Language Models on Long Code Understanding
Authors:
Jia Li,
Xuyuan Guo,
Lei Li,
Kechi Zhang,
Ge Li,
Jia Li,
Zhengwei Tao,
Fang Liu,
Chongyang Tao,
Yuqi Zhu,
Zhi Jin
Abstract:
Current advanced long-context language models offer great potential for real-world software engineering applications. However, progress in this critical domain remains hampered by a fundamental limitation: the absence of a rigorous evaluation framework for long code understanding. To gap this obstacle, we propose a long code understanding benchmark LONGCODEU from four aspects (8 tasks) to evaluate…
▽ More
Current advanced long-context language models offer great potential for real-world software engineering applications. However, progress in this critical domain remains hampered by a fundamental limitation: the absence of a rigorous evaluation framework for long code understanding. To gap this obstacle, we propose a long code understanding benchmark LONGCODEU from four aspects (8 tasks) to evaluate LCLMs' long code understanding ability required for practical applications, including code unit perception, intra-code unit understanding, inter-code unit relation understanding, and long code documentation understanding. We evaluate 9 popular LCLMs on LONGCODEU (i.e., 6 general models and 3 code models). Our experimental results reveal key limitations in current LCLMs' capabilities for long code understanding. Particularly, the performance of LCLMs drops dramatically when the long code length is greater than 32K, falling far short of their claimed 128K-1M context windows. In the four aspects, inter-code unit relation understanding is the most challenging for LCLMs. Our study provides valuable insights for optimizing LCLMs and driving advancements in software engineering.
△ Less
Submitted 6 March, 2025;
originally announced March 2025.
-
MPPI-DBaS: Safe Trajectory Optimization with Adaptive Exploration
Authors:
Fanxin Wang,
Yikun Cheng,
Chuyuan Tao
Abstract:
In trajectory optimization, Model Predictive Path Integral (MPPI) control is a sampling-based Model Predictive Control (MPC) framework that generates optimal inputs by efficiently simulating numerous trajectories. In practice, however, MPPI often struggles to guarantee safety assurance and balance efficient sampling in open spaces with the need for more extensive exploration under tight constraint…
▽ More
In trajectory optimization, Model Predictive Path Integral (MPPI) control is a sampling-based Model Predictive Control (MPC) framework that generates optimal inputs by efficiently simulating numerous trajectories. In practice, however, MPPI often struggles to guarantee safety assurance and balance efficient sampling in open spaces with the need for more extensive exploration under tight constraints. To address this challenge, we incorporate discrete barrier states (DBaS) into MPPI and propose a novel MPPI-DBaS algorithm that ensures system safety and enables adaptive exploration across diverse scenarios. We evaluate our method in simulation experiments where the vehicle navigates through closely placed obstacles. The results demonstrate that the proposed algorithm significantly outperforms standard MPPI, achieving a higher success rate and lower tracking errors.
△ Less
Submitted 20 February, 2025;
originally announced February 2025.
-
Deep Subspace Learning for Surface Anomaly Classification Based on 3D Point Cloud Data
Authors:
Xuanming Cao,
Chengyu Tao,
Juan Du
Abstract:
Surface anomaly classification is critical for manufacturing system fault diagnosis and quality control. However, the following challenges always hinder accurate anomaly classification in practice: (i) Anomaly patterns exhibit intra-class variation and inter-class similarity, presenting challenges in the accurate classification of each sample. (ii) Despite the predefined classes, new types of anom…
▽ More
Surface anomaly classification is critical for manufacturing system fault diagnosis and quality control. However, the following challenges always hinder accurate anomaly classification in practice: (i) Anomaly patterns exhibit intra-class variation and inter-class similarity, presenting challenges in the accurate classification of each sample. (ii) Despite the predefined classes, new types of anomalies can occur during production that require to be detected accurately. (iii) Anomalous data is rare in manufacturing processes, leading to limited data for model learning. To tackle the above challenges simultaneously, this paper proposes a novel deep subspace learning-based 3D anomaly classification model. Specifically, starting from a lightweight encoder to extract the latent representations, we model each class as a subspace to account for the intra-class variation, while promoting distinct subspaces of different classes to tackle the inter-class similarity. Moreover, the explicit modeling of subspaces offers the capability to detect out-of-distribution samples, i.e., new types of anomalies, and the regularization effect with much fewer learnable parameters of our proposed subspace classifier, compared to the popular Multi-Layer Perceptions (MLPs). Extensive numerical experiments demonstrate our method achieves better anomaly classification results than benchmark methods, and can effectively identify the new types of anomalies.
△ Less
Submitted 17 February, 2025;
originally announced February 2025.
-
Efficient Diffusion Models: A Survey
Authors:
Hui Shen,
Jingxuan Zhang,
Boning Xiong,
Rui Hu,
Shoufa Chen,
Zhongwei Wan,
Xin Wang,
Yu Zhang,
Zixuan Gong,
Guangyin Bao,
Chaofan Tao,
Yongfeng Huang,
Ye Yuan,
Mi Zhang
Abstract:
Diffusion models have emerged as powerful generative models capable of producing high-quality contents such as images, videos, and audio, demonstrating their potential to revolutionize digital content creation. However, these capabilities come at the cost of their significant computational resources and lengthy generation time, underscoring the critical need to develop efficient techniques for pra…
▽ More
Diffusion models have emerged as powerful generative models capable of producing high-quality contents such as images, videos, and audio, demonstrating their potential to revolutionize digital content creation. However, these capabilities come at the cost of their significant computational resources and lengthy generation time, underscoring the critical need to develop efficient techniques for practical deployment. In this survey, we provide a systematic and comprehensive review of research on efficient diffusion models. We organize the literature in a taxonomy consisting of three main categories, covering distinct yet interconnected efficient diffusion model topics from algorithm-level, system-level, and framework perspective, respectively. We have also created a GitHub repository where we organize the papers featured in this survey at https://github.com/AIoT-MLSys-Lab/Efficient-Diffusion-Model-Survey. We hope our survey can serve as a valuable resource to help researchers and practitioners gain a systematic understanding of efficient diffusion model research and inspire them to contribute to this important and exciting field.
△ Less
Submitted 6 June, 2025; v1 submitted 3 February, 2025;
originally announced February 2025.
-
Local perfect chirality at reflection-zeros away from exceptional points in optical whispering gallery microcavity
Authors:
Junda Zhu,
Haitao Liu,
Fang Bo,
Can Tao,
Guoquan Zhang,
Jingjun Xu
Abstract:
Recently, a local and imperfect chirality of the resonant eigenmode at the exceptional point (EP) has been reported in the optical whispering gallery microcavity system perturbed by two strong nanoscatterers [Phys. Rev. A 108, L041501 (2023)]. Here, we discover a local perfect chirality of the resonant eigenmode away from the EP in the parameter space of the strongly perturbed microcavity system.…
▽ More
Recently, a local and imperfect chirality of the resonant eigenmode at the exceptional point (EP) has been reported in the optical whispering gallery microcavity system perturbed by two strong nanoscatterers [Phys. Rev. A 108, L041501 (2023)]. Here, we discover a local perfect chirality of the resonant eigenmode away from the EP in the parameter space of the strongly perturbed microcavity system. By considering the multiple scattering process of the azimuthally propagating modes (APMs) at the nanoscatterers with a first-principles-based model, the local perfect chirality is predicted to result from the unidirectional reflectionlessness, i.e., the reflection-zero (R-zero) of the APMs at the two nanoscatterers. Numerical results and model predictions consistently show that the structural parameters of the R-zero typically deviate from those of the EP, which means that the pair of split resonant eigenmodes at the R-zero have different complex resonance frequencies and electromagnetic fields. In general, only one of the pair of split eigenmodes exhibits a local perfect chirality within the local azimuthal range divided by the two nanoscatterers. With the decrease of the two nanoscatterers' sizes or their relative azimuthal angle, the R-zero tends to coincide with the EP.
△ Less
Submitted 8 February, 2025;
originally announced February 2025.
-
Euclid preparation. 3-dimensional galaxy clustering in configuration space. Part I. 2-point correlation function estimation
Authors:
Euclid Collaboration,
S. de la Torre,
F. Marulli,
E. Keihänen,
A. Viitanen,
M. Viel,
A. Veropalumbo,
E. Branchini,
D. Tavagnacco,
F. Rizzo,
J. Valiviita,
V. Lindholm,
V. Allevato,
G. Parimbelli,
E. Sarpa,
Z. Ghaffari,
A. Amara,
S. Andreon,
N. Auricchio,
C. Baccigalupi,
M. Baldi,
S. Bardelli,
A. Basset,
D. Bonino,
M. Brescia
, et al. (275 additional authors not shown)
Abstract:
The 2-point correlation function of the galaxy spatial distribution is a major cosmological observable that enables constraints on the dynamics and geometry of the Universe. The Euclid mission aims at performing an extensive spectroscopic survey of approximately 20--30 million H$α$-emitting galaxies up to about redshift two. This ambitious project seeks to elucidate the nature of dark energy by ma…
▽ More
The 2-point correlation function of the galaxy spatial distribution is a major cosmological observable that enables constraints on the dynamics and geometry of the Universe. The Euclid mission aims at performing an extensive spectroscopic survey of approximately 20--30 million H$α$-emitting galaxies up to about redshift two. This ambitious project seeks to elucidate the nature of dark energy by mapping the 3-dimensional clustering of galaxies over a significant portion of the sky. This paper presents the methodology and software developed for estimating the 3-dimensional 2-point correlation function within the Euclid Science Ground Segment. The software is designed to overcome the significant challenges posed by the large and complex Euclid data set, which involves millions of galaxies. Key challenges include efficient pair counting, managing computational resources, and ensuring the accuracy of the correlation function estimation. The software leverages advanced algorithms, including kd-tree, octree, and linked-list data partitioning strategies, to optimise the pair-counting process. The implementation also includes parallel processing capabilities using shared-memory open multi-processing to further enhance performance and reduce computation times. Extensive validation and performance testing of the software are presented. The results indicate that the software is robust and can reliably estimate the 2-point correlation function, which is essential for deriving cosmological parameters with high precision. Furthermore, the paper discusses the expected performance of the software during different stages of the Euclid Wide Survey observations and forecasts how the precision of the correlation function measurements will improve over the mission's timeline, highlighting the software's capability to handle large data sets efficiently.
△ Less
Submitted 27 January, 2025;
originally announced January 2025.
-
Euclid preparation. LXVIII. Extracting physical parameters from galaxies with machine learning
Authors:
Euclid Collaboration,
I. Kovačić,
M. Baes,
A. Nersesian,
N. Andreadis,
L. Nemani,
Abdurro'uf,
L. Bisigello,
M. Bolzonella,
C. Tortora,
A. van der Wel,
S. Cavuoti,
C. J. Conselice,
A. Enia,
L. K. Hunt,
P. Iglesias-Navarro,
E. Iodice,
J. H. Knapen,
F. R. Marleau,
O. Müller,
R. F. Peletier,
J. Román,
R. Ragusa,
P. Salucci,
T. Saifollahi
, et al. (265 additional authors not shown)
Abstract:
The Euclid mission is generating a vast amount of imaging data in four broadband filters at high angular resolution. This will allow the detailed study of mass, metallicity, and stellar populations across galaxies, which will constrain their formation and evolutionary pathways. Transforming the Euclid imaging for large samples of galaxies into maps of physical parameters in an efficient and reliab…
▽ More
The Euclid mission is generating a vast amount of imaging data in four broadband filters at high angular resolution. This will allow the detailed study of mass, metallicity, and stellar populations across galaxies, which will constrain their formation and evolutionary pathways. Transforming the Euclid imaging for large samples of galaxies into maps of physical parameters in an efficient and reliable manner is an outstanding challenge. We investigate the power and reliability of machine learning techniques to extract the distribution of physical parameters within well-resolved galaxies. We focus on estimating stellar mass surface density, mass-averaged stellar metallicity and age. We generate noise-free, synthetic high-resolution imaging data in the Euclid photometric bands for a set of 1154 galaxies from the TNG50 cosmological simulation. The images are generated with the SKIRT radiative transfer code, taking into account the complex 3D distribution of stellar populations and interstellar dust attenuation. We use a machine learning framework to map the idealised mock observational data to the physical parameters on a pixel-by-pixel basis. We find that stellar mass surface density can be accurately recovered with a $\leq 0.130 {\rm \,dex}$ scatter. Conversely, stellar metallicity and age estimates are, as expected, less robust, but still contain significant information which originates from underlying correlations at a sub-kpc scale between stellar mass surface density and stellar population properties.
△ Less
Submitted 31 March, 2025; v1 submitted 24 January, 2025;
originally announced January 2025.
-
Euclid preparation LX. The use of HST images as input for weak-lensing image simulations
Authors:
Euclid Collaboration,
D. Scognamiglio,
T. Schrabback,
M. Tewes,
B. Gillis,
H. Hoekstra,
E. M. Huff,
O. Marggraf,
T. Kitching,
R. Massey,
I. Tereno,
C. S. Carvalho,
A. Robertson,
G. Congedo,
N. Aghanim,
B. Altieri,
A. Amara,
S. Andreon,
N. Auricchio,
C. Baccigalupi,
M. Baldi,
S. Bardelli,
P. Battaglia,
C. Bodendorf,
D. Bonino
, et al. (223 additional authors not shown)
Abstract:
Data from the Euclid space telescope will enable cosmic shear measurements with very small statistical errors, requiring corresponding systematic error control level. A common approach to correct for shear biases involves calibrating shape measurement methods using image simulations with known input shear. Given their high resolution, Hubble Space Telescope (HST) galaxies can, in principle, be uti…
▽ More
Data from the Euclid space telescope will enable cosmic shear measurements with very small statistical errors, requiring corresponding systematic error control level. A common approach to correct for shear biases involves calibrating shape measurement methods using image simulations with known input shear. Given their high resolution, Hubble Space Telescope (HST) galaxies can, in principle, be utilised to emulate Euclid observations. In this work, we employ a GalSim-based testing environment to investigate whether uncertainties in the HST point spread function (PSF) model or in data processing techniques introduce significant biases in weak-lensing (WL) shear calibration. We used single Sérsic galaxy models to simulate both HST and Euclid observations. We then `Euclidised' our HST simulations and compared the results with the directly simulated Euclid-like images. For this comparison, we utilised a moment-based shape measurement algorithm and galaxy model fits. Through the Euclidisation procedure, we effectively reduced the residual multiplicative biases in shear measurements to sub-percent levels. This achievement was made possible by employing either the native pixel scales of the instruments, utilising the Lanczos15 interpolation kernel, correcting for noise correlations, and ensuring consistent galaxy signal-to-noise ratios between simulation branches. However, the Euclidisation procedure requires further analysis on the impact of the correlated noise, to estimate calibration bias. Additionally, we conducted an in-depth analysis of the accuracy of TinyTim HST PSF models using star fields observed in the F606W and F814W filters. We observe that F606W images exhibit a broader scatter in the recovered best-fit focus, compared to those in the F814W filter.
△ Less
Submitted 14 January, 2025;
originally announced January 2025.
-
Enhancing Scene Classification in Cloudy Image Scenarios: A Collaborative Transfer Method with Information Regulation Mechanism using Optical Cloud-Covered and SAR Remote Sensing Images
Authors:
Yuze Wang,
Rong Xiao,
Haifeng Li,
Mariana Belgiu,
Chao Tao
Abstract:
In remote sensing scene classification, leveraging the transfer methods with well-trained optical models is an efficient way to overcome label scarcity. However, cloud contamination leads to optical information loss and significant impacts on feature distribution, challenging the reliability and stability of transferred target models. Common solutions include cloud removal for optical data or dire…
▽ More
In remote sensing scene classification, leveraging the transfer methods with well-trained optical models is an efficient way to overcome label scarcity. However, cloud contamination leads to optical information loss and significant impacts on feature distribution, challenging the reliability and stability of transferred target models. Common solutions include cloud removal for optical data or directly using Synthetic aperture radar (SAR) data in the target domain. However, cloud removal requires substantial auxiliary data for support and pre-training, while directly using SAR disregards the unobstructed portions of optical data. This study presents a scene classification transfer method that synergistically combines multi-modality data, which aims to transfer the source domain model trained on cloudfree optical data to the target domain that includes both cloudy optical and SAR data at low cost. Specifically, the framework incorporates two parts: (1) the collaborative transfer strategy, based on knowledge distillation, enables the efficient prior knowledge transfer across heterogeneous data; (2) the information regulation mechanism (IRM) is proposed to address the modality imbalance issue during transfer. It employs auxiliary models to measure the contribution discrepancy of each modality, and automatically balances the information utilization of modalities during the target model learning process at the sample-level. The transfer experiments were conducted on simulated and real cloud datasets, demonstrating the superior performance of the proposed method compared to other solutions in cloud-covered scenarios. We also verified the importance and limitations of IRM, and further discussed and visualized the modality imbalance problem during the model transfer. Codes are available at https://github.com/wangyuze-csu/ESCCS
△ Less
Submitted 8 January, 2025;
originally announced January 2025.
-
The FACTS Grounding Leaderboard: Benchmarking LLMs' Ability to Ground Responses to Long-Form Input
Authors:
Alon Jacovi,
Andrew Wang,
Chris Alberti,
Connie Tao,
Jon Lipovetz,
Kate Olszewska,
Lukas Haas,
Michelle Liu,
Nate Keating,
Adam Bloniarz,
Carl Saroufim,
Corey Fry,
Dror Marcus,
Doron Kukliansky,
Gaurav Singh Tomar,
James Swirhun,
Jinwei Xing,
Lily Wang,
Madhu Gurumurthy,
Michael Aaron,
Moran Ambar,
Rachana Fellinger,
Rui Wang,
Zizhao Zhang,
Sasha Goldshtein
, et al. (1 additional authors not shown)
Abstract:
We introduce FACTS Grounding, an online leaderboard and associated benchmark that evaluates language models' ability to generate text that is factually accurate with respect to given context in the user prompt. In our benchmark, each prompt includes a user request and a full document, with a maximum length of 32k tokens, requiring long-form responses. The long-form responses are required to be ful…
▽ More
We introduce FACTS Grounding, an online leaderboard and associated benchmark that evaluates language models' ability to generate text that is factually accurate with respect to given context in the user prompt. In our benchmark, each prompt includes a user request and a full document, with a maximum length of 32k tokens, requiring long-form responses. The long-form responses are required to be fully grounded in the provided context document while fulfilling the user request. Models are evaluated using automated judge models in two phases: (1) responses are disqualified if they do not fulfill the user request; (2) they are judged as accurate if the response is fully grounded in the provided document. The automated judge models were comprehensively evaluated against a held-out test-set to pick the best prompt template, and the final factuality score is an aggregate of multiple judge models to mitigate evaluation bias. The FACTS Grounding leaderboard will be actively maintained over time, and contains both public and private splits to allow for external participation while guarding the integrity of the leaderboard. It can be found at https://www.kaggle.com/facts-leaderboard.
△ Less
Submitted 6 January, 2025;
originally announced January 2025.
-
Unsupervised Domain Adaptation for Occlusion Resilient Human Pose Estimation
Authors:
Arindam Dutta,
Sarosij Bose,
Saketh Bachu,
Calvin-Khang Ta,
Konstantinos Karydis,
Amit K. Roy-Chowdhury
Abstract:
Occlusions are a significant challenge to human pose estimation algorithms, often resulting in inaccurate and anatomically implausible poses. Although current occlusion-robust human pose estimation algorithms exhibit impressive performance on existing datasets, their success is largely attributed to supervised training and the availability of additional information, such as multiple views or tempo…
▽ More
Occlusions are a significant challenge to human pose estimation algorithms, often resulting in inaccurate and anatomically implausible poses. Although current occlusion-robust human pose estimation algorithms exhibit impressive performance on existing datasets, their success is largely attributed to supervised training and the availability of additional information, such as multiple views or temporal continuity. Furthermore, these algorithms typically suffer from performance degradation under distribution shifts. While existing domain adaptive human pose estimation algorithms address this bottleneck, they tend to perform suboptimally when the target domain images are occluded, a common occurrence in real-life scenarios. To address these challenges, we propose OR-POSE: Unsupervised Domain Adaptation for Occlusion Resilient Human POSE Estimation. OR-POSE is an innovative unsupervised domain adaptation algorithm which effectively mitigates domain shifts and overcomes occlusion challenges by employing the mean teacher framework for iterative pseudo-label refinement. Additionally, OR-POSE reinforces realistic pose prediction by leveraging a learned human pose prior which incorporates the anatomical constraints of humans in the adaptation process. Lastly, OR-POSE avoids overfitting to inaccurate pseudo labels generated from heavily occluded images by employing a novel visibility-based curriculum learning approach. This enables the model to gradually transition from training samples with relatively less occlusion to more challenging, heavily occluded samples. Extensive experiments show that OR-POSE outperforms existing analogous state-of-the-art algorithms by $\sim$ 7% on challenging occluded human pose estimation datasets.
△ Less
Submitted 6 January, 2025;
originally announced January 2025.