Search | arXiv e-print repository

The Amazon Nova Family of Models: Technical Report and Model Card

Authors: Amazon AGI, Aaron Langford, Aayush Shah, Abhanshu Gupta, Abhimanyu Bhatter, Abhinav Goyal, Abhinav Mathur, Abhinav Mohanty, Abhishek Kumar, Abhishek Sethi, Abi Komma, Abner Pena, Achin Jain, Adam Kunysz, Adam Opyrchal, Adarsh Singh, Aditya Rawal, Adok Achar Budihal Prasad, Adrià de Gispert, Agnika Kumar, Aishwarya Aryamane, Ajay Nair, Akilan M, Akshaya Iyengar, Akshaya Vishnu Kudlu Shanbhogue , et al. (761 additional authors not shown)

Abstract: We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents… ▽ More We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents and text. Amazon Nova Micro is a text-only model that delivers our lowest-latency responses at very low cost. Amazon Nova Canvas is an image generation model that creates professional grade images with rich customization controls. Amazon Nova Reel is a video generation model offering high-quality outputs, customization, and motion control. Our models were built responsibly and with a commitment to customer trust, security, and reliability. We report benchmarking results for core capabilities, agentic performance, long context, functional adaptation, runtime performance, and human evaluation. △ Less

Submitted 17 March, 2025; originally announced June 2025.

Comments: 48 pages, 10 figures

Report number: 20250317

arXiv:2506.11445 [pdf, ps, other]

Resolve Highway Conflict in Multi-Autonomous Vehicle Controls with Local State Attention

Authors: Xuan Duy Ta, Bang Giang Le, Thanh Ha Le, Viet Cuong Ta

Abstract: In mixed-traffic environments, autonomous vehicles must adapt to human-controlled vehicles and other unusual driving situations. This setting can be framed as a multi-agent reinforcement learning (MARL) environment with full cooperative reward among the autonomous vehicles. While methods such as Multi-agent Proximal Policy Optimization can be effective in training MARL tasks, they often fail to re… ▽ More In mixed-traffic environments, autonomous vehicles must adapt to human-controlled vehicles and other unusual driving situations. This setting can be framed as a multi-agent reinforcement learning (MARL) environment with full cooperative reward among the autonomous vehicles. While methods such as Multi-agent Proximal Policy Optimization can be effective in training MARL tasks, they often fail to resolve local conflict between agents and are unable to generalize to stochastic events. In this paper, we propose a Local State Attention module to assist the input state representation. By relying on the self-attention operator, the module is expected to compress the essential information of nearby agents to resolve the conflict in traffic situations. Utilizing a simulated highway merging scenario with the priority vehicle as the unexpected event, our approach is able to prioritize other vehicles' information to manage the merging process. The results demonstrate significant improvements in merging efficiency compared to popular baselines, especially in high-density traffic settings. △ Less

Submitted 12 June, 2025; originally announced June 2025.

arXiv:2506.08378 [pdf, ps, other]

Euclid preparation: The NISP spectroscopy channel, on ground performance and calibration

Authors: Euclid Collaboration, W. Gillard, T. Maciaszek, E. Prieto, F. Grupp, A. Costille, K. Jahnke, J. Clemens, S. Dusini, M. Carle, C. Sirignano, E. Medinaceli, S. Ligori, E. Franceschi, M. Trifoglio, W. Bon, R. Barbier, S. Ferriol, A. Secroun, N. Auricchio, P. Battaglia, C. Bonoli, L. Corcione, F. Hormuth, D. Le Mignant , et al. (334 additional authors not shown)

Abstract: ESA's Euclid cosmology mission relies on the very sensitive and accurately calibrated spectroscopy channel of the Near-Infrared Spectrometer and Photometer (NISP). With three operational grisms in two wavelength intervals, NISP provides diffraction-limited slitless spectroscopy over a field of $0.57$ deg$^2$. A blue grism $\text{BG}_\text{E}$ covers the wavelength range $926$--$1366$\,nm at a spec… ▽ More ESA's Euclid cosmology mission relies on the very sensitive and accurately calibrated spectroscopy channel of the Near-Infrared Spectrometer and Photometer (NISP). With three operational grisms in two wavelength intervals, NISP provides diffraction-limited slitless spectroscopy over a field of $0.57$ deg$^2$. A blue grism $\text{BG}_\text{E}$ covers the wavelength range $926$--$1366$\,nm at a spectral resolution $R=440$--$900$ for a $0.5''$ diameter source with a dispersion of $1.24$ nm px$^{-1}$. Two red grisms $\text{RG}_\text{E}$ span $1206$ to $1892$\,nm at $R=550$--$740$ and a dispersion of $1.37$ nm px$^{-1}$. We describe the construction of the grisms as well as the ground testing of the flight model of the NISP instrument where these properties were established. △ Less

Submitted 9 June, 2025; originally announced June 2025.

Comments: 18 pages 15 figures with additional 8 pages of annexes. Submitted to A&A

arXiv:2506.08158 [pdf, ps, other]

ETT-CKGE: Efficient Task-driven Tokens for Continual Knowledge Graph Embedding

Authors: Lijing Zhu, Qizhen Lan, Qing Tian, Wenbo Sun, Li Yang, Lu Xia, Yixin Xie, Xi Xiao, Tiehang Duan, Cui Tao, Shuteng Niu

Abstract: Continual Knowledge Graph Embedding (CKGE) seeks to integrate new knowledge while preserving past information. However, existing methods struggle with efficiency and scalability due to two key limitations: (1) suboptimal knowledge preservation between snapshots caused by manually designed node/relation importance scores that ignore graph dependencies relevant to the downstream task, and (2) comput… ▽ More Continual Knowledge Graph Embedding (CKGE) seeks to integrate new knowledge while preserving past information. However, existing methods struggle with efficiency and scalability due to two key limitations: (1) suboptimal knowledge preservation between snapshots caused by manually designed node/relation importance scores that ignore graph dependencies relevant to the downstream task, and (2) computationally expensive graph traversal for node/relation importance calculation, leading to slow training and high memory overhead. To address these limitations, we introduce ETT-CKGE (Efficient, Task-driven, Tokens for Continual Knowledge Graph Embedding), a novel task-guided CKGE method that leverages efficient task-driven tokens for efficient and effective knowledge transfer between snapshots. Our method introduces a set of learnable tokens that directly capture task-relevant signals, eliminating the need for explicit node scoring or traversal. These tokens serve as consistent and reusable guidance across snapshots, enabling efficient token-masked embedding alignment between snapshots. Importantly, knowledge transfer is achieved through simple matrix operations, significantly reducing training time and memory usage. Extensive experiments across six benchmark datasets demonstrate that ETT-CKGE consistently achieves superior or competitive predictive performance, while substantially improving training efficiency and scalability compared to state-of-the-art CKGE methods. The code is available at: https://github.com/lijingzhu1/ETT-CKGE/tree/main △ Less

Submitted 9 June, 2025; originally announced June 2025.

arXiv:2506.03008 [pdf, ps, other]

Euclid preparation. Constraining parameterised models of modifications of gravity with the spectroscopic and photometric primary probes

Authors: Euclid Collaboration, I. S. Albuquerque, N. Frusciante, Z. Sakr, S. Srinivasan, L. Atayde, B. Bose, V. F. Cardone, S. Casas, M. Martinelli, J. Noller, E. M. Teixeira, D. B. Thomas, I. Tutusaus, M. Cataneo, K. Koyama, L. Lombriser, F. Pace, A. Silvestri, N. Aghanim, A. Amara, S. Andreon, N. Auricchio, C. Baccigalupi, M. Baldi , et al. (263 additional authors not shown)

Abstract: The Euclid mission has the potential to understand the fundamental physical nature of late-time cosmic acceleration and, as such, of deviations from the standard cosmological model, LCDM. In this paper, we focus on model-independent methods to modify the evolution of scalar perturbations at linear scales. We consider two approaches: the first is based on the two phenomenological modified gravity (… ▽ More The Euclid mission has the potential to understand the fundamental physical nature of late-time cosmic acceleration and, as such, of deviations from the standard cosmological model, LCDM. In this paper, we focus on model-independent methods to modify the evolution of scalar perturbations at linear scales. We consider two approaches: the first is based on the two phenomenological modified gravity (PMG) parameters, $μ_{\rm mg}$ and $Σ_{\rm mg}$, which are phenomenologically connected to the clustering of matter and weak lensing, respectively; and the second is the effective field theory (EFT) of dark energy and modified gravity, which we use to parameterise the braiding function, $α_{\rm B}$, which defines the mixing between the metric and the dark energy field. We discuss the predictions from spectroscopic and photometric primary probes by Euclid on the cosmological parameters and a given set of additional parameters featuring the PMG and EFT models. We use the Fisher matrix method applied to spectroscopic galaxy clustering (GCsp), weak lensing (WL), photometric galaxy clustering (GCph), and cross-correlation (XC) between GCph and WL. For the modelling of photometric predictions on nonlinear scales, we use the halo model to cover two limits for the screening mechanism: the unscreened (US) case, for which the screening mechanism is not present; and the super-screened (SS) case, which assumes strong screening. We also assume scale cuts to account for our uncertainties in the modelling of nonlinear perturbation evolution. We choose a time-dependent form for $\{μ_{\rm mg},Σ_{\rm mg}\}$, with two fiducial sets of values for the corresponding model parameters at the present time, $\{\barμ_0,\barΣ_0\}$, and two forms for $α_{\rm B}$, with one fiducial set of values for each of the model parameters, $α_{\rm B,0}$ and $\{α_{\rm B,0},m\}$. (Abridged) △ Less

Submitted 3 June, 2025; originally announced June 2025.

Comments: 21 pages, 9 figures

arXiv:2505.23932 [pdf, ps, other]

SwingArena: Competitive Programming Arena for Long-context GitHub Issue Solving

Authors: Wendong Xu, Jing Xiong, Chenyang Zhao, Qiujiang Chen, Haoran Wang, Hui Shen, Zhongwei Wan, Jianbo Dai, Taiqiang Wu, He Xiao, Chaofan Tao, Z. Morley Mao, Ying Sheng, Zhijiang Guo, Hongxia Yang, Bei Yu, Lingpeng Kong, Quanquan Gu, Ngai Wong

Abstract: We present SwingArena, a competitive evaluation framework for Large Language Models (LLMs) that closely mirrors real-world software development workflows. Unlike traditional static benchmarks, SwingArena models the collaborative process of software iteration by pairing LLMs as submitters, who generate patches, and reviewers, who create test cases and verify the patches through continuous integrati… ▽ More We present SwingArena, a competitive evaluation framework for Large Language Models (LLMs) that closely mirrors real-world software development workflows. Unlike traditional static benchmarks, SwingArena models the collaborative process of software iteration by pairing LLMs as submitters, who generate patches, and reviewers, who create test cases and verify the patches through continuous integration (CI) pipelines. To support these interactive evaluations, we introduce a retrieval-augmented code generation (RACG) module that efficiently handles long-context challenges by providing syntactically and semantically relevant code snippets from large codebases, supporting multiple programming languages (C++, Python, Rust, and Go). This enables the framework to scale across diverse tasks and contexts while respecting token limitations. Our experiments, using over 400 high-quality real-world GitHub issues selected from a pool of 2,300 issues, show that models like GPT-4o excel at aggressive patch generation, whereas DeepSeek and Gemini prioritize correctness in CI validation. SwingArena presents a scalable and extensible methodology for evaluating LLMs in realistic, CI-driven software development settings. More details are available on our project page: swing-bench.github.io △ Less

Submitted 2 June, 2025; v1 submitted 29 May, 2025; originally announced May 2025.

arXiv:2505.19159 [pdf, ps, other]

A Joint Learning Framework with Feature Reconstruction and Prediction for Incomplete Satellite Image Time Series in Agricultural Semantic Segmentation

Authors: Yuze Wang, Mariana Belgiu, Haiyang Wu, Dandan Zhong, Yangyang Cao, Chao Tao

Abstract: Satellite Image Time Series (SITS) is crucial for agricultural semantic segmentation. However, Cloud contamination introduces time gaps in SITS, disrupting temporal dependencies and causing feature shifts, leading to degraded performance of models trained on complete SITS. Existing methods typically address this by reconstructing the entire SITS before prediction or using data augmentation to simu… ▽ More Satellite Image Time Series (SITS) is crucial for agricultural semantic segmentation. However, Cloud contamination introduces time gaps in SITS, disrupting temporal dependencies and causing feature shifts, leading to degraded performance of models trained on complete SITS. Existing methods typically address this by reconstructing the entire SITS before prediction or using data augmentation to simulate missing data. Yet, full reconstruction may introduce noise and redundancy, while the data-augmented model can only handle limited missing patterns, leading to poor generalization. We propose a joint learning framework with feature reconstruction and prediction to address incomplete SITS more effectively. During training, we simulate data-missing scenarios using temporal masks. The two tasks are guided by both ground-truth labels and the teacher model trained on complete SITS. The prediction task constrains the model from selectively reconstructing critical features from masked inputs that align with the teacher's temporal feature representations. It reduces unnecessary reconstruction and limits noise propagation. By integrating reconstructed features into the prediction task, the model avoids learning shortcuts and maintains its ability to handle varied missing patterns and complete SITS. Experiments on SITS from Hunan Province, Western France, and Catalonia show that our method improves mean F1-scores by 6.93% in cropland extraction and 7.09% in crop classification over baselines. It also generalizes well across satellite sensors, including Sentinel-2 and PlanetScope, under varying temporal missing rates and model backbones. △ Less

Submitted 25 May, 2025; originally announced May 2025.

arXiv:2505.18229 [pdf]

BEDI: A Comprehensive Benchmark for Evaluating Embodied Agents on UAVs

Authors: Mingning Guo, Mengwei Wu, Jiarun He, Shaoxian Li, Haifeng Li, Chao Tao

Abstract: With the rapid advancement of low-altitude remote sensing and Vision-Language Models (VLMs), Embodied Agents based on Unmanned Aerial Vehicles (UAVs) have shown significant potential in autonomous tasks. However, current evaluation methods for UAV-Embodied Agents (UAV-EAs) remain constrained by the lack of standardized benchmarks, diverse testing scenarios and open system interfaces. To address th… ▽ More With the rapid advancement of low-altitude remote sensing and Vision-Language Models (VLMs), Embodied Agents based on Unmanned Aerial Vehicles (UAVs) have shown significant potential in autonomous tasks. However, current evaluation methods for UAV-Embodied Agents (UAV-EAs) remain constrained by the lack of standardized benchmarks, diverse testing scenarios and open system interfaces. To address these challenges, we propose BEDI (Benchmark for Embodied Drone Intelligence), a systematic and standardized benchmark designed for evaluating UAV-EAs. Specifically, we introduce a novel Dynamic Chain-of-Embodied-Task paradigm based on the perception-decision-action loop, which decomposes complex UAV tasks into standardized, measurable subtasks. Building on this paradigm, we design a unified evaluation framework encompassing five core sub-skills: semantic perception, spatial perception, motion control, tool utilization, and task planning. Furthermore, we construct a hybrid testing platform that integrates static real-world environments with dynamic virtual scenarios, enabling comprehensive performance assessment of UAV-EAs across varied contexts. The platform also offers open and standardized interfaces, allowing researchers to customize tasks and extend scenarios, thereby enhancing flexibility and scalability in the evaluation process. Finally, through empirical evaluations of several state-of-the-art (SOTA) VLMs, we reveal their limitations in embodied UAV tasks, underscoring the critical role of the BEDI benchmark in advancing embodied intelligence research and model optimization. By filling the gap in systematic and standardized evaluation within this field, BEDI facilitates objective model comparison and lays a robust foundation for future development in this field. Our benchmark will be released at https://github.com/lostwolves/BEDI . △ Less

Submitted 23 May, 2025; originally announced May 2025.

arXiv:2505.15929 [pdf, ps, other]

PhyX: Does Your Model Have the "Wits" for Physical Reasoning?

Authors: Hui Shen, Taiqiang Wu, Qi Han, Yunta Hsieh, Jizhou Wang, Yuyue Zhang, Yuxin Cheng, Zijian Hao, Yuansheng Ni, Xin Wang, Zhongwei Wan, Kai Zhang, Wendong Xu, Jing Xiong, Ping Luo, Wenhu Chen, Chaofan Tao, Zhuoqing Mao, Ngai Wong

Abstract: Existing benchmarks fail to capture a crucial aspect of intelligence: physical reasoning, the integrated ability to combine domain knowledge, symbolic reasoning, and understanding of real-world constraints. To address this gap, we introduce PhyX: the first large-scale benchmark designed to assess models capacity for physics-grounded reasoning in visual scenarios. PhyX includes 3K meticulously cura… ▽ More Existing benchmarks fail to capture a crucial aspect of intelligence: physical reasoning, the integrated ability to combine domain knowledge, symbolic reasoning, and understanding of real-world constraints. To address this gap, we introduce PhyX: the first large-scale benchmark designed to assess models capacity for physics-grounded reasoning in visual scenarios. PhyX includes 3K meticulously curated multimodal questions spanning 6 reasoning types across 25 sub-domains and 6 core physics domains: thermodynamics, electromagnetism, mechanics, modern physics, optics, and wave\&acoustics. In our comprehensive evaluation, even state-of-the-art models struggle significantly with physical reasoning. GPT-4o, Claude3.7-Sonnet, and GPT-o4-mini achieve only 32.5%, 42.2%, and 45.8% accuracy respectively-performance gaps exceeding 29% compared to human experts. Our analysis exposes critical limitations in current models: over-reliance on memorized disciplinary knowledge, excessive dependence on mathematical formulations, and surface-level visual pattern matching rather than genuine physical understanding. We provide in-depth analysis through fine-grained statistics, detailed case studies, and multiple evaluation paradigms to thoroughly examine physical reasoning capabilities. To ensure reproducibility, we implement a compatible evaluation protocol based on widely-used toolkits such as VLMEvalKit, enabling one-click evaluation. More details are available on our project page: https://phyx-bench.github.io/. △ Less

Submitted 29 May, 2025; v1 submitted 21 May, 2025; originally announced May 2025.

arXiv:2505.13886 [pdf, ps, other]

Code2Logic: Game-Code-Driven Data Synthesis for Enhancing VLMs General Reasoning

Authors: Jingqi Tong, Jixin Tang, Hangcheng Li, Yurong Mou, Ming Zhang, Jun Zhao, Yanbo Wen, Fan Song, Jiahao Zhan, Yuyang Lu, Chaoran Tao, Zhiyuan Guo, Jizhou Yu, Tianhao Cheng, Changhao Jiang, Zhen Wang, Tao Liang, Zhihui Fei, Mingyang Wan, Guojun Ma, Weifeng Ge, Guanhua Chen, Tao Gui, Xipeng Qiu, Qi Zhang , et al. (1 additional authors not shown)

Abstract: Visual-language Chain-of-Thought (CoT) data resources are relatively scarce compared to text-only counterparts, limiting the improvement of reasoning capabilities in Vision Language Models (VLMs). However, high-quality vision-language reasoning data is expensive and labor-intensive to annotate. To address this issue, we leverage a promising resource: game code, which naturally contains logical str… ▽ More Visual-language Chain-of-Thought (CoT) data resources are relatively scarce compared to text-only counterparts, limiting the improvement of reasoning capabilities in Vision Language Models (VLMs). However, high-quality vision-language reasoning data is expensive and labor-intensive to annotate. To address this issue, we leverage a promising resource: game code, which naturally contains logical structures and state transition processes. Therefore, we propose Code2Logic, a novel game-code-driven approach for multimodal reasoning data synthesis. Our approach leverages Large Language Models (LLMs) to adapt game code, enabling automatic acquisition of reasoning processes and results through code execution. Using the Code2Logic approach, we developed the GameQA dataset to train and evaluate VLMs. GameQA is cost-effective and scalable to produce, challenging for state-of-the-art models, and diverse with 30 games and 158 tasks. Surprisingly, despite training solely on game data, VLMs demonstrated out of domain generalization, specifically Qwen2.5-VL-7B improving performance by 2.33\% across 7 diverse vision-language benchmarks. Our code and dataset are available at https://github.com/tongjingqi/Code2Logic. △ Less

Submitted 19 May, 2025; originally announced May 2025.

Comments: 49 pages, 19 figures, submitted to NeurIPS 2025

ACM Class: I.2.7; I.2.10

arXiv:2505.07880 [pdf, ps, other]

doi 10.3847/1538-4357/ad9f32

An Agnostic Approach to Building Empirical Type Ia Supernova Light Curves: Evidence for Intrinsic Chromatic Flux Variation Using Nearby Supernova Factory Data

Authors: Jared Hand, A. G. Kim, G. Aldering, P. Antilogus, C. Aragon, S. Bailey, C. Baltay, S. Bongard, K. Boone, C. Buton, Y. Copin, S. Dixon, D. Fouchez, E. Gangler, R. Gupta, B. Hayden, W. Hillebrandt, Mitchell Karmen, M. Kowalski, D. Küsters, P. -F. Léget, F. Mondon, J. Nordin, R. Pain, E. Pecontal , et al. (13 additional authors not shown)

Abstract: We present a new empirical Type Ia supernova (SN Ia) model with three chromatic flux variation templates: one phase dependent and two phase independent. No underlying dust extinction model or patterns of intrinsic variability are assumed. Implemented with Stan and trained using spectrally binned Nearby Supernova Factory spectrophotometry, we examine this model's 2D, phase-independent flux variatio… ▽ More We present a new empirical Type Ia supernova (SN Ia) model with three chromatic flux variation templates: one phase dependent and two phase independent. No underlying dust extinction model or patterns of intrinsic variability are assumed. Implemented with Stan and trained using spectrally binned Nearby Supernova Factory spectrophotometry, we examine this model's 2D, phase-independent flux variation space using two motivated basis representations. In both, the first phase-independent template captures variation that appears dust-like, while the second captures a combination of effectively intrinsic variability and second-order dust-like effects. We find that approximately 13% of the modeled phase-independent flux variance is not dust-like. Previous empirical SN Ia models either assume an effective dust extinction recipe in their architecture, or only allow for a single mode of phase-independent variation. The presented results demonstrate such an approach may be insufficient, because it could "leak" noticeable intrinsic variation into phase-independent templates. △ Less

Submitted 10 May, 2025; originally announced May 2025.

Journal ref: ApJ 982 110 (2025)

arXiv:2505.04688 [pdf, other]

Euclid preparation. The impact of redshift interlopers on the two-point correlation function analysis

Authors: Euclid Collaboration, I. Risso, A. Veropalumbo, E. Branchini, E. Maragliano, S. de la Torre, E. Sarpa, P. Monaco, B. R. Granett, S. Lee, G. E. Addison, S. Bruton, C. Carbone, G. Lavaux, K. Markovic, K. McCarthy, G. Parimbelli, F. Passalacqua, W. J. Percival, C. Scarlata, E. Sefusatti, Y. Wang, M. Bonici, F. Oppizzi, N. Aghanim , et al. (295 additional authors not shown)

Abstract: The Euclid survey aims to measure the spectroscopic redshift of emission-line galaxies by identifying the H$\,α$ line in their slitless spectra. This method is sensitive to the signal-to-noise ratio of the line, as noise fluctuations or other strong emission lines can be misidentified as H$\,α$, depending on redshift. These effects lead to catastrophic redshift errors and the inclusion of interlop… ▽ More The Euclid survey aims to measure the spectroscopic redshift of emission-line galaxies by identifying the H$\,α$ line in their slitless spectra. This method is sensitive to the signal-to-noise ratio of the line, as noise fluctuations or other strong emission lines can be misidentified as H$\,α$, depending on redshift. These effects lead to catastrophic redshift errors and the inclusion of interlopers in the sample. We forecast the impact of such redshift errors on galaxy clustering measurements. In particular, we study the effect of interloper contamination on the two-point correlation function (2PCF), the growth rate of structures, and the Alcock-Paczynski (AP) parameters. We analyze 1000 synthetic spectroscopic catalogues, the EuclidLargeMocks, designed to match the area and selection function of the Data Release 1 (DR1) sample. We estimate the 2PCF of the contaminated catalogues, isolating contributions from correctly identified galaxies and from interlopers. We explore different models with increasing complexity to describe the measured 2PCF at fixed cosmology. Finally, we perform a cosmological inference and evaluate the systematic error on the inferred $fσ_8$, $α_{\parallel}$ and $α_{\perp}$ values associated with different models. Our results demonstrate that a minimal modelling approach, which only accounts for an attenuation of the clustering signal regardless of the type of contaminants, is sufficient to recover the correct values of $fσ_8$, $α_{\parallel}$, and $α_{\perp}$ at DR1. The accuracy and precision of the estimated AP parameters are largely insensitive to the presence of interlopers. The adoption of a minimal model induces a 1%-3% systematic error on the growth rate of structure estimation, depending on the redshift. However, this error remains smaller than the statistical error expected for the Euclid DR1 analysis. △ Less

Submitted 7 May, 2025; originally announced May 2025.

Comments: 27 pages, 22 figures, submitted to A&A

arXiv:2504.18768 [pdf, other]

doi 10.1145/3730892

TransparentGS: Fast Inverse Rendering of Transparent Objects with Gaussians

Authors: Letian Huang, Dongwei Ye, Jialin Dan, Chengzhi Tao, Huiwen Liu, Kun Zhou, Bo Ren, Yuanqi Li, Yanwen Guo, Jie Guo

Abstract: The emergence of neural and Gaussian-based radiance field methods has led to considerable advancements in novel view synthesis and 3D object reconstruction. Nonetheless, specular reflection and refraction continue to pose significant challenges due to the instability and incorrect overfitting of radiance fields to high-frequency light variations. Currently, even 3D Gaussian Splatting (3D-GS), as a… ▽ More The emergence of neural and Gaussian-based radiance field methods has led to considerable advancements in novel view synthesis and 3D object reconstruction. Nonetheless, specular reflection and refraction continue to pose significant challenges due to the instability and incorrect overfitting of radiance fields to high-frequency light variations. Currently, even 3D Gaussian Splatting (3D-GS), as a powerful and efficient tool, falls short in recovering transparent objects with nearby contents due to the existence of apparent secondary ray effects. To address this issue, we propose TransparentGS, a fast inverse rendering pipeline for transparent objects based on 3D-GS. The main contributions are three-fold. Firstly, an efficient representation of transparent objects, transparent Gaussian primitives, is designed to enable specular refraction through a deferred refraction strategy. Secondly, we leverage Gaussian light field probes (GaussProbe) to encode both ambient light and nearby contents in a unified framework. Thirdly, a depth-based iterative probes query (IterQuery) algorithm is proposed to reduce the parallax errors in our probe-based framework. Experiments demonstrate the speed and accuracy of our approach in recovering transparent objects from complex environments, as well as several applications in computer graphics and vision. △ Less

Submitted 1 May, 2025; v1 submitted 25 April, 2025; originally announced April 2025.

Comments: accepted by SIGGRAPH 2025; https://letianhuang.github.io/transparentgs/

arXiv:2504.17867 [pdf, other]

Euclid preparation: TBD. Cosmic Dawn Survey: evolution of the galaxy stellar mass function across 0.2<z<6.5 measured over 10 square degrees

Authors: Euclid Collaboration, L. Zalesky, J. R. Weaver, C. J. R. McPartland, G. Murphree, I. Valdes, C. K. Jespersen, S. Taamoli, N. Chartab, N. Allen, S. W. J. Barrow, D. B. Sanders, S. Toft, B. Mobasher, I. Szapudi, B. Altieri, A. Amara, S. Andreon, N. Auricchio, C. Baccigalupi, M. Baldi, S. Bardelli, P. Battaglia, A. Biviano, D. Bonino , et al. (282 additional authors not shown)

Abstract: The Cosmic Dawn Survey Pre-launch (PL) catalogues cover an effective 10.13 deg$^{2}$ area with uniform deep Spitzer/IRAC data ($m\sim25$ mag, 5$σ$), the largest area covered to these depths in the infrared. These data are used to gain new insight into the growth of stellar mass across cosmic history by characterising the evolution of the galaxy stellar mass function (GSMF) through… ▽ More The Cosmic Dawn Survey Pre-launch (PL) catalogues cover an effective 10.13 deg$^{2}$ area with uniform deep Spitzer/IRAC data ($m\sim25$ mag, 5$σ$), the largest area covered to these depths in the infrared. These data are used to gain new insight into the growth of stellar mass across cosmic history by characterising the evolution of the galaxy stellar mass function (GSMF) through $0.2 < z \leq 6.5$. The total volume (0.62 Gpc$^{3}$) represents a tenfold increase compared to previous works that have explored $z > 3$ and significantly reduces cosmic variance, yielding strong constraints on the abundance of massive galaxies. Results are generally consistent with the literature but now provide firm estimates of number density where only upper limits were previously available. Contrasting the GSMF with the dark matter halo mass function suggests that massive galaxies ($M \gtrsim10^{11}$ M$_{\odot}$) at $z > 3.5$ required integrated star-formation efficiencies of $M/(M_{\rm h}f_{\rm b}) \gtrsim$ 0.25--0.5, in excess of the commonly-held view of ``universal peak efficiency" from studies on the stellar-to-halo mass relation (SHMR). Such increased efficiencies imply an evolving peak in the SHMR at $z > 3.5$ which can be maintained if feedback mechanisms from active galactic nuclei and stellar processes are ineffective at early times. In addition, a significant fraction of the most massive quiescent galaxies are observed to be in place already by $z\sim 2.5$--3. The apparent lack in change of their number density by $z\sim 0.2$ is consistent with relatively little mass growth from mergers. Utilising the unique volume, evidence for an environmental dependence of the galaxy stellar mass function is found all the way through $z\sim 3.5$ for the first time, though a more careful characterisation of the density field is ultimately required for confirmation. △ Less

Submitted 24 April, 2025; originally announced April 2025.

Comments: - Submitted to A&A - Catalogues available here: https://dawn.calet.org/pl/

arXiv:2504.16090 [pdf]

Launching Insights: A Pilot Study on Leveraging Real-World Observational Data from the Mayo Clinic Platform to Advance Clinical Research

Authors: Yue Yu, Xinyue Hu, Sivaraman Rajaganapathy, Jingna Feng, Ahmed Abdelhameed, Xiaodi Li, Jianfu Li, Ken Liu, Liu Yang, Nilufer Taner, Phil Fiero, Soulmaz Boroumand, Richard Larsen, Maneesh Goyal, Clark Otley, Nansu Zong, John Halamka, Cui Tao

Abstract: Backgrounds: Artificial intelligence (AI) is transforming healthcare, yet translating AI models from theoretical frameworks to real-world clinical applications remains challenging. The Mayo Clinic Platform (MCP) was established to address these challenges by providing a scalable ecosystem that integrates real-world multiple modalities data from multiple institutions, advanced analytical tools, and… ▽ More Backgrounds: Artificial intelligence (AI) is transforming healthcare, yet translating AI models from theoretical frameworks to real-world clinical applications remains challenging. The Mayo Clinic Platform (MCP) was established to address these challenges by providing a scalable ecosystem that integrates real-world multiple modalities data from multiple institutions, advanced analytical tools, and secure computing environments to support clinical research and AI development. Methods: In this study, we conducted four research projects leveraging MCP's data infrastructure and analytical capabilities to demonstrate its potential in facilitating real-world evidence generation and AI-driven clinical insights. Utilizing MCP's tools and environment, we facilitated efficient cohort identification, data extraction, and subsequent statistical or AI-powered analyses. Results: The results underscore MCP's role in accelerating translational research by offering de-identified, standardized real-world data and facilitating AI model validation across diverse healthcare settings. Compared to Mayo's internal Electronic Health Record (EHR) data, MCP provides broader accessibility, enhanced data standardization, and multi-institutional integration, making it a valuable resource for both internal and external researchers. Conclusion: Looking ahead, MCP is well-positioned to transform clinical research through its scalable ecosystem, effectively bridging the divide between AI innovation and clinical deployment. Future investigations will build upon this foundation, further exploring MCP's capacity to advance precision medicine and enhance patient outcomes. △ Less

Submitted 21 March, 2025; originally announced April 2025.

Comments: 12 pages, 3 figures, 2 tables

arXiv:2504.13820 [pdf, other]

CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning

Authors: Yang Yue, Yulin Wang, Chenxin Tao, Pan Liu, Shiji Song, Gao Huang

Abstract: Humans can develop internal world models that encode common sense knowledge, telling them how the world works and predicting the consequences of their actions. This concept has emerged as a promising direction for establishing general-purpose machine-learning models in recent preliminary works, e.g., for visual representation learning. In this paper, we present CheXWorld, the first effort towards… ▽ More Humans can develop internal world models that encode common sense knowledge, telling them how the world works and predicting the consequences of their actions. This concept has emerged as a promising direction for establishing general-purpose machine-learning models in recent preliminary works, e.g., for visual representation learning. In this paper, we present CheXWorld, the first effort towards a self-supervised world model for radiographic images. Specifically, our work develops a unified framework that simultaneously models three aspects of medical knowledge essential for qualified radiologists, including 1) local anatomical structures describing the fine-grained characteristics of local tissues (e.g., architectures, shapes, and textures); 2) global anatomical layouts describing the global organization of the human body (e.g., layouts of organs and skeletons); and 3) domain variations that encourage CheXWorld to model the transitions across different appearance domains of radiographs (e.g., varying clarity, contrast, and exposure caused by collecting radiographs from different hospitals, devices, or patients). Empirically, we design tailored qualitative and quantitative analyses, revealing that CheXWorld successfully captures these three dimensions of medical knowledge. Furthermore, transfer learning experiments across eight medical image classification and segmentation benchmarks showcase that CheXWorld significantly outperforms existing SSL methods and large-scale medical foundation models. Code & pre-trained models are available at https://github.com/LeapLabTHU/CheXWorld. △ Less

Submitted 18 April, 2025; originally announced April 2025.

Comments: Accepted by CVPR 2025

arXiv:2504.13020 [pdf]

Euclid preparation. Estimating galaxy physical properties using CatBoost chained regressors with attention

Authors: Euclid Collaboration, A. Humphrey, P. A. C. Cunha, L. Bisigello, C. Tortora, M. Bolzonella, L. Pozzetti, M. Baes, B. R. Granett, A. Amara, S. Andreon, N. Auricchio, C. Baccigalupi, M. Baldi, S. Bardelli, A. Biviano, C. Bodendorf, D. Bonino, E. Branchini, M. Brescia, J. Brinchmann, S. Camera, G. Cañas-Herrera, V. Capobianco, C. Carbone , et al. (210 additional authors not shown)

Abstract: Euclid will image ~14000 deg^2 of the extragalactic sky at visible and NIR wavelengths, providing a dataset of unprecedented size and richness that will facilitate a multitude of studies into the evolution of galaxies. In the vast majority of cases the main source of information will come from broad-band images and data products thereof. Therefore, there is a pressing need to identify or develop s… ▽ More Euclid will image ~14000 deg^2 of the extragalactic sky at visible and NIR wavelengths, providing a dataset of unprecedented size and richness that will facilitate a multitude of studies into the evolution of galaxies. In the vast majority of cases the main source of information will come from broad-band images and data products thereof. Therefore, there is a pressing need to identify or develop scalable yet reliable methodologies to estimate the redshift and physical properties of galaxies using broad-band photometry from Euclid, optionally including ground-based optical photometry also. To address this need, we present a novel method to estimate the redshift, stellar mass, star-formation rate, specific star-formation rate, E(B-V), and age of galaxies, using mock Euclid and ground-based photometry. The main novelty of our property-estimation pipeline is its use of the CatBoost implementation of gradient-boosted regression-trees, together with chained regression and an intelligent, automatic optimization of the training data. The pipeline also includes a computationally-efficient method to estimate prediction uncertainties, and, in the absence of ground-truth labels, provides accurate predictions for metrics of model performance up to z~2. We apply our pipeline to several datasets consisting of mock Euclid broad-band photometry and mock ground-based ugriz photometry, to evaluate the performance of our methodology for estimating the redshift and physical properties of galaxies detected in the Euclid Wide Survey. The quality of our photometric redshift and physical property estimates are highly competitive overall, validating our modeling approach. We find that the inclusion of ground-based optical photometry significantly improves the quality of the property estimation, highlighting the importance of combining Euclid data with ancillary ground-based optical data. (Abridged) △ Less

Submitted 17 April, 2025; originally announced April 2025.

Comments: 22 pages, 13 figures, 4 tables. Accepted for publication by Astronomy & Astrophysics

arXiv:2504.10046 [pdf, other]

CodeRAG: Supportive Code Retrieval on Bigraph for Real-World Code Generation

Authors: Jia Li, Xianjie Shi, Kechi Zhang, Lei Li, Ge Li, Zhengwei Tao, Jia Li, Fang Liu, Chongyang Tao, Zhi Jin

Abstract: Large language models (LLMs) have shown promising performance in automated code generation, especially excelling in simple tasks such as generating standalone codes. Different from simple tasks, real-world code generation usually depends on specific programming environment (e.g., code repositories). It contains complex dependencies and domain knowledge, which is needed for LLMs when generating tar… ▽ More Large language models (LLMs) have shown promising performance in automated code generation, especially excelling in simple tasks such as generating standalone codes. Different from simple tasks, real-world code generation usually depends on specific programming environment (e.g., code repositories). It contains complex dependencies and domain knowledge, which is needed for LLMs when generating target code snippets. In this paper, we propose CodeRAG, a retrieval-augmented code generation (RAG) framework to comprehensively retrieve supportive codes for real-world code generation. Beginning with the requirement, CodeRAG first constructs a requirement graph for the current repository, and retrieves sub- and similar- requirement nodes of the target requirement on the graph. Meanwhile, it models the repository into a DS-code graph. CodeRAG then maps these relevant requirement nodes into their corresponding code nodes, and treats these code nodes as archors for LLM reasoning on DS-code graph. Finally, CodeRAG introduces a code-oriented agentic reasoning process, seamlessly allowing LLMs to reason and comprehensively retrieve for supportive codes which LLMs' need for generating correct programs. Experiments show that CodeRAG achieves significant improvements (i.e., increasing 40.90 and 37.79 Pass@1 on GPT-4o and Gemini-Pro on DevEval) compared to no RAG scenarios. Further tests on reasoning LLMs (i.e., QwQ-32B) confirm CodeRAG's adaptability and efficacy across various types of LLMs. In addition, CodeRAG outperforms commercial programming products such as Copilit and Cursor. We further investigate the performance of our framework on different dependency types, and observe that CodeRAG is superior in generating examples where target codes invoke predefined cross-file code snippets. These results demonstrate CodeRAG's potential in solving real-world repo-level coding challenges. △ Less

Submitted 14 April, 2025; originally announced April 2025.

arXiv:2504.06437 [pdf, other]

DBaS-Log-MPPI: Efficient and Safe Trajectory Optimization via Barrier States

Authors: Fanxin Wang, Haolong Jiang, Chuyuan Tao, Wenbin Wan, Yikun Cheng

Abstract: Optimizing trajectory costs for nonlinear control systems remains a significant challenge. Model Predictive Control (MPC), particularly sampling-based approaches such as the Model Predictive Path Integral (MPPI) method, has recently demonstrated considerable success by leveraging parallel computing to efficiently evaluate numerous trajectories. However, MPPI often struggles to balance safe navigat… ▽ More Optimizing trajectory costs for nonlinear control systems remains a significant challenge. Model Predictive Control (MPC), particularly sampling-based approaches such as the Model Predictive Path Integral (MPPI) method, has recently demonstrated considerable success by leveraging parallel computing to efficiently evaluate numerous trajectories. However, MPPI often struggles to balance safe navigation in constrained environments with effective exploration in open spaces, leading to infeasibility in cluttered conditions. To address these limitations, we propose DBaS-Log-MPPI, a novel algorithm that integrates Discrete Barrier States (DBaS) to ensure safety while enabling adaptive exploration with enhanced feasibility. Our method is efficiently validated through three simulation missions and one real-world experiment, involving a 2D quadrotor and a ground vehicle navigating through cluttered obstacles. We demonstrate that our algorithm surpasses both Vanilla MPPI and Log-MPPI, achieving higher success rates, lower tracking errors, and a conservative average speed. △ Less

Submitted 26 March, 2025; originally announced April 2025.

Comments: IROS 2025

arXiv:2504.01509 [pdf, other]

PROPHET: An Inferable Future Forecasting Benchmark with Causal Intervened Likelihood Estimation

Authors: Zhengwei Tao, Zhi Jin, Bincheng Li, Xiaoying Bai, Haiyan Zhao, Chengfeng Dou, Xiancai Chen, Jia Li, Linyu Li, Chongyang Tao

Abstract: Predicting future events stands as one of the ultimate aspirations of artificial intelligence. Recent advances in large language model (LLM)-based systems have shown remarkable potential in forecasting future events, thereby garnering significant interest in the research community. Currently, several benchmarks have been established to evaluate the forecasting capabilities by formalizing the event… ▽ More Predicting future events stands as one of the ultimate aspirations of artificial intelligence. Recent advances in large language model (LLM)-based systems have shown remarkable potential in forecasting future events, thereby garnering significant interest in the research community. Currently, several benchmarks have been established to evaluate the forecasting capabilities by formalizing the event prediction as a retrieval-augmented generation (RAG) and reasoning task. In these benchmarks, each prediction question is answered with relevant retrieved news articles. However, because there is no consideration on whether the questions can be supported by valid or sufficient supporting rationales, some of the questions in these benchmarks may be inherently noninferable. To address this issue, we introduce a new benchmark, PROPHET, which comprises inferable forecasting questions paired with relevant news for retrieval. To ensure the inferability of the benchmark, we propose Causal Intervened Likelihood (CIL), a statistical measure that assesses inferability through causal inference. In constructing this benchmark, we first collected recent trend forecasting questions and then filtered the data using CIL, resulting in an inferable benchmark for event prediction. Through extensive experiments, we first demonstrate the validity of CIL and in-depth investigations into event prediction with the aid of CIL. Subsequently, we evaluate several representative prediction systems on PROPHET, drawing valuable insights for future directions. △ Less

Submitted 2 April, 2025; originally announced April 2025.

arXiv:2503.23106 [pdf]

A large-scale image-text dataset benchmark for farmland segmentation

Authors: Chao Tao, Dandan Zhong, Weiliang Mu, Zhuofei Du, Haiyang Wu

Abstract: The traditional deep learning paradigm that solely relies on labeled data has limitations in representing the spatial relationships between farmland elements and the surrounding environment.It struggles to effectively model the dynamic temporal evolution and spatial heterogeneity of farmland. Language,as a structured knowledge carrier,can explicitly express the spatiotemporal characteristics of fa… ▽ More The traditional deep learning paradigm that solely relies on labeled data has limitations in representing the spatial relationships between farmland elements and the surrounding environment.It struggles to effectively model the dynamic temporal evolution and spatial heterogeneity of farmland. Language,as a structured knowledge carrier,can explicitly express the spatiotemporal characteristics of farmland, such as its shape, distribution,and surrounding environmental information.Therefore,a language-driven learning paradigm can effectively alleviate the challenges posed by the spatiotemporal heterogeneity of farmland.However,in the field of remote sensing imagery of farmland,there is currently no comprehensive benchmark dataset to support this research direction.To fill this gap,we introduced language based descriptions of farmland and developed FarmSeg-VL dataset,the first fine-grained image-text dataset designed for spatiotemporal farmland segmentation.Firstly, this article proposed a semi-automatic annotation method that can accurately assign caption to each image, ensuring high data quality and semantic richness while improving the efficiency of dataset construction.Secondly,the FarmSeg-VL exhibits significant spatiotemporal characteristics.In terms of the temporal dimension,it covers all four seasons.In terms of the spatial dimension,it covers eight typical agricultural regions across China.In addition, in terms of captions,FarmSeg-VL covers rich spatiotemporal characteristics of farmland,including its inherent properties,phenological characteristics, spatial distribution,topographic and geomorphic features,and the distribution of surrounding environments.Finally,we present a performance analysis of VLMs and the deep learning models that rely solely on labels trained on the FarmSeg-VL,demonstrating its potential as a standard benchmark for farmland segmentation. △ Less

Submitted 29 March, 2025; originally announced March 2025.

arXiv:2503.20170 [pdf, ps, other]

Decomposing a factorial into large factors

Authors: Boris Alexeev, Evan Conway, Matthieu Rosenfeld, Andrew V. Sutherland, Terence Tao, Markus Uhr, Kevin Ventullo

Abstract: Let $t(N)$ denote the largest number such that $N!$ can be expressed as the product of $N$ integers greater than or equal to $t(N)$. The bound $t(N)/N = 1/e-o(1)$ was apparently established in unpublished work of Erdős, Selfridge, and Straus; but the proof is lost. Here we obtain the more precise asymptotic… ▽ More Let $t(N)$ denote the largest number such that $N!$ can be expressed as the product of $N$ integers greater than or equal to $t(N)$. The bound $t(N)/N = 1/e-o(1)$ was apparently established in unpublished work of Erdős, Selfridge, and Straus; but the proof is lost. Here we obtain the more precise asymptotic $$ \frac{t(N)}{N} = \frac{1}{e} - \frac{c_0}{\log N} + O\left( \frac{1}{\log^{1+c} N} \right)$$ for an explicit constant $c_0 = 0.30441901\dots$ and some absolute constant $c>0$, answering a question of Erdős and Graham. For the upper bound, a further lower order term in the asymptotic expansion is also obtained. With numerical assistance, we obtain highly precise computations of $t(N)$ for wide ranges of $N$, establishing several explicit conjectures of Guy and Selfridge on this sequence. For instance, we show that $t(N) \geq N/3$ for $N \geq 43632$, with the threshold shown to be best possible. △ Less

Submitted 2 June, 2025; v1 submitted 25 March, 2025; originally announced March 2025.

Comments: 79 pages, 18 figures. This is a completely new version, with many stronger results and numerics than before

MSC Class: 11A51

arXiv:2503.17455 [pdf, other]

doi 10.1051/0004-6361/202553887

Euclid preparation LXX. Forecasting detection limits for intracluster light in the Euclid Wide Survey

Authors: Euclid Collaboration, C. Bellhouse, J. B. Golden-Marx, S. P. Bamford, N. A. Hatch, M. Kluge, A. Ellien, S. L. Ahad, P. Dimauro, F. Durret, A. H. Gonzalez, Y. Jimenez-Teja, M. Montes, M. Sereno, E. Slezak, M. Bolzonella, G. Castignani, O. Cucciati, G. De Lucia, Z. Ghaffari, L. Moscardini, R. Pello, L. Pozzetti, T. Saifollahi, A. S. Borlaff , et al. (270 additional authors not shown)

Abstract: The intracluster light (ICL) permeating galaxy clusters is a tracer of the cluster's assembly history, and potentially a tracer of their dark matter structure. In this work we explore the capability of the Euclid Wide Survey to detect ICL using H-band mock images. We simulate clusters across a range of redshifts (0.3-1.8) and halo masses ($10^{13.9}$-$10^{15.0}$ M$_\odot$), using an observationall… ▽ More The intracluster light (ICL) permeating galaxy clusters is a tracer of the cluster's assembly history, and potentially a tracer of their dark matter structure. In this work we explore the capability of the Euclid Wide Survey to detect ICL using H-band mock images. We simulate clusters across a range of redshifts (0.3-1.8) and halo masses ($10^{13.9}$-$10^{15.0}$ M$_\odot$), using an observationally motivated model of the ICL. We identify a 50-200 kpc circular annulus around the brightest cluster galaxy (BCG) in which the signal-to-noise ratio (S/N) of the ICL is maximised and use the S/N within this aperture as our figure of merit for ICL detection. We compare three state-of-the-art methods for ICL detection, and find that a method that performs simple aperture photometry after high-surface brightness source masking is able to detect ICL with minimal bias for clusters more massive than $10^{14.2}$ M$_\odot$. The S/N of the ICL detection is primarily limited by the redshift of the cluster, driven by cosmological dimming, rather than the mass of the cluster. Assuming the ICL in each cluster contains 15% of the stellar light, we forecast that Euclid will be able to measure the presence of ICL in up to $\sim80000$ clusters of $>10^{14.2}$ M$_\odot$ between $z=0.3$ and 1.5 with a S/N$>3$. Half of these clusters will reside below $z=0.75$ and the majority of those below $z=0.6$ will be detected with a S/N $>20$. A few thousand clusters at $1.3<z<1.5$ will have ICL detectable with a S/N greater than 3. The surface brightness profile of the ICL model is strongly dependent on both the mass of the cluster and the redshift at which it is observed so the outer ICL is best observed in the most massive clusters of $>10^{14.7}$ M$_\odot$. Euclid will detect the ICL at more than 500 kpc distance from the BCG, up to $z=0.7$, in several hundred of these massive clusters over its large survey volume. △ Less

Submitted 21 March, 2025; originally announced March 2025.

Comments: 21 pages, 13 figures, 2 tables. Accepted for publication in A&A

Journal ref: A&A 698, A14 (2025)

arXiv:2503.15635 [pdf, other]

Euclid preparation. Spatially resolved stellar populations of local galaxies with Euclid: a proof of concept using synthetic images with the TNG50 simulation

Authors: Euclid Collaboration, Abdurro'uf, C. Tortora, M. Baes, A. Nersesian, I. Kovačić, M. Bolzonella, A. Lançon, L. Bisigello, F. Annibali, M. N. Bremer, D. Carollo, C. J. Conselice, A. Enia, A. M. N. Ferguson, A. Ferré-Mateu, L. K. Hunt, E. Iodice, J. H. Knapen, A. Iovino, F. R. Marleau, R. F. Peletier, R. Ragusa, M. Rejkuba, A. S. G. Robotham , et al. (264 additional authors not shown)

Abstract: The European Space Agency's Euclid mission will observe approximately 14,000 $\rm{deg}^{2}$ of the extragalactic sky and deliver high-quality imaging for many galaxies. The depth and high spatial resolution of the data will enable a detailed analysis of stellar population properties of local galaxies. In this study, we test our pipeline for spatially resolved SED fitting using synthetic images of… ▽ More The European Space Agency's Euclid mission will observe approximately 14,000 $\rm{deg}^{2}$ of the extragalactic sky and deliver high-quality imaging for many galaxies. The depth and high spatial resolution of the data will enable a detailed analysis of stellar population properties of local galaxies. In this study, we test our pipeline for spatially resolved SED fitting using synthetic images of Euclid, LSST, and GALEX generated from the TNG50 simulation. We apply our pipeline to 25 local simulated galaxies to recover their resolved stellar population properties. We produce 3 types of data cubes: GALEX + LSST + Euclid, LSST + Euclid, and Euclid-only. We perform the SED fitting tests with two SPS models in a Bayesian framework. Because the age, metallicity, and dust attenuation estimates are biased when applying only classical formulations of flat priors, we examine the effects of additional priors in the forms of mass-age-$Z$ relations, constructed using a combination of empirical and simulated data. Stellar-mass surface densities can be recovered well using any of the 3 data cubes, regardless of the SPS model and prior variations. The new priors then significantly improve the measurements of mass-weighted age and $Z$ compared to results obtained without priors, but they may play an excessive role compared to the data in determining the outcome when no UV data is available. The spatially resolved SED fitting method is powerful for mapping the stellar populations of galaxies with the current abundance of high-quality imaging data. Our study re-emphasizes the gain added by including multiwavelength data from ancillary surveys and the roles of priors in Bayesian SED fitting. With the Euclid data alone, we will be able to generate complete and deep stellar mass maps of galaxies in the local Universe, thus exploiting the telescope's wide field, NIR sensitivity, and high spatial resolution. △ Less