Search | arXiv e-print repository

Bootstrapping form factor squared in ${\cal N}=4$ super-Yang-Mills

Authors: Song He, Xiang Li, Jingwen Lin, Jiahao Liu, Kai Yan

Abstract: We propose a bootstrap program for the {\it form factor squared} with operator ${\rm tr}(φ^2)$ in maximally supersymmetric Yang-Mills theory in the planar limit, which plays a central role for perturbative calculations of important physical observables such as energy correlators. The tree-level $N$-point form factor (FF) squared can be obtained by cutting $N$ propagators of a collection of two-poi… ▽ More We propose a bootstrap program for the {\it form factor squared} with operator ${\rm tr}(φ^2)$ in maximally supersymmetric Yang-Mills theory in the planar limit, which plays a central role for perturbative calculations of important physical observables such as energy correlators. The tree-level $N$-point form factor (FF) squared can be obtained by cutting $N$ propagators of a collection of two-point ``master diagrams" at $(N{-}1)$ loops: for $N=3,4,5,6$ there are merely $1, 2, 4, 13$ topologies of such diagrams respectively, and their numerators are strongly constrained by power-counting (including ``no triangle" property) and other constraints such as the ``rung rule". Moreover, these two-point diagrams provide a ``unification" of FF squared at different numbers of loops and legs, which is similar to extracting (planar) amplitude squared from vacuum master diagrams (dual to $f$-graphs): by cutting $2\leq n<N$ propagators, one can also extract the planar integrand of $n$-point FF squared at $(N-n)$ loops, thus our results automatically include integrands of 2-point (Sudakov) FF up to four loops (where the squaring is trivial), 3-point FF squared up to three loops, and so on. Our ansatz is completely fixed using soft limits of (tree and loop) FF squared and the multi-collinear limit which reduces it to the splitting function, without any other inputs such as unitarity cuts. This method opens up the exciting possibility of a {\it graphical bootstrap} for FF squared for higher $N$ (which contains {\it e.g.} planar Sudakov FF to $N{-}2$ loops) similar to that for the amplitude squared via $f$-graphs. We also comment on applications to the computation of leading order energy correlators where new structures are expected after performing phase-space integrations. △ Less

Submitted 9 June, 2025; originally announced June 2025.

Comments: 32 pages + appendix and refs, 2 tables, many figures and attached with ancillary files

arXiv:2506.06970 [pdf, other]

Guiding Cross-Modal Representations with MLLM Priors via Preference Alignment

Authors: Pengfei Zhao, Rongbo Luan, Wei Zhang, Peng Wu, Sifeng He

Abstract: Despite Contrastive Language-Image Pretraining (CLIP)'s remarkable capability to retrieve content across modalities, a substantial modality gap persists in its feature space. Intriguingly, we discover that off-the-shelf MLLMs (Multimodal Large Language Models) demonstrate powerful inherent modality alignment properties. While recent MLLM-based retrievers with unified architectures partially mitiga… ▽ More Despite Contrastive Language-Image Pretraining (CLIP)'s remarkable capability to retrieve content across modalities, a substantial modality gap persists in its feature space. Intriguingly, we discover that off-the-shelf MLLMs (Multimodal Large Language Models) demonstrate powerful inherent modality alignment properties. While recent MLLM-based retrievers with unified architectures partially mitigate this gap, their reliance on coarse modality alignment mechanisms fundamentally limits their potential. In this work, We introduce MAPLE (Modality-Aligned Preference Learning for Embeddings), a novel framework that leverages the fine grained alignment priors inherent in MLLM to guide cross modal representation learning. MAPLE formulates the learning process as reinforcement learning with two key components: (1) Automatic preference data construction using off-the-shelf MLLM, and (2) a new Relative Preference Alignment (RPA) loss, which adapts Direct Preference Optimization (DPO) to the embedding learning setting. Experimental results show that our preference-guided alignment achieves substantial gains in fine-grained cross-modal retrieval, underscoring its effectiveness in handling nuanced semantic distinctions. △ Less

Submitted 7 June, 2025; originally announced June 2025.

arXiv:2506.06825 [pdf, ps, other]

Identity Deepfake Threats to Biometric Authentication Systems: Public and Expert Perspectives

Authors: Shijing He, Yaxiong Lei, Zihan Zhang, Yuzhou Sun, Shujun Li, Chi Zhang, Juan Ye

Abstract: Generative AI (Gen-AI) deepfakes pose a rapidly evolving threat to biometric authentication, yet a significant gap exists between expert understanding of these risks and public perception. This disconnection creates critical vulnerabilities in systems trusted by millions. To bridge this gap, we conducted a comprehensive mixed-method study, surveying 408 professionals across key sectors and conduct… ▽ More Generative AI (Gen-AI) deepfakes pose a rapidly evolving threat to biometric authentication, yet a significant gap exists between expert understanding of these risks and public perception. This disconnection creates critical vulnerabilities in systems trusted by millions. To bridge this gap, we conducted a comprehensive mixed-method study, surveying 408 professionals across key sectors and conducting in-depth interviews with 37 participants (25 experts, 12 general public [non-experts]). Our findings reveal a paradox: while the public increasingly relies on biometrics for convenience, experts express grave concerns about the spoofing of static modalities like face and voice recognition. We found significant demographic and sector-specific divides in awareness and trust, with finance professionals, for example, showing heightened skepticism. To systematically analyze these threats, we introduce a novel Deepfake Kill Chain model, adapted from Hutchins et al.'s cybersecurity frameworks to map the specific attack vectors used by malicious actors against biometric systems. Based on this model and our empirical findings, we propose a tri-layer mitigation framework that prioritizes dynamic biometric signals (e.g., eye movements), robust privacy-preserving data governance, and targeted educational initiatives. This work provides the first empirically grounded roadmap for defending against AI-generated identity threats by aligning technical safeguards with human-centered insights. △ Less

Submitted 7 June, 2025; originally announced June 2025.

MSC Class: 68T10; 68T45; 68M25 ACM Class: I.4.9; I.5.4; K.4.1; K.6.5

arXiv:2506.06591 [pdf, ps, other]

Privacy Perspectives and Practices of Chinese Smart Home Product Teams

Authors: Shijing He, Yaxiong Lei, Xiao Zhan, Chi Zhang, Juan Ye, Ruba Abu-Salma, Jose Such

Abstract: Previous research has explored the privacy needs and concerns of device owners, primary users, and different bystander groups with regard to smart home devices like security cameras, smart speakers, and hubs, but little is known about the privacy views and practices of smart home product teams, particularly those in non-Western contexts. This paper presents findings from 27 semi-structured intervi… ▽ More Previous research has explored the privacy needs and concerns of device owners, primary users, and different bystander groups with regard to smart home devices like security cameras, smart speakers, and hubs, but little is known about the privacy views and practices of smart home product teams, particularly those in non-Western contexts. This paper presents findings from 27 semi-structured interviews with Chinese smart home product team members, including product/project managers, software/hardware engineers, user experience (UX) designers, legal/privacy experts, and marketers/operation specialists. We examine their privacy perspectives, practices, and risk mitigation strategies. Our results show that participants emphasized compliance with Chinese data privacy laws, which typically prioritized national security over individual privacy rights. China-specific cultural, social, and legal factors also influenced participants' ethical considerations and attitudes toward balancing user privacy and security with convenience. Drawing on our findings, we propose a set of recommendations for smart home product teams, along with socio-technical and legal interventions to address smart home privacy issues-especially those belonging to at-risk groups-in Chinese multi-user smart homes. △ Less

Submitted 6 June, 2025; originally announced June 2025.

arXiv:2506.04957 [pdf, ps, other]

The asymptotics of the $\mathrm{SL}_2(\mathbb{C})$-Hitchin metric on the singular locus: subintegrable systems

Authors: Siqi He, Johannes Horn, Nianzi Li

Abstract: We study the asymptotic hyperkähler geometry of the $\mathrm{SL}_2(\mathbb{C})$-Hitchin moduli space over the singular fibers of the Hitchin fibration. We extend the previously known exponential convergence results for solutions to the Hitchin equation to the class of locally fiducial Higgs bundles defined by a special local description at the singularities of the spectral curve. This condition is… ▽ More We study the asymptotic hyperkähler geometry of the $\mathrm{SL}_2(\mathbb{C})$-Hitchin moduli space over the singular fibers of the Hitchin fibration. We extend the previously known exponential convergence results for solutions to the Hitchin equation to the class of locally fiducial Higgs bundles defined by a special local description at the singularities of the spectral curve. This condition is satisfied by the Higgs bundles contained in certain subintegrable systems introduced by Hitchin. We prove that the restriction of the hyperkähler metric to the subintegrable system converges exponentially fast to the corresponding semi-flat metric along a ray $(\mathcal{E},t\varphi)$. This answers a question posed by Hitchin in \cite{Hitchin2021subintegrable_special_Kaehler}. More generally, we prove that for each stratum of quadratic differentials there is a closed subset of the corresponding Hitchin fibers, such that the restricted hyperkähler metric converges to a generalized semi-flat metric. △ Less

Submitted 5 June, 2025; originally announced June 2025.

Comments: 41 pages

MSC Class: 53C26; 53C07

arXiv:2506.03922 [pdf, ps, other]

HSSBench: Benchmarking Humanities and Social Sciences Ability for Multimodal Large Language Models

Authors: Zhaolu Kang, Junhao Gong, Jiaxu Yan, Wanke Xia, Yian Wang, Ziwen Wang, Huaxuan Ding, Zhuo Cheng, Wenhao Cao, Zhiyuan Feng, Siqi He, Shannan Yan, Junzhe Chen, Xiaomin He, Chaoya Jiang, Wei Ye, Kaidong Yu, Xuelong Li

Abstract: Multimodal Large Language Models (MLLMs) have demonstrated significant potential to advance a broad range of domains. However, current benchmarks for evaluating MLLMs primarily emphasize general knowledge and vertical step-by-step reasoning typical of STEM disciplines, while overlooking the distinct needs and potential of the Humanities and Social Sciences (HSS). Tasks in the HSS domain require mo… ▽ More Multimodal Large Language Models (MLLMs) have demonstrated significant potential to advance a broad range of domains. However, current benchmarks for evaluating MLLMs primarily emphasize general knowledge and vertical step-by-step reasoning typical of STEM disciplines, while overlooking the distinct needs and potential of the Humanities and Social Sciences (HSS). Tasks in the HSS domain require more horizontal, interdisciplinary thinking and a deep integration of knowledge across related fields, which presents unique challenges for MLLMs, particularly in linking abstract concepts with corresponding visual representations. Addressing this gap, we present HSSBench, a dedicated benchmark designed to assess the capabilities of MLLMs on HSS tasks in multiple languages, including the six official languages of the United Nations. We also introduce a novel data generation pipeline tailored for HSS scenarios, in which multiple domain experts and automated agents collaborate to generate and iteratively refine each sample. HSSBench contains over 13,000 meticulously designed samples, covering six key categories. We benchmark more than 20 mainstream MLLMs on HSSBench and demonstrate that it poses significant challenges even for state-of-the-art models. We hope that this benchmark will inspire further research into enhancing the cross-disciplinary reasoning abilities of MLLMs, especially their capacity to internalize and connect knowledge across fields. △ Less

Submitted 4 June, 2025; originally announced June 2025.

arXiv:2506.03543 [pdf, ps, other]

CogniPair: From LLM Chatbots to Conscious AI Agents -- GNWT-Based Multi-Agent Digital Twins for Social Pairing -- Dating & Hiring Applications

Authors: Wanghao Ye, Sihan Chen, Yiting Wang, Shwai He, Bowei Tian, Guoheng Sun, Ziyi Wang, Ziyao Wang, Yexiao He, Zheyu Shen, Meng Liu, Yuning Zhang, Meng Feng, Yang Wang, Siyuan Peng, Yilong Dai, Zhenle Duan, Hanzhang Qin, Ang Li

Abstract: Current large language model (LLM) agents lack authentic human psychological processes necessary for genuine digital twins and social AI applications. To address this limitation, we present a computational implementation of Global Workspace Theory (GNWT) that integrates human cognitive architecture principles into LLM agents, creating specialized sub-agents for emotion, memory, social norms, plann… ▽ More Current large language model (LLM) agents lack authentic human psychological processes necessary for genuine digital twins and social AI applications. To address this limitation, we present a computational implementation of Global Workspace Theory (GNWT) that integrates human cognitive architecture principles into LLM agents, creating specialized sub-agents for emotion, memory, social norms, planning, and goal-tracking coordinated through a global workspace mechanism. However, authentic digital twins require accurate personality initialization. We therefore develop a novel adventure-based personality test that evaluates true personality through behavioral choices within interactive scenarios, bypassing self-presentation bias found in traditional assessments. Building on these innovations, our CogniPair platform enables digital twins to engage in realistic simulated dating interactions and job interviews before real encounters, providing bidirectional cultural fit assessment for both romantic compatibility and workplace matching. Validation using 551 GNWT-Agents and Columbia University Speed Dating dataset demonstrates 72% correlation with human attraction patterns, 77.8% match prediction accuracy, and 74% agreement in human validation studies. This work advances psychological authenticity in LLM agents and establishes a foundation for intelligent dating platforms and HR technology solutions. △ Less