Skip to main content

Showing 1–50 of 178 results for author: Dong, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2509.26127  [pdf, ps, other

    cs.CV

    EchoGen: Generating Visual Echoes in Any Scene via Feed-Forward Subject-Driven Auto-Regressive Model

    Authors: Ruixiao Dong, Zhendong Wang, Keli Liu, Li Li, Ying Chen, Kai Li, Daowen Li, Houqiang Li

    Abstract: Subject-driven generation is a critical task in creative AI; yet current state-of-the-art methods present a stark trade-off. They either rely on computationally expensive, per-subject fine-tuning, sacrificing efficiency and zero-shot capability, or employ feed-forward architectures built on diffusion models, which are inherently plagued by slow inference speeds. Visual Auto-Regressive (VAR) models… ▽ More

    Submitted 30 September, 2025; originally announced September 2025.

  2. arXiv:2509.23687  [pdf, ps, other

    eess.SP cs.AI

    Joint Hybrid Beamforming and Artificial Noise Design for Secure Multi-UAV ISAC Networks

    Authors: Runze Dong, Buhong Wang, Cunqian Feng, Jiang Weng, Chen Han, Jiwei Tian

    Abstract: Integrated sensing and communication (ISAC) emerges as a key enabler for next-generation applications such as smart cities and autonomous systems. Its integration with unmanned aerial vehicles (UAVs) unlocks new potentials for reliable communication and precise sensing in dynamic aerial environments. However, existing research predominantly treats UAVs as aerial base stations, overlooking their ro… ▽ More

    Submitted 28 September, 2025; originally announced September 2025.

  3. arXiv:2509.20447  [pdf, ps, other

    astro-ph.EP astro-ph.IM cs.LG

    Neural Networks as Surrogate Solvers for Time-Dependent Accretion Disk Dynamics

    Authors: Shunyuan Mao, Weiqi Wang, Sifan Wang, Ruobing Dong, Lu Lu, Kwang Moo Yi, Paris Perdikaris, Andrea Isella, Sébastien Fabbro, Lile Wang

    Abstract: Accretion disks are ubiquitous in astrophysics, appearing in diverse environments from planet-forming systems to X-ray binaries and active galactic nuclei. Traditionally, modeling their dynamics requires computationally intensive (magneto)hydrodynamic simulations. Recently, Physics-Informed Neural Networks (PINNs) have emerged as a promising alternative. This approach trains neural networks direct… ▽ More

    Submitted 24 September, 2025; originally announced September 2025.

    Comments: Astrophysical Journal Letters accepted; associate animations are available at https://doi.org/10.6084/m9.figshare.30192904

  4. arXiv:2509.17359  [pdf, ps, other

    cs.IR

    MLLM-Driven Semantic Identifier Generation for Generative Cross-Modal Retrieval

    Authors: Tianyuan Li, Lei Wang, Ahtamjan Ahmat, Yating Yang, Bo Ma, Rui Dong, Bangju Han

    Abstract: Generative cross-modal retrieval, which treats retrieval as a generation task, has emerged as a promising direction with the rise of Multimodal Large Language Models (MLLMs). In this setting, the model responds to a text query by generating an identifier corresponding to the target image. However, existing methods typically rely on manually crafted string IDs, clustering-based labels, or atomic id… ▽ More

    Submitted 22 September, 2025; originally announced September 2025.

  5. arXiv:2509.15404  [pdf, ps, other

    cs.RO

    Trust-Aware Embodied Bayesian Persuasion for Mixed-Autonomy

    Authors: Shaoting Peng, Katherine Driggs-Campbell, Roy Dong

    Abstract: Safe and efficient interaction between autonomous vehicles (AVs) and human-driven vehicles (HVs) is a critical challenge for future transportation systems. While game-theoretic models capture how AVs influence HVs, they often suffer from a long-term decay of influence and can be perceived as manipulative, eroding the human's trust. This can paradoxically lead to riskier human driving behavior over… ▽ More

    Submitted 18 September, 2025; originally announced September 2025.

  6. arXiv:2509.10873  [pdf, ps, other

    cs.MM

    Automated Radiology Report Generation Based on Topic-Keyword Semantic Guidance

    Authors: Jing Xiao, Hongfei Liu, Ruiqi Dong, Jimin Liu, Haoyong Yu

    Abstract: Automated radiology report generation is essential in clinical practice. However, diagnosing radiological images typically requires physicians 5-10 minutes, resulting in a waste of valuable healthcare resources. Existing studies have not fully leveraged knowledge from historical radiology reports, lacking sufficient and accurate prior information. To address this, we propose a Topic-Keyword Semant… ▽ More

    Submitted 13 September, 2025; originally announced September 2025.

  7. arXiv:2509.08418  [pdf, ps, other

    cond-mat.mtrl-sci cs.LG

    Facet: highly efficient E(3)-equivariant networks for interatomic potentials

    Authors: Nicholas Miklaucic, Lai Wei, Rongzhi Dong, Nihang Fu, Sadman Sadeed Omee, Qingyang Li, Sourin Dey, Victor Fung, Jianjun Hu

    Abstract: Computational materials discovery is limited by the high cost of first-principles calculations. Machine learning (ML) potentials that predict energies from crystal structures are promising, but existing methods face computational bottlenecks. Steerable graph neural networks (GNNs) encode geometry with spherical harmonics, respecting atomic symmetries -- permutation, rotation, and translation -- fo… ▽ More

    Submitted 10 September, 2025; originally announced September 2025.

  8. arXiv:2509.08199  [pdf, ps, other

    cs.CY stat.AP

    Algorithmic Tradeoffs, Applied NLP, and the State-of-the-Art Fallacy

    Authors: AJ Alvero, Ruohong Dong, Klint Kanopka, David Lang

    Abstract: Computational sociology is growing in popularity, yet the analytic tools employed differ widely in power, transparency, and interpretability. In computer science, methods gain popularity after surpassing benchmarks of predictive accuracy, becoming the "state of the art." Computer scientists favor novelty and innovation for different reasons, but prioritizing technical prestige over methodological… ▽ More

    Submitted 9 September, 2025; originally announced September 2025.

  9. ELEC: Efficient Large Language Model-Empowered Click-Through Rate Prediction

    Authors: Rui Dong, Wentao Ouyang, Xiangzheng Liu

    Abstract: Click-through rate (CTR) prediction plays an important role in online advertising systems. On the one hand, traditional CTR prediction models capture the collaborative signals in tabular data via feature interaction modeling, but they lose semantics in text. On the other hand, Large Language Models (LLMs) excel in understanding the context and meaning behind text, but they face challenges in captu… ▽ More

    Submitted 9 September, 2025; originally announced September 2025.

    Comments: SIGIR 2025

  10. arXiv:2509.06976  [pdf

    cs.LG cs.AI

    A Knowledge-Guided Cross-Modal Feature Fusion Model for Local Traffic Demand Prediction

    Authors: Lingyu Zhang, Pengfei Xu, Guobin Wu, Jian Liang, Ruiyang Dong, Yunhai Wang, Xuan Song

    Abstract: Traffic demand prediction plays a critical role in intelligent transportation systems. Existing traffic prediction models primarily rely on temporal traffic data, with limited efforts incorporating human knowledge and experience for urban traffic demand forecasting. However, in real-world scenarios, traffic knowledge and experience derived from human daily life significantly influence precise traf… ▽ More

    Submitted 29 August, 2025; originally announced September 2025.

  11. arXiv:2509.06499  [pdf, ps, other

    cs.CV

    TIDE: Achieving Balanced Subject-Driven Image Generation via Target-Instructed Diffusion Enhancement

    Authors: Jibai Lin, Bo Ma, Yating Yang, Xi Zhou, Rong Ma, Turghun Osman, Ahtamjan Ahmat, Rui Dong, Lei Wang

    Abstract: Subject-driven image generation (SDIG) aims to manipulate specific subjects within images while adhering to textual instructions, a task crucial for advancing text-to-image diffusion models. SDIG requires reconciling the tension between maintaining subject identity and complying with dynamic edit instructions, a challenge inadequately addressed by existing methods. In this paper, we introduce the… ▽ More

    Submitted 18 September, 2025; v1 submitted 8 September, 2025; originally announced September 2025.

  12. arXiv:2508.12546  [pdf, ps, other

    cs.SE

    XAMT: Cross-Framework API Matching for Testing Deep Learning Libraries

    Authors: Bin Duan, Ruican Dong, Naipeng Dong, Dan Dongseong Kim, Guowei Yang

    Abstract: Deep learning powers critical applications such as autonomous driving, healthcare, and finance, where the correctness of underlying libraries is essential. Bugs in widely used deep learning APIs can propagate to downstream systems, causing serious consequences. While existing fuzzing techniques detect bugs through intra-framework testing across hardware backends (CPU vs. GPU), they may miss bugs t… ▽ More

    Submitted 17 August, 2025; originally announced August 2025.

  13. arXiv:2507.21134  [pdf, ps, other

    cs.CL cs.CY cs.LG

    TRIDENT: Benchmarking LLM Safety in Finance, Medicine, and Law

    Authors: Zheng Hui, Yijiang River Dong, Ehsan Shareghi, Nigel Collier

    Abstract: As large language models (LLMs) are increasingly deployed in high-risk domains such as law, finance, and medicine, systematically evaluating their domain-specific safety and compliance becomes critical. While prior work has largely focused on improving LLM performance in these domains, it has often neglected the evaluation of domain-specific safety risks. To bridge this gap, we first define domain… ▽ More

    Submitted 22 July, 2025; originally announced July 2025.

  14. arXiv:2507.04447  [pdf, ps, other

    cs.CV cs.RO

    DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge

    Authors: Wenyao Zhang, Hongsi Liu, Zekun Qi, Yunnan Wang, Xinqiang Yu, Jiazhao Zhang, Runpei Dong, Jiawei He, Fan Lu, He Wang, Zhizheng Zhang, Li Yi, Wenjun Zeng, Xin Jin

    Abstract: Recent advances in vision-language-action (VLA) models have shown promise in integrating image generation with action prediction to improve generalization and reasoning in robot manipulation. However, existing methods are limited to challenging image-based forecasting, which suffers from redundant information and lacks comprehensive and critical world knowledge, including dynamic, spatial and sema… ▽ More

    Submitted 26 August, 2025; v1 submitted 6 July, 2025; originally announced July 2025.

  15. arXiv:2507.02018  [pdf, ps, other

    q-fin.ST cs.AI cs.LG

    NGAT: A Node-level Graph Attention Network for Long-term Stock Prediction

    Authors: Yingjie Niu, Mingchuan Zhao, Valerio Poti, Ruihai Dong

    Abstract: Graph representation learning methods have been widely adopted in financial applications to enhance company representations by leveraging inter-firm relationships. However, current approaches face three key challenges: (1) The advantages of relational information are obscured by limitations in downstream task designs; (2) Existing graph models specifically designed for stock prediction often suffe… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

    ACM Class: I.2.1

  16. arXiv:2506.04650  [pdf, ps, other

    cs.LG

    Neural Network Reprogrammability: A Unified Theme on Model Reprogramming, Prompt Tuning, and Prompt Instruction

    Authors: Zesheng Ye, Chengyi Cai, Ruijiang Dong, Jianzhong Qi, Lei Feng, Pin-Yu Chen, Feng Liu

    Abstract: As large-scale pre-trained foundation models continue to expand in size and capability, efficiently adapting them to specific downstream tasks has become increasingly critical. Despite substantial progress, existing adaptation approaches have evolved largely in isolation, without a clear understanding of their interrelationships. This survey introduces neural network reprogrammability as a unifyin… ▽ More

    Submitted 13 June, 2025; v1 submitted 5 June, 2025; originally announced June 2025.

  17. arXiv:2505.24863  [pdf, ps, other

    cs.CL

    AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

    Authors: Junyu Zhang, Runpei Dong, Han Wang, Xuying Ning, Haoran Geng, Peihao Li, Xialin He, Yutong Bai, Jitendra Malik, Saurabh Gupta, Huan Zhang

    Abstract: This paper presents AlphaOne ($α$1), a universal framework for modulating reasoning progress in large reasoning models (LRMs) at test time. $α$1 first introduces $α$ moment, which represents the scaled thinking phase with a universal parameter $α$. Within this scaled pre-$α$ moment phase, it dynamically schedules slow thinking transitions by modeling the insertion of reasoning transition tokens as… ▽ More

    Submitted 30 May, 2025; originally announced May 2025.

  18. arXiv:2505.19141  [pdf, ps, other

    math.NT cs.FL

    S-unit equations in modules and linear-exponential Diophantine equations

    Authors: Ruiwen Dong, Doron Shafrir

    Abstract: Let $T$ be a positive integer, and $\mathcal{M}$ be a finitely presented module over the Laurent polynomial ring $\mathbb{Z}_{/T}[X_1^{\pm}, \ldots, X_N^{\pm}]$. We consider S-unit equations over $\mathcal{M}$: these are equations of the form $x_1 m_1 + \cdots + x_K m_K = m_0$, where the variables $x_1, \ldots, x_K$ range over the set of monomials (with coefficient 1) of… ▽ More

    Submitted 27 May, 2025; v1 submitted 25 May, 2025; originally announced May 2025.

    Comments: 80 pages, corrected spelling mistake for a name

  19. arXiv:2504.11588  [pdf, other

    cs.CV cs.AI

    Deep Learning Approaches for Medical Imaging Under Varying Degrees of Label Availability: A Comprehensive Survey

    Authors: Siteng Ma, Honghui Du, Yu An, Jing Wang, Qinqin Wang, Haochang Wu, Aonghus Lawlor, Ruihai Dong

    Abstract: Deep learning has achieved significant breakthroughs in medical imaging, but these advancements are often dependent on large, well-annotated datasets. However, obtaining such datasets poses a significant challenge, as it requires time-consuming and labor-intensive annotations from medical experts. Consequently, there is growing interest in learning paradigms such as incomplete, inexact, and absent… ▽ More

    Submitted 15 April, 2025; originally announced April 2025.

    Comments: 33 pages, 10 figures, 8 tables. Will be submit to Medical Image Analysis

    MSC Class: 68T07; 68T45; 92C50; 92C55 ACM Class: I.2.10; I.4.5; I.4.6; I.4.9; J.3

  20. arXiv:2504.07165  [pdf, other

    cs.CV

    Perception in Reflection

    Authors: Yana Wei, Liang Zhao, Kangheng Lin, En Yu, Yuang Peng, Runpei Dong, Jianjian Sun, Haoran Wei, Zheng Ge, Xiangyu Zhang, Vishal M. Patel

    Abstract: We present a perception in reflection paradigm designed to transcend the limitations of current large vision-language models (LVLMs), which are expected yet often fail to achieve perfect perception initially. Specifically, we propose Reflective Perception (RePer), a dual-model reflection mechanism that systematically alternates between policy and critic models, enables iterative refinement of visu… ▽ More

    Submitted 9 April, 2025; originally announced April 2025.

  21. arXiv:2503.18948  [pdf, other

    cs.CV

    Equivariant Image Modeling

    Authors: Ruixiao Dong, Mengde Xu, Zigang Geng, Li Li, Han Hu, Shuyang Gu

    Abstract: Current generative models, such as autoregressive and diffusion approaches, decompose high-dimensional data distribution learning into a series of simpler subtasks. However, inherent conflicts arise during the joint optimization of these subtasks, and existing solutions fail to resolve such conflicts without sacrificing efficiency or scalability. We propose a novel equivariant image modeling frame… ▽ More

    Submitted 24 March, 2025; originally announced March 2025.

  22. arXiv:2503.17893  [pdf, other

    cs.PF

    Modeling Utilization to Identify Shared-Memory Atomic Bottlenecks

    Authors: Rongcui Dong, Sreepathi Pai

    Abstract: Performance analysis is critical for GPU programs with data-dependent behavior, but models like Roofline are not very useful for them and interpreting raw performance counters is tedious. In this work, we present an analytical model for shared memory atomics (\emph{fetch-and-op} and \emph{compare-and-swap} instructions on NVIDIA Volta and Ampere GPU) that allows users to immediately determine if s… ▽ More

    Submitted 22 March, 2025; originally announced March 2025.

    Comments: GPGPU 2025

  23. arXiv:2503.10497  [pdf, other

    cs.CL

    MMLU-ProX: A Multilingual Benchmark for Advanced Large Language Model Evaluation

    Authors: Weihao Xuan, Rui Yang, Heli Qi, Qingcheng Zeng, Yunze Xiao, Aosong Feng, Dairui Liu, Yun Xing, Junjue Wang, Fan Gao, Jinghui Lu, Yuang Jiang, Huitao Li, Xin Li, Kunyu Yu, Ruihai Dong, Shangding Gu, Yuekang Li, Xiaofei Xie, Felix Juefei-Xu, Foutse Khomh, Osamu Yoshie, Qingyu Chen, Douglas Teodoro, Nan Liu , et al. (7 additional authors not shown)

    Abstract: Existing large language model (LLM) evaluation benchmarks primarily focus on English, while current multilingual tasks lack parallel questions that specifically assess cross-linguistic reasoning abilities. This dual limitation makes it challenging to comprehensively assess LLMs' performance in the multilingual setting. To fill this gap, we introduce MMLU-ProX, a comprehensive benchmark covering 29… ▽ More

    Submitted 26 May, 2025; v1 submitted 13 March, 2025; originally announced March 2025.

  24. arXiv:2502.19158  [pdf, other

    cs.CL cs.AI

    When Personalization Meets Reality: A Multi-Faceted Analysis of Personalized Preference Learning

    Authors: Yijiang River Dong, Tiancheng Hu, Yinhong Liu, Ahmet Üstün, Nigel Collier

    Abstract: While Reinforcement Learning from Human Feedback (RLHF) is widely used to align Large Language Models (LLMs) with human preferences, it typically assumes homogeneous preferences across users, overlooking diverse human values and minority viewpoints. Although personalized preference learning addresses this by tailoring separate preferences for individual users, the field lacks standardized methods… ▽ More

    Submitted 26 February, 2025; originally announced February 2025.

  25. arXiv:2502.13143  [pdf, ps, other

    cs.RO cs.AI cs.CV

    SoFar: Language-Grounded Orientation Bridges Spatial Reasoning and Object Manipulation

    Authors: Zekun Qi, Wenyao Zhang, Yufei Ding, Runpei Dong, Xinqiang Yu, Jingwen Li, Lingyun Xu, Baoyu Li, Xialin He, Guofan Fan, Jiazhao Zhang, Jiawei He, Jiayuan Gu, Xin Jin, Kaisheng Ma, Zhizheng Zhang, He Wang, Li Yi

    Abstract: While spatial reasoning has made progress in object localization relationships, it often overlooks object orientation-a key factor in 6-DoF fine-grained manipulation. Traditional pose representations rely on pre-defined frames or templates, limiting generalization and semantic grounding. In this paper, we introduce the concept of semantic orientation, which defines object orientations using natura… ▽ More

    Submitted 23 September, 2025; v1 submitted 18 February, 2025; originally announced February 2025.

    Comments: Accepted at NeurIPS 2025 Spotlight

  26. arXiv:2502.12152  [pdf, other

    cs.RO cs.LG

    Learning Getting-Up Policies for Real-World Humanoid Robots

    Authors: Xialin He, Runpei Dong, Zixuan Chen, Saurabh Gupta

    Abstract: Automatic fall recovery is a crucial prerequisite before humanoid robots can be reliably deployed. Hand-designing controllers for getting up is difficult because of the varied configurations a humanoid can end up in after a fall and the challenging terrains humanoid robots are expected to operate on. This paper develops a learning framework to produce controllers that enable humanoid robots to get… ▽ More

    Submitted 27 April, 2025; v1 submitted 17 February, 2025; originally announced February 2025.

    Comments: Robotics: Science and Systems (RSS), 2025. Project page: https://humanoid-getup.github.io/

  27. arXiv:2502.06221  [pdf, other

    cs.RO

    Interaction-aware Conformal Prediction for Crowd Navigation

    Authors: Zhe Huang, Tianchen Ji, Heling Zhang, Fatemeh Cheraghi Pouria, Katherine Driggs-Campbell, Roy Dong

    Abstract: During crowd navigation, robot motion plan needs to consider human motion uncertainty, and the human motion uncertainty is dependent on the robot motion plan. We introduce Interaction-aware Conformal Prediction (ICP) to alternate uncertainty-aware robot motion planning and decision-dependent human motion uncertainty quantification. ICP is composed of a trajectory predictor to predict human traject… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

    Comments: Accepted by WAFR 2024

  28. arXiv:2501.12389  [pdf, other

    cs.CV

    Taming Teacher Forcing for Masked Autoregressive Video Generation

    Authors: Deyu Zhou, Quan Sun, Yuang Peng, Kun Yan, Runpei Dong, Duomin Wang, Zheng Ge, Nan Duan, Xiangyu Zhang, Lionel M. Ni, Heung-Yeung Shum

    Abstract: We introduce MAGI, a hybrid video generation framework that combines masked modeling for intra-frame generation with causal modeling for next-frame generation. Our key innovation, Complete Teacher Forcing (CTF), conditions masked frames on complete observation frames rather than masked ones (namely Masked Teacher Forcing, MTF), enabling a smooth transition from token-level (patch-level) to frame-l… ▽ More

    Submitted 21 January, 2025; originally announced January 2025.

    Comments: 12 pages, 9 figures

  29. arXiv:2411.10130  [pdf, other

    cs.CV

    Towards Multi-View Consistent Style Transfer with One-Step Diffusion via Vision Conditioning

    Authors: Yushen Zuo, Jun Xiao, Kin-Chung Chan, Rongkang Dong, Cuixin Yang, Zongqi He, Hao Xie, Kin-Man Lam

    Abstract: The stylization of 3D scenes is an increasingly attractive topic in 3D vision. Although image style transfer has been extensively researched with promising results, directly applying 2D style transfer methods to 3D scenes often fails to preserve the structural and multi-view properties of 3D environments, resulting in unpleasant distortions in images from different viewpoints. To address these iss… ▽ More

    Submitted 15 November, 2024; originally announced November 2024.

    Comments: Accepted by ECCV 2024 AI for Visual Arts Workshop and Challenges, 18 pages, 7 figures

  30. arXiv:2410.15311  [pdf, other

    cs.AI cs.CL cs.CY

    Who is Undercover? Guiding LLMs to Explore Multi-Perspective Team Tactic in the Game

    Authors: Ruiqi Dong, Zhixuan Liao, Guangwei Lai, Yuhan Ma, Danni Ma, Chenyou Fan

    Abstract: Large Language Models (LLMs) are pivotal AI agents in complex tasks but still face challenges in open decision-making problems within complex scenarios. To address this, we use the language logic game ``Who is Undercover?'' (WIU) as an experimental platform to propose the Multi-Perspective Team Tactic (MPTT) framework. MPTT aims to cultivate LLMs' human-like language expression logic, multi-dimens… ▽ More

    Submitted 20 October, 2024; originally announced October 2024.

  31. arXiv:2410.13125  [pdf, other

    cs.IR

    Transformers4NewsRec: A Transformer-based News Recommendation Framework

    Authors: Dairui Liu, Honghui Du, Boming Yang, Neil Hurley, Aonghus Lawlor, Irene Li, Derek Greene, Ruihai Dong

    Abstract: Pre-trained transformer models have shown great promise in various natural language processing tasks, including personalized news recommendations. To harness the power of these models, we introduce Transformers4NewsRec, a new Python framework built on the \textbf{Transformers} library. This framework is designed to unify and compare the performance of various news recommendation models, including… ▽ More

    Submitted 16 October, 2024; originally announced October 2024.

  32. arXiv:2410.07952  [pdf, ps, other

    cs.GT eess.SY math.OC

    Eco-driving Incentive Mechanisms for Mitigating Emissions in Urban Transportation

    Authors: M. Umar B. Niazi, Jung-Hoon Cho, Munther A. Dahleh, Roy Dong, Cathy Wu

    Abstract: This paper proposes incentive mechanisms that promote eco-driving in transportation networks with the over-arching objective of minimizing emissions. The transportation system operator provides the drivers with energy-efficient driving guidance throughout their trips, and their eco-driving levels are measured by how closely they follow this guidance via vehicle telematics. Drivers choose their eco… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

  33. arXiv:2410.07216  [pdf, other

    q-fin.ST cs.AI cs.LG

    Evaluating Financial Relational Graphs: Interpretation Before Prediction

    Authors: Yingjie Niu, Lanxin Lu, Rian Dolphin, Valerio Poti, Ruihai Dong

    Abstract: Accurate and robust stock trend forecasting has been a crucial and challenging task, as stock price changes are influenced by multiple factors. Graph neural network-based methods have recently achieved remarkable success in this domain by constructing stock relationship graphs that reflect internal factors and relationships between stocks. However, most of these methods rely on predefined factors… ▽ More

    Submitted 28 September, 2024; originally announced October 2024.

    Comments: Accepted by 2024 ACM International Conference on AI in Finance

    ACM Class: I.2.4

  34. arXiv:2410.04905  [pdf, ps, other

    math.GR cs.FL math.LO

    Equations in wreath products

    Authors: Laurent Bartholdi, Ruiwen Dong, Leon Pernak, Jan Philipp Wächter

    Abstract: We survey solvability of equations in wreath products of groups, and prove that the quadratic diophantine problem is solvable in wreath products of Abelian groups. We consider the related question of determining commutator width, and prove that the quadratic diophantine problem is also solvable in Baumslag's finitely presented metabelian group. This text is a short version of an extensive article… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

  35. arXiv:2409.17228  [pdf, other

    astro-ph.EP cs.AI cs.LG

    Disk2Planet: A Robust and Automated Machine Learning Tool for Parameter Inference in Disk-Planet Systems

    Authors: Shunyuan Mao, Ruobing Dong, Kwang Moo Yi, Lu Lu, Sifan Wang, Paris Perdikaris

    Abstract: We introduce Disk2Planet, a machine learning-based tool to infer key parameters in disk-planet systems from observed protoplanetary disk structures. Disk2Planet takes as input the disk structures in the form of two-dimensional density and velocity maps, and outputs disk and planet properties, that is, the Shakura--Sunyaev viscosity, the disk aspect ratio, the planet--star mass ratio, and the plane… ▽ More

    Submitted 25 September, 2024; originally announced September 2024.

    Comments: Accepted to ApJ

  36. arXiv:2409.15045  [pdf, other

    cs.CV

    AIM 2024 Sparse Neural Rendering Challenge: Methods and Results

    Authors: Michal Nazarczuk, Sibi Catley-Chandar, Thomas Tanay, Richard Shaw, Eduardo Pérez-Pellitero, Radu Timofte, Xing Yan, Pan Wang, Yali Guo, Yongxin Wu, Youcheng Cai, Yanan Yang, Junting Li, Yanghong Zhou, P. Y. Mok, Zongqi He, Zhe Xiao, Kin-Chung Chan, Hana Lebeta Goshu, Cuixin Yang, Rongkang Dong, Jun Xiao, Kin-Man Lam, Jiayao Hao, Qiong Gao , et al. (5 additional authors not shown)

    Abstract: This paper reviews the challenge on Sparse Neural Rendering that was part of the Advances in Image Manipulation (AIM) workshop, held in conjunction with ECCV 2024. This manuscript focuses on the competition set-up, the proposed methods and their respective results. The challenge aims at producing novel camera view synthesis of diverse scenes from sparse image observations. It is composed of two tr… ▽ More

    Submitted 23 September, 2024; originally announced September 2024.

    Comments: Part of Advances in Image Manipulation workshop at ECCV 2024

  37. arXiv:2409.13259  [pdf, other

    q-bio.MN cs.AI

    A generalizable framework for unlocking missing reactions in genome-scale metabolic networks using deep learning

    Authors: Xiaoyi Liu, Hongpeng Yang, Chengwei Ai, Ruihan Dong, Yijie Ding, Qianqian Yuan, Jijun Tang, Fei Guo

    Abstract: Incomplete knowledge of metabolic processes hinders the accuracy of GEnome-scale Metabolic models (GEMs), which in turn impedes advancements in systems biology and metabolic engineering. Existing gap-filling methods typically rely on phenotypic data to minimize the disparity between computational predictions and experimental results. However, there is still a lack of an automatic and precise gap-f… ▽ More

    Submitted 20 September, 2024; originally announced September 2024.

  38. arXiv:2409.12396  [pdf, other

    cs.CY cs.AI

    ARTAI: An Evaluation Platform to Assess Societal Risk of Recommender Algorithms

    Authors: Qin Ruan, Jin Xu, Ruihai Dong, Arjumand Younus, Tai Tan Mai, Barry O'Sullivan, Susan Leavy

    Abstract: Societal risk emanating from how recommender algorithms disseminate content online is now well documented. Emergent regulation aims to mitigate this risk through ethical audits and enabling new research on the social impact of algorithms. However, there is currently a need for tools and methods that enable such evaluation. This paper presents ARTAI, an evaluation environment that enables large-sca… ▽ More

    Submitted 18 September, 2024; originally announced September 2024.

    Comments: 3 pages, 1 figure, accepted at FAccTRec 2024 Workshop, RecSys 2024

    ACM Class: H.3.3; I.2.7; I.5.1

  39. arXiv:2409.07077  [pdf, other

    math.GR cs.FL math.NT

    Submonoid Membership in n-dimensional lamplighter groups and S-unit equations

    Authors: Ruiwen Dong

    Abstract: We show that Submonoid Membership is decidable in n-dimensional lamplighter groups $(\mathbb{Z}/p\mathbb{Z}) \wr \mathbb{Z}^n$ for any prime $p$ and integer $n$. More generally, we show decidability of Submonoid Membership in semidirect products of the form $\mathcal{Y} \rtimes \mathbb{Z}^n$, where $\mathcal{Y}$ is any finitely presented module over the Laurent polynomial ring… ▽ More

    Submitted 27 May, 2025; v1 submitted 11 September, 2024; originally announced September 2024.

    Comments: Full version of conference paper at ICALP'25

  40. arXiv:2408.11567  [pdf, ps, other

    cs.CV

    Positional Prompt Tuning for Efficient 3D Representation Learning

    Authors: Shaochen Zhang, Zekun Qi, Runpei Dong, Xiuxiu Bai, Xing Wei

    Abstract: We rethink the role of positional encoding in 3D representation learning and fine-tuning. We argue that using positional encoding in point Transformer-based methods serves to aggregate multi-scale features of point clouds. Additionally, we explore parameter-efficient fine-tuning (PEFT) through the lens of prompts and adapters, introducing a straightforward yet effective method called PPT for point… ▽ More

    Submitted 23 September, 2025; v1 submitted 21 August, 2024; originally announced August 2024.

    Comments: Accepted at ACMMM 2025 Oral

  41. arXiv:2408.09460  [pdf, other

    cs.CV

    Fine-Grained Building Function Recognition from Street-View Images via Geometry-Aware Semi-Supervised Learning

    Authors: Weijia Li, Jinhua Yu, Dairong Chen, Yi Lin, Runmin Dong, Xiang Zhang, Conghui He, Haohuan Fu

    Abstract: In this work, we propose a geometry-aware semi-supervised framework for fine-grained building function recognition, utilizing geometric relationships among multi-source data to enhance pseudo-label accuracy in semi-supervised learning, broadening its applicability to various building function categorization systems. Firstly, we design an online semi-supervised pre-training stage, which facilitates… ▽ More

    Submitted 8 September, 2024; v1 submitted 18 August, 2024; originally announced August 2024.

    Comments: This paper is currently under review

  42. arXiv:2408.07527  [pdf, other

    cs.CV cs.AI

    Evidential Graph Contrastive Alignment for Source-Free Blending-Target Domain Adaptation

    Authors: Juepeng Zheng, Yibin Wen, Jinxiao Zhang, Runmin Dong, Haohuan Fu

    Abstract: In this paper, we firstly tackle a more realistic Domain Adaptation (DA) setting: Source-Free Blending-Target Domain Adaptation (SF-BTDA), where we can not access to source domain data while facing mixed multiple target domains without any domain labels in prior. Compared to existing DA scenarios, SF-BTDA generally faces the co-existence of different label shifts in different targets, along with n… ▽ More

    Submitted 25 August, 2024; v1 submitted 14 August, 2024; originally announced August 2024.

  43. arXiv:2407.18645  [pdf, other

    cs.LG q-fin.ST

    Contrastive Learning of Asset Embeddings from Financial Time Series

    Authors: Rian Dolphin, Barry Smyth, Ruihai Dong

    Abstract: Representation learning has emerged as a powerful paradigm for extracting valuable latent features from complex, high-dimensional data. In financial domains, learning informative representations for assets can be used for tasks like sector classification, and risk management. However, the complex and stochastic nature of financial markets poses unique challenges. We propose a novel contrastive lea… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

    Comments: 9 pages, 4 figures, 4 tables

  44. FedUD: Exploiting Unaligned Data for Cross-Platform Federated Click-Through Rate Prediction

    Authors: Wentao Ouyang, Rui Dong, Ri Tao, Xiangzheng Liu

    Abstract: Click-through rate (CTR) prediction plays an important role in online advertising platforms. Most existing methods use data from the advertising platform itself for CTR prediction. As user behaviors also exist on many other platforms, e.g., media platforms, it is beneficial to further exploit such complementary information for better modeling user interest and for improving CTR prediction performa… ▽ More

    Submitted 25 July, 2024; originally announced July 2024.

  45. arXiv:2407.05352  [pdf, other

    cs.CV cs.MM

    Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model

    Authors: Danni Yang, Ruohan Dong, Jiayi Ji, Yiwei Ma, Haowei Wang, Xiaoshuai Sun, Rongrong Ji

    Abstract: Recently, diffusion models have increasingly demonstrated their capabilities in vision understanding. By leveraging prompt-based learning to construct sentences, these models have shown proficiency in classification and visual grounding tasks. However, existing approaches primarily showcase their ability to perform sentence-level localization, leaving the potential for leveraging contextual inform… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  46. arXiv:2406.16855  [pdf, other

    cs.CV

    DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation

    Authors: Yuang Peng, Yuxin Cui, Haomiao Tang, Zekun Qi, Runpei Dong, Jing Bai, Chunrui Han, Zheng Ge, Xiangyu Zhang, Shu-Tao Xia

    Abstract: Personalized image generation holds great promise in assisting humans in everyday work and life due to its impressive ability to creatively generate personalized content across various contexts. However, current evaluations either are automated but misalign with humans or require human evaluations that are time-consuming and expensive. In this work, we present DreamBench++, a human-aligned benchma… ▽ More

    Submitted 8 March, 2025; v1 submitted 24 June, 2024; originally announced June 2024.

    Comments: ICLR 2025, Project page: https://dreambenchplus.github.io/

  47. arXiv:2406.16439  [pdf, ps, other

    cs.CV

    Exploring Test-Time Adaptation for Object Detection in Continually Changing Environments

    Authors: Shilei Cao, Juepeng Zheng, Yan Liu, Baoquan Zhao, Ziqi Yuan, Weijia Li, Runmin Dong, Haohuan Fu

    Abstract: Real-world application models are commonly deployed in dynamic environments, where the target domain distribution undergoes temporal changes. Continual Test-Time Adaptation (CTTA) has recently emerged as a promising technique to gradually adapt a source-trained model to continually changing target domains. Despite recent advancements in addressing CTTA, two critical issues remain: 1) Fixed thresho… ▽ More

    Submitted 11 June, 2025; v1 submitted 24 June, 2024; originally announced June 2024.

  48. arXiv:2406.11657  [pdf, other

    cs.CL cs.CY

    Can LLM be a Personalized Judge?

    Authors: Yijiang River Dong, Tiancheng Hu, Nigel Collier

    Abstract: Ensuring that large language models (LLMs) reflect diverse user values and preferences is crucial as their user bases expand globally. It is therefore encouraging to see the growing interest in LLM personalization within the research community. However, current works often rely on the LLM-as-a-Judge approach for evaluation without thoroughly examining its validity. In this paper, we investigate th… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Our code is available at https://github.com/dong-river/Personalized-Judge

  49. arXiv:2406.10869  [pdf, other

    eess.IV cs.CV

    Geometric Distortion Guided Transformer for Omnidirectional Image Super-Resolution

    Authors: Cuixin Yang, Rongkang Dong, Jun Xiao, Cong Zhang, Kin-Man Lam, Fei Zhou, Guoping Qiu

    Abstract: As virtual and augmented reality applications gain popularity, omnidirectional image (ODI) super-resolution has become increasingly important. Unlike 2D plain images that are formed on a plane, ODIs are projected onto spherical surfaces. Applying established image super-resolution methods to ODIs, therefore, requires performing equirectangular projection (ERP) to map the ODIs onto a plane. ODI sup… ▽ More

    Submitted 16 January, 2025; v1 submitted 16 June, 2024; originally announced June 2024.

    Comments: 13 pages, 12 figures, journal

  50. arXiv:2406.08480  [pdf, ps, other

    cs.SC cs.LO math.AC math.GR

    Linear equations with monomial constraints and decision problems in abelian-by-cyclic groups

    Authors: Ruiwen Dong

    Abstract: We show that it is undecidable whether a system of linear equations over the Laurent polynomial ring $\mathbb{Z}[X^{\pm}]$ admit solutions where a specified subset of variables take value in the set of monomials $\{X^z \mid z \in \mathbb{Z}\}$. In particular, we construct a finitely presented $\mathbb{Z}[X^{\pm}]$-module, where it is undecidable whether a linear equation… ▽ More

    Submitted 6 September, 2024; v1 submitted 12 June, 2024; originally announced June 2024.

    Comments: Corrected an error in Lemma 6.8. Supersedes arXiv:2309.08811