Skip to main content

Showing 1–50 of 110 results for author: Fan, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.15715  [pdf, ps, other

    cs.LG cs.AI

    NeuronSeek: On Stability and Expressivity of Task-driven Neurons

    Authors: Hanyu Pei, Jing-Xiao Liao, Qibin Zhao, Ting Gao, Shijun Zhang, Xiaoge Zhang, Feng-Lei Fan

    Abstract: Drawing inspiration from our human brain that designs different neurons for different tasks, recent advances in deep learning have explored modifying a network's neurons to develop so-called task-driven neurons. Prototyping task-driven neurons (referred to as NeuronSeek) employs symbolic regression (SR) to discover the optimal neuron formulation and construct a network from these optimized neurons… ▽ More

    Submitted 31 May, 2025; originally announced June 2025.

    Comments: 14 pages, 10 figures

  2. arXiv:2506.12103  [pdf, other

    cs.AI cs.CY cs.LG

    The Amazon Nova Family of Models: Technical Report and Model Card

    Authors: Amazon AGI, Aaron Langford, Aayush Shah, Abhanshu Gupta, Abhimanyu Bhatter, Abhinav Goyal, Abhinav Mathur, Abhinav Mohanty, Abhishek Kumar, Abhishek Sethi, Abi Komma, Abner Pena, Achin Jain, Adam Kunysz, Adam Opyrchal, Adarsh Singh, Aditya Rawal, Adok Achar Budihal Prasad, Adrià de Gispert, Agnika Kumar, Aishwarya Aryamane, Ajay Nair, Akilan M, Akshaya Iyengar, Akshaya Vishnu Kudlu Shanbhogue , et al. (761 additional authors not shown)

    Abstract: We present Amazon Nova, a new generation of state-of-the-art foundation models that deliver frontier intelligence and industry-leading price performance. Amazon Nova Pro is a highly-capable multimodal model with the best combination of accuracy, speed, and cost for a wide range of tasks. Amazon Nova Lite is a low-cost multimodal model that is lightning fast for processing images, video, documents… ▽ More

    Submitted 17 March, 2025; originally announced June 2025.

    Comments: 48 pages, 10 figures

    Report number: 20250317

  3. arXiv:2506.05359  [pdf, ps, other

    q-fin.ST cs.CR

    Enhancing Meme Token Market Transparency: A Multi-Dimensional Entity-Linked Address Analysis for Liquidity Risk Evaluation

    Authors: Qiangqiang Liu, Qian Huang, Frank Fan, Haishan Wu, Xueyan Tang

    Abstract: Meme tokens represent a distinctive asset class within the cryptocurrency ecosystem, characterized by high community engagement, significant market volatility, and heightened vulnerability to market manipulation. This paper introduces an innovative approach to assessing liquidity risk in meme token markets using entity-linked address identification techniques. We propose a multi-dimensional method… ▽ More

    Submitted 22 May, 2025; originally announced June 2025.

    Comments: IEEE International Conference on Blockchain and Cryptocurrency (Proc. IEEE ICBC 2025)

  4. arXiv:2505.11642  [pdf, other

    cs.MA cs.AI cs.LG

    PeerGuard: Defending Multi-Agent Systems Against Backdoor Attacks Through Mutual Reasoning

    Authors: Falong Fan, Xi Li

    Abstract: Multi-agent systems leverage advanced AI models as autonomous agents that interact, cooperate, or compete to complete complex tasks across applications such as robotics and traffic management. Despite their growing importance, safety in multi-agent systems remains largely underexplored, with most research focusing on single AI models rather than interacting agents. This work investigates backdoor… ▽ More

    Submitted 27 May, 2025; v1 submitted 16 May, 2025; originally announced May 2025.

    Comments: This paper has been accepted to IEEE IRI 2025

  5. arXiv:2505.10367  [pdf, ps, other

    eess.SY cs.LG

    A Hybrid Strategy for Aggregated Probabilistic Forecasting and Energy Trading in HEFTCom2024

    Authors: Chuanqing Pu, Feilong Fan, Nengling Tai, Songyuan Liu, Jinming Yu

    Abstract: Obtaining accurate probabilistic energy forecasts and making effective decisions amid diverse uncertainties are routine challenges in future energy systems. This paper presents the solution of team GEB, which ranked 3rd in trading, 4th in forecasting, and 1st among student teams in the IEEE Hybrid Energy Forecasting and Trading Competition 2024 (HEFTCom2024). The solution provides accurate probabi… ▽ More

    Submitted 15 May, 2025; originally announced May 2025.

    Comments: Solution description of IEEE Hybrid Energy Forecasting and Trading Competition (HEFTCom)

  6. arXiv:2505.09313  [pdf, ps, other

    cs.CR cs.LG

    Detecting Sybil Addresses in Blockchain Airdrops: A Subgraph-based Feature Propagation and Fusion Approach

    Authors: Qiangqiang Liu, Qian Huang, Frank Fan, Haishan Wu, Xueyan Tang

    Abstract: Sybil attacks pose a significant security threat to blockchain ecosystems, particularly in token airdrop events. This paper proposes a novel sybil address identification method based on subgraph feature extraction lightGBM. The method first constructs a two-layer deep transaction subgraph for each address, then extracts key event operation features according to the lifecycle of sybil addresses, in… ▽ More

    Submitted 14 May, 2025; originally announced May 2025.

    Comments: IEEE International Conference on Blockchain and Cryptocurrency(Proc. IEEE ICBC 2025)

  7. arXiv:2504.11953  [pdf, other

    eess.IV cs.CV

    Novel-view X-ray Projection Synthesis through Geometry-Integrated Deep Learning

    Authors: Daiqi Liu, Fuxin Fan, Andreas Maier

    Abstract: X-ray imaging plays a crucial role in the medical field, providing essential insights into the internal anatomy of patients for diagnostics, image-guided procedures, and clinical decision-making. Traditional techniques often require multiple X-ray projections from various angles to obtain a comprehensive view, leading to increased radiation exposure and more complex clinical processes. This paper… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

    Comments: 6 pages, 3 figures, 1 table

  8. A Category-Fragment Segmentation Framework for Pelvic Fracture Segmentation in X-ray Images

    Authors: Daiqi Liu, Fuxin Fan, Andreas Maier

    Abstract: Pelvic fractures, often caused by high-impact trauma, frequently require surgical intervention. Imaging techniques such as CT and 2D X-ray imaging are used to transfer the surgical plan to the operating room through image registration, enabling quick intraoperative adjustments. Specifically, segmenting pelvic fractures from 2D X-ray imaging can assist in accurately positioning bone fragments and g… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

    Comments: 5 pages, 2 figures, 1 table

  9. arXiv:2504.11511  [pdf, ps, other

    cs.LG cs.AI

    Position Paper: Rethinking Privacy in RL for Sequential Decision-making in the Age of LLMs

    Authors: Flint Xiaofeng Fan, Cheston Tan, Roger Wattenhofer, Yew-Soon Ong

    Abstract: The rise of reinforcement learning (RL) in critical real-world applications demands a fundamental rethinking of privacy in AI systems. Traditional privacy frameworks, designed to protect isolated data points, fall short for sequential decision-making systems where sensitive information emerges from temporal patterns, behavioral strategies, and collaborative dynamics. Modern RL paradigms, such as f… ▽ More

    Submitted 18 June, 2025; v1 submitted 15 April, 2025; originally announced April 2025.

    Comments: IJCNN 2025 Position Paper Track

  10. arXiv:2504.02382  [pdf, other

    eess.IV cs.AI cs.CV

    Benchmark of Segmentation Techniques for Pelvic Fracture in CT and X-ray: Summary of the PENGWIN 2024 Challenge

    Authors: Yudi Sang, Yanzhen Liu, Sutuke Yibulayimu, Yunning Wang, Benjamin D. Killeen, Mingxu Liu, Ping-Cheng Ku, Ole Johannsen, Karol Gotkowski, Maximilian Zenk, Klaus Maier-Hein, Fabian Isensee, Peiyan Yue, Yi Wang, Haidong Yu, Zhaohong Pan, Yutong He, Xiaokun Liang, Daiqi Liu, Fuxin Fan, Artur Jurgas, Andrzej Skalski, Yuxi Ma, Jing Yang, Szymon Płotka , et al. (11 additional authors not shown)

    Abstract: The segmentation of pelvic fracture fragments in CT and X-ray images is crucial for trauma diagnosis, surgical planning, and intraoperative guidance. However, accurately and efficiently delineating the bone fragments remains a significant challenge due to complex anatomy and imaging limitations. The PENGWIN challenge, organized as a MICCAI 2024 satellite event, aimed to advance automated fracture… ▽ More

    Submitted 3 April, 2025; originally announced April 2025.

    Comments: PENGWIN 2024 Challenge Report

  11. arXiv:2502.04722  [pdf, other

    cs.SD cs.LG eess.AS

    Singing Voice Conversion with Accompaniment Using Self-Supervised Representation-Based Melody Features

    Authors: Wei Chen, Binzhu Sha, Jing Yang, Zhuo Wang, Fan Fan, Zhiyong Wu

    Abstract: Melody preservation is crucial in singing voice conversion (SVC). However, in many scenarios, audio is often accompanied with background music (BGM), which can cause audio distortion and interfere with the extraction of melody and other key features, significantly degrading SVC performance. Previous methods have attempted to address this by using more robust neural network-based melody extractors,… ▽ More

    Submitted 7 February, 2025; originally announced February 2025.

    Comments: Accepted by ICASSP2025

  12. arXiv:2502.03822  [pdf, other

    cs.RO

    Dynamic Rank Adjustment in Diffusion Policies for Efficient and Flexible Training

    Authors: Xiatao Sun, Shuo Yang, Yinxing Chen, Francis Fan, Yiyan Liang, Daniel Rakita

    Abstract: Diffusion policies trained via offline behavioral cloning have recently gained traction in robotic motion generation. While effective, these policies typically require a large number of trainable parameters. This model size affords powerful representations but also incurs high computational cost during training. Ideally, it would be beneficial to dynamically adjust the trainable portion as needed,… ▽ More

    Submitted 25 April, 2025; v1 submitted 6 February, 2025; originally announced February 2025.

    Comments: Accepted to Robotics: Science and Systems (RSS) 2025

  13. arXiv:2502.00870  [pdf, other

    cs.LG cs.AI cs.MA

    FedHPD: Heterogeneous Federated Reinforcement Learning via Policy Distillation

    Authors: Wenzheng Jiang, Ji Wang, Xiongtao Zhang, Weidong Bao, Cheston Tan, Flint Xiaofeng Fan

    Abstract: Federated Reinforcement Learning (FedRL) improves sample efficiency while preserving privacy; however, most existing studies assume homogeneous agents, limiting its applicability in real-world scenarios. This paper investigates FedRL in black-box settings with heterogeneous agents, where each agent employs distinct policy networks and training configurations without disclosing their internal detai… ▽ More

    Submitted 2 February, 2025; originally announced February 2025.

    Comments: This preprint presents the full version of the Extended Abstract accepted by AAMAS 2025, including all the proofs and experiments

    ACM Class: I.2.11

  14. arXiv:2501.15417  [pdf, ps, other

    cs.SD cs.AI cs.LG eess.AS

    AnyEnhance: A Unified Generative Model with Prompt-Guidance and Self-Critic for Voice Enhancement

    Authors: Junan Zhang, Jing Yang, Zihao Fang, Yuancheng Wang, Zehua Zhang, Zhuo Wang, Fan Fan, Zhizheng Wu

    Abstract: We introduce AnyEnhance, a unified generative model for voice enhancement that processes both speech and singing voices. Based on a masked generative model, AnyEnhance is capable of handling both speech and singing voices, supporting a wide range of enhancement tasks including denoising, dereverberation, declipping, super-resolution, and target speaker extraction, all simultaneously and without fi… ▽ More

    Submitted 22 June, 2025; v1 submitted 26 January, 2025; originally announced January 2025.

    Comments: Accepted by IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP) 2025

  15. arXiv:2501.05176  [pdf

    cs.SE

    Deep Assessment of Code Review Generation Approaches: Beyond Lexical Similarity

    Authors: Yanjie Jiang, Hui Liu, Tianyi Chen, Fu Fan, Chunhao Dong, Kui Liu, Lu Zhang

    Abstract: Code review is a standard practice for ensuring the quality of software projects, and recent research has focused extensively on automated code review. While significant advancements have been made in generating code reviews, the automated assessment of these reviews remains less explored, with existing approaches and metrics often proving inaccurate. Current metrics, such as BLEU, primarily rely… ▽ More

    Submitted 9 January, 2025; originally announced January 2025.

  16. arXiv:2412.15538  [pdf, other

    cs.LG cs.AI cs.CR

    FedRLHF: A Convergence-Guaranteed Federated Framework for Privacy-Preserving and Personalized RLHF

    Authors: Flint Xiaofeng Fan, Cheston Tan, Yew-Soon Ong, Roger Wattenhofer, Wei-Tsang Ooi

    Abstract: In the era of increasing privacy concerns and demand for personalized experiences, traditional Reinforcement Learning with Human Feedback (RLHF) frameworks face significant challenges due to their reliance on centralized data. We introduce Federated Reinforcement Learning with Human Feedback (FedRLHF), a novel framework that decentralizes the RLHF process. FedRLHF enables collaborative policy lear… ▽ More

    Submitted 7 February, 2025; v1 submitted 19 December, 2024; originally announced December 2024.

    Comments: Updated for AAMAS 2025 camera-ready. This preprint represents the full version of the paper, including all proofs, experimental details, and additional discussions

    ACM Class: I.2.11

  17. arXiv:2412.04858  [pdf, ps, other

    cs.AI

    Rethink Deep Learning with Invariance in Data Representation

    Authors: Shuren Qi, Fei Wang, Tieyong Zeng, Fenglei Fan

    Abstract: Integrating invariance into data representations is a principled design in intelligent systems and web applications. Representations play a fundamental role, where systems and applications are both built on meaningful representations of digital inputs (rather than the raw data). In fact, the proper design/learning of such representations relies on priors w.r.t. the task of interest. Here, the conc… ▽ More

    Submitted 6 December, 2024; originally announced December 2024.

    Comments: Accepted by WWW 2025 for a tutorial

  18. arXiv:2411.11110  [pdf, other

    eess.IV cs.CV

    Retinal Vessel Segmentation via Neuron Programming

    Authors: Tingting Wu, Ruyi Min, Peixuan Song, Hengtao Guo, Tieyong Zeng, Feng-Lei Fan

    Abstract: The accurate segmentation of retinal blood vessels plays a crucial role in the early diagnosis and treatment of various ophthalmic diseases. Designing a network model for this task requires meticulous tuning and extensive experimentation to handle the tiny and intertwined morphology of retinal blood vessels. To tackle this challenge, Neural Architecture Search (NAS) methods are developed to fully… ▽ More

    Submitted 17 November, 2024; originally announced November 2024.

  19. arXiv:2410.16720  [pdf, other

    cs.DB cs.CR

    NodeOP: Optimizing Node Management for Decentralized Networks

    Authors: Angela Tsang, Jiankai Sun, Boo Xie, Azeem Khan, Ender Lu, Fletcher Fan, Maggie Wu, Jing Tang

    Abstract: We present NodeOP, a novel framework designed to optimize the management of General Node Operators in decentralized networks. By integrating Agent-Based Modeling (ABM) with a Tendermint Byzantine Fault Tolerance (BFT)-based consensus mechanism, NodeOP addresses key challenges in task allocation, consensus formation, and system stability. Through rigorous mathematical modeling and formal optimizati… ▽ More

    Submitted 22 October, 2024; originally announced October 2024.

  20. arXiv:2410.06151  [pdf, ps, other

    cs.LG cs.AI

    Diversifying Robot Locomotion Behaviors with Extrinsic Behavioral Curiosity

    Authors: Zhenglin Wan, Xingrui Yu, David Mark Bossens, Yueming Lyu, Qing Guo, Flint Xiaofeng Fan, Yew Soon Ong, Ivor Tsang

    Abstract: Imitation learning (IL) has shown promise in robot locomotion but is often limited to learning a single expert policy, constraining behavior diversity and robustness in unpredictable real-world scenarios. To address this, we introduce Quality Diversity Inverse Reinforcement Learning (QD-IRL), a novel framework that integrates quality-diversity optimization with IRL methods, enabling agents to lear… ▽ More

    Submitted 9 July, 2025; v1 submitted 8 October, 2024; originally announced October 2024.

    Comments: 22 pages, conference paper

    Journal ref: International Conference on Machine Learning (ICML 2025)

  21. arXiv:2410.04524  [pdf, other

    cs.CL

    Toward Secure Tuning: Mitigating Security Risks from Instruction Fine-Tuning

    Authors: Yanrui Du, Sendong Zhao, Jiawei Cao, Ming Ma, Danyang Zhao, Shuren Qi, Fenglei Fan, Ting Liu, Bing Qin

    Abstract: Instruction fine-tuning has emerged as a critical technique for customizing Large Language Models (LLMs) to specific applications. However, recent studies have highlighted significant security vulnerabilities in fine-tuned LLMs. Existing defense efforts focus more on pre-training and post-training methods, yet there remains underexplored in in-training methods. To fill this gap, we introduce a nov… ▽ More

    Submitted 16 February, 2025; v1 submitted 6 October, 2024; originally announced October 2024.

  22. arXiv:2409.14615  [pdf, other

    cs.RO

    A Comparative Study on State-Action Spaces for Learning Viewpoint Selection and Manipulation with Diffusion Policy

    Authors: Xiatao Sun, Francis Fan, Yinxing Chen, Daniel Rakita

    Abstract: Robotic manipulation tasks often rely on static cameras for perception, which can limit flexibility, particularly in scenarios like robotic surgery and cluttered environments where mounting static cameras is impractical. Ideally, robots could jointly learn a policy for dynamic viewpoint and manipulation. However, it remains unclear which state-action space is most suitable for this complex learnin… ▽ More

    Submitted 12 November, 2024; v1 submitted 22 September, 2024; originally announced September 2024.

    Comments: Submitted to ICRA 2025. Website: https://apollo-lab-yale.github.io/spaces_comparative_study/

  23. arXiv:2409.08800  [pdf, other

    cs.CV

    Task-Specific Data Preparation for Deep Learning to Reconstruct Structures of Interest from Severely Truncated CBCT Data

    Authors: Yixing Huang, Fuxin Fan, Ahmed Gomaa, Andreas Maier, Rainer Fietkau, Christoph Bert, Florian Putz

    Abstract: Cone-beam computed tomography (CBCT) is widely used in interventional surgeries and radiation oncology. Due to the limited size of flat-panel detectors, anatomical structures might be missing outside the limited field-of-view (FOV), which restricts the clinical applications of CBCT systems. Recently, deep learning methods have been proposed to extend the FOV for multi-slice CT systems. However, in… ▽ More

    Submitted 13 September, 2024; originally announced September 2024.

    Comments: Published in the CT-Meeting 2024 proceeding. arXiv admin note: text overlap with arXiv:2108.13844

  24. arXiv:2409.05982  [pdf, other

    eess.IV cs.CV

    Enhancing Cross-Modality Synthesis: Subvolume Merging for MRI-to-CT Conversion

    Authors: Fuxin Fan, Jingna Qiu, Yixing Huang, Andreas Maier

    Abstract: Providing more precise tissue attenuation information, synthetic computed tomography (sCT) generated from magnetic resonance imaging (MRI) contributes to improved radiation therapy treatment planning. In our study, we employ the advanced SwinUNETR framework for synthesizing CT from MRI images. Additionally, we introduce a three-dimensional subvolume merging technique in the prediction process. By… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

  25. arXiv:2409.00592  [pdf, other

    cs.LG cs.AI cs.ET

    Hyper-Compression: Model Compression via Hyperfunction

    Authors: Fenglei Fan, Juntong Fan, Dayang Wang, Jingbo Zhang, Zelin Dong, Shijun Zhang, Ge Wang, Tieyong Zeng

    Abstract: The rapid growth of large models' size has far outpaced that of computing resources. To bridge this gap, encouraged by the parsimonious relationship between genotype and phenotype in the brain's growth and development, we propose the so-called hyper-compression that turns the model compression into the issue of parameter representation via a hyperfunction. Specifically, it is known that the trajec… ▽ More

    Submitted 2 April, 2025; v1 submitted 31 August, 2024; originally announced September 2024.

  26. arXiv:2408.04171  [pdf, other

    cs.CV

    Rotation center identification based on geometric relationships for rotary motion deblurring

    Authors: Jinhui Qin, Yong Ma, Jun Huang, Fan Fan, You Du

    Abstract: Non-blind rotary motion deblurring (RMD) aims to recover the latent clear image from a rotary motion blurred (RMB) image. The rotation center is a crucial input parameter in non-blind RMD methods. Existing methods directly estimate the rotation center from the RMB image. However they always suffer significant errors, and the performance of RMD is limited. For the assembled imaging systems, the pos… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  27. arXiv:2407.09580  [pdf, other

    cs.CV cs.AI

    Don't Fear Peculiar Activation Functions: EUAF and Beyond

    Authors: Qianchao Wang, Shijun Zhang, Dong Zeng, Zhaoheng Xie, Hengtao Guo, Feng-Lei Fan, Tieyong Zeng

    Abstract: In this paper, we propose a new super-expressive activation function called the Parametric Elementary Universal Activation Function (PEUAF). We demonstrate the effectiveness of PEUAF through systematic and comprehensive experiments on various industrial and image datasets, including CIFAR10, Tiny-ImageNet, and ImageNet. Moreover, we significantly generalize the family of super-expressive activatio… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  28. arXiv:2406.14847  [pdf, other

    cs.CV

    Fair Text to Medical Image Diffusion Model with Subgroup Distribution Aligned Tuning

    Authors: Xu Han, Fangfang Fan, Jingzhao Rong, Zhen Li, Georges El Fakhri, Qingyu Chen, Xiaofeng Liu

    Abstract: The text to medical image (T2MedI) with latent diffusion model has great potential to alleviate the scarcity of medical imaging data and explore the underlying appearance distribution of lesions in a specific patient status description. However, as the text to nature image models, we show that the T2MedI model can also bias to some subgroups to overlook the minority ones in the training set. In th… ▽ More

    Submitted 7 January, 2025; v1 submitted 20 June, 2024; originally announced June 2024.

  29. arXiv:2406.01631  [pdf, other

    cs.IR cs.LG

    SUBER: An RL Environment with Simulated Human Behavior for Recommender Systems

    Authors: Nathan Corecco, Giorgio Piatti, Luca A. Lanzendörfer, Flint Xiaofeng Fan, Roger Wattenhofer

    Abstract: Reinforcement learning (RL) has gained popularity in the realm of recommender systems due to its ability to optimize long-term rewards and guide users in discovering relevant content. However, the successful implementation of RL in recommender systems is challenging because of several factors, including the limited availability of online data for training on-policy methods. This scarcity requires… ▽ More

    Submitted 20 August, 2024; v1 submitted 1 June, 2024; originally announced June 2024.

  30. arXiv:2405.02369  [pdf, other

    cs.NE cs.AI cs.LG

    No One-Size-Fits-All Neurons: Task-based Neurons for Artificial Neural Networks

    Authors: Feng-Lei Fan, Meng Wang, Hang-Cheng Dong, Jianwei Ma, Tieyong Zeng

    Abstract: Biologically, the brain does not rely on a single type of neuron that universally functions in all aspects. Instead, it acts as a sophisticated designer of task-based neurons. In this study, we address the following question: since the human brain is a task-based neuron user, can the artificial network design go from the task-based architecture design to the task-based neuron design? Since methodo… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: 12 pages, 4 figures

  31. arXiv:2404.16627  [pdf, other

    cs.CL

    Incorporating Lexical and Syntactic Knowledge for Unsupervised Cross-Lingual Transfer

    Authors: Jianyu Zheng, Fengfei Fan, Jianquan Li

    Abstract: Unsupervised cross-lingual transfer involves transferring knowledge between languages without explicit supervision. Although numerous studies have been conducted to improve performance in such tasks by focusing on cross-lingual knowledge, particularly lexical and syntactic knowledge, current approaches are limited as they only incorporate syntactic or lexical information. Since each type of inform… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Accepted at LREC-Coling 2024

  32. arXiv:2404.14807  [pdf, other

    cs.CV

    BigReg: An Efficient Registration Pipeline for High-Resolution X-Ray and Light-Sheet Fluorescence Microscopy

    Authors: Siyuan Mei, Fuxin Fan, Mareike Thies, Mingxuan Gu, Fabian Wagner, Oliver Aust, Ina Erceg, Zeynab Mirzaei, Georgiana Neag, Yipeng Sun, Yixing Huang, Andreas Maier

    Abstract: Recently, X-ray microscopy (XRM) and light-sheet fluorescence microscopy (LSFM) have emerged as pivotal tools in preclinical research, particularly for studying bone remodeling diseases such as osteoporosis. These modalities offer micrometer-level resolution, and their integration allows for a complementary examination of bone microstructures which is essential for analyzing functional changes. Ho… ▽ More

    Submitted 20 May, 2025; v1 submitted 23 April, 2024; originally announced April 2024.

  33. arXiv:2404.03541  [pdf, other

    eess.IV cs.CV

    Segmentation-Guided Knee Radiograph Generation using Conditional Diffusion Models

    Authors: Siyuan Mei, Fuxin Fan, Fabian Wagner, Mareike Thies, Mingxuan Gu, Yipeng Sun, Andreas Maier

    Abstract: Deep learning-based medical image processing algorithms require representative data during development. In particular, surgical data might be difficult to obtain, and high-quality public datasets are limited. To overcome this limitation and augment datasets, a widely adopted solution is the generation of synthetic images. In this work, we employ conditional diffusion models to generate knee radiog… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  34. arXiv:2403.20188  [pdf, other

    cs.NI cs.AI cs.LG

    Distributed Swarm Learning for Edge Internet of Things

    Authors: Yue Wang, Zhi Tian, FXin Fan, Zhipeng Cai, Cameron Nowzari, Kai Zeng

    Abstract: The rapid growth of Internet of Things (IoT) has led to the widespread deployment of smart IoT devices at wireless edge for collaborative machine learning tasks, ushering in a new era of edge learning. With a huge number of hardware-constrained IoT devices operating in resource-limited wireless networks, edge learning encounters substantial challenges, including communication and computation bottl… ▽ More

    Submitted 29 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2210.16705

  35. arXiv:2403.20156  [pdf, other

    cs.LG cs.AI

    CAESAR: Enhancing Federated RL in Heterogeneous MDPs through Convergence-Aware Sampling with Screening

    Authors: Hei Yi Mak, Flint Xiaofeng Fan, Luca A. Lanzendörfer, Cheston Tan, Wei Tsang Ooi, Roger Wattenhofer

    Abstract: In this study, we delve into Federated Reinforcement Learning (FedRL) in the context of value-based agents operating across diverse Markov Decision Processes (MDPs). Existing FedRL methods typically aggregate agents' learning by averaging the value functions across them to improve their performance. However, this aggregation strategy is suboptimal in heterogeneous environments where agents converg… ▽ More

    Submitted 16 April, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

  36. arXiv:2403.20002  [pdf, other

    cs.CV

    Grounding and Enhancing Grid-based Models for Neural Fields

    Authors: Zelin Zhao, Fenglei Fan, Wenlong Liao, Junchi Yan

    Abstract: Many contemporary studies utilize grid-based models for neural field representation, but a systematic analysis of grid-based models is still missing, hindering the improvement of those models. Therefore, this paper introduces a theoretical framework for grid-based models. This framework points out that these models' approximation and generalization behaviors are determined by grid tangent kernels… ▽ More

    Submitted 6 June, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

    Comments: CVPR24 Oral & Best Paper Award Candidate. Pre-rebuttal scores: 555. Post-rebuttal scores: 555

  37. arXiv:2403.03326  [pdf, other

    eess.IV cs.CV

    AnatoMix: Anatomy-aware Data Augmentation for Multi-organ Segmentation

    Authors: Chang Liu, Fuxin Fan, Annette Schwarz, Andreas Maier

    Abstract: Multi-organ segmentation in medical images is a widely researched task and can save much manual efforts of clinicians in daily routines. Automating the organ segmentation process using deep learning (DL) is a promising solution and state-of-the-art segmentation models are achieving promising accuracy. In this work, We proposed a novel data augmentation strategy for increasing the generalizibility… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  38. arXiv:2403.02827  [pdf, other

    cs.CV

    Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation

    Authors: Weijie Li, Litong Gong, Yiran Zhu, Fanda Fan, Biao Wang, Tiezheng Ge, Bo Zheng

    Abstract: Image-to-video (I2V) generation tasks always suffer from keeping high fidelity in the open domains. Traditional image animation techniques primarily focus on specific domains such as faces or human poses, making them difficult to generalize to open domains. Several recent I2V frameworks based on diffusion models can generate dynamic content for open domain images but fail to maintain fidelity. We… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  39. arXiv:2401.16104  [pdf, other

    cs.CV eess.IV

    A 2D Sinogram-Based Approach to Defect Localization in Computed Tomography

    Authors: Yuzhong Zhou, Linda-Sophie Schneider, Fuxin Fan, Andreas Maier

    Abstract: The rise of deep learning has introduced a transformative era in the field of image processing, particularly in the context of computed tomography. Deep learning has made a significant contribution to the field of industrial Computed Tomography. However, many defect detection algorithms are applied directly to the reconstructed domain, often disregarding the raw sensor data. This paper shifts the… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  40. arXiv:2401.16039  [pdf, other

    eess.IV cs.CV cs.LG

    Data-Driven Filter Design in FBP: Transforming CT Reconstruction with Trainable Fourier Series

    Authors: Yipeng Sun, Linda-Sophie Schneider, Fuxin Fan, Mareike Thies, Mingxuan Gu, Siyuan Mei, Yuzhong Zhou, Siming Bayer, Andreas Maier

    Abstract: In this study, we introduce a Fourier series-based trainable filter for computed tomography (CT) reconstruction within the filtered backprojection (FBP) framework. This method overcomes the limitation in noise reduction by optimizing Fourier series coefficients to construct the filter, maintaining computational efficiency with minimal increment for the trainable parameters compared to other deep l… ▽ More

    Submitted 25 October, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: accepted by 8th International Conference on Image Formation in X-Ray Computed Tomography, Bamberg, Germany

  41. arXiv:2401.07571  [pdf, other

    cs.CV

    A Bi-Pyramid Multimodal Fusion Method for the Diagnosis of Bipolar Disorders

    Authors: Guoxin Wang, Sheng Shi, Shan An, Fengmei Fan, Wenshu Ge, Qi Wang, Feng Yu, Zhiren Wang

    Abstract: Previous research on the diagnosis of Bipolar disorder has mainly focused on resting-state functional magnetic resonance imaging. However, their accuracy can not meet the requirements of clinical diagnosis. Efficient multimodal fusion strategies have great potential for applications in multimodal data and can further improve the performance of medical diagnosis models. In this work, we utilize bot… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

    Comments: Accepted by IEEE ICASSP 2024

  42. arXiv:2401.07532  [pdf, other

    cs.SD cs.AI eess.AS

    Multi-view MidiVAE: Fusing Track- and Bar-view Representations for Long Multi-track Symbolic Music Generation

    Authors: Zhiwei Lin, Jun Chen, Boshi Tang, Binzhu Sha, Jing Yang, Yaolong Ju, Fan Fan, Shiyin Kang, Zhiyong Wu, Helen Meng

    Abstract: Variational Autoencoders (VAEs) constitute a crucial component of neural symbolic music generation, among which some works have yielded outstanding results and attracted considerable attention. Nevertheless, previous VAEs still encounter issues with overly long feature sequences and generated results lack contextual coherence, thus the challenge of modeling long multi-track symbolic music still re… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

    Comments: Accepted by ICASSP 2024

  43. arXiv:2401.03489  [pdf, other

    cs.LG cs.AI cs.DC cs.MA

    Decentralized Federated Policy Gradient with Byzantine Fault-Tolerance and Provably Fast Convergence

    Authors: Philip Jordan, Florian Grötschla, Flint Xiaofeng Fan, Roger Wattenhofer

    Abstract: In Federated Reinforcement Learning (FRL), agents aim to collaboratively learn a common task, while each agent is acting in its local environment without exchanging raw trajectories. Existing approaches for FRL either (a) do not provide any fault-tolerance guarantees (against misbehaving agents), or (b) rely on a trusted central agent (a single point of failure) for aggregating updates. We provide… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

    Comments: Accepted at AAMAS'24

  44. arXiv:2401.01651  [pdf, other

    cs.CV cs.AI

    AIGCBench: Comprehensive Evaluation of Image-to-Video Content Generated by AI

    Authors: Fanda Fan, Chunjie Luo, Wanling Gao, Jianfeng Zhan

    Abstract: The burgeoning field of Artificial Intelligence Generated Content (AIGC) is witnessing rapid advancements, particularly in video generation. This paper introduces AIGCBench, a pioneering comprehensive and scalable benchmark designed to evaluate a variety of video generation tasks, with a primary focus on Image-to-Video (I2V) generation. AIGCBench tackles the limitations of existing benchmarks, whi… ▽ More

    Submitted 23 January, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

    Comments: Accepted to BenchCouncil Transactions on Benchmarks, Standards and Evaluations (TBench)

  45. arXiv:2311.17303  [pdf, other

    cs.LG cs.AI stat.ME

    Enhancing the Performance of Neural Networks Through Causal Discovery and Integration of Domain Knowledge

    Authors: Xiaoge Zhang, Xiao-Lin Wang, Fenglei Fan, Yiu-Ming Cheung, Indranil Bose

    Abstract: In this paper, we develop a generic methodology to encode hierarchical causality structure among observed variables into a neural network in order to improve its predictive performance. The proposed methodology, called causality-informed neural network (CINN), leverages three coherent steps to systematically map the structural causal knowledge into the layer-to-layer design of neural network while… ▽ More

    Submitted 24 December, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

  46. arXiv:2310.19477  [pdf, other

    cs.CV cs.MM eess.IV

    VDIP-TGV: Blind Image Deconvolution via Variational Deep Image Prior Empowered by Total Generalized Variation

    Authors: Tingting Wu, Zhiyan Du, Zhi Li, Feng-Lei Fan, Tieyong Zeng

    Abstract: Recovering clear images from blurry ones with an unknown blur kernel is a challenging problem. Deep image prior (DIP) proposes to use the deep network as a regularizer for a single image rather than as a supervised model, which achieves encouraging results in the nonblind deblurring problem. However, since the relationship between images and the network architectures is unclear, it is hard to find… ▽ More

    Submitted 10 November, 2023; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: 13 pages, 5 figures

  47. arXiv:2310.02690  [pdf, other

    eess.IV cs.CV

    Multi-Dimension-Embedding-Aware Modality Fusion Transformer for Psychiatric Disorder Clasification

    Authors: Guoxin Wang, Xuyang Cao, Shan An, Fengmei Fan, Chao Zhang, Jinsong Wang, Feng Yu, Zhiren Wang

    Abstract: Deep learning approaches, together with neuroimaging techniques, play an important role in psychiatric disorders classification. Previous studies on psychiatric disorders diagnosis mainly focus on using functional connectivity matrices of resting-state functional magnetic resonance imaging (rs-fMRI) as input, which still needs to fully utilize the rich temporal information of the time series of rs… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  48. arXiv:2309.02119  [pdf, other

    cs.CV

    Hierarchical Masked 3D Diffusion Model for Video Outpainting

    Authors: Fanda Fan, Chaoxu Guo, Litong Gong, Biao Wang, Tiezheng Ge, Yuning Jiang, Chunjie Luo, Jianfeng Zhan

    Abstract: Video outpainting aims to adequately complete missing areas at the edges of video frames. Compared to image outpainting, it presents an additional challenge as the model should maintain the temporal consistency of the filled area. In this paper, we introduce a masked 3D diffusion model for video outpainting. We use the technique of mask modeling to train the 3D diffusion model. This allows us to u… ▽ More

    Submitted 19 January, 2024; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: Accepted to ACM MM 2023

  49. arXiv:2307.16363  [pdf, other

    cs.LG cs.AI cs.AR

    BearingPGA-Net: A Lightweight and Deployable Bearing Fault Diagnosis Network via Decoupled Knowledge Distillation and FPGA Acceleration

    Authors: Jing-Xiao Liao, Sheng-Lai Wei, Chen-Long Xie, Tieyong Zeng, Jinwei Sun, Shiping Zhang, Xiaoge Zhang, Feng-Lei Fan

    Abstract: Deep learning has achieved remarkable success in the field of bearing fault diagnosis. However, this success comes with larger models and more complex computations, which cannot be transferred into industrial fields requiring models to be of high speed, strong portability, and low power consumption. In this paper, we propose a lightweight and deployable model for bearing fault diagnosis, referred… ▽ More

    Submitted 30 July, 2023; originally announced July 2023.

  50. arXiv:2307.08673  [pdf

    cs.LG cs.CV

    CohortFinder: an open-source tool for data-driven partitioning of biomedical image cohorts to yield robust machine learning models

    Authors: Fan Fan, Georgia Martinez, Thomas Desilvio, John Shin, Yijiang Chen, Bangchen Wang, Takaya Ozeki, Maxime W. Lafarge, Viktor H. Koelzer, Laura Barisoni, Anant Madabhushi, Satish E. Viswanath, Andrew Janowczyk

    Abstract: Batch effects (BEs) refer to systematic technical differences in data collection unrelated to biological variations whose noise is shown to negatively impact machine learning (ML) model generalizability. Here we release CohortFinder, an open-source tool aimed at mitigating BEs via data-driven cohort partitioning. We demonstrate CohortFinder improves ML model performance in downstream medical image… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

    Comments: 26 pages, 9 figures, 4 tables. Abstract was accepted by European Society of Digital and Integrative Pathology (ESDIP), Germany, 2022