-
Sequence-Only Prediction of Binding Affinity Changes: A Robust and Interpretable Model for Antibody Engineering
Authors:
Chen Liu,
Mingchen Li,
Yang Tan,
Wenrui Gou,
Guisheng Fan,
Bingxin Zhou
Abstract:
A pivotal area of research in antibody engineering is to find effective modifications that enhance antibody-antigen binding affinity. Traditional wet-lab experiments assess mutants in a costly and time-consuming manner. Emerging deep learning solutions offer an alternative by modeling antibody structures to predict binding affinity changes. However, they heavily depend on high-quality complex stru…
▽ More
A pivotal area of research in antibody engineering is to find effective modifications that enhance antibody-antigen binding affinity. Traditional wet-lab experiments assess mutants in a costly and time-consuming manner. Emerging deep learning solutions offer an alternative by modeling antibody structures to predict binding affinity changes. However, they heavily depend on high-quality complex structures, which are frequently unavailable in practice. Therefore, we propose ProtAttBA, a deep learning model that predicts binding affinity changes based solely on the sequence information of antibody-antigen complexes. ProtAttBA employs a pre-training phase to learn protein sequence patterns, following a supervised training phase using labeled antibody-antigen complex data to train a cross-attention-based regressor for predicting binding affinity changes. We evaluated ProtAttBA on three open benchmarks under different conditions. Compared to both sequence- and structure-based prediction methods, our approach achieves competitive performance, demonstrating notable robustness, especially with uncertain complex structures. Notably, our method possesses interpretability from the attention mechanism. We show that the learned attention scores can identify critical residues with impacts on binding affinity. This work introduces a rapid and cost-effective computational tool for antibody engineering, with the potential to accelerate the development of novel therapeutic antibodies.
△ Less
Submitted 14 May, 2025;
originally announced May 2025.
-
VenusX: Unlocking Fine-Grained Functional Understanding of Proteins
Authors:
Yang Tan,
Wenrui Gou,
Bozitao Zhong,
Liang Hong,
Huiqun Yu,
Bingxin Zhou
Abstract:
Deep learning models have driven significant progress in predicting protein function and interactions at the protein level. While these advancements have been invaluable for many biological applications such as enzyme engineering and function annotation, a more detailed perspective is essential for understanding protein functional mechanisms and evaluating the biological knowledge captured by mode…
▽ More
Deep learning models have driven significant progress in predicting protein function and interactions at the protein level. While these advancements have been invaluable for many biological applications such as enzyme engineering and function annotation, a more detailed perspective is essential for understanding protein functional mechanisms and evaluating the biological knowledge captured by models. To address this demand, we introduce VenusX, the first large-scale benchmark for fine-grained functional annotation and function-based protein pairing at the residue, fragment, and domain levels. VenusX comprises three major task categories across six types of annotations, including residue-level binary classification, fragment-level multi-class classification, and pairwise functional similarity scoring for identifying critical active sites, binding sites, conserved sites, motifs, domains, and epitopes. The benchmark features over 878,000 samples curated from major open-source databases such as InterPro, BioLiP, and SAbDab. By providing mixed-family and cross-family splits at three sequence identity thresholds, our benchmark enables a comprehensive assessment of model performance on both in-distribution and out-of-distribution scenarios. For baseline evaluation, we assess a diverse set of popular and open-source models, including pre-trained protein language models, sequence-structure hybrids, structure-based methods, and alignment-based techniques. Their performance is reported across all benchmark datasets and evaluation settings using multiple metrics, offering a thorough comparison and a strong foundation for future research. Code and data are publicly available at https://github.com/ai4protein/VenusX.
△ Less
Submitted 16 May, 2025;
originally announced May 2025.
-
CPE-Pro: A Structure-Sensitive Deep Learning Method for Protein Representation and Origin Evaluation
Authors:
Wenrui Gou,
Wenhui Ge,
Yang Tan,
Mingchen Li,
Guisheng Fan,
Huiqun Yu
Abstract:
Protein structures are important for understanding their functions and interactions. Currently, many protein structure prediction methods are enriching the structure database. Discriminating the origin of structures is crucial for distinguishing between experimentally resolved and computationally predicted structures, evaluating the reliability of prediction methods, and guiding downstream biologi…
▽ More
Protein structures are important for understanding their functions and interactions. Currently, many protein structure prediction methods are enriching the structure database. Discriminating the origin of structures is crucial for distinguishing between experimentally resolved and computationally predicted structures, evaluating the reliability of prediction methods, and guiding downstream biological studies. Building on works in structure prediction, We developed a structure-sensitive supervised deep learning model, Crystal vs Predicted Evaluator for Protein Structure (CPE-Pro), to represent and discriminate the origin of protein structures. CPE-Pro learns the structural information of proteins and captures inter-structural differences to achieve accurate traceability on four data classes, and is expected to be extended to more. Simultaneously, we utilized Foldseek to encode protein structures into "structure-sequences" and trained a protein Structural Sequence Language Model, SSLM. Preliminary experiments demonstrated that, compared to large-scale protein language models pre-trained on vast amounts of amino acid sequences, the "structure-sequence" enables the language model to learn more informative protein features, enhancing and optimizing structural representations. We have provided the code, model weights, and all related materials on https://github.com/GouWenrui/CPE-Pro-main.git.
△ Less
Submitted 23 October, 2024; v1 submitted 20 October, 2024;
originally announced October 2024.
-
Hamba: Single-view 3D Hand Reconstruction with Graph-guided Bi-Scanning Mamba
Authors:
Haoye Dong,
Aviral Chharia,
Wenbo Gou,
Francisco Vicente Carrasco,
Fernando De la Torre
Abstract:
3D Hand reconstruction from a single RGB image is challenging due to the articulated motion, self-occlusion, and interaction with objects. Existing SOTA methods employ attention-based transformers to learn the 3D hand pose and shape, yet they do not fully achieve robust and accurate performance, primarily due to inefficiently modeling spatial relations between joints. To address this problem, we p…
▽ More
3D Hand reconstruction from a single RGB image is challenging due to the articulated motion, self-occlusion, and interaction with objects. Existing SOTA methods employ attention-based transformers to learn the 3D hand pose and shape, yet they do not fully achieve robust and accurate performance, primarily due to inefficiently modeling spatial relations between joints. To address this problem, we propose a novel graph-guided Mamba framework, named Hamba, which bridges graph learning and state space modeling. Our core idea is to reformulate Mamba's scanning into graph-guided bidirectional scanning for 3D reconstruction using a few effective tokens. This enables us to efficiently learn the spatial relationships between joints for improving reconstruction performance. Specifically, we design a Graph-guided State Space (GSS) block that learns the graph-structured relations and spatial sequences of joints and uses 88.5% fewer tokens than attention-based methods. Additionally, we integrate the state space features and the global features using a fusion module. By utilizing the GSS block and the fusion module, Hamba effectively leverages the graph-guided state space features and jointly considers global and local features to improve performance. Experiments on several benchmarks and in-the-wild tests demonstrate that Hamba significantly outperforms existing SOTAs, achieving the PA-MPVPE of 5.3mm and F@15mm of 0.992 on FreiHAND. At the time of this paper's acceptance, Hamba holds the top position, Rank 1 in two Competition Leaderboards on 3D hand reconstruction. Project Website: https://humansensinglab.github.io/Hamba/
△ Less
Submitted 26 November, 2024; v1 submitted 12 July, 2024;
originally announced July 2024.
-
Resilient Random Time-hopping Reply against Distance Attacks in UWB Ranging
Authors:
Wenlong Gou,
Chuanhang Yu,
Gang Wu
Abstract:
In order to mitigate the distance reduction attack in Ultra-Wide Band (UWB) ranging, this paper proposes a secure ranging scheme based on a random time-hopping mechanism without redundant signaling overhead. Additionally, a secure ranging strategy is designed for backward compatibility with existing standards such as IEEE 802.15.4a/z, combined with an attack detection scheme. The effectiveness and…
▽ More
In order to mitigate the distance reduction attack in Ultra-Wide Band (UWB) ranging, this paper proposes a secure ranging scheme based on a random time-hopping mechanism without redundant signaling overhead. Additionally, a secure ranging strategy is designed for backward compatibility with existing standards such as IEEE 802.15.4a/z, combined with an attack detection scheme. The effectiveness and feasibility of the proposed strategy are demonstrated through both simulation and experimental results in the case of the Ghost Peak attack, as demonstrated by Patrick Leu et al. The random time-hopping mechanism is verified to be capable of reducing the success rate of distance reduction attacks to less than 0.01%, thereby significantly enhancing the security of UWB ranging.
△ Less
Submitted 14 May, 2025; v1 submitted 10 June, 2024;
originally announced June 2024.
-
Channel Reciprocity Based Attack Detection for Securing UWB Ranging by Autoencoder
Authors:
Wenlong Gou,
Chuanhang Yu,
Juntao Ma,
Gang Wu,
Vladimir Mordachev
Abstract:
A variety of ranging threats represented by Ghost Peak attack have raised concerns regarding the security performance of Ultra-Wide Band (UWB) systems with the finalization of the IEEE 802.15.4z standard. Based on channel reciprocity, this paper proposes a low complexity attack detection scheme that compares Channel Impulse Response (CIR) features of both ranging sides utilizing an autoencoder wit…
▽ More
A variety of ranging threats represented by Ghost Peak attack have raised concerns regarding the security performance of Ultra-Wide Band (UWB) systems with the finalization of the IEEE 802.15.4z standard. Based on channel reciprocity, this paper proposes a low complexity attack detection scheme that compares Channel Impulse Response (CIR) features of both ranging sides utilizing an autoencoder with the capability of data compression and feature extraction. Taking Ghost Peak attack as an example, this paper demonstrates the effectiveness, feasibility and generalizability of the proposed attack detection scheme through simulation and experimental validation. The proposed scheme achieves an attack detection success rate of over 99% and can be implemented in current systems at low cost.
△ Less
Submitted 10 June, 2024; v1 submitted 28 May, 2024;
originally announced May 2024.
-
FreeMan: Towards Benchmarking 3D Human Pose Estimation under Real-World Conditions
Authors:
Jiong Wang,
Fengyu Yang,
Wenbo Gou,
Bingliang Li,
Danqi Yan,
Ailing Zeng,
Yijun Gao,
Junle Wang,
Yanqing Jing,
Ruimao Zhang
Abstract:
Estimating the 3D structure of the human body from natural scenes is a fundamental aspect of visual perception. 3D human pose estimation is a vital step in advancing fields like AIGC and human-robot interaction, serving as a crucial technique for understanding and interacting with human actions in real-world settings. However, the current datasets, often collected under single laboratory condition…
▽ More
Estimating the 3D structure of the human body from natural scenes is a fundamental aspect of visual perception. 3D human pose estimation is a vital step in advancing fields like AIGC and human-robot interaction, serving as a crucial technique for understanding and interacting with human actions in real-world settings. However, the current datasets, often collected under single laboratory conditions using complex motion capture equipment and unvarying backgrounds, are insufficient. The absence of datasets on variable conditions is stalling the progress of this crucial task. To facilitate the development of 3D pose estimation, we present FreeMan, the first large-scale, multi-view dataset collected under the real-world conditions. FreeMan was captured by synchronizing 8 smartphones across diverse scenarios. It comprises 11M frames from 8000 sequences, viewed from different perspectives. These sequences cover 40 subjects across 10 different scenarios, each with varying lighting conditions. We have also established an semi-automated pipeline containing error detection to reduce the workload of manual check and ensure precise annotation. We provide comprehensive evaluation baselines for a range of tasks, underlining the significant challenges posed by FreeMan. Further evaluations of standard indoor/outdoor human sensing datasets reveal that FreeMan offers robust representation transferability in real and complex scenes. Code and data are available at https://wangjiongw.github.io/freeman.
△ Less
Submitted 3 April, 2024; v1 submitted 10 September, 2023;
originally announced September 2023.
-
SYENet: A Simple Yet Effective Network for Multiple Low-Level Vision Tasks with Real-time Performance on Mobile Device
Authors:
Weiran Gou,
Ziyao Yi,
Yan Xiang,
Shaoqing Li,
Zibin Liu,
Dehui Kong,
Ke Xu
Abstract:
With the rapid development of AI hardware accelerators, applying deep learning-based algorithms to solve various low-level vision tasks on mobile devices has gradually become possible. However, two main problems still need to be solved: task-specific algorithms make it difficult to integrate them into a single neural network architecture, and large amounts of parameters make it difficult to achiev…
▽ More
With the rapid development of AI hardware accelerators, applying deep learning-based algorithms to solve various low-level vision tasks on mobile devices has gradually become possible. However, two main problems still need to be solved: task-specific algorithms make it difficult to integrate them into a single neural network architecture, and large amounts of parameters make it difficult to achieve real-time inference. To tackle these problems, we propose a novel network, SYENet, with only $~$6K parameters, to handle multiple low-level vision tasks on mobile devices in a real-time manner. The SYENet consists of two asymmetrical branches with simple building blocks. To effectively connect the results by asymmetrical branches, a Quadratic Connection Unit(QCU) is proposed. Furthermore, to improve performance, a new Outlier-Aware Loss is proposed to process the image. The proposed method proves its superior performance with the best PSNR as compared with other networks in real-time applications such as Image Signal Processing(ISP), Low-Light Enhancement(LLE), and Super-Resolution(SR) with 2K60FPS throughput on Qualcomm 8 Gen 1 mobile SoC(System-on-Chip). Particularly, for ISP task, SYENet got the highest score in MAI 2022 Learned Smartphone ISP challenge.
△ Less
Submitted 16 August, 2023;
originally announced August 2023.
-
SNEAK: Synonymous Sentences-Aware Adversarial Attack on Natural Language Video Localization
Authors:
Wenbo Gou,
Wen Shi,
Jian Lou,
Lijie Huang,
Pan Zhou,
Ruixuan Li
Abstract:
Natural language video localization (NLVL) is an important task in the vision-language understanding area, which calls for an in-depth understanding of not only computer vision and natural language side alone, but more importantly the interplay between both sides. Adversarial vulnerability has been well-recognized as a critical security issue of deep neural network models, which requires prudent i…
▽ More
Natural language video localization (NLVL) is an important task in the vision-language understanding area, which calls for an in-depth understanding of not only computer vision and natural language side alone, but more importantly the interplay between both sides. Adversarial vulnerability has been well-recognized as a critical security issue of deep neural network models, which requires prudent investigation. Despite its extensive yet separated studies in video and language tasks, current understanding of the adversarial robustness in vision-language joint tasks like NLVL is less developed. This paper therefore aims to comprehensively investigate the adversarial robustness of NLVL models by examining three facets of vulnerabilities from both attack and defense aspects. To achieve the attack goal, we propose a new adversarial attack paradigm called synonymous sentences-aware adversarial attack on NLVL (SNEAK), which captures the cross-modality interplay between the vision and language sides.
△ Less
Submitted 8 December, 2021;
originally announced December 2021.