Skip to main content

Showing 1–50 of 50 results for author: Zhong, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.05319  [pdf, ps, other

    cs.CL cs.AI

    LCDS: A Logic-Controlled Discharge Summary Generation System Supporting Source Attribution and Expert Review

    Authors: Cheng Yuan, Xinkai Rui, Yongqi Fan, Yawei Fan, Boyang Zhong, Jiacheng Wang, Weiyan Zhang, Tong Ruan

    Abstract: Despite the remarkable performance of Large Language Models (LLMs) in automated discharge summary generation, they still suffer from hallucination issues, such as generating inaccurate content or fabricating information without valid sources. In addition, electronic medical records (EMRs) typically consist of long-form data, making it challenging for LLMs to attribute the generated content to the… ▽ More

    Submitted 7 July, 2025; originally announced July 2025.

    Comments: ACL Demo 2025

  2. arXiv:2506.04953  [pdf, ps, other

    cs.CV

    APVR: Hour-Level Long Video Understanding with Adaptive Pivot Visual Information Retrieval

    Authors: Hong Gao, Yiming Bao, Xuezhen Tu, Bin Zhong, Minling Zhang

    Abstract: Current multimodal large language models (MLLMs) struggle with hour-level video understanding, facing significant challenges not only in modeling the substantial information volume of long videos but also in overcoming the memory wall and resource constraints during both training and inference. Although recent training-free approaches have alleviated resource demands by compressing visual features… ▽ More

    Submitted 28 June, 2025; v1 submitted 5 June, 2025; originally announced June 2025.

  3. arXiv:2505.11812  [pdf, ps, other

    cs.LG cs.CL q-bio.QM

    VenusX: Unlocking Fine-Grained Functional Understanding of Proteins

    Authors: Yang Tan, Wenrui Gou, Bozitao Zhong, Liang Hong, Huiqun Yu, Bingxin Zhou

    Abstract: Deep learning models have driven significant progress in predicting protein function and interactions at the protein level. While these advancements have been invaluable for many biological applications such as enzyme engineering and function annotation, a more detailed perspective is essential for understanding protein functional mechanisms and evaluating the biological knowledge captured by mode… ▽ More

    Submitted 16 May, 2025; originally announced May 2025.

    Comments: 29 pages, 3 figures, 17 tables

  4. arXiv:2504.13914  [pdf, other

    cs.CL

    Seed1.5-Thinking: Advancing Superb Reasoning Models with Reinforcement Learning

    Authors: ByteDance Seed, :, Jiaze Chen, Tiantian Fan, Xin Liu, Lingjun Liu, Zhiqi Lin, Mingxuan Wang, Chengyi Wang, Xiangpeng Wei, Wenyuan Xu, Yufeng Yuan, Yu Yue, Lin Yan, Qiying Yu, Xiaochen Zuo, Chi Zhang, Ruofei Zhu, Zhecheng An, Zhihao Bai, Yu Bao, Xingyan Bin, Jiangjie Chen, Feng Chen, Hongmin Chen , et al. (249 additional authors not shown)

    Abstract: We introduce Seed1.5-Thinking, capable of reasoning through thinking before responding, resulting in improved performance on a wide range of benchmarks. Seed1.5-Thinking achieves 86.7 on AIME 2024, 55.0 on Codeforces and 77.3 on GPQA, demonstrating excellent reasoning abilities in STEM and coding. Beyond reasoning tasks, the method demonstrates notable generalization across diverse domains. For in… ▽ More

    Submitted 29 April, 2025; v1 submitted 10 April, 2025; originally announced April 2025.

  5. arXiv:2504.11536  [pdf, other

    cs.CL cs.AI

    ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

    Authors: Jiazhan Feng, Shijue Huang, Xingwei Qu, Ge Zhang, Yujia Qin, Baoquan Zhong, Chengquan Jiang, Jinxin Chi, Wanjun Zhong

    Abstract: While reasoning models (e.g., DeepSeek R1) trained with reinforcement learning (RL), excel in textual reasoning, they struggle in scenarios requiring structured problem-solving, such as geometric reasoning, concise computation, or complex equation solving-areas where computational tools like code interpreters (CI) demonstrate distinct advantages. To bridge this gap, we propose ReTool, which enhanc… ▽ More

    Submitted 17 April, 2025; v1 submitted 15 April, 2025; originally announced April 2025.

    Comments: fix typos

  6. arXiv:2503.15840  [pdf, other

    cs.LO cs.FL

    Automatic Generation of Safety-compliant Linear Temporal Logic via Large Language Model: A Self-supervised Framework

    Authors: Junle Li, Meiqi Tian, Bingzhuo Zhong

    Abstract: Converting high-level tasks described by natural language into formal specifications like Linear Temporal Logic (LTL) is a key step towards providing formal safety guarantees over cyber-physical systems (CPS). While the compliance of the formal specifications themselves against the safety restrictions imposed on CPS is crucial for ensuring safety, most existing works only focus on translation cons… ▽ More

    Submitted 24 April, 2025; v1 submitted 20 March, 2025; originally announced March 2025.

  7. arXiv:2503.06625  [pdf, other

    cs.CV

    Similarity-Guided Layer-Adaptive Vision Transformer for UAV Tracking

    Authors: Chaocan Xue, Bineng Zhong, Qihua Liang, Yaozong Zheng, Ning Li, Yuanliang Xue, Shuxiang Song

    Abstract: Vision transformers (ViTs) have emerged as a popular backbone for visual tracking. However, complete ViT architectures are too cumbersome to deploy for unmanned aerial vehicle (UAV) tracking which extremely emphasizes efficiency. In this study, we discover that many layers within lightweight ViT-based trackers tend to learn relatively redundant and repetitive target representations. Based on this… ▽ More

    Submitted 9 March, 2025; originally announced March 2025.

  8. arXiv:2503.06621  [pdf, other

    cs.CV

    Dynamic Updates for Language Adaptation in Visual-Language Tracking

    Authors: Xiaohai Li, Bineng Zhong, Qihua Liang, Zhiyi Mo, Jian Nong, Shuxiang Song

    Abstract: The consistency between the semantic information provided by the multi-modal reference and the tracked object is crucial for visual-language (VL) tracking. However, existing VL tracking frameworks rely on static multi-modal references to locate dynamic objects, which can lead to semantic discrepancies and reduce the robustness of the tracker. To address this issue, we propose a novel vision-langua… ▽ More

    Submitted 9 March, 2025; originally announced March 2025.

  9. arXiv:2502.06583  [pdf, other

    cs.CV

    Adaptive Perception for Unified Visual Multi-modal Object Tracking

    Authors: Xiantao Hu, Bineng Zhong, Qihua Liang, Zhiyi Mo, Liangtao Shi, Ying Tai, Jian Yang

    Abstract: Recently, many multi-modal trackers prioritize RGB as the dominant modality, treating other modalities as auxiliary, and fine-tuning separately various multi-modal tasks. This imbalance in modality dependence limits the ability of methods to dynamically utilize complementary information from each modality in complex scenarios, making it challenging to fully perceive the advantages of multi-modal.… ▽ More

    Submitted 10 February, 2025; originally announced February 2025.

  10. arXiv:2501.00758  [pdf, other

    cs.CV

    Less is More: Token Context-aware Learning for Object Tracking

    Authors: Chenlong Xu, Bineng Zhong, Qihua Liang, Yaozong Zheng, Guorong Li, Shuxiang Song

    Abstract: Recently, several studies have shown that utilizing contextual information to perceive target states is crucial for object tracking. They typically capture context by incorporating multiple video frames. However, these naive frame-context methods fail to consider the importance of each patch within a reference frame, making them susceptible to noise and redundant tokens, which deteriorates trackin… ▽ More

    Submitted 1 January, 2025; originally announced January 2025.

    Comments: Accepted by AAAI 2025

  11. arXiv:2412.15691  [pdf, other

    cs.CV

    Exploiting Multimodal Spatial-temporal Patterns for Video Object Tracking

    Authors: Xiantao Hu, Ying Tai, Xu Zhao, Chen Zhao, Zhenyu Zhang, Jun Li, Bineng Zhong, Jian Yang

    Abstract: Multimodal tracking has garnered widespread attention as a result of its ability to effectively address the inherent limitations of traditional RGB tracking. However, existing multimodal trackers mainly focus on the fusion and enhancement of spatial features or merely leverage the sparse temporal relationships between video frames. These approaches do not fully exploit the temporal correlations in… ▽ More

    Submitted 20 December, 2024; originally announced December 2024.

  12. arXiv:2412.13615  [pdf, other

    cs.CV

    MambaLCT: Boosting Tracking via Long-term Context State Space Model

    Authors: Xiaohai Li, Bineng Zhong, Qihua Liang, Guorong Li, Zhiyi Mo, Shuxiang Song

    Abstract: Effectively constructing context information with long-term dependencies from video sequences is crucial for object tracking. However, the context length constructed by existing work is limited, only considering object information from adjacent frames or video clips, leading to insufficient utilization of contextual information. To address this issue, we propose MambaLCT, which constructs and util… ▽ More

    Submitted 18 December, 2024; originally announced December 2024.

  13. arXiv:2412.13611  [pdf, other

    cs.CV

    Robust Tracking via Mamba-based Context-aware Token Learning

    Authors: Jinxia Xie, Bineng Zhong, Qihua Liang, Ning Li, Zhiyi Mo, Shuxiang Song

    Abstract: How to make a good trade-off between performance and computational cost is crucial for a tracker. However, current famous methods typically focus on complicated and time-consuming learning that combining temporal and appearance information by input more and more images (or features). Consequently, these methods not only increase the model's computational source and learning burden but also introdu… ▽ More

    Submitted 18 December, 2024; originally announced December 2024.

    Comments: AAAI2025

  14. arXiv:2412.01783  [pdf, other

    eess.SY cs.LG

    Transfer Learning for Control Systems via Neural Simulation Relations

    Authors: Alireza Nadali, Bingzhuo Zhong, Ashutosh Trivedi, Majid Zamani

    Abstract: Transfer learning is an umbrella term for machine learning approaches that leverage knowledge gained from solving one problem (the source domain) to improve speed, efficiency, and data requirements in solving a different but related problem (the target domain). The performance of the transferred model in the target domain is typically measured via some notion of loss function in the target domain.… ▽ More

    Submitted 2 December, 2024; originally announced December 2024.

  15. arXiv:2409.11147  [pdf, other

    cs.CL

    Reasoning Graph Enhanced Exemplars Retrieval for In-Context Learning

    Authors: Yukang Lin, Bingchen Zhong, Shuoran Jiang, Joanna Siebert, Qingcai Chen

    Abstract: Large language models (LLMs) have exhibited remarkable few-shot learning capabilities and unified the paradigm of NLP tasks through the in-context learning (ICL) technique. Despite the success of ICL, the quality of the exemplar demonstrations can significantly influence the LLM's performance. Existing exemplar selection methods mainly focus on the semantic similarity between queries and candidate… ▽ More

    Submitted 12 December, 2024; v1 submitted 17 September, 2024; originally announced September 2024.

  16. arXiv:2408.13659  [pdf, other

    cs.LG cs.AI cs.CE q-bio.QM

    ReactZyme: A Benchmark for Enzyme-Reaction Prediction

    Authors: Chenqing Hua, Bozitao Zhong, Sitao Luan, Liang Hong, Guy Wolf, Doina Precup, Shuangjia Zheng

    Abstract: Enzymes, with their specific catalyzed reactions, are necessary for all aspects of life, enabling diverse biological processes and adaptations. Predicting enzyme functions is essential for understanding biological pathways, guiding drug development, enhancing bioproduct yields, and facilitating evolutionary studies. Addressing the inherent complexities, we introduce a new approach to annotating en… ▽ More

    Submitted 30 September, 2024; v1 submitted 24 August, 2024; originally announced August 2024.

    Journal ref: 38th Conference on Neural Information Processing Systems (NeurIPS 2024) Track on Datasets and Benchmarks

  17. arXiv:2408.06391  [pdf, other

    q-bio.QM cs.AI cs.LG

    Autoregressive Enzyme Function Prediction with Multi-scale Multi-modality Fusion

    Authors: Dingyi Rong, Wenzhuo Zheng, Bozitao Zhong, Zhouhan Lin, Liang Hong, Ning Liu

    Abstract: Accurate prediction of enzyme function is crucial for elucidating biological mechanisms and driving innovation across various sectors. Existing deep learning methods tend to rely solely on either sequence data or structural data and predict the EC number as a whole, neglecting the intrinsic hierarchical structure of EC numbers. To address these limitations, we introduce MAPred, a novel multi-modal… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

  18. arXiv:2406.19755  [pdf, other

    q-bio.QM cs.AI

    Protein Representation Learning with Sequence Information Embedding: Does it Always Lead to a Better Performance?

    Authors: Yang Tan, Lirong Zheng, Bozitao Zhong, Liang Hong, Bingxin Zhou

    Abstract: Deep learning has become a crucial tool in studying proteins. While the significance of modeling protein structure has been discussed extensively in the literature, amino acid types are typically included in the input as a default operation for many inference tasks. This study demonstrates with structure alignment task that embedding amino acid types in some cases may not help a deep learning mode… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 8 pages, 4 figures

  19. arXiv:2404.14850  [pdf, other

    cs.CL cs.LG q-bio.BM

    Simple, Efficient and Scalable Structure-aware Adapter Boosts Protein Language Models

    Authors: Yang Tan, Mingchen Li, Bingxin Zhou, Bozitao Zhong, Lirong Zheng, Pan Tan, Ziyi Zhou, Huiqun Yu, Guisheng Fan, Liang Hong

    Abstract: Fine-tuning Pre-trained protein language models (PLMs) has emerged as a prominent strategy for enhancing downstream prediction tasks, often outperforming traditional supervised learning approaches. As a widely applied powerful technique in natural language processing, employing Parameter-Efficient Fine-Tuning techniques could potentially enhance the performance of PLMs. However, the direct transfe… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 30 pages, 4 figures, 8 tables

  20. arXiv:2403.10574  [pdf, other

    cs.CV

    Autoregressive Queries for Adaptive Tracking with Spatio-TemporalTransformers

    Authors: Jinxia Xie, Bineng Zhong, Zhiyi Mo, Shengping Zhang, Liangtao Shi, Shuxiang Song, Rongrong Ji

    Abstract: The rich spatio-temporal information is crucial to capture the complicated target appearance variations in visual tracking. However, most top-performing tracking algorithms rely on many hand-crafted components for spatio-temporal information aggregation. Consequently, the spatio-temporal information is far away from being fully explored. To alleviate this issue, we propose an adaptive tracker with… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

  21. End-to-End Human Instance Matting

    Authors: Qinglin Liu, Shengping Zhang, Quanling Meng, Bineng Zhong, Peiqiang Liu, Hongxun Yao

    Abstract: Human instance matting aims to estimate an alpha matte for each human instance in an image, which is extremely challenging and has rarely been studied so far. Despite some efforts to use instance segmentation to generate a trimap for each instance and apply trimap-based matting methods, the resulting alpha mattes are often inaccurate due to inaccurate segmentation. In addition, this approach is co… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

    Journal ref: IEEE T-CSVT 2023

  22. arXiv:2402.10433  [pdf, other

    q-bio.BM cs.LG q-bio.QM

    Fusing Neural and Physical: Augment Protein Conformation Sampling with Tractable Simulations

    Authors: Jiarui Lu, Zuobai Zhang, Bozitao Zhong, Chence Shi, Jian Tang

    Abstract: The protein dynamics are common and important for their biological functions and properties, the study of which usually involves time-consuming molecular dynamics (MD) simulations in silico. Recently, generative models has been leveraged as a surrogate sampler to obtain conformation ensembles with orders of magnitude faster and without requiring any simulation data (a "zero-shot" inference). Howev… ▽ More

    Submitted 11 March, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: Published at the GEM workshop, ICLR 2024

  23. arXiv:2401.03142  [pdf, other

    cs.CV

    Explicit Visual Prompts for Visual Object Tracking

    Authors: Liangtao Shi, Bineng Zhong, Qihua Liang, Ning Li, Shengping Zhang, Xianxian Li

    Abstract: How to effectively exploit spatio-temporal information is crucial to capture target appearance changes in visual tracking. However, most deep learning-based trackers mainly focus on designing a complicated appearance model or template updating strategy, while lacking the exploitation of context between consecutive frames and thus entailing the \textit{when-and-how-to-update} dilemma. To address th… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

  24. arXiv:2401.01686  [pdf, other

    cs.CV

    ODTrack: Online Dense Temporal Token Learning for Visual Tracking

    Authors: Yaozong Zheng, Bineng Zhong, Qihua Liang, Zhiyi Mo, Shengping Zhang, Xianxian Li

    Abstract: Online contextual reasoning and association across consecutive video frames are critical to perceive instances in visual tracking. However, most current top-performing trackers persistently lean on sparse temporal relationships between reference and search frames via an offline mode. Consequently, they can only interact independently within each image-pair and establish limited temporal correlatio… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  25. arXiv:2312.00080  [pdf, other

    q-bio.QM cs.LG

    PDB-Struct: A Comprehensive Benchmark for Structure-based Protein Design

    Authors: Chuanrui Wang, Bozitao Zhong, Zuobai Zhang, Narendra Chaudhary, Sanchit Misra, Jian Tang

    Abstract: Structure-based protein design has attracted increasing interest, with numerous methods being introduced in recent years. However, a universally accepted method for evaluation has not been established, since the wet-lab validation can be overly time-consuming for the development of new algorithms, and the $\textit{in silico}$ validation with recovery and perplexity metrics is efficient but may not… ▽ More

    Submitted 29 November, 2023; originally announced December 2023.

    Comments: 13 pages

  26. arXiv:2308.14103  [pdf, other

    cs.CV

    Towards Unified Token Learning for Vision-Language Tracking

    Authors: Yaozong Zheng, Bineng Zhong, Qihua Liang, Guorong Li, Rongrong Ji, Xianxian Li

    Abstract: In this paper, we present a simple, flexible and effective vision-language (VL) tracking pipeline, termed \textbf{MMTrack}, which casts VL tracking as a token generation task. Traditional paradigms address VL tracking task indirectly with sophisticated prior designs, making them over-specialize on the features of specific architectures or mechanisms. In contrast, our proposed framework serializes… ▽ More

    Submitted 27 August, 2023; originally announced August 2023.

  27. arXiv:2306.03117  [pdf, other

    q-bio.QM cs.LG q-bio.BM

    Str2Str: A Score-based Framework for Zero-shot Protein Conformation Sampling

    Authors: Jiarui Lu, Bozitao Zhong, Zuobai Zhang, Jian Tang

    Abstract: The dynamic nature of proteins is crucial for determining their biological functions and properties, for which Monte Carlo (MC) and molecular dynamics (MD) simulations stand as predominant tools to study such phenomena. By utilizing empirically derived force fields, MC or MD simulations explore the conformational space through numerically evolving the system via Markov chain or Newtonian mechanics… ▽ More

    Submitted 11 March, 2024; v1 submitted 5 June, 2023; originally announced June 2023.

    Comments: Published as a conference paper at ICLR 2024, see https://openreview.net/forum?id=C4BikKsgmK

  28. arXiv:2306.01794  [pdf, other

    q-bio.QM cs.LG

    DiffPack: A Torsional Diffusion Model for Autoregressive Protein Side-Chain Packing

    Authors: Yangtian Zhang, Zuobai Zhang, Bozitao Zhong, Sanchit Misra, Jian Tang

    Abstract: Proteins play a critical role in carrying out biological functions, and their 3D structures are essential in determining their functions. Accurately predicting the conformation of protein side-chains given their backbones is important for applications in protein structure prediction, design and protein-protein interactions. Traditional methods are computationally intensive and have limited accurac… ▽ More

    Submitted 15 February, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  29. arXiv:2210.08761  [pdf, other

    q-bio.BM cs.LG

    Protein Sequence and Structure Co-Design with Equivariant Translation

    Authors: Chence Shi, Chuanrui Wang, Jiarui Lu, Bozitao Zhong, Jian Tang

    Abstract: Proteins are macromolecules that perform essential functions in all living organisms. Designing novel proteins with specific structures and desired functions has been a long-standing challenge in the field of bioengineering. Existing approaches generate both protein sequence and structure using either autoregressive models or diffusion models, both of which suffer from high inference costs. In thi… ▽ More

    Submitted 2 March, 2023; v1 submitted 17 October, 2022; originally announced October 2022.

    Comments: Published as a conference paper at ICLR 2023, see https://openreview.net/forum?id=pRCMXcfdihq

  30. arXiv:2210.06069  [pdf, other

    q-bio.BM cs.LG

    E3Bind: An End-to-End Equivariant Network for Protein-Ligand Docking

    Authors: Yangtian Zhang, Huiyu Cai, Chence Shi, Bozitao Zhong, Jian Tang

    Abstract: In silico prediction of the ligand binding pose to a given protein target is a crucial but challenging task in drug discovery. This work focuses on blind flexible selfdocking, where we aim to predict the positions, orientations and conformations of docked molecules. Traditional physics-based methods usually suffer from inaccurate scoring functions and high inference cost. Recently, data-driven met… ▽ More

    Submitted 1 June, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: International Conference on Learning Representations (ICLR 2023)

  31. arXiv:2203.05922  [pdf, other

    cs.CV

    Visualizing and Understanding Patch Interactions in Vision Transformer

    Authors: Jie Ma, Yalong Bai, Bineng Zhong, Wei Zhang, Ting Yao, Tao Mei

    Abstract: Vision Transformer (ViT) has become a leading tool in various computer vision tasks, owing to its unique self-attention mechanism that learns visual representations explicitly through cross-patch information interactions. Despite having good success, the literature seldom explores the explainability of vision transformer, and there is no clear picture of how the attention mechanism with respect to… ▽ More

    Submitted 11 March, 2022; originally announced March 2022.

    Comments: 15 pages, 14 figures

  32. arXiv:2202.08502  [pdf, other

    cs.CV

    CLS: Cross Labeling Supervision for Semi-Supervised Learning

    Authors: Yao Yao, Junyi Shen, Jin Xu, Bin Zhong, Li Xiao

    Abstract: It is well known that the success of deep neural networks is greatly attributed to large-scale labeled datasets. However, it can be extremely time-consuming and laborious to collect sufficient high-quality labeled data in most practical applications. Semi-supervised learning (SSL) provides an effective solution to reduce the cost of labeling by simultaneously leveraging both labeled and unlabeled… ▽ More

    Submitted 17 February, 2022; originally announced February 2022.

  33. arXiv:2109.12252  [pdf

    cs.CV cs.LG cs.MM

    Long-Range Feature Propagating for Natural Image Matting

    Authors: Qinglin Liu, Haozhe Xie, Shengping Zhang, Bineng Zhong, Rongrong Ji

    Abstract: Natural image matting estimates the alpha values of unknown regions in the trimap. Recently, deep learning based methods propagate the alpha values from the known regions to unknown regions according to the similarity between them. However, we find that more than 50\% pixels in the unknown regions cannot be correlated to pixels in known regions due to the limitation of small effective reception fi… ▽ More

    Submitted 24 September, 2021; originally announced September 2021.

    Journal ref: ACM International Conference on Multimedia (ACM MM) 2021

  34. arXiv:2104.12041  [pdf, other

    cs.CV

    Distractor-Aware Fast Tracking via Dynamic Convolutions and MOT Philosophy

    Authors: Zikai Zhang, Bineng Zhong, Shengping Zhang, Zhenjun Tang, Xin Liu, Zhaoxiang Zhang

    Abstract: A practical long-term tracker typically contains three key properties, i.e. an efficient model design, an effective global re-detection strategy and a robust distractor awareness mechanism. However, most state-of-the-art long-term trackers (e.g., Pseudo and re-detecting based ones) do not take all three key properties into account and therefore may either be time-consuming or drift to distractors.… ▽ More

    Submitted 24 April, 2021; originally announced April 2021.

  35. arXiv:2104.00829  [pdf, other

    cs.CV

    Learning to Filter: Siamese Relation Network for Robust Tracking

    Authors: Siyuan Cheng, Bineng Zhong, Guorong Li, Xin Liu, Zhenjun Tang, Xianxian Li, Jing Wang

    Abstract: Despite the great success of Siamese-based trackers, their performance under complicated scenarios is still not satisfying, especially when there are distractors. To this end, we propose a novel Siamese relation network, which introduces two efficient modules, i.e. Relation Detector (RD) and Refinement Module (RM). RD performs in a meta-learning way to obtain a learning ability to filter the distr… ▽ More

    Submitted 1 April, 2021; originally announced April 2021.

  36. arXiv:2005.03837  [pdf, other

    cs.CV

    Projection & Probability-Driven Black-Box Attack

    Authors: Jie Li, Rongrong Ji, Hong Liu, Jianzhuang Liu, Bineng Zhong, Cheng Deng, Qi Tian

    Abstract: Generating adversarial examples in a black-box setting retains a significant challenge with vast practical application prospects. In particular, existing black-box attacks suffer from the need for excessive queries, as it is non-trivial to find an appropriate direction to optimize in the high-dimensional space. In this paper, we propose Projection & Probability-driven Black-box Attack (PPBA) to ta… ▽ More

    Submitted 7 May, 2020; originally announced May 2020.

    Comments: CVPR2020

  37. arXiv:2004.11500  [pdf, other

    cs.CV cs.LG eess.IV

    What Can Be Transferred: Unsupervised Domain Adaptation for Endoscopic Lesions Segmentation

    Authors: Jiahua Dong, Yang Cong, Gan Sun, Bineng Zhong, Xiaowei Xu

    Abstract: Unsupervised domain adaptation has attracted growing research attention on semantic segmentation. However, 1) most existing models cannot be directly applied into lesions transfer of medical images, due to the diverse appearances of same lesion among different datasets; 2) equal attention has been paid into all semantic representations instead of neglecting irrelevant knowledge, which leads to neg… ▽ More

    Submitted 23 April, 2020; originally announced April 2020.

    Comments: This paper is accepted by IEEE Conference on Computer Vision and Pattern Recognition 2020 (CVPR 2020)

  38. arXiv:2003.06761  [pdf, ps, other

    cs.CV

    Siamese Box Adaptive Network for Visual Tracking

    Authors: Zedu Chen, Bineng Zhong, Guorong Li, Shengping Zhang, Rongrong Ji

    Abstract: Most of the existing trackers usually rely on either a multi-scale searching scheme or pre-defined anchor boxes to accurately estimate the scale and aspect ratio of a target. Unfortunately, they typically call for tedious and heuristic configurations. To address this issue, we propose a simple yet effective visual tracking framework (named Siamese Box Adaptive Network, SiamBAN) by exploiting the e… ▽ More

    Submitted 22 April, 2020; v1 submitted 15 March, 2020; originally announced March 2020.

    Comments: Accepted to CVPR 2020

  39. arXiv:1912.12520  [pdf, other

    cs.SI cs.CL cs.LG

    Weak Supervision for Fake News Detection via Reinforcement Learning

    Authors: Yaqing Wang, Weifeng Yang, Fenglong Ma, Jin Xu, Bin Zhong, Qiang Deng, Jing Gao

    Abstract: Today social media has become the primary source for news. Via social media platforms, fake news travel at unprecedented speeds, reach global audiences and put users and communities at great risk. Therefore, it is extremely important to detect fake news as early as possible. Recently, deep learning based approaches have shown improved performance in fake news detection. However, the training of su… ▽ More

    Submitted 19 January, 2020; v1 submitted 28 December, 2019; originally announced December 2019.

    Comments: AAAI 2020

  40. arXiv:1911.05701  [pdf, other

    cs.LG cs.AI

    Transfer Value Iteration Networks

    Authors: Junyi Shen, Hankz Hankui Zhuo, Jin Xu, Bin Zhong, Sinno Jialin Pan

    Abstract: Value iteration networks (VINs) have been demonstrated to have a good generalization ability for reinforcement learning tasks across similar domains. However, based on our experiments, a policy learned by VINs still fail to generalize well on the domain whose action space and feature space are not identical to those in the domain where it is trained. In this paper, we propose a transfer learning a… ▽ More

    Submitted 26 November, 2019; v1 submitted 11 November, 2019; originally announced November 2019.

  41. PASS3D: Precise and Accelerated Semantic Segmentation for 3D Point Cloud

    Authors: Xin Kong, Guangyao Zhai, Baoquan Zhong, Yong Liu

    Abstract: In this paper, we propose PASS3D to achieve point-wise semantic segmentation for 3D point cloud. Our framework combines the efficiency of traditional geometric methods with robustness of deep learning methods, consisting of two stages: At stage-1, our accelerated cluster proposal algorithm will generate refined cluster proposals by segmenting point clouds without ground, capable of generating less… ▽ More

    Submitted 26 August, 2020; v1 submitted 4 September, 2019; originally announced September 2019.

    Comments: This paper has been accepted by IROS-2019

  42. arXiv:1908.09774  [pdf, other

    physics.soc-ph cs.CY

    Evaluating resilience in urban transportation systems for sustainability: A systems-based Bayesian network model

    Authors: Junqing Tang, Hans Heinimann, Ke Han, Hanbin Luo, Botao Zhong

    Abstract: This paper proposes a hierarchical Bayesian network model (BNM) to quantitatively evaluate the resilience of urban transportation infrastructure. Based on systemic thinkings and sustainability perspectives, we investigate the long-term resilience of the road transportation systems in four cities of China from 1998 to 2017, namely Beijing, Tianjin, Shanghai, and Chongqing, respectively. The model t… ▽ More

    Submitted 26 August, 2019; originally announced August 2019.

    Comments: 28 pages, 11 figures, 2 tables. Preprint submitted to Transportation Research Part C: Emerging technologies

  43. arXiv:1903.10082  [pdf, other

    cs.CV

    Residual Non-local Attention Networks for Image Restoration

    Authors: Yulun Zhang, Kunpeng Li, Kai Li, Bineng Zhong, Yun Fu

    Abstract: In this paper, we propose a residual non-local attention network for high-quality image restoration. Without considering the uneven distribution of information in the corrupted images, previous methods are restricted by local convolutional operation and equal treatment of spatial- and channel-wise features. To address this issue, we design local and non-local attention blocks to extract features t… ▽ More

    Submitted 24 March, 2019; originally announced March 2019.

    Comments: To appear in ICLR 2019

  44. arXiv:1903.02173  [pdf, other

    cs.LG cs.AI stat.ML

    Representative Task Self-selection for Flexible Clustered Lifelong Learning

    Authors: Gan Sun, Yang Cong, Qianqian Wang, Bineng Zhong, Yun Fu

    Abstract: Consider the lifelong machine learning paradigm whose objective is to learn a sequence of tasks depending on previous experiences, e.g., knowledge library or deep network weights. However, the knowledge libraries or deep networks for most recent lifelong learning models are with prescribed size, and can degenerate the performance for both learned tasks and coming ones when facing with a new task e… ▽ More

    Submitted 21 June, 2019; v1 submitted 5 March, 2019; originally announced March 2019.

    Comments: 15 pages, 33 figures

  45. arXiv:1812.10477  [pdf, other

    cs.CV

    Residual Dense Network for Image Restoration

    Authors: Yulun Zhang, Yapeng Tian, Yu Kong, Bineng Zhong, Yun Fu

    Abstract: Convolutional neural network has recently achieved great success for image restoration (IR) and also offered hierarchical features. However, most deep CNN based IR models do not make full use of the hierarchical features from the original low-quality images, thereby achieving relatively-low performance. In this paper, we propose a novel residual dense network (RDN) to address this problem in IR. W… ▽ More

    Submitted 22 January, 2020; v1 submitted 24 December, 2018; originally announced December 2018.

    Comments: To appear in TPAMI. arXiv admin note: substantial text overlap with arXiv:1802.08797

  46. arXiv:1807.02758  [pdf, other

    cs.CV

    Image Super-Resolution Using Very Deep Residual Channel Attention Networks

    Authors: Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bineng Zhong, Yun Fu

    Abstract: Convolutional neural network (CNN) depth is of crucial importance for image super-resolution (SR). However, we observe that deeper networks for image SR are more difficult to train. The low-resolution inputs and features contain abundant low-frequency information, which is treated equally across channels, hence hindering the representational ability of CNNs. To solve these problems, we propose the… ▽ More

    Submitted 12 July, 2018; v1 submitted 8 July, 2018; originally announced July 2018.

    Comments: To appear in ECCV 2018

  47. Robust and Efficient Boosting Method using the Conditional Risk

    Authors: Zhi Xiao, Zhe Luo, Bo Zhong, Xin Dang

    Abstract: Well-known for its simplicity and effectiveness in classification, AdaBoost, however, suffers from overfitting when class-conditional distributions have significant overlap. Moreover, it is very sensitive to noise that appears in the labels. This article tackles the above limitations simultaneously via optimizing a modified loss function (i.e., the conditional risk). The proposed approach has the… ▽ More

    Submitted 21 June, 2018; originally announced June 2018.

    Comments: 14 Pages, 2 figures and 5 tables

  48. arXiv:1805.09386  [pdf, other

    cs.LG math.OC stat.ML

    Predictive Local Smoothness for Stochastic Gradient Methods

    Authors: Jun Li, Hongfu Liu, Bineng Zhong, Yue Wu, Yun Fu

    Abstract: Stochastic gradient methods are dominant in nonconvex optimization especially for deep models but have low asymptotical convergence due to the fixed smoothness. To address this problem, we propose a simple yet effective method for improving stochastic gradient methods named predictive local smoothness (PLS). First, we create a convergence condition to build a learning rate which varies adaptively… ▽ More

    Submitted 23 May, 2018; originally announced May 2018.

    Comments: 14 pages, 7 figures

  49. arXiv:1802.08797  [pdf, other

    cs.CV

    Residual Dense Network for Image Super-Resolution

    Authors: Yulun Zhang, Yapeng Tian, Yu Kong, Bineng Zhong, Yun Fu

    Abstract: A very deep convolutional neural network (CNN) has recently achieved great success for image super-resolution (SR) and offered hierarchical features as well. However, most deep CNN based SR models do not make full use of the hierarchical features from the original low-resolution (LR) images, thereby achieving relatively-low performance. In this paper, we propose a novel residual dense network (RDN… ▽ More

    Submitted 26 March, 2018; v1 submitted 23 February, 2018; originally announced February 2018.

    Comments: To appear in CVPR 2018 as spotlight

  50. arXiv:1603.07800  [pdf, other

    cs.CV

    An Effective Unconstrained Correlation Filter and Its Kernelization for Face Recognition

    Authors: Yan Yan, Hanzi Wang, Cuihua Li, Chenhui Yang, Bineng Zhong

    Abstract: In this paper, an effective unconstrained correlation filter called Uncon- strained Optimal Origin Tradeoff Filter (UOOTF) is presented and applied to robust face recognition. Compared with the conventional correlation filters in Class-dependence Feature Analysis (CFA), UOOTF improves the overall performance for unseen patterns by removing the hard constraints on the origin correlation outputs dur… ▽ More

    Submitted 24 March, 2016; originally announced March 2016.

    Journal ref: Neurocomputing, 119, pp.201-211, 2013