Skip to main content

Showing 1–41 of 41 results for author: Guan, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2507.06121  [pdf, ps, other

    cs.IR

    Unconditional Diffusion for Generative Sequential Recommendation

    Authors: Yimeng Bai, Yang Zhang, Sihao Ding, Shaohui Ruan, Han Yao, Danhui Guan, Fuli Feng, Tat-Seng Chua

    Abstract: Diffusion models, known for their generative ability to simulate data creation through noise-adding and denoising processes, have emerged as a promising approach for building generative recommenders. To incorporate user history for personalization, existing methods typically adopt a conditional diffusion framework, where the reverse denoising process of reconstructing items from noise is modified… ▽ More

    Submitted 8 July, 2025; originally announced July 2025.

    ACM Class: H.3.3; H.3.5

  2. arXiv:2507.01766  [pdf, ps, other

    cs.IT eess.SP

    Reconfigurable Intelligent Surface aided Integrated-Navigation-and-Communication in Urban Canyons: A Satellite Selection Approach

    Authors: Tianwei Hou, Da Guan, Xin Sun, Anna Li, Wenqiang Yi, Yuanwei Liu, Arumugam Nallanathan

    Abstract: This study investigates the application of a simultaneous transmitting and reflecting reconfigurable intelligent surface (STAR-RIS)-aided medium-Earth-orbit (MEO) satellite network for providing both global positioning services and communication services in the urban canyons, where the direct satellite-user links are obstructed. Superposition coding (SC) and successive interference cancellation (S… ▽ More

    Submitted 2 July, 2025; originally announced July 2025.

  3. arXiv:2506.08070  [pdf, ps, other

    cs.LG cs.AI

    Info-Coevolution: An Efficient Framework for Data Model Coevolution

    Authors: Ziheng Qin, Hailun Xu, Wei Chee Yew, Qi Jia, Yang Luo, Kanchan Sarkar, Danhui Guan, Kai Wang, Yang You

    Abstract: Machine learning relies heavily on data, yet the continuous growth of real-world data poses challenges for efficient dataset construction and training. A fundamental yet unsolved question is: given our current model and data, does a new data (sample/batch) need annotation/learning? Conventional approaches retain all available data, leading to non-optimal data and training efficiency. Active learni… ▽ More

    Submitted 19 June, 2025; v1 submitted 9 June, 2025; originally announced June 2025.

    Comments: V1

    Journal ref: ICML 2025

  4. arXiv:2506.01663  [pdf, ps, other

    cs.CV

    Zoom-Refine: Boosting High-Resolution Multimodal Understanding via Localized Zoom and Self-Refinement

    Authors: Xuan Yu, Dayan Guan, Michael Ying Yang, Yanfeng Gu

    Abstract: Multimodal Large Language Models (MLLM) often struggle to interpret high-resolution images accurately, where fine-grained details are crucial for complex visual understanding. We introduce Zoom-Refine, a novel training-free method that enhances MLLM capabilities to address this issue. Zoom-Refine operates through a synergistic process of \textit{Localized Zoom} and \textit{Self-Refinement}. In the… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

    Comments: Code is available at https://github.com/xavier-yu114/Zoom-Refine

  5. arXiv:2505.12408  [pdf, ps, other

    cs.CV cs.AI cs.HC

    ViEEG: Hierarchical Neural Coding with Cross-Modal Progressive Enhancement for EEG-Based Visual Decoding

    Authors: Minxu Liu, Donghai Guan, Chuhang Zheng, Chunwei Tian, Jie Wen, Qi Zhu

    Abstract: Understanding and decoding brain activity into visual representations is a fundamental challenge at the intersection of neuroscience and artificial intelligence. While EEG-based visual decoding has shown promise due to its non-invasive, low-cost nature and millisecond-level temporal resolution, existing methods are limited by their reliance on flat neural representations that overlook the brain's… ▽ More

    Submitted 25 May, 2025; v1 submitted 18 May, 2025; originally announced May 2025.

    Comments: 24 pages, 18 figures

  6. arXiv:2410.06811  [pdf, other

    cs.CV

    Rethinking the Evaluation of Visible and Infrared Image Fusion

    Authors: Dayan Guan, Yixuan Wu, Tianzhu Liu, Alex C. Kot, Yanfeng Gu

    Abstract: Visible and Infrared Image Fusion (VIF) has garnered significant interest across a wide range of high-level vision tasks, such as object detection and semantic segmentation. However, the evaluation of VIF methods remains challenging due to the absence of ground truth. This paper proposes a Segmentation-oriented Evaluation Approach (SEA) to assess VIF methods by incorporating the semantic segmentat… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

    Comments: The code has been released in \url{https://github.com/Yixuan-2002/SEA/}

  7. arXiv:2410.04762  [pdf

    cs.CV

    WTCL-Dehaze: Rethinking Real-world Image Dehazing via Wavelet Transform and Contrastive Learning

    Authors: Divine Joseph Appiah, Donghai Guan, Abdul Nasser Kasule, Mingqiang Wei

    Abstract: Images captured in hazy outdoor conditions often suffer from colour distortion, low contrast, and loss of detail, which impair high-level vision tasks. Single image dehazing is essential for applications such as autonomous driving and surveillance, with the aim of restoring image clarity. In this work, we propose WTCL-Dehaze an enhanced semi-supervised dehazing network that integrates Contrastive… ▽ More

    Submitted 7 October, 2024; originally announced October 2024.

    Comments: 15 pages,4 figures

  8. arXiv:2410.01239  [pdf, other

    cs.CV

    Replacement Learning: Training Vision Tasks with Fewer Learnable Parameters

    Authors: Yuming Zhang, Peizhe Wang, Shouxin Zhang, Dongzhi Guan, Jiabin Liu, Junhao Su

    Abstract: Traditional end-to-end deep learning models often enhance feature representation and overall performance by increasing the depth and complexity of the network during training. However, this approach inevitably introduces issues of parameter redundancy and resource inefficiency, especially in deeper networks. While existing works attempt to skip certain redundant layers to alleviate these problems,… ▽ More

    Submitted 2 October, 2024; originally announced October 2024.

  9. arXiv:2407.05638  [pdf, other

    cs.CV

    HPFF: Hierarchical Locally Supervised Learning with Patch Feature Fusion

    Authors: Junhao Su, Chenghao He, Feiyu Zhu, Xiaojie Xu, Dongzhi Guan, Chenyang Si

    Abstract: Traditional deep learning relies on end-to-end backpropagation for training, but it suffers from drawbacks such as high memory consumption and not aligning with biological neural networks. Recent advancements have introduced locally supervised learning, which divides networks into modules with isolated gradients and trains them locally. However, this approach can lead to performance lag due to lim… ▽ More

    Submitted 8 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  10. arXiv:2407.05623  [pdf, other

    cs.CV

    Momentum Auxiliary Network for Supervised Local Learning

    Authors: Junhao Su, Changpeng Cai, Feiyu Zhu, Chenghao He, Xiaojie Xu, Dongzhi Guan, Chenyang Si

    Abstract: Deep neural networks conventionally employ end-to-end backpropagation for their training process, which lacks biological credibility and triggers a locking dilemma during network parameter updates, leading to significant GPU memory use. Supervised local learning, which segments the network into multiple local blocks updated by independent auxiliary networks. However, these methods cannot replace e… ▽ More

    Submitted 12 August, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024(Oral)

  11. arXiv:2407.00906  [pdf, other

    cs.CV cs.LG

    GSO-YOLO: Global Stability Optimization YOLO for Construction Site Detection

    Authors: Yuming Zhang, Dongzhi Guan, Shouxin Zhang, Junhao Su, Yunzhi Han, Jiabin Liu

    Abstract: Safety issues at construction sites have long plagued the industry, posing risks to worker safety and causing economic damage due to potential hazards. With the advancement of artificial intelligence, particularly in the field of computer vision, the automation of safety monitoring on construction sites has emerged as a solution to this longstanding issue. Despite achieving impressive performance,… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  12. arXiv:2406.16633  [pdf, other

    cs.CV

    MLAAN: Scaling Supervised Local Learning with Multilaminar Leap Augmented Auxiliary Network

    Authors: Yuming Zhang, Shouxin Zhang, Peizhe Wang, Feiyu Zhu, Dongzhi Guan, Junhao Su, Jiabin Liu, Changpeng Cai

    Abstract: Deep neural networks (DNNs) typically employ an end-to-end (E2E) training paradigm which presents several challenges, including high GPU memory consumption, inefficiency, and difficulties in model parallelization during training. Recent research has sought to address these issues, with one promising approach being local learning. This method involves partitioning the backbone network into gradient… ▽ More

    Submitted 20 December, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

    Comments: Accepted by AAAI2025

  13. arXiv:2406.13227  [pdf, other

    cs.CV

    Controllable and Gradual Facial Blemishes Retouching via Physics-Based Modelling

    Authors: Chenhao Shuai, Rizhao Cai, Bandara Dissanayake, Amanda Newman, Dayan Guan, Dennis Sng, Ling Li, Alex Kot

    Abstract: Face retouching aims to remove facial blemishes, such as pigmentation and acne, and still retain fine-grain texture details. Nevertheless, existing methods just remove the blemishes but focus little on realism of the intermediate process, limiting their use more to beautifying facial images on social media rather than being effective tools for simulating changes in facial pigmentation and ance. Mo… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 7 pages, 6 figures. The paper has been accepted by the IEEE Conference on Multimedia Expo 2024

  14. arXiv:2403.18293  [pdf, other

    cs.CV

    Efficient Test-Time Adaptation of Vision-Language Models

    Authors: Adilbek Karmanov, Dayan Guan, Shijian Lu, Abdulmotaleb El Saddik, Eric Xing

    Abstract: Test-time adaptation with pre-trained vision-language models has attracted increasing attention for tackling distribution shifts during the test time. Though prior studies have achieved very promising performance, they involve intensive computation which is severely unaligned with test-time adaptation. We design TDA, a training-free dynamic adapter that enables effective and efficient test-time ad… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024. The code has been released in \url{https://kdiaaa.github.io/tda/}

  15. arXiv:2312.02896  [pdf, other

    cs.CV

    BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models

    Authors: Rizhao Cai, Zirui Song, Dayan Guan, Zhenhao Chen, Xing Luo, Chenyu Yi, Alex Kot

    Abstract: Large Multimodal Models (LMMs) such as GPT-4V and LLaVA have shown remarkable capabilities in visual reasoning with common image styles. However, their robustness against diverse style shifts, crucial for practical applications, remains largely unexplored. In this paper, we propose a new benchmark, BenchLMM, to assess the robustness of LMMs against three different styles: artistic image style, ima… ▽ More

    Submitted 5 December, 2023; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: Code is available at https://github.com/AIFEG/BenchLMM

  16. arXiv:2307.09729  [pdf, other

    cs.CV cs.MM eess.IV

    NTIRE 2023 Quality Assessment of Video Enhancement Challenge

    Authors: Xiaohong Liu, Xiongkuo Min, Wei Sun, Yulun Zhang, Kai Zhang, Radu Timofte, Guangtao Zhai, Yixuan Gao, Yuqin Cao, Tengchuan Kou, Yunlong Dong, Ziheng Jia, Yilin Li, Wei Wu, Shuming Hu, Sibin Deng, Pengxiang Xiao, Ying Chen, Kai Li, Kai Zhao, Kun Yuan, Ming Sun, Heng Cong, Hao Wang, Lingzhi Fu , et al. (47 additional authors not shown)

    Abstract: This paper reports on the NTIRE 2023 Quality Assessment of Video Enhancement Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2023. This challenge is to address a major challenge in the field of video processing, namely, video quality assessment (VQA) for enhanced videos. The challenge uses the VQA Dataset for Perceptual… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

  17. arXiv:2304.00690  [pdf, other

    cs.CV

    3D Semantic Segmentation in the Wild: Learning Generalized Models for Adverse-Condition Point Clouds

    Authors: Aoran Xiao, Jiaxing Huang, Weihao Xuan, Ruijie Ren, Kangcheng Liu, Dayan Guan, Abdulmotaleb El Saddik, Shijian Lu, Eric Xing

    Abstract: Robust point cloud parsing under all-weather conditions is crucial to level-5 autonomy in autonomous driving. However, how to learn a universal 3D semantic segmentation (3DSS) model is largely neglected as most existing benchmarks are dominated by point clouds captured under normal weather. We introduce SemanticSTF, an adverse-weather point cloud dataset that provides dense point-level annotations… ▽ More

    Submitted 2 April, 2023; originally announced April 2023.

    Comments: CVPR2023

  18. arXiv:2211.15030  [pdf, other

    cs.CV eess.IV

    Imperceptible Adversarial Attack via Invertible Neural Networks

    Authors: Zihan Chen, Ziyue Wang, Junjie Huang, Wentao Zhao, Xiao Liu, Dejian Guan

    Abstract: Adding perturbations via utilizing auxiliary gradient information or discarding existing details of the benign images are two common approaches for generating adversarial examples. Though visual imperceptibility is the desired property of adversarial examples, conventional adversarial attacks still generate traceable adversarial perturbations. In this paper, we introduce a novel Adversarial Attack… ▽ More

    Submitted 17 January, 2023; v1 submitted 27 November, 2022; originally announced November 2022.

  19. arXiv:2208.05728  [pdf, other

    cs.IR

    Continual Transfer Learning for Cross-Domain Click-Through Rate Prediction at Taobao

    Authors: Lixin Liu, Yanling Wang, Tianming Wang, Dong Guan, Jiawei Wu, Jingxu Chen, Rong Xiao, Wenxiang Zhu, Fei Fang

    Abstract: As one of the largest e-commerce platforms in the world, Taobao's recommendation systems (RSs) serve the demands of shopping for hundreds of millions of customers. Click-Through Rate (CTR) prediction is a core component of the RS. One of the biggest characteristics in CTR prediction at Taobao is that there exist multiple recommendation domains where the scales of different domains vary significant… ▽ More

    Submitted 20 February, 2023; v1 submitted 11 August, 2022; originally announced August 2022.

    Comments: Accepted by WWW 2023 industry track

  20. arXiv:2208.01905  [pdf

    cs.CV

    Graph Signal Processing for Heterogeneous Change Detection Part II: Spectral Domain Analysis

    Authors: Yuli Sun, Lin Lei, Dongdong Guan, Gangyao Kuang, Li Liu

    Abstract: This is the second part of the paper that provides a new strategy for the heterogeneous change detection (HCD) problem, that is, solving HCD from the perspective of graph signal processing (GSP). We construct a graph to represent the structure of each image, and treat each image as a graph signal defined on the graph. In this way, we can convert the HCD problem into a comparison of responses of si… ▽ More

    Submitted 7 August, 2022; v1 submitted 3 August, 2022; originally announced August 2022.

  21. arXiv:2208.01881  [pdf

    cs.CV eess.IV

    Graph Signal Processing for Heterogeneous Change Detection Part I: Vertex Domain Filtering

    Authors: Yuli Sun, Lin Lei, Dongdong Guan, Gangyao Kuang, Li Liu

    Abstract: This paper provides a new strategy for the Heterogeneous Change Detection (HCD) problem: solving HCD from the perspective of Graph Signal Processing (GSP). We construct a graph for each image to capture the structure information, and treat each image as the graph signal. In this way, we convert the HCD into a GSP problem: a comparison of the responses of the two signals on different systems define… ▽ More

    Submitted 7 August, 2022; v1 submitted 3 August, 2022; originally announced August 2022.

  22. arXiv:2208.00223  [pdf, other

    cs.CV cs.AI cs.LG

    PolarMix: A General Data Augmentation Technique for LiDAR Point Clouds

    Authors: Aoran Xiao, Jiaxing Huang, Dayan Guan, Kaiwen Cui, Shijian Lu, Ling Shao

    Abstract: LiDAR point clouds, which are usually scanned by rotating LiDAR sensors continuously, capture precise geometry of the surrounding environment and are crucial to many autonomous detection and navigation tasks. Though many 3D deep architectures have been developed, efficient collection and annotation of large amounts of point clouds remain one major challenge in the analytic and understanding of poi… ▽ More

    Submitted 30 July, 2022; originally announced August 2022.

  23. arXiv:2207.02372  [pdf, other

    cs.CV

    Domain Adaptive Video Segmentation via Temporal Pseudo Supervision

    Authors: Yun Xing, Dayan Guan, Jiaxing Huang, Shijian Lu

    Abstract: Video semantic segmentation has achieved great progress under the supervision of large amounts of labelled training data. However, domain adaptive video segmentation, which can mitigate data labelling constraints by adapting from a labelled source domain toward an unlabelled target domain, is largely neglected. We design temporal pseudo supervision (TPS), a simple and effective method that explore… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

    Comments: Accepted to ECCV 2022. Code is available at https://github.com/xing0047/TPS

  24. arXiv:2203.15480  [pdf, other

    cs.CV

    SAR-ShipNet: SAR-Ship Detection Neural Network via Bidirectional Coordinate Attention and Multi-resolution Feature Fusion

    Authors: Yuwen Deng, Donghai Guan, Yanyu Chen, Weiwei Yuan, Jiemin Ji, Mingqiang Wei

    Abstract: This paper studies a practically meaningful ship detection problem from synthetic aperture radar (SAR) images by the neural network. We broadly extract different types of SAR image features and raise the intriguing question that whether these extracted features are beneficial to (1) suppress data variations (e.g., complex land-sea backgrounds, scattered noise) of real-world SAR images, and (2) enh… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

    Comments: This paper was accepted by the International Conference on Acoustics, Speech, and Signal Processing(ICASSP) 2022

  25. arXiv:2203.10026  [pdf, other

    cs.CV

    Unbiased Subclass Regularization for Semi-Supervised Semantic Segmentation

    Authors: Dayan Guan, Jiaxing Huang, Aoran Xiao, Shijian Lu

    Abstract: Semi-supervised semantic segmentation learns from small amounts of labelled images and large amounts of unlabelled images, which has witnessed impressive progress with the recent advance of deep neural networks. However, it often suffers from severe class-bias problem while exploring the unlabelled images, largely due to the clear pixel-wise class imbalance in the labelled images. This paper prese… ▽ More

    Submitted 26 March, 2022; v1 submitted 18 March, 2022; originally announced March 2022.

    Comments: Accepted to CVPR 2022. Code is available at https://github.com/Dayan-Guan/USRN

  26. Unsupervised Point Cloud Representation Learning with Deep Neural Networks: A Survey

    Authors: Aoran Xiao, Jiaxing Huang, Dayan Guan, Xiaoqin Zhang, Shijian Lu, Ling Shao

    Abstract: Point cloud data have been widely explored due to its superior accuracy and robustness under various adverse situations. Meanwhile, deep neural networks (DNNs) have achieved very impressive success in various applications such as surveillance and autonomous driving. The convergence of point cloud and DNNs has led to many deep point cloud models, largely trained under the supervision of large-scale… ▽ More

    Submitted 26 March, 2023; v1 submitted 28 February, 2022; originally announced February 2022.

    Comments: IEEE Transactions on Pattern Analysis and Machine Intelligence

  27. arXiv:2110.03374  [pdf, other

    cs.CV

    Model Adaptation: Historical Contrastive Learning for Unsupervised Domain Adaptation without Source Data

    Authors: Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu

    Abstract: Unsupervised domain adaptation aims to align a labeled source domain and an unlabeled target domain, but it requires to access the source data which often raises concerns in data privacy, data portability and data transmission efficiency. We study unsupervised model adaptation (UMA), or called Unsupervised Domain Adaptation without Source Data, an alternative setting that aims to adapt source-trai… ▽ More

    Submitted 4 June, 2022; v1 submitted 7 October, 2021; originally announced October 2021.

    Comments: Accepted to Advances in Neural Information Processing Systems 34 (NeurIPS 2021)

  28. arXiv:2107.11004  [pdf, other

    cs.CV

    Domain Adaptive Video Segmentation via Temporal Consistency Regularization

    Authors: Dayan Guan, Jiaxing Huang, Aoran Xiao, Shijian Lu

    Abstract: Video semantic segmentation is an essential task for the analysis and understanding of videos. Recent efforts largely focus on supervised video segmentation by learning from fully annotated data, but the learnt models often experience clear performance drop while applied to videos of a different domain. This paper presents DA-VSN, a domain adaptive video segmentation network that addresses domain… ▽ More

    Submitted 22 July, 2021; originally announced July 2021.

    Comments: Accepted to ICCV 2021. Code is available at https://github.com/Dayan-Guan/DA-VSN

  29. arXiv:2107.05399  [pdf, other

    cs.CV

    Transfer Learning from Synthetic to Real LiDAR Point Cloud for Semantic Segmentation

    Authors: Aoran Xiao, Jiaxing Huang, Dayan Guan, Fangneng Zhan, Shijian Lu

    Abstract: Knowledge transfer from synthetic to real data has been widely studied to mitigate data annotation constraints in various computer vision tasks such as semantic segmentation. However, the study focused on 2D images and its counterpart in 3D point clouds segmentation lags far behind due to the lack of large-scale synthetic datasets and effective transfer methods. We address this issue by collecting… ▽ More

    Submitted 1 December, 2021; v1 submitted 12 July, 2021; originally announced July 2021.

    Comments: Accepted by AAAI 2022

  30. arXiv:2106.02885  [pdf, other

    cs.CV

    Category Contrast for Unsupervised Domain Adaptation in Visual Tasks

    Authors: Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu, Ling Shao

    Abstract: Instance contrast for unsupervised representation learning has achieved great success in recent years. In this work, we explore the idea of instance contrastive learning in unsupervised domain adaptation (UDA) and propose a novel Category Contrast technique (CaCo) that introduces semantic priors on top of instance discrimination for visual UDA tasks. By considering instance contrastive learning as… ▽ More

    Submitted 17 March, 2022; v1 submitted 5 June, 2021; originally announced June 2021.

    Comments: CVPR2022 version

  31. arXiv:2106.02874  [pdf, other

    cs.CV

    RDA: Robust Domain Adaptation via Fourier Adversarial Attacking

    Authors: Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu

    Abstract: Unsupervised domain adaptation (UDA) involves a supervised loss in a labeled source domain and an unsupervised loss in an unlabeled target domain, which often faces more severe overfitting (than classical supervised learning) as the supervised source loss has clear domain gap and the unsupervised target loss is often noisy due to the lack of annotations. This paper presents RDA, a robust domain ad… ▽ More

    Submitted 15 August, 2021; v1 submitted 5 June, 2021; originally announced June 2021.

    Comments: Accepted to ICCV2021 (International Conference on Computer Vision)

  32. arXiv:2106.02845  [pdf, other

    cs.CV

    Semi-Supervised Domain Adaptation via Adaptive and Progressive Feature Alignment

    Authors: Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu

    Abstract: Contemporary domain adaptive semantic segmentation aims to address data annotation challenges by assuming that target domains are completely unannotated. However, annotating a few target samples is usually very manageable and worthwhile especially if it improves the adaptation performance substantially. This paper presents SSDAS, a Semi-Supervised Domain Adaptive image Segmentation network that em… ▽ More

    Submitted 5 June, 2021; originally announced June 2021.

  33. arXiv:2103.12991  [pdf, other

    cs.CV

    MLAN: Multi-Level Adversarial Network for Domain Adaptive Semantic Segmentation

    Authors: Jiaxing Huang, Dayan Guan, Shijian Lu, Aoran Xiao

    Abstract: Recent progresses in domain adaptive semantic segmentation demonstrate the effectiveness of adversarial learning (AL) in unsupervised domain adaptation. However, most adversarial learning based methods align source and target distributions at a global image level but neglect the inconsistency around local image regions. This paper presents a novel multi-level adversarial network (MLAN) that aims t… ▽ More

    Submitted 4 June, 2022; v1 submitted 24 March, 2021; originally announced March 2021.

    Comments: Accepted to Pattern Recognition, 2022

  34. arXiv:2103.02584  [pdf, other

    cs.CV

    Cross-View Regularization for Domain Adaptive Panoptic Segmentation

    Authors: Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu

    Abstract: Panoptic segmentation unifies semantic segmentation and instance segmentation which has been attracting increasing attention in recent years. However, most existing research was conducted under a supervised learning setup whereas unsupervised domain adaptive panoptic segmentation which is critical in different tasks and applications is largely neglected. We design a domain adaptive panoptic segmen… ▽ More

    Submitted 3 March, 2021; originally announced March 2021.

    Comments: Accepted to CVPR 2021 as an Oral Presentation

  35. arXiv:2103.02370  [pdf, other

    cs.CV

    FSDR: Frequency Space Domain Randomization for Domain Generalization

    Authors: Jiaxing Huang, Dayan Guan, Aoran Xiao, Shijian Lu

    Abstract: Domain generalization aims to learn a generalizable model from a known source domain for various unknown target domains. It has been studied widely by domain randomization that transfers source images to different styles in spatial space for learning domain-agnostic features. However, most existing randomization uses GANs that often lack of controls and even alter semantic structures of images und… ▽ More

    Submitted 3 March, 2021; originally announced March 2021.

    Comments: Accepted to CVPR2021

  36. FPS-Net: A Convolutional Fusion Network for Large-Scale LiDAR Point Cloud Segmentation

    Authors: Aoran Xiao, Xiaofei Yang, Shijian Lu, Dayan Guan, Jiaxing Huang

    Abstract: Scene understanding based on LiDAR point cloud is an essential task for autonomous cars to drive safely, which often employs spherical projection to map 3D point cloud into multi-channel 2D images for semantic segmentation. Most existing methods simply stack different point attributes/modalities (e.g. coordinates, intensity, depth, etc.) as image channels to increase information capacity, but igno… ▽ More

    Submitted 28 February, 2021; originally announced March 2021.

    Comments: 20 pages, 7 figures

  37. arXiv:2103.00236  [pdf, other

    cs.CV

    Uncertainty-Aware Unsupervised Domain Adaptation in Object Detection

    Authors: Dayan Guan, Jiaxing Huang, Aoran Xiao, Shijian Lu, Yanpeng Cao

    Abstract: Unsupervised domain adaptive object detection aims to adapt detectors from a labelled source domain to an unlabelled target domain. Most existing works take a two-stage strategy that first generates region proposals and then detects objects of interest, where adversarial learning is widely adopted to mitigate the inter-domain discrepancy in both stages. However, adversarial learning may impair the… ▽ More

    Submitted 19 May, 2021; v1 submitted 27 February, 2021; originally announced March 2021.

    Comments: Accepted in the IEEE Transactions on Multimedia

  38. arXiv:2007.02424  [pdf, other

    cs.CV

    Contextual-Relation Consistent Domain Adaptation for Semantic Segmentation

    Authors: Jiaxing Huang, Shijian Lu, Dayan Guan, Xiaobing Zhang

    Abstract: Recent advances in unsupervised domain adaptation for semantic segmentation have shown great potentials to relieve the demand of expensive per-pixel annotations. However, most existing works address the domain discrepancy by aligning the data distributions of two domains at a global image level whereas the local consistencies are largely neglected. This paper presents an innovative local contextua… ▽ More

    Submitted 15 July, 2020; v1 submitted 5 July, 2020; originally announced July 2020.

    Comments: Accepted to ECCV 2020

  39. arXiv:1904.03692  [pdf, other

    cs.CV

    Unsupervised Domain Adaptation for Multispectral Pedestrian Detection

    Authors: Dayan Guan, Xing Luo, Yanpeng Cao, Jiangxin Yang, Yanlong Cao, George Vosselman, Michael Ying Yang

    Abstract: Multimodal information (e.g., visible and thermal) can generate robust pedestrian detections to facilitate around-the-clock computer vision applications, such as autonomous driving and video surveillance. However, it still remains a crucial challenge to train a reliable detector working well in different multispectral pedestrian datasets without manual annotations. In this paper, we propose a nove… ▽ More

    Submitted 7 April, 2019; originally announced April 2019.

  40. arXiv:1902.05291  [pdf, other

    cs.CV

    Box-level Segmentation Supervised Deep Neural Networks for Accurate and Real-time Multispectral Pedestrian Detection

    Authors: Yanpeng Cao, Dayan Guan, Yulun Wu, Jiangxin Yang, Yanlong Cao, Michael Ying Yang

    Abstract: Effective fusion of complementary information captured by multi-modal sensors (visible and infrared cameras) enables robust pedestrian detection under various surveillance situations (e.g. daytime and nighttime). In this paper, we present a novel box-level segmentation supervised learning framework for accurate and real-time multispectral pedestrian detection by incorporating features extracted in… ▽ More

    Submitted 14 February, 2019; originally announced February 2019.

  41. arXiv:1802.09972  [pdf, other

    cs.CV

    Fusion of Multispectral Data Through Illumination-aware Deep Neural Networks for Pedestrian Detection

    Authors: Dayan Guan, Yanpeng Cao, Jun Liang, Yanlong Cao, Michael Ying Yang

    Abstract: Multispectral pedestrian detection has received extensive attention in recent years as a promising solution to facilitate robust human target detection for around-the-clock applications (e.g. security surveillance and autonomous driving). In this paper, we demonstrate illumination information encoded in multispectral images can be utilized to significantly boost performance of pedestrian detection… ▽ More

    Submitted 27 February, 2018; originally announced February 2018.