Skip to main content

Showing 1–38 of 38 results for author: Chan, K C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2503.24387  [pdf, other

    cs.CV

    Consistent Subject Generation via Contrastive Instantiated Concepts

    Authors: Lee Hsin-Ying, Kelvin C. K. Chan, Ming-Hsuan Yang

    Abstract: While text-to-image generative models can synthesize diverse and faithful contents, subject variation across multiple creations limits the application in long content generation. Existing approaches require time-consuming tuning, references for all subjects, or access to other creations. We introduce Contrastive Concept Instantiation (CoCoIns) to effectively synthesize consistent subjects across m… ▽ More

    Submitted 31 March, 2025; originally announced March 2025.

    Comments: Project page: https://contrastive-concept-instantiation.github.io

  2. arXiv:2411.18662  [pdf, other

    cs.CV

    HoliSDiP: Image Super-Resolution via Holistic Semantics and Diffusion Prior

    Authors: Li-Yuan Tsao, Hao-Wei Chen, Hao-Wei Chung, Deqing Sun, Chun-Yi Lee, Kelvin C. K. Chan, Ming-Hsuan Yang

    Abstract: Text-to-image diffusion models have emerged as powerful priors for real-world image super-resolution (Real-ISR). However, existing methods may produce unintended results due to noisy text prompts and their lack of spatial information. In this paper, we present HoliSDiP, a framework that leverages semantic segmentation to provide both precise textual and spatial guidance for diffusion-based Real-IS… ▽ More

    Submitted 27 November, 2024; originally announced November 2024.

    Comments: Project page: https://liyuantsao.github.io/HoliSDiP/

  3. arXiv:2410.11824  [pdf, other

    cs.CV

    KITTEN: A Knowledge-Intensive Evaluation of Image Generation on Visual Entities

    Authors: Hsin-Ping Huang, Xinyi Wang, Yonatan Bitton, Hagai Taitelbaum, Gaurav Singh Tomar, Ming-Wei Chang, Xuhui Jia, Kelvin C. K. Chan, Hexiang Hu, Yu-Chuan Su, Ming-Hsuan Yang

    Abstract: Recent advancements in text-to-image generation have significantly enhanced the quality of synthesized images. Despite this progress, evaluations predominantly focus on aesthetic appeal or alignment with text prompts. Consequently, there is limited understanding of whether these models can accurately represent a wide variety of realistic visual entities - a task requiring real-world knowledge. To… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

    Comments: Project page: https://kitten-project.github.io/

  4. arXiv:2410.11439  [pdf, other

    cs.CV

    A Simple Approach to Unifying Diffusion-based Conditional Generation

    Authors: Xirui Li, Charles Herrmann, Kelvin C. K. Chan, Yinxiao Li, Deqing Sun, Chao Ma, Ming-Hsuan Yang

    Abstract: Recent progress in image generation has sparked research into controlling these models through condition signals, with various methods addressing specific challenges in conditional generation. Instead of proposing another specialized technique, we introduce a simple, unified framework to handle diverse conditional generation tasks involving a specific image-condition correlation. By learning a joi… ▽ More

    Submitted 5 April, 2025; v1 submitted 15 October, 2024; originally announced October 2024.

    Comments: Project page: https://lixirui142.github.io/unicon-diffusion/

  5. arXiv:2408.09241  [pdf, other

    cs.CV eess.IV

    Re-boosting Self-Collaboration Parallel Prompt GAN for Unsupervised Image Restoration

    Authors: Xin Lin, Yuyan Zhou, Jingtong Yue, Chao Ren, Kelvin C. K. Chan, Lu Qi, Ming-Hsuan Yang

    Abstract: Unsupervised restoration approaches based on generative adversarial networks (GANs) offer a promising solution without requiring paired datasets. Yet, these GAN-based approaches struggle to surpass the performance of conventional unsupervised GAN-based frameworks without significantly modifying model structures or increasing the computational complexity. To address these issues, we propose a self-… ▽ More

    Submitted 17 August, 2024; originally announced August 2024.

    Comments: This paper is an extended and revised version of our previous work "Unsupervised Image Denoising in Real-World Scenarios via Self-Collaboration Parallel Generative Adversarial Branches"(https://openaccess.thecvf.com/content/ICCV2023/papers/Lin_Unsupervised_Image_Denoising_in_Real-World_Scenarios_via_Self-Collaboration_Parallel_Generative_ICCV_2023_paper.pdf)

  6. arXiv:2405.01356  [pdf, other

    cs.CV

    Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance

    Authors: Kelvin C. K. Chan, Yang Zhao, Xuhui Jia, Ming-Hsuan Yang, Huisheng Wang

    Abstract: In subject-driven text-to-image synthesis, the synthesis process tends to be heavily influenced by the reference images provided by users, often overlooking crucial attributes detailed in the text prompt. In this work, we propose Subject-Agnostic Guidance (SAG), a simple yet effective solution to remedy the problem. We show that through constructing a subject-agnostic condition and applying our pr… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted to CVPR 2024

  7. arXiv:2404.11475  [pdf, other

    cs.CV cs.AI

    AdaIR: Exploiting Underlying Similarities of Image Restoration Tasks with Adapters

    Authors: Hao-Wei Chen, Yu-Syuan Xu, Kelvin C. K. Chan, Hsien-Kai Kuo, Chun-Yi Lee, Ming-Hsuan Yang

    Abstract: Existing image restoration approaches typically employ extensive networks specifically trained for designated degradations. Despite being effective, such methods inevitably entail considerable storage costs and computational overheads due to the reliance on task-specific networks. In this work, we go beyond this well-established framework and exploit the inherent commonalities among image restorat… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  8. arXiv:2401.01952  [pdf, other

    cs.CV cs.AI cs.CL

    Instruct-Imagen: Image Generation with Multi-modal Instruction

    Authors: Hexiang Hu, Kelvin C. K. Chan, Yu-Chuan Su, Wenhu Chen, Yandong Li, Kihyuk Sohn, Yang Zhao, Xue Ben, Boqing Gong, William Cohen, Ming-Wei Chang, Xuhui Jia

    Abstract: This paper presents instruct-imagen, a model that tackles heterogeneous image generation tasks and generalizes across unseen tasks. We introduce *multi-modal instruction* for image generation, a task representation articulating a range of generation intents with precision. It uses natural language to amalgamate disparate modalities (e.g., text, edge, style, subject, etc.), such that abundant gener… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

    Comments: 20 pages, 18 figures

  9. arXiv:2312.03771  [pdf, other

    cs.CV

    DreamInpainter: Text-Guided Subject-Driven Image Inpainting with Diffusion Models

    Authors: Shaoan Xie, Yang Zhao, Zhisheng Xiao, Kelvin C. K. Chan, Yandong Li, Yanwu Xu, Kun Zhang, Tingbo Hou

    Abstract: This study introduces Text-Guided Subject-Driven Image Inpainting, a novel task that combines text and exemplar images for image inpainting. While both text and exemplar images have been used independently in previous efforts, their combined utilization remains unexplored. Simultaneously accommodating both conditions poses a significant challenge due to the inherent balance required between editab… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

  10. arXiv:2312.01734  [pdf, other

    cs.CV

    Effective Adapter for Face Recognition in the Wild

    Authors: Yunhao Liu, Yu-Ju Tsai, Kelvin C. K. Chan, Xiangtai Li, Lu Qi, Ming-Hsuan Yang

    Abstract: In this paper, we tackle the challenge of face recognition in the wild, where images often suffer from low quality and real-world distortions. Traditional heuristic approaches-either training models directly on these degraded images or their enhanced counterparts using face restoration techniques-have proven ineffective, primarily due to the degradation of facial features and the discrepancy in im… ▽ More

    Submitted 3 April, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

  11. arXiv:2312.01677  [pdf, other

    cs.CV

    Multi-task Image Restoration Guided By Robust DINO Features

    Authors: Xin Lin, Jingtong Yue, Kelvin C. K. Chan, Lu Qi, Chao Ren, Jinshan Pan, Ming-Hsuan Yang

    Abstract: Multi-task image restoration has gained significant interest due to its inherent versatility and efficiency compared to its single-task counterpart. However, performance decline is observed with an increase in the number of tasks, primarily attributed to the restoration model's challenge in handling different tasks with distinct natures at the same time. Thus, a perspective emerged aiming to explo… ▽ More

    Submitted 16 August, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

  12. arXiv:2309.03897  [pdf, other

    cs.CV

    ProPainter: Improving Propagation and Transformer for Video Inpainting

    Authors: Shangchen Zhou, Chongyi Li, Kelvin C. K. Chan, Chen Change Loy

    Abstract: Flow-based propagation and spatiotemporal Transformer are two mainstream mechanisms in video inpainting (VI). Despite the effectiveness of these components, they still suffer from some limitations that affect their performance. Previous propagation-based approaches are performed separately either in the image or feature domain. Global image propagation isolated from learning may cause spatial misa… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

    Comments: Accepted by ICCV 2023. Code: https://github.com/sczhou/ProPainter

  13. arXiv:2308.07314  [pdf, other

    cs.CV

    Dual Associated Encoder for Face Restoration

    Authors: Yu-Ju Tsai, Yu-Lun Liu, Lu Qi, Kelvin C. K. Chan, Ming-Hsuan Yang

    Abstract: Restoring facial details from low-quality (LQ) images has remained a challenging problem due to its ill-posedness induced by various degradations in the wild. The existing codebook prior mitigates the ill-posedness by leveraging an autoencoder and learned codebook of high-quality (HQ) features, achieving remarkable quality. However, existing approaches in this paradigm frequently depend on a singl… ▽ More

    Submitted 20 January, 2024; v1 submitted 14 August, 2023; originally announced August 2023.

    Comments: ICLR 2024, Project page: https://liagm.github.io/DAEFR/

  14. arXiv:2305.07015  [pdf, other

    cs.CV

    Exploiting Diffusion Prior for Real-World Image Super-Resolution

    Authors: Jianyi Wang, Zongsheng Yue, Shangchen Zhou, Kelvin C. K. Chan, Chen Change Loy

    Abstract: We present a novel approach to leverage prior knowledge encapsulated in pre-trained text-to-image diffusion models for blind super-resolution (SR). Specifically, by employing our time-aware encoder, we can achieve promising restoration results without altering the pre-trained synthesis model, thereby preserving the generative prior and minimizing training cost. To remedy the loss of fidelity cause… ▽ More

    Submitted 28 June, 2024; v1 submitted 11 May, 2023; originally announced May 2023.

    Comments: Accepted by IJCV'2024. Some Figs are compressed due to size limits. Uncompressed ver.: https://github.com/IceClear/StableSR/releases/download/UncompressedPDF/StableSR_IJCV_Uncompressed.pdf. Project page: https://iceclear.github.io/projects/stablesr/

  15. arXiv:2304.10530  [pdf, other

    cs.CV

    Collaborative Diffusion for Multi-Modal Face Generation and Editing

    Authors: Ziqi Huang, Kelvin C. K. Chan, Yuming Jiang, Ziwei Liu

    Abstract: Diffusion models arise as a powerful generative tool recently. Despite the great progress, existing diffusion models mainly focus on uni-modal control, i.e., the diffusion process is driven by only one modality of condition. To further unleash the users' creativity, it is desirable for the model to be controllable by multiple modalities simultaneously, e.g., generating and editing faces by describ… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: CVPR 2023. Project page: https://ziqihuangg.github.io/projects/collaborative-diffusion.html Code: https://github.com/ziqihuangg/Collaborative-Diffusion

  16. arXiv:2304.07429  [pdf, other

    cs.CV

    Identity Encoder for Personalized Diffusion

    Authors: Yu-Chuan Su, Kelvin C. K. Chan, Yandong Li, Yang Zhao, Han Zhang, Boqing Gong, Huisheng Wang, Xuhui Jia

    Abstract: Many applications can benefit from personalized image generation models, including image enhancement, video conferences, just to name a few. Existing works achieved personalization by fine-tuning one model for each person. While being successful, this approach incurs additional computation and storage overhead for each new identity. Furthermore, it usually expects tens or hundreds of examples per… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

  17. arXiv:2304.02642  [pdf, other

    cs.CV

    Taming Encoder for Zero Fine-tuning Image Customization with Text-to-Image Diffusion Models

    Authors: Xuhui Jia, Yang Zhao, Kelvin C. K. Chan, Yandong Li, Han Zhang, Boqing Gong, Tingbo Hou, Huisheng Wang, Yu-Chuan Su

    Abstract: This paper proposes a method for generating images of customized objects specified by users. The method is based on a general framework that bypasses the lengthy optimization required by previous approaches, which often employ a per-object optimization paradigm. Our framework adopts an encoder to capture high-level identifiable semantics of objects, producing an object-specific embedding with only… ▽ More

    Submitted 5 April, 2023; originally announced April 2023.

  18. arXiv:2303.13495  [pdf, other

    cs.CV

    ReVersion: Diffusion-Based Relation Inversion from Images

    Authors: Ziqi Huang, Tianxing Wu, Yuming Jiang, Kelvin C. K. Chan, Ziwei Liu

    Abstract: Diffusion models gain increasing popularity for their generative capabilities. Recently, there have been surging needs to generate customized images by inverting diffusion models from exemplar images, and existing inversion methods mainly focus on capturing object appearances (i.e., the "look"). However, how to invert object relations, another important pillar in the visual world, remains unexplor… ▽ More

    Submitted 1 December, 2024; v1 submitted 23 March, 2023; originally announced March 2023.

    Comments: SIGGRAPH Asia (Conference Track) 2024, Project page: https://ziqihuangg.github.io/projects/reversion.html Code: https://github.com/ziqihuangg/ReVersion

  19. arXiv:2212.09581  [pdf, other

    cs.CV

    Reference-based Image and Video Super-Resolution via C2-Matching

    Authors: Yuming Jiang, Kelvin C. K. Chan, Xintao Wang, Chen Change Loy, Ziwei Liu

    Abstract: Reference-based Super-Resolution (Ref-SR) has recently emerged as a promising paradigm to enhance a low-resolution (LR) input image or video by introducing an additional high-resolution (HR) reference image. Existing Ref-SR methods mostly rely on implicit correspondence matching to borrow HR textures from reference images to compensate for the information loss in input images. However, performing… ▽ More

    Submitted 19 March, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

    Comments: To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). arXiv admin note: substantial text overlap with arXiv:2106.01863

  20. arXiv:2207.14812  [pdf, other

    cs.CV

    GLEAN: Generative Latent Bank for Image Super-Resolution and Beyond

    Authors: Kelvin C. K. Chan, Xiangyu Xu, Xintao Wang, Jinwei Gu, Chen Change Loy

    Abstract: We show that pre-trained Generative Adversarial Networks (GANs) such as StyleGAN and BigGAN can be used as a latent bank to improve the performance of image super-resolution. While most existing perceptual-oriented approaches attempt to generate realistic outputs through learning with adversarial loss, our method, Generative LatEnt bANk (GLEAN), goes beyond existing practices by directly leveragin… ▽ More

    Submitted 29 July, 2022; originally announced July 2022.

    Comments: Accepted to TPAMI. Extension of our CVPR 2021 version: https://openaccess.thecvf.com/content/CVPR2021/html/Chan_GLEAN_Generative_Latent_Bank_for_Large-Factor_Image_Super-Resolution_CVPR_2021_paper.html?ref=https://githubhelp.com. arXiv admin note: text overlap with arXiv:2012.00739

  21. arXiv:2207.12396  [pdf, other

    cs.CV

    Exploring CLIP for Assessing the Look and Feel of Images

    Authors: Jianyi Wang, Kelvin C. K. Chan, Chen Change Loy

    Abstract: Measuring the perception of visual content is a long-standing problem in computer vision. Many mathematical models have been developed to evaluate the look or quality of an image. Despite the effectiveness of such tools in quantifying degradations such as noise and blurriness levels, such quantification is loosely coupled with human language. When it comes to more abstract perception about the fee… ▽ More

    Submitted 23 November, 2022; v1 submitted 25 July, 2022; originally announced July 2022.

    Comments: Accepted by AAAI2023. Code: https://github.com/IceClear/CLIP-IQA

  22. arXiv:2206.11253  [pdf, other

    cs.CV

    Towards Robust Blind Face Restoration with Codebook Lookup Transformer

    Authors: Shangchen Zhou, Kelvin C. K. Chan, Chongyi Li, Chen Change Loy

    Abstract: Blind face restoration is a highly ill-posed problem that often requires auxiliary guidance to 1) improve the mapping from degraded inputs to desired outputs, or 2) complement high-quality details lost in the inputs. In this paper, we demonstrate that a learned discrete codebook prior in a small proxy space largely reduces the uncertainty and ambiguity of restoration mapping by casting blind face… ▽ More

    Submitted 31 October, 2022; v1 submitted 22 June, 2022; originally announced June 2022.

    Comments: Accepted by NeurIPS 2022. Code: https://github.com/sczhou/CodeFormer

  23. arXiv:2204.05308  [pdf, other

    cs.CV

    On the Generalization of BasicVSR++ to Video Deblurring and Denoising

    Authors: Kelvin C. K. Chan, Shangchen Zhou, Xiangyu Xu, Chen Change Loy

    Abstract: The exploitation of long-term information has been a long-standing problem in video restoration. The recent BasicVSR and BasicVSR++ have shown remarkable performance in video super-resolution through long-term propagation and effective alignment. Their success has led to a question of whether they can be transferred to different video restoration tasks. In this work, we extend BasicVSR++ to a gene… ▽ More

    Submitted 18 June, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

    Comments: Technical report. Extension of arXiv:2104.13371

  24. arXiv:2111.12704  [pdf, other

    cs.CV

    Investigating Tradeoffs in Real-World Video Super-Resolution

    Authors: Kelvin C. K. Chan, Shangchen Zhou, Xiangyu Xu, Chen Change Loy

    Abstract: The diversity and complexity of degradations in real-world video super-resolution (VSR) pose non-trivial challenges in inference and training. First, while long-term propagation leads to improved performance in cases of mild degradations, severe in-the-wild degradations could be exaggerated through propagation, impairing output quality. To balance the tradeoff between detail synthesis and artifact… ▽ More

    Submitted 24 November, 2021; originally announced November 2021.

    Comments: Tech report, 14 pages, 14 figures. Code can be found at https://github.com/ckkelvinchan/RealBasicVSR

  25. arXiv:2110.04562  [pdf, other

    cs.CV eess.IV

    Temporally Consistent Video Colorization with Deep Feature Propagation and Self-regularization Learning

    Authors: Yihao Liu, Hengyuan Zhao, Kelvin C. K. Chan, Xintao Wang, Chen Change Loy, Yu Qiao, Chao Dong

    Abstract: Video colorization is a challenging and highly ill-posed problem. Although recent years have witnessed remarkable progress in single image colorization, there is relatively less research effort on video colorization and existing methods always suffer from severe flickering artifacts (temporal inconsistency) or unsatisfying colorization performance. We address this problem from a new perspective, b… ▽ More

    Submitted 9 October, 2021; originally announced October 2021.

    Comments: 13 pages, 10 figures

  26. arXiv:2109.08239  [pdf, other

    cs.CV cs.CG

    A computationally efficient framework for vector representation of persistence diagrams

    Authors: Kit C. Chan, Umar Islambekov, Alexey Luchinsky, Rebecca Sanders

    Abstract: In Topological Data Analysis, a common way of quantifying the shape of data is to use a persistence diagram (PD). PDs are multisets of points in $\mathbb{R}^2$ computed using tools of algebraic topology. However, this multi-set structure limits the utility of PDs in applications. Therefore, in recent years efforts have been directed towards extracting informative and efficient summaries from PDs t… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

    Comments: 28 pages, 17 figures

  27. arXiv:2106.05850  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    Matrix Completion with Model-free Weighting

    Authors: Jiayi Wang, Raymond K. W. Wong, Xiaojun Mao, Kwun Chuen Gary Chan

    Abstract: In this paper, we propose a novel method for matrix completion under general non-uniform missing structures. By controlling an upper bound of a novel balancing error, we construct weights that can actively adjust for the non-uniformity in the empirical risk without explicitly modeling the observation probabilities, and can be computed efficiently via convex optimization. The recovered matrix based… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

    Comments: Proceedings of the 38th International Conference on Machine Learning, PMLR 139, 2021

  28. arXiv:2106.01863  [pdf, other

    cs.CV cs.LG eess.IV

    Robust Reference-based Super-Resolution via C2-Matching

    Authors: Yuming Jiang, Kelvin C. K. Chan, Xintao Wang, Chen Change Loy, Ziwei Liu

    Abstract: Reference-based Super-Resolution (Ref-SR) has recently emerged as a promising paradigm to enhance a low-resolution (LR) input image by introducing an additional high-resolution (HR) reference image. Existing Ref-SR methods mostly rely on implicit correspondence matching to borrow HR textures from reference images to compensate for the information loss in input images. However, performing local tra… ▽ More

    Submitted 3 June, 2021; originally announced June 2021.

    Comments: To appear in CVPR2021. The source code is available at https://github.com/yumingj/C2-Matching

  29. arXiv:2104.13371  [pdf, other

    cs.CV

    BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment

    Authors: Kelvin C. K. Chan, Shangchen Zhou, Xiangyu Xu, Chen Change Loy

    Abstract: A recurrent structure is a popular framework choice for the task of video super-resolution. The state-of-the-art method BasicVSR adopts bidirectional propagation with feature alignment to effectively exploit information from the entire input video. In this study, we redesign BasicVSR by proposing second-order grid propagation and flow-guided deformable alignment. We show that by empowering the rec… ▽ More

    Submitted 27 April, 2021; originally announced April 2021.

    Comments: 3 champions and 1 runner-up in NTIRE 2021

  30. arXiv:2104.10781  [pdf, other

    eess.IV cs.CV

    NTIRE 2021 Challenge on Quality Enhancement of Compressed Video: Methods and Results

    Authors: Ren Yang, Radu Timofte, Jing Liu, Yi Xu, Xinjian Zhang, Minyi Zhao, Shuigeng Zhou, Kelvin C. K. Chan, Shangchen Zhou, Xiangyu Xu, Chen Change Loy, Xin Li, Fanglong Liu, He Zheng, Lielin Jiang, Qi Zhang, Dongliang He, Fu Li, Qingqing Dang, Yibin Huang, Matteo Maggioni, Zhongqian Fu, Shuai Xiao, Cheng li, Thomas Tanay , et al. (47 additional authors not shown)

    Abstract: This paper reviews the first NTIRE challenge on quality enhancement of compressed video, with a focus on the proposed methods and results. In this challenge, the new Large-scale Diverse Video (LDV) dataset is employed. The challenge has three tracks. Tracks 1 and 2 aim at enhancing the videos compressed by HEVC at a fixed QP, while Track 3 is designed for enhancing the videos compressed by x265 at… ▽ More

    Submitted 31 August, 2022; v1 submitted 21 April, 2021; originally announced April 2021.

    Comments: Corrected the MOS values in Table 2, and corrected some minor typos

  31. arXiv:2012.02181  [pdf, other

    cs.CV

    BasicVSR: The Search for Essential Components in Video Super-Resolution and Beyond

    Authors: Kelvin C. K. Chan, Xintao Wang, Ke Yu, Chao Dong, Chen Change Loy

    Abstract: Video super-resolution (VSR) approaches tend to have more components than the image counterparts as they need to exploit the additional temporal dimension. Complex designs are not uncommon. In this study, we wish to untangle the knots and reconsider some most essential components for VSR guided by four basic functionalities, i.e., Propagation, Alignment, Aggregation, and Upsampling. By reusing som… ▽ More

    Submitted 7 April, 2021; v1 submitted 3 December, 2020; originally announced December 2020.

    Comments: CVPR 2021 camera-ready

  32. arXiv:2012.00739  [pdf, other

    cs.CV

    GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution

    Authors: Kelvin C. K. Chan, Xintao Wang, Xiangyu Xu, Jinwei Gu, Chen Change Loy

    Abstract: We show that pre-trained Generative Adversarial Networks (GANs), e.g., StyleGAN, can be used as a latent bank to improve the restoration quality of large-factor image super-resolution (SR). While most existing SR approaches attempt to generate realistic textures through learning with adversarial loss, our method, Generative LatEnt bANk (GLEAN), goes beyond existing practices by directly leveraging… ▽ More

    Submitted 1 December, 2020; originally announced December 2020.

    Comments: Tech report, 19 pages, 19 figures. A high-resolution version of this paper can be found at https://ckkelvinchan.github.io/

  33. arXiv:2009.07265  [pdf, other

    cs.CV

    Understanding Deformable Alignment in Video Super-Resolution

    Authors: Kelvin C. K. Chan, Xintao Wang, Ke Yu, Chao Dong, Chen Change Loy

    Abstract: Deformable convolution, originally proposed for the adaptation to geometric variations of objects, has recently shown compelling performance in aligning multiple frames and is increasingly adopted for video super-resolution. Despite its remarkable performance, its underlying mechanism for alignment remains unclear. In this study, we carefully investigate the relation between deformable alignment a… ▽ More

    Submitted 15 September, 2020; originally announced September 2020.

    Comments: Tech report, 15 pages, 19 figures

  34. arXiv:2006.11408  [pdf, other

    cs.CV cs.LG stat.ML

    Quasi-conformal Geometry based Local Deformation Analysis of Lateral Cephalogram for Childhood OSA Classification

    Authors: Hei-Long Chan, Hoi-Man Yuen, Chun-Ting Au, Kate Ching-Ching Chan, Albert Martin Li, Lok-Ming Lui

    Abstract: Craniofacial profile is one of the anatomical causes of obstructive sleep apnea(OSA). By medical research, cephalometry provides information on patients' skeletal structures and soft tissues. In this work, a novel approach to cephalometric analysis using quasi-conformal geometry based local deformation information was proposed for OSA classification. Our study was a retrospective analysis based on… ▽ More

    Submitted 31 May, 2020; originally announced June 2020.

  35. arXiv:2004.13979  [pdf

    cs.CV cs.LG eess.IV

    Skeleton Focused Human Activity Recognition in RGB Video

    Authors: Bruce X. B. Yu, Yan Liu, Keith C. C. Chan

    Abstract: The data-driven approach that learns an optimal representation of vision features like skeleton frames or RGB videos is currently a dominant paradigm for activity recognition. While great improvements have been achieved from existing single modal approaches with increasingly larger datasets, the fusion of various data modalities at the feature level has seldom been attempted. In this paper, we pro… ▽ More

    Submitted 29 April, 2020; originally announced April 2020.

    Comments: 8 pages

  36. arXiv:2004.13977  [pdf

    cs.CV cs.LG eess.IV

    Effective Human Activity Recognition Based on Small Datasets

    Authors: Bruce X. B. Yu, Yan Liu, Keith C. C. Chan

    Abstract: Most recent work on vision-based human activity recognition (HAR) focuses on designing complex deep learning models for the task. In so doing, there is a requirement for large datasets to be collected. As acquiring and processing large training datasets are usually very expensive, the problem of how dataset size can be reduced without affecting recognition accuracy has to be tackled. To do so, we… ▽ More

    Submitted 29 April, 2020; originally announced April 2020.

    Comments: 7 pages

  37. arXiv:1905.02716  [pdf, other

    cs.CV

    EDVR: Video Restoration with Enhanced Deformable Convolutional Networks

    Authors: Xintao Wang, Kelvin C. K. Chan, Ke Yu, Chao Dong, Chen Change Loy

    Abstract: Video restoration tasks, including super-resolution, deblurring, etc, are drawing increasing attention in the computer vision community. A challenging benchmark named REDS is released in the NTIRE19 Challenge. This new benchmark challenges existing methods from two aspects: (1) how to align multiple frames given large motions, and (2) how to effectively fuse different frames with diverse motion an… ▽ More

    Submitted 7 May, 2019; originally announced May 2019.

    Comments: To appear in CVPR 2019 Workshop. The winners in all four tracks in the NTIRE 2019 video restoration and enhancement challenges. Project page: https://xinntao.github.io/projects/EDVR , Code: https://github.com/xinntao/EDVR

  38. arXiv:1510.00482  [pdf

    cs.NI cs.CY

    Integration of physical equipment and simulators for on-campus and online delivery of practical networking labs

    Authors: Ka Ching Chan

    Abstract: This paper presents the design and development of a networking laboratory that integrates physical networking equipment with the open source GNS3 network simulators for delivery of practical networking classes simultaneously to both on-campus and online students. This transformation work has resulted in significant increase in laboratory capacity, reducing repeating classes. The integrated platfor… ▽ More

    Submitted 11 October, 2015; v1 submitted 1 October, 2015; originally announced October 2015.

    Comments: 10 pages, 6 figures, 2 tables

    Report number: Technical Report CSIT 20151002