Skip to main content

Showing 1–43 of 43 results for author: Cheng, K

Searching in archive eess. Search in all archives.
.
  1. arXiv:2506.20231  [pdf, ps, other

    eess.SP

    Sensing-Aware Transmit Waveform/Receive Filter Design for OFDM-MBS Systems

    Authors: Xinghe Li, Kainan Cheng, Hongzhi Guo, Huiyong Li, Ziyang Cheng

    Abstract: In this letter, we study the problem of cooperative sensing design for an orthogonal frequency division multiplexing (OFDM) multiple base stations (MBS) system. We consider a practical scenario where the base stations (BSs) exploit certain subcarriers to realize a sensing function. Since the high sidelobe level (SLL) of OFDM waveforms degrades radar detection for weak targets, and the cross-correl… ▽ More

    Submitted 30 June, 2025; v1 submitted 25 June, 2025; originally announced June 2025.

  2. arXiv:2506.19199  [pdf, ps, other

    eess.SY cs.MA cs.RO

    Low-Cost Infrastructure-Free 3D Relative Localization with Sub-Meter Accuracy in Near Field

    Authors: Qiangsheng Gao, Ka Ho Cheng, Li Qiu, Zijun Gong

    Abstract: Relative localization in the near-field scenario is critically important for unmanned vehicle (UxV) applications. Although related works addressing 2D relative localization problem have been widely studied for unmanned ground vehicles (UGVs), the problem in 3D scenarios for unmanned aerial vehicles (UAVs) involves more uncertainties and remains to be investigated. Inspired by the phenomenon that a… ▽ More

    Submitted 23 June, 2025; originally announced June 2025.

  3. arXiv:2505.04652  [pdf, other

    eess.IV cs.CV

    Rethinking Boundary Detection in Deep Learning-Based Medical Image Segmentation

    Authors: Yi Lin, Dong Zhang, Xiao Fang, Yufan Chen, Kwang-Ting Cheng, Hao Chen

    Abstract: Medical image segmentation is a pivotal task within the realms of medical image analysis and computer vision. While current methods have shown promise in accurately segmenting major regions of interest, the precise segmentation of boundary areas remains challenging. In this study, we propose a novel network architecture named CTO, which combines Convolutional Neural Networks (CNNs), Vision Transfo… ▽ More

    Submitted 6 May, 2025; originally announced May 2025.

    Comments: Accepted by Medical Image Analysis

  4. arXiv:2502.14422  [pdf, other

    eess.SY

    Towards Routing and Edge Computing in Satellite-Terrestrial Networks: A Column Generation Approach

    Authors: Yuan Liao, Kan Cheng, Fan Lu, Hao Jin, Zhaohui Yang

    Abstract: Edge computing that enables satellites to process raw data locally is expected to bring further timeliness and flexibility to satellite-terrestrial networks (STNs). In this letter, we propose a three-layer edge computing protocol, where raw data collected by the satellites can be processed locally, or transmitted to other satellites or the ground station via multi-hop routing for further processin… ▽ More

    Submitted 23 April, 2025; v1 submitted 20 February, 2025; originally announced February 2025.

  5. arXiv:2502.00687  [pdf, other

    cs.AR eess.SY

    A Flexible Precision Scaling Deep Neural Network Accelerator with Efficient Weight Combination

    Authors: Liang Zhao, Kunming Shao, Fengshi Tian, Tim Kwang-Ting Cheng, Chi-Ying Tsui, Yi Zou

    Abstract: Deploying mixed-precision neural networks on edge devices is friendly to hardware resources and power consumption. To support fully mixed-precision neural network inference, it is necessary to design flexible hardware accelerators for continuous varying precision operations. However, the previous works have issues on hardware utilization and overhead of reconfigurable logic. In this paper, we prop… ▽ More

    Submitted 2 February, 2025; originally announced February 2025.

    Comments: Accepted by 2025 IEEE International Symposium on Circuits and Systems (ISCAS)

  6. arXiv:2501.12385  [pdf, other

    cs.SD cs.LG eess.AS

    Audio Texture Manipulation by Exemplar-Based Analogy

    Authors: Kan Jen Cheng, Tingle Li, Gopala Anumanchipalli

    Abstract: Audio texture manipulation involves modifying the perceptual characteristics of a sound to achieve specific transformations, such as adding, removing, or replacing auditory elements. In this paper, we propose an exemplar-based analogy model for audio texture manipulation. Instead of conditioning on text-based instructions, our method uses paired speech examples, where one clip represents the origi… ▽ More

    Submitted 21 January, 2025; originally announced January 2025.

    Comments: ICASSP 2025

  7. arXiv:2412.15322  [pdf, other

    cs.CV cs.LG cs.SD eess.AS

    MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

    Authors: Ho Kei Cheng, Masato Ishii, Akio Hayakawa, Takashi Shibuya, Alexander Schwing, Yuki Mitsufuji

    Abstract: We propose to synthesize high-quality and synchronized audio, given video and optional text conditions, using a novel multimodal joint training framework MMAudio. In contrast to single-modality training conditioned on (limited) video data only, MMAudio is jointly trained with larger-scale, readily available text-audio data to learn to generate semantically aligned high-quality audio samples. Addit… ▽ More

    Submitted 7 April, 2025; v1 submitted 19 December, 2024; originally announced December 2024.

    Comments: Accepted to CVPR 2025. Project page: https://hkchengrex.github.io/MMAudio

  8. arXiv:2407.18449  [pdf, other

    eess.IV cs.CV cs.LG

    Towards A Generalizable Pathology Foundation Model via Unified Knowledge Distillation

    Authors: Jiabo Ma, Zhengrui Guo, Fengtao Zhou, Yihui Wang, Yingxue Xu, Jinbang Li, Fang Yan, Yu Cai, Zhengjie Zhu, Cheng Jin, Yi Lin, Xinrui Jiang, Chenglong Zhao, Danyi Li, Anjia Han, Zhenhui Li, Ronald Cheong Kin Chan, Jiguang Wang, Peng Fei, Kwang-Ting Cheng, Shaoting Zhang, Li Liang, Hao Chen

    Abstract: Foundation models pretrained on large-scale datasets are revolutionizing the field of computational pathology (CPath). The generalization ability of foundation models is crucial for the success in various downstream clinical tasks. However, current foundation models have only been evaluated on a limited type and number of tasks, leaving their generalization ability and overall performance unclear.… ▽ More

    Submitted 14 April, 2025; v1 submitted 25 July, 2024; originally announced July 2024.

    Comments: update

    Report number: I.2.10

  9. A Deep Learning-Augmented Stand-off Radar Scheme for Rapidly Detecting Tree Defects

    Authors: Jiwei Qian, Yee Hui Lee, Kaixuan Cheng, Qiqi Dai, Mohamed Lokman Mohd Yusof, Daryl Lee, Abdulkadir C. Yucel

    Abstract: Tree defect detection is crucial for the structural health screening of trees. Existing nondestructive testing (NDT) techniques for tree defect detection require time-consuming and labor-intensive measurement campaigns. This discourages their application for the routine structural health screening of whole populations of managed urban trees. To address this issue, this study proposes a deep-learni… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: Accepted and to be published in IEEE Transactions on Geoscience and Remote Sensing

  10. arXiv:2310.03443  [pdf, ps, other

    cs.CL cs.SD eess.AS

    The North System for Formosa Speech Recognition Challenge 2023

    Authors: Li-Wei Chen, Kai-Chen Cheng, Hung-Shin Lee

    Abstract: This report provides a concise overview of the proposed North system, which aims to achieve automatic word/syllable recognition for Taiwanese Hakka (Sixian). The report outlines three key components of the system: the acquisition, composition, and utilization of the training data; the architecture of the model; and the hardware specifications and operational statistics. The demonstration of the sy… ▽ More

    Submitted 5 October, 2023; v1 submitted 5 October, 2023; originally announced October 2023.

  11. Bi-Modality Medical Image Synthesis Using Semi-Supervised Sequential Generative Adversarial Networks

    Authors: Xin Yang, Yi Lin, Zhiwei Wang, Xin Li, Kwang-Ting Cheng

    Abstract: In this paper, we propose a bi-modality medical image synthesis approach based on sequential generative adversarial network (GAN) and semi-supervised learning. Our approach consists of two generative modules that synthesize images of the two modalities in a sequential order. A method for measuring the synthesis complexity is proposed to automatically determine the synthesis order in our sequential… ▽ More

    Submitted 29 August, 2023; v1 submitted 27 August, 2023; originally announced August 2023.

  12. arXiv:2307.05944  [pdf

    cs.AR cs.NE eess.SP

    A 137.5 TOPS/W SRAM Compute-in-Memory Macro with 9-b Memory Cell-Embedded ADCs and Signal Margin Enhancement Techniques for AI Edge Applications

    Authors: Xiaomeng Wang, Fengshi Tian, Xizi Chen, Jiakun Zheng, Xuejiao Liu, Fengbin Tu, Jie Yang, Mohamad Sawan, Kwang-Ting Cheng, Chi-Ying Tsui

    Abstract: In this paper, we propose a high-precision SRAM-based CIM macro that can perform 4x4-bit MAC operations and yield 9-bit signed output. The inherent discharge branches of SRAM cells are utilized to apply time-modulated MAC and 9-bit ADC readout operations on two bit-line capacitors. The same principle is used for both MAC and A-to-D conversion ensuring high linearity and thus supporting large numbe… ▽ More

    Submitted 19 July, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

    Comments: Submitted to IEEE ASSCC 2023

  13. arXiv:2304.06662  [pdf, other

    eess.IV cs.CV

    Deep Learning in Breast Cancer Imaging: A Decade of Progress and Future Directions

    Authors: Luyang Luo, Xi Wang, Yi Lin, Xiaoqi Ma, Andong Tan, Ronald Chan, Varut Vardhanabhuti, Winnie CW Chu, Kwang-Ting Cheng, Hao Chen

    Abstract: Breast cancer has reached the highest incidence rate worldwide among all malignancies since 2020. Breast imaging plays a significant role in early diagnosis and intervention to improve the outcome of breast cancer patients. In the past decade, deep learning has shown remarkable progress in breast cancer imaging analysis, holding great promise in interpreting the rich information and complex contex… ▽ More

    Submitted 20 January, 2024; v1 submitted 13 April, 2023; originally announced April 2023.

    Comments: IEEE RBME 2024

  14. arXiv:2304.01064  [pdf, other

    cs.CV eess.IV

    Real-time 6K Image Rescaling with Rate-distortion Optimization

    Authors: Chenyang Qi, Xin Yang, Ka Leong Cheng, Ying-Cong Chen, Qifeng Chen

    Abstract: Contemporary image rescaling aims at embedding a high-resolution (HR) image into a low-resolution (LR) thumbnail image that contains embedded information for HR image reconstruction. Unlike traditional image super-resolution, this enables high-fidelity HR image restoration faithful to the original one, given the embedded information in the LR thumbnail. However, state-of-the-art image rescaling me… ▽ More

    Submitted 19 May, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

    Comments: Accepted by CVPR 2023; Github Repository: https://github.com/AbnerVictor/HyperThumbnail

  15. arXiv:2303.13111  [pdf, other

    eess.IV cs.CV

    Boosting Convolution with Efficient MLP-Permutation for Volumetric Medical Image Segmentation

    Authors: Yi Lin, Xiao Fang, Dong Zhang, Kwang-Ting Cheng, Hao Chen

    Abstract: Recently, the advent of vision Transformer (ViT) has brought substantial advancements in 3D dataset benchmarks, particularly in 3D volumetric medical image segmentation (Vol-MedSeg). Concurrently, multi-layer perceptron (MLP) network has regained popularity among researchers due to their comparable results to ViT, albeit with the exclusion of the resource-intensive self-attention module. In this w… ▽ More

    Submitted 26 May, 2025; v1 submitted 23 March, 2023; originally announced March 2023.

  16. arXiv:2303.06807  [pdf, other

    eess.IV cs.CV

    Vessel-Promoted OCT to OCTA Image Translation by Heuristic Contextual Constraints

    Authors: Shuhan Li, Dong Zhang, Xiaomeng Li, Chubin Ou, Lin An, Yanwu Xu, Kwang-Ting Cheng

    Abstract: Optical Coherence Tomography Angiography (OCTA) is a crucial tool in the clinical screening of retinal diseases, allowing for accurate 3D imaging of blood vessels through non-invasive scanning. However, the hardware-based approach for acquiring OCTA images presents challenges due to the need for specialized sensors and expensive devices. In this paper, we introduce a novel method called TransPro,… ▽ More

    Submitted 21 August, 2024; v1 submitted 12 March, 2023; originally announced March 2023.

    Comments: Accepted by Medical Image Analysis

  17. arXiv:2210.08004  [pdf, other

    physics.optics cs.CV eess.IV

    Misaligned orientations of 4f optical neural network for image classification accuracy on various datasets

    Authors: Yanbing Liu, Wei Li, Kun Cheng, Xun Liu, Wei Yang

    Abstract: In recent years, the optical 4f system has drawn much attention in building high-speed and ultra-low-power optical neural networks (ONNs). Most optical systems suffer from the misalignment of the optical devices during installment. The performance of ONN based on the optical 4f system (4f-ONN) is considered sensitive to the misalignment in the optical path introduced. In order to comprehensively i… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

  18. Nonparametric and Regularized Dynamical Wasserstein Barycenters for Sequential Observations

    Authors: Kevin C. Cheng, Shuchin Aeron, Michael C. Hughes, Eric L. Miller

    Abstract: We consider probabilistic models for sequential observations which exhibit gradual transitions among a finite number of states. We are particularly motivated by applications such as human activity analysis where observed accelerometer time series contains segments representing distinct activities, which we call pure states, as well as periods characterized by continuous transition among these pure… ▽ More

    Submitted 21 September, 2023; v1 submitted 4 October, 2022; originally announced October 2022.

    Journal ref: IEEE Transactions on Signal Processing (2023), volume 71, pages 3164 - 3178

  19. arXiv:2209.12373  [pdf, other

    eess.SY

    Carbon-Aware EV Charging

    Authors: Kai-Wen Cheng, Yuexin Bian, Yuanyuan Shi, Yize Chen

    Abstract: This paper examines the problem of optimizing the charging pattern of electric vehicles (EV) by taking real-time electricity grid carbon intensity into consideration. The objective of the proposed charging scheme is to minimize the carbon emissions contributed by EV charging events, while simultaneously satisfying constraints posed by EV user's charging schedules, charging station transformer limi… ▽ More

    Submitted 25 September, 2022; originally announced September 2022.

    Comments: Accepted at SmartGridComm 2022; First two authors contribute equally

  20. arXiv:2207.10869  [pdf, other

    eess.IV cs.CV

    Optimizing Image Compression via Joint Learning with Denoising

    Authors: Ka Leong Cheng, Yueqi Xie, Qifeng Chen

    Abstract: High levels of noise usually exist in today's captured images due to the relatively small sensors equipped in the smartphone cameras, where the noise brings extra challenges to lossy image compression algorithms. Without the capacity to tell the difference between image details and noise, general image compression methods allocate additional bits to explicitly store the undesired image noise durin… ▽ More

    Submitted 22 July, 2022; originally announced July 2022.

    Comments: Accepted to ECCV 2022

  21. arXiv:2207.02946  [pdf

    eess.IV cs.CV cs.LG

    Virtual staining of defocused autofluorescence images of unlabeled tissue using deep neural networks

    Authors: Yijie Zhang, Luzhe Huang, Tairan Liu, Keyi Cheng, Kevin de Haan, Yuzhu Li, Bijie Bai, Aydogan Ozcan

    Abstract: Deep learning-based virtual staining was developed to introduce image contrast to label-free tissue sections, digitally matching the histological staining, which is time-consuming, labor-intensive, and destructive to tissue. Standard virtual staining requires high autofocusing precision during the whole slide imaging of label-free tissue, which consumes a significant portion of the total imaging t… ▽ More

    Submitted 6 July, 2022; originally announced July 2022.

    Comments: 26 Pages, 5 Figures

    Journal ref: Intelligent Computing (2022)

  22. arXiv:2206.15134  [pdf, other

    eess.IV cs.CV

    InsMix: Towards Realistic Generative Data Augmentation for Nuclei Instance Segmentation

    Authors: Yi Lin, Zeyu Wang, Kwang-Ting Cheng, Hao Chen

    Abstract: Nuclei Segmentation from histology images is a fundamental task in digital pathology analysis. However, deep-learning-based nuclei segmentation methods often suffer from limited annotations. This paper proposes a realistic data augmentation method for nuclei segmentation, named InsMix, that follows a Copy-Paste-Smooth principle and performs morphology-constrained generative instance augmentation.… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

    Comments: Accepted by MICCAI 2022 (early accepted)

  23. arXiv:2206.03935  [pdf, other

    eess.IV cs.CV

    Dual-Distribution Discrepancy for Anomaly Detection in Chest X-Rays

    Authors: Yu Cai, Hao Chen, Xin Yang, Yu Zhou, Kwang-Ting Cheng

    Abstract: Chest X-ray (CXR) is the most typical radiological exam for diagnosis of various diseases. Due to the expensive and time-consuming annotations, detecting anomalies in CXRs in an unsupervised fashion is very promising. However, almost all of the existing methods consider anomaly detection as a one-class classification (OCC) problem. They model the distribution of only known normal images during tra… ▽ More

    Submitted 29 June, 2022; v1 submitted 8 June, 2022; originally announced June 2022.

    Comments: Early Accepted to MICCAI 2022

  24. arXiv:2202.08195  [pdf, other

    eess.IV cs.CV q-bio.QM

    Nuclei Segmentation with Point Annotations from Pathology Images via Self-Supervised Learning and Co-Training

    Authors: Yi Lin, Zhiyong Qu, Hao Chen, Zhongke Gao, Yuexiang Li, Lili Xia, Kai Ma, Yefeng Zheng, Kwang-Ting Cheng

    Abstract: Nuclei segmentation is a crucial task for whole slide image analysis in digital pathology. Generally, the segmentation performance of fully-supervised learning heavily depends on the amount and quality of the annotated data. However, it is time-consuming and expensive for professional pathologists to provide accurate pixel-level ground truth, while it is much easier to get coarse labels such as po… ▽ More

    Submitted 17 August, 2023; v1 submitted 16 February, 2022; originally announced February 2022.

    Comments: Accepted by MedIA

  25. arXiv:2201.07881  [pdf

    cs.RO eess.SY

    Analysis of lane-change conflict between cars and trucks at merging section using UAV video data

    Authors: Yichen Lu, Kai Cheng, Yue Zhang, Xinqiang Chen, Yajie Zou

    Abstract: The freeway on-ramp merging section is often identified as a crash-prone spot due to the high frequency of traffic conflicts. Very few traffic conflict analysis studies comprehensively consider different vehicle types at freeway merging section. Thus, the main objective of this study is to analyse conflicts between different vehicle types at freeway merging section. Field data are collected by Unm… ▽ More

    Submitted 5 January, 2022; originally announced January 2022.

  26. Selective Synthetic Augmentation with HistoGAN for Improved Histopathology Image Classification

    Authors: Yuan Xue, Jiarong Ye, Qianying Zhou, Rodney Long, Sameer Antani, Zhiyun Xue, Carl Cornwell, Richard Zaino, Keith Cheng, Xiaolei Huang

    Abstract: Histopathological analysis is the present gold standard for precancerous lesion diagnosis. The goal of automated histopathological classification from digital images requires supervised training, which requires a large number of expert annotations that can be expensive and time-consuming to collect. Meanwhile, accurate classification of image patches cropped from whole-slide images is essential fo… ▽ More

    Submitted 10 November, 2021; originally announced November 2021.

    Comments: Elsevier Medical Image Analysis Best Paper Award runner up. arXiv admin note: substantial text overlap with arXiv:1912.03837

    Journal ref: Medical Image Analysis 67 (2021): 101816

  27. A Multi-attribute Controllable Generative Model for Histopathology Image Synthesis

    Authors: Jiarong Ye, Yuan Xue, Peter Liu, Richard Zaino, Keith Cheng, Xiaolei Huang

    Abstract: Generative models have been applied in the medical imaging domain for various image recognition and synthesis tasks. However, a more controllable and interpretable image synthesis model is still lacking yet necessary for important applications such as assisting in medical training. In this work, we leverage the efficient self-attention and contrastive learning modules and build upon state-of-the-a… ▽ More

    Submitted 10 November, 2021; originally announced November 2021.

    Comments: MICCAI 2021

  28. arXiv:2108.03690  [pdf, other

    eess.IV cs.CV

    Enhanced Invertible Encoding for Learned Image Compression

    Authors: Yueqi Xie, Ka Leong Cheng, Qifeng Chen

    Abstract: Although deep learning based image compression methods have achieved promising progress these days, the performance of these methods still cannot match the latest compression standard Versatile Video Coding (VVC). Most of the recent developments focus on designing a more accurate and flexible entropy model that can better parameterize the distributions of the latent features. However, few efforts… ▽ More

    Submitted 8 August, 2021; originally announced August 2021.

    Comments: Accepted to ACM Multimedia 2021 as Oral

  29. arXiv:2106.06733  [pdf, other

    eess.IV cs.CV cs.LG

    LENAS: Learning-based Neural Architecture Search and Ensemble for 3D Radiotherapy Dose Prediction

    Authors: Yi Lin, Yanfei Liu, Hao Chen, Xin Yang, Kai Ma, Yefeng Zheng, Kwang-Ting Cheng

    Abstract: Radiation therapy treatment planning requires balancing the delivery of the target dose while sparing normal tissues, making it a complex process. To streamline the planning process and enhance its quality, there is a growing demand for knowledge-based planning (KBP). Ensemble learning has shown impressive power in various deep learning tasks, and it has great potential to improve the performance… ▽ More

    Submitted 13 May, 2024; v1 submitted 12 June, 2021; originally announced June 2021.

  30. arXiv:2011.00070  [pdf, other

    eess.IV cs.CV

    Adversarial Robust Training of Deep Learning MRI Reconstruction Models

    Authors: Francesco Calivá, Kaiyang Cheng, Rutwik Shah, Valentina Pedoia

    Abstract: Deep Learning (DL) has shown potential in accelerating Magnetic Resonance Image acquisition and reconstruction. Nevertheless, there is a dearth of tailored methods to guarantee that the reconstruction of small features is achieved with high fidelity. In this work, we employ adversarial attacks to generate small synthetic perturbations, which are difficult to reconstruct for a trained DL reconstruc… ▽ More

    Submitted 27 April, 2021; v1 submitted 30 October, 2020; originally announced November 2020.

    Comments: 32 pages, 9 figures, 6 tables, accepted at MELBA (MIDL 2020 special issue)

  31. Cascaded deep monocular 3D human pose estimation with evolutionary training data

    Authors: Shichao Li, Lei Ke, Kevin Pratama, Yu-Wing Tai, Chi-Keung Tang, Kwang-Ting Cheng

    Abstract: End-to-end deep representation learning has achieved remarkable accuracy for monocular 3D human pose estimation, yet these models may fail for unseen poses with limited and fixed training data. This paper proposes a novel data augmentation method that: (1) is scalable for synthesizing massive amount of training data (over 8 million valid 3D human poses with corresponding 2D projections) for traini… ▽ More

    Submitted 8 April, 2021; v1 submitted 13 June, 2020; originally announced June 2020.

    Comments: Accepted to CVPR 2020 as Oral Presentation

  32. arXiv:2006.05539  [pdf, other

    eess.SP math.ST stat.ML

    On Matched Filtering for Statistical Change Point Detection

    Authors: Kevin C. Cheng, Eric L. Miller, Michael C. Hughes, Shuchin Aeron

    Abstract: Non-parametric and distribution-free two-sample tests have been the foundation of many change point detection algorithms. However, randomness in the test statistic as a function of time makes them susceptible to false positives and localization ambiguity. We address these issues by deriving and applying filters matched to the expected temporal signatures of a change for various sliding window, two… ▽ More

    Submitted 27 October, 2020; v1 submitted 9 June, 2020; originally announced June 2020.

  33. arXiv:2005.08931  [pdf, other

    cs.CV cs.LG eess.IV

    Joint Multi-Dimension Pruning via Numerical Gradient Update

    Authors: Zechun Liu, Xiangyu Zhang, Zhiqiang Shen, Zhe Li, Yichen Wei, Kwang-Ting Cheng, Jian Sun

    Abstract: We present joint multi-dimension pruning (abbreviated as JointPruning), an effective method of pruning a network on three crucial aspects: spatial, depth and channel simultaneously. To tackle these three naturally different dimensions, we proposed a general framework by defining pruning as seeking the best pruning vector (i.e., the numerical value of layer-wise channel number, spacial size, depth)… ▽ More

    Submitted 25 September, 2021; v1 submitted 18 May, 2020; originally announced May 2020.

    Comments: Accepted to IEEE Transactions on Image Processing (TIP) 2021

  34. arXiv:2005.01996  [pdf, other

    eess.IV cs.CV

    NTIRE 2020 Challenge on Real-World Image Super-Resolution: Methods and Results

    Authors: Andreas Lugmayr, Martin Danelljan, Radu Timofte, Namhyuk Ahn, Dongwoon Bai, Jie Cai, Yun Cao, Junyang Chen, Kaihua Cheng, SeYoung Chun, Wei Deng, Mostafa El-Khamy, Chiu Man Ho, Xiaozhong Ji, Amin Kheradmand, Gwantae Kim, Hanseok Ko, Kanghyu Lee, Jungwon Lee, Hao Li, Ziluan Liu, Zhi-Song Liu, Shuai Liu, Yunhua Lu, Zibo Meng , et al. (21 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2020 challenge on real world super-resolution. It focuses on the participating methods and final results. The challenge addresses the real world setting, where paired true high and low-resolution images are unavailable. For training, only one set of source input images is therefore provided along with a set of unpaired high-quality target images. In Track 1: Image Proc… ▽ More

    Submitted 5 May, 2020; originally announced May 2020.

  35. arXiv:2003.03488  [pdf, other

    cs.CV cs.LG eess.IV

    ReActNet: Towards Precise Binary Neural Network with Generalized Activation Functions

    Authors: Zechun Liu, Zhiqiang Shen, Marios Savvides, Kwang-Ting Cheng

    Abstract: In this paper, we propose several ideas for enhancing a binary network to close its accuracy gap from real-valued networks without incurring any additional computational cost. We first construct a baseline network by modifying and binarizing a compact real-valued network with parameter-free shortcuts, bypassing all the intermediate convolutional layers including the downsampling layers. This basel… ▽ More

    Submitted 12 July, 2020; v1 submitted 6 March, 2020; originally announced March 2020.

    Comments: Accepted to ECCV 2020. Code is available at: https://github.com/liuzechun/ReActNet

  36. arXiv:2002.06817  [pdf, other

    cs.SD cs.LG cs.MM eess.AS

    Addressing the confounds of accompaniments in singer identification

    Authors: Tsung-Han Hsieh, Kai-Hsiang Cheng, Zhe-Cheng Fan, Yu-Ching Yang, Yi-Hsuan Yang

    Abstract: Identifying singers is an important task with many applications. However, the task remains challenging due to many issues. One major issue is related to the confounding factors from the background instrumental music that is mixed with the vocals in music production. A singer identification model may learn to extract non-vocal related features from the instrumental part of the songs, if a singer on… ▽ More

    Submitted 17 February, 2020; originally announced February 2020.

  37. arXiv:1912.02589  [pdf, other

    eess.IV

    Label Refinement with an Iterative Generative Adversarial Network for Boosting Retinal Vessel Segmentation

    Authors: Yunqiao Yang, Zhiwei Wang, Jingen Liu, Kwang-Ting Cheng, Xin Yang

    Abstract: State-of-the-art methods for retinal vessel segmentation mainly rely on manually labeled vessels as the ground truth for supervised training. The quality of manual labels plays an essential role in the segmentation accuracy, while in practice it could vary a lot and in turn could substantially mislead the training process and limit the segmentation accuracy. This paper aims to "purify" any compreh… ▽ More

    Submitted 5 December, 2019; originally announced December 2019.

  38. arXiv:1911.01325  [pdf, other

    eess.SP cs.LG

    Optimal Transport Based Change Point Detection and Time Series Segment Clustering

    Authors: Kevin C. Cheng, Shuchin Aeron, Michael C. Hughes, Erika Hussey, Eric L. Miller

    Abstract: Two common problems in time series analysis are the decomposition of the data stream into disjoint segments that are each in some sense "homogeneous" - a problem known as Change Point Detection (CPD) - and the grouping of similar nonadjacent segments, a problem that we call Time Series Segment Clustering (TSSC). Building upon recent theoretical advances characterizing the limiting distribution-fre… ▽ More

    Submitted 20 February, 2020; v1 submitted 4 November, 2019; originally announced November 2019.

  39. arXiv:1909.06326  [pdf, other

    q-bio.QM cs.CV cs.LG eess.IV physics.med-ph

    Automatic Hip Fracture Identification and Functional Subclassification with Deep Learning

    Authors: Justin D Krogue, Kaiyang V Cheng, Kevin M Hwang, Paul Toogood, Eric G Meinberg, Erik J Geiger, Musa Zaid, Kevin C McGill, Rina Patel, Jae Ho Sohn, Alexandra Wright, Bryan F Darger, Kevin A Padrez, Eugene Ozhinsky, Sharmila Majumdar, Valentina Pedoia

    Abstract: Purpose: Hip fractures are a common cause of morbidity and mortality. Automatic identification and classification of hip fractures using deep learning may improve outcomes by reducing diagnostic errors and decreasing time to operation. Methods: Hip and pelvic radiographs from 1118 studies were reviewed and 3034 hips were labeled via bounding boxes and classified as normal, displaced femoral neck f… ▽ More

    Submitted 10 September, 2019; originally announced September 2019.

    Comments: Presented at Orthopaedic Research Society, Austin, TX, Feb 2, 2019, currently in submission for publication

  40. arXiv:1908.10737  [pdf, other

    cs.CV cs.LG eess.IV

    Facial age estimation by deep residual decision making

    Authors: Shichao Li, Kwang-Ting Cheng

    Abstract: Residual representation learning simplifies the optimization problem of learning complex functions and has been widely used by traditional convolutional neural networks. However, it has not been applied to deep neural decision forest (NDF). In this paper we incorporate residual learning into NDF and the resulting model achieves state-of-the-art level accuracy on three public age estimation benchma… ▽ More

    Submitted 28 August, 2019; originally announced August 2019.

    Comments: Following-up work for visualizing deep neural decision forest for facial age estimation

  41. arXiv:1904.10242  [pdf

    eess.SP

    Memory System Designed for Multiply-Accumulate (MAC) Engine Based on Stochastic Computing

    Authors: Xinyue Zhang, Yuan Wang, Yawen Zhang, Jiahao Song, Zuodong Zhang, Kaili Cheng, Runsheng Wang, Ru Huang

    Abstract: Convolutional neural network (CNN) achieves excellent performance on fascinating tasks such as image recognition and natural language processing at the cost of high power consumption. Stochastic computing (SC) is an attractive paradigm implemented in low power applications which performs arithmetic operations with simple logic and low hardware cost. However, conventional memory structure designed… ▽ More

    Submitted 23 April, 2019; originally announced April 2019.

    Comments: 6 pages, 6 figures

  42. arXiv:1812.01269  [pdf, other

    cs.SD eess.AS

    Learning to match transient sound events using attentional similarity for few-shot sound recognition

    Authors: Szu-Yu Chou, Kai-Hsiang Cheng, Jyh-Shing Roger Jang, Yi-Hsuan Yang

    Abstract: In this paper, we introduce a novel attentional similarity module for the problem of few-shot sound recognition. Given a few examples of an unseen sound event, a classifier must be quickly adapted to recognize the new sound event without much fine-tuning. The proposed attentional similarity module can be plugged into any metric-based learning method for few-shot learning, allowing the resulting mo… ▽ More

    Submitted 18 February, 2019; v1 submitted 4 December, 2018; originally announced December 2018.

    Comments: This is a pre-print version of an ICASSP 2019 paper

  43. arXiv:1807.10917  [pdf, other

    eess.SP

    Multiple Access for Transmissions Over Independent Fading Channels

    Authors: Bei-Hao Chang, Chia-Fu Chang, Pin-Wen Su, I-Hsien Yeh, Kai-Chuan Cheng, Ying-Chen Lin, Mao-Chao Lin

    Abstract: We propose to employ a multilevel detection (MLDT) technique to allow multiple users which respectively transmit messages over independent fading channels to share the same resource, e.g., the same signature sequence in the CDMA (code division multiple access) system. The users are separated by the different channel gains including amplitudes and phases resultant from the independent fading channe… ▽ More

    Submitted 28 July, 2018; originally announced July 2018.

    Comments: 26 pages, 14 figures. Not published by now. Some results appear in the Master theses of Bei-Hao Chang, I-Hsien Yeh and Ying-Chen Lin respectively at National Taiwan University