Skip to main content

Showing 1–39 of 39 results for author: Wen, S

Searching in archive eess. Search in all archives.
.
  1. arXiv:2506.22410  [pdf, ps, other

    eess.SY

    Spherical Pendulum with Quad-Rotor Thrust Vectoring Actuation -- A Novel Mechatronics and Control Benchmark Platform

    Authors: Yuchen Li, Omar Curiel, Sheng-Fan Wen, Tsu-Chin Tsao

    Abstract: Motor-actuated pendulums have been established as arguably the most common laboratory prototypes used in control system education because of the relevance to robot manipulator control in industry. Meanwhile, multi-rotor drones like quadcopters have become popular in industrial applications but have not been broadly employed in control education laboratory. Platforms with pendulums and multi-rotor… ▽ More

    Submitted 27 June, 2025; originally announced June 2025.

  2. arXiv:2506.08319  [pdf, ps, other

    eess.SY cs.RO

    DEKC: Data-Enable Control for Tethered Space Robot Deployment in the Presence of Uncertainty via Koopman Operator Theory

    Authors: Ao Jin, Qinyi Wang, Sijie Wen, Ya Liu, Ganghui Shen, Panfeng Huang, Fan Zhang

    Abstract: This work focuses the deployment of tethered space robot in the presence of unknown uncertainty. A data-enable framework called DEKC which contains offline training part and online execution part is proposed to deploy tethered space robot in the presence of uncertainty. The main idea of this work is modeling the unknown uncertainty as a dynamical system, which enables high accuracy and convergence… ▽ More

    Submitted 9 June, 2025; originally announced June 2025.

    Comments: 12 pages

  3. arXiv:2506.06483  [pdf, ps, other

    cs.GR cs.AI cs.CV cs.LG eess.IV

    Noise Consistency Regularization for Improved Subject-Driven Image Synthesis

    Authors: Yao Ni, Song Wen, Piotr Koniusz, Anoop Cherian

    Abstract: Fine-tuning Stable Diffusion enables subject-driven image synthesis by adapting the model to generate images containing specific subjects. However, existing fine-tuning methods suffer from two key issues: underfitting, where the model fails to reliably capture subject identity, and overfitting, where it memorizes the subject image and reduces background diversity. To address these challenges, we p… ▽ More

    Submitted 6 June, 2025; originally announced June 2025.

  4. arXiv:2503.19349  [pdf, other

    eess.SY cs.LG math.OC

    Optimal Parameter Adaptation for Safety-Critical Control via Safe Barrier Bayesian Optimization

    Authors: Shengbo Wang, Ke Li, Zheng Yan, Zhenyuan Guo, Song Zhu, Guanghui Wen, Shiping Wen

    Abstract: Safety is of paramount importance in control systems to avoid costly risks and catastrophic damages. The control barrier function (CBF) method, a promising solution for safety-critical control, poses a new challenge of enhancing control performance due to its direct modification of original control design and the introduction of uncalibrated parameters. In this work, we shed light on the crucial r… ▽ More

    Submitted 25 March, 2025; originally announced March 2025.

    Comments: Preprent manuscript, review only

  5. arXiv:2501.00348  [pdf, other

    cs.SD cs.AI eess.AS

    Temporal Information Reconstruction and Non-Aligned Residual in Spiking Neural Networks for Speech Classification

    Authors: Qi Zhang, Huamin Wang, Hangchi Shen, Shukai Duan, Shiping Wen, Tingwen Huang

    Abstract: Recently, it can be noticed that most models based on spiking neural networks (SNNs) only use a same level temporal resolution to deal with speech classification problems, which makes these models cannot learn the information of input data at different temporal scales. Additionally, owing to the different time lengths of the data before and after the sub-modules of many models, the effective resid… ▽ More

    Submitted 31 December, 2024; originally announced January 2025.

    Comments: 9 pages, 5 figures

  6. arXiv:2409.18558  [pdf, other

    cs.SD eess.AS

    XWSB: A Blend System Utilizing XLS-R and WavLM with SLS Classifier detection system for SVDD 2024 Challenge

    Authors: Qishan Zhang, Shuangbing Wen, Fangke Yan, Tao Hu, Jun Li

    Abstract: This paper introduces the model structure used in the SVDD 2024 Challenge. The SVDD 2024 challenge has been introduced this year for the first time. Singing voice deepfake detection (SVDD) which faces complexities due to informal speech intonations and varying speech rates. In this paper, we propose the XWSB system, which achieved SOTA per-formance in the SVDD challenge. XWSB stands for XLS-R, Wav… ▽ More

    Submitted 27 September, 2024; originally announced September 2024.

    Journal ref: IEEE Spoken Language Technology Workshop 2024

  7. arXiv:2404.11313  [pdf, other

    eess.IV cs.AI

    NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results

    Authors: Xin Li, Kun Yuan, Yajing Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte, Wei Sun, Haoning Wu, Zicheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai, Jianhui Sun, Tianyi Wang, Lei Li, Han Kong, Wenxuan Wang, Bing Li, Cheng Luo , et al. (43 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2024 Challenge on Shortform UGC Video Quality Assessment (S-UGC VQA), where various excellent solutions are submitted and evaluated on the collected dataset KVQ from popular short-form video platform, i.e., Kuaishou/Kwai Platform. The KVQ database is divided into three parts, including 2926 videos for training, 420 videos for validation, and 854 videos for testing. The… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR2024 Workshop. The challenge report for CVPR NTIRE2024 Short-form UGC Video Quality Assessment Challenge

  8. arXiv:2309.03905  [pdf, other

    cs.MM cs.CL cs.CV cs.LG cs.SD eess.AS

    ImageBind-LLM: Multi-modality Instruction Tuning

    Authors: Jiaming Han, Renrui Zhang, Wenqi Shao, Peng Gao, Peng Xu, Han Xiao, Kaipeng Zhang, Chris Liu, Song Wen, Ziyu Guo, Xudong Lu, Shuai Ren, Yafei Wen, Xiaoxin Chen, Xiangyu Yue, Hongsheng Li, Yu Qiao

    Abstract: We present ImageBind-LLM, a multi-modality instruction tuning method of large language models (LLMs) via ImageBind. Existing works mainly focus on language and image instruction tuning, different from which, our ImageBind-LLM can respond to multi-modality conditions, including audio, 3D point clouds, video, and their embedding-space arithmetic by only image-text alignment training. During training… ▽ More

    Submitted 11 September, 2023; v1 submitted 7 September, 2023; originally announced September 2023.

    Comments: Code is available at https://github.com/OpenGVLab/LLaMA-Adapter

  9. arXiv:2308.03684  [pdf, other

    eess.AS cs.SD

    Active Noise Control based on the Momentum Multichannel Normalized Filtered-x Least Mean Square Algorithm

    Authors: Dongyuan Shi, Woon-Seng Gan, Bhan Lam, Shulin Wen, Xiaoyi Shen

    Abstract: Multichannel active noise control (MCANC) is widely utilized to achieve significant noise cancellation area in the complicated acoustic field. Meanwhile, the filter-x least mean square (FxLMS) algorithm gradually becomes the benchmark solution for the implementation of MCANC due to its low computational complexity. However, its slow convergence speed more or less undermines the performance of deal… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: Conference: INTER-NOISE and NOISE-CON Congress and Conference Proceedings 2020 At Korea Volume: 261

  10. arXiv:2307.00828  [pdf, other

    eess.SY cs.LG math.OC

    Model-Assisted Probabilistic Safe Adaptive Control With Meta-Bayesian Learning

    Authors: Shengbo Wang, Ke Li, Yin Yang, Yuting Cao, Tingwen Huang, Shiping Wen

    Abstract: Breaking safety constraints in control systems can lead to potential risks, resulting in unexpected costs or catastrophic damage. Nevertheless, uncertainty is ubiquitous, even among similar tasks. In this paper, we develop a novel adaptive safe control framework that integrates meta learning, Bayesian models, and control barrier function (CBF) method. Specifically, with the help of CBF method, we… ▽ More

    Submitted 13 July, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

  11. arXiv:2301.03471  [pdf

    eess.SP eess.SY

    Technology Report : Smartphone-Based Pedestrian Dead Reckoning Integrated with Data-Fusion-Adopted Visible Light Positioning

    Authors: Shangsheng Wen, Ziyang Ge, Danlan Yuan, Yingcong Chen, Xuecong Fang

    Abstract: Pedestrian dead-reckoning (PDR) is a potential indoor localization technology that obtains location estimation with the inertial measurement unit (IMU). However, one of its most significant drawbacks is the accumulation of its measurement error. This paper proposes a visible light positioning (VLP)-integrated PDR system, which could achieve real-time and accurate indoor positioning using IMU and t… ▽ More

    Submitted 5 January, 2023; originally announced January 2023.

  12. arXiv:2210.13031  [pdf, other

    eess.IV

    A geometry method for LED mapping

    Authors: Junlin Huang, Shangsheng Wen, Weipeng Guan

    Abstract: With inputs from RGB-D camera, industrial camera and wheel odometer, in this letter, we propose a geometry-based detecting method, by which the 3-D modulated LED map can be acquired with the aid of visual odometry algorithm from ORB-SLAM2 system when the decoding result of LED-ID is inaccurate. Subsequently, an enhanced cost function is proposed to optimize the mapping result of LEDs. The average… ▽ More

    Submitted 28 October, 2022; v1 submitted 24 October, 2022; originally announced October 2022.

    Comments: 4 pages, 6 figures

  13. arXiv:2206.14597  [pdf, other

    cs.LG cs.AI eess.SP

    Generative Anomaly Detection for Time Series Datasets

    Authors: Zhuangwei Kang, Ayan Mukhopadhyay, Aniruddha Gokhale, Shijie Wen, Abhishek Dubey

    Abstract: Traffic congestion anomaly detection is of paramount importance in intelligent traffic systems. The goals of transportation agencies are two-fold: to monitor the general traffic conditions in the area of interest and to locate road segments under abnormal congestion states. Modeling congestion patterns can achieve these goals for citywide roadways, which amounts to learning the distribution of mul… ▽ More

    Submitted 28 June, 2022; originally announced June 2022.

    Comments: A shorter version of the paper was accepted at the ITSC 2022

  14. Suboptimal Safety-Critical Control for Continuous Systems Using Prediction-Correction Online Optimization

    Authors: Shengbo Wang, Shiping Wen, Yin Yang, Yuting Cao, Kaibo Shi, Tingwen Huang

    Abstract: This paper investigates the control barrier function (CBF) based safety-critical control for continuous nonlinear control affine systems using the more efficient online algorithms through time-varying optimization. The idea lies in that when quadratic programming (QP) or other convex optimization algorithms needed in the CBF-based method is not computation affordable, the alternative suboptimal fe… ▽ More

    Submitted 20 March, 2023; v1 submitted 29 March, 2022; originally announced March 2022.

  15. arXiv:2203.10726  [pdf, other

    eess.IV cs.CV

    TransFusion: Multi-view Divergent Fusion for Medical Image Segmentation with Transformers

    Authors: Di Liu, Yunhe Gao, Qilong Zhangli, Ligong Han, Xiaoxiao He, Zhaoyang Xia, Song Wen, Qi Chang, Zhennan Yan, Mu Zhou, Dimitris Metaxas

    Abstract: Combining information from multi-view images is crucial to improve the performance and robustness of automated methods for disease diagnosis. However, due to the non-alignment characteristics of multi-view images, building correlation and data fusion across views largely remain an open problem. In this study, we present TransFusion, a Transformer-based architecture to merge divergent multi-view im… ▽ More

    Submitted 5 September, 2022; v1 submitted 21 March, 2022; originally announced March 2022.

  16. arXiv:2202.04261  [pdf, other

    cs.SD cs.AI eess.AS

    The Volcspeech system for the ICASSP 2022 multi-channel multi-party meeting transcription challenge

    Authors: Chen Shen, Yi Liu, Wenzhi Fan, Bin Wang, Shixue Wen, Yao Tian, Jun Zhang, Jingsheng Yang, Zejun Ma

    Abstract: This paper describes our submission to ICASSP 2022 Multi-channel Multi-party Meeting Transcription (M2MeT) Challenge. For Track 1, we propose several approaches to empower the clustering-based speaker diarization system to handle overlapped speech. Front-end dereverberation and the direction-of-arrival (DOA) estimation are used to improve the accuracy of speaker diarization. Multi-channel combinat… ▽ More

    Submitted 9 February, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

  17. arXiv:2112.00447  [pdf

    eess.SP

    An improved bearing fault detection strategy based on artificial bee colony algorithm

    Authors: Haiquan Wang, Wenxuan Yue, Shengjun Wen, Xiaobin Xu, Menghao Su, Shanshan Zhang, Panpan Du

    Abstract: The operating state of bearing directly affects the performance of rotating machinery and how to accurately and decisively extract features from the original vibration signal and recognize the faulty parts as early as possible is very critical. In this study, the one-dimensional ternary model which has been proved to be an effective statistical method in feature selection is introduced and shapele… ▽ More

    Submitted 2 December, 2021; v1 submitted 1 December, 2021; originally announced December 2021.

  18. arXiv:2111.13849  [pdf, ps, other

    eess.SY

    Robust Adaptive Safety-Critical Control for Unknown Systems with Finite-Time Element-Wise Parameter Estimation

    Authors: Shengbo Wang, Bo Lyu, Shiping Wen, Kaibo Shi, Song Zhu, Tingwen Huang

    Abstract: Safety is always one of the most critical principles for a system to be controlled. This paper investigates a safety-critical control scheme for unknown structured systems by using the control barrier function (CBF) method. Benefited from the dynamic regressor extension and mixing (DREM), an extended element-wise parameter identification law is utilized to dismiss the uncertainty. On the one hand,… ▽ More

    Submitted 14 January, 2022; v1 submitted 27 November, 2021; originally announced November 2021.

  19. arXiv:2111.13848  [pdf, ps, other

    eess.SY

    Optimal Tracking Control for Unknown Linear Systems with Finite-Time Parameter Estimation

    Authors: Shengbo Wang, Shiping Wen, Kaibo Shi, Song Zhu, Tingwen Huang

    Abstract: The optimal control input for linear systems can be solved from algebraic Riccati equation (ARE), from which it remains questionable to get the form of the exact solution. In engineering, the acceptable numerical solutions of ARE can be found by iteration or optimization. Recently, the gradient descent based numerical solutions has been proven effective to approximate the optimal ones. This paper… ▽ More

    Submitted 6 January, 2022; v1 submitted 27 November, 2021; originally announced November 2021.

  20. arXiv:2111.07104  [pdf, other

    eess.IV cs.CV cs.LG

    A strong baseline for image and video quality assessment

    Authors: Shaoguo Wen, Junle Wang

    Abstract: In this work, we present a simple yet effective unified model for perceptual quality assessment of image and video. In contrast to existing models which usually consist of complex network architecture, or rely on the concatenation of multiple branches of features, our model achieves a comparable performance by applying only one global feature derived from a backbone network (i.e. resnet18 in the p… ▽ More

    Submitted 13 November, 2021; originally announced November 2021.

  21. arXiv:2111.01652  [pdf, other

    eess.AS cs.SD eess.SY

    Design and Evaluation of Active Noise Control on Machinery Noise

    Authors: Shulin Wen, Duy Hai Nguyen, Miqing Wang, Woon-Seng Gan

    Abstract: Construction workers and residents live near around construction sites are exposed to noises that might cause hearing loss, high blood pressure, heart disease, sleep disturbance and stress. Regulations has been carried out by national governments to limit the maximum permissible noise levels for construction works. A four-channel active noise control system mounted on the opening of an enclosure i… ▽ More

    Submitted 2 November, 2021; originally announced November 2021.

    Journal ref: APSIPA 2021

  22. arXiv:2110.12274  [pdf, other

    eess.IV cs.CV

    "One-Shot" Reduction of Additive Artifacts in Medical Images

    Authors: Yu-Jen Chen, Yen-Jung Chang, Shao-Cheng Wen, Yiyu Shi, Xiaowei Xu, Tsung-Yi Ho, Meiping Huang, Haiyun Yuan, Jian Zhuang

    Abstract: Medical images may contain various types of artifacts with different patterns and mixtures, which depend on many factors such as scan setting, machine condition, patients' characteristics, surrounding environment, etc. However, existing deep-learning-based artifact reduction methods are restricted by their training set with specific predetermined artifact types and patterns. As such, they have lim… ▽ More

    Submitted 23 October, 2021; originally announced October 2021.

  23. arXiv:2103.05099  [pdf, other

    cs.CV cs.LG eess.IV

    Subjective and Objective Quality Assessment of Mobile Gaming Video

    Authors: Shaoguo Wen, Suiyi Ling, Junle Wang, Ximing Chen, Lizhi Fang, Yanqing Jing, Patrick Le Callet

    Abstract: Nowadays, with the vigorous expansion and development of gaming video streaming techniques and services, the expectation of users, especially the mobile phone users, for higher quality of experience is also growing swiftly. As most of the existing research focuses on traditional video streaming, there is a clear lack of both subjective study and objective quality models that are tailored for quali… ▽ More

    Submitted 27 January, 2021; originally announced March 2021.

    Comments: 5 pages

    MSC Class: 68U10 ACM Class: J.0

  24. arXiv:2103.01053  [pdf

    eess.SP

    High Accuracy Visible Light Positioning Based on Multi-target Tracking Algorithm

    Authors: Linyi Huang, Wentao Yang, Shangsheng Wen, Manxi Liu, Weipeng Guan

    Abstract: In this paper, we propose a multi-target image tracking algorithm based on continuously apative mean-shift (Cam-shift) and unscented Kalman filter. We improved the single-lamp tracking algorithm proposed in our previous work to multi-target tracking, and achieved better robustness in the case of occlusion, the real-time performance to complete one positioning and relatively high accuracy by dynami… ▽ More

    Submitted 26 May, 2021; v1 submitted 1 March, 2021; originally announced March 2021.

  25. arXiv:2101.04442  [pdf, other

    cs.CV eess.IV

    Joint Demosaicking and Denoising in the Wild: The Case of Training Under Ground Truth Uncertainty

    Authors: Jierun Chen, Song Wen, S. -H. Gary Chan

    Abstract: Image demosaicking and denoising are the two key fundamental steps in digital camera pipelines, aiming to reconstruct clean color images from noisy luminance readings. In this paper, we propose and study Wild-JDD, a novel learning framework for joint demosaicking and denoising in the wild. In contrast to previous works which generally assume the ground truth of training data is a perfect reflectio… ▽ More

    Submitted 12 January, 2021; originally announced January 2021.

    Comments: Accepted by AAAI2021

  26. arXiv:2011.02155  [pdf, other

    eess.IV cs.CV cs.LG

    Do Noises Bother Human and Neural Networks In the Same Way? A Medical Image Analysis Perspective

    Authors: Shao-Cheng Wen, Yu-Jen Chen, Zihao Liu, Wujie Wen, Xiaowei Xu, Yiyu Shi, Tsung-Yi Ho, Qianjun Jia, Meiping Huang, Jian Zhuang

    Abstract: Deep learning had already demonstrated its power in medical images, including denoising, classification, segmentation, etc. All these applications are proposed to automatically analyze medical images beforehand, which brings more information to radiologists during clinical assessment for accuracy improvement. Recently, many medical denoising methods had shown their significant artifact reduction r… ▽ More

    Submitted 4 November, 2020; originally announced November 2020.

  27. arXiv:2010.06995  [pdf

    q-bio.QM eess.IV

    A Pathologist-Annotated Dataset for Validating Artificial Intelligence: A Project Description and Pilot Study

    Authors: Sarah N Dudgeon, Si Wen, Matthew G Hanna, Rajarsi Gupta, Mohamed Amgad, Manasi Sheth, Hetal Marble, Richard Huang, Markus D Herrmann, Clifford H. Szu, Darick Tong, Bruce Werness, Evan Szu, Denis Larsimont, Anant Madabhushi, Evangelos Hytopoulos, Weijie Chen, Rajendra Singh, Steven N. Hart, Joel Saltz, Roberto Salgado, Brandon D Gallas

    Abstract: Purpose: In this work, we present a collaboration to create a validation dataset of pathologist annotations for algorithms that process whole slide images (WSIs). We focus on data collection and evaluation of algorithm performance in the context of estimating the density of stromal tumor infiltrating lymphocytes (sTILs) in breast cancer. Methods: We digitized 64 glass slides of hematoxylin- and eo… ▽ More

    Submitted 14 October, 2020; originally announced October 2020.

    Comments: 26 pages, 4 figures, 2 tables Submitted to the Journal of Pathology Informatics Project web page: https://ncihub.org/groups/eedapstudies

  28. arXiv:2010.05466  [pdf, other

    cs.CV cs.LG cs.MM cs.SD eess.AS

    Discriminative Sounding Objects Localization via Self-supervised Audiovisual Matching

    Authors: Di Hu, Rui Qian, Minyue Jiang, Xiao Tan, Shilei Wen, Errui Ding, Weiyao Lin, Dejing Dou

    Abstract: Discriminatively localizing sounding objects in cocktail-party, i.e., mixed sound scenes, is commonplace for humans, but still challenging for machines. In this paper, we propose a two-stage learning framework to perform self-supervised class-aware sounding object localization. First, we propose to learn robust object representations by aggregating the candidate sound localization results in the s… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

    Comments: To appear in NeurIPS 2020. Previous Title: Learning to Discriminatively Localize Sounding Objects in a Cocktail-party Scenario

  29. arXiv:2008.00414  [pdf, ps, other

    cs.CR eess.SY

    On the Security of Networked Control Systems in Smart Vehicle and its Adaptive Cruise Control

    Authors: Faezeh Farivar, Mohammad Sayad Haghighi, Alireza Jolfaei, Sheng Wen

    Abstract: With the benefits of Internet of Vehicles (IoV) paradigm, come along unprecedented security challenges. Among many applications of inter-connected systems, vehicular networks and smart cars are examples that are already rolled out. Smart vehicles not only have networks connecting their internal components e.g. via Controller Area Network (CAN) bus, but also are connected to the outside world throu… ▽ More

    Submitted 4 August, 2020; v1 submitted 2 August, 2020; originally announced August 2020.

    Comments: This paper has been accepted and is to appear in IEEE Transactions on Intelligent Transportation Systems

    ACM Class: I.2.8; J.6; H.4

  30. arXiv:2007.05890  [pdf

    eess.SP cs.IT

    Recognition and evaluation of constellation diagram using deep learning based on underwater wireless optical communication

    Authors: ZiHao Zhou, WeiPeng Guan, ShangSheng Wen

    Abstract: Abstract. In this paper, we proposed a method of constellation diagram recognition and evaluation using deep learning based on underwater wireless optical communication (UWOC). More specifically, an constellation diagram analyzer for UWOC system based on convolutional neural network (CNN) is designed for modulation format recognition (MFR), optical signal noise ratio (OSNR) and phase error estimat… ▽ More

    Submitted 11 July, 2020; originally announced July 2020.

  31. arXiv:2006.04356  [pdf, other

    cs.CV cs.LG cs.RO eess.IV

    Associate-3Ddet: Perceptual-to-Conceptual Association for 3D Point Cloud Object Detection

    Authors: Liang Du, Xiaoqing Ye, Xiao Tan, Jianfeng Feng, Zhenbo Xu, Errui Ding, Shilei Wen

    Abstract: Object detection from 3D point clouds remains a challenging task, though recent studies pushed the envelope with the deep learning techniques. Owing to the severe spatial occlusion and inherent variance of point density with the distance to sensors, appearance of a same object varies a lot in point cloud data. Designing robust feature representation against such appearance changes is hence the key… ▽ More

    Submitted 8 June, 2020; originally announced June 2020.

    Comments: 8 pages, 5 figures, CVPR 2020

  32. Sub-Band Knowledge Distillation Framework for Speech Enhancement

    Authors: Xiang Hao, Shixue Wen, Xiangdong Su, Yun Liu, Guanglai Gao, Xiaofei Li

    Abstract: In single-channel speech enhancement, methods based on full-band spectral features have been widely studied. However, only a few methods pay attention to non-full-band spectral features. In this paper, we explore a knowledge distillation framework based on sub-band spectral mapping for single-channel speech enhancement. Specifically, we divide the full frequency band into multiple sub-bands and pr… ▽ More

    Submitted 29 October, 2020; v1 submitted 29 May, 2020; originally announced May 2020.

    Comments: Published in Interspeech 2020

  33. arXiv:2005.02291  [pdf, other

    eess.IV cs.CV cs.LG

    NTIRE 2020 Challenge on Video Quality Mapping: Methods and Results

    Authors: Dario Fuoli, Zhiwu Huang, Martin Danelljan, Radu Timofte, Hua Wang, Longcun Jin, Dewei Su, Jing Liu, Jaehoon Lee, Michal Kudelski, Lukasz Bala, Dmitry Hrybov, Marcin Mozejko, Muchen Li, Siyao Li, Bo Pang, Cewu Lu, Chao Li, Dongliang He, Fu Li, Shilei Wen

    Abstract: This paper reviews the NTIRE 2020 challenge on video quality mapping (VQM), which addresses the issues of quality mapping from source video domain to target video domain. The challenge includes both a supervised track (track 1) and a weakly-supervised track (track 2) for two benchmark datasets. In particular, track 1 offers a new Internet video benchmark, requiring algorithms to learn the map from… ▽ More

    Submitted 15 June, 2020; v1 submitted 5 May, 2020; originally announced May 2020.

    Comments: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops

  34. arXiv:2005.01056  [pdf, other

    eess.IV cs.CV

    NTIRE 2020 Challenge on Perceptual Extreme Super-Resolution: Methods and Results

    Authors: Kai Zhang, Shuhang Gu, Radu Timofte, Taizhang Shang, Qiuju Dai, Shengchen Zhu, Tong Yang, Yandong Guo, Younghyun Jo, Sejong Yang, Seon Joo Kim, Lin Zha, Jiande Jiang, Xinbo Gao, Wen Lu, Jing Liu, Kwangjin Yoon, Taegyun Jeon, Kazutoshi Akita, Takeru Ooba, Norimichi Ukita, Zhipeng Luo, Yuehan Yao, Zhenyu Xu, Dongliang He , et al. (38 additional authors not shown)

    Abstract: This paper reviews the NTIRE 2020 challenge on perceptual extreme super-resolution with focus on proposed solutions and results. The challenge task was to super-resolve an input image with a magnification factor 16 based on a set of prior examples of low and corresponding high resolution images. The goal is to obtain a network design capable to produce high resolution results with the best percept… ▽ More

    Submitted 3 May, 2020; originally announced May 2020.

    Comments: CVPRW 2020

  35. arXiv:2004.05855  [pdf, other

    eess.IV

    Variable Rate Image Compression Method with Dead-zone Quantizer

    Authors: Jing Zhou, Akira Nakagawa, Keizo Kato, Sihan Wen, Kimihiko Kazui, Zhiming Tan

    Abstract: Deep learning based image compression methods have achieved superior performance compared with transform based conventional codec. With end-to-end Rate-Distortion Optimization (RDO) in the codec, compression model is optimized with Lagrange multiplier $λ$. For conventional codec, signal is decorrelated with orthonmal transformation, and uniform quantizer is introduced. We propose a variable rate i… ▽ More

    Submitted 26 April, 2020; v1 submitted 13 April, 2020; originally announced April 2020.

  36. arXiv:2001.01888  [pdf

    eess.SP

    Indoor Localization System of ROS mobile robot based on Visible Light Communication

    Authors: Weipeng Guan, Shihuan Chen, Shangsheng Wen, Wenyuan Hou, Zequn Tan, Ruihong Cen

    Abstract: In this paper, an indoor robot localization system based on Robot Operating System (ROS) and visible light communication (VLC) is presented. On the basis of our previous work, we innovatively designed a VLC localization and navigation package based on Robot Operating System (ROS), which contains the LED-ID detection and recognition method, the video target tracking algorithm and the double-lamp po… ▽ More

    Submitted 6 January, 2020; originally announced January 2020.

  37. A Sparse Representation Based Joint Demosaicing Method for Single-Chip Polarized Color Sensor

    Authors: Sijia Wen, Yinqiang Zheng, Feng Lu

    Abstract: The emergence of the single-chip polarized color sensor now allows for simultaneously capturing chromatic and polarimetric information of the scene on a monochromatic image plane. However, unlike the usual camera with an embedded demosaicing method, the latest polarized color camera is not delivered with an in-built demosaicing tool. For demosaicing, the users have to down-sample the captured imag… ▽ More

    Submitted 7 April, 2021; v1 submitted 16 December, 2019; originally announced December 2019.

    Comments: 10 pages, 7 figures

    Journal ref: IEEE Transactions on Image Processing 2021

  38. arXiv:1911.11773  [pdf

    eess.SP

    High accuracy and error analysis of indoor visible light positioning algorithm based on image sensor

    Authors: Shihuan Chen, Weipeng Guan, Zequn Tan, Shangsheng Wen, Manxi Liu, Jingmin Wang, Jingyi Li

    Abstract: In recent years, with the increasing demand for indoor positioning service, visible light indoor positioning based on image sensors has been widely studied. However, many researches only put forward the relevant localization algorithm and did not make a deep discussion on the principle of the visible light localization. In this paper, we make a deep discussion on the principle of the two-light pos… ▽ More

    Submitted 29 April, 2020; v1 submitted 26 November, 2019; originally announced November 2019.

    Comments: This paper presents a centimeter-level precise positioning system based on image sensor and visible light LED. In this paper, the principle of dual-light positioning algorithm and three-lamp positioning algorithm based on image sensor is deeply and respectively analyzed. And the error generation in the algorithm is discussed

  39. arXiv:1910.07844  [pdf, other

    eess.IV

    Multi-scale and Context-adaptive Entropy Model for Image Compression

    Authors: Jing Zhou, Sihan Wen, Akira Nakagawa, Kimihiko Kazui, Zhiming Tan

    Abstract: We propose an end-to-end trainable image compression framework with a multi-scale and context-adaptive entropy model, especially for low bitrate compression. Due to the success of autoregressive priors in probabilistic generative model, the complementary combination of autoregressive and hierarchical priors can estimate the distribution of each latent representation accurately. Based on this combi… ▽ More

    Submitted 17 October, 2019; originally announced October 2019.

    Comments: accepted by CVPR workshop and Challenge on Learned Image Compression (CLIC) 2019