Skip to main content

Showing 1–6 of 6 results for author: Lou, J

Searching in archive eess. Search in all archives.
.
  1. arXiv:2504.13476  [pdf, other

    cs.LG cs.CV eess.IV

    Variational Autoencoder Framework for Hyperspectral Retrievals (Hyper-VAE) of Phytoplankton Absorption and Chlorophyll a in Coastal Waters for NASA's EMIT and PACE Missions

    Authors: Jiadong Lou, Bingqing Liu, Yuanheng Xiong, Xiaodong Zhang, Xu Yuan

    Abstract: Phytoplankton absorb and scatter light in unique ways, subtly altering the color of water, changes that are often minor for human eyes to detect but can be captured by sensitive ocean color instruments onboard satellites from space. Hyperspectral sensors, paired with advanced algorithms, are expected to significantly enhance the characterization of phytoplankton community composition, especially i… ▽ More

    Submitted 18 April, 2025; originally announced April 2025.

  2. arXiv:2305.08408  [pdf, other

    cs.CV eess.IV

    SB-VQA: A Stack-Based Video Quality Assessment Framework for Video Enhancement

    Authors: Ding-Jiun Huang, Yu-Ting Kao, Tieh-Hung Chuang, Ya-Chun Tsai, Jing-Kai Lou, Shuen-Huei Guan

    Abstract: In recent years, several video quality assessment (VQA) methods have been developed, achieving high performance. However, these methods were not specifically trained for enhanced videos, which limits their ability to predict video quality accurately based on human subjective perception. To address this issue, we propose a stack-based framework for VQA that outperforms existing state-of-the-art met… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Comments: CVPR NTIRE 2023

  3. arXiv:2103.03612  [pdf, other

    eess.IV cs.MM

    An Optimized H.266/VVC Software Decoder On Mobile Platform

    Authors: Yiming Li, Shan Liu, Yu Chen, Yushan Zheng, Sijia Chen, Bin Zhu, Jian Lou

    Abstract: As the successor of H.265/HEVC, the new versatile video coding standard (H.266/VVC) can provide up to 50% bitrate saving with the same subjective quality, at the cost of increased decoding complexity. To accelerate the application of the new coding standard, a real-time H.266/VVC software decoder that can support various platforms is implemented, where SIMD technologies, parallelism optimization,… ▽ More

    Submitted 5 March, 2021; originally announced March 2021.

  4. Just Noticeable Difference for Deep Machine Vision

    Authors: Jian Jin, Xingxing Zhang, Xin Fu, Huan Zhang, Weisi Lin, Jian Lou, Yao Zhao

    Abstract: As an important perceptual characteristic of the Human Visual System (HVS), the Just Noticeable Difference (JND) has been studied for decades with image and video processing (e.g., perceptual visual signal compression). However, there is little exploration on the existence of JND for the Deep Machine Vision (DMV), although the DMV has made great strides in many machine vision tasks. In this paper,… ▽ More

    Submitted 7 January, 2022; v1 submitted 16 February, 2021; originally announced February 2021.

    Journal ref: IEEE Transactions on Circuits and Systems for Video Technology, 2021

  5. arXiv:2009.10298  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    End-to-End Speech Recognition and Disfluency Removal

    Authors: Paria Jamshid Lou, Mark Johnson

    Abstract: Disfluency detection is usually an intermediate step between an automatic speech recognition (ASR) system and a downstream task. By contrast, this paper aims to investigate the task of end-to-end speech recognition and disfluency removal. We specifically explore whether it is possible to train an ASR model to directly map disfluent speech into fluent transcripts, without relying on a separate disf… ▽ More

    Submitted 28 September, 2020; v1 submitted 21 September, 2020; originally announced September 2020.

  6. arXiv:1906.01155  [pdf, other

    cs.CL cs.SD eess.AS

    ShEMO -- A Large-Scale Validated Database for Persian Speech Emotion Detection

    Authors: Omid Mohamad Nezami, Paria Jamshid Lou, Mansoureh Karami

    Abstract: This paper introduces a large-scale, validated database for Persian called Sharif Emotional Speech Database (ShEMO). The database includes 3000 semi-natural utterances, equivalent to 3 hours and 25 minutes of speech data extracted from online radio plays. The ShEMO covers speech samples of 87 native-Persian speakers for five basic emotions including anger, fear, happiness, sadness and surprise, as… ▽ More

    Submitted 10 June, 2019; v1 submitted 3 June, 2019; originally announced June 2019.