Skip to main content

Showing 1–9 of 9 results for author: Rao, Y

Searching in archive eess. Search in all archives.
.
  1. arXiv:2506.05655  [pdf, ps, other

    cs.CV eess.IV

    Aerial Multi-View Stereo via Adaptive Depth Range Inference and Normal Cues

    Authors: Yimei Liu, Yakun Ju, Yuan Rao, Hao Fan, Junyu Dong, Feng Gao, Qian Du

    Abstract: Three-dimensional digital urban reconstruction from multi-view aerial images is a critical application where deep multi-view stereo (MVS) methods outperform traditional techniques. However, existing methods commonly overlook the key differences between aerial and close-range settings, such as varying depth ranges along epipolar lines and insensitive feature-matching associated with low-detailed ae… ▽ More

    Submitted 5 June, 2025; originally announced June 2025.

    Comments: IEEE TGRS 2025

  2. arXiv:2502.04328  [pdf, ps, other

    cs.CV cs.CL cs.MM cs.SD eess.AS eess.IV

    Ola: Pushing the Frontiers of Omni-Modal Language Model

    Authors: Zuyan Liu, Yuhao Dong, Jiahui Wang, Ziwei Liu, Winston Hu, Jiwen Lu, Yongming Rao

    Abstract: Recent advances in large language models, particularly following GPT-4o, have sparked increasing interest in developing omni-modal models capable of understanding more modalities. While some open-source alternatives have emerged, there is still a notable lag behind specialized single-modality models in performance. In this paper, we present Ola, an Omni-modal Language model that achieves competiti… ▽ More

    Submitted 2 June, 2025; v1 submitted 6 February, 2025; originally announced February 2025.

  3. UMFA: A photorealistic style transfer method based on U-Net and multi-layer feature aggregation

    Authors: D. Y. Rao, X. J. Wu, H. Li, J. Kittler, T. Y. Xu

    Abstract: In this paper, we propose a photorealistic style transfer network to emphasize the natural effect of photorealistic image stylization. In general, distortion of the image content and lacking of details are two typical issues in the style transfer field. To this end, we design a novel framework employing the U-Net structure to maintain the rich spatial clues, with a multi-layer feature aggregation… ▽ More

    Submitted 13 August, 2021; originally announced August 2021.

  4. arXiv:2011.07017  [pdf, other

    cs.CV cs.LG eess.IV

    NightVision: Generating Nighttime Satellite Imagery from Infra-Red Observations

    Authors: Paula Harder, William Jones, Redouane Lguensat, Shahine Bouabid, James Fulton, Dánell Quesada-Chacón, Aris Marcolongo, Sofija Stefanović, Yuhan Rao, Peter Manshausen, Duncan Watson-Parris

    Abstract: The recent explosion in applications of machine learning to satellite imagery often rely on visible images and therefore suffer from a lack of data during the night. The gap can be filled by employing available infra-red observations to generate visible images. This work presents how deep learning can be applied successfully to create those images by using U-Net based architectures. The proposed m… ▽ More

    Submitted 8 December, 2020; v1 submitted 13 November, 2020; originally announced November 2020.

  5. arXiv:2004.11012  [pdf, other

    eess.AS cs.SD

    ByteSing: A Chinese Singing Voice Synthesis System Using Duration Allocated Encoder-Decoder Acoustic Models and WaveRNN Vocoders

    Authors: Yu Gu, Xiang Yin, Yonghui Rao, Yuan Wan, Benlai Tang, Yang Zhang, Jitong Chen, Yuxuan Wang, Zejun Ma

    Abstract: This paper presents ByteSing, a Chinese singing voice synthesis (SVS) system based on duration allocated Tacotron-like acoustic models and WaveRNN neural vocoders. Different from the conventional SVS models, the proposed ByteSing employs Tacotron-like encoder-decoder structures as the acoustic models, in which the CBHG models and recurrent neural networks (RNNs) are explored as encoders and decode… ▽ More

    Submitted 24 January, 2021; v1 submitted 23 April, 2020; originally announced April 2020.

    Comments: Accepted by ISCSLP2021

  6. arXiv:2003.13081  [pdf, other

    eess.IV cs.CV

    Structure-Preserving Super Resolution with Gradient Guidance

    Authors: Cheng Ma, Yongming Rao, Yean Cheng, Ce Chen, Jiwen Lu, Jie Zhou

    Abstract: Structures matter in single image super resolution (SISR). Recent studies benefiting from generative adversarial network (GAN) have promoted the development of SISR by recovering photo-realistic images. However, there are always undesired structural distortions in the recovered images. In this paper, we propose a structure-preserving super resolution method to alleviate the above issue while maint… ▽ More

    Submitted 29 March, 2020; originally announced March 2020.

    Comments: Accepted to CVPR 2020

  7. arXiv:1907.09411  [pdf, other

    eess.SP

    Deep Fault Diagnosis for Rotating Machinery with Scarce Labeled Samples

    Authors: Jing Zhang, Jing Tian, Tao Wen, Xiaohui Yang, Yong Rao, Xiaobin Xu

    Abstract: Early and accurately detecting faults in rotating machinery is crucial for operation safety of the modern manufacturing system. In this paper, we proposed a novel Deep fault diagnosis (DFD) method for rotating machinery with scarce labeled samples. DFD tackles the challenging problem by transferring knowledge from shallow models, which is based on the idea that shallow models trained with differen… ▽ More

    Submitted 13 July, 2019; originally announced July 2019.

    Comments: To appear in Chinese Journal of Electronics

  8. arXiv:1812.03985  [pdf, other

    eess.SP physics.optics

    Rayleigh fading suppression in one-dimension optical scatters

    Authors: Shengtao Lin, Zinan Wang, Ji Xiong, Yun Fu, Jialin Jiang, Yue Wu, Yongxiang Chen, Chongyu Lu, Yunjiang Rao

    Abstract: Highly coherent wave is favorable for applications in which phase retrieval is necessary, yet a high coherent wave is prone to encounter Rayleigh fading phenomenon as it passes through a medium of random scatters. As an exemplary case, phase-sensitive optical time-domain reflectometry (Φ-OTDR) utilizes coherent interference of backscattering light along a fiber to achieve ultra-sensitive acoustic… ▽ More

    Submitted 8 December, 2018; originally announced December 2018.

    Comments: 9 pages, 5 figures

    Journal ref: IEEE Access 2019

  9. arXiv:1806.08898  [pdf, other

    eess.IV

    Pansharpening via Detail Injection Based Convolutional Neural Networks

    Authors: Lin He, Yizhou Rao, Jun Li, Antonio Plaza, Jiawei Zhu

    Abstract: Pansharpening aims to fuse a multispectral (MS) image with an associated panchromatic (PAN) image, producing a composite image with the spectral resolution of the former and the spatial resolution of the latter. Traditional pansharpening methods can be ascribed to a unified detail injection context, which views the injected MS details as the integration of PAN details and band-wise injection gains… ▽ More

    Submitted 22 June, 2018; originally announced June 2018.