Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.MM

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Multimedia

Authors and titles for February 2020

Total of 39 entries
Showing up to 50 entries per page: fewer | more | all
[1] arXiv:2002.00251 [pdf, other]
Title: Multi-Modal Music Information Retrieval: Augmenting Audio-Analysis with Visual Computing for Improved Music Video Analysis
Alexander Schindler
Comments: Dissertation at TU Wien
Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[2] arXiv:2002.01425 [pdf, other]
Title: Spatially Variant Laplacian Pyramids for Multi-Frame Exposure Fusion
Anmol Biswas, Green Rosh K S, Sachin Deepak Lomte
Subjects: Multimedia (cs.MM)
[3] arXiv:2002.02370 [pdf, other]
Title: Data hiding in speech signal using steganography and encryption
Hanisha Chowdary N, Karan K, Bharath K P, Rajesh Kumar M
Subjects: Multimedia (cs.MM)
[4] arXiv:2002.03156 [pdf, other]
Title: A Time-Frequency Perspective on Audio Watermarking
Haijian Zhang
Subjects: Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[5] arXiv:2002.09607 [pdf, other]
Title: Multi-Representation Knowledge Distillation For Audio Classification
Liang Gao, Kele Xu, Huaimin Wang, Yuxing Peng
Subjects: Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[6] arXiv:2002.10651 [pdf, other]
Title: A Comparative Evaluation of Temporal Pooling Methods for Blind Video Quality Assessment
Zhengzhong Tu, Chia-Ju Chen, Li-Heng Chen, Neil Birkbeck, Balu Adsumilli, Alan C. Bovik
Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV)
[7] arXiv:2002.11088 [pdf, other]
Title: Model Watermarking for Image Processing Networks
Jie Zhang, Dongdong Chen, Jing Liao, Han Fang, Weiming Zhang, Wenbo Zhou, Hao Cui, Nenghai Yu
Comments: AAAI 2020
Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[8] arXiv:2002.12275 [pdf, other]
Title: Subjective Quality Assessment for YouTube UGC Dataset
Joong Gon Yim, Yilin Wang, Neil Birkbeck, Balu Adsumilli
Subjects: Multimedia (cs.MM); Image and Video Processing (eess.IV)
[9] arXiv:2002.01553 (cross-list from cs.NI) [pdf, other]
Title: EdgeDASH: Exploiting Network-Assisted Adaptive Video Streaming for Edge Caching
Suzan Bayhan, Setareh Maghsudi, Anatolij Zubow
Subjects: Networking and Internet Architecture (cs.NI); Multimedia (cs.MM)
[10] arXiv:2002.02609 (cross-list from cs.CV) [pdf, other]
Title: Image Fine-grained Inpainting
Zheng Hui, Jie Li, Xiumei Wang, Xinbo Gao
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[11] arXiv:2002.02927 (cross-list from cs.CV) [pdf, other]
Title: SPN-CNN: Boosting Sensor-Based Source Camera Attribution With Deep Learning
Matthias Kirchner, Cameron Johnson
Comments: Presented at the IEEE International Workshop on Information Forensics and Security (WIFS) 2019
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Image and Video Processing (eess.IV)
[12] arXiv:2002.03322 (cross-list from cs.CV) [pdf, other]
Title: VIFB: A Visible and Infrared Image Fusion Benchmark
Xingchen Zhang, Ping Ye, Gang Xiao
Comments: 11 pages, 5 figures, 5 tables. Accepted to CVPRW2020. Compared to the CVPRW2020 version, this version corrects minor mistakes in Table 4 and the first paragraph of Section 4.2
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[13] arXiv:2002.03557 (cross-list from cs.CV) [pdf, other]
Title: Multitask Emotion Recognition with Incomplete Labels
Didan Deng, Zhaokang Chen, Bertram E. Shi
Comments: Accepted by FG2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[14] arXiv:2002.03773 (cross-list from cs.CV) [pdf, other]
Title: Deriving Emotions and Sentiments from Visual Content: A Disaster Analysis Use Case
Kashif Ahmad, Syed Zohaib, Nicola Conci, Ala Al-Fuqaha
Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR); Machine Learning (cs.LG); Multimedia (cs.MM)
[15] arXiv:2002.03977 (cross-list from eess.AS) [pdf, other]
Title: Multimodal active speaker detection and virtual cinematography for video conferencing
Ross Cutler, Ramin Mehran, Sam Johnson, Cha Zhang, Adam Kirk, Oliver Whyte, Adarsh Kowdle
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Multimedia (cs.MM); Machine Learning (stat.ML)
[16] arXiv:2002.04537 (cross-list from eess.IV) [pdf, other]
Title: 3D Point Cloud Enhancement using Graph-Modelled Multiview Depth Measurements
Xue Zhang, Gene Cheung, Jiahao Pang, Dong Tian
Comments: 5 figures
Subjects: Image and Video Processing (eess.IV); Multimedia (cs.MM)
[17] arXiv:2002.04780 (cross-list from cs.CV) [pdf, other]
Title: MFFW: A new dataset for multi-focus image fusion
Shuang Xu, Xiaoli Wei, Chunxia Zhang, Junmin Liu, Jiangshe Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[18] arXiv:2002.05070 (cross-list from cs.CV) [pdf, other]
Title: AlignNet: A Unifying Approach to Audio-Visual Alignment
Jianren Wang, Zhaoyuan Fang, Hang Zhao
Comments: WACV2020. Project video and code are available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[19] arXiv:2002.05305 (cross-list from cs.HC) [pdf, other]
Title: Interactive Multi-User 3D Visual Analytics in Augmented Reality
Wanze Xie, Yining Liang, Janet Johnson, Andrea Mower, Samuel Burns, Colleen Chelini, Paul D Alessandro, Nadir Weibel, Jürgen P. Schulze
Comments: In Proceedings of IS&T The Engineering Reality of Virtual Reality 2020
Journal-ref: Electronic Imaging, The Engineering Reality of Virtual Reality 2020, pp. 363-1-363-6(6)
Subjects: Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
[20] arXiv:2002.05314 (cross-list from eess.AS) [pdf, other]
Title: Self-supervised learning for audio-visual speaker diarization
Yifan Ding, Yong Xu, Shi-Xiong Zhang, Yahuan Cong, Liqiang Wang
Subjects: Audio and Speech Processing (eess.AS); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Machine Learning (stat.ML)
[21] arXiv:2002.05604 (cross-list from eess.AS) [pdf, other]
Title: Efficient And Scalable Neural Residual Waveform Coding With Collaborative Quantization
Kai Zhen, Mi Suk Lee, Jongmo Sung, Seungkwon Beack, Minje Kim
Comments: Accepted in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) , Barcelona, Spain, May 4-8, 2020
Subjects: Audio and Speech Processing (eess.AS); Multimedia (cs.MM); Sound (cs.SD); Signal Processing (eess.SP)
[22] arXiv:2002.05639 (cross-list from cs.CL) [pdf, other]
Title: Looking Enhances Listening: Recovering Missing Speech Using Images
Tejas Srinivasan, Ramon Sanabria, Florian Metze
Comments: Accepted to ICASSP 2020
Subjects: Computation and Language (cs.CL); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[23] arXiv:2002.06652 (cross-list from cs.CL) [pdf, other]
Title: SBERT-WK: A Sentence Embedding Method by Dissecting BERT-based Word Models
Bin Wang, C.-C. Jay Kuo
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Multimedia (cs.MM)
[24] arXiv:2002.06794 (cross-list from cs.CR) [pdf, other]
Title: Computing in Covert Domain Using Data Hiding
Zhenxing Qian, Zichi Wang, Xinpeng Zhang
Comments: 5 pages, 7 figures
Subjects: Cryptography and Security (cs.CR); Multimedia (cs.MM)
[25] arXiv:2002.06817 (cross-list from cs.SD) [pdf, other]
Title: Addressing the confounds of accompaniments in singer identification
Tsung-Han Hsieh, Kai-Hsiang Cheng, Zhe-Cheng Fan, Yu-Ching Yang, Yi-Hsuan Yang
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[26] arXiv:2002.06923 (cross-list from cs.IR) [pdf, other]
Title: Serial Speakers: a Dataset of TV Series
Xavier Bost (LIA), Vincent Labatut (LIA), Georges Linares (LIA)
Journal-ref: 12th International Conference on Language Resources and Evaluation (LREC 2020), p.4256-4264, May 2020, Marseille, France
Subjects: Information Retrieval (cs.IR); Multimedia (cs.MM)
[27] arXiv:2002.07048 (cross-list from cs.LG) [pdf, other]
Title: Bit Allocation for Multi-Task Collaborative Intelligence
Saeed Ranjbar Alvar, Ivan V. Bajić
Comments: Accepted for publication ICASSP'20
Subjects: Machine Learning (cs.LG); Multimedia (cs.MM); Image and Video Processing (eess.IV)
[28] arXiv:2002.07082 (cross-list from eess.IV) [pdf, other]
Title: PCSGAN: Perceptual Cyclic-Synthesized Generative Adversarial Networks for Thermal and NIR to Visible Image Transformation
Kancharagunta Kishan Babu, Shiv Ram Dubey
Comments: Published in Neurocomputing Journal, Elsevier
Journal-ref: Neurocomputing, 413:41-50, Nov 2020
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[29] arXiv:2002.07677 (cross-list from cs.SD) [pdf, other]
Title: Performance Analysis of Adaptive Noise Cancellation for Speech Signal
Pratibha Balaji, Shruthi Narayan, Durga Sraddha, Bharath K P, Karthik R, Rajesh Kumar Muthu
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[30] arXiv:2002.09140 (cross-list from eess.IV) [pdf, other]
Title: Blind Omnidirectional Image Quality Assessment with Viewport Oriented Graph Convolutional Networks
Jiahua Xu, Wei Zhou, Zhibo Chen
Subjects: Image and Video Processing (eess.IV); Multimedia (cs.MM)
[31] arXiv:2002.09461 (cross-list from cs.CV) [pdf, other]
Title: Fine-Grained Instance-Level Sketch-Based Video Retrieval
Peng Xu, Kun Liu, Tao Xiang, Timothy M. Hospedales, Zhanyu Ma, Jun Guo, Yi-Zhe Song
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[32] arXiv:2002.09748 (cross-list from cs.SD) [pdf, other]
Title: DECIBEL: Improving Audio Chord Estimation for Popular Music by Alignment and Integration of Crowd-Sourced Symbolic Representations
Daphne Odekerken, Hendrik Vincent Koops, Anja Volk
Comments: 81 pages, 47 figures
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[33] arXiv:2002.10696 (cross-list from cs.RO) [pdf, other]
Title: Human Perception-Optimized Planning for Comfortable VR-Based Telepresence
Israel Becerra, Markku Suomalainen, Eliezer Lozano, Katherine J. Mimnaugh, Rafael Murrieta-Cid, Steven M. LaValle
Comments: Accepted for publication in IEEE Robotics and Automation Letters (RA-L)
Subjects: Robotics (cs.RO); Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
[34] arXiv:2002.10798 (cross-list from eess.IV) [pdf, other]
Title: Model-based Joint Bit Allocation between Geometry and Color for Video-based 3D Point Cloud Compression
Qi Liu, Hui Yuan, Junhui Hou, Raouf Hamzaoui, Honglei Su
Comments: 13pages, 10 figures, submitted to IEEE Transactions on Multimedia
Journal-ref: IEEE Transactions on Multimedia, 2020
Subjects: Image and Video Processing (eess.IV); Multimedia (cs.MM); Systems and Control (eess.SY)
[35] arXiv:2002.11079 (cross-list from cs.CV) [pdf, other]
Title: DDet: Dual-path Dynamic Enhancement Network for Real-World Image Super-Resolution
Yukai Shi, Haoyu Zhong, Zhijing Yang, Xiaojun Yang, Liang Lin
Comments: Code address: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Image and Video Processing (eess.IV)
[36] arXiv:2002.11616 (cross-list from cs.CV) [pdf, other]
Title: Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution
Xiaoyu Xiang, Yapeng Tian, Yulun Zhang, Yun Fu, Jan P. Allebach, Chenliang Xu
Comments: This work is accepted in CVPR 2020. The source code and pre-trained model are available on this https URL. 12 pages, 10 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Image and Video Processing (eess.IV)
[37] arXiv:2002.11812 (cross-list from cs.CV) [pdf, other]
Title: Learning to Shadow Hand-drawn Sketches
Qingyuan Zheng, Zhuoru Li, Adam Bargteil
Comments: To appear in CVPR 2020 (Oral presentation)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Multimedia (cs.MM)
[38] arXiv:2002.11891 (cross-list from eess.IV) [pdf, other]
Title: BBAND Index: A No-Reference Banding Artifact Predictor
Zhengzhong Tu, Jessie Lin, Yilin Wang, Balu Adsumilli, Alan C. Bovik
Comments: Accepted by ICASSP 2020
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[39] arXiv:2002.12521 (cross-list from eess.IV) [pdf, other]
Title: Improved Image Coding Autoencoder With Deep Learning
Licheng Xiao, Hairong Wang, Nam Ling
Subjects: Image and Video Processing (eess.IV); Machine Learning (cs.LG); Multimedia (cs.MM)
Total of 39 entries
Showing up to 50 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack