Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.MM

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Multimedia

Authors and titles for October 2020

Total of 74 entries : 1-50 51-74
Showing up to 50 entries per page: fewer | more | all
[51] arXiv:2010.11734 (cross-list from cs.CV) [pdf, other]
Title: Identification of deep breath while moving forward based on multiple body regions and graph signal analysis
Yunlu Wang, Cheng Yang, Menghan Hu, Jian Zhang, Qingli Li, Guangtao Zhai, Xiao-Ping Zhang
Comments: 5 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Systems and Control (eess.SY)
[52] arXiv:2010.11744 (cross-list from cs.HC) [pdf, other]
Title: A Qualitative Analysis of Haptic Feedback in Music Focused Exercises
Gareth W. Young, David Murphy, Jeffrey Weeter
Comments: 6 pages
Journal-ref: Proceedings of the International Conference on New Interfaces for Musical Expression, 2017
Subjects: Human-Computer Interaction (cs.HC); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[53] arXiv:2010.11886 (cross-list from cs.CV) [pdf, other]
Title: GAZED- Gaze-guided Cinematic Editing of Wide-Angle Monocular Video Recordings
K L Bhanu Moorthy, Moneish Kumar, Ramanathan Subramaniam, Vineet Gandhi
Comments: 10 pages
Journal-ref: In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems (CHI '20). Association for Computing Machinery, New York, NY, USA, 1-11
Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
[54] arXiv:2010.11985 (cross-list from cs.CL) [pdf, other]
Title: MTAG: Modal-Temporal Attention Graph for Unaligned Human Multimodal Language Sequences
Jianing Yang, Yongxin Wang, Ruitao Yi, Yuying Zhu, Azaan Rehman, Amir Zadeh, Soujanya Poria, Louis-Philippe Morency
Comments: NAACL 2021
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[55] arXiv:2010.12139 (cross-list from cs.SD) [pdf, other]
Title: GSEP: A robust vocal and accompaniment separation system using gated CBHG module and loudness normalization
Soochul Park, Ben Sangbae Chon
Comments: 5 pages, 5 figures
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[56] arXiv:2010.12216 (cross-list from cs.CV) [pdf, other]
Title: Feature matching in Ultrasound images
Hang Zhu, Zihao Wang
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[57] arXiv:2010.12325 (cross-list from cs.SD) [pdf, other]
Title: A Computational Evaluation of Musical Pattern Discovery Algorithms
Iris Ren, Anja Volk, Wouter Swierstra, Remco C. Veltkamp
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[58] arXiv:2010.12540 (cross-list from cs.IR) [pdf, other]
Title: Comprehensive Empirical Evaluation of Deep Learning Approaches for Session-based Recommendation in E-Commerce
Mohamed Maher (1), Perseverance Munga Ngoy (1), Aleksandrs Rebriks (1), Cagri Ozcinar (1), Josue Cuevas (3), Rajasekhar Sanagavarapu (3), Gholamreza Anbarjafari (1 and 2) ((1) iCV Lab, University of Tartu, Tartu, Estonia, (2) Faculty of Engineering, Hasan Kalyoncu University, Gaziantep, Turkey, (3) Rakuten Inc., Big Data Department, Machine Learning Group, Tokyo, Japan)
Comments: 48 pages, 17 figures, journal
Subjects: Information Retrieval (cs.IR); Computers and Society (cs.CY); Multimedia (cs.MM)
[59] arXiv:2010.12968 (cross-list from cs.CV) [pdf, other]
Title: Improved Actor Relation Graph based Group Activity Recognition
Zijian Kuang, Xinran Tie
Journal-ref: ICSM 2022
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[60] arXiv:2010.13035 (cross-list from cs.HC) [pdf, other]
Title: Enactive Mandala: Audio-visualizing Brain Waves
Tomohiro Tokunaga, Michael J. Lyons
Comments: 2 pages, 2 figures
Journal-ref: Proceedings of the International Conference on New Interfaces for Musical Expression, 2013
Subjects: Human-Computer Interaction (cs.HC); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[61] arXiv:2010.13059 (cross-list from eess.IV) [pdf, other]
Title: A QP-adaptive Mechanism for CNN-based Filter in Video Coding
Chao Liu, Heming Sun, Jiro Katto, Xiaoyang Zeng, Yibo Fan
Subjects: Image and Video Processing (eess.IV); Machine Learning (cs.LG); Multimedia (cs.MM)
[62] arXiv:2010.13468 (cross-list from cs.SD) [pdf, other]
Title: Melody Harmonization Using Orderless NADE, Chord Balancing, and Blocked Gibbs Sampling
Chung-En Sun, Yi-Wei Chen, Hung-Shin Lee, Yen-Hsing Chen, Hsin-Min Wang
Comments: Accepted by ICASSP 2021, and Demo is available at: this https URL
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[63] arXiv:2010.13540 (cross-list from cs.SD) [pdf, other]
Title: Contrastive Unsupervised Learning for Audio Fingerprinting
Zhesong Yu, Xingjian Du, Bilei Zhu, Zejun Ma
Comments: 5 pages
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[64] arXiv:2010.14129 (cross-list from cs.CV) [pdf, other]
Title: Mining Generalized Features for Detecting AI-Manipulated Fake Faces
Yang Yu, Rongrong Ni, Yao Zhao
Comments: 14 pages, 9 figures. This work has been submitted to the IEEE for possible publication
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[65] arXiv:2010.14168 (cross-list from cs.SD) [pdf, other]
Title: Rule-embedded network for audio-visual voice activity detection in live musical video streams
Yuanbo Hou, Yi Deng, Bilei Zhu, Zejun Ma, Dick Botteldooren
Comments: Submitted to ICASSP 2021
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[66] arXiv:2010.14565 (cross-list from cs.SD) [pdf, other]
Title: Remixing Music with Visual Conditioning
Li-Chia Yang, Alexander Lerch
Journal-ref: 2020 IEEE International Symposium on Multimedia
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[67] arXiv:2010.14709 (cross-list from cs.SD) [pdf, other]
Title: Melody-Conditioned Lyrics Generation with SeqGANs
Yihao Chen, Alexander Lerch
Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[68] arXiv:2010.14805 (cross-list from cs.SD) [pdf, other]
Title: Large-Scale MIDI-based Composer Classification
Qiuqiang Kong, Keunwoo Choi, Yuxuan Wang
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[69] arXiv:2010.15288 (cross-list from cs.LG) [pdf, other]
Title: Speech-Image Semantic Alignment Does Not Depend on Any Prior Classification Tasks
Masood S. Mortazavi
Journal-ref: Proceedings of INTERSPEECH 2020
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT); Multimedia (cs.MM)
[70] arXiv:2010.15343 (cross-list from cs.CV) [pdf, other]
Title: Identifying safe intersection design through unsupervised feature extraction from satellite imagery
Jasper S. Wijnands, Haifeng Zhao, Kerry A. Nice, Jason Thompson, Katherine Scully, Jingqiu Guo, Mark Stevenson
Comments: 16 pages, 10 figures. Computer-Aided Civil and Infrastructure Engineering (2020)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Image and Video Processing (eess.IV)
[71] arXiv:2010.15869 (cross-list from cs.SD) [pdf, other]
Title: Acoustic Correlates of the Voice Qualifiers: A Survey
Shahan Ali Memon
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[72] arXiv:2010.16030 (cross-list from cs.IR) [pdf, other]
Title: Multimodal Metric Learning for Tag-based Music Retrieval
Minz Won, Sergio Oramas, Oriol Nieto, Fabien Gouyon, Xavier Serra
Comments: 5 pages, 2 figures, submitted to ICASSP 2021
Subjects: Information Retrieval (cs.IR); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[73] arXiv:2010.16073 (cross-list from cs.CV) [pdf, other]
Title: CNN based Multistage Gated Average Fusion (MGAF) for Human Action Recognition Using Depth and Inertial Sensors
Zeeshan Ahmad, Naimul khan
Comments: arXiv admin note: text overlap with arXiv:1910.11482
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Image and Video Processing (eess.IV)
[74] arXiv:2010.16211 (cross-list from cs.CV) [pdf, other]
Title: Statistical Analysis of Signal-Dependent Noise: Application in Blind Localization of Image Splicing Forgery
Mian Zou, Heng Yao, Chuan Qin, Xinpeng Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Image and Video Processing (eess.IV)
Total of 74 entries : 1-50 51-74
Showing up to 50 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack