Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.MM

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Multimedia

Authors and titles for September 2022

Total of 76 entries : 1-50 51-76
Showing up to 50 entries per page: fewer | more | all
[1] arXiv:2209.00526 [pdf, other]
Title: Reproducibility Companion Paper: Describing Subjective Experiment Consistency by $p$-Value P-P Plot
Jakub Nawała, Lucjan Janowski, Bogdan Ćmiel, Krzysztof Rusek, Marc A. Kastner, Jan Zahálka
Comments: Please refer to the original publication: this https URL Related paper: this https URL or arXiv:2009.13372
Journal-ref: In Proceedings of the 29th ACM International Conference on Multimedia (MM '21). New York, NY, USA, 3627-3629 (2021)
Subjects: Multimedia (cs.MM)
[2] arXiv:2209.01421 [pdf, other]
Title: Deep Live Video Ad Placement on the 5G Edge
Mohammad Hosseini
Comments: ACM Multimedia Systems 2018, Demo track, June 2018, Amsterdam, Netherlands, 5 pages
Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV)
[3] arXiv:2209.01768 [pdf, other]
Title: Predict-and-Update Network: Audio-Visual Speech Recognition Inspired by Human Speech Perception
Jiadong Wang, Xinyuan Qian, Haizhou Li
Subjects: Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[4] arXiv:2209.01982 [pdf, other]
Title: Volumetric video streaming: Current approaches and implementations
Irene Viola, Pablo Cesar
Journal-ref: Valenzise, G., Alain, M., Zerman, E., and Ozcinar, C., Immersive media technologies. 2022, Elsevier
Subjects: Multimedia (cs.MM); Image and Video Processing (eess.IV)
[5] arXiv:2209.02604 [pdf, other]
Title: Make Acoustic and Visual Cues Matter: CH-SIMS v2.0 Dataset and AV-Mixup Consistent Module
Yihe Liu, Ziqi Yuan, Huisheng Mao, Zhiyun Liang, Wanqiuyue Yang, Yuanzhe Qiu, Tie Cheng, Xiaoteng Li, Hua Xu, Kai Gao
Comments: 16pages, 7 figures, accepted by ICMI 2022
Subjects: Multimedia (cs.MM); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[6] arXiv:2209.02927 [pdf, other]
Title: Network-aware Prefetching Method for Short-Form Video Streaming
Duc Nguyen, Phong Nguyen, Vu Long, Truong Thu Huong, Pham Ngoc Nam
Subjects: Multimedia (cs.MM)
[7] arXiv:2209.03126 [pdf, other]
Title: DM$^2$S$^2$: Deep Multi-Modal Sequence Sets with Hierarchical Modality Attention
Shunsuke Kitada, Yuki Iwazaki, Riku Togashi, Hitoshi Iyatomi
Comments: 12 pages, 3 figures. Accepted by IEEE Access on Nov. 3, 2022
Journal-ref: in IEEE Access, vol. 10, pp. 120023-120034, 2022
Subjects: Multimedia (cs.MM); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[8] arXiv:2209.03338 [pdf, other]
Title: ESSYS* Sharing #UC: An Emotion-driven Audiovisual Installation
Sérgio M. Rebelo, Mariana Seiça, Pedro Martins, João Bicker, Penousal Machado
Comments: Paper to be published in 2022 IEEE VIS Arts Program (VISAP 2022). For the associated supplementary materials, see this https URL
Journal-ref: 2022 IEEE VIS Arts Program (VISAP 2022)
Subjects: Multimedia (cs.MM); Human-Computer Interaction (cs.HC); Information Retrieval (cs.IR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[9] arXiv:2209.03420 [pdf, other]
Title: Using Computational Approaches in Visual Identity Design: A Visual Identity for the Design and Multimedia Courses of Faculty of Sciences and Technology of University of Coimbra
Sérgio M. Rebelo, Tiago Martins, Artur Rebelo, João Bicker, Penousal Machado
Comments: Paper presented in 10th Typography Meeting "Borders", 22--23 Oct. 2019, Matosinhos, Portugal
Journal-ref: 10th Typography Meeting "Borders", 22--23 Oct. 2019, Matosinhos, Portugal
Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV)
[10] arXiv:2209.05761 [pdf, other]
Title: A Survey on Mobile Edge Computing for Video Streaming: Opportunities and Challenges
Muhammad Asif Khan, Emna Baccour, Zina Chkirbene, Aiman Erbad, Ridha Hamila, Mounir Hamdi, Moncef Gabbouj
Comments: 36 pages
Subjects: Multimedia (cs.MM); Networking and Internet Architecture (cs.NI)
[11] arXiv:2209.06345 [pdf, other]
Title: Self-supervised Multi-Modal Video Forgery Attack Detection
Chenhui Zhao, Xiang Li, Rabih Younes
Subjects: Multimedia (cs.MM)
[12] arXiv:2209.06496 [pdf, other]
Title: CCOM-HuQin: an Annotated Multimodal Chinese Fiddle Performance Dataset
Yu Zhang, Ziya Zhou, Xiaobing Li, Feng Yu, Maosong Sun
Comments: 15 pages, 11 figures
Journal-ref: Transactions of the International Society for Music Information Retrieval, 2023, 6(1), 60-74
Subjects: Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[13] arXiv:2209.08000 [pdf, other]
Title: TIMIT-TTS: a Text-to-Speech Dataset for Multimodal Synthetic Media Detection
Davide Salvi, Brian Hosler, Paolo Bestagini, Matthew C. Stamm, Stefano Tubaro
Subjects: Multimedia (cs.MM)
[14] arXiv:2209.08795 [pdf, other]
Title: AutoLV: Automatic Lecture Video Generator
Wenbin Wang, Yang Song, Sanjay Jha
Comments: 4 pages, 4 figures, ICIP 2022
Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[15] arXiv:2209.08884 [pdf, other]
Title: Adaptive 3D Mesh Steganography Based on Feature-Preserving Distortion
Yushu Zhang, Jiahao Zhu, Mignfu Xue, Xinpeng Zhang, Xiaochun Cao
Comments: IEEE TVCG major revision
Subjects: Multimedia (cs.MM)
[16] arXiv:2209.10134 [pdf, other]
Title: Recipe Generation from Unsegmented Cooking Videos
Taichi Nishimura, Atsushi Hashimoto, Yoshitaka Ushiku, Hirotaka Kameko, Shinsuke Mori
Comments: Accepted at ACM TOMM; ACM Transactions on Multimedia Computing, Communications, and Applications
Subjects: Multimedia (cs.MM); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[17] arXiv:2209.11426 [pdf, other]
Title: The Beauty of Repetition in Machine Composition Scenarios
Zhejing Hu, Xiao Ma, Yan Liu, Gong Chen, Yongxu Liu
Comments: Published on ACM Multimedia 2022
Subjects: Multimedia (cs.MM)
[18] arXiv:2209.13206 [pdf, other]
Title: Blind Robust VideoWatermarking Based on Adaptive Region Selection and Channel Reference
Qinwei Chang, Leichao Huang, Shaoteng Liu, Hualuo Liu, Tianshu Yang, Yexin Wang
Comments: Accepted to ACM Multimedia 2022
Subjects: Multimedia (cs.MM)
[19] arXiv:2209.13542 [pdf, other]
Title: EmpathicSchool: A multimodal dataset for real-time facial expressions and physiological data analysis under different stress conditions
Majid Hosseini, Fahad Sohrab, Raju Gottumukkala, Ravi Teja Bhupatiraju, Satya Katragadda, Jenni Raitoharju, Alexandros Iosifidis, Moncef Gabbouj
Subjects: Multimedia (cs.MM); Signal Processing (eess.SP)
[20] arXiv:2209.15557 [pdf, other]
Title: Explaining Hierarchical Features in Dynamic Point Cloud Processing
Pedro Gomes, Silvia Rossi, Laura Toni
Subjects: Multimedia (cs.MM)
[21] arXiv:2209.00291 (cross-list from cs.SD) [pdf, other]
Title: Generating Coherent Drum Accompaniment With Fills And Improvisations
Rishabh Dahale, Vaibhav Talwadker, Preeti Rao, Prateek Verma
Comments: 8 pages, 7 figures, 23rd International Society for Music Information Retrieval Conference (ISMIR 2022), Bengaluru, India
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[22] arXiv:2209.00302 (cross-list from cs.LG) [pdf, other]
Title: Progressive Fusion for Multimodal Integration
Shiv Shankar, Laure Thompson, Madalina Fiterau
Subjects: Machine Learning (cs.LG); Multimedia (cs.MM)
[23] arXiv:2209.00353 (cross-list from cs.SD) [pdf, other]
Title: AccoMontage2: A Complete Harmonization and Accompaniment Arrangement System
Li Yi, Haochen Hu, Jingwei Zhao, Gus Xia
Comments: Accepted by ISMIR 2022
Subjects: Sound (cs.SD); Information Retrieval (cs.IR); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[24] arXiv:2209.00852 (cross-list from cs.CV) [pdf, other]
Title: Geometry Aligned Variational Transformer for Image-conditioned Layout Generation
Yunning Cao, Ye Ma, Min Zhou, Chuanbin Liu, Hongtao Xie, Tiezheng Ge, Yuning Jiang
Comments: To be published in ACM MM 2022
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[25] arXiv:2209.00891 (cross-list from cs.CL) [pdf, other]
Title: Multi-modal Contrastive Representation Learning for Entity Alignment
Zhenxi Lin, Ziheng Zhang, Meng Wang, Yinghui Shi, Xian Wu, Yefeng Zheng
Comments: Accepted by COLING 2022
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
[26] arXiv:2209.01308 (cross-list from cs.AI) [pdf, other]
Title: Multimodal and Crossmodal AI for Smart Data Analysis
Minh-Son Dao
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR); Multimedia (cs.MM)
[27] arXiv:2209.01744 (cross-list from cs.CR) [pdf, other]
Title: Investigation on Principles for Cost Assignment in Motion Vector-based Video Steganography
Jun Li, Minqing Zhang, Ke Niu, Xiaoyuan Yang
Comments: 16 pages, 8 figures,
Subjects: Cryptography and Security (cs.CR); Multimedia (cs.MM)
[28] arXiv:2209.01935 (cross-list from cs.CV) [pdf, other]
Title: Forensicability Assessment of Questioned Images in Recapturing Detection
Changsheng Chen, Lin Zhao, Rizhao Cai, Zitong Yu, Jiwu Huang, Alex C. Kot
Comments: 12 pages, 10 figures, 2 tables (Submitted to TIFS July-2022)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[29] arXiv:2209.02161 (cross-list from cs.HC) [pdf, other]
Title: Comparative Study of AR Versus Image and Video for Exercise Learning
Jamie Burns, Wenge Xu, Ian Williams, Irfan Khawaja
Comments: 6 pages
Subjects: Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
[30] arXiv:2209.02226 (cross-list from eess.IV) [pdf, other]
Title: Learning to Predict on Octree for Scalable Point Cloud Geometry Coding
Yixiang Mao, Yueyu Hu, Yao Wang
Comments: Accepted and presented at IEEE MIPR conference
Subjects: Image and Video Processing (eess.IV); Multimedia (cs.MM)
[31] arXiv:2209.02446 (cross-list from cs.CY) [pdf, other]
Title: Web3 Challenges and Opportunities for the Market
Dan Sheridan (College of Computing and Software Engineering, Kennesaw State University, GA, USA), James Harris (College of Computing and Software Engineering, Kennesaw State University, GA, USA), Frank Wear (College of Computing and Software Engineering, Kennesaw State University, GA, USA), Jerry Cowell Jr (College of Computing and Software Engineering, Kennesaw State University, GA, USA), Easton Wong (College of Computing and Software Engineering, Kennesaw State University, GA, USA), Abbas Yazdinejad (Cyber Science Lab, School of Computer Science, University of Guelph, Ontario, Canada)
Subjects: Computers and Society (cs.CY); Multimedia (cs.MM)
[32] arXiv:2209.02564 (cross-list from cs.CV) [pdf, other]
Title: Progressive Domain Adaptation with Contrastive Learning for Object Detection in the Satellite Imagery
Debojyoti Biswas, Jelena Tešić
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[33] arXiv:2209.02574 (cross-list from eess.IV) [pdf, other]
Title: Cross Modal Compression: Towards Human-comprehensible Semantic Compression
Jiguo Li, Chuanmin Jia, Xinfeng Zhang, Siwei Ma, Wen Gao
Comments: 10 pages, 4 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[34] arXiv:2209.02696 (cross-list from cs.SD) [pdf, other]
Title: Instrument Separation of Symbolic Music by Explicitly Guided Diffusion Model
Sangjun Han, Hyeongrae Ihm, DaeHan Ahn, Woohyung Lim
Comments: Submitted to NeurIPS 2022 Workshop on Machine Learning for Creativity and Design
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[35] arXiv:2209.02871 (cross-list from cs.SD) [pdf, other]
Title: Improving Choral Music Separation through Expressive Synthesized Data from Sampled Instruments
Ke Chen, Hao-Wen Dong, Yi Luo, Julian McAuley, Taylor Berg-Kirkpatrick, Miller Puckette, Shlomo Dubnov
Comments: Camera Ready for Proceedings of the 23rd International Society for Music Information Retrieval Conference, ISMIR 2022
Journal-ref: The 23rd International Society for Music Information Retrieval Conference, 2022
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[36] arXiv:2209.03275 (cross-list from cs.SD) [pdf, html, other]
Title: Multimodal Speech Enhancement Using Burst Propagation
Mohsin Raza, Leandro A. Passos, Ahmed Khubaib, Ahsan Adeel
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[37] arXiv:2209.03430 (cross-list from cs.LG) [pdf, other]
Title: Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions
Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[38] arXiv:2209.03609 (cross-list from cs.CV) [pdf, other]
Title: Frame-Subtitle Self-Supervision for Multi-Modal Video Question Answering
Jiong Wang, Zhou Zhao, Weike Jin
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[39] arXiv:2209.03656 (cross-list from cs.CV) [pdf, other]
Title: Saliency-based Multiple Region of Interest Detection from a Single 360° image
Yuuki Sawabe, Satoshi Ikehata, Kiyoharu Aizawa
Journal-ref: in IEEE Access, vol. 10, pp. 89124-89133, 2022
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[40] arXiv:2209.04077 (cross-list from cs.SD) [pdf, other]
Title: Prediction method of Soundscape Impressions using Environmental Sounds and Aerial Photographs
Yusuke Ono, Sunao Hara, Masanobu Abe
Comments: Submitted APSIPA ASC 2022
Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[41] arXiv:2209.04093 (cross-list from cs.CV) [pdf, other]
Title: Learning Audio-Visual embedding for Person Verification in the Wild
Peiwen Sun, Shanshan Zhang, Zishan Liu, Yougen Yuan, Taotao Zhang, Honggang Zhang, Pengfei Hu
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[42] arXiv:2209.04109 (cross-list from cs.SD) [pdf, other]
Title: MATT: A Multiple-instance Attention Mechanism for Long-tail Music Genre Classification
Xiaokai Liu, Menghua Zhang
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[43] arXiv:2209.04530 (cross-list from cs.SD) [pdf, other]
Title: DeID-VC: Speaker De-identification via Zero-shot Pseudo Voice Conversion
Ruibin Yuan, Yuxuan Wu, Jacob Li, Jaxter Kim
Comments: Accepted by Interspeech 2022
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[44] arXiv:2209.05653 (cross-list from cs.CV) [pdf, other]
Title: Semantic2Graph: Graph-based Multi-modal Feature Fusion for Action Segmentation in Videos
Junbin Zhang, Pei-Hsuan Tsai, Meng-Hsun Tsai
Comments: 13 pages, 3 figures, 9 tables. Published on Applied Intelligence
Journal-ref: Applied Intelligence(2024)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[45] arXiv:2209.05773 (cross-list from cs.CV) [pdf, other]
Title: CAIBC: Capturing All-round Information Beyond Color for Text-based Person Retrieval
Zijie Wang, Aichun Zhu, Jingyi Xue, Xili Wan, Chao Liu, Tian Wang, Yifeng Li
Comments: Accepted on ACM MM '22
Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR); Multimedia (cs.MM)
[46] arXiv:2209.05800 (cross-list from cs.CV) [pdf, other]
Title: Time-of-Day Neural Style Transfer for Architectural Photographs
Yingshu Chen, Tuan-Anh Vu, Ka-Chun Shum, Binh-Son Hua, Sai-Kit Yeung
Comments: Updated version with corrected equations. Paper published at the International Conference on Computational Photography (ICCP) 2022. 12 pages of content with 6 pages of supplementary materials
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Multimedia (cs.MM)
[47] arXiv:2209.06054 (cross-list from cs.SD) [pdf, other]
Title: SongDriver: Real-time Music Accompaniment Generation without Logical Latency nor Exposure Bias
Zihao Wang, Qihao Liang, Kejun Zhang, Yuxing Wang, Chen Zhang, Pengfei Yu, Yongsheng Feng, Wenbo Liu, Yikai Wang, Yuntai Bao, Yiheng Yang
Comments: *Both Zihao Wang and Qihao Liang contribute equally to the paper and share the co-first authorship. This paper has been accepted by ACM Multimedia 2022, oral session, full paper (main track)
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[48] arXiv:2209.06209 (cross-list from cs.CV) [pdf, other]
Title: Look Before You Leap: Improving Text-based Person Retrieval by Learning A Consistent Cross-modal Common Manifold
Zijie Wang, Aichun Zhu, Jingyi Xue, Xili Wan, Chao Liu, Tian Wang, Yifeng Li
Comments: Accepted on ACM MM '22. arXiv admin note: text overlap with arXiv:2209.05773
Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR); Multimedia (cs.MM)
[49] arXiv:2209.06416 (cross-list from cs.CL) [pdf, other]
Title: ImageArg: A Multi-modal Tweet Dataset for Image Persuasiveness Mining
Zhexiong Liu, Meiqi Guo, Yue Dai, Diane Litman
Comments: In Argument Mining Workshop, held in conjunction with the International Conference on Computational Linguistics (COLING), October 2022
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[50] arXiv:2209.06498 (cross-list from cs.HC) [pdf, other]
Title: Evaluation of Text Selection Techniques in Virtual Reality Head-Mounted Displays
Wenge Xu, Xuanru Meng, Kangyou Yu, Sayan Sacar, Hai-Ning Liang
Comments: IEEE ISMAR'22 conference track; 10 pages; There was a mistake in Section 4.4.1 reagrding the symbol, it has been updated in this version
Subjects: Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
Total of 76 entries : 1-50 51-76
Showing up to 50 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack