Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.MM

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Multimedia

Authors and titles for September 2021

Total of 53 entries : 1-25 26-50 51-53
Showing up to 25 entries per page: fewer | more | all
[26] arXiv:2109.04023 (cross-list from cs.HC) [pdf, other]
Title: Rethinking Immersive Virtual Reality and Empathy
Ken Jen Lee, Edith Law
Comments: 4 pages, ACM CSCW 2021 workshop, arttech: Performance and Embodiment in Technology for Resilience and Mental Health
Subjects: Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
[27] arXiv:2109.04177 (cross-list from cs.HC) [pdf, other]
Title: Comfort and Sickness while Virtually Aboard an Autonomous Telepresence Robot
Markku Suomalainen, Katherine J. Mimnaugh, Israel Becerra, Eliezer Lozano, Rafael Murrieta-Cid, Steven M. LaValle
Comments: Accepted for publication in EuroXR 2021
Journal-ref: In: Bourdot P., Alca\~niz Raya M., Figueroa P., Interrante V., Kuhlen T.W., Reiners D. (eds) Virtual Reality and Mixed Reality. EuroXR 2021. Lecture Notes in Computer Science, vol 13105. Springer, Cham
Subjects: Human-Computer Interaction (cs.HC); Multimedia (cs.MM); Robotics (cs.RO)
[28] arXiv:2109.04275 (cross-list from cs.CV) [pdf, other]
Title: M5Product: Self-harmonized Contrastive Learning for E-commercial Multi-modal Pretraining
Xiao Dong, Xunlin Zhan, Yangxin Wu, Yunchao Wei, Michael C. Kampffmeyer, Xiaoyong Wei, Minlong Lu, Yaowei Wang, Xiaodan Liang
Comments: CVPR2022
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[29] arXiv:2109.04872 (cross-list from cs.CV) [pdf, other]
Title: Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding
Zhenzhi Wang, Limin Wang, Tao Wu, Tianhao Li, Gangshan Wu
Comments: AAAI 2022 Camera Ready Version
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[30] arXiv:2109.05199 (cross-list from cs.CL) [pdf, other]
Title: A Survey on Multi-modal Summarization
Anubhav Jangra, Sourajit Mukherjee, Adam Jatowt, Sriparna Saha, Mohammad Hasanuzzaman
Comments: Accepted in ACM CSUR 2023
Subjects: Computation and Language (cs.CL); Multimedia (cs.MM); Neural and Evolutionary Computing (cs.NE)
[31] arXiv:2109.05665 (cross-list from cs.CV) [pdf, other]
Title: CANS: Communication Limited Camera Network Self-Configuration for Intelligent Industrial Surveillance
Jingzheng Tu, Qimin Xu, Cailian Chen
Comments: 6 pages, 11 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[32] arXiv:2109.06072 (cross-list from cs.IR) [pdf, other]
Title: BeautifAI -- A Personalised Occasion-oriented Makeup Recommendation System
Kshitij Gulati, Gaurav Verma, Mukesh Mohania, Ashish Kundu
Comments: Withdrawing due to issues with training the Makeup Style Transfer (section about style transfer). This renders the current methodology invalid
Subjects: Information Retrieval (cs.IR); Multimedia (cs.MM)
[33] arXiv:2109.06637 (cross-list from cs.CV) [pdf, other]
Title: Multi-modal Representation Learning for Video Advertisement Content Structuring
Daya Guo, Zhaoyang Zeng
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[34] arXiv:2109.08013 (cross-list from cs.CV) [pdf, other]
Title: Detecting Propaganda Techniques in Memes
Dimitar Dimitrov, Bishr Bin Ali, Shaden Shaar, Firoj Alam, Fabrizio Silvestri, Hamed Firooz, Preslav Nakov, Giovanni Da San Martino
Comments: propaganda, disinformation, fake news, memes, multimodality. arXiv admin note: text overlap with arXiv:2105.09284
Journal-ref: ACL-2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG); Multimedia (cs.MM)
[35] arXiv:2109.08039 (cross-list from cs.CV) [pdf, other]
Title: A Survey on Temporal Sentence Grounding in Videos
Xiaohan Lan, Yitian Yuan, Xin Wang, Zhi Wang, Wenwu Zhu
Comments: 32 pages with 19 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[36] arXiv:2109.08371 (cross-list from cs.CV) [pdf, other]
Title: Audio-Visual Collaborative Representation Learning for Dynamic Saliency Prediction
Hailong Ning, Bin Zhao, Zhanxuan Hu, Lang He, Ercheng Pei
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[37] arXiv:2109.08411 (cross-list from cs.CV) [pdf, other]
Title: Cross Modification Attention Based Deliberation Model for Image Captioning
Zheng Lian, Yanan Zhang, Haichang Li, Rui Wang, Xiaohui Hu
Comments: This work has been submitted to the IEEE TMM for possible publication
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[38] arXiv:2109.08478 (cross-list from cs.CL) [pdf, other]
Title: Multimodal Incremental Transformer with Visual Grounding for Visual Dialogue Generation
Feilong Chen, Fandong Meng, Xiuyi Chen, Peng Li, Jie Zhou
Comments: ACL Fingdings 2021
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[39] arXiv:2109.08942 (cross-list from eess.IV) [pdf, other]
Title: iWave3D: End-to-end Brain Image Compression with Trainable 3-D Wavelet Transform
Dongmei Xue, Haichuan Ma, Li Li, Dong Liu, Zhiwei Xiong
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[40] arXiv:2109.09023 (cross-list from cs.CR) [pdf, other]
Title: Anti-Neuron Watermarking: Protecting Personal Data Against Unauthorized Neural Networks
Zihang Zou, Boqing Gong, Liqiang Wang
Comments: Accepted to ECCV 2022
Subjects: Cryptography and Security (cs.CR); Machine Learning (cs.LG); Multimedia (cs.MM)
[41] arXiv:2109.09617 (cross-list from cs.SD) [pdf, other]
Title: TeleMelody: Lyric-to-Melody Generation with a Template-Based Two-Stage Method
Zeqian Ju, Peiling Lu, Xu Tan, Rui Wang, Chen Zhang, Songruoyao Wu, Kejun Zhang, Xiangyang Li, Tao Qin, Tie-Yan Liu
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[42] arXiv:2109.10683 (cross-list from cs.LG) [pdf, other]
Title: Adaptive Neural Message Passing for Inductive Learning on Hypergraphs
Devanshu Arya, Deepak K. Gupta, Stevan Rudinac, Marcel Worring
Subjects: Machine Learning (cs.LG); Multimedia (cs.MM)
[43] arXiv:2109.10849 (cross-list from eess.IV) [pdf, other]
Title: DVC-P: Deep Video Compression with Perceptual Optimizations
Saiping Zhang, Marta Mrak, Luis Herranz, Marc Górriz, Shuai Wan, Fuzheng Yang
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[44] arXiv:2109.11526 (cross-list from cs.CV) [pdf, other]
Title: MARMOT: A Deep Learning Framework for Constructing Multimodal Representations for Vision-and-Language Tasks
Patrick Y. Wu, Walter R. Mebane Jr
Comments: 57 pages, 16 figures. Forthcoming in Computational Communication Research
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Computers and Society (cs.CY); Machine Learning (cs.LG); Multimedia (cs.MM)
[45] arXiv:2109.12252 (cross-list from cs.CV) [pdf, other]
Title: Long-Range Feature Propagating for Natural Image Matting
Qinglin Liu, Haozhe Xie, Shengping Zhang, Bineng Zhong, Rongrong Ji
Journal-ref: ACM International Conference on Multimedia (ACM MM) 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[46] arXiv:2109.12293 (cross-list from cs.NI) [pdf, other]
Title: Adaptive video transmission using QUBO method and Digital Annealer based on Ising machine
Bo Wei, Hang Song, Jiro Katto
Subjects: Networking and Internet Architecture (cs.NI); Multimedia (cs.MM); Image and Video Processing (eess.IV)
[47] arXiv:2109.12307 (cross-list from cs.CV) [pdf, other]
Title: Multi-Modal Multi-Instance Learning for Retinal Disease Recognition
Xirong Li, Yang Zhou, Jie Wang, Hailan Lin, Jianchun Zhao, Dayong Ding, Weihong Yu, Youxin Chen
Comments: Accepted by ACM Multimedia 2021 (Main Track)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[48] arXiv:2109.12651 (cross-list from cs.IR) [pdf, other]
Title: Why Do We Click: Visual Impression-aware News Recommendation
Jiahao Xun, Shengyu Zhang, Zhou Zhao, Jieming Zhu, Qi Zhang, Jingjie Li, Xiuqiang He, Xiaofei He, Tat-Seng Chua, Fei Wu
Comments: Accepted by ACM Multimedia 2021
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[49] arXiv:2109.12776 (cross-list from cs.CV) [pdf, other]
Title: Joint Multimedia Event Extraction from Video and Article
Brian Chen, Xudong Lin, Christopher Thomas, Manling Li, Shoya Yoshida, Lovish Chum, Heng Ji, Shih-Fu Chang
Comments: To be presented at EMNLP 2021 findings
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[50] arXiv:2109.14306 (cross-list from cs.CV) [pdf, other]
Title: Three-Stream 3D/1D CNN for Fine-Grained Action Classification and Segmentation in Table Tennis
Pierre-Etienne Martin (MPI-EVA), Jenny Benois-Pineau (UB), Renaud Péteri (MIA), Julien Morlier (UB)
Comments: MMSports '21, October 20, 2021, Virtual Event,, Oct 2021, Chengdu, China
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Multimedia (cs.MM)
Total of 53 entries : 1-25 26-50 51-53
Showing up to 25 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack