Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.MM

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Multimedia

Authors and titles for April 2023

Total of 80 entries : 1-25 26-50 51-75 76-80
Showing up to 25 entries per page: fewer | more | all
[26] arXiv:2304.01347 (cross-list from q-bio.NC) [pdf, other]
Title: Temporal Dynamic Synchronous Functional Brain Network for Schizophrenia Diagnosis and Lateralization Analysis
Cheng Zhu, Ying Tan, Shuqi Yang, Jiaqing Miao, Jiayi Zhu, Huan Huang, Dezhong Yao, Cheng Luo
Subjects: Neurons and Cognition (q-bio.NC); Machine Learning (cs.LG); Multimedia (cs.MM)
[27] arXiv:2304.02051 (cross-list from cs.CV) [pdf, other]
Title: Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing
Alberto Baldrati, Davide Morelli, Giuseppe Cartella, Marcella Cornia, Marco Bertini, Rita Cucchiara
Comments: ICCV 2023
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[28] arXiv:2304.02173 (cross-list from cs.CV) [pdf, other]
Title: ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules
Zhi-Qi Cheng, Qi Dai, Siyao Li, Jingdong Sun, Teruko Mitamura, Alexander G. Hauptmann
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[29] arXiv:2304.02274 (cross-list from cs.HC) [pdf, other]
Title: Tangible Web: An Interactive Immersion Virtual RealityCreativity System that Travels Across Reality
Simin Yang, Ze Gao, Reza Hadi Mogavi, Pan Hui, Tristan Braud
Comments: Accepted In Proceedings of the ACM Web Conference 2023, April 30-May 4, 2023, Austin, TX, USA. ACM, New York, NY, USA
Subjects: Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
[30] arXiv:2304.02970 (cross-list from cs.CV) [pdf, html, other]
Title: Unraveling Instance Associations: A Closer Look for Audio-Visual Segmentation
Yuanhong Chen, Yuyuan Liu, Hu Wang, Fengbei Liu, Chong Wang, Helen Frazer, Gustavo Carneiro
Comments: Code is available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[31] arXiv:2304.03076 (cross-list from eess.IV) [pdf, other]
Title: Fast QTMT Partition for VVC Intra Coding Using U-Net Framework
Zhao Zan, Leilei Huang, ShuShi Chen, Xiantao Zhang, Zhenghui Zhao, Haibing Yin, Yibo Fan
Subjects: Image and Video Processing (eess.IV); Multimedia (cs.MM)
[32] arXiv:2304.03135 (cross-list from cs.CV) [pdf, other]
Title: VLPD: Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision
Mengyin Liu, Jie Jiang, Chao Zhu, Xu-Cheng Yin
Comments: Accepted by CVPR 2023
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[33] arXiv:2304.03323 (cross-list from cs.SD) [pdf, other]
Title: DSVAE: Interpretable Disentangled Representation for Synthetic Speech Detection
Amit Kumar Singh Yadav, Kratika Bhagtani, Ziyue Xiang, Paolo Bestagini, Stefano Tubaro, Edward J. Delp
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[34] arXiv:2304.03652 (cross-list from cs.HC) [pdf, other]
Title: An Accessible Toolkit for 360 VR Studies
Corrie Green, Chloƫ Farr, Yang Jiang
Comments: for associated github repo, this https URL
Subjects: Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
[35] arXiv:2304.03732 (cross-list from cs.NI) [pdf, other]
Title: Enabling immersive experiences in challenging network conditions
Pooja Aggarwal, Michael Luby, Lorenz Minder
Comments: 6 pages, 8 figures
Subjects: Networking and Internet Architecture (cs.NI); Multimedia (cs.MM)
[36] arXiv:2304.04399 (cross-list from cs.CV) [pdf, other]
Title: CAVL: Learning Contrastive and Adaptive Representations of Vision and Language
Shentong Mo, Jingfei Xia, Ihor Markevych
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
[37] arXiv:2304.04901 (cross-list from cs.CV) [pdf, other]
Title: Efficiently Collecting Training Dataset for 2D Object Detection by Online Visual Feedback
Takuya Kiyokawa, Naoki Shirakura, Hiroki Katayama, Keita Tomochika, Jun Takamatsu
Comments: 13 pages, 14 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[38] arXiv:2304.04915 (cross-list from cs.SD) [pdf, other]
Title: AffectMachine-Classical: A novel system for generating affective classical music
Kat R. Agres, Adyasha Dash, Phoebe Chua
Comments: K. Agres and A. Dash share first authorship
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[39] arXiv:2304.05402 (cross-list from cs.CV) [pdf, other]
Title: Boosting Cross-task Transferability of Adversarial Patches with Visual Relations
Tony Ma, Songze Li, Yisong Xiao, Shunchang Liu
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Multimedia (cs.MM)
[40] arXiv:2304.05600 (cross-list from cs.SD) [pdf, html, other]
Title: Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning
Nikhil Singh, Chih-Wei Wu, Iroro Orife, Mahdi Kalayeh
Comments: Accepted to CVPR 2024
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[41] arXiv:2304.06116 (cross-list from cs.CV) [pdf, other]
Title: AutoShot: A Short Video Dataset and State-of-the-Art Shot Boundary Detection
Wentao Zhu, Yufang Huang, Xiufeng Xie, Wenxian Liu, Jincan Deng, Debing Zhang, Zhangyang Wang, Ji Liu
Comments: 10 pages, 5 figures, 3 tables, in CVPR 2023; Top-1 solution for scene / shot boundary detection this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Neural and Evolutionary Computing (cs.NE)
[42] arXiv:2304.06275 (cross-list from cs.CV) [pdf, other]
Title: Noisy Correspondence Learning with Meta Similarity Correction
Haochen Han, Kaiyao Miao, Qinghua Zheng, Minnan Luo
Comments: Accepted at CVPR 2023
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[43] arXiv:2304.06358 (cross-list from cs.CV) [pdf, other]
Title: Deep Metric Multi-View Hashing for Multimedia Retrieval
Jian Zhu, Zhangmin Huang, Xiaohu Ruan, Yu Cui, Yongli Cheng, Lingfang Zeng
Comments: Accepted by IEEE ICME 2023
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[44] arXiv:2304.06896 (cross-list from eess.IV) [pdf, other]
Title: Machine Perception-Driven Image Compression: A Layered Generative Approach
Yuefeng Zhang, Chuanmin Jia, Jiannhui Chang, Siwei Ma
Comments: 12 pages, 12 figures
Journal-ref: IEEE Transactions on Circuits and Systems for Video Technology 2024
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[45] arXiv:2304.07056 (cross-list from eess.IV) [pdf, other]
Title: Perceptual Quality Assessment of Face Video Compression: A Benchmark and An Effective Method
Yixuan Li, Bolin Chen, Baoliang Chen, Meng Wang, Shiqi Wang, Weisi Lin
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[46] arXiv:2304.07498 (cross-list from cs.HC) [pdf, other]
Title: Virtual Reality Training of Social Skills in Autism Spectrum Disorder: An Examination of Acceptability, Usability, User Experience, Social Skills, and Executive Functions
Panagiotis Kourtesis, Evangelia-Chrysanthi Kouklari, Petros Roussos, Vasileios Mantas, Katerina Papanikolaou, Christos Skaloumbakas, Artemios Pehlivanidis
Subjects: Human-Computer Interaction (cs.HC); Computers and Society (cs.CY); Multimedia (cs.MM)
[47] arXiv:2304.07567 (cross-list from cs.CV) [pdf, other]
Title: CoVLR: Coordinating Cross-Modal Consistency and Intra-Modal Structure for Vision-Language Retrieval
Yang Yang, Zhongtian Fu, Xiangyu Wu, Wenjie Li
Comments: I apologize for my operational mistake, which has resulted in the absence of a revised version of the manuscript. Furthermore, I am concerned that the submission process of this paper may potentially lead to conflicts. Therefore, I kindly request the withdrawal of the manuscript
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[48] arXiv:2304.07775 (cross-list from cs.CV) [pdf, other]
Title: Robust Cross-Modal Knowledge Distillation for Unconstrained Videos
Wenke Xia, Xingjian Li, Andong Deng, Haoyi Xiong, Dejing Dou, Di Hu
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[49] arXiv:2304.08028 (cross-list from cs.CV) [pdf, other]
Title: MMANet: Margin-aware Distillation and Modality-aware Regularization for Incomplete Multimodal Learning
Shicai Wei, Yang Luo, Chunbo Luo
Comments: 10 pages, 3 figures, CVPR2023
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[50] arXiv:2304.08345 (cross-list from cs.LG) [pdf, html, other]
Title: VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
Jing Liu, Sihan Chen, Xingjian He, Longteng Guo, Xinxin Zhu, Weining Wang, Jinhui Tang
Comments: Preprint version w/o audio files embeded in PDF. Audio embeded version can be found on project page or github
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
Total of 80 entries : 1-25 26-50 51-75 76-80
Showing up to 25 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack