Multimedia

Authors and titles for April 2023

Total of 80 entries : 1-25 26-50 51-75 76-80

Showing up to 25 entries per page: fewer | more | all

[26] arXiv:2304.01347 (cross-list from q-bio.NC) [pdf, other]: Title: Temporal Dynamic Synchronous Functional Brain Network for Schizophrenia Diagnosis and Lateralization Analysis

Cheng Zhu, Ying Tan, Shuqi Yang, Jiaqing Miao, Jiayi Zhu, Huan Huang, Dezhong Yao, Cheng Luo

Subjects: Neurons and Cognition (q-bio.NC); Machine Learning (cs.LG); Multimedia (cs.MM)
[27] arXiv:2304.02051 (cross-list from cs.CV) [pdf, other]: Title: Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing

Alberto Baldrati, Davide Morelli, Giuseppe Cartella, Marcella Cornia, Marco Bertini, Rita Cucchiara

Comments: ICCV 2023

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[28] arXiv:2304.02173 (cross-list from cs.CV) [pdf, other]: Title: ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules

Zhi-Qi Cheng, Qi Dai, Siyao Li, Jingdong Sun, Teruko Mitamura, Alexander G. Hauptmann

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[29] arXiv:2304.02274 (cross-list from cs.HC) [pdf, other]: Title: Tangible Web: An Interactive Immersion Virtual RealityCreativity System that Travels Across Reality

Simin Yang, Ze Gao, Reza Hadi Mogavi, Pan Hui, Tristan Braud

Comments: Accepted In Proceedings of the ACM Web Conference 2023, April 30-May 4, 2023, Austin, TX, USA. ACM, New York, NY, USA

Subjects: Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
[30] arXiv:2304.02970 (cross-list from cs.CV) [pdf, html, other]: Title: Unraveling Instance Associations: A Closer Look for Audio-Visual Segmentation

Yuanhong Chen, Yuyuan Liu, Hu Wang, Fengbei Liu, Chong Wang, Helen Frazer, Gustavo Carneiro

Comments: Code is available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[31] arXiv:2304.03076 (cross-list from eess.IV) [pdf, other]: Title: Fast QTMT Partition for VVC Intra Coding Using U-Net Framework

Zhao Zan, Leilei Huang, ShuShi Chen, Xiantao Zhang, Zhenghui Zhao, Haibing Yin, Yibo Fan

Subjects: Image and Video Processing (eess.IV); Multimedia (cs.MM)
[32] arXiv:2304.03135 (cross-list from cs.CV) [pdf, other]: Title: VLPD: Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision

Mengyin Liu, Jie Jiang, Chao Zhu, Xu-Cheng Yin

Comments: Accepted by CVPR 2023

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[33] arXiv:2304.03323 (cross-list from cs.SD) [pdf, other]: Title: DSVAE: Interpretable Disentangled Representation for Synthetic Speech Detection

Amit Kumar Singh Yadav, Kratika Bhagtani, Ziyue Xiang, Paolo Bestagini, Stefano Tubaro, Edward J. Delp

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[34] arXiv:2304.03652 (cross-list from cs.HC) [pdf, other]: Title: An Accessible Toolkit for 360 VR Studies

Corrie Green, Chloë Farr, Yang Jiang

Comments: for associated github repo, this https URL

Subjects: Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
[35] arXiv:2304.03732 (cross-list from cs.NI) [pdf, other]: Title: Enabling immersive experiences in challenging network conditions

Pooja Aggarwal, Michael Luby, Lorenz Minder

Comments: 6 pages, 8 figures

Subjects: Networking and Internet Architecture (cs.NI); Multimedia (cs.MM)
[36] arXiv:2304.04399 (cross-list from cs.CV) [pdf, other]: Title: CAVL: Learning Contrastive and Adaptive Representations of Vision and Language

Shentong Mo, Jingfei Xia, Ihor Markevych

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
[37] arXiv:2304.04901 (cross-list from cs.CV) [pdf, other]: Title: Efficiently Collecting Training Dataset for 2D Object Detection by Online Visual Feedback

Takuya Kiyokawa, Naoki Shirakura, Hiroki Katayama, Keita Tomochika, Jun Takamatsu

Comments: 13 pages, 14 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[38] arXiv:2304.04915 (cross-list from cs.SD) [pdf, other]: Title: AffectMachine-Classical: A novel system for generating affective classical music

Kat R. Agres, Adyasha Dash, Phoebe Chua

Comments: K. Agres and A. Dash share first authorship

Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[39] arXiv:2304.05402 (cross-list from cs.CV) [pdf, other]: Title: Boosting Cross-task Transferability of Adversarial Patches with Visual Relations

Tony Ma, Songze Li, Yisong Xiao, Shunchang Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG); Multimedia (cs.MM)
[40] arXiv:2304.05600 (cross-list from cs.SD) [pdf, html, other]: Title: Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning

Nikhil Singh, Chih-Wei Wu, Iroro Orife, Mahdi Kalayeh

Comments: Accepted to CVPR 2024

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[41] arXiv:2304.06116 (cross-list from cs.CV) [pdf, other]: Title: AutoShot: A Short Video Dataset and State-of-the-Art Shot Boundary Detection

Wentao Zhu, Yufang Huang, Xiufeng Xie, Wenxian Liu, Jincan Deng, Debing Zhang, Zhangyang Wang, Ji Liu

Comments: 10 pages, 5 figures, 3 tables, in CVPR 2023; Top-1 solution for scene / shot boundary detection this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Neural and Evolutionary Computing (cs.NE)
[42] arXiv:2304.06275 (cross-list from cs.CV) [pdf, other]: Title: Noisy Correspondence Learning with Meta Similarity Correction

Haochen Han, Kaiyao Miao, Qinghua Zheng, Minnan Luo

Comments: Accepted at CVPR 2023

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[43] arXiv:2304.06358 (cross-list from cs.CV) [pdf, other]: Title: Deep Metric Multi-View Hashing for Multimedia Retrieval

Jian Zhu, Zhangmin Huang, Xiaohu Ruan, Yu Cui, Yongli Cheng, Lingfang Zeng

Comments: Accepted by IEEE ICME 2023

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[44] arXiv:2304.06896 (cross-list from eess.IV) [pdf, other]: Title: Machine Perception-Driven Image Compression: A Layered Generative Approach

Yuefeng Zhang, Chuanmin Jia, Jiannhui Chang, Siwei Ma

Comments: 12 pages, 12 figures

Journal-ref: IEEE Transactions on Circuits and Systems for Video Technology 2024

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[45] arXiv:2304.07056 (cross-list from eess.IV) [pdf, other]: Title: Perceptual Quality Assessment of Face Video Compression: A Benchmark and An Effective Method

Yixuan Li, Bolin Chen, Baoliang Chen, Meng Wang, Shiqi Wang, Weisi Lin

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[46] arXiv:2304.07498 (cross-list from cs.HC) [pdf, other]: Title: Virtual Reality Training of Social Skills in Autism Spectrum Disorder: An Examination of Acceptability, Usability, User Experience, Social Skills, and Executive Functions

Panagiotis Kourtesis, Evangelia-Chrysanthi Kouklari, Petros Roussos, Vasileios Mantas, Katerina Papanikolaou, Christos Skaloumbakas, Artemios Pehlivanidis

Subjects: Human-Computer Interaction (cs.HC); Computers and Society (cs.CY); Multimedia (cs.MM)
[47] arXiv:2304.07567 (cross-list from cs.CV) [pdf, other]: Title: CoVLR: Coordinating Cross-Modal Consistency and Intra-Modal Structure for Vision-Language Retrieval

Yang Yang, Zhongtian Fu, Xiangyu Wu, Wenjie Li

Comments: I apologize for my operational mistake, which has resulted in the absence of a revised version of the manuscript. Furthermore, I am concerned that the submission process of this paper may potentially lead to conflicts. Therefore, I kindly request the withdrawal of the manuscript

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[48] arXiv:2304.07775 (cross-list from cs.CV) [pdf, other]: Title: Robust Cross-Modal Knowledge Distillation for Unconstrained Videos

Wenke Xia, Xingjian Li, Andong Deng, Haoyi Xiong, Dejing Dou, Di Hu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[49] arXiv:2304.08028 (cross-list from cs.CV) [pdf, other]: Title: MMANet: Margin-aware Distillation and Modality-aware Regularization for Incomplete Multimodal Learning

Shicai Wei, Yang Luo, Chunbo Luo

Comments: 10 pages, 3 figures, CVPR2023

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[50] arXiv:2304.08345 (cross-list from cs.LG) [pdf, html, other]: Title: VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset

Jing Liu, Sihan Chen, Xingjian He, Longteng Guo, Xinxin Zhu, Weining Wang, Jinhui Tang

Comments: Preprint version w/o audio files embeded in PDF. Audio embeded version can be found on project page or github

Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)

Total of 80 entries : 1-25 26-50 51-75 76-80

Showing up to 25 entries per page: fewer | more | all