Multimodal Deep Learning

Akkus, Cem; Chu, Luyang; Djakovic, Vladana; Jauch-Walser, Steffen; Koch, Philipp; Loss, Giacomo; Marquardt, Christopher; Moldovan, Marco; Sauter, Nadja; Schneider, Maximilian; Schulte, Rickmer; Urbanczyk, Karol; Goschenhofer, Jann; Heumann, Christian; Hvingelby, Rasmus; Schalk, Daniel; Aßenmacher, Matthias

Computer Science > Computation and Language

arXiv:2301.04856 (cs)

[Submitted on 12 Jan 2023]

Title:Multimodal Deep Learning

Authors:Cem Akkus, Luyang Chu, Vladana Djakovic, Steffen Jauch-Walser, Philipp Koch, Giacomo Loss, Christopher Marquardt, Marco Moldovan, Nadja Sauter, Maximilian Schneider, Rickmer Schulte, Karol Urbanczyk, Jann Goschenhofer, Christian Heumann, Rasmus Hvingelby, Daniel Schalk, Matthias Aßenmacher

View PDF

Abstract:This book is the result of a seminar in which we reviewed multimodal approaches and attempted to create a solid overview of the field, starting with the current state-of-the-art approaches in the two subfields of Deep Learning individually. Further, modeling frameworks are discussed where one modality is transformed into the other, as well as models in which one modality is utilized to enhance representation learning for the other. To conclude the second part, architectures with a focus on handling both modalities simultaneously are introduced. Finally, we also cover other modalities as well as general-purpose multi-modal models, which are able to handle different tasks on different modalities within one unified architecture. One interesting application (Generative Art) eventually caps off this booklet.

Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2301.04856 [cs.CL]
	(or arXiv:2301.04856v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2301.04856

Submission history

From: Daniel Schalk [view email]
[v1] Thu, 12 Jan 2023 07:42:36 UTC (42,629 KB)

Computer Science > Computation and Language

Title:Multimodal Deep Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Multimodal Deep Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators