Skip to main content

Showing 1–1 of 1 results for author: Kath, J

Searching in archive cs. Search in all archives.
.
  1. Reducing the Transformer Architecture to a Minimum

    Authors: Bernhard Bermeitinger, Tomas Hrycej, Massimo Pavone, Julianus Kath, Siegfried Handschuh

    Abstract: Transformers are a widespread and successful model architecture, particularly in Natural Language Processing (NLP) and Computer Vision (CV). The essential innovation of this architecture is the Attention Mechanism, which solves the problem of extracting relevant context information from long sequences in NLP and realistic scenes in CV. A classical neural network component, a Multi-Layer Perceptron… ▽ More

    Submitted 29 October, 2024; v1 submitted 17 October, 2024; originally announced October 2024.

    Comments: 8 pages, to appear in KDIR2024

    Journal ref: Proceedings of the 16th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - KDIR2024