Skip to main content

Showing 1–4 of 4 results for author: Shaheen, Z

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.13073  [pdf, other

    cs.CV cs.LG cs.MM

    FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline

    Authors: Vladimir Arkhipkin, Zein Shaheen, Viacheslav Vasilev, Elizaveta Dakhova, Andrey Kuznetsov, Denis Dimitrov

    Abstract: Multimedia generation approaches occupy a prominent place in artificial intelligence research. Text-to-image models achieved high-quality results over the last few years. However, video synthesis methods recently started to develop. This paper presents a new two-stage latent diffusion text-to-video generation architecture based on the text-to-image diffusion model. The first stage concerns keyfram… ▽ More

    Submitted 20 December, 2023; v1 submitted 21 November, 2023; originally announced November 2023.

    Comments: Project page: https://ai-forever.github.io/kandinsky-video/

  2. arXiv:2111.14192  [pdf, other

    cs.CL cs.AI

    Zero-Shot Cross-Lingual Transfer in Legal Domain Using Transformer Models

    Authors: Zein Shaheen, Gerhard Wohlgenannt, Dmitry Mouromtsev

    Abstract: Zero-shot cross-lingual transfer is an important feature in modern NLP models and architectures to support low-resource languages. In this work, We study zero-shot cross-lingual transfer from English to French and German under Multi-Label Text Classification, where we train a classifier using English training set, and we test using French and German test sets. We extend EURLEX57K dataset, the Engl… ▽ More

    Submitted 11 December, 2021; v1 submitted 28 November, 2021; originally announced November 2021.

    Comments: Accepted in CSCI2021 conference

  3. arXiv:2010.12871  [pdf, other

    cs.CL cs.AI

    Large Scale Legal Text Classification Using Transformer Models

    Authors: Zein Shaheen, Gerhard Wohlgenannt, Erwin Filtz

    Abstract: Large multi-label text classification is a challenging Natural Language Processing (NLP) problem that is concerned with text classification for datasets with thousands of labels. We tackle this problem in the legal domain, where datasets, such as JRC-Acquis and EURLEX57K labeled with the EuroVoc vocabulary were created within the legal information systems of the European Union. The EuroVoc taxonom… ▽ More

    Submitted 24 October, 2020; originally announced October 2020.

  4. arXiv:2005.02470  [pdf, other

    cs.CL cs.LG

    Russian Natural Language Generation: Creation of a Language Modelling Dataset and Evaluation with Modern Neural Architectures

    Authors: Zein Shaheen, Gerhard Wohlgenannt, Bassel Zaity, Dmitry Mouromtsev, Vadim Pak

    Abstract: Generating coherent, grammatically correct, and meaningful text is very challenging, however, it is crucial to many modern NLP systems. So far, research has mostly focused on English language, for other languages both standardized datasets, as well as experiments with state-of-the-art models, are rare. In this work, we i) provide a novel reference dataset for Russian language modeling, ii) experim… ▽ More

    Submitted 5 May, 2020; originally announced May 2020.