Skip to main content

Showing 1–14 of 14 results for author: Novák, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2509.17796  [pdf, ps, other

    cs.CL

    Findings of the Fourth Shared Task on Multilingual Coreference Resolution: Can LLMs Dethrone Traditional Approaches?

    Authors: Michal Novák, Miloslav Konopík, Anna Nedoluzhko, Martin Popel, Ondřej Pražák, Jakub Sido, Milan Straka, Zdeněk Žabokrtský, Daniel Zeman

    Abstract: The paper presents an overview of the fourth edition of the Shared Task on Multilingual Coreference Resolution, organized as part of the CODI-CRAC 2025 workshop. As in the previous editions, participants were challenged to develop systems that identify mentions and cluster them according to identity coreference. A key innovation of this year's task was the introduction of a dedicated Large Langu… ▽ More

    Submitted 22 September, 2025; originally announced September 2025.

    Comments: Accepted to CODI-CRAC 2025

  2. Mitigating Language Barriers in Education: Developing Multilingual Digital Learning Materials with Machine Translation

    Authors: Lucie Poláková, Martin Popel, Věra Kloudová, Michal Novák, Mariia Anisimova, Jiří Balhar

    Abstract: The EdUKate project combines digital education, linguistics, translation studies, and machine translation to develop multilingual learning materials for Czech primary and secondary schools. Launched through collaboration between a major Czech academic institution and the country's largest educational publisher, the project is aimed at translating up to 9,000 multimodal interactive exercises from C… ▽ More

    Submitted 11 September, 2025; originally announced September 2025.

    Comments: 8 pages, 2 figures

    Journal ref: L. Poláková, M. Popel, V. Kloudová, M. Novák, M. Anisimova, J. Balhar (2025). Mitigating Language Barriers in Education: Developing Multilingual Digital Learning Materials with Machine Translation, EDULEARN25, pp. 8754-8760

  3. arXiv:2507.19854  [pdf, ps, other

    cs.RO cs.HC

    Think, Act, Learn: A Framework for Autonomous Robotic Agents using Closed-Loop Large Language Models

    Authors: Anjali R. Menon, Rohit K. Sharma, Priya Singh, Chengyu Wang, Aurora M. Ferreira, Mateja Novak

    Abstract: The integration of Large Language Models (LLMs) into robotics has unlocked unprecedented capabilities in high-level task planning. However, most current systems operate in an open-loop fashion, where LLMs act as one-shot planners, rendering them brittle and unable to adapt to unforeseen circumstances in dynamic physical environments. To overcome this limitation, this paper introduces the "Think, A… ▽ More

    Submitted 26 July, 2025; originally announced July 2025.

    Comments: 13 pages, 7 figures

    MSC Class: 68T05; 68T07; 68T40 ACM Class: I.2.6; I.2.9; I.2.7; I.2.10; H.5.2

  4. Findings of the Third Shared Task on Multilingual Coreference Resolution

    Authors: Michal Novák, Barbora Dohnalová, Miloslav Konopík, Anna Nedoluzhko, Martin Popel, Ondřej Pražák, Jakub Sido, Milan Straka, Zdeněk Žabokrtský, Daniel Zeman

    Abstract: The paper presents an overview of the third edition of the shared task on multilingual coreference resolution, held as part of the CRAC 2024 workshop. Similarly to the previous two editions, the participants were challenged to develop systems capable of identifying mentions and clustering them based on identity coreference. This year's edition took another step towards real-world application by… ▽ More

    Submitted 9 November, 2024; v1 submitted 21 October, 2024; originally announced October 2024.

    Comments: Accepted to CRAC 2024

  5. arXiv:2404.18385  [pdf, other

    cs.HC cs.AI

    Equivalence: An analysis of artists' roles with Image Generative AI from Conceptual Art perspective through an interactive installation design practice

    Authors: Yixuan Li, Dan C. Baciu, Marcos Novak, George Legrady

    Abstract: Over the past year, the emergence of advanced text-to-image Generative AI models has significantly impacted the art world, challenging traditional notions of creativity and the role of artists. This study explores how artists interact with these technologies, using a 5P model (Purpose, People, Process, Product, and Press) based on Rhodes' creativity framework to compare the artistic processes behi… ▽ More

    Submitted 29 April, 2024; v1 submitted 28 April, 2024; originally announced April 2024.

    ACM Class: I.2.7; J.0; J.5

  6. arXiv:2404.06964  [pdf, other

    cs.CL

    Charles Translator: A Machine Translation System between Ukrainian and Czech

    Authors: Martin Popel, Lucie Poláková, Michal Novák, Jindřich Helcl, Jindřich Libovický, Pavel Straňák, Tomáš Krabač, Jaroslava Hlaváčová, Mariia Anisimova, Tereza Chlaňová

    Abstract: We present Charles Translator, a machine translation system between Ukrainian and Czech, developed as part of a society-wide effort to mitigate the impact of the Russian-Ukrainian war on individuals and society. The system was developed in the spring of 2022 with the help of many language data providers in order to quickly meet the demand for such a service, which was not available at the time in… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  7. arXiv:2310.13381  [pdf, other

    cs.LG

    Accelerated sparse Kernel Spectral Clustering for large scale data clustering problems

    Authors: Mihaly Novak, Rocco Langone, Carlos Alzate, Johan Suykens

    Abstract: An improved version of the sparse multiway kernel spectral clustering (KSC) is presented in this brief. The original algorithm is derived from weighted kernel principal component (KPCA) analysis formulated within the primal-dual least-squares support vector machine (LS-SVM) framework. Sparsity is achieved then by the combination of the incomplete Cholesky decomposition (ICD) based low rank approxi… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

  8. arXiv:2308.03601  [pdf, other

    cs.CL

    Negative Lexical Constraints in Neural Machine Translation

    Authors: Josef Jon, Dušan Variš, Michal Novák, João Paulo Aires, Ondřej Bojar

    Abstract: This paper explores negative lexical constraining in English to Czech neural machine translation. Negative lexical constraining is used to prohibit certain words or expressions in the translation produced by the neural translation model. We compared various methods based on modifying either the decoding process or the training data. The comparison was performed on two tasks: paraphrasing and feedb… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

  9. arXiv:2209.07841  [pdf, other

    cs.CL

    Findings of the Shared Task on Multilingual Coreference Resolution

    Authors: Zdeněk Žabokrtský, Miloslav Konopík, Anna Nedoluzhko, Michal Novák, Maciej Ogrodniczuk, Martin Popel, Ondřej Pražák, Jakub Sido, Daniel Zeman, Yilun Zhu

    Abstract: This paper presents an overview of the shared task on multilingual coreference resolution associated with the CRAC 2022 workshop. Shared task participants were supposed to develop trainable systems capable of identifying mentions and clustering them according to identity coreference. The public edition of CorefUD 1.0, which contains 13 datasets for 10 languages, was used as the source of training… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

  10. arXiv:2109.09354  [pdf, other

    cs.CL

    CUNI systems for WMT21: Multilingual Low-Resource Translation for Indo-European Languages Shared Task

    Authors: Josef Jon, Michal Novák, João Paulo Aires, Dušan Variš, Ondřej Bojar

    Abstract: This paper describes Charles University submission for Multilingual Low-Resource Translation for Indo-European Languages shared task at WMT21. We competed in translation from Catalan into Romanian, Italian and Occitan. Our systems are based on shared multilingual model. We show that using joint model for multiple similar language pairs improves upon translation quality in each pair. We also demons… ▽ More

    Submitted 20 September, 2021; originally announced September 2021.

  11. arXiv:2109.09350  [pdf, other

    cs.CL

    CUNI systems for WMT21: Terminology translation Shared Task

    Authors: Josef Jon, Michal Novák, João Paulo Aires, Dušan Variš, Ondřej Bojar

    Abstract: This paper describes Charles University submission for Terminology translation Shared Task at WMT21. The objective of this task is to design a system which translates certain terms based on a provided terminology database, while preserving high overall translation quality. We competed in English-French language pair. Our approach is based on providing the desired translations alongside the input s… ▽ More

    Submitted 20 September, 2021; originally announced September 2021.

  12. arXiv:2104.05688  [pdf, other

    cs.CL cs.HC

    Backtranslation Feedback Improves User Confidence in MT, Not Quality

    Authors: Vilém Zouhar, Michal Novák, Matúš Žilinec, Ondřej Bojar, Mateo Obregón, Robin L. Hill, Frédéric Blain, Marina Fomicheva, Lucia Specia, Lisa Yankovskaya

    Abstract: Translating text into a language unknown to the text's author, dubbed outbound translation, is a modern need for which the user experience has significant room for improvement, beyond the basic machine translation facility. We demonstrate this by showing three ways in which user confidence in the outbound translation, as well as its overall final quality, can be affected: backward translation, qua… ▽ More

    Submitted 12 April, 2021; originally announced April 2021.

    Comments: 9 pages (excluding references); to appear at NAACL-HWT 2021

  13. arXiv:1909.01701  [pdf, other

    cs.CL

    SAO WMT19 Test Suite: Machine Translation of Audit Reports

    Authors: Tereza Vojtěchová, Michal Novák, Miloš Klouček, Ondřej Bojar

    Abstract: This paper describes a machine translation test set of documents from the auditing domain and its use as one of the "test suites" in the WMT19 News Translation Task for translation directions involving Czech, English and German. Our evaluation suggests that current MT systems optimized for the general news domain can perform quite well even in the particular domain of audit reports. The detailed… ▽ More

    Submitted 4 September, 2019; originally announced September 2019.

    Comments: WMT19 (http://www.statmt.org/wmt19/)

    Journal ref: Vojtěchová et al. (2019): SAO WMT19 Test Suite: Machine Translation of Audit Reports. In: Fourth Conference on Machine Translation - Proceedings of the Conference, pp. 680-692, ACL, ISBN 978-1-950737-27-7

  14. arXiv:1805.03834  [pdf, other

    cs.DS

    Haplotype-aware graph indexes

    Authors: Jouni Sirén, Erik Garrison, Adam M. Novak, Benedict Paten, Richard Durbin

    Abstract: The variation graph toolkit (VG) represents genetic variation as a graph. Each path in the graph is a potential haplotype, though most paths are unlikely recombinations of true haplotypes. We augment the VG model with haplotype information to identify which paths are more likely to be correct. For this purpose, we develop a scalable implementation of the graph extension of the positional Burrows--… ▽ More

    Submitted 15 June, 2018; v1 submitted 10 May, 2018; originally announced May 2018.

    Comments: Accepted to WABI 2018