Skip to main content

Showing 1–3 of 3 results for author: Nadas, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.20605  [pdf, other

    cs.CL cs.AI cs.DL cs.LG

    TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language Models

    Authors: Mihai Nadas, Laura Diosan, Andrei Piscoran, Andreea Tomescu

    Abstract: Moral stories are a time-tested vehicle for transmitting values, yet modern NLP lacks a large, structured corpus that couples coherent narratives with explicit ethical lessons. We close this gap with TF1-EN-3M, the first open dataset of three million English-language fables generated exclusively by instruction-tuned models no larger than 8B parameters. Each story follows a six-slot scaffold (chara… ▽ More

    Submitted 29 April, 2025; originally announced April 2025.

  2. arXiv:2504.18439  [pdf, other

    cs.RO

    The Autonomous Software Stack of the FRED-003C: The Development That Led to Full-Scale Autonomous Racing

    Authors: Zalán Demeter, Levente Puskás, Balázs Kovács, Ádám Matkovics, Martin Nádas, Balázs Tuba, Zsolt Farkas, Ármin Bogár-Németh, Gergely Bári

    Abstract: Scientific development often takes place in the context of research projects carried out by dedicated students during their time at university. In the field of self-driving software research, the Formula Student Driverless competitions are an excellent platform to promote research and attract young engineers. This article presents the software stack developed by BME Formula Racing Team, that forme… ▽ More

    Submitted 25 April, 2025; originally announced April 2025.

    Comments: Accepted to be published at 2025 IEEE Intelligent Vehicles Symposium (IV)

  3. arXiv:2503.14023  [pdf, other

    cs.CL

    Synthetic Data Generation Using Large Language Models: Advances in Text and Code

    Authors: Mihai Nadas, Laura Diosan, Andreea Tomescu

    Abstract: Large language models (LLMs) have unlocked new possibilities for generating synthetic training data in both natural language and code. By producing artificial but task-relevant examples, these models can significantly augment or even replace real-world datasets, especially when labeled data is scarce or sensitive. This paper surveys recent advances in using LLMs to create synthetic text and code,… ▽ More

    Submitted 18 March, 2025; originally announced March 2025.

    Comments: 21 pages, 3 tables, 64 references, preprint