Skip to main content

Showing 1–13 of 13 results for author: Lukoševičius, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.08073  [pdf, other

    cs.CL cs.IR cs.LG stat.ML

    Extracting Sentence Embeddings from Pretrained Transformer Models

    Authors: Lukas Stankevičius, Mantas Lukoševičius

    Abstract: Pre-trained transformer models shine in many natural language processing tasks and therefore are expected to bear the representation of the input sentence or text meaning. These sentence-level embeddings are also important in retrieval-augmented generation. But do commonly used plain averaging or prompt templates sufficiently capture and represent the underlying meaning? After providing a comprehe… ▽ More

    Submitted 20 February, 2025; v1 submitted 15 August, 2024; originally announced August 2024.

    Comments: Postprint update

    MSC Class: 68T07; 68T50; 68T05 ACM Class: I.2.6; I.2.7

    Journal ref: Appl. Sci. 2024, 14(19), 8887

  2. arXiv:2407.19947  [pdf, other

    cs.CL cs.LG

    Inference acceleration for large language models using "stairs" assisted greedy generation

    Authors: Domas Grigaliūnas, Mantas Lukoševičius

    Abstract: Large Language Models (LLMs) with billions of parameters are known for their impressive predicting capabilities but require lots of resources to run. With their massive rise in popularity, even a small reduction in required resources could have an impact on environment. On the other hand, smaller models require fewer resources but may sacrifice accuracy. In this work, we are proposing an implement… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

    Comments: Accepted at the 29th International Conference on Information Society and University Studies (IVUS 2024)

    MSC Class: 68T07; 68T50; 68T05; ACM Class: I.2.6; I.2.7

  3. arXiv:2407.19914  [pdf

    cs.CL cs.IR cs.LG

    Sentiment Analysis of Lithuanian Online Reviews Using Large Language Models

    Authors: Brigita Vileikytė, Mantas Lukoševičius, Lukas Stankevičius

    Abstract: Sentiment analysis is a widely researched area within Natural Language Processing (NLP), attracting significant interest due to the advent of automated solutions. Despite this, the task remains challenging because of the inherent complexity of languages and the subjective nature of sentiments. It is even more challenging for less-studied and less-resourced languages such as Lithuanian. Our review… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

    Comments: Accepted at the 29th International Conference on Information Society and University Studies (IVUS 2024)

    MSC Class: 68T07; 68T50; 68T05; ACM Class: I.2.6; I.2.7

  4. arXiv:2405.15729  [pdf, other

    cs.SE cs.CL cs.LG

    Optimizing Large Language Models for OpenAPI Code Completion

    Authors: Bohdan Petryshyn, Mantas Lukoševičius

    Abstract: Recent advancements in Large Language Models (LLMs) and their utilization in code generation tasks have significantly reshaped the field of software development. Despite the remarkable efficacy of code completion solutions in mainstream programming languages, their performance lags when applied to less ubiquitous formats such as OpenAPI definitions. This study evaluates the OpenAPI completion perf… ▽ More

    Submitted 10 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: Update: a better quality and readability of figures, better explanation of code infilling and document splitting in training, some text polishing, making it more compact

    MSC Class: 68T07; 68T50; 68T05 ACM Class: I.2.2; I.2.6; I.2.7; D.1.2; D.2.1; D.2.3; D.2.6

  5. arXiv:2204.05192  [pdf, other

    cs.LG cs.NE

    Task-Synchronized Recurrent Neural Networks

    Authors: Mantas Lukoševičius, Arnas Uselis

    Abstract: Data are often sampled irregularly in time. Dealing with this using Recurrent Neural Networks (RNNs) traditionally involved ignoring the fact, feeding the time differences as additional inputs, or resampling the data. All these methods have their shortcomings. We propose an elegant straightforward alternative approach where instead the RNN is in effect resampled in time to match the time of the da… ▽ More

    Submitted 2 July, 2024; v1 submitted 11 April, 2022; originally announced April 2022.

    Comments: The 1st version was written in May 2019 and double-blind reviewed for a prominent conference. A major update. We changed the name of the article and methods to an arguably more precise one, and because a very similar title has been published in the meantime. We've rewritten much of the text, connected to the current literature, redone some experiments, figures, discussion, published source code

    MSC Class: 68T07; 68T05; 37M10 ACM Class: I.2.6; G.1.2

  6. arXiv:2203.09963  [pdf, other

    cs.CL cs.IR cs.LG stat.ML

    Towards Lithuanian grammatical error correction

    Authors: Lukas Stankevičius, Mantas Lukoševičius

    Abstract: Everyone wants to write beautiful and correct text, yet the lack of language skills, experience, or hasty typing can result in errors. By employing the recent advances in transformer architectures, we construct a grammatical error correction model for Lithuanian, the language rich in archaic features. We compare subword and byte-level approaches and share our best trained model, achieving F… ▽ More

    Submitted 18 March, 2022; originally announced March 2022.

    MSC Class: 68T07; 68T50; 68T05 ACM Class: I.2.6; I.2.7

  7. arXiv:2201.13242  [pdf, other

    cs.CL cs.IR cs.LG stat.ML

    Correcting diacritics and typos with a ByT5 transformer model

    Authors: Lukas Stankevičius, Mantas Lukoševičius, Jurgita Kapočiūtė-Dzikienė, Monika Briedienė, Tomas Krilavičius

    Abstract: Due to the fast pace of life and online communications and the prevalence of English and the QWERTY keyboard, people tend to forgo using diacritics, make typographical errors (typos) when typing in other languages. Restoring diacritics and correcting spelling is important for proper language use and the disambiguation of texts for both humans and downstream algorithms. However, both of these probl… ▽ More

    Submitted 18 March, 2022; v1 submitted 31 January, 2022; originally announced January 2022.

    MSC Class: 68T07; 68T50; 68T05 ACM Class: I.2.6; I.2.7

    Journal ref: Appl. Sci. 2022, 12(5), 2636

  8. arXiv:2107.02211  [pdf, other

    eess.IV cs.CV cs.LG

    Automated age-related macular degeneration area estimation -- first results

    Authors: Rokas Pečiulis, Mantas Lukoševičius, Algimantas Kriščiukaitis, Robertas Petrolis, Dovilė Buteikienė

    Abstract: This work aims to research an automatic method for detecting Age-related Macular Degeneration (AMD) lesions in RGB eye fundus images. For this, we align invasively obtained eye fundus contrast images (the "golden standard" diagnostic) to the RGB ones and use them to hand-annotate the lesions. This is done using our custom-made tool. Using the data, we train and test five different convolutional ne… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

    MSC Class: 68T07; 68T05; 68T45; 92C55 ACM Class: I.2.6; J.3

    Journal ref: Proceedings of the 26th International Conference on Information Society and University Studies (IVUS 2021), pp. 141-149, CEUR, 2021

  9. Generating abstractive summaries of Lithuanian news articles using a transformer model

    Authors: Lukas Stankevičius, Mantas Lukoševičius

    Abstract: In this work, we train the first monolingual Lithuanian transformer model on a relatively large corpus of Lithuanian news articles and compare various output decoding algorithms for abstractive news summarization. We achieve an average ROUGE-2 score 0.163, generated summaries are coherent and look impressive at first glance. However, some of them contain misleading information that is not so easy… ▽ More

    Submitted 22 June, 2021; v1 submitted 23 April, 2021; originally announced May 2021.

    Comments: Accepted in ICIST 2021

    MSC Class: 68T07; 68T50; 68T05 ACM Class: I.2.6; I.2.7

    Journal ref: International Conference on Information and Software Technologies - ICIST 2021, Communications in Computer and Information Science, vol 1486 (2021) 341-352

  10. arXiv:2006.11282  [pdf, other

    cs.LG cs.NE stat.ML

    Efficient implementations of echo state network cross-validation

    Authors: Mantas Lukoševičius, Arnas Uselis

    Abstract: Background/introduction: Cross-Validation (CV) is still uncommon in time series modeling. Echo State Networks (ESNs), as a prime example of Reservoir Computing (RC) models, are known for their fast and precise one-shot learning, that often benefit from good hyper-parameter tuning. This makes them ideal to change the status quo. Methods: We discuss CV of time series for predicting a concrete time… ▽ More

    Submitted 3 December, 2020; v1 submitted 19 June, 2020; originally announced June 2020.

    Comments: arXiv admin note: substantial text overlap with arXiv:1908.08450

    MSC Class: 68T05 (Primary) 37M10; 15A06 (Secondary) ACM Class: I.2.6

    Journal ref: Cognitive Computation, 2021

  11. arXiv:2005.05930  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    Localized convolutional neural networks for geospatial wind forecasting

    Authors: Arnas Uselis, Mantas Lukoševičius, Lukas Stasytis

    Abstract: Convolutional Neural Networks (CNN) possess many positive qualities when it comes to spatial raster data. Translation invariance enables CNNs to detect features regardless of their position in the scene. However, in some domains, like geospatial, not all locations are exactly equal. In this work, we propose localized convolutional neural networks that enable convolutional architectures to learn lo… ▽ More

    Submitted 10 July, 2020; v1 submitted 12 May, 2020; originally announced May 2020.

    MSC Class: 68T05 ACM Class: I.2.6

    Journal ref: Energies, 13 (13), pp. 3440, 2020

  12. arXiv:2004.03461  [pdf, other

    cs.IR cs.CL cs.LG

    Testing pre-trained Transformer models for Lithuanian news clustering

    Authors: Lukas Stankevičius, Mantas Lukoševičius

    Abstract: A recent introduction of Transformer deep learning architecture made breakthroughs in various natural language processing tasks. However, non-English languages could not leverage such new opportunities with the English text pre-trained models. This changed with research focusing on multilingual models, where less-spoken languages are the main beneficiaries. We compare pre-trained multilingual BERT… ▽ More

    Submitted 3 April, 2020; originally announced April 2020.

    Comments: Submission accepted at https://ivus.ktu.edu/

    MSC Class: 68T05 ACM Class: I.2.6

    Journal ref: Proceedings of the Information Society and University Studies 2020, pp. 46-53, vol. 2698, CEUR, Kaunas, 2020, ISSN: 1613-0073

  13. Efficient Cross-Validation of Echo State Networks

    Authors: Mantas Lukoševičius, Arnas Uselis

    Abstract: Echo State Networks (ESNs) are known for their fast and precise one-shot learning of time series. But they often need good hyper-parameter tuning for best performance. For this good validation is key, but usually, a single validation split is used. In this rather practical contribution we suggest several schemes for cross-validating ESNs and introduce an efficient algorithm for implementing them.… ▽ More

    Submitted 22 August, 2019; originally announced August 2019.

    Comments: Accepted in ICANN'19 Workshop on Reservoir Computing

    MSC Class: 68T05 (Primary) 37M10; 15A06 (Secondary) ACM Class: I.2.6

    Journal ref: Artificial Neural Networks and Machine Learning - ICANN 2019: Workshop and Special Sessions. ICANN 2019., pp. 121-133