Skip to main content

Showing 1–16 of 16 results for author: Almeida, T S

.
  1. arXiv:2501.07482  [pdf, other

    cs.CL cs.AI

    TiEBe: Tracking Language Model Recall of Notable Worldwide Events Through Time

    Authors: Thales Sales Almeida, Giovana Kerche Bonás, João Guilherme Alves Santos, Hugo Abonizio, Rodrigo Nogueira

    Abstract: As the knowledge landscape evolves and large language models (LLMs) become increasingly widespread, there is a growing need to keep these models updated with current events. While existing benchmarks assess general factual recall, few studies explore how LLMs retain knowledge over time or across different regions. To address these gaps, we present the Timely Events Benchmark (TiEBe), a dataset of… ▽ More

    Submitted 20 May, 2025; v1 submitted 13 January, 2025; originally announced January 2025.

  2. arXiv:2501.02068  [pdf, other

    cs.CL cs.AI

    The interplay between domain specialization and model size

    Authors: Roseval Malaquias Junior, Ramon Pires, Thales Sales Almeida, Kenzo Sakiyama, Roseli A. F. Romero, Rodrigo Nogueira

    Abstract: Scaling laws for language models have often focused on finding the optimal model size and token count for training from scratch. However, achieving this optimal balance requires significant compute resources due to the extensive data demands when training models from randomly-initialized weights. Continued pretraining offers a cost-effective alternative, leveraging the compute investment from pret… ▽ More

    Submitted 29 March, 2025; v1 submitted 3 January, 2025; originally announced January 2025.

  3. arXiv:2410.12049  [pdf, other

    cs.CL cs.AI

    Sabiá-3 Technical Report

    Authors: Hugo Abonizio, Thales Sales Almeida, Thiago Laitz, Roseval Malaquias Junior, Giovana Kerche Bonás, Rodrigo Nogueira, Ramon Pires

    Abstract: This report presents Sabiá-3, our new flagship language model, and Sabiazinho-3, a more cost-effective sibling. The models were trained on a large brazilian-centric corpus. Evaluations across diverse professional and academic benchmarks show a strong performance on Portuguese and Brazil-related tasks. Sabiá-3 shows large improvements in comparison to our previous best of model, Sabia-2 Medium, esp… ▽ More

    Submitted 1 April, 2025; v1 submitted 15 October, 2024; originally announced October 2024.

  4. SurveySum: A Dataset for Summarizing Multiple Scientific Articles into a Survey Section

    Authors: Leandro Carísio Fernandes, Gustavo Bartz Guedes, Thiago Soares Laitz, Thales Sales Almeida, Rodrigo Nogueira, Roberto Lotufo, Jayr Pereira

    Abstract: Document summarization is a task to shorten texts into concise and informative summaries. This paper introduces a novel dataset designed for summarizing multiple scientific articles into a section of a survey. Our contributions are: (1) SurveySum, a new dataset addressing the gap in domain-specific summarization tools; (2) two specific pipelines to summarize scientific articles into a section of a… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

    Comments: 15 pages, 6 figures, 1 table. Submitted to BRACIS 2024

  5. arXiv:2404.08191  [pdf, other

    cs.CL

    Measuring Cross-lingual Transfer in Bytes

    Authors: Leandro Rodrigues de Souza, Thales Sales Almeida, Roberto Lotufo, Rodrigo Nogueira

    Abstract: Multilingual pretraining has been a successful solution to the challenges posed by the lack of resources for languages. These models can transfer knowledge to target languages with minimal or no examples. Recent research suggests that monolingual models also have a similar capability, but the mechanisms behind this transfer remain unclear. Some studies have explored factors like language contamina… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: NAACL 2024

  6. arXiv:2403.09887  [pdf, other

    cs.CL cs.AI

    Sabiá-2: A New Generation of Portuguese Large Language Models

    Authors: Thales Sales Almeida, Hugo Abonizio, Rodrigo Nogueira, Ramon Pires

    Abstract: We introduce Sabiá-2, a family of large language models trained on Portuguese texts. The models are evaluated on a diverse range of exams, including entry-level tests for Brazilian universities, professional certification exams, and graduate-level exams for various disciplines such as accounting, economics, engineering, law and medicine. Our results reveal that our best model so far, Sabiá-2 Mediu… ▽ More

    Submitted 26 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  7. arXiv:2311.14169  [pdf, other

    cs.CL cs.AI cs.LG

    Evaluating GPT-4's Vision Capabilities on Brazilian University Admission Exams

    Authors: Ramon Pires, Thales Sales Almeida, Hugo Abonizio, Rodrigo Nogueira

    Abstract: Recent advancements in language models have showcased human-comparable performance in academic entrance exams. However, existing studies often overlook questions that require the integration of visual comprehension, thus compromising the full spectrum and complexity inherent in real-world scenarios. To address this gap, we present a comprehensive framework to evaluate language models on entrance e… ▽ More

    Submitted 23 November, 2023; originally announced November 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2303.17003

  8. arXiv:2307.05410  [pdf, other

    cs.CL

    BLUEX: A benchmark based on Brazilian Leading Universities Entrance eXams

    Authors: Thales Sales Almeida, Thiago Laitz, Giovana K. Bonás, Rodrigo Nogueira

    Abstract: One common trend in recent studies of language models (LMs) is the use of standardized tests for evaluation. However, despite being the fifth most spoken language worldwide, few such evaluations have been conducted in Portuguese. This is mainly due to the lack of high-quality datasets available to the community for carrying out evaluations in Portuguese. To address this gap, we introduce the Brazi… ▽ More

    Submitted 11 July, 2023; originally announced July 2023.

  9. Sabiá: Portuguese Large Language Models

    Authors: Ramon Pires, Hugo Abonizio, Thales Sales Almeida, Rodrigo Nogueira

    Abstract: As the capabilities of language models continue to advance, it is conceivable that "one-size-fits-all" model will remain as the main paradigm. For instance, given the vast number of languages worldwide, many of which are low-resource, the prevalent practice is to pretrain a single model on multiple languages. In this paper, we add to the growing body of evidence that challenges this practice, demo… ▽ More

    Submitted 9 November, 2023; v1 submitted 16 April, 2023; originally announced April 2023.

  10. arXiv:2210.14837  [pdf, other

    cs.IR cs.LG

    NeuralSearchX: Serving a Multi-billion-parameter Reranker for Multilingual Metasearch at a Low Cost

    Authors: Thales Sales Almeida, Thiago Laitz, João Seródio, Luiz Henrique Bonifacio, Roberto Lotufo, Rodrigo Nogueira

    Abstract: The widespread availability of search API's (both free and commercial) brings the promise of increased coverage and quality of search results for metasearch engines, while decreasing the maintenance costs of the crawling and indexing infrastructures. However, merging strategies frequently comprise complex pipelines that require careful tuning, which is often overlooked in the literature. In this w… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Comments: published as a full paper at the DESIRES 2022 Conference. 13 pages

    Journal ref: DESIRES 2022-3rd International Conference on Design of Experimental Search and Information REtrieval Systems, 30-31,August 2022, San Jose, CA, USA

  11. arXiv:2103.07573  [pdf, other

    eess.IV cond-mat.mtrl-sci cs.CV q-bio.QM

    Mining Artifacts in Mycelium SEM Micrographs

    Authors: Thaicia Stona de Almeida

    Abstract: Mycelium is a promising biomaterial based on fungal mycelium, a highly porous, nanofibrous structure. Scanning electron micrographs are used to characterize its network, but the currently available tools for nanofibrous microstructures do not contemplate the particularities of biomaterials. The adoption of a software for artificial nanofibrous in mycelium characterization adds the uncertainty of i… ▽ More

    Submitted 12 March, 2021; originally announced March 2021.

    Comments: 7 pages, 9 figures

    MSC Class: 74N15; 62H35; 68U10 ACM Class: I.4

  12. arXiv:2102.12683  [pdf, other

    math.AP math.CA math.DG

    Convex geometric reasoning for crystalline energies

    Authors: Thaicia Stona de Almeida

    Abstract: The present work revisits the classical Wulff problem restricted to crystalline integrands, a class of surface energies that gives rise to finitely faceted crystals. The general proof of the Wulff theorem was given by J.E. Taylor (1978) by methods of Geometric Measure Theory. This work follows a simpler and direct way through Minkowski Theory by taking advantage of the convex properties of the con… ▽ More

    Submitted 24 February, 2021; originally announced February 2021.

    Comments: 12 pages, 2 figures

    MSC Class: 49Q10 (Primary) 35A15; 49J40; 49K20; 52B60 (Secondary)

    Journal ref: Caspian Journal of Computational and Mathematical Engineering, 1, 2016, 51-62

  13. Extending the ADM formalism to Weyl geometry

    Authors: A. B. Barreto, T. S. Almeida, C. Romero

    Abstract: In order to treat quantum cosmology in the framework of Weyl spacetimes we take the first step of extending the Arnowitt-Deser-Misner formalism to Weyl geometry. We then obtain an expression of the curvature tensor in terms of spatial quantities by splitting spacetime in (3+1)-dimensional form. We next write the Lagrangian of the gravitation field based in Weyl-type gravity theory. We extend the g… ▽ More

    Submitted 29 March, 2015; originally announced March 2015.

    Comments: 10 pages

  14. (2+1)-Dimensional Gravity in Weyl Integrable Spacetime

    Authors: J. E. Madriz Aguilar, C. Romero, J. B. Fonseca-Neto, T. S. Almeida, J. B. Formiga

    Abstract: We investigate (2+1)-dimensional gravity in a Weyl integrable spacetime (WIST). We show that, unlike general relativity, this scalar-tensor theory has a Newtonian limit for any dimension and that in three dimensions the congruence of world lines of particles of a pressureless fluid has a non-vanishing geodesic deviation. We present and discuss a class of static vacuum solutions generated by a circ… ▽ More

    Submitted 13 March, 2015; originally announced March 2015.

    Comments: 9 pages. arXiv admin note: text overlap with arXiv:1201.1469, arXiv:1101.5333

  15. Wormholes in Wyman's solution

    Authors: J. B. Formiga, T. S. Almeida

    Abstract: The most general solution of the Einstein field equations coupled with a massless scalar field is known as Wyman's solution. This solution is also present in the Brans-Dicke theory and, due to its importance, it has been studied in detail by many authors. However, this solutions has not been studied from the perspective of a possible wormhole. In this paper, we perform a detailed analysis of this… ▽ More

    Submitted 10 September, 2014; v1 submitted 1 April, 2014; originally announced April 2014.

    Comments: the content was improved and some new references were added

  16. From Brans-Dicke gravity to a geometrical scalar-tensor theory

    Authors: T. S. Almeida, M. L. Pucheu, C. Romero, J. B. Formiga

    Abstract: We consider an approach to Brans-Dicke theory of gravity in which the scalar field has a geometrical nature. By postulating the Palatini variation, we find out that the role played by the scalar field consists in turning the space-time geometry into a Weyl integrable manifold. This procedure leads to a scalar-tensor theory that differs from the original Brans-Dicke theory in many aspects and prese… ▽ More

    Submitted 21 November, 2013; originally announced November 2013.

    Comments: 21 pages