Skip to main content

Showing 1–28 of 28 results for author: Soares, F

.
  1. arXiv:2505.11475  [pdf, other

    cs.CL cs.AI cs.LG

    HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages

    Authors: Zhilin Wang, Jiaqi Zeng, Olivier Delalleau, Hoo-Chang Shin, Felipe Soares, Alexander Bukharin, Ellie Evans, Yi Dong, Oleksii Kuchaiev

    Abstract: Preference datasets are essential for training general-domain, instruction-following language models with Reinforcement Learning from Human Feedback (RLHF). Each subsequent data release raises expectations for future data collection, meaning there is a constant need to advance the quality and diversity of openly available preference data. To address this need, we introduce HelpSteer3-Preference, a… ▽ More

    Submitted 16 May, 2025; originally announced May 2025.

    Comments: 38 pages, 2 figures

  2. Exploring a Large Language Model for Transforming Taxonomic Data into OWL: Lessons Learned and Implications for Ontology Development

    Authors: Filipi Miranda Soares, Antonio Mauro Saraiva, Luís Ferreira Pires, Luiz Olavo Bonino da Silva Santos, Dilvan de Abreu Moreira, Fernando Elias Corrêa, Kelly Rosa Braghetto, Debora Pignatari Drucker, Alexandre Cláudio Botazzo Delbem

    Abstract: Managing scientific names in ontologies that represent species taxonomies is challenging due to the ever-evolving nature of these taxonomies. Manually maintaining these names becomes increasingly difficult when dealing with thousands of scientific names. To address this issue, this paper investigates the use of ChatGPT-4 to automate the development of the :Organism module in the Agricultural Produ… ▽ More

    Submitted 25 April, 2025; originally announced April 2025.

    Comments: 31 pages, 6 Figures, accepted for publication in Data Intelligence

    Journal ref: 2025

  3. arXiv:2503.04378  [pdf, ps, other

    cs.CL cs.AI cs.LG

    HelpSteer3: Human-Annotated Feedback and Edit Data to Empower Inference-Time Scaling in Open-Ended General-Domain Tasks

    Authors: Zhilin Wang, Jiaqi Zeng, Olivier Delalleau, Daniel Egert, Ellie Evans, Hoo-Chang Shin, Felipe Soares, Yi Dong, Oleksii Kuchaiev

    Abstract: Inference-Time Scaling has been critical to the success of recent models such as OpenAI o1 and DeepSeek R1. However, many techniques used to train models for inference-time scaling require tasks to have answers that can be verified, limiting their application to domains such as math, coding and logical reasoning. We take inspiration from how humans make first attempts, ask for detailed feedback fr… ▽ More

    Submitted 30 May, 2025; v1 submitted 6 March, 2025; originally announced March 2025.

    Comments: 23 pages, 2 figures, Accepted to ACL 2025 Main

  4. arXiv:2410.08213  [pdf

    nlin.CD physics.class-ph

    The nonlinear dynamics of a cantilever beam subject to axial flow in a tapered passage

    Authors: Filipe Soares, José Antunes, Christophe Vergez, Vincent Debut, Bruno Cochelin, Fabrice Silva

    Abstract: A cantilever beam under axial flow, confined or not, is known to develop self-sustained oscillations at sufficiently large flow velocities. In recent decades, the analysis of this archetypal system has been mostly pursued under linearized conditions, to calculate the critical boundaries separating stable from unstable behavior. However, nonlinear analysis of the self-sustained oscillations ensuing… ▽ More

    Submitted 25 September, 2024; originally announced October 2024.

    Comments: 22nd International Conference of Numerical Analysis and Applied Mathematics, Sep 2024, Heraklion, Greece

  5. arXiv:2406.11704  [pdf, other

    cs.CL cs.AI cs.LG

    Nemotron-4 340B Technical Report

    Authors: Nvidia, :, Bo Adler, Niket Agarwal, Ashwath Aithal, Dong H. Anh, Pallab Bhattacharya, Annika Brundyn, Jared Casper, Bryan Catanzaro, Sharon Clay, Jonathan Cohen, Sirshak Das, Ayush Dattagupta, Olivier Delalleau, Leon Derczynski, Yi Dong, Daniel Egert, Ellie Evans, Aleksander Ficek, Denys Fridman, Shaona Ghosh, Boris Ginsburg, Igor Gitman, Tomasz Grzegorzek , et al. (58 additional authors not shown)

    Abstract: We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under the NVIDIA Open Model License Agreement, a permissive model license that allows distribution, modification, and use of the models and its outputs. These models perform competitively to open access models on a wide range of evaluation be… ▽ More

    Submitted 6 August, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  6. arXiv:2403.13401  [pdf

    physics.class-ph

    On the radiation from unbaffled pistons and their dipole equivalent

    Authors: Filipe Soares, Vincent Debut

    Abstract: The radiation efficiency from simple vibrating planar surfaces is often used as a basis to describe the sound radiation from more complex structures, having important applications in various fields of acoustics. The low-frequency radiation efficiency of a baffled piston can easily be represented by a simple monopole source. Notably, the equivalent source strength is dependent on the piston surface… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: Tecniacustica 2023, Oct 2023, Cuenca, Spain

  7. arXiv:2402.03108  [pdf

    stat.AP

    Perceived Vulnerability to Disease Scale: Factorial structure, reliability, and validity in times of Portugal's COVID-19 pandemic lockdown

    Authors: Ana Paula Martins, María C. Vega-Hernández, Francisca Ribeiro Soares, Rosa Marina Afonso

    Abstract: The present study examines the factor structure of a Portuguese version of the Perceived Vulnerability to Disease Scale (PVD), designed to assess individual differences in chronic concerns about transmission of infectious diseases. Method: Data from a Portuguese convenience sample (n=1203), collected during the first Covid-19 pandemic lockdown. Results: the scale revealed, through an exploratory f… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: 25 pages, 3 figures, 4 tables

  8. arXiv:2401.08515  [pdf, other

    physics.optics physics.app-ph

    Apodized Slanted Grating Couplers for LiDAR Applications

    Authors: Vahram Voskerchyan, Francis, Tian, Francisco M. Soares, David Alvarez Outerelo, Francisco J. Diaz-Otero

    Abstract: Solid state LiDAR systems traditionally rely on costly active components for efficient beam scanning. In this study, we propose a cost-effective, purely passive steering approach using apodized slanted grating couplers. Through apodization, we achieve a uniform upward emission profile and enhanced upward transmission. Theoretical calculations indicate successful steering of 91.5$^\circ$x42.8… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  9. arXiv:2310.04837  [pdf, other

    cs.CV cs.AI cs.DC

    Federated Self-Supervised Learning of Monocular Depth Estimators for Autonomous Vehicles

    Authors: Elton F. de S. Soares, Carlos Alberto V. Campos

    Abstract: Image-based depth estimation has gained significant attention in recent research on computer vision for autonomous vehicles in intelligent transportation systems. This focus stems from its cost-effectiveness and wide range of potential applications. Unlike binocular depth estimation methods that require two fixed cameras, monocular depth estimation methods only rely on a single camera, making them… ▽ More

    Submitted 7 October, 2023; originally announced October 2023.

    Comments: 16 pages, 8 figures, journal preprint

  10. arXiv:2308.03584  [pdf, other

    cs.DB

    A Polystore Architecture Using Knowledge Graphs to Support Queries on Heterogeneous Data Stores

    Authors: Leonardo Guerreiro Azevedo, Renan Francisco Santos Souza, Elton F. de S. Soares, Raphael M. Thiago, Julio Cesar Cardoso Tesolin, Ann C. Oliveira, Marcio Ferreira Moreno

    Abstract: Modern applications commonly need to manage dataset types composed of heterogeneous data and schemas, making it difficult to access them in an integrated way. A single data store to manage heterogeneous data using a common data model is not effective in such a scenario, which results in the domain data being fragmented in the data stores that best fit their storage and access requirements (e.g., N… ▽ More

    Submitted 15 March, 2024; v1 submitted 7 August, 2023; originally announced August 2023.

    Comments: Reference the paper as L. G. Azevedo, R. Souza, E. F. de S. Soares, R. M. Thiago, J. C. D. Tesolin, A. C. Oliveira, M. F. Moreno, A Polystore Architecture Using Knowledge Graphs to Support Queries on Heterogeneous Data Stores. Proceedings of 20th Brazilian Symposium in Information Systems, 2024 (to be published)

  11. arXiv:2302.04825  [pdf, other

    astro-ph.GA astro-ph.IM

    Boundary conditions in hydrodynamic simulations of isolated galaxies and their impact on the gas-loss processes

    Authors: Anderson Caproni, Gustavo A. Lanfranchi, Amâncio C. S. Friaça, Jennifer F. Soares

    Abstract: Three-dimensional hydrodynamic simulations are commonly used to study the evolution of the gaseous content in isolated galaxies, besides its connection with galactic star formation histories. Stellar winds, supernova blasts, and black hole feedback are mechanisms usually invoked to drive galactic outflows and decrease the initial galactic gas reservoir. However, any simulation imposes the need of… ▽ More

    Submitted 9 February, 2023; originally announced February 2023.

    Comments: 12 pages, 6 figures. Accepted for publication in The Astrophysical Journal

  12. arXiv:2301.03657  [pdf, other

    physics.optics physics.app-ph

    Monolithically Integrated Wavelength-meter in InP with measurement bandwidth of 100nm centered on the C band

    Authors: Andrea Volpini, Damiano Massella, David Alvarez-Outerelo, Francisco Soares, Francisco J. Diaz-Otero

    Abstract: In this paper we will explore the creation of a monolithically integrated wavelength meter in InP. This type of devices are a key requirement for many applications and it is especially important to have them integrated with active components like lasers and gain sections. We present a wavelength meter based on multiple ring resonators that has been realized in a commercial MPW run and tested using… ▽ More

    Submitted 9 January, 2023; originally announced January 2023.

  13. arXiv:2106.11400  [pdf, ps, other

    physics.plasm-ph

    Dynamics of antiproton plasma in a time-dependent harmonic trap

    Authors: Luiz Gustavo F. Soares, Fernando Haas

    Abstract: An antiproton plasma confined in a quasi-1D device is described in terms of a self-consistent fluid formulation using a variational approach. Unlike previous treatments, the use of the time-dependent variational method allows to retain the thermal and Coulomb effects. A certain Ansatz is proposed for the number density and fluid velocity fields, which reduces the problem essentially to ordinary no… ▽ More

    Submitted 21 June, 2021; originally announced June 2021.

  14. arXiv:2007.11369  [pdf, other

    cs.OH

    A Research Agenda on Pediatric Chest X-Ray: Is Deep Learning Still in Childhood?

    Authors: Afonso U. Fonseca, Gabriel S. Vieira, Fabrízzio A. A. M. N. Soares, Renato F. Bulcão-Neto

    Abstract: Several reasons explain the significant role that chest X-rays play on supporting clinical analysis and early disease detection in pediatric patients, such as low cost, high resolution, low radiation levels, and high availability. In the last decade, Deep Learning (DL) has been given special attention from the computer-aided diagnosis research community, outperforming the state of the art of many… ▽ More

    Submitted 7 October, 2020; v1 submitted 20 July, 2020; originally announced July 2020.

    Comments: 16 pages, 11 figures, 11 tables

  15. arXiv:1908.09876  [pdf, other

    cs.SE cs.IR cs.LG

    BULNER: BUg Localization with word embeddings and NEtwork Regularization

    Authors: Jacson Rodrigues Barbosa, Ricardo Marcondes Marcacini, Ricardo Britto, Frederico Soares, Solange Rezende, Auri M. R. Vincenzi, Marcio E. Delamaro

    Abstract: Bug localization (BL) from the bug report is the strategic activity of the software maintaining process. Because BL is a costly and tedious activity, BL techniques information retrieval-based and machine learning-based could aid software engineers. We propose a method for BUg Localization with word embeddings and Network Regularization (BULNER). The preliminary results suggest that BULNER has bett… ▽ More

    Submitted 26 August, 2019; originally announced August 2019.

    Comments: VII Workshop on Software Visualization, Evolution and Maintenance (VEM '19)

  16. arXiv:1905.01855  [pdf, ps, other

    cs.CL

    UFRGS Participation on the WMT Biomedical Translation Shared Task

    Authors: Felipe Soares, Karin Becker

    Abstract: This paper describes the machine translation systems developed by the Universidade Federal do Rio Grande do Sul (UFRGS) team for the biomedical translation shared task. Our systems are based on statistical machine translation and neural machine translation, using the Moses and OpenNMT toolkits, respectively. We participated in four translation directions for the English/Spanish and English/Portugu… ▽ More

    Submitted 6 May, 2019; originally announced May 2019.

    Comments: Published on the Third Conference on Machine Translation (WMT18)

  17. arXiv:1905.01852  [pdf, other

    cs.CL

    A Large Parallel Corpus of Full-Text Scientific Articles

    Authors: Felipe Soares, Viviane Pereira Moreira, Karin Becker

    Abstract: The Scielo database is an important source of scientific information in Latin America, containing articles from several research domains. A striking characteristic of Scielo is that many of its full-text contents are presented in more than one language, thus being a potential source of parallel corpora. In this article, we present the development of a parallel corpus from Scielo in three languages… ▽ More

    Submitted 6 May, 2019; originally announced May 2019.

    Comments: Published in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

  18. A Parallel Corpus of Theses and Dissertations Abstracts

    Authors: Felipe Soares, Gabrielli Harumi Yamashita, Michel Jose Anzanello

    Abstract: In Brazil, the governmental body responsible for overseeing and coordinating post-graduate programs, CAPES, keeps records of all theses and dissertations presented in the country. Information regarding such documents can be accessed online in the Theses and Dissertations Catalog (TDC), which contains abstracts in Portuguese and English, and additional metadata. Thus, this database can be a potenti… ▽ More

    Submitted 5 May, 2019; originally announced May 2019.

    Comments: Published in the PROPOR Conference. arXiv admin note: text overlap with arXiv:1905.01712

    Journal ref: Computational Processing of the Portuguese Language 2018

  19. arXiv:1905.01712  [pdf, other

    cs.CL cs.IR

    BVS Corpus: A Multilingual Parallel Corpus of Biomedical Scientific Texts

    Authors: Felipe Soares, Martin Krallinger

    Abstract: The BVS database (Health Virtual Library) is a centralized source of biomedical information for Latin America and Carib, created in 1998 and coordinated by BIREME (Biblioteca Regional de Medicina) in agreement with the Pan American Health Organization (OPAS). Abstracts are available in English, Spanish, and Portuguese, with a subset in more than one language, thus being a possible source of parall… ▽ More

    Submitted 5 May, 2019; originally announced May 2019.

    Comments: Accepted at the Copora conference. arXiv admin note: text overlap with arXiv:1905.01715

  20. arXiv:1812.05259  [pdf

    q-bio.QM q-bio.TO

    Heart rate variability monitoring identifies asymptomatic toddlers exposed to Zika virus during pregnancy

    Authors: Christophe L. Herry, Helena M. F. Soares, Lavinia Schuler-Faccini, Martin G. Frasch

    Abstract: Although Zika virus (ZIKV) seems to be prominently neurotropic, there are some reports of involvement of other organs, particularly the heart. Of special concern are those children exposed prenatally to ZIKV and born with no microcephaly or other congenital anomaly. Electrocardiogram (ECG) - derived heart rate variability (HRV) metrics represent an attractive, low cost, widely deployable tool for… ▽ More

    Submitted 12 December, 2018; originally announced December 2018.

    Journal ref: Physiol. Meas. 2021

  21. arXiv:1805.09859  [pdf, other

    stat.AP

    Measure of gap and inequalities in basic education students proficiencies

    Authors: José Francisco Soares, Erica Castilho Rodrigues, Victor Maia Senna Delgado

    Abstract: This study uses students performance on standardized tests as evidence of the quality of education and introduces a methodology based on the comparison of performance distributions to produce indicators for both the level achieved by the students and the learning gap between social groups, two inseparable dimensions of quality of education. In the first case, the study compares the distribution of… ▽ More

    Submitted 31 May, 2018; v1 submitted 24 May, 2018; originally announced May 2018.

  22. arXiv:1711.07636  [pdf, ps, other

    math.GR

    Local finiteness for Green's relations in semigroup varieties

    Authors: Mikhail V. Volkov, Pedro V. Silva, Filipa Soares

    Abstract: A semigroup variety V is said to be locally K-finite, where K stands for any of Green's relations H, R, L, D, or J, if every finitely generated semigroup from V has only finitely many K-classes. We characterize locally K-finite varieties of finite axiomatic rank in the language of "forbidden objects".

    Submitted 21 November, 2017; originally announced November 2017.

    Comments: 32 pages, 2 figures, 1 table

    MSC Class: 20M07

  23. arXiv:1606.03866  [pdf, ps, other

    math.GR

    Local finiteness for Green relations in (I-)semigroup varieties

    Authors: Pedro V. Silva, Filipa Soares

    Abstract: In this work, the lattice of varieties of semigroups and the lattice of varieties of I-semigroups (a common setting for both the variety of completely regular semigroups and the variety of inverse semigroups) are studied with respect to the following concepts: a variety V of (I-)semigroups is said to be locally K-finite, where K stands for any of the five Green's relations, if every finitely gener… ▽ More

    Submitted 13 June, 2016; originally announced June 2016.

    Comments: 24 pages

    MSC Class: 20M07; 20M10

  24. arXiv:1412.3048  [pdf, ps, other

    math.GR

    Howson's property for semidirect products of semilattices by groups

    Authors: Pedro V. Silva, Filipa Soares

    Abstract: An inverse semigroup $S$ is a Howson inverse semigroup if the intersection of finitely generated inverse subsemigroups of $S$ is finitely generated. Given a locally finite action $θ$ of a group $G$ on a semilattice $E$, it is proved that $E \ast_θ G$ is a Howson inverse semigroup if and only if $G$ is a Howson group. It is also shown that this equivalence fails for arbitrary actions.

    Submitted 9 December, 2014; originally announced December 2014.

    MSC Class: 20M18

  25. arXiv:0811.1130  [pdf, ps, other

    physics.data-an

    The infinite partition of a line segment and multifractal objects

    Authors: A. I. L. de Araújo, R. F. Soares, J. P. de Oliveira, G. Corso

    Abstract: We report an algorithm for the partition of a line segment according to a given ratio $ν$. At each step the length distribution among sets of the partition follows a binomial distribution. We call $k$-set to the set of elements with the same length at the step $n$. The total number of elements is $2^n$ and the number of elements in a same $k$-set is $C_n^k$. In the limit of an infinite partion t… ▽ More

    Submitted 7 November, 2008; originally announced November 2008.

  26. Light-induced structural transformations in a single gallium nanoparticulate

    Authors: B. F. Soares, K. F. MacDonald, V. A. Fedotov, N. I. Zheludev

    Abstract: In a single gallium nanoparticulate, self-assembled (from an atomic beam) in a nano-aperture at the tip of a tapered optical fiber, we have observed evidence for a sequence of reversible light-induced transformations between five different structural phases (gamma - epsilon - delta - beta - liquid), stimulated by optical excitation at nanowatt power levels.

    Submitted 9 March, 2005; originally announced March 2005.

    Comments: 4 pages, 3 figures, 19 references

    Journal ref: Nano Lett. 5, 2104 (2005)

  27. Anisotropy and percolation threshold in a multifractal support

    Authors: L. S. Lucena, J. E. Freitas, G. Corso, R. F. Soares

    Abstract: Recently a multifractal object, $Q_{mf}$, was proposed to study percolation properties in a multifractal support. The area and the number of neighbors of the blocks of $Q_{mf}$ show a non-trivial behavior. The value of the probability of occupation at the percolation threshold, $p_{c}$, is a function of $ρ$, a parameter of $Q_{mf}$ which is related to its anisotropy. We investigate the relation… ▽ More

    Submitted 14 August, 2003; originally announced August 2003.

  28. arXiv:cond-mat/0212530  [pdf, ps, other

    cond-mat.stat-mech

    Percolation in a Multifractal

    Authors: G. Corso, J. E. Freitas, L. S. Lucena, R. F. Soares

    Abstract: We build a multifractal object and use it as a support to study percolation. We identify some differences between percolation in a multifractal and in a regular lattice. We use many samples of finite size lattices and draw the histogram of percolating lattices against site occupation probability. Depending on a parameter characterizing the multifractal and the lattice size, the histogram can ha… ▽ More

    Submitted 11 August, 2003; v1 submitted 20 December, 2002; originally announced December 2002.