-
EuroLLM-9B: Technical Report
Authors:
Pedro Henrique Martins,
João Alves,
Patrick Fernandes,
Nuno M. Guerreiro,
Ricardo Rei,
Amin Farajian,
Mateusz Klimaszewski,
Duarte M. Alves,
José Pombal,
Manuel Faysse,
Pierre Colombo,
François Yvon,
Barry Haddow,
José G. C. de Souza,
Alexandra Birch,
André F. T. Martins
Abstract:
This report presents EuroLLM-9B, a large language model trained from scratch to support the needs of European citizens by covering all 24 official European Union languages and 11 additional languages. EuroLLM addresses the issue of European languages being underrepresented and underserved in existing open large language models. We provide a comprehensive overview of EuroLLM-9B's development, inclu…
▽ More
This report presents EuroLLM-9B, a large language model trained from scratch to support the needs of European citizens by covering all 24 official European Union languages and 11 additional languages. EuroLLM addresses the issue of European languages being underrepresented and underserved in existing open large language models. We provide a comprehensive overview of EuroLLM-9B's development, including tokenizer design, architectural specifications, data filtering, and training procedures. We describe the pre-training data collection and filtering pipeline, including the creation of EuroFilter, an AI-based multilingual filter, as well as the design of EuroBlocks-Synthetic, a novel synthetic dataset for post-training that enhances language coverage for European languages. Evaluation results demonstrate EuroLLM-9B's competitive performance on multilingual benchmarks and machine translation tasks, establishing it as the leading open European-made LLM of its size. To support open research and adoption, we release all major components of this work, including the base and instruction-tuned models, the EuroFilter classifier, and the synthetic post-training dataset.
△ Less
Submitted 4 June, 2025;
originally announced June 2025.
-
On defect in finite extensions of valued fields
Authors:
Caio Henrique Silva de Souza,
Mark Spivakovsky
Abstract:
In recent decades, the defect of finite extensions of valued fields has emerged as the main obstacle in several fundamental problems in algebraic geometry such as the local uniformization problem. Hence, it is important to identify defectless fields and study properties related to defect. In this paper we study the relations between the following properties of valued fields: simply defectless, imm…
▽ More
In recent decades, the defect of finite extensions of valued fields has emerged as the main obstacle in several fundamental problems in algebraic geometry such as the local uniformization problem. Hence, it is important to identify defectless fields and study properties related to defect. In this paper we study the relations between the following properties of valued fields: simply defectless, immediate-defectless and algebraically maximal. The main result of the paper is an example of an algebraically maximal field that admits a simple defect extension. For this, we introduce the notion of quasi-finite elements in the generalized power series field $k\left(\left(t^Γ\right)\right)$.
△ Less
Submitted 20 March, 2025;
originally announced March 2025.
-
The S-PLUS Fornax Project (S+FP): Mapping globular clusters systems within 5 virial radii around NGC 1399
Authors:
Luis Lomelí-Núñez,
A. Cortesi,
A. V. Smith Castelli,
M. L. Buzzo,
Y. D. Mayya,
Vasiliki Fragkou,
J. A. Alzate-Trujillo,
R. F. Haack,
J. P. Calderón,
A. R. Lopes,
Michael Hilker,
M. Grossi,
Karín Menéndez-Delmestre,
Thiago S. Gonçalves,
Ana L. Chies-Santos,
L. A. Gutiérrez-Soto,
Ciria Lima-Dias,
S. V. Werner,
Pedro K. Humire,
R. C. Thom de Souza,
A. Alvarez-Candal,
Swayamtrupta Panda,
Avinash Chaturvedi,
E. Telles,
C. Mendes de Oliveira
, et al. (3 additional authors not shown)
Abstract:
We present the largest sample ($\sim$13,000 candidates, $\sim$3000 of wich are bona-fide candidates) of globular cluster (GCs) candidates reported in the Fornax Cluster so far. The survey is centered on the NGC 1399 galaxy, extending out to 5 virial radii (\rv) of the cluster. We carried out a photometric study using images observed in the 12-bands system of the Southern Photometric Local Universe…
▽ More
We present the largest sample ($\sim$13,000 candidates, $\sim$3000 of wich are bona-fide candidates) of globular cluster (GCs) candidates reported in the Fornax Cluster so far. The survey is centered on the NGC 1399 galaxy, extending out to 5 virial radii (\rv) of the cluster. We carried out a photometric study using images observed in the 12-bands system of the Southern Photometric Local Universe Survey (S-PLUS), corresponding to 106 pointings, covering a sky area of $\sim$208 square degrees. Studying the properties of spectroscopically confirmed GCs, we have designed a method to select GC candidates using structural and photometric parameters. We found evidence of color bimodality in 2 broad bands colors, namely $(g-i)_{0}$ and $(g-z)_{0}$, while, in the narrow bands, we did not find strong statistical evidence to confirm bimodality in any color. We analyzed the GCs luminosity functions (GCLF) in the 12-bands of S-PLUS, and we can highlight two points: a) due to the relatively shallow depth of S-PLUS, it is only possible to observe the bright end of the GCLF and, b) at that level, in all the bands it can be appreciated the log-normal distribution typical for GC systems. With the spatial coverage reached in this study, we are able for the first time explore the large scale distribution of GCs within and around a galaxy cluster. In particular, we noted that the GCs might be clustered along substructures, which traces the current cluster build up.
△ Less
Submitted 19 March, 2025;
originally announced March 2025.
-
KAN-Mixers: a new deep learning architecture for image classification
Authors:
Jorge Luiz dos Santos Canuto,
Linnyer Beatrys Ruiz Aylon,
Rodrigo Clemente Thom de Souza
Abstract:
Due to their effective performance, Convolutional Neural Network (CNN) and Vision Transformer (ViT) architectures have become the standard for solving computer vision tasks. Such architectures require large data sets and rely on convolution and self-attention operations. In 2021, MLP-Mixer emerged, an architecture that relies only on Multilayer Perceptron (MLP) and achieves extremely competitive r…
▽ More
Due to their effective performance, Convolutional Neural Network (CNN) and Vision Transformer (ViT) architectures have become the standard for solving computer vision tasks. Such architectures require large data sets and rely on convolution and self-attention operations. In 2021, MLP-Mixer emerged, an architecture that relies only on Multilayer Perceptron (MLP) and achieves extremely competitive results when compared to CNNs and ViTs. Despite its good performance in computer vision tasks, the MLP-Mixer architecture may not be suitable for refined feature extraction in images. Recently, the Kolmogorov-Arnold Network (KAN) was proposed as a promising alternative to MLP models. KANs promise to improve accuracy and interpretability when compared to MLPs. Therefore, the present work aims to design a new mixer-based architecture, called KAN-Mixers, using KANs as main layers and evaluate its performance, in terms of several performance metrics, in the image classification task. As main results obtained, the KAN-Mixers model was superior to the MLP, MLP-Mixer and KAN models in the Fashion-MNIST and CIFAR-10 datasets, with 0.9030 and 0.6980 of average accuracy, respectively.
△ Less
Submitted 11 March, 2025;
originally announced March 2025.
-
Data denoising with self consistency, variance maximization, and the Kantorovich dominance
Authors:
Joshua Zoen-Git Hiew,
Tongseok Lim,
Brendan Pass,
Marcelo Cruz de Souza
Abstract:
We introduce a new framework for data denoising, partially inspired by martingale optimal transport. For a given noisy distribution (the data), our approach involves finding the closest distribution to it among all distributions which 1) have a particular prescribed structure (expressed by requiring they lie in a particular domain), and 2) are self-consistent with the data. We show that this amoun…
▽ More
We introduce a new framework for data denoising, partially inspired by martingale optimal transport. For a given noisy distribution (the data), our approach involves finding the closest distribution to it among all distributions which 1) have a particular prescribed structure (expressed by requiring they lie in a particular domain), and 2) are self-consistent with the data. We show that this amounts to maximizing the variance among measures in the domain which are dominated in convex order by the data. For particular choices of the domain, this problem and a relaxed version of it, in which the self-consistency condition is removed, are intimately related to various classical approaches to denoising. We prove that our general problem has certain desirable features: solutions exist under mild assumptions, have certain robustness properties, and, for very simple domains, coincide with solutions to the relaxed problem.
We also introduce a novel relationship between distributions, termed Kantorovich dominance, which retains certain aspects of the convex order while being a weaker, more robust, and easier-to-verify condition. Building on this, we propose and analyze a new denoising problem by substituting the convex order in the previously described framework with Kantorovich dominance. We demonstrate that this revised problem shares some characteristics with the full convex order problem but offers enhanced stability, greater computational efficiency, and, in specific domains, more meaningful solutions. Finally, we present simple numerical examples illustrating solutions for both the full convex order problem and the Kantorovich dominance problem.
△ Less
Submitted 5 February, 2025;
originally announced February 2025.
-
Trajectory Planning and Control for Differentially Flat Fixed-Wing Aerial Systems
Authors:
Luca Morando,
Sanket A. Salunkhe,
Nishanth Bobbili,
Jeffrey Mao,
Luca Masci,
Hung Nguyen,
Cristino de Souza,
Giuseppe Loianno
Abstract:
Efficient real-time trajectory planning and control for fixed-wing unmanned aerial vehicles is challenging due to their non-holonomic nature, complex dynamics, and the additional uncertainties introduced by unknown aerodynamic effects. In this paper, we present a fast and efficient real-time trajectory planning and control approach for fixed-wing unmanned aerial vehicles, leveraging the differenti…
▽ More
Efficient real-time trajectory planning and control for fixed-wing unmanned aerial vehicles is challenging due to their non-holonomic nature, complex dynamics, and the additional uncertainties introduced by unknown aerodynamic effects. In this paper, we present a fast and efficient real-time trajectory planning and control approach for fixed-wing unmanned aerial vehicles, leveraging the differential flatness property of fixed-wing aircraft in coordinated flight conditions to generate dynamically feasible trajectories. The approach provides the ability to continuously replan trajectories, which we show is useful to dynamically account for the curvature constraint as the aircraft advances along its path. Extensive simulations and real-world experiments validate our approach, showcasing its effectiveness in generating trajectories even in challenging conditions for small FW such as wind disturbances.
△ Less
Submitted 1 February, 2025;
originally announced February 2025.
-
Affirmative Hackathon for Software Developers with Disabilities: An Industry Initiative
Authors:
Thayssa Rocha,
Nicole Davila,
Rafaella Vaccari,
Nicoly Menezes,
Marcelle Mota,
Edward Monteiro,
Cleidson de Souza,
Gustavo Pinto
Abstract:
People with disabilities (PWD) often encounter several barriers to becoming employed. A growing body of evidence in software development highlights the benefits of diversity and inclusion in the field. However, recruiting, hiring, and fostering a supportive environment for PWD remains challenging. These challenges are exacerbated by the lack of skilled professionals with experience in inclusive hi…
▽ More
People with disabilities (PWD) often encounter several barriers to becoming employed. A growing body of evidence in software development highlights the benefits of diversity and inclusion in the field. However, recruiting, hiring, and fostering a supportive environment for PWD remains challenging. These challenges are exacerbated by the lack of skilled professionals with experience in inclusive hiring and management, which prevents companies from effectively increasing PWD representation on software development teams. Inspired by the strategy adopted in some technology companies that attract talent through hackathons and training courses, this paper reports the experience of Zup Innovation, a Brazilian software company, in hosting a fully remote affirmative hackathon with 50 participants to attract PWD developers. This event resulted in 10 new hires and 146 people added to the company's talent pool. Through surveys with participants, we gathered attendees' perceptions and experiences, aiming to improve future hackathons and similar initiatives by providing insights on accessibility and collaboration. Our findings offer lessons for other companies seeking to address similar challenges and promote greater inclusion in tech teams.
△ Less
Submitted 20 January, 2025; v1 submitted 13 January, 2025;
originally announced January 2025.
-
Teaching materials aligned or unaligned with the principles of the Cognitive Theory of Multimedia Learning: the choices made by Physics teachers and students
Authors:
Aline N. Braga,
Antonio A. M. Neto,
Alessandra N. Braga,
Silvio C. F. Pereira Filho,
Nelson P. C. de Souza,
Danilo T. Alves
Abstract:
In a recent study [Rev. Bras. Ens. Fís. vol. 45, 2023], the absence of the Cognitive Theory of Multimedia Learning (CTML) in the curricula of Physics teacher education programs at Brazilian public universities was highlighted. Considering this gap, the present study investigates whether, even without any formal prior knowledge of CTML principles (Coherence, Signaling, Spatial Contiguity, Segmentat…
▽ More
In a recent study [Rev. Bras. Ens. Fís. vol. 45, 2023], the absence of the Cognitive Theory of Multimedia Learning (CTML) in the curricula of Physics teacher education programs at Brazilian public universities was highlighted. Considering this gap, the present study investigates whether, even without any formal prior knowledge of CTML principles (Coherence, Signaling, Spatial Contiguity, Segmentation, Multimedia, and Personalization), Physics teacher trainees and educators tend to choose, among two formats of multimedia materials - one aligned with a given CTML principle and the other not - the materials aligned with these principles. The findings of this case study revealed that, although most participants generally selected materials aligned with the mentioned principles, a significant portion did not. These results underscore the importance of Brazilian universities considering the inclusion of CTML in Physics teacher education curricula.
△ Less
Submitted 27 December, 2024;
originally announced December 2024.
-
Stellar atmospheric parameters and chemical abundances of about 5 million stars from S-PLUS multi-band photometry
Authors:
C. E. Ferreira Lopes,
L. A. Gutiérrez-Soto,
V. S. Ferreira Alberice,
N. Monsalves,
D. Hazarika,
M. Catelan,
V. M. Placco,
G. Limberg,
F. Almeida-Fernandes,
H. D. Perottoni,
A. V. Smith Castelli,
S. Akras,
J. Alonso-García,
V. Cordeiro,
M. Jaque Arancibia,
S. Daflon,
B. Dias,
D. R. Gonçalves,
E. Machado-Pereira,
A. R. Lopes,
C. R. Bom,
R. C. Thom de Souza,
N. G. de Isídio,
A. Alvarez-Candal,
M. E. De Rossi
, et al. (8 additional authors not shown)
Abstract:
Context. Spectroscopic surveys like APOGEE, GALAH, and LAMOST have significantly advanced our understanding of the Milky Way by providing extensive stellar parameters and chemical abundances. Complementing these, photometric surveys with narrow/medium-band filters, such as the Southern Photometric Local Universe Survey (S-PLUS), offer the potential to estimate stellar parameters and abundances for…
▽ More
Context. Spectroscopic surveys like APOGEE, GALAH, and LAMOST have significantly advanced our understanding of the Milky Way by providing extensive stellar parameters and chemical abundances. Complementing these, photometric surveys with narrow/medium-band filters, such as the Southern Photometric Local Universe Survey (S-PLUS), offer the potential to estimate stellar parameters and abundances for a much larger number of stars.
Aims. This work develops methodologies to extract stellar atmospheric parameters and selected chemical abundances from S-PLUS photometric data, which spans ~3000 square degrees using seven narrowband and five broadband filters.
Methods. Using 66 S-PLUS colors, we estimated parameters based on training samples from LAMOST, APOGEE, and GALAH, applying Cost-Sensitive Neural Networks (NN) and Random Forests (RF). We tested for spurious correlations by including abundances not covered by the S-PLUS filters and evaluated NN and RF performance, with NN consistently outperforming RF. Including Teff and log g as features improved accuracy by ~3%. We retained only parameters with a goodness-of-fit above 50%.
Results. Our approach provides reliable estimates of fundamental parameters (Teff, log g, [Fe/H]) and abundance ratios such as [α/Fe], [Al/Fe], [C/Fe], [Li/Fe], and [Mg/Fe] for ~5 million stars, with goodness-of-fit >60%. Additional ratios like [Cu/Fe], [O/Fe], and [Si/Fe] were derived but are less accurate. Validation using star clusters, TESS, and J-PLUS data confirmed the robustness of our methodology.
Conclusions. By leveraging S-PLUS photometry and machine learning, we present a cost-effective alternative to high-resolution spectroscopy for deriving stellar parameters and abundances, enabling insights into Milky Way stellar populations and supporting future classification efforts.
△ Less
Submitted 27 November, 2024;
originally announced November 2024.
-
Lorentz-violating Yukawa theory at finite temperature
Authors:
D. S. Cabral,
L. A. S. Evangelista,
J. C. R. de Souza,
L. H. A. R. Ferreira,
A. F. Santos
Abstract:
This paper addresses Yukawa theory, focusing on the scattering between two identical fermions mediated by an intermediate scalar boson, considering the effects of thermal contributions and Lorentz symmetry breaking. Temperature is introduced into the theory through the TFD formalism, while Lorentz violation arises from a background tensor coupled to the kinetic part of the Klein-Gordon Lagrangian.…
▽ More
This paper addresses Yukawa theory, focusing on the scattering between two identical fermions mediated by an intermediate scalar boson, considering the effects of thermal contributions and Lorentz symmetry breaking. Temperature is introduced into the theory through the TFD formalism, while Lorentz violation arises from a background tensor coupled to the kinetic part of the Klein-Gordon Lagrangian. Two important quantities are calculated: the cross-section for the scattering process and the modified Yukawa potential. The main results obtained in this work demonstrate that considering Lorentz symmetry breaking has several implications for changes in symmetries and physical states, while the presence of temperature is strongly related to the strength of the interaction. This interplay between symmetry breaking and temperature effects provides deeper insights into the behavior of the Yukawa theory under different conditions.
△ Less
Submitted 22 October, 2024;
originally announced October 2024.
-
Findings of the WMT 2024 Shared Task on Chat Translation
Authors:
Wafaa Mohammed,
Sweta Agrawal,
M. Amin Farajian,
Vera Cabarrão,
Bryan Eikema,
Ana C. Farinha,
José G. C. de Souza
Abstract:
This paper presents the findings from the third edition of the Chat Translation Shared Task. As with previous editions, the task involved translating bilingual customer support conversations, specifically focusing on the impact of conversation context in translation quality and evaluation. We also include two new language pairs: English-Korean and English-Dutch, in addition to the set of language…
▽ More
This paper presents the findings from the third edition of the Chat Translation Shared Task. As with previous editions, the task involved translating bilingual customer support conversations, specifically focusing on the impact of conversation context in translation quality and evaluation. We also include two new language pairs: English-Korean and English-Dutch, in addition to the set of language pairs from previous editions: English-German, English-French, and English-Brazilian Portuguese. We received 22 primary submissions and 32 contrastive submissions from eight teams, with each language pair having participation from at least three teams. We evaluated the systems comprehensively using both automatic metrics and human judgments via a direct assessment framework. The official rankings for each language pair were determined based on human evaluation scores, considering performance in both translation directions--agent and customer. Our analysis shows that while the systems excelled at translating individual turns, there is room for improvement in overall conversation-level translation quality.
△ Less
Submitted 15 October, 2024;
originally announced October 2024.
-
Modeling User Preferences with Automatic Metrics: Creating a High-Quality Preference Dataset for Machine Translation
Authors:
Sweta Agrawal,
José G. C. de Souza,
Ricardo Rei,
António Farinhas,
Gonçalo Faria,
Patrick Fernandes,
Nuno M Guerreiro,
Andre Martins
Abstract:
Alignment with human preferences is an important step in developing accurate and safe large language models. This is no exception in machine translation (MT), where better handling of language nuances and context-specific variations leads to improved quality. However, preference data based on human feedback can be very expensive to obtain and curate at a large scale. Automatic metrics, on the othe…
▽ More
Alignment with human preferences is an important step in developing accurate and safe large language models. This is no exception in machine translation (MT), where better handling of language nuances and context-specific variations leads to improved quality. However, preference data based on human feedback can be very expensive to obtain and curate at a large scale. Automatic metrics, on the other hand, can induce preferences, but they might not match human expectations perfectly. In this paper, we propose an approach that leverages the best of both worlds. We first collect sentence-level quality assessments from professional linguists on translations generated by multiple high-quality MT systems and evaluate the ability of current automatic metrics to recover these preferences. We then use this analysis to curate a new dataset, MT-Pref (metric induced translation preference) dataset, which comprises 18k instances covering 18 language directions, using texts sourced from multiple domains post-2022. We show that aligning TOWER models on MT-Pref significantly improves translation quality on WMT23 and FLORES benchmarks.
△ Less
Submitted 10 October, 2024;
originally announced October 2024.
-
EuroLLM: Multilingual Language Models for Europe
Authors:
Pedro Henrique Martins,
Patrick Fernandes,
João Alves,
Nuno M. Guerreiro,
Ricardo Rei,
Duarte M. Alves,
José Pombal,
Amin Farajian,
Manuel Faysse,
Mateusz Klimaszewski,
Pierre Colombo,
Barry Haddow,
José G. C. de Souza,
Alexandra Birch,
André F. T. Martins
Abstract:
The quality of open-weight LLMs has seen significant improvement, yet they remain predominantly focused on English. In this paper, we introduce the EuroLLM project, aimed at developing a suite of open-weight multilingual LLMs capable of understanding and generating text in all official European Union languages, as well as several additional relevant languages. We outline the progress made to date,…
▽ More
The quality of open-weight LLMs has seen significant improvement, yet they remain predominantly focused on English. In this paper, we introduce the EuroLLM project, aimed at developing a suite of open-weight multilingual LLMs capable of understanding and generating text in all official European Union languages, as well as several additional relevant languages. We outline the progress made to date, detailing our data collection and filtering process, the development of scaling laws, the creation of our multilingual tokenizer, and the data mix and modeling configurations. Additionally, we release our initial models: EuroLLM-1.7B and EuroLLM-1.7B-Instruct and report their performance on multilingual general benchmarks and machine translation.
△ Less
Submitted 24 September, 2024;
originally announced September 2024.
-
Assisting Novice Developers Learning in Flutter Through Cognitive-Driven Development
Authors:
Ronivaldo Ferreira,
Victor H. S. Pinto,
Cleidson R. B. de Souza,
Gustavo Pinto
Abstract:
Cognitive-Driven Development (CDD) is a coding design technique that helps developers focus on designing code within cognitive limits. The imposed limit tends to enhance code readability and maintainability. While early works on CDD focused mostly on Java, its applicability extends beyond specific programming languages. In this study, we explored the use of CDD in two new dimensions: focusing on F…
▽ More
Cognitive-Driven Development (CDD) is a coding design technique that helps developers focus on designing code within cognitive limits. The imposed limit tends to enhance code readability and maintainability. While early works on CDD focused mostly on Java, its applicability extends beyond specific programming languages. In this study, we explored the use of CDD in two new dimensions: focusing on Flutter programming and targeting novice developers unfamiliar with both Flutter and CDD. Our goal was to understand to what extent CDD helps novice developers learn a new programming technology. We conducted an in-person Flutter training camp with 24 participants. After receiving CDD training, six remaining students were tasked with developing a software management application guided by CDD practices. Our findings indicate that CDD helped participants keep code complexity low, measured using Intrinsic Complexity Points (ICP), a CDD metric. Notably, stricter ICP limits led to a 20\% reduction in code size, improving code quality and readability. This report could be valuable for professors and instructors seeking effective methodologies for teaching design practices that reduce code and cognitive complexity.
△ Less
Submitted 20 August, 2024;
originally announced August 2024.
-
Fast variational Bayesian inference for correlated survival data: an application to invasive mechanical ventilation duration analysis
Authors:
Chengqian Xian,
Camila P. E. de Souza,
Wenqing He,
Felipe F. Rodrigues,
Renfang Tian
Abstract:
Correlated survival data are prevalent in various clinical settings and have been extensively discussed in literature. One of the most common types of correlated survival data is clustered survival data, where the survival times from individuals in a cluster are associated. Our study is motivated by invasive mechanical ventilation data from different intensive care units (ICUs) in Ontario, Canada,…
▽ More
Correlated survival data are prevalent in various clinical settings and have been extensively discussed in literature. One of the most common types of correlated survival data is clustered survival data, where the survival times from individuals in a cluster are associated. Our study is motivated by invasive mechanical ventilation data from different intensive care units (ICUs) in Ontario, Canada, forming multiple clusters. The survival times from patients within the same ICU cluster are correlated. To address this association, we introduce a shared frailty log-logistic accelerated failure time model that accounts for intra-cluster correlation through a cluster-specific random intercept. We present a novel, fast variational Bayes (VB) algorithm for parameter inference and evaluate its performance using simulation studies varying the number of clusters and their sizes. We further compare the performance of our proposed VB algorithm with the h-likelihood method and a Markov Chain Monte Carlo (MCMC) algorithm. The proposed algorithm delivers satisfactory results and demonstrates computational efficiency over the MCMC algorithm. We apply our method to the ICU ventilation data from Ontario to investigate the ICU site random effect on ventilation duration.
△ Less
Submitted 31 July, 2024;
originally announced August 2024.
-
Time-Machines Construct in $f(\mathcal{R},\mathcal{A},A^{μν}\,A_{μν})$ and $f(\mathcal{R})$ Modified Gravity Theories
Authors:
F. Ahmed,
J. C. R. de Souza,
A. F. Santos
Abstract:
In this paper, our objective is to explore a time-machine space-time formulated in general relativity, as introduced by Li (Phys. Rev. D {\bf 59}, 084016 (1999)), within the context of modified gravity theories. We consider Ricci-inverse gravity of all Classes of models, {\it i.e.}, (i) Class-{\bf I}: $f(\mathcal{R}, \mathcal{A})=(\mathcal{R}+{κ\,\mathcal{R}^2}+β\,\mathcal{A})$, (ii) Class-{\bf II…
▽ More
In this paper, our objective is to explore a time-machine space-time formulated in general relativity, as introduced by Li (Phys. Rev. D {\bf 59}, 084016 (1999)), within the context of modified gravity theories. We consider Ricci-inverse gravity of all Classes of models, {\it i.e.}, (i) Class-{\bf I}: $f(\mathcal{R}, \mathcal{A})=(\mathcal{R}+{κ\,\mathcal{R}^2}+β\,\mathcal{A})$, (ii) Class-{\bf II}: $f(\mathcal{R}, A^{μν}\,A_{μν})=(\mathcal{R}+{κ\,\mathcal{R}^2}+γ\,A^{μν}\,A_{μν})$ model, and (iii) Class-{\bf III}: $f(\mathcal{R}, \mathcal{A}, A^{μν}\,A_{μν})=(\mathcal{R}{κ\,\mathcal{R}^2}+β\,\mathcal{A}+δ\,\mathcal{A}^2+γ\,A^{μν}\,A_{μν})$ model, where $A^{μν}$ is the anti-curvature tensor, the reciprocal of the Ricci tensor, $R_{μν}$, $\mathcal{A}=g_{μν}\,A^{μν}$ is its scalar, and $β, κ, γ, δ$ are the coupling constants. Moreover, we consider $f(\mathcal{R})$ modified gravity theory and investigate the same time-machine space-time. In fact, we show that Li time-machine space-time serve as valid solutions both in Ricci-inverse and $f(\mathcal{R})$ modified gravity theories. Thus, both theory allows the formation of closed time-like curves analogue to general relativity, thereby representing a possible time-machine model in these gravity theories theoretically.
△ Less
Submitted 4 October, 2024; v1 submitted 16 July, 2024;
originally announced July 2024.
-
Teaching and Learning Ethnography for Software Engineering Contexts
Authors:
Yvonne Dittrich,
Helen Sharp,
Cleidson de Souza
Abstract:
Ethnography has become one of the established methods for empirical research on software engineering. Although there is a wide variety of introductory books available, there has been no material targeting software engineering students particularly, until now. In this chapter we provide an introduction to teaching and learning ethnography for faculty teaching ethnography to software engineering gra…
▽ More
Ethnography has become one of the established methods for empirical research on software engineering. Although there is a wide variety of introductory books available, there has been no material targeting software engineering students particularly, until now. In this chapter we provide an introduction to teaching and learning ethnography for faculty teaching ethnography to software engineering graduate students and for the students themselves of such courses.
The contents of the chapter focuses on what we think is the core basic knowledge for newbies to ethnography as a research method. We complement the text with proposals for exercises, tips for teaching, and pitfalls that we and our students have experienced.
The chapter is designed to support part of a course on empirical software engineering and provides pointers and literature for further reading.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Tame fields, Graded Rings and Finite Complete Sequences of Key Polynomials
Authors:
Caio Henrique Silva de Souza
Abstract:
In this paper, we present a criterion for $(K,v)$ to be henselian and defectless in terms of finite complete sequences of key polynomials. For this, we use the theory of Mac Lane-Vaquié chains and abstract key polynomials. We then prove that a valued field $(K,v)$ is tame if and only if $vK$ is $p$-divisible, $Kv$ is perfect and every simple algebraic extension of $K$ admits a finite complete sequ…
▽ More
In this paper, we present a criterion for $(K,v)$ to be henselian and defectless in terms of finite complete sequences of key polynomials. For this, we use the theory of Mac Lane-Vaquié chains and abstract key polynomials. We then prove that a valued field $(K,v)$ is tame if and only if $vK$ is $p$-divisible, $Kv$ is perfect and every simple algebraic extension of $K$ admits a finite complete sequence of key polynomials. The properties $vK$ $p$-divisible and $Kv$ perfect are described by the Frobenius endomorphism on the associated graded ring. We also make considerations on simply defectless and algebraically maximal valued fields and purely inertial and purely ramified extensions.
△ Less
Submitted 10 January, 2025; v1 submitted 1 July, 2024;
originally announced July 2024.
-
Ratchet current and scaling properties in a nontwist mapping
Authors:
Matheus Rolim Sales,
Daniel Borin,
Leonardo Costa de Souza,
José Danilo Szezech Jr.,
Ricardo Luiz Viana,
Iberê Luiz Caldas,
Edson Denis Leonel
Abstract:
We investigate the transport of particles in the chaotic component of phase space for a two-dimensional, area-preserving nontwist map. The survival probability for particles within the chaotic sea is described by an exponential decay for regions in phase space predominantly chaotic and it is scaling invariant in this case. Alternatively, when considering mixed chaotic and regular regions, there is…
▽ More
We investigate the transport of particles in the chaotic component of phase space for a two-dimensional, area-preserving nontwist map. The survival probability for particles within the chaotic sea is described by an exponential decay for regions in phase space predominantly chaotic and it is scaling invariant in this case. Alternatively, when considering mixed chaotic and regular regions, there is a deviation from the exponential decay, characterized by a power law tail for long times, a signature of the stickiness effect. Furthermore, due to the asymmetry of the chaotic component of phase space with respect to the line $I = 0$, there is an unbalanced stickiness which generates a ratchet current in phase space. Finally, we perform a phenomenological description of the diffusion of chaotic particles by identifying three scaling hypotheses, and obtaining the critical exponents via extensive numerical simulations.
△ Less
Submitted 13 August, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.
-
Model-based Clustering of Multi-Dimensional Zero-Inflated Counts via the EM Algorithm
Authors:
Zahra AghahosseinaliShirazi,
Pedro A. Rangel,
Camila P. E. de Souza
Abstract:
Zero-inflated count data arise in various fields, including health, biology, economics, and the social sciences. These data are often modelled using probabilistic distributions such as zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB), or zero-inflated binomial (ZIB). To account for heterogeneity in the data, it is often useful to cluster observations into groups that may explain…
▽ More
Zero-inflated count data arise in various fields, including health, biology, economics, and the social sciences. These data are often modelled using probabilistic distributions such as zero-inflated Poisson (ZIP), zero-inflated negative binomial (ZINB), or zero-inflated binomial (ZIB). To account for heterogeneity in the data, it is often useful to cluster observations into groups that may explain underlying differences in the data-generating process. This paper focuses on model-based clustering for zero-inflated counts when observations are structured in a matrix form rather than a vector. We propose a clustering framework based on mixtures of ZIP or ZINB distributions, with both the count and zero components depending on cluster assignments. Our approach incorporates covariates through a log-linear structure for the mean parameter and includes a size factor to adjust for differences in total sampling or exposure. Model parameters and cluster assignments are estimated via the Expectation-Maximization (EM) algorithm. We assess the performance of our proposed methodology through simulation studies evaluating clustering accuracy and estimator properties, followed by applications to publicly available datasets.
△ Less
Submitted 27 March, 2025; v1 submitted 31 May, 2024;
originally announced June 2024.
-
QUEST: Quality-Aware Metropolis-Hastings Sampling for Machine Translation
Authors:
Gonçalo R. A. Faria,
Sweta Agrawal,
António Farinhas,
Ricardo Rei,
José G. C. de Souza,
André F. T. Martins
Abstract:
An important challenge in machine translation (MT) is to generate high-quality and diverse translations. Prior work has shown that the estimated likelihood from the MT model correlates poorly with translation quality. In contrast, quality evaluation metrics (such as COMET or BLEURT) exhibit high correlations with human judgments, which has motivated their use as rerankers (such as quality-aware an…
▽ More
An important challenge in machine translation (MT) is to generate high-quality and diverse translations. Prior work has shown that the estimated likelihood from the MT model correlates poorly with translation quality. In contrast, quality evaluation metrics (such as COMET or BLEURT) exhibit high correlations with human judgments, which has motivated their use as rerankers (such as quality-aware and minimum Bayes risk decoding). However, relying on a single translation with high estimated quality increases the chances of "gaming the metric''. In this paper, we address the problem of sampling a set of high-quality and diverse translations. We provide a simple and effective way to avoid over-reliance on noisy quality estimates by using them as the energy function of a Gibbs distribution. Instead of looking for a mode in the distribution, we generate multiple samples from high-density areas through the Metropolis-Hastings algorithm, a simple Markov chain Monte Carlo approach. The results show that our proposed method leads to high-quality and diverse outputs across multiple language pairs (English$\leftrightarrow${German, Russian}) with two strong decoder-only LLMs (Alma-7b, Tower-7b).
△ Less
Submitted 15 October, 2024; v1 submitted 28 May, 2024;
originally announced June 2024.
-
Fast Bayesian Basis Selection for Functional Data Representation with Correlated Errors
Authors:
Ana Carolina da Cruz,
Camila P. E. de Souza,
Pedro H. T. O. Sousa
Abstract:
Functional data analysis finds widespread application across various fields. While functional data are intrinsically infinite-dimensional, in practice, they are observed only at a finite set of points, typically over a dense grid. As a result, smoothing techniques are often used to approximate the observed data as functions. In this work, we propose a novel Bayesian approach for selecting basis fu…
▽ More
Functional data analysis finds widespread application across various fields. While functional data are intrinsically infinite-dimensional, in practice, they are observed only at a finite set of points, typically over a dense grid. As a result, smoothing techniques are often used to approximate the observed data as functions. In this work, we propose a novel Bayesian approach for selecting basis functions for smoothing one or multiple curves simultaneously. Our method differentiates from other Bayesian approaches in two key ways: (i) by accounting for correlated errors and (ii) by developing a variational EM algorithm, which is faster than MCMC methods such as Gibbs sampling. Simulation studies demonstrate that our method effectively identifies the true underlying structure of the data across various scenarios, and it is applicable to different types of functional data. Our variational EM algorithm not only recovers the basis coefficients and the correct set of basis functions but also estimates the existing within-curve correlation. When applied to the motorcycle and Canadian weather datasets, our method demonstrates comparable, and in some cases superior, performance in terms of adjusted $R^2$ compared to regression splines, smoothing splines, Bayesian LASSO and LASSO. Our proposed method is implemented in R and codes are available at https://github.com/acarolcruz/VB-Bases-Selection.
△ Less
Submitted 8 November, 2024; v1 submitted 31 May, 2024;
originally announced May 2024.
-
Tower: An Open Multilingual Large Language Model for Translation-Related Tasks
Authors:
Duarte M. Alves,
José Pombal,
Nuno M. Guerreiro,
Pedro H. Martins,
João Alves,
Amin Farajian,
Ben Peters,
Ricardo Rei,
Patrick Fernandes,
Sweta Agrawal,
Pierre Colombo,
José G. C. de Souza,
André F. T. Martins
Abstract:
While general-purpose large language models (LLMs) demonstrate proficiency on multiple tasks within the domain of translation, approaches based on open LLMs are competitive only when specializing on a single task. In this paper, we propose a recipe for tailoring LLMs to multiple tasks present in translation workflows. We perform continued pretraining on a multilingual mixture of monolingual and pa…
▽ More
While general-purpose large language models (LLMs) demonstrate proficiency on multiple tasks within the domain of translation, approaches based on open LLMs are competitive only when specializing on a single task. In this paper, we propose a recipe for tailoring LLMs to multiple tasks present in translation workflows. We perform continued pretraining on a multilingual mixture of monolingual and parallel data, creating TowerBase, followed by finetuning on instructions relevant for translation processes, creating TowerInstruct. Our final model surpasses open alternatives on several tasks relevant to translation workflows and is competitive with general-purpose closed LLMs. To facilitate future research, we release the Tower models, our specialization dataset, an evaluation framework for LLMs focusing on the translation ecosystem, and a collection of model generations, including ours, on our benchmark.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
PANDAS: Prototype-based Novel Class Discovery and Detection
Authors:
Tyler L. Hayes,
César R. de Souza,
Namil Kim,
Jiwon Kim,
Riccardo Volpi,
Diane Larlus
Abstract:
Object detectors are typically trained once and for all on a fixed set of classes. However, this closed-world assumption is unrealistic in practice, as new classes will inevitably emerge after the detector is deployed in the wild. In this work, we look at ways to extend a detector trained for a set of base classes so it can i) spot the presence of novel classes, and ii) automatically enrich its re…
▽ More
Object detectors are typically trained once and for all on a fixed set of classes. However, this closed-world assumption is unrealistic in practice, as new classes will inevitably emerge after the detector is deployed in the wild. In this work, we look at ways to extend a detector trained for a set of base classes so it can i) spot the presence of novel classes, and ii) automatically enrich its repertoire to be able to detect those newly discovered classes together with the base ones. We propose PANDAS, a method for novel class discovery and detection. It discovers clusters representing novel classes from unlabeled data, and represents old and new classes with prototypes. During inference, a distance-based classifier uses these prototypes to assign a label to each detected object instance. The simplicity of our method makes it widely applicable. We experimentally demonstrate the effectiveness of PANDAS on the VOC 2012 and COCO-to-LVIS benchmarks. It performs favorably against the state of the art for this task while being computationally more affordable.
△ Less
Submitted 30 April, 2024; v1 submitted 27 February, 2024;
originally announced February 2024.
-
Cosmological constant Petrov type-N space-time in Ricci-inverse gravity
Authors:
F. Ahmed,
J. C. R. de Souza,
A. F. Santos
Abstract:
Our focus is on a specific type-N space-time that exhibits closed time-like curves in general relativity theory within the framework of Ricci-inverse gravity model. The matter-energy content is solely composed of a pure radiation field, and it adheres to the energy conditions while featuring a negative cosmological constant. One of the key findings in this investigation is the non-zero determinant…
▽ More
Our focus is on a specific type-N space-time that exhibits closed time-like curves in general relativity theory within the framework of Ricci-inverse gravity model. The matter-energy content is solely composed of a pure radiation field, and it adheres to the energy conditions while featuring a negative cosmological constant. One of the key findings in this investigation is the non-zero determinant of the Ricci tensor ($R_{μν}$), which implies the existence of an anti-curvature tensor ($A^{μν}$) and, as a consequence, an anti-curvature scalar ($A \neq R^{-1}$). Furthermore, we establish that this type-N space-time serves as a solution within modified gravity theories via the Ricci-inverse model, which involves adjustments to the cosmological constant ($Λ$) and the energy density ($ρ$) of the radiation field expressed in terms of a coupling constant. As a result, our findings suggest that causality violations remain possible within the framework of this Ricci-inverse gravity model, alongside the predictions of general relativity.
△ Less
Submitted 26 December, 2023;
originally announced December 2023.
-
Polar order, shear banding, and clustering in confined active matter
Authors:
Daniel Canavello,
Rubens H. Damascena,
Leonardo R. E. Cabral,
Clécio C. de Souza Silva
Abstract:
We investigate the collective behavior of sterically interacting self-propelled particles confined in a harmonic potential. Our theoretical and numerical study unveils the emergence of distinctive collective polar organizations, revealing how different levels of interparticle torques and noise influence the system. The observed phases include the shear-banded vortex, where the system self organize…
▽ More
We investigate the collective behavior of sterically interacting self-propelled particles confined in a harmonic potential. Our theoretical and numerical study unveils the emergence of distinctive collective polar organizations, revealing how different levels of interparticle torques and noise influence the system. The observed phases include the shear-banded vortex, where the system self organizes in two concentric bands rotating in opposite directions around the potential center; the uniform vortex, where the two bands merge into a close packed configurations rotating uniformly as a quasi-rigid body; and the orbiting polar state, characterized by parallel orientation vectors and the cluster revolving around the potential center, without rotation, as a rigid body. Intriguingly, at lower filling fractions, the vortex and polar phases merge into a single phase where the trapped cluster breaks into smaller polarized clusters, each one orbiting the potential center as a rigid body.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
AutoNumerics-Zero: Automated Discovery of State-of-the-Art Mathematical Functions
Authors:
Esteban Real,
Yao Chen,
Mirko Rossini,
Connal de Souza,
Manav Garg,
Akhil Verghese,
Moritz Firsching,
Quoc V. Le,
Ekin Dogus Cubuk,
David H. Park
Abstract:
Computers calculate transcendental functions by approximating them through the composition of a few limited-precision instructions. For example, an exponential can be calculated with a Taylor series. These approximation methods were developed over the centuries by mathematicians, who emphasized the attainability of arbitrary precision. Computers, however, operate on few limited precision types, su…
▽ More
Computers calculate transcendental functions by approximating them through the composition of a few limited-precision instructions. For example, an exponential can be calculated with a Taylor series. These approximation methods were developed over the centuries by mathematicians, who emphasized the attainability of arbitrary precision. Computers, however, operate on few limited precision types, such as the popular float32. In this study, we show that when aiming for limited precision, existing approximation methods can be outperformed by programs automatically discovered from scratch by a simple evolutionary algorithm. In particular, over real numbers, our method can approximate the exponential function reaching orders of magnitude more precision for a given number of operations when compared to previous approaches. More practically, over float32 numbers and constrained to less than 1 ULP of error, the same method attains a speedup over baselines by generating code that triggers better XLA/LLVM compilation paths. In other words, in both cases, evolution searched a vast space of possible programs, without knowledge of mathematics, to discover previously unknown optimized approximations to high precision, for the first time. We also give evidence that these results extend beyond the exponential. The ubiquity of transcendental functions suggests that our method has the potential to reduce the cost of scientific computing applications.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
Developer Experiences with a Contextualized AI Coding Assistant: Usability, Expectations, and Outcomes
Authors:
Gustavo Pinto,
Cleidson de Souza,
Thayssa Rocha,
Igor Steinmacher,
Alberto de Souza,
Edward Monteiro
Abstract:
In the rapidly advancing field of artificial intelligence, software development has emerged as a key area of innovation. Despite the plethora of general-purpose AI assistants available, their effectiveness diminishes in complex, domain-specific scenarios. Noting this limitation, both the academic community and industry players are relying on contextualized coding AI assistants. These assistants su…
▽ More
In the rapidly advancing field of artificial intelligence, software development has emerged as a key area of innovation. Despite the plethora of general-purpose AI assistants available, their effectiveness diminishes in complex, domain-specific scenarios. Noting this limitation, both the academic community and industry players are relying on contextualized coding AI assistants. These assistants surpass general-purpose AI tools by integrating proprietary, domain-specific knowledge, offering precise and relevant solutions. Our study focuses on the initial experiences of 62 participants who used a contextualized coding AI assistant -- named StackSpot AI -- in a controlled setting. According to the participants, the assistants' use resulted in significant time savings, easier access to documentation, and the generation of accurate codes for internal APIs. However, challenges associated with the knowledge sources necessary to make the coding assistant access more contextual information as well as variable responses and limitations in handling complex codes were observed. The study's findings, detailing both the benefits and challenges of contextualized AI assistants, underscore their potential to revolutionize software development practices, while also highlighting areas for further refinement.
△ Less
Submitted 30 November, 2023;
originally announced November 2023.
-
Lessons from Building StackSpot AI: A Contextualized AI Coding Assistant
Authors:
Gustavo Pinto,
Cleidson de Souza,
João Batista Neto,
Alberto de Souza,
Tarcísio Gotto,
Edward Monteiro
Abstract:
With their exceptional natural language processing capabilities, tools based on Large Language Models (LLMs) like ChatGPT and Co-Pilot have swiftly become indispensable resources in the software developer's toolkit. While recent studies suggest the potential productivity gains these tools can unlock, users still encounter drawbacks, such as generic or incorrect answers. Additionally, the pursuit o…
▽ More
With their exceptional natural language processing capabilities, tools based on Large Language Models (LLMs) like ChatGPT and Co-Pilot have swiftly become indispensable resources in the software developer's toolkit. While recent studies suggest the potential productivity gains these tools can unlock, users still encounter drawbacks, such as generic or incorrect answers. Additionally, the pursuit of improved responses often leads to extensive prompt engineering efforts, diverting valuable time from writing code that delivers actual value. To address these challenges, a new breed of tools, built atop LLMs, is emerging. These tools aim to mitigate drawbacks by employing techniques like fine-tuning or enriching user prompts with contextualized information.
In this paper, we delve into the lessons learned by a software development team venturing into the creation of such a contextualized LLM-based application, using retrieval-based techniques, called CodeBuddy. Over a four-month period, the team, despite lacking prior professional experience in LLM-based applications, built the product from scratch. Following the initial product release, we engaged with the development team responsible for the code generative components. Through interviews and analysis of the application's issue tracker, we uncover various intriguing challenges that teams working on LLM-based applications might encounter. For instance, we found three main group of lessons: LLM-based lessons, User-based lessons, and Technical lessons. By understanding these lessons, software development teams could become better prepared to build LLM-based applications.
△ Less
Submitted 4 January, 2024; v1 submitted 30 November, 2023;
originally announced November 2023.
-
Steering Large Language Models for Machine Translation with Finetuning and In-Context Learning
Authors:
Duarte M. Alves,
Nuno M. Guerreiro,
João Alves,
José Pombal,
Ricardo Rei,
José G. C. de Souza,
Pierre Colombo,
André F. T. Martins
Abstract:
Large language models (LLMs) are a promising avenue for machine translation (MT). However, current LLM-based MT systems are brittle: their effectiveness highly depends on the choice of few-shot examples and they often require extra post-processing due to overgeneration. Alternatives such as finetuning on translation instructions are computationally expensive and may weaken in-context learning capa…
▽ More
Large language models (LLMs) are a promising avenue for machine translation (MT). However, current LLM-based MT systems are brittle: their effectiveness highly depends on the choice of few-shot examples and they often require extra post-processing due to overgeneration. Alternatives such as finetuning on translation instructions are computationally expensive and may weaken in-context learning capabilities, due to overspecialization. In this paper, we provide a closer look at this problem. We start by showing that adapter-based finetuning with LoRA matches the performance of traditional finetuning while reducing the number of training parameters by a factor of 50. This method also outperforms few-shot prompting and eliminates the need for post-processing or in-context examples. However, we show that finetuning generally degrades few-shot performance, hindering adaptation capabilities. Finally, to obtain the best of both worlds, we propose a simple approach that incorporates few-shot examples during finetuning. Experiments on 10 language pairs show that our proposed approach recovers the original few-shot capabilities while keeping the added benefits of finetuning.
△ Less
Submitted 20 October, 2023;
originally announced October 2023.
-
An Empirical Study of Translation Hypothesis Ensembling with Large Language Models
Authors:
António Farinhas,
José G. C. de Souza,
André F. T. Martins
Abstract:
Large language models (LLMs) are becoming a one-fits-many solution, but they sometimes hallucinate or produce unreliable output. In this paper, we investigate how hypothesis ensembling can improve the quality of the generated text for the specific problem of LLM-based machine translation. We experiment with several techniques for ensembling hypotheses produced by LLMs such as ChatGPT, LLaMA, and A…
▽ More
Large language models (LLMs) are becoming a one-fits-many solution, but they sometimes hallucinate or produce unreliable output. In this paper, we investigate how hypothesis ensembling can improve the quality of the generated text for the specific problem of LLM-based machine translation. We experiment with several techniques for ensembling hypotheses produced by LLMs such as ChatGPT, LLaMA, and Alpaca. We provide a comprehensive study along multiple dimensions, including the method to generate hypotheses (multiple prompts, temperature-based sampling, and beam search) and the strategy to produce the final translation (instruction-based, quality-based reranking, and minimum Bayes risk (MBR) decoding). Our results show that MBR decoding is a very effective method, that translation quality can be improved using a small number of samples, and that instruction tuning has a strong impact on the relation between the diversity of the hypotheses and the sampling temperature.
△ Less
Submitted 17 October, 2023;
originally announced October 2023.
-
Effects of Distributed Generation on the Bidirectional Operation of Cascaded Step Voltage Regulators: Case Study of a Real 34.5 kV Distribution Feeder
Authors:
Hugo Rodrigues de Brito,
Valéria Monteiro de Souza,
João Paulo Abreu Vieira,
Maria Emília de Lima Tostes,
Ubiratan Holanda Bezerra,
Vanderson Carvalho de Souza,
Daniel da Conceição Pinheiro,
Heitor Alves Barata,
Hugo Nazareno de Souza Cardoso,
Marcelo Sousa Costa
Abstract:
This work investigates the impact of feeder bidirectional active power flow on the operation of two cascaded step voltage regulators (SVRs) located at a 34.5 kV rural distribution feeder. It shows that, when active power flow reversal is possible both by network reconfiguration and by high penetration levels of distributed generation (DG), typical SVR control mode settings are unable to prevent th…
▽ More
This work investigates the impact of feeder bidirectional active power flow on the operation of two cascaded step voltage regulators (SVRs) located at a 34.5 kV rural distribution feeder. It shows that, when active power flow reversal is possible both by network reconfiguration and by high penetration levels of distributed generation (DG), typical SVR control mode settings are unable to prevent the occurrence of runaway condition, a phenomenon characterized by loss of SVR voltage control capabilities. Such developments are the basis for a DG pre-dispatch control strategy that aims to avoid the adverse effects of the described power flow reversal scenarios, as well as to ensure reliable operation of the utility distribution network.
△ Less
Submitted 13 October, 2023;
originally announced October 2023.
-
Back-Propagation Optimization and Multi-Valued Artificial Neural Networks for Highly Vivid Structural Color Filter Metasurfaces
Authors:
Arthur Clini de Souza,
Stéphane Lanteri,
Hugo Enrique Hernandez-Figueroa,
Marco Abbarchi,
David Grosso,
Badre Kerzabi,
Mahmoud Elsawy
Abstract:
We introduce a novel technique for designing color filter metasurfaces using a data-driven approach based on deep learning. Our innovative approach employs inverse design principles to identify highly efficient designs that outperform all the configurations in the dataset, which consists of 585 distinct geometries solely. By combining Multi-Valued Artificial Neural Networks and back-propagation op…
▽ More
We introduce a novel technique for designing color filter metasurfaces using a data-driven approach based on deep learning. Our innovative approach employs inverse design principles to identify highly efficient designs that outperform all the configurations in the dataset, which consists of 585 distinct geometries solely. By combining Multi-Valued Artificial Neural Networks and back-propagation optimization, we overcome the limitations of previous approaches, such as poor performance due to extrapolation and undesired local minima. Consequently, we successfully create reliable and highly efficient configurations for metasurface color filters capable of producing exceptionally vivid colors that go beyond the sRGB gamut. Furthermore, our deep learning technique can be extended to design various pixellated metasurface configurations with different functionalities.
△ Less
Submitted 18 October, 2023; v1 submitted 2 October, 2023;
originally announced October 2023.
-
Scaling up COMETKIWI: Unbabel-IST 2023 Submission for the Quality Estimation Shared Task
Authors:
Ricardo Rei,
Nuno M. Guerreiro,
José Pombal,
Daan van Stigt,
Marcos Treviso,
Luisa Coheur,
José G. C. de Souza,
André F. T. Martins
Abstract:
We present the joint contribution of Unbabel and Instituto Superior Técnico to the WMT 2023 Shared Task on Quality Estimation (QE). Our team participated on all tasks: sentence- and word-level quality prediction (task 1) and fine-grained error span detection (task 2). For all tasks, we build on the COMETKIWI-22 model (Rei et al., 2022b). Our multilingual approaches are ranked first for all tasks,…
▽ More
We present the joint contribution of Unbabel and Instituto Superior Técnico to the WMT 2023 Shared Task on Quality Estimation (QE). Our team participated on all tasks: sentence- and word-level quality prediction (task 1) and fine-grained error span detection (task 2). For all tasks, we build on the COMETKIWI-22 model (Rei et al., 2022b). Our multilingual approaches are ranked first for all tasks, reaching state-of-the-art performance for quality estimation at word-, span- and sentence-level granularity. Compared to the previous state-of-the-art COMETKIWI-22, we show large improvements in correlation with human judgements (up to 10 Spearman points). Moreover, we surpass the second-best multilingual submission to the shared-task with up to 3.8 absolute points.
△ Less
Submitted 21 September, 2023;
originally announced September 2023.
-
An axially symmetric spacetime with causality violation in Ricci-inverse gravity
Authors:
J. C. R. de Souza,
A. F. Santos
Abstract:
In this paper, Ricci-inverse gravity is investigated. It is an alternative theory of gravity that introduces into the Einstein-Hilbert action an anti-curvature scalar that is obtained from the anti-curvature tensor which is the inverse of the Ricci tensor. An axially symmetric spacetime with causality violation is studied. Two classes of the model are discussed. Different sources of matter are con…
▽ More
In this paper, Ricci-inverse gravity is investigated. It is an alternative theory of gravity that introduces into the Einstein-Hilbert action an anti-curvature scalar that is obtained from the anti-curvature tensor which is the inverse of the Ricci tensor. An axially symmetric spacetime with causality violation is studied. Two classes of the model are discussed. Different sources of matter are considered. Then a direct relation between the content of matter and causality violation is shown. Our results confirm that Ricci-inverse gravity allows the existence of Closed Time-like Curves (CTCs) that lead to the violation of causality. Furthermore, a comparison is made between the results of general relativity and Ricci-inverse gravity. Other spacetimes, such as Gödel and Gödel-type universes, which are exact solutions of general relativity and allow for causality violations, are also explored in Ricci-inverse gravity framework.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
Geometry of vectorial martingale optimal transport and robust option pricing
Authors:
Joshua Zoen-Git Hiew,
Tongseok Lim,
Brendan Pass,
Marcelo Cruz de Souza
Abstract:
This paper addresses robust finance, which is concerned with the development of models and approaches that account for market uncertainties. Specifically, we investigate the Vectorial Martingale Optimal Transport (VMOT) problem, the geometry of its solutions, and its application with robust option pricing problems in finance. To this end, we consider two-period market models and show that when the…
▽ More
This paper addresses robust finance, which is concerned with the development of models and approaches that account for market uncertainties. Specifically, we investigate the Vectorial Martingale Optimal Transport (VMOT) problem, the geometry of its solutions, and its application with robust option pricing problems in finance. To this end, we consider two-period market models and show that when the spatial dimension $d$ (the number of underlying assets) is 2, the extremal model for the cap option with a sub- or super-modular payout reduces to a single factor model in the first period, but not in general when $d > 2$. The result demonstrates a subtle relationship between spatial dimension, cost function supermodularity, and their effect on the geometry of solutions to the VMOT problem. We investigate applications of the model to financial problems and demonstrate how the dimensional reduction caused by monotonicity can be used to improve existing computational methods.
△ Less
Submitted 18 September, 2023; v1 submitted 10 September, 2023;
originally announced September 2023.
-
How life-table right-censoring affected the Brazilian Social Security Factor: an application of the gamma-Gompertz-Makeham model
Authors:
Filipe Costa de Souza,
Wilton Bernardino,
Silvio Cabral Patricio
Abstract:
Automatic Adjustment Mechanisms (AAM) are legal instruments that help social security systems respond to demographic and economic changes. In Brazil, the Social Security Factor (SSF) was introduced in the late 1990s as an AAM to link retirement benefits to life expectancy at the retirement age, with the hope of promoting contributory justice and discouraging early retirement. Recent research has h…
▽ More
Automatic Adjustment Mechanisms (AAM) are legal instruments that help social security systems respond to demographic and economic changes. In Brazil, the Social Security Factor (SSF) was introduced in the late 1990s as an AAM to link retirement benefits to life expectancy at the retirement age, with the hope of promoting contributory justice and discouraging early retirement. Recent research has highlighted the limitations of right-censored life tables, such as those used in Brazil. It has recommended using the gamma-Gompertz-Makeham (GGM) model to estimate adult and old-age mortality. This study investigated the impact of right-censoring on the SSF by comparing the official SSF and other social security metrics with a counterfactual scenario computed based on fitted GGM models. The results indicate that from 2004 to 2012, official life tables may have negatively impacted retirees' income, particularly for those who delayed their retirement. Furthermore, the GGM-fitted models' life expectancies had more stable paths over time, which could have helped with long-term planning. This study's findings are significant for policymakers as they highlight the importance of using appropriate mortality metrics in AAMs to ensure accurate retirement benefit payments. They also underscore the need to consider the potential impacts of seemingly innocuous hypotheses on public action outcomes. Overall, this study provides valuable insights for public planners and policymakers looking to enhance the effectiveness and fairness of social security systems.
△ Less
Submitted 12 September, 2023; v1 submitted 7 September, 2023;
originally announced September 2023.
-
Parametrizations of subsets of the space of valuations
Authors:
Josnei Antonio Novacoski,
Caio Henrique Silva de Souza
Abstract:
In this paper we present different ways to parametrize subsets of the space of valuations on $K[x]$ extending a given valuation on $K$. We discuss the methods using pseudo-Cauchy sequences and approximation types. The method presented here is slightly different than the ones in the literature and we believe that our approach is more accurate.
In this paper we present different ways to parametrize subsets of the space of valuations on $K[x]$ extending a given valuation on $K$. We discuss the methods using pseudo-Cauchy sequences and approximation types. The method presented here is slightly different than the ones in the literature and we believe that our approach is more accurate.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
ProWis: A Visual Approach for Building, Managing, and Analyzing Weather Simulation Ensembles at Runtime
Authors:
Carolina Veiga Ferreira de Souza,
Suzanna Maria Bonnet,
Daniel de Oliveira,
Marcio Cataldi,
Fabio Miranda,
Marcos Lage
Abstract:
Weather forecasting is essential for decision-making and is usually performed using numerical modeling. Numerical weather models, in turn, are complex tools that require specialized training and laborious setup and are challenging even for weather experts. Moreover, weather simulations are data-intensive computations and may take hours to days to complete. When the simulation is finished, the expe…
▽ More
Weather forecasting is essential for decision-making and is usually performed using numerical modeling. Numerical weather models, in turn, are complex tools that require specialized training and laborious setup and are challenging even for weather experts. Moreover, weather simulations are data-intensive computations and may take hours to days to complete. When the simulation is finished, the experts face challenges analyzing its outputs, a large mass of spatiotemporal and multivariate data. From the simulation setup to the analysis of results, working with weather simulations involves several manual and error-prone steps. The complexity of the problem increases exponentially when the experts must deal with ensembles of simulations, a frequent task in their daily duties. To tackle these challenges, we propose ProWis: an interactive and provenance-oriented system to help weather experts build, manage, and analyze simulation ensembles at runtime. Our system follows a human-in-the-loop approach to enable the exploration of multiple atmospheric variables and weather scenarios. ProWis was built in close collaboration with weather experts, and we demonstrate its effectiveness by presenting two case studies of rainfall events in Brazil.
△ Less
Submitted 9 August, 2023;
originally announced August 2023.
-
BayesCPclust: A Bayesian Approach for Clustering Constant-Wise Change-Point Data
Authors:
Ana Carolina da Cruz,
Camila P. E. de Souza
Abstract:
Change-point models deal with ordered data sequences. Their primary goal is to infer the locations where an aspect of the data sequence changes. In this paper, we propose and implement a nonparametric Bayesian model for clustering observations based on their constant-wise change-point profiles via Gibbs sampler. Our model incorporates a Dirichlet Process on the constant-wise change-point structure…
▽ More
Change-point models deal with ordered data sequences. Their primary goal is to infer the locations where an aspect of the data sequence changes. In this paper, we propose and implement a nonparametric Bayesian model for clustering observations based on their constant-wise change-point profiles via Gibbs sampler. Our model incorporates a Dirichlet Process on the constant-wise change-point structures to cluster observations while simultaneously performing multiple change-point estimation. Additionally, our approach controls the number of clusters in the model, not requiring the specification of the number of clusters a priori. Satisfactory clustering and estimation results were obtained when evaluating our method under various simulated scenarios and on a real dataset from single-cell genomic sequencing. Our proposed methodology is implemented as an R package called BayesCPclust and is available from the Comprehensive R Archive Network at https://CRAN.R-project.org/package=BayesCPclust.
△ Less
Submitted 10 February, 2025; v1 submitted 28 May, 2023;
originally announced May 2023.
-
Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation
Authors:
Patrick Fernandes,
Aman Madaan,
Emmy Liu,
António Farinhas,
Pedro Henrique Martins,
Amanda Bertsch,
José G. C. de Souza,
Shuyan Zhou,
Tongshuang Wu,
Graham Neubig,
André F. T. Martins
Abstract:
Many recent advances in natural language generation have been fueled by training large language models on internet-scale data. However, this paradigm can lead to models that generate toxic, inaccurate, and unhelpful content, and automatic evaluation metrics often fail to identify these behaviors. As models become more capable, human feedback is an invaluable signal for evaluating and improving mod…
▽ More
Many recent advances in natural language generation have been fueled by training large language models on internet-scale data. However, this paradigm can lead to models that generate toxic, inaccurate, and unhelpful content, and automatic evaluation metrics often fail to identify these behaviors. As models become more capable, human feedback is an invaluable signal for evaluating and improving models. This survey aims to provide an overview of the recent research that has leveraged human feedback to improve natural language generation. First, we introduce an encompassing formalization of feedback, and identify and organize existing research into a taxonomy following this formalization. Next, we discuss how feedback can be described by its format and objective, and cover the two approaches proposed to use feedback (either for training or decoding): directly using the feedback or training feedback models. We also discuss existing datasets for human-feedback data collection, and concerns surrounding feedback collection. Finally, we provide an overview of the nascent field of AI feedback, which exploits large language models to make judgments based on a set of principles and minimize the need for human intervention.
△ Less
Submitted 31 May, 2023; v1 submitted 1 May, 2023;
originally announced May 2023.
-
Variational Bayesian analysis of survival data using a log-logistic accelerated failure time model
Authors:
Chengqian Xian,
Camila P. E. de Souza,
Wenqing He,
Felipe F. Rodrigues,
Renfang Tian
Abstract:
The log-logistic regression model is one of the most commonly used accelerated failure time (AFT) models in survival analysis, for which statistical inference methods are mainly established under the frequentist framework. Recently, Bayesian inference for log-logistic AFT models using Markov chain Monte Carlo (MCMC) techniques has also been widely developed. In this work, we develop an alternative…
▽ More
The log-logistic regression model is one of the most commonly used accelerated failure time (AFT) models in survival analysis, for which statistical inference methods are mainly established under the frequentist framework. Recently, Bayesian inference for log-logistic AFT models using Markov chain Monte Carlo (MCMC) techniques has also been widely developed. In this work, we develop an alternative approach to MCMC methods and infer the parameters of the log-logistic AFT model via a mean-field variational Bayes (VB) algorithm. A piecewise approximation technique is embedded in deriving the VB algorithm to achieve conjugacy. The proposed VB algorithm is evaluated and compared with typical frequentist inferences and MCMC inference using simulated data under various scenarios. A publicly available dataset is employed for illustration. We demonstrate that the proposed VB algorithm can achieve good estimation accuracy and has a lower computational cost compared with MCMC methods.
△ Less
Submitted 10 October, 2023; v1 submitted 16 March, 2023;
originally announced March 2023.
-
Bayesian Variable Selection for Function-on-Scalar Regression Models: a comparative analysis
Authors:
Pedro Henrique T. O. Sousa,
Camila P. E. de Souza,
Ronaldo Dias
Abstract:
In this work, we developed a new Bayesian method for variable selection in function-on-scalar regression (FOSR). Our method uses a hierarchical Bayesian structure and latent variables to enable an adaptive covariate selection process for FOSR. Extensive simulation studies show the proposed method's main properties, such as its accuracy in estimating the coefficients and high capacity to select var…
▽ More
In this work, we developed a new Bayesian method for variable selection in function-on-scalar regression (FOSR). Our method uses a hierarchical Bayesian structure and latent variables to enable an adaptive covariate selection process for FOSR. Extensive simulation studies show the proposed method's main properties, such as its accuracy in estimating the coefficients and high capacity to select variables correctly. Furthermore, we conducted a substantial comparative analysis with the main competing methods, the BGLSS (Bayesian Group Lasso with Spike and Slab prior) method, the group LASSO (Least Absolute Shrinkage and Selection Operator), the group MCP (Minimax Concave Penalty), and the group SCAD (Smoothly Clipped Absolute Deviation). Our results demonstrate that the proposed methodology is superior in correctly selecting covariates compared with the existing competing methods while maintaining a satisfactory level of goodness of fit. In contrast, the competing methods could not balance selection accuracy with goodness of fit. We also considered a COVID-19 dataset and some socioeconomic data from Brazil as an application and obtained satisfactory results. In short, the proposed Bayesian variable selection model is highly competitive, showing significant predictive and selective quality.
△ Less
Submitted 24 April, 2024; v1 submitted 6 March, 2023;
originally announced March 2023.
-
Element-Wise Attention Layers: an option for optimization
Authors:
Giovanni Araujo Bacochina,
Rodrigo Clemente Thom de Souza
Abstract:
The use of Attention Layers has become a trend since the popularization of the Transformer-based models, being the key element for many state-of-the-art models that have been developed through recent years. However, one of the biggest obstacles in implementing these architectures - as well as many others in Deep Learning Field - is the enormous amount of optimizing parameters they possess, which m…
▽ More
The use of Attention Layers has become a trend since the popularization of the Transformer-based models, being the key element for many state-of-the-art models that have been developed through recent years. However, one of the biggest obstacles in implementing these architectures - as well as many others in Deep Learning Field - is the enormous amount of optimizing parameters they possess, which make its use conditioned on the availability of robust hardware. In this paper, it's proposed a new method of attention mechanism that adapts the Dot-Product Attention, which uses matrices multiplications, to become element-wise through the use of arrays multiplications. To test the effectiveness of such approach, two models (one with a VGG-like architecture and one with the proposed method) have been trained in a classification task using Fashion MNIST and CIFAR10 datasets. Each model has been trained for 10 epochs in a single Tesla T4 GPU from Google Colaboratory. The results show that this mechanism allows for an accuracy of 92% of the VGG-like counterpart in Fashion MNIST dataset, while reducing the number of parameters in 97%. For CIFAR10, the accuracy is still equivalent to 60% of the VGG-like counterpart while using 50% less parameters.
△ Less
Submitted 10 February, 2023;
originally announced February 2023.
-
Unified Functional Hashing in Automatic Machine Learning
Authors:
Ryan Gillard,
Stephen Jonany,
Yingjie Miao,
Michael Munn,
Connal de Souza,
Jonathan Dungay,
Chen Liang,
David R. So,
Quoc V. Le,
Esteban Real
Abstract:
The field of Automatic Machine Learning (AutoML) has recently attained impressive results, including the discovery of state-of-the-art machine learning solutions, such as neural image classifiers. This is often done by applying an evolutionary search method, which samples multiple candidate solutions from a large space and evaluates the quality of each candidate through a long training process. As…
▽ More
The field of Automatic Machine Learning (AutoML) has recently attained impressive results, including the discovery of state-of-the-art machine learning solutions, such as neural image classifiers. This is often done by applying an evolutionary search method, which samples multiple candidate solutions from a large space and evaluates the quality of each candidate through a long training process. As a result, the search tends to be slow. In this paper, we show that large efficiency gains can be obtained by employing a fast unified functional hash, especially through the functional equivalence caching technique, which we also present. The central idea is to detect by hashing when the search method produces equivalent candidates, which occurs very frequently, and this way avoid their costly re-evaluation. Our hash is "functional" in that it identifies equivalent candidates even if they were represented or coded differently, and it is "unified" in that the same algorithm can hash arbitrary representations; e.g. compute graphs, imperative code, or lambda functions. As evidence, we show dramatic improvements on multiple AutoML domains, including neural architecture search and algorithm discovery. Finally, we consider the effect of hash collisions, evaluation noise, and search distribution through empirical analysis. Altogether, we hope this paper may serve as a guide to hashing techniques in AutoML.
△ Less
Submitted 10 February, 2023;
originally announced February 2023.
-
Robust Switching Control of DC-DC Boost Converter for EV Charging Stations
Authors:
Saif Ahmad,
Ryan P. C. de Souza,
Pauline Kergus,
Zohra Kader,
Stephane Caux
Abstract:
In this work, the problem of switching control design for DC-DC boost converter is considered, in the case of operation under uncertain equilibrium condition arising due to perturbations in the input and load parameters. Assuming that these uncertain parameters are generated via a known linear exo-system, a parameter estimator is designed to update the equilibrium point for the switching controlle…
▽ More
In this work, the problem of switching control design for DC-DC boost converter is considered, in the case of operation under uncertain equilibrium condition arising due to perturbations in the input and load parameters. Assuming that these uncertain parameters are generated via a known linear exo-system, a parameter estimator is designed to update the equilibrium point for the switching controller in real-time. In order to mitigate the noise amplification problem associated with the designed parameter estimator, the estimation error injection term is filtered via a set of first-order filters to obtain the desired level of noise suppression in the final set of estimates. To demonstrate the efficiency of the developed scheme, a realistic application scenario of a DC charging station for electric vehicles is considered, with photovoltaic array as the source and a battery connected at the load side.
△ Less
Submitted 6 December, 2022;
originally announced December 2022.
-
A Bibliometrics Analysis on 28 years of Authentication and Threat Model Area
Authors:
Wesley dos Reis Bezerra,
Cristiano Antônio de Souza,
Carla Merkle Westphall,
Carlos Becker Westphall
Abstract:
The large volume of publications in any research area can make it difficult for researchers to track their research areas' trends, challenges, and characteristics. Bibliometrics solves this problem by bringing statistical tools to help the analysis of selected publications from an online database. Although there are different works in security, our study aims to fill the bibliometric gap in the au…
▽ More
The large volume of publications in any research area can make it difficult for researchers to track their research areas' trends, challenges, and characteristics. Bibliometrics solves this problem by bringing statistical tools to help the analysis of selected publications from an online database. Although there are different works in security, our study aims to fill the bibliometric gap in the authentication and threat model area. As a result, a description of the dataset obtained, an overview of some selected variables, and an analysis of the ten most cited articles in this selected dataset is presented, which brings together publications from the last 28 years in these areas combined.
△ Less
Submitted 26 September, 2022;
originally announced September 2022.
-
Characteristics and Main Threats about Multi-Factor Authentication: A Survey
Authors:
Wesley dos Reis Bezerra,
Cristiano Antônio de Souza,
Carla Merkle Westphall,
Carlos Becker Westphall
Abstract:
This work reports that the Systematic Literature Review process is responsible for providing theoretical support to research in the Threat Model and Multi-Factor Authentication. However, different from the related works, this study aims to evaluate the main characteristics of authentication solutions and their threat model. Also, it intends to list characteristics, threats, and related content to…
▽ More
This work reports that the Systematic Literature Review process is responsible for providing theoretical support to research in the Threat Model and Multi-Factor Authentication. However, different from the related works, this study aims to evaluate the main characteristics of authentication solutions and their threat model. Also, it intends to list characteristics, threats, and related content to a state-of-art. As a result, we brought a portfolio analysis through charts, figures, and tables presented in the discussion section.
△ Less
Submitted 26 September, 2022;
originally announced September 2022.
-
CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared Task
Authors:
Ricardo Rei,
Marcos Treviso,
Nuno M. Guerreiro,
Chrysoula Zerva,
Ana C. Farinha,
Christine Maroti,
José G. C. de Souza,
Taisiya Glushkova,
Duarte M. Alves,
Alon Lavie,
Luisa Coheur,
André F. T. Martins
Abstract:
We present the joint contribution of IST and Unbabel to the WMT 2022 Shared Task on Quality Estimation (QE). Our team participated on all three subtasks: (i) Sentence and Word-level Quality Prediction; (ii) Explainable QE; and (iii) Critical Error Detection. For all tasks we build on top of the COMET framework, connecting it with the predictor-estimator architecture of OpenKiwi, and equipping it w…
▽ More
We present the joint contribution of IST and Unbabel to the WMT 2022 Shared Task on Quality Estimation (QE). Our team participated on all three subtasks: (i) Sentence and Word-level Quality Prediction; (ii) Explainable QE; and (iii) Critical Error Detection. For all tasks we build on top of the COMET framework, connecting it with the predictor-estimator architecture of OpenKiwi, and equipping it with a word-level sequence tagger and an explanation extractor. Our results suggest that incorporating references during pretraining improves performance across several language pairs on downstream tasks, and that jointly training with sentence and word-level objectives yields a further boost. Furthermore, combining attention and gradient information proved to be the top strategy for extracting good explanations of sentence-level QE models. Overall, our submissions achieved the best results for all three tasks for almost all language pairs by a considerable margin.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Clustering Functional Data via Variational Inference
Authors:
Chengqian Xian,
Camila de Souza,
John Jewell,
Ronaldo Dias
Abstract:
Functional data analysis deals with data recorded densely over time (or any other continuum) with one or more observed curves per subject. Conceptually, functional data are continuously defined, but in practice, they are usually observed at discrete points. Among different kinds of functional data analyses, clustering analysis aims to determine underlying groups of curves in the dataset when there…
▽ More
Functional data analysis deals with data recorded densely over time (or any other continuum) with one or more observed curves per subject. Conceptually, functional data are continuously defined, but in practice, they are usually observed at discrete points. Among different kinds of functional data analyses, clustering analysis aims to determine underlying groups of curves in the dataset when there is no information on the group membership of each individual curve. In this work, we propose a new model-based approach for clustering and smoothing functional data simultaneously via variational inference. We derive coordinate ascent mean-field variational Bayes algorithms to approximate the posterior distribution of our model parameters by finding the variational distribution with the smallest Kullback-Leibler divergence to the posterior. The performance of our proposed method is evaluated using simulated data and publicly available datasets.
△ Less
Submitted 18 January, 2023; v1 submitted 26 May, 2022;
originally announced May 2022.