An Experimental Study on Data Augmentation Techniques for Named Entity Recognition on Low-Resource Domains
Authors:
Arthur Elwing Torres,
Edleno Silva de Moura,
Altigran Soares da Silva,
Mario A. Nascimento,
Filipe Mesquita
Abstract:
Named Entity Recognition (NER) is a machine learning task that traditionally relies on supervised learning and annotated data. Acquiring such data is often a challenge, particularly in specialized fields like medical, legal, and financial sectors. Those are commonly referred to as low-resource domains, which comprise long-tail entities, due to the scarcity of available data. To address this, data…
▽ More
Named Entity Recognition (NER) is a machine learning task that traditionally relies on supervised learning and annotated data. Acquiring such data is often a challenge, particularly in specialized fields like medical, legal, and financial sectors. Those are commonly referred to as low-resource domains, which comprise long-tail entities, due to the scarcity of available data. To address this, data augmentation techniques are increasingly being employed to generate additional training instances from the original dataset. In this study, we evaluate the effectiveness of two prominent text augmentation techniques, Mention Replacement and Contextual Word Replacement, on two widely-used NER models, Bi-LSTM+CRF and BERT. We conduct experiments on four datasets from low-resource domains, and we explore the impact of various combinations of training subset sizes and number of augmented examples. We not only confirm that data augmentation is particularly beneficial for smaller datasets, but we also demonstrate that there is no universally optimal number of augmented examples, i.e., NER practitioners must experiment with different quantities in order to fine-tune their projects.
△ Less
Submitted 21 November, 2024;
originally announced November 2024.
Computing a 3-role assignment is polynomial-time solvable on complementary prisms
Authors:
Diane Castonguay,
Elisângela S. Dias,
Fernanda N. Mesquita,
Julliano R. Nascimento
Abstract:
A $r$-role assignment of a simple graph $G$ is an assignment of $r$ distinct roles to the vertices of $G$, such that two vertices with the same role have the same set of roles assigned to related vertices. Furthermore, a specific $r$-role assignment defines a role graph, in which the vertices are the distinct $r$ roles, and there is an edge between two roles whenever there are two related vertices…
▽ More
A $r$-role assignment of a simple graph $G$ is an assignment of $r$ distinct roles to the vertices of $G$, such that two vertices with the same role have the same set of roles assigned to related vertices. Furthermore, a specific $r$-role assignment defines a role graph, in which the vertices are the distinct $r$ roles, and there is an edge between two roles whenever there are two related vertices in the graph $G$ that correspond to these roles. We consider complementary prisms, which are graphs formed from the disjoint union of the graph with its respective complement, adding the edges of a perfect matching between their corresponding vertices. In this work, we characterize the complementary prisms that do not admit a $3$-role assignment. We highlight that all of them are complementary prisms of disconnected bipartite graphs. Moreover, using our findings, we show that the problem of deciding whether a complementary prism has a $3$-role assignment can be solved in polynomial time.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.