-
Enhancing Women's Experiences in Software Engineering
Authors:
Júlia Rocha Fortunato,
Luana Ribeiro Soares,
Gabriela Silva Alves,
Edna Dias Canedo,
Fabiana Freitas Mendes
Abstract:
Context: Women face many challenges in their lives, which affect their daily experiences and influence major life decisions, starting before they enroll in bachelor's programs, setting a difficult path for those aspiring to enter the software development industry. Goal: To explore the challenges that women face across three different life stages, beginning as high school students, continuing as un…
▽ More
Context: Women face many challenges in their lives, which affect their daily experiences and influence major life decisions, starting before they enroll in bachelor's programs, setting a difficult path for those aspiring to enter the software development industry. Goal: To explore the challenges that women face across three different life stages, beginning as high school students, continuing as university undergraduates, and extending into their professional lives, as well as potential solutions to address these challenges. Research Method: We conducted a literature review followed by workshops to understand the perspectives of high school women, undergraduates, and practitioners regarding the same set of challenges and solutions identified in the literature. Results: Regardless of the life stage, women feel discouraged in a toxic environment often characterized by a lack of inclusion, harassment, and the exhausting need to prove themselves. We also discovered that some challenges are specific to certain life stages; for example, issues related to maternity were mentioned only by practitioners. Conclusions: Gender-related challenges arise before women enter the software development field when the proportion of men and women is still similar. While the need to prove themselves is mentioned at all three stages, high school women's challenges are more often directed toward convincing their parents that they are mature enough to handle their responsibilities. As they progress, the emphasis shifts to proving their competence in managing responsibilities for which they have received training. Increasing the inclusion of women in the field should, therefore, start earlier, and profound societal changes may be necessary to boost women's participation.
△ Less
Submitted 6 May, 2025;
originally announced May 2025.
-
A Machine Learning Approach to Automatic Fall Detection of Soldiers
Authors:
Leandro Soares,
Gustavo Venturini,
José Gomes,
Jonathan Efigenio,
Pablo Rangel,
Pedro Gonzalez,
Joel dos Santos,
Diego Brandão,
Eduardo Bezerra
Abstract:
Military personnel and security agents often face significant physical risks during conflict and engagement situations, particularly in urban operations. Ensuring the rapid and accurate communication of incidents involving injuries is crucial for the timely execution of rescue operations. This article presents research conducted under the scope of the Brazilian Navy's ``Soldier of the Future'' pro…
▽ More
Military personnel and security agents often face significant physical risks during conflict and engagement situations, particularly in urban operations. Ensuring the rapid and accurate communication of incidents involving injuries is crucial for the timely execution of rescue operations. This article presents research conducted under the scope of the Brazilian Navy's ``Soldier of the Future'' project, focusing on the development of a Casualty Detection System to identify injuries that could incapacitate a soldier and lead to severe blood loss. The study specifically addresses the detection of soldier falls, which may indicate critical injuries such as hypovolemic hemorrhagic shock. To generate the publicly available dataset, we used smartwatches and smartphones as wearable devices to collect inertial data from soldiers during various activities, including simulated falls. The data were used to train 1D Convolutional Neural Networks (CNN1D) with the objective of accurately classifying falls that could result from life-threatening injuries. We explored different sensor placements (on the wrists and near the center of mass) and various approaches to using inertial variables, including linear and angular accelerations. The neural network models were optimized using Bayesian techniques to enhance their performance. The best-performing model and its results, discussed in this article, contribute to the advancement of automated systems for monitoring soldier safety and improving response times in engagement scenarios.
△ Less
Submitted 31 January, 2025; v1 submitted 26 January, 2025;
originally announced January 2025.
-
Inspiring Women in Technology: Educational Pathways and Impact
Authors:
Larissa F. Rodrigues Moreira,
Liziane S. Soares,
Adriana Z. Martinhago
Abstract:
This paper presents initiatives aimed at fostering female involvement in the realm of computing and endeavoring to inspire more women to pursue careers in these fields. The Meninas++ Project coordinates activities at both the high school and higher education levels, facilitating dialogue between young women and computing professionals, and promoting female role models within the field. Our study d…
▽ More
This paper presents initiatives aimed at fostering female involvement in the realm of computing and endeavoring to inspire more women to pursue careers in these fields. The Meninas++ Project coordinates activities at both the high school and higher education levels, facilitating dialogue between young women and computing professionals, and promoting female role models within the field. Our study demonstrated the significant impact of these activities on inspiring, empowering, and retaining female students in computing. Furthermore, higher education initiatives have fostered engagement among both women and men, promoting inclusivity, entrepreneurship, and collaboration to enhance women's representation in the computing field.
△ Less
Submitted 23 December, 2024;
originally announced December 2024.
-
Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning
Authors:
Bryan L. M. de Oliveira,
Murilo L. da Luz,
Bruno Brandão,
Luana G. B. Martins,
Telma W. de L. Soares,
Luckeciano C. Melo
Abstract:
Learning effective visual representations enables agents to extract meaningful information from raw sensory inputs, which is essential for generalizing across different tasks. However, evaluating representation learning separately from policy learning remains a challenge with most reinforcement learning (RL) benchmarks. To address this gap, we introduce the Sliding Puzzles Gym (SPGym), a novel ben…
▽ More
Learning effective visual representations enables agents to extract meaningful information from raw sensory inputs, which is essential for generalizing across different tasks. However, evaluating representation learning separately from policy learning remains a challenge with most reinforcement learning (RL) benchmarks. To address this gap, we introduce the Sliding Puzzles Gym (SPGym), a novel benchmark that reimagines the classic 8-tile puzzle with a visual observation space of images sourced from arbitrarily large datasets. SPGym provides precise control over representation complexity through visual diversity, allowing researchers to systematically scale the representation learning challenge while maintaining consistent environment dynamics. Despite the apparent simplicity of the task, our experiments with both model-free and model-based RL algorithms reveal fundamental limitations in current methods. As we increase visual diversity by expanding the pool of possible images, all tested algorithms show significant performance degradation, with even state-of-the-art methods struggling to generalize across different visual inputs while maintaining consistent puzzle-solving capabilities. These results highlight critical gaps in visual representation learning for RL and provide clear directions for improving robustness and generalization in decision-making systems.
△ Less
Submitted 13 February, 2025; v1 submitted 17 October, 2024;
originally announced October 2024.
-
Extended Reality System for Robotic Learning from Human Demonstration
Authors:
Isaac Ngui,
Courtney McBeth,
Grace He,
André Corrêa Santos,
Luciano Soares,
Marco Morales,
Nancy M. Amato
Abstract:
Many real-world tasks are intuitive for a human to perform, but difficult to encode algorithmically when utilizing a robot to perform the tasks. In these scenarios, robotic systems can benefit from expert demonstrations to learn how to perform each task. In many settings, it may be difficult or unsafe to use a physical robot to provide these demonstrations, for example, considering cooking tasks s…
▽ More
Many real-world tasks are intuitive for a human to perform, but difficult to encode algorithmically when utilizing a robot to perform the tasks. In these scenarios, robotic systems can benefit from expert demonstrations to learn how to perform each task. In many settings, it may be difficult or unsafe to use a physical robot to provide these demonstrations, for example, considering cooking tasks such as slicing with a knife. Extended reality provides a natural setting for demonstrating robotic trajectories while bypassing safety concerns and providing a broader range of interaction modalities. We propose the Robot Action Demonstration in Extended Reality (RADER) system, a generic extended reality interface for learning from demonstration. We additionally present its application to an existing state-of-the-art learning from demonstration approach and show comparable results between demonstrations given on a physical robot and those given using our extended reality system.
△ Less
Submitted 19 September, 2024;
originally announced September 2024.
-
Gemma 2: Improving Open Language Models at a Practical Size
Authors:
Gemma Team,
Morgane Riviere,
Shreya Pathak,
Pier Giuseppe Sessa,
Cassidy Hardin,
Surya Bhupatiraju,
Léonard Hussenot,
Thomas Mesnard,
Bobak Shahriari,
Alexandre Ramé,
Johan Ferret,
Peter Liu,
Pouya Tafti,
Abe Friesen,
Michelle Casbon,
Sabela Ramos,
Ravin Kumar,
Charline Le Lan,
Sammy Jerome,
Anton Tsitsulin,
Nino Vieillard,
Piotr Stanczyk,
Sertan Girgin,
Nikola Momchev,
Matt Hoffman
, et al. (173 additional authors not shown)
Abstract:
In this work, we introduce Gemma 2, a new addition to the Gemma family of lightweight, state-of-the-art open models, ranging in scale from 2 billion to 27 billion parameters. In this new version, we apply several known technical modifications to the Transformer architecture, such as interleaving local-global attentions (Beltagy et al., 2020a) and group-query attention (Ainslie et al., 2023). We al…
▽ More
In this work, we introduce Gemma 2, a new addition to the Gemma family of lightweight, state-of-the-art open models, ranging in scale from 2 billion to 27 billion parameters. In this new version, we apply several known technical modifications to the Transformer architecture, such as interleaving local-global attentions (Beltagy et al., 2020a) and group-query attention (Ainslie et al., 2023). We also train the 2B and 9B models with knowledge distillation (Hinton et al., 2015) instead of next token prediction. The resulting models deliver the best performance for their size, and even offer competitive alternatives to models that are 2-3 times bigger. We release all our models to the community.
△ Less
Submitted 2 October, 2024; v1 submitted 31 July, 2024;
originally announced August 2024.
-
Static and Dynamic Verification of OCaml Programs: The Gospel Ecosystem (Extended Version)
Authors:
Tiago Lopes Soares,
Ion Chirica,
Mário Pereira
Abstract:
We present our work on the collaborative use of dynamic and static analysis tools for the verification of software written in the OCaml language. We build upon Gospel, a specification language for OCaml that can be used both in dynamic and static analyses. We employ Ortac, for runtime assertion checking, and Cameleer and CFML for the deductive verification of OCaml code. We report on the use of su…
▽ More
We present our work on the collaborative use of dynamic and static analysis tools for the verification of software written in the OCaml language. We build upon Gospel, a specification language for OCaml that can be used both in dynamic and static analyses. We employ Ortac, for runtime assertion checking, and Cameleer and CFML for the deductive verification of OCaml code. We report on the use of such tools to build a case study of collaborative analysis of a non-trivial OCaml program. This shows how these tools nicely complement each others, while at the same highlights the differences when writing specification targeting dynamic or static analysis methods.
△ Less
Submitted 26 July, 2024; v1 submitted 24 July, 2024;
originally announced July 2024.
-
From RAG to RICHES: Retrieval Interlaced with Sequence Generation
Authors:
Palak Jain,
Livio Baldini Soares,
Tom Kwiatkowski
Abstract:
We present RICHES, a novel approach that interleaves retrieval with sequence generation tasks. RICHES offers an alternative to conventional RAG systems by eliminating the need for separate retriever and generator. It retrieves documents by directly decoding their contents, constrained on the corpus. Unifying retrieval with generation allows us to adapt to diverse new tasks via prompting alone. RIC…
▽ More
We present RICHES, a novel approach that interleaves retrieval with sequence generation tasks. RICHES offers an alternative to conventional RAG systems by eliminating the need for separate retriever and generator. It retrieves documents by directly decoding their contents, constrained on the corpus. Unifying retrieval with generation allows us to adapt to diverse new tasks via prompting alone. RICHES can work with any Instruction-tuned model, without additional training. It provides attributed evidence, supports multi-hop retrievals and interleaves thoughts to plan on what to retrieve next, all within a single decoding pass of the LLM. We demonstrate the strong performance of RICHES across ODQA tasks including attributed and multi-hop QA.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Authors:
Gemini Team,
Petko Georgiev,
Ving Ian Lei,
Ryan Burnell,
Libin Bai,
Anmol Gulati,
Garrett Tanzer,
Damien Vincent,
Zhufeng Pan,
Shibo Wang,
Soroosh Mariooryad,
Yifan Ding,
Xinyang Geng,
Fred Alcober,
Roy Frostig,
Mark Omernick,
Lexi Walker,
Cosmin Paduraru,
Christina Sorokin,
Andrea Tacchetti,
Colin Gaffney,
Samira Daruki,
Olcan Sercinoglu,
Zach Gleicher,
Juliette Love
, et al. (1112 additional authors not shown)
Abstract:
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February…
▽ More
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content.
△ Less
Submitted 16 December, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
Gemini: A Family of Highly Capable Multimodal Models
Authors:
Gemini Team,
Rohan Anil,
Sebastian Borgeaud,
Jean-Baptiste Alayrac,
Jiahui Yu,
Radu Soricut,
Johan Schalkwyk,
Andrew M. Dai,
Anja Hauth,
Katie Millican,
David Silver,
Melvin Johnson,
Ioannis Antonoglou,
Julian Schrittwieser,
Amelia Glaese,
Jilin Chen,
Emily Pitler,
Timothy Lillicrap,
Angeliki Lazaridou,
Orhan Firat,
James Molloy,
Michael Isard,
Paul R. Barham,
Tom Hennigan,
Benjamin Lee
, et al. (1326 additional authors not shown)
Abstract:
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr…
▽ More
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
△ Less
Submitted 9 May, 2025; v1 submitted 18 December, 2023;
originally announced December 2023.
-
1-PAGER: One Pass Answer Generation and Evidence Retrieval
Authors:
Palak Jain,
Livio Baldini Soares,
Tom Kwiatkowski
Abstract:
We present 1-Pager the first system that answers a question and retrieves evidence using a single Transformer-based model and decoding process. 1-Pager incrementally partitions the retrieval corpus using constrained decoding to select a document and answer string, and we show that this is competitive with comparable retrieve-and-read alternatives according to both retrieval and answer accuracy met…
▽ More
We present 1-Pager the first system that answers a question and retrieves evidence using a single Transformer-based model and decoding process. 1-Pager incrementally partitions the retrieval corpus using constrained decoding to select a document and answer string, and we show that this is competitive with comparable retrieve-and-read alternatives according to both retrieval and answer accuracy metrics. 1-Pager also outperforms the equivalent closed-book question answering model, by grounding predictions in an evidence corpus. While 1-Pager is not yet on-par with more expensive systems that read many more documents before generating an answer, we argue that it provides an important step toward attributed generation by folding retrieval into the sequence-to-sequence paradigm that is currently dominant in NLP. We also show that the search paths used to partition the corpus are easy to read and understand, paving a way forward for interpretable neural retrieval.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
Calibrating Likelihoods towards Consistency in Summarization Models
Authors:
Polina Zablotskaia,
Misha Khalman,
Rishabh Joshi,
Livio Baldini Soares,
Shoshana Jakobovits,
Joshua Maynez,
Shashi Narayan
Abstract:
Despite the recent advances in abstractive text summarization, current summarization models still suffer from generating factually inconsistent summaries, reducing their utility for real-world application. We argue that the main reason for such behavior is that the summarization models trained with maximum likelihood objective assign high probability to plausible sequences given the context, but t…
▽ More
Despite the recent advances in abstractive text summarization, current summarization models still suffer from generating factually inconsistent summaries, reducing their utility for real-world application. We argue that the main reason for such behavior is that the summarization models trained with maximum likelihood objective assign high probability to plausible sequences given the context, but they often do not accurately rank sequences by their consistency. In this work, we solve this problem by calibrating the likelihood of model generated sequences to better align with a consistency metric measured by natural language inference (NLI) models. The human evaluation study and automatic metrics show that the calibrated models generate more consistent and higher-quality summaries. We also show that the models trained using our method return probabilities that are better aligned with the NLI scores, which significantly increase reliability of summarization models.
△ Less
Submitted 12 October, 2023;
originally announced October 2023.
-
NAIL: Lexical Retrieval Indices with Efficient Non-Autoregressive Decoders
Authors:
Livio Baldini Soares,
Daniel Gillick,
Jeremy R. Cole,
Tom Kwiatkowski
Abstract:
Neural document rerankers are extremely effective in terms of accuracy. However, the best models require dedicated hardware for serving, which is costly and often not feasible. To avoid this serving-time requirement, we present a method of capturing up to 86% of the gains of a Transformer cross-attention model with a lexicalized scoring function that only requires 10-6% of the Transformer's FLOPs…
▽ More
Neural document rerankers are extremely effective in terms of accuracy. However, the best models require dedicated hardware for serving, which is costly and often not feasible. To avoid this serving-time requirement, we present a method of capturing up to 86% of the gains of a Transformer cross-attention model with a lexicalized scoring function that only requires 10-6% of the Transformer's FLOPs per document and can be served using commodity CPUs. When combined with a BM25 retriever, this approach matches the quality of a state-of-the art dual encoder retriever, that still requires an accelerator for query encoding. We introduce NAIL (Non-Autoregressive Indexing with Language models) as a model architecture that is compatible with recent encoder-decoder and decoder-only large language models, such as T5, GPT-3 and PaLM. This model architecture can leverage existing pre-trained checkpoints and can be fine-tuned for efficiently constructing document representations that do not require neural processing of queries.
△ Less
Submitted 23 October, 2023; v1 submitted 23 May, 2023;
originally announced May 2023.
-
Evaluating and Modeling Attribution for Cross-Lingual Question Answering
Authors:
Benjamin Muller,
John Wieting,
Jonathan H. Clark,
Tom Kwiatkowski,
Sebastian Ruder,
Livio Baldini Soares,
Roee Aharoni,
Jonathan Herzig,
Xinyi Wang
Abstract:
Trustworthy answer content is abundant in many high-resource languages and is instantly accessible through question answering systems, yet this content can be hard to access for those that do not speak these languages. The leap forward in cross-lingual modeling quality offered by generative language models offers much promise, yet their raw generations often fall short in factuality. To improve tr…
▽ More
Trustworthy answer content is abundant in many high-resource languages and is instantly accessible through question answering systems, yet this content can be hard to access for those that do not speak these languages. The leap forward in cross-lingual modeling quality offered by generative language models offers much promise, yet their raw generations often fall short in factuality. To improve trustworthiness in these systems, a promising direction is to attribute the answer to a retrieved source, possibly in a content-rich language different from the query. Our work is the first to study attribution for cross-lingual question answering. First, we collect data in 5 languages to assess the attribution level of a state-of-the-art cross-lingual QA system. To our surprise, we find that a substantial portion of the answers is not attributable to any retrieved passages (up to 50% of answers exactly matching a gold reference) despite the system being able to attend directly to the retrieved text. Second, to address this poor attribution level, we experiment with a wide range of attribution detection techniques. We find that Natural Language Inference models and PaLM 2 fine-tuned on a very small amount of attribution data can accurately detect attribution. Based on these models, we improve the attribution level of a cross-lingual question-answering system. Overall, we show that current academic generative cross-lingual QA systems have substantial shortcomings in attribution and we build tooling to mitigate these issues.
△ Less
Submitted 15 November, 2023; v1 submitted 23 May, 2023;
originally announced May 2023.
-
Investigating the Perceived Impact of Maternity on Software Engineering: a Women's Perspective
Authors:
Larissa Soares,
Edna Canedo,
Claudia Pereira,
Carla Bezerra,
Fabiana Mendes
Abstract:
Background: Several researchers report the impact of gender on software development teams, especially in relation to women. In general, women are under-represented on these teams and face challenges and difficulties in their workplaces. When it comes to women who are mothers, these challenges can be amplified and directly impact these women's professional lives, both in industry and academia. Howe…
▽ More
Background: Several researchers report the impact of gender on software development teams, especially in relation to women. In general, women are under-represented on these teams and face challenges and difficulties in their workplaces. When it comes to women who are mothers, these challenges can be amplified and directly impact these women's professional lives, both in industry and academia. However, little is known about women ICT practitioners' perceptions of the challenges of maternity in their professional careers. Objective: This paper investigates mothers' challenges and difficulties in global software development teams. Method: We conducted a survey with women ICT practitioners who work in academia and global technology companies. We surveyed 141 mothers from different countries and employed mixed methods to analyze the data. Results: Our findings reveal that women face sociocultural challenges, including work-life balance issues, bad jokes, and moral harassment. Furthermore, few women occupy leadership positions in software teams, and most reported that they did not have a support network during and after maternity leave, feeling overloaded. The surveyed women suggested a set of actions to reduce the challenges they face in their workplaces, such as: i) changing culture; ii) creating a code of conduct for men; iii) more empathy; iv) creating childcare within companies; and v) creating opportunities/programs for women in the software industry and academia. Conclusion: Adding to the underrepresentation of ICT roles, women also face many challenges in one important phase of women's lives, maternity. Our findings explore these challenges and can help organizations in developing policies to minimize them. Furthermore, it can help raise awareness of co-workers and bosses, toward a more friendly and inclusive workplace.
△ Less
Submitted 24 April, 2023;
originally announced April 2023.
-
Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models
Authors:
Bernd Bohnet,
Vinh Q. Tran,
Pat Verga,
Roee Aharoni,
Daniel Andor,
Livio Baldini Soares,
Massimiliano Ciaramita,
Jacob Eisenstein,
Kuzman Ganchev,
Jonathan Herzig,
Kai Hui,
Tom Kwiatkowski,
Ji Ma,
Jianmo Ni,
Lierni Sestorain Saralegui,
Tal Schuster,
William W. Cohen,
Michael Collins,
Dipanjan Das,
Donald Metzler,
Slav Petrov,
Kellie Webster
Abstract:
Large language models (LLMs) have shown impressive results while requiring little or no direct supervision. Further, there is mounting evidence that LLMs may have potential in information-seeking scenarios. We believe the ability of an LLM to attribute the text that it generates is likely to be crucial in this setting. We formulate and study Attributed QA as a key first step in the development of…
▽ More
Large language models (LLMs) have shown impressive results while requiring little or no direct supervision. Further, there is mounting evidence that LLMs may have potential in information-seeking scenarios. We believe the ability of an LLM to attribute the text that it generates is likely to be crucial in this setting. We formulate and study Attributed QA as a key first step in the development of attributed LLMs. We propose a reproducible evaluation framework for the task and benchmark a broad set of architectures. We take human annotations as a gold standard and show that a correlated automatic metric is suitable for development. Our experimental work gives concrete answers to two key questions (How to measure attribution?, and How well do current state-of-the-art methods perform on attribution?), and give some hints as to how to address a third (How to build LLMs with attribution?).
△ Less
Submitted 10 February, 2023; v1 submitted 15 December, 2022;
originally announced December 2022.
-
Relict landslide detection using Deep-Learning architectures for image segmentation in rainforest areas: A new framework
Authors:
Guilherme P. B. Garcia,
Carlos H. Grohmann,
Lucas P. Soares,
Mateus Espadoto
Abstract:
Landslides are destructive and recurrent natural disasters on steep slopes and represent a risk to lives and properties. Knowledge of relict landslides location is vital to understand their mechanisms, update inventory maps and improve risk assessment. However, relict landslide mapping is complex in tropical regions covered with rainforest vegetation. A new CNN framework is proposed for semi-autom…
▽ More
Landslides are destructive and recurrent natural disasters on steep slopes and represent a risk to lives and properties. Knowledge of relict landslides location is vital to understand their mechanisms, update inventory maps and improve risk assessment. However, relict landslide mapping is complex in tropical regions covered with rainforest vegetation. A new CNN framework is proposed for semi-automatic detection of relict landslides, which uses a dataset generated by a k-means clustering algorithm and has a pre-training step. The weights computed in the pre-training are used to fine-tune the CNN training process. A comparison between the proposed and the standard framework is performed using CBERS-04A WPM images. Three CNNs for semantic segmentation are used (Unet, FPN, Linknet) with two augmented datasets. A total of 42 combinations of CNNs are tested. Values of precision and recall were very similar between the combinations tested. Recall was higher than 75% for every combination, but precision values were usually smaller than 20%. False positives (FP) samples were addressed as the cause for these low precision values. Predictions of the proposed framework were more accurate and correctly detected more landslides. This work demonstrates that there are limitations for detecting relict landslides in areas covered with rainforest, mainly related to similarities between the spectral response of pastures and deforested areas with Gleichenella sp. ferns, commonly used as an indicator of landslide scars.
△ Less
Submitted 29 May, 2023; v1 submitted 4 August, 2022;
originally announced August 2022.
-
Refactoring Assertion Roulette and Duplicate Assert test smells: a controlled experiment
Authors:
Railana Santana,
Luana Martins,
Tássio Virgínio,
Larissa Soares,
Heitor Costa,
Ivan Machado
Abstract:
Test smells can reduce the developers' ability to interact with the test code. Refactoring test code offers a safe strategy to handle test smells. However, the manual refactoring activity is not a trivial process, and it is often tedious and error-prone. This study aims to evaluate RAIDE, a tool for automatic identification and refactoring of test smells. We present an empirical assessment of RAID…
▽ More
Test smells can reduce the developers' ability to interact with the test code. Refactoring test code offers a safe strategy to handle test smells. However, the manual refactoring activity is not a trivial process, and it is often tedious and error-prone. This study aims to evaluate RAIDE, a tool for automatic identification and refactoring of test smells. We present an empirical assessment of RAIDE, in which we analyzed its capability at refactoring Assertion Roulette and Duplicate Assert test smells and compared the results against both manual refactoring and a state-of-the-art approach. The results show that RAIDE provides a faster and more intuitive approach for handling test smells than using an automated tool for smells detection combined with manual refactoring.
△ Less
Submitted 12 July, 2022;
originally announced July 2022.
-
Scaling Up Models and Data with $\texttt{t5x}$ and $\texttt{seqio}$
Authors:
Adam Roberts,
Hyung Won Chung,
Anselm Levskaya,
Gaurav Mishra,
James Bradbury,
Daniel Andor,
Sharan Narang,
Brian Lester,
Colin Gaffney,
Afroz Mohiuddin,
Curtis Hawthorne,
Aitor Lewkowycz,
Alex Salcianu,
Marc van Zee,
Jacob Austin,
Sebastian Goodman,
Livio Baldini Soares,
Haitang Hu,
Sasha Tsvyashchenko,
Aakanksha Chowdhery,
Jasmijn Bastings,
Jannis Bulian,
Xavier Garcia,
Jianmo Ni,
Andrew Chen
, et al. (18 additional authors not shown)
Abstract:
Recent neural network-based language models have benefited greatly from scaling up the size of training datasets and the number of parameters in the models themselves. Scaling can be complicated due to various factors including the need to distribute computation on supercomputer clusters (e.g., TPUs), prevent bottlenecks when infeeding data, and ensure reproducible results. In this work, we presen…
▽ More
Recent neural network-based language models have benefited greatly from scaling up the size of training datasets and the number of parameters in the models themselves. Scaling can be complicated due to various factors including the need to distribute computation on supercomputer clusters (e.g., TPUs), prevent bottlenecks when infeeding data, and ensure reproducible results. In this work, we present two software libraries that ease these issues: $\texttt{t5x}$ simplifies the process of building and training large language models at scale while maintaining ease of use, and $\texttt{seqio}$ provides a task-based API for simple creation of fast and reproducible training data and evaluation pipelines. These open-source libraries have been used to train models with hundreds of billions of parameters on datasets with multiple terabytes of training data.
Along with the libraries, we release configurations and instructions for T5-like encoder-decoder models as well as GPT-like decoder-only architectures.
$\texttt{t5x}$ and $\texttt{seqio}$ are open source and available at https://github.com/google-research/t5x and https://github.com/google/seqio, respectively.
△ Less
Submitted 31 March, 2022;
originally announced March 2022.
-
Exploring the acceptability of digital contact tracing for UK students
Authors:
Dave Murray-Rust,
Luis Soares,
Katya Gorkovenko,
John Rooksby
Abstract:
Contact tracing systems control the spread of disease by discovering the set of people an infectious individual has come into contact with. Students are often mobile and sociable and therefore can contribute to the spread of disease. Controls on the movement of students studying in the UK were put in place during the Covid-19 pandemic, and some restrictions may be necessary over several years. App…
▽ More
Contact tracing systems control the spread of disease by discovering the set of people an infectious individual has come into contact with. Students are often mobile and sociable and therefore can contribute to the spread of disease. Controls on the movement of students studying in the UK were put in place during the Covid-19 pandemic, and some restrictions may be necessary over several years. App based digital contact tracing may help ease restrictions by enabling students to make informed decisions and take precautions. However, designing for the end user acceptability of these apps remains under-explored. This study with 22 students from UK Universities (inc. 11 international students) uses a fictional user interface to prompt in-depth interviews on the acceptability of contact tracing tools. We explore intended uptake, usage and compliance with contact tracing apps, finding students are positive, although concerned about privacy, security, and burden of participating.
△ Less
Submitted 21 January, 2022;
originally announced January 2022.
-
On the Privacy of National Contact Tracing COVID-19 Applications: The Coronavírus-SUS Case
Authors:
Jéferson Campos Nobre,
Laura Rodrigues Soares,
Briggette Olenka Roman Huaytalla,
Elvandi da Silva Júnior,
Lisandro Zambenedetti Granville
Abstract:
The 2019 Coronavirus disease (COVID-19) pandemic, caused by a quick dissemination of the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), has had a deep impact worldwide, both in terms of the loss of human life and the economic and social disruption. The use of digital technologies has been seen as an important effort to combat the pandemic and one of such technologies is contact trac…
▽ More
The 2019 Coronavirus disease (COVID-19) pandemic, caused by a quick dissemination of the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), has had a deep impact worldwide, both in terms of the loss of human life and the economic and social disruption. The use of digital technologies has been seen as an important effort to combat the pandemic and one of such technologies is contact tracing applications. These applications were successfully employed to face other infectious diseases, thus they have been used during the current pandemic. However, the use of contact tracing poses several privacy concerns since it is necessary to store and process data which can lead to the user/device identification as well as location and behavior tracking. These concerns are even more relevant when considering nationwide implementations since they can lead to mass surveillance by authoritarian governments. Despite the restrictions imposed by data protection laws from several countries, there are still doubts on the preservation of the privacy of the users. In this article, we analyze the privacy features in national contact tracing COVID-19 applications considering their intrinsic characteristics. As a case study, we discuss in more depth the Brazilian COVID-19 application Coronavírus-SUS, since Brazil is one of the most impacted countries by the current pandemic. Finally, as we believe contact tracing will continue to be employed as part of the strategy for the current and potential future pandemics, we present key research challenges.
△ Less
Submitted 2 August, 2021;
originally announced August 2021.
-
Remote Pathological Gait Classification System
Authors:
Pedro Albuquerque,
Joao Machado,
Tanmay Tulsidas Verlekar,
Luis Ducla Soares,
Paulo Lobato Correia
Abstract:
Several pathologies can alter the way people walk, i.e. their gait. Gait analysis can therefore be used to detect impairments and help diagnose illnesses and assess patient recovery. Using vision-based systems, diagnoses could be done at home or in a clinic, with the needed computation being done remotely. State-of-the-art vision-based gait analysis systems use deep learning, requiring large datas…
▽ More
Several pathologies can alter the way people walk, i.e. their gait. Gait analysis can therefore be used to detect impairments and help diagnose illnesses and assess patient recovery. Using vision-based systems, diagnoses could be done at home or in a clinic, with the needed computation being done remotely. State-of-the-art vision-based gait analysis systems use deep learning, requiring large datasets for training. However, to our best knowledge, the biggest publicly available pathological gait dataset contains only 10 subjects, simulating 4 gait pathologies. This paper presents a new dataset called GAIT-IT, captured from 21 subjects simulating 4 gait pathologies, with 2 severity levels, besides normal gait, being considerably larger than publicly available gait pathology datasets, allowing to train a deep learning model for gait pathology classification. Moreover, it was recorded in a professional studio, making it possible to obtain nearly perfect silhouettes, free of segmentation errors. Recognizing the importance of remote healthcare, this paper proposes a prototype of a web application allowing to upload a walking person's video, possibly acquired using a smartphone camera, and execute a web service that classifies the person's gait as normal or across different pathologies. The web application has a user friendly interface and could be used by healthcare professionals or other end users. An automatic gait analysis system is also developed and integrated with the web application for pathology classification. Compared to state-of-the-art solutions, it achieves a drastic reduction in the number of model parameters, which means significantly lower memory requirements, as well as lower training and execution times. Classification accuracy is on par with the state-of-the-art.
△ Less
Submitted 4 May, 2021;
originally announced May 2021.
-
Evaluating Explanations: How much do explanations from the teacher aid students?
Authors:
Danish Pruthi,
Rachit Bansal,
Bhuwan Dhingra,
Livio Baldini Soares,
Michael Collins,
Zachary C. Lipton,
Graham Neubig,
William W. Cohen
Abstract:
While many methods purport to explain predictions by highlighting salient features, what aims these explanations serve and how they ought to be evaluated often go unstated. In this work, we introduce a framework to quantify the value of explanations via the accuracy gains that they confer on a student model trained to simulate a teacher model. Crucially, the explanations are available to the stude…
▽ More
While many methods purport to explain predictions by highlighting salient features, what aims these explanations serve and how they ought to be evaluated often go unstated. In this work, we introduce a framework to quantify the value of explanations via the accuracy gains that they confer on a student model trained to simulate a teacher model. Crucially, the explanations are available to the student during training, but are not available at test time. Compared to prior proposals, our approach is less easily gamed, enabling principled, automatic, model-agnostic evaluation of attributions. Using our framework, we compare numerous attribution methods for text classification and question answering, and observe quantitative differences that are consistent (to a moderate to high degree) across different student model architectures and learning strategies.
△ Less
Submitted 16 December, 2021; v1 submitted 1 December, 2020;
originally announced December 2020.
-
A Deductive Verification Framework For Higher Order Programs
Authors:
Tiago Lopes Soares
Abstract:
In this report, we present the preliminary work developed for our research project for the APDC (Área Prática de Desenvolvimento Curricular) course. The main goal of this project is to develop a framework, on top of the Why3 tool, for the verification of effectful higher-order programs. We use defunctionalization as an intermediate transformation from higher-order OCaml implementations into first…
▽ More
In this report, we present the preliminary work developed for our research project for the APDC (Área Prática de Desenvolvimento Curricular) course. The main goal of this project is to develop a framework, on top of the Why3 tool, for the verification of effectful higher-order programs. We use defunctionalization as an intermediate transformation from higher-order OCaml implementations into first order ones. The target for our translation is WhyML, the Why3's programming language. We believe defunctionalization can be an interesting route for the automated verification of higher-order programs, since one can employ off-the-shelf automated program verifiers to prove the correctness of the generated first-order program. This report also serves to introduce the reader to the subject of deductive program verification and some of the tools and concepts used to prove higher order effectful programs.
△ Less
Submitted 27 November, 2020;
originally announced November 2020.
-
QED: A Framework and Dataset for Explanations in Question Answering
Authors:
Matthew Lamm,
Jennimaria Palomaki,
Chris Alberti,
Daniel Andor,
Eunsol Choi,
Livio Baldini Soares,
Michael Collins
Abstract:
A question answering system that in addition to providing an answer provides an explanation of the reasoning that leads to that answer has potential advantages in terms of debuggability, extensibility and trust. To this end, we propose QED, a linguistically informed, extensible framework for explanations in question answering. A QED explanation specifies the relationship between a question and ans…
▽ More
A question answering system that in addition to providing an answer provides an explanation of the reasoning that leads to that answer has potential advantages in terms of debuggability, extensibility and trust. To this end, we propose QED, a linguistically informed, extensible framework for explanations in question answering. A QED explanation specifies the relationship between a question and answer according to formal semantic notions such as referential equality, sentencehood, and entailment. We describe and publicly release an expert-annotated dataset of QED explanations built upon a subset of the Google Natural Questions dataset, and report baseline models on two tasks -- post-hoc explanation generation given an answer, and joint question answering and explanation generation. In the joint setting, a promising result suggests that training on a relatively small amount of QED data can improve question answering. In addition to describing the formal, language-theoretic motivations for the QED approach, we describe a large user study showing that the presence of QED explanations significantly improves the ability of untrained raters to spot errors made by a strong neural QA baseline.
△ Less
Submitted 8 September, 2020;
originally announced September 2020.
-
Landslide Segmentation with U-Net: Evaluating Different Sampling Methods and Patch Sizes
Authors:
Lucas P. Soares,
Helen C. Dias,
Carlos H. Grohmann
Abstract:
Landslide inventory maps are crucial to validate predictive landslide models; however, since most mapping methods rely on visual interpretation or expert knowledge, detailed inventory maps are still lacking. This study used a fully convolutional deep learning model named U-net to automatically segment landslides in the city of Nova Friburgo, located in the mountainous range of Rio de Janeiro, sout…
▽ More
Landslide inventory maps are crucial to validate predictive landslide models; however, since most mapping methods rely on visual interpretation or expert knowledge, detailed inventory maps are still lacking. This study used a fully convolutional deep learning model named U-net to automatically segment landslides in the city of Nova Friburgo, located in the mountainous range of Rio de Janeiro, southeastern Brazil. The objective was to evaluate the impact of patch sizes, sampling methods, and datasets on the overall accuracy of the models. The training data used the optical information from RapidEye satellite, and a digital elevation model (DEM) derived from the L-band sensor of the ALOS satellite. The data was sampled using random and regular grid methods and patched in three sizes (32x32, 64x64, and 128x128 pixels). The models were evaluated on two areas with precision, recall, f1-score, and mean intersect over union (mIoU) metrics. The results show that the models trained with 32x32 tiles tend to have higher recall values due to higher true positive rates; however, they misclassify more background areas as landslides (false positives). Models trained with 128x128 tiles usually achieve higher precision values because they make less false positive errors. In both test areas, DEM and augmentation increased the accuracy of the models. Random sampling helped in model generalization. Models trained with 128x128 random tiles from the data that used the RapidEye image, DEM information, and augmentation achieved the highest f1-score, 0.55 in test area one, and 0.58 in test area two. The results achieved in this study are comparable to other fully convolutional models found in the literature, increasing the knowledge in the area.
△ Less
Submitted 13 July, 2020;
originally announced July 2020.
-
Automatic Detection of COVID-19 Cases on X-ray images Using Convolutional Neural Networks
Authors:
Lucas P. Soares,
Cesar P. Soares
Abstract:
In recent months the world has been surprised by the rapid advance of COVID-19. In order to face this disease and minimize its socio-economic impacts, in addition to surveillance and treatment, diagnosis is a crucial procedure. However, the realization of this is hampered by the delay and the limited access to laboratory tests, demanding new strategies to carry out case triage. In this scenario, d…
▽ More
In recent months the world has been surprised by the rapid advance of COVID-19. In order to face this disease and minimize its socio-economic impacts, in addition to surveillance and treatment, diagnosis is a crucial procedure. However, the realization of this is hampered by the delay and the limited access to laboratory tests, demanding new strategies to carry out case triage. In this scenario, deep learning models are being proposed as a possible option to assist the diagnostic process based on chest X-ray and computed tomography images. Therefore, this research aims to automate the process of detecting COVID-19 cases from chest images, using convolutional neural networks (CNN) through deep learning techniques. The results can contribute to expand access to other forms of detection of COVID-19 and to speed up the process of identifying this disease. All databases used, the codes built, and the results obtained from the models' training are available for open access. This action facilitates the involvement of other researchers in enhancing these models since this can contribute to the improvement of results and, consequently, the progress in confronting COVID-19.
△ Less
Submitted 1 July, 2020;
originally announced July 2020.
-
Facts as Experts: Adaptable and Interpretable Neural Memory over Symbolic Knowledge
Authors:
Pat Verga,
Haitian Sun,
Livio Baldini Soares,
William W. Cohen
Abstract:
Massive language models are the core of modern NLP modeling and have been shown to encode impressive amounts of commonsense and factual information. However, that knowledge exists only within the latent parameters of the model, inaccessible to inspection and interpretation, and even worse, factual information memorized from the training corpora is likely to become stale as the world changes. Knowl…
▽ More
Massive language models are the core of modern NLP modeling and have been shown to encode impressive amounts of commonsense and factual information. However, that knowledge exists only within the latent parameters of the model, inaccessible to inspection and interpretation, and even worse, factual information memorized from the training corpora is likely to become stale as the world changes. Knowledge stored as parameters will also inevitably exhibit all of the biases inherent in the source materials. To address these problems, we develop a neural language model that includes an explicit interface between symbolically interpretable factual information and subsymbolic neural knowledge. We show that this model dramatically improves performance on two knowledge-intensive question-answering tasks. More interestingly, the model can be updated without re-training by manipulating its symbolic representations. In particular this model allows us to add new facts and overwrite existing ones in ways that are not possible for earlier models.
△ Less
Submitted 1 July, 2020;
originally announced July 2020.
-
Empirical Evaluation of Pretraining Strategies for Supervised Entity Linking
Authors:
Thibault Févry,
Nicholas FitzGerald,
Livio Baldini Soares,
Tom Kwiatkowski
Abstract:
In this work, we present an entity linking model which combines a Transformer architecture with large scale pretraining from Wikipedia links. Our model achieves the state-of-the-art on two commonly used entity linking datasets: 96.7% on CoNLL and 94.9% on TAC-KBP. We present detailed analyses to understand what design choices are important for entity linking, including choices of negative entity c…
▽ More
In this work, we present an entity linking model which combines a Transformer architecture with large scale pretraining from Wikipedia links. Our model achieves the state-of-the-art on two commonly used entity linking datasets: 96.7% on CoNLL and 94.9% on TAC-KBP. We present detailed analyses to understand what design choices are important for entity linking, including choices of negative entity candidates, Transformer architecture, and input perturbations. Lastly, we present promising results on more challenging settings such as end-to-end entity linking and entity linking without in-domain training data.
△ Less
Submitted 28 May, 2020;
originally announced May 2020.
-
New Protocols and Negative Results for Textual Entailment Data Collection
Authors:
Samuel R. Bowman,
Jennimaria Palomaki,
Livio Baldini Soares,
Emily Pitler
Abstract:
Natural language inference (NLI) data has proven useful in benchmarking and, especially, as pretraining data for tasks requiring language understanding. However, the crowdsourcing protocol that was used to collect this data has known issues and was not explicitly optimized for either of these purposes, so it is likely far from ideal. We propose four alternative protocols, each aimed at improving e…
▽ More
Natural language inference (NLI) data has proven useful in benchmarking and, especially, as pretraining data for tasks requiring language understanding. However, the crowdsourcing protocol that was used to collect this data has known issues and was not explicitly optimized for either of these purposes, so it is likely far from ideal. We propose four alternative protocols, each aimed at improving either the ease with which annotators can produce sound training examples or the quality and diversity of those examples. Using these alternatives and a fifth baseline protocol, we collect and compare five new 8.5k-example training sets. In evaluations focused on transfer learning applications, our results are solidly negative, with models trained on our baseline dataset yielding good transfer performance to downstream tasks, but none of our four new methods (nor the recent ANLI) showing any improvements over that baseline. In a small silver lining, we observe that all four new protocols, especially those where annotators edit pre-filled text boxes, reduce previously observed issues with annotation artifacts.
△ Less
Submitted 29 September, 2020; v1 submitted 24 April, 2020;
originally announced April 2020.
-
Entities as Experts: Sparse Memory Access with Entity Supervision
Authors:
Thibault Févry,
Livio Baldini Soares,
Nicholas FitzGerald,
Eunsol Choi,
Tom Kwiatkowski
Abstract:
We focus on the problem of capturing declarative knowledge about entities in the learned parameters of a language model. We introduce a new model - Entities as Experts (EAE) - that can access distinct memories of the entities mentioned in a piece of text. Unlike previous efforts to integrate entity knowledge into sequence models, EAE's entity representations are learned directly from text. We show…
▽ More
We focus on the problem of capturing declarative knowledge about entities in the learned parameters of a language model. We introduce a new model - Entities as Experts (EAE) - that can access distinct memories of the entities mentioned in a piece of text. Unlike previous efforts to integrate entity knowledge into sequence models, EAE's entity representations are learned directly from text. We show that EAE's learned representations capture sufficient knowledge to answer TriviaQA questions such as "Which Dr. Who villain has been played by Roger Delgado, Anthony Ainley, Eric Roberts?", outperforming an encoder-generator Transformer model with 10x the parameters. According to the LAMA knowledge probes, EAE contains more factual knowledge than a similarly sized BERT, as well as previous approaches that integrate external sources of entity knowledge. Because EAE associates parameters with specific entities, it only needs to access a fraction of its parameters at inference time, and we show that the correct identification and representation of entities is essential to EAE's performance.
△ Less
Submitted 6 October, 2020; v1 submitted 15 April, 2020;
originally announced April 2020.
-
Learning Cross-Context Entity Representations from Text
Authors:
Jeffrey Ling,
Nicholas FitzGerald,
Zifei Shan,
Livio Baldini Soares,
Thibault Févry,
David Weiss,
Tom Kwiatkowski
Abstract:
Language modeling tasks, in which words, or word-pieces, are predicted on the basis of a local context, have been very effective for learning word embeddings and context dependent representations of phrases. Motivated by the observation that efforts to code world knowledge into machine readable knowledge bases or human readable encyclopedias tend to be entity-centric, we investigate the use of a f…
▽ More
Language modeling tasks, in which words, or word-pieces, are predicted on the basis of a local context, have been very effective for learning word embeddings and context dependent representations of phrases. Motivated by the observation that efforts to code world knowledge into machine readable knowledge bases or human readable encyclopedias tend to be entity-centric, we investigate the use of a fill-in-the-blank task to learn context independent representations of entities from the text contexts in which those entities were mentioned. We show that large scale training of neural models allows us to learn high quality entity representations, and we demonstrate successful results on four domains: (1) existing entity-level typing benchmarks, including a 64% error reduction over previous work on TypeNet (Murty et al., 2018); (2) a novel few-shot category reconstruction task; (3) existing entity linking benchmarks, where we match the state-of-the-art on CoNLL-Aida without linking-specific features and obtain a score of 89.8% on TAC-KBP 2010 without using any alias table, external knowledge base or in domain training data and (4) answering trivia questions, which uniquely identify entities. Our global entity representations encode fine-grained type categories, such as Scottish footballers, and can answer trivia questions such as: Who was the last inmate of Spandau jail in Berlin?
△ Less
Submitted 11 January, 2020;
originally announced January 2020.
-
Matching the Blanks: Distributional Similarity for Relation Learning
Authors:
Livio Baldini Soares,
Nicholas FitzGerald,
Jeffrey Ling,
Tom Kwiatkowski
Abstract:
General purpose relation extractors, which can model arbitrary relations, are a core aspiration in information extraction. Efforts have been made to build general purpose extractors that represent relations with their surface forms, or which jointly embed surface forms with relations from an existing knowledge graph. However, both of these approaches are limited in their ability to generalize. In…
▽ More
General purpose relation extractors, which can model arbitrary relations, are a core aspiration in information extraction. Efforts have been made to build general purpose extractors that represent relations with their surface forms, or which jointly embed surface forms with relations from an existing knowledge graph. However, both of these approaches are limited in their ability to generalize. In this paper, we build on extensions of Harris' distributional hypothesis to relations, as well as recent advances in learning text representations (specifically, BERT), to build task agnostic relation representations solely from entity-linked text. We show that these representations significantly outperform previous work on exemplar based relation extraction (FewRel) even without using any of that task's training data. We also show that models initialized with our task agnostic representations, and then tuned on supervised relation extraction datasets, significantly outperform the previous methods on SemEval 2010 Task 8, KBP37, and TACRED.
△ Less
Submitted 7 June, 2019;
originally announced June 2019.
-
A simple centrality index for scientific social recognition
Authors:
Osame Kinouchi,
Leonardo D. H. Soares,
George C. Cardoso
Abstract:
We introduce a new centrality index for bipartite network of papers and authors that we call $K$-index. The $K$-index grows with the citation performance of the papers that cite a given researcher and can seen as a measure of scientific social recognition. Indeed, the $K$-index measures the number of hubs, defined in a self-consistent way in the bipartite network, that cites a given author. We sho…
▽ More
We introduce a new centrality index for bipartite network of papers and authors that we call $K$-index. The $K$-index grows with the citation performance of the papers that cite a given researcher and can seen as a measure of scientific social recognition. Indeed, the $K$-index measures the number of hubs, defined in a self-consistent way in the bipartite network, that cites a given author. We show that the $K$-index can be computed by simple inspection of the Web of Science platform and presents several advantages over other centrality indexes, in particular Hirsch $h$-index. The $K$-index is robust to self-citations, is not limited by the total number of papers published by a researcher as occurs for the $h$-index and can distinguish in a consistent way researchers that have the same $h$-index but very different scientific social recognition. The $K$-index easily detects a known case of a researcher with inflated number of papers, citations and $h$-index due to scientific misconduct. Finally, we show that, in a sample of twenty-eight physics Nobel laureates and twenty-eight highly cited non-Nobel-laureate physicists, the $K$-index correlates better to the achievement of the prize than the number of papers, citations, citations per paper, citing articles or the $h$-index. Clustering researchers in a $K$ versus $h$ plot reveals interesting outliers that suggest that these two indexes can present complementary independent information.
△ Less
Submitted 28 September, 2017; v1 submitted 16 September, 2016;
originally announced September 2016.
-
Fault Analysis Using Gegenbauer Multiresolution Analysis
Authors:
L. R. Soares,
H. M. de Oliveira
Abstract:
This paper exploits the multiresolution analysis in the fault analysis on transmission lines. Faults were simulated using the ATP (Alternative Transient Program), considering signals at 128/cycle. A nonorthogonal multiresolution analysis was provided by Gegenbauer scaling and wavelet filters. In the cases where the signal reconstruction is not required, orthogonality may be immaterial. Gegenbauer…
▽ More
This paper exploits the multiresolution analysis in the fault analysis on transmission lines. Faults were simulated using the ATP (Alternative Transient Program), considering signals at 128/cycle. A nonorthogonal multiresolution analysis was provided by Gegenbauer scaling and wavelet filters. In the cases where the signal reconstruction is not required, orthogonality may be immaterial. Gegenbauer filter banks are thereby offered in this paper as a tool for analyzing fault signals on transmission lines. Results are compared to those ones derived from a 4-coefficient Daubechies filter. The main advantages in favor of Gegenbauer filters are their smaller computational effort and their constant group delay, as they are symmetric filters.
△ Less
Submitted 12 February, 2015;
originally announced March 2015.
-
Lobby index as a network centrality measure
Authors:
Monica G. Campiteli,
Adriano J. Holanda,
Leonardo D. H. Soares,
Paulo R. C. Soles,
Osame Kinouchi
Abstract:
We study the lobby index (l-index for short) as a local node centrality measure for complex networks. The l-inde is compared with degree (a local measure), betweenness and Eigenvector centralities (two global measures) in the case of biological network (Yeast interaction protein-protein network) and a linguistic network (Moby Thesaurus II). In both networks, the l-index has poor correlation with b…
▽ More
We study the lobby index (l-index for short) as a local node centrality measure for complex networks. The l-inde is compared with degree (a local measure), betweenness and Eigenvector centralities (two global measures) in the case of biological network (Yeast interaction protein-protein network) and a linguistic network (Moby Thesaurus II). In both networks, the l-index has poor correlation with betweenness but correlates with degree and Eigenvector. Being a local measure, one can take advantage by using the l-index because it carries more information about its neighbors when compared with degree centrality, indeed it requires less time to compute when compared with Eigenvector centrality. Results suggests that l-index produces better results than degree and Eigenvector measures for ranking purposes, becoming suitable as a tool to perform this task.
△ Less
Submitted 26 June, 2013; v1 submitted 29 April, 2013;
originally announced April 2013.