-
Beyond Text Compression: Evaluating Tokenizers Across Scales
Authors:
Jonas F. Lotz,
António V. Lopes,
Stephan Peitz,
Hendra Setiawan,
Leonardo Emili
Abstract:
The choice of tokenizer can profoundly impact language model performance, yet accessible and reliable evaluations of tokenizer quality remain an open challenge. Inspired by scaling consistency, we show that smaller models can accurately predict significant differences in tokenizer impact on larger models at a fraction of the compute cost. By systematically evaluating both English-centric and multi…
▽ More
The choice of tokenizer can profoundly impact language model performance, yet accessible and reliable evaluations of tokenizer quality remain an open challenge. Inspired by scaling consistency, we show that smaller models can accurately predict significant differences in tokenizer impact on larger models at a fraction of the compute cost. By systematically evaluating both English-centric and multilingual tokenizers, we find that tokenizer choice has negligible effects on tasks in English but results in consistent performance differences in multilingual settings. We propose new intrinsic tokenizer metrics inspired by Zipf's law that correlate more strongly with downstream performance than text compression when modeling unseen languages. By combining several metrics to capture multiple aspects of tokenizer behavior, we develop a reliable framework for intrinsic tokenizer evaluations. Our work offers a more efficient path to informed tokenizer selection in future language model development.
△ Less
Submitted 3 June, 2025;
originally announced June 2025.
-
Memorization: A Close Look at Books
Authors:
Iris Ma,
Ian Domingo,
Alberto Krone-Martins,
Pierre Baldi,
Cristina V. Lopes
Abstract:
To what extent can entire books be extracted from LLMs? Using the Llama 3 70B family of models, and the "prefix-prompting" extraction technique, we were able to auto-regressively reconstruct, with a very high level of similarity, one entire book (Alice's Adventures in Wonderland) from just the first 500 tokens. We were also able to obtain high extraction rates on several other books, piece-wise. H…
▽ More
To what extent can entire books be extracted from LLMs? Using the Llama 3 70B family of models, and the "prefix-prompting" extraction technique, we were able to auto-regressively reconstruct, with a very high level of similarity, one entire book (Alice's Adventures in Wonderland) from just the first 500 tokens. We were also able to obtain high extraction rates on several other books, piece-wise. However, these successes do not extend uniformly to all books. We show that extraction rates of books correlate with book popularity and thus, likely duplication in the training data.
We also confirm the undoing of mitigations in the instruction-tuned Llama 3.1, following recent work (Nasr et al., 2025). We further find that this undoing comes from changes to only a tiny fraction of weights concentrated primarily in the lower transformer blocks. Our results provide evidence of the limits of current regurgitation mitigation strategies and introduce a framework for studying how fine-tuning affects the retrieval of verbatim memorization in aligned LLMs.
△ Less
Submitted 16 April, 2025;
originally announced April 2025.
-
Self-ReS: Self-Reflection in Large Vision-Language Models for Long Video Understanding
Authors:
Joao Pereira,
Vasco Lopes,
David Semedo,
Joao Neves
Abstract:
Large Vision-Language Models (LVLMs) demonstrate remarkable performance in short-video tasks such as video question answering, but struggle in long-video understanding. The linear frame sampling strategy, conventionally used by LVLMs, fails to account for the non-linear distribution of key events in video data, often introducing redundant or irrelevant information in longer contexts while risking…
▽ More
Large Vision-Language Models (LVLMs) demonstrate remarkable performance in short-video tasks such as video question answering, but struggle in long-video understanding. The linear frame sampling strategy, conventionally used by LVLMs, fails to account for the non-linear distribution of key events in video data, often introducing redundant or irrelevant information in longer contexts while risking the omission of critical events in shorter ones. To address this, we propose SelfReS, a non-linear spatiotemporal self-reflective sampling method that dynamically selects key video fragments based on user prompts. Unlike prior approaches, SelfReS leverages the inherently sparse attention maps of LVLMs to define reflection tokens, enabling relevance-aware token selection without requiring additional training or external modules. Experiments demonstrate that SelfReS can be seamlessly integrated into strong base LVLMs, improving long-video task accuracy and achieving up to 46% faster inference speed within the same GPU memory budget.
△ Less
Submitted 26 March, 2025;
originally announced March 2025.
-
Zero-Shot Action Recognition in Surveillance Videos
Authors:
Joao Pereira,
Vasco Lopes,
David Semedo,
Joao Neves
Abstract:
The growing demand for surveillance in public spaces presents significant challenges due to the shortage of human resources. Current AI-based video surveillance systems heavily rely on core computer vision models that require extensive finetuning, which is particularly difficult in surveillance settings due to limited datasets and difficult setting (viewpoint, low quality, etc.). In this work, we…
▽ More
The growing demand for surveillance in public spaces presents significant challenges due to the shortage of human resources. Current AI-based video surveillance systems heavily rely on core computer vision models that require extensive finetuning, which is particularly difficult in surveillance settings due to limited datasets and difficult setting (viewpoint, low quality, etc.). In this work, we propose leveraging Large Vision-Language Models (LVLMs), known for their strong zero and few-shot generalization, to tackle video understanding tasks in surveillance. Specifically, we explore VideoLLaMA2, a state-of-the-art LVLM, and an improved token-level sampling method, Self-Reflective Sampling (Self-ReS). Our experiments on the UCF-Crime dataset show that VideoLLaMA2 represents a significant leap in zero-shot performance, with 20% boost over the baseline. Self-ReS additionally increases zero-shot action recognition performance to 44.6%. These results highlight the potential of LVLMs, paired with improved sampling techniques, for advancing surveillance video analysis in diverse scenarios.
△ Less
Submitted 18 March, 2025; v1 submitted 28 October, 2024;
originally announced October 2024.
-
Integrating AI Tutors in a Programming Course
Authors:
Iris Ma,
Alberto Krone Martins,
Cristina Videira Lopes
Abstract:
RAGMan is an LLM-powered tutoring system that can support a variety of course-specific and homework-specific AI tutors. RAGMan leverages Retrieval Augmented Generation (RAG), as well as strict instructions, to ensure the alignment of the AI tutors' responses. By using RAGMan's AI tutors, students receive assistance with their specific homework assignments without directly obtaining solutions, whil…
▽ More
RAGMan is an LLM-powered tutoring system that can support a variety of course-specific and homework-specific AI tutors. RAGMan leverages Retrieval Augmented Generation (RAG), as well as strict instructions, to ensure the alignment of the AI tutors' responses. By using RAGMan's AI tutors, students receive assistance with their specific homework assignments without directly obtaining solutions, while also having the ability to ask general programming-related questions.
RAGMan was deployed as an optional resource in an introductory programming course with an enrollment of 455 students. It was configured as a set of five homework-specific AI tutors. This paper describes the interactions the students had with the AI tutors, the students' feedback, and a comparative grade analysis. Overall, about half of the students engaged with the AI tutors, and the vast majority of the interactions were legitimate homework questions. When students posed questions within the intended scope, the AI tutors delivered accurate responses 98% of the time. Within the students used AI tutors, 78% reported that the tutors helped their learning. Beyond AI tutors' ability to provide valuable suggestions, students reported appreciating them for fostering a safe learning environment free from judgment.
△ Less
Submitted 13 July, 2024;
originally announced July 2024.
-
Towards AI-Assisted Synthesis of Verified Dafny Methods
Authors:
Md Rakib Hossain Misu,
Cristina V. Lopes,
Iris Ma,
James Noble
Abstract:
Large language models show great promise in many domains, including programming. A promise is easy to make but hard to keep, and language models often fail to keep their promises, generating erroneous code. A promising avenue to keep models honest is to incorporate formal verification: generating programs' specifications as well as code so that the code can be proved correct with respect to the sp…
▽ More
Large language models show great promise in many domains, including programming. A promise is easy to make but hard to keep, and language models often fail to keep their promises, generating erroneous code. A promising avenue to keep models honest is to incorporate formal verification: generating programs' specifications as well as code so that the code can be proved correct with respect to the specifications. Unfortunately, existing large language models show a severe lack of proficiency in verified programming.
In this paper, we demonstrate how to improve two pretrained models' proficiency in the Dafny verification-aware language. Using 178 problems from the MBPP dataset, we prompt two contemporary models (GPT-4 and PaLM-2) to synthesize Dafny methods. We use three different types of prompts: a direct Contextless prompt; a Signature prompt that includes a method signature and test cases, and a Chain of Thought (CoT) prompt that decomposes the problem into steps and includes retrieval augmentation generated example problems and solutions. Our results show that GPT-4 performs better than PaLM-2 on these tasks and that both models perform best with the retrieval augmentation generated CoT prompt. GPT-4 was able to generate verified, human-evaluated, Dafny methods for 58% of the problems, however, GPT-4 managed only 19% of the problems with the Contextless prompt, and even fewer (10%) for the Signature prompt. We are thus able to contribute 153 verified Dafny solutions to MBPP problems, 50 that we wrote manually, and 103 synthesized by GPT-4.
Our results demonstrate that the benefits of formal program verification are now within reach of code generating large language models...
△ Less
Submitted 10 June, 2024; v1 submitted 31 January, 2024;
originally announced February 2024.
-
Commit Messages in the Age of Large Language Models
Authors:
Cristina V. Lopes,
Vanessa I. Klotzman,
Iris Ma,
Iftekar Ahmed
Abstract:
Commit messages are explanations of changes made to a codebase that are stored in version control systems. They help developers understand the codebase as it evolves. However, writing commit messages can be tedious and inconsistent among developers. To address this issue, researchers have tried using different methods to automatically generate commit messages, including rule-based, retrieval-based…
▽ More
Commit messages are explanations of changes made to a codebase that are stored in version control systems. They help developers understand the codebase as it evolves. However, writing commit messages can be tedious and inconsistent among developers. To address this issue, researchers have tried using different methods to automatically generate commit messages, including rule-based, retrieval-based, and learning-based approaches. Advances in large language models offer new possibilities for generating commit messages. In this study, we evaluate the performance of OpenAI's ChatGPT for generating commit messages based on code changes. We compare the results obtained with ChatGPT to previous automatic commit message generation methods that have been trained specifically on commit data. Our goal is to assess the extent to which large pre-trained language models can generate commit messages that are both quantitatively and qualitatively acceptable. We found that ChatGPT was able to outperform previous Automatic Commit Message Generation (ACMG) methods by orders of magnitude, and that, generally, the messages it generates are both accurate and of high-quality. We also provide insights, and a categorization, for the cases where it fails.
△ Less
Submitted 1 February, 2024; v1 submitted 31 January, 2024;
originally announced January 2024.
-
One Wide Feedforward is All You Need
Authors:
Telmo Pessoa Pires,
António V. Lopes,
Yannick Assogba,
Hendra Setiawan
Abstract:
The Transformer architecture has two main non-embedding components: Attention and the Feed Forward Network (FFN). Attention captures interdependencies between words regardless of their position, while the FFN non-linearly transforms each input token independently. In this work we explore the role of the FFN, and find that despite taking up a significant fraction of the model's parameters, it is hi…
▽ More
The Transformer architecture has two main non-embedding components: Attention and the Feed Forward Network (FFN). Attention captures interdependencies between words regardless of their position, while the FFN non-linearly transforms each input token independently. In this work we explore the role of the FFN, and find that despite taking up a significant fraction of the model's parameters, it is highly redundant. Concretely, we are able to substantially reduce the number of parameters with only a modest drop in accuracy by removing the FFN on the decoder layers and sharing a single FFN across the encoder. Finally we scale this architecture back to its original size by increasing the hidden dimension of the shared FFN, achieving substantial gains in both accuracy and latency with respect to the original Transformer Big.
△ Less
Submitted 21 October, 2023; v1 submitted 4 September, 2023;
originally announced September 2023.
-
Using Large Language Models to Generate JUnit Tests: An Empirical Study
Authors:
Mohammed Latif Siddiq,
Joanna C. S. Santos,
Ridwanul Hasan Tanvir,
Noshin Ulfat,
Fahmid Al Rifat,
Vinicius Carvalho Lopes
Abstract:
A code generation model generates code by taking a prompt from a code comment, existing code, or a combination of both. Although code generation models (e.g., GitHub Copilot) are increasingly being adopted in practice, it is unclear whether they can successfully be used for unit test generation without fine-tuning for a strongly typed language like Java. To fill this gap, we investigated how well…
▽ More
A code generation model generates code by taking a prompt from a code comment, existing code, or a combination of both. Although code generation models (e.g., GitHub Copilot) are increasingly being adopted in practice, it is unclear whether they can successfully be used for unit test generation without fine-tuning for a strongly typed language like Java. To fill this gap, we investigated how well three models (Codex, GPT-3.5-Turbo, and StarCoder) can generate unit tests. We used two benchmarks (HumanEval and Evosuite SF110) to investigate the effect of context generation on the unit test generation process. We evaluated the models based on compilation rates, test correctness, test coverage, and test smells. We found that the Codex model achieved above 80% coverage for the HumanEval dataset, but no model had more than 2% coverage for the EvoSuite SF110 benchmark. The generated tests also suffered from test smells, such as Duplicated Asserts and Empty Tests.
△ Less
Submitted 8 March, 2024; v1 submitted 30 April, 2023;
originally announced May 2023.
-
Improving the Quality of Commit Messages in Students' Projects
Authors:
Iris Ma,
Cristina V. Lopes
Abstract:
Commit messages play a crucial role in collaborative software development. They provide a clear and concise description of the changes made to the source code. However, many commit messages among students' projects lack useful information. This is a concern, as low-quality commit messages can negatively impact communication of software development and future maintenance. To address this issue, thi…
▽ More
Commit messages play a crucial role in collaborative software development. They provide a clear and concise description of the changes made to the source code. However, many commit messages among students' projects lack useful information. This is a concern, as low-quality commit messages can negatively impact communication of software development and future maintenance. To address this issue, this research aims to help students write high-quality commit messages by "nudging" them in the right direction. We modified the GitHub Desktop application by incorporating specific requirements for commit messages, specifically "what" and "why" parts. To test whether this affects the quality of commit messages, we divided students from an Information Retrieval class into two groups, with one group using the modified application and the other using other interfaces. The results show that the quality of commit messages is improved in terms of informativeness, clearness, and length.
△ Less
Submitted 26 April, 2023;
originally announced April 2023.
-
Are Neural Architecture Search Benchmarks Well Designed? A Deeper Look Into Operation Importance
Authors:
Vasco Lopes,
Bruno Degardin,
Luís A. Alexandre
Abstract:
Neural Architecture Search (NAS) benchmarks significantly improved the capability of developing and comparing NAS methods while at the same time drastically reduced the computational overhead by providing meta-information about thousands of trained neural networks. However, tabular benchmarks have several drawbacks that can hinder fair comparisons and provide unreliable results. These usually focu…
▽ More
Neural Architecture Search (NAS) benchmarks significantly improved the capability of developing and comparing NAS methods while at the same time drastically reduced the computational overhead by providing meta-information about thousands of trained neural networks. However, tabular benchmarks have several drawbacks that can hinder fair comparisons and provide unreliable results. These usually focus on providing a small pool of operations in heavily constrained search spaces -- usually cell-based neural networks with pre-defined outer-skeletons. In this work, we conducted an empirical analysis of the widely used NAS-Bench-101, NAS-Bench-201 and TransNAS-Bench-101 benchmarks in terms of their generability and how different operations influence the performance of the generated architectures. We found that only a subset of the operation pool is required to generate architectures close to the upper-bound of the performance range. Also, the performance distribution is negatively skewed, having a higher density of architectures in the upper-bound range. We consistently found convolution layers to have the highest impact on the architecture's performance, and that specific combination of operations favors top-scoring architectures. These findings shed insights on the correct evaluation and comparison of NAS methods using NAS benchmarks, showing that directly searching on NAS-Bench-201, ImageNet16-120 and TransNAS-Bench-101 produces more reliable results than searching only on CIFAR-10. Furthermore, with this work we provide suggestions for future benchmark evaluations and design. The code used to conduct the evaluations is available at https://github.com/VascoLopes/NAS-Benchmark-Evaluation.
△ Less
Submitted 29 March, 2023;
originally announced March 2023.
-
Black Boxes, White Noise: Similarity Detection for Neural Functions
Authors:
Farima Farmahinifarahani,
Cristina V. Lopes
Abstract:
Similarity, or clone, detection has important applications in copyright violation, software theft, code search, and the detection of malicious components. There is now a good number of open source and proprietary clone detectors for programs written in traditional programming languages. However, the increasing adoption of deep learning models in software poses a challenge to these tools: these mod…
▽ More
Similarity, or clone, detection has important applications in copyright violation, software theft, code search, and the detection of malicious components. There is now a good number of open source and proprietary clone detectors for programs written in traditional programming languages. However, the increasing adoption of deep learning models in software poses a challenge to these tools: these models implement functions that are inscrutable black boxes. As more software includes these DNN functions, new techniques are needed in order to assess the similarity between deep learning components of software. Previous work has unveiled techniques for comparing the representations learned at various layers of deep neural network models by feeding canonical inputs to the models. Our goal is to be able to compare DNN functions when canonical inputs are not available -- because they may not be in many application scenarios. The challenge, then, is to generate appropriate inputs and to identify a metric that, for those inputs, is capable of representing the degree of functional similarity between two comparable DNN functions.
Our approach uses random input with values between -1 and 1, in a shape that is compatible with what the DNN models expect. We then compare the outputs by performing correlation analysis. Our study shows how it is possible to perform similarity analysis even in the absence of meaningful canonical inputs. The response to random inputs of two comparable DNN functions exposes those functions' similarity, or lack thereof. Of all the metrics tried, we find that Spearman's rank correlation coefficient is the most powerful and versatile, although in special cases other methods and metrics are more expressive. We present a systematic empirical study comparing the effectiveness of several similarity metrics using a dataset of 56,355 classifiers collected from GitHub. This is accompanied by a sensitivity analysis that reveals how certain models' training related properties affect the effectiveness of the similarity metrics.
To the best of our knowledge, this is the first work that shows how similarity of DNN functions can be detected by using random inputs. Our study of correlation metrics, and the identification of Spearman correlation coefficient as the most powerful among them for this purpose, establishes a complete and practical method for DNN clone detection that can be used in the design of new tools. It may also serve as inspiration for other program analysis tasks whose approaches break in the presence of DNN components.
△ Less
Submitted 20 February, 2023;
originally announced February 2023.
-
Nonlinear dynamics of asymmetric bistable energy harvesters
Authors:
João Pedro Norenberg,
Roberto Luo,
Vinicius Goncaalves Lopes,
João Victor L. L. Peterson,
Americo Cunha Jr
Abstract:
The paper investigates asymmetries effects over a nonlinear vibration energy harvester dynamics. The asymmetric system performance is compared with symmetric ones. Different asymmetry levels on restoring force and gravity action are investigated from a system-sloping angle variation. Bifurcation diagrams and basins of attraction are used to examine the local and global characteristics underlying d…
▽ More
The paper investigates asymmetries effects over a nonlinear vibration energy harvester dynamics. The asymmetric system performance is compared with symmetric ones. Different asymmetry levels on restoring force and gravity action are investigated from a system-sloping angle variation. Bifurcation diagrams and basins of attraction are used to examine the local and global characteristics underlying dynamical systems under different excitation energy. The results show the adverse effects of asymmetries on system dynamics. They also reveal ways to overcome them by canceling asymmetric influence from optimal sloping angle values and improving asymmetric system performance over symmetrical ones. This comprehensive numerical study provides novel valuable insights into asymmetrical energy harvester dynamics, a wide and still less explored topic.
△ Less
Submitted 9 June, 2023; v1 submitted 20 August, 2022;
originally announced September 2022.
-
Guided Evolutionary Neural Architecture Search With Efficient Performance Estimation
Authors:
Vasco Lopes,
Miguel Santos,
Bruno Degardin,
Luís A. Alexandre
Abstract:
Neural Architecture Search (NAS) methods have been successfully applied to image tasks with excellent results. However, NAS methods are often complex and tend to converge to local minima as soon as generated architectures seem to yield good results. This paper proposes GEA, a novel approach for guided NAS. GEA guides the evolution by exploring the search space by generating and evaluating several…
▽ More
Neural Architecture Search (NAS) methods have been successfully applied to image tasks with excellent results. However, NAS methods are often complex and tend to converge to local minima as soon as generated architectures seem to yield good results. This paper proposes GEA, a novel approach for guided NAS. GEA guides the evolution by exploring the search space by generating and evaluating several architectures in each generation at initialisation stage using a zero-proxy estimator, where only the highest-scoring architecture is trained and kept for the next generation. Subsequently, GEA continuously extracts knowledge about the search space without increased complexity by generating several off-springs from an existing architecture at each generation. More, GEA forces exploitation of the most performant architectures by descendant generation while simultaneously driving exploration through parent mutation and favouring younger architectures to the detriment of older ones. Experimental results demonstrate the effectiveness of the proposed method, and extensive ablation studies evaluate the importance of different parameters. Results show that GEA achieves state-of-the-art results on all data sets of NAS-Bench-101, NAS-Bench-201 and TransNAS-Bench-101 benchmarks.
△ Less
Submitted 22 July, 2022;
originally announced August 2022.
-
Towards Less Constrained Macro-Neural Architecture Search
Authors:
Vasco Lopes,
Luís A. Alexandre
Abstract:
Networks found with Neural Architecture Search (NAS) achieve state-of-the-art performance in a variety of tasks, out-performing human-designed networks. However, most NAS methods heavily rely on human-defined assumptions that constrain the search: architecture's outer-skeletons, number of layers, parameter heuristics and search spaces. Additionally, common search spaces consist of repeatable modul…
▽ More
Networks found with Neural Architecture Search (NAS) achieve state-of-the-art performance in a variety of tasks, out-performing human-designed networks. However, most NAS methods heavily rely on human-defined assumptions that constrain the search: architecture's outer-skeletons, number of layers, parameter heuristics and search spaces. Additionally, common search spaces consist of repeatable modules (cells) instead of fully exploring the architecture's search space by designing entire architectures (macro-search). Imposing such constraints requires deep human expertise and restricts the search to pre-defined settings. In this paper, we propose LCMNAS, a method that pushes NAS to less constrained search spaces by performing macro-search without relying on pre-defined heuristics or bounded search spaces. LCMNAS introduces three components for the NAS pipeline: i) a method that leverages information about well-known architectures to autonomously generate complex search spaces based on Weighted Directed Graphs with hidden properties, ii) an evolutionary search strategy that generates complete architectures from scratch, and iii) a mixed-performance estimation approach that combines information about architectures at initialization stage and lower fidelity estimates to infer their trainability and capacity to model complex functions. We present experiments in 13 different data sets showing that LCMNAS is capable of generating both cell and macro-based architectures with minimal GPU computation and state-of-the-art results. More, we conduct extensive studies on the importance of different NAS components in both cell and macro-based settings. Code for reproducibility is public at https://github.com/VascoLopes/LCMNAS.
△ Less
Submitted 6 January, 2023; v1 submitted 10 March, 2022;
originally announced March 2022.
-
Guided Evolution for Neural Architecture Search
Authors:
Vasco Lopes,
Miguel Santos,
Bruno Degardin,
Luís A. Alexandre
Abstract:
Neural Architecture Search (NAS) methods have been successfully applied to image tasks with excellent results. However, NAS methods are often complex and tend to converge to local minima as soon as generated architectures seem to yield good results. In this paper, we propose G-EA, a novel approach for guided evolutionary NAS. The rationale behind G-EA, is to explore the search space by generating…
▽ More
Neural Architecture Search (NAS) methods have been successfully applied to image tasks with excellent results. However, NAS methods are often complex and tend to converge to local minima as soon as generated architectures seem to yield good results. In this paper, we propose G-EA, a novel approach for guided evolutionary NAS. The rationale behind G-EA, is to explore the search space by generating and evaluating several architectures in each generation at initialization stage using a zero-proxy estimator, where only the highest-scoring network is trained and kept for the next generation. This evaluation at initialization stage allows continuous extraction of knowledge from the search space without increasing computation, thus allowing the search to be efficiently guided. Moreover, G-EA forces exploitation of the most performant networks by descendant generation while at the same time forcing exploration by parent mutation and by favouring younger architectures to the detriment of older ones. Experimental results demonstrate the effectiveness of the proposed method, showing that G-EA achieves state-of-the-art results in NAS-Bench-201 search space in CIFAR-10, CIFAR-100 and ImageNet16-120, with mean accuracies of 93.98%, 72.12% and 45.94% respectively.
△ Less
Submitted 28 October, 2021;
originally announced October 2021.
-
Generative Adversarial Graph Convolutional Networks for Human Action Synthesis
Authors:
Bruno Degardin,
João Neves,
Vasco Lopes,
João Brito,
Ehsan Yaghoubi,
Hugo Proença
Abstract:
Synthesising the spatial and temporal dynamics of the human body skeleton remains a challenging task, not only in terms of the quality of the generated shapes, but also of their diversity, particularly to synthesise realistic body movements of a specific action (action conditioning). In this paper, we propose Kinetic-GAN, a novel architecture that leverages the benefits of Generative Adversarial N…
▽ More
Synthesising the spatial and temporal dynamics of the human body skeleton remains a challenging task, not only in terms of the quality of the generated shapes, but also of their diversity, particularly to synthesise realistic body movements of a specific action (action conditioning). In this paper, we propose Kinetic-GAN, a novel architecture that leverages the benefits of Generative Adversarial Networks and Graph Convolutional Networks to synthesise the kinetics of the human body. The proposed adversarial architecture can condition up to 120 different actions over local and global body movements while improving sample quality and diversity through latent space disentanglement and stochastic variations. Our experiments were carried out in three well-known datasets, where Kinetic-GAN notably surpasses the state-of-the-art methods in terms of distribution quality metrics while having the ability to synthesise more than one order of magnitude regarding the number of different actions. Our code and models are publicly available at https://github.com/DegardinBruno/Kinetic-GAN.
△ Less
Submitted 25 October, 2021; v1 submitted 21 October, 2021;
originally announced October 2021.
-
The nonlinear dynamics of a bistable energy harvesting system with colored noise disturbances
Authors:
Vinicius Gonçalves Lopes,
João Victor L. L. Peterson,
Americo Cunha Jr
Abstract:
This paper deals with the nonlinear stochastic dynamics of a piezoelectric energy harvesting system subjected to a harmonic external excitation disturbed by Gaussian colored noise. A parametric analysis is conducted, where the effects of the standard deviation and the correlation time of colored noise on the system response are investigated. The numerical results suggest a strong influence of nois…
▽ More
This paper deals with the nonlinear stochastic dynamics of a piezoelectric energy harvesting system subjected to a harmonic external excitation disturbed by Gaussian colored noise. A parametric analysis is conducted, where the effects of the standard deviation and the correlation time of colored noise on the system response are investigated. The numerical results suggest a strong influence of noise on the system response for higher values of correlation time and standard deviation, and a low (noise level independent) influence for low values of correlation time.
△ Less
Submitted 14 July, 2021;
originally announced July 2021.
-
REGINA - Reasoning Graph Convolutional Networks in Human Action Recognition
Authors:
Bruno Degardin,
Vasco Lopes,
Hugo Proença
Abstract:
It is known that the kinematics of the human body skeleton reveals valuable information in action recognition. Recently, modeling skeletons as spatio-temporal graphs with Graph Convolutional Networks (GCNs) has been reported to solidly advance the state-of-the-art performance. However, GCN-based approaches exclusively learn from raw skeleton data, and are expected to extract the inherent structura…
▽ More
It is known that the kinematics of the human body skeleton reveals valuable information in action recognition. Recently, modeling skeletons as spatio-temporal graphs with Graph Convolutional Networks (GCNs) has been reported to solidly advance the state-of-the-art performance. However, GCN-based approaches exclusively learn from raw skeleton data, and are expected to extract the inherent structural information on their own. This paper describes REGINA, introducing a novel way to REasoning Graph convolutional networks IN Human Action recognition. The rationale is to provide to the GCNs additional knowledge about the skeleton data, obtained by handcrafted features, in order to facilitate the learning process, while guaranteeing that it remains fully trainable in an end-to-end manner. The challenge is to capture complementary information over the dynamics between consecutive frames, which is the key information extracted by state-of-the-art GCN techniques. Moreover, the proposed strategy can be easily integrated in the existing GCN-based methods, which we also regard positively. Our experiments were carried out in well known action recognition datasets and enabled to conclude that REGINA contributes for solid improvements in performance when incorporated to other GCN-based approaches, without any other adjustment regarding the original method. For reproducibility, the REGINA code and all the experiments carried out will be publicly available at https://github.com/DegardinBruno.
△ Less
Submitted 14 May, 2021;
originally announced May 2021.
-
Expanding Frontiers: Settling an Understanding of Systems-of-Information Systems
Authors:
Valdemar Vicente Graciano Neto,
Bruno Gabriel Araújo Lebtag,
Paulo Gabriel Teixeira,
Priscilla Batista,
Vinícius Carvalho Lopes,
Jamal El-Hachem,
Jérémy Buisson,
Flavio Oquendo,
Juliana Fernandes,
Francisco Ferreira,
Rodrigo Peireira dos Santos,
Davi Viana,
Everton Cavalcante,
Mohamad Kassab,
Ahmad Mohsin,
Roberto Oliveira,
Vânia Neves,
Maria Istela Cagnin,
Elisa Yumi Nakagawa
Abstract:
System-of-Systems (SoS) has consolidated itself as a special type of software-intensive systems. As such, subtypes of SoS have also emerged, such as Cyber-Physical SoS (CPSoS) that are formed essentially of cyber-physical constituent systems and Systems-of-Information Systems (SoIS) that contain information systems as their constituents. In contrast to CPSoS that have been investigated and covered…
▽ More
System-of-Systems (SoS) has consolidated itself as a special type of software-intensive systems. As such, subtypes of SoS have also emerged, such as Cyber-Physical SoS (CPSoS) that are formed essentially of cyber-physical constituent systems and Systems-of-Information Systems (SoIS) that contain information systems as their constituents. In contrast to CPSoS that have been investigated and covered in the specialized literature, SoIS still lack critical discussion about their fundamentals. The main contribution of this paper is to present those fundamentals to set an understanding of SoIS. By offering a discussion and examining literature cases, we draw an essential settlement on SoIS definition, basics, and practical implications. The discussion herein presented results from research conducted on SoIS over the past years in interinstitutional and multinational research collaborations. The knowledge gathered in this paper arises from several scientific discussion meetings among the authors. As a result, we aim to contribute to the state of the art of SoIS besides paving the research avenues for the forthcoming years.
△ Less
Submitted 25 March, 2021;
originally announced March 2021.
-
EPE-NAS: Efficient Performance Estimation Without Training for Neural Architecture Search
Authors:
Vasco Lopes,
Saeid Alirezazadeh,
Luís A. Alexandre
Abstract:
Neural Architecture Search (NAS) has shown excellent results in designing architectures for computer vision problems. NAS alleviates the need for human-defined settings by automating architecture design and engineering. However, NAS methods tend to be slow, as they require large amounts of GPU computation. This bottleneck is mainly due to the performance estimation strategy, which requires the eva…
▽ More
Neural Architecture Search (NAS) has shown excellent results in designing architectures for computer vision problems. NAS alleviates the need for human-defined settings by automating architecture design and engineering. However, NAS methods tend to be slow, as they require large amounts of GPU computation. This bottleneck is mainly due to the performance estimation strategy, which requires the evaluation of the generated architectures, mainly by training them, to update the sampler method. In this paper, we propose EPE-NAS, an efficient performance estimation strategy, that mitigates the problem of evaluating networks, by scoring untrained networks and creating a correlation with their trained performance. We perform this process by looking at intra and inter-class correlations of an untrained network. We show that EPE-NAS can produce a robust correlation and that by incorporating it into a simple random sampling strategy, we are able to search for competitive networks, without requiring any training, in a matter of seconds using a single GPU. Moreover, EPE-NAS is agnostic to the search method, since it focuses on the evaluation of untrained networks, making it easy to integrate into almost any NAS method.
△ Less
Submitted 16 February, 2021;
originally announced February 2021.
-
An AutoML-based Approach to Multimodal Image Sentiment Analysis
Authors:
Vasco Lopes,
António Gaspar,
Luís A. Alexandre,
João Cordeiro
Abstract:
Sentiment analysis is a research topic focused on analysing data to extract information related to the sentiment that it causes. Applications of sentiment analysis are wide, ranging from recommendation systems, and marketing to customer satisfaction. Recent approaches evaluate textual content using Machine Learning techniques that are trained over large corpora. However, as social media grown, oth…
▽ More
Sentiment analysis is a research topic focused on analysing data to extract information related to the sentiment that it causes. Applications of sentiment analysis are wide, ranging from recommendation systems, and marketing to customer satisfaction. Recent approaches evaluate textual content using Machine Learning techniques that are trained over large corpora. However, as social media grown, other data types emerged in large quantities, such as images. Sentiment analysis in images has shown to be a valuable complement to textual data since it enables the inference of the underlying message polarity by creating context and connections. Multimodal sentiment analysis approaches intend to leverage information of both textual and image content to perform an evaluation. Despite recent advances, current solutions still flounder in combining both image and textual information to classify social media data, mainly due to subjectivity, inter-class homogeneity and fusion data differences. In this paper, we propose a method that combines both textual and image individual sentiment analysis into a final fused classification based on AutoML, that performs a random search to find the best model. Our method achieved state-of-the-art performance in the B-T4SA dataset, with 95.19% accuracy.
△ Less
Submitted 16 February, 2021;
originally announced February 2021.
-
Analyzing Dominance Move (MIP-DoM) Indicator for Multi- and Many-objective Optimization
Authors:
Claudio Lucio do Val Lopes,
Flávio Vinícius Cruzeiro Martins,
Elizabeth Fialho Wanner,
Kalyanmoy Deb
Abstract:
Dominance move (DoM) is a binary quality indicator that can be used in multi-objective and many-objective optimization to compare two solution sets obtained from different algorithms. The DoM indicator can differentiate the sets for certain important features, such as convergence, spread, uniformity, and cardinality. DoM does not use any reference, and it has an intuitive and physical meaning, sim…
▽ More
Dominance move (DoM) is a binary quality indicator that can be used in multi-objective and many-objective optimization to compare two solution sets obtained from different algorithms. The DoM indicator can differentiate the sets for certain important features, such as convergence, spread, uniformity, and cardinality. DoM does not use any reference, and it has an intuitive and physical meaning, similar to the $ε$-indicator, and calculates the minimum total move of members of one set so that all elements in another set are to be dominated or identical to at least one member of the first set. Despite the aforementioned properties, DoM is hard to calculate, particularly in higher dimensions. There is an efficient and exact method to calculate it in a bi-objective case only. This work proposes a novel approach to calculate DoM using a mixed integer programming (MIP) approach, which can handle sets with three or more objectives and is shown to overcome the $ε$-indicator's information loss. Experiments, in the bi-objective space, are done to verify the model's correctness. Furthermore, other experiments, using 3, 5, 10, 15, 20, 25 and 30-objective problems are performed to show how the model behaves in higher-dimensional cases. Algorithms, such as IBEA, MOEA/D, NSGA-III, NSGA-II, and SPEA2 are used to generate the solution sets (however any other algorithms can also be used with the proposed MIP-DoM indicator). Further extensions are discussed to handle certain idiosyncrasies with some solution sets and also to improve the quality indicator and its use for other situations.
△ Less
Submitted 5 February, 2021; v1 submitted 21 December, 2020;
originally announced December 2020.
-
Auto-Classifier: A Robust Defect Detector Based on an AutoML Head
Authors:
Vasco Lopes,
Luís A. Alexandre
Abstract:
The dominant approach for surface defect detection is the use of hand-crafted feature-based methods. However, this falls short when conditions vary that affect extracted images. So, in this paper, we sought to determine how well several state-of-the-art Convolutional Neural Networks perform in the task of surface defect detection. Moreover, we propose two methods: CNN-Fusion, that fuses the predic…
▽ More
The dominant approach for surface defect detection is the use of hand-crafted feature-based methods. However, this falls short when conditions vary that affect extracted images. So, in this paper, we sought to determine how well several state-of-the-art Convolutional Neural Networks perform in the task of surface defect detection. Moreover, we propose two methods: CNN-Fusion, that fuses the prediction of all the networks into a final one, and Auto-Classifier, which is a novel proposal that improves a Convolutional Neural Network by modifying its classification component using AutoML. We carried out experiments to evaluate the proposed methods in the task of surface defect detection using different datasets from DAGM2007. We show that the use of Convolutional Neural Networks achieves better results than traditional methods, and also, that Auto-Classifier out-performs all other methods, by achieving 100% accuracy and 100% AUC results throughout all the datasets.
△ Less
Submitted 3 September, 2020;
originally announced September 2020.
-
HMCNAS: Neural Architecture Search using Hidden Markov Chains and Bayesian Optimization
Authors:
Vasco Lopes,
Luís A. Alexandre
Abstract:
Neural Architecture Search has achieved state-of-the-art performance in a variety of tasks, out-performing human-designed networks. However, many assumptions, that require human definition, related with the problems being solved or the models generated are still needed: final model architectures, number of layers to be sampled, forced operations, small search spaces, which ultimately contributes t…
▽ More
Neural Architecture Search has achieved state-of-the-art performance in a variety of tasks, out-performing human-designed networks. However, many assumptions, that require human definition, related with the problems being solved or the models generated are still needed: final model architectures, number of layers to be sampled, forced operations, small search spaces, which ultimately contributes to having models with higher performances at the cost of inducing bias into the system. In this paper, we propose HMCNAS, which is composed of two novel components: i) a method that leverages information about human-designed models to autonomously generate a complex search space, and ii) an Evolutionary Algorithm with Bayesian Optimization that is capable of generating competitive CNNs from scratch, without relying on human-defined parameters or small search spaces. The experimental results show that the proposed approach results in competitive architectures obtained in a very short time. HMCNAS provides a step towards generalizing NAS, by providing a way to create competitive models, without requiring any human knowledge about the specific task.
△ Less
Submitted 31 July, 2020;
originally announced July 2020.
-
A softwarized perspective of the 5G networks
Authors:
Kleber Vieira Cardoso,
Cristiano Bonato Both,
Lúcio Rene Prade,
Ciro J. A. Macedo,
Victor Hugo L. Lopes
Abstract:
The main goal of this article is to present the fundamental theoretical concepts for the tutorial presented in IEEE NetSoft 2020. The article explores the use of software in the 5G system composed of the Radio Access Network (RAN) and the core components, following the standards defined by 3GPP, particularly the Release 15. The article provides a brief overview of mobile cellular networks, includi…
▽ More
The main goal of this article is to present the fundamental theoretical concepts for the tutorial presented in IEEE NetSoft 2020. The article explores the use of software in the 5G system composed of the Radio Access Network (RAN) and the core components, following the standards defined by 3GPP, particularly the Release 15. The article provides a brief overview of mobile cellular networks, including basic concepts, operations, and evolution through the called `generations' of mobile networks. From a software perspective, RAN is presented in the context of 4G and 5G networks, which includes the virtualization and disaggregation concepts. A significant part of the article is dedicated to 5G networks and beyond, focusing on core, i.e., considering the Service-Based Architecture (SBA), due to its relevance and totally softwarized approach. Finally, the article briefly describes the demonstrations presented in IEEE NetSoft 2020, providing the link for the repository that has all material employed in the tutorial.
△ Less
Submitted 24 August, 2020; v1 submitted 18 June, 2020;
originally announced June 2020.
-
A Hybrid Method for Training Convolutional Neural Networks
Authors:
Vasco Lopes,
Paulo Fazendeiro
Abstract:
Artificial Intelligence algorithms have been steadily increasing in popularity and usage. Deep Learning, allows neural networks to be trained using huge datasets and also removes the need for human extracted features, as it automates the feature learning process. In the hearth of training deep neural networks, such as Convolutional Neural Networks, we find backpropagation, that by computing the gr…
▽ More
Artificial Intelligence algorithms have been steadily increasing in popularity and usage. Deep Learning, allows neural networks to be trained using huge datasets and also removes the need for human extracted features, as it automates the feature learning process. In the hearth of training deep neural networks, such as Convolutional Neural Networks, we find backpropagation, that by computing the gradient of the loss function with respect to the weights of the network for a given input, it allows the weights of the network to be adjusted to better perform in the given task. In this paper, we propose a hybrid method that uses both backpropagation and evolutionary strategies to train Convolutional Neural Networks, where the evolutionary strategies are used to help to avoid local minimas and fine-tune the weights, so that the network achieves higher accuracy results. We show that the proposed hybrid method is capable of improving upon regular training in the task of image classification in CIFAR-10, where a VGG16 model was used and the final test results increased 0.61%, in average, when compared to using only backpropagation.
△ Less
Submitted 15 April, 2020;
originally announced May 2020.
-
Brazilian Lyrics-Based Music Genre Classification Using a BLSTM Network
Authors:
Raul de Araújo Lima,
Rômulo César Costa de Sousa,
Simone Diniz Junqueira Barbosa,
Hélio Cortês Vieira Lopes
Abstract:
Organize songs, albums, and artists in groups with shared similarity could be done with the help of genre labels. In this paper, we present a novel approach for automatic classifying musical genre in Brazilian music using only the song lyrics. This kind of classification remains a challenge in the field of Natural Language Processing. We construct a dataset of 138,368 Brazilian song lyrics distrib…
▽ More
Organize songs, albums, and artists in groups with shared similarity could be done with the help of genre labels. In this paper, we present a novel approach for automatic classifying musical genre in Brazilian music using only the song lyrics. This kind of classification remains a challenge in the field of Natural Language Processing. We construct a dataset of 138,368 Brazilian song lyrics distributed in 14 genres. We apply SVM, Random Forest and a Bidirectional Long Short-Term Memory (BLSTM) network combined with different word embeddings techniques to address this classification task. Our experiments show that the BLSTM method outperforms the other models with an F1-score average of $0.48$. Some genres like "gospel", "funk-carioca" and "sertanejo", which obtained 0.89, 0.70 and 0.69 of F1-score, respectively, can be defined as the most distinct and easy to classify in the Brazilian musical genres context.
△ Less
Submitted 6 March, 2020;
originally announced March 2020.
-
An Assignment Problem Formulation for Dominance Move Indicator
Authors:
Claudio Lucio do Val Lopes,
Flávio Vinícius Cruzeiro Martins,
Elizabeth F. Wanner
Abstract:
Dominance move (DoM) is a binary quality indicator to compare solution sets in multiobjective optimization. The indicator allows a more natural and intuitive relation when comparing solution sets. It is Pareto compliant and does not demand any parameters or reference sets. In spite of its advantages, the combinatorial calculation nature is a limitation. The original formulation presents an efficie…
▽ More
Dominance move (DoM) is a binary quality indicator to compare solution sets in multiobjective optimization. The indicator allows a more natural and intuitive relation when comparing solution sets. It is Pareto compliant and does not demand any parameters or reference sets. In spite of its advantages, the combinatorial calculation nature is a limitation. The original formulation presents an efficient method to calculate it in a biobjective case only. This work presents an assignment formulation to calculate DoM in problems with three objectives or more. Some initial experiments, in the biobjective space, were done to present the model correctness. Next, other experiments, using three dimensions, were also done to show how DoM could be compared with other indicators: inverted generational distance (IGD) and hypervolume (HV). Results show the assignment formulation for DoM is valid for more than three objectives. However, there are some strengths and weaknesses, which are discussed and detailed. Some notes, considerations, and future research paths conclude this work.
△ Less
Submitted 14 May, 2020; v1 submitted 25 February, 2020;
originally announced February 2020.
-
Dominance Move calculation using a MIP approach for comparison of multi and many-objective optimization solution sets
Authors:
Claudio Lucio do Val Lopes,
Flávio Vinícius Cruzeiro Martins,
Elizabeth Fialho Wanner
Abstract:
Dominance move (DoM) is a binary quality indicator that can be used in multiobjective optimization. It can compare solution sets while representing some important features such as convergence, spread, uniformity, and cardinality. DoM has an intuitive concept and considers the minimum move of one set needed to weakly Pareto dominate the other set. Despite the aforementioned properties, DoM is hard…
▽ More
Dominance move (DoM) is a binary quality indicator that can be used in multiobjective optimization. It can compare solution sets while representing some important features such as convergence, spread, uniformity, and cardinality. DoM has an intuitive concept and considers the minimum move of one set needed to weakly Pareto dominate the other set. Despite the aforementioned properties, DoM is hard to calculate. The original formulation presents an efficient and exact method to calculate it in a biobjective case only. This work presents a new approach to calculate and extend DoM to deal with three or more objectives. The idea is to use a mixed integer programming (MIP) approach to calculate DoM. Some initial experiments, in the biobjective space, were done to verify the model correctness. Furthermore, other experiments, using three, five, and ten objective functions were done to show how the model behaves in higher dimensional cases. Algorithms such as IBEA, MOEAD, NSGAIII, NSGAII, and SPEA2 were used to generate the solution sets, however any other algorithms could be used with DoM indicator. The results have confirmed the effectiveness of the MIP DoM in problems with more than three objective functions. Final notes, considerations, and future research are discussed to exploit some solution sets particularities and improve the model and its use for other situations.
△ Less
Submitted 10 January, 2020;
originally announced January 2020.
-
GANprintR: Improved Fakes and Evaluation of the State of the Art in Face Manipulation Detection
Authors:
João C. Neves,
Ruben Tolosana,
Ruben Vera-Rodriguez,
Vasco Lopes,
Hugo Proença,
Julian Fierrez
Abstract:
The availability of large-scale facial databases, together with the remarkable progresses of deep learning technologies, in particular Generative Adversarial Networks (GANs), have led to the generation of extremely realistic fake facial content, raising obvious concerns about the potential for misuse. Such concerns have fostered the research on manipulation detection methods that, contrary to huma…
▽ More
The availability of large-scale facial databases, together with the remarkable progresses of deep learning technologies, in particular Generative Adversarial Networks (GANs), have led to the generation of extremely realistic fake facial content, raising obvious concerns about the potential for misuse. Such concerns have fostered the research on manipulation detection methods that, contrary to humans, have already achieved astonishing results in various scenarios. In this study, we focus on the synthesis of entire facial images, which is a specific type of facial manipulation. The main contributions of this study are four-fold: i) a novel strategy to remove GAN "fingerprints" from synthetic fake images based on autoencoders is described, in order to spoof facial manipulation detection systems while keeping the visual quality of the resulting images; ii) an in-depth analysis of the recent literature in facial manipulation detection; iii) a complete experimental assessment of this type of facial manipulation, considering the state-of-the-art fake detection systems (based on holistic deep networks, steganalysis, and local artifacts), remarking how challenging is this task in unconstrained scenarios; and finally iv) we announce a novel public database, named iFakeFaceDB, yielding from the application of our proposed GAN-fingerprint Removal approach (GANprintR) to already very realistic synthetic fake images.
The results obtained in our empirical evaluation show that additional efforts are required to develop robust facial manipulation detection systems against unseen conditions and spoof techniques, such as the one proposed in this study.
△ Less
Submitted 1 July, 2020; v1 submitted 13 November, 2019;
originally announced November 2019.
-
GoTcha: An Interactive Debugger for GoT-Based Distributed Systems
Authors:
Rohan Achar,
Pritha Dawn,
Cristina V. Lopes
Abstract:
Debugging distributed systems is hard. Most of the techniques that have been developed for debugging such systems use either extensive model checking, or postmortem analysis of logs and traces. Interactive debugging is typically a tool that is only effective in single threaded and single process applications, and is rarely applied to distributed systems. While the live observation of state changes…
▽ More
Debugging distributed systems is hard. Most of the techniques that have been developed for debugging such systems use either extensive model checking, or postmortem analysis of logs and traces. Interactive debugging is typically a tool that is only effective in single threaded and single process applications, and is rarely applied to distributed systems. While the live observation of state changes using interactive debuggers is effective, it comes with a host of problems in distributed scenarios. In this paper, we discuss the requirements an interactive debugger for distributed systems should meet, the role the underlying distributed model plays in facilitating the debugger, and the implementation of our interactive debugger: GoTcha. GoTcha is a browser based interactive debugger for distributed systems built on the Global Object Tracker (GoT) programming model. We show how the GoT model facilitates the debugger, and the features that the debugger can offer. We also demonstrate a typical debugging workflow.
△ Less
Submitted 6 September, 2019;
originally announced September 2019.
-
MANAS: Multi-Agent Neural Architecture Search
Authors:
Vasco Lopes,
Fabio Maria Carlucci,
Pedro M Esperança,
Marco Singh,
Victor Gabillon,
Antoine Yang,
Hang Xu,
Zewei Chen,
Jun Wang
Abstract:
The Neural Architecture Search (NAS) problem is typically formulated as a graph search problem where the goal is to learn the optimal operations over edges in order to maximise a graph-level global objective. Due to the large architecture parameter space, efficiency is a key bottleneck preventing NAS from its practical use. In this paper, we address the issue by framing NAS as a multi-agent proble…
▽ More
The Neural Architecture Search (NAS) problem is typically formulated as a graph search problem where the goal is to learn the optimal operations over edges in order to maximise a graph-level global objective. Due to the large architecture parameter space, efficiency is a key bottleneck preventing NAS from its practical use. In this paper, we address the issue by framing NAS as a multi-agent problem where agents control a subset of the network and coordinate to reach optimal architectures. We provide two distinct lightweight implementations, with reduced memory requirements (1/8th of state-of-the-art), and performances above those of much more computationally expensive methods. Theoretically, we demonstrate vanishing regrets of the form O(sqrt(T)), with T being the total number of rounds. Finally, aware that random search is an, often ignored, effective baseline we perform additional experiments on 3 alternative datasets and 2 network configurations, and achieve favourable results in comparison.
△ Less
Submitted 12 January, 2023; v1 submitted 3 September, 2019;
originally announced September 2019.
-
Unbabel's Participation in the WMT19 Translation Quality Estimation Shared Task
Authors:
Fabio Kepler,
Jonay Trénous,
Marcos Treviso,
Miguel Vera,
António Góis,
M. Amin Farajian,
António V. Lopes,
André F. T. Martins
Abstract:
We present the contribution of the Unbabel team to the WMT 2019 Shared Task on Quality Estimation. We participated on the word, sentence, and document-level tracks, encompassing 3 language pairs: English-German, English-Russian, and English-French. Our submissions build upon the recent OpenKiwi framework: we combine linear, neural, and predictor-estimator systems with new transfer learning approac…
▽ More
We present the contribution of the Unbabel team to the WMT 2019 Shared Task on Quality Estimation. We participated on the word, sentence, and document-level tracks, encompassing 3 language pairs: English-German, English-Russian, and English-French. Our submissions build upon the recent OpenKiwi framework: we combine linear, neural, and predictor-estimator systems with new transfer learning approaches using BERT and XLM pre-trained models. We compare systems individually and propose new ensemble techniques for word and sentence-level predictions. We also propose a simple technique for converting word labels into document-level predictions. Overall, our submitted systems achieve the best results on all tracks and language pairs by a considerable margin.
△ Less
Submitted 11 September, 2019; v1 submitted 24 July, 2019;
originally announced July 2019.
-
Unbabel's Submission to the WMT2019 APE Shared Task: BERT-based Encoder-Decoder for Automatic Post-Editing
Authors:
António V. Lopes,
M. Amin Farajian,
Gonçalo M. Correia,
Jonay Trenous,
André F. T. Martins
Abstract:
This paper describes Unbabel's submission to the WMT2019 APE Shared Task for the English-German language pair. Following the recent rise of large, powerful, pre-trained models, we adapt the BERT pretrained model to perform Automatic Post-Editing in an encoder-decoder framework. Analogously to dual-encoder architectures we develop a BERT-based encoder-decoder (BED) model in which a single pretraine…
▽ More
This paper describes Unbabel's submission to the WMT2019 APE Shared Task for the English-German language pair. Following the recent rise of large, powerful, pre-trained models, we adapt the BERT pretrained model to perform Automatic Post-Editing in an encoder-decoder framework. Analogously to dual-encoder architectures we develop a BERT-based encoder-decoder (BED) model in which a single pretrained BERT encoder receives both the source src and machine translation tgt strings. Furthermore, we explore a conservativeness factor to constrain the APE system to perform fewer edits. As the official results show, when trained on a weighted combination of in-domain and artificial training data, our BED system with the conservativeness penalty improves significantly the translations of a strong Neural Machine Translation system by $-0.78$ and $+1.23$ in terms of TER and BLEU, respectively. Finally, our submission achieves a new state-of-the-art, ex-aequo, in English-German APE of NMT.
△ Less
Submitted 29 June, 2019; v1 submitted 30 May, 2019;
originally announced May 2019.
-
Analyzing and Supporting Adaptation of Online Code Examples
Authors:
Tianyi Zhang,
Di Yang,
Cristina Videira Lopes,
Miryung Kim
Abstract:
Developers often resort to online Q&A forums such as Stack Overflow (SO) for filling their programming needs. Although code examples on those forums are good starting points, they are often incomplete and inadequate for developers' local program contexts; adaptation of those examples is necessary to integrate them to production code. As a consequence, the process of adapting online code examples i…
▽ More
Developers often resort to online Q&A forums such as Stack Overflow (SO) for filling their programming needs. Although code examples on those forums are good starting points, they are often incomplete and inadequate for developers' local program contexts; adaptation of those examples is necessary to integrate them to production code. As a consequence, the process of adapting online code examples is done over and over again, by multiple developers independently. Our work extensively studies these adaptations and variations, serving as the basis for a tool that helps integrate these online code examples in a target context in an interactive manner.
We perform a large-scale empirical study about the nature and extent of adaptations and variations of SO snippets. We construct a comprehensive dataset linking SO posts to GitHub counterparts based on clone detection, time stamp analysis, and explicit URL references. We then qualitatively inspect 400 SO examples and their GitHub counterparts and develop a taxonomy of 24 adaptation types. Using this taxonomy, we build an automated adaptation analysis technique on top of GumTree to classify the entire dataset into these types. We build a Chrome extension called ExampleStack that automatically lifts an adaptation-aware template from each SO example and its GitHub counterparts to identify hot spots where most changes happen. A user study with sixteen programmers shows that seeing the commonalities and variations in similar GitHub counterparts increases their confidence about the given SO example, and helps them grasp a more comprehensive view about how to reuse the example differently and avoid common pitfalls.
△ Less
Submitted 28 May, 2019;
originally announced May 2019.
-
Got: Git, but for Objects
Authors:
Rohan Achar,
Cristina V. Lopes
Abstract:
We look at one important category of distributed applications characterized by the existence of multiple collaborating, and competing, components sharing mutable, long-lived, replicated objects. The problem addressed by our work is that of object state synchronization among the components. As an organizing principle for replicated objects, we formally specify the Global Object Tracker (GoT) model,…
▽ More
We look at one important category of distributed applications characterized by the existence of multiple collaborating, and competing, components sharing mutable, long-lived, replicated objects. The problem addressed by our work is that of object state synchronization among the components. As an organizing principle for replicated objects, we formally specify the Global Object Tracker (GoT) model, an object-oriented programming model based on causal consistency with application-level conflict resolution strategies, whose elements and interfaces mirror those found in decentralized version control systems: a version graph, working data, diffs, commit, checkout, fetch, push, and merge. We have implemented GoT in a framework called Spacetime, written in Python.
In its purest form, GoT is impractical for real systems, because of the unbounded growth of the version graph and because passing diff'ed histories over the network makes remote communication too slow. We present our solution to these problems that adds some constraints to GoT applications, but that makes the model feasible in practice. We present a performance analysis of Spacetime for representative workloads, which shows that the additional constraints added to GoT make it not just feasible, but viable for real applications.
△ Less
Submitted 13 April, 2019;
originally announced April 2019.
-
Controlling Robots using Artificial Intelligence and a Consortium Blockchain
Authors:
Vasco Lopes,
Luís A. Alexandre,
Nuno Pereira
Abstract:
Blockchain is a disruptive technology that is normally used within financial applications, however it can be very beneficial also in certain robotic contexts, such as when an immutable register of events is required. Among the several properties of Blockchain that can be useful within robotic environments, we find not just immutability but also decentralization of the data, irreversibility, access…
▽ More
Blockchain is a disruptive technology that is normally used within financial applications, however it can be very beneficial also in certain robotic contexts, such as when an immutable register of events is required. Among the several properties of Blockchain that can be useful within robotic environments, we find not just immutability but also decentralization of the data, irreversibility, accessibility and non-repudiation. In this paper, we propose an architecture that uses blockchain as a ledger and smart-contract technology for robotic control by using external parties, Oracles, to process data. We show how to register events in a secure way, how it is possible to use smart-contracts to control robots and how to interface with external Artificial Intelligence algorithms for image analysis. The proposed architecture is modular and can be used in multiple contexts such as in manufacturing, network control, robot control, and others, since it is easy to integrate, adapt, maintain and extend to new domains.
△ Less
Submitted 2 March, 2019;
originally announced March 2019.
-
Multi-Winner Contests for Strategic Diffusion in Social Networks
Authors:
Wen Shen,
Yang Feng,
Cristina V. Lopes
Abstract:
Strategic diffusion encourages participants to take active roles in promoting stakeholders' agendas by rewarding successful referrals. As social media continues to transform the way people communicate, strategic diffusion has become a powerful tool for stakeholders to influence people's decisions or behaviors for desired objectives. Existing reward mechanisms for strategic diffusion are usually ei…
▽ More
Strategic diffusion encourages participants to take active roles in promoting stakeholders' agendas by rewarding successful referrals. As social media continues to transform the way people communicate, strategic diffusion has become a powerful tool for stakeholders to influence people's decisions or behaviors for desired objectives. Existing reward mechanisms for strategic diffusion are usually either vulnerable to false-name attacks or not individually rational for participants that have made successful referrals. Here, we introduce a novel multi-winner contests (MWC) mechanism for strategic diffusion in social networks. The MWC mechanism satisfies several desirable properties, including false-name-proofness, individual rationality, budget constraint, monotonicity, and subgraph constraint. Numerical experiments on four real-world social network datasets demonstrate that stakeholders can significantly boost participants' aggregated efforts with proper design of competitions. Our work sheds light on how to design manipulation-resistant mechanisms with appropriate contests.
△ Less
Submitted 10 March, 2019; v1 submitted 13 November, 2018;
originally announced November 2018.
-
An Overview of Blockchain Integration with Robotics and Artificial Intelligence
Authors:
Vasco Lopes,
Luís A. Alexandre
Abstract:
Blockchain technology is growing everyday at a fast-passed rhythm and it's possible to integrate it with many systems, namely Robotics with AI services. However, this is still a recent field and there isn't yet a clear understanding of what it could potentially become. In this paper, we conduct an overview of many different methods and platforms that try to leverage the power of blockchain into ro…
▽ More
Blockchain technology is growing everyday at a fast-passed rhythm and it's possible to integrate it with many systems, namely Robotics with AI services. However, this is still a recent field and there isn't yet a clear understanding of what it could potentially become. In this paper, we conduct an overview of many different methods and platforms that try to leverage the power of blockchain into robotic systems, to improve AI services or to solve problems that are present in the major blockchains, which can lead to the ability of creating robotic systems with increased capabilities and security. We present an overview, discuss the methods and conclude the paper with our view on the future of the integration of these technologies.
△ Less
Submitted 30 September, 2018;
originally announced October 2018.
-
The Java Build Framework: Large Scale Compilation
Authors:
Pedro Martins,
Rohan Achar,
Cristina V. Lopes
Abstract:
Large repositories of source code for research tend to limit their utility to static analysis of the code, as they give no guarantees on whether the projects are compilable, much less runnable in any way. The immediate consequence of the lack of large compilable and runnable datasets is that research that requires such properties does not generalize beyond small benchmarks. We present the Java Bui…
▽ More
Large repositories of source code for research tend to limit their utility to static analysis of the code, as they give no guarantees on whether the projects are compilable, much less runnable in any way. The immediate consequence of the lack of large compilable and runnable datasets is that research that requires such properties does not generalize beyond small benchmarks. We present the Java Build Framework, a method and tool capable of automatically compiling a large percentage of Java projects available in open source repositories like GitHub. Two elements are at the core: a very large repository of JAR files, and techniques of resolution of compilation faults and dependencies.
△ Less
Submitted 12 April, 2018;
originally announced April 2018.
-
Toward Understanding the Impact of User Participation in Autonomous Ridesharing Systems
Authors:
Wen Shen,
Rohan Achar,
Cristina V. Lopes
Abstract:
Autonomous ridesharing systems (ARS) promise many societal and environmental benefits, including decreased accident rates, reduced energy consumption and pollutant emissions, and diminished land use for parking. To unleash ARS' potential, stakeholders must understand how the degree of passenger participation influences the ridesharing systems' efficiency. To date, however, a careful study that qua…
▽ More
Autonomous ridesharing systems (ARS) promise many societal and environmental benefits, including decreased accident rates, reduced energy consumption and pollutant emissions, and diminished land use for parking. To unleash ARS' potential, stakeholders must understand how the degree of passenger participation influences the ridesharing systems' efficiency. To date, however, a careful study that quantifies the impact of user participation on ARS' performance is missing. Here, we present the first simulation analysis to investigate how and to what extent user participation affects the efficiency of ARS. We demonstrate how specific configurations (e.g., fleet size, vehicle capacity, and the maximum waiting time) of a system can be identified to counter the performance loss due to users' uncoordinated behavior on ridesharing participation. Our results indicate that stakeholders of ARS should base decisions regarding system configurations on insights from data-driven simulations and make tradeoffs between system efficiency and price of anarchy for desired outcomes.
△ Less
Submitted 28 March, 2018; v1 submitted 17 March, 2018;
originally announced March 2018.
-
Information Design in Crowdfunding under Thresholding Policies
Authors:
Wen Shen,
Jacob W. Crandall,
Ke Yan,
Cristina V. Lopes
Abstract:
Crowdfunding has emerged as a prominent way for entrepreneurs to secure funding without sophisticated intermediation. In crowdfunding, an entrepreneur often has to decide how to disclose the campaign status in order to collect as many contributions as possible. Such decisions are difficult to make primarily due to incomplete information. We propose information design as a tool to help the entrepre…
▽ More
Crowdfunding has emerged as a prominent way for entrepreneurs to secure funding without sophisticated intermediation. In crowdfunding, an entrepreneur often has to decide how to disclose the campaign status in order to collect as many contributions as possible. Such decisions are difficult to make primarily due to incomplete information. We propose information design as a tool to help the entrepreneur to improve revenue by influencing backers' beliefs. We introduce a heuristic algorithm to dynamically compute information-disclosure policies for the entrepreneur, followed by an empirical evaluation to demonstrate its competitiveness over the widely-adopted immediate-disclosure policy. Our results demonstrate that the immediate-disclosure policy is not optimal when backers follow thresholding policies despite its ease of implementation. With appropriate heuristics, an entrepreneur can benefit from dynamic information disclosure. Our work sheds light on information design in a dynamic setting where agents make decisions using thresholding policies.
△ Less
Submitted 28 March, 2018; v1 submitted 12 September, 2017;
originally announced September 2017.
-
cf4ocl: a C framework for OpenCL
Authors:
Nuno Fachada,
Vitor V. Lopes,
Rui C. Martins,
Agostinho C. Rosa
Abstract:
OpenCL is an open standard for parallel programming of heterogeneous compute devices, such as GPUs, CPUs, DSPs or FPGAs. However, the verbosity of its C host API can hinder application development. In this paper we present cf4ocl, a software library for rapid development of OpenCL programs in pure C. It aims to reduce the verbosity of the OpenCL API, offering straightforward memory management, int…
▽ More
OpenCL is an open standard for parallel programming of heterogeneous compute devices, such as GPUs, CPUs, DSPs or FPGAs. However, the verbosity of its C host API can hinder application development. In this paper we present cf4ocl, a software library for rapid development of OpenCL programs in pure C. It aims to reduce the verbosity of the OpenCL API, offering straightforward memory management, integrated profiling of events (e.g., kernel execution and data transfers), simple but extensible device selection mechanism and user-friendly error management. We compare two versions of a conceptual application example, one based on cf4ocl, the other developed directly with the OpenCL host API. Results show that the former is simpler to implement and offers more features, at the cost of an effectively negligible computational overhead. Additionally, the tools provided with cf4ocl allowed for a quick analysis on how to optimize the application.
△ Less
Submitted 12 May, 2017; v1 submitted 5 September, 2016;
originally announced September 2016.
-
Collective Intelligence for Smarter API Recommendations in Python
Authors:
Andrea Renika D'Souza,
Di Yang,
Cristina V. Lopes
Abstract:
Software developers use Application Programming Interfaces (APIs) of libraries and frameworks extensively while writing programs. In this context, the recommendations provided in code completion pop-ups help developers choose the desired methods. The candidate lists recommended by these tools, however, tend to be large, ordered alphabetically and sometimes even incomplete. A fair amount of work ha…
▽ More
Software developers use Application Programming Interfaces (APIs) of libraries and frameworks extensively while writing programs. In this context, the recommendations provided in code completion pop-ups help developers choose the desired methods. The candidate lists recommended by these tools, however, tend to be large, ordered alphabetically and sometimes even incomplete. A fair amount of work has been done recently to improve the relevance of these code completion results, especially for statically typed languages like Java. However, these proposed techniques rely on the static type of the object and are therefore inapplicable for a dynamically typed language like Python. In this paper, we present PyReco, an intelligent code completion system for Python which uses the mined API usages from open source repositories to order the results based on relevance rather than the conventional alphabetic order. To recommend suggestions that are relevant for a working context, a nearest neighbor classifier is used to identify the best matching usage among all the extracted usage patterns. To evaluate the effectiveness of our system, the code completion queries are automatically extracted from projects and tested quantitatively using a ten-fold cross validation technique. The evaluation shows that our approach outperforms the alphabetically ordered API recommendation systems in recommending APIs for standard, as well as, third-party libraries.
△ Less
Submitted 31 August, 2016;
originally announced August 2016.
-
SimOutUtils - Utilities for analyzing time series simulation output
Authors:
Nuno Fachada,
Vitor V. Lopes,
Rui C. Martins,
Agostinho C. Rosa
Abstract:
SimOutUtils is a suite of MATLAB/Octave functions for studying and analyzing time series-like output from stochastic simulation models. More specifically, SimOutUtils allows modelers to study and visualize simulation output dynamics, perform distributional analysis of output statistical summaries, as well as compare these summaries in order to assert the statistical equivalence of two or more mode…
▽ More
SimOutUtils is a suite of MATLAB/Octave functions for studying and analyzing time series-like output from stochastic simulation models. More specifically, SimOutUtils allows modelers to study and visualize simulation output dynamics, perform distributional analysis of output statistical summaries, as well as compare these summaries in order to assert the statistical equivalence of two or more model implementations. Additionally, the provided functions are able to produce publication quality figures and tables showcasing results from the specified simulation output studies.
△ Less
Submitted 6 January, 2017; v1 submitted 22 March, 2016;
originally announced March 2016.
-
micompr: An R Package for Multivariate Independent Comparison of Observations
Authors:
Nuno Fachada,
João Rodrigues,
Vitor V. Lopes,
Rui C. Martins,
Agostinho C. Rosa
Abstract:
The R package micompr implements a procedure for assessing if two or more multivariate samples are drawn from the same distribution. The procedure uses principal component analysis to convert multivariate observations into a set of linearly uncorrelated statistical measures, which are then compared using a number of statistical methods. This technique is independent of the distributional propertie…
▽ More
The R package micompr implements a procedure for assessing if two or more multivariate samples are drawn from the same distribution. The procedure uses principal component analysis to convert multivariate observations into a set of linearly uncorrelated statistical measures, which are then compared using a number of statistical methods. This technique is independent of the distributional properties of samples and automatically selects features that best explain their differences. The procedure is appropriate for comparing samples of time series, images, spectrometric measures or similar high-dimension multivariate observations.
△ Less
Submitted 21 February, 2017; v1 submitted 22 March, 2016;
originally announced March 2016.
-
An Online Mechanism for Ridesharing in Autonomous Mobility-on-Demand Systems
Authors:
Wen Shen,
Cristina V. Lopes,
Jacob W. Crandall
Abstract:
With proper management, Autonomous Mobility-on-Demand (AMoD) systems have great potential to satisfy the transport demands of urban populations by providing safe, convenient, and affordable ridesharing services. Meanwhile, such systems can substantially decrease private car ownership and use, and thus significantly reduce traffic congestion, energy consumption, and carbon emissions. To achieve thi…
▽ More
With proper management, Autonomous Mobility-on-Demand (AMoD) systems have great potential to satisfy the transport demands of urban populations by providing safe, convenient, and affordable ridesharing services. Meanwhile, such systems can substantially decrease private car ownership and use, and thus significantly reduce traffic congestion, energy consumption, and carbon emissions. To achieve this objective, an AMoD system requires private information about the demand from passengers. However, due to self-interestedness, passengers are unlikely to cooperate with the service providers in this regard. Therefore, an online mechanism is desirable if it incentivizes passengers to truthfully report their actual demand. For the purpose of promoting ridesharing, we hereby introduce a posted-price, integrated online ridesharing mechanism (IORS) that satisfies desirable properties such as ex-post incentive compatibility, individual rationality, and budget-balance. Numerical results indicate the competitiveness of IORS compared with two benchmarks, namely the optimal assignment and an offline, auction-based mechanism.
△ Less
Submitted 1 March, 2017; v1 submitted 7 March, 2016;
originally announced March 2016.
-
SourcererCC: Scaling Code Clone Detection to Big Code
Authors:
Hitesh Sajnani,
Vaibhav Saini,
Jeffrey Svajlenko,
Chanchal K. Roy,
Cristina V. Lopes
Abstract:
Despite a decade of active research, there is a marked lack in clone detectors that scale to very large repositories of source code, in particular for detecting near-miss clones where significant editing activities may take place in the cloned code. We present SourcererCC, a token-based clone detector that targets three clone types, and exploits an index to achieve scalability to large inter-proje…
▽ More
Despite a decade of active research, there is a marked lack in clone detectors that scale to very large repositories of source code, in particular for detecting near-miss clones where significant editing activities may take place in the cloned code. We present SourcererCC, a token-based clone detector that targets three clone types, and exploits an index to achieve scalability to large inter-project repositories using a standard workstation. SourcererCC uses an optimized inverted-index to quickly query the potential clones of a given code block. Filtering heuristics based on token ordering are used to significantly reduce the size of the index, the number of code-block comparisons needed to detect the clones, as well as the number of required token-comparisons needed to judge a potential clone.
We evaluate the scalability, execution time, recall and precision of SourcererCC, and compare it to four publicly available and state-of-the-art tools. To measure recall, we use two recent benchmarks, (1) a large benchmark of real clones, BigCloneBench, and (2) a Mutation/Injection-based framework of thousands of fine-grained artificial clones. We find SourcererCC has both high recall and precision, and is able to scale to a large inter-project repository (250MLOC) using a standard workstation.
△ Less
Submitted 20 December, 2015;
originally announced December 2015.
-
Model-independent comparison of simulation output
Authors:
Nuno Fachada,
Vitor V. Lopes,
Rui C. Martins,
Agostinho C. Rosa
Abstract:
Computational models of complex systems are usually elaborate and sensitive to implementation details, characteristics which often affect their verification and validation. Model replication is a possible solution to this issue. It avoids biases associated with the language or toolkit used to develop the original model, not only promoting its verification and validation, but also fostering the cre…
▽ More
Computational models of complex systems are usually elaborate and sensitive to implementation details, characteristics which often affect their verification and validation. Model replication is a possible solution to this issue. It avoids biases associated with the language or toolkit used to develop the original model, not only promoting its verification and validation, but also fostering the credibility of the underlying conceptual model. However, different model implementations must be compared to assess their equivalence. The problem is, given two or more implementations of a stochastic model, how to prove that they display similar behavior? In this paper, we present a model comparison technique, which uses principal component analysis to convert simulation output into a set of linearly uncorrelated statistical measures, analyzable in a consistent, model-independent fashion. It is appropriate for ascertaining distributional equivalence of a model replication with its original implementation. Besides model-independence, this technique has three other desirable properties: a) it automatically selects output features that best explain implementation differences; b) it does not depend on the distributional properties of simulation output; and, c) it simplifies the modelers' work, as it can be used directly on simulation outputs. The proposed technique is shown to produce similar results to the manual or empirical selection of output features when applied to a well-studied reference model.
△ Less
Submitted 6 January, 2017; v1 submitted 30 September, 2015;
originally announced September 2015.