Skip to main content

Showing 1–50 of 56 results for author: Lopes, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.03101  [pdf, ps, other

    cs.CL

    Beyond Text Compression: Evaluating Tokenizers Across Scales

    Authors: Jonas F. Lotz, António V. Lopes, Stephan Peitz, Hendra Setiawan, Leonardo Emili

    Abstract: The choice of tokenizer can profoundly impact language model performance, yet accessible and reliable evaluations of tokenizer quality remain an open challenge. Inspired by scaling consistency, we show that smaller models can accurately predict significant differences in tokenizer impact on larger models at a fraction of the compute cost. By systematically evaluating both English-centric and multi… ▽ More

    Submitted 3 June, 2025; originally announced June 2025.

    Comments: ACL 2025

  2. arXiv:2504.12549  [pdf, other

    cs.CL cs.AI cs.LG

    Memorization: A Close Look at Books

    Authors: Iris Ma, Ian Domingo, Alberto Krone-Martins, Pierre Baldi, Cristina V. Lopes

    Abstract: To what extent can entire books be extracted from LLMs? Using the Llama 3 70B family of models, and the "prefix-prompting" extraction technique, we were able to auto-regressively reconstruct, with a very high level of similarity, one entire book (Alice's Adventures in Wonderland) from just the first 500 tokens. We were also able to obtain high extraction rates on several other books, piece-wise. H… ▽ More

    Submitted 16 April, 2025; originally announced April 2025.

  3. arXiv:2503.20362  [pdf, other

    cs.CV

    Self-ReS: Self-Reflection in Large Vision-Language Models for Long Video Understanding

    Authors: Joao Pereira, Vasco Lopes, David Semedo, Joao Neves

    Abstract: Large Vision-Language Models (LVLMs) demonstrate remarkable performance in short-video tasks such as video question answering, but struggle in long-video understanding. The linear frame sampling strategy, conventionally used by LVLMs, fails to account for the non-linear distribution of key events in video data, often introducing redundant or irrelevant information in longer contexts while risking… ▽ More

    Submitted 26 March, 2025; originally announced March 2025.

  4. arXiv:2410.21113  [pdf, other

    cs.CV cs.CL

    Zero-Shot Action Recognition in Surveillance Videos

    Authors: Joao Pereira, Vasco Lopes, David Semedo, Joao Neves

    Abstract: The growing demand for surveillance in public spaces presents significant challenges due to the shortage of human resources. Current AI-based video surveillance systems heavily rely on core computer vision models that require extensive finetuning, which is particularly difficult in surveillance settings due to limited datasets and difficult setting (viewpoint, low quality, etc.). In this work, we… ▽ More

    Submitted 18 March, 2025; v1 submitted 28 October, 2024; originally announced October 2024.

  5. arXiv:2407.15718  [pdf, other

    cs.CY cs.AI cs.HC cs.IR cs.SE

    Integrating AI Tutors in a Programming Course

    Authors: Iris Ma, Alberto Krone Martins, Cristina Videira Lopes

    Abstract: RAGMan is an LLM-powered tutoring system that can support a variety of course-specific and homework-specific AI tutors. RAGMan leverages Retrieval Augmented Generation (RAG), as well as strict instructions, to ensure the alignment of the AI tutors' responses. By using RAGMan's AI tutors, students receive assistance with their specific homework assignments without directly obtaining solutions, whil… ▽ More

    Submitted 13 July, 2024; originally announced July 2024.

    Comments: Accepted at SIGCSE Virtual 2024

  6. arXiv:2402.00247  [pdf, other

    cs.SE cs.PL

    Towards AI-Assisted Synthesis of Verified Dafny Methods

    Authors: Md Rakib Hossain Misu, Cristina V. Lopes, Iris Ma, James Noble

    Abstract: Large language models show great promise in many domains, including programming. A promise is easy to make but hard to keep, and language models often fail to keep their promises, generating erroneous code. A promising avenue to keep models honest is to incorporate formal verification: generating programs' specifications as well as code so that the code can be proved correct with respect to the sp… ▽ More

    Submitted 10 June, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

    Comments: This is an author provided preprint. The final version will be published at Proc. ACM Softw. Eng; FSE 2024, in July 2024

  7. arXiv:2401.17622  [pdf, other

    cs.SE

    Commit Messages in the Age of Large Language Models

    Authors: Cristina V. Lopes, Vanessa I. Klotzman, Iris Ma, Iftekar Ahmed

    Abstract: Commit messages are explanations of changes made to a codebase that are stored in version control systems. They help developers understand the codebase as it evolves. However, writing commit messages can be tedious and inconsistent among developers. To address this issue, researchers have tried using different methods to automatically generate commit messages, including rule-based, retrieval-based… ▽ More

    Submitted 1 February, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

    Comments: Submitted to FSE 23 on Feb 6 2023

  8. arXiv:2309.01826  [pdf, other

    cs.CL cs.AI

    One Wide Feedforward is All You Need

    Authors: Telmo Pessoa Pires, António V. Lopes, Yannick Assogba, Hendra Setiawan

    Abstract: The Transformer architecture has two main non-embedding components: Attention and the Feed Forward Network (FFN). Attention captures interdependencies between words regardless of their position, while the FFN non-linearly transforms each input token independently. In this work we explore the role of the FFN, and find that despite taking up a significant fraction of the model's parameters, it is hi… ▽ More

    Submitted 21 October, 2023; v1 submitted 4 September, 2023; originally announced September 2023.

    Comments: Accepted at WMT23 (EMNLP 2023)

  9. Using Large Language Models to Generate JUnit Tests: An Empirical Study

    Authors: Mohammed Latif Siddiq, Joanna C. S. Santos, Ridwanul Hasan Tanvir, Noshin Ulfat, Fahmid Al Rifat, Vinicius Carvalho Lopes

    Abstract: A code generation model generates code by taking a prompt from a code comment, existing code, or a combination of both. Although code generation models (e.g., GitHub Copilot) are increasingly being adopted in practice, it is unclear whether they can successfully be used for unit test generation without fine-tuning for a strongly typed language like Java. To fill this gap, we investigated how well… ▽ More

    Submitted 8 March, 2024; v1 submitted 30 April, 2023; originally announced May 2023.

    Comments: Accepted in Research Track of The 28th International Conference on Evaluation and Assessment in Software Engineering (EASE 2024)

    Journal ref: The 28th International Conference on Evaluation and Assessment in Software Engineering (EASE), 2024, 313-322

  10. Improving the Quality of Commit Messages in Students' Projects

    Authors: Iris Ma, Cristina V. Lopes

    Abstract: Commit messages play a crucial role in collaborative software development. They provide a clear and concise description of the changes made to the source code. However, many commit messages among students' projects lack useful information. This is a concern, as low-quality commit messages can negatively impact communication of software development and future maintenance. To address this issue, thi… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

    Comments: Accepted at ICSE SEENG Workshop 2023

  11. arXiv:2303.16938  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Are Neural Architecture Search Benchmarks Well Designed? A Deeper Look Into Operation Importance

    Authors: Vasco Lopes, Bruno Degardin, Luís A. Alexandre

    Abstract: Neural Architecture Search (NAS) benchmarks significantly improved the capability of developing and comparing NAS methods while at the same time drastically reduced the computational overhead by providing meta-information about thousands of trained neural networks. However, tabular benchmarks have several drawbacks that can hinder fair comparisons and provide unreliable results. These usually focu… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

    Comments: 15 pages; 11 figues; 10 tables

  12. Black Boxes, White Noise: Similarity Detection for Neural Functions

    Authors: Farima Farmahinifarahani, Cristina V. Lopes

    Abstract: Similarity, or clone, detection has important applications in copyright violation, software theft, code search, and the detection of malicious components. There is now a good number of open source and proprietary clone detectors for programs written in traditional programming languages. However, the increasing adoption of deep learning models in software poses a challenge to these tools: these mod… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

    Journal ref: The Art, Science, and Engineering of Programming, 2023, Vol. 7, Issue 3, Article 12

  13. arXiv:2209.06906  [pdf, other

    math.DS cs.CE nlin.CD physics.class-ph

    Nonlinear dynamics of asymmetric bistable energy harvesters

    Authors: João Pedro Norenberg, Roberto Luo, Vinicius Goncaalves Lopes, João Victor L. L. Peterson, Americo Cunha Jr

    Abstract: The paper investigates asymmetries effects over a nonlinear vibration energy harvester dynamics. The asymmetric system performance is compared with symmetric ones. Different asymmetry levels on restoring force and gravity action are investigated from a system-sloping angle variation. Bifurcation diagrams and basins of attraction are used to examine the local and global characteristics underlying d… ▽ More

    Submitted 9 June, 2023; v1 submitted 20 August, 2022; originally announced September 2022.

    Report number: Volume 257, 1 November 2023, 108542 MSC Class: 37N15 ACM Class: I.6.3

    Journal ref: International Journal of Mechanical Sciences 2023

  14. arXiv:2208.06475  [pdf

    cs.NE cs.AI cs.CV cs.LG

    Guided Evolutionary Neural Architecture Search With Efficient Performance Estimation

    Authors: Vasco Lopes, Miguel Santos, Bruno Degardin, Luís A. Alexandre

    Abstract: Neural Architecture Search (NAS) methods have been successfully applied to image tasks with excellent results. However, NAS methods are often complex and tend to converge to local minima as soon as generated architectures seem to yield good results. This paper proposes GEA, a novel approach for guided NAS. GEA guides the evolution by exploring the search space by generating and evaluating several… ▽ More

    Submitted 22 July, 2022; originally announced August 2022.

    Comments: 10 pages, 7 figures, 4 tables. arXiv admin note: substantial text overlap with arXiv:2110.15232

  15. arXiv:2203.05508  [pdf, other

    cs.CV cs.AI cs.LG cs.NE

    Towards Less Constrained Macro-Neural Architecture Search

    Authors: Vasco Lopes, Luís A. Alexandre

    Abstract: Networks found with Neural Architecture Search (NAS) achieve state-of-the-art performance in a variety of tasks, out-performing human-designed networks. However, most NAS methods heavily rely on human-defined assumptions that constrain the search: architecture's outer-skeletons, number of layers, parameter heuristics and search spaces. Additionally, common search spaces consist of repeatable modul… ▽ More

    Submitted 6 January, 2023; v1 submitted 10 March, 2022; originally announced March 2022.

    Comments: 13 pages double-column, 9 tables, 6 figures

  16. arXiv:2110.15232  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Guided Evolution for Neural Architecture Search

    Authors: Vasco Lopes, Miguel Santos, Bruno Degardin, Luís A. Alexandre

    Abstract: Neural Architecture Search (NAS) methods have been successfully applied to image tasks with excellent results. However, NAS methods are often complex and tend to converge to local minima as soon as generated architectures seem to yield good results. In this paper, we propose G-EA, a novel approach for guided evolutionary NAS. The rationale behind G-EA, is to explore the search space by generating… ▽ More

    Submitted 28 October, 2021; originally announced October 2021.

    Comments: Paper accepted at 35th Conference on Neural Information Processing Systems (NeurIPS) - New In ML. 9 pages, 2 figures, 1 table

  17. arXiv:2110.11191  [pdf, other

    cs.CV cs.AI cs.LG

    Generative Adversarial Graph Convolutional Networks for Human Action Synthesis

    Authors: Bruno Degardin, João Neves, Vasco Lopes, João Brito, Ehsan Yaghoubi, Hugo Proença

    Abstract: Synthesising the spatial and temporal dynamics of the human body skeleton remains a challenging task, not only in terms of the quality of the generated shapes, but also of their diversity, particularly to synthesise realistic body movements of a specific action (action conditioning). In this paper, we propose Kinetic-GAN, a novel architecture that leverages the benefits of Generative Adversarial N… ▽ More

    Submitted 25 October, 2021; v1 submitted 21 October, 2021; originally announced October 2021.

    Comments: Published as a conference paper at WACV 2022. Code and pretrained models available at https://github.com/DegardinBruno/Kinetic-GAN

  18. arXiv:2107.14045  [pdf, other

    cond-mat.stat-mech cs.CE physics.class-ph stat.AP

    The nonlinear dynamics of a bistable energy harvesting system with colored noise disturbances

    Authors: Vinicius Gonçalves Lopes, João Victor L. L. Peterson, Americo Cunha Jr

    Abstract: This paper deals with the nonlinear stochastic dynamics of a piezoelectric energy harvesting system subjected to a harmonic external excitation disturbed by Gaussian colored noise. A parametric analysis is conducted, where the effects of the standard deviation and the correlation time of colored noise on the system response are investigated. The numerical results suggest a strong influence of nois… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

    MSC Class: 37N15 ACM Class: I.6.6

    Journal ref: Journal of Computational Interdisciplinary Sciences, vol. 10, pp. 125, 2019

  19. arXiv:2105.06711  [pdf, other

    cs.CV

    REGINA - Reasoning Graph Convolutional Networks in Human Action Recognition

    Authors: Bruno Degardin, Vasco Lopes, Hugo Proença

    Abstract: It is known that the kinematics of the human body skeleton reveals valuable information in action recognition. Recently, modeling skeletons as spatio-temporal graphs with Graph Convolutional Networks (GCNs) has been reported to solidly advance the state-of-the-art performance. However, GCN-based approaches exclusively learn from raw skeleton data, and are expected to extract the inherent structura… ▽ More

    Submitted 14 May, 2021; originally announced May 2021.

  20. arXiv:2103.14100  [pdf, other

    cs.SE

    Expanding Frontiers: Settling an Understanding of Systems-of-Information Systems

    Authors: Valdemar Vicente Graciano Neto, Bruno Gabriel Araújo Lebtag, Paulo Gabriel Teixeira, Priscilla Batista, Vinícius Carvalho Lopes, Jamal El-Hachem, Jérémy Buisson, Flavio Oquendo, Juliana Fernandes, Francisco Ferreira, Rodrigo Peireira dos Santos, Davi Viana, Everton Cavalcante, Mohamad Kassab, Ahmad Mohsin, Roberto Oliveira, Vânia Neves, Maria Istela Cagnin, Elisa Yumi Nakagawa

    Abstract: System-of-Systems (SoS) has consolidated itself as a special type of software-intensive systems. As such, subtypes of SoS have also emerged, such as Cyber-Physical SoS (CPSoS) that are formed essentially of cyber-physical constituent systems and Systems-of-Information Systems (SoIS) that contain information systems as their constituents. In contrast to CPSoS that have been investigated and covered… ▽ More

    Submitted 25 March, 2021; originally announced March 2021.

    Comments: 6 pages, 2 figures, 28 references

  21. EPE-NAS: Efficient Performance Estimation Without Training for Neural Architecture Search

    Authors: Vasco Lopes, Saeid Alirezazadeh, Luís A. Alexandre

    Abstract: Neural Architecture Search (NAS) has shown excellent results in designing architectures for computer vision problems. NAS alleviates the need for human-defined settings by automating architecture design and engineering. However, NAS methods tend to be slow, as they require large amounts of GPU computation. This bottleneck is mainly due to the performance estimation strategy, which requires the eva… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.

  22. An AutoML-based Approach to Multimodal Image Sentiment Analysis

    Authors: Vasco Lopes, António Gaspar, Luís A. Alexandre, João Cordeiro

    Abstract: Sentiment analysis is a research topic focused on analysing data to extract information related to the sentiment that it causes. Applications of sentiment analysis are wide, ranging from recommendation systems, and marketing to customer satisfaction. Recent approaches evaluate textual content using Machine Learning techniques that are trained over large corpora. However, as social media grown, oth… ▽ More

    Submitted 16 February, 2021; originally announced February 2021.

  23. Analyzing Dominance Move (MIP-DoM) Indicator for Multi- and Many-objective Optimization

    Authors: Claudio Lucio do Val Lopes, Flávio Vinícius Cruzeiro Martins, Elizabeth Fialho Wanner, Kalyanmoy Deb

    Abstract: Dominance move (DoM) is a binary quality indicator that can be used in multi-objective and many-objective optimization to compare two solution sets obtained from different algorithms. The DoM indicator can differentiate the sets for certain important features, such as convergence, spread, uniformity, and cardinality. DoM does not use any reference, and it has an intuitive and physical meaning, sim… ▽ More

    Submitted 5 February, 2021; v1 submitted 21 December, 2020; originally announced December 2020.

    Comments: 15 Pages. Submitted for consideration for publication in the IEEE Transactions on Evolutionary Computation

    Journal ref: IEEE Transactions on Evolutionary Computation 2021

  24. Auto-Classifier: A Robust Defect Detector Based on an AutoML Head

    Authors: Vasco Lopes, Luís A. Alexandre

    Abstract: The dominant approach for surface defect detection is the use of hand-crafted feature-based methods. However, this falls short when conditions vary that affect extracted images. So, in this paper, we sought to determine how well several state-of-the-art Convolutional Neural Networks perform in the task of surface defect detection. Moreover, we propose two methods: CNN-Fusion, that fuses the predic… ▽ More

    Submitted 3 September, 2020; originally announced September 2020.

    Comments: 12 pages, 2 figures. Published in ICONIP2020, proceedings published in the Springer's series of Lecture Notes in Computer Science

  25. arXiv:2007.16149  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    HMCNAS: Neural Architecture Search using Hidden Markov Chains and Bayesian Optimization

    Authors: Vasco Lopes, Luís A. Alexandre

    Abstract: Neural Architecture Search has achieved state-of-the-art performance in a variety of tasks, out-performing human-designed networks. However, many assumptions, that require human definition, related with the problems being solved or the models generated are still needed: final model architectures, number of layers to be sampled, forced operations, small search spaces, which ultimately contributes t… ▽ More

    Submitted 31 July, 2020; originally announced July 2020.

    Comments: 9 pages, 1 figure, 2 tables, neural architecture search, macro-search

  26. arXiv:2006.10409  [pdf, other

    cs.NI

    A softwarized perspective of the 5G networks

    Authors: Kleber Vieira Cardoso, Cristiano Bonato Both, Lúcio Rene Prade, Ciro J. A. Macedo, Victor Hugo L. Lopes

    Abstract: The main goal of this article is to present the fundamental theoretical concepts for the tutorial presented in IEEE NetSoft 2020. The article explores the use of software in the 5G system composed of the Radio Access Network (RAN) and the core components, following the standards defined by 3GPP, particularly the Release 15. The article provides a brief overview of mobile cellular networks, includi… ▽ More

    Submitted 24 August, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

    Comments: 23 pages (two columns), 23 figures, 3 tables, typos corrected, and for additional material associated to the tutorial, see https://github.com/LABORA-INF-UFG/NetSoft2020-Tutorial4

  27. arXiv:2005.04153  [pdf, other

    cs.NE cs.CV cs.LG stat.ML

    A Hybrid Method for Training Convolutional Neural Networks

    Authors: Vasco Lopes, Paulo Fazendeiro

    Abstract: Artificial Intelligence algorithms have been steadily increasing in popularity and usage. Deep Learning, allows neural networks to be trained using huge datasets and also removes the need for human extracted features, as it automates the feature learning process. In the hearth of training deep neural networks, such as Convolutional Neural Networks, we find backpropagation, that by computing the gr… ▽ More

    Submitted 15 April, 2020; originally announced May 2020.

    Comments: 1 figure, 6 pages

  28. arXiv:2003.05377  [pdf, other

    cs.CL cs.IR cs.LG stat.ML

    Brazilian Lyrics-Based Music Genre Classification Using a BLSTM Network

    Authors: Raul de Araújo Lima, Rômulo César Costa de Sousa, Simone Diniz Junqueira Barbosa, Hélio Cortês Vieira Lopes

    Abstract: Organize songs, albums, and artists in groups with shared similarity could be done with the help of genre labels. In this paper, we present a novel approach for automatic classifying musical genre in Brazilian music using only the song lyrics. This kind of classification remains a challenge in the field of Natural Language Processing. We construct a dataset of 138,368 Brazilian song lyrics distrib… ▽ More

    Submitted 6 March, 2020; originally announced March 2020.

    Comments: 7 pages, 4 figures, 3 tables

    MSC Class: 68T50(Primary); 68T05 (Secondary) ACM Class: I.2.7; I.2.6

  29. arXiv:2002.10842  [pdf, ps, other

    cs.NE

    An Assignment Problem Formulation for Dominance Move Indicator

    Authors: Claudio Lucio do Val Lopes, Flávio Vinícius Cruzeiro Martins, Elizabeth F. Wanner

    Abstract: Dominance move (DoM) is a binary quality indicator to compare solution sets in multiobjective optimization. The indicator allows a more natural and intuitive relation when comparing solution sets. It is Pareto compliant and does not demand any parameters or reference sets. In spite of its advantages, the combinatorial calculation nature is a limitation. The original formulation presents an efficie… ▽ More

    Submitted 14 May, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

    Comments: arXiv admin note: text overlap with arXiv:2001.03657

  30. arXiv:2001.03657  [pdf, ps, other

    math.OC cs.NE

    Dominance Move calculation using a MIP approach for comparison of multi and many-objective optimization solution sets

    Authors: Claudio Lucio do Val Lopes, Flávio Vinícius Cruzeiro Martins, Elizabeth Fialho Wanner

    Abstract: Dominance move (DoM) is a binary quality indicator that can be used in multiobjective optimization. It can compare solution sets while representing some important features such as convergence, spread, uniformity, and cardinality. DoM has an intuitive concept and considers the minimum move of one set needed to weakly Pareto dominate the other set. Despite the aforementioned properties, DoM is hard… ▽ More

    Submitted 10 January, 2020; originally announced January 2020.

    Comments: 23 pages, 5 figures

  31. GANprintR: Improved Fakes and Evaluation of the State of the Art in Face Manipulation Detection

    Authors: João C. Neves, Ruben Tolosana, Ruben Vera-Rodriguez, Vasco Lopes, Hugo Proença, Julian Fierrez

    Abstract: The availability of large-scale facial databases, together with the remarkable progresses of deep learning technologies, in particular Generative Adversarial Networks (GANs), have led to the generation of extremely realistic fake facial content, raising obvious concerns about the potential for misuse. Such concerns have fostered the research on manipulation detection methods that, contrary to huma… ▽ More

    Submitted 1 July, 2020; v1 submitted 13 November, 2019; originally announced November 2019.

    Journal ref: IEEE Journal of Selected Topics in Signal Processing, 2020

  32. arXiv:1909.03167  [pdf, other

    cs.DC cs.SE

    GoTcha: An Interactive Debugger for GoT-Based Distributed Systems

    Authors: Rohan Achar, Pritha Dawn, Cristina V. Lopes

    Abstract: Debugging distributed systems is hard. Most of the techniques that have been developed for debugging such systems use either extensive model checking, or postmortem analysis of logs and traces. Interactive debugging is typically a tool that is only effective in single threaded and single process applications, and is rarely applied to distributed systems. While the live observation of state changes… ▽ More

    Submitted 6 September, 2019; originally announced September 2019.

  33. arXiv:1909.01051  [pdf, other

    cs.CV cs.LG cs.MA

    MANAS: Multi-Agent Neural Architecture Search

    Authors: Vasco Lopes, Fabio Maria Carlucci, Pedro M Esperança, Marco Singh, Victor Gabillon, Antoine Yang, Hang Xu, Zewei Chen, Jun Wang

    Abstract: The Neural Architecture Search (NAS) problem is typically formulated as a graph search problem where the goal is to learn the optimal operations over edges in order to maximise a graph-level global objective. Due to the large architecture parameter space, efficiency is a key bottleneck preventing NAS from its practical use. In this paper, we address the issue by framing NAS as a multi-agent proble… ▽ More

    Submitted 12 January, 2023; v1 submitted 3 September, 2019; originally announced September 2019.

  34. arXiv:1907.10352  [pdf, other

    cs.CL

    Unbabel's Participation in the WMT19 Translation Quality Estimation Shared Task

    Authors: Fabio Kepler, Jonay Trénous, Marcos Treviso, Miguel Vera, António Góis, M. Amin Farajian, António V. Lopes, André F. T. Martins

    Abstract: We present the contribution of the Unbabel team to the WMT 2019 Shared Task on Quality Estimation. We participated on the word, sentence, and document-level tracks, encompassing 3 language pairs: English-German, English-Russian, and English-French. Our submissions build upon the recent OpenKiwi framework: we combine linear, neural, and predictor-estimator systems with new transfer learning approac… ▽ More

    Submitted 11 September, 2019; v1 submitted 24 July, 2019; originally announced July 2019.

    Comments: In Proceedings of the Fourth Conference on Machine Translation (WMT) 2019: https://www.aclweb.org/anthology/W19-5406/

  35. arXiv:1905.13068  [pdf, other

    cs.CL

    Unbabel's Submission to the WMT2019 APE Shared Task: BERT-based Encoder-Decoder for Automatic Post-Editing

    Authors: António V. Lopes, M. Amin Farajian, Gonçalo M. Correia, Jonay Trenous, André F. T. Martins

    Abstract: This paper describes Unbabel's submission to the WMT2019 APE Shared Task for the English-German language pair. Following the recent rise of large, powerful, pre-trained models, we adapt the BERT pretrained model to perform Automatic Post-Editing in an encoder-decoder framework. Analogously to dual-encoder architectures we develop a BERT-based encoder-decoder (BED) model in which a single pretraine… ▽ More

    Submitted 29 June, 2019; v1 submitted 30 May, 2019; originally announced May 2019.

    Comments: Updated sections 2.2 and 4

  36. arXiv:1905.12111  [pdf, other

    cs.SE

    Analyzing and Supporting Adaptation of Online Code Examples

    Authors: Tianyi Zhang, Di Yang, Cristina Videira Lopes, Miryung Kim

    Abstract: Developers often resort to online Q&A forums such as Stack Overflow (SO) for filling their programming needs. Although code examples on those forums are good starting points, they are often incomplete and inadequate for developers' local program contexts; adaptation of those examples is necessary to integrate them to production code. As a consequence, the process of adapting online code examples i… ▽ More

    Submitted 28 May, 2019; originally announced May 2019.

    Comments: This paper will appear at ICSE 2019

  37. arXiv:1904.06584  [pdf, other

    cs.PL cs.DC

    Got: Git, but for Objects

    Authors: Rohan Achar, Cristina V. Lopes

    Abstract: We look at one important category of distributed applications characterized by the existence of multiple collaborating, and competing, components sharing mutable, long-lived, replicated objects. The problem addressed by our work is that of object state synchronization among the components. As an organizing principle for replicated objects, we formally specify the Global Object Tracker (GoT) model,… ▽ More

    Submitted 13 April, 2019; originally announced April 2019.

  38. arXiv:1903.00660  [pdf, other

    cs.RO

    Controlling Robots using Artificial Intelligence and a Consortium Blockchain

    Authors: Vasco Lopes, Luís A. Alexandre, Nuno Pereira

    Abstract: Blockchain is a disruptive technology that is normally used within financial applications, however it can be very beneficial also in certain robotic contexts, such as when an immutable register of events is required. Among the several properties of Blockchain that can be useful within robotic environments, we find not just immutability but also decentralization of the data, irreversibility, access… ▽ More

    Submitted 2 March, 2019; originally announced March 2019.

  39. arXiv:1811.05624  [pdf, other

    cs.SI cs.MA

    Multi-Winner Contests for Strategic Diffusion in Social Networks

    Authors: Wen Shen, Yang Feng, Cristina V. Lopes

    Abstract: Strategic diffusion encourages participants to take active roles in promoting stakeholders' agendas by rewarding successful referrals. As social media continues to transform the way people communicate, strategic diffusion has become a powerful tool for stakeholders to influence people's decisions or behaviors for desired objectives. Existing reward mechanisms for strategic diffusion are usually ei… ▽ More

    Submitted 10 March, 2019; v1 submitted 13 November, 2018; originally announced November 2018.

    Comments: 9 pages, 6 figures, In Proceedings of The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19)

  40. arXiv:1810.00329  [pdf, ps, other

    cs.AI cs.CR

    An Overview of Blockchain Integration with Robotics and Artificial Intelligence

    Authors: Vasco Lopes, Luís A. Alexandre

    Abstract: Blockchain technology is growing everyday at a fast-passed rhythm and it's possible to integrate it with many systems, namely Robotics with AI services. However, this is still a recent field and there isn't yet a clear understanding of what it could potentially become. In this paper, we conduct an overview of many different methods and platforms that try to leverage the power of blockchain into ro… ▽ More

    Submitted 30 September, 2018; originally announced October 2018.

  41. arXiv:1804.04621  [pdf, other

    cs.SE

    The Java Build Framework: Large Scale Compilation

    Authors: Pedro Martins, Rohan Achar, Cristina V. Lopes

    Abstract: Large repositories of source code for research tend to limit their utility to static analysis of the code, as they give no guarantees on whether the projects are compilable, much less runnable in any way. The immediate consequence of the lack of large compilable and runnable datasets is that research that requires such properties does not generalize beyond small benchmarks. We present the Java Bui… ▽ More

    Submitted 12 April, 2018; originally announced April 2018.

  42. arXiv:1803.06464  [pdf, other

    cs.CY cs.MA eess.SY

    Toward Understanding the Impact of User Participation in Autonomous Ridesharing Systems

    Authors: Wen Shen, Rohan Achar, Cristina V. Lopes

    Abstract: Autonomous ridesharing systems (ARS) promise many societal and environmental benefits, including decreased accident rates, reduced energy consumption and pollutant emissions, and diminished land use for parking. To unleash ARS' potential, stakeholders must understand how the degree of passenger participation influences the ridesharing systems' efficiency. To date, however, a careful study that qua… ▽ More

    Submitted 28 March, 2018; v1 submitted 17 March, 2018; originally announced March 2018.

    Comments: 17 pages, 11 figures

    Journal ref: Proceedings of the 2018 Winter Simulation Conference

  43. arXiv:1709.04049  [pdf, other

    cs.AI cs.CY cs.MA

    Information Design in Crowdfunding under Thresholding Policies

    Authors: Wen Shen, Jacob W. Crandall, Ke Yan, Cristina V. Lopes

    Abstract: Crowdfunding has emerged as a prominent way for entrepreneurs to secure funding without sophisticated intermediation. In crowdfunding, an entrepreneur often has to decide how to disclose the campaign status in order to collect as many contributions as possible. Such decisions are difficult to make primarily due to incomplete information. We propose information design as a tool to help the entrepre… ▽ More

    Submitted 28 March, 2018; v1 submitted 12 September, 2017; originally announced September 2017.

    Comments: 9 pages, 2 figures, In Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2018)

  44. cf4ocl: a C framework for OpenCL

    Authors: Nuno Fachada, Vitor V. Lopes, Rui C. Martins, Agostinho C. Rosa

    Abstract: OpenCL is an open standard for parallel programming of heterogeneous compute devices, such as GPUs, CPUs, DSPs or FPGAs. However, the verbosity of its C host API can hinder application development. In this paper we present cf4ocl, a software library for rapid development of OpenCL programs in pure C. It aims to reduce the verbosity of the OpenCL API, offering straightforward memory management, int… ▽ More

    Submitted 12 May, 2017; v1 submitted 5 September, 2016; originally announced September 2016.

    Comments: The peer-reviewed version of this paper is published in Science of Computer Programming at http://www.sciencedirect.com/science/article/pii/S0167642317300540 . This version is typeset by the authors and differs only in pagination and typographical detail

    ACM Class: D.1.3; D.2.2

    Journal ref: Science of Computer Programming, 143, pp. 9-19, 2017

  45. arXiv:1608.08736  [pdf, other

    cs.SE

    Collective Intelligence for Smarter API Recommendations in Python

    Authors: Andrea Renika D'Souza, Di Yang, Cristina V. Lopes

    Abstract: Software developers use Application Programming Interfaces (APIs) of libraries and frameworks extensively while writing programs. In this context, the recommendations provided in code completion pop-ups help developers choose the desired methods. The candidate lists recommended by these tools, however, tend to be large, ordered alphabetically and sometimes even incomplete. A fair amount of work ha… ▽ More

    Submitted 31 August, 2016; originally announced August 2016.

    Comments: 10 pages, SCAM 2016

  46. SimOutUtils - Utilities for analyzing time series simulation output

    Authors: Nuno Fachada, Vitor V. Lopes, Rui C. Martins, Agostinho C. Rosa

    Abstract: SimOutUtils is a suite of MATLAB/Octave functions for studying and analyzing time series-like output from stochastic simulation models. More specifically, SimOutUtils allows modelers to study and visualize simulation output dynamics, perform distributional analysis of output statistical summaries, as well as compare these summaries in order to assert the statistical equivalence of two or more mode… ▽ More

    Submitted 6 January, 2017; v1 submitted 22 March, 2016; originally announced March 2016.

    Comments: The peer-reviewed version of this paper is published in the Journal of Open Research Software at http://doi.org/10.5334/jors.110 . This version is typeset by the authors and differs only in pagination and typographical detail

    MSC Class: 62-07 ACM Class: D.2.4; G.3; I.6.4; I.6.6

    Journal ref: Journal of Open Research Software. 4(1), p.e38, 2016

  47. micompr: An R Package for Multivariate Independent Comparison of Observations

    Authors: Nuno Fachada, João Rodrigues, Vitor V. Lopes, Rui C. Martins, Agostinho C. Rosa

    Abstract: The R package micompr implements a procedure for assessing if two or more multivariate samples are drawn from the same distribution. The procedure uses principal component analysis to convert multivariate observations into a set of linearly uncorrelated statistical measures, which are then compared using a number of statistical methods. This technique is independent of the distributional propertie… ▽ More

    Submitted 21 February, 2017; v1 submitted 22 March, 2016; originally announced March 2016.

    Comments: The peer-reviewed version of this paper is published in The R Journal at https://journal.r-project.org/archive/2016-2/fachada-rodrigues-lopes-etal.pdf . This version is typeset by the authors and differs only in pagination and typographical detail

    MSC Class: 62-07; 62H15; 62H25 ACM Class: G.3

    Journal ref: The R Journal, 8(2): 405-420 (2016)

  48. arXiv:1603.02208  [pdf, other

    cs.AI cs.GT

    An Online Mechanism for Ridesharing in Autonomous Mobility-on-Demand Systems

    Authors: Wen Shen, Cristina V. Lopes, Jacob W. Crandall

    Abstract: With proper management, Autonomous Mobility-on-Demand (AMoD) systems have great potential to satisfy the transport demands of urban populations by providing safe, convenient, and affordable ridesharing services. Meanwhile, such systems can substantially decrease private car ownership and use, and thus significantly reduce traffic congestion, energy consumption, and carbon emissions. To achieve thi… ▽ More

    Submitted 1 March, 2017; v1 submitted 7 March, 2016; originally announced March 2016.

    Journal ref: Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI 2016) pp. 475-481

  49. SourcererCC: Scaling Code Clone Detection to Big Code

    Authors: Hitesh Sajnani, Vaibhav Saini, Jeffrey Svajlenko, Chanchal K. Roy, Cristina V. Lopes

    Abstract: Despite a decade of active research, there is a marked lack in clone detectors that scale to very large repositories of source code, in particular for detecting near-miss clones where significant editing activities may take place in the cloned code. We present SourcererCC, a token-based clone detector that targets three clone types, and exploits an index to achieve scalability to large inter-proje… ▽ More

    Submitted 20 December, 2015; originally announced December 2015.

    Comments: Accepted for publication at ICSE'16 (preprint, unrevised)

  50. Model-independent comparison of simulation output

    Authors: Nuno Fachada, Vitor V. Lopes, Rui C. Martins, Agostinho C. Rosa

    Abstract: Computational models of complex systems are usually elaborate and sensitive to implementation details, characteristics which often affect their verification and validation. Model replication is a possible solution to this issue. It avoids biases associated with the language or toolkit used to develop the original model, not only promoting its verification and validation, but also fostering the cre… ▽ More

    Submitted 6 January, 2017; v1 submitted 30 September, 2015; originally announced September 2015.

    Comments: The peer-reviewed version of this paper is published in Simulation Modelling Practice and Theory at http://dx.doi.org/10.1016/j.simpat.2016.12.013 . This version is typeset by the authors and differs only in pagination and typographical detail

    MSC Class: 68U20 ACM Class: D.2.4; I.2.2; I.5.2; I.6.4; I.6.6

    Journal ref: Simulation Modelling Practice and Theory, 72C, pp. 131-149, 2017