Skip to main content

Showing 1–50 of 178 results for author: Santos, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2506.06850  [pdf, ps, other

    cs.CV eess.SP

    Deep Inertial Pose: A deep learning approach for human pose estimation

    Authors: Sara M. Cerqueira, Manuel Palermo, Cristina P. Santos

    Abstract: Inertial-based Motion capture system has been attracting growing attention due to its wearability and unsconstrained use. However, accurate human joint estimation demands several complex and expertise demanding steps, which leads to expensive software such as the state-of-the-art MVN Awinda from Xsens Technologies. This work aims to study the use of Neural Networks to abstract the complex biomecha… ▽ More

    Submitted 7 June, 2025; originally announced June 2025.

  2. arXiv:2506.04260  [pdf, ps, other

    cs.CY cs.HC

    Turning to Online Forums for Legal Information: A Case Study of GDPR's Legitimate Interests

    Authors: Lin Kyi, Cristiana Santos, Sushil Ammanaghatta Shivakumar, Franziska Roesner, Asia Biega

    Abstract: Practitioners building online services and tools often turn to online forums such as Reddit, Law Stack Exchange, and Stack Overflow for legal guidance to ensure compliance with the GDPR. The legal information presented in these forums directly impact present-day industry practitioner's decisions. Online forums can serve as gateways that, depending on the accuracy and quality of the answers provide… ▽ More

    Submitted 2 June, 2025; originally announced June 2025.

  3. arXiv:2505.14700  [pdf, ps, other

    cs.LG math.NA stat.ML

    Stochastic Fractional Neural Operators: A Symmetrized Approach to Modeling Turbulence in Complex Fluid Dynamics

    Authors: Rômulo Damasclin Chaves dos Santos, Jorge Henrique de Oliveira Sales

    Abstract: In this work, we introduce a new class of neural network operators designed to handle problems where memory effects and randomness play a central role. In this work, we introduce a new class of neural network operators designed to handle problems where memory effects and randomness play a central role. These operators merge symmetrized activation functions, Caputo-type fractional derivatives, and… ▽ More

    Submitted 12 May, 2025; originally announced May 2025.

    Comments: 17 pages

  4. arXiv:2505.12892  [pdf, other

    cs.CY

    "I will never pay for this" Perception of fairness and factors affecting behaviour on 'pay-or-ok' models

    Authors: Victor Morel, Farzaneh Karegar, Cristiana Santos

    Abstract: The rise of cookie paywalls ('pay-or-ok' models) has prompted growing debates around the right to privacy and data protection, monetisation, and the legitimacy of user consent. Despite their increasing use across sectors, limited research has explored how users perceive these models or what shapes their decisions to either consent to tracking or pay. To address this gap, we conducted four focus gr… ▽ More

    Submitted 23 May, 2025; v1 submitted 19 May, 2025; originally announced May 2025.

    Comments: Submitted to APF2025, comments welcome

  5. arXiv:2504.11406  [pdf, other

    cs.CV cs.AI

    Multi-level Cellular Automata for FLIM networks

    Authors: Felipe Crispim Salvagnini, Jancarlo F. Gomes, Cid A. N. Santos, Silvio Jamil F. Guimarães, Alexandre X. Falcão

    Abstract: The necessity of abundant annotated data and complex network architectures presents a significant challenge in deep-learning Salient Object Detection (deep SOD) and across the broader deep-learning landscape. This challenge is particularly acute in medical applications in developing countries with limited computational resources. Combining modern and classical techniques offers a path to maintaini… ▽ More

    Submitted 15 April, 2025; originally announced April 2025.

  6. arXiv:2504.10294  [pdf, other

    cs.RO

    Ankle Exoskeletons in Walking and Load-Carrying Tasks: Insights into Biomechanics and Human-Robot Interaction

    Authors: J. F. Almeida, J. André, C. P. Santos

    Abstract: Background: Lower limb exoskeletons can enhance quality of life, but widespread adoption is limited by the lack of frameworks to assess their biomechanical and human-robot interaction effects, which are essential for developing adaptive and personalized control strategies. Understanding impacts on kinematics, muscle activity, and HRI dynamics is key to achieve improved usability of wearable robots… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

  7. arXiv:2504.10102  [pdf, other

    cs.RO eess.SY

    A Human-Sensitive Controller: Adapting to Human Ergonomics and Physical Constraints via Reinforcement Learning

    Authors: Vitor Martins, Sara M. Cerqueira, Mercedes Balcells, Elazer R Edelman, Cristina P. Santos

    Abstract: Work-Related Musculoskeletal Disorders continue to be a major challenge in industrial environments, leading to reduced workforce participation, increased healthcare costs, and long-term disability. This study introduces a human-sensitive robotic system aimed at reintegrating individuals with a history of musculoskeletal disorders into standard job roles, while simultaneously optimizing ergonomic c… ▽ More

    Submitted 14 April, 2025; originally announced April 2025.

  8. arXiv:2504.03751  [pdf, ps, other

    cs.LG

    Revolutionizing Fractional Calculus with Neural Networks: Voronovskaya-Damasclin Theory for Next-Generation AI Systems

    Authors: Rômulo Damasclin Chaves dos Santos, Jorge Henrique de Oliveira Sales

    Abstract: This work introduces rigorous convergence rates for neural network operators activated by symmetrized and perturbed hyperbolic tangent functions, utilizing novel Voronovskaya-Damasclin asymptotic expansions. We analyze basic, Kantorovich, and quadrature-type operators over infinite domains, extending classical approximation theory to fractional calculus via Caputo derivatives. Key innovations incl… ▽ More

    Submitted 1 April, 2025; originally announced April 2025.

    Comments: 16 pages

  9. Impacto de Treinamento em Programação Competitiva no Ensino Médio: Resultados e Desafios

    Authors: Camila da Cruz Santos, Sarah Souto dos Santos, Crishna Irion, Giullia Rodrigues de Menezes, Rafael Dias Araújo, João Henrique de Souza Pereira

    Abstract: This article presents an ongoing research aiming to develop an effective methodology for teaching programming, focusing on participation in the Brazilian Informatics Olympiad (OBI), for elementary and high school students. The training conducted with students from the Federal Institute and state schools, demonstrates the importance of programming training programs as a way to promote interest in c… ▽ More

    Submitted 31 January, 2025; originally announced March 2025.

    Comments: 10 pages, in Portuguese, 8 figures and 2 tables

    Journal ref: Simpósio Brasileiro de Informática na Educação (SBIE) 2024

  10. Promoting Gender Equality in Competitive Programming: Strategies and Impacts of Affirmative Actions in Programming Marathons in Brazil

    Authors: Crishna Irion, Camila da Cruz Santos, Luiz Claudio Theodoro, Rafael Dias Araujo, Joao Henrique de Souza Pereira

    Abstract: In the context of Computing, competitive programming is a relevant area that aims to have students, usually in teams, solve programming challenges, developing skills and competencies in the field. However, female participation remains significantly low and notably distant compared to male participation, even with proven intellectual equity between genders. This research aims to present strategies… ▽ More

    Submitted 21 February, 2025; originally announced February 2025.

    Comments: 12 pages, SBIE (2024), in Portuguese language

  11. arXiv:2501.14249  [pdf, other

    cs.LG cs.AI cs.CL

    Humanity's Last Exam

    Authors: Long Phan, Alice Gatti, Ziwen Han, Nathaniel Li, Josephina Hu, Hugh Zhang, Chen Bo Calvin Zhang, Mohamed Shaaban, John Ling, Sean Shi, Michael Choi, Anish Agrawal, Arnav Chopra, Adam Khoja, Ryan Kim, Richard Ren, Jason Hausenloy, Oliver Zhang, Mantas Mazeika, Dmitry Dodonov, Tung Nguyen, Jaeho Lee, Daron Anderson, Mikhail Doroshenko, Alun Cennyth Stokes , et al. (1084 additional authors not shown)

    Abstract: Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of… ▽ More

    Submitted 19 April, 2025; v1 submitted 24 January, 2025; originally announced January 2025.

    Comments: 29 pages, 6 figures

  12. arXiv:2501.10496  [pdf, ps, other

    stat.ML cs.LG

    Extension of Symmetrized Neural Network Operators with Fractional and Mixed Activation Functions

    Authors: Rômulo Damasclin Chaves dos Santos, Jorge Henrique de Oliveira Sales

    Abstract: We propose a novel extension to symmetrized neural network operators by incorporating fractional and mixed activation functions. This study addresses the limitations of existing models in approximating higher-order smooth functions, particularly in complex and high-dimensional spaces. Our framework introduces a fractional exponent in the activation functions, allowing adaptive non-linear approxima… ▽ More

    Submitted 17 January, 2025; originally announced January 2025.

    Comments: 13 pages

  13. arXiv:2501.02170  [pdf

    cs.SE

    An Empirical Study of Safetensors' Usage Trends and Developers' Perceptions

    Authors: Beatrice Casey, Kaia Damian, Andrew Cotaj, Joanna C. S. Santos

    Abstract: Developers are sharing pre-trained Machine Learning (ML) models through a variety of model sharing platforms, such as Hugging Face, in an effort to make ML development more collaborative. To share the models, they must first be serialized. While there are many methods of serialization in Python, most of them are unsafe. To tame this insecurity, Hugging Face released safetensors as a way to mitigat… ▽ More

    Submitted 3 January, 2025; originally announced January 2025.

  14. arXiv:2411.15414  [pdf, other

    cs.CR

    Measuring Compliance of Consent Revocation on the Web

    Authors: Gayatri Priyadarsini Kancherla, Nataliia Bielova, Cristiana Santos, Abhishek Bichhawat

    Abstract: The GDPR requires websites to facilitate the right to revoke consent from Web users. While numerous studies measured compliance of consent with the various consent requirements, no prior work has studied consent revocation on the Web. Therefore, it remains unclear how difficult it is to revoke consent on the websites' interfaces, nor whether revoked consent is properly stored and communicated behi… ▽ More

    Submitted 22 May, 2025; v1 submitted 22 November, 2024; originally announced November 2024.

  15. arXiv:2411.14533  [pdf, other

    math.OC cs.DM

    The connected Grundy coloring problem: Formulations and a local-search enhanced biased random-key genetic algorithm

    Authors: Mateus C. Silva, Rafael A. Melo, Mauricio G. C. Resende, Marcio C. Santos, Rodrigo F. Toso

    Abstract: Given a graph G=(V,E), a connected Grundy coloring is a proper vertex coloring that can be obtained by a first-fit heuristic on a connected vertex sequence. A first-fit coloring heuristic is one that attributes to each vertex in a sequence the lowest-index color not used for its preceding neighbors. A connected vertex sequence is one in which each element, except for the first one, is connected to… ▽ More

    Submitted 21 November, 2024; originally announced November 2024.

    ACM Class: F.2; G.2

  16. arXiv:2410.17736  [pdf, other

    cs.CL

    MojoBench: Language Modeling and Benchmarks for Mojo

    Authors: Nishat Raihan, Joanna C. S. Santos, Marcos Zampieri

    Abstract: The recently introduced Mojo programming language (PL) by Modular, has received significant attention in the scientific community due to its claimed significant speed boost over Python. Despite advancements in code Large Language Models (LLMs) across various PLs, Mojo remains unexplored in this context. To address this gap, we introduce MojoBench, the first framework for Mojo code generation. Mojo… ▽ More

    Submitted 23 October, 2024; originally announced October 2024.

  17. arXiv:2410.16349  [pdf, other

    cs.LG cs.HC

    Large Language Models in Computer Science Education: A Systematic Literature Review

    Authors: Nishat Raihan, Mohammed Latif Siddiq, Joanna C. S. Santos, Marcos Zampieri

    Abstract: Large language models (LLMs) are becoming increasingly better at a wide range of Natural Language Processing tasks (NLP), such as text generation and understanding. Recently, these models have extended their capabilities to coding tasks, bridging the gap between natural languages (NL) and programming languages (PL). Foundational models such as the Generative Pre-trained Transformer (GPT) and LLaMA… ▽ More

    Submitted 21 October, 2024; originally announced October 2024.

    Comments: Accepted at 56th ACM Technical Symposium on Computer Science Education (SIGCSE TS 2025)

  18. arXiv:2410.16294  [pdf

    cs.CY

    Emílias Podcast -- Mulheres na Computação: Ampliando Horizontes e Inspirando Carreiras em STEM

    Authors: Nathálya Chaves Dos Santos, Adolfo Gustavo Serra Seca Neto

    Abstract: On October 3, 2024, the "Emílias Podcast -- Women in Computing" celebrates its 5th anniversary, standing out as a platform that promotes the participation of women in STEM (an acronym for "science, technology, engineering, and mathematics"). The podcast aims to provide a space for women in computing and related fields to share their experiences and highlight the various opportunities in Informatio… ▽ More

    Submitted 6 October, 2024; originally announced October 2024.

    Comments: In Portuguese. Accepted for presentation at XIV SEMINÁRIO DE EXTENSÃO E INOVAÇÃO. UTFPR 2024

  19. arXiv:2410.04490  [pdf

    cs.CR cs.LG cs.SE

    A Large-Scale Exploit Instrumentation Study of AI/ML Supply Chain Attacks in Hugging Face Models

    Authors: Beatrice Casey, Joanna C. S. Santos, Mehdi Mirakhorli

    Abstract: The development of machine learning (ML) techniques has led to ample opportunities for developers to develop and deploy their own models. Hugging Face serves as an open source platform where developers can share and download other models in an effort to make ML development more collaborative. In order for models to be shared, they first need to be serialized. Certain Python serialization methods a… ▽ More

    Submitted 6 October, 2024; originally announced October 2024.

  20. arXiv:2409.12862  [pdf, other

    cs.RO cs.HC

    Extended Reality System for Robotic Learning from Human Demonstration

    Authors: Isaac Ngui, Courtney McBeth, Grace He, André Corrêa Santos, Luciano Soares, Marco Morales, Nancy M. Amato

    Abstract: Many real-world tasks are intuitive for a human to perform, but difficult to encode algorithmically when utilizing a robot to perform the tasks. In these scenarios, robotic systems can benefit from expert demonstrations to learn how to perform each task. In many settings, it may be difficult or unsafe to use a physical robot to provide these demonstrations, for example, considering cooking tasks s… ▽ More

    Submitted 19 September, 2024; originally announced September 2024.

    Comments: In submission

  21. arXiv:2409.03151  [pdf, other

    cs.LG stat.ML

    Standing on the shoulders of giants

    Authors: Lucas Felipe Ferraro Cardoso, José de Sousa Ribeiro Filho, Vitor Cirilo Araujo Santos, Regiane Silva Kawasaki Frances, Ronnie Cley de Oliveira Alves

    Abstract: Although fundamental to the advancement of Machine Learning, the classic evaluation metrics extracted from the confusion matrix, such as precision and F1, are limited. Such metrics only offer a quantitative view of the models' performance, without considering the complexity of the data or the quality of the hit. To overcome these limitations, recent research has introduced the use of psychometric… ▽ More

    Submitted 6 September, 2024; v1 submitted 4 September, 2024; originally announced September 2024.

    Comments: 15 pages, 8 figures, 3 tables, submitted for the BRACIS'24 conference

    ACM Class: I.2.6

  22. arXiv:2404.13002  [pdf, other

    cs.CV cs.LG

    Towards Robust Ferrous Scrap Material Classification with Deep Learning and Conformal Prediction

    Authors: Paulo Henrique dos Santos, Valéria de Carvalho Santos, Eduardo José da Silva Luz

    Abstract: In the steel production domain, recycling ferrous scrap is essential for environmental and economic sustainability, as it reduces both energy consumption and greenhouse gas emissions. However, the classification of scrap materials poses a significant challenge, requiring advancements in automation technology. Additionally, building trust among human operators is a major obstacle. Traditional appro… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  23. arXiv:2404.10155  [pdf, other

    cs.SE cs.LG

    The Fault in our Stars: Quality Assessment of Code Generation Benchmarks

    Authors: Mohammed Latif Siddiq, Simantika Dristi, Joy Saha, Joanna C. S. Santos

    Abstract: Large Language Models (LLMs) are gaining popularity among software engineers. A crucial aspect of developing effective code generation LLMs is to evaluate these models using a robust benchmark. Evaluation benchmarks with quality issues can provide a false sense of performance. In this work, we conduct the first-of-its-kind study of the quality of prompts within benchmarks used to compare the perfo… ▽ More

    Submitted 4 September, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted at the 24th IEEE International Conference on Source Code Analysis and Manipulation(SCAM 2024) Research Track

  24. arXiv:2404.06370  [pdf

    cs.AI

    Enhancing Decision Analysis with a Large Language Model: pyDecision a Comprehensive Library of MCDA Methods in Python

    Authors: Valdecy Pereira, Marcio Pereira Basilio, Carlos Henrique Tarjano SantosCarlos Henrique Tarjano Santos

    Abstract: Purpose: Multicriteria decision analysis (MCDA) has become increasingly essential for decision-making in complex environments. In response to this need, the pyDecision library, implemented in Python and available at https://bit.ly/3tLFGtH, has been developed to provide a comprehensive and accessible collection of MCDA methods. Methods: The pyDecision offers 70 MCDA methods, including AHP, TOPSIS,… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 23 pages, 2 figures

  25. arXiv:2403.10646  [pdf

    cs.LG cs.CR

    A Survey of Source Code Representations for Machine Learning-Based Cybersecurity Tasks

    Authors: Beatrice Casey, Joanna C. S. Santos, George Perry

    Abstract: Machine learning techniques for cybersecurity-related software engineering tasks are becoming increasingly popular. The representation of source code is a key portion of the technique that can impact the way the model is able to learn the features of the source code. With an increasing number of these techniques being developed, it is valuable to see the current state of the field to better unders… ▽ More

    Submitted 9 April, 2025; v1 submitted 15 March, 2024; originally announced March 2024.

  26. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1112 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 16 December, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  27. Self-calibrated convolution towards glioma segmentation

    Authors: Felipe C. R. Salvagnini, Gerson O. Barbosa, Alexandre X. Falcao, Cid A. N. Santos

    Abstract: Accurate brain tumor segmentation in the early stages of the disease is crucial for the treatment's effectiveness, avoiding exhaustive visual inspection of a qualified specialist on 3D MR brain images of multiple protocols (e.g., T1, T2, T2-FLAIR, T1-Gd). Several networks exist for Glioma segmentation, being nnU-Net one of the best. In this work, we evaluate self-calibrated convolutions in differe… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  28. arXiv:2402.02877  [pdf

    cs.CR cs.CY cs.HC

    Feedback to the European Data Protection Board's Guidelines 2/2023 on Technical Scope of Art. 5(3) of ePrivacy Directive

    Authors: Cristiana Santos, Nataliia Bielova, Vincent Roca, Mathieu Cunche, Gilles Mertens, Karel Kubicek, Hamed Haddadi

    Abstract: We very much welcome the EDPB's Guidelines. Please find hereunder our feedback to the Guidelines 2/2023 on Technical Scope of Art. 5(3) of ePrivacy Directive. Our comments are presented after a quotation from the proposed text by the EDPB in a box.

    Submitted 5 February, 2024; originally announced February 2024.

  29. arXiv:2402.01223  [pdf, other

    cs.CR math.NT

    Efficient $(3,3)$-isogenies on fast Kummer surfaces

    Authors: Maria Corte-Real Santos, Craig Costello, Benjamin Smith

    Abstract: We give an alternative derivation of $(N,N)$-isogenies between fast Kummer surfaces which complements existing works based on the theory oftheta functions. We use this framework to produce explicit formulae for the case of $N = 3$, and show that the resulting algorithms are more efficient than all prior $(3, 3)$-isogeny algorithms.

    Submitted 4 September, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  30. arXiv:2401.06790  [pdf, other

    cs.CL cs.AI

    Using Zero-shot Prompting in the Automatic Creation and Expansion of Topic Taxonomies for Tagging Retail Banking Transactions

    Authors: Daniel de S. Moraes, Pedro T. C. Santos, Polyana B. da Costa, Matheus A. S. Pinto, Ivan de J. P. Pinto, Álvaro M. G. da Veiga, Sergio Colcher, Antonio J. G. Busson, Rafael H. Rocha, Rennan Gaio, Rafael Miceli, Gabriela Tourinho, Marcos Rabaioli, Leandro Santos, Fellipe Marques, David Favaro

    Abstract: This work presents an unsupervised method for automatically constructing and expanding topic taxonomies using instruction-based fine-tuned LLMs (Large Language Models). We apply topic modeling and keyword extraction techniques to create initial topic taxonomies and LLMs to post-process the resulting terms and create a hierarchy. To expand an existing taxonomy with new terms, we use zero-shot promp… ▽ More

    Submitted 11 February, 2024; v1 submitted 7 January, 2024; originally announced January 2024.

  31. arXiv:2401.01200  [pdf, other

    cs.CV cs.AI

    Skin cancer diagnosis using NIR spectroscopy data of skin lesions in vivo using machine learning algorithms

    Authors: Flavio P. Loss, Pedro H. da Cunha, Matheus B. Rocha, Madson Poltronieri Zanoni, Leandro M. de Lima, Isadora Tavares Nascimento, Isabella Rezende, Tania R. P. Canuto, Luciana de Paula Vieira, Renan Rossoni, Maria C. S. Santos, Patricia Lyra Frasson, Wanderson Romão, Paulo R. Filgueiras, Renato A. Krohling

    Abstract: Skin lesions are classified in benign or malignant. Among the malignant, melanoma is a very aggressive cancer and the major cause of deaths. So, early diagnosis of skin cancer is very desired. In the last few years, there is a growing interest in computer aided diagnostic (CAD) using most image and clinical data of the lesion. These sources of information present limitations due to their inability… ▽ More

    Submitted 2 January, 2024; originally announced January 2024.

  32. arXiv:2312.12598  [pdf, other

    cs.SE cs.AI

    A Case Study on Test Case Construction with Large Language Models: Unveiling Practical Insights and Challenges

    Authors: Roberto Francisco de Lima Junior, Luiz Fernando Paes de Barros Presta, Lucca Santos Borborema, Vanderson Nogueira da Silva, Marcio Leal de Melo Dahia, Anderson Carlos Sousa e Santos

    Abstract: This paper presents a detailed case study examining the application of Large Language Models (LLMs) in the construction of test cases within the context of software engineering. LLMs, characterized by their advanced natural language processing capabilities, are increasingly garnering attention as tools to automate and enhance various aspects of the software development life cycle. Leveraging a cas… ▽ More

    Submitted 21 December, 2023; v1 submitted 19 December, 2023; originally announced December 2023.

  33. arXiv:2312.08806  [pdf, other

    cs.CR

    You Can't Trust Your Tag Neither: Privacy Leaks and Potential Legal Violations within the Google Tag Manager

    Authors: Gilles Mertens, Nataliia Bielova, Vincent Roca, Cristiana Santos

    Abstract: Tag Management Systems were developed in order to support website publishers in installing multiple third-party JavaScript scripts (Tags) on their websites. Google developed its own TMS called ``Google Tag Manager'' (GTM) that is currently present on 42\% of the top 1 million most popular websites. However, GTM has not yet been thoroughly evaluated by the academic research community. In this work,… ▽ More

    Submitted 11 April, 2025; v1 submitted 14 December, 2023; originally announced December 2023.

  34. "It doesn't tell me anything about how my data is used'': User Perceptions of Data Collection Purposes

    Authors: Lin Kyi, Abraham Mhaidli, Cristiana Santos, Franziska Roesner, Asia Biega

    Abstract: Data collection purposes and their descriptions are presented on almost all privacy notices under the GDPR, yet there is a lack of research focusing on how effective they are at informing users about data practices. We fill this gap by investigating users' perceptions of data collection purposes and their descriptions, a crucial aspect of informed consent. We conducted 23 semi-structured interview… ▽ More

    Submitted 6 February, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: Accepted for publication at the 2024 ACM Conference on Human Factors in Computing Systems (CHI'24)

  35. arXiv:2311.10768  [pdf, other

    cs.CL

    Memory Augmented Language Models through Mixture of Word Experts

    Authors: Cicero Nogueira dos Santos, James Lee-Thorp, Isaac Noble, Chung-Ching Chang, David Uthus

    Abstract: Scaling up the number of parameters of language models has proven to be an effective approach to improve performance. For dense models, increasing model size proportionally increases the model's computation footprint. In this work, we seek to aggressively decouple learning capacity and FLOPs through Mixture-of-Experts (MoE) style models with large knowledge-rich vocabulary based routing functions… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: 14 pages

  36. Seneca: Taint-Based Call Graph Construction for Java Object Deserialization

    Authors: Joanna C. S. Santos, Mehdi Mirakhorli, Ali Shokri

    Abstract: Object serialization and deserialization are widely used for storing and preserving objects in files, memory, or database as well as for transporting them across machines, enabling remote interaction among processes and many more. This mechanism relies on reflection, a dynamic language that introduces serious challenges for static analyses. Current state-of-the-art call graph construction algorith… ▽ More

    Submitted 2 September, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: Accepted at OOPSLA 2024

  37. SALLM: Security Assessment of Generated Code

    Authors: Mohammed Latif Siddiq, Joanna C. S. Santos, Sajith Devareddy, Anna Muller

    Abstract: With the growing popularity of Large Language Models (LLMs) in software engineers' daily practices, it is important to ensure that the code generated by these tools is not only functionally correct but also free of vulnerabilities. Although LLMs can help developers to be more productive, prior empirical studies have shown that LLMs can generate insecure code. There are two contributing factors to… ▽ More

    Submitted 4 September, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: Accepted at the 6th International Workshop on Automated and verifiable Software sYstem DEvelopment (ASYDE) with ASE Conference 2024

    Journal ref: 39th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW '24), October 27-November 1, 2024, Sacramento, CA, USA, ACM, New York, NY, USA, 12 pages

  38. Predictive Maintenance Model Based on Anomaly Detection in Induction Motors: A Machine Learning Approach Using Real-Time IoT Data

    Authors: Sergio F. Chevtchenko, Monalisa C. M. dos Santos, Diego M. Vieira, Ricardo L. Mota, Elisson Rocha, Bruna V. Cruz, Danilo Araújo, Ermeson Andrade

    Abstract: With the support of Internet of Things (IoT) devices, it is possible to acquire data from degradation phenomena and design data-driven models to perform anomaly detection in industrial equipment. This approach not only identifies potential anomalies but can also serve as a first step toward building predictive maintenance policies. In this work, we demonstrate a novel anomaly detection system on i… ▽ More

    Submitted 15 October, 2023; originally announced October 2023.

  39. arXiv:2310.07671  [pdf, other

    cs.CE cond-mat.mtrl-sci

    Discovery of Novel Reticular Materials for Carbon Dioxide Capture using GFlowNets

    Authors: Flaviu Cipcigan, Jonathan Booth, Rodrigo Neumann Barros Ferreira, Carine Ribeiro dos Santos, Mathias Steiner

    Abstract: Artificial intelligence holds promise to improve materials discovery. GFlowNets are an emerging deep learning algorithm with many applications in AI-assisted discovery. By using GFlowNets, we generate porous reticular materials, such as metal organic frameworks and covalent organic frameworks, for applications in carbon dioxide capture. We introduce a new Python package (matgfn) to train and sampl… ▽ More

    Submitted 16 October, 2023; v1 submitted 11 October, 2023; originally announced October 2023.

    Journal ref: Digital Discovery, 2024, 3, 449-455

  40. arXiv:2309.15207  [pdf, other

    cs.LG

    Balancing Computational Efficiency and Forecast Error in Machine Learning-based Time-Series Forecasting: Insights from Live Experiments on Meteorological Nowcasting

    Authors: Elin Törnquist, Wagner Costa Santos, Timothy Pogue, Nicholas Wingle, Robert A. Caulk

    Abstract: Machine learning for time-series forecasting remains a key area of research. Despite successful application of many machine learning techniques, relating computational efficiency to forecast error remains an under-explored domain. This paper addresses this topic through a series of real-time experiments to quantify the relationship between computational cost and forecast error using meteorological… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: 26 pages

    ACM Class: I.2; J.2

  41. Enhancing E-Learning System Through Learning Management System (LMS) Technologies: Reshape The Learner Experience

    Authors: Cecilia P. Abaricia, Manuel Luis C. Delos Santos

    Abstract: This paper aims to determine how the LMS Web portal application reshapes the learner experience through the developed E-Learning Management System using Data Mining Algorithm. The methodology that the researchers used is descriptive research involving the interpretation of the meaning or significance of what is described. Gather data from questionnaires, surveys, observations concerned with the… ▽ More

    Submitted 31 August, 2023; originally announced September 2023.

    Comments: 14 pages, 6 figures, 2 Tables, Special Issue on International Research Conference on Computer Engineering and Technology Education 2023 (IRCCETE 2023)

    Report number: ISSN print: 2546-0552; ISSN online: 2546-115X

    Journal ref: International Journal of Computing Sciences Research (IJCSR), Volume 7, pp. 2066-2079, Published on April 29, 2023

  42. Legitimate Interest is the New Consent -- Large-Scale Measurement and Legal Compliance of IAB Europe TCF Paywalls

    Authors: Victor Morel, Cristiana Santos, Viktor Fredholm, Adam Thunberg

    Abstract: Cookie paywalls allow visitors of a website to access its content only after they make a choice between paying a fee or accept tracking. European Data Protection Authorities (DPAs) recently issued guidelines and decisions on paywalls lawfulness, but it is yet unknown whether websites comply with them. We study in this paper the prevalence of cookie paywalls on the top one million websites using an… ▽ More

    Submitted 13 October, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

    Comments: Accepted for publication at WPES2023, minor modifications following feedback from the community

  43. An Ontology of Dark Patterns Knowledge: Foundations, Definitions, and a Pathway for Shared Knowledge-Building

    Authors: Colin M. Gray, Cristiana Santos, Nataliia Bielova, Thomas Mildner

    Abstract: Deceptive and coercive design practices are increasingly used by companies to extract profit, harvest data, and limit consumer choice. Dark patterns represent the most common contemporary amalgamation of these problematic practices, connecting designers, technologists, scholars, regulators, and legal professionals in transdisciplinary dialogue. However, a lack of universally accepted definitions a… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

    Journal ref: CHI '24: Proceedings of the CHI Conference on Human Factors in Computing Systems May 2024

  44. Improving Image Classification of Knee Radiographs: An Automated Image Labeling Approach

    Authors: Jikai Zhang, Carlos Santos, Christine Park, Maciej Mazurowski, Roy Colglazier

    Abstract: Large numbers of radiographic images are available in knee radiology practices which could be used for training of deep learning models for diagnosis of knee abnormalities. However, those images do not typically contain readily available labels due to limitations of human annotations. The purpose of our study was to develop an automated labeling approach that improves the image classification mode… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: This is the preprint version

  45. ICARUS: An Android-Based Unmanned Aerial Vehicle (UAV) Search and Rescue Eye in the Sky

    Authors: Manuel Luis C. Delos Santos, Jerum B. Dasalla, Jomar C. Feliciano, Dustin Red B. Cabatay

    Abstract: The purpose of this paper is to develop an unmanned aerial vehicle (UAV) using a quadcopter with the capability of video surveillance, map coordinates, a deployable parachute with a medicine kit or a food pack as a payload, a collision warning system, remotely controlled, integrated with an android application to assist in search and rescue operations. Applied research for the development of the… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

    Comments: 15 pages, 14 figures, Special Issue: IRCCETE 2023

    Report number: ISSN print: 2546-0552; ISSN online: 2546-115X

    Journal ref: International Journal of Computing Sciences Research (IJCSR), Volume 7, pp. 2272-2286, July 14, 2023

  46. Anomaly Detection in Industrial Machinery using IoT Devices and Machine Learning: a Systematic Mapping

    Authors: Sérgio F. Chevtchenko, Elisson da Silva Rocha, Monalisa Cristina Moura Dos Santos, Ricardo Lins Mota, Diego Moura Vieira, Ermeson Carneiro de Andrade, Danilo Ricardo Barbosa de Araújo

    Abstract: Anomaly detection is critical in the smart industry for preventing equipment failure, reducing downtime, and improving safety. Internet of Things (IoT) has enabled the collection of large volumes of data from industrial machinery, providing a rich source of information for Anomaly Detection. However, the volume and complexity of data generated by the Internet of Things ecosystems make it difficult… ▽ More

    Submitted 14 November, 2023; v1 submitted 28 July, 2023; originally announced July 2023.

  47. arXiv:2307.10018  [pdf, other

    cs.RO cs.AI

    RobôCIn Small Size League Extended Team Description Paper for RoboCup 2023

    Authors: Aline Lima de Oliveira, Cauê Addae da Silva Gomes, Cecília Virginia Santos da Silva, Charles Matheus de Sousa Alves, Danilo Andrade Martins de Souza, Driele Pires Ferreira Araújo Xavier, Edgleyson Pereira da Silva, Felipe Bezerra Martins, Lucas Henrique Cavalcanti Santos, Lucas Dias Maciel, Matheus Paixão Gumercindo dos Santos, Matheus Lafayette Vasconcelos, Matheus Vinícius Teotonio do Nascimento Andrade, João Guilherme Oliveira Carvalho de Melo, João Pedro Souza Pereira de Moura, José Ronald da Silva, José Victor Silva Cruz, Pedro Henrique Santana de Morais, Pedro Paulo Salman de Oliveira, Riei Joaquim Matos Rodrigues, Roberto Costa Fernandes, Ryan Vinicius Santos Morais, Tamara Mayara Ramos Teobaldo, Washington Igor dos Santos Silva, Edna Natividade Silva Barros

    Abstract: RobôCIn has participated in RoboCup Small Size League since 2019, won its first world title in 2022 (Division B), and is currently a three-times Latin-American champion. This paper presents our improvements to defend the Small Size League (SSL) division B title in RoboCup 2023 in Bordeaux, France. This paper aims to share some of the academic research that our team developed over the past year. Ou… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

  48. arXiv:2307.08220  [pdf, other

    cs.SE cs.LG

    FRANC: A Lightweight Framework for High-Quality Code Generation

    Authors: Mohammed Latif Siddiq, Beatrice Casey, Joanna C. S. Santos

    Abstract: In recent years, the use of automated source code generation utilizing transformer-based generative models has expanded, and these models can generate functional code according to the requirements of the developers. However, recent research revealed that these automatically generated source codes can contain vulnerabilities and other quality issues. Despite researchers' and practitioners' attempts… ▽ More

    Submitted 28 August, 2024; v1 submitted 16 July, 2023; originally announced July 2023.

    Comments: Accepted at the 24th IEEE International Conference on Source Code Analysis and Manipulation (SCAM 2024)

  49. arXiv:2307.06860  [pdf

    cs.SD cs.LG eess.AS

    AnuraSet: A dataset for benchmarking Neotropical anuran calls identification in passive acoustic monitoring

    Authors: Juan Sebastián Cañas, Maria Paula Toro-Gómez, Larissa Sayuri Moreira Sugai, Hernán Darío Benítez Restrepo, Jorge Rudas, Breyner Posso Bautista, Luís Felipe Toledo, Simone Dena, Adão Henrique Rosa Domingos, Franco Leandro de Souza, Selvino Neckel-Oliveira, Anderson da Rosa, Vítor Carvalho-Rocha, José Vinícius Bernardy, José Luiz Massao Moreira Sugai, Carolina Emília dos Santos, Rogério Pereira Bastos, Diego Llusia, Juan Sebastián Ulloa

    Abstract: Global change is predicted to induce shifts in anuran acoustic behavior, which can be studied through passive acoustic monitoring (PAM). Understanding changes in calling behavior requires the identification of anuran species, which is challenging due to the particular characteristics of neotropical soundscapes. In this paper, we introduce a large-scale multi-species dataset of anuran amphibians ca… ▽ More

    Submitted 11 July, 2023; originally announced July 2023.

  50. arXiv:2306.04009  [pdf, other

    cs.CL cs.AI

    Triggering Multi-Hop Reasoning for Question Answering in Language Models using Soft Prompts and Random Walks

    Authors: Kanishka Misra, Cicero Nogueira dos Santos, Siamak Shakeri

    Abstract: Despite readily memorizing world knowledge about entities, pre-trained language models (LMs) struggle to compose together two or more facts to perform multi-hop reasoning in question-answering tasks. In this work, we propose techniques that improve upon this limitation by relying on random walks over structured knowledge graphs. Specifically, we use soft prompts to guide LMs to chain together thei… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: Findings of ACL 2023