-
Understanding Underrepresented Groups in Open Source Software
Authors:
Reydne Santos,
Rafa Prado,
Ana Paula de Holanda Silva,
Kiev Gama,
Fernando Castor,
Ronnie de Souza Santos
Abstract:
Context: Diversity can impact team communication, productivity, cohesiveness, and creativity. Analyzing the existing knowledge about diversity in open source software (OSS) projects can provide directions for future research and raise awareness about barriers and biases against underrepresented groups in OSS. Objective: This study aims to analyze the knowledge about minority groups in OSS projects…
▽ More
Context: Diversity can impact team communication, productivity, cohesiveness, and creativity. Analyzing the existing knowledge about diversity in open source software (OSS) projects can provide directions for future research and raise awareness about barriers and biases against underrepresented groups in OSS. Objective: This study aims to analyze the knowledge about minority groups in OSS projects. We investigated which groups were studied in the OSS literature, the study methods used, their implications, and their recommendations to promote the inclusion of minority groups in OSS projects. Method: To achieve this goal, we performed a systematic literature review study that analyzed 42 papers that directly study underrepresented groups in OSS projects. Results: Most papers focus on gender (62.3%), while others like age or ethnicity are rarely studied. The neurodiversity dimension, have not been studied in the context of OSS. Our results also reveal that diversity in OSS projects faces several barriers but brings significant benefits, such as promoting safe and welcoming environments. Conclusion: Most analyzed papers adopt a myopic perspective that sees gender as strictly binary. Dimensions of diversity that affect how individuals interact and function in an OSS project, such as age, tenure, and ethnicity, have received very little attention.
△ Less
Submitted 30 May, 2025;
originally announced June 2025.
-
Stochastic Fractional Neural Operators: A Symmetrized Approach to Modeling Turbulence in Complex Fluid Dynamics
Authors:
Rômulo Damasclin Chaves dos Santos,
Jorge Henrique de Oliveira Sales
Abstract:
In this work, we introduce a new class of neural network operators designed to handle problems where memory effects and randomness play a central role. In this work, we introduce a new class of neural network operators designed to handle problems where memory effects and randomness play a central role. These operators merge symmetrized activation functions, Caputo-type fractional derivatives, and…
▽ More
In this work, we introduce a new class of neural network operators designed to handle problems where memory effects and randomness play a central role. In this work, we introduce a new class of neural network operators designed to handle problems where memory effects and randomness play a central role. These operators merge symmetrized activation functions, Caputo-type fractional derivatives, and stochastic perturbations introduced via Itô type noise. The result is a powerful framework capable of approximating functions that evolve over time with both long-term memory and uncertain dynamics. We develop the mathematical foundations of these operators, proving three key theorems of Voronovskaya type. These results describe the asymptotic behavior of the operators, their convergence in the mean-square sense, and their consistency under fractional regularity assumptions. All estimates explicitly account for the influence of the memory parameter $α$ and the noise level $σ$. As a practical application, we apply the proposed theory to the fractional Navier-Stokes equations with stochastic forcing, a model often used to describe turbulence in fluid flows with memory. Our approach provides theoretical guarantees for the approximation quality and suggests that these neural operators can serve as effective tools in the analysis and simulation of complex systems. By blending ideas from neural networks, fractional calculus, and stochastic analysis, this research opens new perspectives for modeling turbulent phenomena and other multiscale processes where memory and randomness are fundamental. The results lay the groundwork for hybrid learning-based methods with strong analytical backing.
△ Less
Submitted 12 May, 2025;
originally announced May 2025.
-
Justiça Algorítmica: Instrumentalização, Limites Conceituais e Desafios na Engenharia de Software
Authors:
Lucas Rodrigues Valença,
Ronnie de Souza Santos
Abstract:
This article describes ongoing research with the aim of understanding the concept of justice in the field of software engineering, the factors that underlie the creation and instrumentalization of these concepts, and the limitations faced by software engineering when applying them. The expansion of the field of study called ``algorithmic justice'' fundamentally consists in the creation of mechanis…
▽ More
This article describes ongoing research with the aim of understanding the concept of justice in the field of software engineering, the factors that underlie the creation and instrumentalization of these concepts, and the limitations faced by software engineering when applying them. The expansion of the field of study called ``algorithmic justice'' fundamentally consists in the creation of mechanisms and procedures based on mathematical and formal procedures to conceptualize, evaluate and reduce biases and discrimination caused by algorithms. We conducted a systematic mapping in the context of justice in software engineering, comprising the metrics and definitions of algorithmic justice, as well as the procedures and techniques for fairer decision-making systems. We propose a discussion about the limitations that arise due to the understanding of justice as an attribute of software and the result of decision-making, as well as the influence that the field suffers from the construction of computational thinking, which is constantly developed around abstractions. Finally, we reflect on potential paths that could help us move beyond the limits of algorithmic justice.
△ Less
Submitted 11 May, 2025;
originally announced May 2025.
-
On the emergence of numerical instabilities in Next Generation Reservoir Computing
Authors:
Edmilson Roque dos Santos,
Erik Bollt
Abstract:
Next Generation Reservoir Computing (NGRC) is a low-cost machine learning method for forecasting chaotic time series from data. However, ensuring the dynamical stability of NGRC models during autonomous prediction remains a challenge. In this work, we uncover a key connection between the numerical conditioning of the NGRC feature matrix -- formed by polynomial evaluations on time-delay coordinates…
▽ More
Next Generation Reservoir Computing (NGRC) is a low-cost machine learning method for forecasting chaotic time series from data. However, ensuring the dynamical stability of NGRC models during autonomous prediction remains a challenge. In this work, we uncover a key connection between the numerical conditioning of the NGRC feature matrix -- formed by polynomial evaluations on time-delay coordinates -- and the long-term NGRC dynamics. Merging tools from numerical linear algebra and ergodic theory of dynamical systems, we systematically study how the feature matrix conditioning varies across hyperparameters. We demonstrate that the NGRC feature matrix tends to be ill-conditioned for short time lags and high-degree polynomials. Ill-conditioning amplifies sensitivity to training data perturbations, which can produce unstable NGRC dynamics. We evaluate the impact of different numerical algorithms (Cholesky, SVD, and LU) for solving the regularized least-squares problem.
△ Less
Submitted 1 May, 2025;
originally announced May 2025.
-
From Diverse Origins to a DEI Crisis: The Pushback Against Equity, Diversity, and Inclusion in Software Engineering
Authors:
Ronnie de Souza Santos,
Ann Barcomb,
Mairieli Wessel,
Cleyton Magalhaes
Abstract:
Background: Diversity, equity, and inclusion are rooted in the very origins of software engineering, shaped by the contributions from many individuals from underrepresented groups to the field. Yet today, DEI efforts in the industry face growing resistance. As companies retreat from visible commitments, and pushback initiatives started only a few years ago. Aims: This study explores how the DEI ba…
▽ More
Background: Diversity, equity, and inclusion are rooted in the very origins of software engineering, shaped by the contributions from many individuals from underrepresented groups to the field. Yet today, DEI efforts in the industry face growing resistance. As companies retreat from visible commitments, and pushback initiatives started only a few years ago. Aims: This study explores how the DEI backlash is unfolding in the software industry by investigating institutional changes, lived experiences, and the strategies used to sustain DEI practices. Method: We conducted an exploratory case study using 59 publicly available Reddit posts authored by self-identified software professionals. Data were analyzed using reflexive thematic analysis. Results: Our findings show that software companies are responding to the DEI backlash in varied ways, including re-structuring programs, scaling back investments, or quietly continuing efforts under new labels. Professionals reported a wide range of emotional responses, from anxiety and frustration to relief and happiness, shaped by identity, role, and organizational culture. Yet, despite the backlash, multiple forms of resistance and adaptation have emerged to protect inclusive practices in software engineering. Conclusions: The DEI backlash is reshaping DEI in software engineering. While public messaging may soften or disappear, core DEI values persist in adapted forms. This study offers a new perspective into how inclusion is evolving under pressure and highlights the resilience of DEI in software environments.
△ Less
Submitted 28 April, 2025; v1 submitted 23 April, 2025;
originally announced April 2025.
-
Revolutionizing Fractional Calculus with Neural Networks: Voronovskaya-Damasclin Theory for Next-Generation AI Systems
Authors:
Rômulo Damasclin Chaves dos Santos,
Jorge Henrique de Oliveira Sales
Abstract:
This work introduces rigorous convergence rates for neural network operators activated by symmetrized and perturbed hyperbolic tangent functions, utilizing novel Voronovskaya-Damasclin asymptotic expansions. We analyze basic, Kantorovich, and quadrature-type operators over infinite domains, extending classical approximation theory to fractional calculus via Caputo derivatives. Key innovations incl…
▽ More
This work introduces rigorous convergence rates for neural network operators activated by symmetrized and perturbed hyperbolic tangent functions, utilizing novel Voronovskaya-Damasclin asymptotic expansions. We analyze basic, Kantorovich, and quadrature-type operators over infinite domains, extending classical approximation theory to fractional calculus via Caputo derivatives. Key innovations include parameterized activation functions with asymmetry control, symmetrized density operators, and fractional Taylor expansions for error analysis. The main theorem demonstrates that Kantorovich operators achieve \(o(n^{-β(N-\varepsilon)})\) convergence rates, while basic operators exhibit \(\mathcal{O}(n^{-βN})\) error decay. For deep networks, we prove \(\mathcal{O}(L^{-β(N-\varepsilon)})\) approximation bounds. Stability results under parameter perturbations highlight operator robustness. By integrating neural approximation theory with fractional calculus, this work provides foundational mathematical insights and deployable engineering solutions, with potential applications in complex system modeling and signal processing.
△ Less
Submitted 1 April, 2025;
originally announced April 2025.
-
A Delphi Study on the Adaptation of SCRUM Practices to Remote Work
Authors:
Cleyton Magalhaes,
Fernando Padoan,
Robson Santos,
Ronnie de Souza Santos
Abstract:
This study explores how Scrum practices were adjusted for remote and hybrid work during and after the COVID-19 pandemic, using a Delphi study with Scrum Masters to gather expert insights. Preliminary key findings highlight communication as the primary challenge, leading to adjustments in meeting structures, information-sharing practices, and collaboration tools. Teams restructured ceremonies, intr…
▽ More
This study explores how Scrum practices were adjusted for remote and hybrid work during and after the COVID-19 pandemic, using a Delphi study with Scrum Masters to gather expert insights. Preliminary key findings highlight communication as the primary challenge, leading to adjustments in meeting structures, information-sharing practices, and collaboration tools. Teams restructured ceremonies, introduced new meetings, and implemented persistent information-sharing mechanisms to improve their work.
△ Less
Submitted 27 March, 2025;
originally announced March 2025.
-
Extension of Symmetrized Neural Network Operators with Fractional and Mixed Activation Functions
Authors:
Rômulo Damasclin Chaves dos Santos,
Jorge Henrique de Oliveira Sales
Abstract:
We propose a novel extension to symmetrized neural network operators by incorporating fractional and mixed activation functions. This study addresses the limitations of existing models in approximating higher-order smooth functions, particularly in complex and high-dimensional spaces. Our framework introduces a fractional exponent in the activation functions, allowing adaptive non-linear approxima…
▽ More
We propose a novel extension to symmetrized neural network operators by incorporating fractional and mixed activation functions. This study addresses the limitations of existing models in approximating higher-order smooth functions, particularly in complex and high-dimensional spaces. Our framework introduces a fractional exponent in the activation functions, allowing adaptive non-linear approximations with improved accuracy. We define new density functions based on $q$-deformed and $θ$-parametrized logistic models and derive advanced Jackson-type inequalities that establish uniform convergence rates. Additionally, we provide a rigorous mathematical foundation for the proposed operators, supported by numerical validations demonstrating their efficiency in handling oscillatory and fractional components. The results extend the applicability of neural network approximation theory to broader functional spaces, paving the way for applications in solving partial differential equations and modeling complex systems.
△ Less
Submitted 17 January, 2025;
originally announced January 2025.
-
Not real or too soft? On the challenges of publishing interdisciplinary software engineering research
Authors:
Sonja M. Hyrynsalmi,
Grischa Liebel,
Ronnie de Souza Santos,
Sebastian Baltes
Abstract:
The discipline of software engineering (SE) combines social and technological dimensions. It is an interdisciplinary research field. However, interdisciplinary research submitted to software engineering venues may not receive the same level of recognition as more traditional or technical topics such as software testing. For this paper, we conducted an online survey of 73 SE researchers and used a…
▽ More
The discipline of software engineering (SE) combines social and technological dimensions. It is an interdisciplinary research field. However, interdisciplinary research submitted to software engineering venues may not receive the same level of recognition as more traditional or technical topics such as software testing. For this paper, we conducted an online survey of 73 SE researchers and used a mixed-method data analysis approach to investigate their challenges and recommendations when publishing interdisciplinary research in SE. We found that the challenges of publishing interdisciplinary research in SE can be divided into topic-related and reviewing-related challenges. Furthermore, while our initial focus was on publishing interdisciplinary research, the impact of current reviewing practices on marginalized groups emerged from our data, as we found that marginalized groups are more likely to receive negative feedback. In addition, we found that experienced researchers are less likely to change their research direction due to feedback they receive. To address the identified challenges, our participants emphasize the importance of highlighting the impact and value of interdisciplinary work for SE, collaborating with experienced researchers, and establishing clearer submission guidelines and new interdisciplinary SE publication venues. Our findings contribute to the understanding of the current state of the SE research community and how we could better support interdisciplinary research in our field.
△ Less
Submitted 11 January, 2025;
originally announced January 2025.
-
Towards User-Focused Cross-Domain Testing: Disentangling Accessibility, Usability, and Fairness
Authors:
Matheus de Morais Leça,
Ronnie de Souza Santos
Abstract:
Fairness testing is increasingly recognized as fundamental in software engineering, especially in the domain of data-driven systems powered by artificial intelligence. However, its practical integration into software development may pose challenges, given its overlapping boundaries with usability and accessibility testing. In this tertiary study, we explore these complexities using insights from 1…
▽ More
Fairness testing is increasingly recognized as fundamental in software engineering, especially in the domain of data-driven systems powered by artificial intelligence. However, its practical integration into software development may pose challenges, given its overlapping boundaries with usability and accessibility testing. In this tertiary study, we explore these complexities using insights from 12 systematic reviews published in the past decade, shedding light on the nuanced interactions among fairness, usability, and accessibility testing and how they intersect within contemporary software development practices.
△ Less
Submitted 17 January, 2025; v1 submitted 10 January, 2025;
originally announced January 2025.
-
Curious, Critical Thinker, Empathetic, and Ethically Responsible: Essential Soft Skills for Data Scientists in Software Engineering
Authors:
Matheus de Morais Leça,
Ronnie de Souza Santos
Abstract:
Background. As artificial intelligence and AI-powered systems continue to grow, the role of data scientists has become essential in software development environments. Data scientists face challenges related to managing large volumes of data and addressing the societal impacts of AI algorithms, which require a broad range of soft skills.
Goal. This study aims to identify the key soft skills that…
▽ More
Background. As artificial intelligence and AI-powered systems continue to grow, the role of data scientists has become essential in software development environments. Data scientists face challenges related to managing large volumes of data and addressing the societal impacts of AI algorithms, which require a broad range of soft skills.
Goal. This study aims to identify the key soft skills that data scientists need when working on AI-powered projects, with a particular focus on addressing biases that affect society.
Method. We conducted a thematic analysis of 87 job postings on LinkedIn and 11 interviews with industry practitioners. The job postings came from companies in 12 countries and covered various experience levels. The interviews featured professionals from diverse backgrounds, including different genders, ethnicities, and sexual orientations, who worked with clients from South America, North America, and Europe.
Results. While data scientists share many skills with other software practitioners -- such as those related to coordination, engineering, and management -- there is a growing emphasis on innovation and social responsibility. These include soft skills like curiosity, critical thinking, empathy, and ethical awareness, which are essential for addressing the ethical and societal implications of AI.
Conclusion. Our findings indicate that data scientists working on AI-powered projects require not only technical expertise but also a solid foundation in soft skills that enable them to build AI systems responsibly, with fairness and inclusivity. These insights have important implications for recruitment and training within software companies and for ensuring the long-term success of AI-powered systems and their broader societal impact.
△ Less
Submitted 28 January, 2025; v1 submitted 3 January, 2025;
originally announced January 2025.
-
Hidden Figures in Software Engineering: A Replication Study Exploring Undergraduate Software Students' Awareness of Distinguished Scientists from Underrepresented Groups
Authors:
Ronnie de Souza Santos,
Italo Santos,
Robson Santos,
Cleyton Magalhaes
Abstract:
Technology is a cornerstone of modern life, yet the software engineering field struggles to reflect the diversity of contemporary society. This lack of diversity and inclusivity within the software industry can be traced back to limited representation in software engineering academic settings, where students from underrepresented groups are often stigmatized despite the field's rich history of con…
▽ More
Technology is a cornerstone of modern life, yet the software engineering field struggles to reflect the diversity of contemporary society. This lack of diversity and inclusivity within the software industry can be traced back to limited representation in software engineering academic settings, where students from underrepresented groups are often stigmatized despite the field's rich history of contributions from scientists from diverse backgrounds. Over the years, studies have revealed that women, LGBTQIA+ individuals, and Black students frequently encounter unwelcoming environments in software engineering programs. However, similar to other fields, increasing awareness of notable individuals from marginalized backgrounds could inspire students and foster a more inclusive environment. This study reports the findings from a replicated global survey with undergraduate software engineering students, exploring their knowledge of distinguished scientists from underrepresented groups. These findings show that students have limited awareness of these figures and their contributions, highlighting the need to improve diversity awareness and develop educational practices that celebrate the achievements of historically marginalized groups in software engineering. Index Terms-EDi in software engineering, software engineering education, diversity.
△ Less
Submitted 19 December, 2024;
originally announced December 2024.
-
Diversity in Software Engineering Education: Exploring Motivations, Influences, and Role Models Among Undergraduate Students
Authors:
Ronnie de Souza Santos,
Italo Santos,
Robson Santos,
Cleyton Magalhaes
Abstract:
Software engineering (SE) faces significant diversity challenges in both academia and industry, with underrepresented students encountering hostile environments, limited representation, and systemic biases that hinder their academic and professional success. Despite significant research on the exclusion experienced by students from underrepresented groups in SE education, there is limited understa…
▽ More
Software engineering (SE) faces significant diversity challenges in both academia and industry, with underrepresented students encountering hostile environments, limited representation, and systemic biases that hinder their academic and professional success. Despite significant research on the exclusion experienced by students from underrepresented groups in SE education, there is limited understanding of the specific motivations, influences, and role models that drive underrepresented students to pursue and persist in the field. This study explores the motivations and influences shaping the career aspirations of students from underrepresented groups in SE, and it investigates how role models and mentorship impact their decisions to stay in the field. We conducted a cross-sectional survey with undergraduate SE students and related fields, focusing on their motivations, influences, and the impact of mentorship and role models on their career paths. We identified eight motivations for pursuing SE, with career advancement, technological enthusiasm, and personal growth being the most common. Family members, tech influencers, teachers, and friends were key influences, though 64\% of students reported no specific individual influence. Role models, particularly tech influencers and family members play a critical role in sustaining interest in the field, especially for underrepresented groups. This study provides insights into the varied motivations and influences that guide underrepresented students' decisions to pursue SE. It emphasizes the importance of role models and highlights the need for intersectional approaches to better support diversity in the field.
△ Less
Submitted 16 December, 2024;
originally announced December 2024.
-
Responsible AI in the Software Industry: A Practitioner-Centered Perspective
Authors:
Matheus de Morais Leça,
Mariana Bento,
Ronnie de Souza Santos
Abstract:
Responsible AI principles provide ethical guidelines for developing AI systems, yet their practical implementation in software engineering lacks thorough investigation. Therefore, this study explores the practices and challenges faced by software practitioners in aligning with these principles. Through semi-structured interviews with 25 practitioners, we investigated their methods, concerns, and s…
▽ More
Responsible AI principles provide ethical guidelines for developing AI systems, yet their practical implementation in software engineering lacks thorough investigation. Therefore, this study explores the practices and challenges faced by software practitioners in aligning with these principles. Through semi-structured interviews with 25 practitioners, we investigated their methods, concerns, and strategies for addressing Responsible AI in software development. Our findings reveal that while practitioners frequently address fairness, inclusiveness, and reliability, principles such as transparency and accountability receive comparatively less attention in their practices. This scenario highlights gaps in current strategies and the need for more comprehensive frameworks to fully operationalize Responsible AI principles in software engineering.
△ Less
Submitted 10 December, 2024;
originally announced December 2024.
-
Integrating Positionality Statements in Empirical Software Engineering Research
Authors:
Breno Felix de Sousa,
Ronnie de Souza Santos,
Kiev Gama
Abstract:
Positionality statements are a reflective practice established in fields such as social sciences, where they enhance transparency, reflexivity, and ethical integrity by acknowledging how researchers identities, experiences, and perspectives may shape their work. This study aimed to investigate the understanding, usage, and potential value of positionality statements in software engineering researc…
▽ More
Positionality statements are a reflective practice established in fields such as social sciences, where they enhance transparency, reflexivity, and ethical integrity by acknowledging how researchers identities, experiences, and perspectives may shape their work. This study aimed to investigate the understanding, usage, and potential value of positionality statements in software engineering research, particularly in studies focused on diversity and inclusion.
△ Less
Submitted 9 December, 2024;
originally announced December 2024.
-
Applications and Implications of Large Language Models in Qualitative Analysis: A New Frontier for Empirical Software Engineering
Authors:
Matheus de Morais Leça,
Lucas Valença,
Reydne Santos,
Ronnie de Souza Santos
Abstract:
The use of large language models (LLMs) for qualitative analysis is gaining attention in various fields, including software engineering, where qualitative methods are essential for understanding human and social factors. This study aimed to investigate how LLMs are currently used in qualitative analysis and their potential applications in software engineering research, focusing on the benefits, li…
▽ More
The use of large language models (LLMs) for qualitative analysis is gaining attention in various fields, including software engineering, where qualitative methods are essential for understanding human and social factors. This study aimed to investigate how LLMs are currently used in qualitative analysis and their potential applications in software engineering research, focusing on the benefits, limitations, and practices associated with their use. A systematic mapping study was conducted, analyzing 21 relevant studies to explore reported uses of LLMs for qualitative analysis. The findings indicate that LLMs are primarily used for tasks such as coding, thematic analysis, and data categorization, offering benefits like increased efficiency and support for new researchers. However, limitations such as output variability, challenges in capturing nuanced perspectives, and ethical concerns related to privacy and transparency were also identified. The study emphasizes the need for structured strategies and guidelines to optimize LLM use in qualitative research within software engineering, enhancing their effectiveness while addressing ethical considerations. While LLMs show promise in supporting qualitative analysis, human expertise remains crucial for interpreting data, and ongoing exploration of best practices will be vital for their successful integration into empirical software engineering research.
△ Less
Submitted 7 March, 2025; v1 submitted 9 December, 2024;
originally announced December 2024.
-
From Literature to Practice: Exploring Fairness Testing Tools for the Software Industry Adoption
Authors:
Thanh Nguyen,
Luiz Fernando de Lima,
Maria Teresa Badassarre,
Ronnie de Souza Santos
Abstract:
In today's world, we need to ensure that AI systems are fair and unbiased. Our study looked at tools designed to test the fairness of software to see if they are practical and easy for software developers to use. We found that while some tools are cost-effective and compatible with various programming environments, many are hard to use and lack detailed instructions. They also tend to focus on spe…
▽ More
In today's world, we need to ensure that AI systems are fair and unbiased. Our study looked at tools designed to test the fairness of software to see if they are practical and easy for software developers to use. We found that while some tools are cost-effective and compatible with various programming environments, many are hard to use and lack detailed instructions. They also tend to focus on specific types of data, which limits their usefulness in real-world situations. Overall, current fairness testing tools need significant improvements to better support software developers in creating fair and equitable technology. We suggest that new tools should be user-friendly, well-documented, and flexible enough to handle different kinds of data, helping developers identify and fix biases early in the development process. This will lead to more trustworthy and fair software for everyone.
△ Less
Submitted 4 September, 2024;
originally announced September 2024.
-
Preliminary Insights on Industry Practices for Addressing Fairness Debt
Authors:
Ronnie de Souza Santos,
Luiz Fernando de Lima,
Maria Teresa Baldassarre,
Rodrigo Spinola
Abstract:
Context: This study explores how software professionals identify and address biases in AI systems within the software industry, focusing on practical knowledge and real-world applications. Goal: We aimed to understand the strategies employed by practitioners to manage bias and their implications for fairness debt. Method: We used a qualitative research method, gathering insights from industry prof…
▽ More
Context: This study explores how software professionals identify and address biases in AI systems within the software industry, focusing on practical knowledge and real-world applications. Goal: We aimed to understand the strategies employed by practitioners to manage bias and their implications for fairness debt. Method: We used a qualitative research method, gathering insights from industry professionals through interviews and employing thematic analysis to explore the collected data. Findings: Professionals identify biases through discrepancies in model outputs, demographic inconsistencies, and issues with training data. They address these biases using strategies such as enhanced data management, model adjustments, crisis management, improving team diversity, and ethical analysis. Conclusion: Our paper presents initial evidence on addressing fairness debt and provides a foundation for developing structured guidelines to manage fairness-related issues in AI systems.
△ Less
Submitted 4 September, 2024;
originally announced September 2024.
-
Trustworthy and Responsible AI for Human-Centric Autonomous Decision-Making Systems
Authors:
Farzaneh Dehghani,
Mahsa Dibaji,
Fahim Anzum,
Lily Dey,
Alican Basdemir,
Sayeh Bayat,
Jean-Christophe Boucher,
Steve Drew,
Sarah Elaine Eaton,
Richard Frayne,
Gouri Ginde,
Ashley Harris,
Yani Ioannou,
Catherine Lebel,
John Lysack,
Leslie Salgado Arzuaga,
Emma Stanley,
Roberto Souza,
Ronnie de Souza Santos,
Lana Wells,
Tyler Williamson,
Matthias Wilms,
Zaman Wahid,
Mark Ungrin,
Marina Gavrilova
, et al. (1 additional authors not shown)
Abstract:
Artificial Intelligence (AI) has paved the way for revolutionary decision-making processes, which if harnessed appropriately, can contribute to advancements in various sectors, from healthcare to economics. However, its black box nature presents significant ethical challenges related to bias and transparency. AI applications are hugely impacted by biases, presenting inconsistent and unreliable fin…
▽ More
Artificial Intelligence (AI) has paved the way for revolutionary decision-making processes, which if harnessed appropriately, can contribute to advancements in various sectors, from healthcare to economics. However, its black box nature presents significant ethical challenges related to bias and transparency. AI applications are hugely impacted by biases, presenting inconsistent and unreliable findings, leading to significant costs and consequences, highlighting and perpetuating inequalities and unequal access to resources. Hence, developing safe, reliable, ethical, and Trustworthy AI systems is essential.
Our team of researchers working with Trustworthy and Responsible AI, part of the Transdisciplinary Scholarship Initiative within the University of Calgary, conducts research on Trustworthy and Responsible AI, including fairness, bias mitigation, reproducibility, generalization, interpretability, and authenticity. In this paper, we review and discuss the intricacies of AI biases, definitions, methods of detection and mitigation, and metrics for evaluating bias. We also discuss open challenges with regard to the trustworthiness and widespread application of AI across diverse domains of human-centric decision making, as well as guidelines to foster Responsible and Trustworthy AI models.
△ Less
Submitted 2 September, 2024; v1 submitted 28 August, 2024;
originally announced August 2024.
-
Remote Communication Trends Among Developers and Testers in Post-Pandemic Work Environments
Authors:
Felipe Jansen,
Ronnie de Souza Santos
Abstract:
The rapid adoption of remote and hybrid work models in response to the COVID-19 pandemic has brought significant changes to communication and coordination within software development teams, affecting how various activities are executed. Nowadays, these changes are shaping the new post-pandemic environments and continue to impact software teams. In this context, our study explores the characteristi…
▽ More
The rapid adoption of remote and hybrid work models in response to the COVID-19 pandemic has brought significant changes to communication and coordination within software development teams, affecting how various activities are executed. Nowadays, these changes are shaping the new post-pandemic environments and continue to impact software teams. In this context, our study explores the characteristics and challenges of remote communication between software developers and software testers. We investigated how these professionals have adapted to the unique circumstances imposed by COVID-19, especially because many of them have now become permanent in the software industry. In this process, we explored their communication practices and interaction dynamics and how they potentially affect software evolution and quality. Our findings reveal that the transition to remote and hybrid work has resulted in notable changes in communication patterns and task coordination, which could potentially affect the overall quality of project deliverables. Additionally, we highlight the importance of adapting existing workflows, introducing new management practices, and investing in technology to facilitate remote interaction among developers and testers.
△ Less
Submitted 22 August, 2024;
originally announced August 2024.
-
How more data can hurt: Instability and regularization in next-generation reservoir computing
Authors:
Yuanzhao Zhang,
Edmilson Roque dos Santos,
Sean P. Cornelius
Abstract:
It has been found recently that more data can, counter-intuitively, hurt the performance of deep neural networks. Here, we show that a more extreme version of the phenomenon occurs in data-driven models of dynamical systems. To elucidate the underlying mechanism, we focus on next-generation reservoir computing (NGRC) -- a popular framework for learning dynamics from data. We find that, despite lea…
▽ More
It has been found recently that more data can, counter-intuitively, hurt the performance of deep neural networks. Here, we show that a more extreme version of the phenomenon occurs in data-driven models of dynamical systems. To elucidate the underlying mechanism, we focus on next-generation reservoir computing (NGRC) -- a popular framework for learning dynamics from data. We find that, despite learning a better representation of the flow map with more training data, NGRC can adopt an ill-conditioned ``integrator'' and lose stability. We link this data-induced instability to the auxiliary dimensions created by the delayed states in NGRC. Based on these findings, we propose simple strategies to mitigate the instability, either by increasing regularization strength in tandem with data size, or by carefully introducing noise during training. Our results highlight the importance of proper regularization in data-driven modeling of dynamical systems.
△ Less
Submitted 25 January, 2025; v1 submitted 11 July, 2024;
originally announced July 2024.
-
The Role of Generative AI in Software Development Productivity: A Pilot Case Study
Authors:
Mariana Coutinho,
Lorena Marques,
Anderson Santos,
Marcio Dahia,
Cesar Franca,
Ronnie de Souza Santos
Abstract:
With software development increasingly reliant on innovative technologies, there is a growing interest in exploring the potential of generative AI tools to streamline processes and enhance productivity. In this scenario, this paper investigates the integration of generative AI tools within software development, focusing on understanding their uses, benefits, and challenges to software professional…
▽ More
With software development increasingly reliant on innovative technologies, there is a growing interest in exploring the potential of generative AI tools to streamline processes and enhance productivity. In this scenario, this paper investigates the integration of generative AI tools within software development, focusing on understanding their uses, benefits, and challenges to software professionals, in particular, looking at aspects of productivity. Through a pilot case study involving software practitioners working in different roles, we gathered valuable experiences on the integration of generative AI tools into their daily work routines. Our findings reveal a generally positive perception of these tools in individual productivity while also highlighting the need to address identified limitations. Overall, our research sets the stage for further exploration into the evolving landscape of software development practices with the integration of generative AI tools.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Software Fairness Debt
Authors:
Ronnie de Souza Santos,
Felipe Fronchetti,
Savio Freire,
Rodrigo Spinola
Abstract:
As software systems continue to play a significant role in modern society, ensuring their fairness has become a critical concern in software engineering. Motivated by this scenario, this paper focused on exploring the multifaceted nature of bias in software systems, aiming to provide a comprehensive understanding of its origins, manifestations, and impacts. Through a scoping study, we identified t…
▽ More
As software systems continue to play a significant role in modern society, ensuring their fairness has become a critical concern in software engineering. Motivated by this scenario, this paper focused on exploring the multifaceted nature of bias in software systems, aiming to provide a comprehensive understanding of its origins, manifestations, and impacts. Through a scoping study, we identified the primary causes of fairness deficiency in software development and highlighted their adverse effects on individuals and communities, including instances of discrimination and the perpetuation of inequalities. Our investigation culminated in the introduction of the concept of software fairness debt, which complements the notions of technical and social debt, encapsulating the accumulation of biases in software engineering practices while emphasizing the societal ramifications of bias embedded within software systems. Our study contributes to a deeper understanding of fairness in software engineering and paves the way for the development of more equitable and socially responsible software systems.
△ Less
Submitted 9 January, 2025; v1 submitted 3 May, 2024;
originally announced May 2024.
-
Paths to Testing: Why Women Enter and Remain in Software Testing?
Authors:
Kleice Silva,
Ann Barcomb,
Ronnie de Souza Santos
Abstract:
Background. Women bring unique problem-solving skills to software development, often favoring a holistic approach and attention to detail. In software testing, precision and attention to detail are essential as professionals explore system functionalities to identify defects. Recognizing the alignment between these skills and women's strengths can derive strategies for enhancing diversity in softw…
▽ More
Background. Women bring unique problem-solving skills to software development, often favoring a holistic approach and attention to detail. In software testing, precision and attention to detail are essential as professionals explore system functionalities to identify defects. Recognizing the alignment between these skills and women's strengths can derive strategies for enhancing diversity in software engineering. Goal. This study investigates the motivations behind women choosing careers in software testing, aiming to provide insights into their reasons for entering and remaining in the field. Method. This study used a cross-sectional survey methodology following established software engineering guidelines, collecting data from women in software testing to explore their motivations, experiences, and perspectives. Findings. The findings reveal that women enter software testing due to increased entry-level job opportunities, work-life balance, and even fewer gender stereotypes. Their motivations to stay include the impact of delivering high-quality software, continuous learning opportunities, and the challenges the activities bring to them. However, inclusiveness and career development in the field need improvement for sustained diversity. Conclusion. Preliminary yet significant, these findings offer interesting insights for researchers and practitioners towards the understanding of women's diverse motivations in software testing and how this understanding is important for fostering professional growth and creating a more inclusive and equitable industry landscape.
△ Less
Submitted 20 April, 2024;
originally announced April 2024.
-
Exploring Hybrid Work Realities: A Case Study with Software Professionals From Underrepresented Groups
Authors:
Ronnie de Souza Santos,
Cleyton Magalhes,
Robson Santons,
Jorge Correia-Neto
Abstract:
Context. In the post-pandemic era, software professionals resist returning to office routines, favoring the flexibility gained from remote work. Hybrid work structures, then, become popular within software companies, allowing them to choose not to work in the office every day, preserving flexibility, and creating several benefits, including an increase in the support for underrepresented groups in…
▽ More
Context. In the post-pandemic era, software professionals resist returning to office routines, favoring the flexibility gained from remote work. Hybrid work structures, then, become popular within software companies, allowing them to choose not to work in the office every day, preserving flexibility, and creating several benefits, including an increase in the support for underrepresented groups in software development. Goal. We investigated how software professionals from underrepresented groups are experiencing post-pandemic hybrid work. In particular, we analyzed the experiences of neurodivergents, LGBTQIA+ individuals, and people with disabilities working in the software industry. Method. We conducted a case study focusing on the underrepresented groups within a well-established South American software company. Results. Hybrid work is preferred by software professionals from underrepresented groups in the post-pandemic era. Advantages include improved focus at home, personalized work setups, and accommodation for health treatments. Concerns arise about isolation and inadequate infrastructure support, highlighting the need for proactive organizational strategies. Conclusions. Hybrid work emerges as a promising strategy for fostering diversity and inclusion in software engineering, addressing past limitations of the traditional office environment.
△ Less
Submitted 20 April, 2024;
originally announced April 2024.
-
Goal Recognition via Linear Programming
Authors:
Felipe Meneguzzi,
Luísa R. de A. Santos,
Ramon Fraga Pereira,
André G. Pereira
Abstract:
Goal Recognition is the task by which an observer aims to discern the goals that correspond to plans that comply with the perceived behavior of subject agents given as a sequence of observations. Research on Goal Recognition as Planning encompasses reasoning about the model of a planning task, the observations, and the goals using planning techniques, resulting in very efficient recognition approa…
▽ More
Goal Recognition is the task by which an observer aims to discern the goals that correspond to plans that comply with the perceived behavior of subject agents given as a sequence of observations. Research on Goal Recognition as Planning encompasses reasoning about the model of a planning task, the observations, and the goals using planning techniques, resulting in very efficient recognition approaches. In this article, we design novel recognition approaches that rely on the Operator-Counting framework, proposing new constraints, and analyze their constraints' properties both theoretically and empirically. The Operator-Counting framework is a technique that efficiently computes heuristic estimates of cost-to-goal using Integer/Linear Programming (IP/LP). In the realm of theory, we prove that the new constraints provide lower bounds on the cost of plans that comply with observations. We also provide an extensive empirical evaluation to assess how the new constraints improve the quality of the solution, and we found that they are especially informed in deciding which goals are unlikely to be part of the solution. Our novel recognition approaches have two pivotal advantages: first, they employ new IP/LP constraints for efficiently recognizing goals; second, we show how the new IP/LP constraints can improve the recognition of goals under both partial and noisy observability.
△ Less
Submitted 11 April, 2024;
originally announced April 2024.
-
Elevating Software Quality in Agile Environments: The Role of Testing Professionals in Unit Testing
Authors:
Lucas Neves,
Oscar Campos,
Robson Santos,
Italo Santos,
Cleyton Magalhaes,
Ronnie de Souza Santos
Abstract:
Testing is an essential quality activity in the software development process. Usually, a software system is tested on several levels, starting with unit testing that checks the smallest parts of the code until acceptance testing, which is focused on the validations with the end-user. Historically, unit testing has been the domain of developers, who are responsible for ensuring the accuracy of thei…
▽ More
Testing is an essential quality activity in the software development process. Usually, a software system is tested on several levels, starting with unit testing that checks the smallest parts of the code until acceptance testing, which is focused on the validations with the end-user. Historically, unit testing has been the domain of developers, who are responsible for ensuring the accuracy of their code. However, in agile environments, testing professionals play an integral role in various quality improvement initiatives throughout each development cycle. This paper explores the participation of test engineers in unit testing within an industrial context, employing a survey-based research methodology. Our findings demonstrate that testing professionals have the potential to strengthen unit testing by collaborating with developers to craft thorough test cases and fostering a culture of mutual learning and cooperation, ultimately contributing to increasing the overall quality of software projects.
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Hidden Populations in Software Engineering: Challenges, Lessons Learned, and Opportunities
Authors:
Ronnie de Souza Santos,
Kiev Gama
Abstract:
The growing emphasis on studying equity, diversity, and inclusion within software engineering has amplified the need to explore hidden populations within this field. Exploring hidden populations becomes important to obtain invaluable insights into the experiences, challenges, and perspectives of underrepresented groups in software engineering and, therefore, devise strategies to make the software…
▽ More
The growing emphasis on studying equity, diversity, and inclusion within software engineering has amplified the need to explore hidden populations within this field. Exploring hidden populations becomes important to obtain invaluable insights into the experiences, challenges, and perspectives of underrepresented groups in software engineering and, therefore, devise strategies to make the software industry more diverse. However, studying these hidden populations presents multifaceted challenges, including the complexities associated with identifying and engaging participants due to their marginalized status. In this paper, we discuss our experiences and lessons learned while conducting multiple studies involving hidden populations in software engineering. We emphasize the importance of recognizing and addressing these challenges within the software engineering research community to foster a more inclusive and comprehensive understanding of diverse populations of software professionals.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
Charting a Path to Efficient Onboarding: The Role of Software Visualization
Authors:
Fernando Padoan,
Ronnie de Souza Santos,
Rodrigo Pessoa Medeiros
Abstract:
Background. Within the software industry, it is commonly estimated that software professionals invest a substantial portion of their work hours in the process of understanding existing systems. In this context, an ineffective technical onboarding process, which introduces newcomers to software under development, can result in a prolonged period for them to absorb the necessary knowledge required t…
▽ More
Background. Within the software industry, it is commonly estimated that software professionals invest a substantial portion of their work hours in the process of understanding existing systems. In this context, an ineffective technical onboarding process, which introduces newcomers to software under development, can result in a prolonged period for them to absorb the necessary knowledge required to become productive in their roles. Goal. The present study aims to explore the familiarity of managers, leaders, and developers with software visualization tools and how these tools are employed to facilitate the technical onboarding of new team members. Method. To address the research problem, we built upon the insights gained through the literature and embraced a sequential exploratory approach. This approach incorporated quantitative and qualitative analyses of data collected from practitioners using questionnaires and semi-structured interviews. Findings. Our findings demonstrate a gap between the concept of software visualization and the practical use of onboarding tools and techniques. Overall, practitioners do not systematically incorporate software visualization tools into their technical onboarding processes due to a lack of conceptual understanding and awareness of their potential benefits. Conclusion. The software industry could benefit from standardized and evolving onboarding models, improved by incorporating software visualization techniques and tools to support program comprehension of newcomers in the software projects.
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
Post-Pandemic Hybrid Work in Software Companies: Findings from an Industrial Case Study
Authors:
Ronnie de Souza Santos,
Willian Grillo,
Djafran Cabral,
Catarina de Castro,
Nicole Albuquerque,
Cesar França
Abstract:
Context. Software professionals learned from their experience during the pandemic that most of their work can be done remotely, and now software companies are expected to adopt hybrid work models to avoid the resignation of talented professionals who require more flexibility and work-life balance. However, hybrid work is a spectrum of flexible work arrangements, and currently, there are no well-es…
▽ More
Context. Software professionals learned from their experience during the pandemic that most of their work can be done remotely, and now software companies are expected to adopt hybrid work models to avoid the resignation of talented professionals who require more flexibility and work-life balance. However, hybrid work is a spectrum of flexible work arrangements, and currently, there are no well-established hybrid work configurations to be followed in the post-pandemic period. Goal. We investigated how software engineers are experiencing the post-pandemic hybrid work landscape, aiming to understand the factors that influence their choices between remote and in-office work. Method. We explored a large South American company by collecting quantitative and qualitative data from 545 software professionals who are currently navigating diverse hybrid work arrangements tailored to their individual and team requirements. Findings. Our study revealed an array of factors that significantly impact hybrid work within the software industry, including individual preferences, work-life balance, commute time, social interactions, productivity, and more. Team dynamics, project demands, client expectations, and organizational strategies also play an important role in shaping the complex landscape of hybrid work configurations in software engineering. Conclusions. In summary, the success of hybrid work models depends on balancing individual preferences, team dynamics, and organizational strategies. Our study demonstrated that, at present, there is no one-size-fits-all individual approach to hybrid work in the software industry.
△ Less
Submitted 16 January, 2024;
originally announced January 2024.
-
Prompt-based mental health screening from social media text
Authors:
Wesley Ramos dos Santos,
Ivandre Paraboni
Abstract:
This article presents a method for prompt-based mental health screening from a large and noisy dataset of social media text. Our method uses GPT 3.5. prompting to distinguish publications that may be more relevant to the task, and then uses a straightforward bag-of-words text classifier to predict actual user labels. Results are found to be on pair with a BERT mixture of experts classifier, and in…
▽ More
This article presents a method for prompt-based mental health screening from a large and noisy dataset of social media text. Our method uses GPT 3.5. prompting to distinguish publications that may be more relevant to the task, and then uses a straightforward bag-of-words text classifier to predict actual user labels. Results are found to be on pair with a BERT mixture of experts classifier, and incurring only a fraction of its training costs.
△ Less
Submitted 11 May, 2024; v1 submitted 11 January, 2024;
originally announced January 2024.
-
Are We Testing or Being Tested? Exploring the Practical Applications of Large Language Models in Software Testing
Authors:
Robson Santos,
Italo Santos,
Cleyton Magalhaes,
Ronnie de Souza Santos
Abstract:
A Large Language Model (LLM) represents a cutting-edge artificial intelligence model that generates coherent content, including grammatically precise sentences, human-like paragraphs, and syntactically accurate code snippets. LLMs can play a pivotal role in software development, including software testing. LLMs go beyond traditional roles such as requirement analysis and documentation and can supp…
▽ More
A Large Language Model (LLM) represents a cutting-edge artificial intelligence model that generates coherent content, including grammatically precise sentences, human-like paragraphs, and syntactically accurate code snippets. LLMs can play a pivotal role in software development, including software testing. LLMs go beyond traditional roles such as requirement analysis and documentation and can support test case generation, making them valuable tools that significantly enhance testing practices within the field. Hence, we explore the practical application of LLMs in software testing within an industrial setting, focusing on their current use by professional testers. In this context, rather than relying on existing data, we conducted a cross-sectional survey and collected data within real working contexts, specifically, engaging with practitioners in industrial settings. We applied quantitative and qualitative techniques to analyze and synthesize our collected data. Our findings demonstrate that LLMs effectively enhance testing documents and significantly assist testing professionals in programming tasks like debugging and test case automation. LLMs can support individuals engaged in manual testing who need to code. However, it is crucial to emphasize that, at this early stage, software testing professionals should use LLMs with caution while well-defined methods and guidelines are being built for the secure adoption of these tools.
△ Less
Submitted 8 December, 2023;
originally announced December 2023.
-
Exposing Algorithmic Discrimination and Its Consequences in Modern Society: Insights from a Scoping Study
Authors:
Ramandeep Singh Dehal,
Mehak Sharma,
Ronnie de Souza Santos
Abstract:
Algorithmic discrimination is a condition that arises when data-driven software unfairly treats users based on attributes like ethnicity, race, gender, sexual orientation, religion, age, disability, or other personal characteristics. Nowadays, as machine learning gains popularity, cases of algorithmic discrimination are increasingly being reported in several contexts. This study delves into variou…
▽ More
Algorithmic discrimination is a condition that arises when data-driven software unfairly treats users based on attributes like ethnicity, race, gender, sexual orientation, religion, age, disability, or other personal characteristics. Nowadays, as machine learning gains popularity, cases of algorithmic discrimination are increasingly being reported in several contexts. This study delves into various studies published over the years reporting algorithmic discrimination. We aim to support software engineering researchers and practitioners in addressing this issue by discussing key characteristics of the problem
△ Less
Submitted 16 January, 2024; v1 submitted 8 December, 2023;
originally announced December 2023.
-
Navigating the Path of Women in Software Engineering: From Academia to Industry
Authors:
Tatalina Oliveira,
Ann Barcomb,
Ronnie de Souza Santos,
Helda Barros,
Maria Teresa Baldassarre,
César França
Abstract:
Context. Women remain significantly underrepresented in software engineering, leading to a lasting gender gap in the software industry. This disparity starts in education and extends into the industry, causing challenges such as hostile work environments and unequal opportunities. Addressing these issues is crucial for fostering an inclusive and diverse software engineering workforce. Aim. This st…
▽ More
Context. Women remain significantly underrepresented in software engineering, leading to a lasting gender gap in the software industry. This disparity starts in education and extends into the industry, causing challenges such as hostile work environments and unequal opportunities. Addressing these issues is crucial for fostering an inclusive and diverse software engineering workforce. Aim. This study aims to enhance the literature on women in software engineering, exploring their journey from academia to industry and discussing perspectives, challenges, and support. We focus on Brazilian women to extend existing research, which has largely focused on North American and European contexts. Method. In this study, we conducted a cross-sectional survey, collecting both quantitative and qualitative data, focusing on women's experiences in software engineering to explore their journey from university to the software industry. Findings. Our findings highlight persistent challenges faced by women in software engineering, including gender bias, harassment, work-life imbalance, undervaluation, low sense of belonging, and impostor syndrome. These difficulties commonly emerge from university experiences and continue to affect women throughout their entire careers. Conclusion. In summary, our study identifies systemic challenges in women's software engineering journey, emphasizing the need for organizational commitment to address these issues. We provide actionable recommendations for practitioners.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
Myths and Facts about a Career in Software Testing: A Comparison between Students' Beliefs and Professionals' Experience
Authors:
Ronnie de Souza Santos,
Luiz Fernando Capretz,
Cleyton Magalhaes,
Rodrigo Souza
Abstract:
Testing is an indispensable part of software development. However, a career in software testing is reported to be unpopular among students in computer science and related areas. This can potentially create a shortage of testers in the software industry in the future. The question is, whether the perception that undergraduate students have about software testing is accurate and whether it differs f…
▽ More
Testing is an indispensable part of software development. However, a career in software testing is reported to be unpopular among students in computer science and related areas. This can potentially create a shortage of testers in the software industry in the future. The question is, whether the perception that undergraduate students have about software testing is accurate and whether it differs from the experience reported by those who work in testing activities in the software development industry. This investigation demonstrates that a career in software testing is more exciting and rewarding, as reported by professionals working in the field, than students may believe. Therefore, in order to guarantee a workforce focused on software quality, the academy and the software industry need to work together to better inform students about software testing and its essential role in software development.
△ Less
Submitted 10 November, 2023;
originally announced November 2023.
-
Discovery of Novel Reticular Materials for Carbon Dioxide Capture using GFlowNets
Authors:
Flaviu Cipcigan,
Jonathan Booth,
Rodrigo Neumann Barros Ferreira,
Carine Ribeiro dos Santos,
Mathias Steiner
Abstract:
Artificial intelligence holds promise to improve materials discovery. GFlowNets are an emerging deep learning algorithm with many applications in AI-assisted discovery. By using GFlowNets, we generate porous reticular materials, such as metal organic frameworks and covalent organic frameworks, for applications in carbon dioxide capture. We introduce a new Python package (matgfn) to train and sampl…
▽ More
Artificial intelligence holds promise to improve materials discovery. GFlowNets are an emerging deep learning algorithm with many applications in AI-assisted discovery. By using GFlowNets, we generate porous reticular materials, such as metal organic frameworks and covalent organic frameworks, for applications in carbon dioxide capture. We introduce a new Python package (matgfn) to train and sample GFlowNets. We use matgfn to generate the matgfn-rm dataset of novel and diverse reticular materials with gravimetric surface area above 5000 m$^2$/g. We calculate single- and two-component gas adsorption isotherms for the top-100 candidates in matgfn-rm. These candidates are novel compared to the state-of-art ARC-MOF dataset and rank in the 90th percentile in terms of working capacity compared to the CoRE2019 dataset. We discover 15 materials outperforming all materials in CoRE2019.
△ Less
Submitted 16 October, 2023; v1 submitted 11 October, 2023;
originally announced October 2023.
-
Software Testing and Code Refactoring: A Survey with Practitioners
Authors:
Danilo Leandro Lima,
Ronnie de Souza Santos,
Guilherme Pires Garcia,
Sildemir S. da Silva,
Cesar Franca,
Luiz Fernando Capretz
Abstract:
Nowadays, software testing professionals are commonly required to develop coding skills to work on test automation. One essential skill required from those who code is the ability to implement code refactoring, a valued quality aspect of software development; however, software developers usually encounter obstacles in successfully applying this practice. In this scenario, the present study aims to…
▽ More
Nowadays, software testing professionals are commonly required to develop coding skills to work on test automation. One essential skill required from those who code is the ability to implement code refactoring, a valued quality aspect of software development; however, software developers usually encounter obstacles in successfully applying this practice. In this scenario, the present study aims to explore how software testing professionals (e.g., software testers, test engineers, test analysts, and software QAs) deal with code refactoring to understand the benefits and limitations of this practice in the context of software testing. We followed the guidelines to conduct surveys in software engineering and applied three sampling techniques, namely convenience sampling, purposive sampling, and snowballing sampling, to collect data from testing professionals. We received answers from 80 individuals reporting their experience refactoring the code of automated tests. We concluded that in the context of software testing, refactoring offers several benefits, such as supporting the maintenance of automated tests and improving the performance of the testing team. However, practitioners might encounter barriers in effectively implementing this practice, in particular, the lack of interest from managers and leaders. Our study raises discussions on the importance of having testing professionals implement refactoring in the code of automated tests, allowing them to improve their coding abilities.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
AsQM: Audio streaming Quality Metric based on Network Impairments and User Preferences
Authors:
Marcelo Rodrigo dos Santos,
Andreza Patrícia Batista,
Renata Lopes Rosa,
Muhammad Saadi,
Dick Carrillo Melgarejo,
Demóstenes Zegarra Rodríguez
Abstract:
There are many users of audio streaming services because of the proliferation of cloud-based audio streaming services for different content. The complex networks that support these services do not always guarantee an acceptable quality on the end-user side. In this paper, the impact of temporal interruptions on the reproduction of audio streaming and the users preference in relation to audio conte…
▽ More
There are many users of audio streaming services because of the proliferation of cloud-based audio streaming services for different content. The complex networks that support these services do not always guarantee an acceptable quality on the end-user side. In this paper, the impact of temporal interruptions on the reproduction of audio streaming and the users preference in relation to audio contents are studied. In order to determine the key parameters in the audio streaming service, subjective tests were conducted, and their results show that users Quality-of-Experience (QoE) is highly correlated with the following application parameters, the number of temporal interruptions or stalls, its frequency and length, and the temporal location in which they occur. However, most important, experimental results demonstrated that users preference for audio content plays an important role in users QoE. Thus, a Preference Factor (PF) function is defined and considered in the formulation of the proposed metric named Audio streaming Quality Metric (AsQM). Considering that multimedia service providers are based on web servers, a framework to obtain user information is proposed. Furthermore, results show that the AsQM implemented in the audio player of an end users device presents a low impact on energy, processing and memory consumption.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
Beyond the ML Model: Applying Safety Engineering Frameworks to Text-to-Image Development
Authors:
Shalaleh Rismani,
Renee Shelby,
Andrew Smart,
Renelito Delos Santos,
AJung Moon,
Negar Rostamzadeh
Abstract:
Identifying potential social and ethical risks in emerging machine learning (ML) models and their applications remains challenging. In this work, we applied two well-established safety engineering frameworks (FMEA, STPA) to a case study involving text-to-image models at three stages of the ML product development pipeline: data processing, integration of a T2I model with other models, and use. Resu…
▽ More
Identifying potential social and ethical risks in emerging machine learning (ML) models and their applications remains challenging. In this work, we applied two well-established safety engineering frameworks (FMEA, STPA) to a case study involving text-to-image models at three stages of the ML product development pipeline: data processing, integration of a T2I model with other models, and use. Results of our analysis demonstrate the safety frameworks - both of which are not designed explicitly examine social and ethical risks - can uncover failure and hazards that pose social and ethical risks. We discovered a broad range of failures and hazards (i.e., functional, social, and ethical) by analyzing interactions (i.e., between different ML models in the product, between the ML product and user, and between development teams) and processes (i.e., preparation of training data or workflows for using an ML service/product). Our findings underscore the value and importance of examining beyond an ML model in examining social and ethical risks, especially when we have minimal information about an ML model.
△ Less
Submitted 18 July, 2023;
originally announced July 2023.
-
Comparing Mobile Testing Tools Using Documentary Analysis
Authors:
Gustavo da Silva,
Ronnie de Souza Santos
Abstract:
Due to the high demand for mobile applications, given the exponential growth of users of this type of technology, testing professionals are frequently required to invest time in studying testing tools, in particular, because nowadays, several different tools are available. A variety of tools makes it difficult for testing professionals to choose the one that best fits their goals and supports them…
▽ More
Due to the high demand for mobile applications, given the exponential growth of users of this type of technology, testing professionals are frequently required to invest time in studying testing tools, in particular, because nowadays, several different tools are available. A variety of tools makes it difficult for testing professionals to choose the one that best fits their goals and supports them in their work. In this sense, we conducted a comparative analysis among five open-source tools for mobile testing: Appium, Robotium, Espresso, Frank, and EarGrey. We used the documentary analysis method to explore the official documentation of each above-cited tool and developed various comparisons based on technical criteria reported in the literature about characteristics that mobile testing tools should have. Our findings are expected to help practitioners understand several aspects of mobile testing tools.
△ Less
Submitted 1 July, 2023;
originally announced July 2023.
-
The Perspective of Software Professionals on Algorithmic Racism
Authors:
Ronnie de Souza Santos,
Luiz Fernando de Lima,
Cleyton Magalhaes
Abstract:
Context. Algorithmic racism is the term used to describe the behavior of technological solutions that constrains users based on their ethnicity. Lately, various data-driven software systems have been reported to discriminate against Black people, either for the use of biased data sets or due to the prejudice propagated by software professionals in their code. As a result, Black people are experien…
▽ More
Context. Algorithmic racism is the term used to describe the behavior of technological solutions that constrains users based on their ethnicity. Lately, various data-driven software systems have been reported to discriminate against Black people, either for the use of biased data sets or due to the prejudice propagated by software professionals in their code. As a result, Black people are experiencing disadvantages in accessing technology-based services, such as housing, banking, and law enforcement. Goal. This study aims to explore algorithmic racism from the perspective of software professionals. Method. A survey questionnaire was applied to explore the understanding of software practitioners on algorithmic racism, and data analysis was conducted using descriptive statistics and coding techniques. Results. We obtained answers from a sample of 73 software professionals discussing their understanding and perspectives on algorithmic racism in software development. Our results demonstrate that the effects of algorithmic racism are well-known among practitioners. However, there is no consensus on how the problem can be effectively addressed in software engineering. In this paper, some solutions to the problem are proposed based on the professionals' narratives. Conclusion. Combining technical and social strategies, including training on structural racism for software professionals, is the most promising way to address the algorithmic racism problem and its effects on the software solutions delivered to our society.
△ Less
Submitted 26 June, 2023;
originally announced June 2023.
-
Expertise-based Weighting for Regression Models with Noisy Labels
Authors:
Milene Regina dos Santos,
Rafael Izbicki
Abstract:
Regression methods assume that accurate labels are available for training. However, in certain scenarios, obtaining accurate labels may not be feasible, and relying on multiple specialists with differing opinions becomes necessary. Existing approaches addressing noisy labels often impose restrictive assumptions on the regression function. In contrast, this paper presents a novel, more flexible app…
▽ More
Regression methods assume that accurate labels are available for training. However, in certain scenarios, obtaining accurate labels may not be feasible, and relying on multiple specialists with differing opinions becomes necessary. Existing approaches addressing noisy labels often impose restrictive assumptions on the regression function. In contrast, this paper presents a novel, more flexible approach. Our method consists of two steps: estimating each labeler's expertise and combining their opinions using learned weights. We then regress the weighted average against the input features to build the prediction model. The proposed method is formally justified and empirically demonstrated to outperform existing techniques on simulated and real data. Furthermore, its flexibility enables the utilization of any machine learning technique in both steps. In summary, this method offers a simple, fast, and effective solution for training regression models with noisy labels derived from diverse expert opinions.
△ Less
Submitted 12 May, 2023;
originally announced May 2023.
-
What do Transgender Software Professionals say about a Career in the Software Industry?
Authors:
Ronnie de Souza Santos,
Brody Stuart-Verner,
Cleyton Magalhaes
Abstract:
Diversity is an essential aspect of software development because technology influences almost every aspect of modern society, and if the software industry lacks diversity, software products might unintentionally constrain groups of individuals instead of promoting an equalitarian experience to all. In this study, we investigate the perspectives of transgender software professionals about a career…
▽ More
Diversity is an essential aspect of software development because technology influences almost every aspect of modern society, and if the software industry lacks diversity, software products might unintentionally constrain groups of individuals instead of promoting an equalitarian experience to all. In this study, we investigate the perspectives of transgender software professionals about a career in software engineering as one of the aspects of diversity in the software industry. Our findings demonstrate that, on the one hand, trans people choose careers in software engineering for two primary reasons: a) even though software development environments are not exempt from discrimination, the software industry is safer than other industries for transgenders; b) trans people occasionally have to deal with gender dysphoria, anxiety, and fear of judgment, and the work flexibility offered by software companies allow them to cope with these issues more efficiently.
△ Less
Submitted 22 March, 2023;
originally announced March 2023.
-
Post-pandemic Resilience of Hybrid Software Teams
Authors:
Ronnie de Souza Santos,
Gianisa Adisaputri,
Paul Ralph
Abstract:
Background. The COVID-19 pandemic triggered a widespread transition to hybrid work models (combinations of co-located and remote work) as software professionals' demanded more flexibility and improved work-life balance. However, hybrid work models reduce the spontaneous, informal face-to-face interactions that promote group maturation, cohesion, and resilience. Little is known about how software c…
▽ More
Background. The COVID-19 pandemic triggered a widespread transition to hybrid work models (combinations of co-located and remote work) as software professionals' demanded more flexibility and improved work-life balance. However, hybrid work models reduce the spontaneous, informal face-to-face interactions that promote group maturation, cohesion, and resilience. Little is known about how software companies can successfully transition to a hybrid workforce or the factors that influence the resilience of hybrid software development teams. Goal. The purpose of this study is to explore the relationship between hybrid work and team resilience in the context of software development. Method. Constructivist Grounded Theory was used, based on interviews of 26 software professionals. This sample included professionals of different genders, ethnicities, sexual orientations, and levels of experience. Interviewees came from eight different companies, 22 different projects, and four different countries. Consistent with grounded theory methodology, data collection, and analysis were conducted iteratively, in waves, using theoretical sampling, constant comparison, and initial, focused, and theoretical coding. Results. Software Team Resilience is the ability of a group of software professionals to continue working together effectively under adverse conditions. Resilience depends on the group's maturity. The configuration of a hybrid team (who works where and when) can promote or hinder group maturity depending on the level of intra-group interaction it supports. Conclusion. This paper presents the first study on the resilience of hybrid software teams. Software teams need resilience to maintain their performance in the face of disruptions and crises. Software professionals strongly value hybrid work; therefore, team resilience is a key factor to be considered in the software industry.
△ Less
Submitted 10 March, 2023;
originally announced March 2023.
-
LGBTQIA+ (In)Visibility in Computer Science and Software Engineering Education
Authors:
Ronnie de Souza Santos,
Brody Stuart-Verner,
Cleyton de Magalhaes
Abstract:
Modern society is diverse, multicultural, and multifaceted. Because of these characteristics, we are currently observing an increase in the debates about equity, diversity, and inclusion in different areas, especially because several groups of individuals are underrepresented in many environments. In computer science and software engineering, it seems counter-intuitive that these areas, which are…
▽ More
Modern society is diverse, multicultural, and multifaceted. Because of these characteristics, we are currently observing an increase in the debates about equity, diversity, and inclusion in different areas, especially because several groups of individuals are underrepresented in many environments. In computer science and software engineering, it seems counter-intuitive that these areas, which are responsible for creating technological solutions and systems for billions of users around the world, do not reflect the diversity of the society to which it serves. In trying to solve this diversity crisis in the software industry, researchers started to investigate strategies that can be applied to increase diversity and improve inclusion in academia and the software industry. However, the lack of diversity in computer science and related courses, including software engineering, is still a problem, in particular when some specific groups are considered. LGBTQIA+ students, for instance, face several challenges to fit into technology courses, even though most students in universities right now belong to Generation Z, which is described as open-minded to aspects of gender and sexuality. In this study, we aimed to discuss the state-of-art of publications about the inclusion of LGBTQIA+ students in computer science education. Using a mapping study, we identified eight studies published in the past six years that focused on this public. We present strategies developed to adapt curricula and lectures to be more inclusive to LGBTQIA+ students and discuss challenges and opportunities for future research
△ Less
Submitted 10 March, 2023;
originally announced March 2023.
-
Diversity in Software Engineering: A Survey about Scientists from Underrepresented Groups
Authors:
Ronnie de Souza Santos,
Brody Stuart-Verner,
Cleyton de Magalhaes
Abstract:
Technology plays a crucial role in people's lives. However, software engineering discriminates against individuals from underrepresented groups in several ways, either through algorithms that produce biased outcomes or for the lack of diversity and inclusion in software development environments and academic courses focused on technology. This reality contradicts the history of software engineering…
▽ More
Technology plays a crucial role in people's lives. However, software engineering discriminates against individuals from underrepresented groups in several ways, either through algorithms that produce biased outcomes or for the lack of diversity and inclusion in software development environments and academic courses focused on technology. This reality contradicts the history of software engineering, which is filled with outstanding scientists from underrepresented groups who changed the world with their contributions to the field. Ada Lovelace, Alan Turing, and Clarence Ellis are only some individuals who made significant breakthroughs in the area and belonged to the population that is so underrepresented in undergraduate courses and the software industry. Previous research discusses that women, LGBTQIA+ people, and non-white individuals are examples of students who often feel unwelcome and ostracized in software engineering. However, do they know about the remarkable scientists that came before them and that share background similarities with them? Can we use these scientists as role models to motivate these students to continue pursuing a career in software engineering? In this study, we present the preliminary results of a survey with 128 undergraduate students about this topic. Our findings demonstrate that students' knowledge of computer scientists from underrepresented groups is limited. This creates opportunities for investigations on fostering diversity in software engineering courses using strategies exploring computer science's history.
△ Less
Submitted 6 May, 2023; v1 submitted 10 March, 2023;
originally announced March 2023.
-
Benefits and Limitations of Remote Work to LGBTQIA+ Software Professionals
Authors:
Ronnie de Souza Santos,
Cleyton Magalhaes,
Paul Ralph
Abstract:
Background. The mass transition to remote work amid the COVID-19 pandemic profoundly affected software professionals, who abruptly shifted into ostensibly temporary home offices. The effects of this transition on these professionals are complex, depending on the particularities of the context and individuals. Recent studies advocate for remote structures to create opportunities for many equity-des…
▽ More
Background. The mass transition to remote work amid the COVID-19 pandemic profoundly affected software professionals, who abruptly shifted into ostensibly temporary home offices. The effects of this transition on these professionals are complex, depending on the particularities of the context and individuals. Recent studies advocate for remote structures to create opportunities for many equity-deserving groups; however, remote work can also be challenging for some individuals, such as women and individuals with disabilities. Objective. This study aims to investigate the effects of remote work on LGBTQIA+ software professionals. Method. Grounded theory methodology was applied based on information collected from two main sources: a survey questionnaire with a sample of 57 LGBTQIA+ software professionals and nine follow-up interviews with individuals from this sample. This sample included professionals of different genders, ethnicities, sexual orientations, and levels of experience. Findings. Our findings demonstrate that (1) remote work benefits LGBTQIA+ people by increasing security and visibility; (2) remote work harms LGBTQIA+ software professionals through isolation and invisibility; (3) the benefits outweigh the drawbacks; (4) the drawbacks can be mitigated by supportive measures developed by software companies. Conclusion. This paper investigated how remote work can affect LGBTQIA+ software professionals and presented a set of recommendations on how software companies can address the benefits and limitations associated with this work model. In summary, we concluded that remote work is crucial in increasing diversity and inclusion in the software industry.
△ Less
Submitted 4 June, 2023; v1 submitted 12 January, 2023;
originally announced January 2023.
-
LaMDA: Language Models for Dialog Applications
Authors:
Romal Thoppilan,
Daniel De Freitas,
Jamie Hall,
Noam Shazeer,
Apoorv Kulshreshtha,
Heng-Tze Cheng,
Alicia Jin,
Taylor Bos,
Leslie Baker,
Yu Du,
YaGuang Li,
Hongrae Lee,
Huaixiu Steven Zheng,
Amin Ghafouri,
Marcelo Menegali,
Yanping Huang,
Maxim Krikun,
Dmitry Lepikhin,
James Qin,
Dehao Chen,
Yuanzhong Xu,
Zhifeng Chen,
Adam Roberts,
Maarten Bosma,
Vincent Zhao
, et al. (35 additional authors not shown)
Abstract:
We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text. While model scaling alone can improve quality, it shows less improvements on safety and factual grounding. We demonstrate that fine-tuning with annotat…
▽ More
We present LaMDA: Language Models for Dialog Applications. LaMDA is a family of Transformer-based neural language models specialized for dialog, which have up to 137B parameters and are pre-trained on 1.56T words of public dialog data and web text. While model scaling alone can improve quality, it shows less improvements on safety and factual grounding. We demonstrate that fine-tuning with annotated data and enabling the model to consult external knowledge sources can lead to significant improvements towards the two key challenges of safety and factual grounding. The first challenge, safety, involves ensuring that the model's responses are consistent with a set of human values, such as preventing harmful suggestions and unfair bias. We quantify safety using a metric based on an illustrative set of human values, and we find that filtering candidate responses using a LaMDA classifier fine-tuned with a small amount of crowdworker-annotated data offers a promising approach to improving model safety. The second challenge, factual grounding, involves enabling the model to consult external knowledge sources, such as an information retrieval system, a language translator, and a calculator. We quantify factuality using a groundedness metric, and we find that our approach enables the model to generate responses grounded in known sources, rather than responses that merely sound plausible. Finally, we explore the use of LaMDA in the domains of education and content recommendations, and analyze their helpfulness and role consistency.
△ Less
Submitted 10 February, 2022; v1 submitted 20 January, 2022;
originally announced January 2022.
-
Unsupervised machine learning approaches to the $q$-state Potts model
Authors:
Andrea Tirelli,
Danyella O. Carvalho,
Lucas A. Oliveira,
J. P. Lima,
Natanael C. Costa,
Raimundo R. dos Santos
Abstract:
In this paper with study phase transitions of the $q$-state Potts model, through a number of unsupervised machine learning techniques, namely Principal Component Analysis (PCA), $k$-means clustering, Uniform Manifold Approximation and Projection (UMAP), and Topological Data Analysis (TDA). Even though in all cases we are able to retrieve the correct critical temperatures $T_c(q)$, for $q = 3, 4$ a…
▽ More
In this paper with study phase transitions of the $q$-state Potts model, through a number of unsupervised machine learning techniques, namely Principal Component Analysis (PCA), $k$-means clustering, Uniform Manifold Approximation and Projection (UMAP), and Topological Data Analysis (TDA). Even though in all cases we are able to retrieve the correct critical temperatures $T_c(q)$, for $q = 3, 4$ and $5$, results show that non-linear methods as UMAP and TDA are less dependent on finite size effects, while still being able to distinguish between first and second order phase transitions. This study may be considered as a benchmark for the use of different unsupervised machine learning algorithms in the investigation of phase transitions.
△ Less
Submitted 18 March, 2022; v1 submitted 13 December, 2021;
originally announced December 2021.
-
Abordagem probabilística para análise de confiabilidade de dados gerados em sequenciamentos multiplex na plataforma ABI SOLiD
Authors:
Fabio M. F. Lobato,
Carlos D. N. Damasceno,
Péricles L. Machado,
Nandamudi L. Vijaykumar,
André R. dos Santos,
Sylvain H. Darnet,
André N. A. Gonçalves,
Dayse O. de Alencar,
Ádamo L. de Santana
Abstract:
The next-generation sequencers such as Illumina and SOLiD platforms generate a large amount of data, commonly above 10 Gigabytes of text files. Particularly, the SOLiD platform allows the sequencing of multiple samples in a single run, called multiplex run, through a tagging system called Barcode. This feature requires a computational process for separation of the data sample because the sequencer…
▽ More
The next-generation sequencers such as Illumina and SOLiD platforms generate a large amount of data, commonly above 10 Gigabytes of text files. Particularly, the SOLiD platform allows the sequencing of multiple samples in a single run, called multiplex run, through a tagging system called Barcode. This feature requires a computational process for separation of the data sample because the sequencer provides a mixture of all samples in a single output. This process must be secure to avoid any harm that may scramble further analysis. In this context, realized the need to develop a probabilistic model capable of assigning a degree of confidence in the marking system used in multiplex sequencing. The results confirmed the adequacy of the model obtained, which allows, among other things, to guide a process of filtering the data and evaluation of the sequencing protocol used.
△ Less
Submitted 11 August, 2021; v1 submitted 27 July, 2021;
originally announced July 2021.