-
Visibility of Domain Elements in the Elicitation Process: A Family of Empirical Studies
Authors:
Alejandrina Aranda,
Oscar Dieste,
Natalia Juristo
Abstract:
Background: Various factors determine analyst effectiveness during elicitation. While the literature suggests that elicitation technique and time are influential factors, other attributes could also play a role. Aim: Determine aspects that may have an influence on analysts' ability to identify certain elements of the problem domain. Methodology: We conducted 14 quasi-experiments, inquiring 134 sub…
▽ More
Background: Various factors determine analyst effectiveness during elicitation. While the literature suggests that elicitation technique and time are influential factors, other attributes could also play a role. Aim: Determine aspects that may have an influence on analysts' ability to identify certain elements of the problem domain. Methodology: We conducted 14 quasi-experiments, inquiring 134 subjects about two problem domains. For each problem domain, we calculated whether the experimental subjects identified the problem domain elements (concepts, processes, and requirements), i.e., the degree to which these domain elements were visible. Results: Domain element visibility does not appear to be related to either analyst experience or analyst-client interaction. Domain element visibility depends on how analysts provide the elicited information: when asked about the knowledge acquired during elicitation, domain element visibility dramatically increases compared to the information they provide using a written report. Conclusions: Further research is required to replicate our results. However, the finding that analysts have difficulty reporting the information they have acquired is useful for identifying alternatives for improving the documentation of elicitation results. We found evidence that other issues, like domain complexity, the relative importance of different elements within the domain, and the interview script, also seem influential.
△ Less
Submitted 18 December, 2024;
originally announced December 2024.
-
Framework to coordinate ubiquitous devices with SOA standards
Authors:
Oscar A. Testa,
Efrain R. Fonseca C.,
Germán Montejano,
Oscar Dieste
Abstract:
Context: Ubiquitous devices and pervasive environments are in permanent interaction in people's daily lives. In today's hyper-connected environments, it is necessary for these devices to interact with each other, transparently to the users. The problem is analyzed from the different perspectives that compose it: SOA, service composition, interaction, and the capabilities of ubiquitous devices. Pro…
▽ More
Context: Ubiquitous devices and pervasive environments are in permanent interaction in people's daily lives. In today's hyper-connected environments, it is necessary for these devices to interact with each other, transparently to the users. The problem is analyzed from the different perspectives that compose it: SOA, service composition, interaction, and the capabilities of ubiquitous devices. Problem: Currently, ubiquitous devices can interact in a limited way due to the proprietary mechanisms and protocols available on the market. The few proposals from academia have hardly achieved an impact in practice. This is not in harmony with the situation of the Internet environment and web services, which have standardized mechanisms for service composition. Aim: Apply the principles of SOA, currently standardized and tested in the information systems industry, for the connectivity of ubiquitous devices in pervasive environments. For this, a coordination framework based on these technologies is proposed. Methodology: We apply an adaptation of Design Science in our environment to allow the iterative construction and evaluation of prototypes. For this, a proof of concept is developed on which this methodology and its cycles are based. Results: We built and put into operation a coordination framework for ubiquitous devices based on WS-CDL, along with a proof of concept. In addition, we contribute to the WS-CDL language in order to support the characteristics of specific ubiquitous devices.
△ Less
Submitted 9 December, 2024;
originally announced December 2024.
-
Perceived Usability of Collaborative Modeling Tools
Authors:
Ranci Ren,
John W. Castro,
Santiago R. Acuña,
Oscar Dieste,
Silvia T. Acuña
Abstract:
Context: Online collaborative creation of models is becoming commonplace. Collaborative modeling using chatbots and natural language may lower the barriers to modeling for users from different domains. Objective: We compare the perceived usability of two similarly online collaborative modeling tools, the SOCIO chatbot and the Creately web-based tool. Method: We conducted a crossover experiment wit…
▽ More
Context: Online collaborative creation of models is becoming commonplace. Collaborative modeling using chatbots and natural language may lower the barriers to modeling for users from different domains. Objective: We compare the perceived usability of two similarly online collaborative modeling tools, the SOCIO chatbot and the Creately web-based tool. Method: We conducted a crossover experiment with 66 participants. The evaluation instrument was based on the System Usability Scale (SUS). We performed a quantitative and qualitative exploration, employing inferential statistics and thematic analysis. Results: The results indicate that chatbots enabling natural language communication enhance communication and collaboration efficiency and improve the user experience. Conclusion: Chatbots need to improve guidance and help for novices, but they appear beneficial for enhancing user experience.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
Using the SOCIO Chatbot for UML Modelling: A Family of Experiments
Authors:
Ranci Ren,
John W. Castro,
Adrián Santos,
Oscar Dieste,
Silvia T. Acuña
Abstract:
Context: Recent developments in natural language processing have facilitated the adoption of chatbots in typically collaborative software engineering tasks (such as diagram modelling). Families of experiments can assess the performance of tools and processes and, at the same time, alleviate some of the typical shortcomings of individual experiments (e.g., inaccurate and potentially biased results…
▽ More
Context: Recent developments in natural language processing have facilitated the adoption of chatbots in typically collaborative software engineering tasks (such as diagram modelling). Families of experiments can assess the performance of tools and processes and, at the same time, alleviate some of the typical shortcomings of individual experiments (e.g., inaccurate and potentially biased results due to a small number of participants). Objective: Compare the usability of a chatbot for collaborative modelling (i.e., SOCIO) and an online web tool (i.e., Creately). Method: We conducted a family of three experiments to evaluate the usability of SOCIO against the Creately online collaborative tool in academic settings. Results: The student participants were faster at building class diagrams using the chatbot than with the online collaborative tool and more satisfied with SOCIO. Besides, the class diagrams built using the chatbot tended to be more concise -albeit slightly less complete. Conclusion: Chatbots appear to be helpful for building class diagrams. In fact, our study has helped us to shed light on the future direction for experimentation in this field and lays the groundwork for researching the applicability of chatbots in diagramming.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
Effect of Requirements Analyst Experience on Elicitation Effectiveness: A Family of Empirical Studies
Authors:
Alejandrina M. Aranda,
Oscar Dieste,
Jose I. Panach,
Natalia Juristo
Abstract:
Context. Nowadays there is a great deal of uncertainty surrounding the effects of experience on Requirements Engineering (RE). There is a widespread idea that experience improves analyst performance. However, there are empirical studies that demonstrate the exact opposite. Aim. Determine whether experience influences requirements analyst performance. Method. Quasi-experiments run with students and…
▽ More
Context. Nowadays there is a great deal of uncertainty surrounding the effects of experience on Requirements Engineering (RE). There is a widespread idea that experience improves analyst performance. However, there are empirical studies that demonstrate the exact opposite. Aim. Determine whether experience influences requirements analyst performance. Method. Quasi-experiments run with students and professionals. The experimental task was to elicit requirements using the open interview technique immediately followed by the consolidation of the elicited information in domains with which the analysts were and were not familiar. Results. In unfamiliar domains, interview, requirements, development, and professional experience does not influence analyst effectiveness. In familiar domains, effectiveness varies depending on the type of experience. Interview experience has a strong positive effect, whereas professional experience has a moderate negative effect. Requirements experience appears to have a moderately positive effect; however, the statistical power of the analysis is insufficient to be able to confirm this point. Development experience has no effect either way. Conclusion. Experience effects analyst effectiveness differently depending on the problem domain type (familiar, unfamiliar). Generally, experience does not account for all the observed variability, which means there are other influential factors.
△ Less
Submitted 22 August, 2024;
originally announced August 2024.
-
The role of slicing in test-driven development
Authors:
Oscar Dieste,
Ayse Tosun,
Sira Vegas,
Adrian Santos,
Fernando Uyaguari,
Jarno Kyykka,
Natalia Juristo
Abstract:
Test-driven development (TDD) is a widely used agile practice. However, very little is known with certainty about TDD's underlying foundations, i.e., the way TDD works. In this paper, we propose a theoretical framework for TDD, with the following characteristics: 1) Each TDD cycle represents a vertical slice of a (probably also small) user story, 2) vertical slices are captured using contracts, im…
▽ More
Test-driven development (TDD) is a widely used agile practice. However, very little is known with certainty about TDD's underlying foundations, i.e., the way TDD works. In this paper, we propose a theoretical framework for TDD, with the following characteristics: 1) Each TDD cycle represents a vertical slice of a (probably also small) user story, 2) vertical slices are captured using contracts, implicit in the developers' minds, and 3) the code created during a TDD cycle is a sliced-based specification of a code oracle, using the contracts as slicing pre/post-conditions. We have checked the connections among TDD, contracts, and slices using a controlled experiment conducted in the industry.
△ Less
Submitted 18 July, 2024;
originally announced July 2024.
-
Relevant information in TDD experiment reporting
Authors:
Fernando Uyaguari,
Silvia T. Acuña,
John W. Castro,
Davide Fucci,
Oscar Dieste,
Sira Vegas
Abstract:
Experiments are a commonly used method of research in software engineering (SE). Researchers report their experiments following detailed guidelines. However, researchers do not, in the field of test-driven development (TDD) at least, specify how they operationalized the response variables and the measurement process. This article has three aims: (i) identify the response variable operationalizatio…
▽ More
Experiments are a commonly used method of research in software engineering (SE). Researchers report their experiments following detailed guidelines. However, researchers do not, in the field of test-driven development (TDD) at least, specify how they operationalized the response variables and the measurement process. This article has three aims: (i) identify the response variable operationalization components in TDD experiments that study external quality; (ii) study their influence on the experimental results;(ii) determine if the experiment reports describe the measurement process components that have an impact on the results. Sequential mixed method. The first part of the research adopts a quantitative approach applying a statistical análisis (SA) of the impact of the operationalization components on the experimental results. The second part follows on with a qualitative approach applying a systematic mapping study (SMS). The test suites, intervention types and measurers have an influence on the measurements and results of the SA of TDD experiments in SE. The test suites have a major impact on both the measurements and the results of the experiments. The intervention type has less impact on the results than on the measurements. While the measurers have an impact on the measurements, this is not transferred to the experimental results. On the other hand, the results of our SMS confirm that TDD experiments do not usually report either the test suites, the test case generation method, or the details of how external quality was measured. A measurement protocol should be used to assure that the measurements made by different measurers are similar. It is necessary to report the test cases, the experimental task and the intervention type in order to be able to reproduce the measurements and SA, as well as to replicate experiments and build dependable families of experiments.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Test cases as a measurement instrument in experimentation
Authors:
Oscar Dieste,
Fernando Uyaguari,
Sira Vegas,
Natalia Juristo
Abstract:
Background: Test suites are frequently used to quantify relevant software attributes, such as quality or productivity. Problem: We have detected that the same response variable, measured using different test suites, yields different experiment results. Aims: Assess to which extent differences in test case construction influence measurement accuracy and experimental outcomes. Method: Two industry e…
▽ More
Background: Test suites are frequently used to quantify relevant software attributes, such as quality or productivity. Problem: We have detected that the same response variable, measured using different test suites, yields different experiment results. Aims: Assess to which extent differences in test case construction influence measurement accuracy and experimental outcomes. Method: Two industry experiments have been measured using two different test suites, one generated using an ad-hoc method and another using equivalence partitioning. The accuracy of the measures has been studied using standard procedures, such as ISO 5725, Bland-Altman and Interclass Correlation Coefficients. Results: There are differences in the values of the response variables up to +-60%, depending on the test suite (ad-hoc vs. equivalence partitioning) used. Conclusions: The disclosure of datasets and analysis code is insufficient to ensure the reproducibility of SE experiments. Experimenters should disclose all experimental materials needed to perform independent measurement and re-analysis.
△ Less
Submitted 25 April, 2022; v1 submitted 9 November, 2021;
originally announced November 2021.
-
A City upon a Hill: Casting Light on a Real Experimental Process
Authors:
Efraín R. Fonseca C.,
Oscar Dieste,
Natalia Juristo
Abstract:
Context: The overall scientific community is proposing measures to improve the reproducibility and replicability of experiments. Reproducibility is relatively easy to achieve. However, replicability is considerably more complex in both the sciences and Empirical Software Engineering (ESE). Several strategies, e.g., replication packages and families of experiments, have been proposed to improve rep…
▽ More
Context: The overall scientific community is proposing measures to improve the reproducibility and replicability of experiments. Reproducibility is relatively easy to achieve. However, replicability is considerably more complex in both the sciences and Empirical Software Engineering (ESE). Several strategies, e.g., replication packages and families of experiments, have been proposed to improve replication in ESE, with limited success. We wonder whether the failures are due to some mismatch, i.e., the researchers' needs are not satisfied by the proposed replication procedures.
Objectives: Find out how experimental researchers conduct \textit{experiments in practice}.
Methods: We carried out an ethnography study within a SE Research Group. Our main activity was to observe/approach the experimental researchers in their day-to-day settings for two years. Their preferred literature and experimental materials were studied. We used individual and group interviews to gain understanding and examine unclear topics in-depth.
Results: We have created conceptual and process models that represent how experimentation is really conducted in the Research Group. Models fit the community's procedures and terminology at a high level, but they become particular in their minute details.
Conclusion: The actual experimental process differs from textbooks in several points, namely: (1) Number and diversity of activities, (2) existence of different roles, (3) the granularity of the concepts used by the roles, and (4) the viewpoints that different sub-areas or families of experiments have about the overall process.
△ Less
Submitted 29 August, 2021;
originally announced August 2021.
-
Towards a Methodology for Participant Selection in Software Engineering Experiments. A Vision of the Future
Authors:
Valentina Lenarduzzi,
Oscar Dieste,
Davide Fucci,
Sira Vegas
Abstract:
Background. Software Engineering (SE) researchers extensively perform experiments with human subjects. Well-defined samples are required to ensure external validity. Samples are selected \textit{purposely} or by \textit{convenience}, limiting the generalizability of results. Objective. We aim to depict the current status of participants selection in empirical SE, identifying the main threats and h…
▽ More
Background. Software Engineering (SE) researchers extensively perform experiments with human subjects. Well-defined samples are required to ensure external validity. Samples are selected \textit{purposely} or by \textit{convenience}, limiting the generalizability of results. Objective. We aim to depict the current status of participants selection in empirical SE, identifying the main threats and how they are mitigated. We draft a robust approach to participants' selection. Method. We reviewed existing participants' selection guidelines in SE, and performed a preliminary literature review to find out how participants' selection is conducted in SE in practice. % and 3) we summarized the main issues identified. Results. We outline a new selection methodology, by 1) defining the characteristics of the desired population, 2) locating possible sources of sampling available for researchers, and 3) identifying and reducing the "distance" between the selected sample and its corresponding population. Conclusion. We propose a roadmap to develop and empirically validate the selection methodology.
△ Less
Submitted 27 August, 2021;
originally announced August 2021.
-
Publication Bias: A Detailed Analysis of Experiments Published in ESEM
Authors:
Rolando P. Reyes,
Óscar Dieste,
Efraín R. Fonseca C.,
Natalia Juristo
Abstract:
Background: Publication bias is the failure to publish the results of a study based on the direction or strength of the study findings. The existence of publication bias is firmly established in areas like medical research. Recent research suggests the existence of publication bias in Software Engineering. Aims: Finding out whether experiments published in the International Workshop on Empirical S…
▽ More
Background: Publication bias is the failure to publish the results of a study based on the direction or strength of the study findings. The existence of publication bias is firmly established in areas like medical research. Recent research suggests the existence of publication bias in Software Engineering. Aims: Finding out whether experiments published in the International Workshop on Empirical Software Engineering and Measurement (ESEM) are affected by publication bias. Method: We review experiments published in ESEM. We also survey with experimental researchers to triangulate our findings. Results: ESEM experiments do not define hypotheses and frequently perform multiple testing. One-tailed tests have a slightly higher rate of achieving statistically significant results. We could not find other practices associated with publication bias. Conclusions: Our results provide a more encouraging perspective of SE research than previous research: (1) ESEM publications do not seem to be strongly affected by biases and (2) we identify some practices that could be associated with p-hacking, but it is more likely that they are related to the conduction of exploratory research.
△ Less
Submitted 23 June, 2021;
originally announced June 2021.
-
A Family of Experiments on Test-Driven Development
Authors:
Adrian Santos,
Sira Vegas,
Oscar Dieste,
Fernando Uyaguari,
Aysee Tosun,
Davide Fucci,
Burak Turhan,
Giuseppe Scanniello,
Simone Romano,
Itir Karac,
Marco Kuhrmann,
Vladimir Mandic,
Robert Ramac,
Dietmar Pfahl,
Christian Engblom,
Jarno Kyykka,
Kerli Rungi,
Carolina Palomeque,
Jaroslav Spisak,
Markku Oivo,
Natalia Juristo
Abstract:
Context: Test-driven development (TDD) is an agile software development approach that has been widely claimed to improve software quality. However, the extent to which TDD improves quality appears to be largely dependent upon the characteristics of the study in which it is evaluated (e.g., the research method, participant type, programming environment, etc.). The particularities of each study make…
▽ More
Context: Test-driven development (TDD) is an agile software development approach that has been widely claimed to improve software quality. However, the extent to which TDD improves quality appears to be largely dependent upon the characteristics of the study in which it is evaluated (e.g., the research method, participant type, programming environment, etc.). The particularities of each study make the aggregation of results untenable. Objectives: The goal of this paper is to: increase the accuracy and generalizability of the results achieved in isolated experiments on TDD, provide joint conclusions on the performance of TDD across different industrial and academic settings, and assess the extent to which the characteristics of the experiments affect the quality-related performance of TDD. Method: We conduct a family of 12 experiments on TDD in academia and industry. We aggregate their results by means of meta-analysis. We perform exploratory analyses to identify variables impacting the quality-related performance of TDD. Results: TDD novices achieve a slightly higher code quality with iterative test-last development (i.e., ITL, the reverse approach of TDD) than with TDD. The task being developed largely determines quality. The programming environment, the order in which TDD and ITL are applied, or the learning effects from one development approach to another do not appear to affect quality. The quality-related performance of professionals using TDD drops more than for students. We hypothesize that this may be due to their being more resistant to change and potentially less motivated than students. Conclusion: Previous studies seem to provide conflicting results on TDD performance (i.e., positive vs. negative, respectively). We hypothesize that these conflicting results may be due to different study durations, experiment participants being unfamiliar with the TDD process...
△ Less
Submitted 24 November, 2020;
originally announced November 2020.
-
Increasing Validity Through Replication: An Illustrative TDD Case
Authors:
Adrian Santos,
Sira Vegas,
Fernando Uyaguari,
Oscar Dieste,
Burak Turhan,
Natalia Juristo
Abstract:
Context: Software Engineering (SE) experiments suffer from threats to validity that may impact their results. Replication allows researchers building on top of previous experiments' weaknesses and increasing the reliability of the findings. Objective: Illustrating the benefits of replication to increase the reliability of the findings and uncover moderator variables. Method: We replicate an experi…
▽ More
Context: Software Engineering (SE) experiments suffer from threats to validity that may impact their results. Replication allows researchers building on top of previous experiments' weaknesses and increasing the reliability of the findings. Objective: Illustrating the benefits of replication to increase the reliability of the findings and uncover moderator variables. Method: We replicate an experiment on Test-Driven-Development (TDD) and address some of its threats to validity and those of a previous replication. We compare the replications' results and hypothesize on plausible moderators impacting results. Results: Differences across TDD replications' results might be due to the operationalization of the response variables, the allocation of subjects to treatments, the allowance to work outside the laboratory, the provision of stubs, or the task. Conclusion: Replications allow examining the robustness of the findings, hypothesizing on plausible moderators influencing results, and strengthening the evidence obtained.
△ Less
Submitted 11 April, 2020;
originally announced April 2020.
-
How do Practitioners Perceive the Relevance of Requirements Engineering Research? An Ongoing Study
Authors:
X. Franch,
D. Méndez Fernández,
M. Oriol,
A. Vogelsang,
R. Heldal,
E. Knauss,
G. Horta Travassos,
J. C. Carver,
O. Dieste,
T. Zimmermann
Abstract:
The relevance of Requirements Engineering (RE) research to practitioners is a prerequisite for problem-driven research in the area and key for a long-term dissemination of research results to everyday practice. To better understand how industry practitioners perceive the practical relevance of RE research, we have initiated the RE-Pract project, an international collaboration conducting an empiric…
▽ More
The relevance of Requirements Engineering (RE) research to practitioners is a prerequisite for problem-driven research in the area and key for a long-term dissemination of research results to everyday practice. To better understand how industry practitioners perceive the practical relevance of RE research, we have initiated the RE-Pract project, an international collaboration conducting an empirical study. This project opts for a replication of previous work done in two different domains and relies on survey research. To this end, we have designed a survey to be sent to several hundred industry practitioners at various companies around the world and ask them to rate their perceived practical relevance of the research described in a sample of 418 RE papers published between 2010 and 2015 at the RE, ICSE, FSE, ESEC/FSE, ESEM and REFSQ conferences. In this paper, we summarise our research protocol and present the current status of our study and the planned future steps.
△ Less
Submitted 14 June, 2017; v1 submitted 17 May, 2017;
originally announced May 2017.