-
BugsRepo: A Comprehensive Curated Dataset of Bug Reports, Comments and Contributors Information from Bugzilla
Authors:
Jagrit Acharya,
Gouri Ginde
Abstract:
Bug reports help software development teams enhance software quality, yet their utility is often compromised by unclear or incomplete information. This issue not only hinders developers' ability to quickly understand and resolve bugs but also poses significant challenges for various software maintenance prediction systems, such as bug triaging, severity prediction, and bug report summarization. To…
▽ More
Bug reports help software development teams enhance software quality, yet their utility is often compromised by unclear or incomplete information. This issue not only hinders developers' ability to quickly understand and resolve bugs but also poses significant challenges for various software maintenance prediction systems, such as bug triaging, severity prediction, and bug report summarization. To address this issue, we introduce \textnormal{{\fontfamily{ppl}\selectfont BugsRepo}}, a multifaceted dataset derived from Mozilla projects that offers three key components to support a wide range of software maintenance tasks. First, it includes a Bug report meta-data & Comments dataset with detailed records for 119,585 fixed or closed and resolved bug reports, capturing fields like severity, creation time, status, and resolution to provide rich contextual insights. Second, {\fontfamily{ppl}\selectfont BugsRepo} features a contributor information dataset comprising 19,351 Mozilla community members, enriched with metadata on user roles, activity history, and contribution metrics such as the number of bugs filed, comments made, and patches reviewed, thus offering valuable information for tasks like developer recommendation. Lastly, the dataset provides a structured bug report subset of 10,351 well-structured bug reports, complete with steps to reproduce, actual behavior, and expected behavior. After this initial filter, a secondary filtering layer is applied using the CTQRS scale. By integrating static metadata, contributor statistics, and detailed comment threads, {\fontfamily{ppl}\selectfont BugsRepo} presents a holistic view of each bug's history, supporting advancements in automated bug report analysis, which can enhance the efficiency and effectiveness of software maintenance processes.
△ Less
Submitted 26 April, 2025;
originally announced April 2025.
-
Can We Enhance Bug Report Quality Using LLMs?: An Empirical Study of LLM-Based Bug Report Generation
Authors:
Jagrit Acharya,
Gouri Ginde
Abstract:
Bug reports contain the information developers need to triage and fix software bugs. However, unclear, incomplete, or ambiguous information may lead to delays and excessive manual effort spent on bug triage and resolution. In this paper, we explore whether Instruction fine-tuned Large Language Models (LLMs) can automatically transform casual, unstructured bug reports into high-quality, structured…
▽ More
Bug reports contain the information developers need to triage and fix software bugs. However, unclear, incomplete, or ambiguous information may lead to delays and excessive manual effort spent on bug triage and resolution. In this paper, we explore whether Instruction fine-tuned Large Language Models (LLMs) can automatically transform casual, unstructured bug reports into high-quality, structured bug reports adhering to a standard template. We evaluate three open-source instruction-tuned LLMs (\emph{Qwen 2.5, Mistral, and Llama 3.2}) against ChatGPT-4o, measuring performance on established metrics such as CTQRS, ROUGE, METEOR, and SBERT. Our experiments show that fine-tuned Qwen 2.5 achieves a CTQRS score of \textbf{77%}, outperforming both fine-tuned Mistral (\textbf{71%}), Llama 3.2 (\textbf{63%}) and ChatGPT in 3-shot learning (\textbf{75%}). Further analysis reveals that Llama 3.2 shows higher accuracy of detecting missing fields particularly Expected Behavior and Actual Behavior, while Qwen 2.5 demonstrates superior performance in capturing Steps-to-Reproduce, with an F1 score of 76%. Additional testing of the models on other popular projects (e.g., Eclipse, GCC) demonstrates that our approach generalizes well, achieving up to \textbf{70%} CTQRS in unseen projects' bug reports. These findings highlight the potential of instruction fine-tuning in automating structured bug report generation, reducing manual effort for developers and streamlining the software maintenance process.
△ Less
Submitted 26 April, 2025;
originally announced April 2025.
-
"So what if I used GenAI?" -- Implications of Using Cloud-based GenAI in Software Engineering Research
Authors:
Gouri Ginde
Abstract:
Generative Artificial Intelligence (GenAI) advances have led to new technologies capable of generating high-quality code, natural language, and images. The next step is to integrate GenAI technology into various aspects while conducting research or other related areas, a task typically conducted by researchers. Such research outcomes always come with a certain risk of liability. This paper sheds l…
▽ More
Generative Artificial Intelligence (GenAI) advances have led to new technologies capable of generating high-quality code, natural language, and images. The next step is to integrate GenAI technology into various aspects while conducting research or other related areas, a task typically conducted by researchers. Such research outcomes always come with a certain risk of liability. This paper sheds light on the various research aspects in which GenAI is used, thus raising awareness of its legal implications to novice and budding researchers. In particular, there are two risks: data protection and copyright. Both aspects are crucial for GenAI. We summarize key aspects regarding our current knowledge that every software researcher involved in using GenAI should be aware of to avoid critical mistakes that may expose them to liability claims and propose a checklist to guide such awareness.
△ Less
Submitted 10 December, 2024;
originally announced December 2024.
-
Beyond Keywords: A Context-based Hybrid Approach to Mining Ethical Concern-related App Reviews
Authors:
Aakash Sorathiya,
Gouri Ginde
Abstract:
With the increasing proliferation of mobile applications in our everyday experiences, the concerns surrounding ethics have surged significantly. Users generally communicate their feedback, report issues, and suggest new functionalities in application (app) reviews, frequently emphasizing safety, privacy, and accountability concerns. Incorporating these reviews is essential to developing successful…
▽ More
With the increasing proliferation of mobile applications in our everyday experiences, the concerns surrounding ethics have surged significantly. Users generally communicate their feedback, report issues, and suggest new functionalities in application (app) reviews, frequently emphasizing safety, privacy, and accountability concerns. Incorporating these reviews is essential to developing successful products. However, app reviews related to ethical concerns generally use domain-specific language and are expressed using a more varied vocabulary. Thus making automated ethical concern-related app review extraction a challenging and time-consuming effort.
This study proposes a novel Natural Language Processing (NLP) based approach that combines Natural Language Inference (NLI), which provides a deep comprehension of language nuances, and a decoder-only (LLaMA-like) Large Language Model (LLM) to extract ethical concern-related app reviews at scale. Utilizing 43,647 app reviews from the mental health domain, the proposed methodology 1) Evaluates four NLI models to extract potential privacy reviews and compares the results of domain-specific privacy hypotheses with generic privacy hypotheses; 2) Evaluates four LLMs for classifying app reviews to privacy concerns; and 3) Uses the best NLI and LLM models further to extract new privacy reviews from the dataset. Results show that the DeBERTa-v3-base-mnli-fever-anli NLI model with domain-specific hypotheses yields the best performance, and Llama3.1-8B-Instruct LLM performs best in the classification of app reviews. Then, using NLI+LLM, an additional 1,008 new privacy-related reviews were extracted that were not identified through the keyword-based approach in previous research, thus demonstrating the effectiveness of the proposed approach.
△ Less
Submitted 11 November, 2024;
originally announced November 2024.
-
Ethical software requirements from user reviews: A systematic literature review
Authors:
Aakash Sorathiya,
Gouri Ginde
Abstract:
Context: The growing focus on ethics within SE, primarily due to the significant reliance of individuals' lives on software and the consequential social and ethical considerations that impact both people and society has brought focus on ethical software requirements identification and elicitation. User safety, privacy, and security concerns are of prime importance while developing software due to…
▽ More
Context: The growing focus on ethics within SE, primarily due to the significant reliance of individuals' lives on software and the consequential social and ethical considerations that impact both people and society has brought focus on ethical software requirements identification and elicitation. User safety, privacy, and security concerns are of prime importance while developing software due to the widespread use of software across healthcare, education, and business domains. Thus, identifying and elicitating ethical software requirements from app user reviews, focusing on various aspects such as privacy, security, accountability, accessibility, transparency, fairness, safety, and social solidarity, are essential for developing trustworthy software solutions. Objective: This SLR aims to identify and analyze existing ethical requirements identification and elicitation techniques in the context of the formulated research questions. Method: We conducted an SLR based on Kitchenham et al's methodology. We identified and selected 47 primary articles for this study based on a predefined search protocol. Result: Ethical requirements gathering has recently driven drastic interest in the research community due to the rise of ML and AI-based approaches in decision-making within software applications. This SLR provides an overview of ethical requirements identification techniques and the implications of extracting and addressing them. This study also reports the data sources used for analyzing user reviews. Conclusion: This SLR provides an understanding of the ethical software requirements and underscores the importance of user reviews in developing trustworthy software. The findings can also help inform future research and guide software engineers or researchers in addressing software ethical requirements.
△ Less
Submitted 18 September, 2024;
originally announced October 2024.
-
Trustworthy and Responsible AI for Human-Centric Autonomous Decision-Making Systems
Authors:
Farzaneh Dehghani,
Mahsa Dibaji,
Fahim Anzum,
Lily Dey,
Alican Basdemir,
Sayeh Bayat,
Jean-Christophe Boucher,
Steve Drew,
Sarah Elaine Eaton,
Richard Frayne,
Gouri Ginde,
Ashley Harris,
Yani Ioannou,
Catherine Lebel,
John Lysack,
Leslie Salgado Arzuaga,
Emma Stanley,
Roberto Souza,
Ronnie de Souza Santos,
Lana Wells,
Tyler Williamson,
Matthias Wilms,
Zaman Wahid,
Mark Ungrin,
Marina Gavrilova
, et al. (1 additional authors not shown)
Abstract:
Artificial Intelligence (AI) has paved the way for revolutionary decision-making processes, which if harnessed appropriately, can contribute to advancements in various sectors, from healthcare to economics. However, its black box nature presents significant ethical challenges related to bias and transparency. AI applications are hugely impacted by biases, presenting inconsistent and unreliable fin…
▽ More
Artificial Intelligence (AI) has paved the way for revolutionary decision-making processes, which if harnessed appropriately, can contribute to advancements in various sectors, from healthcare to economics. However, its black box nature presents significant ethical challenges related to bias and transparency. AI applications are hugely impacted by biases, presenting inconsistent and unreliable findings, leading to significant costs and consequences, highlighting and perpetuating inequalities and unequal access to resources. Hence, developing safe, reliable, ethical, and Trustworthy AI systems is essential.
Our team of researchers working with Trustworthy and Responsible AI, part of the Transdisciplinary Scholarship Initiative within the University of Calgary, conducts research on Trustworthy and Responsible AI, including fairness, bias mitigation, reproducibility, generalization, interpretability, and authenticity. In this paper, we review and discuss the intricacies of AI biases, definitions, methods of detection and mitigation, and metrics for evaluating bias. We also discuss open challenges with regard to the trustworthiness and widespread application of AI across diverse domains of human-centric decision making, as well as guidelines to foster Responsible and Trustworthy AI models.
△ Less
Submitted 2 September, 2024; v1 submitted 28 August, 2024;
originally announced August 2024.
-
Towards Extracting Ethical Concerns-related Software Requirements from App Reviews
Authors:
Aakash Sorathiya,
Gouri Ginde
Abstract:
As mobile applications become increasingly integral to our daily lives, concerns about ethics have grown drastically. Users share their experiences, report bugs, and request new features in application reviews, often highlighting safety, privacy, and accountability concerns. Approaches using machine learning techniques have been used in the past to identify these ethical concerns. However, underst…
▽ More
As mobile applications become increasingly integral to our daily lives, concerns about ethics have grown drastically. Users share their experiences, report bugs, and request new features in application reviews, often highlighting safety, privacy, and accountability concerns. Approaches using machine learning techniques have been used in the past to identify these ethical concerns. However, understanding the underlying reasons behind them and extracting requirements that could address these concerns is crucial for safer software solution development. Thus, we propose a novel approach that leverages a knowledge graph (KG) model to extract software requirements from app reviews, capturing contextual data related to ethical concerns. Our framework consists of three main components: developing an ontology with relevant entities and relations, extracting key entities from app reviews, and creating connections between them. This study analyzes app reviews of the Uber mobile application (a popular taxi/ride app) and presents the preliminary results from the proposed solution. Initial results show that KG can effectively capture contextual data related to software ethical concerns, the underlying reasons behind these concerns, and the corresponding potential requirements.
△ Less
Submitted 19 July, 2024;
originally announced July 2024.
-
PRAGyan -- Connecting the Dots in Tweets
Authors:
Rahul Ravi,
Gouri Ginde,
Jon Rokne
Abstract:
As social media platforms grow, understanding the underlying reasons behind events and statements becomes crucial for businesses, policymakers, and researchers. This research explores the integration of Knowledge Graphs (KGs) with Large Language Models (LLMs) to perform causal analysis of tweets dataset. The LLM aided analysis techniques often lack depth in uncovering the causes driving observed e…
▽ More
As social media platforms grow, understanding the underlying reasons behind events and statements becomes crucial for businesses, policymakers, and researchers. This research explores the integration of Knowledge Graphs (KGs) with Large Language Models (LLMs) to perform causal analysis of tweets dataset. The LLM aided analysis techniques often lack depth in uncovering the causes driving observed effects. By leveraging KGs and LLMs, which encode rich semantic relationships and temporal information, this study aims to uncover the complex interplay of factors influencing causal dynamics and compare the results obtained using GPT-3.5 Turbo. We employ a Retrieval-Augmented Generation (RAG) model, utilizing a KG stored in a Neo4j (a.k.a PRAGyan) data format, to retrieve relevant context for causal reasoning. Our approach demonstrates that the KG-enhanced LLM RAG can provide improved results when compared to the baseline LLM (GPT-3.5 Turbo) model as the source corpus increases in size. Our qualitative analysis highlights the advantages of combining KGs with LLMs for improved interpretability and actionable insights, facilitating informed decision-making across various domains. Whereas, quantitative analysis using metrics such as BLEU and cosine similarity show that our approach outperforms the baseline by 10\%.
△ Less
Submitted 18 July, 2024;
originally announced July 2024.
-
AROhI: An Interactive Tool for Estimating ROI of Data Analytics
Authors:
Noopur Zambare,
Jacob Idoko,
Jagrit Acharya,
Gouri Ginde
Abstract:
The cost of adopting new technology is rarely analyzed and discussed, while it is vital for many software companies worldwide. Thus, it is crucial to consider Return On Investment (ROI) when performing data analytics. Decisions on "How much analytics is needed"? are hard to answer. ROI could guide decision support on the What?, How?, and How Much? Analytics for a given problem. This work details a…
▽ More
The cost of adopting new technology is rarely analyzed and discussed, while it is vital for many software companies worldwide. Thus, it is crucial to consider Return On Investment (ROI) when performing data analytics. Decisions on "How much analytics is needed"? are hard to answer. ROI could guide decision support on the What?, How?, and How Much? Analytics for a given problem. This work details a comprehensive tool that provides conventional and advanced ML approaches for demonstration using requirements dependency extraction and their ROI analysis as use case. Utilizing advanced ML techniques such as Active Learning, Transfer Learning and primitive Large language model: BERT (Bidirectional Encoder Representations from Transformers) as its various components for automating dependency extraction, the tool outcomes demonstrate a mechanism to compute the ROI of ML algorithms to present a clear picture of trade-offs between the cost and benefits of a technology investment.
△ Less
Submitted 30 July, 2024; v1 submitted 18 July, 2024;
originally announced July 2024.
-
Use of NoSQL database and visualization techniques to analyze massive scholarly article data from journals
Authors:
Gouri Ginde,
Snehanshu Saha,
Archana Mathur,
Harsha Vamsi,
Sudeepa Roy Dey,
Swati Sampatrao Gambhire
Abstract:
Visualization of the massive data is a challenging endeavor. Extracting data and providing graphical representations can aid in its effective utilization in terms of interpretation and knowledge discovery. Publishing research articles has become a way of life for academicians. The scholarly publications can shape-up the professional growth of authors and also expand the research and technological…
▽ More
Visualization of the massive data is a challenging endeavor. Extracting data and providing graphical representations can aid in its effective utilization in terms of interpretation and knowledge discovery. Publishing research articles has become a way of life for academicians. The scholarly publications can shape-up the professional growth of authors and also expand the research and technological growth of a country, continent and other demographic regions. Scholarly articles have grown in gigantic numbers that are published in different domains by various journals. Information related to articles, authors, their affiliations, number of citations, country, publisher, references and other information is like a gold mine for statisticians and data analysts. This data when used skillfully, via visual analysis tool, can provide valuable understanding and can aid in deeper exposition for researchers working in domains like scientometrics and bibliometrics. Since the data is not readily available, we used Google scholar, a comprehensive and free repository of scholarly articles, as data source for our study. Data was scraped from Google scholar and stored as a graph and later visualized in the form of nodes and its relationships, which offered discerning and concealed information of growing impact of articles, journals and authors in their domains. Not only this, evident domain shift of an author, various research domains spread for an author, predicting emerging domain and subdomains, detecting cartel behavior at Journal and author-level was also depicted by graphical analysis. Neo4j graph database was used in the background to help store the data in structured manner.
△ Less
Submitted 21 April, 2018;
originally announced May 2018.
-
Visualisation of massive data from scholarly Article and Journal Database A Novel Scheme
Authors:
Gouri Ginde
Abstract:
Scholarly articles publishing and getting cited has become a way of life for academicians. These scholarly publications shape up the career growth of not only the authors but also of the country, continent and the technological domains. Author affiliations, country and other information of an author coupled with data analytics can provide useful and insightful results. However, massive and complet…
▽ More
Scholarly articles publishing and getting cited has become a way of life for academicians. These scholarly publications shape up the career growth of not only the authors but also of the country, continent and the technological domains. Author affiliations, country and other information of an author coupled with data analytics can provide useful and insightful results. However, massive and complete data is required to perform this research. Google scholar which is a comprehensive and free repository of scholarly articles has been used as a data source for this purpose. Data scraped from Google scholar when stored as a graph and visualized in the form of nodes and relationships, can offer discerning and concealed information. Such as, evident domain shift of an author, various research domains spread for an author, prediction of emerging domain and sub domains, detection of journal and author level citation cartel behaviors etc. The data from graph database is also used in computation of scholastic indicators for the journals. Eventually, econometric model, named Cobb Douglas model is used to compute the journals Modeling "Internationality" Index based on these scholastic indicators.
△ Less
Submitted 3 November, 2016;
originally announced November 2016.
-
ScientoBASE: A Framework and Model for Computing Scholastic Indicators of non-local influence of Journals via Native Data Acquisition algorithms
Authors:
Gouri Ginde,
Snehanshu Saha,
Archana Mathur,
Sukrit Venkatagiri,
Sujith Vadakkepat,
Anand Narasimhamurthy,
B. S. Daya Sagar
Abstract:
Defining and measuring internationality as a function of influence diffusion of scientific journals is an open problem. There exists no metric to rank journals based on the extent or scale of internationality. Measuring internationality is qualitative, vague, open to interpretation and is limited by vested interests. With the tremendous increase in the number of journals in various fields and the…
▽ More
Defining and measuring internationality as a function of influence diffusion of scientific journals is an open problem. There exists no metric to rank journals based on the extent or scale of internationality. Measuring internationality is qualitative, vague, open to interpretation and is limited by vested interests. With the tremendous increase in the number of journals in various fields and the unflinching desire of academics across the globe to publish in "international" journals, it has become an absolute necessity to evaluate, rank and categorize journals based on internationality. Authors, in the current work have defined internationality as a measure of influence that transcends across geographic boundaries. There are concerns raised by the authors about unethical practices reflected in the process of journal publication whereby scholarly influence of a select few are artificially boosted, primarily by resorting to editorial maneuvres. To counter the impact of such tactics, authors have come up with a new method that defines and measures internationality by eliminating such local effects when computing the influence of journals. A new metric, Non-Local Influence Quotient(NLIQ) is proposed as one such parameter for internationality computation along with another novel metric, Other-Citation Quotient as the complement of the ratio of self-citation and total citation. In addition, SNIP and International Collaboration Ratio are used as two other parameters.
△ Less
Submitted 6 May, 2016;
originally announced May 2016.