-
Gender Disparities in Contributions, Leadership, and Collaboration: An Exploratory Study on Software Systems Research
Authors:
Shamse Tasnim Cynthia,
Saikat Mondal,
Joy Krishan Das,
Banani Roy
Abstract:
Gender diversity enhances research by bringing diverse perspectives and innovative approaches. It ensures equitable solutions that address the needs of diverse populations. However, gender disparity persists in research where women remain underrepresented, which might limit diversity and innovation. Many even leave scientific careers as their contributions often go unnoticed and undervalued. There…
▽ More
Gender diversity enhances research by bringing diverse perspectives and innovative approaches. It ensures equitable solutions that address the needs of diverse populations. However, gender disparity persists in research where women remain underrepresented, which might limit diversity and innovation. Many even leave scientific careers as their contributions often go unnoticed and undervalued. Therefore, understanding gender-based contributions and collaboration dynamics is crucial to addressing this gap and creating a more inclusive research environment. In this study, we analyzed 2,000 articles published over the past decade in the Journal of Systems and Software (JSS). From these, we selected 384 articles that detailed authors' contributions and contained both female and male authors to investigate gender-based contributions. Our contributions are fourfold. First, we analyzed women's engagement in software systems research. Our analysis showed that only 32.74% of the total authors are women and female-led or supervised studies were fewer than those of men. Second, we investigated female authors' contributions across 14 major roles. Interestingly, we found that women contributed comparably to men in most roles, with more contributions in conceptualization, writing, and reviewing articles. Third, we explored the areas of software systems research and found that female authors are more actively involved in human-centric research domains. Finally, we analyzed gender-based collaboration dynamics. Our findings revealed that female supervisors tended to collaborate locally more often than national-level collaborations. Our study highlights that females' contributions to software systems research are comparable to those of men. Therefore, the barriers need to be addressed to enhance female participation and ensure equity and inclusivity in research.
△ Less
Submitted 6 May, 2025; v1 submitted 20 December, 2024;
originally announced December 2024.
-
Why Do Developers Engage with ChatGPT in Issue-Tracker? Investigating Usage and Reliance on ChatGPT-Generated Code
Authors:
Joy Krishan Das,
Saikat Mondal,
Chanchal K. Roy
Abstract:
Large language models (LLMs) like ChatGPT have shown the potential to assist developers with coding and debugging tasks. However, their role in collaborative issue resolution is underexplored. In this study, we analyzed 1,152 Developer-ChatGPT conversations across 1,012 issues in GitHub to examine the diverse usage of ChatGPT and reliance on its generated code. Our contributions are fourfold. Firs…
▽ More
Large language models (LLMs) like ChatGPT have shown the potential to assist developers with coding and debugging tasks. However, their role in collaborative issue resolution is underexplored. In this study, we analyzed 1,152 Developer-ChatGPT conversations across 1,012 issues in GitHub to examine the diverse usage of ChatGPT and reliance on its generated code. Our contributions are fourfold. First, we manually analyzed 289 conversations to understand ChatGPT's usage in the GitHub Issues. Our analysis revealed that ChatGPT is primarily utilized for ideation, whereas its usage for validation (e.g., code documentation accuracy) is minimal. Second, we applied BERTopic modeling to identify key areas of engagement on the entire dataset. We found that backend issues (e.g., API management) dominate conversations, while testing is surprisingly less covered. Third, we utilized the CPD clone detection tool to check if the code generated by ChatGPT was used to address issues. Our findings revealed that ChatGPT-generated code was used as-is to resolve only 5.83\% of the issues. Fourth, we estimated sentiment using a RoBERTa-based sentiment analysis model to determine developers' satisfaction with different usages and engagement areas. We found positive sentiment (i.e., high satisfaction) about using ChatGPT for refactoring and addressing data analytics (e.g., categorizing table data) issues. On the contrary, we observed negative sentiment when using ChatGPT to debug issues and address automation tasks (e.g., GUI interactions). Our findings show the unmet needs and growing dissatisfaction among developers. Researchers and ChatGPT developers should focus on developing task-specific solutions that help resolve diverse issues, improving user satisfaction and problem-solving efficiency in software development.
△ Less
Submitted 10 December, 2024; v1 submitted 9 December, 2024;
originally announced December 2024.
-
Beyond Text-to-SQL for IoT Defense: A Comprehensive Framework for Querying and Classifying IoT Threats
Authors:
Ryan Pavlich,
Nima Ebadi,
Richard Tarbell,
Billy Linares,
Adrian Tan,
Rachael Humphreys,
Jayanta Kumar Das,
Rambod Ghandiparsi,
Hannah Haley,
Jerris George,
Rocky Slavin,
Kim-Kwang Raymond Choo,
Glenn Dietrich,
Anthony Rios
Abstract:
Recognizing the promise of natural language interfaces to databases, prior studies have emphasized the development of text-to-SQL systems. While substantial progress has been made in this field, existing research has concentrated on generating SQL statements from text queries. The broader challenge, however, lies in inferring new information about the returned data. Our research makes two major co…
▽ More
Recognizing the promise of natural language interfaces to databases, prior studies have emphasized the development of text-to-SQL systems. While substantial progress has been made in this field, existing research has concentrated on generating SQL statements from text queries. The broader challenge, however, lies in inferring new information about the returned data. Our research makes two major contributions to address this gap. First, we introduce a novel Internet-of-Things (IoT) text-to-SQL dataset comprising 10,985 text-SQL pairs and 239,398 rows of network traffic activity. The dataset contains additional query types limited in prior text-to-SQL datasets, notably temporal-related queries. Our dataset is sourced from a smart building's IoT ecosystem exploring sensor read and network traffic data. Second, our dataset allows two-stage processing, where the returned data (network traffic) from a generated SQL can be categorized as malicious or not. Our results show that joint training to query and infer information about the data can improve overall text-to-SQL performance, nearly matching substantially larger models. We also show that current large language models (e.g., GPT3.5) struggle to infer new information about returned data, thus our dataset provides a novel test bed for integrating complex domain-specific reasoning into LLMs.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Investigating the Utility of ChatGPT in the Issue Tracking System: An Exploratory Study
Authors:
Joy Krishan Das,
Saikat Mondal,
Chanchal K. Roy
Abstract:
Issue tracking systems serve as the primary tool for incorporating external users and customizing a software project to meet the users' requirements. However, the limited number of contributors and the challenge of identifying the best approach for each issue often impede effective resolution. Recently, an increasing number of developers are turning to AI tools like ChatGPT to enhance problem-solv…
▽ More
Issue tracking systems serve as the primary tool for incorporating external users and customizing a software project to meet the users' requirements. However, the limited number of contributors and the challenge of identifying the best approach for each issue often impede effective resolution. Recently, an increasing number of developers are turning to AI tools like ChatGPT to enhance problem-solving efficiency. While previous studies have demonstrated the potential of ChatGPT in areas such as automatic program repair, debugging, and code generation, there is a lack of study on how developers explicitly utilize ChatGPT to resolve issues in their tracking system. Hence, this study aims to examine the interaction between ChatGPT and developers to analyze their prevalent activities and provide a resolution. In addition, we assess the code reliability by confirming if the code produced by ChatGPT was integrated into the project's codebase using the clone detection tool NiCad. Our investigation reveals that developers mainly use ChatGPT for brainstorming solutions but often opt to write their code instead of using ChatGPT-generated code, possibly due to concerns over the generation of "hallucinated code", as highlighted in the literature.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Analyzing Host-Viral Interactome of SARS-CoV-2 for Identifying Vulnerable Host Proteins during COVID-19 Pathogenesis
Authors:
Jayanta Kumar Das,
Swarup Roy,
Pietro Hiram Guzzi
Abstract:
The development of therapeutic targets for COVID-19 treatment is based on the understanding of the molecular mechanism of pathogenesis. The identification of genes and proteins involved in the infection mechanism is the key to shed out light into the complex molecular mechanisms. The combined effort of many laboratories distributed throughout the world has produced the accumulation of both protein…
▽ More
The development of therapeutic targets for COVID-19 treatment is based on the understanding of the molecular mechanism of pathogenesis. The identification of genes and proteins involved in the infection mechanism is the key to shed out light into the complex molecular mechanisms. The combined effort of many laboratories distributed throughout the world has produced the accumulation of both protein and genetic interactions. In this work we integrate these available results and we obtain an host protein-protein interaction network composed by 1432 human proteins. We calculate network centrality measures to identify key proteins. Then we perform functional enrichment of central proteins. We observed that the identified proteins are mostly associated with several crucial pathways, including cellular process, signalling transduction, neurodegenerative disease. Finally, we focused on proteins involved in causing disease in the human respiratory tract. We conclude that COVID19 is a complex disease, and we highlighted many potential therapeutic targets including RBX1, HSPA5, ITCH, RAB7A, RAB5A, RAB8A, PSMC5, CAPZB, CANX, IGF2R, HSPA1A, which are central and also associated with multiple diseases
△ Less
Submitted 5 February, 2021;
originally announced February 2021.
-
Relationship of Two Discrete Dynamical Models: One-dimensional Cellular Automata and Integral Value Transformations
Authors:
Sreeya Ghosh,
Sudhakar Sahoo,
Sk. Sarif Hassan,
Jayanta Kumar Das,
Pabitra Pal Choudhury
Abstract:
Cellular Automaton (CA) and an Integral Value Transformation (IVT) are two well established mathematical models which evolve in discrete time steps. Theoretically, studies on CA suggest that CA is capable of producing a great variety of evolution patterns. However computation of non-linear CA or higher dimensional CA maybe complex, whereas IVTs can be manipulated easily. The main purpose of this p…
▽ More
Cellular Automaton (CA) and an Integral Value Transformation (IVT) are two well established mathematical models which evolve in discrete time steps. Theoretically, studies on CA suggest that CA is capable of producing a great variety of evolution patterns. However computation of non-linear CA or higher dimensional CA maybe complex, whereas IVTs can be manipulated easily. The main purpose of this paper is to study the link between a transition function of a one-dimensional CA and IVTs. Mathematically, we have also established the algebraic structures of a set of transition functions of a one-dimensional CA as well as that of a set of IVTs using binary operations. Also DNA sequence evolution has been modelled using IVTs.
△ Less
Submitted 30 June, 2020; v1 submitted 24 June, 2020;
originally announced June 2020.
-
Implementation of the open source virtualization technologies in cloud computing
Authors:
Mohammad Mamun Or Rashid,
M. Masud Rana,
Jugal Krishna Das
Abstract:
The Virtualization and Cloud Computing is a recent buzzword in the digital world. Cloud computing provide IT as a service to the users on demand basis. This service has greater flexibility, availability, reliability and scalability with utility computing model. This new concept of computing has an immense potential in it to be used in the field of e-governance and in the overall IT development per…
▽ More
The Virtualization and Cloud Computing is a recent buzzword in the digital world. Cloud computing provide IT as a service to the users on demand basis. This service has greater flexibility, availability, reliability and scalability with utility computing model. This new concept of computing has an immense potential in it to be used in the field of e-governance and in the overall IT development perspective in developing countries like Bangladesh.
△ Less
Submitted 11 May, 2016;
originally announced May 2016.
-
Multi-Number CVT-XOR Arithmetic Operations in any Base System and its Significant Properties
Authors:
Jayanta Kumar Das,
Pabitra Pal Choudhury,
Sudhakar Sahoo
Abstract:
Carry Value Transformation (CVT) is a model of discrete dynamical system which is one special case of Integral Value Transformations (IVTs). Earlier in [5] it has been proved that sum of two non-negative integers is equal to the sum of their CVT and XOR values in any base system. In the present study, this phenomenon is extended to perform CVT and XOR operations for many non-negative integers in a…
▽ More
Carry Value Transformation (CVT) is a model of discrete dynamical system which is one special case of Integral Value Transformations (IVTs). Earlier in [5] it has been proved that sum of two non-negative integers is equal to the sum of their CVT and XOR values in any base system. In the present study, this phenomenon is extended to perform CVT and XOR operations for many non-negative integers in any base system. To achieve that both the definition of CVT and XOR are modified over the set of multiple integers instead of two. Also some important properties of these operations have been studied. With the help of cellular automata the adder circuit designed in [14] on using CVT-XOR recurrence formula is used to design a parallel adder circuit for multiple numbers in binary number system.
△ Less
Submitted 30 November, 2015;
originally announced January 2016.
-
Carry Value Transformation (CVT) - Exclusive OR (XOR) Tree and Its Significant Properties
Authors:
Jayanta Kumar Das,
Pabitra Pal Choudhury,
Sudhakar Sahoo
Abstract:
CVT and XOR are two binary operations together used to calculate the sum of two non-negative integers on using a recursive mechanism. In this present study the convergence behaviors of this recursive mechanism has been captured through a tree like structure named as CVT-XOR Tree. We have analyzed how to identify the parent nodes, leaf nodes and internal nodes in the CVT-XOR Tree. We also provide t…
▽ More
CVT and XOR are two binary operations together used to calculate the sum of two non-negative integers on using a recursive mechanism. In this present study the convergence behaviors of this recursive mechanism has been captured through a tree like structure named as CVT-XOR Tree. We have analyzed how to identify the parent nodes, leaf nodes and internal nodes in the CVT-XOR Tree. We also provide the parent information, depth information and the number of children of a node in different CVT-XOR Trees on defining three different matrices. Lastly, one observation is made towards very old Mathematical problem of Goldbach Conjecture.
△ Less
Submitted 4 June, 2015;
originally announced June 2015.
-
On Analysis and Generation of some Biologically Important Boolean Functions
Authors:
Camellia Ray,
Jayanta Kumar Das,
Pabitra Pal Choudhury
Abstract:
Boolean networks are used to model biological networks such as gene regulatory networks. Often Boolean networks show very chaotic behaviour which is sensitive to any small perturbations. In order to reduce the chaotic behaviour and to attain stability in the gene regulatory network, nested Canalizing Functions (NCFs) are best suited. NCFs and its variants have a wide range of applications in syste…
▽ More
Boolean networks are used to model biological networks such as gene regulatory networks. Often Boolean networks show very chaotic behaviour which is sensitive to any small perturbations. In order to reduce the chaotic behaviour and to attain stability in the gene regulatory network, nested Canalizing Functions (NCFs) are best suited. NCFs and its variants have a wide range of applications in systems biology. Previously, many works were done on the application of canalizing functions, but there were fewer methods to check if any arbitrary Boolean function is canalizing or not. In this paper, by using Karnaugh Map this problem is solved and also it has been shown that when the canalizing functions of variable is given, all the canalizing functions of variable could be generated by the method of concatenation. In this paper we have uniquely identified the number of NCFs having a particular Hamming Distance (H.D) generated by each variable as starting canalizing input. Partially NCFs of 4 variables has also been studied in this paper.
△ Less
Submitted 12 September, 2014; v1 submitted 9 May, 2014;
originally announced May 2014.