-
Centrality Change Proneness: an Early Indicator of Microservice Architectural Degradation
Authors:
Alexander Bakhtin,
Matteo Esposito,
Valentina Lenarduzzi,
Davide Taibi
Abstract:
Over the past decade, the wide adoption of Microservice Architecture has required the identification of various patterns and anti-patterns to prevent Microservice Architectural Degradation. Frequently, the systems are modelled as a network of connected services. Recently, the study of temporal networks has emerged as a way to describe and analyze evolving networks. Previous research has explored h…
▽ More
Over the past decade, the wide adoption of Microservice Architecture has required the identification of various patterns and anti-patterns to prevent Microservice Architectural Degradation. Frequently, the systems are modelled as a network of connected services. Recently, the study of temporal networks has emerged as a way to describe and analyze evolving networks. Previous research has explored how software metrics such as size, complexity, and quality are related to microservice centrality in the architectural network. This study investigates whether temporal centrality metrics can provide insight into the early detection of architectural degradation by correlating or affecting software metrics. We reconstructed the architecture of 7 releases of an OSS microservice project with 42 services. For every service in every release, we computed the software and centrality metrics. From one of the latter, we derived a new metric, Centrality Change Proneness. We then explored the correlation between the metrics. We identified 7 size and 5 complexity metrics that have a consistent correlation with centrality, while Centrality Change Proneness did not affect the software metrics, thus providing yet another perspective and an early indicator of microservice architectural degradation.
△ Less
Submitted 9 June, 2025;
originally announced June 2025.
-
Leveraging Network Methods for Hub-like Microservice Detection
Authors:
Alexander Bakhtin,
Matteo Esposito,
Valentina Lenarduzzi,
Davide Taibi
Abstract:
Context: Microservice Architecture is a popular architectural paradigm that facilitates flexibility by decomposing applications into small, independently deployable services. Catalogs of architectural anti-patterns have been proposed to highlight the negative aspects of flawed microservice design. In particular, the Hub-like anti-pattern lacks an unambiguous definition and detection method. Aim: I…
▽ More
Context: Microservice Architecture is a popular architectural paradigm that facilitates flexibility by decomposing applications into small, independently deployable services. Catalogs of architectural anti-patterns have been proposed to highlight the negative aspects of flawed microservice design. In particular, the Hub-like anti-pattern lacks an unambiguous definition and detection method. Aim: In this work, we aim to find a robust detection approach for the Hub-like microservice anti-pattern that outputs a reasonable number of Hub-like candidates with high precision. Method: We leveraged a dataset of 25 microservice networks and several network hub detection techniques to identify the Hub-like anti-pattern, namely scale-free property, centrality metrics and clustering coefficient, minimum description length principle, and the approach behind the Arcan tool. Results and Conclusion: Our findings revealed that the studied architectural networks are not scale-free, that most considered hub detection approaches do not agree on the detected hubs, and that the method by Kirkley leveraging the Erdos-Renyi encoding is the most accurate one in terms of the number of detected hubs and the detection precision. Investigating further the applicability of these methods to detecting Hub-like components in microservice-based and other systems opens up new research directions. Moreover, our results provide an evaluation of the approach utilized by the widely used Arcan tool and highlight the potential to update the tool to use the normalized degree centrality of a component in the network, or for the approach based on ER encoding to be adopted instead.
△ Less
Submitted 9 June, 2025;
originally announced June 2025.
-
A Private Smart Wallet with Probabilistic Compliance
Authors:
Andrea Rizzini,
Marco Esposito,
Francesco Bruschi,
Donatella Sciuto
Abstract:
We propose a privacy-preserving smart wallet with a novel invitation-based private onboarding mechanism. The solution integrates two levels of compliance in concert with an authority party: a proof of innocence mechanism and an ancestral commitment tracking system using bloom filters for probabilistic UTXO chain states. Performance analysis demonstrates practical efficiency: private transfers with…
▽ More
We propose a privacy-preserving smart wallet with a novel invitation-based private onboarding mechanism. The solution integrates two levels of compliance in concert with an authority party: a proof of innocence mechanism and an ancestral commitment tracking system using bloom filters for probabilistic UTXO chain states. Performance analysis demonstrates practical efficiency: private transfers with compliance checks complete within seconds on a consumer-grade laptop, and overall with proof generation remaining low. On-chain costs stay minimal, ensuring affordability for all operations on Base layer 2 network. The wallet facilitates private contact list management through encrypted data blobs while maintaining transaction unlinkability. Our evaluation validates the approach's viability for privacy-preserving, compliance-aware digital payments with minimized computational and financial overhead.
△ Less
Submitted 5 June, 2025;
originally announced June 2025.
-
AudSemThinker: Enhancing Audio-Language Models through Reasoning over Semantics of Sound
Authors:
Gijs Wijngaard,
Elia Formisano,
Michele Esposito,
Michel Dumontier
Abstract:
Audio-language models have shown promising results in various sound understanding tasks, yet they remain limited in their ability to reason over the fine-grained semantics of sound. In this paper, we present AudSemThinker, a model whose reasoning is structured around a framework of auditory semantics inspired by human cognition. To support this, we introduce AudSem, a novel dataset specifically cu…
▽ More
Audio-language models have shown promising results in various sound understanding tasks, yet they remain limited in their ability to reason over the fine-grained semantics of sound. In this paper, we present AudSemThinker, a model whose reasoning is structured around a framework of auditory semantics inspired by human cognition. To support this, we introduce AudSem, a novel dataset specifically curated for semantic descriptor reasoning in audio-language models. AudSem addresses the persistent challenge of data contamination in zero-shot evaluations by providing a carefully filtered collection of audio samples paired with captions generated through a robust multi-stage pipeline. Our experiments demonstrate that AudSemThinker outperforms state-of-the-art models across multiple training settings, highlighting its strength in semantic audio reasoning. Both AudSemThinker and the AudSem dataset are released publicly.
△ Less
Submitted 20 May, 2025;
originally announced May 2025.
-
Descriptor: C++ Self-Admitted Technical Debt Dataset (CppSATD)
Authors:
Phuoc Pham,
Murali Sridharan,
Matteo Esposito,
Valentina Lenarduzzi
Abstract:
In software development, technical debt (TD) refers to suboptimal implementation choices made by the developers to meet urgent deadlines and limited resources, posing challenges for future maintenance. Self-Admitted Technical Debt (SATD) is a sub-type of TD, representing specific TD instances ``openly admitted'' by the developers and often expressed through source code comments. Previous research…
▽ More
In software development, technical debt (TD) refers to suboptimal implementation choices made by the developers to meet urgent deadlines and limited resources, posing challenges for future maintenance. Self-Admitted Technical Debt (SATD) is a sub-type of TD, representing specific TD instances ``openly admitted'' by the developers and often expressed through source code comments. Previous research on SATD has focused predominantly on the Java programming language, revealing a significant gap in cross-language SATD. Such a narrow focus limits the generalizability of existing findings as well as SATD detection techniques across multiple programming languages. Our work addresses such limitation by introducing CppSATD, a dedicated C++ SATD dataset, comprising over 531,000 annotated comments and their source code contexts. Our dataset can serve as a foundation for future studies that aim to develop SATD detection methods in C++, generalize the existing findings to other languages, or contribute novel insights to cross-language SATD research.
△ Less
Submitted 1 June, 2025; v1 submitted 2 May, 2025;
originally announced May 2025.
-
LO2: Microservice API Anomaly Dataset of Logs and Metrics
Authors:
Alexander Bakhtin,
Jesse Nyyssölä,
Yuqing Wang,
Noman Ahmad,
Ke Ping,
Matteo Esposito,
Mika Mäntylä,
Davide Taibi
Abstract:
Context. Microservice-based systems have gained significant attention over the past years. A critical factor for understanding and analyzing the behavior of these systems is the collection of monitoring data such as logs, metrics, and traces. These data modalities can be used for anomaly detection and root cause analysis of failures. In particular, multi-modal methods utilizing several types of th…
▽ More
Context. Microservice-based systems have gained significant attention over the past years. A critical factor for understanding and analyzing the behavior of these systems is the collection of monitoring data such as logs, metrics, and traces. These data modalities can be used for anomaly detection and root cause analysis of failures. In particular, multi-modal methods utilizing several types of this data at once have gained traction in the research community since these three modalities capture different dimensions of system behavior. Aim. We provide a dataset that supports research on anomaly detection and architectural degradation in microservice systems. We generate a comprehensive dataset of logs, metrics, and traces from a production microservice system to enable the exploration of multi-modal fusion methods that integrate multiple data modalities. Method. We dynamically tested the various APIs of the MS-based system, implementing the OAuth2.0 protocol using the Locust tool. For each execution of the prepared test suite, we collect logs and performance metrics for correct and erroneous calls with data labeled according to the error triggered during the call. Contributions. We collected approximately 657,000 individual log files, totaling over two billion log lines. In addition, we collected more than 45 million individual metric files that contain 485 unique metrics. We provide an initial analysis of logs, identify key metrics through PCA, and discuss challenges in collecting traces for this system. Moreover, we highlight the possibilities for making a more fine-grained version of the data set. This work advances anomaly detection in microservice systems using multiple data sources.
△ Less
Submitted 16 April, 2025;
originally announced April 2025.
-
Force-Free Molecular Dynamics Through Autoregressive Equivariant Networks
Authors:
Fabian L. Thiemann,
Thiago Reschützegger,
Massimiliano Esposito,
Tseden Taddese,
Juan D. Olarte-Plata,
Fausto Martelli
Abstract:
Molecular dynamics (MD) simulations play a crucial role in scientific research. Yet their computational cost often limits the timescales and system sizes that can be explored. Most data-driven efforts have been focused on reducing the computational cost of accurate interatomic forces required for solving the equations of motion. Despite their success, however, these machine learning interatomic po…
▽ More
Molecular dynamics (MD) simulations play a crucial role in scientific research. Yet their computational cost often limits the timescales and system sizes that can be explored. Most data-driven efforts have been focused on reducing the computational cost of accurate interatomic forces required for solving the equations of motion. Despite their success, however, these machine learning interatomic potentials (MLIPs) are still bound to small time-steps. In this work, we introduce TrajCast, a transferable and data-efficient framework based on autoregressive equivariant message passing networks that directly updates atomic positions and velocities lifting the constraints imposed by traditional numerical integration. We benchmark our framework across various systems, including a small molecule, crystalline material, and bulk liquid, demonstrating excellent agreement with reference MD simulations for structural, dynamical, and energetic properties. Depending on the system, TrajCast allows for forecast intervals up to $30\times$ larger than traditional MD time-steps, generating over 15 ns of trajectory data per day for a solid with more than 4,000 atoms. By enabling efficient large-scale simulations over extended timescales, TrajCast can accelerate materials discovery and explore physical phenomena beyond the reach of traditional simulations and experiments. An open-source implementation of TrajCast is accessible under https://github.com/IBM/trajcast.
△ Less
Submitted 31 March, 2025;
originally announced March 2025.
-
Generative AI for Software Architecture. Applications, Trends, Challenges, and Future Directions
Authors:
Matteo Esposito,
Xiaozhou Li,
Sergio Moreschini,
Noman Ahmad,
Tomas Cerny,
Karthik Vaidhyanathan,
Valentina Lenarduzzi,
Davide Taibi
Abstract:
Context: Generative Artificial Intelligence (GenAI) is transforming much of software development, yet its application in software architecture is still in its infancy, and no prior study has systematically addressed the topic. Aim: We aim to systematically synthesize the use, rationale, contexts, usability, and future challenges of GenAI in software architecture. Method: We performed a multivocal…
▽ More
Context: Generative Artificial Intelligence (GenAI) is transforming much of software development, yet its application in software architecture is still in its infancy, and no prior study has systematically addressed the topic. Aim: We aim to systematically synthesize the use, rationale, contexts, usability, and future challenges of GenAI in software architecture. Method: We performed a multivocal literature review (MLR), analyzing peer-reviewed and gray literature, identifying current practices, models, adoption contexts, and reported challenges, extracting themes via open coding. Results: Our review identified significant adoption of GenAI for architectural decision support and architectural reconstruction. OpenAI GPT models are predominantly applied, and there is consistent use of techniques such as few-shot prompting and retrieved-augmented generation (RAG). GenAI has been applied mostly to initial stages of the Software Development Life Cycle (SDLC), such as Requirements-to-Architecture and Architecture-to-Code. Monolithic and microservice architectures were the dominant targets. However, rigorous testing of GenAI outputs was typically missing from the studies. Among the most frequent challenges are model precision, hallucinations, ethical aspects, privacy issues, lack of architecture-specific datasets, and the absence of sound evaluation frameworks. Conclusions: GenAI shows significant potential in software design, but several challenges remain on its path to greater adoption. Research efforts should target designing general evaluation methodologies, handling ethics and precision, increasing transparency and explainability, and promoting architecture-specific datasets and benchmarks to bridge the gap between theoretical possibilities and practical use.
△ Less
Submitted 17 March, 2025;
originally announced March 2025.
-
Visual-Haptic Model Mediated Teleoperation for Remote Ultrasound
Authors:
David Black,
Maria Tirindelli,
Septimiu Salcudean,
Wolfgang Wein,
Marco Esposito
Abstract:
Tele-ultrasound has the potential greatly to improve health equity for countless remote communities. However, practical scenarios involve potentially large time delays which cause current implementations of telerobotic ultrasound (US) to fail. Using a local model of the remote environment to provide haptics to the expert operator can decrease teleoperation instability, but the delayed visual feedb…
▽ More
Tele-ultrasound has the potential greatly to improve health equity for countless remote communities. However, practical scenarios involve potentially large time delays which cause current implementations of telerobotic ultrasound (US) to fail. Using a local model of the remote environment to provide haptics to the expert operator can decrease teleoperation instability, but the delayed visual feedback remains problematic. This paper introduces a robotic tele-US system in which the local model is not only haptic, but also visual, by re-slicing and rendering a pre-acquired US sweep in real time to provide the operator a preview of what the delayed image will resemble. A prototype system is presented and tested with 15 volunteer operators. It is found that visual-haptic model-mediated teleoperation (MMT) compensates completely for time delays up to 1000 ms round trip in terms of operator effort and completion time while conventional MMT does not. Visual-haptic MMT also significantly outperforms MMT for longer time delays in terms of motion accuracy and force control. This proof-of-concept study suggests that visual-haptic MMT may facilitate remote robotic tele-US.
△ Less
Submitted 11 February, 2025;
originally announced February 2025.
-
Network Centrality as a New Perspective on Microservice Architecture
Authors:
Alexander Bakhtin,
Matteo Esposito,
Valentina Lenarduzzi,
Davide Taibi
Abstract:
Context: Over the past decade, the adoption of Microservice Architecture (MSA) has led to the identification of various patterns and anti-patterns, such as Nano/Mega/Hub services. Detecting these anti-patterns often involves modeling the system as a Service Dependency Graph (SDG) and applying graph-theoretic approaches. Aim: While previous research has explored software metrics (SMs) such as size,…
▽ More
Context: Over the past decade, the adoption of Microservice Architecture (MSA) has led to the identification of various patterns and anti-patterns, such as Nano/Mega/Hub services. Detecting these anti-patterns often involves modeling the system as a Service Dependency Graph (SDG) and applying graph-theoretic approaches. Aim: While previous research has explored software metrics (SMs) such as size, complexity, and quality for assessing MSAs, the potential of graph-specific metrics like network centrality remains largely unexplored. This study investigates whether centrality metrics (CMs) can provide new insights into MSA quality and facilitate the detection of architectural anti-patterns, complementing or extending traditional SMs. Method: We analyzed 24 open-source MSA projects, reconstructing their architectures to study 53 microservices. We measured SMs and CMs for each microservice and tested their correlation to determine the relationship between these metric types. Results and Conclusion: Among 902 computed metric correlations, we found weak to moderate correlation in 282 cases. These findings suggest that centrality metrics offer a novel perspective for understanding MSA properties. Specifically, ratio-based centrality metrics show promise for detecting specific anti-patterns, while subgraph centrality needs further investigation for its applicability in architectural assessments.
△ Less
Submitted 23 January, 2025;
originally announced January 2025.
-
A Call for Critically Rethinking and Reforming Data Analysis in Empirical Software Engineering
Authors:
Matteo Esposito,
Mikel Robredo,
Murali Sridharan,
Guilherme Horta Travassos,
Rafael Peñaloza,
Valentina Lenarduzzi
Abstract:
Context: Empirical Software Engineering (ESE) drives innovation in SE through qualitative and quantitative studies. However, concerns about the correct application of empirical methodologies have existed since the 2006 Dagstuhl seminar on SE. Objective: To analyze three decades of SE research, identify mistakes in statistical methods, and evaluate experts' ability to detect and address these issue…
▽ More
Context: Empirical Software Engineering (ESE) drives innovation in SE through qualitative and quantitative studies. However, concerns about the correct application of empirical methodologies have existed since the 2006 Dagstuhl seminar on SE. Objective: To analyze three decades of SE research, identify mistakes in statistical methods, and evaluate experts' ability to detect and address these issues. Methods: We conducted a literature survey of ~27,000 empirical studies, using LLMs to classify statistical methodologies as adequate or inadequate. Additionally, we selected 30 primary studies and held a workshop with 33 ESE experts to assess their ability to identify and resolve statistical issues. Results: Significant statistical issues were found in the primary studies, and experts showed limited ability to detect and correct these methodological problems, raising concerns about the broader ESE community's proficiency in this area. Conclusions. Despite our study's eventual limitations, its results shed light on recurring issues from promoting information copy-and-paste from past authors' works and the continuous publication of inadequate approaches that promote dubious results and jeopardize the spread of the correct statistical strategies among researchers. Besides, it justifies further investigation into empirical rigor in software engineering to expose these recurring issues and establish a framework for reassessing our field's foundation of statistical methodology application. Therefore, this work calls for critically rethinking and reforming data analysis in empirical software engineering, paving the way for our work soon.
△ Less
Submitted 22 January, 2025;
originally announced January 2025.
-
Bringing Order Amidst Chaos: On the Role of Artificial Intelligence in Secure Software Engineering
Authors:
Matteo Esposito
Abstract:
Context. Developing secure and reliable software remains a key challenge in software engineering (SE). The ever-evolving technological landscape offers both opportunities and threats, creating a dynamic space where chaos and order compete. Secure software engineering (SSE) must continuously address vulnerabilities that endanger software systems and carry broader socio-economic risks, such as compr…
▽ More
Context. Developing secure and reliable software remains a key challenge in software engineering (SE). The ever-evolving technological landscape offers both opportunities and threats, creating a dynamic space where chaos and order compete. Secure software engineering (SSE) must continuously address vulnerabilities that endanger software systems and carry broader socio-economic risks, such as compromising critical national infrastructure and causing significant financial losses. Researchers and practitioners have explored methodologies like Static Application Security Testing Tools (SASTTs) and artificial intelligence (AI) approaches, including machine learning (ML) and large language models (LLMs), to detect and mitigate these vulnerabilities. Each method has unique strengths and limitations.
Aim. This thesis seeks to bring order to the chaos in SSE by addressing domain-specific differences that impact AI accuracy.
Methodology. The research employs a mix of empirical strategies, such as evaluating effort-aware metrics, analyzing SASTTs, conducting method-level analysis, and leveraging evidence-based techniques like systematic dataset reviews. These approaches help characterize vulnerability prediction datasets.
Results. Key findings include limitations in static analysis tools for identifying vulnerabilities, gaps in SASTT coverage of vulnerability types, weak relationships among vulnerability severity scores, improved defect prediction accuracy using just-in-time modeling, and threats posed by untouched methods.
Conclusions. This thesis highlights the complexity of SSE and the importance of contextual knowledge in improving AI-driven vulnerability and defect prediction. The comprehensive analysis advances effective prediction models, benefiting both researchers and practitioners.
△ Less
Submitted 9 January, 2025;
originally announced January 2025.
-
On Large Language Models in Mission-Critical IT Governance: Are We Ready Yet?
Authors:
Matteo Esposito,
Francesco Palagiano,
Valentina Lenarduzzi,
Davide Taibi
Abstract:
Context. The security of critical infrastructure has been a pressing concern since the advent of computers and has become even more critical in today's era of cyber warfare. Protecting mission-critical systems (MCSs), essential for national security, requires swift and robust governance, yet recent events reveal the increasing difficulty of meeting these challenges. Aim. Building on prior research…
▽ More
Context. The security of critical infrastructure has been a pressing concern since the advent of computers and has become even more critical in today's era of cyber warfare. Protecting mission-critical systems (MCSs), essential for national security, requires swift and robust governance, yet recent events reveal the increasing difficulty of meeting these challenges. Aim. Building on prior research showcasing the potential of Generative AI (GAI), such as Large Language Models, in enhancing risk analysis, we aim to explore practitioners' views on integrating GAI into the governance of IT MCSs. Our goal is to provide actionable insights and recommendations for stakeholders, including researchers, practitioners, and policymakers. Method. We designed a survey to collect practical experiences, concerns, and expectations of practitioners who develop and implement security solutions in the context of MCSs. Conclusions and Future Works. Our findings highlight that the safe use of LLMs in MCS governance requires interdisciplinary collaboration. Researchers should focus on designing regulation-oriented models and focus on accountability; practitioners emphasize data protection and transparency, while policymakers must establish a unified AI framework with global benchmarks to ensure ethical and secure LLMs-based MCS governance.
△ Less
Submitted 10 January, 2025; v1 submitted 16 December, 2024;
originally announced December 2024.
-
The Dual-Edged Sword of Technical Debt: Benefits and Issues Analyzed Through Developer Discussions
Authors:
Xiaozhou Li,
Matteo Esposito,
Andrea Janes,
Valentina Lenarduzzi
Abstract:
Background. Technical debt (TD) has long been one of the key factors influencing the maintainability of software products. It represents technical compromises that sacrifice long-term software quality for potential short-term benefits. Objective. This work is to collectively investigate the practitioners' opinions on the various perspectives of TD from a large collection of articles. We find the t…
▽ More
Background. Technical debt (TD) has long been one of the key factors influencing the maintainability of software products. It represents technical compromises that sacrifice long-term software quality for potential short-term benefits. Objective. This work is to collectively investigate the practitioners' opinions on the various perspectives of TD from a large collection of articles. We find the topics and latent details of each, where the sentiments of the detected opinions are also considered. Method. For such a purpose, we conducted a grey literature review on the articles systematically collected from three mainstream technology forums. Furthermore, we adopted natural language processing techniques like topic modeling and sentiment analysis to achieve a systematic and comprehensive understanding. However, we adopted ChatGPT to support the topic interpretation. Results. In this study, 2,213 forum posts and articles were collected, with eight main topics and 43 sub-topics identified. For each topic, we obtained the practitioners' collective positive and negative opinions. Conclusion. We identified 8 major topics in TD related to software development. Identified challenges by practitioners include unclear roles and a lack of engagement. On the other hand, active management supports collaboration and mitigates the impact of TD on the source code.
△ Less
Submitted 30 July, 2024;
originally announced July 2024.
-
An evidence-based methodology for human rights impact assessment (HRIA) in the development of AI data-intensive systems
Authors:
Alessandro Mantelero,
Maria Samantha Esposito
Abstract:
Different approaches have been adopted in addressing the challenges of Artificial Intelligence (AI), some centred on personal data and others on ethics, respectively narrowing and broadening the scope of AI regulation. This contribution aims to demonstrate that a third way is possible, starting from the acknowledgement of the role that human rights can play in regulating the impact of data-intensi…
▽ More
Different approaches have been adopted in addressing the challenges of Artificial Intelligence (AI), some centred on personal data and others on ethics, respectively narrowing and broadening the scope of AI regulation. This contribution aims to demonstrate that a third way is possible, starting from the acknowledgement of the role that human rights can play in regulating the impact of data-intensive systems. The focus on human rights is neither a paradigm shift nor a mere theoretical exercise. Through the analysis of more than 700 decisions and documents of the data protection authorities of six countries, we show that human rights already underpin the decisions in the field of data use. Based on empirical analysis of this evidence, this work presents a methodology and a model for a Human Rights Impact Assessment (HRIA). The methodology and related assessment model are focused on AI applications, whose nature and scale require a proper contextualisation of HRIA methodology. Moreover, the proposed models provide a more measurable approach to risk assessment which is consistent with the regulatory proposals centred on risk thresholds. The proposed methodology is tested in concrete case-studies to prove its feasibility and effectiveness. The overall goal is to respond to the growing interest in HRIA, moving from a mere theoretical debate to a concrete and context-specific implementation in the field of data-intensive applications based on AI.
△ Less
Submitted 30 July, 2024;
originally announced July 2024.
-
In Search of Metrics to Guide Developer-Based Refactoring Recommendations
Authors:
Mikel Robredo,
Matteo Esposito,
Fabio Palomba,
Rafael Peñaloza,
Valentina Lenarduzzi
Abstract:
Context. Source code refactoring is a well-established approach to improving source code quality without compromising its external behavior. Motivation. The literature described the benefits of refactoring, yet its application in practice is threatened by the high cost of time, resource allocation, and effort required to perform it continuously. Providing refactoring recommendations closer to what…
▽ More
Context. Source code refactoring is a well-established approach to improving source code quality without compromising its external behavior. Motivation. The literature described the benefits of refactoring, yet its application in practice is threatened by the high cost of time, resource allocation, and effort required to perform it continuously. Providing refactoring recommendations closer to what developers perceive as relevant may support the broader application of refactoring in practice and drive prioritization efforts. Aim. In this paper, we aim to foster the design of a developer-based refactoring recommender, proposing an empirical study into the metrics that study the developer's willingness to apply refactoring operations. We build upon previous work describing the developer's motivations for refactoring and investigate how product and process metrics may grasp those motivations. Expected Results. We will quantify the value of product and process metrics in grasping developers' motivations to perform refactoring, thus providing a catalog of metrics for developer-based refactoring recommenders to use.
△ Less
Submitted 25 July, 2024;
originally announced July 2024.
-
Generative AI in Evidence-Based Software Engineering: A White Paper
Authors:
Matteo Esposito,
Andrea Janes,
Davide Taibi,
Valentina Lenarduzzi
Abstract:
Context. In less than a year practitioners and researchers witnessed a rapid and wide implementation of Generative Artificial Intelligence. The daily availability of new models proposed by practitioners and researchers has enabled quick adoption. Textual GAIs capabilities enable researchers worldwide to explore new generative scenarios simplifying and hastening all timeconsuming text generation an…
▽ More
Context. In less than a year practitioners and researchers witnessed a rapid and wide implementation of Generative Artificial Intelligence. The daily availability of new models proposed by practitioners and researchers has enabled quick adoption. Textual GAIs capabilities enable researchers worldwide to explore new generative scenarios simplifying and hastening all timeconsuming text generation and analysis tasks.
Motivation. The exponentially growing number of publications in our field with the increased accessibility to information due to digital libraries makes conducting systematic literature reviews and mapping studies an effort and timeinsensitive task Stemmed from this challenge we investigated and envisioned the role of GAIs in evidencebased software engineering.
Future Directions. Based on our current investigation we will follow up the vision with the creation and empirical validation of a comprehensive suite of models to effectively support EBSE researchers
△ Less
Submitted 22 August, 2024; v1 submitted 24 July, 2024;
originally announced July 2024.
-
Audio-Language Datasets of Scenes and Events: A Survey
Authors:
Gijs Wijngaard,
Elia Formisano,
Michele Esposito,
Michel Dumontier
Abstract:
Audio-language models (ALMs) generate linguistic descriptions of sound-producing events and scenes. Advances in dataset creation and computational power have led to significant progress in this domain. This paper surveys 69 datasets used to train ALMs, covering research up to September 2024 (https://github.com/GLJS/audio-datasets). It provides a comprehensive analysis of datasets origins, audio an…
▽ More
Audio-language models (ALMs) generate linguistic descriptions of sound-producing events and scenes. Advances in dataset creation and computational power have led to significant progress in this domain. This paper surveys 69 datasets used to train ALMs, covering research up to September 2024 (https://github.com/GLJS/audio-datasets). It provides a comprehensive analysis of datasets origins, audio and linguistic characteristics, and use cases. Key sources include YouTube-based datasets like AudioSet with over two million samples, and community platforms like Freesound with over 1 million samples. Through principal component analysis of audio and text embeddings, the survey evaluates the acoustic and linguistic variability across datasets. It also analyzes data leakage through CLAP embeddings, and examines sound category distributions to identify imbalances. Finally, the survey identifies key challenges in developing large, diverse datasets to enhance ALM performance, including dataset overlap, biases, accessibility barriers, and the predominance of English-language content, while highlighting opportunities for improvement.
△ Less
Submitted 8 January, 2025; v1 submitted 9 July, 2024;
originally announced July 2024.
-
6GSoft: Software for Edge-to-Cloud Continuum
Authors:
Muhammad Azeem Akbar,
Matteo Esposito,
Sami Hyrynsalmi,
Karthikeyan Dinesh Kumar,
Valentina Lenarduzzi,
Xiaozhou Li,
Ali Mehraj,
Tommi Mikkonen,
Sergio Moreschini,
Niko Mäkitalo,
Markku Oivo,
Anna-Sofia Paavonen,
Risha Parveen,
Kari Smolander,
Ruoyu Su,
Kari Systä,
Davide Taibi,
Nan Yang,
Zheying Zhang,
Muhammad Zohaib
Abstract:
In the era of 6G, developing and managing software requires cutting-edge software engineering (SE) theories and practices tailored for such complexity across a vast number of connected edge devices. Our project aims to lead the development of sustainable methods and energy-efficient orchestration models specifically for edge environments, enhancing architectural support driven by AI for contempora…
▽ More
In the era of 6G, developing and managing software requires cutting-edge software engineering (SE) theories and practices tailored for such complexity across a vast number of connected edge devices. Our project aims to lead the development of sustainable methods and energy-efficient orchestration models specifically for edge environments, enhancing architectural support driven by AI for contemporary edge-to-cloud continuum computing. This initiative seeks to position Finland at the forefront of the 6G landscape, focusing on sophisticated edge orchestration and robust software architectures to optimize the performance and scalability of edge networks. Collaborating with leading Finnish universities and companies, the project emphasizes deep industry-academia collaboration and international expertise to address critical challenges in edge orchestration and software architecture, aiming to drive significant advancements in software productivity and market impact.
△ Less
Submitted 9 July, 2024; v1 submitted 8 July, 2024;
originally announced July 2024.
-
On the correlation between Architectural Smells and Static Analysis Warnings
Authors:
Matteo Esposito,
Mikel Robredo,
Francesca Arcelli Fontana,
Valentina Lenarduzzi
Abstract:
Background. Software quality assurance is essential during software development and maintenance. Static Analysis Tools (SAT) are widely used for assessing code quality. Architectural smells are becoming more daunting to address and evaluate among quality issues.
Objective. We aim to understand the relationships between static analysis warnings (SAW) and architectural smells (AS) to guide develop…
▽ More
Background. Software quality assurance is essential during software development and maintenance. Static Analysis Tools (SAT) are widely used for assessing code quality. Architectural smells are becoming more daunting to address and evaluate among quality issues.
Objective. We aim to understand the relationships between static analysis warnings (SAW) and architectural smells (AS) to guide developers/maintainers in focusing their efforts on SAWs more prone to co-occurring with AS.
Method. We performed an empirical study on 103 Java projects totaling 72 million LOC belonging to projects from a vast set of domains, and 785 SAW detected by four SAT, Checkstyle, Findbugs, PMD, SonarQube, and 4 architectural smells detected by ARCAN tool. We analyzed how SAWs influence AS presence. Finally, we proposed an AS remediation effort prioritization based on SAW severity and SAW proneness to specific ASs.
Results. Our study reveals a moderate correlation between SAWs and ASs. Different combinations of SATs and SAWs significantly affect AS occurrence, with certain SAWs more likely to co-occur with specific ASs. Conversely, 33.79% of SAWs act as "healthy carriers", not associated with any ASs.
Conclusion. Practitioners can ignore about a third of SAWs and focus on those most likely to be associated with ASs. Prioritizing AS remediation based on SAW severity or SAW proneness to specific ASs results in effective rankings like those based on AS severity.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Beyond Words: On Large Language Models Actionability in Mission-Critical Risk Analysis
Authors:
Matteo Esposito,
Francesco Palagiano,
Valentina Lenarduzzi,
Davide Taibi
Abstract:
Context. Risk analysis assesses potential risks in specific scenarios. Risk analysis principles are context-less; the same methodology can be applied to a risk connected to health and information technology security. Risk analysis requires a vast knowledge of national and international regulations and standards and is time and effort-intensive. A large language model can quickly summarize informat…
▽ More
Context. Risk analysis assesses potential risks in specific scenarios. Risk analysis principles are context-less; the same methodology can be applied to a risk connected to health and information technology security. Risk analysis requires a vast knowledge of national and international regulations and standards and is time and effort-intensive. A large language model can quickly summarize information in less time than a human and can be fine-tuned to specific tasks.
Aim. Our empirical study aims to investigate the effectiveness of Retrieval-Augmented Generation and fine-tuned LLM in risk analysis. To our knowledge, no prior study has explored its capabilities in risk analysis.
Method. We manually curated 193 unique scenarios leading to 1283 representative samples from over 50 mission-critical analyses archived by the industrial context team in the last five years. We compared the base GPT-3.5 and GPT-4 models versus their Retrieval-Augmented Generation and fine-tuned counterparts. We employ two human experts as competitors of the models and three other human experts to review the models and the former human experts' analysis. The reviewers analyzed 5,000 scenario analyses.
Results and Conclusions. Human experts demonstrated higher accuracy, but LLMs are quicker and more actionable. Moreover, our findings show that RAG-assisted LLMs have the lowest hallucination rates, effectively uncovering hidden risks and complementing human expertise. Thus, the choice of model depends on specific needs, with FTMs for accuracy, RAG for hidden risks discovery, and base models for comprehensiveness and actionability. Therefore, experts can leverage LLMs as an effective complementing companion in risk analysis within a condensed timeframe. They can also save costs by averting unnecessary expenses associated with implementing unwarranted countermeasures.
△ Less
Submitted 6 September, 2024; v1 submitted 11 June, 2024;
originally announced June 2024.
-
$Classi|Q\rangle$ Towards a Translation Framework To Bridge The Classical-Quantum Programming Gap
Authors:
Matteo Esposito,
Maryam Tavassoli Sabzevari,
Boshuai Ye,
Davide Falessi,
Arif Ali Khan,
Davide Taibi
Abstract:
Quantum computing, albeit readily available as hardware or emulated on the cloud, is still far from being available in general regarding complex programming paradigms and learning curves. This vision paper introduces $Classi|Q\rangle$, a translation framework idea to bridge Classical and Quantum Computing by translating high-level programming languages, e.g., Python or C++, into a low-level langua…
▽ More
Quantum computing, albeit readily available as hardware or emulated on the cloud, is still far from being available in general regarding complex programming paradigms and learning curves. This vision paper introduces $Classi|Q\rangle$, a translation framework idea to bridge Classical and Quantum Computing by translating high-level programming languages, e.g., Python or C++, into a low-level language, e.g., Quantum Assembly. Our idea paper serves as a blueprint for ongoing efforts in quantum software engineering, offering a roadmap for further $Classi|Q\rangle$ development to meet the diverse needs of researchers and practitioners. $Classi|Q\rangle$ is designed to empower researchers and practitioners with no prior quantum experience to harness the potential of hybrid quantum computation. We also discuss future enhancements to $Classi|Q\rangle$, including support for additional quantum languages, improved optimization strategies, and integration with emerging quantum computing platforms.
△ Less
Submitted 1 July, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.
-
AI techniques for near real-time monitoring of contaminants in coastal waters on board future Phisat-2 mission
Authors:
Francesca Razzano,
Pietro Di Stasio,
Francesco Mauro,
Gabriele Meoni,
Marco Esposito,
Gilda Schirinzi,
Silvia L. Ullo
Abstract:
Differently from conventional procedures, the proposed solution advocates for a groundbreaking paradigm in water quality monitoring through the integration of satellite Remote Sensing (RS) data, Artificial Intelligence (AI) techniques, and onboard processing. The objective is to offer nearly real-time detection of contaminants in coastal waters addressing a significant gap in the existing literatu…
▽ More
Differently from conventional procedures, the proposed solution advocates for a groundbreaking paradigm in water quality monitoring through the integration of satellite Remote Sensing (RS) data, Artificial Intelligence (AI) techniques, and onboard processing. The objective is to offer nearly real-time detection of contaminants in coastal waters addressing a significant gap in the existing literature. Moreover, the expected outcomes include substantial advancements in environmental monitoring, public health protection, and resource conservation. The specific focus of our study is on the estimation of Turbidity and pH parameters, for their implications on human and aquatic health. Nevertheless, the designed framework can be extended to include other parameters of interest in the water environment and beyond. Originating from our participation in the European Space Agency (ESA) OrbitalAI Challenge, this article describes the distinctive opportunities and issues for the contaminants monitoring on the Phisat-2 mission. The specific characteristics of this mission, with the tools made available, will be presented, with the methodology proposed by the authors for the onboard monitoring of water contaminants in near real-time. Preliminary promising results are discussed and in progress and future work introduced.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
Leveraging Large Language Models for Preliminary Security Risk Analysis: A Mission-Critical Case Study
Authors:
Matteo Esposito,
Francesco Palagiano
Abstract:
Preliminary security risk analysis (PSRA) provides a quick approach to identify, evaluate and propose remeditation to potential risks in specific scenarios. The extensive expertise required for an effective PSRA and the substantial ammount of textual-related tasks hinder quick assessments in mission-critical contexts, where timely and prompt actions are essential. The speed and accuracy of human e…
▽ More
Preliminary security risk analysis (PSRA) provides a quick approach to identify, evaluate and propose remeditation to potential risks in specific scenarios. The extensive expertise required for an effective PSRA and the substantial ammount of textual-related tasks hinder quick assessments in mission-critical contexts, where timely and prompt actions are essential. The speed and accuracy of human experts in PSRA significantly impact response time. A large language model can quickly summarise information in less time than a human. To our knowledge, no prior study has explored the capabilities of fine-tuned models (FTM) in PSRA. Our case study investigates the proficiency of FTM to assist practitioners in PSRA. We manually curated 141 representative samples from over 50 mission-critical analyses archived by the industrial context team in the last five years.We compared the proficiency of the FTM versus seven human experts. Within the industrial context, our approach has proven successful in reducing errors in PSRA, hastening security risk detection, and minimizing false positives and negatives. This translates to cost savings for the company by averting unnecessary expenses associated with implementing unwarranted countermeasures. Therefore, experts can focus on more comprehensive risk analysis, leveraging LLMs for an effective preliminary assessment within a condensed timeframe.
△ Less
Submitted 23 March, 2024;
originally announced March 2024.
-
An Extensive Comparison of Static Application Security Testing Tools
Authors:
Matteo Esposito,
Valentina Falaschi,
Davide Falessi
Abstract:
Context: Static Application Security Testing Tools (SASTTs) identify software vulnerabilities to support the security and reliability of software applications. Interestingly, several studies have suggested that alternative solutions may be more effective than SASTTs due to their tendency to generate false alarms, commonly referred to as low Precision. Aim: We aim to comprehensively evaluate SASTTs…
▽ More
Context: Static Application Security Testing Tools (SASTTs) identify software vulnerabilities to support the security and reliability of software applications. Interestingly, several studies have suggested that alternative solutions may be more effective than SASTTs due to their tendency to generate false alarms, commonly referred to as low Precision. Aim: We aim to comprehensively evaluate SASTTs, setting a reliable benchmark for assessing and finding gaps in vulnerability identification mechanisms based on SASTTs or alternatives. Method: Our SASTTs evaluation is based on a controlled, though synthetic, Java codebase. It involves an assessment of 1.5 million test executions, and it features innovative methodological features such as effort-aware accuracy metrics and method-level analysis. Results: Our findings reveal that SASTTs detect a tiny range of vulnerabilities. In contrast to prevailing wisdom, SASTTs exhibit high Precision while falling short in Recall. Conclusions: The paper suggests that enhancing Recall, alongside expanding the spectrum of detected vulnerability types, should be the primary focus for improving SASTTs or alternative approaches, such as machine learning-based vulnerability identification solutions.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
QCSHQD: Quantum computing as a service for Hybrid classical-quantum software development: A Vision
Authors:
Maryam Tavassoli Sabzevari,
Matteo Esposito,
Arif Ali Khan,
Davide Taibi
Abstract:
Quantum Computing (QC) is transitioning from theoretical frameworks to an indispensable powerhouse of computational capability, resulting in extensive adoption across both industrial and academic domains. QC presents exceptional advantages, including unparalleled processing speed and the potential to solve complex problems beyond the capabilities of classical computers. Nevertheless, academic rese…
▽ More
Quantum Computing (QC) is transitioning from theoretical frameworks to an indispensable powerhouse of computational capability, resulting in extensive adoption across both industrial and academic domains. QC presents exceptional advantages, including unparalleled processing speed and the potential to solve complex problems beyond the capabilities of classical computers. Nevertheless, academic researchers and industry practitioners encounter various challenges in harnessing the benefits of this technology. The limited accessibility of QC resources for classical developers, and a general lack of domain knowledge and expertise, represent insurmountable barrier, hence to address these challenges, we introduce a framework- Quantum Computing as a Service for Hybrid Classical-Quantum Software Development (QCSHQD), which leverages service-oriented strategies. Our framework comprises three principal components: an Integrated Development Environment (IDE) for user interaction, an abstraction layer dedicated to orchestrating quantum services, and a service provider responsible for executing services on quantum computer. This study presents a blueprint for QCSHQD, designed to democratize access to QC resources for classical developers who want to seamless harness QC power. The vision of QCSHQD paves the way for groundbreaking innovations by addressing key challenges of hybridization between classical and quantum computers.
△ Less
Submitted 12 April, 2024; v1 submitted 13 March, 2024;
originally announced March 2024.
-
Multi-graph Graph Matching for Coronary Artery Semantic Labeling
Authors:
Chen Zhao,
Zhihui Xu,
Pukar Baral,
Michel Esposito,
Weihua Zhou
Abstract:
Coronary artery disease (CAD) stands as the leading cause of death worldwide, and invasive coronary angiography (ICA) remains the gold standard for assessing vascular anatomical information. However, deep learning-based methods encounter challenges in generating semantic labels for arterial segments, primarily due to the morphological similarity between arterial branches and varying anatomy of art…
▽ More
Coronary artery disease (CAD) stands as the leading cause of death worldwide, and invasive coronary angiography (ICA) remains the gold standard for assessing vascular anatomical information. However, deep learning-based methods encounter challenges in generating semantic labels for arterial segments, primarily due to the morphological similarity between arterial branches and varying anatomy of arterial system between different projection view angles and patients. To address this challenge, we model the vascular tree as a graph and propose a multi-graph graph matching (MGM) algorithm for coronary artery semantic labeling. The MGM algorithm assesses the similarity between arterials in multiple vascular tree graphs, considering the cycle consistency between each pair of graphs. As a result, the unannotated arterial segments are appropriately labeled by matching them with annotated segments. Through the incorporation of anatomical graph structure, radiomics features, and semantic mapping, the proposed MGM model achieves an impressive accuracy of 0.9471 for coronary artery semantic labeling using our multi-site dataset with 718 ICAs. With the semantic labeled arteries, an overall accuracy of 0.9155 was achieved for stenosis detection. The proposed MGM presents a novel tool for coronary artery analysis using multiple ICA-derived graphs, offering valuable insights into vascular health and pathology.
△ Less
Submitted 14 August, 2024; v1 submitted 24 February, 2024;
originally announced February 2024.
-
Quantum Transfer Learning for Acceptability Judgements
Authors:
Giuseppe Buonaiuto,
Raffaele Guarasci,
Aniello Minutolo,
Giuseppe De Pietro,
Massimo Esposito
Abstract:
Hybrid quantum-classical classifiers promise to positively impact critical aspects of natural language processing tasks, particularly classification-related ones. Among the possibilities currently investigated, quantum transfer learning, i.e., using a quantum circuit for fine-tuning pre-trained classical models for a specific task, is attracting significant attention as a potential platform for pr…
▽ More
Hybrid quantum-classical classifiers promise to positively impact critical aspects of natural language processing tasks, particularly classification-related ones. Among the possibilities currently investigated, quantum transfer learning, i.e., using a quantum circuit for fine-tuning pre-trained classical models for a specific task, is attracting significant attention as a potential platform for proving quantum advantage.
This work shows potential advantages, both in terms of performance and expressiveness, of quantum transfer learning algorithms trained on embedding vectors extracted from a large language model to perform classification on a classical Linguistics task: acceptability judgments. Acceptability judgment is the ability to determine whether a sentence is considered natural and well-formed by a native speaker. The approach has been tested on sentences extracted from ItaCoLa, a corpus that collects Italian sentences labeled with their acceptability judgment. The evaluation phase shows results for the quantum transfer learning pipeline comparable to state-of-the-art classical transfer learning algorithms, proving current quantum computers' capabilities to tackle NLP tasks for ready-to-use applications. Furthermore, a qualitative linguistic analysis, aided by explainable AI methods, reveals the capabilities of quantum transfer learning algorithms to correctly classify complex and more structured sentences, compared to their classical counterpart. This finding sets the ground for a quantifiable quantum advantage in NLP in the near future.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
Monitoring water contaminants in coastal areas through ML algorithms leveraging atmospherically corrected Sentinel-2 data
Authors:
Francesca Razzano,
Francesco Mauro,
Pietro Di Stasio,
Gabriele Meoni,
Marco Esposito,
Gilda Schirinzi,
Silvia Liberata Ullo
Abstract:
Monitoring water contaminants is of paramount importance, ensuring public health and environmental well-being. Turbidity, a key parameter, poses a significant problem, affecting water quality. Its accurate assessment is crucial for safeguarding ecosystems and human consumption, demanding meticulous attention and action. For this, our study pioneers a novel approach to monitor the Turbidity contami…
▽ More
Monitoring water contaminants is of paramount importance, ensuring public health and environmental well-being. Turbidity, a key parameter, poses a significant problem, affecting water quality. Its accurate assessment is crucial for safeguarding ecosystems and human consumption, demanding meticulous attention and action. For this, our study pioneers a novel approach to monitor the Turbidity contaminant, integrating CatBoost Machine Learning (ML) with high-resolution data from Sentinel-2 Level-2A. Traditional methods are labor-intensive while CatBoost offers an efficient solution, excelling in predictive accuracy. Leveraging atmospherically corrected Sentinel-2 data through the Google Earth Engine (GEE), our study contributes to scalable and precise Turbidity monitoring. A specific tabular dataset derived from Hong Kong contaminants monitoring stations enriches our study, providing region-specific insights. Results showcase the viability of this integrated approach, laying the foundation for adopting advanced techniques in global water quality management.
△ Less
Submitted 8 January, 2024;
originally announced January 2024.
-
DXP: Billing Data Preparation for Big Data Analytics
Authors:
Luca Gagliardelli,
Domenico Beneventano,
Marco Esposito,
Luca Zecchini,
Giovanni Simonini,
Sonia Bergamaschi,
Fabio Miselli,
Giuseppe Miano
Abstract:
In this paper, we present the data preparation activities that we performed for the Digital Experience Platform (DXP) project, commissioned and supervised by Doxee S.p.A.. DXP manages the billing data of the users of different companies operating in various sectors (electricity and gas, telephony, pay TV, etc.). This data has to be processed to provide services to the users (e.g., interactive bill…
▽ More
In this paper, we present the data preparation activities that we performed for the Digital Experience Platform (DXP) project, commissioned and supervised by Doxee S.p.A.. DXP manages the billing data of the users of different companies operating in various sectors (electricity and gas, telephony, pay TV, etc.). This data has to be processed to provide services to the users (e.g., interactive billing), but mainly to provide analytics to the companies (e.g., churn prediction or user segmentation). We focus on the design of the data preparation pipeline, describing the challenges that we had to overcome in order to get the billing data ready to perform analysis on it. We illustrate the lessons learned by highlighting the key points that could be transferred to similar projects. Moreover, we report some interesting results and considerations derived from the preliminary analysis of the prepared data, also pointing out some possible future directions for the ongoing project, spacing from big data integration to privacy-preserving temporal record linkage.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
Optimizing Fault-Tolerant Quality-Guaranteed Sensor Deployments for UAV Localization in Critical Areas via Computational Geometry
Authors:
Marco Esposito,
Toni Mancini,
Enrico Tronci
Abstract:
The increasing spreading of small commercial Unmanned Aerial Vehicles (UAVs, aka drones) presents serious threats for critical areas such as airports, power plants, governmental and military facilities. In fact, such UAVs can easily disturb or jam radio communications, collide with other flying objects, perform espionage activity, and carry offensive payloads, e.g., weapons or explosives. A centra…
▽ More
The increasing spreading of small commercial Unmanned Aerial Vehicles (UAVs, aka drones) presents serious threats for critical areas such as airports, power plants, governmental and military facilities. In fact, such UAVs can easily disturb or jam radio communications, collide with other flying objects, perform espionage activity, and carry offensive payloads, e.g., weapons or explosives. A central problem when designing surveillance solutions for the localization of unauthorized UAVs in critical areas is to decide how many triangulating sensors to use, and where to deploy them to optimise both coverage and cost effectiveness.
In this article, we compute deployments of triangulating sensors for UAV localization, optimizing a given blend of metrics, namely: coverage under multiple sensing quality levels, cost-effectiveness, fault-tolerance. We focus on large, complex 3D regions, which exhibit obstacles (e.g., buildings), varying terrain elevation, different coverage priorities, constraints on possible sensors placement. Our novel approach relies on computational geometry and statistical model checking, and enables the effective use of off-the-shelf AI-based black-box optimizers. Moreover, our method allows us to compute a closed-form, analytical representation of the region uncovered by a sensor deployment, which provides the means for rigorous, formal certification of the quality of the latter.
We show the practical feasibility of our approach by computing optimal sensor deployments for UAV localization in two large, complex 3D critical regions, the Rome Leonardo Da Vinci International Airport (FCO) and the Vienna International Center (VIC), using NOMAD as our state-of-the-art underlying optimization engine. Results show that we can compute optimal sensor deployments within a few hours on a standard workstation and within minutes on a small parallel infrastructure.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Implicit Neural Representations for Breathing-compensated Volume Reconstruction in Robotic Ultrasound
Authors:
Yordanka Velikova,
Mohammad Farid Azampour,
Walter Simson,
Marco Esposito,
Nassir Navab
Abstract:
Ultrasound (US) imaging is widely used in diagnosing and staging abdominal diseases due to its lack of non-ionizing radiation and prevalent availability. However, significant inter-operator variability and inconsistent image acquisition hinder the widespread adoption of extensive screening programs. Robotic ultrasound systems have emerged as a promising solution, offering standardized acquisition…
▽ More
Ultrasound (US) imaging is widely used in diagnosing and staging abdominal diseases due to its lack of non-ionizing radiation and prevalent availability. However, significant inter-operator variability and inconsistent image acquisition hinder the widespread adoption of extensive screening programs. Robotic ultrasound systems have emerged as a promising solution, offering standardized acquisition protocols and the possibility of automated acquisition. Additionally, these systems enable access to 3D data via robotic tracking, enhancing volumetric reconstruction for improved ultrasound interpretation and precise disease diagnosis. However, the interpretability of 3D US reconstruction of abdominal images can be affected by the patient's breathing motion. This study introduces a method to compensate for breathing motion in 3D US compounding by leveraging implicit neural representations. Our approach employs a robotic ultrasound system for automated screenings. To demonstrate the method's effectiveness, we evaluate our proposed method for the diagnosis and monitoring of abdominal aorta aneurysms as a representative use case. Our experiments demonstrate that our proposed pipeline facilitates robust automated robotic acquisition, mitigating artifacts from breathing motion, and yields smoother 3D reconstructions for enhanced screening and medical diagnosis.
△ Less
Submitted 3 April, 2024; v1 submitted 8 November, 2023;
originally announced November 2023.
-
Hyper Association Graph Matching with Uncertainty Quantification for Coronary Artery Semantic Labeling
Authors:
Chen Zhao,
Michele Esposito,
Zhihui Xu,
Weihua Zhou
Abstract:
Coronary artery disease (CAD) is one of the primary causes leading to death worldwide. Accurate extraction of individual arterial branches on invasive coronary angiograms (ICA) is important for stenosis detection and CAD diagnosis. However, deep learning-based models face challenges in generating semantic segmentation for coronary arteries due to the morphological similarity among different types…
▽ More
Coronary artery disease (CAD) is one of the primary causes leading to death worldwide. Accurate extraction of individual arterial branches on invasive coronary angiograms (ICA) is important for stenosis detection and CAD diagnosis. However, deep learning-based models face challenges in generating semantic segmentation for coronary arteries due to the morphological similarity among different types of coronary arteries. To address this challenge, we propose an innovative approach using the hyper association graph-matching neural network with uncertainty quantification (HAGMN-UQ) for coronary artery semantic labeling on ICAs. The graph-matching procedure maps the arterial branches between two individual graphs, so that the unlabeled arterial segments are classified by the labeled segments, and the coronary artery semantic labeling is achieved. By incorporating the anatomical structural loss and uncertainty, our model achieved an accuracy of 0.9345 for coronary artery semantic labeling with a fast inference speed, leading to an effective and efficient prediction in real-time clinical decision-making scenarios.
△ Less
Submitted 20 August, 2023;
originally announced August 2023.
-
Early Career Developers' Perceptions of Code Understandability. A Study of Complexity Metrics
Authors:
Matteo Esposito,
Andrea Janes,
Terhi Kilamo,
Valentina Lenarduzzi
Abstract:
Context. Code understandability is fundamental. Developers need to understand the code they are modifying clearly. A low understandability can increase the amount of coding effort, and misinterpreting code impacts the entire development process. Ideally, developers should write clear and understandable code with the least effort. Aim. Our work investigates whether the McCabe Cyclomatic Complexity…
▽ More
Context. Code understandability is fundamental. Developers need to understand the code they are modifying clearly. A low understandability can increase the amount of coding effort, and misinterpreting code impacts the entire development process. Ideally, developers should write clear and understandable code with the least effort. Aim. Our work investigates whether the McCabe Cyclomatic Complexity or the Cognitive Complexity can be a good predictor for the developers' perceived code understandability to understand which of the two complexities can be used as criteria to evaluate if a piece of code is understandable. Method. We designed and conducted an empirical study among 216 early career developers with professional experience ranging from one to four years. We asked them to manually inspect and rate the understandability of 12 Java classes that exhibit different levels of Cyclomatic and Cognitive Complexity. Results. Our findings showed that while the old-fashioned McCabe Cyclomatic Complexity and the most recent Cognitive Complexity are modest predictors for code understandability when considering the complexity perceived by early-career developers, they are not for problem severity. Conclusions. Based on our results, early-career developers should not be left alone when performing code-reviewing tasks due to their scarce experience. Moreover, low complexity measures indicate good understandability, but having either CoC or CyC high makes understandability unpredictable. Nevertheless, there is no evidence that CyC or CoC are indicators of early-career perceived severity.Future research efforts will focus on expanding the population to experienced developers to confront whether seniority influences the predictive power of the chosen metrics.
△ Less
Submitted 15 July, 2024; v1 submitted 14 March, 2023;
originally announced March 2023.
-
AGMN: Association Graph-based Graph Matching Network for Coronary Artery Semantic Labeling on Invasive Coronary Angiograms
Authors:
Chen Zhao,
Zhihui Xu,
Jingfeng Jiang,
Michele Esposito,
Drew Pienta,
Guang-Uei Hung,
Weihua Zhou
Abstract:
Semantic labeling of coronary arterial segments in invasive coronary angiography (ICA) is important for automated assessment and report generation of coronary artery stenosis in the computer-aided diagnosis of coronary artery disease (CAD). Inspired by the training procedure of interventional cardiologists for interpreting the structure of coronary arteries, we propose an association graph-based g…
▽ More
Semantic labeling of coronary arterial segments in invasive coronary angiography (ICA) is important for automated assessment and report generation of coronary artery stenosis in the computer-aided diagnosis of coronary artery disease (CAD). Inspired by the training procedure of interventional cardiologists for interpreting the structure of coronary arteries, we propose an association graph-based graph matching network (AGMN) for coronary arterial semantic labeling. We first extract the vascular tree from invasive coronary angiography (ICA) and convert it into multiple individual graphs. Then, an association graph is constructed from two individual graphs where each vertex represents the relationship between two arterial segments. Using the association graph, the AGMN extracts the vertex features by the embedding module, aggregates the features from adjacent vertices and edges by graph convolution network, and decodes the features to generate the semantic mappings between arteries. By learning the mapping of arterial branches between two individual graphs, the unlabeled arterial segments are classified by the labeled segments to achieve semantic labeling. A dataset containing 263 ICAs was employed to train and validate the proposed model, and a five-fold cross-validation scheme was performed. Our AGMN model achieved an average accuracy of 0.8264, an average precision of 0.8276, an average recall of 0.8264, and an average F1-score of 0.8262, which significantly outperformed existing coronary artery semantic labeling methods. In conclusion, we have developed and validated a new algorithm with high accuracy, interpretability, and robustness for coronary artery semantic labeling on ICAs.
△ Less
Submitted 11 January, 2023;
originally announced January 2023.
-
Seniors' acceptance of virtual humanoid agents
Authors:
Anna Esposito,
Terry Amorese,
Marialucia Cuciniello,
Antonietta M. Esposito,
Alda Troncone,
Maria Ines Torres,
Stephan Schlögl,
Gennaro Cordasco
Abstract:
This paper reports on a study conducted as part of the EU EMPATHIC project, whose goal is to develop an empathic virtual coach capable of enhancing seniors' well-being, focusing on user requirements and expectations with respect to participants' age and technology experiences (i.e. participants' familiarity with technological devices such as smartphones, laptops, and tablets). The data shows that…
▽ More
This paper reports on a study conducted as part of the EU EMPATHIC project, whose goal is to develop an empathic virtual coach capable of enhancing seniors' well-being, focusing on user requirements and expectations with respect to participants' age and technology experiences (i.e. participants' familiarity with technological devices such as smartphones, laptops, and tablets). The data shows that seniors' favorite technological device is the smartphone, and this device was also the one that scored the highest in terms of easiness to use. We found statistically significant differences on the preferences expressed by seniors toward the gender of the agents. Seniors (independently from their gender) prefer to interact with female humanoid agents on both the pragmatic and hedonic dimensions of an interactive system and are more in favor to commit themselves in a long-lasting interaction with them. In addition, we found statistically significant effects of the seniors' technology savviness on the hedonic qualities of the proposed interactive systems. Seniors with technological experience felt less motivated and judged the proposed agents less captivating, exciting, and appealing.
△ Less
Submitted 2 May, 2021;
originally announced May 2021.
-
Convolutional Normalization
Authors:
Massimiliano Esposito,
Nader Ganaba
Abstract:
As the deep neural networks are being applied to complex tasks, the size of the networks and architecture increases and their topology becomes more complicated too. At the same time, training becomes slow and at some instances inefficient. This motivated the introduction of various normalization techniques such as Batch Normalization and Layer Normalization. The aforementioned normalization method…
▽ More
As the deep neural networks are being applied to complex tasks, the size of the networks and architecture increases and their topology becomes more complicated too. At the same time, training becomes slow and at some instances inefficient. This motivated the introduction of various normalization techniques such as Batch Normalization and Layer Normalization. The aforementioned normalization methods use arithmetic operations to compute an approximation statistics (mainly the first and second moments) of the layer's data and use it to normalize it. The aforementioned methods use plain Monte Carlo method to approximate the statistics and such method fails when approximating the statistics whose distribution is complex. Here, we propose an approach that uses weighted sum, implemented using depth-wise convolutional neural networks, to not only approximate the statistics, but to learn the coefficients of the sum.
△ Less
Submitted 18 February, 2021;
originally announced February 2021.
-
Autonomous Robotic Screening of Tubular Structures based only on Real-Time Ultrasound Imaging Feedback
Authors:
Zhongliang Jiang,
Zhenyu Li,
Matthias Grimm,
Mingchuan Zhou,
Marco Esposito,
Wolfgang Wein,
Walter Stechele,
Thomas Wendler,
Nassir Navab
Abstract:
Ultrasound (US) imaging is widely employed for diagnosis and staging of peripheral vascular diseases (PVD), mainly due to its high availability and the fact it does not emit radiation. However, high inter-operator variability and a lack of repeatability of US image acquisition hinder the implementation of extensive screening programs. To address this challenge, we propose an end-to-end workflow fo…
▽ More
Ultrasound (US) imaging is widely employed for diagnosis and staging of peripheral vascular diseases (PVD), mainly due to its high availability and the fact it does not emit radiation. However, high inter-operator variability and a lack of repeatability of US image acquisition hinder the implementation of extensive screening programs. To address this challenge, we propose an end-to-end workflow for automatic robotic US screening of tubular structures using only the real-time US imaging feedback. We first train a U-Net for real-time segmentation of the vascular structure from cross-sectional US images. Then, we represent the detected vascular structure as a 3D point cloud and use it to estimate the longitudinal axis of the target tubular structure and its mean radius by solving a constrained non-linear optimization problem. Iterating the previous processes, the US probe is automatically aligned to the orientation normal to the target tubular tissue and adjusted online to center the tracked tissue based on the spatial calibration. The real-time segmentation result is evaluated both on a phantom and in-vivo on brachial arteries of volunteers. In addition, the whole process is validated both in simulation and physical phantoms. The mean absolute radius error and orientation error ($\pm$ SD) in the simulation are $1.16\pm0.1~mm$ and $2.7\pm3.3^{\circ}$, respectively. On a gel phantom, these errors are $1.95\pm2.02~mm$ and $3.3\pm2.4^{\circ}$. This shows that the method is able to automatically screen tubular tissues with an optimal probe orientation (i.e. normal to the vessel) and at the same to accurately estimate the mean radius, both in real-time.
△ Less
Submitted 30 June, 2021; v1 submitted 30 October, 2020;
originally announced November 2020.
-
Thermodynamic Computing
Authors:
Tom Conte,
Erik DeBenedictis,
Natesh Ganesh,
Todd Hylton,
John Paul Strachan,
R. Stanley Williams,
Alexander Alemi,
Lee Altenberg,
Gavin Crooks,
James Crutchfield,
Lidia del Rio,
Josh Deutsch,
Michael DeWeese,
Khari Douglas,
Massimiliano Esposito,
Michael Frank,
Robert Fry,
Peter Harsha,
Mark Hill,
Christopher Kello,
Jeff Krichmar,
Suhas Kumar,
Shih-Chii Liu,
Seth Lloyd,
Matteo Marsili
, et al. (14 additional authors not shown)
Abstract:
The hardware and software foundations laid in the first half of the 20th Century enabled the computing technologies that have transformed the world, but these foundations are now under siege. The current computing paradigm, which is the foundation of much of the current standards of living that we now enjoy, faces fundamental limitations that are evident from several perspectives. In terms of hard…
▽ More
The hardware and software foundations laid in the first half of the 20th Century enabled the computing technologies that have transformed the world, but these foundations are now under siege. The current computing paradigm, which is the foundation of much of the current standards of living that we now enjoy, faces fundamental limitations that are evident from several perspectives. In terms of hardware, devices have become so small that we are struggling to eliminate the effects of thermodynamic fluctuations, which are unavoidable at the nanometer scale. In terms of software, our ability to imagine and program effective computational abstractions and implementations are clearly challenged in complex domains. In terms of systems, currently five percent of the power generated in the US is used to run computing systems - this astonishing figure is neither ecologically sustainable nor economically scalable. Economically, the cost of building next-generation semiconductor fabrication plants has soared past $10 billion. All of these difficulties - device scaling, software complexity, adaptability, energy consumption, and fabrication economics - indicate that the current computing paradigm has matured and that continued improvements along this path will be limited. If technological progress is to continue and corresponding social and economic benefits are to continue to accrue, computing must become much more capable, energy efficient, and affordable. We propose that progress in computing can continue under a united, physically grounded, computational paradigm centered on thermodynamics. Herein we propose a research agenda to extend these thermodynamic foundations into complex, non-equilibrium, self-organizing systems and apply them holistically to future computing systems that will harness nature's innate computational capacity. We call this type of computing "Thermodynamic Computing" or TC.
△ Less
Submitted 14 November, 2019; v1 submitted 5 November, 2019;
originally announced November 2019.
-
CFCM: Segmentation via Coarse to Fine Context Memory
Authors:
Fausto Milletari,
Nicola Rieke,
Maximilian Baust,
Marco Esposito,
Nassir Navab
Abstract:
Recent neural-network-based architectures for image segmentation make extensive usage of feature forwarding mechanisms to integrate information from multiple scales. Although yielding good results, even deeper architectures and alternative methods for feature fusion at different resolutions have been scarcely investigated for medical applications. In this work we propose to implement segmentation…
▽ More
Recent neural-network-based architectures for image segmentation make extensive usage of feature forwarding mechanisms to integrate information from multiple scales. Although yielding good results, even deeper architectures and alternative methods for feature fusion at different resolutions have been scarcely investigated for medical applications. In this work we propose to implement segmentation via an encoder-decoder architecture which differs from any other previously published method since (i) it employs a very deep architecture based on residual learning and (ii) combines features via a convolutional Long Short Term Memory (LSTM), instead of concatenation or summation. The intuition is that the memory mechanism implemented by LSTMs can better integrate features from different scales through a coarse-to-fine strategy; hence the name Coarse-to-Fine Context Memory (CFCM). We demonstrate the remarkable advantages of this approach on two datasets: the Montgomery county lung segmentation dataset, and the EndoVis 2015 challenge dataset for surgical instrument segmentation.
△ Less
Submitted 4 June, 2018;
originally announced June 2018.
-
Analysis of Motion Planning by Sampling in Subspaces of Progressively Increasing Dimension
Authors:
Marios P. Xanthidis,
Joel M. Esposito,
Ioannis Rekleitis,
Jason M. O'Kane
Abstract:
Despite the performance advantages of modern sampling-based motion planners, solving high dimensional planning problems in near real-time remains a challenge. Applications include hyper-redundant manipulators, snake-like and humanoid robots. Based on the intuition that many of these problem instances do not require the robots to exercise every degree of freedom independently, we introduce an enhan…
▽ More
Despite the performance advantages of modern sampling-based motion planners, solving high dimensional planning problems in near real-time remains a challenge. Applications include hyper-redundant manipulators, snake-like and humanoid robots. Based on the intuition that many of these problem instances do not require the robots to exercise every degree of freedom independently, we introduce an enhancement to popular sampling-based planning algorithms aimed at circumventing the exponential dependence on dimensionality. We propose beginning the search in a lower dimensional subspace of the configuration space in the hopes that a simple solution will be found quickly. After a certain number of samples are generated, if no solution is found, we increase the dimension of the search subspace by one and continue sampling in the higher dimensional subspace. In the worst case, the search subspace expands to include the full configuration space - making the completeness properties identical to the underlying sampling-based planer. Our experiments comparing the enhanced and traditional version of RRT, RRT-Connect, and BidirectionalT-RRT on both a planar hyper-redundant manipulator and the Baxter humanoid robot indicate that a solution is typically found much faster using this approach and the run time appears to be less sensitive to the dimension of the full configuration space. We explore important implementation issues in the sampling process and discuss its limitations.
△ Less
Submitted 30 January, 2018;
originally announced February 2018.