-
Chaos Engineering in the Wild: Findings from GitHub
Authors:
Joshua Owotogbe,
Indika Kumara,
Dario Di Nucci,
Damian Andrew Tamburri,
Willem-Jan van den Heuvel
Abstract:
Chaos engineering aims to improve the resilience of software systems by intentionally injecting faults to identify and address system weaknesses that cause outages in production environments. Although many tools for chaos engineering exist, their practical adoption is not yet explored. This study examines 971 GitHub repositories that incorporate 10 popular chaos engineering tools to identify patte…
▽ More
Chaos engineering aims to improve the resilience of software systems by intentionally injecting faults to identify and address system weaknesses that cause outages in production environments. Although many tools for chaos engineering exist, their practical adoption is not yet explored. This study examines 971 GitHub repositories that incorporate 10 popular chaos engineering tools to identify patterns and trends in their use. The analysis reveals that Toxiproxy and Chaos Mesh are the most frequently used, showing consistent growth since 2016 and reflecting increasing adoption in cloud-native development. The release of new chaos engineering tools peaked in 2018, followed by a shift toward refinement and integration, with Chaos Mesh and LitmusChaos leading in ongoing development activity. Software development is the most frequent application (58.0%), followed by unclassified purposes (16.2%), teaching (10.3%), learning (9.9%), and research (5.7%). Development-focused repositories tend to have higher activity, particularly for Toxiproxy and Chaos Mesh, highlighting their industrial relevance. Fault injection scenarios mainly address network disruptions (40.9%) and instance termination (32.7%), while application-level faults remain underrepresented (3.0%), highlighting for future exploration.
△ Less
Submitted 19 May, 2025;
originally announced May 2025.
-
Recursive Self-Similarity in Deep Weight Spaces of Neural Architectures: A Fractal and Coarse Geometry Perspective
Authors:
Ambarish Moharil,
Indika Kumara,
Damian Andrew Tamburri,
Majid Mohammadi,
Willem-Jan van den Heuvel
Abstract:
This paper conceptualizes the Deep Weight Spaces (DWS) of neural architectures as hierarchical, fractal-like, coarse geometric structures observable at discrete integer scales through recursive dilation. We introduce a coarse group action termed the fractal transformation, $T_{r_k} $, acting under the symmetry group $G = (\mathbb{Z}, +) $, to analyze neural parameter matrices or tensors, by segmen…
▽ More
This paper conceptualizes the Deep Weight Spaces (DWS) of neural architectures as hierarchical, fractal-like, coarse geometric structures observable at discrete integer scales through recursive dilation. We introduce a coarse group action termed the fractal transformation, $T_{r_k} $, acting under the symmetry group $G = (\mathbb{Z}, +) $, to analyze neural parameter matrices or tensors, by segmenting the underlying discrete grid $Ω$ into $N(r_k)$ fractals across varying observation scales $ r_k $. This perspective adopts a box count technique, commonly used to assess the hierarchical and scale-related geometry of physical structures, which has been extensively formalized under the topic of fractal geometry. We assess the structural complexity of neural layers by estimating the Hausdorff-Besicovitch dimension of their layers and evaluating a degree of self-similarity. The fractal transformation features key algebraic properties such as linearity, identity, and asymptotic invertibility, which is a signature of coarse structures. We show that the coarse group action exhibits a set of symmetries such as Discrete Scale Invariance (DSI) under recursive dilation, strong invariance followed by weak equivariance to permutations, alongside respecting the scaling equivariance of activation functions, defined by the intertwiner group relations. Our framework targets large-scale structural properties of DWS, deliberately overlooking minor inconsistencies to focus on significant geometric characteristics of neural networks. Experiments on CIFAR-10 using ResNet-18, VGG-16, and a custom CNN validate our approach, demonstrating effective fractal segmentation and structural analysis.
△ Less
Submitted 18 March, 2025;
originally announced March 2025.
-
Chaos Engineering: A Multi-Vocal Literature Review
Authors:
Joshua Owotogbe,
Indika Kumara,
Willem-Jan Van Den Heuvel,
Damian Andrew Tamburri
Abstract:
Organizations, particularly medium and large enterprises, typically today rely heavily on complex, distributed systems to deliver critical services and products. However, the growing complexity of these systems poses challenges in ensuring service availability, performance, and reliability. Traditional resilience testing methods often fail to capture modern systems' intricate interactions and fail…
▽ More
Organizations, particularly medium and large enterprises, typically today rely heavily on complex, distributed systems to deliver critical services and products. However, the growing complexity of these systems poses challenges in ensuring service availability, performance, and reliability. Traditional resilience testing methods often fail to capture modern systems' intricate interactions and failure modes. Chaos Engineering addresses these challenges by proactively testing how systems in production behave under turbulent conditions, allowing developers to uncover and resolve potential issues before they escalate into outages. Though chaos engineering has received growing attention from researchers and practitioners alike, we observed a lack of a comprehensive literature review. Hence, we performed a Multivocal Literature Review (MLR) on chaos engineering to fill this research gap by systematically analyzing 88 academic and grey literature sources published from January 2019 to April 2024. We first used the selected sources to derive a unified definition of chaos engineering and to identify key capabilities, components, and adoption drivers. We also developed a taxonomy for chaos engineering and compared the relevant tools using it. Finally, we analyzed the state of the current chaos engineering research and identified several open research issues.
△ Less
Submitted 2 December, 2024;
originally announced December 2024.
-
A Scale-Invariant Diagnostic Approach Towards Understanding Dynamics of Deep Neural Networks
Authors:
Ambarish Moharil,
Damian Tamburri,
Indika Kumara,
Willem-Jan Van Den Heuvel,
Alireza Azarfar
Abstract:
This paper introduces a scale-invariant methodology employing \textit{Fractal Geometry} to analyze and explain the nonlinear dynamics of complex connectionist systems. By leveraging architectural self-similarity in Deep Neural Networks (DNNs), we quantify fractal dimensions and \textit{roughness} to deeply understand their dynamics and enhance the quality of \textit{intrinsic} explanations. Our ap…
▽ More
This paper introduces a scale-invariant methodology employing \textit{Fractal Geometry} to analyze and explain the nonlinear dynamics of complex connectionist systems. By leveraging architectural self-similarity in Deep Neural Networks (DNNs), we quantify fractal dimensions and \textit{roughness} to deeply understand their dynamics and enhance the quality of \textit{intrinsic} explanations. Our approach integrates principles from Chaos Theory to improve visualizations of fractal evolution and utilizes a Graph-Based Neural Network for reconstructing network topology. This strategy aims at advancing the \textit{intrinsic} explainability of connectionist Artificial Intelligence (AI) systems.
△ Less
Submitted 12 July, 2024;
originally announced July 2024.
-
Architectural Design Decisions for Self-Serve Data Platforms in Data Meshes
Authors:
Tom van Eijk,
Indika Kumara,
Dario Di Nucci,
Damian Andrew Tamburri,
Willem-Jan van den Heuvel
Abstract:
Data mesh is an emerging decentralized approach to managing and generating value from analytical enterprise data at scale. It shifts the ownership of the data to the business domains closest to the data, promotes sharing and managing data as autonomous products, and uses a federated and automated data governance model. The data mesh relies on a managed data platform that offers services to domain…
▽ More
Data mesh is an emerging decentralized approach to managing and generating value from analytical enterprise data at scale. It shifts the ownership of the data to the business domains closest to the data, promotes sharing and managing data as autonomous products, and uses a federated and automated data governance model. The data mesh relies on a managed data platform that offers services to domain and governance teams to build, share, and manage data products efficiently. However, designing and implementing a self-serve data platform is challenging, and the platform engineers and architects must understand and choose the appropriate design options to ensure the platform will enhance the experience of domain and governance teams. For these reasons, this paper proposes a catalog of architectural design decisions and their corresponding decision options by systematically reviewing 43 industrial gray literature articles on self-serve data platforms in data mesh. Moreover, we used semi-structured interviews with six data engineering experts with data mesh experience to validate, refine, and extend the findings from the literature. Such a catalog of design decisions and options drawn from the state of practice shall aid practitioners in building data meshes while providing a baseline for further research on data mesh architectures.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Data Mesh: a Systematic Gray Literature Review
Authors:
Abel Goedegebuure,
Indika Kumara,
Stefan Driessen,
Dario Di Nucci,
Geert Monsieur,
Willem-jan van den Heuvel,
Damian Andrew Tamburri
Abstract:
Data mesh is an emerging domain-driven decentralized data architecture that aims to minimize or avoid operational bottlenecks associated with centralized, monolithic data architectures in enterprises. The topic has picked the practitioners' interest, and there is considerable gray literature on it. At the same time, we observe a lack of academic attempts at defining and building upon the concept.…
▽ More
Data mesh is an emerging domain-driven decentralized data architecture that aims to minimize or avoid operational bottlenecks associated with centralized, monolithic data architectures in enterprises. The topic has picked the practitioners' interest, and there is considerable gray literature on it. At the same time, we observe a lack of academic attempts at defining and building upon the concept. Hence, in this article, we aim to start from the foundations and characterize the data mesh architecture regarding its design principles, architectural components, capabilities, and organizational roles. We systematically collected, analyzed, and synthesized 114 industrial gray literature articles. The review provides insights into practitioners' perspectives on the four key principles of data mesh: data as a product, domain ownership of data, self-serve data platform, and federated computational governance. Moreover, due to the comparability of data mesh and SOA (service-oriented architecture), we mapped the findings from the gray literature into the reference architectures from the SOA academic literature to create the reference architectures for describing three key dimensions of data mesh: organization of capabilities and roles, development, and runtime. Finally, we discuss open research issues in data mesh, partially based on the findings from the gray literature.
△ Less
Submitted 7 August, 2024; v1 submitted 3 April, 2023;
originally announced April 2023.
-
QSOC: Quantum Service-Oriented Computing
Authors:
Indika Kumara,
Willem-Jan Van Den Heuvel,
Damian A. Tamburri
Abstract:
Quantum computing is quickly turning from a promise to a reality, witnessing the launch of several cloud-based, general-purpose offerings, and IDEs. Unfortunately, however, existing solutions typically implicitly assume intimate knowledge about quantum computing concepts and operators. This paper introduces Quantum Service-Oriented Computing (QSOC), including a model-driven methodology to allow en…
▽ More
Quantum computing is quickly turning from a promise to a reality, witnessing the launch of several cloud-based, general-purpose offerings, and IDEs. Unfortunately, however, existing solutions typically implicitly assume intimate knowledge about quantum computing concepts and operators. This paper introduces Quantum Service-Oriented Computing (QSOC), including a model-driven methodology to allow enterprise DevOps teams to compose, configure and operate enterprise applications without intimate knowledge on the underlying quantum infrastructure, advocating knowledge reuse, separation of concerns, resource optimization, and mixed quantum- & conventional QSOC applications.
△ Less
Submitted 4 May, 2021;
originally announced May 2021.
-
DeepIaC: Deep Learning-Based Linguistic Anti-pattern Detection in IaC
Authors:
Nemania Borovits,
Indika Kumara,
Parvathy Krishnan,
Stefano Dalla Palma,
Dario Di Nucci,
Fabio Palomba,
Damian A. Tamburri,
Willem-Jan van den Heuvel
Abstract:
Linguistic anti-patterns are recurring poor practices concerning inconsistencies among the naming, documentation, and implementation of an entity. They impede readability, understandability, and maintainability of source code. This paper attempts to detect linguistic anti-patterns in infrastructure as code (IaC) scripts used to provision and manage computing environments. In particular, we conside…
▽ More
Linguistic anti-patterns are recurring poor practices concerning inconsistencies among the naming, documentation, and implementation of an entity. They impede readability, understandability, and maintainability of source code. This paper attempts to detect linguistic anti-patterns in infrastructure as code (IaC) scripts used to provision and manage computing environments. In particular, we consider inconsistencies between the logic/body of IaC code units and their names. To this end, we propose a novel automated approach that employs word embeddings and deep learning techniques. We build and use the abstract syntax tree of IaC code units to create their code embedments. Our experiments with a dataset systematically extracted from open source repositories show that our approach yields an accuracy between0.785and0.915in detecting inconsistencies
△ Less
Submitted 22 September, 2020;
originally announced September 2020.
-
Towards Semantic Detection of Smells in Cloud Infrastructure Code
Authors:
Indika Kumara,
Zoe Vasileiou,
Georgios Meditskos,
Damian A. Tamburri,
Willem-Jan Van Den Heuvel,
Anastasios Karakostas,
Stefanos Vrochidis,
Ioannis Kompatsiaris
Abstract:
Automated deployment and management of Cloud applications relies on descriptions of their deployment topologies, often referred to as Infrastructure Code. As the complexity of applications and their deployment models increases, developers inadvertently introduce software smells to such code specifications, for instance, violations of good coding practices, modular structure, and more. This paper p…
▽ More
Automated deployment and management of Cloud applications relies on descriptions of their deployment topologies, often referred to as Infrastructure Code. As the complexity of applications and their deployment models increases, developers inadvertently introduce software smells to such code specifications, for instance, violations of good coding practices, modular structure, and more. This paper presents a knowledge-driven approach enabling developers to identify the aforementioned smells in deployment descriptions. We detect smells with SPARQL-based rules over pattern-based OWL 2 knowledge graphs capturing deployment models. We show the feasibility of our approach with a prototype and three case studies.
△ Less
Submitted 4 July, 2020;
originally announced July 2020.
-
Quality Assurance of Heterogeneous Applications: The SODALITE Approach
Authors:
Indika Kumara,
Giovanni Quattrocchi,
Damian Tamburri,
Willem-Jan Van Den Heuvel
Abstract:
A key focus of the SODALITE project is to assure the quality and performance of the deployments of applications over heterogeneous Cloud and HPC environments. It offers a set of tools to detect and correct errors, smells, and bugs in the deployment models and their provisioning workflows, and a framework to monitor and refactor deployment model instances at runtime. This paper presents objectives,…
▽ More
A key focus of the SODALITE project is to assure the quality and performance of the deployments of applications over heterogeneous Cloud and HPC environments. It offers a set of tools to detect and correct errors, smells, and bugs in the deployment models and their provisioning workflows, and a framework to monitor and refactor deployment model instances at runtime. This paper presents objectives, designs, early results of the quality assurance framework and the refactoring framework.
△ Less
Submitted 25 March, 2020;
originally announced March 2020.
-
SDSN@RT: a middleware environment for single-instance multi-tenant cloud applications
Authors:
Indika Kumara,
Jun Han,
Alan Colman,
Willem-Jan van den Heuvel,
Damian A. Tamburri,
Malinda Kapuruge
Abstract:
With the Single-Instance Multi-Tenancy (SIMT) model for composite Software-as-a-Service (SaaS) applications, a single composite application instance can host multiple tenants, yielding the benefits of better service and resource utilization, and reduced operational cost for the SaaS provider. An SIMT application needs to share services and their aggregation (the application) among its tenants whil…
▽ More
With the Single-Instance Multi-Tenancy (SIMT) model for composite Software-as-a-Service (SaaS) applications, a single composite application instance can host multiple tenants, yielding the benefits of better service and resource utilization, and reduced operational cost for the SaaS provider. An SIMT application needs to share services and their aggregation (the application) among its tenants while supporting variations in the functional and performance requirements of the tenants. The SaaS provider requires a middleware environment that can deploy, enact and manage a designed SIMT application, to achieve the varied requirements of the different tenants in a controlled manner. This paper presents the SDSN@RT (Software-Defined Service Networks @ RunTime) middleware environment that can meet the aforementioned requirements. SDSN@RT represents an SIMT composite cloud application as a multi-tenant service network, where the same service network simultaneously hosts a set of virtual service networks (VSNs), one for each tenant. A service network connects a set of services, and coordinates the interactions between them. A VSN realizes the requirements for a specific tenant and can be deployed, configured, and logically isolated in the service network at runtime. SDSN@RT also supports the monitoring and runtime changes of the deployed multi-tenant service networks. We show the feasibility of SDSN@RT with a prototype implementation, and demonstrate its capabilities to host SIMT applications and support their changes with a case study. The performance study of the prototype implementation shows that the runtime capabilities of our middleware incur little overhead.
△ Less
Submitted 10 February, 2020;
originally announced February 2020.
-
FM4SN: A Feature-Oriented Approach to Tenant-Driven Customization of Multi-Tenant Service Networks
Authors:
Indika Kumara,
Jun Han,
Alan Colman,
Willem-Jan van den Heuvel,
Damian Tamburri
Abstract:
In a multi-tenant service network, multiple virtual service networks (VSNs), one for each tenant, coexist on the same service network. The tenants themselves need to be able to dynamically create and customize their own VSNs to support their initial and changing functional and performance requirements. These tasks are problematic for them due to: 1) platform-specific knowledge required, 2) the exi…
▽ More
In a multi-tenant service network, multiple virtual service networks (VSNs), one for each tenant, coexist on the same service network. The tenants themselves need to be able to dynamically create and customize their own VSNs to support their initial and changing functional and performance requirements. These tasks are problematic for them due to: 1) platform-specific knowledge required, 2) the existence of a large number of customization options and their dependencies, and 3) the complexity in deriving the right subset of options. In this paper, we present an approach to enable and simplify the tenant-driven customization of multi-tenant service networks. We propose to use feature as a high-level customization abstraction. A regulated collaboration among a set of services in the service network realizes a feature. A software engineer can design a customization policy for a service network using the mappings between features and collaborations, and enact the policy with the controller of the service network. A tenant can then specify the requirements for its VSN as a set of functional and performance features. A customization request from a tenant triggers the customization policy of the service network, which (re)configures the corresponding VSN at runtime to realize the selected features. We show the feasibility of our approach with two case studies and a performance evaluation.
△ Less
Submitted 10 February, 2020;
originally announced February 2020.