-
Metric Privacy in Federated Learning for Medical Imaging: Improving Convergence and Preventing Client Inference Attacks
Authors:
Judith Sáinz-Pardo Díaz,
Andreas Athanasiou,
Kangsoo Jung,
Catuscia Palamidessi,
Álvaro López García
Abstract:
Federated learning is a distributed learning technique that allows training a global model with the participation of different data owners without the need to share raw data. This architecture is orchestrated by a central server that aggregates the local models from the clients. This server may be trusted, but not all nodes in the network. Then, differential privacy (DP) can be used to privatize t…
▽ More
Federated learning is a distributed learning technique that allows training a global model with the participation of different data owners without the need to share raw data. This architecture is orchestrated by a central server that aggregates the local models from the clients. This server may be trusted, but not all nodes in the network. Then, differential privacy (DP) can be used to privatize the global model by adding noise. However, this may affect convergence across the rounds of the federated architecture, depending also on the aggregation strategy employed. In this work, we aim to introduce the notion of metric-privacy to mitigate the impact of classical server side global-DP on the convergence of the aggregated model. Metric-privacy is a relaxation of DP, suitable for domains provided with a notion of distance. We apply it from the server side by computing a distance for the difference between the local models. We compare our approach with standard DP by analyzing the impact on six classical aggregation strategies. The proposed methodology is applied to an example of medical imaging and different scenarios are simulated across homogeneous and non-i.i.d clients. Finally, we introduce a novel client inference attack, where a semi-honest client tries to find whether another client participated in the training and study how it can be mitigated using DP and metric-privacy. Our evaluation shows that metric-privacy can increase the performance of the model compared to standard DP, while offering similar protection against client inference attacks.
△ Less
Submitted 3 February, 2025;
originally announced February 2025.
-
Enhancing the Convergence of Federated Learning Aggregation Strategies with Limited Data
Authors:
Judith Sáinz-Pardo Díaz,
Álvaro López García
Abstract:
The development of deep learning techniques is a leading field applied to cases in which medical data is used, particularly in cases of image diagnosis. This type of data has privacy and legal restrictions that in many cases prevent it from being processed from central servers. However, in this area collaboration between different research centers, in order to create models as robust as possible,…
▽ More
The development of deep learning techniques is a leading field applied to cases in which medical data is used, particularly in cases of image diagnosis. This type of data has privacy and legal restrictions that in many cases prevent it from being processed from central servers. However, in this area collaboration between different research centers, in order to create models as robust as possible, trained with the largest quantity and diversity of data available, is a critical point to be taken into account. In this sense, the application of privacy aware distributed architectures, such as federated learning arises. When applying this type of architecture, the server aggregates the different local models trained with the data of each data owner to build a global model. This point is critical and therefore it is fundamental to analyze different ways of aggregation according to the use case, taking into account the distribution of the clients, the characteristics of the model, etc. In this paper we propose a novel aggregation strategy and we apply it to a use case of cerebral magnetic resonance image classification. In this use case the aggregation function proposed manages to improve the convergence obtained over the rounds of the federated learning process in relation to different aggregation strategies classically implemented and applied.
△ Less
Submitted 27 January, 2025;
originally announced January 2025.
-
An Open Source Python Library for Anonymizing Sensitive Data
Authors:
Judith Sáinz-Pardo Díaz,
Álvaro López García
Abstract:
Open science is a fundamental pillar to promote scientific progress and collaboration, based on the principles of open data, open source and open access. However, the requirements for publishing and sharing open data are in many cases difficult to meet in compliance with strict data protection regulations. Consequently, researchers need to rely on proven methods that allow them to anonymize their…
▽ More
Open science is a fundamental pillar to promote scientific progress and collaboration, based on the principles of open data, open source and open access. However, the requirements for publishing and sharing open data are in many cases difficult to meet in compliance with strict data protection regulations. Consequently, researchers need to rely on proven methods that allow them to anonymize their data without sharing it with third parties. To this end, this paper presents the implementation of a Python library for the anonymization of sensitive tabular data. This framework provides users with a wide range of anonymization methods that can be applied on the given dataset, including the set of identifiers, quasi-identifiers, generalization hierarchies and allowed level of suppression, along with the sensitive attribute and the level of anonymity required. The library has been implemented following best practices for integration and continuous development, as well as the use of workflows to test code coverage based on unit and functional tests.
△ Less
Submitted 20 August, 2024;
originally announced August 2024.
-
Personalized Federated Learning for improving radar based precipitation nowcasting on heterogeneous areas
Authors:
Judith Sáinz-Pardo Díaz,
María Castrillo,
Juraj Bartok,
Ignacio Heredia Cachá,
Irina Malkin Ondík,
Ivan Martynovskyi,
Khadijeh Alibabaei,
Lisana Berberi,
Valentin Kozlov,
Álvaro López García
Abstract:
The increasing generation of data in different areas of life, such as the environment, highlights the need to explore new techniques for processing and exploiting data for useful purposes. In this context, artificial intelligence techniques, especially through deep learning models, are key tools to be used on the large amount of data that can be obtained, for example, from weather radars. In many…
▽ More
The increasing generation of data in different areas of life, such as the environment, highlights the need to explore new techniques for processing and exploiting data for useful purposes. In this context, artificial intelligence techniques, especially through deep learning models, are key tools to be used on the large amount of data that can be obtained, for example, from weather radars. In many cases, the information collected by these radars is not open, or belongs to different institutions, thus needing to deal with the distributed nature of this data. In this work, the applicability of a personalized federated learning architecture, which has been called adapFL, on distributed weather radar images is addressed. To this end, given a single available radar covering 400 km in diameter, the captured images are divided in such a way that they are disjointly distributed into four different federated clients. The results obtained with adapFL are analyzed in each zone, as well as in a central area covering part of the surface of each of the previously distributed areas. The ultimate goal of this work is to study the generalization capability of this type of learning technique for its extrapolation to use cases in which a representative number of radars is available, whose data can not be centralized due to technical, legal or administrative concerns. The results of this preliminary study indicate that the performance obtained in each zone with the adapFL approach allows improving the results of the federated learning approach, the individual deep learning models and the classical Continuity Tracking Radar Echoes by Correlation approach.
△ Less
Submitted 11 August, 2024;
originally announced August 2024.
-
The irruption of cryptocurrencies into Twitter cashtags: a classifying solution
Authors:
Ana Fernández Vilas,
Rebeca Díaz Redondo,
Antón Lorenzo García
Abstract:
There is a consensus about the good sensing characteristics of Twitter to mine and uncover knowledge in financial markets, being considered a relevant feeder for taking decisions about buying or holding stock shares and even for detecting stock manipulation. Although Twitter hashtags allow to aggregate topic-related content, a specific mechanism for financial information also exists: Cashtag. Howe…
▽ More
There is a consensus about the good sensing characteristics of Twitter to mine and uncover knowledge in financial markets, being considered a relevant feeder for taking decisions about buying or holding stock shares and even for detecting stock manipulation. Although Twitter hashtags allow to aggregate topic-related content, a specific mechanism for financial information also exists: Cashtag. However, the irruption of cryptocurrencies has resulted in a significant degradation on the cashtag-based aggregation of posts. Unfortunately, Twitter' users may use homonym tickers to refer to cryptocurrencies and to companies in stock markets, which means that filtering by cashtag may result on both posts referring to stock companies and cryptocurrencies. This research proposes automated classifiers to distinguish conflicting cashtags and, so, their container tweets by analyzing the distinctive features of tweets referring to stock companies and cryptocurrencies. As experiment, this paper analyses the interference between cryptocurrencies and company tickers in the London Stock Exchange (LSE), specifically, companies in the main and alternative market indices FTSE-100 and AIM-100. Heuristic-based as well as supervised classifiers are proposed and their advantages and drawbacks, including their ability to self-adapt to Twitter usage changes, are discussed. The experiment confirms a significant distortion in collected data when colliding or homonym cashtags exist, i.e., the same \$ acronym to refer to company tickers and cryptocurrencies. According to our results, the distinctive features of posts including cryptocurrencies or company tickers support accurate classification of colliding tweets (homonym cashtags) and Independent Models, as the most detached classifiers from training data, have the potential to be trans-applicability (in different stock markets) while retaining performance.
△ Less
Submitted 14 December, 2023;
originally announced December 2023.
-
Comparison of machine learning models applied on anonymized data with different techniques
Authors:
Judith Sáinz-Pardo Díaz,
Álvaro López García
Abstract:
Anonymization techniques based on obfuscating the quasi-identifiers by means of value generalization hierarchies are widely used to achieve preset levels of privacy. To prevent different types of attacks against database privacy it is necessary to apply several anonymization techniques beyond the classical k-anonymity or $\ell$-diversity. However, the application of these methods is directly conne…
▽ More
Anonymization techniques based on obfuscating the quasi-identifiers by means of value generalization hierarchies are widely used to achieve preset levels of privacy. To prevent different types of attacks against database privacy it is necessary to apply several anonymization techniques beyond the classical k-anonymity or $\ell$-diversity. However, the application of these methods is directly connected to a reduction of their utility in prediction and decision making tasks. In this work we study four classical machine learning methods currently used for classification purposes in order to analyze the results as a function of the anonymization techniques applied and the parameters selected for each of them. The performance of these models is studied when varying the value of k for k-anonymity and additional tools such as $\ell$-diversity, t-closeness and $δ$-disclosure privacy are also deployed on the well-known adult dataset.
△ Less
Submitted 12 May, 2023;
originally announced May 2023.
-
pyCANON: A Python library to check the level of anonymity of a dataset
Authors:
Judith Sáinz-Pardo Díaz,
Álvaro López García
Abstract:
Openly sharing data with sensitive attributes and privacy restrictions is a challenging task. In this document we present the implementation of pyCANON, a Python library and command line interface (CLI) to check and assess the level of anonymity of a dataset through some of the most common anonymization techniques: k-anonymity, ($α$,k)-anonymity, $\ell$-diversity, entropy $\ell$-diversity, recursi…
▽ More
Openly sharing data with sensitive attributes and privacy restrictions is a challenging task. In this document we present the implementation of pyCANON, a Python library and command line interface (CLI) to check and assess the level of anonymity of a dataset through some of the most common anonymization techniques: k-anonymity, ($α$,k)-anonymity, $\ell$-diversity, entropy $\ell$-diversity, recursive (c,$\ell$)-diversity, basic $β$-likeness, enhanced $β$-likeness, t-closeness and $δ$-disclosure privacy. For the case of more than one sensitive attributes, two approaches are proposed for evaluating this techniques. The main strength of this library is to obtain a full report of the parameters that are fulfilled for each of the techniques mentioned above, with the unique requirement of the set of quasi-identifiers and that of sensitive attributes. We present the methods implemented together with the attacks they prevent, the description of the library, use examples of the different functions, as well as the impact and the possible applications that can be developed. Finally, some possible aspects to be incorporated in future updates are proposed.
△ Less
Submitted 16 August, 2022;
originally announced August 2022.
-
A Container-Based Workflow for Distributed Training of Deep Learning Algorithms in HPC Clusters
Authors:
Jose González-Abad,
Álvaro López García,
Valentin Y. Kozlov
Abstract:
Deep learning has been postulated as a solution for numerous problems in different branches of science. Given the resource-intensive nature of these models, they often need to be executed on specialized hardware such graphical processing units (GPUs) in a distributed manner. In the academic field, researchers get access to this kind of resources through High Performance Computing (HPC) clusters. T…
▽ More
Deep learning has been postulated as a solution for numerous problems in different branches of science. Given the resource-intensive nature of these models, they often need to be executed on specialized hardware such graphical processing units (GPUs) in a distributed manner. In the academic field, researchers get access to this kind of resources through High Performance Computing (HPC) clusters. This kind of infrastructures make the training of these models difficult due to their multi-user nature and limited user permission. In addition, different HPC clusters may possess different peculiarities that can entangle the research cycle (e.g., libraries dependencies). In this paper we develop a workflow and methodology for the distributed training of deep learning models in HPC clusters which provides researchers with a series of novel advantages. It relies on udocker as containerization tool and on Horovod as library for the distribution of the models across multiple GPUs. udocker does not need any special permission, allowing researchers to run the entire workflow without relying on any administrator. Horovod ensures the efficient distribution of the training independently of the deep learning framework used. Additionally, due to containerization and specific features of the workflow, it provides researchers with a cluster-agnostic way of running their models. The experiments carried out show that the workflow offers good scalability in the distributed training of the models and that it easily adapts to different clusters.
△ Less
Submitted 14 November, 2022; v1 submitted 4 August, 2022;
originally announced August 2022.
-
Study of the performance and scalability of federated learning for medical imaging with intermittent clients
Authors:
Judith Sáinz-Pardo Díaz,
Álvaro López García
Abstract:
Federated learning is a data decentralization privacy-preserving technique used to perform machine or deep learning in a secure way. In this paper we present theoretical aspects about federated learning, such as the presentation of an aggregation operator, different types of federated learning, and issues to be taken into account in relation to the distribution of data from the clients, together w…
▽ More
Federated learning is a data decentralization privacy-preserving technique used to perform machine or deep learning in a secure way. In this paper we present theoretical aspects about federated learning, such as the presentation of an aggregation operator, different types of federated learning, and issues to be taken into account in relation to the distribution of data from the clients, together with the exhaustive analysis of a use case where the number of clients varies. Specifically, a use case of medical image analysis is proposed, using chest X-Ray images obtained from an open data repository. In addition to the advantages related to privacy, improvements in predictions (in terms of accuracy, loss and area under the curve) and reduction of execution times will be studied with respect to the classical case (the centralized approach). Different clients will be simulated from the training data, selected in an unbalanced manner. The results of considering three or ten clients are exposed and compared between them and against the centralized case. Two different problems related to intermittent clients are discussed, together with two approaches to be followed for each of them. Specifically, this type of problems may occur because in a real scenario some clients may leave the training, and others enter it, and on the other hand because of client technical or connectivity problems. Finally, improvements and future work in the field are proposed.
△ Less
Submitted 3 November, 2022; v1 submitted 18 July, 2022;
originally announced July 2022.
-
Forecasting COVID-19 spreading trough an ensemble of classical and machine learning models: Spain's case study
Authors:
Ignacio Heredia Cacha,
Judith Sainz-Pardo Díaz,
María Castrillo Melguizo,
Álvaro López García
Abstract:
In this work we evaluate the applicability of an ensemble of population models and machine learning models to predict the near future evolution of the COVID-19 pandemic, with a particular use case in Spain. We rely solely in open and public datasets, fusing incidence, vaccination, human mobility and weather data to feed our machine learning models (Random Forest, Gradient Boosting, k-Nearest Neigh…
▽ More
In this work we evaluate the applicability of an ensemble of population models and machine learning models to predict the near future evolution of the COVID-19 pandemic, with a particular use case in Spain. We rely solely in open and public datasets, fusing incidence, vaccination, human mobility and weather data to feed our machine learning models (Random Forest, Gradient Boosting, k-Nearest Neighbours and Kernel Ridge Regression). We use the incidence data to adjust classic population models (Gompertz, Logistic, Richards, Bertalanffy) in order to be able to better capture the trend of the data. We then ensemble these two families of models in order to obtain a more robust and accurate prediction. Furthermore, we have observed an improvement in the predictions obtained with machine learning models as we add new features (vaccines, mobility, climatic conditions), analyzing the importance of each of them using Shapley Additive Explanation values. As in any other modelling work, data and predictions quality have several limitations and therefore they must be seen from a critical standpoint, as we discuss in the text. Our work concludes that the ensemble use of these models improves the individual predictions (using only machine learning models or only population models) and can be applied, with caution, in cases when compartmental models cannot be utilized due to the lack of relevant data.
△ Less
Submitted 12 August, 2022; v1 submitted 12 July, 2022;
originally announced July 2022.
-
Interaction and Conflict Management in AI-assisted Operational Control Loops in 6G
Authors:
Saeedeh Parsaeefard,
Pooyan Habibi,
Alberto Leon Garcia
Abstract:
This paper studies autonomous and AI-assisted control loops (ACLs) in the next generation of wireless networks in the lens of multi-agent environments. We will study the diverse interactions and conflict management among these loops. We propose "interaction and conflict management" (ICM) modules to achieve coherent, consistent and interactions among these ACLs. We introduce three categories of ACL…
▽ More
This paper studies autonomous and AI-assisted control loops (ACLs) in the next generation of wireless networks in the lens of multi-agent environments. We will study the diverse interactions and conflict management among these loops. We propose "interaction and conflict management" (ICM) modules to achieve coherent, consistent and interactions among these ACLs. We introduce three categories of ACLs based on their sizes, their cooperative and competitive behaviors, and their sharing of datasets and models. These categories help to introduce conflict resolution and interaction management mechanisms for ICM. Using Kubernetes, we present an implementation of ICM to remove the conflicts in the scheduling and rescheduling of Pods for different ACLs in networks.
△ Less
Submitted 22 October, 2021;
originally announced October 2021.
-
Generalized ADMM in Distributed Learning via Variational Inequality
Authors:
Saeedeh Parsaeefard,
Alberto Leon Garcia
Abstract:
Due to the explosion in size and complexity of modern data sets and privacy concerns of data holders, it is increasingly important to be able to solve machine learning problems in distributed manners. The Alternating Direction Method of Multipliers (ADMM) through the concept of consensus variables is a practical algorithm in this context where its diverse variations and its performance have been s…
▽ More
Due to the explosion in size and complexity of modern data sets and privacy concerns of data holders, it is increasingly important to be able to solve machine learning problems in distributed manners. The Alternating Direction Method of Multipliers (ADMM) through the concept of consensus variables is a practical algorithm in this context where its diverse variations and its performance have been studied in different application areas. In this paper, we study the effect of the local data sets of users in the distributed learning of ADMM. Our aim is to deploy variational inequality (VI) to attain an unified view of ADMM variations. Through the simulation results, we demonstrate how more general definitions of consensus parameters and introducing the uncertain parameters in distribute approach can help to get the better results in learning processes.
△ Less
Submitted 26 April, 2021;
originally announced April 2021.
-
Robust Federated Learning by Mixture of Experts
Authors:
Saeedeh Parsaeefard,
Sayed Ehsan Etesami,
Alberto Leon Garcia
Abstract:
We present a novel weighted average model based on the mixture of experts (MoE) concept to provide robustness in Federated learning (FL) against the poisoned/corrupted/outdated local models. These threats along with the non-IID nature of data sets can considerably diminish the accuracy of the FL model. Our proposed MoE-FL setup relies on the trust between users and the server where the users share…
▽ More
We present a novel weighted average model based on the mixture of experts (MoE) concept to provide robustness in Federated learning (FL) against the poisoned/corrupted/outdated local models. These threats along with the non-IID nature of data sets can considerably diminish the accuracy of the FL model. Our proposed MoE-FL setup relies on the trust between users and the server where the users share a portion of their public data sets with the server. The server applies a robust aggregation method by solving the optimization problem or the Softmax method to highlight the outlier cases and to reduce their adverse effect on the FL process. Our experiments illustrate that MoE-FL outperforms the performance of the traditional aggregation approach for high rate of poisoned data from attackers.
△ Less
Submitted 23 April, 2021;
originally announced April 2021.
-
Stochastic Geometry-Based Modeling and Analysis of Beam Management in 5G
Authors:
Sanket S. Kalamkar,
Fuad M. Abinader Jr.,
François Baccelli,
Andrea S. Marcano Fani,
and Luis G. Uzeda Garcia
Abstract:
Beam management is central in the operation of dense 5G cellular networks. Focusing the energy radiated to mobile terminals (MTs) by increasing the number of beams per cell increases signal power and decreases interference, and has hence the potential to bring major improvements on area spectral efficiency (ASE). This benefit, however, comes with unavoidable overheads that increase with the number…
▽ More
Beam management is central in the operation of dense 5G cellular networks. Focusing the energy radiated to mobile terminals (MTs) by increasing the number of beams per cell increases signal power and decreases interference, and has hence the potential to bring major improvements on area spectral efficiency (ASE). This benefit, however, comes with unavoidable overheads that increase with the number of beams and the MT speed. This paper proposes a first system-level stochastic geometry model encompassing major aspects of the beam management problem: frequencies, antennas, and propagation; physical layer, wireless links, and coding; network geometry, interference, and resource sharing; sensing, signaling, and mobility management. This model leads to a simple analytical expression for the effective ASE that the typical user gets in this context. This in turn allows one to find, for a wide variety of 5G network scenarios including millimeter wave (mmWave) and sub-6 GHz, the number of beams per cell that offers the best global trade-off between these benefits and costs. We finally provide numerical results that discuss the effects of different systemic trade-offs and performances of mmWave and sub-6 GHz 5G deployments.
△ Less
Submitted 14 September, 2020; v1 submitted 8 June, 2020;
originally announced June 2020.
-
Estimation of high frequency nutrient concentrations from water quality surrogates using machine learning methods
Authors:
María Castrillo,
Álvaro López García
Abstract:
Continuous high frequency water quality monitoring is becoming a critical task to support water management. Despite the advancements in sensor technologies, certain variables cannot be easily and/or economically monitored in-situ and in real time. In these cases, surrogate measures can be used to make estimations by means of data-driven models. In this work, variables that are commonly measured in…
▽ More
Continuous high frequency water quality monitoring is becoming a critical task to support water management. Despite the advancements in sensor technologies, certain variables cannot be easily and/or economically monitored in-situ and in real time. In these cases, surrogate measures can be used to make estimations by means of data-driven models. In this work, variables that are commonly measured in-situ are used as surrogates to estimate the concentrations of nutrients in a rural catchment and in an urban one, making use of machine learning models, specifically Random Forests. The results are compared with those of linear modelling using the same number of surrogates, obtaining a reduction in the Root Mean Squared Error (RMSE) of up to 60.1%. The profit from including up to seven surrogate sensors was computed, concluding that adding more than 4 and 5 sensors in each of the catchments respectively was not worthy in terms of error improvement.
△ Less
Submitted 27 January, 2020;
originally announced January 2020.
-
Representation of Federated Learning via Worst-Case Robust Optimization Theory
Authors:
Saeedeh Parsaeefard,
Iman Tabrizian,
Alberto Leon Garcia
Abstract:
Federated learning (FL) is a distributed learning approach where a set of end-user devices participate in the learning process by acting on their isolated local data sets. Here, we process local data sets of users where worst-case optimization theory is used to reformulate the FL problem where the impact of local data sets in training phase is considered as an uncertain function bounded in a close…
▽ More
Federated learning (FL) is a distributed learning approach where a set of end-user devices participate in the learning process by acting on their isolated local data sets. Here, we process local data sets of users where worst-case optimization theory is used to reformulate the FL problem where the impact of local data sets in training phase is considered as an uncertain function bounded in a closed uncertainty region. This representation allows us to compare the performance of FL with its centralized counterpart, and to replace the uncertain function with a concept of protection functions leading to more tractable formulation. The latter supports applying a regularization factor in each user cost function in FL to reach a better performance. We evaluated our model using the MNIST data set versus the protection function parameters, e.g., regularization factors.
△ Less
Submitted 11 December, 2019;
originally announced December 2019.
-
An efficient cloud scheduler design supporting preemptible instances
Authors:
Álvaro López García,
Enol Fernández-del-Castillo,
Isabel Campos Plasencia
Abstract:
Maximizing resource utilization by performing an efficient resource provisioning is a key factor for any cloud provider: commercial actors can maximize their revenues, whereas scientific and non-commercial providers can maximize their infrastructure utilization. Traditionally, batch systems have allowed data centers to fill their resources as much as possible by using backfilling and similar techn…
▽ More
Maximizing resource utilization by performing an efficient resource provisioning is a key factor for any cloud provider: commercial actors can maximize their revenues, whereas scientific and non-commercial providers can maximize their infrastructure utilization. Traditionally, batch systems have allowed data centers to fill their resources as much as possible by using backfilling and similar techniques. However, in an IaaS cloud, where virtual machines are supposed to live indefinitely, or at least as long as the user is able to pay for them, these policies are not easily implementable. In this work we present a new scheduling algorithm for IaaS providers that is able to support preemptible instances, that can be stopped by higher priority requests without introducing large modifications in the current cloud schedulers. This scheduler enables the implementation of new cloud usage and payment models that allow more efficient usage of the resources and potential new revenue sources for commercial providers. We also study the correctness and the performace overhead of the proposed scheduler agains existing solutions.
△ Less
Submitted 28 January, 2020; v1 submitted 27 December, 2018;
originally announced December 2018.
-
umd-verification: Automation of Software Validation for the EGI federated e-Infrastructure
Authors:
Pablo Orviz Fernandez,
Joao Pina,
Alvaro Lopez Garcia,
Isabel Campos Plasencia,
Mario David,
Jorge Gomes
Abstract:
Supporting e-Science in the EGI e-Infrastructure requires extensive and reliable software, for advanced computing use, deployed across over approximately 300 European and worldwide data centers. The Unified Middleware Distribution (UMD) and Cloud Middleware Distribution (CMD) are the channels to deliver the software for the EGI e-Infrastructure consumption. The software is compiled, validated and…
▽ More
Supporting e-Science in the EGI e-Infrastructure requires extensive and reliable software, for advanced computing use, deployed across over approximately 300 European and worldwide data centers. The Unified Middleware Distribution (UMD) and Cloud Middleware Distribution (CMD) are the channels to deliver the software for the EGI e-Infrastructure consumption. The software is compiled, validated and distributed following the Software Provisioning Process (SWPP), where the Quality Criteria (QC) definition sets the minimum quality requirements for EGI acceptance. The growing number of software components currently existing within UMD and CMD distributions hinders the application of the traditional, manual-based validation mechanisms, thus driving the adoption of automated solutions. This paper presents umd-verification, an open-source tool that enforces the fulfillment of the QC requirements in an automated way for the continuous validation of the software products for scientific disposal. The umd-verification tool has been successfully integrated within the SWPP pipeline and is progressively supporting the full validation of the products in the UMD and CMD repositories. While the cost of supporting new products is dependant on the availability of Infrastructure as Code solutions to take over the deployment and high test coverage, the results obtained for the already integrated products are promising, as the time invested in the validation of products has been drastically reduced. Furthermore, automation adoption has brought along benefits for the reliability of the process, such as the removal of human-associated errors or the risk of regression of previously tested functionalities.
△ Less
Submitted 30 July, 2018;
originally announced July 2018.
-
Efficient image deployment in cloud environments
Authors:
Álvaro López García,
Enol Fernández del Castillo
Abstract:
The biggest overhead for the instantiation of a virtual machine in a cloud infrastructure is the time spent in transferring the image of the virtual machine into the physical node that executes it. This overhead becomes larger for requests composed of several virtual machines to be started concurrently, and the illusion of flexibility and elasticity usually associated with the cloud computing mode…
▽ More
The biggest overhead for the instantiation of a virtual machine in a cloud infrastructure is the time spent in transferring the image of the virtual machine into the physical node that executes it. This overhead becomes larger for requests composed of several virtual machines to be started concurrently, and the illusion of flexibility and elasticity usually associated with the cloud computing model may vanish. This poses a problem for both the resource providers and the software developers, since tackling those overheads is not a trivial issue.
In this work we implement and evaluate several improvements for virtual machine image distribution problem in a cloud infrastructure and propose a method based on BitTorrent and local caching of the virtual machine images that reduces the transfer time when large requests are made
△ Less
Submitted 21 November, 2017;
originally announced November 2017.
-
Standards for enabling heterogeneous IaaS cloud federations
Authors:
Álvaro López García,
Enol Fernández del Castillo,
Pablo Orviz Fernández
Abstract:
Technology market is continuing a rapid growth phase where different resource providers and Cloud Management Frameworks are positioning to provide ad-hoc solutions -in terms of management interfaces, information discovery or billing- trying to differentiate from competitors but that as a result remain incompatible between them when addressing more complex scenarios like federated clouds. Grasping…
▽ More
Technology market is continuing a rapid growth phase where different resource providers and Cloud Management Frameworks are positioning to provide ad-hoc solutions -in terms of management interfaces, information discovery or billing- trying to differentiate from competitors but that as a result remain incompatible between them when addressing more complex scenarios like federated clouds. Grasping interoperability problems present in current infrastructures is then a must-do, tackled by studying how existing and emerging standards could enhance user experience in the cloud ecosystem. In this paper we will review the current open challenges in Infrastructure as a Service cloud interoperability and federation, as well as point to the potential standards that should alleviate these problems.
△ Less
Submitted 21 November, 2017;
originally announced November 2017.
-
Orchestrating Complex Application Architectures in Heterogeneous Clouds
Authors:
Miguel Caballer,
Sahdev Zala,
Álvaro López García,
Germán Moltó,
Pablo Orviz Fernández,
Mathieu Velten
Abstract:
Private cloud infrastructures are now widely deployed and adopted across technology industries and research institutions. Although cloud computing has emerged as a reality, it is now known that a single cloud provider cannot fully satisfy complex user requirements. This has resulted in a growing interest in developing hybrid cloud solutions that bind together distinct and heterogeneous cloud infra…
▽ More
Private cloud infrastructures are now widely deployed and adopted across technology industries and research institutions. Although cloud computing has emerged as a reality, it is now known that a single cloud provider cannot fully satisfy complex user requirements. This has resulted in a growing interest in developing hybrid cloud solutions that bind together distinct and heterogeneous cloud infrastructures. In this paper we describe the orchestration approach for heterogeneous clouds that has been implemented and used within the INDIGO-DataCloud project. This orchestration model uses existing open-source software like OpenStack and leverages the OASIS Topology and Specification for Cloud Applications (TOSCA) open standard as the modeling language. Our approach uses virtual machines and Docker containers in an homogeneous and transparent way providing consistent application deployment for the users. This approach is illustrated by means of two different use cases in different scientific communities, implemented using the INDIGO-DataCloud solutions.
△ Less
Submitted 9 November, 2017;
originally announced November 2017.
-
Resource provisioning in Science Clouds: Requirements and challenges
Authors:
Álvaro López García,
Enol Fernández-del-Castillo,
Pablo Orviz Fernández,
Isabel Campos Plasencia,
Jesús Marco de Lucas
Abstract:
Cloud computing has permeated into the information technology industry in the last few years, and it is emerging nowadays in scientific environments. Science user communities are demanding a broad range of computing power to satisfy the needs of high-performance applications, such as local clusters, high-performance computing systems, and computing grids. Different workloads are needed from differ…
▽ More
Cloud computing has permeated into the information technology industry in the last few years, and it is emerging nowadays in scientific environments. Science user communities are demanding a broad range of computing power to satisfy the needs of high-performance applications, such as local clusters, high-performance computing systems, and computing grids. Different workloads are needed from different computational models, and the cloud is already considered as a promising paradigm. The scheduling and allocation of resources is always a challenging matter in any form of computation and clouds are not an exception. Science applications have unique features that differentiate their workloads, hence, their requirements have to be taken into consideration to be fulfilled when building a Science Cloud. This paper will discuss what are the main scheduling and resource allocation challenges for any Infrastructure as a Service provider supporting scientific applications.
△ Less
Submitted 25 September, 2017;
originally announced September 2017.
-
Improved Cloud resource allocation: how INDIGO-DataCloud is overcoming the current limitations in Cloud schedulers
Authors:
Alvaro Lopez Garcia,
Lisa Zangrando,
Massimo Sgaravatto,
Vincent Llorens,
Sara Vallero,
Valentina Zaccolo,
Stefano Bagnasco,
Sonia Taneja,
Stefano Dal Pra,
Davide Salomoni,
Giacinto Donvito
Abstract:
Performing efficient resource provisioning is a fundamental aspect for any resource provider. Local Resource Management Systems (LRMS) have been used in data centers for decades in order to obtain the best usage of the resources, providing their fair usage and partitioning for the users. In contrast, current cloud schedulers are normally based on the immediate allocation of resources on a first-co…
▽ More
Performing efficient resource provisioning is a fundamental aspect for any resource provider. Local Resource Management Systems (LRMS) have been used in data centers for decades in order to obtain the best usage of the resources, providing their fair usage and partitioning for the users. In contrast, current cloud schedulers are normally based on the immediate allocation of resources on a first-come, first-served basis, meaning that a request will fail if there are no resources (e.g. OpenStack) or it will be trivially queued ordered by entry time (e.g. OpenNebula). Moreover, these scheduling strategies are based on a static partitioning of the resources, meaning that existing quotas cannot be exceeded, even if there are idle resources allocated to other projects. This is a consequence of the fact that cloud instances are not associated with a maximum execution time and leads to a situation where the resources are under-utilized. These facts have been identified by the INDIGO-DataCloud project as being too simplistic for accommodating scientific workloads in an efficient way, leading to an underutilization of the resources, a non desirable situation in scientific data centers. In this work, we will present the work done in the scheduling area during the first year of the INDIGO project and the foreseen evolutions.
△ Less
Submitted 20 July, 2017;
originally announced July 2017.
-
Analysis of Scientific Cloud Computing requirements
Authors:
Álvaro López García,
Enol Fernández del Castillo
Abstract:
While the requirements of enterprise and web applications have driven the development of Cloud computing, some of its key features, such as customized environments and rapid elasticity, could also benefit scientific applications. However, neither virtualization techniques nor Cloud-like access to resources is common in scientific computing centers due to the negative perception of the impact that…
▽ More
While the requirements of enterprise and web applications have driven the development of Cloud computing, some of its key features, such as customized environments and rapid elasticity, could also benefit scientific applications. However, neither virtualization techniques nor Cloud-like access to resources is common in scientific computing centers due to the negative perception of the impact that virtualization techniques introduce.
In this paper we discuss the feasibility of the IaaS cloud model to satisfy some of the computational science requirements and the main drawbacks that need to be addressed by cloud resource providers so that the maximum benefit can be obtained from a given cloud infrastructure.
△ Less
Submitted 22 June, 2015; v1 submitted 24 September, 2013;
originally announced September 2013.