-
Overcoming Noisy and Irrelevant Data in Federated Learning
Authors:
Tiffany Tuor,
Shiqiang Wang,
Bong Jun Ko,
Changchang Liu,
Kin K. Leung
Abstract:
Many image and vision applications require a large amount of data for model training. Collecting all such data at a central location can be challenging due to data privacy and communication bandwidth restrictions. Federated learning is an effective way of training a machine learning model in a distributed manner from local data collected by client devices, which does not require exchanging the raw…
▽ More
Many image and vision applications require a large amount of data for model training. Collecting all such data at a central location can be challenging due to data privacy and communication bandwidth restrictions. Federated learning is an effective way of training a machine learning model in a distributed manner from local data collected by client devices, which does not require exchanging the raw data among clients. A challenge is that among the large variety of data collected at each client, it is likely that only a subset is relevant for a learning task while the rest of data has a negative impact on model training. Therefore, before starting the learning process, it is important to select the subset of data that is relevant to the given federated learning task. In this paper, we propose a method for distributedly selecting relevant data, where we use a benchmark model trained on a small benchmark dataset that is task-specific, to evaluate the relevance of individual data samples at each client and select the data with sufficiently high relevance. Then, each client only uses the selected subset of its data in the federated learning process. The effectiveness of our proposed approach is evaluated on multiple real-world image datasets in a simulated system with a large number of clients, showing up to $25\%$ improvement in model accuracy compared to training with all data.
△ Less
Submitted 22 June, 2020; v1 submitted 22 January, 2020;
originally announced January 2020.
-
Model Pruning Enables Efficient Federated Learning on Edge Devices
Authors:
Yuang Jiang,
Shiqiang Wang,
Victor Valls,
Bong Jun Ko,
Wei-Han Lee,
Kin K. Leung,
Leandros Tassiulas
Abstract:
Federated learning (FL) allows model training from local data collected by edge/mobile devices while preserving data privacy, which has wide applicability to image and vision applications. A challenge is that client devices in FL usually have much more limited computation and communication resources compared to servers in a datacenter. To overcome this challenge, we propose PruneFL -- a novel FL a…
▽ More
Federated learning (FL) allows model training from local data collected by edge/mobile devices while preserving data privacy, which has wide applicability to image and vision applications. A challenge is that client devices in FL usually have much more limited computation and communication resources compared to servers in a datacenter. To overcome this challenge, we propose PruneFL -- a novel FL approach with adaptive and distributed parameter pruning, which adapts the model size during FL to reduce both communication and computation overhead and minimize the overall training time, while maintaining a similar accuracy as the original model. PruneFL includes initial pruning at a selected client and further pruning as part of the FL process. The model size is adapted during this process, which includes maximizing the approximate empirical risk reduction divided by the time of one FL round. Our experiments with various datasets on edge devices (e.g., Raspberry Pi) show that: (i) we significantly reduce the training time compared to conventional FL and various other pruning-based methods; (ii) the pruned model with automatically determined size converges to an accuracy that is very similar to the original model, and it is also a lottery ticket of the original model.
△ Less
Submitted 6 April, 2022; v1 submitted 26 September, 2019;
originally announced September 2019.
-
Online Collection and Forecasting of Resource Utilization in Large-Scale Distributed Systems
Authors:
Tiffany Tuor,
Shiqiang Wang,
Kin K. Leung,
Bong Jun Ko
Abstract:
Large-scale distributed computing systems often contain thousands of distributed nodes (machines). Monitoring the conditions of these nodes is important for system management purposes, which, however, can be extremely resource demanding as this requires collecting local measurements of each individual node and constantly sending those measurements to a central controller. Meanwhile, it is often us…
▽ More
Large-scale distributed computing systems often contain thousands of distributed nodes (machines). Monitoring the conditions of these nodes is important for system management purposes, which, however, can be extremely resource demanding as this requires collecting local measurements of each individual node and constantly sending those measurements to a central controller. Meanwhile, it is often useful to forecast the future system conditions for various purposes such as resource planning/allocation and anomaly detection, but it is usually too resource-consuming to have one forecasting model running for each node, which may also neglect correlations in observed metrics across different nodes. In this paper, we propose a mechanism for collecting and forecasting the resource utilization of machines in a distributed computing system in a scalable manner. We present an algorithm that allows each local node to decide when to transmit its most recent measurement to the central node, so that the transmission frequency is kept below a given constraint value. Based on the measurements received from local nodes, the central node summarizes the received data into a small number of clusters. Since the cluster partitioning can change over time, we also present a method to capture the evolution of clusters and their centroids. As an effective way to reduce the amount of computation, time-series forecasting models are trained on the time-varying centroids of each cluster, to forecast the future resource utilizations of a group of local nodes. The effectiveness of our proposed approach is confirmed by extensive experiments using multiple real-world datasets.
△ Less
Submitted 22 May, 2019;
originally announced May 2019.
-
Harvesting Time-Series Data from Service-Based Systems Hosted in MANETs
Authors:
Petr Novotny,
Bong Jun Ko,
Alexander L. Wolf
Abstract:
We are concerned with reliably harvesting data collected from service-based systems hosted on a mobile ad hoc network (MANET). More specifically, we are concerned with time-bounded and time-sensitive time-series monitoring data describing the state of the network and system. The data are harvested in order to perform an analysis, usually one that requires a global view of the data taken from distr…
▽ More
We are concerned with reliably harvesting data collected from service-based systems hosted on a mobile ad hoc network (MANET). More specifically, we are concerned with time-bounded and time-sensitive time-series monitoring data describing the state of the network and system. The data are harvested in order to perform an analysis, usually one that requires a global view of the data taken from distributed sites. For example, network- and application-state data are typically analysed in order to make operational and maintenance decisions. MANETs are a challenging environment in which to harvest monitoring data, due to the inherently unstable and unpredictable connectivity between nodes, and the overhead of transferring data in a wireless medium. These limitations must be overcome to support time-series analysis of perishable and time-critical data. We present an epidemic, delay tolerant, and intelligent method to efficiently and effectively transfer time-series data between the mobile nodes of MANETs. The method establishes a network-wide synchronization overlay to transfer increments of the data over intermediate nodes in periodic cycles. The data are then accessible from local stores at the nodes. We implemented the method in Java~EE and present evaluation on a run-time dependence discovery method for Web Service applications hosted on MANETs, and comparison to other four methods demonstrating that our method performs significantly better in both data availability and network overhead.
△ Less
Submitted 22 September, 2018;
originally announced September 2018.
-
Live Service Migration in Mobile Edge Clouds
Authors:
Andrew Machen,
Shiqiang Wang,
Kin K. Leung,
Bong Jun Ko,
Theodoros Salonidis
Abstract:
Mobile edge clouds (MECs) bring the benefits of the cloud closer to the user, by installing small cloud infrastructures at the network edge. This enables a new breed of real-time applications, such as instantaneous object recognition and safety assistance in intelligent transportation systems, that require very low latency. One key issue that comes with proximity is how to ensure that users always…
▽ More
Mobile edge clouds (MECs) bring the benefits of the cloud closer to the user, by installing small cloud infrastructures at the network edge. This enables a new breed of real-time applications, such as instantaneous object recognition and safety assistance in intelligent transportation systems, that require very low latency. One key issue that comes with proximity is how to ensure that users always receive good performance as they move across different locations. Migrating services between MECs is seen as the means to achieve this. This article presents a layered framework for migrating active service applications that are encapsulated either in virtual machines (VMs) or containers. This layering approach allows a substantial reduction in service downtime. The framework is easy to implement using readily available technologies, and one of its key advantages is that it supports containers, which is a promising emerging technology that offers tangible benefits over VMs. The migration performance of various real applications is evaluated by experiments under the presented framework. Insights drawn from the experimentation results are discussed.
△ Less
Submitted 2 August, 2017; v1 submitted 13 June, 2017;
originally announced June 2017.