-
$r$Age-$k$: Communication-Efficient Federated Learning Using Age Factor
Authors:
Matin Mortaheb,
Priyanka Kaswan,
Sennur Ulukus
Abstract:
Federated learning (FL) is a collaborative approach where multiple clients, coordinated by a parameter server (PS), train a unified machine-learning model. The approach, however, suffers from two key challenges: data heterogeneity and communication overhead. Data heterogeneity refers to inconsistencies in model training arising from heterogeneous data at different clients. Communication overhead a…
▽ More
Federated learning (FL) is a collaborative approach where multiple clients, coordinated by a parameter server (PS), train a unified machine-learning model. The approach, however, suffers from two key challenges: data heterogeneity and communication overhead. Data heterogeneity refers to inconsistencies in model training arising from heterogeneous data at different clients. Communication overhead arises from the large volumes of parameter updates exchanged between the PS and clients. Existing solutions typically address these challenges separately. This paper introduces a new communication-efficient algorithm that uses the age of information metric to simultaneously tackle both limitations of FL. We introduce age vectors at the PS, which keep track of how often the different model parameters are updated from the clients. The PS uses this to selectively request updates for specific gradient indices from each client. Further, the PS employs age vectors to identify clients with statistically similar data and group them into clusters. The PS combines the age vectors of the clustered clients to efficiently coordinate gradient index updates among clients within a cluster. We evaluate our approach using the MNIST and CIFAR10 datasets in highly non-i.i.d. settings. The experimental results show that our proposed method can expedite training, surpassing other communication-efficient strategies in efficiency.
△ Less
Submitted 29 October, 2024;
originally announced October 2024.
-
Hierarchical Over-the-Air FedGradNorm
Authors:
Cemil Vahapoglu,
Matin Mortaheb,
Sennur Ulukus
Abstract:
Multi-task learning (MTL) is a learning paradigm to learn multiple related tasks simultaneously with a single shared network where each task has a distinct personalized header network for fine-tuning. MTL can be integrated into a federated learning (FL) setting if tasks are distributed across clients and clients have a single shared network, leading to personalized federated learning (PFL). To cop…
▽ More
Multi-task learning (MTL) is a learning paradigm to learn multiple related tasks simultaneously with a single shared network where each task has a distinct personalized header network for fine-tuning. MTL can be integrated into a federated learning (FL) setting if tasks are distributed across clients and clients have a single shared network, leading to personalized federated learning (PFL). To cope with statistical heterogeneity in the federated setting across clients which can significantly degrade the learning performance, we use a distributed dynamic weighting approach. To perform the communication between the remote parameter server (PS) and the clients efficiently over the noisy channel in a power and bandwidth-limited regime, we utilize over-the-air (OTA) aggregation and hierarchical federated learning (HFL). Thus, we propose hierarchical over-the-air (HOTA) PFL with a dynamic weighting strategy which we call HOTA-FedGradNorm. Our algorithm considers the channel conditions during the dynamic weight selection process. We conduct experiments on a wireless communication system dataset (RadComDynamic). The experimental results demonstrate that the training speed with HOTA-FedGradNorm is faster compared to the algorithms with a naive static equal weighting strategy. In addition, HOTA-FedGradNorm provides robustness against the negative channel effects by compensating for the channel conditions during the dynamic weight selection process.
△ Less
Submitted 14 December, 2022;
originally announced December 2022.
-
FedGradNorm: Personalized Federated Gradient-Normalized Multi-Task Learning
Authors:
Matin Mortaheb,
Cemil Vahapoglu,
Sennur Ulukus
Abstract:
Multi-task learning (MTL) is a novel framework to learn several tasks simultaneously with a single shared network where each task has its distinct personalized header network for fine-tuning. MTL can be implemented in federated learning settings as well, in which tasks are distributed across clients. In federated settings, the statistical heterogeneity due to different task complexities and data h…
▽ More
Multi-task learning (MTL) is a novel framework to learn several tasks simultaneously with a single shared network where each task has its distinct personalized header network for fine-tuning. MTL can be implemented in federated learning settings as well, in which tasks are distributed across clients. In federated settings, the statistical heterogeneity due to different task complexities and data heterogeneity due to non-iid nature of local datasets can both degrade the learning performance of the system. In addition, tasks can negatively affect each other's learning performance due to negative transference effects. To cope with these challenges, we propose FedGradNorm which uses a dynamic-weighting method to normalize gradient norms in order to balance learning speeds among different tasks. FedGradNorm improves the overall learning performance in a personalized federated learning setting. We provide convergence analysis for FedGradNorm by showing that it has an exponential convergence rate. We also conduct experiments on multi-task facial landmark (MTFL) and wireless communication system dataset (RadComDynamic). The experimental results show that our framework can achieve faster training performance compared to equal-weighting strategy. In addition to improving training speed, FedGradNorm also compensates for the imbalanced datasets among clients.
△ Less
Submitted 24 March, 2022;
originally announced March 2022.
-
Dynamic Infection Spread Model Based Group Testing
Authors:
Batuhan Arasli,
Sennur Ulukus
Abstract:
We study a dynamic infection spread model, inspired by the discrete time SIR model, where infections are spread via non-isolated infected individuals. While infection keeps spreading over time, a limited capacity testing is performed at each time instance as well. In contrast to the classical, static, group testing problem, the objective in our setup is not to find the minimum number of required t…
▽ More
We study a dynamic infection spread model, inspired by the discrete time SIR model, where infections are spread via non-isolated infected individuals. While infection keeps spreading over time, a limited capacity testing is performed at each time instance as well. In contrast to the classical, static, group testing problem, the objective in our setup is not to find the minimum number of required tests to identify the infection status of every individual in the population, but to control the infection spread by detecting and isolating the infections over time by using the given, limited number of tests. In order to analyze the performance of the proposed algorithms, we focus on the mean-sense analysis of the number of individuals that remain non-infected throughout the process of controlling the infection. We propose two dynamic algorithms that both use given limited number of tests to identify and isolate the infections over time, while the infection spreads. While the first algorithm is a dynamic randomized individual testing algorithm, in the second algorithm we employ the group testing approach similar to the original work of Dorfman. By considering weak versions of our algorithms, we obtain lower bounds for the performance of our algorithms. Finally, we implement our algorithms and run simulations to gather numerical results and compare our algorithms and theoretical approximation results under different sets of system parameters.
△ Less
Submitted 21 January, 2022;
originally announced January 2022.
-
Dynamical Dorfman Testing with Quarantine
Authors:
Mustafa Doger,
Sennur Ulukus
Abstract:
We consider dynamical group testing problem with a community structure. With a discrete-time SIR (susceptible, infectious, recovered) model, we use Dorfman's two-step group testing approach to identify infections, and step in whenever necessary to inhibit infection spread via quarantines. We analyze the trade-off between quarantine and test costs as well as disease spread. For the special dynamica…
▽ More
We consider dynamical group testing problem with a community structure. With a discrete-time SIR (susceptible, infectious, recovered) model, we use Dorfman's two-step group testing approach to identify infections, and step in whenever necessary to inhibit infection spread via quarantines. We analyze the trade-off between quarantine and test costs as well as disease spread. For the special dynamical i.i.d. model, we show that the optimal first stage Dorfman group size differs in dynamic and static cases. We compare the performance of the proposed dynamic two-stage Dorfman testing with state-of-the-art non-adaptive group testing method in dynamic settings.
△ Less
Submitted 18 January, 2022;
originally announced January 2022.
-
Covert Communications via Adversarial Machine Learning and Reconfigurable Intelligent Surfaces
Authors:
Brian Kim,
Tugba Erpek,
Yalin E. Sagduyu,
Sennur Ulukus
Abstract:
By moving from massive antennas to antenna surfaces for software-defined wireless systems, the reconfigurable intelligent surfaces (RISs) rely on arrays of unit cells to control the scattering and reflection profiles of signals, mitigating the propagation loss and multipath attenuation, and thereby improving the coverage and spectral efficiency. In this paper, covert communication is considered in…
▽ More
By moving from massive antennas to antenna surfaces for software-defined wireless systems, the reconfigurable intelligent surfaces (RISs) rely on arrays of unit cells to control the scattering and reflection profiles of signals, mitigating the propagation loss and multipath attenuation, and thereby improving the coverage and spectral efficiency. In this paper, covert communication is considered in the presence of the RIS. While there is an ongoing transmission boosted by the RIS, both the intended receiver and an eavesdropper individually try to detect this transmission using their own deep neural network (DNN) classifiers. The RIS interaction vector is designed by balancing two (potentially conflicting) objectives of focusing the transmitted signal to the receiver and keeping the transmitted signal away from the eavesdropper. To boost covert communications, adversarial perturbations are added to signals at the transmitter to fool the eavesdropper's classifier while keeping the effect on the receiver low. Results from different network topologies show that adversarial perturbation and RIS interaction vector can be jointly designed to effectively increase the signal detection accuracy at the receiver while reducing the detection accuracy at the eavesdropper to enable covert communications.
△ Less
Submitted 21 December, 2021;
originally announced December 2021.
-
Adversarial Attacks against Deep Learning Based Power Control in Wireless Communications
Authors:
Brian Kim,
Yi Shi,
Yalin E. Sagduyu,
Tugba Erpek,
Sennur Ulukus
Abstract:
We consider adversarial machine learning based attacks on power allocation where the base station (BS) allocates its transmit power to multiple orthogonal subcarriers by using a deep neural network (DNN) to serve multiple user equipments (UEs). The DNN that corresponds to a regression model is trained with channel gains as the input and returns transmit powers as the output. While the BS allocates…
▽ More
We consider adversarial machine learning based attacks on power allocation where the base station (BS) allocates its transmit power to multiple orthogonal subcarriers by using a deep neural network (DNN) to serve multiple user equipments (UEs). The DNN that corresponds to a regression model is trained with channel gains as the input and returns transmit powers as the output. While the BS allocates the transmit powers to the UEs to maximize rates for all UEs, there is an adversary that aims to minimize these rates. The adversary may be an external transmitter that aims to manipulate the inputs to the DNN by interfering with the pilot signals that are transmitted to measure the channel gain. Alternatively, the adversary may be a rogue UE that transmits fabricated channel estimates to the BS. In both cases, the adversary carefully crafts adversarial perturbations to manipulate the inputs to the DNN of the BS subject to an upper bound on the strengths of these perturbations. We consider the attacks targeted on a single UE or all UEs. We compare these attacks with a benchmark, where the adversary scales down the input to the DNN. We show that the adversarial attacks are much more effective than the benchmark attack in terms of reducing the rate of communications. We also show that adversarial attacks are robust to the uncertainty at the adversary including the erroneous knowledge of channel gains and the potential errors in exercising the attacks exactly as specified.
△ Less
Submitted 12 October, 2021; v1 submitted 16 September, 2021;
originally announced September 2021.
-
Adversarial Attacks on Deep Learning Based mmWave Beam Prediction in 5G and Beyond
Authors:
Brian Kim,
Yalin E. Sagduyu,
Tugba Erpek,
Sennur Ulukus
Abstract:
Deep learning provides powerful means to learn from spectrum data and solve complex tasks in 5G and beyond such as beam selection for initial access (IA) in mmWave communications. To establish the IA between the base station (e.g., gNodeB) and user equipment (UE) for directional transmissions, a deep neural network (DNN) can predict the beam that is best slanted to each UE by using the received si…
▽ More
Deep learning provides powerful means to learn from spectrum data and solve complex tasks in 5G and beyond such as beam selection for initial access (IA) in mmWave communications. To establish the IA between the base station (e.g., gNodeB) and user equipment (UE) for directional transmissions, a deep neural network (DNN) can predict the beam that is best slanted to each UE by using the received signal strengths (RSSs) from a subset of possible narrow beams. While improving the latency and reliability of beam selection compared to the conventional IA that sweeps all beams, the DNN itself is susceptible to adversarial attacks. We present an adversarial attack by generating adversarial perturbations to manipulate the over-the-air captured RSSs as the input to the DNN. This attack reduces the IA performance significantly and fools the DNN into choosing the beams with small RSSs compared to jamming attacks with Gaussian or uniform noise.
△ Less
Submitted 25 March, 2021;
originally announced March 2021.
-
Channel Effects on Surrogate Models of Adversarial Attacks against Wireless Signal Classifiers
Authors:
Brian Kim,
Yalin E. Sagduyu,
Tugba Erpek,
Kemal Davaslioglu,
Sennur Ulukus
Abstract:
We consider a wireless communication system that consists of a background emitter, a transmitter, and an adversary. The transmitter is equipped with a deep neural network (DNN) classifier for detecting the ongoing transmissions from the background emitter and transmits a signal if the spectrum is idle. Concurrently, the adversary trains its own DNN classifier as the surrogate model by observing th…
▽ More
We consider a wireless communication system that consists of a background emitter, a transmitter, and an adversary. The transmitter is equipped with a deep neural network (DNN) classifier for detecting the ongoing transmissions from the background emitter and transmits a signal if the spectrum is idle. Concurrently, the adversary trains its own DNN classifier as the surrogate model by observing the spectrum to detect the ongoing transmissions of the background emitter and generate adversarial attacks to fool the transmitter into misclassifying the channel as idle. This surrogate model may differ from the transmitter's classifier significantly because the adversary and the transmitter experience different channels from the background emitter and therefore their classifiers are trained with different distributions of inputs. This system model may represent a setting where the background emitter is a primary user, the transmitter is a secondary user, and the adversary is trying to fool the secondary user to transmit even though the channel is occupied by the primary user. We consider different topologies to investigate how different surrogate models that are trained by the adversary (depending on the differences in channel effects experienced by the adversary) affect the performance of the adversarial attack. The simulation results show that the surrogate models that are trained with different distributions of channel-induced inputs severely limit the attack performance and indicate that the transferability of adversarial attacks is neither readily available nor straightforward to achieve since surrogate models for wireless applications may significantly differ from the target model depending on channel effects.
△ Less
Submitted 8 March, 2021; v1 submitted 3 December, 2020;
originally announced December 2020.
-
Adversarial Attacks with Multiple Antennas Against Deep Learning-Based Modulation Classifiers
Authors:
Brian Kim,
Yalin E. Sagduyu,
Tugba Erpek,
Kemal Davaslioglu,
Sennur Ulukus
Abstract:
We consider a wireless communication system, where a transmitter sends signals to a receiver with different modulation types while the receiver classifies the modulation types of the received signals using its deep learning-based classifier. Concurrently, an adversary transmits adversarial perturbations using its multiple antennas to fool the classifier into misclassifying the received signals. Fr…
▽ More
We consider a wireless communication system, where a transmitter sends signals to a receiver with different modulation types while the receiver classifies the modulation types of the received signals using its deep learning-based classifier. Concurrently, an adversary transmits adversarial perturbations using its multiple antennas to fool the classifier into misclassifying the received signals. From the adversarial machine learning perspective, we show how to utilize multiple antennas at the adversary to improve the adversarial (evasion) attack performance. Two main points are considered while exploiting the multiple antennas at the adversary, namely the power allocation among antennas and the utilization of channel diversity. First, we show that multiple independent adversaries, each with a single antenna cannot improve the attack performance compared to a single adversary with multiple antennas using the same total power. Then, we consider various ways to allocate power among multiple antennas at a single adversary such as allocating power to only one antenna, and proportional or inversely proportional to the channel gain. By utilizing channel diversity, we introduce an attack to transmit the adversarial perturbation through the channel with the largest channel gain at the symbol level. We show that this attack reduces the classifier accuracy significantly compared to other attacks under different channel conditions in terms of channel variance and channel correlation across antennas. Also, we show that the attack success improves significantly as the number of antennas increases at the adversary that can better utilize channel diversity to craft adversarial attacks.
△ Less
Submitted 31 July, 2020;
originally announced July 2020.
-
How to Make 5G Communications "Invisible": Adversarial Machine Learning for Wireless Privacy
Authors:
Brian Kim,
Yalin E. Sagduyu,
Kemal Davaslioglu,
Tugba Erpek,
Sennur Ulukus
Abstract:
We consider the problem of hiding wireless communications from an eavesdropper that employs a deep learning (DL) classifier to detect whether any transmission of interest is present or not. There exists one transmitter that transmits to its receiver in the presence of an eavesdropper, while a cooperative jammer (CJ) transmits carefully crafted adversarial perturbations over the air to fool the eav…
▽ More
We consider the problem of hiding wireless communications from an eavesdropper that employs a deep learning (DL) classifier to detect whether any transmission of interest is present or not. There exists one transmitter that transmits to its receiver in the presence of an eavesdropper, while a cooperative jammer (CJ) transmits carefully crafted adversarial perturbations over the air to fool the eavesdropper into classifying the received superposition of signals as noise. The CJ puts an upper bound on the strength of perturbation signal to limit its impact on the bit error rate (BER) at the receiver. We show that this adversarial perturbation causes the eavesdropper to misclassify the received signals as noise with high probability while increasing the BER only slightly. On the other hand, the CJ cannot fool the eavesdropper by simply transmitting Gaussian noise as in conventional jamming and instead needs to craft perturbation signals built by adversarial machine learning to enable covert communications. Our results show that signals with different modulation types and eventually 5G communications can be effectively hidden from an eavesdropper even if it is equipped with a DL classifier to detect transmissions.
△ Less
Submitted 15 May, 2020;
originally announced May 2020.
-
Channel-Aware Adversarial Attacks Against Deep Learning-Based Wireless Signal Classifiers
Authors:
Brian Kim,
Yalin E. Sagduyu,
Kemal Davaslioglu,
Tugba Erpek,
Sennur Ulukus
Abstract:
This paper presents channel-aware adversarial attacks against deep learning-based wireless signal classifiers. There is a transmitter that transmits signals with different modulation types. A deep neural network is used at each receiver to classify its over-the-air received signals to modulation types. In the meantime, an adversary transmits an adversarial perturbation (subject to a power budget)…
▽ More
This paper presents channel-aware adversarial attacks against deep learning-based wireless signal classifiers. There is a transmitter that transmits signals with different modulation types. A deep neural network is used at each receiver to classify its over-the-air received signals to modulation types. In the meantime, an adversary transmits an adversarial perturbation (subject to a power budget) to fool receivers into making errors in classifying signals that are received as superpositions of transmitted signals and adversarial perturbations. First, these evasion attacks are shown to fail when channels are not considered in designing adversarial perturbations. Then, realistic attacks are presented by considering channel effects from the adversary to each receiver. After showing that a channel-aware attack is selective (i.e., it affects only the receiver whose channel is considered in the perturbation design), a broadcast adversarial attack is presented by crafting a common adversarial perturbation to simultaneously fool classifiers at different receivers. The major vulnerability of modulation classifiers to over-the-air adversarial attacks is shown by accounting for different levels of information available about the channel, the transmitter input, and the classifier model. Finally, a certified defense based on randomized smoothing that augments training data with noise is introduced to make the modulation classifier robust to adversarial perturbations.
△ Less
Submitted 20 December, 2021; v1 submitted 11 May, 2020;
originally announced May 2020.
-
Over-the-Air Adversarial Attacks on Deep Learning Based Modulation Classifier over Wireless Channels
Authors:
Brian Kim,
Yalin E. Sagduyu,
Kemal Davaslioglu,
Tugba Erpek,
Sennur Ulukus
Abstract:
We consider a wireless communication system that consists of a transmitter, a receiver, and an adversary. The transmitter transmits signals with different modulation types, while the receiver classifies its received signals to modulation types using a deep learning-based classifier. In the meantime, the adversary makes over-the-air transmissions that are received as superimposed with the transmitt…
▽ More
We consider a wireless communication system that consists of a transmitter, a receiver, and an adversary. The transmitter transmits signals with different modulation types, while the receiver classifies its received signals to modulation types using a deep learning-based classifier. In the meantime, the adversary makes over-the-air transmissions that are received as superimposed with the transmitter's signals to fool the classifier at the receiver into making errors. While this evasion attack has received growing interest recently, the channel effects from the adversary to the receiver have been ignored so far such that the previous attack mechanisms cannot be applied under realistic channel effects. In this paper, we present how to launch a realistic evasion attack by considering channels from the adversary to the receiver. Our results show that modulation classification is vulnerable to an adversarial attack over a wireless channel that is modeled as Rayleigh fading with path loss and shadowing. We present various adversarial attacks with respect to availability of information about channel, transmitter input, and classifier architecture. First, we present two types of adversarial attacks, namely a targeted attack (with minimum power) and non-targeted attack that aims to change the classification to a target label or to any other label other than the true label, respectively. Both are white-box attacks that are transmitter input-specific and use channel information. Then we introduce an algorithm to generate adversarial attacks using limited channel information where the adversary only knows the channel distribution. Finally, we present a black-box universal adversarial perturbation (UAP) attack where the adversary has limited knowledge about both channel and transmitter input.
△ Less
Submitted 13 February, 2020; v1 submitted 5 February, 2020;
originally announced February 2020.
-
Distributed Gradient Descent with Coded Partial Gradient Computations
Authors:
Emre Ozfatura,
Sennur Ulukus,
Deniz Gunduz
Abstract:
Coded computation techniques provide robustness against straggling servers in distributed computing, with the following limitations: First, they increase decoding complexity. Second, they ignore computations carried out by straggling servers; and they are typically designed to recover the full gradient, and thus, cannot provide a balance between the accuracy of the gradient and per-iteration compl…
▽ More
Coded computation techniques provide robustness against straggling servers in distributed computing, with the following limitations: First, they increase decoding complexity. Second, they ignore computations carried out by straggling servers; and they are typically designed to recover the full gradient, and thus, cannot provide a balance between the accuracy of the gradient and per-iteration completion time. Here we introduce a hybrid approach, called coded partial gradient computation (CPGC), that benefits from the advantages of both coded and uncoded computation schemes, and reduces both the computation time and decoding complexity.
△ Less
Submitted 22 November, 2018;
originally announced November 2018.
-
Speeding Up Distributed Gradient Descent by Utilizing Non-persistent Stragglers
Authors:
Emre Ozfatura,
Deniz Gunduz,
Sennur Ulukus
Abstract:
Distributed gradient descent (DGD) is an efficient way of implementing gradient descent (GD), especially for large data sets, by dividing the computation tasks into smaller subtasks and assigning to different computing servers (CSs) to be executed in parallel. In standard parallel execution, per-iteration waiting time is limited by the execution time of the straggling servers. Coded DGD techniques…
▽ More
Distributed gradient descent (DGD) is an efficient way of implementing gradient descent (GD), especially for large data sets, by dividing the computation tasks into smaller subtasks and assigning to different computing servers (CSs) to be executed in parallel. In standard parallel execution, per-iteration waiting time is limited by the execution time of the straggling servers. Coded DGD techniques have been introduced recently, which can tolerate straggling servers via assigning redundant computation tasks to the CSs. In most of the existing DGD schemes, either with coded computation or coded communication, the non-straggling CSs transmit one message per iteration once they complete all their assigned computation tasks. However, although the straggling servers cannot complete all their assigned tasks, they are often able to complete a certain portion of them. In this paper, we allow multiple transmissions from each CS at each iteration in order to make sure a maximum number of completed computations can be reported to the aggregating server (AS), including the straggling servers. We numerically show that the average completion time per iteration can be reduced significantly by slightly increasing the communication load per server.
△ Less
Submitted 2 October, 2018; v1 submitted 7 August, 2018;
originally announced August 2018.