-
Energy-efficient Federated Learning with Dynamic Model Size Allocation
Authors:
M S Chaitanya Kumar,
Sai Satya Narayana J,
Yunkai Bao,
Xin Wang,
Steve Drew
Abstract:
Federated Learning (FL) presents a paradigm shift towards distributed model training across isolated data repositories or edge devices without explicit data sharing. Despite of its advantages, FL is inherently less efficient than centralized training models, leading to increased energy consumption and, consequently, higher carbon emissions. In this paper, we propose CAMA, a carbon-aware FL framewo…
▽ More
Federated Learning (FL) presents a paradigm shift towards distributed model training across isolated data repositories or edge devices without explicit data sharing. Despite of its advantages, FL is inherently less efficient than centralized training models, leading to increased energy consumption and, consequently, higher carbon emissions. In this paper, we propose CAMA, a carbon-aware FL framework, promoting the operation on renewable excess energy and spare computing capacity, aiming to minimize operational carbon emissions. CAMA introduces a dynamic model adaptation strategy which adapts the model sizes based on the availability of energy and computing resources. Ordered dropout is integratged to enable the aggregation with varying model sizes. Empirical evaluations on real-world energy and load traces demonstrate that our method achieves faster convergence and ensures equitable client participation, while scaling efficiently to handle large numbers of clients. The source code of CAMA is available at https://github.com/denoslab/CAMA.
△ Less
Submitted 23 November, 2024;
originally announced November 2024.
-
Textless NLP -- Zero Resource Challenge with Low Resource Compute
Authors:
Krithiga Ramadass,
Abrit Pal Singh,
Srihari J,
Sheetal Kalyani
Abstract:
This work addresses the persistent challenges of substantial training time and GPU resource requirements even when training lightweight encoder-vocoder models for Textless NLP. We reduce training steps significantly while improving performance by a) leveraging learning rate schedulers for efficient and faster convergence b) optimizing hop length and c) tuning the interpolation scale factors for be…
▽ More
This work addresses the persistent challenges of substantial training time and GPU resource requirements even when training lightweight encoder-vocoder models for Textless NLP. We reduce training steps significantly while improving performance by a) leveraging learning rate schedulers for efficient and faster convergence b) optimizing hop length and c) tuning the interpolation scale factors for better audio quality. Additionally, we explore the latent space representation for Indian languages such as Tamil and Bengali for the acoustic unit discovery and voice conversion task. Our approach leverages a quantized encoder architecture, in conjunction with a vocoder which utilizes the proposed mixture of optimized hop length, tuned interpolation scale factors and a cyclic learning rate scheduler. We obtain consistently good results across English, Tamil and Bengali datasets. The proposed method excels in capturing complex linguistic patterns, resulting in clear reconstructed audio during voice conversion with significantly reduced training time.
△ Less
Submitted 24 September, 2024;
originally announced September 2024.
-
PrivFED -- A Framework for Privacy-Preserving Federated Learning in Enhanced Breast Cancer Diagnosis
Authors:
Maithili Jha,
S. Maitri,
M. Lohithdakshan,
Shiny Duela J,
K. Raja
Abstract:
In the day-to-day operations of healthcare institutions, a multitude of Personally Identifiable Information (PII) data exchanges occur, exposing the data to a spectrum of cybersecurity threats. This study introduces a federated learning framework, trained on the Wisconsin dataset, to mitigate challenges such as data scarcity and imbalance. Techniques like the Synthetic Minority Over-sampling Techn…
▽ More
In the day-to-day operations of healthcare institutions, a multitude of Personally Identifiable Information (PII) data exchanges occur, exposing the data to a spectrum of cybersecurity threats. This study introduces a federated learning framework, trained on the Wisconsin dataset, to mitigate challenges such as data scarcity and imbalance. Techniques like the Synthetic Minority Over-sampling Technique (SMOTE) are incorporated to bolster robustness, while isolation forests are employed to fortify the model against outliers. Catboost serves as the classification tool across all devices. The identification of optimal features for heightened accuracy is pursued through Principal Component Analysis (PCA),accentuating the significance of hyperparameter tuning, as underscored in a comparative analysis. The model exhibits an average accuracy of 99.95% on edge devices and 98% on the central server.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Compressing Vision Transformers for Low-Resource Visual Learning
Authors:
Eric Youn,
Sai Mitheran J,
Sanjana Prabhu,
Siyuan Chen
Abstract:
Vision transformer (ViT) and its variants have swept through visual learning leaderboards and offer state-of-the-art accuracy in tasks such as image classification, object detection, and semantic segmentation by attending to different parts of the visual input and capturing long-range spatial dependencies. However, these models are large and computation-heavy. For instance, the recently proposed V…
▽ More
Vision transformer (ViT) and its variants have swept through visual learning leaderboards and offer state-of-the-art accuracy in tasks such as image classification, object detection, and semantic segmentation by attending to different parts of the visual input and capturing long-range spatial dependencies. However, these models are large and computation-heavy. For instance, the recently proposed ViT-B model has 86M parameters making it impractical for deployment on resource-constrained devices. As a result, their deployment on mobile and edge scenarios is limited. In our work, we aim to take a step toward bringing vision transformers to the edge by utilizing popular model compression techniques such as distillation, pruning, and quantization.
Our chosen application environment is an unmanned aerial vehicle (UAV) that is battery-powered and memory-constrained, carrying a single-board computer on the scale of an NVIDIA Jetson Nano with 4GB of RAM. On the other hand, the UAV requires high accuracy close to that of state-of-the-art ViTs to ensure safe object avoidance in autonomous navigation, or correct localization of humans in search-and-rescue. Inference latency should also be minimized given the application requirements. Hence, our target is to enable rapid inference of a vision transformer on an NVIDIA Jetson Nano (4GB) with minimal accuracy loss. This allows us to deploy ViTs on resource-constrained devices, opening up new possibilities in surveillance, environmental monitoring, etc. Our implementation is made available at https://github.com/chensy7/efficient-vit.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
Investigation of Speaker-adaptation methods in Transformer based ASR
Authors:
Vishwas M. Shetty,
Metilda Sagaya Mary N J,
S. Umesh
Abstract:
End-to-end models are fast replacing the conventional hybrid models in automatic speech recognition. Transformer, a sequence-to-sequence model, based on self-attention popularly used in machine translation tasks, has given promising results when used for automatic speech recognition. This paper explores different ways of incorporating speaker information at the encoder input while training a trans…
▽ More
End-to-end models are fast replacing the conventional hybrid models in automatic speech recognition. Transformer, a sequence-to-sequence model, based on self-attention popularly used in machine translation tasks, has given promising results when used for automatic speech recognition. This paper explores different ways of incorporating speaker information at the encoder input while training a transformer-based model to improve its speech recognition performance. We present speaker information in the form of speaker embeddings for each of the speakers. We experiment using two types of speaker embeddings: x-vectors and novel s-vectors proposed in our previous work. We report results on two datasets a) NPTEL lecture database and b) Librispeech 500-hour split. NPTEL is an open-source e-learning portal providing lectures from top Indian universities. We obtain improvements in the word error rate over the baseline through our approach of integrating speaker embeddings into the model.
△ Less
Submitted 17 November, 2021; v1 submitted 7 August, 2020;
originally announced August 2020.
-
Boundary-type Sets of Strong Product of Directed Graphs
Authors:
Prasanth G. Narasimha-Shenoi,
Bijo S Anand,
Mary Shalet T J
Abstract:
Let $D=(V,E)$ be a strongly connected digraph and let $u ,v\in V(D)$. The maximum distance $md (u,v)$ is defined as\\ $md(u,v)$=max\{$\overrightarrow{d}(u,v), \overrightarrow{d}(v,u)$\} where $\overrightarrow{d}(u,v)$ denote the length of a shortest directed $u-v$ path in $D$. This is a metric. The boundary, contour, eccentric and peripheral sets of a strong digraph $D$ with respect to this metric…
▽ More
Let $D=(V,E)$ be a strongly connected digraph and let $u ,v\in V(D)$. The maximum distance $md (u,v)$ is defined as\\ $md(u,v)$=max\{$\overrightarrow{d}(u,v), \overrightarrow{d}(v,u)$\} where $\overrightarrow{d}(u,v)$ denote the length of a shortest directed $u-v$ path in $D$. This is a metric. The boundary, contour, eccentric and peripheral sets of a strong digraph $D$ with respect to this metric have been defined, and the above said metrically defined sets of a large strong digraph $D$ have been investigated in terms of the factors in its prime factor decomposition with respect to Cartesian product. In this paper we investigate about the above boundary-type sets of a strong digraph $D$ in terms of the factors in its prime factor decomposition with respect to strong product.
△ Less
Submitted 9 November, 2019;
originally announced November 2019.
-
Directed graphs and its Boundary Vertices
Authors:
Manoj Changat,
Prasanth G. Narasimha-Shenoi,
Mary Shallet T. J,
Ram Kumar
Abstract:
Suppose that $D=(V,E)$ is a strongly connected digraph. Let $u,v\in V(D)$. The maximum distance $md (u,v)$ is defined as $md(u,v)$=max\{$\overrightarrow{d}(u,v), \overrightarrow{d}(v,u)$\} where $\overrightarrow{d}(u,v)$ denote the length of a shortest directed $u-v$ path in $D$. This is a metric. The boundary, contour, eccentric and peripheral sets of a strong digraph $D$ are defined with respect…
▽ More
Suppose that $D=(V,E)$ is a strongly connected digraph. Let $u,v\in V(D)$. The maximum distance $md (u,v)$ is defined as $md(u,v)$=max\{$\overrightarrow{d}(u,v), \overrightarrow{d}(v,u)$\} where $\overrightarrow{d}(u,v)$ denote the length of a shortest directed $u-v$ path in $D$. This is a metric. The boundary, contour, eccentric and peripheral sets of a strong digraph $D$ are defined with respect to this metric. The main aim of this paper is to identify the above said metrically defined sets of a large strong digraph $D$ in terms of its prime factor decomposition with respect to cartesian product.
△ Less
Submitted 10 September, 2016;
originally announced September 2016.