-
A Cascaded Dilated Convolution Approach for Mpox Lesion Classification
Authors:
Ayush Deshmukh
Abstract:
The global outbreak of the Mpox virus, classified as a Public Health Emergency of International Concern (PHEIC) by the World Health Organization, presents significant diagnostic challenges due to its visual similarity to other skin lesion diseases. Traditional diagnostic methods for Mpox, which rely on clinical symptoms and laboratory tests, are slow and labor intensive. Deep learning-based approa…
▽ More
The global outbreak of the Mpox virus, classified as a Public Health Emergency of International Concern (PHEIC) by the World Health Organization, presents significant diagnostic challenges due to its visual similarity to other skin lesion diseases. Traditional diagnostic methods for Mpox, which rely on clinical symptoms and laboratory tests, are slow and labor intensive. Deep learning-based approaches for skin lesion classification offer a promising alternative. However, developing a model that balances efficiency with accuracy is crucial to ensure reliable and timely diagnosis without compromising performance. This study introduces the Cascaded Atrous Group Attention (CAGA) framework to address these challenges, combining the Cascaded Atrous Attention module and the Cascaded Group Attention mechanism. The Cascaded Atrous Attention module utilizes dilated convolutions and cascades the outputs to enhance multi-scale representation. This is integrated into the Cascaded Group Attention mechanism, which reduces redundancy in Multi-Head Self-Attention. By integrating the Cascaded Atrous Group Attention module with EfficientViT-L1 as the backbone architecture, this approach achieves state-of-the-art performance, reaching an accuracy of 98% on the Mpox Close Skin Image (MCSI) dataset while reducing model parameters by 37.5% compared to the original EfficientViT-L1. The model's robustness is demonstrated through extensive validation on two additional benchmark datasets, where it consistently outperforms existing approaches.
△ Less
Submitted 13 January, 2025; v1 submitted 13 December, 2024;
originally announced December 2024.
-
A comparative analysis of SRGAN models
Authors:
Fatemeh Rezapoor Nikroo,
Ajinkya Deshmukh,
Anantha Sharma,
Adrian Tam,
Kaarthik Kumar,
Cleo Norris,
Aditya Dangi
Abstract:
In this study, we evaluate the performance of multiple state-of-the-art SRGAN (Super Resolution Generative Adversarial Network) models, ESRGAN, Real-ESRGAN and EDSR, on a benchmark dataset of real-world images which undergo degradation using a pipeline. Our results show that some models seem to significantly increase the resolution of the input images while preserving their visual quality, this is…
▽ More
In this study, we evaluate the performance of multiple state-of-the-art SRGAN (Super Resolution Generative Adversarial Network) models, ESRGAN, Real-ESRGAN and EDSR, on a benchmark dataset of real-world images which undergo degradation using a pipeline. Our results show that some models seem to significantly increase the resolution of the input images while preserving their visual quality, this is assessed using Tesseract OCR engine. We observe that EDSR-BASE model from huggingface outperforms the remaining candidate models in terms of both quantitative metrics and subjective visual quality assessments with least compute overhead. Specifically, EDSR generates images with higher peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) values and are seen to return high quality OCR results with Tesseract OCR engine. These findings suggest that EDSR is a robust and effective approach for single-image super-resolution and may be particularly well-suited for applications where high-quality visual fidelity is critical and optimized compute.
△ Less
Submitted 19 July, 2023; v1 submitted 18 July, 2023;
originally announced July 2023.
-
Active learning with binary models for real time data labelling
Authors:
Ankush Deshmukh,
Bhargava B C,
A V Narasimhadhan
Abstract:
Machine learning (ML) and Deep Learning (DL) tasks primarily depend on data. Most of the ML and DL applications involve supervised learning which requires labelled data. In the initial phases of ML realm lack of data used to be a problem, now we are in a new era of big data. The supervised ML algorithms require data to be labelled and of good quality. Labelling task requires a large amount of mone…
▽ More
Machine learning (ML) and Deep Learning (DL) tasks primarily depend on data. Most of the ML and DL applications involve supervised learning which requires labelled data. In the initial phases of ML realm lack of data used to be a problem, now we are in a new era of big data. The supervised ML algorithms require data to be labelled and of good quality. Labelling task requires a large amount of money and time investment. Data labelling require a skilled person who will charge high for this task, consider the case of the medical field or the data is in bulk that requires a lot of people assigned to label it. The amount of data that is well enough for training needs to be known, money and time can not be wasted to label the whole data. This paper mainly aims to propose a strategy that helps in labelling the data along with oracle in real-time. With balancing on model contribution for labelling is 89 and 81.1 for furniture type and intel scene image data sets respectively. Further with balancing being kept off model contribution is found to be 83.47 and 78.71 for furniture type and flower data sets respectively.
△ Less
Submitted 3 May, 2022; v1 submitted 28 February, 2022;
originally announced March 2022.
-
Modulation and signal class labelling using active learning and classification using machine learning
Authors:
Bhargava B C,
Ankush Deshmukh,
A V Narasimhadhan
Abstract:
Supervised learning in machine learning (ML) requires labelled data set. Further real-time data classification requires an easily available methodology for labelling. Wireless modulation and signal classification find their application in plenty of areas such as military, commercial and electronic reconaissance and cognitive radio. This paper mainly aims to solve the problem of real-time wireless…
▽ More
Supervised learning in machine learning (ML) requires labelled data set. Further real-time data classification requires an easily available methodology for labelling. Wireless modulation and signal classification find their application in plenty of areas such as military, commercial and electronic reconaissance and cognitive radio. This paper mainly aims to solve the problem of real-time wireless modulation and signal class labelling with an active learning framework. Further modulation and signal classification is performed with machine learning algorithms such as KNN, SVM, Naive bayes. Active learning helps in labelling the data points belonging to different classes with the least amount of data samples trained. An accuracy of 86 percent is obtained by the active learning algorithm for the signal with SNR 18 dB. Further, KNN based model for modulation and signal classification performs well over range of SNR, and an accuracy of 99.8 percent is obtained for 18 dB signal. The novelty of this work exists in applying active learning for wireless modulation and signal class labelling. Both modulation and signal classes are labelled at a given time with help of couplet formation from the data samples.
△ Less
Submitted 23 February, 2022;
originally announced February 2022.
-
Offline RL With Resource Constrained Online Deployment
Authors:
Jayanth Reddy Regatti,
Aniket Anand Deshmukh,
Frank Cheng,
Young Hun Jung,
Abhishek Gupta,
Urun Dogan
Abstract:
Offline reinforcement learning is used to train policies in scenarios where real-time access to the environment is expensive or impossible. As a natural consequence of these harsh conditions, an agent may lack the resources to fully observe the online environment before taking an action. We dub this situation the resource-constrained setting. This leads to situations where the offline dataset (ava…
▽ More
Offline reinforcement learning is used to train policies in scenarios where real-time access to the environment is expensive or impossible. As a natural consequence of these harsh conditions, an agent may lack the resources to fully observe the online environment before taking an action. We dub this situation the resource-constrained setting. This leads to situations where the offline dataset (available for training) can contain fully processed features (using powerful language models, image models, complex sensors, etc.) which are not available when actions are actually taken online. This disconnect leads to an interesting and unexplored problem in offline RL: Is it possible to use a richly processed offline dataset to train a policy which has access to fewer features in the online environment? In this work, we introduce and formalize this novel resource-constrained problem setting. We highlight the performance gap between policies trained using the full offline dataset and policies trained using limited features. We address this performance gap with a policy transfer algorithm which first trains a teacher agent using the offline dataset where features are fully available, and then transfers this knowledge to a student agent that only uses the resource-constrained features. To better capture the challenge of this setting, we propose a data collection procedure: Resource Constrained-Datasets for RL (RC-D4RL). We evaluate our transfer algorithm on RC-D4RL and the popular D4RL benchmarks and observe consistent improvement over the baseline (TD3+BC without transfer). The code for the experiments is available at https://github.com/JayanthRR/RC-OfflineRL.
△ Less
Submitted 7 December, 2021; v1 submitted 6 October, 2021;
originally announced October 2021.
-
Robust Mean Estimation in High Dimensions via $\ell_0$ Minimization
Authors:
Jing Liu,
Aditya Deshmukh,
Venugopal V. Veeravalli
Abstract:
We study the robust mean estimation problem in high dimensions, where $α<0.5$ fraction of the data points can be arbitrarily corrupted. Motivated by compressive sensing, we formulate the robust mean estimation problem as the minimization of the $\ell_0$-`norm' of the outlier indicator vector, under second moment constraints on the inlier data points. We prove that the global minimum of this object…
▽ More
We study the robust mean estimation problem in high dimensions, where $α<0.5$ fraction of the data points can be arbitrarily corrupted. Motivated by compressive sensing, we formulate the robust mean estimation problem as the minimization of the $\ell_0$-`norm' of the outlier indicator vector, under second moment constraints on the inlier data points. We prove that the global minimum of this objective is order optimal for the robust mean estimation problem, and we propose a general framework for minimizing the objective. We further leverage the $\ell_1$ and $\ell_p$ $(0<p<1)$, minimization techniques in compressive sensing to provide computationally tractable solutions to the $\ell_0$ minimization problem. Both synthetic and real data experiments demonstrate that the proposed algorithms significantly outperform state-of-the-art robust mean estimation methods.
△ Less
Submitted 20 August, 2020;
originally announced August 2020.
-
Emo-CNN for Perceiving Stress from Audio Signals: A Brain Chemistry Approach
Authors:
Anup Anand Deshmukh,
Catherine Soladie,
Renaud Seguier
Abstract:
Emotion plays a key role in many applications like healthcare, to gather patients emotional behavior. There are certain emotions which are given more importance due to their effectiveness in understanding human feelings. In this paper, we propose an approach that models human stress from audio signals. The research challenge in speech emotion detection is defining the very meaning of stress and be…
▽ More
Emotion plays a key role in many applications like healthcare, to gather patients emotional behavior. There are certain emotions which are given more importance due to their effectiveness in understanding human feelings. In this paper, we propose an approach that models human stress from audio signals. The research challenge in speech emotion detection is defining the very meaning of stress and being able to categorize it in a precise manner. Supervised Machine Learning models, including state of the art Deep Learning classification methods, rely on the availability of clean and labelled data. One of the problems in affective computation and emotion detection is the limited amount of annotated data of stress. The existing labelled stress emotion datasets are highly subjective to the perception of the annotator.
We address the first issue of feature selection by exploiting the use of traditional MFCC features in Convolutional Neural Network. Our experiments show that Emo-CNN consistently and significantly outperforms the popular existing methods over multiple datasets. It achieves 90.2% categorical accuracy on the Emo-DB dataset. To tackle the second and the more significant problem of subjectivity in stress labels, we use Lovheim's cube, which is a 3-dimensional projection of emotions. The cube aims at explaining the relationship between these neurotransmitters and the positions of emotions in 3D space. The learnt emotion representations from the Emo-CNN are mapped to the cube using three component PCA (Principal Component Analysis) which is then used to model human stress. This proposed approach not only circumvents the need for labelled stress data but also complies with the psychological theory of emotions given by Lovheim's cube. We believe that this work is the first step towards creating a connection between Artificial Intelligence and the chemistry of human emotions.
△ Less
Submitted 7 January, 2020;
originally announced January 2020.