-
Diversity Conscious Refined Random Forest
Authors:
Sijan Bhattarai,
Saurav Bhandari,
Girija Bhusal,
Saroj Shakya,
Tapendra Pandey
Abstract:
Random Forest (RF) is a widely used ensemble learning technique known for its robust classification performance across diverse domains. However, it often relies on hundreds of trees and all input features, leading to high inference cost and model redundancy. In this work, our goal is to grow trees dynamically only on informative features and then enforce maximal diversity by clustering and retaini…
▽ More
Random Forest (RF) is a widely used ensemble learning technique known for its robust classification performance across diverse domains. However, it often relies on hundreds of trees and all input features, leading to high inference cost and model redundancy. In this work, our goal is to grow trees dynamically only on informative features and then enforce maximal diversity by clustering and retaining uncorrelated trees. Therefore, we propose a Refined Random Forest Classifier that iteratively refines itself by first removing the least informative features and then analytically determines how many new trees should be grown, followed by correlation-based clustering to remove redundant trees. The classification accuracy of our model was compared against the standard RF on the same number of trees. Experiments on 8 multiple benchmark datasets, including binary and multiclass datasets, demonstrate that the proposed model achieves improved accuracy compared to standard RF.
△ Less
Submitted 5 July, 2025; v1 submitted 1 July, 2025;
originally announced July 2025.
-
A Novel Zero-Touch, Zero-Trust, AI/ML Enablement Framework for IoT Network Security
Authors:
Sushil Shakya,
Robert Abbas,
Sasa Maric
Abstract:
The IoT facilitates a connected, intelligent, and sustainable society; therefore, it is imperative to protect the IoT ecosystem. The IoT-based 5G and 6G will leverage the use of machine learning and artificial intelligence (ML/AI) more to pave the way for autonomous and collaborative secure IoT networks. Zero-touch, zero-trust IoT security with AI and machine learning (ML) enablement frameworks of…
▽ More
The IoT facilitates a connected, intelligent, and sustainable society; therefore, it is imperative to protect the IoT ecosystem. The IoT-based 5G and 6G will leverage the use of machine learning and artificial intelligence (ML/AI) more to pave the way for autonomous and collaborative secure IoT networks. Zero-touch, zero-trust IoT security with AI and machine learning (ML) enablement frameworks offers a powerful approach to securing the expanding landscape of Internet of Things (IoT) devices. This paper presents a novel framework based on the integration of Zero Trust, Zero Touch, and AI/ML powered for the detection, mitigation, and prevention of DDoS attacks in modern IoT ecosystems. The focus will be on the new integrated framework by establishing zero trust for all IoT traffic, fixed and mobile 5G/6G IoT network traffic, and data security (quarantine-zero touch and dynamic policy enforcement). We perform a comparative analysis of five machine learning models, namely, XGBoost, Random Forest, K-Nearest Neighbors, Stochastic Gradient Descent, and Native Bayes, by comparing these models based on accuracy, precision, recall, F1-score, and ROC-AUC. Results show that the best performance in detecting and mitigating different DDoS vectors comes from the ensemble-based approaches.
△ Less
Submitted 5 February, 2025;
originally announced February 2025.
-
A Comparative Analysis of Machine Learning Models for DDoS Detection in IoT Networks
Authors:
Sushil Shakya,
Robert Abbas
Abstract:
This paper presents the detection of DDoS attacks in IoT networks using machine learning models. Their rapid growth has made them highly susceptible to various forms of cyberattacks, many of whose security procedures are implemented in an irregular manner. It evaluates the efficacy of different machine learning models, such as XGBoost, K-Nearest Neighbours, Stochastic Gradient Descent, and Naïve B…
▽ More
This paper presents the detection of DDoS attacks in IoT networks using machine learning models. Their rapid growth has made them highly susceptible to various forms of cyberattacks, many of whose security procedures are implemented in an irregular manner. It evaluates the efficacy of different machine learning models, such as XGBoost, K-Nearest Neighbours, Stochastic Gradient Descent, and Naïve Bayes, in detecting DDoS attacks from normal network traffic. Each model has been explained on several performance metrics, such as accuracy, precision, recall, and F1-score to understand the suitability of each model in real-time detection and response against DDoS threats. This comparative analysis will, therefore, enumerate the unique strengths and weaknesses of each model with respect to the IoT environments that are dynamic and hence moving in nature. The effectiveness of these models is analyzed, showing how machine learning can greatly enhance IoT security frameworks, offering adaptive, efficient, and reliable DDoS detection capabilities. These findings have shown the potential of machine learning in addressing the pressing need for robust IoT security solutions that can mitigate modern cyber threats and assure network integrity.
△ Less
Submitted 8 November, 2024;
originally announced November 2024.
-
Deep Priors for Video Quality Prediction
Authors:
Siddharath Narayan Shakya,
Parimala Kancharla
Abstract:
In this work, we designed a completely blind video quality assessment algorithm using the deep video prior. This work mainly explores the utility of deep video prior in estimating the visual quality of the video. In our work, we have used a single distorted video and a reference video pair to learn the deep video prior. At inference time, the learned deep prior is used to restore the original vide…
▽ More
In this work, we designed a completely blind video quality assessment algorithm using the deep video prior. This work mainly explores the utility of deep video prior in estimating the visual quality of the video. In our work, we have used a single distorted video and a reference video pair to learn the deep video prior. At inference time, the learned deep prior is used to restore the original videos from the distorted videos. The ability of learned deep video prior to restore the original video from the distorted video is measured to quantify distortion in the video. Our hypothesis is that the learned deep video prior fails in restoring the highly distorted videos. The restoring ability of deep video prior is proportional to the distortion present in the video. Therefore, we propose to use the distance between the distorted video and the restored video as the perceptual quality of the video. Our algorithm is trained using a single video pair and it does not need any labelled data. We show that our proposed algorithm outperforms the existing unsupervised video quality assessment algorithms in terms of LCC and SROCC on a synthetically distorted video quality assessment dataset.
△ Less
Submitted 5 November, 2024; v1 submitted 29 October, 2024;
originally announced October 2024.
-
From an attention economy to an ecology of attending. A manifesto
Authors:
Gunter Bombaerts,
Tom Hannes,
Martin Adam,
Alessandra Aloisi,
Joel Anderson,
Lawrence Berger,
Stefano Davide Bettera,
Enrico Campo,
Laura Candiotto,
Silvia Caprioglio Panizza,
Yves Citton,
Diego DâAngelo,
Matthew Dennis,
Nathalie Depraz,
Peter Doran,
Wolfgang Drechsler,
Bill Duane,
William Edelglass,
Iris Eisenberger,
Beverley Foulks McGuire,
Antony Fredriksson,
Karamjit S. Gill,
Peter D. Hershock,
Soraj Hongladarom,
Beth Jacobs
, et al. (30 additional authors not shown)
Abstract:
As the signatories of this manifesto, we denounce the attention economy as inhumane and a threat to our sociopolitical and ecological well-being. We endorse policymakers' efforts to address the negative consequences of the attention economy's technology, but add that these approaches are often limited in their criticism of the systemic context of human attention. Starting from Buddhist philosophy,…
▽ More
As the signatories of this manifesto, we denounce the attention economy as inhumane and a threat to our sociopolitical and ecological well-being. We endorse policymakers' efforts to address the negative consequences of the attention economy's technology, but add that these approaches are often limited in their criticism of the systemic context of human attention. Starting from Buddhist philosophy, we advocate a broader approach: an ecology of attending, that centers on conceptualizing, designing, and using attention (1) in an embedded way and (2) focused on the alleviating of suffering. With 'embedded' we mean that attention is not a neutral, isolated mechanism but a meaning-engendering part of an 'ecology' of bodily, sociotechnical and moral frameworks. With 'focused on the alleviation of suffering' we explicitly move away from the (often implicit) conception of attention as a tool for gratifying desires.
△ Less
Submitted 22 October, 2024;
originally announced October 2024.
-
Automatic speech recognition for the Nepali language using CNN, bidirectional LSTM and ResNet
Authors:
Manish Dhakal,
Arman Chhetri,
Aman Kumar Gupta,
Prabin Lamichhane,
Suraj Pandey,
Subarna Shakya
Abstract:
This paper presents an end-to-end deep learning model for Automatic Speech Recognition (ASR) that transcribes Nepali speech to text. The model was trained and tested on the OpenSLR (audio, text) dataset. The majority of the audio dataset have silent gaps at both ends which are clipped during dataset preprocessing for a more uniform mapping of audio frames and their corresponding texts. Mel Frequen…
▽ More
This paper presents an end-to-end deep learning model for Automatic Speech Recognition (ASR) that transcribes Nepali speech to text. The model was trained and tested on the OpenSLR (audio, text) dataset. The majority of the audio dataset have silent gaps at both ends which are clipped during dataset preprocessing for a more uniform mapping of audio frames and their corresponding texts. Mel Frequency Cepstral Coefficients (MFCCs) are used as audio features to feed into the model. The model having Bidirectional LSTM paired with ResNet and one-dimensional CNN produces the best results for this dataset out of all the models (neural networks with variations of LSTM, GRU, CNN, and ResNet) that have been trained so far. This novel model uses Connectionist Temporal Classification (CTC) function for loss calculation during training and CTC beam search decoding for predicting characters as the most likely sequence of Nepali text. On the test dataset, the character error rate (CER) of 17.06 percent has been achieved. The source code is available at: https://github.com/manishdhakal/ASR-Nepali-using-CNN-BiLSTM-ResNet.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Contextual Spelling Correction with Language Model for Low-resource Setting
Authors:
Nishant Luitel,
Nirajan Bekoju,
Anand Kumar Sah,
Subarna Shakya
Abstract:
The task of Spell Correction(SC) in low-resource languages presents a significant challenge due to the availability of only a limited corpus of data and no annotated spelling correction datasets. To tackle these challenges a small-scale word-based transformer LM is trained to provide the SC model with contextual understanding. Further, the probabilistic error rules are extracted from the corpus in…
▽ More
The task of Spell Correction(SC) in low-resource languages presents a significant challenge due to the availability of only a limited corpus of data and no annotated spelling correction datasets. To tackle these challenges a small-scale word-based transformer LM is trained to provide the SC model with contextual understanding. Further, the probabilistic error rules are extracted from the corpus in an unsupervised way to model the tendency of error happening(error model). Then the combination of LM and error model is used to develop the SC model through the well-known noisy channel framework. The effectiveness of this approach is demonstrated through experiments on the Nepali language where there is access to just an unprocessed corpus of textual data.
△ Less
Submitted 28 April, 2024;
originally announced April 2024.
-
Can Perplexity Predict Fine-tuning Performance? An Investigation of Tokenization Effects on Sequential Language Models for Nepali
Authors:
Nishant Luitel,
Nirajan Bekoju,
Anand Kumar Sah,
Subarna Shakya
Abstract:
The impact of subword tokenization on language model performance is well-documented for perplexity, with finer granularity consistently reducing this intrinsic metric. However, research on how different tokenization schemes affect a model's understanding capabilities remains limited, particularly for non-Latin script languages. Addressing this gap, we conducted a comprehensive evaluation of six di…
▽ More
The impact of subword tokenization on language model performance is well-documented for perplexity, with finer granularity consistently reducing this intrinsic metric. However, research on how different tokenization schemes affect a model's understanding capabilities remains limited, particularly for non-Latin script languages. Addressing this gap, we conducted a comprehensive evaluation of six distinct tokenization strategies by pretraining transformer-based language models for Nepali and evaluating their performance across multiple downstream tasks. While recent prominent models like GPT, RoBERTa, Claude, LLaMA, Mistral, Falcon, and MPT have adopted byte-level BPE tokenization, our findings demonstrate that for Nepali, SentencePiece tokenization consistently yields superior results on understanding-based tasks. Unlike previous studies that primarily focused on BERT-based architectures, our research specifically examines sequential transformer models, providing valuable insights for language model development in low-resource languages and highlighting the importance of tokenization strategy beyond perplexity reduction.
△ Less
Submitted 9 June, 2025; v1 submitted 28 April, 2024;
originally announced April 2024.
-
Interpreting Indirect Answers to Yes-No Questions in Multiple Languages
Authors:
Zijie Wang,
Md Mosharaf Hossain,
Shivam Mathur,
Terry Cruz Melo,
Kadir Bulut Ozler,
Keun Hee Park,
Jacob Quintero,
MohammadHossein Rezaei,
Shreya Nupur Shakya,
Md Nayem Uddin,
Eduardo Blanco
Abstract:
Yes-no questions expect a yes or no for an answer, but people often skip polar keywords. Instead, they answer with long explanations that must be interpreted. In this paper, we focus on this challenging problem and release new benchmarks in eight languages. We present a distant supervision approach to collect training data. We also demonstrate that direct answers (i.e., with polar keywords) are us…
▽ More
Yes-no questions expect a yes or no for an answer, but people often skip polar keywords. Instead, they answer with long explanations that must be interpreted. In this paper, we focus on this challenging problem and release new benchmarks in eight languages. We present a distant supervision approach to collect training data. We also demonstrate that direct answers (i.e., with polar keywords) are useful to train models to interpret indirect answers (i.e., without polar keywords). Experimental results demonstrate that monolingual fine-tuning is beneficial if training data can be obtained via distant supervision for the language of interest (5 languages). Additionally, we show that cross-lingual fine-tuning is always beneficial (8 languages).
△ Less
Submitted 20 October, 2023;
originally announced October 2023.
-
Analysis, Detection, and Classification of Android Malware using System Calls
Authors:
Shubham Shakya,
Mayank Dave
Abstract:
With the increasing popularity of Android in the last decade, Android is popular among users as well as attackers. The vast number of android users grabs the attention of attackers on android. Due to the continuous evolution of the variety and attacking techniques of android malware, our detection methods should need an update too. Most of the researcher's works are based on static features, and v…
▽ More
With the increasing popularity of Android in the last decade, Android is popular among users as well as attackers. The vast number of android users grabs the attention of attackers on android. Due to the continuous evolution of the variety and attacking techniques of android malware, our detection methods should need an update too. Most of the researcher's works are based on static features, and very few focus on dynamic features. In this paper, we are filling the literature gap by detecting android malware using System calls. We are running the malicious app in a monitored and controlled environment using an emulator to detect malware. Malicious behavior is activated with some simulated events during its runtime to activate its hostile behavior. Logs collected during the app's runtime are analyzed and fed to different machine learning models for Detection and Family classification of Malware. The result indicates that K-Nearest Neighbor and the Decision Tree gave the highest accuracy in malware detection and Family Classification respectively.
△ Less
Submitted 12 August, 2022;
originally announced August 2022.
-
Age Range Estimation using MTCNN and VGG-Face Model
Authors:
Dipesh Gyawali,
Prashanga Pokharel,
Ashutosh Chauhan,
Subodh Chandra Shakya
Abstract:
The Convolutional Neural Network has amazed us with its usage on several applications. Age range estimation using CNN is emerging due to its application in myriad of areas which makes it a state-of-the-art area for research and improve the estimation accuracy. A deep CNN model is used for identification of people's age range in our proposed work. At first, we extracted only face images from image…
▽ More
The Convolutional Neural Network has amazed us with its usage on several applications. Age range estimation using CNN is emerging due to its application in myriad of areas which makes it a state-of-the-art area for research and improve the estimation accuracy. A deep CNN model is used for identification of people's age range in our proposed work. At first, we extracted only face images from image dataset using MTCNN to remove unnecessary features other than face from the image. Secondly, we used random crop technique for data augmentation to improve the model performance. We have used the concept of transfer learning in our research. A pretrained face recognition model i.e VGG-Face is used to build our model for identification of age range whose performance is evaluated on Adience Benchmark for confirming the efficacy of our work. The performance in test set outperformed existing state-of-the-art by substantial margins.
△ Less
Submitted 17 April, 2021;
originally announced April 2021.
-
A Comparison of Semantic Similarity Methods for Maximum Human Interpretability
Authors:
Pinky Sitikhu,
Kritish Pahi,
Pujan Thapa,
Subarna Shakya
Abstract:
The inclusion of semantic information in any similarity measures improves the efficiency of the similarity measure and provides human interpretable results for further analysis. The similarity calculation method that focuses on features related to the text's words only, will give less accurate results. This paper presents three different methods that not only focus on the text's words but also inc…
▽ More
The inclusion of semantic information in any similarity measures improves the efficiency of the similarity measure and provides human interpretable results for further analysis. The similarity calculation method that focuses on features related to the text's words only, will give less accurate results. This paper presents three different methods that not only focus on the text's words but also incorporates semantic information of texts in their feature vector and computes semantic similarities. These methods are based on corpus-based and knowledge-based methods, which are: cosine similarity using tf-idf vectors, cosine similarity using word embedding and soft cosine similarity using word embedding. Among these three, cosine similarity using tf-idf vectors performed best in finding similarities between short news texts. The similar texts given by the method are easy to interpret and can be used directly in other information retrieval applications.
△ Less
Submitted 30 October, 2019; v1 submitted 20 October, 2019;
originally announced October 2019.
-
Fine-grained Sentiment Classification using BERT
Authors:
Manish Munikar,
Sushil Shakya,
Aakash Shrestha
Abstract:
Sentiment classification is an important process in understanding people's perception towards a product, service, or topic. Many natural language processing models have been proposed to solve the sentiment classification problem. However, most of them have focused on binary sentiment classification. In this paper, we use a promising deep learning model called BERT to solve the fine-grained sentime…
▽ More
Sentiment classification is an important process in understanding people's perception towards a product, service, or topic. Many natural language processing models have been proposed to solve the sentiment classification problem. However, most of them have focused on binary sentiment classification. In this paper, we use a promising deep learning model called BERT to solve the fine-grained sentiment classification task. Experiments show that our model outperforms other popular models for this task without sophisticated architecture. We also demonstrate the effectiveness of transfer learning in natural language processing in the process.
△ Less
Submitted 4 October, 2019;
originally announced October 2019.
-
Quanvolutional Neural Networks: Powering Image Recognition with Quantum Circuits
Authors:
Maxwell Henderson,
Samriddhi Shakya,
Shashindra Pradhan,
Tristan Cook
Abstract:
Convolutional neural networks (CNNs) have rapidly risen in popularity for many machine learning applications, particularly in the field of image recognition. Much of the benefit generated from these networks comes from their ability to extract features from the data in a hierarchical manner. These features are extracted using various transformational layers, notably the convolutional layer which g…
▽ More
Convolutional neural networks (CNNs) have rapidly risen in popularity for many machine learning applications, particularly in the field of image recognition. Much of the benefit generated from these networks comes from their ability to extract features from the data in a hierarchical manner. These features are extracted using various transformational layers, notably the convolutional layer which gives the model its name. In this work, we introduce a new type of transformational layer called a quantum convolution, or quanvolutional layer. Quanvolutional layers operate on input data by locally transforming the data using a number of random quantum circuits, in a way that is similar to the transformations performed by random convolutional filter layers. Provided these quantum transformations produce meaningful features for classification purposes, then the overall algorithm could be quite useful for near term quantum computing, because it requires small quantum circuits with little to no error correction. In this work, we empirically evaluated the potential benefit of these quantum transformations by comparing three types of models built on the MNIST dataset: CNNs, quantum convolutional neural networks (QNNs), and CNNs with additional non-linearities introduced. Our results showed that the QNN models had both higher test set accuracy as well as faster training compared to the purely classical CNNs.
△ Less
Submitted 9 April, 2019;
originally announced April 2019.
-
Word Sense Disambiguation using WSD specific Wordnet of Polysemy Words
Authors:
Udaya Raj Dhungana,
Subarna Shakya,
Kabita Baral,
Bharat Sharma
Abstract:
This paper presents a new model of WordNet that is used to disambiguate the correct sense of polysemy word based on the clue words. The related words for each sense of a polysemy word as well as single sense word are referred to as the clue words. The conventional WordNet organizes nouns, verbs, adjectives and adverbs together into sets of synonyms called synsets each expressing a different concep…
▽ More
This paper presents a new model of WordNet that is used to disambiguate the correct sense of polysemy word based on the clue words. The related words for each sense of a polysemy word as well as single sense word are referred to as the clue words. The conventional WordNet organizes nouns, verbs, adjectives and adverbs together into sets of synonyms called synsets each expressing a different concept. In contrast to the structure of WordNet, we developed a new model of WordNet that organizes the different senses of polysemy words as well as the single sense words based on the clue words. These clue words for each sense of a polysemy word as well as for single sense word are used to disambiguate the correct meaning of the polysemy word in the given context using knowledge based Word Sense Disambiguation (WSD) algorithms. The clue word can be a noun, verb, adjective or adverb.
△ Less
Submitted 10 September, 2014;
originally announced September 2014.