-
Can maiBERT Speak for Maithili?
Authors:
Sumit Yadav,
Raju Kumar Yadav,
Utsav Maskey,
Gautam Siddharth Kashyap,
Md Azizul Hoque,
Ganesh Gautam
Abstract:
Natural Language Understanding (NLU) for low-resource languages remains a major challenge in NLP due to the scarcity of high-quality data and language-specific models. Maithili, despite being spoken by millions, lacks adequate computational resources, limiting its inclusion in digital and AI-driven applications. To address this gap, we introducemaiBERT, a BERT-based language model pre-trained spec…
▽ More
Natural Language Understanding (NLU) for low-resource languages remains a major challenge in NLP due to the scarcity of high-quality data and language-specific models. Maithili, despite being spoken by millions, lacks adequate computational resources, limiting its inclusion in digital and AI-driven applications. To address this gap, we introducemaiBERT, a BERT-based language model pre-trained specifically for Maithili using the Masked Language Modeling (MLM) technique. Our model is trained on a newly constructed Maithili corpus and evaluated through a news classification task. In our experiments, maiBERT achieved an accuracy of 87.02%, outperforming existing regional models like NepBERTa and HindiBERT, with a 0.13% overall accuracy gain and 5-7% improvement across various classes. We have open-sourced maiBERT on Hugging Face enabling further fine-tuning for downstream tasks such as sentiment analysis and Named Entity Recognition (NER).
△ Less
Submitted 22 September, 2025; v1 submitted 18 September, 2025;
originally announced September 2025.
-
Pruning Literals for Highly Efficient Explainability at Word Level
Authors:
Rohan Kumar Yadav,
Bimal Bhattarai,
Abhik Jana,
Lei Jiao,
Seid Muhie Yimam
Abstract:
Designing an explainable model becomes crucial now for Natural Language Processing(NLP) since most of the state-of-the-art machine learning models provide a limited explanation for the prediction. In the spectrum of an explainable model, Tsetlin Machine(TM) is promising because of its capability of providing word-level explanation using proposition logic. However, concern rises over the elaborated…
▽ More
Designing an explainable model becomes crucial now for Natural Language Processing(NLP) since most of the state-of-the-art machine learning models provide a limited explanation for the prediction. In the spectrum of an explainable model, Tsetlin Machine(TM) is promising because of its capability of providing word-level explanation using proposition logic. However, concern rises over the elaborated combination of literals (propositional logic) in the clause that makes the model difficult for humans to comprehend, despite having a transparent learning process. In this paper, we design a post-hoc pruning of clauses that eliminate the randomly placed literals in the clause thereby making the model more efficiently interpretable than the vanilla TM. Experiments on the publicly available YELP-HAT Dataset demonstrate that the proposed pruned TM's attention map aligns more with the human attention map than the vanilla TM's attention map. In addition, the pairwise similarity measure also surpasses the attention map-based neural network models. In terms of accuracy, the proposed pruning method does not degrade the accuracy significantly but rather enhances the performance up to 4% to 9% in some test data.
△ Less
Submitted 7 November, 2024;
originally announced November 2024.
-
Enhancing Interpretable Clauses Semantically using Pretrained Word Representation
Authors:
Rohan Kumar Yadav,
Lei Jiao,
Ole-Christoffer Granmo,
Morten Goodwin
Abstract:
Tsetlin Machine (TM) is an interpretable pattern recognition algorithm based on propositional logic, which has demonstrated competitive performance in many Natural Language Processing (NLP) tasks, including sentiment analysis, text classification, and Word Sense Disambiguation. To obtain human-level interpretability, legacy TM employs Boolean input features such as bag-of-words (BOW). However, the…
▽ More
Tsetlin Machine (TM) is an interpretable pattern recognition algorithm based on propositional logic, which has demonstrated competitive performance in many Natural Language Processing (NLP) tasks, including sentiment analysis, text classification, and Word Sense Disambiguation. To obtain human-level interpretability, legacy TM employs Boolean input features such as bag-of-words (BOW). However, the BOW representation makes it difficult to use any pre-trained information, for instance, word2vec and GloVe word representations. This restriction has constrained the performance of TM compared to deep neural networks (DNNs) in NLP. To reduce the performance gap, in this paper, we propose a novel way of using pre-trained word representations for TM. The approach significantly enhances the performance and interpretability of TM. We achieve this by extracting semantically related words from pre-trained word representations as input features to the TM. Our experiments show that the accuracy of the proposed approach is significantly higher than the previous BOW-based TM, reaching the level of DNN-based models.
△ Less
Submitted 10 September, 2021; v1 submitted 14 April, 2021;
originally announced April 2021.
-
Massively Parallel and Asynchronous Tsetlin Machine Architecture Supporting Almost Constant-Time Scaling
Authors:
K. Darshana Abeyrathna,
Bimal Bhattarai,
Morten Goodwin,
Saeed Gorji,
Ole-Christoffer Granmo,
Lei Jiao,
Rupsa Saha,
Rohan K. Yadav
Abstract:
Using logical clauses to represent patterns, Tsetlin Machines (TMs) have recently obtained competitive performance in terms of accuracy, memory footprint, energy, and learning speed on several benchmarks. Each TM clause votes for or against a particular class, with classification resolved using a majority vote. While the evaluation of clauses is fast, being based on binary operators, the voting ma…
▽ More
Using logical clauses to represent patterns, Tsetlin Machines (TMs) have recently obtained competitive performance in terms of accuracy, memory footprint, energy, and learning speed on several benchmarks. Each TM clause votes for or against a particular class, with classification resolved using a majority vote. While the evaluation of clauses is fast, being based on binary operators, the voting makes it necessary to synchronize the clause evaluation, impeding parallelization. In this paper, we propose a novel scheme for desynchronizing the evaluation of clauses, eliminating the voting bottleneck. In brief, every clause runs in its own thread for massive native parallelism. For each training example, we keep track of the class votes obtained from the clauses in local voting tallies. The local voting tallies allow us to detach the processing of each clause from the rest of the clauses, supporting decentralized learning. This means that the TM most of the time will operate on outdated voting tallies. We evaluated the proposed parallelization across diverse learning tasks and it turns out that our decentralized TM learning algorithm copes well with working on outdated data, resulting in no significant loss in learning accuracy. Furthermore, we show that the proposed approach provides up to 50 times faster learning. Finally, learning time is almost constant for reasonable clause amounts (employing from 20 to 7,000 clauses on a Tesla V100 GPU). For sufficiently large clause numbers, computation time increases approximately proportionally. Our parallel and asynchronous architecture thus allows processing of massive datasets and operating with more clauses for higher accuracy.
△ Less
Submitted 9 June, 2021; v1 submitted 10 September, 2020;
originally announced September 2020.
-
EDUQA: Educational Domain Question Answering System using Conceptual Network Mapping
Authors:
Abhishek Agarwal,
Nikhil Sachdeva,
Raj Kamal Yadav,
Vishaal Udandarao,
Vrinda Mittal,
Anubha Gupta,
Abhinav Mathur
Abstract:
Most of the existing question answering models can be largely compiled into two categories: i) open domain question answering models that answer generic questions and use large-scale knowledge base along with the targeted web-corpus retrieval and ii) closed domain question answering models that address focused questioning area and use complex deep learning models. Both the above models derive answ…
▽ More
Most of the existing question answering models can be largely compiled into two categories: i) open domain question answering models that answer generic questions and use large-scale knowledge base along with the targeted web-corpus retrieval and ii) closed domain question answering models that address focused questioning area and use complex deep learning models. Both the above models derive answers through textual comprehension methods. Due to their inability to capture the pedagogical meaning of textual content, these models are not appropriately suited to the educational field for pedagogy. In this paper, we propose an on-the-fly conceptual network model that incorporates educational semantics. The proposed model preserves correlations between conceptual entities by applying intelligent indexing algorithms on the concept network so as to improve answer generation. This model can be utilized for building interactive conversational agents for aiding classroom learning.
△ Less
Submitted 12 November, 2019;
originally announced November 2019.
-
Performance evaluation of FD-SOI Mosfets for different metal gate work function
Authors:
Deepesh Ranka,
Ashwani K. Rana,
Rakesh Kumar Yadav,
Kamalesh Yadav,
Devendra Giri
Abstract:
Fully depleted (FD) Silicon on Insulator (SOI) metal oxide Field Effect Transistor (MOSFET) Is the Leading Contender for Sun 65nm Regime. This paper presents a study of effects of work functions of metal gate on the performance of FD-SOI MOSFET. Sentaurus TCAD simulation tool is used to investigate the effect of work function of gates ont he performance FDSOI MOSFET. Specific channel length of the…
▽ More
Fully depleted (FD) Silicon on Insulator (SOI) metal oxide Field Effect Transistor (MOSFET) Is the Leading Contender for Sun 65nm Regime. This paper presents a study of effects of work functions of metal gate on the performance of FD-SOI MOSFET. Sentaurus TCAD simulation tool is used to investigate the effect of work function of gates ont he performance FDSOI MOSFET. Specific channel length of the device that had been concentrated is 25nm. From simulation we observed that by changing the work function of the metal gates of FD-SOI MOSFET we can change the threshold voltage. Hence by using this technique we can set the appropriate threshold voltage of FD-SOI MOSFET at same voltage and we can decrease the leakage current, gate tunneling current and short channel effects and increase drive current.
△ Less
Submitted 4 April, 2011;
originally announced April 2011.