Search | arXiv e-print repository

Can maiBERT Speak for Maithili?

Authors: Sumit Yadav, Raju Kumar Yadav, Utsav Maskey, Gautam Siddharth Kashyap, Md Azizul Hoque, Ganesh Gautam

Abstract: Natural Language Understanding (NLU) for low-resource languages remains a major challenge in NLP due to the scarcity of high-quality data and language-specific models. Maithili, despite being spoken by millions, lacks adequate computational resources, limiting its inclusion in digital and AI-driven applications. To address this gap, we introducemaiBERT, a BERT-based language model pre-trained spec… ▽ More Natural Language Understanding (NLU) for low-resource languages remains a major challenge in NLP due to the scarcity of high-quality data and language-specific models. Maithili, despite being spoken by millions, lacks adequate computational resources, limiting its inclusion in digital and AI-driven applications. To address this gap, we introducemaiBERT, a BERT-based language model pre-trained specifically for Maithili using the Masked Language Modeling (MLM) technique. Our model is trained on a newly constructed Maithili corpus and evaluated through a news classification task. In our experiments, maiBERT achieved an accuracy of 87.02%, outperforming existing regional models like NepBERTa and HindiBERT, with a 0.13% overall accuracy gain and 5-7% improvement across various classes. We have open-sourced maiBERT on Hugging Face enabling further fine-tuning for downstream tasks such as sentiment analysis and Named Entity Recognition (NER). △ Less

Submitted 22 September, 2025; v1 submitted 18 September, 2025; originally announced September 2025.

Comments: Preprint

arXiv:2411.04557 [pdf, other]

Pruning Literals for Highly Efficient Explainability at Word Level

Authors: Rohan Kumar Yadav, Bimal Bhattarai, Abhik Jana, Lei Jiao, Seid Muhie Yimam

Abstract: Designing an explainable model becomes crucial now for Natural Language Processing(NLP) since most of the state-of-the-art machine learning models provide a limited explanation for the prediction. In the spectrum of an explainable model, Tsetlin Machine(TM) is promising because of its capability of providing word-level explanation using proposition logic. However, concern rises over the elaborated… ▽ More Designing an explainable model becomes crucial now for Natural Language Processing(NLP) since most of the state-of-the-art machine learning models provide a limited explanation for the prediction. In the spectrum of an explainable model, Tsetlin Machine(TM) is promising because of its capability of providing word-level explanation using proposition logic. However, concern rises over the elaborated combination of literals (propositional logic) in the clause that makes the model difficult for humans to comprehend, despite having a transparent learning process. In this paper, we design a post-hoc pruning of clauses that eliminate the randomly placed literals in the clause thereby making the model more efficiently interpretable than the vanilla TM. Experiments on the publicly available YELP-HAT Dataset demonstrate that the proposed pruned TM's attention map aligns more with the human attention map than the vanilla TM's attention map. In addition, the pairwise similarity measure also surpasses the attention map-based neural network models. In terms of accuracy, the proposed pruning method does not degrade the accuracy significantly but rather enhances the performance up to 4% to 9% in some test data. △ Less

Submitted 7 November, 2024; originally announced November 2024.

Comments: 8 pages, 3 figures

Journal ref: 2024 International Symposium on the Tsetlin Machine (ISTM)

arXiv:2104.06901 [pdf, other]

Enhancing Interpretable Clauses Semantically using Pretrained Word Representation

Authors: Rohan Kumar Yadav, Lei Jiao, Ole-Christoffer Granmo, Morten Goodwin

Abstract: Tsetlin Machine (TM) is an interpretable pattern recognition algorithm based on propositional logic, which has demonstrated competitive performance in many Natural Language Processing (NLP) tasks, including sentiment analysis, text classification, and Word Sense Disambiguation. To obtain human-level interpretability, legacy TM employs Boolean input features such as bag-of-words (BOW). However, the… ▽ More Tsetlin Machine (TM) is an interpretable pattern recognition algorithm based on propositional logic, which has demonstrated competitive performance in many Natural Language Processing (NLP) tasks, including sentiment analysis, text classification, and Word Sense Disambiguation. To obtain human-level interpretability, legacy TM employs Boolean input features such as bag-of-words (BOW). However, the BOW representation makes it difficult to use any pre-trained information, for instance, word2vec and GloVe word representations. This restriction has constrained the performance of TM compared to deep neural networks (DNNs) in NLP. To reduce the performance gap, in this paper, we propose a novel way of using pre-trained word representations for TM. The approach significantly enhances the performance and interpretability of TM. We achieve this by extracting semantically related words from pre-trained word representations as input features to the TM. Our experiments show that the accuracy of the proposed approach is significantly higher than the previous BOW-based TM, reaching the level of DNN-based models. △ Less

Submitted 10 September, 2021; v1 submitted 14 April, 2021; originally announced April 2021.

Comments: BlackboxNLP 2021

arXiv:2009.04861 [pdf, other]

Massively Parallel and Asynchronous Tsetlin Machine Architecture Supporting Almost Constant-Time Scaling

Authors: K. Darshana Abeyrathna, Bimal Bhattarai, Morten Goodwin, Saeed Gorji, Ole-Christoffer Granmo, Lei Jiao, Rupsa Saha, Rohan K. Yadav

Abstract: Using logical clauses to represent patterns, Tsetlin Machines (TMs) have recently obtained competitive performance in terms of accuracy, memory footprint, energy, and learning speed on several benchmarks. Each TM clause votes for or against a particular class, with classification resolved using a majority vote. While the evaluation of clauses is fast, being based on binary operators, the voting ma… ▽ More Using logical clauses to represent patterns, Tsetlin Machines (TMs) have recently obtained competitive performance in terms of accuracy, memory footprint, energy, and learning speed on several benchmarks. Each TM clause votes for or against a particular class, with classification resolved using a majority vote. While the evaluation of clauses is fast, being based on binary operators, the voting makes it necessary to synchronize the clause evaluation, impeding parallelization. In this paper, we propose a novel scheme for desynchronizing the evaluation of clauses, eliminating the voting bottleneck. In brief, every clause runs in its own thread for massive native parallelism. For each training example, we keep track of the class votes obtained from the clauses in local voting tallies. The local voting tallies allow us to detach the processing of each clause from the rest of the clauses, supporting decentralized learning. This means that the TM most of the time will operate on outdated voting tallies. We evaluated the proposed parallelization across diverse learning tasks and it turns out that our decentralized TM learning algorithm copes well with working on outdated data, resulting in no significant loss in learning accuracy. Furthermore, we show that the proposed approach provides up to 50 times faster learning. Finally, learning time is almost constant for reasonable clause amounts (employing from 20 to 7,000 clauses on a Tesla V100 GPU). For sufficiently large clause numbers, computation time increases approximately proportionally. Our parallel and asynchronous architecture thus allows processing of massive datasets and operating with more clauses for higher accuracy. △ Less

Submitted 9 June, 2021; v1 submitted 10 September, 2020; originally announced September 2020.

Comments: Accepted to ICML 2021

arXiv:1911.05013 [pdf, other]

doi 10.1109/ICASSP.2019.8683538

EDUQA: Educational Domain Question Answering System using Conceptual Network Mapping

Authors: Abhishek Agarwal, Nikhil Sachdeva, Raj Kamal Yadav, Vishaal Udandarao, Vrinda Mittal, Anubha Gupta, Abhinav Mathur

Abstract: Most of the existing question answering models can be largely compiled into two categories: i) open domain question answering models that answer generic questions and use large-scale knowledge base along with the targeted web-corpus retrieval and ii) closed domain question answering models that address focused questioning area and use complex deep learning models. Both the above models derive answ… ▽ More Most of the existing question answering models can be largely compiled into two categories: i) open domain question answering models that answer generic questions and use large-scale knowledge base along with the targeted web-corpus retrieval and ii) closed domain question answering models that address focused questioning area and use complex deep learning models. Both the above models derive answers through textual comprehension methods. Due to their inability to capture the pedagogical meaning of textual content, these models are not appropriately suited to the educational field for pedagogy. In this paper, we propose an on-the-fly conceptual network model that incorporates educational semantics. The proposed model preserves correlations between conceptual entities by applying intelligent indexing algorithms on the concept network so as to improve answer generation. This model can be utilized for building interactive conversational agents for aiding classroom learning. △ Less

Submitted 12 November, 2019; originally announced November 2019.

Comments: Published in the 44th International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2019

Journal ref: IEEE ICASSP (2019) 8137-8141

arXiv:1104.0824 [pdf]

Performance evaluation of FD-SOI Mosfets for different metal gate work function

Authors: Deepesh Ranka, Ashwani K. Rana, Rakesh Kumar Yadav, Kamalesh Yadav, Devendra Giri

Abstract: Fully depleted (FD) Silicon on Insulator (SOI) metal oxide Field Effect Transistor (MOSFET) Is the Leading Contender for Sun 65nm Regime. This paper presents a study of effects of work functions of metal gate on the performance of FD-SOI MOSFET. Sentaurus TCAD simulation tool is used to investigate the effect of work function of gates ont he performance FDSOI MOSFET. Specific channel length of the… ▽ More Fully depleted (FD) Silicon on Insulator (SOI) metal oxide Field Effect Transistor (MOSFET) Is the Leading Contender for Sun 65nm Regime. This paper presents a study of effects of work functions of metal gate on the performance of FD-SOI MOSFET. Sentaurus TCAD simulation tool is used to investigate the effect of work function of gates ont he performance FDSOI MOSFET. Specific channel length of the device that had been concentrated is 25nm. From simulation we observed that by changing the work function of the metal gates of FD-SOI MOSFET we can change the threshold voltage. Hence by using this technique we can set the appropriate threshold voltage of FD-SOI MOSFET at same voltage and we can decrease the leakage current, gate tunneling current and short channel effects and increase drive current. △ Less

Submitted 4 April, 2011; originally announced April 2011.

Comments: 14 pages,12 figures,International Journal of VLSI design & Communication Systems (VLSICS) Vol.2, No.1, March 2011

Journal ref: International Journal of VLSI design & Communication Systems (VLSICS) Vol.2, No.1, March 2011

Showing 1–6 of 6 results for author: Yadav, R K