Skip to main content

Showing 1–2 of 2 results for author: Masumi, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.14690  [pdf

    cs.CL cs.AI

    FarsEval-PKBETS: A new diverse benchmark for evaluating Persian large language models

    Authors: Mehrnoush Shamsfard, Zahra Saaberi, Mostafa Karimi manesh, Seyed Mohammad Hossein Hashemi, Zahra Vatankhah, Motahareh Ramezani, Niki Pourazin, Tara Zare, Maryam Azimi, Sarina Chitsaz, Sama Khoraminejad, Morteza Mahdavi Mortazavi, Mohammad Mahdi Chizari, Sahar Maleki, Seyed Soroush Majd, Mostafa Masumi, Sayed Ali Musavi Khoeini, Amir Mohseni, Sogol Alipour

    Abstract: Research on evaluating and analyzing large language models (LLMs) has been extensive for resource-rich languages such as English, yet their performance in languages such as Persian has received considerably less attention. This paper introduces FarsEval-PKBETS benchmark, a subset of FarsEval project for evaluating large language models in Persian. This benchmark consists of 4000 questions and answ… ▽ More

    Submitted 20 April, 2025; originally announced April 2025.

    Comments: 24 pages, 3 figures, 3 tables

    MSC Class: 68T50 ACM Class: I.2.7; E.0

  2. arXiv:2402.06617  [pdf, other

    cs.CL

    FaBERT: Pre-training BERT on Persian Blogs

    Authors: Mostafa Masumi, Seyed Soroush Majd, Mehrnoush Shamsfard, Hamid Beigy

    Abstract: We introduce FaBERT, a Persian BERT-base model pre-trained on the HmBlogs corpus, encompassing both informal and formal Persian texts. FaBERT is designed to excel in traditional Natural Language Understanding (NLU) tasks, addressing the intricacies of diverse sentence structures and linguistic styles prevalent in the Persian language. In our comprehensive evaluation of FaBERT on 12 datasets in var… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.