Skip to main content

Showing 1–2 of 2 results for author: Mehtani, S

.
  1. arXiv:2106.03958  [pdf, other

    cs.CL

    Exploiting Language Relatedness for Low Web-Resource Language Model Adaptation: An Indic Languages Study

    Authors: Yash Khemchandani, Sarvesh Mehtani, Vaidehi Patil, Abhijeet Awasthi, Partha Talukdar, Sunita Sarawagi

    Abstract: Recent research in multilingual language models (LM) has demonstrated their ability to effectively handle multiple languages in a single model. This holds promise for low web-resource languages (LRL) as multilingual models can enable transfer of supervision from high resource languages to LRLs. However, incorporating a new language in an LM still remains a challenge, particularly for languages wit… ▽ More

    Submitted 9 June, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: Accepted to ACL-IJCNLP 2021

  2. arXiv:2103.10730  [pdf, other

    cs.CL

    MuRIL: Multilingual Representations for Indian Languages

    Authors: Simran Khanuja, Diksha Bansal, Sarvesh Mehtani, Savya Khosla, Atreyee Dey, Balaji Gopalan, Dilip Kumar Margam, Pooja Aggarwal, Rajiv Teja Nagipogu, Shachi Dave, Shruti Gupta, Subhash Chandra Bose Gali, Vish Subramanian, Partha Talukdar

    Abstract: India is a multilingual society with 1369 rationalized languages and dialects being spoken across the country (INDIA, 2011). Of these, the 22 scheduled languages have a staggering total of 1.17 billion speakers and 121 languages have more than 10,000 speakers (INDIA, 2011). India also has the second largest (and an ever growing) digital footprint (Statista, 2020). Despite this, today's state-of-th… ▽ More

    Submitted 2 April, 2021; v1 submitted 19 March, 2021; originally announced March 2021.