Skip to main content

Showing 1–3 of 3 results for author: Pawakapan, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.05829  [pdf, other

    cs.CL cs.AI cs.LG

    SambaLingo: Teaching Large Language Models New Languages

    Authors: Zoltan Csaki, Bo Li, Jonathan Li, Qiantong Xu, Pian Pawakapan, Leon Zhang, Yun Du, Hengyu Zhao, Changran Hu, Urmish Thakker

    Abstract: Despite the widespread availability of LLMs, there remains a substantial gap in their capabilities and availability across diverse languages. One approach to address these issues has been to take an existing pre-trained LLM and continue to train it on new languages. While prior works have experimented with language adaptation, many questions around best practices and methodology have not been cove… ▽ More

    Submitted 17 July, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    Comments: 23 pages

  2. arXiv:2311.05741  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Efficiently Adapting Pretrained Language Models To New Languages

    Authors: Zoltan Csaki, Pian Pawakapan, Urmish Thakker, Qiantong Xu

    Abstract: Recent large language models (LLM) exhibit sub-optimal performance on low-resource languages, as the training data of these models is usually dominated by English and other high-resource languages. Furthermore, it is challenging to train models for low-resource languages, especially from scratch, due to a lack of high quality training data. Adapting pretrained LLMs reduces the need for data in the… ▽ More

    Submitted 14 December, 2023; v1 submitted 9 November, 2023; originally announced November 2023.

    Comments: Accepted to "The third Neurips Workshop on Efficient Natural Language and Speech Processing 2023" (ENLSP-III)

  3. arXiv:1811.08458  [pdf, other

    cs.LG cs.CV stat.ML

    Intermediate Level Adversarial Attack for Enhanced Transferability

    Authors: Qian Huang, Zeqi Gu, Isay Katsman, Horace He, Pian Pawakapan, Zhiqiu Lin, Serge Belongie, Ser-Nam Lim

    Abstract: Neural networks are vulnerable to adversarial examples, malicious inputs crafted to fool trained models. Adversarial examples often exhibit black-box transfer, meaning that adversarial examples for one model can fool another model. However, adversarial examples may be overfit to exploit the particular architecture and feature representation of a source model, resulting in sub-optimal black-box tra… ▽ More

    Submitted 20 November, 2018; originally announced November 2018.

    Comments: Preprint