Skip to main content

Showing 1–1 of 1 results for author: Paparas, D

Searching in archive stat. Search in all archives.
.
  1. arXiv:2402.04177  [pdf, other

    cs.CL cs.LG stat.ML

    Scaling Laws for Downstream Task Performance in Machine Translation

    Authors: Berivan Isik, Natalia Ponomareva, Hussein Hazimeh, Dimitris Paparas, Sergei Vassilvitskii, Sanmi Koyejo

    Abstract: Scaling laws provide important insights that can guide the design of large language models (LLMs). Existing work has primarily focused on studying scaling laws for pretraining (upstream) loss. However, in transfer learning settings, in which LLMs are pretrained on an unsupervised dataset and then finetuned on a downstream task, we often also care about the downstream performance. In this work, we… ▽ More

    Submitted 20 February, 2025; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: Published at the International Conference on Learning Representations (ICLR) 2025. Previous title: "Scaling Laws for Downstream Task Performance of Large Language Models"