Skip to main content

Showing 1–2 of 2 results for author: Crew, E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2103.15845  [pdf, other

    cs.CL

    Text Normalization for Low-Resource Languages of Africa

    Authors: Andrew Zupon, Evan Crew, Sandy Ritchie

    Abstract: Training data for machine learning models can come from many different sources, which can be of dubious quality. For resource-rich languages like English, there is a lot of data available, so we can afford to throw out the dubious data. For low-resource languages where there is much less data available, we can't necessarily afford to throw out the dubious data, in case we end up with a training se… ▽ More

    Submitted 29 March, 2021; originally announced March 2021.

    Comments: to be presented at AfricaNLP 2021

  2. arXiv:1912.01218  [pdf

    cs.HC cs.CL

    Writing Across the World's Languages: Deep Internationalization for Gboard, the Google Keyboard

    Authors: Daan van Esch, Elnaz Sarbar, Tamar Lucassen, Jeremy O'Brien, Theresa Breiner, Manasa Prasad, Evan Crew, Chieu Nguyen, Françoise Beaufays

    Abstract: This technical report describes our deep internationalization program for Gboard, the Google Keyboard. Today, Gboard supports 900+ language varieties across 70+ writing systems, and this report describes how and why we have been adding support for hundreds of language varieties from around the globe. Many languages of the world are increasingly used in writing on an everyday basis, and we describe… ▽ More

    Submitted 3 December, 2019; originally announced December 2019.