Skip to main content

Showing 1–5 of 5 results for author: Athiwaratkun, B

Searching in archive stat. Search in all archives.
.
  1. arXiv:2009.13272  [pdf, other

    cs.CL cs.LG stat.ML

    Augmented Natural Language for Generative Sequence Labeling

    Authors: Ben Athiwaratkun, Cicero Nogueira dos Santos, Jason Krone, Bing Xiang

    Abstract: We propose a generative framework for joint sequence labeling and sentence-level classification. Our model performs multiple sequence labeling tasks at once using a single, shared natural language output space. Unlike prior discriminative methods, our model naturally incorporates label semantics and shares knowledge across tasks. Our framework is general purpose, performing well on few-shot, low-r… ▽ More

    Submitted 15 September, 2020; originally announced September 2020.

    Comments: To appear at EMNLP 2020

  2. arXiv:1806.05594  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    There Are Many Consistent Explanations of Unlabeled Data: Why You Should Average

    Authors: Ben Athiwaratkun, Marc Finzi, Pavel Izmailov, Andrew Gordon Wilson

    Abstract: Presently the most successful approaches to semi-supervised learning are based on consistency regularization, whereby a model is trained to be robust to small perturbations of its inputs and parameters. To understand consistency regularization, we conceptually explore how loss geometry interacts with training procedures. The consistency loss dramatically improves generalization performance over su… ▽ More

    Submitted 21 February, 2019; v1 submitted 14 June, 2018; originally announced June 2018.

    Comments: Appears at ICLR 2019

  3. arXiv:1806.02901  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Probabilistic FastText for Multi-Sense Word Embeddings

    Authors: Ben Athiwaratkun, Andrew Gordon Wilson, Anima Anandkumar

    Abstract: We introduce Probabilistic FastText, a new model for word embeddings that can capture multiple word senses, sub-word structure, and uncertainty information. In particular, we represent each word with a Gaussian mixture density, where the mean of a mixture component is given by the sum of n-grams. This representation allows the model to share statistical strength across sub-word structures (e.g. La… ▽ More

    Submitted 7 June, 2018; originally announced June 2018.

    Comments: Published at ACL 2018

  4. arXiv:1804.09843  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Hierarchical Density Order Embeddings

    Authors: Ben Athiwaratkun, Andrew Gordon Wilson

    Abstract: By representing words with probability densities rather than point vectors, probabilistic word embeddings can capture rich and interpretable semantic information and uncertainty. The uncertainty information can be particularly meaningful in capturing entailment relationships -- whereby general words such as "entity" correspond to broad distributions that encompass more specific words such as "anim… ▽ More

    Submitted 25 April, 2018; originally announced April 2018.

    Comments: Published at ICLR 2018

  5. arXiv:1704.08424  [pdf, other

    stat.ML cs.AI cs.CL cs.LG

    Multimodal Word Distributions

    Authors: Ben Athiwaratkun, Andrew Gordon Wilson

    Abstract: Word embeddings provide point representations of words containing useful semantic information. We introduce multimodal word distributions formed from Gaussian mixtures, for multiple word meanings, entailment, and rich uncertainty information. To learn these distributions, we propose an energy-based max-margin objective. We show that the resulting approach captures uniquely expressive semantic info… ▽ More

    Submitted 9 September, 2019; v1 submitted 26 April, 2017; originally announced April 2017.

    Comments: This paper also appears at ACL 2017