Skip to main content

Showing 1–4 of 4 results for author: Mao, H H

Searching in archive stat. Search in all archives.
.
  1. arXiv:2007.00800  [pdf, other

    cs.LG stat.ML

    A Survey on Self-supervised Pre-training for Sequential Transfer Learning in Neural Networks

    Authors: Huanru Henry Mao

    Abstract: Deep neural networks are typically trained under a supervised learning framework where a model learns a single task using labeled data. Instead of relying solely on labeled data, practitioners can harness unlabeled or related data to improve model performance, which is often more accessible and ubiquitous. Self-supervised pre-training for transfer learning is becoming an increasingly popular techn… ▽ More

    Submitted 1 July, 2020; originally announced July 2020.

  2. arXiv:2003.04887  [pdf, other

    cs.LG cs.CL stat.ML

    ReZero is All You Need: Fast Convergence at Large Depth

    Authors: Thomas Bachlechner, Bodhisattwa Prasad Majumder, Huanru Henry Mao, Garrison W. Cottrell, Julian McAuley

    Abstract: Deep networks often suffer from vanishing or exploding gradients due to inefficient signal propagation, leading to long training times or convergence difficulties. Various architecture designs, sophisticated residual-style networks, and initialization schemes have been shown to improve deep signal propagation. Recently, Pennington et al. used free probability theory to show that dynamical isometry… ▽ More

    Submitted 24 June, 2020; v1 submitted 10 March, 2020; originally announced March 2020.

  3. arXiv:1908.09451  [pdf, ps, other

    cs.LG cs.CL stat.ML

    Improving Neural Story Generation by Targeted Common Sense Grounding

    Authors: Huanru Henry Mao, Bodhisattwa Prasad Majumder, Julian McAuley, Garrison W. Cottrell

    Abstract: Stories generated with neural language models have shown promise in grammatical and stylistic consistency. However, the generated stories are still lacking in common sense reasoning, e.g., they often contain sentences deprived of world knowledge. We propose a simple multi-task learning scheme to achieve quantitatively better common sense reasoning in language models by leveraging auxiliary trainin… ▽ More

    Submitted 27 February, 2020; v1 submitted 25 August, 2019; originally announced August 2019.

  4. arXiv:1907.04868  [pdf, other

    cs.SD cs.LG cs.MM eess.AS stat.ML

    LakhNES: Improving multi-instrumental music generation with cross-domain pre-training

    Authors: Chris Donahue, Huanru Henry Mao, Yiting Ethan Li, Garrison W. Cottrell, Julian McAuley

    Abstract: We are interested in the task of generating multi-instrumental music scores. The Transformer architecture has recently shown great promise for the task of piano score generation; here we adapt it to the multi-instrumental setting. Transformers are complex, high-dimensional language models which are capable of capturing long-term structure in sequence data, but require large amounts of data to fit.… ▽ More

    Submitted 10 July, 2019; originally announced July 2019.

    Comments: Published as a conference paper at ISMIR 2019