Skip to main content

Showing 1–4 of 4 results for author: Quesnelle, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2411.19870  [pdf, ps, other

    cs.LG cs.AI

    DeMo: Decoupled Momentum Optimization

    Authors: Bowen Peng, Jeffrey Quesnelle, Diederik P. Kingma

    Abstract: Training large neural networks typically requires sharing gradients between accelerators through specialized high-speed interconnects. Drawing from the signal processing principles of frequency decomposition and energy compaction, we demonstrate that synchronizing full optimizer states and model parameters during training is unnecessary. By decoupling momentum updates and allowing controlled diver… ▽ More

    Submitted 29 November, 2024; originally announced November 2024.

  2. arXiv:2408.11857  [pdf, other

    cs.CL

    Hermes 3 Technical Report

    Authors: Ryan Teknium, Jeffrey Quesnelle, Chen Guang

    Abstract: Instruct (or "chat") tuned models have become the primary way in which most people interact with large language models. As opposed to "base" or "foundation" models, instruct-tuned models are optimized to respond to imperative statements. We present Hermes 3, a neutrally-aligned generalist instruct and tool use model with strong reasoning and creative abilities. Its largest version, Hermes 3 405B,… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

  3. arXiv:2309.00071  [pdf, other

    cs.CL cs.AI cs.LG

    YaRN: Efficient Context Window Extension of Large Language Models

    Authors: Bowen Peng, Jeffrey Quesnelle, Honglu Fan, Enrico Shippole

    Abstract: Rotary Position Embeddings (RoPE) have been shown to effectively encode positional information in transformer-based language models. However, these models fail to generalize past the sequence length they were trained on. We present YaRN (Yet another RoPE extensioN method), a compute-efficient method to extend the context window of such models, requiring 10x less tokens and 2.5x less training steps… ▽ More

    Submitted 1 November, 2023; v1 submitted 31 August, 2023; originally announced September 2023.

  4. arXiv:1712.01210  [pdf, ps, other

    cs.CR

    On the linkability of Zcash transactions

    Authors: Jeffrey Quesnelle

    Abstract: Zcash is a fork of Bitcoin with optional anonymity features. While transparent transactions are fully linkable, shielded transactions use zero-knowledge proofs to obscure the parties and amounts of the transactions. First, we observe various metrics regarding the usage of shielded addresses. Moreover, we show that most coins sent to shielded addresses are later sent back to transparent addresses.… ▽ More

    Submitted 4 December, 2017; originally announced December 2017.

    Comments: 5 pages, 9 figures, 1 appendix