Skip to main content

Showing 1–11 of 11 results for author: Terriberry, T B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2205.05785  [pdf, other

    eess.AS cs.SD

    Real-Time Packet Loss Concealment With Mixed Generative and Predictive Model

    Authors: Jean-Marc Valin, Ahmed Mustafa, Christopher Montgomery, Timothy B. Terriberry, Michael Klingbeil, Paris Smaragdis, Arvindh Krishnaswamy

    Abstract: As deep speech enhancement algorithms have recently demonstrated capabilities greatly surpassing their traditional counterparts for suppressing noise, reverberation and echo, attention is turning to the problem of packet loss concealment (PLC). PLC is a challenging task because it not only involves real-time speech synthesis, but also frequent transitions between the received audio and the synthes… ▽ More

    Submitted 11 May, 2022; originally announced May 2022.

    Comments: Submitted to INTERSPEECH 2022

  2. Perceptually-Driven Video Coding with the Daala Video Codec

    Authors: Yushin Cho, Thomas J. Daede, Nathan E. Egge, Guillaume Martres, Tristan Matthews, Christopher Montgomery, Timothy B. Terriberry, Jean-Marc Valin

    Abstract: The Daala project is a royalty-free video codec that attempts to compete with the best patent-encumbered codecs. Part of our strategy is to replace core tools of traditional video codecs with alternative approaches, many of them designed to take perceptual aspects into account, rather than optimizing for simple metrics like PSNR. This paper documents some of our experiences with these tools, which… ▽ More

    Submitted 8 October, 2016; originally announced October 2016.

    Comments: 19 pages, Proceedings of SPIE Workshop on Applications of Digital Image Processing (ADIP), 2016

  3. arXiv:1608.01947  [pdf, other

    cs.MM

    Daala: Building A Next-Generation Video Codec From Unconventional Technology

    Authors: Jean-Marc Valin, Timothy B. Terriberry, Nathan E. Egge, Thomas Daede, Yushin Cho, Christopher Montgomery, Michael Bebenita

    Abstract: Daala is a new royalty-free video codec that attempts to compete with state-of-the-art royalty-bearing codecs. To do so, it must achieve good compression while avoiding all of their patented techniques. We use technology that is as different as possible from traditional approaches to achieve this. This paper describes the technology behind Daala and discusses where it fits in the newly created AV1… ▽ More

    Submitted 5 August, 2016; originally announced August 2016.

    Comments: 6 pages, accepted for multimedia signal processing (MMSP) workshop, 2016

  4. arXiv:1605.04930  [pdf, other

    cs.MM

    Daala: A Perceptually-Driven Still Picture Codec

    Authors: Jean-Marc Valin, Nathan E. Egge, Thomas Daede, Timothy B. Terriberry, Christopher Montgomery

    Abstract: Daala is a new royalty-free video codec based on perceptually-driven coding techniques. We explore using its keyframe format for still picture coding and show how it has improved over the past year. We believe the technology used in Daala could be the basis of an excellent, royalty-free image format.

    Submitted 16 May, 2016; originally announced May 2016.

    Comments: Accepted for ICIP 2016, 5 pages

  5. arXiv:1603.03129  [pdf, other

    cs.MM

    Daala: A Perceptually-Driven Next Generation Video Codec

    Authors: Thomas J. Daede, Nathan E. Egge, Jean-Marc Valin, Guillaume Martres, Timothy B. Terriberry

    Abstract: The Daala project is a royalty-free video codec that attempts to compete with the best patent-encumbered codecs. Part of our strategy is to replace core tools of traditional video codecs with alternative approaches, many of them designed to take perceptual aspects into account, rather than optimizing for simple metrics like PSNR. This paper documents some of our experiences with these tools, which… ▽ More

    Submitted 9 March, 2016; originally announced March 2016.

    Comments: 10 pages

  6. arXiv:1603.01824  [pdf, ps, other

    cs.SD

    Low-Complexity Iterative Sinusoidal Parameter Estimation

    Authors: Jean-Marc Valin, Daniel V. Smith, Christopher Montgomery, Timothy B. Terriberry

    Abstract: Sinusoidal parameter estimation is a computationally-intensive task, which can pose problems for real-time implementations. In this paper, we propose a low-complexity iterative method for estimating sinusoidal parameters that is based on the linearisation of the model around an initial frequency estimate. We show that for N sinusoids in a frame of length L, the proposed method has a complexity of… ▽ More

    Submitted 6 March, 2016; originally announced March 2016.

    Comments: 8 pages. arXiv admin note: substantial text overlap with arXiv:1602.05900

    Journal ref: Proceedings of International Conference on Signal Processing and Communication Systems (ICSPCS), pp. 276-283, 2007

  7. An Iterative Linearised Solution to the Sinusoidal Parameter Estimation Problem

    Authors: Jean-Marc Valin, Daniel V. Smith, Christopher Montgomery, Timothy B. Terriberry

    Abstract: Signal processing applications use sinusoidal modelling for speech synthesis, speech coding, and audio coding. Estimation of the model parameters involves non-linear optimisation methods, which can be very costly for real-time applications. We propose a low-complexity iterative method that starts from initial frequency estimates and converges rapidly. We show that for N sinusoids in a frame of len… ▽ More

    Submitted 17 February, 2016; originally announced February 2016.

    Comments: 23 pages

    Journal ref: Computers and Electrical Engineering (Elsevier), Vol. 36, No. 4, pp. 603-616, 2010

  8. A High-Quality Speech and Audio Codec With Less Than 10 ms Delay

    Authors: Jean-Marc Valin, Timothy B. Terriberry, Christopher Montgomery, Gregory Maxwell

    Abstract: With increasing quality requirements for multimedia communications, audio codecs must maintain both high quality and low delay. Typically, audio codecs offer either low delay or high quality, but rarely both. We propose a codec that simultaneously addresses both these requirements, with a delay of only 8.7 ms at 44.1 kHz. It uses gain-shape algebraic vector quantisation in the frequency domain wit… ▽ More

    Submitted 17 February, 2016; originally announced February 2016.

    Comments: 10 pages

    Journal ref: IEEE Transactions on Audio, Speech and Language Processing, Vol. 18, No. 1, pp. 58-67, 2010

  9. arXiv:1602.05311  [pdf, ps, other

    cs.MM cs.SD

    A Full-Bandwidth Audio Codec With Low Complexity And Very Low Delay

    Authors: Jean-Marc Valin, Timothy B. Terriberry, Gregory Maxwell

    Abstract: We propose an audio codec that addresses the low-delay requirements of some applications such as network music performance. The codec is based on the modified discrete cosine transform (MDCT) with very short frames and uses gain-shape quantization to preserve the spectral envelope. The short frame sizes required for low delay typically hinder the performance of transform codecs. However, at 96 kbi… ▽ More

    Submitted 17 February, 2016; originally announced February 2016.

    Comments: 5 pages, Proceedings of EUSIPCO 2009

  10. Perceptual Vector Quantization For Video Coding

    Authors: Jean-Marc Valin, Timothy B. Terriberry

    Abstract: This paper applies energy conservation principles to the Daala video codec using gain-shape vector quantization to encode a vector of AC coefficients as a length (gain) and direction (shape). The technique originates from the CELT mode of the Opus audio codec, where it is used to conserve the spectral envelope of an audio signal. Conserving energy in video has the potential to preserve textures ra… ▽ More

    Submitted 16 February, 2016; originally announced February 2016.

    Comments: 11 pages, Proceedings of SPIE Visual Information Processing and Communication, 2015

    Journal ref: Proc. SPIE 9410, Visual Information Processing and Communication VI, 941009 (March 4, 2015)

  11. arXiv:1602.04845  [pdf, ps, other

    cs.MM cs.SD

    High-Quality, Low-Delay Music Coding in the Opus Codec

    Authors: Jean-Marc Valin, Gregory Maxwell, Timothy B. Terriberry, Koen Vos

    Abstract: The IETF recently standardized the Opus codec as RFC6716. Opus targets a wide range of real-time Internet applications by combining a linear prediction coder with a transform coder. We describe the transform coder, with particular attention to the psychoacoustic knowledge built into the format. The result out-performs existing audio codecs that do not operate under real-time constraints.

    Submitted 15 February, 2016; originally announced February 2016.

    Comments: 10 pages, 135th AES Convention. Proceedings of the 135th AES Convention, October 2013