Skip to main content

Showing 1–50 of 71 results for author: Kleijn, B

.
  1. arXiv:2503.20744  [pdf, other

    cs.CV cs.AI

    High Quality Diffusion Distillation on a Single GPU with Relative and Absolute Position Matching

    Authors: Guoqiang Zhang, Kenta Niwa, J. P. Lewis, Cedric Mesnage, W. Bastiaan Kleijn

    Abstract: We introduce relative and absolute position matching (RAPM), a diffusion distillation method resulting in high quality generation that can be trained efficiently on a single GPU. Recent diffusion distillation research has achieved excellent results for high-resolution text-to-image generation with methods such as phased consistency models (PCM) and improved distribution matching distillation (DMD2… ▽ More

    Submitted 26 March, 2025; originally announced March 2025.

  2. arXiv:2412.08356  [pdf, other

    cs.SD cs.LG eess.AS

    Zero-Shot Mono-to-Binaural Speech Synthesis

    Authors: Alon Levkovitch, Julian Salazar, Soroosh Mariooryad, RJ Skerry-Ryan, Nadav Bar, Bastiaan Kleijn, Eliya Nachmani

    Abstract: We present ZeroBAS, a neural method to synthesize binaural audio from monaural audio recordings and positional information without training on any binaural data. To our knowledge, this is the first published zero-shot neural approach to mono-to-binaural audio synthesis. Specifically, we show that a parameter-free geometric time warping and amplitude scaling based on source location suffices to get… ▽ More

    Submitted 28 May, 2025; v1 submitted 11 December, 2024; originally announced December 2024.

  3. arXiv:2408.09884  [pdf, other

    math.PR

    Existence and phase structure of random inverse limit measures

    Authors: B. J. K. Kleijn

    Abstract: Analogous to Kolmogorov's theorem for the existence of stochastic processes describing random functions, we consider theorems for the existence of stochastic processes describing random measures, as limits of inverse measure systems. Specifically, given a coherent inverse system of random (bounded/signed/positive/probability) histograms on refining partitions, we study conditions for the existence… ▽ More

    Submitted 15 May, 2025; v1 submitted 19 August, 2024; originally announced August 2024.

    Comments: 55 pp., 2 figures

    MSC Class: 28C20; 60B11; 60G07; 60G15; 60G57

  4. arXiv:2407.09093  [pdf, other

    cs.LG cs.AI

    On Exact Bit-level Reversible Transformers Without Changing Architectures

    Authors: Guoqiang Zhang, J. P. Lewis, W. B. Kleijn

    Abstract: Various reversible deep neural networks (DNN) models have been proposed to reduce memory consumption in the training process. However, almost all existing reversible DNNs either require special non-standard architectures or are constructed by modifying existing DNN architectures considerably to enable reversibility. In this work we present the BDIA-transformer, which is an exact bit-level reversib… ▽ More

    Submitted 5 October, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

  5. arXiv:2402.11334  [pdf, ps, other

    math.PR

    Contiguity and remote contiguity of some random graphs

    Authors: B. J. K. Kleijn, S. Rizzelli

    Abstract: Asymptotic properties of random graph sequences, like occurrence of a giant component or full connectivity in Erdős-Rényi graphs, are usually derived with very specific choices for defining parameters. The question arises to which extent those parameters choices may be perturbed, without losing the asymptotic property. Writing $(P_n)$ and $(Q_n)$ for two sequences of graph distributions, asymptoti… ▽ More

    Submitted 17 February, 2024; originally announced February 2024.

    MSC Class: 05C80; 05C40; 60B10; 60G30

  6. arXiv:2401.00896  [pdf, other

    cs.CV

    TrailBlazer: Trajectory Control for Diffusion-Based Video Generation

    Authors: Wan-Duo Kurt Ma, J. P. Lewis, W. Bastiaan Kleijn

    Abstract: Within recent approaches to text-to-video (T2V) generation, achieving controllability in the synthesized video is often a challenge. Typically, this issue is addressed by providing low-level per-frame guidance in the form of edge maps, depth maps, or an existing video to be altered. However, the process of obtaining such guidance can be labor-intensive. This paper focuses on enhancing controllabil… ▽ More

    Submitted 8 April, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

    Comments: 14 pages, 18 figures, Project Page: https://hohonu-vicml.github.io/Trailblazer.Page/

  7. arXiv:2307.10829  [pdf, other

    cs.CV

    Exact Diffusion Inversion via Bi-directional Integration Approximation

    Authors: Guoqiang Zhang, J. P. Lewis, W. Bastiaan Kleijn

    Abstract: Recently, various methods have been proposed to address the inconsistency issue of DDIM inversion to enable image editing, such as EDICT [36] and Null-text inversion [22]. However, the above methods introduce considerable computational overhead. In this paper, we propose a new technique, named \emph{bi-directional integration approximation} (BDIA), to perform exact diffusion inversion with neglibl… ▽ More

    Submitted 26 November, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

    Comments: arXiv admin note: text overlap with arXiv:2304.11328. Our code is available at https://github.com/guoqiang-zhang-x/BDIA

  8. arXiv:2304.11328  [pdf, ps, other

    cs.LG math.NA

    On Accelerating Diffusion-Based Sampling Process via Improved Integration Approximation

    Authors: Guoqiang Zhang, Niwa Kenta, W. Bastiaan Kleijn

    Abstract: A popular approach to sample a diffusion-based generative model is to solve an ordinary differential equation (ODE). In existing samplers, the coefficients of the ODE solvers are pre-determined by the ODE formulation, the reverse discrete timesteps, and the employed ODE methods. In this paper, we consider accelerating several popular ODE-based sampling processes (including EDM, DDIM, and DPM-Solve… ▽ More

    Submitted 3 October, 2023; v1 submitted 22 April, 2023; originally announced April 2023.

  9. arXiv:2304.11312  [pdf, ps, other

    cs.AI cs.LG

    Lookahead Diffusion Probabilistic Models for Refining Mean Estimation

    Authors: Guoqiang Zhang, Niwa Kenta, W. Bastiaan Kleijn

    Abstract: We propose lookahead diffusion probabilistic models (LA-DPMs) to exploit the correlation in the outputs of the deep neural networks (DNNs) over subsequent timesteps in diffusion probabilistic models (DPMs) to refine the mean estimation of the conditional Gaussian distributions in the backward process. A typical DPM first obtains an estimate of the original data sample $\boldsymbol{x}$ by feeding t… ▽ More

    Submitted 21 April, 2023; originally announced April 2023.

    Comments: accepted by CVPR, 2023

  10. arXiv:2303.12984  [pdf, other

    cs.SD eess.AS

    LMCodec: A Low Bitrate Speech Codec With Causal Transformer Models

    Authors: Teerapat Jenrungrot, Michael Chinen, W. Bastiaan Kleijn, Jan Skoglund, Zalán Borsos, Neil Zeghidour, Marco Tagliasacchi

    Abstract: We introduce LMCodec, a causal neural speech codec that provides high quality audio at very low bitrates. The backbone of the system is a causal convolutional codec that encodes audio into a hierarchy of coarse-to-fine tokens using residual vector quantization. LMCodec trains a Transformer language model to predict the fine tokens from the coarse ones in a generative fashion, allowing for the tran… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

    Comments: 5 pages, accepted to ICASSP 2023, project page: https://mjenrungrot.github.io/chrome-media-audio-papers/publications/lmcodec

  11. arXiv:2302.13153  [pdf, other

    cs.CV cs.GR cs.LG

    Directed Diffusion: Direct Control of Object Placement through Attention Guidance

    Authors: Wan-Duo Kurt Ma, J. P. Lewis, Avisek Lahiri, Thomas Leung, W. Bastiaan Kleijn

    Abstract: Text-guided diffusion models such as DALLE-2, Imagen, eDiff-I, and Stable Diffusion are able to generate an effectively endless variety of images given only a short text prompt describing the desired image content. In many cases the images are of very high quality. However, these models often struggle to compose scenes containing several key objects such as characters in specified positional relat… ▽ More

    Submitted 26 September, 2023; v1 submitted 25 February, 2023; originally announced February 2023.

    Comments: Our project page: https://hohonu-vicml.github.io/DirectedDiffusion.Page

  12. arXiv:2301.09198  [pdf, other

    eess.AS cs.SD

    Estimation of Source and Receiver Positions, Room Geometry and Reflection Coefficients From a Single Room Impulse Response

    Authors: Wangyang Yu, W. Bastiaan Kleijn

    Abstract: We propose an algorithm to estimate source and receiver positions, room geometry and reflection coefficients from a single room impulse response simultaneously. It is based on a symmetry analysis of the room impulse response. The proposed method utilizes the times of arrivals of the direct path, first order reflections and second order reflections. The proposed method is robust to erroneous pulses… ▽ More

    Submitted 22 January, 2023; originally announced January 2023.

  13. arXiv:2207.02262  [pdf, other

    cs.SD cs.LG eess.AS

    Ultra-Low-Bitrate Speech Coding with Pretrained Transformers

    Authors: Ali Siahkoohi, Michael Chinen, Tom Denton, W. Bastiaan Kleijn, Jan Skoglund

    Abstract: Speech coding facilitates the transmission of speech over low-bandwidth networks with minimal distortion. Neural-network based speech codecs have recently demonstrated significant improvements in quality over traditional approaches. While this new generation of codecs is capable of synthesizing high-fidelity speech, their use of recurrent or convolutional layers often restricts their effective rec… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

    Comments: Proceedings of INTERSPEECH 2022

  14. arXiv:2204.02040  [pdf

    cs.SD cs.CR eess.AS

    On the Relevance of Bandwidth Extension for Speaker Verification

    Authors: Marcos Faundez-Zanuy, Mattias Nilsson, W. Bastiaan Kleijn

    Abstract: In this paper, we consider the effect of a bandwidth extension of narrow-band speech signals (0.3-3.4 kHz) to 0.3-8 kHz on speaker verification. Using covariance matrix based verification systems together with detection error trade-off curves, we compare the performance between systems operating on narrow-band, wide-band (0-8 kHz), and bandwidth-extended speech. The experiments were conducted usin… ▽ More

    Submitted 5 April, 2022; originally announced April 2022.

    Comments: 4 pages published in 7th International Conference on Spoken Language Processing, September 16-20, 2002, Denver, Colorado, USA. arXiv admin note: text overlap with arXiv:2202.13865

    Journal ref: 7th International Conference on Spoken Language Processing (ICSLP2002), September 16-20, 2002

  15. arXiv:2203.13273  [pdf, ps, other

    cs.LG

    A DNN Optimizer that Improves over AdaBelief by Suppression of the Adaptive Stepsize Range

    Authors: Guoqiang Zhang, Kenta Niwa, W. Bastiaan Kleijn

    Abstract: We make contributions towards improving adaptive-optimizer performance. Our improvements are based on suppression of the range of adaptive stepsizes in the AdaBelief optimizer. Firstly, we show that the particular placement of the parameter epsilon within the update expressions of AdaBelief reduces the range of the adaptive stepsizes, making AdaBelief closer to SGD with momentum. Secondly, we exte… ▽ More

    Submitted 24 January, 2023; v1 submitted 24 March, 2022; originally announced March 2022.

    Comments: 10 pages

  16. arXiv:2202.13865  [pdf

    cs.SD cs.LG eess.AS

    On the relevance of bandwidth extension for speaker identification

    Authors: Marcos Faundez-Zanuy, Mattias Nilsson, W. Bastiaan Kleijn

    Abstract: In this paper we discuss the relevance of bandwidth extension for speaker identification tasks. Mainly we want to study if it is possible to recognize voices that have been bandwith extended. For this purpose, we created two different databases (microphonic and ISDN) of speech signals that were bandwidth extended from telephone bandwidth ([300, 3400] Hz) to full bandwidth ([100, 8000] Hz). We have… ▽ More

    Submitted 24 February, 2022; originally announced February 2022.

    Comments: 4 pages

    Journal ref: 2002 11th European Signal Processing Conference, 2002, pp. 1-4

  17. arXiv:2112.06125  [pdf, ps, other

    cs.LG cs.AI math.OC

    Extending AdamW by Leveraging Its Second Moment and Magnitude

    Authors: Guoqiang Zhang, Niwa Kenta, W. Bastiaan Kleijn

    Abstract: Recent work [4] analyses the local convergence of Adam in a neighbourhood of an optimal solution for a twice-differentiable function. It is found that the learning rate has to be sufficiently small to ensure local stability of the optimal solution. The above convergence results also hold for AdamW. In this work, we propose a new adaptive optimisation method by extending AdamW in two aspects with t… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

    Comments: 9 pages

  18. arXiv:2108.07078  [pdf, other

    math.ST

    Confidence sets in a sparse stochastic block model with two communities of unknown sizes

    Authors: B. J. K. Kleijn, J. van Waaij

    Abstract: In a sparse stochastic block model with two communities of unequal sizes we derive two posterior concentration inequalities, that imply (1) posterior (almost-)exact recovery of the community structure under sparsity bounds comparable to well-known sharp bounds in the planted bi-section model; (2) a construction of confidence sets for the community assignment from credible sets, with finite graph s… ▽ More

    Submitted 16 August, 2021; originally announced August 2021.

    Comments: 27 pages, 4 figures

    MSC Class: 05C80; 60B10; 62G05; 62G15; 82B26; 94C15

  19. arXiv:2107.08809  [pdf, ps, other

    cs.DC math.OC

    Revisiting the Primal-Dual Method of Multipliers for Optimisation over Centralised Networks

    Authors: Guoqiang Zhang, Kenta Niwa, W. Bastiaan Kleijn

    Abstract: The primal-dual method of multipliers (PDMM) was originally designed for solving a decomposable optimisation problem over a general network. In this paper, we revisit PDMM for optimisation over a centralized network. We first note that the recently proposed method FedSplit [1] implements PDMM for a centralized network. In [1], Inexact FedSplit (i.e., gradient based FedSplit) was also studied both… ▽ More

    Submitted 19 July, 2021; originally announced July 2021.

    Comments: 13 pages

  20. arXiv:2105.08478  [pdf, ps, other

    math.ST

    Uncertainty quantification and testing in a stochastic block model with two unequal communities

    Authors: J. van Waaij, B. J. K. Kleijn

    Abstract: We show posterior convergence for the community structure in the planted bi-section model, for several interesting priors. Examples include where the label on each vertex is iid Bernoulli distributed, with some parameter $r\in(0,1)$. The parameter $r$ may be fixed, or equipped with a beta distribution. We do not have constraints on the class sizes, which might be as small as zero, or include all v… ▽ More

    Submitted 13 August, 2021; v1 submitted 18 May, 2021; originally announced May 2021.

    Comments: arXiv admin note: text overlap with arXiv:1810.09533, arXiv:2005.01362

    MSC Class: 62G15; 62G05; 82B26; 05C80

  21. arXiv:2102.11906  [pdf, other

    eess.AS cs.SD

    Handling Background Noise in Neural Speech Generation

    Authors: Tom Denton, Alejandro Luebs, Felicia S. C. Lim, Andrew Storus, Hengchin Yeh, W. Bastiaan Kleijn, Jan Skoglund

    Abstract: Recent advances in neural-network based generative modeling of speech has shown great potential for speech coding. However, the performance of such models drops when the input is not clean speech, e.g., in the presence of background noise, preventing its use in practical applications. In this paper we examine the reason and discuss methods to overcome this issue. Placing a denoising preprocessing… ▽ More

    Submitted 23 February, 2021; originally announced February 2021.

    Comments: 5 pages, 3 figures, presented at the Asilomar Conference on Signals, Systems, and Computers 2020

  22. arXiv:2102.09660  [pdf, other

    eess.AS cs.SD

    Generative Speech Coding with Predictive Variance Regularization

    Authors: W. Bastiaan Kleijn, Andrew Storus, Michael Chinen, Tom Denton, Felicia S. C. Lim, Alejandro Luebs, Jan Skoglund, Hengchin Yeh

    Abstract: The recent emergence of machine-learning based generative models for speech suggests a significant reduction in bit rate for speech codecs is possible. However, the performance of generative models deteriorates significantly with the distortions present in real-world input signals. We argue that this deterioration is due to the sensitivity of the maximum likelihood criterion to outliers and the in… ▽ More

    Submitted 18 February, 2021; originally announced February 2021.

    MSC Class: 94 ACM Class: I.m

  23. arXiv:2012.10145  [pdf, other

    q-fin.TR q-fin.ST

    Heavy tailed distributions in closing auctions

    Authors: M. Derksen, B. Kleijn, R. de Vilder

    Abstract: We study the tails of closing auction return distributions for a sample of liquid European stocks. We use the stochastic call auction model of Derksen et al. (2020a), to derive a relation between tail exponents of limit order placement distributions and tail exponents of the resulting closing auction return distribution and we verify this relation empirically. Counter-intuitively, large closing pr… ▽ More

    Submitted 18 December, 2020; originally announced December 2020.

  24. arXiv:2005.03807  [pdf, other

    cs.LG stat.ML

    Variance Constrained Autoencoding

    Authors: D. T. Braithwaite, M. O'Connor, W. B. Kleijn

    Abstract: Recent state-of-the-art autoencoder based generative models have an encoder-decoder structure and learn a latent representation with a pre-defined distribution that can be sampled from. Implementing the encoder networks of these models in a stochastic manner provides a natural and common approach to avoid overfitting and enforce a smooth decoder function. However, we show that for stochastic encod… ▽ More

    Submitted 7 May, 2020; originally announced May 2020.

    Comments: 20 pages

  25. arXiv:2005.01362  [pdf, other

    math.ST

    Uncertainty quantification in the stochastic block model with an unknown number of classes

    Authors: J. van Waaij, B. J. K. Kleijn

    Abstract: We study the frequentist properties of Bayesian statistical inference for the stochastic block model, with an unknown number of classes of varying sizes. We equip the space of vertex labellings with a prior on the number of classes and, conditionally, a prior on the labels. The number of classes may grow to infinity as a function of the number of vertices, depending on the sparsity of the graph. W… ▽ More

    Submitted 4 May, 2020; originally announced May 2020.

    MSC Class: 62G20; 62G05; 62G15

  26. arXiv:2003.10353  [pdf, other

    q-fin.TR

    Effects of MiFID II on stock price formation

    Authors: Mike Derksen, Bas Kleijn, Robin de Vilder

    Abstract: This paper examines effects of MiFID II on European stock markets. We study the effects of the new tick size regime, both intraday and in the closing auction. An increase (decrease) in tick size is associated with a decrease (increase) in intraday liquidity, but a more (less) stable market. In the closing auction an increase in tick size has a positive effect on liquidity. Moreover, we report a po… ▽ More

    Submitted 25 August, 2020; v1 submitted 23 March, 2020; originally announced March 2020.

  27. arXiv:1912.08308  [pdf, other

    eess.SP

    Distributed Network Privacy using Error Correcting Codes

    Authors: Matt O'Connor, W. Bastiaan Kleijn

    Abstract: Most current distributed processing research deals with improving the flexibility and convergence speed of algorithms for networks of finite size with no constraints on information sharing and no concept for expected levels of signal privacy. In this work we investigate the concept of data privacy in unbounded public networks, where linear codes are used to create hard limits on the number of node… ▽ More

    Submitted 17 December, 2019; originally announced December 2019.

  28. arXiv:1911.09445  [pdf, ps, other

    cs.LG stat.ML

    Approximated Orthonormal Normalisation in Training Neural Networks

    Authors: Guoqiang Zhang, Kenta Niwa, W. B. Kleijn

    Abstract: Generalisation of a deep neural network (DNN) is one major concern when employing the deep learning approach for solving practical problems. In this paper we propose a new technique, named approximated orthonormal normalisation (AON), to improve the generalisation capacity of a DNN model. Considering a weight matrix W from a particular neural layer in the model, our objective is to design a functi… ▽ More

    Submitted 14 January, 2020; v1 submitted 21 November, 2019; originally announced November 2019.

  29. arXiv:1909.04776  [pdf, other

    eess.AS cs.SD

    Generative Speech Enhancement Based on Cloned Networks

    Authors: Michael Chinen, W. Bastiaan Kleijn, Felicia S. C. Lim, Jan Skoglund

    Abstract: We propose to implement speech enhancement by the regeneration of clean speech from a salient representation extracted from the noisy signal. The network that extracts salient features is trained using a set of weight-sharing clones of the extractor network. The clones receive mel-frequency spectra of different noisy versions of the same speech signal as input. By encouraging the outputs of the cl… ▽ More

    Submitted 10 September, 2019; originally announced September 2019.

    Comments: Accepted WASPAA 2019

  30. arXiv:1908.07045  [pdf, other

    eess.AS cs.SD

    Salient Speech Representations Based on Cloned Networks

    Authors: W. Bastiaan Kleijn, Felicia S. C. Lim, Michael Chinen, Jan Skoglund

    Abstract: We define salient features as features that are shared by signals that are defined as being equivalent by a system designer. The definition allows the designer to contribute qualitative information. We aim to find salient features that are useful as conditioning for generative networks. We extract salient features by jointly training a set of clones of an encoder network. Each network clone receiv… ▽ More

    Submitted 19 August, 2019; originally announced August 2019.

    Comments: Interspeech 2019

  31. arXiv:1908.01580  [pdf, other

    cs.LG stat.ML

    The HSIC Bottleneck: Deep Learning without Back-Propagation

    Authors: Wan-Duo Kurt Ma, J. P. Lewis, W. Bastiaan Kleijn

    Abstract: We introduce the HSIC (Hilbert-Schmidt independence criterion) bottleneck for training deep neural networks. The HSIC bottleneck is an alternative to the conventional cross-entropy loss and backpropagation that has a number of distinct advantages. It mitigates exploding and vanishing gradients, resulting in the ability to learn very deep networks without skip connections. There is no requirement f… ▽ More

    Submitted 5 December, 2019; v1 submitted 5 August, 2019; originally announced August 2019.

  32. arXiv:1904.07583  [pdf, other

    q-fin.TR

    Clearing price distributions in call auctions

    Authors: M. Derksen, B. Kleijn, R. de Vilder

    Abstract: We propose a model for price formation in financial markets based on clearing of a standard call auction with random orders, and verify its validity for prediction of the daily closing price distribution statistically. The model considers random buy and sell orders, placed following demand- and supply-side valuation distributions; an equilibrium equation then leads to a distribution for clearing p… ▽ More

    Submitted 28 November, 2019; v1 submitted 16 April, 2019; originally announced April 2019.

  33. arXiv:1904.00869  [pdf, ps, other

    eess.AS

    Room Geometry Estimation from Room Impulse Responses using Convolutional Neural Networks

    Authors: Wangyang Yu, W. Bastiaan Kleijn

    Abstract: We describe a new method to estimate the geometry of a room given room impulse responses. The method utilises convolutional neural networks to estimate the room geometry and uses the mean square error as the loss function. In contrast to existing methods, we do not require the position or distance of sources or receivers in the room. The method can be used with only a single room impulse response… ▽ More

    Submitted 15 May, 2019; v1 submitted 1 April, 2019; originally announced April 2019.

  34. arXiv:1902.09030  [pdf, ps, other

    cs.LG stat.ML

    Rapidly Adapting Moment Estimation

    Authors: Guoqiang Zhang, Kenta Niwa, W. Bastiaan Kleijn

    Abstract: Adaptive gradient methods such as Adam have been shown to be very effective for training deep neural networks (DNNs) by tracking the second moment of gradients to compute the individual learning rates. Differently from existing methods, we make use of the most recent first moment of gradients to compute the individual learning rates per iteration. The motivation behind it is that the dynamic varia… ▽ More

    Submitted 24 February, 2019; originally announced February 2019.

    Comments: 11 pages

  35. arXiv:1810.09533  [pdf, other

    math.ST

    Asymptotic uncertainty quantification for communities in sparse planted bi-section models

    Authors: B. J. K. Kleijn, J. van Waaij

    Abstract: Posterior distributions for community structure in sparse planted bi-section models are shown to achieve exact (resp. almost-exact) recovery, with sharp bounds for the sparsity regimes where edge probabilities decrease as $O(\log(n)/n)$ (resp. $O(1/n)$). Assuming posterior recovery, one may interpret credible sets (resp. enlarged credible sets) as asymptotically consistent confidence sets; the dia… ▽ More

    Submitted 2 March, 2023; v1 submitted 22 October, 2018; originally announced October 2018.

    Comments: 29 pp., 3 fig

    MSC Class: 62G20; 62G05; 62G15

  36. arXiv:1807.11320  [pdf, other

    cs.LG eess.SP stat.ML

    Kernel Density Estimation-Based Markov Models with Hidden State

    Authors: Gustav Eje Henter, Arne Leijon, W. Bastiaan Kleijn

    Abstract: We consider Markov models of stochastic processes where the next-step conditional distribution is defined by a kernel density estimator (KDE), similar to Markov forecast densities and certain time-series bootstrap schemes. The KDE Markov models (KDE-MMs) we discuss are nonlinear, nonparametric, fully probabilistic representations of stationary processes, based on techniques with strong asymptotic… ▽ More

    Submitted 30 July, 2018; originally announced July 2018.

    Comments: 14 pages, 6 figures

    MSC Class: 62M10; 62G07 ACM Class: G.3

  37. arXiv:1807.07306  [pdf, other

    cs.LG cs.IT stat.ML

    Bounded Information Rate Variational Autoencoders

    Authors: D. T. Braithwaite, W. B. Kleijn

    Abstract: This paper introduces a new member of the family of Variational Autoencoders (VAE) that constrains the rate of information transferred by the latent layer. The latent layer is interpreted as a communication channel, the information rate of which is bound by imposing a pre-set signal-to-noise ratio. The new constraint subsumes the mutual information between the input and latent variables, combining… ▽ More

    Submitted 25 July, 2018; v1 submitted 19 July, 2018; originally announced July 2018.

    Comments: Presented at KDD 2018 Deep Learning Day. Minor changes and correction of rate calculations, overall results remain the same

  38. arXiv:1807.04871  [pdf, other

    math.OC

    Bregman Monotone Operator Splitting

    Authors: Kenta Niwa, W. Bastiaan Kleijn

    Abstract: Monotone operator splitting is a powerful paradigm that facilitates parallel processing for optimization problems where the cost function can be split into two convex functions. We propose a generalized form of monotone operator splitting based on Bregman divergence. We show that an appropriate design of the Bregman divergence leads to faster convergence than conventional splitting algorithms. The… ▽ More

    Submitted 10 November, 2018; v1 submitted 12 July, 2018; originally announced July 2018.

    Comments: 19 pages, 1 figure

    MSC Class: 46N10

  39. arXiv:1803.06718  [pdf, other

    eess.AS cs.SD

    Directional emphasis in ambisonics

    Authors: W. Bastiaan Kleijn

    Abstract: We describe an ambisonics enhancement method that increases the signal strength in specified directions at low computational cost. The method can be used in a static setup to emphasize the signal arriving from a particular direction or set of directions. It can also be used in an adaptive arrangement where it sharpens directionality and reduces the distortion in timbre associated with low-degree a… ▽ More

    Submitted 24 May, 2018; v1 submitted 18 March, 2018; originally announced March 2018.

  40. arXiv:1712.01120  [pdf, other

    eess.AS cs.SD eess.SP

    Wavenet based low rate speech coding

    Authors: W. Bastiaan Kleijn, Felicia S. C. Lim, Alejandro Luebs, Jan Skoglund, Florian Stimberg, Quan Wang, Thomas C. Walters

    Abstract: Traditional parametric coding of speech facilitates low rate but provides poor reconstruction quality because of the inadequacy of the model used. We describe how a WaveNet generative speech model can be used to generate high quality speech from the bit stream of a standard parametric coder operating at 2.4 kb/s. We compare this parametric coder with a waveform coder based on the same generative m… ▽ More

    Submitted 1 December, 2017; originally announced December 2017.

    Comments: 5 pages, 2 figures

  41. arXiv:1709.08053  [pdf, other

    math.NA

    Finite Synchrosqueezing Transform Based On The STFT

    Authors: Mozhgan Mohammadpour, Bastiaan Kleijn, Rajab Ali Kamyabi Gol

    Abstract: The finite STFT Synchrosqueezing transform is a time-frequency analysis method that can decompose finite complex signals into time-varying oscillatory components. This representation is sparse and invertible, allowing recovery of the original signal. The STFT Synchrosqueezing transform on finite dimensional signals has the advantage of an efficient matrix representation. This article defines the f… ▽ More

    Submitted 23 September, 2017; originally announced September 2017.

    Comments: 10 pages, 6 figures

  42. arXiv:1708.06881  [pdf, ps, other

    math.OC cs.DC cs.IT

    On Relationship between Primal-Dual Method of Multipliers and Kalman Filter

    Authors: Guoqiang Zhang, W. Bastiaan Kleijn, Richard Heusdens

    Abstract: Recently the primal-dual method of multipliers (PDMM), a novel distributed optimization method, was proposed for solving a general class of decomposable convex optimizations over graphic models. In this work, we first study the convergence properties of PDMM for decomposable quadratic optimizations over tree-structured graphs. We show that with proper parameter selection, PDMM converges to its opt… ▽ More

    Submitted 22 August, 2017; originally announced August 2017.

    Comments: 11 pages

  43. An evaluation of intrusive instrumental intelligibility metrics

    Authors: Steven Van Kuyk, W. Bastiaan Kleijn, Richard C. Hendriks

    Abstract: Instrumental intelligibility metrics are commonly used as an alternative to listening tests. This paper evaluates 12 monaural intrusive intelligibility metrics: SII, HEGP, CSII, HASPI, NCM, QSTI, STOI, ESTOI, MIKNN, SIMI, SIIB, and $\text{sEPSM}^\text{corr}$. In addition, this paper investigates the ability of intelligibility metrics to generalize to new types of distortions and analyzes why the t… ▽ More

    Submitted 28 July, 2018; v1 submitted 20 August, 2017; originally announced August 2017.

    Comments: Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2018

  44. An instrumental intelligibility metric based on information theory

    Authors: Steven Van Kuyk, W. Bastiaan Kleijn, Richard C. Hendriks

    Abstract: We propose a monaural intrusive instrumental intelligibility metric called speech intelligibility in bits (SIIB). SIIB is an estimate of the amount of information shared between a talker and a listener in bits per second. Unlike existing information theoretic intelligibility metrics, SIIB accounts for talker variability and statistical dependencies between time-frequency units. Our evaluation show… ▽ More

    Submitted 14 January, 2018; v1 submitted 17 August, 2017; originally announced August 2017.

    Comments: Published in IEEE Signal Processing Letters

    Journal ref: IEEE Signal Processing Letters, vol. 25, no. 1, pp. 115-119, Jan. 2018

  45. arXiv:1706.02654  [pdf, other

    math.OC

    Derivation and Analysis of the Primal-Dual Method of Multipliers Based on Monotone Operator Theory

    Authors: Thomas Sherson, Richard Heusdens, W. Bastiaan Kleijn

    Abstract: In this paper we present a novel derivation for an existing node-based algorithm for distributed optimisation termed the primal-dual method of multipliers (PDMM). In contrast to its initial derivation, in this work monotone operator theory is used to connect PDMM with other first-order methods such as Douglas-Rachford splitting and the alternating direction method of multipliers thus providing ins… ▽ More

    Submitted 6 November, 2017; v1 submitted 8 June, 2017; originally announced June 2017.

    Comments: 13 pages, 6 figures

  46. arXiv:1705.09888  [pdf, other

    cs.CV

    Cross-modal Subspace Learning for Fine-grained Sketch-based Image Retrieval

    Authors: Peng Xu, Qiyue Yin, Yongye Huang, Yi-Zhe Song, Zhanyu Ma, Liang Wang, Tao Xiang, W. Bastiaan Kleijn, Jun Guo

    Abstract: Sketch-based image retrieval (SBIR) is challenging due to the inherent domain-gap between sketch and photo. Compared with pixel-perfect depictions of photos, sketches are iconic renderings of the real world with highly abstract. Therefore, matching sketch and photo directly using low-level visual clues are unsufficient, since a common low-level subspace that traverses semantically across the two m… ▽ More

    Submitted 27 May, 2017; originally announced May 2017.

    Comments: Accepted by Neurocomputing

  47. arXiv:1702.03380  [pdf, ps, other

    cs.LG cs.DC

    Training Deep Neural Networks via Optimization Over Graphs

    Authors: Guoqiang Zhang, W. Bastiaan Kleijn

    Abstract: In this work, we propose to train a deep neural network by distributed optimization over a graph. Two nonlinear functions are considered: the rectified linear unit (ReLU) and a linear unit with both lower and upper cutoffs (DCutLU). The problem reformulation over a graph is realized by explicitly representing ReLU or DCutLU using a set of slack variables. We then apply the alternating direction me… ▽ More

    Submitted 17 June, 2017; v1 submitted 10 February, 2017; originally announced February 2017.

    Comments: 5 pages

  48. arXiv:1611.08444  [pdf, other

    math.ST

    On the frequentist validity of Bayesian limits

    Authors: B. J. K. Kleijn

    Abstract: To the frequentist who computes posteriors, not all priors are useful asymptotically: in this paper Schwartz's 1965 Kullback-Leibler condition is generalised to enable frequentist interpretation of convergence of posterior distributions with the complex models and often dependent datasets in present-day statistical applications. We prove four simple and fully general frequentist theorems, for post… ▽ More

    Submitted 27 November, 2017; v1 submitted 25 November, 2016; originally announced November 2016.

    Comments: journal article: main text 20pp., appendices 35 pp., 1 figure

    MSC Class: 62G05; 62G20 (Primary); 62G10; 62C10; 62B15 (Secondary) ACM Class: G.3

  49. The semi-parametric Bernstein-von Mises theorem for regression models with symmetric errors

    Authors: Minwoo Chae, Yongdai Kim, Bas Kleijn

    Abstract: In a smooth semi-parametric model, the marginal posterior distribution for a finite dimensional parameter of interest is expected to be asymptotically equivalent to the sampling distribution of any efficient point-estimator. The assertion leads to asymptotic equivalence of credible and confidence sets for the parameter of interest and is known as the semi-parametric Bernstein-von Mises theorem. In… ▽ More

    Submitted 12 January, 2017; v1 submitted 14 July, 2016; originally announced July 2016.

    Comments: 46 pages, 1 figure

  50. arXiv:1607.03516  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    Deep Reconstruction-Classification Networks for Unsupervised Domain Adaptation

    Authors: Muhammad Ghifary, W. Bastiaan Kleijn, Mengjie Zhang, David Balduzzi, Wen Li

    Abstract: In this paper, we propose a novel unsupervised domain adaptation algorithm based on deep learning for visual object recognition. Specifically, we design a new model called Deep Reconstruction-Classification Network (DRCN), which jointly learns a shared encoding representation for two tasks: i) supervised classification of labeled source data, and ii) unsupervised reconstruction of unlabeled target… ▽ More

    Submitted 1 August, 2016; v1 submitted 12 July, 2016; originally announced July 2016.

    Comments: to appear in European Conference on Computer Vision (ECCV) 2016