Skip to main content

Showing 1–50 of 68 results for author: Baraniuk, R G

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.13781  [pdf, other

    cs.LG cs.AI cs.CL cs.CV stat.ML

    A Primal-Dual Framework for Transformers and Neural Networks

    Authors: Tan M. Nguyen, Tam Nguyen, Nhat Ho, Andrea L. Bertozzi, Richard G. Baraniuk, Stanley J. Osher

    Abstract: Self-attention is key to the remarkable success of transformers in sequence modeling tasks including many applications in natural language processing and computer vision. Like neural network layers, these attention mechanisms are often developed by heuristics and experience. To provide a principled framework for constructing attention layers in transformers, we show that the self-attention corresp… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted to ICLR 2023, 26 pages, 4 figures, 14 tables

  2. arXiv:2405.13977  [pdf, ps, other

    cs.LG stat.ML

    Improving Fairness and Mitigating MADness in Generative Models

    Authors: Paul Mayer, Lorenzo Luzi, Ali Siahkoohi, Don H. Johnson, Richard G. Baraniuk

    Abstract: Generative models unfairly penalize data belonging to minority classes, suffer from model autophagy disorder (MADness), and learn biased estimates of the underlying distribution parameters. Our theoretical and empirical results show that training generative models with intentionally designed hypernetworks leads to models that 1) are more fair when generating datapoints belonging to minority classe… ▽ More

    Submitted 3 October, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

    MSC Class: 68T07

  3. arXiv:2401.14429  [pdf, ps, other

    cs.LG cs.RO eess.SP stat.ML

    [Re] The Discriminative Kalman Filter for Bayesian Filtering with Nonlinear and Non-Gaussian Observation Models

    Authors: Josue Casco-Rodriguez, Caleb Kemere, Richard G. Baraniuk

    Abstract: Kalman filters provide a straightforward and interpretable means to estimate hidden or latent variables, and have found numerous applications in control, robotics, signal processing, and machine learning. One such application is neural decoding for neuroprostheses. In 2020, Burkhart et al. thoroughly evaluated their new version of the Kalman filter that leverages Bayes' theorem to improve filter p… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  4. arXiv:2210.12100  [pdf, other

    cs.CV cs.LG stat.ML

    Boomerang: Local sampling on image manifolds using diffusion models

    Authors: Lorenzo Luzi, Paul M Mayer, Josue Casco-Rodriguez, Ali Siahkoohi, Richard G. Baraniuk

    Abstract: The inference stage of diffusion models can be seen as running a reverse-time diffusion stochastic differential equation, where samples from a Gaussian latent distribution are transformed into samples from a target distribution that usually reside on a low-dimensional manifold, e.g., an image manifold. The intermediate values between the initial latent space and the image manifold can be interpret… ▽ More

    Submitted 17 April, 2024; v1 submitted 21 October, 2022; originally announced October 2022.

    Comments: Published in Transactions on Machine Learning Research

  5. arXiv:2209.14778  [pdf, other

    cs.LG cs.AI cs.CG cs.CV stat.ML

    Batch Normalization Explained

    Authors: Randall Balestriero, Richard G. Baraniuk

    Abstract: A critically important, ubiquitous, and yet poorly understood ingredient in modern deep networks (DNs) is batch normalization (BN), which centers and normalizes the feature maps. To date, only limited progress has been made understanding why BN boosts DN learning and inference performance; work has focused exclusively on showing that BN smooths a DN's loss landscape. In this paper, we study BN the… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

  6. arXiv:2205.14055  [pdf, other

    cs.LG stat.ML

    A Blessing of Dimensionality in Membership Inference through Regularization

    Authors: Jasper Tan, Daniel LeJeune, Blake Mason, Hamid Javadi, Richard G. Baraniuk

    Abstract: Is overparameterization a privacy liability? In this work, we study the effect that the number of parameters has on a classifier's vulnerability to membership inference attacks. We first demonstrate how the number of parameters of a model can induce a privacy--utility trade-off: increasing the number of parameters generally improves generalization performance at the expense of lower privacy. Howev… ▽ More

    Submitted 13 April, 2023; v1 submitted 27 May, 2022; originally announced May 2022.

    Comments: 26 pages, 14 figures

  7. arXiv:2204.03145  [pdf, other

    stat.AP cs.LG stat.ML

    DeepTensor: Low-Rank Tensor Decomposition with Deep Network Priors

    Authors: Vishwanath Saragadam, Randall Balestriero, Ashok Veeraraghavan, Richard G. Baraniuk

    Abstract: DeepTensor is a computationally efficient framework for low-rank decomposition of matrices and tensors using deep generative networks. We decompose a tensor as the product of low-rank tensor factors (e.g., a matrix as the outer product of two vectors), where each low-rank tensor is generated by a deep network (DN) that is trained in a self-supervised manner to minimize the mean-squared approximati… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

    Comments: 14 pages

  8. arXiv:2202.01243  [pdf, other

    stat.ML cs.LG

    Parameters or Privacy: A Provable Tradeoff Between Overparameterization and Membership Inference

    Authors: Jasper Tan, Blake Mason, Hamid Javadi, Richard G. Baraniuk

    Abstract: A surprising phenomenon in modern machine learning is the ability of a highly overparameterized model to generalize well (small error on the test data) even when it is trained to memorize the training data (zero error on the training data). This has led to an arms race towards increasingly overparameterized models (c.f., deep learning). In this paper, we study an underexplored hidden cost of overp… ▽ More

    Submitted 30 November, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

    Comments: 25 pages, 8 figures

  9. arXiv:2110.08678  [pdf, other

    cs.LG cs.CL stat.ML

    Improving Transformers with Probabilistic Attention Keys

    Authors: Tam Nguyen, Tan M. Nguyen, Dung D. Le, Duy Khuong Nguyen, Viet-Anh Tran, Richard G. Baraniuk, Nhat Ho, Stanley J. Osher

    Abstract: Multi-head attention is a driving force behind state-of-the-art transformers, which achieve remarkable performance across a variety of natural language processing (NLP) and computer vision tasks. It has been observed that for many applications, those attention heads learn redundant embedding, and most of them can be removed without degrading the performance of the model. Inspired by this observati… ▽ More

    Submitted 12 June, 2022; v1 submitted 16 October, 2021; originally announced October 2021.

    Comments: 27 pages, 16 figures, 10 tables

    Journal ref: Proceedings of the 39th International Conference on Machine Learning, Baltimore, Maryland, USA, PMLR 162, 2022

  10. arXiv:2110.02915  [pdf, other

    cs.LG eess.SP stat.CO

    Unrolling Particles: Unsupervised Learning of Sampling Distributions

    Authors: Fernando Gama, Nicolas Zilberstein, Richard G. Baraniuk, Santiago Segarra

    Abstract: Particle filtering is used to compute good nonlinear estimates of complex systems. It samples trajectories from a chosen distribution and computes the estimate as a weighted average. Easy-to-sample distributions often lead to degenerate samples where only one trajectory carries all the weight, negatively affecting the resulting performance of the estimate. While much research has been done on the… ▽ More

    Submitted 6 October, 2021; originally announced October 2021.

  11. arXiv:2109.02355  [pdf, other

    stat.ML cs.LG

    A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

    Authors: Yehuda Dar, Vidya Muthukumar, Richard G. Baraniuk

    Abstract: The rapid recent progress in machine learning (ML) has raised a number of scientific questions that challenge the longstanding dogma of the field. One of the most important riddles is the good empirical generalization of overparameterized models. Overparameterized models are excessively complex with respect to the size of the training dataset, which results in them perfectly fitting (i.e., interpo… ▽ More

    Submitted 6 September, 2021; originally announced September 2021.

  12. arXiv:2106.07769  [pdf, other

    cs.LG stat.ML

    The Flip Side of the Reweighted Coin: Duality of Adaptive Dropout and Regularization

    Authors: Daniel LeJeune, Hamid Javadi, Richard G. Baraniuk

    Abstract: Among the most successful methods for sparsifying deep (neural) networks are those that adaptively mask the network weights throughout training. By examining this masking, or dropout, in the linear case, we uncover a duality between such adaptive methods and regularization through the so-called "$η$-trick" that casts both as iteratively reweighted optimizations. We show that any dropout strategy t… ▽ More

    Submitted 3 January, 2022; v1 submitted 14 June, 2021; originally announced June 2021.

    Comments: 19 pages, 2 figures. Appeared in NeurIPS 2021. Small typographical correction

  13. arXiv:2104.07824  [pdf, ps, other

    cs.LG stat.ML

    NePTuNe: Neural Powered Tucker Network for Knowledge Graph Completion

    Authors: Shashank Sonkar, Arzoo Katiyar, Richard G. Baraniuk

    Abstract: Knowledge graphs link entities through relations to provide a structured representation of real world facts. However, they are often incomplete, because they are based on only a small fraction of all plausible facts. The task of knowledge graph completion via link prediction aims to overcome this challenge by inferring missing facts represented as links between entities. Current approaches to link… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

  14. arXiv:2006.14600  [pdf, ps, other

    cs.LG stat.ML

    Ensembles of Generative Adversarial Networks for Disconnected Data

    Authors: Lorenzo Luzi, Randall Balestriero, Richard G. Baraniuk

    Abstract: Most current computer vision datasets are composed of disconnected sets, such as images from different classes. We prove that distributions of this type of data cannot be represented with a continuous generative network without error. They can be represented in two ways: With an ensemble of networks or with a single network with truncated latent space. We show that ensembles are more desirable tha… ▽ More

    Submitted 25 June, 2020; originally announced June 2020.

  15. arXiv:2006.10023  [pdf, other

    cs.LG stat.ML

    Analytical Probability Distributions and EM-Learning for Deep Generative Networks

    Authors: Randall Balestriero, Sebastien Paris, Richard G. Baraniuk

    Abstract: Deep Generative Networks (DGNs) with probabilistic modeling of their output and latent space are currently trained via Variational Autoencoders (VAEs). In the absence of a known analytical form for the posterior and likelihood expectation, VAEs resort to approximations, including (Amortized) Variational Inference (AVI) and Monte-Carlo (MC) sampling. We exploit the Continuous Piecewise Affine (CPA)… ▽ More

    Submitted 17 June, 2020; originally announced June 2020.

  16. arXiv:2006.07460  [pdf, other

    cs.LG stat.ML

    An Improved Semi-Supervised VAE for Learning Disentangled Representations

    Authors: Weili Nie, Zichao Wang, Ankit B. Patel, Richard G. Baraniuk

    Abstract: Learning interpretable and disentangled representations is a crucial yet challenging task in representation learning. In this work, we focus on semi-supervised disentanglement learning and extend work by Locatello et al. (2019) by introducing another source of supervision that we denote as label replacement. Specifically, during training, we replace the inferred representation associated with a da… ▽ More

    Submitted 22 June, 2020; v1 submitted 12 June, 2020; originally announced June 2020.

  17. arXiv:2006.07002  [pdf, other

    cs.LG stat.ML

    Double Double Descent: On Generalization Errors in Transfer Learning between Linear Regression Tasks

    Authors: Yehuda Dar, Richard G. Baraniuk

    Abstract: We study the transfer learning process between two linear regression problems. An important and timely special case is when the regressors are overparameterized and perfectly interpolate their training data. We examine a parameter transfer mechanism whereby a subset of the parameters of the target task solution are constrained to the values learned for a related source task. We analytically charac… ▽ More

    Submitted 28 September, 2022; v1 submitted 12 June, 2020; originally announced June 2020.

  18. arXiv:2006.06919  [pdf, other

    cs.LG math.DS stat.ML

    MomentumRNN: Integrating Momentum into Recurrent Neural Networks

    Authors: Tan M. Nguyen, Richard G. Baraniuk, Andrea L. Bertozzi, Stanley J. Osher, Bao Wang

    Abstract: Designing deep neural networks is an art that often involves an expensive search over candidate architectures. To overcome this for recurrent neural nets (RNNs), we establish a connection between the hidden state dynamics in an RNN and gradient descent (GD). We then integrate momentum into this framework and propose a new family of RNNs, called {\em MomentumRNNs}. We theoretically prove and numeri… ▽ More

    Submitted 11 October, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: 21 pages, 11 figures, Accepted for publication at Advances in Neural Information Processing Systems (NeurIPS) 2020

    MSC Class: 68T07 ACM Class: I.2

    Journal ref: Advances in Neural Information Processing Systems (NeurIPS) 2020

  19. arXiv:2005.12442  [pdf, other

    cs.LG cs.AI stat.ML

    qDKT: Question-centric Deep Knowledge Tracing

    Authors: Shashank Sonkar, Andrew E. Waters, Andrew S. Lan, Phillip J. Grimaldi, Richard G. Baraniuk

    Abstract: Knowledge tracing (KT) models, e.g., the deep knowledge tracing (DKT) model, track an individual learner's acquisition of skills over time by examining the learner's performance on questions related to those skills. A practical limitation in most existing KT models is that all questions nested under a particular skill are treated as equivalent observations of a learner's ability, which is an inacc… ▽ More

    Submitted 25 May, 2020; originally announced May 2020.

  20. arXiv:2005.06001  [pdf, other

    eess.IV cs.LG stat.ML

    Deep Learning Techniques for Inverse Problems in Imaging

    Authors: Gregory Ongie, Ajil Jalal, Christopher A. Metzler, Richard G. Baraniuk, Alexandros G. Dimakis, Rebecca Willett

    Abstract: Recent work in machine learning shows that deep neural networks can be used to solve a wide variety of inverse problems arising in computational imaging. We explore the central prevailing themes of this emerging area and present a taxonomy that can be used to categorize different problems and reconstruction methods. Our taxonomy is organized along two central axes: (1) whether or not a forward mod… ▽ More

    Submitted 12 May, 2020; originally announced May 2020.

  21. arXiv:2003.05980  [pdf, other

    cs.CY cs.LG stat.AP

    Educational Question Mining At Scale: Prediction, Analysis and Personalization

    Authors: Zichao Wang, Sebastian Tschiatschek, Simon Woodhead, Jose Miguel Hernandez-Lobato, Simon Peyton Jones, Richard G. Baraniuk, Cheng Zhang

    Abstract: Online education platforms enable teachers to share a large number of educational resources such as questions to form exercises and quizzes for students. With large volumes of available questions, it is important to have an automated way to quantify their properties and intelligently select them for students, enabling effective and personalized learning experiences. In this work, we propose a fram… ▽ More

    Submitted 28 February, 2021; v1 submitted 12 March, 2020; originally announced March 2020.

    Comments: Accepted at AAAI-EAAI 2021

  22. arXiv:2002.10614  [pdf, other

    cs.LG stat.ML

    Subspace Fitting Meets Regression: The Effects of Supervision and Orthonormality Constraints on Double Descent of Generalization Errors

    Authors: Yehuda Dar, Paul Mayer, Lorenzo Luzi, Richard G. Baraniuk

    Abstract: We study the linear subspace fitting problem in the overparameterized setting, where the estimated subspace can perfectly interpolate the training examples. Our scope includes the least-squares solutions to subspace fitting tasks with varying levels of supervision in the training data (i.e., the proportion of input-output examples of the desired low-dimensional mapping) and orthonormality of the v… ▽ More

    Submitted 20 August, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

  23. arXiv:2002.10583  [pdf, other

    cs.LG cs.NE stat.ML

    Scheduled Restart Momentum for Accelerated Stochastic Gradient Descent

    Authors: Bao Wang, Tan M. Nguyen, Andrea L. Bertozzi, Richard G. Baraniuk, Stanley J. Osher

    Abstract: Stochastic gradient descent (SGD) with constant momentum and its variants such as Adam are the optimization algorithms of choice for training deep neural networks (DNNs). Since DNN training is incredibly computationally expensive, there is great interest in speeding up the convergence. Nesterov accelerated gradient (NAG) improves the convergence rate of gradient descent (GD) for convex optimizatio… ▽ More

    Submitted 26 April, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

    Comments: 35 pages, 16 figures, 18 tables

  24. arXiv:1912.03978  [pdf, other

    cs.LG cs.CV stat.ML

    InfoCNF: An Efficient Conditional Continuous Normalizing Flow with Adaptive Solvers

    Authors: Tan M. Nguyen, Animesh Garg, Richard G. Baraniuk, Anima Anandkumar

    Abstract: Continuous Normalizing Flows (CNFs) have emerged as promising deep generative models for a wide range of tasks thanks to their invertibility and exact likelihood estimation. However, conditioning CNFs on signals of interest for conditional image generation and downstream predictive tasks is inefficient due to the high-dimensional latent code generated by the model, which needs to be of the same si… ▽ More

    Submitted 9 December, 2019; originally announced December 2019.

    Comments: 17 pages, 14 figures, 2 tables

  25. arXiv:1910.04743  [pdf, other

    stat.ML cs.LG

    The Implicit Regularization of Ordinary Least Squares Ensembles

    Authors: Daniel LeJeune, Hamid Javadi, Richard G. Baraniuk

    Abstract: Ensemble methods that average over a collection of independent predictors that are each limited to a subsampling of both the examples and features of the training data command a significant presence in machine learning, such as the ever-popular random forest, yet the nature of the subsampling effect, particularly of the features, is not well understood. We study the case of an ensemble of linear p… ▽ More

    Submitted 24 March, 2020; v1 submitted 10 October, 2019; originally announced October 2019.

    Comments: 18 pages, 4 figures. To appear in AISTATS 2020

  26. arXiv:1909.11957  [pdf, other

    cs.LG stat.ML

    Drawing Early-Bird Tickets: Towards More Efficient Training of Deep Networks

    Authors: Haoran You, Chaojian Li, Pengfei Xu, Yonggan Fu, Yue Wang, Xiaohan Chen, Richard G. Baraniuk, Zhangyang Wang, Yingyan Celine Lin

    Abstract: (Frankle & Carbin, 2019) shows that there exist winning tickets (small but critical subnetworks) for dense, randomly initialized networks, that can be trained alone to achieve comparable accuracies to the latter in a similar number of iterations. However, the identification of these winning tickets still requires the costly train-prune-retrain process, limiting their practical benefits. In this pa… ▽ More

    Submitted 3 March, 2025; v1 submitted 26 September, 2019; originally announced September 2019.

    Comments: Accepted as ICLR2020 Spotlight

  27. arXiv:1907.04572  [pdf, other

    cs.LG cs.CV stat.ML

    Out-of-Distribution Detection Using Neural Rendering Generative Models

    Authors: Yujia Huang, Sihui Dai, Tan Nguyen, Richard G. Baraniuk, Anima Anandkumar

    Abstract: Out-of-distribution (OoD) detection is a natural downstream task for deep generative models, due to their ability to learn the input probability distribution. There are mainly two classes of approaches for OoD detection using deep generative models, viz., based on likelihood measure and the reconstruction loss. However, both approaches are unable to carry out OoD detection effectively, especially… ▽ More

    Submitted 10 July, 2019; originally announced July 2019.

  28. arXiv:1905.11639  [pdf, other

    cs.LG stat.ML

    Implicit Rugosity Regularization via Data Augmentation

    Authors: Daniel LeJeune, Randall Balestriero, Hamid Javadi, Richard G. Baraniuk

    Abstract: Deep (neural) networks have been applied productively in a wide range of supervised and unsupervised learning tasks. Unlike classical machine learning algorithms, deep networks typically operate in the \emph{overparameterized} regime, where the number of parameters is larger than the number of training data points. Consequently, understanding the generalization properties and the role of (explicit… ▽ More

    Submitted 10 October, 2019; v1 submitted 28 May, 2019; originally announced May 2019.

    Comments: 15 pages, 12 figures

  29. arXiv:1905.09190  [pdf, other

    cs.LG stat.ML

    Thresholding Graph Bandits with GrAPL

    Authors: Daniel LeJeune, Gautam Dasarathy, Richard G. Baraniuk

    Abstract: In this paper, we introduce a new online decision making paradigm that we call Thresholding Graph Bandits. The main goal is to efficiently identify a subset of arms in a multi-armed bandit problem whose means are above a specified threshold. While traditionally in such problems, the arms are assumed to be independent, in our paradigm we further suppose that we have access to the similarity between… ▽ More

    Submitted 24 March, 2020; v1 submitted 22 May, 2019; originally announced May 2019.

    Comments: 14 pages, 3 figures. To appear in AISTATS 2020

  30. arXiv:1905.08831  [pdf, other

    cs.SI cs.LG eess.SP stat.ML

    IdeoTrace: A Framework for Ideology Tracing with a Case Study on the 2016 U.S. Presidential Election

    Authors: Indu Manickam, Andrew S. Lan, Gautam Dasarathy, Richard G. Baraniuk

    Abstract: The 2016 United States presidential election has been characterized as a period of extreme divisiveness that was exacerbated on social media by the influence of fake news, trolls, and social bots. However, the extent to which the public became more polarized in response to these influences over the course of the election is not well understood. In this paper we propose IdeoTrace, a framework for (… ▽ More

    Submitted 30 May, 2019; v1 submitted 21 May, 2019; originally announced May 2019.

    Comments: 9 pages, 4 figures, submitted to ASONAM 2019

  31. arXiv:1902.09465  [pdf, other

    cs.DS cs.LG stat.ML

    Adaptive Estimation for Approximate k-Nearest-Neighbor Computations

    Authors: Daniel LeJeune, Richard G. Baraniuk, Reinhard Heckel

    Abstract: Algorithms often carry out equally many computations for "easy" and "hard" problem instances. In particular, algorithms for finding nearest neighbors typically have the same running time regardless of the particular problem instance. In this paper, we consider the approximate k-nearest-neighbor problem, which is the problem of finding a subset of O(k) points in a given set of points that contains… ▽ More

    Submitted 25 February, 2019; originally announced February 2019.

    Comments: 11 pages, 2 figures. To appear in AISTATS 2019

    Journal ref: Proceedings of Machine Learning Research 89 (2019):3099-3107

  32. arXiv:1902.06687  [pdf, other

    cs.DS cs.CG cs.LG eess.SP stat.ML

    Sub-linear Memory Sketches for Near Neighbor Search on Streaming Data

    Authors: Benjamin Coleman, Richard G. Baraniuk, Anshumali Shrivastava

    Abstract: We present the first sublinear memory sketch that can be queried to find the nearest neighbors in a dataset. Our online sketching algorithm compresses an N element dataset to a sketch of size $O(N^b \log^3 N)$ in $O(N^{(b+1)} \log^3 N)$ time, where $b < 1$. This sketch can correctly report the nearest neighbors of any query that satisfies a stability condition parameterized by $b$. We achieve subl… ▽ More

    Submitted 14 September, 2020; v1 submitted 18 February, 2019; originally announced February 2019.

    Comments: Published in ICML2020

  33. arXiv:1811.02657  [pdf, other

    cs.CV cs.AI cs.LG cs.NE stat.ML

    A Bayesian Perspective of Convolutional Neural Networks through a Deconvolutional Generative Model

    Authors: Tan Nguyen, Nhat Ho, Ankit Patel, Anima Anandkumar, Michael I. Jordan, Richard G. Baraniuk

    Abstract: Inspired by the success of Convolutional Neural Networks (CNNs) for supervised prediction in images, we design the Deconvolutional Generative Model (DGM), a new probabilistic generative model whose inference calculations correspond to those in a given CNN architecture. The DGM uses a CNN to design the prior distribution in the probabilistic model. Furthermore, the DGM generates images from coarse… ▽ More

    Submitted 9 December, 2019; v1 submitted 31 October, 2018; originally announced November 2018.

    Comments: Keywords: neural nets, generative models, semi-supervised learning, cross-entropy, statistical guarantees 80 pages, 7 figures, 8 tables

  34. arXiv:1810.09274  [pdf, other

    cs.LG stat.ML

    From Hard to Soft: Understanding Deep Network Nonlinearities via Vector Quantization and Statistical Inference

    Authors: Randall Balestriero, Richard G. Baraniuk

    Abstract: Nonlinearity is crucial to the performance of a deep (neural) network (DN). To date there has been little progress understanding the menagerie of available nonlinearities, but recently progress has been made on understanding the rôle played by piecewise affine and convex nonlinearities like the ReLU and absolute value activation functions and max-pooling. In particular, DN layers constructed from… ▽ More

    Submitted 22 October, 2018; originally announced October 2018.

  35. arXiv:1806.04310  [pdf, other

    cs.DS cs.LG stat.ML

    MISSION: Ultra Large-Scale Feature Selection using Count-Sketches

    Authors: Amirali Aghazadeh, Ryan Spring, Daniel LeJeune, Gautam Dasarathy, Anshumali Shrivastava, Richard G. Baraniuk

    Abstract: Feature selection is an important challenge in machine learning. It plays a crucial role in the explainability of machine-driven decisions that are rapidly permeating throughout modern society. Unfortunately, the explosion in the size and dimensionality of real-world datasets poses a severe challenge to standard feature selection algorithms. Today, it is not uncommon for datasets to have billions… ▽ More

    Submitted 11 June, 2018; originally announced June 2018.

  36. arXiv:1805.10531  [pdf, other

    stat.ML cs.CV cs.LG

    Unsupervised Learning with Stein's Unbiased Risk Estimator

    Authors: Christopher A. Metzler, Ali Mousavi, Reinhard Heckel, Richard G. Baraniuk

    Abstract: Learning from unlabeled and noisy data is one of the grand challenges of machine learning. As such, it has seen a flurry of research with new ideas proposed continuously. In this work, we revisit a classical idea: Stein's Unbiased Risk Estimator (SURE). We show that, in the context of image recovery, SURE and its generalizations can be used to train convolutional neural networks (CNNs) for a range… ▽ More

    Submitted 22 July, 2020; v1 submitted 26 May, 2018; originally announced May 2018.

  37. arXiv:1803.00212  [pdf, other

    stat.ML cs.LG

    prDeep: Robust Phase Retrieval with a Flexible Deep Network

    Authors: Christopher A. Metzler, Philip Schniter, Ashok Veeraraghavan, Richard G. Baraniuk

    Abstract: Phase retrieval algorithms have become an important component in many modern computational imaging systems. For instance, in the context of ptychography and speckle correlation imaging, they enable imaging past the diffraction limit and through scattering media, respectively. Unfortunately, traditional phase retrieval algorithms struggle in the presence of noise. Progress has been made recently on… ▽ More

    Submitted 29 June, 2018; v1 submitted 28 February, 2018; originally announced March 2018.

  38. arXiv:1711.04313  [pdf, other

    stat.ML cs.LG

    Semi-Supervised Learning via New Deep Network Inversion

    Authors: Randall Balestriero, Vincent Roger, Herve G. Glotin, Richard G. Baraniuk

    Abstract: We exploit a recently derived inversion scheme for arbitrary deep neural networks to develop a new semi-supervised learning framework that applies to a wide range of systems and problems. The approach outperforms current state-of-the-art methods on MNIST reaching $99.14\%$ of test set accuracy while using $5$ labeled examples per class. Experiments with one-dimensional signals highlight the genera… ▽ More

    Submitted 12 November, 2017; originally announced November 2017.

    Comments: arXiv admin note: substantial text overlap with arXiv:1710.09302

  39. arXiv:1707.03386  [pdf, ps, other

    stat.ML cs.LG

    DeepCodec: Adaptive Sensing and Recovery via Deep Convolutional Neural Networks

    Authors: Ali Mousavi, Gautam Dasarathy, Richard G. Baraniuk

    Abstract: In this paper we develop a novel computational sensing framework for sensing and recovering structured signals. When trained on a set of representative signals, our framework learns to take undersampled measurements and recover signals from them using a deep convolutional neural network. In other words, it learns a transformation from the original signals to a near-optimal number of undersampled m… ▽ More

    Submitted 11 July, 2017; originally announced July 2017.

  40. arXiv:1704.06625  [pdf, other

    stat.ML cs.LG

    Learned D-AMP: Principled Neural Network based Compressive Image Recovery

    Authors: Christopher A. Metzler, Ali Mousavi, Richard G. Baraniuk

    Abstract: Compressive image recovery is a challenging problem that requires fast and accurate algorithms. Recently, neural networks have been applied to this problem with promising results. By exploiting massively parallel GPU processing architectures and oodles of training data, they can run orders of magnitude faster than existing techniques. However, these methods are largely unprincipled black boxes tha… ▽ More

    Submitted 6 November, 2017; v1 submitted 21 April, 2017; originally announced April 2017.

  41. arXiv:1703.08544  [pdf, other

    stat.ML cs.CL

    Data-Mining Textual Responses to Uncover Misconception Patterns

    Authors: Joshua J. Michalenko, Andrew S. Lan, Richard G. Baraniuk

    Abstract: An important, yet largely unstudied, problem in student data analysis is to detect misconceptions from students' responses to open-response questions. Misconception detection enables instructors to deliver more targeted feedback on the misconceptions exhibited by many students in their class, thus improving the quality of instruction. In this paper, we propose a new natural language processing-bas… ▽ More

    Submitted 29 March, 2017; v1 submitted 24 March, 2017; originally announced March 2017.

    Comments: 7 Pages, Submitted to EDM 2017, Workshop version accepted to L@S 2017. Article title and acronym changed to more clearly indicate the scientific goal of the paper of improving the quality of educational instruction

  42. arXiv:1701.03891  [pdf, ps, other

    stat.ML cs.AI cs.IT cs.LG

    Learning to Invert: Signal Recovery via Deep Convolutional Networks

    Authors: Ali Mousavi, Richard G. Baraniuk

    Abstract: The promise of compressive sensing (CS) has been offset by two significant challenges. First, real-world data is not exactly sparse in a fixed basis. Second, current high-performance recovery algorithms are slow to converge, which limits CS to either non-real-time applications or scenarios where massive back-end computing is available. In this paper, we attack both of these challenges head-on by d… ▽ More

    Submitted 14 January, 2017; originally announced January 2017.

    Comments: Accepted at The 42nd IEEE International Conference on Acoustics, Speech and Signal Processing

  43. arXiv:1612.01942  [pdf, other

    stat.ML cs.LG cs.NE

    Semi-Supervised Learning with the Deep Rendering Mixture Model

    Authors: Tan Nguyen, Wanjia Liu, Ethan Perez, Richard G. Baraniuk, Ankit B. Patel

    Abstract: Semi-supervised learning algorithms reduce the high cost of acquiring labeled training data by using both labeled and unlabeled data during learning. Deep Convolutional Networks (DCNs) have achieved great success in supervised tasks and as such have been widely employed in the semi-supervised learning. In this paper we leverage the recently developed Deep Rendering Mixture Model (DRMM), a probabil… ▽ More

    Submitted 6 December, 2016; originally announced December 2016.

  44. arXiv:1612.01936  [pdf, other

    stat.ML cs.LG cs.NE

    A Probabilistic Framework for Deep Learning

    Authors: Ankit B. Patel, Tan Nguyen, Richard G. Baraniuk

    Abstract: We develop a probabilistic framework for deep learning based on the Deep Rendering Mixture Model (DRMM), a new generative probabilistic model that explicitly capture variations in data due to latent task nuisance variables. We demonstrate that max-sum inference in the DRMM yields an algorithm that exactly reproduces the operations in deep convolutional neural networks (DCNs), providing a first pri… ▽ More

    Submitted 6 December, 2016; originally announced December 2016.

    Comments: arXiv admin note: substantial text overlap with arXiv:1504.00641

  45. arXiv:1511.01017  [pdf, ps, other

    math.ST cs.IT math.OC stat.ML

    Consistent Parameter Estimation for LASSO and Approximate Message Passing

    Authors: Ali Mousavi, Arian Maleki, Richard G. Baraniuk

    Abstract: We consider the problem of recovering a vector $β_o \in \mathbb{R}^p$ from $n$ random and noisy linear observations $y= Xβ_o + w$, where $X$ is the measurement matrix and $w$ is noise. The LASSO estimate is given by the solution to the optimization problem $\hatβ_λ = \arg \min_β \frac{1}{2} \|y-Xβ\|_2^2 + λ\| β\|_1$. Among the iterative algorithms that have been proposed for solving this optimizat… ▽ More

    Submitted 4 November, 2015; v1 submitted 3 November, 2015; originally announced November 2015.

    Comments: arXiv admin note: text overlap with arXiv:1309.5979

  46. arXiv:1508.04073  [pdf, ps, other

    cs.IT stat.ME

    An Information-Theoretic Measure of Dependency Among Variables in Large Datasets

    Authors: Ali Mousavi, Richard G. Baraniuk

    Abstract: The maximal information coefficient (MIC), which measures the amount of dependence between two variables, is able to detect both linear and non-linear associations. However, computational cost grows rapidly as a function of the dataset size. In this paper, we develop a computationally efficient approximation to the MIC that replaces its dynamic programming step with a much simpler technique based… ▽ More

    Submitted 17 August, 2015; originally announced August 2015.

  47. A Deep Learning Approach to Structured Signal Recovery

    Authors: Ali Mousavi, Ankit B. Patel, Richard G. Baraniuk

    Abstract: In this paper, we develop a new framework for sensing and recovering structured signals. In contrast to compressive sensing (CS) systems that employ linear measurements, sparse representations, and computationally complex convex/greedy algorithms, we introduce a deep learning framework that supports both linear and mildly nonlinear measurements, that learns a structured representation from trainin… ▽ More

    Submitted 17 August, 2015; originally announced August 2015.

    Journal ref: In Proceeding of 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton)

  48. arXiv:1505.05208  [pdf, other

    stat.ML cs.LG

    oASIS: Adaptive Column Sampling for Kernel Matrix Approximation

    Authors: Raajen Patel, Thomas A. Goldstein, Eva L. Dyer, Azalia Mirhoseini, Richard G. Baraniuk

    Abstract: Kernel matrices (e.g. Gram or similarity matrices) are essential for many state-of-the-art approaches to classification, clustering, and dimensionality reduction. For large datasets, the cost of forming and factoring such kernel matrices becomes intractable. To address this challenge, we introduce a new adaptive sampling algorithm called Accelerated Sequential Incoherence Selection (oASIS) that sa… ▽ More

    Submitted 19 May, 2015; originally announced May 2015.

    ACM Class: G.1.0; G.4

  49. arXiv:1505.00824  [pdf, other

    cs.IT cs.CV cs.LG stat.ML

    Self-Expressive Decompositions for Matrix Approximation and Clustering

    Authors: Eva L. Dyer, Tom A. Goldstein, Raajen Patel, Konrad P. Kording, Richard G. Baraniuk

    Abstract: Data-aware methods for dimensionality reduction and matrix decomposition aim to find low-dimensional structure in a collection of data. Classical approaches discover such structure by learning a basis that can efficiently express the collection. Recently, "self expression", the idea of using a small subset of data vectors to represent the full collection, has been developed as an alternative to le… ▽ More

    Submitted 4 May, 2015; originally announced May 2015.

    Comments: 11 pages, 7 figures

  50. arXiv:1504.00641  [pdf, other

    stat.ML cs.CV cs.LG cs.NE

    A Probabilistic Theory of Deep Learning

    Authors: Ankit B. Patel, Tan Nguyen, Richard G. Baraniuk

    Abstract: A grand challenge in machine learning is the development of computational algorithms that match or outperform humans in perceptual inference tasks that are complicated by nuisance variation. For instance, visual object recognition involves the unknown object position, orientation, and scale in object recognition while speech recognition involves the unknown voice pronunciation, pitch, and speed. R… ▽ More

    Submitted 2 April, 2015; originally announced April 2015.

    Comments: 56 pages, 6 figures, 2 tables

    Report number: Rice University Electrical and Computer Engineering Dept. Technical Report No 2015-1