-
Usable Privacy in Virtual Worlds: Design Implications for Data Collection Awareness and Control Interfaces in Virtual Reality
Authors:
Viktorija Paneva,
Verena Winterhalter,
Naga Sai Surya Vamsy Malladi,
Marvin Strauss,
Stefan Schneegass,
Florian Alt
Abstract:
Extended reality (XR) devices have become ubiquitous. They are equipped with arrays of sensors, collecting extensive user and environmental data, allowing inferences about sensitive user information users may not realize they are sharing. Current VR privacy notices largely replicate mechanisms from 2D interfaces, failing to leverage the unique affordances of virtual 3D environments. To address thi…
▽ More
Extended reality (XR) devices have become ubiquitous. They are equipped with arrays of sensors, collecting extensive user and environmental data, allowing inferences about sensitive user information users may not realize they are sharing. Current VR privacy notices largely replicate mechanisms from 2D interfaces, failing to leverage the unique affordances of virtual 3D environments. To address this, we conducted brainstorming and sketching sessions with novice game developers and designers, followed by privacy expert evaluations, to explore and refine privacy interfaces tailored for VR. Key challenges include balancing user engagement with privacy awareness, managing complex privacy information with user comprehension, and maintaining compliance and trust. We identify design implications such as thoughtful gamification, explicit and purpose-tied consent mechanisms, and granular, modifiable privacy control options. Our findings provide actionable guidance to researchers and practitioners for developing privacy-aware and user-friendly VR experiences.
△ Less
Submitted 13 March, 2025;
originally announced March 2025.
-
FlowMAC: Conditional Flow Matching for Audio Coding at Low Bit Rates
Authors:
Nicola Pia,
Martin Strauss,
Markus Multrus,
Bernd Edler
Abstract:
This paper introduces FlowMAC, a novel neural audio codec for high-quality general audio compression at low bit rates based on conditional flow matching (CFM). FlowMAC jointly learns a mel spectrogram encoder, quantizer and decoder. At inference time the decoder integrates a continuous normalizing flow via an ODE solver to generate a high-quality mel spectrogram. This is the first time that a CFM-…
▽ More
This paper introduces FlowMAC, a novel neural audio codec for high-quality general audio compression at low bit rates based on conditional flow matching (CFM). FlowMAC jointly learns a mel spectrogram encoder, quantizer and decoder. At inference time the decoder integrates a continuous normalizing flow via an ODE solver to generate a high-quality mel spectrogram. This is the first time that a CFM-based approach is applied to general audio coding, enabling a scalable, simple and memory efficient training. Our subjective evaluations show that FlowMAC at 3 kbps achieves similar quality as state-of-the-art GAN-based and DDPM-based neural audio codecs at double the bit rate. Moreover, FlowMAC offers a tunable inference pipeline, which permits to trade off complexity and quality. This enables real-time coding on CPU, while maintaining high perceptual quality.
△ Less
Submitted 6 April, 2025; v1 submitted 26 September, 2024;
originally announced September 2024.
-
Designing and Evaluating Scalable Privacy Awareness and Control User Interfaces for Mixed Reality
Authors:
Marvin Strauss,
Viktorija Paneva,
Florian Alt,
Stefan Schneegass
Abstract:
As Mixed Reality (MR) devices become increasingly popular across industries, they raise significant privacy and ethical concerns due to their capacity to collect extensive data on users and their environments. This paper highlights the urgent need for privacy-aware user interfaces that educate and empower both users and bystanders, enabling them to understand, control, and manage data collection a…
▽ More
As Mixed Reality (MR) devices become increasingly popular across industries, they raise significant privacy and ethical concerns due to their capacity to collect extensive data on users and their environments. This paper highlights the urgent need for privacy-aware user interfaces that educate and empower both users and bystanders, enabling them to understand, control, and manage data collection and sharing. Key research questions include improving user awareness of privacy implications, developing usable privacy controls, and evaluating the effectiveness of these measures in real-world settings. The proposed research roadmap aims to embed privacy considerations into the design and development of MR technologies, promoting responsible innovation that safeguards user privacy while preserving the functionality and appeal of these emerging technologies.
△ Less
Submitted 1 September, 2024;
originally announced September 2024.
-
Efficient Area-based and Speaker-Agnostic Source Separation
Authors:
Martin Strauss,
Okan Köpüklü
Abstract:
This paper introduces an area-based source separation method designed for virtual meeting scenarios. The aim is to preserve speech signals from an unspecified number of sources within a defined spatial area in front of a linear microphone array, while suppressing all other sounds. Therefore, we employ an efficient neural network architecture adapted for multi-channel input to encompass the predefi…
▽ More
This paper introduces an area-based source separation method designed for virtual meeting scenarios. The aim is to preserve speech signals from an unspecified number of sources within a defined spatial area in front of a linear microphone array, while suppressing all other sounds. Therefore, we employ an efficient neural network architecture adapted for multi-channel input to encompass the predefined target area. To evaluate the approach, training data and specific test scenarios including multiple target and interfering speakers, as well as background noise are simulated. All models are rated according to DNSMOS and scale-invariant signal-to-distortion ratio. Our experiments show that the proposed method separates speech from multiple speakers within the target area well, besides being of very low complexity, intended for real-time processing. In addition, a power reduction heatmap is used to demonstrate the networks' ability to identify sources located within the target area. We put our approach in context with a well-established baseline for speaker-speaker separation and discuss its strengths and challenges.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
Predicting Preferred Dialogue-to-Background Loudness Difference in Dialogue-Separated Audio
Authors:
Luca Resti,
Martin Strauss,
Matteo Torcoli,
Emanuël Habets,
Bernd Edler
Abstract:
Dialogue Enhancement (DE) enables the rebalancing of dialogue and background sounds to fit personal preferences and needs in the context of broadcast audio. When individual audio stems are unavailable from production, Dialogue Separation (DS) can be applied to the final audio mixture to obtain estimates of these stems. This work focuses on Preferred Loudness Differences (PLDs) between dialogue and…
▽ More
Dialogue Enhancement (DE) enables the rebalancing of dialogue and background sounds to fit personal preferences and needs in the context of broadcast audio. When individual audio stems are unavailable from production, Dialogue Separation (DS) can be applied to the final audio mixture to obtain estimates of these stems. This work focuses on Preferred Loudness Differences (PLDs) between dialogue and background sounds. While previous studies determined the PLD through a listening test employing original stems from production, stems estimated by DS are used in the present study. In addition, a larger variety of signal classes is considered. PLDs vary substantially across individuals (average interquartile range: 5.7 LU). Despite this variability, PLDs are found to be highly dependent on the signal type under consideration, and it is shown that median PLDs can be predicted using objective intelligibility metrics. Two existing baseline prediction methods - intended for use with original stems - displayed a Mean Absolute Error (MAE) of 7.5 LU and 5 LU, respectively. A modified baseline (MAE: 3.2 LU) and an alternative approach (MAE: 2.5 LU) are proposed. Results support the viability of processing final broadcast mixtures with DS and offering an alternative remixing that accounts for median PLDs.
△ Less
Submitted 31 May, 2023; v1 submitted 30 May, 2023;
originally announced May 2023.
-
Slow Down, Move Over: A Case Study in Formal Verification, Refinement, and Testing of the Responsibility-Sensitive Safety Model for Self-Driving Cars
Authors:
Megan Strauss,
Stefan Mitsch
Abstract:
Technology advances give us the hope of driving without human error, reducing vehicle emissions and simplifying an everyday task with the future of self-driving cars. Making sure these vehicles are safe is very important to the continuation of this field. In this paper, we formalize the Responsibility-Sensitive Safety model (RSS) for self-driving cars and prove the safety and optimality of this mo…
▽ More
Technology advances give us the hope of driving without human error, reducing vehicle emissions and simplifying an everyday task with the future of self-driving cars. Making sure these vehicles are safe is very important to the continuation of this field. In this paper, we formalize the Responsibility-Sensitive Safety model (RSS) for self-driving cars and prove the safety and optimality of this model in the longitudinal direction. We utilize the hybrid systems theorem prover KeYmaera X to formalize RSS as a hybrid system with its nondeterministic control choices and continuous motion model, and prove absence of collisions. We then illustrate the practicality of RSS through refinement proofs that turn the verified nondeterministic control envelopes into deterministic ones and further verified compilation to Python. The refinement and compilation are safety-preserving; as a result, safety proofs of the formal model transfer to the compiled code, while counterexamples discovered in testing the code of an unverified model transfer back. The resulting Python code allows to test the behavior of cars following the motion model of RSS in simulation, to measure agreement between the model and simulation with monitors that are derived from the formal model, and to report counterexamples from simulation back to the formal model.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
Improved Normalizing Flow-Based Speech Enhancement using an All-pole Gammatone Filterbank for Conditional Input Representation
Authors:
Martin Strauss,
Matteo Torcoli,
Bernd Edler
Abstract:
Deep generative models for Speech Enhancement (SE) received increasing attention in recent years. The most prominent example are Generative Adversarial Networks (GANs), while normalizing flows (NF) received less attention despite their potential. Building on previous work, architectural modifications are proposed, along with an investigation of different conditional input representations. Despite…
▽ More
Deep generative models for Speech Enhancement (SE) received increasing attention in recent years. The most prominent example are Generative Adversarial Networks (GANs), while normalizing flows (NF) received less attention despite their potential. Building on previous work, architectural modifications are proposed, along with an investigation of different conditional input representations. Despite being a common choice in related works, Mel-spectrograms demonstrate to be inadequate for the given scenario. Alternatively, a novel All-Pole Gammatone filterbank (APG) with high temporal resolution is proposed. Although computational evaluation metric results would suggest that state-of-the-art GAN-based methods perform best, a perceptual evaluation via a listening test indicates that the presented NF approach (based on time domain and APG) performs best, especially at lower SNRs. On average, APG outputs are rated as having good quality, which is unmatched by the other methods, including GAN.
△ Less
Submitted 20 October, 2022;
originally announced October 2022.
-
Automated Learning of Interpretable Models with Quantified Uncertainty
Authors:
G. F. Bomarito,
P. E. Leser,
N. C. M Strauss,
K. M. Garbrecht,
J. D. Hochhalter
Abstract:
Interpretability and uncertainty quantification in machine learning can provide justification for decisions, promote scientific discovery and lead to a better understanding of model behavior. Symbolic regression provides inherently interpretable machine learning, but relatively little work has focused on the use of symbolic regression on noisy data and the accompanying necessity to quantify uncert…
▽ More
Interpretability and uncertainty quantification in machine learning can provide justification for decisions, promote scientific discovery and lead to a better understanding of model behavior. Symbolic regression provides inherently interpretable machine learning, but relatively little work has focused on the use of symbolic regression on noisy data and the accompanying necessity to quantify uncertainty. A new Bayesian framework for genetic-programming-based symbolic regression (GPSR) is introduced that uses model evidence (i.e., marginal likelihood) to formulate replacement probability during the selection phase of evolution. Model parameter uncertainty is automatically quantified, enabling probabilistic predictions with each equation produced by the GPSR algorithm. Model evidence is also quantified in this process, and its use is shown to increase interpretability, improve robustness to noise, and reduce overfitting when compared to a conventional GPSR implementation on both numerical and physical experiments.
△ Less
Submitted 12 April, 2022;
originally announced May 2022.
-
A Hands-on Comparison of DNNs for Dialog Separation Using Transfer Learning from Music Source Separation
Authors:
Martin Strauss,
Jouni Paulus,
Matteo Torcoli,
Bernd Edler
Abstract:
This paper describes a hands-on comparison on using state-of-the-art music source separation deep neural networks (DNNs) before and after task-specific fine-tuning for separating speech content from non-speech content in broadcast audio (i.e., dialog separation). The music separation models are selected as they share the number of channels (2) and sampling rate (44.1 kHz or higher) with the consid…
▽ More
This paper describes a hands-on comparison on using state-of-the-art music source separation deep neural networks (DNNs) before and after task-specific fine-tuning for separating speech content from non-speech content in broadcast audio (i.e., dialog separation). The music separation models are selected as they share the number of channels (2) and sampling rate (44.1 kHz or higher) with the considered broadcast content, and vocals separation in music is considered as a parallel for dialog separation in the target application domain. These similarities are assumed to enable transfer learning between the tasks. Three models pre-trained on music (Open-Unmix, Spleeter, and Conv-TasNet) are considered in the experiments, and fine-tuned with real broadcast data. The performance of the models is evaluated before and after fine-tuning with computational evaluation metrics (SI-SIRi, SI-SDRi, 2f-model), as well as with a listening test simulating an application where the non-speech signal is partially attenuated, e.g., for better speech intelligibility. The evaluations include two reference systems specifically developed for dialog separation. The results indicate that pre-trained music source separation models can be used for dialog separation to some degree, and that they benefit from the fine-tuning, reaching a performance close to task-specific solutions.
△ Less
Submitted 22 June, 2021; v1 submitted 16 June, 2021;
originally announced June 2021.
-
A Flow-Based Neural Network for Time Domain Speech Enhancement
Authors:
Martin Strauss,
Bernd Edler
Abstract:
Speech enhancement involves the distinction of a target speech signal from an intrusive background. Although generative approaches using Variational Autoencoders or Generative Adversarial Networks (GANs) have increasingly been used in recent years, normalizing flow (NF) based systems are still scarse, despite their success in related fields. Thus, in this paper we propose a NF framework to directl…
▽ More
Speech enhancement involves the distinction of a target speech signal from an intrusive background. Although generative approaches using Variational Autoencoders or Generative Adversarial Networks (GANs) have increasingly been used in recent years, normalizing flow (NF) based systems are still scarse, despite their success in related fields. Thus, in this paper we propose a NF framework to directly model the enhancement process by density estimation of clean speech utterances conditioned on their noisy counterpart. The WaveGlow model from speech synthesis is adapted to enable direct enhancement of noisy utterances in time domain. In addition, we demonstrate that nonlinear input companding benefits the model performance by equalizing the distribution of input samples. Experimental evaluation on a publicly available dataset shows comparable results to current state-of-the-art GAN-based approaches, while surpassing the chosen baselines using objective evaluation metrics.
△ Less
Submitted 16 June, 2021;
originally announced June 2021.
-
Audio-Based Search and Rescue with a Drone: Highlights from the IEEE Signal Processing Cup 2019 Student Competition
Authors:
Antoine Deleforge,
Diego Di Carlo,
Martin Strauss,
Romain Serizel,
Lucio Marcenaro
Abstract:
Unmanned aerial vehicles (UAV), commonly referred to as drones, have raised increasing interest in recent years. Search and rescue scenarios where humans in emergency situations need to be quickly found in areas difficult to access constitute an important field of application for this technology. While research efforts have mostly focused on developing video-based solutions for this task \cite{lop…
▽ More
Unmanned aerial vehicles (UAV), commonly referred to as drones, have raised increasing interest in recent years. Search and rescue scenarios where humans in emergency situations need to be quickly found in areas difficult to access constitute an important field of application for this technology. While research efforts have mostly focused on developing video-based solutions for this task \cite{lopez2017cvemergency}, UAV-embedded audio-based localization has received relatively less attention. Though, UAVs equipped with a microphone array could be of critical help to localize people in emergency situations, in particular when video sensors are limited by a lack of visual feedback due to bad lighting conditions or obstacles limiting the field of view. This motivated the topic of the 6th edition of the IEEE Signal Processing Cup (SP Cup): a UAV-embedded sound source localization challenge for search and rescue. In this article, we share an overview of the IEEE SP Cup experience including the competition tasks, participating teams, technical approaches and statistics.
△ Less
Submitted 3 July, 2019;
originally announced July 2019.
-
Fair Pipelines
Authors:
Amanda Bower,
Sarah N. Kitchen,
Laura Niss,
Martin J. Strauss,
Alexander Vargas,
Suresh Venkatasubramanian
Abstract:
This work facilitates ensuring fairness of machine learning in the real world by decoupling fairness considerations in compound decisions. In particular, this work studies how fairness propagates through a compound decision-making processes, which we call a pipeline. Prior work in algorithmic fairness only focuses on fairness with respect to one decision. However, many decision-making processes re…
▽ More
This work facilitates ensuring fairness of machine learning in the real world by decoupling fairness considerations in compound decisions. In particular, this work studies how fairness propagates through a compound decision-making processes, which we call a pipeline. Prior work in algorithmic fairness only focuses on fairness with respect to one decision. However, many decision-making processes require more than one decision. For instance, hiring is at least a two stage model: deciding who to interview from the applicant pool and then deciding who to hire from the interview pool. Perhaps surprisingly, we show that the composition of fair components may not guarantee a fair pipeline under a $(1+\varepsilon)$-equal opportunity definition of fair. However, we identify circumstances that do provide that guarantee. We also propose numerous directions for future work on more general compound machine learning decisions.
△ Less
Submitted 2 July, 2017;
originally announced July 2017.
-
For-all Sparse Recovery in Near-Optimal Time
Authors:
Anna C. Gilbert,
Yi Li,
Ely Porat,
Martin J. Strauss
Abstract:
An approximate sparse recovery system in $\ell_1$ norm consists of parameters $k$, $ε$, $N$, an $m$-by-$N$ measurement $Φ$, and a recovery algorithm, $\mathcal{R}$. Given a vector, $\mathbf{x}$, the system approximates $x$ by $\widehat{\mathbf{x}} = \mathcal{R}(Φ\mathbf{x})$, which must satisfy $\|\widehat{\mathbf{x}}-\mathbf{x}\|_1 \leq (1+ε)\|\mathbf{x}-\mathbf{x}_k\|_1$. We consider the 'for al…
▽ More
An approximate sparse recovery system in $\ell_1$ norm consists of parameters $k$, $ε$, $N$, an $m$-by-$N$ measurement $Φ$, and a recovery algorithm, $\mathcal{R}$. Given a vector, $\mathbf{x}$, the system approximates $x$ by $\widehat{\mathbf{x}} = \mathcal{R}(Φ\mathbf{x})$, which must satisfy $\|\widehat{\mathbf{x}}-\mathbf{x}\|_1 \leq (1+ε)\|\mathbf{x}-\mathbf{x}_k\|_1$. We consider the 'for all' model, in which a single matrix $Φ$, possibly 'constructed' non-explicitly using the probabilistic method, is used for all signals $\mathbf{x}$. The best existing sublinear algorithm by Porat and Strauss (SODA'12) uses $O(ε^{-3} k\log(N/k))$ measurements and runs in time $O(k^{1-α}N^α)$ for any constant $α> 0$.
In this paper, we improve the number of measurements to $O(ε^{-2} k \log(N/k))$, matching the best existing upper bound (attained by super-linear algorithms), and the runtime to $O(k^{1+β}\textrm{poly}(\log N,1/ε))$, with a modest restriction that $ε\leq (\log k/\log N)^γ$, for any constants $β,γ> 0$. When $k\leq \log^c N$ for some $c>0$, the runtime is reduced to $O(k\textrm{poly}(N,1/ε))$. With no restrictions on $ε$, we have an approximation recovery system with $m = O(k/ε\log(N/k)((\log N/\log k)^γ+ 1/ε))$ measurements.
△ Less
Submitted 7 March, 2017; v1 submitted 7 February, 2014;
originally announced February 2014.
-
L2/L2-foreach sparse recovery with low risk
Authors:
Anna C. Gilbert,
Hung Q. Ngo,
Ely Porat,
Atri Rudra,
Martin J. Strauss
Abstract:
In this paper, we consider the "foreach" sparse recovery problem with failure probability $p$. The goal of which is to design a distribution over $m \times N$ matrices $Φ$ and a decoding algorithm $\algo$ such that for every $\vx\in\R^N$, we have the following error guarantee with probability at least $1-p$ \[\|\vx-\algo(Φ\vx)\|_2\le C\|\vx-\vx_k\|_2,\] where $C$ is a constant (ideally arbitrarily…
▽ More
In this paper, we consider the "foreach" sparse recovery problem with failure probability $p$. The goal of which is to design a distribution over $m \times N$ matrices $Φ$ and a decoding algorithm $\algo$ such that for every $\vx\in\R^N$, we have the following error guarantee with probability at least $1-p$ \[\|\vx-\algo(Φ\vx)\|_2\le C\|\vx-\vx_k\|_2,\] where $C$ is a constant (ideally arbitrarily close to 1) and $\vx_k$ is the best $k$-sparse approximation of $\vx$.
Much of the sparse recovery or compressive sensing literature has focused on the case of either $p = 0$ or $p = Ω(1)$. We initiate the study of this problem for the entire range of failure probability. Our two main results are as follows: \begin{enumerate} \item We prove a lower bound on $m$, the number measurements, of $Ω(k\log(n/k)+\log(1/p))$ for $2^{-Θ(N)}\le p <1$. Cohen, Dahmen, and DeVore \cite{CDD2007:NearOptimall2l2} prove that this bound is tight. \item We prove nearly matching upper bounds for \textit{sub-linear} time decoding. Previous such results addressed only $p = Ω(1)$. \end{enumerate}
Our results and techniques lead to the following corollaries: (i) the first ever sub-linear time decoding $\lolo$ "forall" sparse recovery system that requires a $\log^γ{N}$ extra factor (for some $γ<1$) over the optimal $O(k\log(N/k))$ number of measurements, and (ii) extensions of Gilbert et al. \cite{GHRSW12:SimpleSignals} results for information-theoretically bounded adversaries.
△ Less
Submitted 23 April, 2013;
originally announced April 2013.
-
Sublinear Time, Measurement-Optimal, Sparse Recovery For All
Authors:
Ely Porat,
Martin J. Strauss
Abstract:
An approximate sparse recovery system in ell_1 norm formally consists of parameters N, k, epsilon an m-by-N measurement matrix, Phi, and a decoding algorithm, D. Given a vector, x, where x_k denotes the optimal k-term approximation to x, the system approximates x by hat_x = D(Phi.x), which must satisfy
||hat_x - x||_1 <= (1+epsilon)||x - x_k||_1.
Among the goals in designing such systems are m…
▽ More
An approximate sparse recovery system in ell_1 norm formally consists of parameters N, k, epsilon an m-by-N measurement matrix, Phi, and a decoding algorithm, D. Given a vector, x, where x_k denotes the optimal k-term approximation to x, the system approximates x by hat_x = D(Phi.x), which must satisfy
||hat_x - x||_1 <= (1+epsilon)||x - x_k||_1.
Among the goals in designing such systems are minimizing m and the runtime of D. We consider the "forall" model, in which a single matrix Phi is used for all signals x.
All previous algorithms that use the optimal number m=O(k log(N/k)) of measurements require superlinear time Omega(N log(N/k)). In this paper, we give the first algorithm for this problem that uses the optimum number of measurements (up to a constant factor) and runs in sublinear time o(N) when k=o(N), assuming access to a data structure requiring space and preprocessing O(N).
△ Less
Submitted 14 July, 2011; v1 submitted 8 December, 2010;
originally announced December 2010.
-
Approximate Sparse Recovery: Optimizing Time and Measurements
Authors:
Anna C. Gilbert,
Yi Li,
Ely Porat,
Martin J. Strauss
Abstract:
An approximate sparse recovery system consists of parameters $k,N$, an $m$-by-$N$ measurement matrix, $Φ$, and a decoding algorithm, $\mathcal{D}$. Given a vector, $x$, the system approximates $x$ by $\widehat x =\mathcal{D}(Φx)$, which must satisfy $\| \widehat x - x\|_2\le C \|x - x_k\|_2$, where $x_k$ denotes the optimal $k$-term approximation to $x$. For each vector $x$, the system must succ…
▽ More
An approximate sparse recovery system consists of parameters $k,N$, an $m$-by-$N$ measurement matrix, $Φ$, and a decoding algorithm, $\mathcal{D}$. Given a vector, $x$, the system approximates $x$ by $\widehat x =\mathcal{D}(Φx)$, which must satisfy $\| \widehat x - x\|_2\le C \|x - x_k\|_2$, where $x_k$ denotes the optimal $k$-term approximation to $x$. For each vector $x$, the system must succeed with probability at least 3/4. Among the goals in designing such systems are minimizing the number $m$ of measurements and the runtime of the decoding algorithm, $\mathcal{D}$.
In this paper, we give a system with $m=O(k \log(N/k))$ measurements--matching a lower bound, up to a constant factor--and decoding time $O(k\log^c N)$, matching a lower bound up to $\log(N)$ factors.
We also consider the encode time (i.e., the time to multiply $Φ$ by $x$), the time to update measurements (i.e., the time to multiply $Φ$ by a 1-sparse $x$), and the robustness and stability of the algorithm (adding noise before and after the measurements). Our encode and update times are optimal up to $\log(N)$ factors.
△ Less
Submitted 1 December, 2009;
originally announced December 2009.
-
Combining geometry and combinatorics: A unified approach to sparse signal recovery
Authors:
R. Berinde,
A. C. Gilbert,
P. Indyk,
H. Karloff,
M. J. Strauss
Abstract:
There are two main algorithmic approaches to sparse signal recovery: geometric and combinatorial. The geometric approach starts with a geometric constraint on the measurement matrix and then uses linear programming to decode information about the signal from its measurements. The combinatorial approach constructs the measurement matrix and a combinatorial decoding algorithm to match. We present…
▽ More
There are two main algorithmic approaches to sparse signal recovery: geometric and combinatorial. The geometric approach starts with a geometric constraint on the measurement matrix and then uses linear programming to decode information about the signal from its measurements. The combinatorial approach constructs the measurement matrix and a combinatorial decoding algorithm to match. We present a unified approach to these two classes of sparse signal recovery algorithms.
The unifying elements are the adjacency matrices of high-quality unbalanced expanders. We generalize the notion of Restricted Isometry Property (RIP), crucial to compressed sensing results for signal recovery, from the Euclidean norm to the l_p norm for p about 1, and then show that unbalanced expanders are essentially equivalent to RIP-p matrices.
From known deterministic constructions for such matrices, we obtain new deterministic measurement matrix constructions and algorithms for signal recovery which, compared to previous deterministic algorithms, are superior in either the number of measurements or in noise tolerance.
△ Less
Submitted 29 April, 2008;
originally announced April 2008.
-
Private Approximate Heavy Hitters
Authors:
Martin J. Strauss,
Xuan Zheng
Abstract:
We consider the problem of private computation of approximate Heavy Hitters. Alice and Bob each hold a vector and, in the vector sum, they want to find the B largest values along with their indices. While the exact problem requires linear communication, protocols in the literature solve this problem approximately using polynomial computation time, polylogarithmic communication, and constantly ma…
▽ More
We consider the problem of private computation of approximate Heavy Hitters. Alice and Bob each hold a vector and, in the vector sum, they want to find the B largest values along with their indices. While the exact problem requires linear communication, protocols in the literature solve this problem approximately using polynomial computation time, polylogarithmic communication, and constantly many rounds. We show how to solve the problem privately with comparable cost, in the sense that nothing is learned by Alice and Bob beyond what is implied by their input, the ideal top-B output, and goodness of approximation (equivalently, the Euclidean norm of the vector sum). We give lower bounds showing that the Euclidean norm must leak by any efficient algorithm.
△ Less
Submitted 29 September, 2006;
originally announced September 2006.
-
Algorithmic linear dimension reduction in the l_1 norm for sparse vectors
Authors:
A. C. Gilbert,
M. J. Strauss,
J. A. Tropp,
R. Vershynin
Abstract:
This paper develops a new method for recovering m-sparse signals that is simultaneously uniform and quick. We present a reconstruction algorithm whose run time, O(m log^2(m) log^2(d)), is sublinear in the length d of the signal. The reconstruction error is within a logarithmic factor (in m) of the optimal m-term approximation error in l_1. In particular, the algorithm recovers m-sparse signals p…
▽ More
This paper develops a new method for recovering m-sparse signals that is simultaneously uniform and quick. We present a reconstruction algorithm whose run time, O(m log^2(m) log^2(d)), is sublinear in the length d of the signal. The reconstruction error is within a logarithmic factor (in m) of the optimal m-term approximation error in l_1. In particular, the algorithm recovers m-sparse signals perfectly and noisy signals are recovered with polylogarithmic distortion. Our algorithm makes O(m log^2 (d)) measurements, which is within a logarithmic factor of optimal. We also present a small-space implementation of the algorithm. These sketching techniques and the corresponding reconstruction algorithms provide an algorithmic dimension reduction in the l_1 norm. In particular, vectors of support m in dimension d can be linearly embedded into O(m log^2 d) dimensions with polylogarithmic distortion. We can reconstruct a vector from its low-dimensional sketch in time O(m log^2(m) log^2(d)). Furthermore, this reconstruction is stable and robust under small perturbations.
△ Less
Submitted 18 August, 2006;
originally announced August 2006.
-
List decoding of noisy Reed-Muller-like codes
Authors:
A. R. Calderbank,
Anna C. Gilbert,
Martin J. Strauss
Abstract:
First- and second-order Reed-Muller (RM(1) and RM(2), respectively) codes are two fundamental error-correcting codes which arise in communication as well as in probabilistically-checkable proofs and learning. In this paper, we take the first steps toward extending the quick randomized decoding tools of RM(1) into the realm of quadratic binary and, equivalently, Z_4 codes. Our main algorithmic re…
▽ More
First- and second-order Reed-Muller (RM(1) and RM(2), respectively) codes are two fundamental error-correcting codes which arise in communication as well as in probabilistically-checkable proofs and learning. In this paper, we take the first steps toward extending the quick randomized decoding tools of RM(1) into the realm of quadratic binary and, equivalently, Z_4 codes. Our main algorithmic result is an extension of the RM(1) techniques from Goldreich-Levin and Kushilevitz-Mansour algorithms to the Hankel code, a code between RM(1) and RM(2). That is, given signal s of length N, we find a list that is a superset of all Hankel codewords phi with dot product to s at least (1/sqrt(k)) times the norm of s, in time polynomial in k and log(N). We also give a new and simple formulation of a known Kerdock code as a subcode of the Hankel code. As a corollary, we can list-decode Kerdock, too. Also, we get a quick algorithm for finding a sparse Kerdock approximation. That is, for k small compared with 1/sqrt{N} and for epsilon > 0, we find, in time polynomial in (k log(N)/epsilon), a k-Kerdock-term approximation s~ to s with Euclidean error at most the factor (1+epsilon+O(k^2/sqrt{N})) times that of the best such approximation.
△ Less
Submitted 2 August, 2006; v1 submitted 20 July, 2006;
originally announced July 2006.