-
A Survey of Structure from Motion
Authors:
Onur Ozyesil,
Vladislav Voroninski,
Ronen Basri,
Amit Singer
Abstract:
The structure from motion (SfM) problem in computer vision is the problem of recovering the three-dimensional ($3$D) structure of a stationary scene from a set of projective measurements, represented as a collection of two-dimensional ($2$D) images, via estimation of motion of the cameras corresponding to these images. In essence, SfM involves the three main stages of (1) extraction of features in…
▽ More
The structure from motion (SfM) problem in computer vision is the problem of recovering the three-dimensional ($3$D) structure of a stationary scene from a set of projective measurements, represented as a collection of two-dimensional ($2$D) images, via estimation of motion of the cameras corresponding to these images. In essence, SfM involves the three main stages of (1) extraction of features in images (e.g., points of interest, lines, etc.) and matching these features between images, (2) camera motion estimation (e.g., using relative pairwise camera positions estimated from the extracted features), and (3) recovery of the $3$D structure using the estimated motion and features (e.g., by minimizing the so-called reprojection error). This survey mainly focuses on relatively recent developments in the literature pertaining to stages (2) and (3). More specifically, after touching upon the early factorization-based techniques for motion and structure estimation, we provide a detailed account of some of the recent camera location estimation methods in the literature, followed by discussion of notable techniques for $3$D structure recovery. We also cover the basics of the simultaneous localization and mapping (SLAM) problem, which can be viewed as a specific case of the SfM problem. Further, our survey includes a review of the fundamentals of feature extraction and matching (i.e., stage (1) above), various recent methods for handling ambiguities in $3$D scenes, SfM techniques involving relatively uncommon camera models and image features, and popular sources of data and SfM software.
△ Less
Submitted 8 May, 2017; v1 submitted 30 January, 2017;
originally announced January 2017.
-
Synchronization over Cartan motion groups via contraction
Authors:
Onur Ozyesil,
Nir Sharon,
Amit Singer
Abstract:
Group contraction is an algebraic map that relates two classes of Lie groups by a limiting process. We utilize this notion for the compactification of the class of Cartan motion groups. The compactification process is then applied to reduce a non-compact synchronization problem to a problem where the solution can be obtained by means of a unitary, faithful representation. We describe this method o…
▽ More
Group contraction is an algebraic map that relates two classes of Lie groups by a limiting process. We utilize this notion for the compactification of the class of Cartan motion groups. The compactification process is then applied to reduce a non-compact synchronization problem to a problem where the solution can be obtained by means of a unitary, faithful representation. We describe this method of synchronization via contraction in detail and analyze several important aspects of this application. One important special case of Cartan motion groups is the group of rigid motions, also called the special Euclidean group. We thoroughly discuss the synchronization over this group and show numerically the advantages of our approach compared to some current state-of-the-art synchronization methods on both synthetic and real data.
△ Less
Submitted 8 December, 2017; v1 submitted 30 November, 2016;
originally announced December 2016.
-
Robust Camera Location Estimation by Convex Programming
Authors:
Onur Ozyesil,
Amit Singer
Abstract:
$3$D structure recovery from a collection of $2$D images requires the estimation of the camera locations and orientations, i.e. the camera motion. For large, irregular collections of images, existing methods for the location estimation part, which can be formulated as the inverse problem of estimating $n$ locations $\mathbf{t}_1, \mathbf{t}_2, \ldots, \mathbf{t}_n$ in $\mathbb{R}^3…
▽ More
$3$D structure recovery from a collection of $2$D images requires the estimation of the camera locations and orientations, i.e. the camera motion. For large, irregular collections of images, existing methods for the location estimation part, which can be formulated as the inverse problem of estimating $n$ locations $\mathbf{t}_1, \mathbf{t}_2, \ldots, \mathbf{t}_n$ in $\mathbb{R}^3$ from noisy measurements of a subset of the pairwise directions $\frac{\mathbf{t}_i - \mathbf{t}_j}{\|\mathbf{t}_i - \mathbf{t}_j\|}$, are sensitive to outliers in direction measurements. In this paper, we firstly provide a complete characterization of well-posed instances of the location estimation problem, by presenting its relation to the existing theory of parallel rigidity. For robust estimation of camera locations, we introduce a two-step approach, comprised of a pairwise direction estimation method robust to outliers in point correspondences between image pairs, and a convex program to maintain robustness to outlier directions. In the presence of partially corrupted measurements, we empirically demonstrate that our convex formulation can even recover the locations exactly. Lastly, we demonstrate the utility of our formulations through experiments on Internet photo collections.
△ Less
Submitted 3 June, 2015; v1 submitted 29 November, 2014;
originally announced December 2014.
-
Stable Camera Motion Estimation Using Convex Programming
Authors:
Onur Ozyesil,
Amit Singer,
Ronen Basri
Abstract:
We study the inverse problem of estimating n locations $t_1, ..., t_n$ (up to global scale, translation and negation) in $R^d$ from noisy measurements of a subset of the (unsigned) pairwise lines that connect them, that is, from noisy measurements of $\pm (t_i - t_j)/\|t_i - t_j\|$ for some pairs (i,j) (where the signs are unknown). This problem is at the core of the structure from motion (SfM) pr…
▽ More
We study the inverse problem of estimating n locations $t_1, ..., t_n$ (up to global scale, translation and negation) in $R^d$ from noisy measurements of a subset of the (unsigned) pairwise lines that connect them, that is, from noisy measurements of $\pm (t_i - t_j)/\|t_i - t_j\|$ for some pairs (i,j) (where the signs are unknown). This problem is at the core of the structure from motion (SfM) problem in computer vision, where the $t_i$'s represent camera locations in $R^3$. The noiseless version of the problem, with exact line measurements, has been considered previously under the general title of parallel rigidity theory, mainly in order to characterize the conditions for unique realization of locations. For noisy pairwise line measurements, current methods tend to produce spurious solutions that are clustered around a few locations. This sensitivity of the location estimates is a well-known problem in SfM, especially for large, irregular collections of images.
In this paper we introduce a semidefinite programming (SDP) formulation, specially tailored to overcome the clustering phenomenon. We further identify the implications of parallel rigidity theory for the location estimation problem to be well-posed, and prove exact (in the noiseless case) and stable location recovery results. We also formulate an alternating direction method to solve the resulting semidefinite program, and provide a distributed version of our formulation for large numbers of locations. Specifically for the camera location estimation problem, we formulate a pairwise line estimation method based on robust camera orientation and subspace estimation. Lastly, we demonstrate the utility of our algorithm through experiments on real images.
△ Less
Submitted 15 January, 2015; v1 submitted 18 December, 2013;
originally announced December 2013.
-
On Detection With Partial Information In The Gaussian Setup
Authors:
Onur Ozyesil,
M. Kivanc Mihcak,
Yucel Altug
Abstract:
We introduce the problem of communication with partial information, where there is an asymmetry between the transmitter and the receiver codebooks. Practical applications of the proposed setup include the robust signal hashing problem within the context of multimedia security and asymmetric communications with resource-lacking receivers. We study this setup in a binary detection theoretic contex…
▽ More
We introduce the problem of communication with partial information, where there is an asymmetry between the transmitter and the receiver codebooks. Practical applications of the proposed setup include the robust signal hashing problem within the context of multimedia security and asymmetric communications with resource-lacking receivers. We study this setup in a binary detection theoretic context for the additive colored Gaussian noise channel. In our proposed setup, the partial information available at the detector consists of dimensionality-reduced versions of the transmitter codewords, where the dimensionality reduction is achieved via a linear transform. We first derive the corresponding MAP-optimal detection rule and the corresponding conditional probability of error (conditioned on the partial information the detector possesses). Then, we constructively quantify an optimal class of linear transforms, where the cost function is the expected Chernoff bound on the conditional probability of error of the MAP-optimal detector.
△ Less
Submitted 27 October, 2009;
originally announced October 2009.
-
Reliable Communications with Asymmetric Codebooks: An Information Theoretic Analysis of Robust Signal Hashing
Authors:
Yucel Altug,
M. Kivanc Mihcak,
Onur Ozyesil,
Vishal Monga
Abstract:
In this paper, a generalization of the traditional point-to-point to communication setup, which is named as "reliable communications with asymmetric codebooks", is proposed. Under the assumption of independent identically distributed (i.i.d) encoder codewords, it is proven that the operational capacity of the system is equal to the information capacity of the system, which is given by…
▽ More
In this paper, a generalization of the traditional point-to-point to communication setup, which is named as "reliable communications with asymmetric codebooks", is proposed. Under the assumption of independent identically distributed (i.i.d) encoder codewords, it is proven that the operational capacity of the system is equal to the information capacity of the system, which is given by $\max_{p(x)} I(U;Y)$, where $X, U$ and $Y$ denote the individual random elements of encoder codewords, decoder codewords and decoder inputs. The capacity result is derived in the "binary symmetric" case (which is an analogous formulation of the traditional "binary symmetric channel" for our case), as a function of the system parameters. A conceptually insightful inference is made by attributing the difference from the classical Shannon-type capacity of binary symmetric channel to the {\em gap} due to the codebook asymmetry.
△ Less
Submitted 10 September, 2008;
originally announced September 2008.