-
Hypergraph Spectral Clustering in the Weighted Stochastic Block Model
Authors:
Kwangjun Ahn,
Kangwook Lee,
Changho Suh
Abstract:
Spectral clustering is a celebrated algorithm that partitions objects based on pairwise similarity information. While this approach has been successfully applied to a variety of domains, it comes with limitations. The reason is that there are many other applications in which only \emph{multi}-way similarity measures are available. This motivates us to explore the multi-way measurement setting. In…
▽ More
Spectral clustering is a celebrated algorithm that partitions objects based on pairwise similarity information. While this approach has been successfully applied to a variety of domains, it comes with limitations. The reason is that there are many other applications in which only \emph{multi}-way similarity measures are available. This motivates us to explore the multi-way measurement setting. In this work, we develop two algorithms intended for such setting: Hypergraph Spectral Clustering (HSC) and Hypergraph Spectral Clustering with Local Refinement (HSCLR). Our main contribution lies in performance analysis of the poly-time algorithms under a random hypergraph model, which we name the weighted stochastic block model, in which objects and multi-way measures are modeled as nodes and weights of hyperedges, respectively. Denoting by $n$ the number of nodes, our analysis reveals the following: (1) HSC outputs a partition which is better than a random guess if the sum of edge weights (to be explained later) is $Ω(n)$; (2) HSC outputs a partition which coincides with the hidden partition except for a vanishing fraction of nodes if the sum of edge weights is $ω(n)$; and (3) HSCLR exactly recovers the hidden partition if the sum of edge weights is on the order of $n \log n$. Our results improve upon the state of the arts recently established under the model and they firstly settle the order-wise optimal results for the binary edge weight case. Moreover, we show that our results lead to efficient sketching algorithms for subspace clustering, a computer vision application. Lastly, we show that HSCLR achieves the information-theoretic limits for a special yet practically relevant model, thereby showing no computational barrier for the case.
△ Less
Submitted 23 May, 2018;
originally announced May 2018.
-
Community Recovery in Graphs with Locality
Authors:
Yuxin Chen,
Govinda Kamath,
Changho Suh,
David Tse
Abstract:
Motivated by applications in domains such as social networks and computational biology, we study the problem of community recovery in graphs with locality. In this problem, pairwise noisy measurements of whether two nodes are in the same community or different communities come mainly or exclusively from nearby nodes rather than uniformly sampled between all nodes pairs, as in most existing models.…
▽ More
Motivated by applications in domains such as social networks and computational biology, we study the problem of community recovery in graphs with locality. In this problem, pairwise noisy measurements of whether two nodes are in the same community or different communities come mainly or exclusively from nearby nodes rather than uniformly sampled between all nodes pairs, as in most existing models. We present an algorithm that runs nearly linearly in the number of measurements and which achieves the information theoretic limit for exact recovery.
△ Less
Submitted 1 June, 2016; v1 submitted 11 February, 2016;
originally announced February 2016.
-
Spectral MLE: Top-$K$ Rank Aggregation from Pairwise Comparisons
Authors:
Yuxin Chen,
Changho Suh
Abstract:
This paper explores the preference-based top-$K$ rank aggregation problem. Suppose that a collection of items is repeatedly compared in pairs, and one wishes to recover a consistent ordering that emphasizes the top-$K$ ranked items, based on partially revealed preferences. We focus on the Bradley-Terry-Luce (BTL) model that postulates a set of latent preference scores underlying all items, where t…
▽ More
This paper explores the preference-based top-$K$ rank aggregation problem. Suppose that a collection of items is repeatedly compared in pairs, and one wishes to recover a consistent ordering that emphasizes the top-$K$ ranked items, based on partially revealed preferences. We focus on the Bradley-Terry-Luce (BTL) model that postulates a set of latent preference scores underlying all items, where the odds of paired comparisons depend only on the relative scores of the items involved.
We characterize the minimax limits on identifiability of top-$K$ ranked items, in the presence of random and non-adaptive sampling. Our results highlight a separation measure that quantifies the gap of preference scores between the $K^{\text{th}}$ and $(K+1)^{\text{th}}$ ranked items. The minimum sample complexity required for reliable top-$K$ ranking scales inversely with the separation measure irrespective of other preference distribution metrics. To approach this minimax limit, we propose a nearly linear-time ranking scheme, called \emph{Spectral MLE}, that returns the indices of the top-$K$ items in accordance to a careful score estimate. In a nutshell, Spectral MLE starts with an initial score estimate with minimal squared loss (obtained via a spectral method), and then successively refines each component with the assistance of coordinate-wise MLEs. Encouragingly, Spectral MLE allows perfect top-$K$ item identification under minimal sample complexity. The practical applicability of Spectral MLE is further corroborated by numerical experiments.
△ Less
Submitted 28 May, 2015; v1 submitted 27 April, 2015;
originally announced April 2015.
-
Information Recovery from Pairwise Measurements
Authors:
Yuxin Chen,
Changho Suh,
Andrea J. Goldsmith
Abstract:
This paper is concerned with jointly recovering $n$ node-variables $\left\{ x_{i}\right\}_{1\leq i\leq n}$ from a collection of pairwise difference measurements. Imagine we acquire a few observations taking the form of $x_{i}-x_{j}$; the observation pattern is represented by a measurement graph $\mathcal{G}$ with an edge set $\mathcal{E}$ such that $x_{i}-x_{j}$ is observed if and only if…
▽ More
This paper is concerned with jointly recovering $n$ node-variables $\left\{ x_{i}\right\}_{1\leq i\leq n}$ from a collection of pairwise difference measurements. Imagine we acquire a few observations taking the form of $x_{i}-x_{j}$; the observation pattern is represented by a measurement graph $\mathcal{G}$ with an edge set $\mathcal{E}$ such that $x_{i}-x_{j}$ is observed if and only if $(i,j)\in\mathcal{E}$. To account for noisy measurements in a general manner, we model the data acquisition process by a set of channels with given input/output transition measures. Employing information-theoretic tools applied to channel decoding problems, we develop a \emph{unified} framework to characterize the fundamental recovery criterion, which accommodates general graph structures, alphabet sizes, and channel transition measures. In particular, our results isolate a family of \emph{minimum} \emph{channel divergence measures} to characterize the degree of measurement corruption, which together with the size of the minimum cut of $\mathcal{G}$ dictates the feasibility of exact information recovery. For various homogeneous graphs, the recovery condition depends almost only on the edge sparsity of the measurement graph irrespective of other graphical metrics; alternatively, the minimum sample complexity required for these graphs scales like \[ \text{minimum sample complexity }\asymp\frac{n\log n}{\mathsf{Hel}_{1/2}^{\min}} \] for certain information metric $\mathsf{Hel}_{1/2}^{\min}$ defined in the main text, as long as the alphabet size is not super-polynomial in $n$. We apply our general theory to three concrete applications, including the stochastic block model, the outlier model, and the haplotype assembly problem. Our theory leads to order-wise tight recovery conditions for all these scenarios.
△ Less
Submitted 5 May, 2016; v1 submitted 6 April, 2015;
originally announced April 2015.
-
Boundary-twisted normal form and the number of elementary moves to unknot
Authors:
Chan-Ho Suh
Abstract:
Suppose $K$ is an unknot lying in the 1-skeleton of a triangulated 3-manifold with $t$ tetrahedra. Hass and Lagarias showed there is an upper bound, depending only on $t$, for the minimal number of elementary moves to untangle $K$. We give a simpler proof, utilizing a normal form for surfaces whose boundary is contained in the 1-skeleton of a triangulated 3-manifold. We also obtain a significantly…
▽ More
Suppose $K$ is an unknot lying in the 1-skeleton of a triangulated 3-manifold with $t$ tetrahedra. Hass and Lagarias showed there is an upper bound, depending only on $t$, for the minimal number of elementary moves to untangle $K$. We give a simpler proof, utilizing a normal form for surfaces whose boundary is contained in the 1-skeleton of a triangulated 3-manifold. We also obtain a significantly better upper bound of $2^{120t+14}$ and improve the Hass--Lagarias upper bound on the number of Reidemeister moves needed to unknot to $2^{10^5 n}$, where $n$ is the crossing number.
△ Less
Submitted 20 October, 2010;
originally announced October 2010.
-
The Unknotting Problem and Normal Surface Q-Theory
Authors:
Chan-Ho Suh
Abstract:
Tollefson described a variant of normal surface theory for 3-manifolds, called Q-theory, where only the quadrilateral coordinates are used. Suppose $M$ is a triangulated, compact, irreducible, boundary-irreducible 3-manifold. In Q-theory, if $M$ contains an essential surface, then the projective solution space has an essential surface at a vertex. One interesting situation not covered by this theo…
▽ More
Tollefson described a variant of normal surface theory for 3-manifolds, called Q-theory, where only the quadrilateral coordinates are used. Suppose $M$ is a triangulated, compact, irreducible, boundary-irreducible 3-manifold. In Q-theory, if $M$ contains an essential surface, then the projective solution space has an essential surface at a vertex. One interesting situation not covered by this theorem is when $M$ is boundary reducible, e.g. $M$ is an unknot complement. We prove that in this case $M$ has an essential disc at a vertex of the Q-projective solution space.
△ Less
Submitted 8 September, 2010;
originally announced September 2010.
-
Normal Surface Theory in Link Diagrams
Authors:
Chan-Ho Suh
Abstract:
This paper has been withdrawn by the author, due to a significant error in section 4.3.1.
This paper has been withdrawn by the author, due to a significant error in section 4.3.1.
△ Less
Submitted 26 January, 2009; v1 submitted 16 August, 2007;
originally announced August 2007.