-
Coupled Compound Poisson Factorization
Authors:
Mehmet E. Basbug,
Barbara E. Engelhardt
Abstract:
We present a general framework, the coupled compound Poisson factorization (CCPF), to capture the missing-data mechanism in extremely sparse data sets by coupling a hierarchical Poisson factorization with an arbitrary data-generating model. We derive a stochastic variational inference algorithm for the resulting model and, as examples of our framework, implement three different data-generating mod…
▽ More
We present a general framework, the coupled compound Poisson factorization (CCPF), to capture the missing-data mechanism in extremely sparse data sets by coupling a hierarchical Poisson factorization with an arbitrary data-generating model. We derive a stochastic variational inference algorithm for the resulting model and, as examples of our framework, implement three different data-generating models---a mixture model, linear regression, and factor analysis---to robustly model non-random missing data in the context of clustering, prediction, and matrix factorization. In all three cases, we test our framework against models that ignore the missing-data mechanism on large scale studies with non-random missing data, and we show that explicitly modeling the missing-data mechanism substantially improves the quality of the results, as measured using data log likelihood on a held-out test set.
△ Less
Submitted 8 January, 2017;
originally announced January 2017.
-
Dynamic Collaborative Filtering with Compound Poisson Factorization
Authors:
Ghassen Jerfel,
Mehmet E. Basbug,
Barbara E. Engelhardt
Abstract:
Model-based collaborative filtering analyzes user-item interactions to infer latent factors that represent user preferences and item characteristics in order to predict future interactions. Most collaborative filtering algorithms assume that these latent factors are static, although it has been shown that user preferences and item perceptions drift over time. In this paper, we propose a conjugate…
▽ More
Model-based collaborative filtering analyzes user-item interactions to infer latent factors that represent user preferences and item characteristics in order to predict future interactions. Most collaborative filtering algorithms assume that these latent factors are static, although it has been shown that user preferences and item perceptions drift over time. In this paper, we propose a conjugate and numerically stable dynamic matrix factorization (DCPF) based on compound Poisson matrix factorization that models the smoothly drifting latent factors using Gamma-Markov chains. We propose a numerically stable Gamma chain construction, and then present a stochastic variational inference approach to estimate the parameters of our model. We apply our model to time-stamped ratings data sets: Netflix, Yelp, and Last.fm, where DCPF achieves a higher predictive accuracy than state-of-the-art static and dynamic factorization models.
△ Less
Submitted 1 November, 2016; v1 submitted 16 August, 2016;
originally announced August 2016.
-
Hierarchical Compound Poisson Factorization
Authors:
Mehmet E. Basbug,
Barbara E. Engelhardt
Abstract:
Non-negative matrix factorization models based on a hierarchical Gamma-Poisson structure capture user and item behavior effectively in extremely sparse data sets, making them the ideal choice for collaborative filtering applications. Hierarchical Poisson factorization (HPF) in particular has proved successful for scalable recommendation systems with extreme sparsity. HPF, however, suffers from a t…
▽ More
Non-negative matrix factorization models based on a hierarchical Gamma-Poisson structure capture user and item behavior effectively in extremely sparse data sets, making them the ideal choice for collaborative filtering applications. Hierarchical Poisson factorization (HPF) in particular has proved successful for scalable recommendation systems with extreme sparsity. HPF, however, suffers from a tight coupling of sparsity model (absence of a rating) and response model (the value of the rating), which limits the expressiveness of the latter. Here, we introduce hierarchical compound Poisson factorization (HCPF) that has the favorable Gamma-Poisson structure and scalability of HPF to high-dimensional extremely sparse matrices. More importantly, HCPF decouples the sparsity model from the response model, allowing us to choose the most suitable distribution for the response. HCPF can capture binary, non-negative discrete, non-negative continuous, and zero-inflated continuous responses. We compare HCPF with HPF on nine discrete and three continuous data sets and conclude that HCPF captures the relationship between sparsity and response better than HPF.
△ Less
Submitted 26 May, 2016; v1 submitted 13 April, 2016;
originally announced April 2016.
-
AdaCluster : Adaptive Clustering for Heterogeneous Data
Authors:
Mehmet Emin Basbug,
Barbara Engelhardt
Abstract:
Clustering algorithms start with a fixed divergence, which captures the possibly asymmetric distance between a sample and a centroid. In the mixture model setting, the sample distribution plays the same role. When all attributes have the same topology and dispersion, the data are said to be homogeneous. If the prior knowledge of the distribution is inaccurate or the set of plausible distributions…
▽ More
Clustering algorithms start with a fixed divergence, which captures the possibly asymmetric distance between a sample and a centroid. In the mixture model setting, the sample distribution plays the same role. When all attributes have the same topology and dispersion, the data are said to be homogeneous. If the prior knowledge of the distribution is inaccurate or the set of plausible distributions is large, an adaptive approach is essential. The motivation is more compelling for heterogeneous data, where the dispersion or the topology differs among attributes. We propose an adaptive approach to clustering using classes of parametrized Bregman divergences. We first show that the density of a steep exponential dispersion model (EDM) can be represented with a Bregman divergence. We then propose AdaCluster, an expectation-maximization (EM) algorithm to cluster heterogeneous data using classes of steep EDMs. We compare AdaCluster with EM for a Gaussian mixture model on synthetic data and nine UCI data sets. We also propose an adaptive hard clustering algorithm based on Generalized Method of Moments. We compare the hard clustering algorithm with k-means on the UCI data sets. We empirically verified that adaptively learning the underlying topology yields better clustering of heterogeneous data.
△ Less
Submitted 6 January, 2017; v1 submitted 19 October, 2015;
originally announced October 2015.
-
Accelerometer based Activity Classification with Variational Inference on Sticky HDP-SLDS
Authors:
Mehmet Emin Basbug,
Koray Ozcan,
Senem Velipasalar
Abstract:
As part of daily monitoring of human activities, wearable sensors and devices are becoming increasingly popular sources of data. With the advent of smartphones equipped with acceloremeter, gyroscope and camera; it is now possible to develop activity classification platforms everyone can use conveniently. In this paper, we propose a fast inference method for an unsupervised non-parametric time seri…
▽ More
As part of daily monitoring of human activities, wearable sensors and devices are becoming increasingly popular sources of data. With the advent of smartphones equipped with acceloremeter, gyroscope and camera; it is now possible to develop activity classification platforms everyone can use conveniently. In this paper, we propose a fast inference method for an unsupervised non-parametric time series model namely variational inference for sticky HDP-SLDS(Hierarchical Dirichlet Process Switching Linear Dynamical System). We show that the proposed algorithm can differentiate various indoor activities such as sitting, walking, turning, going up/down the stairs and taking the elevator using only the acceloremeter of an Android smartphone Samsung Galaxy S4. We used the front camera of the smartphone to annotate activity types precisely. We compared the proposed method with Hidden Markov Models with Gaussian emission probabilities on a dataset of 10 subjects. We showed that the efficacy of the stickiness property. We further compared the variational inference to the Gibbs sampler on the same model and show that variational inference is faster in one order of magnitude.
△ Less
Submitted 19 October, 2015;
originally announced October 2015.