-
DAL: A Practical Prior-Free Black-Box Framework for Non-Stationary Bandit Environments
Authors:
Argyrios Gerogiannis,
Yu-Han Huang,
Subhonmesh Bose,
Venugopal V. Veeravalli
Abstract:
We introduce a practical, black-box framework termed Detection Augmenting Learning (DAL) for the problem of non-stationary bandits without prior knowledge of the underlying non-stationarity. DAL is modular, accepting any stationary bandit algorithm as input and augmenting it with a change detector. Our approach is applicable to all common parametric and non-parametric bandit variants. Extensive ex…
▽ More
We introduce a practical, black-box framework termed Detection Augmenting Learning (DAL) for the problem of non-stationary bandits without prior knowledge of the underlying non-stationarity. DAL is modular, accepting any stationary bandit algorithm as input and augmenting it with a change detector. Our approach is applicable to all common parametric and non-parametric bandit variants. Extensive experimentation demonstrates that DAL consistently surpasses current state-of-the-art methods across diverse non-stationary scenarios, including synthetic benchmarks and real-world datasets, underscoring its versatility and scalability. We provide theoretical insights into DAL's strong empirical performance on piecewise stationary and drift settings, complemented by thorough experimental validation.
△ Less
Submitted 24 May, 2025; v1 submitted 31 January, 2025;
originally announced January 2025.
-
Nonparametric Sparse Online Learning of the Koopman Operator
Authors:
Boya Hou,
Sina Sanjari,
Nathan Dahlin,
Alec Koppel,
Subhonmesh Bose
Abstract:
The Koopman operator provides a powerful framework for representing the dynamics of general nonlinear dynamical systems. Data-driven techniques to learn the Koopman operator typically assume that the chosen function space is closed under system dynamics. In this paper, we study the Koopman operator via its action on the reproducing kernel Hilbert space (RKHS), and explore the mis-specified scenari…
▽ More
The Koopman operator provides a powerful framework for representing the dynamics of general nonlinear dynamical systems. Data-driven techniques to learn the Koopman operator typically assume that the chosen function space is closed under system dynamics. In this paper, we study the Koopman operator via its action on the reproducing kernel Hilbert space (RKHS), and explore the mis-specified scenario where the dynamics may escape the chosen function space. We relate the Koopman operator to the conditional mean embeddings (CME) operator and then present an operator stochastic approximation algorithm to learn the Koopman operator iteratively with control over the complexity of the representation. We provide both asymptotic and finite-time last-iterate guarantees of the online sparse learning algorithm with trajectory-based sampling with an analysis that is substantially more involved than that for finite-dimensional stochastic approximation. Numerical examples confirm the effectiveness of the proposed algorithm.
△ Less
Submitted 4 February, 2025; v1 submitted 27 January, 2025;
originally announced January 2025.
-
Detection Augmented Bandit Procedures for Piecewise Stationary MABs: A Modular Approach
Authors:
Yu-Han Huang,
Argyrios Gerogiannis,
Subhonmesh Bose,
Venugopal V. Veeravalli
Abstract:
Conventional Multi-Armed Bandit (MAB) algorithms are designed for stationary environments, where the reward distributions associated with the arms do not change with time. In many applications, however, the environment is more accurately modeled as being nonstationary. In this work, piecewise stationary MAB (PS-MAB) environments are investigated, in which the reward distributions associated with a…
▽ More
Conventional Multi-Armed Bandit (MAB) algorithms are designed for stationary environments, where the reward distributions associated with the arms do not change with time. In many applications, however, the environment is more accurately modeled as being nonstationary. In this work, piecewise stationary MAB (PS-MAB) environments are investigated, in which the reward distributions associated with a subset of the arms change at some change-points and remain stationary between change-points. Our focus is on the asymptotic analysis of PS-MABs, for which practical algorithms based on change detection (CD) have been previously proposed. Our goal is to modularize the design and analysis of such CD-based Bandit (CDB) procedures. To this end, we identify the requirements for stationary bandit algorithms and change detectors in a CDB procedure that are needed for the modularization. We assume that the rewards are sub-Gaussian. Under this assumption and a condition on the separation of the change-points, we show that the analysis of CDB procedures can indeed be modularized, so that regret bounds can be obtained in a unified manner for various combinations of change detectors and bandit algorithms. Through this analysis, we develop new modular CDB procedures that are order-optimal. We compare the performance of our modular CDB procedures with various other methods in simulations.
△ Less
Submitted 26 February, 2025; v1 submitted 2 January, 2025;
originally announced January 2025.
-
Tangential Randomization in Linear Bandits (TRAiL): Guaranteed Inference and Regret Bounds
Authors:
Arda Güçlü,
Subhonmesh Bose
Abstract:
We propose and analyze TRAiL (Tangential Randomization in Linear Bandits), a computationally efficient regret-optimal forced exploration algorithm for linear bandits on action sets that are sublevel sets of strongly convex functions. TRAiL estimates the governing parameter of the linear bandit problem through a standard regularized least squares and perturbs the reward-maximizing action correspond…
▽ More
We propose and analyze TRAiL (Tangential Randomization in Linear Bandits), a computationally efficient regret-optimal forced exploration algorithm for linear bandits on action sets that are sublevel sets of strongly convex functions. TRAiL estimates the governing parameter of the linear bandit problem through a standard regularized least squares and perturbs the reward-maximizing action corresponding to said point estimate along the tangent plane of the convex compact action set before projecting back to it. Exploiting concentration results for matrix martingales, we prove that TRAiL ensures a $Ω(\sqrt{T})$ growth in the inference quality, measured via the minimum eigenvalue of the design (regressor) matrix with high-probability over a $T$-length period. We build on this result to obtain an $\mathcal{O}(\sqrt{T} \log(T))$ upper bound on cumulative regret with probability at least $ 1 - 1/T$ over $T$ periods, and compare TRAiL to other popular algorithms for linear bandits. Then, we characterize an $Ω(\sqrt{T})$ minimax lower bound for any algorithm on the expected regret that covers a wide variety of action/parameter sets and noise processes. Our analysis not only expands the realm of lower-bounds in linear bandits significantly, but as a byproduct, yields a trade-off between regret and inference quality. Specifically, we prove that any algorithm with an $\mathcal{O}(T^α)$ expected regret growth must have an $Ω(T^{1-α})$ asymptotic growth in expected inference quality. Our experiments on the $L^p$ unit ball as action sets reveal how this relation can be violated, but only in the short-run, before returning to respect the bound asymptotically. In effect, regret-minimizing algorithms must have just the right rate of inference -- too fast or too slow inference will incur sub-optimal regret growth.
△ Less
Submitted 18 November, 2024;
originally announced November 2024.
-
Nonparametric Sparse Online Learning of the Koopman Operator
Authors:
Boya Hou,
Sina Sanjari,
Nathan Dahlin,
Alec Koppel,
Subhonmesh Bose
Abstract:
The Koopman operator provides a powerful framework for representing the dynamics of general nonlinear dynamical systems. Data-driven techniques to learn the Koopman operator typically assume that the chosen function space is closed under system dynamics. In this paper, we study the Koopman operator via its action on the reproducing kernel Hilbert space (RKHS), and explore the mis-specified scenari…
▽ More
The Koopman operator provides a powerful framework for representing the dynamics of general nonlinear dynamical systems. Data-driven techniques to learn the Koopman operator typically assume that the chosen function space is closed under system dynamics. In this paper, we study the Koopman operator via its action on the reproducing kernel Hilbert space (RKHS), and explore the mis-specified scenario where the dynamics may escape the chosen function space. We relate the Koopman operator to the conditional mean embeddings (CME) operator and then present an operator stochastic approximation algorithm to learn the Koopman operator iteratively with control over the complexity of the representation. We provide both asymptotic and finite-time last-iterate guarantees of the online sparse learning algorithm with trajectory-based sampling with an analysis that is substantially more involved than that for finite-dimensional stochastic approximation. Numerical examples confirm the effectiveness of the proposed algorithm.
△ Less
Submitted 4 February, 2025; v1 submitted 12 May, 2024;
originally announced May 2024.
-
Estimating Changepoints in Extremal Dependence, Applied to Aviation Stock Prices During COVID-19 Pandemic
Authors:
Arnab Hazra,
Shiladitya Bose
Abstract:
The dependence in the tails of the joint distribution of two random variables is generally assessed using $χ$-measure, the limiting conditional probability of one variable being extremely high given the other variable is also extremely high. This work is motivated by the structural changes in $χ$-measure between the daily rate of return (RoR) of the two Indian airlines, IndiGo and SpiceJet, during…
▽ More
The dependence in the tails of the joint distribution of two random variables is generally assessed using $χ$-measure, the limiting conditional probability of one variable being extremely high given the other variable is also extremely high. This work is motivated by the structural changes in $χ$-measure between the daily rate of return (RoR) of the two Indian airlines, IndiGo and SpiceJet, during the COVID-19 pandemic. We model the daily maximum and minimum RoR vectors (potentially transformed) using the bivariate Hüsler-Reiss (BHR) distribution. To estimate the changepoint in the $χ$-measure of the BHR distribution, we explore two changepoint detection procedures based on the Likelihood Ratio Test (LRT) and Modified Information Criterion (MIC). We obtain critical values and power curves of the LRT and MIC test statistics for low through high values of $χ$-measure. We also explore the consistency of the estimators of the changepoint based on LRT and MIC numerically. In our data application, for RoR maxima and minima, the most prominent changepoints detected by LRT and MIC are close to the announcement of the first phases of lockdown and unlock, respectively, which are realistic; thus, our study would be beneficial for portfolio optimization in the case of future pandemic situations.
△ Less
Submitted 14 June, 2024; v1 submitted 26 August, 2023;
originally announced August 2023.
-
Machine learning based surrogate modeling with SVD enabled training for nonlinear civil structures subject to dynamic loading
Authors:
Siddharth S. Parida,
Supratik Bose,
Megan Butcher,
Georgios Apostolakis,
Prashant Shekhar
Abstract:
The computationally expensive estimation of engineering demand parameters (EDPs) via finite element (FE) models, while considering earthquake and parameter uncertainty limits the use of the Performance Based Earthquake Engineering framework. Attempts have been made to substitute FE models with surrogate models, however, most of these models are a function of building parameters only. This necessit…
▽ More
The computationally expensive estimation of engineering demand parameters (EDPs) via finite element (FE) models, while considering earthquake and parameter uncertainty limits the use of the Performance Based Earthquake Engineering framework. Attempts have been made to substitute FE models with surrogate models, however, most of these models are a function of building parameters only. This necessitates re-training for earthquakes not previously seen by the surrogate. In this paper, the authors propose a machine learning based surrogate model framework, which considers both these uncertainties in order to predict for unseen earthquakes. Accordingly,earthquakes are characterized by their projections on an orthonormal basis, computed using SVD of a representative ground motion suite. This enables one to generate large varieties of earthquakes by randomly sampling these weights and multiplying them with the basis. The weights along with the constitutive parameters serve as inputs to a machine learning model with EDPs as the desired output. Four competing machine learning models were tested and it was observed that a deep neural network (DNN) gave the most accurate prediction. The framework is validated by using it to successfully predict the peak response of one-story and three-story buildings represented using stick models, subjected to unseen far-field ground motions.
△ Less
Submitted 12 June, 2022;
originally announced June 2022.
-
An Improved Relevance Feedback in CBIR
Authors:
Subhadip Maji,
Smarajit Bose
Abstract:
Relevance Feedback in Content-Based Image Retrieval is a method where the feedback of the performance is being used to improve itself. Prior works use feature re-weighting and classification techniques as the Relevance Feedback methods. This paper shows a novel addition to the prior methods to further improve the retrieval accuracy. In addition to all of these, the paper also shows a novel idea to…
▽ More
Relevance Feedback in Content-Based Image Retrieval is a method where the feedback of the performance is being used to improve itself. Prior works use feature re-weighting and classification techniques as the Relevance Feedback methods. This paper shows a novel addition to the prior methods to further improve the retrieval accuracy. In addition to all of these, the paper also shows a novel idea to even improve the 0-th iteration retrieval accuracy from the information of Relevance Feedback.
△ Less
Submitted 29 August, 2020; v1 submitted 21 June, 2020;
originally announced June 2020.
-
Is my Neural Network Neuromorphic? Taxonomy, Recent Trends and Future Directions in Neuromorphic Engineering
Authors:
Sumon Kumar Bose,
Jyotibdha Acharya,
Arindam Basu
Abstract:
In this paper, we review recent work published over the last 3 years under the umbrella of Neuromorphic engineering to analyze what are the common features among such systems. We see that there is no clear consensus but each system has one or more of the following features:(1) Analog computing (2) Non vonNeumann Architecture and low-precision digital processing (3) Spiking Neural Networks (SNN) wi…
▽ More
In this paper, we review recent work published over the last 3 years under the umbrella of Neuromorphic engineering to analyze what are the common features among such systems. We see that there is no clear consensus but each system has one or more of the following features:(1) Analog computing (2) Non vonNeumann Architecture and low-precision digital processing (3) Spiking Neural Networks (SNN) with components closely related to biology. We compare recent machine learning accelerator chips to show that indeed analog processing and reduced bit precision architectures have best throughput, energy and area efficiencies. However, pure digital architectures can also achieve quite high efficiencies by just adopting a non von-Neumann architecture. Given the design automation tools for digital hardware design, it raises a question on the likelihood of adoption of analog processing in the near future for industrial designs. Next, we argue about the importance of defining standards and choosing proper benchmarks for the progress of neuromorphic system designs and propose some desired characteristics of such benchmarks. Finally, we show brain-machine interfaces as a potential task that fulfils all the criteria of such benchmarks.
△ Less
Submitted 27 February, 2020;
originally announced February 2020.
-
CBIR using features derived by Deep Learning
Authors:
Subhadip Maji,
Smarajit Bose
Abstract:
In a Content Based Image Retrieval (CBIR) System, the task is to retrieve similar images from a large database given a query image. The usual procedure is to extract some useful features from the query image, and retrieve images which have similar set of features. For this purpose, a suitable similarity measure is chosen, and images with high similarity scores are retrieved. Naturally the choice o…
▽ More
In a Content Based Image Retrieval (CBIR) System, the task is to retrieve similar images from a large database given a query image. The usual procedure is to extract some useful features from the query image, and retrieve images which have similar set of features. For this purpose, a suitable similarity measure is chosen, and images with high similarity scores are retrieved. Naturally the choice of these features play a very important role in the success of this system, and high level features are required to reduce the semantic gap.
In this paper, we propose to use features derived from pre-trained network models from a deep-learning convolution network trained for a large image classification problem. This approach appears to produce vastly superior results for a variety of databases, and it outperforms many contemporary CBIR systems. We analyse the retrieval time of the method, and also propose a pre-clustering of the database based on the above-mentioned features which yields comparable results in a much shorter time in most of the cases.
△ Less
Submitted 13 February, 2020;
originally announced February 2020.
-
ADEPOS: A Novel Approximate Computing Framework for Anomaly Detection Systems and its Implementation in 65nm CMOS
Authors:
Sumon Kumar Bose,
Bapi Kar,
Mohendra Roy,
Pradeep Kumar Gopalakrishnan,
Zhang Lei,
Aakash Patil,
Arindam Basu
Abstract:
To overcome the energy and bandwidth limitations of traditional IoT systems, edge computing or information extraction at the sensor node has become popular. However, now it is important to create very low energy information extraction or pattern recognition systems. In this paper, we present an approximate computing method to reduce the computation energy of a specific type of IoT system used for…
▽ More
To overcome the energy and bandwidth limitations of traditional IoT systems, edge computing or information extraction at the sensor node has become popular. However, now it is important to create very low energy information extraction or pattern recognition systems. In this paper, we present an approximate computing method to reduce the computation energy of a specific type of IoT system used for anomaly detection (e.g. in predictive maintenance, epileptic seizure detection, etc). Termed as Anomaly Detection Based Power Savings (ADEPOS), our proposed method uses low precision computing and low complexity neural networks at the beginning when it is easy to distinguish healthy data. However, on the detection of anomalies, the complexity of the network and computing precision are adaptively increased for accurate predictions. We show that ensemble approaches are well suited for adaptively changing network size. To validate our proposed scheme, a chip has been fabricated in UMC65nm process that includes an MSP430 microprocessor along with an on-chip switching mode DC-DC converter for dynamic voltage and frequency scaling. Using NASA bearing dataset for machine health monitoring, we show that using ADEPOS we can achieve 8.95X saving of energy along the lifetime without losing any detection accuracy. The energy savings are obtained by reducing the execution time of the neural network on the microprocessor.
△ Less
Submitted 4 December, 2019;
originally announced December 2019.
-
A Stacked Autoencoder Neural Network based Automated Feature Extraction Method for Anomaly detection in On-line Condition Monitoring
Authors:
Mohendra Roy,
Sumon Kumar Bose,
Bapi Kar,
Pradeep Kumar Gopalakrishnan,
Arindam Basu
Abstract:
Condition monitoring is one of the routine tasks in all major process industries. The mechanical parts such as a motor, gear, bearings are the major components of a process industry and any fault in them may cause a total shutdown of the whole process, which may result in serious losses. Therefore, it is very crucial to predict any approaching defects before its occurrence. Several methods exist f…
▽ More
Condition monitoring is one of the routine tasks in all major process industries. The mechanical parts such as a motor, gear, bearings are the major components of a process industry and any fault in them may cause a total shutdown of the whole process, which may result in serious losses. Therefore, it is very crucial to predict any approaching defects before its occurrence. Several methods exist for this purpose and many research are being carried out for better and efficient models. However, most of them are based on the processing of raw sensor signals, which is tedious and expensive. Recently, there has been an increase in the feature based condition monitoring, where only the useful features are extracted from the raw signals and interpreted for the prediction of the fault. Most of these are handcrafted features, where these are manually obtained based on the nature of the raw data. This of course requires the prior knowledge of the nature of data and related processes. This limits the feature extraction process. However, recent development in the autoencoder based feature extraction method provides an alternative to the traditional handcrafted approaches; however, they have mostly been confined in the area of image and audio processing. In this work, we have developed an automated feature extraction method for on-line condition monitoring based on the stack of the traditional autoencoder and an on-line sequential extreme learning machine(OSELM) network. The performance of this method is comparable to that of the traditional feature extraction approaches. The method can achieve 100% detection accuracy for determining the bearing health states of NASA bearing dataset. The simple design of this method is promising for the easy hardware implementation of Internet of Things(IoT) based prognostics solutions.
△ Less
Submitted 18 October, 2018;
originally announced October 2018.
-
Supervised quantum gate "teaching" for quantum hardware design
Authors:
Leonardo Banchi,
Nicola Pancotti,
Sougato Bose
Abstract:
We show how to train a quantum network of pairwise interacting qubits such that its evolution implements a target quantum algorithm into a given network subset. Our strategy is inspired by supervised learning and is designed to help the physical construction of a quantum computer which operates with minimal external classical control.
We show how to train a quantum network of pairwise interacting qubits such that its evolution implements a target quantum algorithm into a given network subset. Our strategy is inspired by supervised learning and is designed to help the physical construction of a quantum computer which operates with minimal external classical control.
△ Less
Submitted 20 July, 2016;
originally announced July 2016.
-
A Novel Minimum Divergence Approach to Robust Speaker Identification
Authors:
Ayanendranath Basu,
Smarajit Bose,
Amita Pal,
Anish Mukherjee,
Debasmita Das
Abstract:
In this work, a novel solution to the speaker identification problem is proposed through minimization of statistical divergences between the probability distribution (g). of feature vectors from the test utterance and the probability distributions of the feature vector corresponding to the speaker classes. This approach is made more robust to the presence of outliers, through the use of suitably m…
▽ More
In this work, a novel solution to the speaker identification problem is proposed through minimization of statistical divergences between the probability distribution (g). of feature vectors from the test utterance and the probability distributions of the feature vector corresponding to the speaker classes. This approach is made more robust to the presence of outliers, through the use of suitably modified versions of the standard divergence measures. The relevant solutions to the minimum distance methods are referred to as the minimum rescaled modified distance estimators (MRMDEs). Three measures were considered - the likelihood disparity, the Hellinger distance and Pearson's chi-square distance. The proposed approach is motivated by the observation that, in the case of the likelihood disparity, when the empirical distribution function is used to estimate g, it becomes equivalent to maximum likelihood classification with Gaussian Mixture Models (GMMs) for speaker classes, a highly effective approach used, for example, by Reynolds [22] based on Mel Frequency Cepstral Coefficients (MFCCs) as features. Significant improvement in classification accuracy is observed under this approach on the benchmark speech corpus NTIMIT and a new bilingual speech corpus NISIS, with MFCC features, both in isolation and in combination with delta MFCC features. Moreover, the ubiquitous principal component transformation, by itself and in conjunction with the principle of classifier combination, is found to further enhance the performance.
△ Less
Submitted 16 December, 2015;
originally announced December 2015.
-
A Hybrid Approach for Improved Content-based Image Retrieval using Segmentation
Authors:
Smarajit Bose,
Amita Pal,
Jhimli Mallick,
Sunil Kumar,
Pratyaydipta Rudra
Abstract:
The objective of Content-Based Image Retrieval (CBIR) methods is essentially to extract, from large (image) databases, a specified number of images similar in visual and semantic content to a so-called query image. To bridge the semantic gap that exists between the representation of an image by low-level features (namely, colour, shape, texture) and its high-level semantic content as perceived by…
▽ More
The objective of Content-Based Image Retrieval (CBIR) methods is essentially to extract, from large (image) databases, a specified number of images similar in visual and semantic content to a so-called query image. To bridge the semantic gap that exists between the representation of an image by low-level features (namely, colour, shape, texture) and its high-level semantic content as perceived by humans, CBIR systems typically make use of the relevance feedback (RF) mechanism. RF iteratively incorporates user-given inputs regarding the relevance of retrieved images, to improve retrieval efficiency. One approach is to vary the weights of the features dynamically via feature reweighting. In this work, an attempt has been made to improve retrieval accuracy by enhancing a CBIR system based on color features alone, through implicit incorporation of shape information obtained through prior segmentation of the images. Novel schemes for feature reweighting as well as for initialization of the relevant set for improved relevance feedback, have also been proposed for boosting performance of RF- based CBIR. At the same time, new measures for evaluation of retrieval accuracy have been suggested, to overcome the limitations of existing measures in the RF context. Results of extensive experiments have been presented to illustrate the effectiveness of the proposed approaches.
△ Less
Submitted 11 February, 2015;
originally announced February 2015.
-
Minimum Distance Estimation of Milky Way Model Parameters and Related Inference
Authors:
Sourabh Banerjee,
Ayanendranath Basu,
Sourabh Bhattacharya,
Smarajit Bose,
Dalia Chakrabarty,
Soumendu Sundar Mukherjee
Abstract:
We propose a method to estimate the location of the Sun in the disk of the Milky Way using a method based on the Hellinger distance and construct confidence sets on our estimate of the unknown location using a bootstrap based method. Assuming the Galactic disk to be two-dimensional, the sought solar location then reduces to the radial distance separating the Sun from the Galactic center and the an…
▽ More
We propose a method to estimate the location of the Sun in the disk of the Milky Way using a method based on the Hellinger distance and construct confidence sets on our estimate of the unknown location using a bootstrap based method. Assuming the Galactic disk to be two-dimensional, the sought solar location then reduces to the radial distance separating the Sun from the Galactic center and the angular separation of the Galactic center to Sun line, from a pre-fixed line on the disk. On astronomical scales, the unknown solar location is equivalent to the location of us earthlings who observe the velocities of a sample of stars in the neighborhood of the Sun. This unknown location is estimated by undertaking pairwise comparisons of the estimated density of the observed set of velocities of the sampled stars, with densities estimated using synthetic stellar velocity data sets generated at chosen locations in the Milky Way disk according to four base astrophysical models. The "match" between the pair of estimated densities is parameterized by the affinity measure based on the familiar Hellinger distance. We perform a novel cross-validation procedure to establish a desirable "consistency" property of the proposed method.
△ Less
Submitted 15 August, 2014; v1 submitted 3 September, 2013;
originally announced September 2013.
-
Modeling Bimodal Discrete Data Using Conway-Maxwell-Poisson Mixture Models
Authors:
Pragya Sur,
Galit Shmueli,
Smarajit Bose,
Paromita Dubey
Abstract:
Bimodal truncated count distributions are frequently observed in aggregate survey data and in user ratings when respondents are mixed in their opinion. They also arise in censored count data, where the highest category might create an additional mode. Modeling bimodal behavior in discrete data is useful for various purposes, from comparing shapes of different samples (or survey questions) to predi…
▽ More
Bimodal truncated count distributions are frequently observed in aggregate survey data and in user ratings when respondents are mixed in their opinion. They also arise in censored count data, where the highest category might create an additional mode. Modeling bimodal behavior in discrete data is useful for various purposes, from comparing shapes of different samples (or survey questions) to predicting future ratings by new raters. The Poisson distribution is the most common distribution for fitting count data and can be modified to achieve mixtures of truncated Poisson distributions. However, it is suitable only for modeling equi-dispersed distributions and is limited in its ability to capture bimodality. The Conway-Maxwell-Poisson (CMP) distribution is a two-parameter generalization of the Poisson distribution that allows for over- and under-dispersion. In this work, we propose a mixture of CMPs for capturing a wide range of truncated discrete data, which can exhibit unimodal and bimodal behavior. We present methods for estimating the parameters of a mixture of two CMP distributions using an EM approach. Our approach introduces a special two-step optimization within the M step to estimate multiple parameters. We examine computational and theoretical issues. The methods are illustrated for modeling ordered rating data as well as truncated count data, using simulated and real examples.
△ Less
Submitted 23 January, 2014; v1 submitted 2 September, 2013;
originally announced September 2013.