Search | arXiv e-print repository

Near-Optimal Non-Convex Stochastic Optimization under Generalized Smoothness

Authors: Zijian Liu, Srikanth Jagabathula, Zhengyuan Zhou

Abstract: The generalized smooth condition, $(L_{0},L_{1})$-smoothness, has triggered people's interest since it is more realistic in many optimization problems shown by both empirical and theoretical evidence. Two recent works established the $O(ε^{-3})$ sample complexity to obtain an $O(ε)$-stationary point. However, both require a large batch size on the order of $\mathrm{ploy}(ε^{-1})$, which is not onl… ▽ More The generalized smooth condition, $(L_{0},L_{1})$-smoothness, has triggered people's interest since it is more realistic in many optimization problems shown by both empirical and theoretical evidence. Two recent works established the $O(ε^{-3})$ sample complexity to obtain an $O(ε)$-stationary point. However, both require a large batch size on the order of $\mathrm{ploy}(ε^{-1})$, which is not only computationally burdensome but also unsuitable for streaming applications. Additionally, these existing convergence bounds are established only for the expected rate, which is inadequate as they do not supply a useful performance guarantee on a single run. In this work, we solve the prior two problems simultaneously by revisiting a simple variant of the STORM algorithm. Specifically, under the $(L_{0},L_{1})$-smoothness and affine-type noises, we establish the first near-optimal $O(\log(1/(δε))ε^{-3})$ high-probability sample complexity where $δ\in(0,1)$ is the failure probability. Besides, for the same algorithm, we also recover the optimal $O(ε^{-3})$ sample complexity for the expected convergence with improved dependence on the problem-dependent parameter. More importantly, our convergence results only require a constant batch size in contrast to the previous works. △ Less

Submitted 27 October, 2023; v1 submitted 12 February, 2023; originally announced February 2023.

Comments: The whole paper is rewritten with new results in V2

arXiv:1701.07483 [pdf, other]

A Model-based Projection Technique for Segmenting Customers

Authors: Srikanth Jagabathula, Lakshminarayanan Subramanian, Ashwin Venkataraman

Abstract: We consider the problem of segmenting a large population of customers into non-overlapping groups with similar preferences, using diverse preference observations such as purchases, ratings, clicks, etc. over subsets of items. We focus on the setting where the universe of items is large (ranging from thousands to millions) and unstructured (lacking well-defined attributes) and each customer provide… ▽ More We consider the problem of segmenting a large population of customers into non-overlapping groups with similar preferences, using diverse preference observations such as purchases, ratings, clicks, etc. over subsets of items. We focus on the setting where the universe of items is large (ranging from thousands to millions) and unstructured (lacking well-defined attributes) and each customer provides observations for only a few items. These data characteristics limit the applicability of existing techniques in marketing and machine learning. To overcome these limitations, we propose a model-based projection technique, which transforms the diverse set of observations into a more comparable scale and deals with missing data by projecting the transformed data onto a low-dimensional space. We then cluster the projected data to obtain the customer segments. Theoretically, we derive precise necessary and sufficient conditions that guarantee asymptotic recovery of the true customer segments. Empirically, we demonstrate the speed and performance of our method in two real-world case studies: (a) 84% improvement in the accuracy of new movie recommendations on the MovieLens data set and (b) 6% improvement in the performance of similar item recommendations algorithm on an offline dataset at eBay. We show that our method outperforms standard latent-class and demographic-based techniques. △ Less

Submitted 25 January, 2017; originally announced January 2017.

Comments: 51 pages, 3 figures, 4 tables

arXiv:1108.3596 [pdf, ps, other]

Assortment Optimization Under General Choice

Authors: Vivek Farias, Srikanth Jagabathula, Devavrat Shah

Abstract: We consider the problem of static assortment optimization, where the goal is to find the assortment of size at most $C$ that maximizes revenues. This is a fundamental decision problem in the area of Operations Management. It has been shown that this problem is provably hard for most of the important families of parametric of choice models, except the multinomial logit (MNL) model. In addition, mos… ▽ More We consider the problem of static assortment optimization, where the goal is to find the assortment of size at most $C$ that maximizes revenues. This is a fundamental decision problem in the area of Operations Management. It has been shown that this problem is provably hard for most of the important families of parametric of choice models, except the multinomial logit (MNL) model. In addition, most of the approximation schemes proposed in the literature are tailored to a specific parametric structure. We deviate from this and propose a general algorithm to find the optimal assortment assuming access to only a subroutine that gives revenue predictions; this means that the algorithm can be applied with any choice model. We prove that when the underlying choice model is the MNL model, our algorithm can find the optimal assortment efficiently. △ Less

Submitted 17 August, 2011; originally announced August 2011.

arXiv:1011.4339 [pdf, other]

Sparse Choice Models

Authors: Vivek F. Farias, Srikanth Jagabathula, Devavrat Shah

Abstract: Choice models, which capture popular preferences over objects of interest, play a key role in making decisions whose eventual outcome is impacted by human choice behavior. In most scenarios, the choice model, which can effectively be viewed as a distribution over permutations, must be learned from observed data. The observed data, in turn, may frequently be viewed as (partial, noisy) information a… ▽ More Choice models, which capture popular preferences over objects of interest, play a key role in making decisions whose eventual outcome is impacted by human choice behavior. In most scenarios, the choice model, which can effectively be viewed as a distribution over permutations, must be learned from observed data. The observed data, in turn, may frequently be viewed as (partial, noisy) information about marginals of this distribution over permutations. As such, the search for an appropriate choice model boils down to learning a distribution over permutations that is (near-)consistent with observed information about this distribution. In this work, we pursue a non-parametric approach which seeks to learn a choice model (i.e. a distribution over permutations) with {\em sparsest} possible support, and consistent with observed data. We assume that the data observed consists of noisy information pertaining to the marginals of the choice model we seek to learn. We establish that {\em any} choice model admits a `very' sparse approximation in the sense that there exists a choice model whose support is small relative to the dimension of the observed data and whose marginals approximately agree with the observed marginal information. We further show that under, what we dub, `signature' conditions, such a sparse approximation can be found in a computationally efficiently fashion relative to a brute force approach. An empirical study using the American Psychological Association election data-set suggests that our approach manages to unearth useful structural properties of the underlying choice model using the sparse approximation found. Our results further suggest that the signature condition is a potential alternative to the recently popularized Restricted Null Space condition for efficient recovery of sparse models. △ Less

Submitted 21 September, 2011; v1 submitted 18 November, 2010; originally announced November 2010.

Comments: 28 pages

arXiv:0910.0895 [pdf, ps, other]

Inferring Rankings Using Constrained Sensing

Authors: Srikanth Jagabathula, Devavrat Shah

Abstract: We consider the problem of recovering a function over the space of permutations (or, the symmetric group) over $n$ elements from given partial information; the partial information we consider is related to the group theoretic Fourier Transform of the function. This problem naturally arises in several settings such as ranked elections, multi-object tracking, ranking systems, and recommendation syst… ▽ More We consider the problem of recovering a function over the space of permutations (or, the symmetric group) over $n$ elements from given partial information; the partial information we consider is related to the group theoretic Fourier Transform of the function. This problem naturally arises in several settings such as ranked elections, multi-object tracking, ranking systems, and recommendation systems. Inspired by the work of Donoho and Stark in the context of discrete-time functions, we focus on non-negative functions with a sparse support (support size $\ll$ domain size). Our recovery method is based on finding the sparsest solution (through $\ell_0$ optimization) that is consistent with the available information. As the main result, we derive sufficient conditions for functions that can be recovered exactly from partial information through $\ell_0$ optimization. Under a natural random model for the generation of functions, we quantify the recoverability conditions by deriving bounds on the sparsity (support size) for which the function satisfies the sufficient conditions with a high probability as $n \to \infty$. $\ell_0$ optimization is computationally hard. Therefore, the popular compressive sensing literature considers solving the convex relaxation, $\ell_1$ optimization, to find the sparsest solution. However, we show that $\ell_1$ optimization fails to recover a function (even with constant sparsity) generated using the random model with a high probability as $n \to \infty$. In order to overcome this problem, we propose a novel iterative algorithm for the recovery of functions that satisfy the sufficient conditions. Finally, using an Information Theoretic framework, we study necessary conditions for exact recovery to be possible. △ Less

Submitted 19 June, 2011; v1 submitted 6 October, 2009; originally announced October 2009.

Comments: 19 pages

arXiv:0910.0063 [pdf, other]

A Nonparametric Approach to Modeling Choice with Limited Data

Authors: Vivek F. Farias, Srikanth Jagabathula, Devavrat Shah

Abstract: A central push in operations models over the last decade has been the incorporation of models of customer choice. Real world implementations of many of these models face the formidable stumbling block of simply identifying the `right' model of choice to use. Thus motivated, we visit the following problem: For a `generic' model of consumer choice (namely, distributions over preference lists) and a… ▽ More A central push in operations models over the last decade has been the incorporation of models of customer choice. Real world implementations of many of these models face the formidable stumbling block of simply identifying the `right' model of choice to use. Thus motivated, we visit the following problem: For a `generic' model of consumer choice (namely, distributions over preference lists) and a limited amount of data on how consumers actually make decisions (such as marginal information about these distributions), how may one predict revenues from offering a particular assortment of choices? We present a framework to answer such questions and design a number of tractable algorithms from a data and computational standpoint for the same. This paper thus takes a significant step towards `automating' the crucial task of choice model selection in the context of operational decision problems. △ Less

Submitted 21 June, 2011; v1 submitted 30 September, 2009; originally announced October 2009.

Comments: 44 pages, 4 figures

arXiv:0808.2530 [pdf, ps, other]

doi 10.1109/TIT.2010.2103851

Fair Scheduling in Networks Through Packet Election

Authors: Srikanth Jagabathula, Devavrat Shah

Abstract: We consider the problem of designing a fair scheduling algorithm for discrete-time constrained queuing networks. Each queue has dedicated exogenous packet arrivals. There are constraints on which queues can be served simultaneously. This model effectively describes important special instances like network switches, interference in wireless networks, bandwidth sharing for congestion control and tra… ▽ More We consider the problem of designing a fair scheduling algorithm for discrete-time constrained queuing networks. Each queue has dedicated exogenous packet arrivals. There are constraints on which queues can be served simultaneously. This model effectively describes important special instances like network switches, interference in wireless networks, bandwidth sharing for congestion control and traffic scheduling in road roundabouts. Fair scheduling is required because it provides isolation to different traffic flows; isolation makes the system more robust and enables providing quality of service. Existing work on fairness for constrained networks concentrates on flow based fairness. As a main result, we describe a notion of packet based fairness by establishing an analogy with the ranked election problem: packets are voters, schedules are candidates and each packet ranks the schedules based on its priorities. We then obtain a scheduling algorithm that achieves the described notion of fairness by drawing upon the seminal work of Goodman and Markowitz (1952). This yields the familiar Maximum Weight (MW) style algorithm. As another important result we prove that algorithm obtained is throughput optimal. There is no reason a priori why this should be true, and the proof requires non-traditional methods. △ Less

Submitted 21 September, 2010; v1 submitted 19 August, 2008; originally announced August 2008.

Comments: 14 pages (double column), submitted to IEEE Transactions on Information Theory

arXiv:cs/0701001 [pdf, ps, other]

On High Spatial Reuse Link Scheduling in STDMA Wireless Ad Hoc Networks

Authors: Ashutosh Deepak Gore, Srikanth Jagabathula, Abhay Karandikar

Abstract: Graph-based algorithms for point-to-point link scheduling in Spatial reuse Time Division Multiple Access (STDMA) wireless ad hoc networks often result in a significant number of transmissions having low Signal to Interference and Noise density Ratio (SINR) at intended receivers, leading to low throughput. To overcome this problem, we propose a new algorithm for STDMA link scheduling based on a g… ▽ More Graph-based algorithms for point-to-point link scheduling in Spatial reuse Time Division Multiple Access (STDMA) wireless ad hoc networks often result in a significant number of transmissions having low Signal to Interference and Noise density Ratio (SINR) at intended receivers, leading to low throughput. To overcome this problem, we propose a new algorithm for STDMA link scheduling based on a graph model of the network as well as SINR computations. The performance of our algorithm is evaluated in terms of spatial reuse and computational complexity. Simulation results demonstrate that our algorithm achieves better performance than existing algorithms. △ Less

Submitted 2 January, 2007; originally announced January 2007.

Comments: 10 pages (double column), 10 figures

ACM Class: C.2.1; C.2.5; F.2

Showing 1–8 of 8 results for author: Jagabathula, S