-
Optimal Sequential Recommendations: Exploiting User and Item Structure
Authors:
Mina Karzand,
Guy Bresler
Abstract:
We consider an online model for recommendation systems, with each user being recommended an item at each time-step and providing 'like' or 'dislike' feedback. A latent variable model specifies the user preferences: both users and items are clustered into types. The model captures structure in both the item and user spaces, as used by item-item and user-user collaborative filtering algorithms. We s…
▽ More
We consider an online model for recommendation systems, with each user being recommended an item at each time-step and providing 'like' or 'dislike' feedback. A latent variable model specifies the user preferences: both users and items are clustered into types. The model captures structure in both the item and user spaces, as used by item-item and user-user collaborative filtering algorithms. We study the situation in which the type preference matrix has i.i.d. entries. Our main contribution is an algorithm that simultaneously uses both item and user structures, proved to be near-optimal via corresponding information-theoretic lower bounds. In particular, our analysis highlights the sub-optimality of using only one of item or user structure (as is done in most collaborative filtering algorithms).
△ Less
Submitted 28 April, 2025;
originally announced April 2025.
-
Score Design for Multi-Criteria Incentivization
Authors:
Anmol Kabra,
Mina Karzand,
Tosca Lechner,
Nathan Srebro,
Serena Wang
Abstract:
We present a framework for designing scores to summarize performance metrics. Our design has two multi-criteria objectives: (1) improving on scores should improve all performance metrics, and (2) achieving pareto-optimal scores should achieve pareto-optimal metrics. We formulate our design to minimize the dimensionality of scores while satisfying the objectives. We give algorithms to design scores…
▽ More
We present a framework for designing scores to summarize performance metrics. Our design has two multi-criteria objectives: (1) improving on scores should improve all performance metrics, and (2) achieving pareto-optimal scores should achieve pareto-optimal metrics. We formulate our design to minimize the dimensionality of scores while satisfying the objectives. We give algorithms to design scores, which are provably minimal under mild assumptions on the structure of performance metrics. This framework draws motivation from real-world practices in hospital rating systems, where misaligned scores and performance metrics lead to unintended consequences.
△ Less
Submitted 8 October, 2024;
originally announced October 2024.
-
MaxiMin Active Learning in Overparameterized Model Classes}
Authors:
Mina Karzand,
Robert D. Nowak
Abstract:
Generating labeled training datasets has become a major bottleneck in Machine Learning (ML) pipelines. Active ML aims to address this issue by designing learning algorithms that automatically and adaptively select the most informative examples for labeling so that human time is not wasted labeling irrelevant, redundant, or trivial examples. This paper proposes a new approach to active ML with nonp…
▽ More
Generating labeled training datasets has become a major bottleneck in Machine Learning (ML) pipelines. Active ML aims to address this issue by designing learning algorithms that automatically and adaptively select the most informative examples for labeling so that human time is not wasted labeling irrelevant, redundant, or trivial examples. This paper proposes a new approach to active ML with nonparametric or overparameterized models such as kernel methods and neural networks. In the context of binary classification, the new approach is shown to possess a variety of desirable properties that allow active learning algorithms to automatically and efficiently identify decision boundaries and data clusters.
△ Less
Submitted 28 April, 2020; v1 submitted 29 May, 2019;
originally announced May 2019.
-
Regret Bounds and Regimes of Optimality for User-User and Item-Item Collaborative Filtering
Authors:
Guy Bresler,
Mina Karzand
Abstract:
We consider an online model for recommendation systems, with each user being recommended an item at each time-step and providing 'like' or 'dislike' feedback. Each user may be recommended a given item at most once. A latent variable model specifies the user preferences: both users and items are clustered into types. All users of a given type have identical preferences for the items, and similarly,…
▽ More
We consider an online model for recommendation systems, with each user being recommended an item at each time-step and providing 'like' or 'dislike' feedback. Each user may be recommended a given item at most once. A latent variable model specifies the user preferences: both users and items are clustered into types. All users of a given type have identical preferences for the items, and similarly, items of a given type are either all liked or all disliked by a given user. We assume that the matrix encoding the preferences of each user type for each item type is randomly generated; in this way, the model captures structure in both the item and user spaces, the amount of structure depending on the number of each of the types. The measure of performance of the recommendation system is the expected number of disliked recommendations per user, defined as expected regret. We propose two algorithms inspired by user-user and item-item collaborative filtering (CF), modified to explicitly make exploratory recommendations, and prove performance guarantees in terms of their expected regret. For two regimes of model parameters, with structure only in item space or only in user space, we prove information-theoretic lower bounds on regret that match our upper bounds up to logarithmic factors. Our analysis elucidates system operating regimes in which existing CF algorithms are nearly optimal.
△ Less
Submitted 7 May, 2019; v1 submitted 6 November, 2017;
originally announced November 2017.
-
Learning a Tree-Structured Ising Model in Order to Make Predictions
Authors:
Guy Bresler,
Mina Karzand
Abstract:
We study the problem of learning a tree Ising model from samples such that subsequent predictions made using the model are accurate. The prediction task considered in this paper is that of predicting the values of a subset of variables given values of some other subset of variables. Virtually all previous work on graphical model learning has focused on recovering the true underlying graph. We defi…
▽ More
We study the problem of learning a tree Ising model from samples such that subsequent predictions made using the model are accurate. The prediction task considered in this paper is that of predicting the values of a subset of variables given values of some other subset of variables. Virtually all previous work on graphical model learning has focused on recovering the true underlying graph. We define a distance ("small set TV" or ssTV) between distributions $P$ and $Q$ by taking the maximum, over all subsets $\mathcal{S}$ of a given size, of the total variation between the marginals of $P$ and $Q$ on $\mathcal{S}$; this distance captures the accuracy of the prediction task of interest. We derive non-asymptotic bounds on the number of samples needed to get a distribution (from the same class) with small ssTV relative to the one generating the samples. One of the main messages of this paper is that far fewer samples are needed than for recovering the underlying tree, which means that accurate predictions are possible using the wrong tree.
△ Less
Submitted 14 June, 2018; v1 submitted 22 April, 2016;
originally announced April 2016.
-
Proportional Fair Rate Allocation for Private Shared Networks
Authors:
Saman Feghhi,
Douglas J. Leith,
Mohammad Karzand
Abstract:
In this paper, we consider fair privacy in a shared network subject to traffic analysis attacks by an eavesdropper. We initiate the study of the joint trade-off between privacy, throughput and delay in such a shared network as a utility fairness problem and derive the proportional fair rate allocation for networks of flows subject to privacy constraints and delay deadlines.
In this paper, we consider fair privacy in a shared network subject to traffic analysis attacks by an eavesdropper. We initiate the study of the joint trade-off between privacy, throughput and delay in such a shared network as a utility fairness problem and derive the proportional fair rate allocation for networks of flows subject to privacy constraints and delay deadlines.
△ Less
Submitted 4 March, 2016;
originally announced March 2016.
-
FEC for Lower In-Order Delivery Delay in Packet Networks
Authors:
Mohammad Karzand,
Douglas J. Leith,
Jason Cloud,
Muriel Medard
Abstract:
We consider use of FEC to reduce in-order delivery delay over packet erasure channels. We propose a class of streaming codes that is capacity achieving and provides a superior throughput-delay trade-off compared to block codes by introducing flexibility in where and when redundancy is placed. This flexibility results in significantly lower in-order delay for a given throughput for a wide range of…
▽ More
We consider use of FEC to reduce in-order delivery delay over packet erasure channels. We propose a class of streaming codes that is capacity achieving and provides a superior throughput-delay trade-off compared to block codes by introducing flexibility in where and when redundancy is placed. This flexibility results in significantly lower in-order delay for a given throughput for a wide range of network scenarios. Furthermore, a major contribution of this paper is the combination of queuing and coding theory to analyze the code's performance. Finally, we present simulation and experimental results illustrating the code's benefits.
△ Less
Submitted 2 September, 2016; v1 submitted 1 September, 2015;
originally announced September 2015.
-
Low Delay Random Linear Coding and Scheduling Over Multiple Interfaces
Authors:
Andres Garcia-Saavedra,
Mohammad Karzand,
Douglas J. Leith
Abstract:
Multipath transport protocols like MPTCP transfer data across multiple routes in parallel and deliver it in order at the receiver. When the delay on one or more of the paths is variable, as is commonly the case, out of order arrivals are frequent and head of line blocking leads to high latency. This is exacerbated when packet loss, which is also common with wireless links, is tackled using ARQ. Th…
▽ More
Multipath transport protocols like MPTCP transfer data across multiple routes in parallel and deliver it in order at the receiver. When the delay on one or more of the paths is variable, as is commonly the case, out of order arrivals are frequent and head of line blocking leads to high latency. This is exacerbated when packet loss, which is also common with wireless links, is tackled using ARQ. This paper introduces Stochastic Earliest Delivery Path First (S-EDPF), a resilient low delay packet scheduler for multipath transport protocols. S-EDPF takes explicit account of the stochastic nature of paths and uses this to minimise in-order delivery delay. S-EDPF also takes account of FEC, jointly scheduling transmission of information and coded packets and in this way allows lossy links to reduce delay and improve resiliency, rather than degrading performance as usually occurs with existing multipath systems. We implement S-EDPF as a multi-platform application that does not require administration privileges nor modifications to the operating system and has negligible impact on energy consumption. We present a thorough experimental evaluation in both controlled environments and into the wild, revealing dramatic gains in delay performance compared to existing approaches.
△ Less
Submitted 30 July, 2015;
originally announced July 2015.
-
Communication Strategies for Low-Latency Trading
Authors:
Mina Karzand,
Lav R. Varshney
Abstract:
The possibility of latency arbitrage in financial markets has led to the deployment of high-speed communication links between distant financial centers. These links are noisy and so there is a need for coding. In this paper, we develop a gametheoretic model of trading behavior where two traders compete to capture latency arbitrage opportunities using binary signalling. Different coding schemes are…
▽ More
The possibility of latency arbitrage in financial markets has led to the deployment of high-speed communication links between distant financial centers. These links are noisy and so there is a need for coding. In this paper, we develop a gametheoretic model of trading behavior where two traders compete to capture latency arbitrage opportunities using binary signalling. Different coding schemes are strategies that trade off between reliability and latency. When one trader has a better channel, the second trader should not compete. With statistically identical channels, we find there are two different regimes of channel noise for which: there is a unique Nash equilibrium yielding ties; and there are two Nash equilibria with different winners.
△ Less
Submitted 27 April, 2015;
originally announced April 2015.
-
Achievable Degrees of Freedom in MIMO Correlatively Changing Fading Channels
Authors:
Mina Karzand,
Lizhong Zheng
Abstract:
The relationship between the transmitted signal and the noiseless received signals in correlatively changing fading channels is modeled as a nonlinear mapping over manifolds of different dimensions. Dimension counting argument claims that the dimensionality of the neighborhood in which this mapping is bijective with probability one is achievable as the degrees of freedom of the system.We call the…
▽ More
The relationship between the transmitted signal and the noiseless received signals in correlatively changing fading channels is modeled as a nonlinear mapping over manifolds of different dimensions. Dimension counting argument claims that the dimensionality of the neighborhood in which this mapping is bijective with probability one is achievable as the degrees of freedom of the system.We call the degrees of freedom achieved by the nonlinear decoding methods the nonlinear degrees of freedom.
△ Less
Submitted 25 January, 2014;
originally announced January 2014.
-
Achievability of Nonlinear Degrees of Freedom in Correlatively Changing Fading Channels
Authors:
Mina Karzand,
Lizhong Zheng
Abstract:
A new approach toward the noncoherent communications over the time varying fading channels is presented. In this approach, the relationship between the input signal space and the output signal space of a correlatively changing fading channel is shown to be a nonlinear mapping between manifolds of different dimensions. Studying this mapping, it is shown that using nonlinear decoding algorithms for…
▽ More
A new approach toward the noncoherent communications over the time varying fading channels is presented. In this approach, the relationship between the input signal space and the output signal space of a correlatively changing fading channel is shown to be a nonlinear mapping between manifolds of different dimensions. Studying this mapping, it is shown that using nonlinear decoding algorithms for single input-multiple output (SIMO) and multiple input multiple output (MIMO) systems, extra numbers of degrees of freedom (DOF) are available. We call them the nonlinear degrees of freedom.
△ Less
Submitted 9 January, 2014;
originally announced January 2014.