Search | arXiv e-print repository

Parallel Token Swapping for Qubit Routing

Authors: Ishan Bansal, Oktay Günlük, Richard Shapley

Abstract: In this paper we study a combinatorial reconfiguration problem that involves finding an optimal sequence of swaps to move an initial configuration of tokens that are placed on the vertices of a graph to a final desired one. This problem arises as a crucial step in reducing the depth of a quantum circuit when compiling a quantum algorithm. We provide the first known constant factor approximation al… ▽ More In this paper we study a combinatorial reconfiguration problem that involves finding an optimal sequence of swaps to move an initial configuration of tokens that are placed on the vertices of a graph to a final desired one. This problem arises as a crucial step in reducing the depth of a quantum circuit when compiling a quantum algorithm. We provide the first known constant factor approximation algorithms for the parallel token swapping problem on graph topologies that are commonly found in modern quantum computers, including cycle graphs, subdivided star graphs, and grid graphs. We also study the so-called stretch factor of a natural lower bound to the problem, which has been shown to be useful when designing heuristics for the qubit routing problem. Finally, we study the colored version of this reconfiguration problem where some tokens share the same color and are considered indistinguishable. △ Less

Submitted 27 November, 2024; originally announced November 2024.

arXiv:2409.02963 [pdf, other]

Fair Minimum Representation Clustering via Integer Programming

Authors: Connor Lawless, Oktay Gunluk

Abstract: Clustering is an unsupervised learning task that aims to partition data into a set of clusters. In many applications, these clusters correspond to real-world constructs (e.g., electoral districts, playlists, TV channels) whose benefit can only be attained by groups when they reach a minimum level of representation (e.g., 50\% to elect their desired candidate). In this paper, we study the k-means a… ▽ More Clustering is an unsupervised learning task that aims to partition data into a set of clusters. In many applications, these clusters correspond to real-world constructs (e.g., electoral districts, playlists, TV channels) whose benefit can only be attained by groups when they reach a minimum level of representation (e.g., 50\% to elect their desired candidate). In this paper, we study the k-means and k-medians clustering problems with the additional constraint that each group (e.g., demographic group) must have a minimum level of representation in at least a given number of clusters. We formulate the problem through a mixed-integer optimization framework and present an alternating minimization algorithm, called MiniReL, that directly incorporates the fairness constraints. While incorporating the fairness criteria leads to an NP-Hard assignment problem within the algorithm, we provide computational approaches that make the algorithm practical even for large datasets. Numerical results show that the approach is able to create fairer clusters with practically no increase in the clustering cost across standard benchmark datasets. △ Less

Submitted 3 September, 2024; originally announced September 2024.

Comments: arXiv admin note: text overlap with arXiv:2302.03151

arXiv:2401.10738 [pdf, ps, other]

Warehouse Problem with Multiple Vendors and Generalized Complementarity Constraints

Authors: Ishan Bansal, Oktay Günlük

Abstract: We study the warehouse problem, arising in the area of inventory management and production planning. Here, a merchant wants to decide an optimal trading policy that computes quantities of a single commodity to purchase, store and sell during each time period of a finite discrete time horizon. Motivated by recent applications in energy markets, we extend the models by Wolsey and Yaman (2018) and Ba… ▽ More We study the warehouse problem, arising in the area of inventory management and production planning. Here, a merchant wants to decide an optimal trading policy that computes quantities of a single commodity to purchase, store and sell during each time period of a finite discrete time horizon. Motivated by recent applications in energy markets, we extend the models by Wolsey and Yaman (2018) and Bansal and Günlük (2023) and consider markets with multiple vendors and a more general form of the complementarity constraints. We show that these extensions can capture various practical conditions such as surge pricing and discounted sales, ramp-up and ramp-down constraints and batch pricing. We analyze the extreme points of the underlying non-linear integer program and provide an algorithm that exactly solves the problem. Our algorithm runs in polynomial time under reasonable practical conditions. We also show that the absence of such conditions renders the problem NP-Hard. △ Less

Submitted 19 January, 2024; originally announced January 2024.

arXiv:2302.12136 [pdf, ps, other]

Warehouse Problem with Bounds, Fixed Costs and Complementarity Constraints

Authors: Ishan Bansal, Oktay Günlük

Abstract: This paper studies an open question in the warehouse problem where a merchant trading a commodity tries to find an optimal inventory-trading policy to decide on purchase and sale quantities during a fixed time horizon in order to maximize their total pay-off, making use of fluctuations in sale and cost prices. We provide the first known polynomial-time algorithms for the case when there are fixed… ▽ More This paper studies an open question in the warehouse problem where a merchant trading a commodity tries to find an optimal inventory-trading policy to decide on purchase and sale quantities during a fixed time horizon in order to maximize their total pay-off, making use of fluctuations in sale and cost prices. We provide the first known polynomial-time algorithms for the case when there are fixed costs for purchases and sales, optional complementarity constraints that prohibit purchasing and selling during the same time period, and bounds on purchase and sales quantities. We do so by providing an exact characterization of the extreme points of the feasible region and using this to construct a suitable network where a min-cost flow computation provides an optimal solution. We are also able to provide polynomial extended linear formulations for the original feasible regions. Our methods build on the work by Wolsey and Yaman (Discrete Optimization 2018). We also consider the problem without fixed costs and provide a fully polynomial time approximation scheme in a setting with time-dependent bounds. △ Less

Submitted 23 February, 2023; originally announced February 2023.

Comments: Version 1 of full paper

arXiv:2302.03151 [pdf, other]

Fair Minimum Representation Clustering

Authors: Connor Lawless, Oktay Gunluk

Abstract: Clustering is an unsupervised learning task that aims to partition data into a set of clusters. In many applications, these clusters correspond to real-world constructs (e.g. electoral districts) whose benefit can only be attained by groups when they reach a minimum level of representation (e.g. 50\% to elect their desired candidate). This paper considers the problem of performing k-means clusteri… ▽ More Clustering is an unsupervised learning task that aims to partition data into a set of clusters. In many applications, these clusters correspond to real-world constructs (e.g. electoral districts) whose benefit can only be attained by groups when they reach a minimum level of representation (e.g. 50\% to elect their desired candidate). This paper considers the problem of performing k-means clustering while ensuring groups (e.g. demographic groups) have that minimum level of representation in a specified number of clusters. We show that the popular $k$-means algorithm, Lloyd's algorithm, can result in unfair outcomes where certain groups lack sufficient representation past the minimum threshold in a proportional number of clusters. We formulate the problem through a mixed-integer optimization framework and present a variant of Lloyd's algorithm, called MiniReL, that directly incorporates the fairness constraints. We show that incorporating the fairness criteria leads to a NP-Hard sub-problem within Lloyd's algorithm, but we provide computational approaches that make the problem tractable for even large datasets. Numerical results show that the approach is able to create fairer clusters with practically no increase in the k-means clustering cost across standard benchmark datasets. △ Less

Submitted 8 February, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

arXiv:2211.03997 [pdf, other]

Online Decision Making with Nonconvex Local and Convex Global Constraints

Authors: Rui Chen, Oktay Gunluk, Andrea Lodi, Guanyi Wang

Abstract: We study the online decision making problem (ODMP) as a natural generalization of online linear programming. In ODMP, a single decision maker undertakes a sequence of decisions over $T$ time steps. At each time step, the decision maker makes a locally feasible decision based on information available up to that point. The objective is to maximize the accumulated reward while satisfying some convex… ▽ More We study the online decision making problem (ODMP) as a natural generalization of online linear programming. In ODMP, a single decision maker undertakes a sequence of decisions over $T$ time steps. At each time step, the decision maker makes a locally feasible decision based on information available up to that point. The objective is to maximize the accumulated reward while satisfying some convex global constraints called goal constraints. The decision made at each step results in an $m$-dimensional vector that represents the contribution of this local decision to the goal constraints. In the online setting, these goal constraints are soft constraints that can be violated moderately. To handle potential nonconvexity and nonlinearity in ODMP, we propose a Fenchel dual-based online algorithm. At each time step, the algorithm requires solving a potentially nonconvex optimization problem over the local feasible set and a convex optimization problem over the goal set. Under certain stochastic input models, we show that the algorithm achieves $O(\sqrt{mT})$ goal constraint violation deterministically, and $\tilde{O}(\sqrt{mT})$ regret in expected reward. Numerical experiments on an online knapsack problem and an assortment optimization problem are conducted to demonstrate the potential of our proposed online algorithm. △ Less

Submitted 28 June, 2024; v1 submitted 7 November, 2022; originally announced November 2022.

arXiv:2210.08798 [pdf, other]

Cluster Explanation via Polyhedral Descriptions

Authors: Connor Lawless, Oktay Gunluk

Abstract: Clustering is an unsupervised learning problem that aims to partition unlabelled data points into groups with similar features. Traditional clustering algorithms provide limited insight into the groups they find as their main focus is accuracy and not the interpretability of the group assignments. This has spurred a recent line of work on explainable machine learning for clustering. In this paper… ▽ More Clustering is an unsupervised learning problem that aims to partition unlabelled data points into groups with similar features. Traditional clustering algorithms provide limited insight into the groups they find as their main focus is accuracy and not the interpretability of the group assignments. This has spurred a recent line of work on explainable machine learning for clustering. In this paper we focus on the cluster description problem where, given a dataset and its partition into clusters, the task is to explain the clusters. We introduce a new approach to explain clusters by constructing polyhedra around each cluster while minimizing either the complexity of the resulting polyhedra or the number of features used in the description. We formulate the cluster description problem as an integer program and present a column generation approach to search over an exponential number of candidate half-spaces that can be used to build the polyhedra. To deal with large datasets, we introduce a novel grouping scheme that first forms smaller groups of data points and then builds the polyhedra around the grouped data, a strategy which out-performs simply sub-sampling data. Compared to state of the art cluster description algorithms, our approach is able to achieve competitive interpretability with improved description accuracy. △ Less

Submitted 17 October, 2022; originally announced October 2022.

arXiv:2111.08466 [pdf, other]

Interpretable and Fair Boolean Rule Sets via Column Generation

Authors: Connor Lawless, Sanjeeb Dash, Oktay Gunluk, Dennis Wei

Abstract: This paper considers the learning of Boolean rules in disjunctive normal form (DNF, OR-of-ANDs, equivalent to decision rule sets) as an interpretable model for classification. An integer program is formulated to optimally trade classification accuracy for rule simplicity. We also consider the fairness setting and extend the formulation to include explicit constraints on two different measures of c… ▽ More This paper considers the learning of Boolean rules in disjunctive normal form (DNF, OR-of-ANDs, equivalent to decision rule sets) as an interpretable model for classification. An integer program is formulated to optimally trade classification accuracy for rule simplicity. We also consider the fairness setting and extend the formulation to include explicit constraints on two different measures of classification parity: equality of opportunity and equalized odds. Column generation (CG) is used to efficiently search over an exponential number of candidate rules without the need for heuristic rule mining. To handle large data sets, we propose an approximate CG algorithm using randomization. Compared to three recently proposed alternatives, the CG algorithm dominates the accuracy-simplicity trade-off in 8 out of 16 data sets. When maximized for accuracy, CG is competitive with rule learners designed for this purpose, sometimes finding significantly simpler solutions that are no less accurate. Compared to other fair and interpretable classifiers, our method is able to find rule sets that meet stricter notions of fairness with a modest trade-off in accuracy. △ Less

Submitted 18 September, 2023; v1 submitted 16 November, 2021; originally announced November 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:2107.01325, arXiv:1805.09901

Journal ref: Journal of Machine Learning Research 2023 Volume 24, Number 229, Pages 1-50

arXiv:2107.01325 [pdf, other]

Fair Decision Rules for Binary Classification

Authors: Connor Lawless, Oktay Gunluk

Abstract: In recent years, machine learning has begun automating decision making in fields as varied as college admissions, credit lending, and criminal sentencing. The socially sensitive nature of some of these applications together with increasing regulatory constraints has necessitated the need for algorithms that are both fair and interpretable. In this paper we consider the problem of building Boolean… ▽ More In recent years, machine learning has begun automating decision making in fields as varied as college admissions, credit lending, and criminal sentencing. The socially sensitive nature of some of these applications together with increasing regulatory constraints has necessitated the need for algorithms that are both fair and interpretable. In this paper we consider the problem of building Boolean rule sets in disjunctive normal form (DNF), an interpretable model for binary classification, subject to fairness constraints. We formulate the problem as an integer program that maximizes classification accuracy with explicit constraints on two different measures of classification parity: equality of opportunity and equalized odds. Column generation framework, with a novel formulation, is used to efficiently search over exponentially many possible rules. When combined with faster heuristics, our method can deal with large data-sets. Compared to other fair and interpretable classifiers, our method is able to find rule sets that meet stricter notions of fairness with a modest trade-off in accuracy. △ Less

Submitted 2 July, 2021; originally announced July 2021.

arXiv:2106.13434 [pdf, other]

Binary Matrix Factorisation and Completion via Integer Programming

Authors: Reka A. Kovacs, Oktay Gunluk, Raphael A. Hauser

Abstract: Binary matrix factorisation is an essential tool for identifying discrete patterns in binary data. In this paper we consider the rank-k binary matrix factorisation problem (k-BMF) under Boolean arithmetic: we are given an n x m binary matrix X with possibly missing entries and need to find two binary matrices A and B of dimension n x k and k x m respectively, which minimise the distance between X… ▽ More Binary matrix factorisation is an essential tool for identifying discrete patterns in binary data. In this paper we consider the rank-k binary matrix factorisation problem (k-BMF) under Boolean arithmetic: we are given an n x m binary matrix X with possibly missing entries and need to find two binary matrices A and B of dimension n x k and k x m respectively, which minimise the distance between X and the Boolean product of A and B in the squared Frobenius distance. We present a compact and two exponential size integer programs (IPs) for k-BMF and show that the compact IP has a weak LP relaxation, while the exponential size IPs have a stronger equivalent LP relaxation. We introduce a new objective function, which differs from the traditional squared Frobenius objective in attributing a weight to zero entries of the input matrix that is proportional to the number of times the zero is erroneously covered in a rank-k factorisation. For one of the exponential size IPs we describe a computational approach based on column generation. Experimental results on synthetic and real word datasets suggest that our integer programming approach is competitive against available methods for k-BMF and provides accurate low-error factorisations. △ Less

Submitted 3 August, 2021; v1 submitted 25 June, 2021; originally announced June 2021.

arXiv:2011.04457 [pdf, other]

Binary Matrix Factorisation via Column Generation

Authors: Reka A. Kovacs, Oktay Gunluk, Raphael A. Hauser

Abstract: Identifying discrete patterns in binary data is an important dimensionality reduction tool in machine learning and data mining. In this paper, we consider the problem of low-rank binary matrix factorisation (BMF) under Boolean arithmetic. Due to the hardness of this problem, most previous attempts rely on heuristic techniques. We formulate the problem as a mixed integer linear program and use a la… ▽ More Identifying discrete patterns in binary data is an important dimensionality reduction tool in machine learning and data mining. In this paper, we consider the problem of low-rank binary matrix factorisation (BMF) under Boolean arithmetic. Due to the hardness of this problem, most previous attempts rely on heuristic techniques. We formulate the problem as a mixed integer linear program and use a large scale optimisation technique of column generation to solve it without the need of heuristic pattern mining. Our approach focuses on accuracy and on the provision of optimality guarantees. Experimental results on real world datasets demonstrate that our proposed method is effective at producing highly accurate factorisations and improves on the previously available best known results for 15 out of 24 problem instances. △ Less

Submitted 3 August, 2021; v1 submitted 9 November, 2020; originally announced November 2020.

Comments: final version as published by AAAI2021, plus including Appendix

arXiv:2006.14084 [pdf, other]

Multilabel Classification by Hierarchical Partitioning and Data-dependent Grouping

Authors: Shashanka Ubaru, Sanjeeb Dash, Arya Mazumdar, Oktay Gunluk

Abstract: In modern multilabel classification problems, each data instance belongs to a small number of classes from a large set of classes. In other words, these problems involve learning very sparse binary label vectors. Moreover, in large-scale problems, the labels typically have certain (unknown) hierarchy. In this paper we exploit the sparsity of label vectors and the hierarchical structure to embed th… ▽ More In modern multilabel classification problems, each data instance belongs to a small number of classes from a large set of classes. In other words, these problems involve learning very sparse binary label vectors. Moreover, in large-scale problems, the labels typically have certain (unknown) hierarchy. In this paper we exploit the sparsity of label vectors and the hierarchical structure to embed them in low-dimensional space using label groupings. Consequently, we solve the classification problem in a much lower dimensional space and then obtain labels in the original space using an appropriately defined lifting. Our method builds on the work of (Ubaru & Mazumdar, 2017), where the idea of group testing was also explored for multilabel classification. We first present a novel data-dependent grouping approach, where we use a group construction based on a low-rank Nonnegative Matrix Factorization (NMF) of the label matrix of training instances. The construction also allows us, using recent results, to develop a fast prediction algorithm that has a logarithmic runtime in the number of labels. We then present a hierarchical partitioning approach that exploits the label hierarchy in large scale problems to divide up the large label space and create smaller sub-problems, which can then be solved independently via the grouping approach. Numerical results on many benchmark datasets illustrate that, compared to other popular methods, our proposed methods achieve competitive accuracy with significantly lower computational costs. △ Less

Submitted 31 October, 2020; v1 submitted 24 June, 2020; originally announced June 2020.

Journal ref: Neural Information Processing Systems (NeurIPS), 2020

arXiv:1906.01761 [pdf, other]

Generalized Linear Rule Models

Authors: Dennis Wei, Sanjeeb Dash, Tian Gao, Oktay Günlük

Abstract: This paper considers generalized linear models using rule-based features, also referred to as rule ensembles, for regression and probabilistic classification. Rules facilitate model interpretation while also capturing nonlinear dependences and interactions. Our problem formulation accordingly trades off rule set complexity and prediction accuracy. Column generation is used to optimize over an expo… ▽ More This paper considers generalized linear models using rule-based features, also referred to as rule ensembles, for regression and probabilistic classification. Rules facilitate model interpretation while also capturing nonlinear dependences and interactions. Our problem formulation accordingly trades off rule set complexity and prediction accuracy. Column generation is used to optimize over an exponentially large space of rules without pre-generating a large subset of candidates or greedily boosting rules one by one. The column generation subproblem is solved using either integer programming or a heuristic optimizing the same objective. In experiments involving logistic and linear regression, the proposed methods obtain better accuracy-complexity trade-offs than existing rule ensemble algorithms. At one end of the trade-off, the methods are competitive with less interpretable benchmark models. △ Less

Submitted 4 June, 2019; originally announced June 2019.

Comments: Published in the Proceedings of the 36th International Conference on Machine Learning (ICML), PMLR 97:6687-6696, 2019. 17 pages, 7 figures

arXiv:1805.09901 [pdf, other]

Boolean Decision Rules via Column Generation

Authors: Sanjeeb Dash, Oktay Günlük, Dennis Wei

Abstract: This paper considers the learning of Boolean rules in either disjunctive normal form (DNF, OR-of-ANDs, equivalent to decision rule sets) or conjunctive normal form (CNF, AND-of-ORs) as an interpretable model for classification. An integer program is formulated to optimally trade classification accuracy for rule simplicity. Column generation (CG) is used to efficiently search over an exponential nu… ▽ More This paper considers the learning of Boolean rules in either disjunctive normal form (DNF, OR-of-ANDs, equivalent to decision rule sets) or conjunctive normal form (CNF, AND-of-ORs) as an interpretable model for classification. An integer program is formulated to optimally trade classification accuracy for rule simplicity. Column generation (CG) is used to efficiently search over an exponential number of candidate clauses (conjunctions or disjunctions) without the need for heuristic rule mining. This approach also bounds the gap between the selected rule set and the best possible rule set on the training data. To handle large datasets, we propose an approximate CG algorithm using randomization. Compared to three recently proposed alternatives, the CG algorithm dominates the accuracy-simplicity trade-off in 7 out of 15 datasets. When maximized for accuracy, CG is competitive with rule learners designed for this purpose, sometimes finding significantly simpler solutions that are no less accurate. △ Less

Submitted 5 August, 2020; v1 submitted 24 May, 2018; originally announced May 2018.

arXiv:1805.03682 [pdf, other]

Robust-to-Dynamics Optimization

Authors: Amir Ali Ahmadi, Oktay Gunluk

Abstract: A robust-to-dynamics optimization (RDO) problem is an optimization problem specified by two pieces of input: (i) a mathematical program (an objective function $f:\mathbb{R}^n\rightarrow\mathbb{R}$ and a feasible set $Ω\subseteq\mathbb{R}^n$), and (ii) a dynamical system (a map $g:\mathbb{R}^n\rightarrow\mathbb{R}^n$). Its goal is to minimize $f$ over the set $\mathcal{S}\subseteqΩ$ of initial cond… ▽ More A robust-to-dynamics optimization (RDO) problem is an optimization problem specified by two pieces of input: (i) a mathematical program (an objective function $f:\mathbb{R}^n\rightarrow\mathbb{R}$ and a feasible set $Ω\subseteq\mathbb{R}^n$), and (ii) a dynamical system (a map $g:\mathbb{R}^n\rightarrow\mathbb{R}^n$). Its goal is to minimize $f$ over the set $\mathcal{S}\subseteqΩ$ of initial conditions that forever remain in $Ω$ under $g$. The focus of this paper is on the case where the mathematical program is a linear program and the dynamical system is either a known linear map, or an uncertain linear map that can change over time. In both cases, we study a converging sequence of polyhedral outer approximations and (lifted) spectrahedral inner approximations to $\mathcal{S}$. Our inner approximations are optimized with respect to the objective function $f$ and their semidefinite characterization -- which has a semidefinite constraint of fixed size -- is obtained by applying polar duality to convex sets that are invariant under (multiple) linear maps. We characterize three barriers that can stop convergence of the outer approximations from being finite. We prove that once these barriers are removed, our inner and outer approximating procedures find an optimal solution and a certificate of optimality for the RDO problem in a finite number of steps. Moreover, in the case where the dynamics are linear, we show that this phenomenon occurs in a number of steps that can be computed in time polynomial in the bit size of the input data. Our analysis also leads to a polynomial-time algorithm for RDO instances where the spectral radius of the linear map is bounded above by any constant less than one. Finally, in our concluding section, we propose a broader research agenda for studying optimization problems with dynamical systems constraints, of which RDO is a special case. △ Less

Submitted 22 November, 2023; v1 submitted 9 May, 2018; originally announced May 2018.

Comments: Major revision

arXiv:1803.04825 [pdf, other]

Low-Rank Boolean Matrix Approximation by Integer Programming

Authors: Reka Kovacs, Oktay Gunluk, Raphael Hauser

Abstract: Low-rank approximations of data matrices are an important dimensionality reduction tool in machine learning and regression analysis. We consider the case of categorical variables, where it can be formulated as the problem of finding low-rank approximations to Boolean matrices. In this paper we give what is to the best of our knowledge the first integer programming formulation that relies on only p… ▽ More Low-rank approximations of data matrices are an important dimensionality reduction tool in machine learning and regression analysis. We consider the case of categorical variables, where it can be formulated as the problem of finding low-rank approximations to Boolean matrices. In this paper we give what is to the best of our knowledge the first integer programming formulation that relies on only polynomially many variables and constraints, we discuss how to solve it computationally and report numerical tests on synthetic and real-world data. △ Less

Submitted 13 March, 2018; originally announced March 2018.

arXiv:1612.03225 [pdf, ps, other]

Optimal Generalized Decision Trees via Integer Programming

Authors: Oktay Gunluk, Jayant Kalagnanam, Minhan Li, Matt Menickelly, Katya Scheinberg

Abstract: Decision trees have been a very popular class of predictive models for decades due to their interpretability and good performance on categorical features. However, they are not always robust and tend to overfit the data. Additionally, if allowed to grow large, they lose interpretability. In this paper, we present a mixed integer programming formulation to construct optimal decision trees of a pres… ▽ More Decision trees have been a very popular class of predictive models for decades due to their interpretability and good performance on categorical features. However, they are not always robust and tend to overfit the data. Additionally, if allowed to grow large, they lose interpretability. In this paper, we present a mixed integer programming formulation to construct optimal decision trees of a prespecified size. We take the special structure of categorical features into account and allow combinatorial decisions (based on subsets of values of features) at each node. Our approach can also handle numerical features via thresholding. We show that very good accuracy can be achieved with small trees using moderately-sized training sets. The optimization problems we solve are tractable with modern solvers. △ Less

Submitted 13 August, 2019; v1 submitted 9 December, 2016; originally announced December 2016.

MSC Class: 90C10

Showing 1–17 of 17 results for author: Gunluk, O