-
Approximating Fair $k$-Min-Sum-Radii in Euclidean Space
Authors:
Lukas Drexler,
Annika Hennes,
Abhiruk Lahiri,
Melanie Schmidt,
Julian Wargalla
Abstract:
The $k$-center problem is a classical clustering problem in which one is asked to find a partitioning of a point set $P$ into $k$ clusters such that the maximum radius of any cluster is minimized. It is well-studied. But what if we add up the radii of the clusters instead of only considering the cluster with maximum radius? This natural variant is called the $k$-min-sum-radii problem. It has becom…
▽ More
The $k$-center problem is a classical clustering problem in which one is asked to find a partitioning of a point set $P$ into $k$ clusters such that the maximum radius of any cluster is minimized. It is well-studied. But what if we add up the radii of the clusters instead of only considering the cluster with maximum radius? This natural variant is called the $k$-min-sum-radii problem. It has become the subject of more and more interest in recent years, inspiring the development of approximation algorithms for the $k$-min-sum-radii problem in its plain version as well as in constrained settings.
We study the problem for Euclidean spaces $\mathbb{R}^d$ of arbitrary dimension but assume the number $k$ of clusters to be constant. In this case, a PTAS for the problem is known (see Bandyapadhyay, Lochet and Saurabh, SoCG, 2023). Our aim is to extend the knowledge base for $k$-min-sum-radii to the domain of fair clustering. We study several group fairness constraints, such as the one introduced by Chierichetti et al. (NeurIPS, 2017). In this model, input points have an additional attribute (e.g., colors such as red and blue), and clusters have to preserve the ratio between different attribute values (e.g., have the same fraction of red and blue points as the ground set). Different variants of this general idea have been studied in the literature. To the best of our knowledge, no approximative results for the fair $k$-min-sum-radii problem are known, despite the immense amount of work on the related fair $k$-center problem.
We propose a PTAS for the fair $k$-min-sum-radii problem in Euclidean spaces of arbitrary dimension for the case of constant $k$. To the best of our knowledge, this is the first PTAS for the problem. It works for different notions of group fairness.
△ Less
Submitted 30 September, 2024; v1 submitted 2 September, 2023;
originally announced September 2023.
-
Connected k-Center and k-Diameter Clustering
Authors:
Lukas Drexler,
Jan Eube,
Kelin Luo,
Dorian Reineccius,
Heiko Röglin,
Melanie Schmidt,
Julian Wargalla
Abstract:
Motivated by an application from geodesy, we introduce a novel clustering problem which is a $k$-center (or k-diameter) problem with a side constraint. For the side constraint, we are given an undirected connectivity graph $G$ on the input points, and a clustering is now only feasible if every cluster induces a connected subgraph in $G$. We call the resulting problems the connected $k$-center prob…
▽ More
Motivated by an application from geodesy, we introduce a novel clustering problem which is a $k$-center (or k-diameter) problem with a side constraint. For the side constraint, we are given an undirected connectivity graph $G$ on the input points, and a clustering is now only feasible if every cluster induces a connected subgraph in $G$. We call the resulting problems the connected $k$-center problem and the connected $k$-diameter problem.
We prove several results on the complexity and approximability of these problems. Our main result is an $O(\log^2{k})$-approximation algorithm for the connected $k$-center and the connected $k$-diameter problem. For Euclidean metrics and metrics with constant doubling dimension, the approximation factor of this algorithm improves to $O(1)$. We also consider the special cases that the connectivity graph is a line or a tree. For the line we give optimal polynomial-time algorithms and for the case that the connectivity graph is a tree, we either give an optimal polynomial-time algorithm or a $2$-approximation algorithm for all variants of our model. We complement our upper bounds by several lower bounds.
△ Less
Submitted 18 October, 2023; v1 submitted 3 November, 2022;
originally announced November 2022.
-
Coresets for constrained k-median and k-means clustering in low dimensional Euclidean space
Authors:
Melanie Schmidt,
Julian Wargalla
Abstract:
We study (Euclidean) $k$-median and $k$-means with constraints in the streaming model.
There have been recent efforts to design unified algorithms to solve constrained $k$-means problems without using knowledge of the specific constraint at hand aside from mild assumptions like the polynomial computability of feasibility under the constraint (compute if a clustering satisfies the constraint) or…
▽ More
We study (Euclidean) $k$-median and $k$-means with constraints in the streaming model.
There have been recent efforts to design unified algorithms to solve constrained $k$-means problems without using knowledge of the specific constraint at hand aside from mild assumptions like the polynomial computability of feasibility under the constraint (compute if a clustering satisfies the constraint) or the presence of an efficient assignment oracle (given a set of centers, produce an optimal assignment of points to the centers which satisfies the constraint). These algorithms have a running time exponential in $k$, but can be applied to a wide range of constraints.
We demonstrate that a technique proposed in 2019 for solving a specific constrained streaming $k$-means problem, namely fair $k$-means clustering, actually implies streaming algorithms for all these constraints. These work for low dimensional Euclidean space. [Note that there are more algorithms for streaming fair $k$-means today, in particular they exist for high dimensional spaces now as well.]
△ Less
Submitted 14 June, 2021;
originally announced June 2021.
-
Disjoint Shortest Paths with Congestion on DAGs
Authors:
Saeed Akhoondian Amiri,
Julian Wargalla
Abstract:
In the k-Disjoint Shortest Paths problem, a set of terminal pairs of vertices $\{(s_i,t_i)\mid 1\le i\le k\}$ is given and we are asked to find paths $P_1,\ldots,P_k$ such that each path $P_i$ is a shortest path from $s_i$ to $t_i$ and every vertex of the graph routes at most one of them. We introduce a generalization of the problem, namely, $k$-Disjoint Shortest Paths with Congestion-$c$ where ev…
▽ More
In the k-Disjoint Shortest Paths problem, a set of terminal pairs of vertices $\{(s_i,t_i)\mid 1\le i\le k\}$ is given and we are asked to find paths $P_1,\ldots,P_k$ such that each path $P_i$ is a shortest path from $s_i$ to $t_i$ and every vertex of the graph routes at most one of them. We introduce a generalization of the problem, namely, $k$-Disjoint Shortest Paths with Congestion-$c$ where every vertex is allowed to route up to $c$ paths.
We provide a simple algorithm to solve the problem in time $f(k) n^{O(k-c)}$ on DAGs. Using the techniques for DAGs, we show the problem is solvable in time $f(k) n^{O(k)}$ on general undirected graphs. Our algorithm for DAGs is based on the earlier algorithm for $k$-Disjoint Paths with Congestion-$c$[IPL2019], but we significantly simplify their argument.
Then we prove that it is not possible to improve the algorithm significantly by showing that for every constant $c$ the problem is W[1]-hard w.r.t.\ parameter $k-c$. We also consider the problem on acyclic planar graphs, but this time we restrict ourselves to the edge-disjoint shortest paths problem. We show that even on acyclic planar graphs there is no $f(k)n^{o(k)}$ algorithm for the problem unless ETH fails.
△ Less
Submitted 7 July, 2021; v1 submitted 19 August, 2020;
originally announced August 2020.