-
Capacitated Fair-Range Clustering: Hardness and Approximation Algorithms
Authors:
Ameet Gadekar,
Suhas Thejaswi
Abstract:
Capacitated fair-range $k$-clustering generalizes classical $k$-clustering by incorporating both capacity constraints and demographic fairness. In this setting, each facility has a capacity limit and may belong to one or more demographic groups. The task is to select $k$ facilities as centers and assign each client to a center such that: ($a$) no center exceeds its capacity, ($b$) the number of ce…
▽ More
Capacitated fair-range $k$-clustering generalizes classical $k$-clustering by incorporating both capacity constraints and demographic fairness. In this setting, each facility has a capacity limit and may belong to one or more demographic groups. The task is to select $k$ facilities as centers and assign each client to a center such that: ($a$) no center exceeds its capacity, ($b$) the number of centers selected from each group lies within specified lower and upper bounds (fair-range constraints), and ($c$) the clustering cost (e.g., $k$-median or $k$-means) is minimized.
Prior work by Thejaswi et al. (KDD 2022) showed that satisfying fair-range constraints is NP-hard, making the problem inapproximable to any polynomial factor. We strengthen this result by showing that inapproximability persists even when the fair-range constraints are trivially satisfiable, highlighting the intrinsic computational complexity of the clustering task itself. Assuming standard complexity conjectures, we show that no non-trivial approximation is possible without exhaustively enumerating all $k$-subsets of the facility set. Notably, our inapproximability results hold even on tree metrics and when the number of groups is logarithmic in the size of the facility set.
In light of these strong inapproximability results, we focus on a more practical setting where the number of groups is constant. In this regime, we design two approximation algorithms: ($i$) a polynomial-time $O(\log k)$- and $O(\log^2 k)$-approximation algorithm for the $k$-median and $k$-means objectives, and ($ii$) a fixed-parameter tractable algorithm parameterized by $k$, achieving $(3+ε)$- and $(9 + ε)$-approximation, respectively. These results match the best-known approximation guarantees for capacitated clustering without fair-range constraints and resolves an open question posed by Zang et al. (NeurIPS 2024).
△ Less
Submitted 21 May, 2025;
originally announced May 2025.
-
Coreset Strikes Back: Improved Parameterized Approximation Schemes for (Constrained) k-Median/Means
Authors:
Sujoy Bhore,
Ameet Gadekar,
Tanmay Inamdar
Abstract:
Algorithmic scatter dimension is a notion of metric spaces introduced recently by Abbasi et al. (FOCS 2023), which unifies many well-known metric spaces, including continuous Euclidean space, bounded doubling space, planar and bounded treewidth metrics. Recently, Bourneuf and Pilipczuk (SODA 2025) showed that metrics induced by graphs from any fixed proper minor closed graph class have bounded sca…
▽ More
Algorithmic scatter dimension is a notion of metric spaces introduced recently by Abbasi et al. (FOCS 2023), which unifies many well-known metric spaces, including continuous Euclidean space, bounded doubling space, planar and bounded treewidth metrics. Recently, Bourneuf and Pilipczuk (SODA 2025) showed that metrics induced by graphs from any fixed proper minor closed graph class have bounded scatter dimension. Abbasi et al. presented a unified approach to obtain EPASes (i.e., $(1+ε)$-approximations running in time FPT in $k$ and $ε$) for $k$-Clustering in metrics of bounded scatter dimension. However, a seemingly inherent limitation of their approach was that it could only handle clustering objectives where each point was assigned to the closest chosen center. They explicitly asked, if there exist EPASes for constrained $k$-Clustering in metrics of bounded scatter dimension.
We present a unified framework which yields EPASes capacitated and fair $k$-Median/Means in metrics of bounded algorithmic scatter dimension. Our framework exploits coresets for such constrained clustering problems in a novel manner, and notably requires only coresets of size $(k\log n/ε)^{O(1)}$, which are usually constuctible even in general metrics. Note that due to existing lower bounds it is impossible to obtain such an EPAS for Capacitated $k$-Center, thus essentially answering the complete spectrum of the question.
Our results on capacitated and fair $k$-Median/Means provide the first EPASes for these problems in broad families of metric spaces. Earlier such results were only known in continuous Euclidean spaces due to Cohen-Addad & Li, (ICALP 2019), and Bandyapadhyay, Fomin & Simonov, (ICALP 2021; JCSS 2024), respectively. Along the way, we obtain faster EPASes for uncapacitated $k$-Median/Means, improving upon the running time of the algorithm by Abbasi et al.
△ Less
Submitted 25 April, 2025; v1 submitted 9 April, 2025;
originally announced April 2025.
-
Dimension-Free Parameterized Approximation Schemes for Hybrid Clustering
Authors:
Ameet Gadekar,
Tanmay Inamdar
Abstract:
Hybrid $k$-Clustering is a model of clustering that generalizes two of the most widely studied clustering objectives: $k$-Center and $k$-Median. In this model, given a set of $n$ points $P$, the goal is to find $k$ centers such that the sum of the $r$-distances of each point to its nearest center is minimized. The $r$-distance between two points $p$ and $q$ is defined as $\max\{d(p, q)-r, 0\}$ --…
▽ More
Hybrid $k$-Clustering is a model of clustering that generalizes two of the most widely studied clustering objectives: $k$-Center and $k$-Median. In this model, given a set of $n$ points $P$, the goal is to find $k$ centers such that the sum of the $r$-distances of each point to its nearest center is minimized. The $r$-distance between two points $p$ and $q$ is defined as $\max\{d(p, q)-r, 0\}$ -- this represents the distance of $p$ to the boundary of the $r$-radius ball around $q$ if $p$ is outside the ball, and $0$ otherwise. This problem was recently introduced by Fomin et al. [APPROX 2024], who designed a $(1+\varepsilon, 1+\varepsilon)$-bicrtieria approximation that runs in time $2^{(kd/\varepsilon)^{O(1)}} \cdot n^{O(1)}$ for inputs in $\mathbb{R}^d$; such a bicriteria solution uses balls of radius $(1+\varepsilon)r$ instead of $r$, and has a cost at most $1+\varepsilon$ times the cost of an optimal solution using balls of radius $r$.
In this paper we significantly improve upon this result by designing an approximation algorithm with the same bicriteria guarantee, but with running time that is FPT only in $k$ and $\varepsilon$ -- crucially, removing the exponential dependence on the dimension $d$. This resolves an open question posed in their paper. Our results extend further in several directions. First, our approximation scheme works in a broader class of metric spaces, including doubling spaces, minor-free, and bounded treewidth metrics. Secondly, our techniques yield a similar bicriteria FPT-approximation schemes for other variants of Hybrid $k$-Clustering, e.g., when the objective features the sum of $z$-th power of the $r$-distances. Finally, we also design a coreset for Hybrid $k$-Clustering in doubling spaces, answering another open question from the work of Fomin et al.
△ Less
Submitted 7 January, 2025;
originally announced January 2025.
-
Fair Clustering for Data Summarization: Improved Approximation Algorithms and Complexity Insights
Authors:
Ameet Gadekar,
Aristides Gionis,
Suhas Thejaswi
Abstract:
Data summarization tasks are often modeled as $k$-clustering problems, where the goal is to choose $k$ data points, called cluster centers, that best represent the dataset by minimizing a clustering objective. A popular objective is to minimize the maximum distance between any data point and its nearest center, which is formalized as the $k$-center problem. While in some applications all data poin…
▽ More
Data summarization tasks are often modeled as $k$-clustering problems, where the goal is to choose $k$ data points, called cluster centers, that best represent the dataset by minimizing a clustering objective. A popular objective is to minimize the maximum distance between any data point and its nearest center, which is formalized as the $k$-center problem. While in some applications all data points can be chosen as centers, in the general setting, centers must be chosen from a predefined subset of points, referred as facilities or suppliers; this is known as the $k$-supplier problem. In this work, we focus on fair data summarization modeled as the fair $k$-supplier problem, where data consists of several groups, and a minimum number of centers must be selected from each group while minimizing the $k$-supplier objective. The groups can be disjoint or overlapping, leading to two distinct problem variants each with different computational complexity.
We present $3$-approximation algorithms for both variants, improving the previously known factor of $5$. For disjoint groups, our algorithm runs in polynomial time, while for overlapping groups, we present a fixed-parameter tractable algorithm, where the exponential runtime depends only on the number of groups and centers. We show that these approximation factors match the theoretical lower bounds, assuming standard complexity theory conjectures. Finally, using an open-source implementation, we demonstrate the scalability of our algorithms on large synthetic datasets and assess the price of fairness on real-world data, comparing solution quality with and without fairness constraints.
△ Less
Submitted 16 October, 2024;
originally announced October 2024.
-
FPT approximations for Capacitated Sum of Radii and Diameters
Authors:
Arnold Filtser,
Ameet Gadekar
Abstract:
The Capacitated Sum of Radii problem involves partitioning a set of points $P$, where each point $p\in P$ has capacity $U_p$, into $k$ clusters that minimize the sum of cluster radii, such that the number of points in the cluster centered at point $p$ is at most $U_p$. We begin by showing that the problem is APX-hard, and that under gap-ETH there is no parameterized approximation scheme (FPT-AS).…
▽ More
The Capacitated Sum of Radii problem involves partitioning a set of points $P$, where each point $p\in P$ has capacity $U_p$, into $k$ clusters that minimize the sum of cluster radii, such that the number of points in the cluster centered at point $p$ is at most $U_p$. We begin by showing that the problem is APX-hard, and that under gap-ETH there is no parameterized approximation scheme (FPT-AS). We then construct a $\approx5.83$-approximation algorithm in FPT time (improving a previous $\approx7.61$ approximation in FPT time). Our results also hold when the objective is a general monotone symmetric norm of radii. We also improve the approximation factors for the uniform capacity case, and for the closely related problem of Capacitated Sum of Diameters.
△ Less
Submitted 8 September, 2024;
originally announced September 2024.
-
Diversity-aware clustering: Computational Complexity and Approximation Algorithms
Authors:
Suhas Thejaswi,
Ameet Gadekar,
Bruno Ordozgoiti,
Aristides Gionis
Abstract:
In this work, we study diversity-aware clustering problems where the data points are associated with multiple attributes resulting in intersecting groups. A clustering solution needs to ensure that the number of chosen cluster centers from each group should be within the range defined by a lower and upper bound threshold for each group, while simultaneously minimizing the clustering objective, whi…
▽ More
In this work, we study diversity-aware clustering problems where the data points are associated with multiple attributes resulting in intersecting groups. A clustering solution needs to ensure that the number of chosen cluster centers from each group should be within the range defined by a lower and upper bound threshold for each group, while simultaneously minimizing the clustering objective, which can be either $k$-median, $k$-means or $k$-supplier. We study the computational complexity of the proposed problems, offering insights into their NP-hardness, polynomial-time inapproximability, and fixed-parameter intractability. We present parameterized approximation algorithms with approximation ratios $1+ \frac{2}{e} + ε\approx 1.736$, $1+\frac{8}{e} + ε\approx 3.943$, and $5$ for diversity-aware $k$-median, diversity-aware $k$-means and diversity-aware $k$-supplier, respectively. Assuming Gap-ETH, the approximation ratios are tight for the diversity-aware $k$-median and diversity-aware $k$-means problems. Our results imply the same approximation factors for their respective fair variants with disjoint groups -- fair $k$-median, fair $k$-means, and fair $k$-supplier -- with lower bound requirements.
△ Less
Submitted 20 May, 2025; v1 submitted 10 January, 2024;
originally announced January 2024.
-
Independent set in $k$-Claw-Free Graphs: Conditional $χ$-boundedness and the Power of LP/SDP Relaxations
Authors:
Parinya Chalermsook,
Ameet Gadekar,
Kamyar Khodamoradi,
Joachim Spoerhase
Abstract:
This paper studies $k$-claw-free graphs, exploring the connection between an extremal combinatorics question and the power of a convex program in approximating the maximum-weight independent set in this graph class. For the extremal question, we consider the notion, that we call \textit{conditional $χ$-boundedness} of a graph: Given a graph $G$ that is assumed to contain an independent set of a ce…
▽ More
This paper studies $k$-claw-free graphs, exploring the connection between an extremal combinatorics question and the power of a convex program in approximating the maximum-weight independent set in this graph class. For the extremal question, we consider the notion, that we call \textit{conditional $χ$-boundedness} of a graph: Given a graph $G$ that is assumed to contain an independent set of a certain (constant) size, we are interested in upper bounding the chromatic number in terms of the clique number of $G$. This question, besides being interesting on its own, has algorithmic implications (which have been relatively neglected in the literature) on the performance of SDP relaxations in estimating the value of maximum-weight independent set.
For $k=3$, Chudnovsky and Seymour (JCTB 2010) prove that any $3$-claw-free graph $G$ with an independent set of size three must satisfy $χ(G) \leq 2 ω(G)$. Their result implies a factor $2$-estimation algorithm for the maximum weight independent set via an SDP relaxation (providing the first non-trivial result for maximum-weight independent set in such graphs via a convex relaxation). An obvious open question is whether a similar conditional $χ$-boundedness phenomenon holds for any $k$-claw-free graph. Our main result answers this question negatively. We further present some evidence that our construction could be useful in studying more broadly the power of convex relaxations in the context of approximating maximum weight independent set in $k$-claw free graphs. In particular, we prove a lower bound on families of convex programs that are stronger than known convex relaxations used algorithmically in this context.
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
Design and Implementation of an Efficient Onboard Computer System for CanSat Atmosphere Monitoring
Authors:
Abhijit Gadekar
Abstract:
With advancements in technology, the smaller versions of satellites have gained momentum in the space industry for earth monitoring and communication-based applications. The rise of CanSat technology has significantly impacted the space industry by providing a cost-effective solution for space exploration. CanSat is a simulation model of a real satellite and plays a crucial role in collecting and…
▽ More
With advancements in technology, the smaller versions of satellites have gained momentum in the space industry for earth monitoring and communication-based applications. The rise of CanSat technology has significantly impacted the space industry by providing a cost-effective solution for space exploration. CanSat is a simulation model of a real satellite and plays a crucial role in collecting and transmitting atmospheric data. This paper discusses the design of an Onboard Computer System forCanSat, used to study various environmental parameters by monitoring the concentrations of gases in the atmosphere. The Onboard Computer System uses GPS, accelerometer, altitude, temperature, pressure, gyroscope, magnetometer, UV radiation, and air quality sensors for atmospheric sensing. A highly efficient and low-power ESP32 microcontroller and a transceiver module are used to acquire data, facilitate seamless communication and transmit the collected data to the ground station.
△ Less
Submitted 23 January, 2025; v1 submitted 7 August, 2023;
originally announced August 2023.
-
Parameterized Approximation for Robust Clustering in Discrete Geometric Spaces
Authors:
Fateme Abbasi,
Sandip Banerjee,
Jarosław Byrka,
Parinya Chalermsook,
Ameet Gadekar,
Kamyar Khodamoradi,
Dániel Marx,
Roohani Sharma,
Joachim Spoerhase
Abstract:
We consider the well-studied Robust $(k, z)$-Clustering problem, which generalizes the classic $k$-Median, $k$-Means, and $k$-Center problems. Given a constant $z\ge 1$, the input to Robust $(k, z)$-Clustering is a set $P$ of $n$ weighted points in a metric space $(M,δ)$ and a positive integer $k$. Further, each point belongs to one (or more) of the $m$ many different groups $S_1,S_2,\ldots,S_m$.…
▽ More
We consider the well-studied Robust $(k, z)$-Clustering problem, which generalizes the classic $k$-Median, $k$-Means, and $k$-Center problems. Given a constant $z\ge 1$, the input to Robust $(k, z)$-Clustering is a set $P$ of $n$ weighted points in a metric space $(M,δ)$ and a positive integer $k$. Further, each point belongs to one (or more) of the $m$ many different groups $S_1,S_2,\ldots,S_m$. Our goal is to find a set $X$ of $k$ centers such that $\max_{i \in [m]} \sum_{p \in S_i} w(p) δ(p,X)^z$ is minimized.
This problem arises in the domains of robust optimization [Anthony, Goyal, Gupta, Nagarajan, Math. Oper. Res. 2010] and in algorithmic fairness. For polynomial time computation, an approximation factor of $O(\log m/\log\log m)$ is known [Makarychev, Vakilian, COLT $2021$], which is tight under a plausible complexity assumption even in the line metrics. For FPT time, there is a $(3^z+ε)$-approximation algorithm, which is tight under GAP-ETH [Goyal, Jaiswal, Inf. Proc. Letters, 2023].
Motivated by the tight lower bounds for general discrete metrics, we focus on \emph{geometric} spaces such as the (discrete) high-dimensional Euclidean setting and metrics of low doubling dimension, which play an important role in data analysis applications. First, for a universal constant $η_0 >0.0006$, we devise a $3^z(1-η_{0})$-factor FPT approximation algorithm for discrete high-dimensional Euclidean spaces thereby bypassing the lower bound for general metrics. We complement this result by showing that even the special case of $k$-Center in dimension $Θ(\log n)$ is $(\sqrt{3/2}- o(1))$-hard to approximate for FPT algorithms. Finally, we complete the FPT approximation landscape by designing an FPT $(1+ε)$-approximation scheme (EPAS) for the metric of sub-logarithmic doubling dimension.
△ Less
Submitted 16 September, 2024; v1 submitted 12 May, 2023;
originally announced May 2023.
-
Parameterized Approximation Schemes for Clustering with General Norm Objectives
Authors:
Fateme Abbasi,
Sandip Banerjee,
Jarosław Byrka,
Parinya Chalermsook,
Ameet Gadekar,
Kamyar Khodamoradi,
Dániel Marx,
Roohani Sharma,
Joachim Spoerhase
Abstract:
This paper considers the well-studied algorithmic regime of designing a $(1+ε)$-approximation algorithm for a $k$-clustering problem that runs in time $f(k,ε)poly(n)$ (sometimes called an efficient parameterized approximation scheme or EPAS for short). Notable results of this kind include EPASes in the high-dimensional Euclidean setting for $k$-center [Badŏiu, Har-Peled, Indyk; STOC'02] as well as…
▽ More
This paper considers the well-studied algorithmic regime of designing a $(1+ε)$-approximation algorithm for a $k$-clustering problem that runs in time $f(k,ε)poly(n)$ (sometimes called an efficient parameterized approximation scheme or EPAS for short). Notable results of this kind include EPASes in the high-dimensional Euclidean setting for $k$-center [Badŏiu, Har-Peled, Indyk; STOC'02] as well as $k$-median, and $k$-means [Kumar, Sabharwal, Sen; J. ACM 2010]. However, existing EPASes handle only basic objectives (such as $k$-center, $k$-median, and $k$-means) and are tailored to the specific objective and metric space.
Our main contribution is a clean and simple EPAS that settles more than ten clustering problems (across multiple well-studied objectives as well as metric spaces) and unifies well-known EPASes. Our algorithm gives EPASes for a large variety of clustering objectives (for example, $k$-means, $k$-center, $k$-median, priority $k$-center, $\ell$-centrum, ordered $k$-median, socially fair $k$-median aka robust $k$-median, or more generally monotone norm $k$-clustering) and metric spaces (for example, continuous high-dimensional Euclidean spaces, metrics of bounded doubling dimension, bounded treewidth metrics, and planar metrics).
Key to our approach is a new concept that we call bounded $ε$-scatter dimension--an intrinsic complexity measure of a metric space that is a relaxation of the standard notion of bounded doubling dimension. Our main technical result shows that two conditions are essentially sufficient for our algorithm to yield an EPAS on the input metric $M$ for any clustering objective: (i) The objective is described by a monotone (not necessarily symmetric!) norm, and (ii) the $ε$-scatter dimension of $M$ is upper bounded by a function of $ε$.
△ Less
Submitted 6 April, 2023;
originally announced April 2023.
-
Clustering with fair-center representation: parameterized approximation algorithms and heuristics
Authors:
Suhas Thejaswi,
Ameet Gadekar,
Bruno Ordozgoiti,
Michal Osadnik
Abstract:
We study a variant of classical clustering formulations in the context of algorithmic fairness, known as diversity-aware clustering. In this variant we are given a collection of facility subsets, and a solution must contain at least a specified number of facilities from each subset while simultaneously minimizing the clustering objective ($k$-median or $k$-means). We investigate the fixed-paramete…
▽ More
We study a variant of classical clustering formulations in the context of algorithmic fairness, known as diversity-aware clustering. In this variant we are given a collection of facility subsets, and a solution must contain at least a specified number of facilities from each subset while simultaneously minimizing the clustering objective ($k$-median or $k$-means). We investigate the fixed-parameter tractability of these problems and show several negative hardness and inapproximability results, even when we afford exponential running time with respect to some parameters.
Motivated by these results we identify natural parameters of the problem, and present fixed-parameter approximation algorithms with approximation ratios $\big(1 + \frac{2}{e} +ε\big)$ and $\big(1 + \frac{8}{e}+ ε\big)$ for diversity-aware $k$-median and diversity-aware $k$-means respectively, and argue that these ratios are essentially tight assuming the gap-exponential time hypothesis. We also present a simple and more practical bicriteria approximation algorithm with better running time bounds. We finally propose efficient and practical heuristics. We evaluate the scalability and effectiveness of our methods in a wide variety of rigorously conducted experiments, on both real and synthetic data.
△ Less
Submitted 24 October, 2022; v1 submitted 13 December, 2021;
originally announced December 2021.
-
On the parameterized complexity of Compact Set Packing
Authors:
Ameet Gadekar
Abstract:
The Set Packing problem is, given a collection of sets $\mathcal{S}$ over a ground set $\mathcal{U}$, to find a maximum collection of sets that are pairwise disjoint. The problem is among the most fundamental NP-hard optimization problems that have been studied extensively in various computational regimes. The focus of this work is on parameterized complexity, Parameterized Set Packing (PSP): Give…
▽ More
The Set Packing problem is, given a collection of sets $\mathcal{S}$ over a ground set $\mathcal{U}$, to find a maximum collection of sets that are pairwise disjoint. The problem is among the most fundamental NP-hard optimization problems that have been studied extensively in various computational regimes. The focus of this work is on parameterized complexity, Parameterized Set Packing (PSP): Given $r \in {\mathbb N}$, is there a collection $ \mathcal{S}' \subseteq \mathcal{S}: |\mathcal{S}'| = r$ such that the sets in $\mathcal{S}'$ are pairwise disjoint? Unfortunately, the problem is not fixed parameter tractable unless $\mathsf{W[1] = FPT}$, and, in fact, an "enumeration" running time of $|\mathcal{S}|^{Ω(r)}$ is required unless the exponential time hypothesis (ETH) fails. This paper is a quest for tractable instances of Set Packing from parameterized complexity perspectives. We say that the input $(\mathcal{U},\mathcal{S})$ is "compact" if $|\mathcal{U}| = f(r)\cdotΘ(\textsf{poly}( \log |\mathcal{S}|))$, for some $f(r) \ge r$. In the Compact Set Packing problem, we are given a compact instance of PSP. In this direction, we present a "dichotomy" result of PSP: When $|\mathcal{U}| = f(r)\cdot o(\log |\mathcal{S}|)$, PSP is in $\textsf{FPT}$, while for $|\mathcal{U}| = r\cdotΘ(\log (|\mathcal{S}|))$, the problem is $W[1]$-hard; moreover, assuming ETH, Compact PSP does not even admit $|\mathcal{S}|^{o(r/\log r)}$ time algorithm. Although certain results in the literature imply hardness of compact versions of related problems such as Set $r$-Covering and Exact $r$-Covering, these constructions fail to extend to Compact PSP. A novel contribution of our work is the identification and construction of a gadget, which we call Compatible Intersecting Set System pair, that is crucial in obtaining the hardness result for Compact PSP.
△ Less
Submitted 25 February, 2023; v1 submitted 11 November, 2021;
originally announced November 2021.
-
Large polaron evolution in anatase TiO2 due to carrier and temperature dependence of electron-phonon coupling
Authors:
B. X. Yan,
D. Y. Wan,
X. Chi,
C. J. Li,
M. R. Motapothula,
S. Hooda,
P. Yang,
Z. Huang,
S. W. Zeng,
A. Gadekar,
S. J. Pennycook,
A. Rusydi,
Ariando,
J. Martin,
T. Venkatesan
Abstract:
The electronic and magneto transport properties of reduced anatase TiO2 epitaxial thin films are analyzed considering various polaronic effects. Unexpectedly, with increasing carrier concentration, the mobility increases, which rarely happens in common metallic systems. We find that the screening of the electron-phonon (e-ph) coupling by excess carriers is necessary to explain this unusual depende…
▽ More
The electronic and magneto transport properties of reduced anatase TiO2 epitaxial thin films are analyzed considering various polaronic effects. Unexpectedly, with increasing carrier concentration, the mobility increases, which rarely happens in common metallic systems. We find that the screening of the electron-phonon (e-ph) coupling by excess carriers is necessary to explain this unusual dependence. We also find that the magnetoresistance (MR) could be decomposed into a linear and a quadratic component, separately characterizing the transport and trap behavior of carriers as a function of temperature. The various transport behaviors could be organized into a single phase diagram which clarifies the nature of large polaron in this material.
△ Less
Submitted 5 March, 2023; v1 submitted 11 November, 2017;
originally announced November 2017.
-
On the hardness of learning sparse parities
Authors:
Arnab Bhattacharyya,
Ameet Gadekar,
Suprovat Ghoshal,
Rishi Saket
Abstract:
This work investigates the hardness of computing sparse solutions to systems of linear equations over F_2. Consider the k-EvenSet problem: given a homogeneous system of linear equations over F_2 on n variables, decide if there exists a nonzero solution of Hamming weight at most k (i.e. a k-sparse solution). While there is a simple O(n^{k/2})-time algorithm for it, establishing fixed parameter intr…
▽ More
This work investigates the hardness of computing sparse solutions to systems of linear equations over F_2. Consider the k-EvenSet problem: given a homogeneous system of linear equations over F_2 on n variables, decide if there exists a nonzero solution of Hamming weight at most k (i.e. a k-sparse solution). While there is a simple O(n^{k/2})-time algorithm for it, establishing fixed parameter intractability for k-EvenSet has been a notorious open problem. Towards this goal, we show that unless k-Clique can be solved in n^{o(k)} time, k-EvenSet has no poly(n)2^{o(sqrt{k})} time algorithm and no polynomial time algorithm when k = (log n)^{2+eta} for any eta > 0.
Our work also shows that the non-homogeneous generalization of the problem -- which we call k-VectorSum -- is W[1]-hard on instances where the number of equations is O(k log n), improving on previous reductions which produced Omega(n) equations. We also show that for any constant eps > 0, given a system of O(exp(O(k))log n) linear equations, it is W[1]-hard to decide if there is a k-sparse linear form satisfying all the equations or if every function on at most k-variables (k-junta) satisfies at most (1/2 + eps)-fraction of the equations. In the setting of computational learning, this shows hardness of approximate non-proper learning of k-parities. In a similar vein, we use the hardness of k-EvenSet to show that that for any constant d, unless k-Clique can be solved in n^{o(k)} time there is no poly(m, n)2^{o(sqrt{k}) time algorithm to decide whether a given set of m points in F_2^n satisfies: (i) there exists a non-trivial k-sparse homogeneous linear form evaluating to 0 on all the points, or (ii) any non-trivial degree d polynomial P supported on at most k variables evaluates to zero on approx. Pr_{F_2^n}[P(z) = 0] fraction of the points i.e., P is fooled by the set of points.
△ Less
Submitted 25 November, 2015;
originally announced November 2015.
-
On learning k-parities with and without noise
Authors:
Arnab Bhattacharyya,
Ameet Gadekar,
Ninad Rajgopal
Abstract:
We first consider the problem of learning $k$-parities in the on-line mistake-bound model: given a hidden vector $x \in \{0,1\}^n$ with $|x|=k$ and a sequence of "questions" $a_1, a_2, ...\in \{0,1\}^n$, where the algorithm must reply to each question with $< a_i, x> \pmod 2$, what is the best tradeoff between the number of mistakes made by the algorithm and its time complexity? We improve the pre…
▽ More
We first consider the problem of learning $k$-parities in the on-line mistake-bound model: given a hidden vector $x \in \{0,1\}^n$ with $|x|=k$ and a sequence of "questions" $a_1, a_2, ...\in \{0,1\}^n$, where the algorithm must reply to each question with $< a_i, x> \pmod 2$, what is the best tradeoff between the number of mistakes made by the algorithm and its time complexity? We improve the previous best result of Buhrman et al. by an $\exp(k)$ factor in the time complexity.
Second, we consider the problem of learning $k$-parities in the presence of classification noise of rate $η\in (0,1/2)$. A polynomial time algorithm for this problem (when $η> 0$ and $k = ω(1)$) is a longstanding challenge in learning theory. Grigorescu et al. showed an algorithm running in time ${n \choose k/2}^{1 + 4η^2 +o(1)}$. Note that this algorithm inherently requires time ${n \choose k/2}$ even when the noise rate $η$ is polynomially small. We observe that for sufficiently small noise rate, it is possible to break the $n \choose k/2$ barrier. In particular, if for some function $f(n) = ω(1)$ and $α\in [1/2, 1)$, $k = n/f(n)$ and $η= o(f(n)^{- α}/\log n)$, then there is an algorithm for the problem with running time $poly(n)\cdot {n \choose k}^{1-α} \cdot e^{-k/4.01}$.
△ Less
Submitted 18 February, 2015;
originally announced February 2015.