-
Coordinate Descent for SLOPE
Authors:
Johan Larsson,
Quentin Klopfenstein,
Mathurin Massias,
Jonas Wallin
Abstract:
The lasso is the most famous sparse regression and feature selection method. One reason for its popularity is the speed at which the underlying optimization problem can be solved. Sorted L-One Penalized Estimation (SLOPE) is a generalization of the lasso with appealing statistical properties. In spite of this, the method has not yet reached widespread interest. A major reason for this is that curr…
▽ More
The lasso is the most famous sparse regression and feature selection method. One reason for its popularity is the speed at which the underlying optimization problem can be solved. Sorted L-One Penalized Estimation (SLOPE) is a generalization of the lasso with appealing statistical properties. In spite of this, the method has not yet reached widespread interest. A major reason for this is that current software packages that fit SLOPE rely on algorithms that perform poorly in high dimensions. To tackle this issue, we propose a new fast algorithm to solve the SLOPE optimization problem, which combines proximal gradient descent and proximal coordinate descent steps. We provide new results on the directional derivative of the SLOPE penalty and its related SLOPE thresholding operator, as well as provide convergence guarantees for our proposed solver. In extensive benchmarks on simulated and real data, we show that our method outperforms a long list of competing algorithms.
△ Less
Submitted 26 October, 2022;
originally announced October 2022.
-
Benchopt: Reproducible, efficient and collaborative optimization benchmarks
Authors:
Thomas Moreau,
Mathurin Massias,
Alexandre Gramfort,
Pierre Ablin,
Pierre-Antoine Bannier,
Benjamin Charlier,
Mathieu Dagréou,
Tom Dupré la Tour,
Ghislain Durif,
Cassio F. Dantas,
Quentin Klopfenstein,
Johan Larsson,
En Lai,
Tanguy Lefort,
Benoit Malézieux,
Badr Moufad,
Binh T. Nguyen,
Alain Rakotomamonjy,
Zaccharie Ramzi,
Joseph Salmon,
Samuel Vaiter
Abstract:
Numerical validation is at the core of machine learning research as it allows to assess the actual impact of new methods, and to confirm the agreement between theory and practice. Yet, the rapid development of the field poses several challenges: researchers are confronted with a profusion of methods to compare, limited transparency and consensus on best practices, as well as tedious re-implementat…
▽ More
Numerical validation is at the core of machine learning research as it allows to assess the actual impact of new methods, and to confirm the agreement between theory and practice. Yet, the rapid development of the field poses several challenges: researchers are confronted with a profusion of methods to compare, limited transparency and consensus on best practices, as well as tedious re-implementation work. As a result, validation is often very partial, which can lead to wrong conclusions that slow down the progress of research. We propose Benchopt, a collaborative framework to automate, reproduce and publish optimization benchmarks in machine learning across programming languages and hardware architectures. Benchopt simplifies benchmarking for the community by providing an off-the-shelf tool for running, sharing and extending experiments. To demonstrate its broad usability, we showcase benchmarks on three standard learning tasks: $\ell_2$-regularized logistic regression, Lasso, and ResNet18 training for image classification. These benchmarks highlight key practical findings that give a more nuanced view of the state-of-the-art for these problems, showing that for practical evaluation, the devil is in the details. We hope that Benchopt will foster collaborative work in the community hence improving the reproducibility of research findings.
△ Less
Submitted 28 October, 2022; v1 submitted 27 June, 2022;
originally announced June 2022.
-
Biased random k-SAT
Authors:
Joel Larsson,
Klas Markström
Abstract:
The basic random $k$-SAT problem is: Given a set of $n$ Boolean variables, and $m$ clauses of size $k$ picked uniformly at random from the set of all such clauses on our variables, is the conjunction of these clauses satisfiable?
Here we consider a variation of this problem where there is a bias towards variables occurring positive -- i.e. variables occur negated w.p. $0<p< \frac{1}{2}$ and posi…
▽ More
The basic random $k$-SAT problem is: Given a set of $n$ Boolean variables, and $m$ clauses of size $k$ picked uniformly at random from the set of all such clauses on our variables, is the conjunction of these clauses satisfiable?
Here we consider a variation of this problem where there is a bias towards variables occurring positive -- i.e. variables occur negated w.p. $0<p< \frac{1}{2}$ and positive otherwise -- and study how the satisfiability threshold depends on $p$. For $p<\frac{1}{2}$ this model breaks many of the symmetries of the original random $k$-SAT problem, e.g. the distribution of satisfying assignments in the Boolean cube is no longer uniform.
For any fixed $k$, we find the asymptotics of the threshold as $p$ approaches $0$ or $\frac{1}{2}$. The former confirms earlier predictions based on numerical studies and heuristic methods from statistical physics.
△ Less
Submitted 12 June, 2019;
originally announced June 2019.
-
Speed and concentration of the covering time for structured coupon collectors
Authors:
Victor Falgas-Ravry,
Joel Larsson,
Klas Markström
Abstract:
Let $V$ be an $n$-set, and let $X$ be a random variable taking values in the powerset of $V$. Suppose we are given a sequence of random coupons $X_1, X_2, \ldots $, where the $X_i$ are independent random variables with distribution given by $X$. The covering time $T$ is the smallest integer $t\geq 0$ such that $\bigcup_{i=1}^tX_i=V$. The distribution of $T$ is important in many applications in com…
▽ More
Let $V$ be an $n$-set, and let $X$ be a random variable taking values in the powerset of $V$. Suppose we are given a sequence of random coupons $X_1, X_2, \ldots $, where the $X_i$ are independent random variables with distribution given by $X$. The covering time $T$ is the smallest integer $t\geq 0$ such that $\bigcup_{i=1}^tX_i=V$. The distribution of $T$ is important in many applications in combinatorial probability, and has been extensively studied. However the literature has focussed almost exclusively on the case where $X$ is assumed to be symmetric and/or uniform in some way.
In this paper we study the covering time for much more general random variables $X$; we give general criteria for $T$ being sharply concentrated around its mean, precise tools to estimate that mean, as well as examples where $T$ fails to be concentrated and when structural properties in the distribution of $X$ allow for a very different behaviour of $T$ relative to the symmetric/uniform case.
△ Less
Submitted 18 January, 2016;
originally announced January 2016.
-
Necessary and Sufficient Conditions for Extended Noncontextuality in a Broad Class of Quantum Mechanical Systems
Authors:
Janne V. Kujala,
Ehtibar N. Dzhafarov,
Jan-Åke Larsson
Abstract:
The notion of (non)contextuality pertains to sets of properties measured one subset (context) at a time. We extend this notion to include so-called inconsistently connected systems, in which the measurements of a given property in different contexts may have different distributions, due to contextual biases in experimental design or physical interactions (signaling): a system of measurements has a…
▽ More
The notion of (non)contextuality pertains to sets of properties measured one subset (context) at a time. We extend this notion to include so-called inconsistently connected systems, in which the measurements of a given property in different contexts may have different distributions, due to contextual biases in experimental design or physical interactions (signaling): a system of measurements has a maximally noncontextual description if they can be imposed a joint distribution on in which the measurements of any one property in different contexts are equal to each other with the maximal probability allowed by their different distributions. We derive necessary and sufficient conditions for the existence of such a description in a broad class of systems including Klyachko-Can-Binicioğlu-Shumvosky-type (KCBS), EPR-Bell-type, and Leggett-Garg-type systems. Because these conditions allow for inconsistent connectedness, they are applicable to real experiments. We illustrate this by analyzing an experiment by Lapkiewicz and colleagues aimed at testing contextuality in a KCBS-type system.
△ Less
Submitted 14 November, 2015; v1 submitted 15 December, 2014;
originally announced December 2014.
-
Contextuality in Three Types of Quantum-Mechanical Systems
Authors:
Ehtibar N. Dzhafarov,
Janne V. Kujala,
Jan-Åke Larsson
Abstract:
We present a formal theory of contextuality for a set of random variables grouped into different subsets (contexts) corresponding to different, mutually incompatible conditions. Within each context the random variables are jointly distributed, but across different contexts they are stochastically unrelated. The theory of contextuality is based on the analysis of the extent to which some of these r…
▽ More
We present a formal theory of contextuality for a set of random variables grouped into different subsets (contexts) corresponding to different, mutually incompatible conditions. Within each context the random variables are jointly distributed, but across different contexts they are stochastically unrelated. The theory of contextuality is based on the analysis of the extent to which some of these random variables can be viewed as preserving their identity across different contexts when one considers all possible joint distributions imposed on the entire set of the random variables. We illustrate the theory on three systems of traditional interest in quantum physics (and also in non-physical, e.g., behavioral studies). These are systems of the Klyachko-Can-Binicioglu-Shumovsky-type, Einstein-Podolsky-Rosen-Bell-type, and Suppes-Zanotti-Leggett-Garg-type. Listed in this order, each of them is formally a special case of the previous one. For each of them we derive necessary and sufficient conditions for contextuality while allowing for experimental errors and contextual biases or signaling. Based on the same principles that underly these derivations we also propose a measure for the degree of contextuality and compute it for the three systems in question.
△ Less
Submitted 29 August, 2015; v1 submitted 9 November, 2014;
originally announced November 2014.
-
Exploiting Active Subspaces to Quantify Uncertainty in the Numerical Simulation of the HyShot II Scramjet
Authors:
Paul Constantine,
Michael Emory,
Johan Larsson,
Gianluca Iaccarino
Abstract:
We present a computational analysis of the reactive flow in a hypersonic scramjet engine with focus on effects of uncertainties in the operating conditions. We employ a novel methodology based on active subspaces to characterize the effects of the input uncertainty on the scramjet performance. The active subspace identifies one-dimensional structure in the map from simulation inputs to quantity of…
▽ More
We present a computational analysis of the reactive flow in a hypersonic scramjet engine with focus on effects of uncertainties in the operating conditions. We employ a novel methodology based on active subspaces to characterize the effects of the input uncertainty on the scramjet performance. The active subspace identifies one-dimensional structure in the map from simulation inputs to quantity of interest that allows us to reparameterize the operating conditions; instead of seven physical parameters, we can use a single derived active variable. This dimension reduction enables otherwise infeasible uncertainty quantification, considering the simulation cost of roughly 9500 CPU-hours per run. For two values of the fuel injection rate, we use a total of 68 simulations to (i) identify the parameters that contribute the most to the variation in the output quantity of interest, (ii) estimate upper and lower bounds on the quantity of interest, (iii) classify sets of operating conditions as safe or unsafe corresponding to a threshold on the output quantity of interest, and (iv) estimate a cumulative distribution function for the quantity of interest.
△ Less
Submitted 15 July, 2015; v1 submitted 26 August, 2014;
originally announced August 2014.
-
The Minimum Perfect Matching in Pseudo-dimension $0<q<1$
Authors:
Joel Larsson
Abstract:
It is known that for $K_{n,n}$ equipped with i.i.d. $exp(1)$ edge costs, the minimum total cost of a perfect matching converges to $π^2/6$ in probability. Similar convergence has been established for all edge cost distributions of pseudo-dimension $q \geq 1$, such as Wei(1,q) costs. In this paper we extend those results all $q>0$, confirming the Mézard-Parisi conjecture in the last remaining appli…
▽ More
It is known that for $K_{n,n}$ equipped with i.i.d. $exp(1)$ edge costs, the minimum total cost of a perfect matching converges to $π^2/6$ in probability. Similar convergence has been established for all edge cost distributions of pseudo-dimension $q \geq 1$, such as Wei(1,q) costs. In this paper we extend those results all $q>0$, confirming the Mézard-Parisi conjecture in the last remaining applicable case.
△ Less
Submitted 12 June, 2019; v1 submitted 14 March, 2014;
originally announced March 2014.