-
On the Convergence of Gradient Descent for Large Learning Rates
Authors:
Alexandru Crăciun,
Debarghya Ghoshdastidar
Abstract:
A vast literature on convergence guarantees for gradient descent and derived methods exists at the moment. However, a simple practical situation remains unexplored: when a fixed step size is used, can we expect gradient descent to converge starting from any initialization? We provide fundamental impossibility results showing that convergence becomes impossible no matter the initialization if the s…
▽ More
A vast literature on convergence guarantees for gradient descent and derived methods exists at the moment. However, a simple practical situation remains unexplored: when a fixed step size is used, can we expect gradient descent to converge starting from any initialization? We provide fundamental impossibility results showing that convergence becomes impossible no matter the initialization if the step size gets too big. Looking at the asymptotic value of the gradient norm along the optimization trajectory, we see that there is a sharp transition as the step size crosses a critical value. This has been observed by practitioners, yet the true mechanisms through which this happens remain unclear beyond heuristics. Using results from dynamical systems theory, we provide a proof of this in the case of linear neural networks with a squared loss. We also prove the impossibility of convergence for more general losses without requiring strong assumptions such as Lipschitz continuity for the gradient. We validate our findings through experiments with non-linear networks.
△ Less
Submitted 9 December, 2024; v1 submitted 20 February, 2024;
originally announced February 2024.
-
Kernelization, Proof Complexity and Social Choice
Authors:
Gabriel Istrate,
Cosmin Bonchis,
Adrian Craciun
Abstract:
We display an application of the notions of kernelization and data reduction from parameterized complexity to proof complexity: Specifically, we show that the existence of data reduction rules for a parameterized problem having (a). a small-length reduction chain, and (b). small-size (extended) Frege proofs certifying the soundness of reduction steps implies the existence of subexponential size (e…
▽ More
We display an application of the notions of kernelization and data reduction from parameterized complexity to proof complexity: Specifically, we show that the existence of data reduction rules for a parameterized problem having (a). a small-length reduction chain, and (b). small-size (extended) Frege proofs certifying the soundness of reduction steps implies the existence of subexponential size (extended) Frege proofs for propositional formalizations of the given problem.
We apply our result to infer the existence of subexponential Frege and extended Frege proofs for a variety of problems. Improving earlier results of Aisenberg et al. (ICALP 2015), we show that propositional formulas expressing (a stronger form of) the Kneser-Lovász Theorem have polynomial size Frege proofs for each constant value of the parameter k. Previously only quasipolynomial bounds were known (and only for the ordinary Kneser-Lovász Theorem).
Another notable application of our framework is to impossibility results in computational social choice: we show that, for any fixed number of agents, propositional translations of the Arrow and Gibbard-Satterthwaite impossibility theorems have subexponential size Frege proofs.
△ Less
Submitted 28 April, 2021;
originally announced April 2021.
-
Gröbner Bases with Reduction Machines
Authors:
Georgiana Şurlea,
Adrian Crăciun
Abstract:
In this paper, we make a contribution to the computation of Gröbner bases. For polynomial reduction, instead of choosing the leading monomial of a polynomial as the monomial with respect to which the reduction process is carried out, we investigate what happens if we make that choice arbitrarily. It turns out not only this is possible (the fact that this produces a normal form being already known…
▽ More
In this paper, we make a contribution to the computation of Gröbner bases. For polynomial reduction, instead of choosing the leading monomial of a polynomial as the monomial with respect to which the reduction process is carried out, we investigate what happens if we make that choice arbitrarily. It turns out not only this is possible (the fact that this produces a normal form being already known in the literature), but, for a fixed choice of reductors, the obtained normal form is the same no matter the order in which we reduce the monomials. To prove this, we introduce reduction machines, which work by reducing each monomial independently and then collecting the result. We show that such a machine can simulate any such reduction. We then discuss different implementations of these machines. Some of these implementations address inherent inefficiencies in reduction machines (repeating the same computations). We describe a first implementation and look at some experimental results.
△ Less
Submitted 4 September, 2019;
originally announced September 2019.
-
Proceedings Third Symposium on Working Formal Methods
Authors:
Mircea Marin,
Adrian Crăciun
Abstract:
This volume contains the proceedings of FROM 2019: the Third Symposium on Working Formal Methods, held on September 3-5, 2019 in Timişoara (Romania). FROM aims to bring together researchers and practitioners who work on formal methods by contributing new theoretical results, methods, techniques, and frameworks, and/or make the formal methods to work by creating or using software tools that apply…
▽ More
This volume contains the proceedings of FROM 2019: the Third Symposium on Working Formal Methods, held on September 3-5, 2019 in Timişoara (Romania). FROM aims to bring together researchers and practitioners who work on formal methods by contributing new theoretical results, methods, techniques, and frameworks, and/or make the formal methods to work by creating or using software tools that apply theoretical contributions.
△ Less
Submitted 2 September, 2019;
originally announced September 2019.
-
Short Proofs of the Kneser-Lovász Coloring Principle
Authors:
James Aisenberg,
Maria Luisa Bonet,
Sam Buss,
Adrian Crãciun,
Gabriel Istrate
Abstract:
We prove that the propositional translations of the Kneser-Lovász theorem have polynomial size extended Frege proofs and quasi-polynomial size Frege proofs. We present a new counting-based combinatorial proof of the Kneser-Lovász theorem that avoids the topological arguments of prior proofs for all but finitely many cases for each k. We introduce a miniaturization of the octahedral Tucker lemma, c…
▽ More
We prove that the propositional translations of the Kneser-Lovász theorem have polynomial size extended Frege proofs and quasi-polynomial size Frege proofs. We present a new counting-based combinatorial proof of the Kneser-Lovász theorem that avoids the topological arguments of prior proofs for all but finitely many cases for each k. We introduce a miniaturization of the octahedral Tucker lemma, called the truncated Tucker lemma: it is open whether its propositional translations have (quasi-)polynomial size Frege or extended Frege proofs.
△ Less
Submitted 20 May, 2015;
originally announced May 2015.
-
Proof Complexity and the Kneser-Lovász Theorem
Authors:
Gabriel Istrate,
Adrian Crăciun
Abstract:
We investigate the proof complexity of a class of propositional formulas expressing a combinatorial principle known as the Kneser-Lovász Theorem. This is a family of propositional tautologies, indexed by an nonnegative integer parameter $k$ that generalizes the Pigeonhole Principle (obtained for $k=1$).
We show, for all fixed $k$, $2^{Ω(n)}$ lower bounds on resolution complexity and exponential…
▽ More
We investigate the proof complexity of a class of propositional formulas expressing a combinatorial principle known as the Kneser-Lovász Theorem. This is a family of propositional tautologies, indexed by an nonnegative integer parameter $k$ that generalizes the Pigeonhole Principle (obtained for $k=1$).
We show, for all fixed $k$, $2^{Ω(n)}$ lower bounds on resolution complexity and exponential lower bounds for bounded depth Frege proofs. These results hold even for the more restricted class of formulas encoding Schrijver's strenghtening of the Kneser-Lovász Theorem. On the other hand for the cases $k=2,3$ (for which combinatorial proofs of the Kneser-Lovász Theorem are known) we give polynomial size Frege ($k=2$), respectively extended Frege ($k=3$) proofs. The paper concludes with a brief announcement of the results (presented in subsequent work) on the proof complexity of the general case of the Kneser-Lovász theorem.
△ Less
Submitted 15 May, 2018; v1 submitted 18 February, 2014;
originally announced February 2014.