Skip to main content

Showing 1–2 of 2 results for author: Shea, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2411.11224  [pdf, other

    cs.LG math.OC stat.ML

    Don't Be So Positive: Negative Step Sizes in Second-Order Methods

    Authors: Betty Shea, Mark Schmidt

    Abstract: The value of second-order methods lies in the use of curvature information. Yet, this information is costly to extract and once obtained, valuable negative curvature information is often discarded so that the method is globally convergent. This limits the effectiveness of second-order methods in modern machine learning. In this paper, we show that second-order and second-order-like methods are pro… ▽ More

    Submitted 5 December, 2024; v1 submitted 17 November, 2024; originally announced November 2024.

    Comments: added affiliation and more references

  2. arXiv:2406.17954  [pdf, other

    cs.LG math.OC

    Why Line Search when you can Plane Search? SO-Friendly Neural Networks allow Per-Iteration Optimization of Learning and Momentum Rates for Every Layer

    Authors: Betty Shea, Mark Schmidt

    Abstract: We introduce the class of SO-friendly neural networks, which include several models used in practice including networks with 2 layers of hidden weights where the number of inputs is larger than the number of outputs. SO-friendly networks have the property that performing a precise line search to set the step size on each iteration has the same asymptotic cost during full-batch training as using a… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.