Skip to main content

Showing 1–3 of 3 results for author: Ju, H

Searching in archive stat. Search in all archives.
.
  1. arXiv:2306.08553  [pdf, other

    cs.LG cs.DS math.OC stat.ML

    Noise Stability Optimization for Finding Flat Minima: A Hessian-based Regularization Approach

    Authors: Hongyang R. Zhang, Dongyue Li, Haotian Ju

    Abstract: The training of over-parameterized neural networks has received much study in recent literature. An important consideration is the regularization of over-parameterized networks due to their highly nonconvex and nonlinear geometry. In this paper, we study noise injection algorithms, which can regularize the Hessian of the loss, leading to regions with flat loss surfaces. Specifically, by injecting… ▽ More

    Submitted 23 September, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: 39 pages

    Journal ref: Trans. Mach. Learn. Res. 2024

  2. arXiv:2302.04451  [pdf, other

    cs.LG cs.SI math.ST stat.ML

    Generalization in Graph Neural Networks: Improved PAC-Bayesian Bounds on Graph Diffusion

    Authors: Haotian Ju, Dongyue Li, Aneesh Sharma, Hongyang R. Zhang

    Abstract: Graph neural networks are widely used tools for graph prediction tasks. Motivated by their empirical performance, prior works have developed generalization bounds for graph neural networks, which scale with graph structures in terms of the maximum degree. In this paper, we present generalization bounds that instead scale with the largest singular value of the graph neural network's feature diffusi… ▽ More

    Submitted 23 October, 2023; v1 submitted 9 February, 2023; originally announced February 2023.

    Comments: 36 pages. Appeared in AISTATS 2023

  3. arXiv:2206.02659  [pdf, other

    cs.LG cs.CV math.ST stat.ML

    Robust Fine-Tuning of Deep Neural Networks with Hessian-based Generalization Guarantees

    Authors: Haotian Ju, Dongyue Li, Hongyang R. Zhang

    Abstract: We consider fine-tuning a pretrained deep neural network on a target task. We study the generalization properties of fine-tuning to understand the problem of overfitting, which has often been observed (e.g., when the target dataset is small or when the training labels are noisy). Existing generalization measures for deep networks depend on notions such as distance from the initialization (i.e., th… ▽ More

    Submitted 22 December, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: 38 pages. Appeared in ICML 2022