Search | arXiv e-print repository

End-to-End Mesh Optimization of a Hybrid Deep Learning Black-Box PDE Solver

Authors: Shaocong Ma, James Diffenderfer, Bhavya Kailkhura, Yi Zhou

Abstract: Deep learning has been widely applied to solve partial differential equations (PDEs) in computational fluid dynamics. Recent research proposed a PDE correction framework that leverages deep learning to correct the solution obtained by a PDE solver on a coarse mesh. However, end-to-end training of such a PDE correction model over both solver-dependent parameters such as mesh parameters and neural n… ▽ More Deep learning has been widely applied to solve partial differential equations (PDEs) in computational fluid dynamics. Recent research proposed a PDE correction framework that leverages deep learning to correct the solution obtained by a PDE solver on a coarse mesh. However, end-to-end training of such a PDE correction model over both solver-dependent parameters such as mesh parameters and neural network parameters requires the PDE solver to support automatic differentiation through the iterative numerical process. Such a feature is not readily available in many existing solvers. In this study, we explore the feasibility of end-to-end training of a hybrid model with a black-box PDE solver and a deep learning model for fluid flow prediction. Specifically, we investigate a hybrid model that integrates a black-box PDE solver into a differentiable deep graph neural network. To train this model, we use a zeroth-order gradient estimator to differentiate the PDE solver via forward propagation. Although experiments show that the proposed approach based on zeroth-order gradient estimation underperforms the baseline that computes exact derivatives using automatic differentiation, our proposed method outperforms the baseline trained with a frozen input mesh to the solver. Moreover, with a simple warm-start on the neural network parameters, we show that models trained by these zeroth-order algorithms achieve an accelerated convergence and improved generalization performance. △ Less

Submitted 28 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

arXiv:2203.16615 [pdf, other]

A Fast and Convergent Proximal Algorithm for Regularized Nonconvex and Nonsmooth Bi-level Optimization

Authors: Ziyi Chen, Bhavya Kailkhura, Yi Zhou

Abstract: Many important machine learning applications involve regularized nonconvex bi-level optimization. However, the existing gradient-based bi-level optimization algorithms cannot handle nonconvex or nonsmooth regularizers, and they suffer from a high computation complexity in nonconvex bi-level optimization. In this work, we study a proximal gradient-type algorithm that adopts the approximate implicit… ▽ More Many important machine learning applications involve regularized nonconvex bi-level optimization. However, the existing gradient-based bi-level optimization algorithms cannot handle nonconvex or nonsmooth regularizers, and they suffer from a high computation complexity in nonconvex bi-level optimization. In this work, we study a proximal gradient-type algorithm that adopts the approximate implicit differentiation (AID) scheme for nonconvex bi-level optimization with possibly nonconvex and nonsmooth regularizers. In particular, the algorithm applies the Nesterov's momentum to accelerate the computation of the implicit gradient involved in AID. We provide a comprehensive analysis of the global convergence properties of this algorithm through identifying its intrinsic potential function. In particular, we formally establish the convergence of the model parameters to a critical point of the bi-level problem, and obtain an improved computation complexity $\mathcal{O}(κ^{3.5}ε^{-2})$ over the state-of-the-art result. Moreover, we analyze the asymptotic convergence rates of this algorithm under a class of local nonconvex geometries characterized by a Łojasiewicz-type gradient inequality. Experiment on hyper-parameter optimization demonstrates the effectiveness of our algorithm. △ Less

Submitted 3 June, 2022; v1 submitted 30 March, 2022; originally announced March 2022.

Comments: 20 pages, 1 figure, 1 table

arXiv:2009.10748 [pdf, other]

doi 10.1109/BigData50022.2020.9377960

FedCluster: Boosting the Convergence of Federated Learning via Cluster-Cycling

Authors: Cheng Chen, Ziyi Chen, Yi Zhou, Bhavya Kailkhura

Abstract: We develop FedCluster--a novel federated learning framework with improved optimization efficiency, and investigate its theoretical convergence properties. The FedCluster groups the devices into multiple clusters that perform federated learning cyclically in each learning round. Therefore, each learning round of FedCluster consists of multiple cycles of meta-update that boost the overall convergenc… ▽ More We develop FedCluster--a novel federated learning framework with improved optimization efficiency, and investigate its theoretical convergence properties. The FedCluster groups the devices into multiple clusters that perform federated learning cyclically in each learning round. Therefore, each learning round of FedCluster consists of multiple cycles of meta-update that boost the overall convergence. In nonconvex optimization, we show that FedCluster with the devices implementing the local {stochastic gradient descent (SGD)} algorithm achieves a faster convergence rate than the conventional {federated averaging (FedAvg)} algorithm in the presence of device-level data heterogeneity. We conduct experiments on deep learning applications and demonstrate that FedCluster converges significantly faster than the conventional federated learning under diverse levels of device-level data heterogeneity for a variety of local optimizers. △ Less

Submitted 22 September, 2020; originally announced September 2020.

Comments: 10 pages, 6 figures

Journal ref: 2020 IEEE International Conference on Big Data (Big Data), Atlanta, GA, USA, 2020, pp. 5017-5026

arXiv:1410.5904 [pdf, ps, other]

Distributed Detection in Tree Networks: Byzantines and Mitigation Techniques

Authors: Bhavya Kailkhura, Swastik Brahma, Berkan Dulek, Yunghsiang S Han, Pramod K. Varshney

Abstract: In this paper, the problem of distributed detection in tree networks in the presence of Byzantines is considered. Closed form expressions for optimal attacking strategies that minimize the miss detection error exponent at the fusion center (FC) are obtained. We also look at the problem from the network designer's (FC's) perspective. We study the problem of designing optimal distributed detection p… ▽ More In this paper, the problem of distributed detection in tree networks in the presence of Byzantines is considered. Closed form expressions for optimal attacking strategies that minimize the miss detection error exponent at the fusion center (FC) are obtained. We also look at the problem from the network designer's (FC's) perspective. We study the problem of designing optimal distributed detection parameters in a tree network in the presence of Byzantines. Next, we model the strategic interaction between the FC and the attacker as a Leader-Follower (Stackelberg) game. This formulation provides a methodology for predicting attacker and defender (FC) equilibrium strategies, which can be used to implement the optimal detector. Finally, a reputation based scheme to identify Byzantines is proposed and its performance is analytically evaluated. We also provide some numerical examples to gain insights into the solution. △ Less

Submitted 21 October, 2014; originally announced October 2014.

arXiv:1309.4513 [pdf, ps, other]

doi 10.1109/TSP.2014.2321735

Distributed Detection in Tree Topologies with Byzantines

Authors: Bhavya Kailkhura, Swastik Brahma, Yunghsiang S. Han, Pramod K. Varshney

Abstract: In this paper, we consider the problem of distributed detection in tree topologies in the presence of Byzantines. The expression for minimum attacking power required by the Byzantines to blind the fusion center (FC) is obtained. More specifically, we show that when more than a certain fraction of individual node decisions are falsified, the decision fusion scheme becomes completely incapable. We o… ▽ More In this paper, we consider the problem of distributed detection in tree topologies in the presence of Byzantines. The expression for minimum attacking power required by the Byzantines to blind the fusion center (FC) is obtained. More specifically, we show that when more than a certain fraction of individual node decisions are falsified, the decision fusion scheme becomes completely incapable. We obtain closed form expressions for the optimal attacking strategies that minimize the detection error exponent at the FC. We also look at the possible counter-measures from the FC's perspective to protect the network from these Byzantines. We formulate the robust topology design problem as a bi-level program and provide an efficient algorithm to solve it. We also provide some numerical results to gain insights into the solution. △ Less

Submitted 17 September, 2013; originally announced September 2013.

Showing 1–5 of 5 results for author: Kailkhura, B