-
Physics-Informed Graph-Mesh Networks for PDEs: A hybrid approach for complex problems
Authors:
Marien Chenaud,
Frédéric Magoulès,
José Alves
Abstract:
The recent rise of deep learning has led to numerous applications, including solving partial differential equations using Physics-Informed Neural Networks. This approach has proven highly effective in several academic cases. However, their lack of physical invariances, coupled with other significant weaknesses, such as an inability to handle complex geometries or their lack of generalization capab…
▽ More
The recent rise of deep learning has led to numerous applications, including solving partial differential equations using Physics-Informed Neural Networks. This approach has proven highly effective in several academic cases. However, their lack of physical invariances, coupled with other significant weaknesses, such as an inability to handle complex geometries or their lack of generalization capabilities, make them unable to compete with classical numerical solvers in industrial settings. In this work, a limitation regarding the use of automatic differentiation in the context of physics-informed learning is highlighted. A hybrid approach combining physics-informed graph neural networks with numerical kernels from finite elements is introduced. After studying the theoretical properties of our model, we apply it to complex geometries, in two and three dimensions. Our choices are supported by an ablation study, and we evaluate the generalisation capacity of the proposed approach.
△ Less
Submitted 25 September, 2024;
originally announced October 2024.
-
Frequency range non-Lipschitz parametric optimization of a noise absorption
Authors:
Frederic Magoules,
Mathieu Menoux,
Anna Rozanova-Pierrat
Abstract:
In the framework of the optimal wave energy absorption, we solve theoretically and numerically a parametric shape optimization problem to find the optimal distribution of absorbing material in the reflexive one defined by a characteristic function in the Robin-type boundary condition associated with the Helmholtz equation. Robin boundary condition can be given on a part or the all boundary of a bo…
▽ More
In the framework of the optimal wave energy absorption, we solve theoretically and numerically a parametric shape optimization problem to find the optimal distribution of absorbing material in the reflexive one defined by a characteristic function in the Robin-type boundary condition associated with the Helmholtz equation. Robin boundary condition can be given on a part or the all boundary of a bounded ($ε$, $\infty$)-domain of R n . The geometry of the partially absorbing boundary is fixed, but allowed to be non-Lipschitz, for example, fractal. It is defined as the support of a d-upper regular measure with d $\in$]n -2, n[. Using the well-posedness properties of the model, for any fixed volume fraction of the absorbing material, we establish the existence of at least one optimal distribution minimizing the acoustical energy on a fixed frequency range of the relaxation problem. Thanks to the shape derivative of the energy functional, also existing for non-Lipschitz boundaries, we implement (in the two-dimensional case) the gradient descent method and find the optimal distribution with 50% of the absorbent material on a frequency range with better performances than the 100% absorbent boundary. The same type of performance is also obtained by the genetic method.
△ Less
Submitted 10 September, 2024;
originally announced September 2024.
-
Distributed convergence detection based on global residual error under asynchronous iterations
Authors:
Frédéric Magoulès,
Guillaume Gbikpi-Benissan
Abstract:
Convergence of classical parallel iterations is detected by performing a reduction operation at each iteration in order to compute a residual error relative to a potential solution vector. To efficiently run asynchronous iterations, blocking communication requests are avoided, which makes it hard to isolate and handle any global vector. While some termination protocols were proposed for asynchrono…
▽ More
Convergence of classical parallel iterations is detected by performing a reduction operation at each iteration in order to compute a residual error relative to a potential solution vector. To efficiently run asynchronous iterations, blocking communication requests are avoided, which makes it hard to isolate and handle any global vector. While some termination protocols were proposed for asynchronous iterations, only very few of them are based on global residual computation and guarantee effective convergence. But the most effective and efficient existing solutions feature two reduction operations, which constitutes an important factor of termination delay. In this paper, we present new, non-intrusive, protocols to compute a residual error under asynchronous iterations, requiring only one reduction operation. Various communication models show that some heuristics can even be introduced and formally evaluated. Extensive experiments with up to 5600 processor cores confirm the practical effectiveness and efficiency of our approach.
△ Less
Submitted 29 December, 2023;
originally announced December 2023.
-
Asynchronous iterations of HSS method for non-Hermitian linear systems
Authors:
Guillaume Gbikpi-Benissan,
Qinmeng Zou,
Frédéric Magoulès
Abstract:
A general asynchronous alternating iterative model is designed, for which convergence is theoretically ensured both under classical spectral radius bound and, then, for a classical class of matrix splittings for $\mathsf H$-matrices. The computational model can be thought of as a two-stage alternating iterative method, which well suits to the well-known Hermitian and skew-Hermitian splitting (HSS)…
▽ More
A general asynchronous alternating iterative model is designed, for which convergence is theoretically ensured both under classical spectral radius bound and, then, for a classical class of matrix splittings for $\mathsf H$-matrices. The computational model can be thought of as a two-stage alternating iterative method, which well suits to the well-known Hermitian and skew-Hermitian splitting (HSS) approach, with the particularity here of considering only one inner iteration. Experimental parallel performance comparison is conducted between the generalized minimal residual (GMRES) algorithm, the standard HSS and our asynchronous variant, on both real and complex non-Hermitian linear systems respectively arising from convection-diffusion and structural dynamics problems. A significant gain on execution time is observed in both cases.
△ Less
Submitted 27 December, 2023;
originally announced December 2023.
-
Resilient asynchronous primal Schur method
Authors:
Guillaume Gbikpi-Benissan,
Frédéric Magoulès
Abstract:
This paper introduces the application of the asynchronous iterations theory within the framework of the primal Schur domain decomposition method. A suitable relaxation scheme is designed, which asynchronous convergence is established under classical spectral radius conditions. For the usual case where the local Schur complement matrices are not constructed, suitable splittings only based on explic…
▽ More
This paper introduces the application of the asynchronous iterations theory within the framework of the primal Schur domain decomposition method. A suitable relaxation scheme is designed, which asynchronous convergence is established under classical spectral radius conditions. For the usual case where the local Schur complement matrices are not constructed, suitable splittings only based on explicitly generated matrices are provided. Numerical experiments are conducted on a supercomputer for both Poisson's and linear elasticity problems. The asynchronous Schur solver outperformed the classical conjugate-gradient-based one in case of compute node failures.
△ Less
Submitted 22 December, 2023;
originally announced December 2023.
-
Asynchronous multiplicative coarse-space correction
Authors:
Guillaume Gbikpi-Benissan,
Frédéric Magoulès
Abstract:
This paper introduces the multiplicative variant of the recently proposed asynchronous additive coarse-space correction method. Definition of an asynchronous extension of multiplicative correction is not straightforward, however, our analysis allows for usual asynchronous programming approaches. General asynchronous iterative models are explicitly devised both for shared or replicated coarse probl…
▽ More
This paper introduces the multiplicative variant of the recently proposed asynchronous additive coarse-space correction method. Definition of an asynchronous extension of multiplicative correction is not straightforward, however, our analysis allows for usual asynchronous programming approaches. General asynchronous iterative models are explicitly devised both for shared or replicated coarse problems and for centralized or distributed ones. Convergence conditions are derived and shown to be satisfied for M-matrices, as also done for the additive case. Implementation aspects are discussed, which reveal the need for non-blocking synchronization for building the successive right-hand-side vectors of the coarse problem. Optionally, a parameter allows for applying each coarse solution a maximum number of times, which has an impact on the algorithm efficiency. Numerical results on a high-speed homogeneous cluster confirm the practical efficiency of the asynchronous two-level method over its synchronous counterpart, even when it is not the case for the underlying one-level methods.
△ Less
Submitted 19 December, 2023;
originally announced December 2023.
-
Physics-Informed Graph Convolutional Networks: Towards a generalized framework for complex geometries
Authors:
Marien Chenaud,
José Alves,
Frédéric Magoulès
Abstract:
Since the seminal work of [9] and their Physics-Informed neural networks (PINNs), many efforts have been conducted towards solving partial differential equations (PDEs) with Deep Learning models. However, some challenges remain, for instance the extension of such models to complex three-dimensional geometries, and a study on how such approaches could be combined to classical numerical solvers. In…
▽ More
Since the seminal work of [9] and their Physics-Informed neural networks (PINNs), many efforts have been conducted towards solving partial differential equations (PDEs) with Deep Learning models. However, some challenges remain, for instance the extension of such models to complex three-dimensional geometries, and a study on how such approaches could be combined to classical numerical solvers. In this work, we justify the use of graph neural networks for these problems, based on the similarity between these architectures and the meshes used in traditional numerical techniques for solving partial differential equations. After proving an issue with the Physics-Informed framework for complex geometries, during the computation of PDE residuals, an alternative procedure is proposed, by combining classical numerical solvers and the Physics-Informed framework. Finally, we propose an implementation of this approach, that we test on a three-dimensional problem on an irregular geometry.
△ Less
Submitted 24 November, 2023; v1 submitted 20 October, 2023;
originally announced October 2023.
-
A Hybrid GNN approach for predicting node data for 3D meshes
Authors:
Shwetha Salimath,
Francesca Bugiotti,
Frederic Magoules
Abstract:
Metal forging is used to manufacture dies. We require the best set of input parameters for the process to be efficient. Currently, we predict the best parameters using the finite element method by generating simulations for the different initial conditions, which is a time-consuming process. In this paper, introduce a hybrid approach that helps in processing and generating new data simulations usi…
▽ More
Metal forging is used to manufacture dies. We require the best set of input parameters for the process to be efficient. Currently, we predict the best parameters using the finite element method by generating simulations for the different initial conditions, which is a time-consuming process. In this paper, introduce a hybrid approach that helps in processing and generating new data simulations using a surrogate graph neural network model based on graph convolutions, having a cheaper time cost. We also introduce a hybrid approach that helps in processing and generating new data simulations using the model. Given a dataset representing meshes, our focus is on the conversion of the available information into a graph or point cloud structure. This new representation enables deep learning. The predicted result is similar, with a low error when compared to that produced using the finite element method. The new models have outperformed existing PointNet and simple graph neural network models when applied to produce the simulations.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
Enhancing the Global/Local Coupling Method: An Asynchronous Parallel Framework
Authors:
Ahmed El Kerim,
Pierre Gosselet,
Frédéric Magoulès
Abstract:
A novel approach is being developed to introduce a parallel asynchronous implementation of non-intrusive global-local coupling. This study examines scenarios involving numerous patches, including those covering the entire structure. By leveraging asynchronous, the method aims to minimize reliance on communication, handle failures effectively, and address load imbalances. Detailed insights into the…
▽ More
A novel approach is being developed to introduce a parallel asynchronous implementation of non-intrusive global-local coupling. This study examines scenarios involving numerous patches, including those covering the entire structure. By leveraging asynchronous, the method aims to minimize reliance on communication, handle failures effectively, and address load imbalances. Detailed insights into the methodology are presented, accompanied by a demonstration of its performance through an academic case study.
△ Less
Submitted 20 October, 2023;
originally announced October 2023.
-
Accurate Coarse Residual for Two-Level Asynchronous Domain Decomposition Methods
Authors:
Guillaume Gbikpi-Benissan,
Frédéric Magoulès
Abstract:
Recently, asynchronous coarse-space correction has been achieved within both the overlapping Schwarz and the primal Schur frameworks. Both additive and multiplicative corrections have been discussed. In this paper, we address some implementation drawbacks of the proposed additive correction scheme. In the existing approach, each coarse solution is applied only once, leaving most of the iterations…
▽ More
Recently, asynchronous coarse-space correction has been achieved within both the overlapping Schwarz and the primal Schur frameworks. Both additive and multiplicative corrections have been discussed. In this paper, we address some implementation drawbacks of the proposed additive correction scheme. In the existing approach, each coarse solution is applied only once, leaving most of the iterations of the solver without coarse-space information while building the right-hand side of the coarse problem. Moreover, one-sided routines of the Message Passing Interface (MPI) standard were considered, which introduced the need for a sleep statement in the iterations loop of the coarse solver. This implies a tuning of the sleep period, which is a non-discrete quantity. In this paper, we improve the accuracy of the coarse right-hand side, which allowed for more frequent corrections. In addition, we highlight a two-sided implementation which better suits the asynchronous coarse-space correction scheme. Numerical experiments show a significant performance gain with such increased incorporation of the coarse space.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
Asynchronous global-local non-invasive coupling for linear elliptic problems
Authors:
Ahmed El Kerim,
Pierre Gosselet,
Frédéric Magoulès
Abstract:
This paper presents the first asynchronous version of the Global/Local non-invasive coupling, capable of dealing efficiently with multiple, possibly adjacent, patches. We give a new interpretation of the coupling in terms of primal domain decomposition method, and we prove the convergence of the relaxed asynchronous iteration. The asynchronous paradigm lifts many bottlenecks of the Global/Local co…
▽ More
This paper presents the first asynchronous version of the Global/Local non-invasive coupling, capable of dealing efficiently with multiple, possibly adjacent, patches. We give a new interpretation of the coupling in terms of primal domain decomposition method, and we prove the convergence of the relaxed asynchronous iteration. The asynchronous paradigm lifts many bottlenecks of the Global/Local coupling performance. We illustrate the method on several linear elliptic problems as encountered in thermal and elasticity studies.
△ Less
Submitted 23 November, 2022;
originally announced November 2022.
-
Point-Cloud-based Deep Learning Models for Finite Element Analysis
Authors:
Meduri Venkata Shivaditya,
Francesca Bugiotti,
Frederic Magoules
Abstract:
In this paper, we explore point-cloud based deep learning models to analyze numerical simulations arising from finite element analysis. The objective is to classify automatically the results of the simulations without tedious human intervention. Two models are here presented: the Point-Net classification model and the Dynamic Graph Convolutional Neural Net model. Both trained point-cloud deep lear…
▽ More
In this paper, we explore point-cloud based deep learning models to analyze numerical simulations arising from finite element analysis. The objective is to classify automatically the results of the simulations without tedious human intervention. Two models are here presented: the Point-Net classification model and the Dynamic Graph Convolutional Neural Net model. Both trained point-cloud deep learning models performed well on experiments with finite element analysis arising from automotive industry. The proposed models show promise in automatizing the analysis process of finite element simulations. An accuracy of 79.17% and 94.5% is obtained for the Point-Net and the Dynamic Graph Convolutional Neural Net model respectively.
△ Less
Submitted 18 November, 2022;
originally announced November 2022.
-
Multilayer Perceptron-based Surrogate Models for Finite Element Analysis
Authors:
Lawson Oliveira Lima,
Julien Rosenberger,
Esteban Antier,
Frederic Magoules
Abstract:
Many Partial Differential Equations (PDEs) do not have analytical solution, and can only be solved by numerical methods. In this context, Physics-Informed Neural Networks (PINN) have become important in the last decades, since it uses a neural network and physical conditions to approximate any functions. This paper focuses on hypertuning of a PINN, used to solve a PDE. The behavior of the approxim…
▽ More
Many Partial Differential Equations (PDEs) do not have analytical solution, and can only be solved by numerical methods. In this context, Physics-Informed Neural Networks (PINN) have become important in the last decades, since it uses a neural network and physical conditions to approximate any functions. This paper focuses on hypertuning of a PINN, used to solve a PDE. The behavior of the approximated solution when we change the learning rate or the activation function (sigmoid, hyperbolic tangent, GELU, ReLU and ELU) is here analyzed. A comparative study is done to determine the best characteristics in the problem, as well as to find a learning rate that allows fast and satisfactory learning. GELU and hyperbolic tangent activation functions exhibit better performance than other activation functions. A suitable choice of the learning rate results in higher accuracy and faster convergence.
△ Less
Submitted 17 November, 2022;
originally announced November 2022.
-
Graph Neural Network-based Surrogate Models for Finite Element Analysis
Authors:
Meduri Venkata Shivaditya,
José Alves,
Francesca Bugiotti,
Frederic Magoules
Abstract:
Current simulation of metal forging processes use advanced finite element methods. Such methods consist of solving mathematical equations, which takes a significant amount of time for the simulation to complete. Computational time can be prohibitive for parametric response surface exploration tasks. In this paper, we propose as an alternative, a Graph Neural Network-based graph prediction model to…
▽ More
Current simulation of metal forging processes use advanced finite element methods. Such methods consist of solving mathematical equations, which takes a significant amount of time for the simulation to complete. Computational time can be prohibitive for parametric response surface exploration tasks. In this paper, we propose as an alternative, a Graph Neural Network-based graph prediction model to act as a surrogate model for parameters search space exploration and which exhibits a time cost reduced by an order of magnitude. Numerical experiments show that this new model outperforms the Point-Net model and the Dynamic Graph Convolutional Neural Net model.
△ Less
Submitted 17 November, 2022;
originally announced November 2022.
-
Asynchronous scalable version of the Global-Local non-invasive coupling
Authors:
Ahmed El Kerim,
Pierre Gosselet,
Frederic Magoules
Abstract:
The Global-Local non-invasive coupling is an improvement of the submodeling technique, which permits to locally enhance structure computations by introducing patches with refined models and to take into accounts all the interactions. In order to circumvent its inherently limited computational performance, we propose and implement an asynchronous version of the method. The asynchronous coupling red…
▽ More
The Global-Local non-invasive coupling is an improvement of the submodeling technique, which permits to locally enhance structure computations by introducing patches with refined models and to take into accounts all the interactions. In order to circumvent its inherently limited computational performance, we propose and implement an asynchronous version of the method. The asynchronous coupling reduces the dependency on communications, failures, and load imbalance. We present the theory and the implementation of the method in the linear case and illustrate its performance on academic cases inspired by actual industrial problems.
△ Less
Submitted 7 August, 2022;
originally announced August 2022.
-
Couplage Global-Local en asynchrone pour des problèmes linéaires
Authors:
Ahmed El Kerim,
Pierre Gosselet,
Frederic Magoules
Abstract:
An asynchronous parallel version of the non-intrusive global-local coupling is implemented. The case of many patches, including those covering the entire structure, is studied. The asynchronism limits the dependency on communications, failures, and load imbalance. We detail the method and illustrate its performance in an academic case.
An asynchronous parallel version of the non-intrusive global-local coupling is implemented. The case of many patches, including those covering the entire structure, is studied. The asynchronism limits the dependency on communications, failures, and load imbalance. We detail the method and illustrate its performance in an academic case.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
JACK2: a new high-level communication library for parallel iterative methods
Authors:
Guillaume Gbikpi-Benissan,
Frederic Magoules
Abstract:
In this paper, we address the problem of designing a distributed application meant to run both classical and asynchronous iterations. MPI libraries are very popular and widely used in the scientific community, however asynchronous iterative methods raise non-negligible difficulties about the efficient management of communication requests and buffers. Moreover, a convergence detection issue is intr…
▽ More
In this paper, we address the problem of designing a distributed application meant to run both classical and asynchronous iterations. MPI libraries are very popular and widely used in the scientific community, however asynchronous iterative methods raise non-negligible difficulties about the efficient management of communication requests and buffers. Moreover, a convergence detection issue is introduced, which requires the implementation of one of the various state-of-the-art termination methods, which are not necessarily highly reliable for most computational environments. We propose here an MPI-based communication library which handles all these issues in a non-intrusive manner, providing a unique interface for implementing both classical and asynchronous iterations. Few details are highlighted about our approach to achieve best communication rates and ensure accurate convergence detection. Experimental results on two supercomputers confirmed the low overhead communication costs introduced, and the effectiveness of our library.
△ Less
Submitted 30 June, 2022;
originally announced June 2022.
-
Distributed asynchronous convergence detection without detection protocol
Authors:
Guillaume Gbikpi-Benissan,
Frederic Magoules
Abstract:
In this paper, we address the problem of detecting the moment when an ongoing asynchronous parallel iterative process can be terminated to provide a sufficiently precise solution to a fixed-point problem being solved. Formulating the detection problem as a global solution identification problem, we analyze the snapshot-based approach, which is the only one that allows for exact global residual err…
▽ More
In this paper, we address the problem of detecting the moment when an ongoing asynchronous parallel iterative process can be terminated to provide a sufficiently precise solution to a fixed-point problem being solved. Formulating the detection problem as a global solution identification problem, we analyze the snapshot-based approach, which is the only one that allows for exact global residual error computation. From a recently developed approximate snapshot protocol providing a reliable global residual error, we experimentally investigate here, as well, the reliability of a global residual error computed without any prior particular detection mechanism. Results on a single-site supercomputer successfully show that such high-performance computing platforms possibly provide computational environments stable enough to allow for simply resorting to non-blocking reduction operations for computing reliable global residual errors, which provides noticeable time saving, at both implementation and execution levels.
△ Less
Submitted 30 June, 2022;
originally announced June 2022.
-
Iterative Krylov Methods for Acoustic Problems on Graphics Processing Unit
Authors:
Abal-Kassim Cheik Ahamed,
Frederic Magoules
Abstract:
This paper deals with linear algebra operations on Graphics Processing Unit (GPU) with complex number arithmetic using double precision. An analysis of their uses within iterative Krylov methods is presented to solve acoustic problems. Numerical experiments performed on a set of acoustic matrices arising from the modelisation of acoustic phenomena inside a car compartment are collected, and outlin…
▽ More
This paper deals with linear algebra operations on Graphics Processing Unit (GPU) with complex number arithmetic using double precision. An analysis of their uses within iterative Krylov methods is presented to solve acoustic problems. Numerical experiments performed on a set of acoustic matrices arising from the modelisation of acoustic phenomena inside a car compartment are collected, and outline the performance, robustness and effectiveness of our algorithms, with a speed-up up to 28x for dot product, 9.8x for sparse matrix-vector product and solvers.
△ Less
Submitted 22 December, 2021;
originally announced December 2021.
-
Fast and Green Computing with Graphics Processing Units for solving Sparse Linear Systems
Authors:
Abal-Kassim Cheik Ahamed,
Alban Desmaison,
Frederic Magoules
Abstract:
In this paper, we aim to introduce a new perspective when comparing highly parallelized algorithms on GPU: the energy consumption of the GPU. We give an analysis of the performance of linear algebra operations, including addition of vectors, element-wise product, dot product and sparse matrix-vector product, in order to validate our experimental protocol. We also analyze their uses within conjugat…
▽ More
In this paper, we aim to introduce a new perspective when comparing highly parallelized algorithms on GPU: the energy consumption of the GPU. We give an analysis of the performance of linear algebra operations, including addition of vectors, element-wise product, dot product and sparse matrix-vector product, in order to validate our experimental protocol. We also analyze their uses within conjugate gradient method for solving the gravity equations on Graphics Processing Unit (GPU). Cusp library is considered and compared to our own implementation with a set of real matrices arrising from the Chicxulub crater and obtained by the finite element discretization of the gravity equations. The experiments demonstrate the performance and robustness of our implementation in terms of energy efficiency.
△ Less
Submitted 20 December, 2021;
originally announced December 2021.
-
Accelerated solution of Helmholtz equation with Iterative Krylov Methods on GPU
Authors:
Abal-Kassim Cheik Ahamed,
Frederic Magoules
Abstract:
This paper gives an analysis and an evaluation of linear algebra operations on Graphics Processing Unit (GPU) with complex number arithmetics with double precision. Knowing the performance of these operations, iterative Krylov methods are considered to solve the acoustic problem efficiently. Numerical experiments carried out on a set of acoustic matrices arising from the modelisation of acoustic p…
▽ More
This paper gives an analysis and an evaluation of linear algebra operations on Graphics Processing Unit (GPU) with complex number arithmetics with double precision. Knowing the performance of these operations, iterative Krylov methods are considered to solve the acoustic problem efficiently. Numerical experiments carried out on a set of acoustic matrices arising from the modelisation of acoustic phenomena within a cylinder and a car compartment are exposed, exhibiting the performance, robustness and efficiency of our algorithms, with a ratio up to 27x for dot product, 10x for sparse matrix-vector product and solvers in complex double precision arithmetics.
△ Less
Submitted 13 December, 2021;
originally announced December 2021.
-
Stochastic Optimized Schwarz Methods for the Gravity Equations on Graphics Processing Unit
Authors:
Abal-Kassim Cheik Ahamed,
Frederic Magoules
Abstract:
Low order, sequential or non-massively parallel finite elements are generaly used for three-dimensional gravity modelling. In this paper, in order to obtain better gravity anomaly solutions in heterogeneous media, we solve the gravimetry problem using massively parallel high order finite elements on hybrid multi-CPU/GPU clusters. Parallel algorithms well suited for such hybrid architectures have t…
▽ More
Low order, sequential or non-massively parallel finite elements are generaly used for three-dimensional gravity modelling. In this paper, in order to obtain better gravity anomaly solutions in heterogeneous media, we solve the gravimetry problem using massively parallel high order finite elements on hybrid multi-CPU/GPU clusters. Parallel algorithms well suited for such hybrid architectures have to be designed. A new stochastic-based optimization procedure for the optimized Schwarz method is here presented, implemented and tuned to graphical cards processors units. Numerical experiments performed on a reallistic test case, demonstrates the robustness and efficiency of the proposed method and of its implementation on massive multi-CPU/GPU architectures.
△ Less
Submitted 7 December, 2021;
originally announced December 2021.
-
On the stability and performance of the solution of sparse linear systems by partitioned procedures
Authors:
Abal-Kassim Cheik Ahamed,
Frederic Magoules
Abstract:
In this paper, we present, evaluate and analyse the performance of parallel synchronous Jacobi algorithms by different partitioned procedures including band-row splitting, band-row sparsity pattern splitting and substructuring splitting, when solving sparse large linear systems. Numerical experiments performed on a set of academic 3D Laplace equation and on a real gravity matrices arising from the…
▽ More
In this paper, we present, evaluate and analyse the performance of parallel synchronous Jacobi algorithms by different partitioned procedures including band-row splitting, band-row sparsity pattern splitting and substructuring splitting, when solving sparse large linear systems. Numerical experiments performed on a set of academic 3D Laplace equation and on a real gravity matrices arising from the Chicxulub crater are exhibited, and show the impact of splitting on parallel synchronous iterations when solving sparse large linear systems. The numerical results clearly show the interest of substructuring methods compared to band-row splitting strategies.
△ Less
Submitted 4 December, 2021;
originally announced December 2021.
-
Coupling and Simulation of Fluid-Structure Interaction Problems for Automotive Sun-roof on Graphics Processing Unit
Authors:
Liang S. Lai,
Choi-Hong Lai,
Abal-Kassim Cheik Ahamed,
Frederic Magoules
Abstract:
In this paper, the authors propose an analysis of the frequency response function in a car compartment, subject to some fluctuating pressure distribution along the open cavity of the sun-roof at the top of a car. Coupling of a computational fluid dynamics and of a computational acoustics code is considered to simulate the acoustic fluid-structure interaction problem. Iterative Krylov methods and d…
▽ More
In this paper, the authors propose an analysis of the frequency response function in a car compartment, subject to some fluctuating pressure distribution along the open cavity of the sun-roof at the top of a car. Coupling of a computational fluid dynamics and of a computational acoustics code is considered to simulate the acoustic fluid-structure interaction problem. Iterative Krylov methods and domain decomposition methods, tuned on Graphic Processing Unit (GPU), are considered to solve the acoustic problem with complex number arithmetics with double precision. Numerical simulations illustrate the efficiency, robustness and accuracy of the proposed approaches.
△ Less
Submitted 30 November, 2021;
originally announced December 2021.
-
Asynchronous parareal time discretization for partial differential equations
Authors:
Frederic Magoules,
Guillaume Gbikpi-Benissan
Abstract:
Asynchronous iterations are more and more investigated for both scaling and fault-resilience purpose on high performance computing platforms. While so far, they have been exclusively applied within space domain decomposition frameworks, this paper advocates a novel application direction targeting time-decomposed time-parallel approaches. Specifically, an asynchronous iterative model is derived fro…
▽ More
Asynchronous iterations are more and more investigated for both scaling and fault-resilience purpose on high performance computing platforms. While so far, they have been exclusively applied within space domain decomposition frameworks, this paper advocates a novel application direction targeting time-decomposed time-parallel approaches. Specifically, an asynchronous iterative model is derived from the Parareal scheme, for which convergence and speedup analysis are then conducted. It turned out that Parareal and async-Parareal feature very close convergence conditions, asymptotically equivalent, including the finite-time termination property. Based on a computational cost model aware of unsteady communication delays, our speedup analysis shows the potential performance gain from asynchronous iterations, which is confirmed by some experimental case of heat evolution on a homogeneous supercomputer. This primary work clearly suggests possible further benefits from asynchronous iterations.
△ Less
Submitted 20 October, 2021;
originally announced October 2021.
-
Interactive simulation for easy decision-making in fluid dynamics
Authors:
Mengchen Wang,
Nicolas Férey,
Frédéric Magoulès,
Patrick Bourdot
Abstract:
A conventional study of fluid simulation involves different stages including conception, simulation, visualization, and analysis tasks. It is, therefore, necessary to switch between different software and interactive contexts which implies costly data manipulation and increases the time needed for decision making. Our interactive simulation approach was designed to shorten this loop, allowing user…
▽ More
A conventional study of fluid simulation involves different stages including conception, simulation, visualization, and analysis tasks. It is, therefore, necessary to switch between different software and interactive contexts which implies costly data manipulation and increases the time needed for decision making. Our interactive simulation approach was designed to shorten this loop, allowing users to visualize and steer a simulation in progress without waiting for the end of the simulation. The methodology allows the users to control, start, pause, or stop a simulation in progress, to change global physical parameters, to interact with its 3D environment by editing boundary conditions such as walls or obstacles. This approach is made possible by using a methodology such as the Lattice Boltzmann Method (LBM) to achieve interactive time while remaining physically relevant. In this work, we present our platform dedicated to interactive fluid simulation based on LBM. The contribution of our interactive simulation approach to decision making will be evaluated in a study based on a simple but realistic use case.
△ Less
Submitted 20 October, 2021;
originally announced October 2021.
-
Power Consumption Analysis of Parallel Algorithms on GPUs
Authors:
Frédéric Magoulès,
Abal-Kassim Cheik Ahamed,
Alban Desmaison,
Jean-Christophe Léchenet,
François Mayer,
Haifa Ben Salem,
Thomas Zhu
Abstract:
Due to their highly parallel multi-cores architecture, GPUs are being increasingly used in a wide range of computationally intensive applications. Compared to CPUs, GPUs can achieve higher performances at accelerating the programs' execution in an energy-efficient way. Therefore GPGPU computing is useful for high performance computing applications and in many scientific research fields. In order t…
▽ More
Due to their highly parallel multi-cores architecture, GPUs are being increasingly used in a wide range of computationally intensive applications. Compared to CPUs, GPUs can achieve higher performances at accelerating the programs' execution in an energy-efficient way. Therefore GPGPU computing is useful for high performance computing applications and in many scientific research fields. In order to bring further performance improvements, GPU clusters are increasingly adopted. The energy consumed by GPUs cannot be neglected. Therefore, an energy-efficient time scheduling of the programs that are going to be executed by the parallel GPUs based on their deadline as well as the assigned priorities could be deployed to face their energetic avidity. For this reason, we present in this paper a model enabling the measure of the power consumption and the time execution of some elementary operations running on a single GPU using a new developed energy measurement protocol. Consequently, using our methodology, energy needs of a program could be predicted, allowing a better task scheduling.
△ Less
Submitted 28 September, 2021;
originally announced October 2021.
-
Parallel Sub-Structuring Methods for solving Sparse Linear Systems on a cluster of GPU
Authors:
Abal-Kassim Cheik Ahamed,
Frédéric Magoulès
Abstract:
The main objective of this work consists in analyzing sub-structuring method for the parallel solution of sparse linear systems with matrices arising from the discretization of partial differential equations such as finite element, finite volume and finite difference. With the success encountered by the general-purpose processing on graphics processing units (GPGPU), we develop an hybrid multiGPUs…
▽ More
The main objective of this work consists in analyzing sub-structuring method for the parallel solution of sparse linear systems with matrices arising from the discretization of partial differential equations such as finite element, finite volume and finite difference. With the success encountered by the general-purpose processing on graphics processing units (GPGPU), we develop an hybrid multiGPUs and CPUs sub-structuring algorithm. GPU computing, with CUDA, is used to accelerate the operations performed on each processor. Numerical experiments have been performed on a set of matrices arising from engineering problems. We compare C+MPI implementation on classical CPU cluster with C+MPI+CUDA on a cluster of GPU. The performance comparison shows a speed-up for the sub-structuring method up to 19 times in double precision by using CUDA.
△ Less
Submitted 8 August, 2021;
originally announced August 2021.
-
On the existence of optimal shapes in architecture
Authors:
Michael Hinz,
Frédéric Magoulès,
Rozanova-Pierrat Anna,
Marina Rynkovskaya,
Alexander Teplyaev
Abstract:
We consider shape optimization problems for elasticity systems in architecture. A typical question in this context is to identify a structure of maximal stability close to an initially proposed one. We show the existence of such an optimally shaped structure within classes of bounded Lipschitz domains and within wider classes of bounded uniform domains with boundaries that may be fractal. In the f…
▽ More
We consider shape optimization problems for elasticity systems in architecture. A typical question in this context is to identify a structure of maximal stability close to an initially proposed one. We show the existence of such an optimally shaped structure within classes of bounded Lipschitz domains and within wider classes of bounded uniform domains with boundaries that may be fractal. In the first case the optimal shape realizes the infimum of a given energy functional over the class, in the second case it realizes the minimum. As a concrete application we discuss the existence of maximally stable roof structures under snow loads.
△ Less
Submitted 5 October, 2020;
originally announced October 2020.
-
Optimal absorption of acoustical waves by a boundary
Authors:
Frédéric Magoulès,
Thi Phuong Kieu Nguyen,
Pascal Omnes,
Anna Rozanova-Pierrat
Abstract:
In the aim to find the simplest and most efficient shape of a noise absorbing wall to dissipate the acoustical energy of a sound wave, we consider a frequency model described by the Helmholtz equation with a damping on the boundary. The well-posedness of the model is shown in a class of domains with d-set boundaries (N -- 1 $\le$ d < N). We introduce a class of admissible Lipschitz boundaries, in…
▽ More
In the aim to find the simplest and most efficient shape of a noise absorbing wall to dissipate the acoustical energy of a sound wave, we consider a frequency model described by the Helmholtz equation with a damping on the boundary. The well-posedness of the model is shown in a class of domains with d-set boundaries (N -- 1 $\le$ d < N). We introduce a class of admissible Lipschitz boundaries, in which an optimal shape of the wall exists in the following sense: We prove the existence of a Radon measure on this shape, greater than or equal to the usual Lebesgue measure, for which the corresponding solution of the Helmholtz problem realizes the infimum of the acoustic energy defined with the Lebesgue measure on the boundary. If this Radon measure coincides with the Lebesgue measure, the corresponding solution realizes the minimum of the energy. For a fixed porous material, considered as an acoustic absorbent, we derive the damping parameters of its boundary from the corresponding time-dependent problem described by the damped wave equation (damping in volume).
△ Less
Submitted 22 July, 2020; v1 submitted 30 March, 2020;
originally announced March 2020.
-
Spectral domain decomposition method for physically-based rendering of photochromic/electrochromic glass windows
Authors:
Guillaume Gbikpi-Benissan,
Patrick Callet,
Frederic Magoules
Abstract:
This paper covers the time consuming issues intrinsic to physically-based image rendering algorithms. First, glass materials optical properties were measured on samples of real glasses and other objects materials inside an hotel room were characterized by deducing spectral data from multiple trichromatic images. We then present the rendering model and ray-tracing algorithm implemented in Virtueliu…
▽ More
This paper covers the time consuming issues intrinsic to physically-based image rendering algorithms. First, glass materials optical properties were measured on samples of real glasses and other objects materials inside an hotel room were characterized by deducing spectral data from multiple trichromatic images. We then present the rendering model and ray-tracing algorithm implemented in Virtuelium, an open source software. In order to accelerate the computation of the interactions between light rays and objects, the ray-tracing algorithm is parallelized by means of domain decomposition method techniques. Numerical experiments show that the speedups obtained with classical parallelization techniques are significantly less significant than those achieved with parallel domain decomposition methods.
△ Less
Submitted 9 December, 2019;
originally announced December 2019.
-
Spectral Domain Decomposition Method for Natural Lighting and Medieval Glass Rendering
Authors:
Guillaume Gbikpi-Benissan,
Remi Cerise,
Patrick Callet,
Frederic Magoules
Abstract:
In this paper, we use an original ray-tracing domain decomposition method to address image rendering of naturally lighted scenes. This new method allows to particularly analyze rendering problems on parallel architectures, in the case of interactions between light-rays and glass material. Numerical experiments, for medieval glass rendering within the church of the Royaumont abbey, illustrate the p…
▽ More
In this paper, we use an original ray-tracing domain decomposition method to address image rendering of naturally lighted scenes. This new method allows to particularly analyze rendering problems on parallel architectures, in the case of interactions between light-rays and glass material. Numerical experiments, for medieval glass rendering within the church of the Royaumont abbey, illustrate the performance of the proposed ray-tracing domain decomposition method (DDM) on multi-cores and multi-processors architectures. On one hand, applying domain decomposition techniques increases speedups obtained by parallelizing the computation. On the other hand, for a fixed number of parallel processes, we notice that speedups increase as the number of sub-domains do.
△ Less
Submitted 9 December, 2019;
originally announced December 2019.
-
Interactive 3D fluid simulation: steering the simulation in progress using Lattice Boltzmann Method
Authors:
Mengchen Wang,
Nicolas Ferey,
Patrick Bourdot,
Frederic Magoules
Abstract:
This paper describes a work in progress about software and hardware architecture to steer and control an ongoing fluid simulation in a context of a serious game application. We propose to use the Lattice Boltzmann Method as the simulation approach considering that it can provide fully parallel algorithms to reach interactive time and because it is easier to change parameters while the simulation i…
▽ More
This paper describes a work in progress about software and hardware architecture to steer and control an ongoing fluid simulation in a context of a serious game application. We propose to use the Lattice Boltzmann Method as the simulation approach considering that it can provide fully parallel algorithms to reach interactive time and because it is easier to change parameters while the simulation is in progress remaining physically relevant than more classical simulation approaches. We describe which parameters we can modify and how we solve technical issues of interactive steering and we finally show an application of our interactive fluid simulation approach of water dam phenomena.
△ Less
Submitted 9 December, 2019;
originally announced December 2019.
-
Using asynchronous simulation approach for interactive simulation
Authors:
Mengchen Wang,
Nicolas Ferey,
Patrick Bourdot,
Frederic Magoules
Abstract:
This paper discusses about the advantage of using asynchronous simulation in the case of interactive simulation in which user can steer and control parameters during a simulation in progress. synchronous models allow to compute each iteration faster to address the issues of performance needed in an highly interactive context, and our hypothesis is that get partial results faster is better than get…
▽ More
This paper discusses about the advantage of using asynchronous simulation in the case of interactive simulation in which user can steer and control parameters during a simulation in progress. synchronous models allow to compute each iteration faster to address the issues of performance needed in an highly interactive context, and our hypothesis is that get partial results faster is better than getting synchronized and final results to take a decision, in a interactive simulation context.
△ Less
Submitted 9 December, 2019;
originally announced December 2019.
-
Spectral domain decomposition method for physically-based rendering of Royaumont abbey
Authors:
Guillaume Gbikpi-Benissan,
Patrick Callet,
Frederic Magoules
Abstract:
In the context of a virtual reconstitution of the destroyed Royaumont abbey church, this paper investigates computer sciences issues intrinsic to the physically-based image rendering. First, a virtual model was designed from historical sources and archaeological descriptions. Then some materials physical properties were measured on remains of the church and on pieces from similar ancient churches.…
▽ More
In the context of a virtual reconstitution of the destroyed Royaumont abbey church, this paper investigates computer sciences issues intrinsic to the physically-based image rendering. First, a virtual model was designed from historical sources and archaeological descriptions. Then some materials physical properties were measured on remains of the church and on pieces from similar ancient churches. We specify the properties of our lighting source which is a representation of the sun, and present the rendering algorithm implemented in our software Virtuelium. In order to accelerate the computation of the interactions between light-rays and objects, this ray-tracing algorithm is parallelized by means of domain decomposition techniques. Numerical experiments show that the computational time saved by a classic parallelization is much less significant than that gained with our approach.
△ Less
Submitted 9 December, 2019;
originally announced December 2019.
-
Beam-tracing domain decomposition method for urban acoustic pollution
Authors:
Guillaume Gbikpi-Benissan,
Frederic Magoules
Abstract:
This paper covers the fast solution of large acoustic problems on low-resources parallel platforms. A domain decomposition method is coupled with a dynamic load balancing scheme to efficiently accelerate a geometrical acoustic method. The geometrical method studied implements a beam-tracing method where intersections are handled as in a ray-tracing method. Beyond the distribution of the global pro…
▽ More
This paper covers the fast solution of large acoustic problems on low-resources parallel platforms. A domain decomposition method is coupled with a dynamic load balancing scheme to efficiently accelerate a geometrical acoustic method. The geometrical method studied implements a beam-tracing method where intersections are handled as in a ray-tracing method. Beyond the distribution of the global processing upon multiple sub-domains, a second parallelization level is operated by means of multi-threading and shared memory mechanisms.
Numerical experiments show that this method allows to handle large scale open domains for parallel computing purposes on few machines. Urban acoustic pollution arrising from car traffic was simulated on a large model of the Shinjuku district of Tokyo, Japan. The good speed-up results illustrate the performance of this new domain decomposition method.
△ Less
Submitted 9 December, 2019;
originally announced December 2019.
-
Coarse Space Correction for Graphic Analysis
Authors:
Guillaume Gbikpi-Benissan,
Frederic Magoules
Abstract:
In this paper we present an effective coarse space correction addressed to accelerate the solution of an algebraic linear system. The system arises from the formulation of the problem of interpolating scattered data by means of Radial Basis Functions. Radial Basis Functions are commonly used for interpolating scattered data during the image reconstruction process in graphic analysis. This requires…
▽ More
In this paper we present an effective coarse space correction addressed to accelerate the solution of an algebraic linear system. The system arises from the formulation of the problem of interpolating scattered data by means of Radial Basis Functions. Radial Basis Functions are commonly used for interpolating scattered data during the image reconstruction process in graphic analysis. This requires to solve a linear system of equations for each color component and this process represents the most time-consuming operation. Several basis functions like trigonometric, exponential, Gaussian, polynomial are here investigated to construct a suitable coarse space correction to speed-up the solution of the linear system. Numerical experiments outline the superiority of some functions for the fast iterative solution of the image reconstruction problem.
△ Less
Submitted 9 December, 2019;
originally announced December 2019.
-
On Extensions of Limited Memory Steepest Descent Method
Authors:
Qinmeng Zou,
Frederic Magoules
Abstract:
We present some extensions to the limited memory steepest descent method based on spectral properties and cyclic iterations. Our aim is to show that it is possible to combine sweep and delayed strategies for improving the performance of gradient methods. Numerical results are reported which indicate that our new methods are better than the original version. Some remarks on the stability and parall…
▽ More
We present some extensions to the limited memory steepest descent method based on spectral properties and cyclic iterations. Our aim is to show that it is possible to combine sweep and delayed strategies for improving the performance of gradient methods. Numerical results are reported which indicate that our new methods are better than the original version. Some remarks on the stability and parallel implementation are shown in the end.
△ Less
Submitted 3 December, 2019;
originally announced December 2019.
-
Recent Developments in Iterative Methods for Reducing Synchronization
Authors:
Qinmeng Zou,
Frederic Magoules
Abstract:
On modern parallel architectures, the cost of synchronization among processors can often dominate the cost of floating-point computation. Several modifications of the existing methods have been proposed in order to keep the communication cost as low as possible. This paper aims at providing a brief overview of recent advances in parallel iterative methods for solving large-scale problems. We refer…
▽ More
On modern parallel architectures, the cost of synchronization among processors can often dominate the cost of floating-point computation. Several modifications of the existing methods have been proposed in order to keep the communication cost as low as possible. This paper aims at providing a brief overview of recent advances in parallel iterative methods for solving large-scale problems. We refer the reader to the related references for more details on the derivation, implementation, performance, and analysis of these techniques.
△ Less
Submitted 2 December, 2019;
originally announced December 2019.
-
Parameter Estimation in the Hermitian and Skew-Hermitian Splitting Method Using Gradient Iterations
Authors:
Qinmeng Zou,
Frederic Magoules
Abstract:
This paper presents enhancement strategies for the Hermitian and skew-Hermitian splitting method based on gradient iterations. The spectral properties are exploited for the parameter estimation, often resulting in a better convergence. In particular, steepest descent with early stopping can generate a rough estimate of the optimal parameter. This is better than an arbitrary choice since the latter…
▽ More
This paper presents enhancement strategies for the Hermitian and skew-Hermitian splitting method based on gradient iterations. The spectral properties are exploited for the parameter estimation, often resulting in a better convergence. In particular, steepest descent with early stopping can generate a rough estimate of the optimal parameter. This is better than an arbitrary choice since the latter often causes stability problems or slow convergence. Additionally, lagged gradient methods are considered as inner solvers for the splitting method. Experiments show that they are competitive with conjugate gradient in low precision.
△ Less
Submitted 3 September, 2019;
originally announced September 2019.
-
Fast Gradient Methods with Alignment for Symmetric Linear Systems without Using Cauchy Step
Authors:
Qinmeng Zou,
Frederic Magoules
Abstract:
The performance of gradient methods has been considerably improved by the introduction of delayed parameters. After two and a half decades, the revealing of second-order information has recently given rise to the Cauchy-based methods with alignment, which reduce asymptotically the search spaces in smaller and smaller dimensions. They are generally considered as the state of the art of gradient met…
▽ More
The performance of gradient methods has been considerably improved by the introduction of delayed parameters. After two and a half decades, the revealing of second-order information has recently given rise to the Cauchy-based methods with alignment, which reduce asymptotically the search spaces in smaller and smaller dimensions. They are generally considered as the state of the art of gradient methods. This paper reveals the spectral properties of minimal gradient and asymptotically optimal steps, and then suggests three fast methods with alignment without using the Cauchy step. The convergence results are provided, and numerical experiments show that the new methods provide competitive and more stable alternatives to the classical Cauchy-based methods. In particular, alignment gradient methods present advantages over the Krylov subspace methods in some situations, which makes them attractive in practice.
△ Less
Submitted 14 September, 2019; v1 submitted 3 September, 2019;
originally announced September 2019.
-
Asynchronous Time-Parallel Method based on Laplace Transform
Authors:
Frederic Magoules,
Qinmeng Zou
Abstract:
Laplace transform method has proved to be very efficient and easy to parallelize for the solution of time-dependent problems. However, the synchronization delay among processors implies an upper bound on the expectable acceleration factor, which leads to a lot of wasted time. In this paper, we propose an original asynchronous Laplace transform method formalized for quasilinear problems based on th…
▽ More
Laplace transform method has proved to be very efficient and easy to parallelize for the solution of time-dependent problems. However, the synchronization delay among processors implies an upper bound on the expectable acceleration factor, which leads to a lot of wasted time. In this paper, we propose an original asynchronous Laplace transform method formalized for quasilinear problems based on the well-known Gaver-Stehfest algorithm. Parallel experiments show the convergence of our new method, as well as several interesting properties compared with the classical algorithms.
△ Less
Submitted 3 September, 2019;
originally announced September 2019.
-
GPU Accelerated Contactless Human Machine Interface for Driving Car
Authors:
Frederic Magoules,
Qinmeng Zou
Abstract:
In this paper we present an original contactless human machine interface for driving car. The proposed framework is based on the image sent by a simple camera device, which is then processed by various computer vision algorithms. These algorithms allow the isolation of the user's hand on the camera frame and translate its movements into orders sent to the computer in a real time process. The optim…
▽ More
In this paper we present an original contactless human machine interface for driving car. The proposed framework is based on the image sent by a simple camera device, which is then processed by various computer vision algorithms. These algorithms allow the isolation of the user's hand on the camera frame and translate its movements into orders sent to the computer in a real time process. The optimization of the implemented algorithms on graphics processing unit leads to real time interaction between the user, the computer and the machine. The user can easily modify or create the interfaces displayed by the proposed framework to fit his personnel needs. A contactless driving car interface is here produced to illustrate the principle of our framework.
△ Less
Submitted 9 July, 2019;
originally announced July 2019.
-
A Novel Contactless Human Machine Interface based on Machine Learning
Authors:
Frederic Magoules,
Qinmeng Zou
Abstract:
This paper describes a global framework that enables contactless human machine interaction using computer vision and machine learning techniques. The main originality of our framework is that only a very simple image acquisition device, as a computer camera, is sufficient to establish a rich human machine interaction as traditional devices such as mouse or keyboard. This framework is based on well…
▽ More
This paper describes a global framework that enables contactless human machine interaction using computer vision and machine learning techniques. The main originality of our framework is that only a very simple image acquisition device, as a computer camera, is sufficient to establish a rich human machine interaction as traditional devices such as mouse or keyboard. This framework is based on well known computer vision techniques and efficient machine learning techniques are used to detect and track user hand gestures so the end user can control his computer using virtual interfaces with very simple gestures.
△ Less
Submitted 9 July, 2019;
originally announced July 2019.
-
Convergence Detection of Asynchronous Iterations based on Modified Recursive Doubling
Authors:
Qinmeng Zou,
Frederic Magoules
Abstract:
This paper addresses the distributed convergence detection problem in asynchronous iterations. A modified recursive doubling algorithm is investigated in order to adapt to the non-power-of-two case. Some convergence detection algorithms are illustrated based on the reduction operation. Finally, a concluding discussion about the implementation and the applicability is presented.
This paper addresses the distributed convergence detection problem in asynchronous iterations. A modified recursive doubling algorithm is investigated in order to adapt to the non-power-of-two case. Some convergence detection algorithms are illustrated based on the reduction operation. Finally, a concluding discussion about the implementation and the applicability is presented.
△ Less
Submitted 2 July, 2019;
originally announced July 2019.
-
A New Cyclic Gradient Method Adapted to Large-Scale Linear Systems
Authors:
Qinmeng Zou,
Frederic Magoules
Abstract:
This paper proposes a new gradient method to solve the large-scale problems. Theoretical analysis shows that the new method has finite termination property for two dimensions and converges R-linearly for any dimensions. Experimental results illustrate first the issue of parallel implementation. Then, the solution of a large-scale problem shows that the new method is better than the others, even co…
▽ More
This paper proposes a new gradient method to solve the large-scale problems. Theoretical analysis shows that the new method has finite termination property for two dimensions and converges R-linearly for any dimensions. Experimental results illustrate first the issue of parallel implementation. Then, the solution of a large-scale problem shows that the new method is better than the others, even competitive with the conjugate gradient method.
△ Less
Submitted 2 July, 2019;
originally announced July 2019.
-
Asynchronous Communications Library for the Parallel-in-Time Solution of Black-Scholes Equation
Authors:
Qinmeng Zou,
Guillaume Gbikpi-Benissan,
Frederic Magoules
Abstract:
The advent of asynchronous iterative scheme gives high efficiency to numerical computations. However, it is generally difficult to handle the problems of resource management and convergence detection. This paper uses JACK2, an asynchronous communication kernel library for iterative algorithms, to implement both classical and asynchronous parareal algorithms, especially the latter. We illustrate th…
▽ More
The advent of asynchronous iterative scheme gives high efficiency to numerical computations. However, it is generally difficult to handle the problems of resource management and convergence detection. This paper uses JACK2, an asynchronous communication kernel library for iterative algorithms, to implement both classical and asynchronous parareal algorithms, especially the latter. We illustrate the measures whereby one can tackle the problems above elegantly for the time-dependent case. Finally, experiments are presented to prove the availability and efficiency of such application.
△ Less
Submitted 2 July, 2019;
originally announced July 2019.
-
Asynchronous Parareal Algorithm Applied to European Option Pricing
Authors:
Qinmeng Zou,
Guillaume Gbikpi-Benissan,
Frederic Magoules
Abstract:
Asynchronous iterations arise naturally in parallel computing if one wants to solve large problems with a minimization of the idle times. This paper presents an original model of asynchronous iterations for a time-domain decomposition method, namely the parareal method. The asynchronous parareal algorithm is here applied to European option pricing, and numerical experiments performed on a parallel…
▽ More
Asynchronous iterations arise naturally in parallel computing if one wants to solve large problems with a minimization of the idle times. This paper presents an original model of asynchronous iterations for a time-domain decomposition method, namely the parareal method. The asynchronous parareal algorithm is here applied to European option pricing, and numerical experiments performed on a parallel supercomputer, illustrate the performance and efficiency of this new method.
△ Less
Submitted 2 July, 2019;
originally announced July 2019.