Search | arXiv e-print repository

doi 10.1016/j.advengsoft.2024.103758

Physics-Informed Graph-Mesh Networks for PDEs: A hybrid approach for complex problems

Authors: Marien Chenaud, Frédéric Magoulès, José Alves

Abstract: The recent rise of deep learning has led to numerous applications, including solving partial differential equations using Physics-Informed Neural Networks. This approach has proven highly effective in several academic cases. However, their lack of physical invariances, coupled with other significant weaknesses, such as an inability to handle complex geometries or their lack of generalization capab… ▽ More The recent rise of deep learning has led to numerous applications, including solving partial differential equations using Physics-Informed Neural Networks. This approach has proven highly effective in several academic cases. However, their lack of physical invariances, coupled with other significant weaknesses, such as an inability to handle complex geometries or their lack of generalization capabilities, make them unable to compete with classical numerical solvers in industrial settings. In this work, a limitation regarding the use of automatic differentiation in the context of physics-informed learning is highlighted. A hybrid approach combining physics-informed graph neural networks with numerical kernels from finite elements is introduced. After studying the theoretical properties of our model, we apply it to complex geometries, in two and three dimensions. Our choices are supported by an ablation study, and we evaluate the generalisation capacity of the proposed approach. △ Less

Submitted 25 September, 2024; originally announced October 2024.

arXiv:2409.06292 [pdf, other]

Frequency range non-Lipschitz parametric optimization of a noise absorption

Authors: Frederic Magoules, Mathieu Menoux, Anna Rozanova-Pierrat

Abstract: In the framework of the optimal wave energy absorption, we solve theoretically and numerically a parametric shape optimization problem to find the optimal distribution of absorbing material in the reflexive one defined by a characteristic function in the Robin-type boundary condition associated with the Helmholtz equation. Robin boundary condition can be given on a part or the all boundary of a bo… ▽ More In the framework of the optimal wave energy absorption, we solve theoretically and numerically a parametric shape optimization problem to find the optimal distribution of absorbing material in the reflexive one defined by a characteristic function in the Robin-type boundary condition associated with the Helmholtz equation. Robin boundary condition can be given on a part or the all boundary of a bounded ($ε$, $\infty$)-domain of R n . The geometry of the partially absorbing boundary is fixed, but allowed to be non-Lipschitz, for example, fractal. It is defined as the support of a d-upper regular measure with d $\in$]n -2, n[. Using the well-posedness properties of the model, for any fixed volume fraction of the absorbing material, we establish the existence of at least one optimal distribution minimizing the acoustical energy on a fixed frequency range of the relaxation problem. Thanks to the shape derivative of the energy functional, also existing for non-Lipschitz boundaries, we implement (in the two-dimensional case) the gradient descent method and find the optimal distribution with 50% of the absorbent material on a frequency range with better performances than the 100% absorbent boundary. The same type of performance is also obtained by the genetic method. △ Less

Submitted 10 September, 2024; originally announced September 2024.

arXiv:2312.17558 [pdf, other]

doi 10.1109/TPDS.2017.2780856

Distributed convergence detection based on global residual error under asynchronous iterations

Authors: Frédéric Magoulès, Guillaume Gbikpi-Benissan

Abstract: Convergence of classical parallel iterations is detected by performing a reduction operation at each iteration in order to compute a residual error relative to a potential solution vector. To efficiently run asynchronous iterations, blocking communication requests are avoided, which makes it hard to isolate and handle any global vector. While some termination protocols were proposed for asynchrono… ▽ More Convergence of classical parallel iterations is detected by performing a reduction operation at each iteration in order to compute a residual error relative to a potential solution vector. To efficiently run asynchronous iterations, blocking communication requests are avoided, which makes it hard to isolate and handle any global vector. While some termination protocols were proposed for asynchronous iterations, only very few of them are based on global residual computation and guarantee effective convergence. But the most effective and efficient existing solutions feature two reduction operations, which constitutes an important factor of termination delay. In this paper, we present new, non-intrusive, protocols to compute a residual error under asynchronous iterations, requiring only one reduction operation. Various communication models show that some heuristics can even be introduced and formally evaluated. Extensive experiments with up to 5600 processor cores confirm the practical effectiveness and efficiency of our approach. △ Less

Submitted 29 December, 2023; originally announced December 2023.

arXiv:2312.16505 [pdf, ps, other]

doi 10.1080/00207160.2021.1952572

Asynchronous iterations of HSS method for non-Hermitian linear systems

Authors: Guillaume Gbikpi-Benissan, Qinmeng Zou, Frédéric Magoulès

Abstract: A general asynchronous alternating iterative model is designed, for which convergence is theoretically ensured both under classical spectral radius bound and, then, for a classical class of matrix splittings for $\mathsf H$-matrices. The computational model can be thought of as a two-stage alternating iterative method, which well suits to the well-known Hermitian and skew-Hermitian splitting (HSS)… ▽ More A general asynchronous alternating iterative model is designed, for which convergence is theoretically ensured both under classical spectral radius bound and, then, for a classical class of matrix splittings for $\mathsf H$-matrices. The computational model can be thought of as a two-stage alternating iterative method, which well suits to the well-known Hermitian and skew-Hermitian splitting (HSS) approach, with the particularity here of considering only one inner iteration. Experimental parallel performance comparison is conducted between the generalized minimal residual (GMRES) algorithm, the standard HSS and our asynchronous variant, on both real and complex non-Hermitian linear systems respectively arising from convection-diffusion and structural dynamics problems. A significant gain on execution time is observed in both cases. △ Less

Submitted 27 December, 2023; originally announced December 2023.

arXiv:2312.14715 [pdf, other]

doi 10.21136/AM.2022.0146-21

Resilient asynchronous primal Schur method

Authors: Guillaume Gbikpi-Benissan, Frédéric Magoulès

Abstract: This paper introduces the application of the asynchronous iterations theory within the framework of the primal Schur domain decomposition method. A suitable relaxation scheme is designed, which asynchronous convergence is established under classical spectral radius conditions. For the usual case where the local Schur complement matrices are not constructed, suitable splittings only based on explic… ▽ More This paper introduces the application of the asynchronous iterations theory within the framework of the primal Schur domain decomposition method. A suitable relaxation scheme is designed, which asynchronous convergence is established under classical spectral radius conditions. For the usual case where the local Schur complement matrices are not constructed, suitable splittings only based on explicitly generated matrices are provided. Numerical experiments are conducted on a supercomputer for both Poisson's and linear elasticity problems. The asynchronous Schur solver outperformed the classical conjugate-gradient-based one in case of compute node failures. △ Less

Submitted 22 December, 2023; originally announced December 2023.

arXiv:2312.12053 [pdf, ps, other]

doi 10.1137/21M1432107

Asynchronous multiplicative coarse-space correction

Authors: Guillaume Gbikpi-Benissan, Frédéric Magoulès

Abstract: This paper introduces the multiplicative variant of the recently proposed asynchronous additive coarse-space correction method. Definition of an asynchronous extension of multiplicative correction is not straightforward, however, our analysis allows for usual asynchronous programming approaches. General asynchronous iterative models are explicitly devised both for shared or replicated coarse probl… ▽ More This paper introduces the multiplicative variant of the recently proposed asynchronous additive coarse-space correction method. Definition of an asynchronous extension of multiplicative correction is not straightforward, however, our analysis allows for usual asynchronous programming approaches. General asynchronous iterative models are explicitly devised both for shared or replicated coarse problems and for centralized or distributed ones. Convergence conditions are derived and shown to be satisfied for M-matrices, as also done for the additive case. Implementation aspects are discussed, which reveal the need for non-blocking synchronization for building the successive right-hand-side vectors of the coarse problem. Optionally, a parameter allows for applying each coarse solution a maximum number of times, which has an impact on the algorithm efficiency. Numerical results on a high-speed homogeneous cluster confirm the practical efficiency of the asynchronous two-level method over its synchronous counterpart, even when it is not the case for the underlying one-level methods. △ Less

Submitted 19 December, 2023; originally announced December 2023.

arXiv:2310.14948 [pdf, other]

doi 10.4203/ccc.5.4.2

Physics-Informed Graph Convolutional Networks: Towards a generalized framework for complex geometries

Authors: Marien Chenaud, José Alves, Frédéric Magoulès

Abstract: Since the seminal work of [9] and their Physics-Informed neural networks (PINNs), many efforts have been conducted towards solving partial differential equations (PDEs) with Deep Learning models. However, some challenges remain, for instance the extension of such models to complex three-dimensional geometries, and a study on how such approaches could be combined to classical numerical solvers. In… ▽ More Since the seminal work of [9] and their Physics-Informed neural networks (PINNs), many efforts have been conducted towards solving partial differential equations (PDEs) with Deep Learning models. However, some challenges remain, for instance the extension of such models to complex three-dimensional geometries, and a study on how such approaches could be combined to classical numerical solvers. In this work, we justify the use of graph neural networks for these problems, based on the similarity between these architectures and the meshes used in traditional numerical techniques for solving partial differential equations. After proving an issue with the Physics-Informed framework for complex geometries, during the computation of PDE residuals, an alternative procedure is proposed, by combining classical numerical solvers and the Physics-Informed framework. Finally, we propose an implementation of this approach, that we test on a three-dimensional problem on an irregular geometry. △ Less

Submitted 24 November, 2023; v1 submitted 20 October, 2023; originally announced October 2023.

Journal ref: Civil-Comp Conferences, Volume 5, Paper 4.2, Civil-Comp Press, Edinburgh, United Kingdom, 2023

arXiv:2310.14707 [pdf, other]

A Hybrid GNN approach for predicting node data for 3D meshes

Authors: Shwetha Salimath, Francesca Bugiotti, Frederic Magoules

Abstract: Metal forging is used to manufacture dies. We require the best set of input parameters for the process to be efficient. Currently, we predict the best parameters using the finite element method by generating simulations for the different initial conditions, which is a time-consuming process. In this paper, introduce a hybrid approach that helps in processing and generating new data simulations usi… ▽ More Metal forging is used to manufacture dies. We require the best set of input parameters for the process to be efficient. Currently, we predict the best parameters using the finite element method by generating simulations for the different initial conditions, which is a time-consuming process. In this paper, introduce a hybrid approach that helps in processing and generating new data simulations using a surrogate graph neural network model based on graph convolutions, having a cheaper time cost. We also introduce a hybrid approach that helps in processing and generating new data simulations using the model. Given a dataset representing meshes, our focus is on the conversion of the available information into a graph or point cloud structure. This new representation enables deep learning. The predicted result is similar, with a low error when compared to that produced using the finite element method. The new models have outperformed existing PointNet and simple graph neural network models when applied to produce the simulations. △ Less

Submitted 23 October, 2023; originally announced October 2023.

arXiv:2310.13494 [pdf, other]

Enhancing the Global/Local Coupling Method: An Asynchronous Parallel Framework

Authors: Ahmed El Kerim, Pierre Gosselet, Frédéric Magoulès

Abstract: A novel approach is being developed to introduce a parallel asynchronous implementation of non-intrusive global-local coupling. This study examines scenarios involving numerous patches, including those covering the entire structure. By leveraging asynchronous, the method aims to minimize reliance on communication, handle failures effectively, and address load imbalances. Detailed insights into the… ▽ More A novel approach is being developed to introduce a parallel asynchronous implementation of non-intrusive global-local coupling. This study examines scenarios involving numerous patches, including those covering the entire structure. By leveraging asynchronous, the method aims to minimize reliance on communication, handle failures effectively, and address load imbalances. Detailed insights into the methodology are presented, accompanied by a demonstration of its performance through an academic case study. △ Less

Submitted 20 October, 2023; originally announced October 2023.

arXiv:2310.12605 [pdf, ps, other]

doi 10.4203/ccc.4.1.1

Accurate Coarse Residual for Two-Level Asynchronous Domain Decomposition Methods

Authors: Guillaume Gbikpi-Benissan, Frédéric Magoulès

Abstract: Recently, asynchronous coarse-space correction has been achieved within both the overlapping Schwarz and the primal Schur frameworks. Both additive and multiplicative corrections have been discussed. In this paper, we address some implementation drawbacks of the proposed additive correction scheme. In the existing approach, each coarse solution is applied only once, leaving most of the iterations… ▽ More Recently, asynchronous coarse-space correction has been achieved within both the overlapping Schwarz and the primal Schur frameworks. Both additive and multiplicative corrections have been discussed. In this paper, we address some implementation drawbacks of the proposed additive correction scheme. In the existing approach, each coarse solution is applied only once, leaving most of the iterations of the solver without coarse-space information while building the right-hand side of the coarse problem. Moreover, one-sided routines of the Message Passing Interface (MPI) standard were considered, which introduced the need for a sleep statement in the iterations loop of the coarse solver. This implies a tuning of the sleep period, which is a non-discrete quantity. In this paper, we improve the accuracy of the coarse right-hand side, which allowed for more frequent corrections. In addition, we highlight a two-sided implementation which better suits the asynchronous coarse-space correction scheme. Numerical experiments show a significant performance gain with such increased incorporation of the coarse space. △ Less

Submitted 19 October, 2023; originally announced October 2023.

arXiv:2211.13073 [pdf, other]

doi 10.1016/j.cma.2023.115910

Asynchronous global-local non-invasive coupling for linear elliptic problems

Authors: Ahmed El Kerim, Pierre Gosselet, Frédéric Magoulès

Abstract: This paper presents the first asynchronous version of the Global/Local non-invasive coupling, capable of dealing efficiently with multiple, possibly adjacent, patches. We give a new interpretation of the coupling in terms of primal domain decomposition method, and we prove the convergence of the relaxed asynchronous iteration. The asynchronous paradigm lifts many bottlenecks of the Global/Local co… ▽ More This paper presents the first asynchronous version of the Global/Local non-invasive coupling, capable of dealing efficiently with multiple, possibly adjacent, patches. We give a new interpretation of the coupling in terms of primal domain decomposition method, and we prove the convergence of the relaxed asynchronous iteration. The asynchronous paradigm lifts many bottlenecks of the Global/Local coupling performance. We illustrate the method on several linear elliptic problems as encountered in thermal and elasticity studies. △ Less

Submitted 23 November, 2022; originally announced November 2022.

arXiv:2211.10073 [pdf, other]

Point-Cloud-based Deep Learning Models for Finite Element Analysis

Authors: Meduri Venkata Shivaditya, Francesca Bugiotti, Frederic Magoules

Abstract: In this paper, we explore point-cloud based deep learning models to analyze numerical simulations arising from finite element analysis. The objective is to classify automatically the results of the simulations without tedious human intervention. Two models are here presented: the Point-Net classification model and the Dynamic Graph Convolutional Neural Net model. Both trained point-cloud deep lear… ▽ More In this paper, we explore point-cloud based deep learning models to analyze numerical simulations arising from finite element analysis. The objective is to classify automatically the results of the simulations without tedious human intervention. Two models are here presented: the Point-Net classification model and the Dynamic Graph Convolutional Neural Net model. Both trained point-cloud deep learning models performed well on experiments with finite element analysis arising from automotive industry. The proposed models show promise in automatizing the analysis process of finite element simulations. An accuracy of 79.17% and 94.5% is obtained for the Point-Net and the Dynamic Graph Convolutional Neural Net model respectively. △ Less

Submitted 18 November, 2022; originally announced November 2022.

arXiv:2211.09380 [pdf, other]

Multilayer Perceptron-based Surrogate Models for Finite Element Analysis

Authors: Lawson Oliveira Lima, Julien Rosenberger, Esteban Antier, Frederic Magoules

Abstract: Many Partial Differential Equations (PDEs) do not have analytical solution, and can only be solved by numerical methods. In this context, Physics-Informed Neural Networks (PINN) have become important in the last decades, since it uses a neural network and physical conditions to approximate any functions. This paper focuses on hypertuning of a PINN, used to solve a PDE. The behavior of the approxim… ▽ More Many Partial Differential Equations (PDEs) do not have analytical solution, and can only be solved by numerical methods. In this context, Physics-Informed Neural Networks (PINN) have become important in the last decades, since it uses a neural network and physical conditions to approximate any functions. This paper focuses on hypertuning of a PINN, used to solve a PDE. The behavior of the approximated solution when we change the learning rate or the activation function (sigmoid, hyperbolic tangent, GELU, ReLU and ELU) is here analyzed. A comparative study is done to determine the best characteristics in the problem, as well as to find a learning rate that allows fast and satisfactory learning. GELU and hyperbolic tangent activation functions exhibit better performance than other activation functions. A suitable choice of the learning rate results in higher accuracy and faster convergence. △ Less

Submitted 17 November, 2022; originally announced November 2022.

arXiv:2211.09373 [pdf, other]

Graph Neural Network-based Surrogate Models for Finite Element Analysis

Authors: Meduri Venkata Shivaditya, José Alves, Francesca Bugiotti, Frederic Magoules

Abstract: Current simulation of metal forging processes use advanced finite element methods. Such methods consist of solving mathematical equations, which takes a significant amount of time for the simulation to complete. Computational time can be prohibitive for parametric response surface exploration tasks. In this paper, we propose as an alternative, a Graph Neural Network-based graph prediction model to… ▽ More Current simulation of metal forging processes use advanced finite element methods. Such methods consist of solving mathematical equations, which takes a significant amount of time for the simulation to complete. Computational time can be prohibitive for parametric response surface exploration tasks. In this paper, we propose as an alternative, a Graph Neural Network-based graph prediction model to act as a surrogate model for parameters search space exploration and which exhibits a time cost reduced by an order of magnitude. Numerical experiments show that this new model outperforms the Point-Net model and the Dynamic Graph Convolutional Neural Net model. △ Less

Submitted 17 November, 2022; originally announced November 2022.

arXiv:2208.03707 [pdf, other]

Asynchronous scalable version of the Global-Local non-invasive coupling

Authors: Ahmed El Kerim, Pierre Gosselet, Frederic Magoules

Abstract: The Global-Local non-invasive coupling is an improvement of the submodeling technique, which permits to locally enhance structure computations by introducing patches with refined models and to take into accounts all the interactions. In order to circumvent its inherently limited computational performance, we propose and implement an asynchronous version of the method. The asynchronous coupling red… ▽ More The Global-Local non-invasive coupling is an improvement of the submodeling technique, which permits to locally enhance structure computations by introducing patches with refined models and to take into accounts all the interactions. In order to circumvent its inherently limited computational performance, we propose and implement an asynchronous version of the method. The asynchronous coupling reduces the dependency on communications, failures, and load imbalance. We present the theory and the implementation of the method in the linear case and illustrate its performance on academic cases inspired by actual industrial problems. △ Less

Submitted 7 August, 2022; originally announced August 2022.

arXiv:2207.09159 [pdf, other]

Couplage Global-Local en asynchrone pour des problèmes linéaires

Authors: Ahmed El Kerim, Pierre Gosselet, Frederic Magoules

Abstract: An asynchronous parallel version of the non-intrusive global-local coupling is implemented. The case of many patches, including those covering the entire structure, is studied. The asynchronism limits the dependency on communications, failures, and load imbalance. We detail the method and illustrate its performance in an academic case. An asynchronous parallel version of the non-intrusive global-local coupling is implemented. The case of many patches, including those covering the entire structure, is studied. The asynchronism limits the dependency on communications, failures, and load imbalance. We detail the method and illustrate its performance in an academic case. △ Less

Submitted 19 July, 2022; originally announced July 2022.

Comments: in French language

arXiv:2206.15420 [pdf, other]

doi 10.4203/ccp.111.39

JACK2: a new high-level communication library for parallel iterative methods

Authors: Guillaume Gbikpi-Benissan, Frederic Magoules

Abstract: In this paper, we address the problem of designing a distributed application meant to run both classical and asynchronous iterations. MPI libraries are very popular and widely used in the scientific community, however asynchronous iterative methods raise non-negligible difficulties about the efficient management of communication requests and buffers. Moreover, a convergence detection issue is intr… ▽ More In this paper, we address the problem of designing a distributed application meant to run both classical and asynchronous iterations. MPI libraries are very popular and widely used in the scientific community, however asynchronous iterative methods raise non-negligible difficulties about the efficient management of communication requests and buffers. Moreover, a convergence detection issue is introduced, which requires the implementation of one of the various state-of-the-art termination methods, which are not necessarily highly reliable for most computational environments. We propose here an MPI-based communication library which handles all these issues in a non-intrusive manner, providing a unique interface for implementing both classical and asynchronous iterations. Few details are highlighted about our approach to achieve best communication rates and ensure accurate convergence detection. Experimental results on two supercomputers confirmed the low overhead communication costs introduced, and the effectiveness of our library. △ Less

Submitted 30 June, 2022; originally announced June 2022.

arXiv:2206.15418 [pdf, other]

doi 10.4203/ccp.112.17

Distributed asynchronous convergence detection without detection protocol

Authors: Guillaume Gbikpi-Benissan, Frederic Magoules

Abstract: In this paper, we address the problem of detecting the moment when an ongoing asynchronous parallel iterative process can be terminated to provide a sufficiently precise solution to a fixed-point problem being solved. Formulating the detection problem as a global solution identification problem, we analyze the snapshot-based approach, which is the only one that allows for exact global residual err… ▽ More In this paper, we address the problem of detecting the moment when an ongoing asynchronous parallel iterative process can be terminated to provide a sufficiently precise solution to a fixed-point problem being solved. Formulating the detection problem as a global solution identification problem, we analyze the snapshot-based approach, which is the only one that allows for exact global residual error computation. From a recently developed approximate snapshot protocol providing a reliable global residual error, we experimentally investigate here, as well, the reliability of a global residual error computed without any prior particular detection mechanism. Results on a single-site supercomputer successfully show that such high-performance computing platforms possibly provide computational environments stable enough to allow for simply resorting to non-blocking reduction operations for computing reliable global residual errors, which provides noticeable time saving, at both implementation and execution levels. △ Less

Submitted 30 June, 2022; originally announced June 2022.

arXiv:2112.11880 [pdf, other]

Iterative Krylov Methods for Acoustic Problems on Graphics Processing Unit

Authors: Abal-Kassim Cheik Ahamed, Frederic Magoules

Abstract: This paper deals with linear algebra operations on Graphics Processing Unit (GPU) with complex number arithmetic using double precision. An analysis of their uses within iterative Krylov methods is presented to solve acoustic problems. Numerical experiments performed on a set of acoustic matrices arising from the modelisation of acoustic phenomena inside a car compartment are collected, and outlin… ▽ More This paper deals with linear algebra operations on Graphics Processing Unit (GPU) with complex number arithmetic using double precision. An analysis of their uses within iterative Krylov methods is presented to solve acoustic problems. Numerical experiments performed on a set of acoustic matrices arising from the modelisation of acoustic phenomena inside a car compartment are collected, and outline the performance, robustness and effectiveness of our algorithms, with a speed-up up to 28x for dot product, 9.8x for sparse matrix-vector product and solvers. △ Less

Submitted 22 December, 2021; originally announced December 2021.

arXiv:2112.10823 [pdf, other]

Fast and Green Computing with Graphics Processing Units for solving Sparse Linear Systems

Authors: Abal-Kassim Cheik Ahamed, Alban Desmaison, Frederic Magoules

Abstract: In this paper, we aim to introduce a new perspective when comparing highly parallelized algorithms on GPU: the energy consumption of the GPU. We give an analysis of the performance of linear algebra operations, including addition of vectors, element-wise product, dot product and sparse matrix-vector product, in order to validate our experimental protocol. We also analyze their uses within conjugat… ▽ More In this paper, we aim to introduce a new perspective when comparing highly parallelized algorithms on GPU: the energy consumption of the GPU. We give an analysis of the performance of linear algebra operations, including addition of vectors, element-wise product, dot product and sparse matrix-vector product, in order to validate our experimental protocol. We also analyze their uses within conjugate gradient method for solving the gravity equations on Graphics Processing Unit (GPU). Cusp library is considered and compared to our own implementation with a set of real matrices arrising from the Chicxulub crater and obtained by the finite element discretization of the gravity equations. The experiments demonstrate the performance and robustness of our implementation in terms of energy efficiency. △ Less

Submitted 20 December, 2021; originally announced December 2021.

arXiv:2112.06465 [pdf, other]

Accelerated solution of Helmholtz equation with Iterative Krylov Methods on GPU

Authors: Abal-Kassim Cheik Ahamed, Frederic Magoules

Abstract: This paper gives an analysis and an evaluation of linear algebra operations on Graphics Processing Unit (GPU) with complex number arithmetics with double precision. Knowing the performance of these operations, iterative Krylov methods are considered to solve the acoustic problem efficiently. Numerical experiments carried out on a set of acoustic matrices arising from the modelisation of acoustic p… ▽ More This paper gives an analysis and an evaluation of linear algebra operations on Graphics Processing Unit (GPU) with complex number arithmetics with double precision. Knowing the performance of these operations, iterative Krylov methods are considered to solve the acoustic problem efficiently. Numerical experiments carried out on a set of acoustic matrices arising from the modelisation of acoustic phenomena within a cylinder and a car compartment are exposed, exhibiting the performance, robustness and efficiency of our algorithms, with a ratio up to 27x for dot product, 10x for sparse matrix-vector product and solvers in complex double precision arithmetics. △ Less

Submitted 13 December, 2021; originally announced December 2021.

arXiv:2112.03851 [pdf, ps, other]

Stochastic Optimized Schwarz Methods for the Gravity Equations on Graphics Processing Unit

Authors: Abal-Kassim Cheik Ahamed, Frederic Magoules

Abstract: Low order, sequential or non-massively parallel finite elements are generaly used for three-dimensional gravity modelling. In this paper, in order to obtain better gravity anomaly solutions in heterogeneous media, we solve the gravimetry problem using massively parallel high order finite elements on hybrid multi-CPU/GPU clusters. Parallel algorithms well suited for such hybrid architectures have t… ▽ More Low order, sequential or non-massively parallel finite elements are generaly used for three-dimensional gravity modelling. In this paper, in order to obtain better gravity anomaly solutions in heterogeneous media, we solve the gravimetry problem using massively parallel high order finite elements on hybrid multi-CPU/GPU clusters. Parallel algorithms well suited for such hybrid architectures have to be designed. A new stochastic-based optimization procedure for the optimized Schwarz method is here presented, implemented and tuned to graphical cards processors units. Numerical experiments performed on a reallistic test case, demonstrates the robustness and efficiency of the proposed method and of its implementation on massive multi-CPU/GPU architectures. △ Less

Submitted 7 December, 2021; originally announced December 2021.

arXiv:2112.02377 [pdf, other]

On the stability and performance of the solution of sparse linear systems by partitioned procedures

Authors: Abal-Kassim Cheik Ahamed, Frederic Magoules

Abstract: In this paper, we present, evaluate and analyse the performance of parallel synchronous Jacobi algorithms by different partitioned procedures including band-row splitting, band-row sparsity pattern splitting and substructuring splitting, when solving sparse large linear systems. Numerical experiments performed on a set of academic 3D Laplace equation and on a real gravity matrices arising from the… ▽ More In this paper, we present, evaluate and analyse the performance of parallel synchronous Jacobi algorithms by different partitioned procedures including band-row splitting, band-row sparsity pattern splitting and substructuring splitting, when solving sparse large linear systems. Numerical experiments performed on a set of academic 3D Laplace equation and on a real gravity matrices arising from the Chicxulub crater are exhibited, and show the impact of splitting on parallel synchronous iterations when solving sparse large linear systems. The numerical results clearly show the interest of substructuring methods compared to band-row splitting strategies. △ Less

Submitted 4 December, 2021; originally announced December 2021.

Comments: arXiv admin note: text overlap with arXiv:2108.13162

arXiv:2112.00087 [pdf, other]

Coupling and Simulation of Fluid-Structure Interaction Problems for Automotive Sun-roof on Graphics Processing Unit

Authors: Liang S. Lai, Choi-Hong Lai, Abal-Kassim Cheik Ahamed, Frederic Magoules

Abstract: In this paper, the authors propose an analysis of the frequency response function in a car compartment, subject to some fluctuating pressure distribution along the open cavity of the sun-roof at the top of a car. Coupling of a computational fluid dynamics and of a computational acoustics code is considered to simulate the acoustic fluid-structure interaction problem. Iterative Krylov methods and d… ▽ More In this paper, the authors propose an analysis of the frequency response function in a car compartment, subject to some fluctuating pressure distribution along the open cavity of the sun-roof at the top of a car. Coupling of a computational fluid dynamics and of a computational acoustics code is considered to simulate the acoustic fluid-structure interaction problem. Iterative Krylov methods and domain decomposition methods, tuned on Graphic Processing Unit (GPU), are considered to solve the acoustic problem with complex number arithmetics with double precision. Numerical simulations illustrate the efficiency, robustness and accuracy of the proposed approaches. △ Less

Submitted 30 November, 2021; originally announced December 2021.

arXiv:2110.10762 [pdf, other]

Asynchronous parareal time discretization for partial differential equations

Authors: Frederic Magoules, Guillaume Gbikpi-Benissan

Abstract: Asynchronous iterations are more and more investigated for both scaling and fault-resilience purpose on high performance computing platforms. While so far, they have been exclusively applied within space domain decomposition frameworks, this paper advocates a novel application direction targeting time-decomposed time-parallel approaches. Specifically, an asynchronous iterative model is derived fro… ▽ More Asynchronous iterations are more and more investigated for both scaling and fault-resilience purpose on high performance computing platforms. While so far, they have been exclusively applied within space domain decomposition frameworks, this paper advocates a novel application direction targeting time-decomposed time-parallel approaches. Specifically, an asynchronous iterative model is derived from the Parareal scheme, for which convergence and speedup analysis are then conducted. It turned out that Parareal and async-Parareal feature very close convergence conditions, asymptotically equivalent, including the finite-time termination property. Based on a computational cost model aware of unsteady communication delays, our speedup analysis shows the potential performance gain from asynchronous iterations, which is confirmed by some experimental case of heat evolution on a homogeneous supercomputer. This primary work clearly suggests possible further benefits from asynchronous iterations. △ Less

Submitted 20 October, 2021; originally announced October 2021.

arXiv:2110.10446 [pdf, other]

doi 10.2312/egs.20211022

Interactive simulation for easy decision-making in fluid dynamics

Authors: Mengchen Wang, Nicolas Férey, Frédéric Magoulès, Patrick Bourdot

Abstract: A conventional study of fluid simulation involves different stages including conception, simulation, visualization, and analysis tasks. It is, therefore, necessary to switch between different software and interactive contexts which implies costly data manipulation and increases the time needed for decision making. Our interactive simulation approach was designed to shorten this loop, allowing user… ▽ More A conventional study of fluid simulation involves different stages including conception, simulation, visualization, and analysis tasks. It is, therefore, necessary to switch between different software and interactive contexts which implies costly data manipulation and increases the time needed for decision making. Our interactive simulation approach was designed to shorten this loop, allowing users to visualize and steer a simulation in progress without waiting for the end of the simulation. The methodology allows the users to control, start, pause, or stop a simulation in progress, to change global physical parameters, to interact with its 3D environment by editing boundary conditions such as walls or obstacles. This approach is made possible by using a methodology such as the Lattice Boltzmann Method (LBM) to achieve interactive time while remaining physically relevant. In this work, we present our platform dedicated to interactive fluid simulation based on LBM. The contribution of our interactive simulation approach to decision making will be evaluated in a study based on a simple but realistic use case. △ Less

Submitted 20 October, 2021; originally announced October 2021.

Journal ref: H. Theisel and M. Wimmer, editors, Eurographics 2021 - Short Papers. The Eurographics Association, 2021

arXiv:2110.01414 [pdf, other]

doi 10.1109/HPCC.2014.54

Power Consumption Analysis of Parallel Algorithms on GPUs

Authors: Frédéric Magoulès, Abal-Kassim Cheik Ahamed, Alban Desmaison, Jean-Christophe Léchenet, François Mayer, Haifa Ben Salem, Thomas Zhu

Abstract: Due to their highly parallel multi-cores architecture, GPUs are being increasingly used in a wide range of computationally intensive applications. Compared to CPUs, GPUs can achieve higher performances at accelerating the programs' execution in an energy-efficient way. Therefore GPGPU computing is useful for high performance computing applications and in many scientific research fields. In order t… ▽ More Due to their highly parallel multi-cores architecture, GPUs are being increasingly used in a wide range of computationally intensive applications. Compared to CPUs, GPUs can achieve higher performances at accelerating the programs' execution in an energy-efficient way. Therefore GPGPU computing is useful for high performance computing applications and in many scientific research fields. In order to bring further performance improvements, GPU clusters are increasingly adopted. The energy consumed by GPUs cannot be neglected. Therefore, an energy-efficient time scheduling of the programs that are going to be executed by the parallel GPUs based on their deadline as well as the assigned priorities could be deployed to face their energetic avidity. For this reason, we present in this paper a model enabling the measure of the power consumption and the time execution of some elementary operations running on a single GPU using a new developed energy measurement protocol. Consequently, using our methodology, energy needs of a program could be predicted, allowing a better task scheduling. △ Less

Submitted 28 September, 2021; originally announced October 2021.

MSC Class: 14Q65; 15A60; 65E10; 65F10; 68W10; 65Y05; 68M20 ACM Class: G.1.3; G.1.6; I.3.1; D.3.4

arXiv:2108.13162 [pdf, other]

doi 10.1109/HPCC.2014.24

Parallel Sub-Structuring Methods for solving Sparse Linear Systems on a cluster of GPU

Authors: Abal-Kassim Cheik Ahamed, Frédéric Magoulès

Abstract: The main objective of this work consists in analyzing sub-structuring method for the parallel solution of sparse linear systems with matrices arising from the discretization of partial differential equations such as finite element, finite volume and finite difference. With the success encountered by the general-purpose processing on graphics processing units (GPGPU), we develop an hybrid multiGPUs… ▽ More The main objective of this work consists in analyzing sub-structuring method for the parallel solution of sparse linear systems with matrices arising from the discretization of partial differential equations such as finite element, finite volume and finite difference. With the success encountered by the general-purpose processing on graphics processing units (GPGPU), we develop an hybrid multiGPUs and CPUs sub-structuring algorithm. GPU computing, with CUDA, is used to accelerate the operations performed on each processor. Numerical experiments have been performed on a set of matrices arising from engineering problems. We compare C+MPI implementation on classical CPU cluster with C+MPI+CUDA on a cluster of GPU. The performance comparison shows a speed-up for the sub-structuring method up to 19 times in double precision by using CUDA. △ Less

Submitted 8 August, 2021; originally announced August 2021.

MSC Class: 14Q65; 15A60; 65E10; 65F10; 68W10; 65Y05 ACM Class: G.1.3; G.1.6; I.3.1; D.3.4

Journal ref: 2014 IEEE Intl Conf on High Performance Computing and Communications, 2014, pp. 121-128

arXiv:2010.01832 [pdf, ps, other]

On the existence of optimal shapes in architecture

Authors: Michael Hinz, Frédéric Magoulès, Rozanova-Pierrat Anna, Marina Rynkovskaya, Alexander Teplyaev

Abstract: We consider shape optimization problems for elasticity systems in architecture. A typical question in this context is to identify a structure of maximal stability close to an initially proposed one. We show the existence of such an optimally shaped structure within classes of bounded Lipschitz domains and within wider classes of bounded uniform domains with boundaries that may be fractal. In the f… ▽ More We consider shape optimization problems for elasticity systems in architecture. A typical question in this context is to identify a structure of maximal stability close to an initially proposed one. We show the existence of such an optimally shaped structure within classes of bounded Lipschitz domains and within wider classes of bounded uniform domains with boundaries that may be fractal. In the first case the optimal shape realizes the infimum of a given energy functional over the class, in the second case it realizes the minimum. As a concrete application we discuss the existence of maximally stable roof structures under snow loads. △ Less

Submitted 5 October, 2020; originally announced October 2020.

arXiv:2003.13250 [pdf, other]

Optimal absorption of acoustical waves by a boundary

Authors: Frédéric Magoulès, Thi Phuong Kieu Nguyen, Pascal Omnes, Anna Rozanova-Pierrat

Abstract: In the aim to find the simplest and most efficient shape of a noise absorbing wall to dissipate the acoustical energy of a sound wave, we consider a frequency model described by the Helmholtz equation with a damping on the boundary. The well-posedness of the model is shown in a class of domains with d-set boundaries (N -- 1 $\le$ d < N). We introduce a class of admissible Lipschitz boundaries, in… ▽ More In the aim to find the simplest and most efficient shape of a noise absorbing wall to dissipate the acoustical energy of a sound wave, we consider a frequency model described by the Helmholtz equation with a damping on the boundary. The well-posedness of the model is shown in a class of domains with d-set boundaries (N -- 1 $\le$ d < N). We introduce a class of admissible Lipschitz boundaries, in which an optimal shape of the wall exists in the following sense: We prove the existence of a Radon measure on this shape, greater than or equal to the usual Lebesgue measure, for which the corresponding solution of the Helmholtz problem realizes the infimum of the acoustic energy defined with the Lebesgue measure on the boundary. If this Radon measure coincides with the Lebesgue measure, the corresponding solution realizes the minimum of the energy. For a fixed porous material, considered as an acoustic absorbent, we derive the damping parameters of its boundary from the corresponding time-dependent problem described by the damped wave equation (damping in volume). △ Less

Submitted 22 July, 2020; v1 submitted 30 March, 2020; originally announced March 2020.

Comments: SIAM Journal on Control and Optimization, Society for Industrial and Applied Mathematics, In press

arXiv:1912.06474 [pdf, other]

doi 10.1109/DCABES.2014.27

Spectral domain decomposition method for physically-based rendering of photochromic/electrochromic glass windows

Authors: Guillaume Gbikpi-Benissan, Patrick Callet, Frederic Magoules

Abstract: This paper covers the time consuming issues intrinsic to physically-based image rendering algorithms. First, glass materials optical properties were measured on samples of real glasses and other objects materials inside an hotel room were characterized by deducing spectral data from multiple trichromatic images. We then present the rendering model and ray-tracing algorithm implemented in Virtueliu… ▽ More This paper covers the time consuming issues intrinsic to physically-based image rendering algorithms. First, glass materials optical properties were measured on samples of real glasses and other objects materials inside an hotel room were characterized by deducing spectral data from multiple trichromatic images. We then present the rendering model and ray-tracing algorithm implemented in Virtuelium, an open source software. In order to accelerate the computation of the interactions between light rays and objects, the ray-tracing algorithm is parallelized by means of domain decomposition method techniques. Numerical experiments show that the speedups obtained with classical parallelization techniques are significantly less significant than those achieved with parallel domain decomposition methods. △ Less

Submitted 9 December, 2019; originally announced December 2019.

Comments: arXiv admin note: substantial text overlap with arXiv:1912.05494

arXiv:1912.05494 [pdf, other]

doi 10.1109/HPCC.2014.17

Spectral Domain Decomposition Method for Natural Lighting and Medieval Glass Rendering

Authors: Guillaume Gbikpi-Benissan, Remi Cerise, Patrick Callet, Frederic Magoules

Abstract: In this paper, we use an original ray-tracing domain decomposition method to address image rendering of naturally lighted scenes. This new method allows to particularly analyze rendering problems on parallel architectures, in the case of interactions between light-rays and glass material. Numerical experiments, for medieval glass rendering within the church of the Royaumont abbey, illustrate the p… ▽ More In this paper, we use an original ray-tracing domain decomposition method to address image rendering of naturally lighted scenes. This new method allows to particularly analyze rendering problems on parallel architectures, in the case of interactions between light-rays and glass material. Numerical experiments, for medieval glass rendering within the church of the Royaumont abbey, illustrate the performance of the proposed ray-tracing domain decomposition method (DDM) on multi-cores and multi-processors architectures. On one hand, applying domain decomposition techniques increases speedups obtained by parallelizing the computation. On the other hand, for a fixed number of parallel processes, we notice that speedups increase as the number of sub-domains do. △ Less

Submitted 9 December, 2019; originally announced December 2019.

arXiv:1912.04356 [pdf, other]

Interactive 3D fluid simulation: steering the simulation in progress using Lattice Boltzmann Method

Authors: Mengchen Wang, Nicolas Ferey, Patrick Bourdot, Frederic Magoules

Abstract: This paper describes a work in progress about software and hardware architecture to steer and control an ongoing fluid simulation in a context of a serious game application. We propose to use the Lattice Boltzmann Method as the simulation approach considering that it can provide fully parallel algorithms to reach interactive time and because it is easier to change parameters while the simulation i… ▽ More This paper describes a work in progress about software and hardware architecture to steer and control an ongoing fluid simulation in a context of a serious game application. We propose to use the Lattice Boltzmann Method as the simulation approach considering that it can provide fully parallel algorithms to reach interactive time and because it is easier to change parameters while the simulation is in progress remaining physically relevant than more classical simulation approaches. We describe which parameters we can modify and how we solve technical issues of interactive steering and we finally show an application of our interactive fluid simulation approach of water dam phenomena. △ Less

Submitted 9 December, 2019; originally announced December 2019.

arXiv:1912.04352 [pdf, other]

Using asynchronous simulation approach for interactive simulation

Authors: Mengchen Wang, Nicolas Ferey, Patrick Bourdot, Frederic Magoules

Abstract: This paper discusses about the advantage of using asynchronous simulation in the case of interactive simulation in which user can steer and control parameters during a simulation in progress. synchronous models allow to compute each iteration faster to address the issues of performance needed in an highly interactive context, and our hypothesis is that get partial results faster is better than get… ▽ More This paper discusses about the advantage of using asynchronous simulation in the case of interactive simulation in which user can steer and control parameters during a simulation in progress. synchronous models allow to compute each iteration faster to address the issues of performance needed in an highly interactive context, and our hypothesis is that get partial results faster is better than getting synchronized and final results to take a decision, in a interactive simulation context. △ Less

Submitted 9 December, 2019; originally announced December 2019.

arXiv:1912.04000 [pdf, ps, other]

doi 10.1109/CSE-EUC-DCABES.2016.212

Spectral domain decomposition method for physically-based rendering of Royaumont abbey

Authors: Guillaume Gbikpi-Benissan, Patrick Callet, Frederic Magoules

Abstract: In the context of a virtual reconstitution of the destroyed Royaumont abbey church, this paper investigates computer sciences issues intrinsic to the physically-based image rendering. First, a virtual model was designed from historical sources and archaeological descriptions. Then some materials physical properties were measured on remains of the church and on pieces from similar ancient churches.… ▽ More In the context of a virtual reconstitution of the destroyed Royaumont abbey church, this paper investigates computer sciences issues intrinsic to the physically-based image rendering. First, a virtual model was designed from historical sources and archaeological descriptions. Then some materials physical properties were measured on remains of the church and on pieces from similar ancient churches. We specify the properties of our lighting source which is a representation of the sun, and present the rendering algorithm implemented in our software Virtuelium. In order to accelerate the computation of the interactions between light-rays and objects, this ray-tracing algorithm is parallelized by means of domain decomposition techniques. Numerical experiments show that the computational time saved by a classic parallelization is much less significant than that gained with our approach. △ Less

Submitted 9 December, 2019; originally announced December 2019.

arXiv:1912.03998 [pdf, other]

doi 10.1109/DCABES.2014.34

Beam-tracing domain decomposition method for urban acoustic pollution

Authors: Guillaume Gbikpi-Benissan, Frederic Magoules

Abstract: This paper covers the fast solution of large acoustic problems on low-resources parallel platforms. A domain decomposition method is coupled with a dynamic load balancing scheme to efficiently accelerate a geometrical acoustic method. The geometrical method studied implements a beam-tracing method where intersections are handled as in a ray-tracing method. Beyond the distribution of the global pro… ▽ More This paper covers the fast solution of large acoustic problems on low-resources parallel platforms. A domain decomposition method is coupled with a dynamic load balancing scheme to efficiently accelerate a geometrical acoustic method. The geometrical method studied implements a beam-tracing method where intersections are handled as in a ray-tracing method. Beyond the distribution of the global processing upon multiple sub-domains, a second parallelization level is operated by means of multi-threading and shared memory mechanisms. Numerical experiments show that this method allows to handle large scale open domains for parallel computing purposes on few machines. Urban acoustic pollution arrising from car traffic was simulated on a large model of the Shinjuku district of Tokyo, Japan. The good speed-up results illustrate the performance of this new domain decomposition method. △ Less

Submitted 9 December, 2019; originally announced December 2019.

arXiv:1912.03993 [pdf, ps, other]

doi 10.1109/DCABES.2013.49

Coarse Space Correction for Graphic Analysis

Authors: Guillaume Gbikpi-Benissan, Frederic Magoules

Abstract: In this paper we present an effective coarse space correction addressed to accelerate the solution of an algebraic linear system. The system arises from the formulation of the problem of interpolating scattered data by means of Radial Basis Functions. Radial Basis Functions are commonly used for interpolating scattered data during the image reconstruction process in graphic analysis. This requires… ▽ More In this paper we present an effective coarse space correction addressed to accelerate the solution of an algebraic linear system. The system arises from the formulation of the problem of interpolating scattered data by means of Radial Basis Functions. Radial Basis Functions are commonly used for interpolating scattered data during the image reconstruction process in graphic analysis. This requires to solve a linear system of equations for each color component and this process represents the most time-consuming operation. Several basis functions like trigonometric, exponential, Gaussian, polynomial are here investigated to construct a suitable coarse space correction to speed-up the solution of the linear system. Numerical experiments outline the superiority of some functions for the fast iterative solution of the image reconstruction problem. △ Less

Submitted 9 December, 2019; originally announced December 2019.

arXiv:1912.01222 [pdf, ps, other]

doi 10.1109/DCABES48411.2019.00050

On Extensions of Limited Memory Steepest Descent Method

Authors: Qinmeng Zou, Frederic Magoules

Abstract: We present some extensions to the limited memory steepest descent method based on spectral properties and cyclic iterations. Our aim is to show that it is possible to combine sweep and delayed strategies for improving the performance of gradient methods. Numerical results are reported which indicate that our new methods are better than the original version. Some remarks on the stability and parall… ▽ More We present some extensions to the limited memory steepest descent method based on spectral properties and cyclic iterations. Our aim is to show that it is possible to combine sweep and delayed strategies for improving the performance of gradient methods. Numerical results are reported which indicate that our new methods are better than the original version. Some remarks on the stability and parallel implementation are shown in the end. △ Less

Submitted 3 December, 2019; originally announced December 2019.

Journal ref: 18th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES), 2019, IEEE

arXiv:1912.00816 [pdf, ps, other]

doi 10.1109/DCABES48411.2019.00048

Recent Developments in Iterative Methods for Reducing Synchronization

Authors: Qinmeng Zou, Frederic Magoules

Abstract: On modern parallel architectures, the cost of synchronization among processors can often dominate the cost of floating-point computation. Several modifications of the existing methods have been proposed in order to keep the communication cost as low as possible. This paper aims at providing a brief overview of recent advances in parallel iterative methods for solving large-scale problems. We refer… ▽ More On modern parallel architectures, the cost of synchronization among processors can often dominate the cost of floating-point computation. Several modifications of the existing methods have been proposed in order to keep the communication cost as low as possible. This paper aims at providing a brief overview of recent advances in parallel iterative methods for solving large-scale problems. We refer the reader to the related references for more details on the derivation, implementation, performance, and analysis of these techniques. △ Less

Submitted 2 December, 2019; originally announced December 2019.

Journal ref: 18th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES), 2019, IEEE

arXiv:1909.01481 [pdf, ps, other]

doi 10.1002/nla.2304

Parameter Estimation in the Hermitian and Skew-Hermitian Splitting Method Using Gradient Iterations

Authors: Qinmeng Zou, Frederic Magoules

Abstract: This paper presents enhancement strategies for the Hermitian and skew-Hermitian splitting method based on gradient iterations. The spectral properties are exploited for the parameter estimation, often resulting in a better convergence. In particular, steepest descent with early stopping can generate a rough estimate of the optimal parameter. This is better than an arbitrary choice since the latter… ▽ More This paper presents enhancement strategies for the Hermitian and skew-Hermitian splitting method based on gradient iterations. The spectral properties are exploited for the parameter estimation, often resulting in a better convergence. In particular, steepest descent with early stopping can generate a rough estimate of the optimal parameter. This is better than an arbitrary choice since the latter often causes stability problems or slow convergence. Additionally, lagged gradient methods are considered as inner solvers for the splitting method. Experiments show that they are competitive with conjugate gradient in low precision. △ Less

Submitted 3 September, 2019; originally announced September 2019.

Journal ref: Numerical Linear Algebra with Applications, 27(4), e2304, 2020

arXiv:1909.01479 [pdf, ps, other]

doi 10.1016/j.cam.2020.113033

Fast Gradient Methods with Alignment for Symmetric Linear Systems without Using Cauchy Step

Authors: Qinmeng Zou, Frederic Magoules

Abstract: The performance of gradient methods has been considerably improved by the introduction of delayed parameters. After two and a half decades, the revealing of second-order information has recently given rise to the Cauchy-based methods with alignment, which reduce asymptotically the search spaces in smaller and smaller dimensions. They are generally considered as the state of the art of gradient met… ▽ More The performance of gradient methods has been considerably improved by the introduction of delayed parameters. After two and a half decades, the revealing of second-order information has recently given rise to the Cauchy-based methods with alignment, which reduce asymptotically the search spaces in smaller and smaller dimensions. They are generally considered as the state of the art of gradient methods. This paper reveals the spectral properties of minimal gradient and asymptotically optimal steps, and then suggests three fast methods with alignment without using the Cauchy step. The convergence results are provided, and numerical experiments show that the new methods provide competitive and more stable alternatives to the classical Cauchy-based methods. In particular, alignment gradient methods present advantages over the Krylov subspace methods in some situations, which makes them attractive in practice. △ Less

Submitted 14 September, 2019; v1 submitted 3 September, 2019; originally announced September 2019.

Journal ref: Journal of Computational and Applied Mathematics, 381, 113033, 2021

arXiv:1909.01473 [pdf, ps, other]

Asynchronous Time-Parallel Method based on Laplace Transform

Authors: Frederic Magoules, Qinmeng Zou

Abstract: Laplace transform method has proved to be very efficient and easy to parallelize for the solution of time-dependent problems. However, the synchronization delay among processors implies an upper bound on the expectable acceleration factor, which leads to a lot of wasted time. In this paper, we propose an original asynchronous Laplace transform method formalized for quasilinear problems based on th… ▽ More Laplace transform method has proved to be very efficient and easy to parallelize for the solution of time-dependent problems. However, the synchronization delay among processors implies an upper bound on the expectable acceleration factor, which leads to a lot of wasted time. In this paper, we propose an original asynchronous Laplace transform method formalized for quasilinear problems based on the well-known Gaver-Stehfest algorithm. Parallel experiments show the convergence of our new method, as well as several interesting properties compared with the classical algorithms. △ Less

Submitted 3 September, 2019; originally announced September 2019.

arXiv:1907.04393 [pdf, other]

doi 10.1109/dcabes.2017.9

GPU Accelerated Contactless Human Machine Interface for Driving Car

Authors: Frederic Magoules, Qinmeng Zou

Abstract: In this paper we present an original contactless human machine interface for driving car. The proposed framework is based on the image sent by a simple camera device, which is then processed by various computer vision algorithms. These algorithms allow the isolation of the user's hand on the camera frame and translate its movements into orders sent to the computer in a real time process. The optim… ▽ More In this paper we present an original contactless human machine interface for driving car. The proposed framework is based on the image sent by a simple camera device, which is then processed by various computer vision algorithms. These algorithms allow the isolation of the user's hand on the camera frame and translate its movements into orders sent to the computer in a real time process. The optimization of the implemented algorithms on graphics processing unit leads to real time interaction between the user, the computer and the machine. The user can easily modify or create the interfaces displayed by the proposed framework to fit his personnel needs. A contactless driving car interface is here produced to illustrate the principle of our framework. △ Less

Submitted 9 July, 2019; originally announced July 2019.

Journal ref: 16th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES), 2017, IEEE

arXiv:1907.04390 [pdf, other]

doi 10.1109/dcabes.2017.37

A Novel Contactless Human Machine Interface based on Machine Learning

Authors: Frederic Magoules, Qinmeng Zou

Abstract: This paper describes a global framework that enables contactless human machine interaction using computer vision and machine learning techniques. The main originality of our framework is that only a very simple image acquisition device, as a computer camera, is sufficient to establish a rich human machine interaction as traditional devices such as mouse or keyboard. This framework is based on well… ▽ More This paper describes a global framework that enables contactless human machine interaction using computer vision and machine learning techniques. The main originality of our framework is that only a very simple image acquisition device, as a computer camera, is sufficient to establish a rich human machine interaction as traditional devices such as mouse or keyboard. This framework is based on well known computer vision techniques and efficient machine learning techniques are used to detect and track user hand gestures so the end user can control his computer using virtual interfaces with very simple gestures. △ Less

Submitted 9 July, 2019; originally announced July 2019.

Journal ref: 16th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES), 2017, IEEE

arXiv:1907.01201 [pdf, other]

doi 10.1109/dcabes.2018.00081

Convergence Detection of Asynchronous Iterations based on Modified Recursive Doubling

Authors: Qinmeng Zou, Frederic Magoules

Abstract: This paper addresses the distributed convergence detection problem in asynchronous iterations. A modified recursive doubling algorithm is investigated in order to adapt to the non-power-of-two case. Some convergence detection algorithms are illustrated based on the reduction operation. Finally, a concluding discussion about the implementation and the applicability is presented. This paper addresses the distributed convergence detection problem in asynchronous iterations. A modified recursive doubling algorithm is investigated in order to adapt to the non-power-of-two case. Some convergence detection algorithms are illustrated based on the reduction operation. Finally, a concluding discussion about the implementation and the applicability is presented. △ Less

Submitted 2 July, 2019; originally announced July 2019.

Journal ref: 17th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES), 2018, IEEE

arXiv:1907.01200 [pdf, other]

doi 10.1109/dcabes.2018.00058

A New Cyclic Gradient Method Adapted to Large-Scale Linear Systems

Authors: Qinmeng Zou, Frederic Magoules

Abstract: This paper proposes a new gradient method to solve the large-scale problems. Theoretical analysis shows that the new method has finite termination property for two dimensions and converges R-linearly for any dimensions. Experimental results illustrate first the issue of parallel implementation. Then, the solution of a large-scale problem shows that the new method is better than the others, even co… ▽ More This paper proposes a new gradient method to solve the large-scale problems. Theoretical analysis shows that the new method has finite termination property for two dimensions and converges R-linearly for any dimensions. Experimental results illustrate first the issue of parallel implementation. Then, the solution of a large-scale problem shows that the new method is better than the others, even competitive with the conjugate gradient method. △ Less

Submitted 2 July, 2019; originally announced July 2019.

Journal ref: 17th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES), 2018, IEEE

arXiv:1907.01199 [pdf, other]

doi 10.1109/dcabes.2017.17

Asynchronous Communications Library for the Parallel-in-Time Solution of Black-Scholes Equation

Authors: Qinmeng Zou, Guillaume Gbikpi-Benissan, Frederic Magoules

Abstract: The advent of asynchronous iterative scheme gives high efficiency to numerical computations. However, it is generally difficult to handle the problems of resource management and convergence detection. This paper uses JACK2, an asynchronous communication kernel library for iterative algorithms, to implement both classical and asynchronous parareal algorithms, especially the latter. We illustrate th… ▽ More The advent of asynchronous iterative scheme gives high efficiency to numerical computations. However, it is generally difficult to handle the problems of resource management and convergence detection. This paper uses JACK2, an asynchronous communication kernel library for iterative algorithms, to implement both classical and asynchronous parareal algorithms, especially the latter. We illustrate the measures whereby one can tackle the problems above elegantly for the time-dependent case. Finally, experiments are presented to prove the availability and efficiency of such application. △ Less

Submitted 2 July, 2019; originally announced July 2019.

Journal ref: 16th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES), 2017, IEEE

arXiv:1907.01198 [pdf, ps, other]

doi 10.1109/dcabes.2017.15

Asynchronous Parareal Algorithm Applied to European Option Pricing

Authors: Qinmeng Zou, Guillaume Gbikpi-Benissan, Frederic Magoules

Abstract: Asynchronous iterations arise naturally in parallel computing if one wants to solve large problems with a minimization of the idle times. This paper presents an original model of asynchronous iterations for a time-domain decomposition method, namely the parareal method. The asynchronous parareal algorithm is here applied to European option pricing, and numerical experiments performed on a parallel… ▽ More Asynchronous iterations arise naturally in parallel computing if one wants to solve large problems with a minimization of the idle times. This paper presents an original model of asynchronous iterations for a time-domain decomposition method, namely the parareal method. The asynchronous parareal algorithm is here applied to European option pricing, and numerical experiments performed on a parallel supercomputer, illustrate the performance and efficiency of this new method. △ Less

Submitted 2 July, 2019; originally announced July 2019.

Journal ref: 16th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES), 2017, IEEE

Showing 1–48 of 48 results for author: Magoules, F