Showing 1–2 of 2 results for author: Lange, D J

Search v0.5.6 released 2020-02-24

arXiv:2304.06441 [pdf, other]

math.NA cs.AR cs.PF

Fast And Automatic Floating Point Error Analysis With CHEF-FP

Authors: Garima Singh, Baidyanath Kundu, Harshitha Menon, Alexander Penev, David J. Lange, Vassil Vassilev

Abstract: As we reach the limit of Moore's Law, researchers are exploring different paradigms to achieve unprecedented performance. Approximate Computing (AC), which relies on the ability of applications to tolerate some error in the results to trade-off accuracy for performance, has shown significant promise. Despite the success of AC in domains such as Machine Learning, its acceptance in High-Performance… ▽ More As we reach the limit of Moore's Law, researchers are exploring different paradigms to achieve unprecedented performance. Approximate Computing (AC), which relies on the ability of applications to tolerate some error in the results to trade-off accuracy for performance, has shown significant promise. Despite the success of AC in domains such as Machine Learning, its acceptance in High-Performance Computing (HPC) is limited due to stringent requirements for accuracy. We need tools and techniques to identify regions of code that are amenable to approximations and their impact on the application output quality to guide developers to employ selective approximation. To this end, we propose CHEF-FP, a flexible, scalable, and easy-to-use source-code transformation tool based on Automatic Differentiation (AD) for analyzing approximation errors in HPC applications. CHEF-FP uses Clad, an efficient AD tool built as a plugin to the Clang compiler and based on the LLVM compiler infrastructure, as a backend and utilizes its AD abilities to evaluate approximation errors in C++ code. CHEF-FP works at the source by injecting error estimation code into the generated adjoints. This enables the error-estimation code to undergo compiler optimizations resulting in improved analysis time and reduced memory usage. We also provide theoretical and architectural augmentations to source code transformation-based AD tools to perform FP error analysis. This paper primarily focuses on analyzing errors introduced by mixed-precision AC techniques. We also show the applicability of our tool in estimating other kinds of errors by evaluating our tool on codes that use approximate functions. Moreover, we demonstrate the speedups CHEF-FP achieved during analysis time compared to the existing state-of-the-art tool due to its ability to generate and insert approximation error estimate code directly into the derivative source. △ Less

Submitted 13 April, 2023; originally announced April 2023.

Comments: 11 pages, to appear in the 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS'23)

ACM Class: G.1.2
arXiv:2203.06139 [pdf, other]

cs.MS

doi 10.1088/1742-6596/2438/1/012043

GPU Accelerated Automatic Differentiation With Clad

Authors: Ioana Ifrim, Vassil Vassilev, David J Lange

Abstract: Automatic Differentiation (AD) is instrumental for science and industry. It is a tool to evaluate the derivative of a function specified through a computer program. The range of AD application domain spans from Machine Learning to Robotics to High Energy Physics. Computing gradients with the help of AD is guaranteed to be more precise than the numerical alternative and have a low, constant factor… ▽ More Automatic Differentiation (AD) is instrumental for science and industry. It is a tool to evaluate the derivative of a function specified through a computer program. The range of AD application domain spans from Machine Learning to Robotics to High Energy Physics. Computing gradients with the help of AD is guaranteed to be more precise than the numerical alternative and have a low, constant factor more arithmetical operations compared to the original function. Moreover, AD applications to domain problems typically are computationally bound. They are often limited by the computational requirements of high-dimensional parameters and thus can benefit from parallel implementations on graphics processing units (GPUs). Clad aims to enable differential analysis for C/C++ and CUDA and is a compiler-assisted AD tool available both as a compiler extension and in ROOT. Moreover, Clad works as a plugin extending the Clang compiler; as a plugin extending the interactive interpreter Cling; and as a Jupyter kernel extension based on xeus-cling. We demonstrate the advantages of parallel gradient computations on GPUs with Clad. We explain how to bring forth a new layer of optimization and a proportional speed up by extending Clad to support CUDA. The gradients of well-behaved C++ functions can be automatically executed on a GPU. The library can be easily integrated into existing frameworks or used interactively. Furthermore, we demonstrate the achieved application performance improvements, including (~10x) in ROOT histogram fitting and corresponding performance gains from offloading to GPUs. △ Less

Submitted 16 May, 2022; v1 submitted 11 March, 2022; originally announced March 2022.

Comments: 7 pages, 2 figures, 20th International Workshop on Advanced Computing and Analysis Techniques in Physics Research

Search v0.5.6 released 2020-02-24