-
Reduced and mixed precision turbulent flow simulations using explicit finite difference schemes
Authors:
Bálint Siklósi,
Pushpender K. Sharma,
David J. Lusher,
István Z. Reguly,
Neil D. Sandham
Abstract:
The use of reduced and mixed precision computing has gained increasing attention in high-performance computing (HPC) as a means to improve computational efficiency, particularly on modern hardware architectures like GPUs. In this work, we explore the application of mixed precision arithmetic in compressible turbulent flow simulations using explicit finite difference schemes. We extend the OPS and…
▽ More
The use of reduced and mixed precision computing has gained increasing attention in high-performance computing (HPC) as a means to improve computational efficiency, particularly on modern hardware architectures like GPUs. In this work, we explore the application of mixed precision arithmetic in compressible turbulent flow simulations using explicit finite difference schemes. We extend the OPS and OpenSBLI frameworks to support customizable precision levels, enabling fine-grained control over precision allocation for different computational tasks. Through a series of numerical experiments on the Taylor-Green vortex benchmark, we demonstrate that mixed precision strategies, such as half-single and single-double combinations, can offer significant performance gains without compromising numerical accuracy. However, pure half-precision computations result in unacceptable accuracy loss, underscoring the need for careful precision selection. Our results show that mixed precision configurations can reduce memory usage and communication overhead, leading to notable speedups, particularly on multi-CPU and multi-GPU systems.
△ Less
Submitted 27 May, 2025;
originally announced May 2025.
-
Mapping Web Pages by Internet Protocol (IP) addresses: Analyzing Spatial and Temporal Characteristics of Web Search Engine Results
Authors:
Ming-Hsiang Tsou,
Daniel Lusher
Abstract:
Internet Protocol (IP) addresses are frequently used as a method of locating web users by researchers in several different fields. However, there are competing reports concerning the accuracy of those locations, and little research has been done in manually comparing the IP geolocation databases and web page geographic information. This paper categorized web page from the Yahoo search engine into…
▽ More
Internet Protocol (IP) addresses are frequently used as a method of locating web users by researchers in several different fields. However, there are competing reports concerning the accuracy of those locations, and little research has been done in manually comparing the IP geolocation databases and web page geographic information. This paper categorized web page from the Yahoo search engine into twelve categories, ranging from 'Blog' and 'News' to 'Education' and 'Governmental'. Then we manually compared the mailing or street address of the web page's content creator with the geolocation results by the given IP address. We introduced a cartographic design method by creating kernel density maps for visualizing the information landscape of web pages associated with specific keywords.
△ Less
Submitted 15 October, 2018;
originally announced October 2018.
-
Energy efficiency of finite difference algorithms on multicore CPUs, GPUs, and Intel Xeon Phi processors
Authors:
Satya P. Jammy,
Christian T. Jacobs,
David J. Lusher,
Neil D. Sandham
Abstract:
In addition to hardware wall-time restrictions commonly seen in high-performance computing systems, it is likely that future systems will also be constrained by energy budgets. In the present work, finite difference algorithms of varying computational and memory intensity are evaluated with respect to both energy efficiency and runtime on an Intel Ivy Bridge CPU node, an Intel Xeon Phi Knights Lan…
▽ More
In addition to hardware wall-time restrictions commonly seen in high-performance computing systems, it is likely that future systems will also be constrained by energy budgets. In the present work, finite difference algorithms of varying computational and memory intensity are evaluated with respect to both energy efficiency and runtime on an Intel Ivy Bridge CPU node, an Intel Xeon Phi Knights Landing processor, and an NVIDIA Tesla K40c GPU. The conventional way of storing the discretised derivatives to global arrays for solution advancement is found to be inefficient in terms of energy consumption and runtime. In contrast, a class of algorithms in which the discretised derivatives are evaluated on-the-fly or stored as thread-/process-local variables (yielding high compute intensity) is optimal both with respect to energy consumption and runtime. On all three hardware architectures considered, a speed-up of ~2 and an energy saving of ~2 are observed for the high compute intensive algorithms compared to the memory intensive algorithm. The energy consumption is found to be proportional to runtime, irrespective of the power consumed and the GPU has an energy saving of ~5 compared to the same algorithm on a CPU node.
△ Less
Submitted 27 September, 2017;
originally announced September 2017.