CFD Solvers with Minimal Memory Access

Rainald Löhner


Many state of the art CFD codes that exhibit low computational intensity (flops per RAM access) `saturate' the memory bandwidth of modern chips after only a few cores, thus minimizing any benefits from going to a higher number of available cores. This bottleneck is expected to become even more pronounced for future manycore systems. This has led to the quest for CFD solvers with minimal memory access. We report on recent developments and results for Finite Difference and Edge-Based Finite Element solvers. The best of these implementations yield one residual for only 6 fetches and 4 stores, regardless of the size of the stencil (and therefore the discretization order). This means that in terms of memory access they are competitive even with finite difference stencils as low as 2 (typical of CFD codes with 2nd order spatial discretization of fluxes and 4th order damping). Timings for a low Mach number finite difference code using a 6th order spatial discretization show competitive timings as compared to conventional loops. This bodes well for future HPC architectures.

Full Text:


Asociación Argentina de Mecánica Computacional
Güemes 3450
S3000GLN Santa Fe, Argentina
Phone: 54-342-4511594 / 4511595 Int. 1006
Fax: 54-342-4511169
E-mail: amca(at)
ISSN 2591-3522