Scalability

One of the goal of the project is to propose a massively parallel code. A script helps to easily make scalability studies by submitting jobs to the scheduler. CPU time of several parts of the code are measured. Another script concatenates all the times in a single text file that can be analysed. You will find below some scalability results of Notus and third party librairies.

Weak scalabilty of Notus part of the code and HYPRE solvers

A weak scalabilty study consists in setting the local (to a core) number of computational cells while increasing the number of cores. So the total size of the domain increases with the number of processors. Perfect weak scalability of the code would provide constant CPU time. Next figure shows CPU times (of 10 time iterations) of a weak scalability study for Notus part of the code and HYPRE solvers on Curie and Occigen supercomputers.

We observe a perfect scalability in the Notus part of the code which corresponds in this case to the discretization of the Navier-Stokes equations (prediction and correction step), the filling of the associated linear systems, and the VOF-PLIC resolution of the advection equation.

Scalability associated to the HYPRE solvers (sstruct and struct interfaces) are the one expected for such problems. We observe no real difference between the Haswell and Broadwell partitions of Occigen supercomputer. Finally, a better scalabilty is observed on Curie supercomputer as shown in next table where we have computed the ratio of the CPU time measured on several nodes to the one measured on 1 node.

 

Figure 1: Notus and HYPRE weak scalability study on Curie and Occigen supercomputers
 

  HYPRE BiCGStab + Jacobi (sstruct) HYPRE BiCGStab + PFMG (struct)
Nb of nodes Curie Occigen HSW Occigen BDW Curie Occigen HSW Occigen BDW
1 1 1 1 1 1 1
2 1.1 1.1 1.4 1.0 1.0 1.1
4 1.2 1.3 1.3 1.1 1.3 1.3
8 1.4 1.7 1.8 1.4 1.5 1.5
16 1.6 2.1 1.8 1.9 2.0 2.1
32 1.7 2.5 2.5 2.7 3.6 3.6
64 2.2 3.1 3.0 3.3 5.7 6.6
128 2.5 3.8 3.3 3.9 8.7 10.2

Table 1: Comparison of HYPRE weak scalability on Curie and Occigen supercomputers

 

Comparison of BiCGStab HYPRE and LIS weak scalability

HYPRE solvers are designed for thousands of processors but they may be less performant for low number of processors. The following figure compares CPU time (of 10 time iterations) of the LIS BiCGStab. It shows that above 2048 processors HYPRE implementation of the solver is more interesting. Below, LIS BiCGStab can be more efficient up to a factor 2.


Figure 2: HYPRE & LIS solvers weak scalability

 

Comparison of some LIS preconditioners

For multiphase flows, time step is usually so small due to the CFL condition (necessary to track the interface) that a point Jacobi preconditioner can be enough for the Navier-Stokes prediction step. Nevertheless, in some cases - single phase flow for instance - a more robut preconditioner is necessary. Next figure compares CPU times (of 10 time iterations) of a weak scalability study of several LIS preconditioners and shows that ILUK(0) or ILUC preconditioner may be more interesting to use.


Figure 3: LIS preconditioners weak scalability