8.2
general documentation
Loading...
Searching...
No Matches
How to set linear solvers with CDO/HHO schemes

Introduction

In code_saturne, a SLES means a S parse L inear E quation S olver. When using CDO/HHO schemes, the settings related to a SLES is stored in the structure cs_param_sles_t

To access more advanced settings, it is possible to retrieve this structure when one has a pointer to a cs_equation_param_t Simply write:

cs_equation_param_t * cs_equation_param_by_name(const char *eqname)
Return the cs_equation_param_t structure associated to a cs_equation_t structure based on the equatio...
Definition cs_equation.c:615
cs_param_sles_t * cs_equation_param_get_sles_param(cs_equation_param_t *eqp)
Get the pointer to the set of parameters to handle the SLES solver associated to this set of equation...
Definition cs_equation_param.cxx:1446
Set of parameters to handle an unsteady convection-diffusion-reaction equation with term sources.
Definition cs_equation_param.h:193
Structure storing all metadata related to the resolution of a linear system with an iterative solver.
Definition cs_param_sles.h:64

To get a full access to the low-level structures handling the linear algebra in code_saturne, please consider to edit the function cs_user_linear_solvers For a Finite Volume scheme, this is the main way to proceed.

Solve a linear system

Several examples are detailed hereafter to modify the default settings related to a linear solver.

A first example

{
/*
Example: Use a BiCGstab iterative solver with a 1st order Neumann
polynomial preconditioner is used to solve a user-defined
equation
*/
}

A second example

For a symmetric positive definite (SPD) system, the solver of choice is the conjugate gradient (CS_PARAM_SOLVER_CG) or its flexible variant (CS_PARAM_SOLVER_FCG). In this case, a good preconditioner is a multilevel algorithm such as an algebraic multigrid (AMG). Different multigrid techniques are available when using code_saturne depending on the installation configuration (in-house implementations or algorithms available from PETSc and/or HYPRE libraries). Please refer to this section, for more details about AMG techniques.

Here is a second example:

If the external library PETSc is available, one uses its algebraic multigrid, otherwise one uses the in-house solver. Here a flexible conjugate gradient (fcg) with a diagonal preconditionning (jacobi).

Set the family of linear solvers

By default, in-house solvers are considered. With some choices of solver or preconditioner, one has to use an external library. The external libraries which can be linked to code_saturne are:

CS_EQKEY_SOLVER_FAMILY allows one to specify the family of linear solvers to consider. This can be useful if an automatic switch is not done or when there are several possibilities. For instance,

  • using HYPRE solver either directly from the HYPRE library or through the PETSc library;
  • using the MUMPS solver either directly from the MUMPS library or through the PETSc library.
{
/* Specify that this is an in-house solvers (default choice) */
/* Use solver/preconditioner available in the MUMPS library */
#if defined(HAVE_MUMPS)
#else
bft_error(__FILE__, __LINE__, 0, "%s: MUMPS is not available\n", __func__);
#endif
/* Use solver/preconditioner available in the PETSc library */
#if defined(HAVE_PETSC)
#else
bft_error(__FILE__, __LINE__, 0, "%s: PETSc is not available\n", __func__);
#endif
/* Use solver/preconditioner available in the HYPRE library */
#if defined(HAVE_HYPRE)
#else
bft_error(__FILE__, __LINE__, 0, "%s: HYPRE is not available\n", __func__);
#endif
}

Set the linear solver

Available linear solvers are listed in cs_param_solver_type_t and can be set using the key CS_EQKEY_SOLVER and the key values gathered in the following table:

key value description associated type available family
"amg" Algebraic MultiGrid (AMG) iterative solver. This is a good choice to achieve a scalable solver related to symmetric positive definite system when used with the default settings. Usually, a better choice is to use it as a preconditioner of a Krylov solver (for instance a conjugate gradient). For convection-dominated systems, one needs to adapt the smoothers, the coarsening algorithm or the coarse solver. CS_PARAM_SOLVER_AMG saturne, PETSc, HYPRE
"bicgs", "bicgstab" stabilized BiCG algorithm (for non-symmetric linear systems). It may lead to breakdown. CS_PARAM_SOLVER_BICGS saturne, PETSc
"bicgstab2" variant of the stabilized BiCG algorithm (for non-symmetric linear systems). It may lead to breakdown. It should be more robust than the BiCGstab algorithm CS_PARAM_SOLVER_BICGS2 saturne, PETSc
"cg" Conjuguate gradient algorithm for symmetric positive definite (SPD) linear systems CS_PARAM_SOLVER_CG saturne, PETSc, HYPRE
"cr3" 3-layer conjuguate residual algorithm for non-symmetric linear systems CS_PARAM_SOLVER_CR3 saturne
"fcg" Flexible version of the conjuguate gradient algorithm. This is useful when the preconditioner changes between two successive iterations CS_PARAM_SOLVER_FCG saturne, PETSc
"fgmres" Flexible Generalized Minimal Residual: robust and flexible iterative solver. More efficient solver can be found on easy-to-solve systems. Additional settings related to the number of directions stored before restarting the algorithm (cf. the key CS_EQKEY_SOLVER_RESTART) CS_PARAM_SOLVER_FGMRES PETSc
"gauss_seidel", "gs" Gauss-Seidel algorithm (Jacobi on interfaces between MPI ranks in case of parallel computing). No preconditioner. CS_PARAM_SOLVER_GAUSS_SEIDEL saturne, PETSc, HYPRE
"gcr" Generalized Conjugate Residual: robust and flexible iterative solver. More efficient solver can be found on easy-to-solve systems. This is close to a FGMRES algorithm. Additional settings related to the number of directions stored before restarting the algorithm (cf. the key CS_EQKEY_SOLVER_RESTART). This is the default solver for a user-defined equation. CS_PARAM_SOLVER_GCR saturne, PETSc
"gmres" Generalized Conjugate Residual: robust iterative solver. More efficient solver can be found on easy-to-solve systems. Additional settings related to the number of directions stored before restarting the algorithm (cf. the key CS_EQKEY_SOLVER_RESTART) CS_PARAM_SOLVER_GMRES saturne, PETSc, HYPRE
"jacobi", "diag", "diagonal" Jacobi algorithm = reciprocal of the diagonal values (simplest iterative solver). This is an efficient solver if the linear system is dominated by the diagonal. No preconditioner. CS_PARAM_SOLVER_JACOBI saturne, PETSc, HYPRE
"mumps" sparse direct solver from the MUMPS library (cf. [1]). This is the most robust solver but it may require a huge memory (especially for 3D problems). By default, it performs a LU factorization on general matrices using double precision. More options are available using the function cs_param_sles_mumps and the function cs_param_sles_mumps_advanced. More options are available through the user-defined function cs_user_sles_mumps_hook CS_PARAM_SOLVER_MUMPS mumps, PETSc (according to the configuration)
"sym_gauss_seidel", "sgs" Symmetric Gauss-Seidel algorithm. Jacobi on interfaces between MPI ranks in case of parallel computing). No preconditioner. CS_PARAM_SOLVER_SYM_GAUSS_SEIDEL saturne, PETSc
"user" User-defined iterative solver (rely on the function cs_user_sles_it_solver) CS_PARAM_SOLVER_USER_DEFINED saturne

Set the preconditioner

Available preconditioners are listed in cs_param_precond_type_t and can be set using the key CS_EQKEY_PRECOND and the key values gathered in the following table:

key value description type family
"amg" Algebraic MultiGrid (AMG) iterative solver. See this section for more details. CS_PARAM_PRECOND_AMG saturne, PETSc, HYPRE
"block_jacobi", "bjacobi" Block Jacobi with an ILU(0) factorization in each block. By default, there is one block per MPI rank. CS_PARAM_PRECOND_BJACOB_ILU0 PETSc
"bjacobi_sgs", "bjacobi_ssor" Block Jacobi with a SSOR algorithm in each block. By default, there is one block per MPI rank. For a sequential run, this is the SSOr algorithm. CS_PARAM_PRECOND_BJACOB_SGS PETSc
"diag", "jacobi" Diagonal preconditioning CS_PARAM_PRECOND_DIAG saturne, PETSc, HYPRE
"ilu0" Incomplete LU factorization (zero fill-in meaning that the same sparsity level is used). CS_PARAM_PRECOND_ILU0 PETSc, HYPRE
"icc0" Incomplete Choleski factorization (zero fill-in meaning that the same sparsity level is used). For SPD matrices. CS_PARAM_PRECOND_ICC0 PETSc, HYPRE
"mumps" the sparse direct solver MUMPS used as preconditioner. Please refer to the MUMPS section for more details. CS_PARAM_PRECOND_MUMPS mumps
"none" No preconditioner. CS_PARAM_PRECOND_NONE saturne, PETSc, HYPRE
"poly1", "poly_1" 1st order Neumann polynomial preconditioning CS_PARAM_PRECOND_POLY1 saturne
"poly2", "poly_2" 2nd order Neumann polynomial preconditioning CS_PARAM_PRECOND_POLY2 saturne
"ssor" Symmetric Successive OverRelaxation (SSOR) algorithm. In case of parallel computations, each MPI rank performs a SSOR on the local matrix. CS_PARAM_PRECOND_SSOR PETSc

Algebraic multigrids

Algebraic multigrids (AMG) can be used either as a solver or as a preconditioner. According to this choice, the settings of the main components should be different. The main options related to an AMG algorithm are:

  • The choice of the cycle (V, W, K for instance)
  • The choice of the down and/or up smoothers (type and number of sweeps)
  • The choice of the coarse solver
  • The choice of the coarsening algorithm

AMG algorithm is a good choice to achieve a scalable solver. The default settings are related to a symmetric positive definite (SPD) system. Usually, a better efficiency is reached when the AMG is used as a preconditioner of a Krylov solver rather than as a solver. For convection-dominated systems, one needs to adapt the main ingredients.

Several multigrid algorithms are available in code_saturne. The default one is the in-house V-cycle. Use the key CS_EQKEY_AMG_TYPE to change the type of multigrid. Possible values for this key are gathered in the next table.

key value description
"v_cycle" This is the default choice. This corresponds to an in-house V-cycle detailed in [13] The main ingredients can set using the functions cs_param_sles_amg_inhouse and cs_param_sles_amg_inhouse_advanced Please refer to this section for more details.
"k_cycle" or "kamg" Switch to a K-cycle strategy. This type of cycle has been detailed in [17] Be aware that a lot of work can be done in the coarsest levels yielding more communications between MPI ranks during a parallel computation. The main ingredients can set using the functions cs_param_sles_amg_inhouse and cs_param_sles_amg_inhouse_advanced Please refer to this section for more details.
"boomer", "bamg" or "boomer_v" Switch to use the V-cycle of the BoomerAMG algorithm from the HYPRE library Please refer to this section for more details on the way to set the main ingredients.
"boomer_w" or "bamg_w" Switch to a W-cycle and the BoomerAMG algorithm from the HYPRE library. Please refer to this section for more details on the way to set the main ingredients.
"gamg" or "gamg_v" Switch to a V-cycle and the GAMG algorithm from the PETSc library. GAMG means "Geometric Algebraic multigrid. Please refer to [this section](GAMG) for more details on the way to set the main ingredients.
`"gamg_w"` Switch to a W-cycle and the GAMG algorithm from the [PETSc](https://petsc.org) library. GAMG means "Geometric Algebraic multigrid. Please refer to @ref cs_ug_cdo_sles_amg_gamg "this section" for more details on the way to set the main ingredients.

In-house multigrids

The in-house multigrid algorithms can be easily tuned thanks to the function cs_param_sles_amg_inhouse More advanced settings can also be added using the function cs_param_sles_amg_inhouse_advanced

{
cs_param_sles_t *slesp = eqp->sles_param;
assert(slesp->field_id > -1);
/* In case of an in-house K-cylcle multigrid as a preconditioner of a
linear iterative solver */
if ((slesp->precond == CS_PARAM_PRECOND_AMG) &&
/* Down: n_iter, smoother, poly. deg. */
/* Up: n_iter, smoother, poly. deg. */
/* Coarse: solver, poly. deg. */
/* coarsen algo, aggregation limit */
(slesp,
CS_CDO_KEEP_DEFAULT, /* max_levels */
500, /* coarse min_n_g_rows */
CS_CDO_KEEP_DEFAULT, /* p0p1_relax */
CS_CDO_KEEP_DEFAULT, /* coarse_max_iter */
CS_CDO_KEEP_DEFAULT); /* coarse_rtol_mult */
} /* K-cycle multigrid as preconditioner */
}

In the case of an advanced settings, it is possible to keep the default settings associated to a parameter using the value CS_CDO_KEEP_DEFAULT

It is also possible to access directly the structures and set the parameters in the function cs_user_linear_solvers . Here is an example of a such usage:

{
NULL,
3, /* aggregation_limit (default 3) */
0, /* coarsening_type (default 0) */
10, /* n_max_levels (default 25) */
30, /* min_g_cells (default 30) */
0.95, /* P0P1 relaxation (default 0.95) */
20); /* postprocessing (default 0) */
(mg,
CS_SLES_JACOBI, /* descent smoother type (default: CS_SLES_PCG) */
CS_SLES_JACOBI, /* ascent smoother type (default: CS_SLES_PCG) */
CS_SLES_PCG, /* coarse solver type (default: CS_SLES_PCG) */
50, /* n max cycles (default 100) */
5, /* n max iter for descent (default 2) */
5, /* n max iter for asscent (default 10) */
1000, /* n max iter coarse solver (default 10000) */
0, /* polynomial precond. degree descent (default 0) */
0, /* polynomial precond. degree ascent (default 0) */
1, /* polynomial precond. degree coarse (default 0) */
-1.0, /* precision multiplier descent (< 0 forces max iters) */
-1.0, /* precision multiplier ascent (< 0 forces max iters) */
0.1); /* requested precision multiplier coarse (default 1) */
}

or

{
NULL,
-1,
10000);
assert(strcmp(cs_sles_pc_get_type(cs_sles_it_get_pc(c)), "multigrid") == 0);
(mg,
CS_SLES_P_GAUSS_SEIDEL, /* descent smoother (CS_SLES_P_SYM_GAUSS_SEIDEL) */
CS_SLES_P_GAUSS_SEIDEL, /* ascent smoother (CS_SLES_P_SYM_GAUSS_SEIDEL) */
CS_SLES_PCG, /* coarse solver (CS_SLES_P_GAUSS_SEIDEL) */
1, /* n max cycles (default 1) */
1, /* n max iter for descent (default 1) */
1, /* n max iter for asscent (default 1) */
500, /* n max iter coarse solver (default 1) */
0, /* polynomial precond. degree descent (default) */
0, /* polynomial precond. degree ascent (default) */
0, /* polynomial precond. degree coarse (default 0) */
-1.0, /* precision multiplier descent (< 0 forces max iters) */
-1.0, /* precision multiplier ascent (< 0 forces max iters) */
1.0); /* requested precision multiplier coarse (default 1) */
}

In order to get the best efficiency in case of HPC computations, parallel grid merging can be optimized as follows:

{
NULL,
4, /* # of ranks merged at a time */
300, /* mean # of cells under which we merge */
500); /* global # of cells under which we merge */
}

BoomerAMG

The main ingredients of the BoomerAMG algorithm are defined thanks to the functions cs_param_sles_boomeramg and cs_param_sles_boomeramg_advanced for additional parameters.

{
/* Set the main parameters of the BoomerAMG algorithm */
/* n_down_iter, down smoother */
/* n_up_iter, up smoother */
/* coarse solver */
/* coarsening algorithm */
/* Set advanced parameters for BoomerAMG */
0.5, /* strong threshold */
8, /* Pmax */
2, /* n_agg_levels */
2); /* n_agg_paths */
}

We strongly invite users to read the HYPRE documentation and especially the part dedicated to BoomerAMG to understand the different choices BoomerAMG documentation

Alternative way (low-level API)

HYPRE setting functions can be called thanks to a lower-level API (relying on a setup hook function). Here is an example of a such usage

In the function cs_user_linear_solvers,

{
NULL,
CS_SLES_HYPRE_PCG, /* solver type */
CS_SLES_HYPRE_BOOMERAMG, /* preconditioner type */
_hypre_p_setup_hook,
NULL);
cs_sles_hypre_set_host_device(sc, 1); /* run on GPU */
}

where the function _hypre_p_setup_hook is defined as follows:

static void
_hypre_p_setup_hook(int verbosity,
void *context,
void *solver_p)
{
HYPRE_Solver solver = solver_p;
HYPRE_Solver precond = NULL;
/* Get pointer to preconditioner, based on solver type (here for PCG) */
HYPRE_PCGGetPrecond(solver, &precond);
/* Assuming the preconditioner is BoomerAMG, set options */
HYPRE_BoomerAMGSetCoarsenType(precond, 8) ; /* HMIS */
HYPRE_BoomerAMGSetAggNumLevels(precond, 2);
HYPRE_BoomerAMGSetPMaxElmts(precond, 4);
HYPRE_BoomerAMGSetInterpType(precond, 7); /* extended+i */
HYPRE_BoomerAMGSetStrongThreshold(precond, 0.5); /* 2d=>0.25 3d=>0.5 */
HYPRE_BoomerAMGSetRelaxType(precond, 6); /* Sym G.S./Jacobi hybrid */
HYPRE_BoomerAMGSetRelaxOrder(precond, 0);
}

GAMG

Up to now, the advanced settings of GAMG is possible inside cs_user_linear_solvers Here is an example of a such usage

{
/* Initialization must be called before setting options;
it does not need to be called before calling
cs_sles_petsc_define(), as this is handled automatically. */
PETSC_COMM_WORLD = cs_glob_mpi_comm;
PetscInitializeNoArguments();
/* See the PETSc documentation for the options database */
#if PETSC_VERSION_GE(3,7,0)
PetscOptionsSetValue(NULL, "-ksp_type", "cg");
PetscOptionsSetValue(NULL, "-pc_type", "gamg");
PetscOptionsSetValue(NULL, "-pc_gamg_agg_nsmooths", "1");
PetscOptionsSetValue(NULL, "-mg_levels_ksp_type", "richardson");
PetscOptionsSetValue(NULL, "-mg_levels_pc_type", "sor");
PetscOptionsSetValue(NULL, "-mg_levels_ksp_max_it", "1");
PetscOptionsSetValue(NULL, "-pc_gamg_threshold", "0.02");
PetscOptionsSetValue(NULL, "-pc_gamg_reuse_interpolation", "TRUE");
PetscOptionsSetValue(NULL, "-pc_gamg_square_graph", "4");
#else
PetscOptionsSetValue("-ksp_type", "cg");
PetscOptionsSetValue("-pc_type", "gamg");
PetscOptionsSetValue("-pc_gamg_agg_nsmooths", "1");
PetscOptionsSetValue("-mg_levels_ksp_type", "richardson");
PetscOptionsSetValue("-mg_levels_pc_type", "sor");
PetscOptionsSetValue("-mg_levels_ksp_max_it", "1");
PetscOptionsSetValue("-pc_gamg_threshold", "0.02");
PetscOptionsSetValue("-pc_gamg_reuse_interpolation", "TRUE");
PetscOptionsSetValue("-pc_gamg_square_graph", "4");
#endif
}

Sparse direct solver: MUMPS

For the most difficult to solve linear systems or in order to check a numerical scheme, it may be useful to solve the linear systems with a direct solver. In code_saturne, this is possible using the MUMPS library.

{
/*
Example: Use MUMPS to solve a user-defined equation
*/
#if defined(HAVE_MUMPS)
#else
bft_error(__FILE__, __LINE__, 0, "%s: MUMPS is not available\n", __func__);
#endif
}

Additional options

Additional options may be set to improve the performance of the sparse direct solver. Please refer to the MUMPS documentation to know more about the different options.

{
/* Parameters related to a user-defined equation */
false, /* single-precision ? */
CS_PARAM_MUMPS_FACTO_LU); /* type of factorization */
1, /* size of the block for analysis */
-1, /* pct memory increase < 0 --> not used */
0, /* BLR compression: 0 --> not used */
1, /* iterative refinement steps */
true); /* advanced optimizations */
}

The function cs_param_sles_mumps allows one to set the two following main parameters:

  1. The factorization is performed using a single (parameter = "true") ou a double precision (parameter = "false"). If the usage of MUMPS is not as a preconditioner, then one advises to consider a double-precision factorization.
  2. Which type of factorization to perform ? There are the following choices:

Further options are available through the function cs_param_sles_mumps_advanced in order to reach a better performance but this is case dependent.

There are three main stages in the resolution:

  1. The analysis stage : renumbering to improve the fill-in of the factorization
  2. the factorization stage
  3. The solve stage (apply the factorization)

Renumbering

During the analysis stage, an important step is the renumbering. The quality of the analysis and in particular of the renumbering has a direct impact on the efficiency of the factorization and its fill-in (memory consumption).

Several algorithms are available according to your MUMPS installation. Some of these algorithms rely on external libraries such as (PT-)Scotch ou (Par)Metis

The type of algorithm which can be set from code_saturne is stored in cs_param_mumps_analysis_algo_t

parameter description dependency
CS_PARAM_MUMPS_ANALYSIS_AMD AMD sequential algorithm. Well-suited for 2D cases. no
CS_PARAM_MUMPS_ANALYSIS_QAMD QAMD sequential algorithmn. Well-suited for 2D cases. no
CS_PARAM_MUMPS_ANALYSIS_PORD Sequential algorithmn. For 3D cases if there is no dependency in your MUMPS installation no
CS_PARAM_MUMPS_ANALYSIS_SCOTCH Sequential algorithm for 3D cases. Scotch or PT-Scotch
CS_PARAM_MUMPS_ANALYSIS_PTSCOTCH Parallel algorithm for 3D cases. PT-Scotch
CS_PARAM_MUMPS_ANALYSIS_METIS Sequential algorithm for 3D cases. Often the best choice METIS or PARMETIS
CS_PARAM_MUMPS_ANALYSIS_PARMETIS Parallel algorithm for 3D cases. PARMETIS
CS_PARAM_MUMPS_ANALYSIS_AUTO The choice is performed by MUMPS according to the available algorithms and the type of matrices

Iterative refinement

When the system is really ill-conditioned, it can be useful to add one or several steps of iterative refinement to improvement the solution. By default, no iteration is performed. A fixed number of iterations is always performed.

Block Low Rank (BLR) algorithms

There is no BLR activated if the value of the parameter related to the BLR algorithm is set to 0. When the value is positive, BLR is activated for the factorization and solve stages using the UCFS variant and the value is used to indicate the level of compression. When the value is negative, the UFSC algorithm is used instead.

If the option CS_PARAM_MUMPS_MEMORY_CONSTRAINED is set, other compressions are activated (cleaning of the workspace, ICNTL(37)=1 and ICNTL(40)=1) in addition to the BLR compression.

Memory increase

In some cases (for instance with 2D cases), the estimation of the memory consumption is not sufficient and an error message is returned. To circumvent this issue, one needs to set a correct mem_coef parameter. This is directly given to ICNTL(14) in MUMPS.

Interfaces with MUMPS structures

To get the finest control on the tuning of the MUMPS solver, one has to use the function cs_user_sles_mumps_hook

This function is called before the main stages:

  1. Before the analysis step
  2. Before the factorization step
void
void *context,
void *pmumps)
{
CS_UNUSED(slesp);
CS_UNUSED(context);
DMUMPS_STRUC_C *mumps = pmumps;
assert(mumps != NULL);
/* If MUMPS is used in single-precision, one has to modify the declaration as
follows:
SMUMPS_STRUC_C *mumps = pmumps;
All the remaining settings are identical in the single or double-precision
case.
*/
mumps->CNTL(4) = 0.0; /* Static pivoting */
mumps->ICNTL(58) = 2; /* Symbolic factorization {0, 1, or 2}*/
}

According to the configuration single or double precision, the structure storing the MUMPS solver differs. In case of a single precision factorization, the cast is done to a pointer to a SMUMPS_STRUC_C structure

Solve a saddle-point linear system

Saddle-point systems arise from the monolithic velocity-pressure coupling of the Navier-Stokes equations or from the discretization of CDO cell-based schemes (steady diffusion equation equation for instance). Some examples of settings involving differents strategies of resolution are presented hereafter.

One considers the following type of saddle-point systems $\mathsf{Mx = b}$ with

\[ \mathsf{M} = \begin{bmatrix}
    A & B^T \\
    B & 0
 \end{bmatrix}
 \qquad
 \text{and}
 \qquad
 \mathsf{b} =
 \begin{bmatrix}
    f \\
    g
 \end{bmatrix}
\]

where $A$ is a square non-singular matrix (symmetric or not according to the numerical and/or modelling choices).

The key CS_EQKEY_SADDLE_SOLVER enables to set the solver (or the strategy) to solve the saddle-point problem associated to an equation. More details are gathered in the following table.

key value description type family
"none" No solver. No saddle-point system to solve. CS_PARAM_SADDLE_SOLVER_NONE saturne
"alu" Uzawa algorithm with an Augmented Lagrangian acceleration. Segregated technique. CS_PARAM_SADDLE_SOLVER_ALU saturne
"fgmres" Flexible GMRES Krylov solver on the full system. Possibility to consider block preconditioning. The choice to approximate the Schur complement has an impact on the effeciency of the algorithm. Monolithic technique. CS_PARAM_SADDLE_SOLVER_FGMRES PETSc
"gcr" GCR (Generalized Conjugate Residual) Krylov solver on the full system along with block preconditioning. The choice to approximate the Schur complement has an impact on the effeciency of the algorithm. Monolithic technique. CS_PARAM_SADDLE_SOLVER_GCR saturne
"gkb" Golub-Kahan bidiagonalization. An augmentation of the (1,1) block is possible. Use this technique on symmetric saddle-point systems. Segregated technique. CS_PARAM_SADDLE_SOLVER_GKB saturne
"minres" MINRES Krylov solver on the full system along with block preconditioning. Close to the choice "gcr" but only for symmetric saddle-point systems. Monolithic technique. CS_PARAM_SADDLE_SOLVER_MINRES saturne
"mumps" MUMPS sparse direct solver on the full system. Monolithic technique. CS_PARAM_SADDLE_SOLVER_MUMPS MUMPS
"notay" Notay's algebraic transformation. Monolithic technique relying on a FMGRES on the modified system. Block preconditioning can be specified on the transformed blocks. (Experimental) CS_PARAM_SADDLE_SOLVER_NOTAY_TRANSFORM PETSc
"uzawa_cg" Uzawa algorithm accelerated by a CG technique. Only for symmetric saddle-point systems. The choice to approximate the Schur complement has an impact on the effeciency of the algorithm. Segregated technique. CS_PARAM_SADDLE_SOLVER_UZAWA_CG saturne

In segregated technique, the (1,1)-block solver refers to the main solver. It can be set following the rationale described in Solve a linear system In monolothic technique, the (1,1)-block solver refers to the approximation of this block for preconditioning. Moreover, for some strategies, additional SLES such as the Schur complement system can also be specified using the same rationale described in Solve a linear system.

Augmented Lagrangian Uzawa algorithm (ALU)

Here is a first example to set a saddle-point solver. One assumes that the external library MUMPS has been installed and has been configured with code_saturne (see the installation guide for more details here).

{
/* Parameters related to the momentum equation */
/* Linear algebra settings for the saddle-point system */
#if defined(HAVE_MUMPS)
#else
bft_error(__FILE__, __LINE__, 0, "%s: MUMPS is not available\n", __func__);
#endif
}

Since the velocity block is augmented by the term $\underline{\mathsf{grad}}\,\mathsf{div}$ (with a scaling coefficient given by CS_EQKEY_SADDLE_AUGMENT_SCALING), the resulting linear system is hard to solve for an iterative method (or one needs a very specific preconditionner for $H(\mathsf{div})$). Our strategy is to consider the sparse direct solver MUMPS [1]. Here is an example of an advanced usage of the MUMPS settings.

Additional options may be set to improve the performance of the sparse direct solver.

{
/* Parameters related to the momentum equation */
false, /* single-precision ? */
CS_PARAM_MUMPS_FACTO_LDLT_SPD); /* type of facto. */
3, /* size of the block for analysis */
-1, /* pct memory increase < 0 --> not used */
0, /* BLR compression: 0 --> not used */
0, /* iterative refinement steps */
true); /* advanced optimizations */
}

Since the velocity block is vector-valued, one can benefit from a block analysis (the third parameter is set to 3).

Here is a second example.

{
/* Linear algebra settings */
#if defined(HAVE_MUMPS)
/* More advanced usage */
false, /* single-precision ? */
3, /* size of the block for analysis */
-1, /* pct memory increase < 0 = not used */
0, /* BLR compression: 0 = not used */
0, /* iterative refinement steps */
CS_PARAM_MUMPS_MEMORY_AUTO, /* memory usage */
true); /* advanced optimizations */
#else
bft_error(__FILE__, __LINE__, 0, "%s: MUMPS is not available\n", __func__);
#endif
}

Block preconditioners with a Krylov solver

In-house solvers

The two in-house Krylov solvers available with block preconditioning are

  • Minres (Minimal residual) algorithm for symmetric indefinite systems such as systems encountered in the case of Stokes systems
  • GCR (Generalized conjugate residual) algorithm for general indefinite systems. This is a flexible solver (preconditioner may vary between two iterations).

These two algorithms are optimized to handle saddle-point problems in code_saturne since the (1,2) and (2,1) which are transposed is stored only once. Moreover, this block is stored in an unassembled way.

{
/* Parameters related to the Stokes settings.
MINRES is not possible with a non-symmetric saddle-point system */
/* Linear algebra settings for the saddle-point system */
/* Linear algebra settings for the (1,1)-block */
#if defined(HAVE_PETSC) /* One assumes that PETSc is installed with Hypre */
/* Must be set after the previous line to switch to PETSC in order to be
able to use a block preconditioning for the velocity block */
/* Set the main parameters for BoomerAMG (Please refer to the Hypre
documentation for more details) */
/* n_down_iter, down smoother */
/* n_up_iter, up smoother */
/* coarse solver */
/* coarsening algorithm */
/* Set advanced parameters for BoomerAMG */
0.5, /* strong threshold */
8, /* Pmax */
2, /* n_agg_levels */
2); /* n_agg_paths */
#else /* PETSc not installed */
/* n_down_iter, down smoother, down poly. deg. */
/* n_up_iter, up smoother, up poly. deg. */
/* coarse solver, coarse poly. deg. */
/* coarsen algo, aggregation limit */
/* Set advanced parameters for an in-house AMG */
CS_CDO_KEEP_DEFAULT, // max. levels
100, // min. n_g_rows
CS_CDO_KEEP_DEFAULT, // p0p1 relax.
CS_CDO_KEEP_DEFAULT, // coarse max. iter
1e-2); // coarse rtol mult.
#endif
/* Linear algebra settings for the Schur complement approximation */
}

PETSc solvers

Other block preconditioners with a Krylov solver can be set using the external librairy PETSc. The two main differences (in addition to the use of the PETSc library) are:

  1. The Krylov solver is a flexible GMRES algorithm
  2. The Schur complement approximation can differ since one hinges on the choices available in PETSc. Please refer to the PETSc documentation for more details.

One uses the FIELDSPLIT functionnality to set the block preconditioning strategy.

Golub-Kahan Bidiagonalization algorithm (GKB)

{
/* Linear algebra settings for the saddle-point system */
/* Linear algebra settings for the (1,1)-block */
}

The linear system may be augmented to improve the convergence rate of the algorithm (but the system is harder to solve). Here is another example:

{
/* Linear algebra settings for the saddle-point system */
/* Linear algebra settings for the (1,1) block */
#if defined(HAVE_MUMPS)
#else
bft_error(__FILE__, __LINE__, 0, "%s: MUMPS is not available\n", __func__);
#endif
}

Notay's algebraic transformation (Experimental)

Here is an example in the case of a saddle-point system stemming from the Navier-Stokes equations.

{
/* Parameters related to the momentum equation */
/* Linear algebra settings for the saddle-point system.
* Notay's algebraic transformation is an experimental feature
* The main solver is a FGMRES on the full saddle-point problem.
*/
/* Set the solver for the (1,1)-block preconditioner --> the velocity block
in the case of the Navier-Stokes system */
}

Uzawa algorithm with a CG acceleration

{
/* Linear algebra settings for the saddle-point system */
/* Linear algebra settings for the (1,1)-block */
#if defined(HAVE_HYPRE)
#else
#endif
/* Linear algebra settings for the Schur complement approximation */
}

Schur complement approximation

The Schur complement related to the saddle-point system $\mathsf{Mx = b}$ detailed above is $\mathsf{S} = \mathsf{-B\cdot A^{-1} \cdot B}$ This is a square matrix. The cost to build exactly this matrix is prohibitive. Thus, one needs an approximation denoted by $\mathsf{\widetilde{S}}$

An approximation of the Schur complement is needed by the following strategies:

  • Block preconditionner with a Krylov method as in "fgmres" (only with PETSc), "gcr" or "minres"
  • Uzawa algorithm with a CG acceleration: "uzawa_cg"

The efficiency of the strategy to solve the saddle-point problem is related to the quality of the approximation. Several types of approximation of the Schur complement are available and a synthesis of the different options is gathered in the following table.

key value description related type family
"none" No approximation. The Schur complement approximation is not used. CS_PARAM_SADDLE_SCHUR_NONE all
"diag_inv" $\mathsf{\widetilde{S}} = \mathsf{-B\cdot diag(A)^{-1} \cdot B}$ CS_PARAM_SADDLE_SCHUR_DIAG_INVERSE saturne, petsc
"identity" The identity matrix is used. CS_PARAM_SADDLE_SCHUR_IDENTITY saturne, petsc
"lumped_inv" $\mathsf{\widetilde{S}} = \mathsf{-B\cdot L \cdot B}$ where $\mathsf{A\,L = 1}$, $\mathsf{1}$ is the vector fill with the value 1. CS_PARAM_SADDLE_SCHUR_LUMPED_INVERSE saturne
"mass_scaled" The pressure mass matrix is used with a scaling coefficient. This scaling coefficient is related to the viscosity for instance when dealing with the Navier-Stokes system. CS_PARAM_SADDLE_SCHUR_MASS_SCALED saturne
"mass_scaled_diag_inv" This is a combination of the "mass_scaled" and "diag_inv" approximation. CS_PARAM_SADDLE_SCHUR_MASS_SCALED_DIAG_INVERSE saturne
"mass_scaled_lumped_inv" This is a combination of the "mass_scaled" and "lumped_inv" approximation. CS_PARAM_SADDLE_SCHUR_MASS_SCALED_LUMPED_INVERSE saturne

To set the way to approximate the Schur complement use

void cs_equation_param_set(cs_equation_param_t *eqp, cs_equation_key_t key, const char *keyval)
Set a parameter attached to a keyname in a cs_equation_param_t structure.
Definition cs_equation_param.cxx:1408
@ CS_EQKEY_SADDLE_SCHUR_APPROX
Definition cs_equation_param.h:1228

For the Schur approximation implying the "diag_inv" or the "lumped_inv" approximation, one needs to build and then solve a linear system associated to the Schur approximation. In the case of the "lumped_inv", an additionnal linear system is solved to define the vector $\mathsf{L}$ from the resolution of $\mathsf{A\,L =
1}$. Please consider the examples below for more details.

Example 1: "mass_scaled"

In the first example, one assumes that the saddle-point solver is associated to the structure \cs_equation_param_t nammed mom_eqp. For instance, one uses a gcr strategy for solving the saddle-point problem. This is one of the simpliest settings for the Schur complement approximation. The "mass_scaled" approximation is well-suited for solving the Stokes problem since the pressure mass matrix is a rather good approximation (in terms of spectral behavior) of the Schur complement.

{
/* Parameters related to the momentum equation */
/* Nothing else to add for the approximation of the Schur complement */
}

Example 2: "mass_scaled_diag_inv"

In the second example, one assumes that the saddle-point solver is associated to the structure \cs_equation_param_t nammed mom_eqp. One now considers a more elaborated approximation which needs to solve a system implying the Schur complement approximation.

{
/* Parameters related to the momentum equation */
CS_EQKEY_SADDLE_SCHUR_APPROX, "mass_scaled_diag_inv");
/* Retrieve the set of parameters to handle the system associated to the
Schur complement approximation */
cs_param_sles_t *schur_slesp =
/* Set the solver, its preconditionner and the stopping criteria */
int ierr = cs_param_sles_set_solver("fgmres", schur_slesp);
assert(ierr == 0);
ierr = cs_param_sles_set_precond("amg", schur_slesp);
assert(ierr == 0);
ierr = cs_param_sles_set_amg_type("boomer", schur_slesp);
assert(ierr == 0);
/* One uses CS_CDO_KEEP_DEFAULT as parameter to keep unchanged the default
settings */
1e-2, /* rtol */
CS_CDO_KEEP_DEFAULT, /* atol */
CS_CDO_KEEP_DEFAULT, /* dtol */
CS_CDO_KEEP_DEFAULT); /* max. iter. */
}

Example 3: "lumped_inv"

In the third and last example, one assumes that the saddle-point solver is associated to the structure \cs_equation_param_t nammed mom_eqp. One now considers a more elaborated approximation which needs to solve a system implying the Schur complement approximation and an additional system to build the Schur complement approximation (the system related to $\mathsf{A\,L = 1}$).

{
/* Parameters related to the momentum equation */
/* Retrieve the set of parameters to handle the system associated to the
Schur complement approximation */
cs_param_sles_t *schur_slesp =
/* Set the solver, its preconditionner and the stopping criteria */
int ierr = cs_param_sles_set_solver("fcg", schur_slesp);
assert(ierr == 0);
ierr = cs_param_sles_set_precond("amg", schur_slesp);
assert(ierr == 0);
/* One uses CS_CDO_KEEP_DEFAULT as parameter to keep unchanged the default
settings */
1e-2, /* rtol */
CS_CDO_KEEP_DEFAULT, /* atol */
CS_CDO_KEEP_DEFAULT, /* dtol */
15); /* max. iter. */
ierr = cs_param_sles_set_amg_type("gamg", schur_slesp);
assert(ierr == 0);
/* Retrieve the additional system (the one used for the lumped inverse") */
ierr = cs_param_sles_set_solver("fgmres", xtra_slesp);
assert(ierr == 0);
ierr = cs_param_sles_set_precond("amg", xtra_slesp);
assert(ierr == 0);
ierr = cs_param_sles_set_amg_type("boomer", xtra_slesp);
assert(ierr == 0);
ierr = cs_param_sles_set_precond_block_type("diag", xtra_slesp);
assert(ierr == 0);
1e-2, /* rtol */
CS_CDO_KEEP_DEFAULT, /* atol */
CS_CDO_KEEP_DEFAULT, /* dtol */
20); /* max. iter. */
}

Additional settings

Immediate exit

In some cases, the initial residual is very small leading to an immediate exit. If this happens and it should not be the case, one can specify a stronger criterion to trigger the immediate exit. In the function cs_user_linear_solvers, add the following lines for instance:

{
/* Redefine the threshold under which an immediate exit occurs */
}

Allow no operation

An immediate exit is possible (i.e. there is no solve) when the norm of the right-hand side is equal to zero or very close to zero (cs_sles_set_epzero). To allow this kind of behavior for the SLES associated to an equation, one has to set the key CS_EQKEY_SOLVER_NO_OP to true.

{
/* In case of a in-house solver, allow one to skip the solve in some
specific situation rhs and solution nearly equal to zero */
}