parallel computing using sgi-mpt

lzhang · Post by **lzhang** » Sat Feb 23, 2019 4:32 pm

Hello,

As I mentioned in a previous post, I finally succeeded to launch a single simulation on a cluster using the method presented in https://www.hpc.ntnu.no/display/hpc/Tut ... de+Saturne

But now I have a problem when trying to apply the same method for a coupled simulation (using cs_user_coupling-saturne.c) on a cluster. Firstly I use the command "code_saturne run --initialize --coupling coupling_parameters.py" to initialise a calculation. Then in the newly created directory in RESU/COUPLING, two running scripts are created: run_solver, mpmd_exec.sh. As what has been done for a single simulation, the run_solver is modified by adding some information related to job submission:

Code: Select all

#PBS -S /bin/bash
#PBS -N job-VIV
#PBS -o output.txt
#PBS -e error.txt
#PBS -l walltime=72:00:00
#PBS -l select=1:ncpus=30:mpiprocs=30:mem=30gb

and a paramter "-np 30" is added after the command "mpiexec_mpt". Then the job is submitted using "qsub run_solver".

In the attached file, I give a small test that fails with two coupled cases: fluid1 and fluid2. When I submit the simulation, I encounter an error in fluid1, saying that

Code: Select all

/gpfs/workdir/lzhang/code_saturne-5.0.8/src/base/cs_sat_coupling.c:2010: Fatal error.

At least 1 Code_Saturne coupling was defined for which
no communication with a Code_Saturne instance is possible.

Then I checked the fluid2, I found that the fluid2 case is not even launched, which may explains the error in fluid1.

Do you have any idea to fix this problem, please? Maybe I should also modify the script mpmd_exec.sh.

Thanks a lot!

Best regard,
Lei

Post by **Yvan Fournier** » Sat Feb 23, 2019 5:22 pm

Hello,

It seems looking at the listing that all 30 ranks were associated to "fluid1", none to "fluid2". So this probably means the $MPI_RANK" variable used in the mpmd_exec.sh was not recognized, and may need to be adapted to your MPI library.

First, a few suggessions for fixing and automating this:

The tutorial you refer to does not seem to mention the specific cases of coupled runs.

Normally, if you have completed the post-install setup of Code_Saturne (i.e. adapted the <install_prefix>/etc/code_saturne.cfg file, detailed in the install doc), you should both have a correct interaction with the batch system and definition of the MPMD launch mode:

the batch entry ensures that running "code_saturne create" will add a batch template to the runcase file, and "code_saturne gui" will allow you to setup the main batch parameters graphically

The "mpmd" section will tell the script whether it should use mpiexec - n1 <exe1> : -n2 <exe2> syntax, use a configfile, or a special script (depending on the possibilities of your MPI library and batch library); in your example, the script seems to be used; I prefer the mpiexec MPMD syntax, if your actual mpiexec_mpt command handles this

When everything is configured correctly, "code_saturne run --initialize" prepares a "run_solver" script which handles the coupled start using the appropriate method. This script does not contain the batch parameters itself. When using "code_saturne submit", the first stages of "initialize" ("stage", data copy + source compilation) is run interactively, and the rest of the "runcase" (mesh import + compute) is submitted to batch).

All of this is not more automated because MPI and batch system vendors seem to conspire to build as many different variants with subtle differences as possible, often recommending syntaxes slightly different from the one recommenced in the MPI standard, so you often need to check tho docs of your own system if the default settings do not work. In your case, as mpmd_exec.sh does not seem to use the correct environment variable, it may need to be adapted, but switching to one of the 2 other options might work better. I may be able to help you if you have elements of the relevant documentation, as I do not have access to a similar system (we have mainly used SLURM with Intel MPI, OpenMPI, or MPICH in recent years, and have some collaborators using other systems.

Best regards,

Yvan

lzhang · Post by **lzhang** » Mon Feb 25, 2019 12:06 pm

Hello,

Indeed it seems to be a problem with $MPI_RANK variable. I am trying to reinstall Code_Saturne using openmpi option for parallel computing to see if it works.

I still have a few questions with regard to your response.

In your case, as mpmd_exec.sh does not seem to use the correct environment variable, it may need to be adapted, but switching to one of the 2 other options might work better. I may be able to help you if you have elements of the relevant documentation, as I do not have access to a similar system (we have mainly used SLURM with Intel MPI, OpenMPI, or MPICH in recent years, and have some collaborators using other systems.

Here what do you mean by "the 2 other options"? And what kind of documentation do you need for better understanding of the problem?

Thanks a lot for your help!

Best regards,
Lei

Post by **Yvan Fournier** » Mon Feb 25, 2019 9:49 pm

Hello,

I found a MPT documentation on the web (SGI MPI and SGI SHMEM User Guide), which mentions that to ran an MPMD (multiple program multiple data) program, you can use syntax such as:

mpirun -np 1 prog1 : -np 5 prog2

which is similar to the recommended MPI standard:

mpiexec -n 1 prog1 : -n 5 prog2

(for prog 1 on 1 rank, prog 2 on 2 ranks), though it does not specify if the (-wd) (working directory) argument is supported.

Since other mpiexec_mpt documentation shows use with the -n instead of -np option (as for mpiexec), I assume the mpiexec_mpt has the recommended mpiexec behavior and options.

In this case the 3 possible MPMD options in the Code_Saturne config file are the ones listed in the last entry of the code_saturne.cfg template. Since the mpiexec syntax seems available, I recommend that one (so uncomment "mpd = mpiexec").

Also set "mpiexec = mpiexec_mpt" in the file.

Note that by default, code_saturne.cfg is not present in the "etc" subdirectory of the code install. To avoid overwriting it in case of reinstall, only code_saturne.cfg.template is copied. You must then copy code_saturne.cfg.template to code_saturne.cfg and adapt it. For testing; copying it to $HOME/.code_saturne.cfg is also possible (but in this case has priority over all installs, whig might not be what you want.).

If this fails, you can run "mpiexec_mpt -n 2 /usr/bin/env" to check which environment variable seems to be used by mpiexec_mpt to indicate which MPI rank will be used by a process, so I can adapt the detection code for the "script" MPMD mode. But using the standard mpiexec method is better.

Regards,

Yvan

lzhang · Post by **lzhang** » Wed Feb 27, 2019 2:35 pm

Hello,

Thank you very much for your response! I have tried to reinstall Code_Saturne using intel-mpi for MPI, it works and I can now run a coupled simulation on the cluster.

By the way do you have some recommendation for choosing the number of processors for each instance in a coupled simulation, and also the number of cpus to specify in job submission? For example I have a coupled simulation composed of 10 instances, and I ask for 20 cpus using "#PBS -l select=1:ncpus=20:mpiprocs=20:mem=10gb". Then I found that use of different number of processors for each instance (the parameter to be set in coupling_parameters.py) can produce very different system resolution time.

Best regards,
Lei

Post by **Yvan Fournier** » Wed Feb 27, 2019 11:48 pm

Hello,

For Code_Saturne, on a xeon/infiniband type cluster, we usually have the best efficiency around 50000 (30000 to 80000) cells per MPI rank. In most cases, OpenMP threads lead to similar or lower performance, so a "pure" MPI configuration is recommended. So starting from 50000 and testing both half and double (if you have enough available processors) is recommended.

In the Code_Saturne performance.log, you can see for the coupling (PLE exchanges) how much time is spent working and how much time waiting. The time scheme does not allow bringing the wait time to zero, but in general, coupling 2 code instances/domains A and B, if A waits for B more than B waits for A, adding more processors to A and less to B will help you balance the computation. Finding the optimal value will require experimenting a bit, but these the elements you can start from.

Best regards,

Yvan

code_saturne User's Forum

parallel computing using sgi-mpt

Re: parallel computing using sgi-mpt

Re: parallel computing using sgi-mpt

Re: parallel computing using sgi-mpt

Re: parallel computing using sgi-mpt

Re: parallel computing using sgi-mpt

Re: parallel computing using sgi-mpt