Installation of code-saturne V2RC1 in parallel

All questions about installation
Forum rules
Please read the forum usage recommendations before posting.
Yvan Fournier

Re: Installation of code-saturne V2RC1 in parallel

Post by Yvan Fournier »

Hello,

Which version of OpenMPI are you using ? I found this thread http://www.open-mpi.org/community/lists ... 5/3175.php, that seems to indicate you should modify line 135 of file runcase_mpi_env in ./share/apps/code_saturne/2.0/ncs-2.0.0-rc1/bin/share/ncs/runcase_mpi_env, to remove the 'machinefile $MPIHOSTS part.

This is probably not an issue with OpenMPI 1.3, as one of our clusters uses OpenMPI 1.3, and Torque if I remember correctly, and we have not encounterred this issue, but it is easy to try.

Also, when you tested running "env" under MPI, was your SALOME environement sourced ?

I see 2 possible environment conflicts in your case:
  • SALOME environment does not play well with OpenMPI
    OpenMPI clashes with the one from OpenFoam, also in your path.
The second case seems improbable, as /opt/openmpi appears first in your PATH, but it would be useful to replace 'set -x' (now that we have seen its output) in your runcase with: 'echo LD_LIBRARY_PATH=$LD_LIBRARY_PATH' to make sure of this (MPI clashes in the case of multiple MPI versions can cause difficult to debug crashes).

Finally, in case the environment transferred by OpenMPI is different from the standard one, it would be interesting to add:
echo 'echo env' >> $localexec
just under:
echo '#!/bin/sh' > $localexec
near line 875 of your 'runcase', as well as add 'echo env' near the beginning of your runcase.

In any case, the error message seems to indicate the script was not even able to start the cs_solver executable, so forcing the generated "localexec" script to dump its environment before calling the code would also allow to determine if the mpiexec command at least goes that far (helping determine whether it is a localhost or LD_LIBRARY_PATH issue for example).

Best regards,

  Yvan
David Monfort

Re: Installation of code-saturne V2RC1 in parallel

Post by David Monfort »

Hi,

To complete Yvan's answer on MPI, here is a possible solution for the following issue:
/share/apps/code_saturne/2.0/librairies/ecs-2.0.0-rc1/bin/cs_preprocess: error while loading shared libraries: libhdf5.so.0: cannot open shared object file: No such file or directory
It seems that the hdf5 library against which the preprocessor is linked is not found at runtime. I assume you compiled the code with SALOME pre-requisites. If so, you may add something like
LD_LIBRARY_PATH=/path/to/hdf5/lib:$LD_LIBRARY_PATH
to your environment file (.profile)

David
Serra Sylvain

Re: Installation of code-saturne V2RC1 in parallel

Post by Serra Sylvain »

Hi,

at the end, the installation is OK.

We had different problem.

- It was some problems of rights. The installation was made in root and some users had not the rights to use all the librairies...

-After that, we can run some simulations on one node and many cores but not on many nodes.
The solution was an update of openMPI.

Now, all is OK.

Thank you for all your answers.

Sylvain
Post Reply