MPI Error with CS/Syrthes coupling

Questions and remarks about code_saturne usage
Forum rules
Please read the forum usage recommendations before posting.
Post Reply
AdamLarat
Posts: 6
Joined: Fri May 16, 2025 3:30 pm

MPI Error with CS/Syrthes coupling

Post by AdamLarat »

Hi there!
I am trying to run the CS/Syrthes coupled test case 3_2D_DISKS_2.
Each simulation works separately as expected : code_saturne 8.0.4, Syrthes5.0.
However, when running the coupled simulation, I get the following MPI Error :

Code: Select all

mpiexec has exited due to process rank 1 with PID 0 on
node po21210 exiting improperly. There are three reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

3. this process called "MPI_Abort" or "orte_abort" and the mca parameter
orte_create_session_dirs is set to false. In this case, the run-time cannot
detect that the abort call was an abnormal termination. Hence, the only
error message you will receive is this one.

This may have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).

You can avoid this message by specifying -quiet on the mpiexec command line.
--------------------------------------------------------------------------
 solver script exited with status 1.

Error running the coupled calculation.

Either of the coupled codes may have failed.

Check the following log files for details.

Domain FLUID (code_saturne):
  run_solver.log, error*.

Domain SOLID (SYRTHES):
  syrthes.log / listing
When looking at the specific logs, syrthes seems to have run properly (file SOLID/syrthes2_listing exists, weights 332K and ends with "FIN NORMALE"), but Saturne seems to have done nothing :
- the run_solver.log file does not exist
- 'listing' is a broken link to that non-existing file
- the preprocessor.log file is present and finishes with "preprocessor finish"

I am running the simulation on Scibian11 and saw in the following topic viewtopic.php?t=3052 there might be a problem with the MPI compilation of CS on Debian, so I rebuilt my code_saturne 8.0.4 entirely. This did not help.

Next, I noticed that the MPI splitting is done by the mpmd_exec.sh script called by run_solver next by. So I went on and tried to run the ./cs_solver and ./syrthes commands of this script exactly in their respective directories. Surprisingly, the cs_solver command does something and starts to write the expected run_solver.log until it crashes due to the absence of a running syrthes instance (which is ok).
When I replace the cs_solver and syrthes commands in the mpmd_exec.sh script by a simple 'hostname' command, the threads are correctly created and each thread spits out the host name.

Does anyone has a idea on what is wrong with my setup ? 
Attachments
run.log
(5.32 KiB) Downloaded 17 times
run.cfg.txt
(247 Bytes) Downloaded 15 times
Yvan Fournier
Posts: 4231
Joined: Mon Feb 20, 2012 3:25 pm

Re: MPI Error with CS/Syrthes coupling

Post by Yvan Fournier »

Hello,

Are you sure your code_saturne and Syrthes builds use the same MPI installation ? Mismatched MPI libraries could explain the observed behavior. Since the Syrthes install script builds its own MPI library by default, this could happen if you are not extra careful.

Best regards,

Yvan
AdamLarat
Posts: 6
Joined: Fri May 16, 2025 3:30 pm

Re: MPI Error with CS/Syrthes coupling

Post by AdamLarat »

Hi Yvan,

Thank you very much for your quick answer!

I have been looking at the compilation logs of Saturne and Syrthes and both seem to use /usr/bin/mpicc for the mpi part.

Then I noticed that Syrthes compiles two separate libraries : the 'seq' and the 'mpi' one. I suddenly remembered that during the onfly compilation of Syrthes at the beginning of the coupled simulation, I've had an error complaining about the missing libsyrthes_cfd.a library. To overcome this issue, I made a symbolic link in ${HOME_SYRTHES}/lib from libsyrthes_seq.a to libsyrthes_cfd.a (50% chance ;-)
The libsyrthes_cfd.a link is now pointing toward the libsyrthes_mpi.a library but this does not help either :-(

What do you mean by "Syrthes builds its own MPI library"?
Attachments
cs_config.log
(162.48 KiB) Downloaded 17 times
cs_Makefile.txt
(54.9 KiB) Downloaded 17 times
syrthesmpi.log
(48.49 KiB) Downloaded 17 times
Yvan Fournier
Posts: 4231
Joined: Mon Feb 20, 2012 3:25 pm

Re: MPI Error with CS/Syrthes coupling

Post by Yvan Fournier »

Hello,

I mean that Syrthes can compile its own OpenMPI library as an external dependency, but this does not seem to be the casde here.

But you may be missing a setting in the syrthes-install/setup.ini file : you should have:

Code: Select all

ple_use=yes
syrthescfd INSTALL=yes
Otherwise, you will be missing the libsyrthes_cfd.a file, which is different (an addition to) the libsyrthes_mpi.a file, so a symbolic link is not the correct solution.

Best regards,

Yvan
AdamLarat
Posts: 6
Joined: Fri May 16, 2025 3:30 pm

Re: MPI Error with CS/Syrthes coupling

Post by AdamLarat »

Thanks Yvan, that was exactly the problem ! 

Maybe all this should be added to the 3_2D_DISKS step-by-step document in a prerequisite sub-section of the introduction :

Code: Select all

### Prerequisite

 * Saturne 8.0.x compiled with mpi support : 
 ```bash 
 /configure –prefix=path/to/install/dir --with-mpi=/usr/lib/x86_64-linux-gnu/openmpi CC=mpicc CXX=mpicxx FC=gfortran
    make 
    make install
 ```
 
 * Syrthes 5.0 compiled in CFD mode : 
 ```bash
 cd src/syrthes-install/
# modify last lines of ‘setup.ini’ file in order to activate CFD mode. They should look like:
# ple USE=yes   PATH=/home/alarat/Codes/Carambarre/libs/code_saturne-8.0.4/
# syrthescfd INSTALL=yes
#
./syrthes_install.py 
source arch/Linux_x86_64/bin/syrthes.profile
```
 * check in Saturne's "configure.log" and in Syrthes' "resume" compilation logs that the same version of mpi has been used (usually should be /usr/bin/mpicc and /usr/bin/mpicxx)
 
 * The line change in Saturne's etc/code_saturne.cfg could be moved to this paragraph. I see in this forum that many people skip it inattentively.   
Post Reply