Coupling Code_Saturne/Syrthes: SIGTERM signal (termination) received

This forum is dedicated to Syrthes related issues, as the Syrthes tool does not have its own forum.
Kanssoune
Posts: 12
Joined: Tue Aug 02, 2022 8:05 am

Coupling Code_Saturne/Syrthes: SIGTERM signal (termination) received

Post by Kanssoune »

Hello
I am trying to do a coupling between code_sturne and syrthes. Burt when running, I have the following error message at the Starting calculation:

Starting calculation
--------------------

--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI COMMUNICATOR 3 SPLIT FROM 0
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[hotcell:124583] 1 more process has sent help message help-mpi-api.txt / mpi-abort
[hotcell:124583] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
solver script exited with status 1.

Error running the coupled calculation.

Either code_saturne or SYRTHES may have failed.

Check code_saturne log (listing) and SYRTHES log (syrthes.log)
for details, as well as error* files.

Post-calculation operations
---------------------------

Error in calculation stage.

In the fluid error file, I have this message:

SIGTERM signal (termination) received.
--> computation interrupted by environment.

Call stack:
1: 0x7f2fbe956027 <epoll_wait+0x57> (libc.so.6)
2: 0x7f2fa917ea29 <ucs_event_set_wait+0x99> (libucs.so.0)
3: 0x7f2fa95e23eb <uct_tcp_iface_progress+0x7b> (libuct.so.0)
4: 0x7f2fa983cada <ucp_worker_progress+0x2a> (libucp.so.0)
5: 0x7f2fbdb3bf94 <opal_progress+0x34> (libopen-pal.so.40)
6: 0x7f2fbdb429d5 <ompi_sync_wait_mt+0xb5> (libopen-pal.so.40)
7: 0x7f2fbf095659 <ompi_request_default_wait+0x1e9> (libmpi.so.40)
8: 0x7f2fbf0c6a58 <PMPI_Intercomm_create+0x3a8> (libmpi.so.40)
9: 0x7f2fc126cf0a <ple_coupling_mpi_intracomm_create+0xda> (libple.so.2)
10: 0x7f2fc16266b1 <cs_syr4_coupling_init_comm+0x141> (libsaturne-7.0.so)
11: 0x7f2fc1628d49 <cs_syr_coupling_all_init+0x739> (libsaturne-7.0.so)
12: 0x7f2fc25e853c <main+0x27c> (libcs_solver-7.0.so)
13: 0x7f2fbe860d85 <__libc_start_main+0xe5> (libc.so.6)
14: 0x40094e <_start+0x2e> (cs_solver)
End of stack
Attached are the listing files (for fluid and solid).

Does anyone have an idea how to get through this? Any help would be greatly appreciated.
Attachments
solid_listing.txt
(3.11 KiB) Downloaded 909 times
fluid_listing.txt
(10.23 KiB) Downloaded 806 times
Yvan Fournier
Posts: 4157
Joined: Mon Feb 20, 2012 3:25 pm

Re: Coupling Code_Saturne/Syrthes: SIGTERM signal (termination) received

Post by Yvan Fournier »

Hello,

I have seen a similar issue several time in the past, though I have not used Syrthes recently. If Syrthes fails, logs can be limited... Do you have any other error message in the log ? Could you run the "run_solver" script in /data/test_CS/RESU_COUPLING/20221103-1522 to see if you get another error log. In some cases, the syrthes data file can contain a hidden command that gets in the way (set by the Syrthes GUI when checking the mesh or something of the sort).

Also make sure you ask for multiple iterations on the Syrthes side.

Best regards,

Yvan
Kanssoune
Posts: 12
Joined: Tue Aug 02, 2022 8:05 am

Re: Coupling Code_Saturne/Syrthes: SIGTERM signal (termination) received

Post by Kanssoune »

Hello Yvan!

Thank you for your quick reply!

I have no other error messages in the logs.

Running "run_solver" script in /data/test_CS/RESU_COUPLING/20221103-1522, it look that "module" is not recognized:

./run_solver: ligne 8: module : commande introuvable
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI COMMUNICATOR 3 SPLIT FROM 0
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[hotcell:135298] 1 more process has sent help message help-mpi-api.txt / mpi-abort
[hotcell:135298] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

A suggestion?
Thanks!

Kanssoune
Yvan Fournier
Posts: 4157
Joined: Mon Feb 20, 2012 3:25 pm

Re: Coupling Code_Saturne/Syrthes: SIGTERM signal (termination) received

Post by Yvan Fournier »

Hello,

Can you post your "run_solver" script ?

Best regards,

Yvan
Kanssoune
Posts: 12
Joined: Tue Aug 02, 2022 8:05 am

Re: Coupling Code_Saturne/Syrthes: SIGTERM signal (termination) received

Post by Kanssoune »

Hello,

Sorry, I was sick and today I read your message.

The script is attached.

Best regards,

Kanssoune
Attachments
run_solver.txt
(1001 Bytes) Downloaded 807 times
Yvan Fournier
Posts: 4157
Joined: Mon Feb 20, 2012 3:25 pm

Re: Coupling Code_Saturne/Syrthes: SIGTERM signal (termination) received

Post by Yvan Fournier »

Hello,

Checking how the "run_solver" script is generated, and looking at your script, I do not believe the warning message "module not found" can be ignored.

So I still have the impression the solid domain is causing the issue. Could you post the solid setup, except for the large files (mesh, ...) ?

Best regards,

Yvan Fournier
Kanssoune
Posts: 12
Joined: Tue Aug 02, 2022 8:05 am

Re: Coupling Code_Saturne/Syrthes: SIGTERM signal (termination) received

Post by Kanssoune »

Hello,

Thank for your reply.

Attached some files. If I need to post a specific file in addition, do not hesitate.

Best regards,
Kanssoune
Attachments
mesh_solide.syr_desc.txt
(77 Bytes) Downloaded 786 times
mesh_solide.syr.txt
(3.1 MiB) Downloaded 814 times
solid.syd.txt
(3.13 KiB) Downloaded 807 times
Yvan Fournier
Posts: 4157
Joined: Mon Feb 20, 2012 3:25 pm

Re: Coupling Code_Saturne/Syrthes: SIGTERM signal (termination) received

Post by Yvan Fournier »

Hello,

Yould you also post your Syrthes installation setup file (setup.ini) and the syrthes.profile (in the "bin" directory of the Syrthes install), as well as the "config.log" file from the code_saturne installation ?

Seeing how early the issue happpens, I wonder if you do not have an MPI library mismatch, and those will help me check.

Best regards,

Yvan
Kanssoune
Posts: 12
Joined: Tue Aug 02, 2022 8:05 am

Re: Coupling Code_Saturne/Syrthes: SIGTERM signal (termination) received

Post by Kanssoune »

Hello,

The requested files are attached.

Thanks for your help.

Best regards,

Kanssoune
Attachments
syrthes.profile.txt
(3.2 KiB) Downloaded 816 times
setup.ini
(4.31 KiB) Downloaded 806 times
config.log
(199.52 KiB) Downloaded 795 times
Yvan Fournier
Posts: 4157
Joined: Mon Feb 20, 2012 3:25 pm

Re: Coupling Code_Saturne/Syrthes: SIGTERM signal (termination) received

Post by Yvan Fournier »

Hello,

The MPI version seems fine, so I do not see any version mismatch which could explain the issues here.

So back to the beginning....

If your test case is not too large, you can post it or send it to me so that I can see if I reproduce the issue.

Regards,

Yvan
Post Reply