Error during calculation
Posted: Wed Mar 04, 2015 4:12 pm
Hello,
i did a quick calculation on CS3.0.6 with Unbuntu 14.04, and it ended with a error really quickely (after 13 iterations)
Error report :
Pile d'appels :
1: 0x7f7c7c351810 <ompi_op_base_sum_double+0x20> (libmpi.so.1)
2: 0x7f7c75b0ff18 <ompi_coll_tuned_allreduce_intra_recursivedoubling+0x4a8> (mca_coll_tuned.so)
3: 0x7f7c7c32aeee <PMPI_Allreduce+0x16e> (libmpi.so.1)
4: 0x7f7c7dd7699f <parsom_+0x2f> (libsaturne.so.0)
5: 0x7f7c7df59cf2 <vitens_+0x932> (libsaturne.so.0)
6: 0x7f7c7de33737 <resopv_+0x2247> (libsaturne.so.0)
7: 0x7f7c7de0c983 <navstv_+0x2e03> (libsaturne.so.0)
8: 0x7f7c7de45d80 <tridim_+0x43e0> (libsaturne.so.0)
9: 0x7f7c7dd275d7 <caltri_+0x2e87> (libsaturne.so.0)
10: 0x7f7c7dcfecf5 <cs_run+0xa35> (libsaturne.so.0)
11: 0x7f7c7dcfe1aa <main+0x14a> (libsaturne.so.0)
12: 0x7f7c7d2ecec5 <__libc_start_main+0xf5> (libc.so.6)
13: 0x400d29 <> (cs_solver)
Fin de la pile
and message from the GUI :
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec has exited due to process rank 0 with PID 2854 on
node 1419-rdlabo exiting improperly. There are two reasons this could occur:
1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.
2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"
This may have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).
--------------------------------------------------------------------------
****************************
Saving calculation results
****************************
solver script exited with status 1.
Error running the calculation.
Check code_saturne log (listing) and error* files for details.
Error in calculation stage.
From what I can understand it is linked with parsom and pallel calculations. I am using a simple user boundary conditions file enclosed to calculate a velocity on outlet just by giving a volumic flowrate taking in account the faces orientation.
I used this same file before with success but at this time it was running fine with 2 cores and it was running OK. Recently I upgraded my computer and I now use 16 processes, and it crashed. In a former calculation (not the same mesh), it failed after 1700 iterations.
Can someone help me to understand ?
Thanks.
i did a quick calculation on CS3.0.6 with Unbuntu 14.04, and it ended with a error really quickely (after 13 iterations)
Error report :
Pile d'appels :
1: 0x7f7c7c351810 <ompi_op_base_sum_double+0x20> (libmpi.so.1)
2: 0x7f7c75b0ff18 <ompi_coll_tuned_allreduce_intra_recursivedoubling+0x4a8> (mca_coll_tuned.so)
3: 0x7f7c7c32aeee <PMPI_Allreduce+0x16e> (libmpi.so.1)
4: 0x7f7c7dd7699f <parsom_+0x2f> (libsaturne.so.0)
5: 0x7f7c7df59cf2 <vitens_+0x932> (libsaturne.so.0)
6: 0x7f7c7de33737 <resopv_+0x2247> (libsaturne.so.0)
7: 0x7f7c7de0c983 <navstv_+0x2e03> (libsaturne.so.0)
8: 0x7f7c7de45d80 <tridim_+0x43e0> (libsaturne.so.0)
9: 0x7f7c7dd275d7 <caltri_+0x2e87> (libsaturne.so.0)
10: 0x7f7c7dcfecf5 <cs_run+0xa35> (libsaturne.so.0)
11: 0x7f7c7dcfe1aa <main+0x14a> (libsaturne.so.0)
12: 0x7f7c7d2ecec5 <__libc_start_main+0xf5> (libc.so.6)
13: 0x400d29 <> (cs_solver)
Fin de la pile
and message from the GUI :
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec has exited due to process rank 0 with PID 2854 on
node 1419-rdlabo exiting improperly. There are two reasons this could occur:
1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.
2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"
This may have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).
--------------------------------------------------------------------------
****************************
Saving calculation results
****************************
solver script exited with status 1.
Error running the calculation.
Check code_saturne log (listing) and error* files for details.
Error in calculation stage.
From what I can understand it is linked with parsom and pallel calculations. I am using a simple user boundary conditions file enclosed to calculate a velocity on outlet just by giving a volumic flowrate taking in account the faces orientation.
I used this same file before with success but at this time it was running fine with 2 cores and it was running OK. Recently I upgraded my computer and I now use 16 processes, and it crashed. In a former calculation (not the same mesh), it failed after 1700 iterations.
Can someone help me to understand ?
Thanks.