Page 3 of 4

Re: CS_4.0.2 and syrthes4.3.0 coupling MPI ABORT

Posted: Mon Jan 18, 2016 2:19 am
by Yvan Fournier
Hello,

Do you have a backtrace for this crash ?

As before, even if the cause of the crash is different, both "listing" and "syrthes.log" might contain useful info, so please post them here.

Regards,

Yvan

Re: CS_4.0.2 and syrthes4.3.0 coupling MPI ABORT

Posted: Mon Jan 18, 2016 12:29 pm
by ROLLAND
Hi,

That's the error message:

Code: Select all

  ---------------------------
  Start SYRTHES preprocessing
  ---------------------------

Updating the mesh file name.. 
   -> OK


 **********************
  Starting calculation
 **********************

--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI COMMUNICATOR 3 SPLIT FROM 0 
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[61930,1],0]
  Exit code:    1
--------------------------------------------------------------------------
[rolland-Precision-WorkStation-T7400:12807] 1 more process has sent help message help-mpi-api.txt / mpi-abort
[rolland-Precision-WorkStation-T7400:12807] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
 solver script exited with status 1.

Error running the coupled calculation.

Either Code_Saturne or SYRTHES may have failed.

Check Code_Saturne log (listing) and SYRTHES log (syrthes.log)
for details, as well as error* files.


 ****************************
  Saving calculation results
 ****************************

 Error in calculation stage.

rolland@rolland-Precision-WorkStation-T7400:~/Documents/EXEMPLE3/TEST3/CAS1$ 
With the "listing" and "syrthes.log" files.

Regards,


Q.ROLLAND

Re: CS_4.0.2 and syrthes4.3.0 coupling MPI ABORT

Posted: Tue Jan 19, 2016 4:18 pm
by Yvan Fournier
Hello,

This is strange, as there does not seem to be any logging or backtrace of the cause of the crash.

If your meshes are not too large and not confidential, you can post them here (along with the setup) so that I can check if I reproduce the issue. Otherwise, I could try to walk you through using a debugger (just to go far enough to see where the crash occurs).

Also, could you try the same case setting "allow_nonmatching" to "false", but using a greater tolerance (0.3, or even up to 2 or 3). This should allow location even with coarse and curved meshes, but generate an explicit error if the coordinates of fluid and solid boundary are really to far part (in which case visualizing the extracted submeshes again, both together and separately, may help).

In any case, I may have limited web access for the rest of this week, so I'll check when I can (next week will be "back to a normal schedule").

Regards,

Yvan

Re: CS_4.0.2 and syrthes4.3.0 coupling MPI ABORT

Posted: Wed Jan 20, 2016 9:56 am
by ROLLAND
Hi,

I tried to set "false" for "allow_nonmatching" and I increased the tolerance up to 2, 3 and even 10 and nothing changed. In Salome, the two meshes coincide perfectly in the common interface (at least from what I see in Salome GUI. I have used the same parameters (same maximum size of meshes and same type of meshes) in the interface). The error message is not changing no matter what I use for "tolerance" and "allow_nonmatching".

Maybe I should split the common interface in two different faces? or create some kind of common intermediary volume?

You will find the geometry, the meshes and the configuration files for the fluid and solid domains in the attached documents. Thank you.

Regards,

Q.ROLLAND

Re: CS_4.0.2 and syrthes4.3.0 coupling MPI ABORT

Posted: Tue Jan 26, 2016 2:27 am
by Yvan Fournier
Hello,

I checked your meshes, and for the solid, it seem the coupled surface has group "4", but as Syrthes handles families and not groups, you need to check the associated matchings in Maillage_2.syr_desc.

In my case (opening your file with Salome 7.1.1), the family used for group "4" in the solid mesh is family 9, so you should have:

CLIM= COUPLAGE_SURF_FLUIDE SYRTHES1 9

instead of

CLIM= COUPLAGE_SURF_FLUIDE SYRTHES1 4

Yes, use of Syrthes with MED files is tricky...

I'll let you test this (if it fails, let me know and I'll test it, but it's late for now).

Regards,

Yvan

Re: CS_4.0.2 and syrthes4.3.0 coupling MPI ABORT

Posted: Thu Jan 28, 2016 2:58 pm
by ROLLAND
Hi,

I checked the "Maillage_2.syr_desc" and indeed "group_of_faces" shows 9 for the face originally named 4 in Salome. I changed the Surface coupling in the Syrthes GUI for 9 instead of 4. And it's working !! Thanks for your help!!
However, the Reference time steps and the Number of iterations have to perfectly match in both GUIs, otherwise the calculation systemically crashes!

Regards,

Q. ROLLAND

Re: CS_4.0.2 and syrthes4.3.0 coupling MPI ABORT

Posted: Thu Jan 28, 2016 4:18 pm
by Yvan Fournier
Hello,

The reference time steps and number of iterations issue is strange:

- the code stopping first should stop the computation. I usually control this from one code, using a large number of time steps on the other, with no issue. I'll check on your case, as I have not tested this recently.

- The reference time step should be important from a physical poitn of view (assuming an unsteady computation), but can normally be different (for possibly faster covnergence of steady physics). There is normally no check between the two code's time steps with default options (or at least, I'll check that the matching option in cs_coupling is not activated by default).

Could you post logs (listing, error*, syrthes.log) of a case which crashes due to time step differences ?

Regards,

Yvan

Re: CS_4.0.2 and syrthes4.3.0 coupling MPI ABORT

Posted: Fri Jan 29, 2016 11:48 am
by ROLLAND
Hi,

I checked the last compilation and I spoke too quickly. Indeed, the error come from the time configuration:
- I put 1 and 10 for solid domain in the Syrthes GUI for the Global Number of time steps and Time steps (in seconds) respectively.
- And I put 100 and 0,1 for the fluid domain in the CS GUI for the Number of iterations and Reference time step respectively.

I guess it was not a good idea :roll: as the solid domain calculation had only one time step to compute, which was not sufficient to complete the fluid/solid couplings, necessarily ending up with a crash.
Changing Global Number of time steps from 1 to 2 and the coupled calculation works! I was too quick in my answer.

Regards,

Q.ROLLAND

Re: CS_4.0.2 and syrthes4.3.0 coupling MPI ABORT

Posted: Fri Jan 29, 2016 12:44 pm
by Yvan Fournier
Hello,

Actually, I recently encountered the crash when asking for only 1 iteration. A cleaner exis should be possible in this case. so I'll need to check (I might already have partially fixed it).

Thanks for the feedback.

Regards,

Yvan

Re: CS_4.0.2 and syrthes4.3.0 coupling MPI ABORT

Posted: Tue Feb 09, 2016 6:05 pm
by jingless
Hello,

Is there any updated Three-2D-disks.pdf for the newer code_saturne versions?


thanks