computation crash with polyhedra mesh

Questions and remarks about code_saturne usage
Forum rules
Please read the forum usage recommendations before posting.
Post Reply
sirlb
Posts: 34
Joined: Mon Mar 17, 2014 11:54 am

computation crash with polyhedra mesh

Post by sirlb »

Good Morning to all,

I am running into problem while trying to make a computation with a polyhedra type mesh.
It seems to seg fault in a parcmx call within typecl process.

I tried with and without mesh reorientation but it always gives the same error.

Some parts of the preprocessor log and listing error message are described below.

The setting is the same than for successful computation made for same model but with trim cell type mesh.

Does anybody can orient me to understand where the problem is ?

Thank you.


Code: Select all

Defining families
-----------------


  Element orientation check.


Warning
=======
350 elements of type quad4 had to be re-oriented

Warning
=======
114 elements of type quad4 were impossible to re-orient

End of conversion to descending connectivity
--------------------------------------------
  Theoretical mesh size:                 2.407 Gb
  Theoretical current memory:            3.901 Gb
  Theoretical peak memory:              10.662 Gb
  Total memory used:                    10.749 Gb

Warning
=======
There is/are 464 isolated face(s)


Main mesh properties
--------------------

  Number of cells:                              11832944
  Number of internal faces:                     68944276
  Number of boundary faces:                       472823
  Number of vertices:                           53287717


Definition of face and cell families
------------------------------------

  Family 1
         Group "1"
  Number of cells          : 11832944
  Family 2
         Group "0"
  Number of internal faces : 68943812
  Number of isolated faces :      464
  Family 3
         Group "1"
  Number of boundary faces :     2113
  Family 4
         Group "10"
  Number of boundary faces :     4231
  Family 5
         Group "11"
  Number of boundary faces :    65774
  Family 6
         Group "12"
  Number of boundary faces :     2371
  Family 7
         Group "13"
  Number of boundary faces :     2375
  Family 8
         Group "14"
  Number of boundary faces :    56529
  Family 9
         Group "15"
  Number of boundary faces :    56704
  Family 10
         Group "16"
  Number of boundary faces :     5669
  Family 11
         Group "17"
  Number of boundary faces :     3052
  Family 12
         Group "18"
  Number of boundary faces :     3168
  Family 13
         Group "19"
  Number of boundary faces :    22445
  Family 14
         Group "2"
  Number of boundary faces :     2113
  Family 15
         Group "3"
  Number of boundary faces :     6125
  Family 16
         Group "4"
  Number of boundary faces :    32059
  Family 17
         Group "5"
  Number of boundary faces :    11596
  Family 18
         Group "6"
  Number of boundary faces :     7618
  Family 19
         Group "7"
  Number of boundary faces :    38872
  Family 20
         Group "8"
  Number of boundary faces :    38911
  Family 21
         Group "9"
  Number of boundary faces :   111098
  Family 21
         Default family
         (no group)
  Number of internal faces :      464





   Critère 1 : orthogonalité :
    Nombre de mauvaises cellules détecté : 1800 -->   0 %

  Critère 2 : décentrement :
    Nombre de mauvaises cellules détecté : 0 -->   0 %

  Critère 3 : qualité du gradient moindres-carrés :
    Nombre de mauvaises cellules détecté : 43609 -->   0 %

  Critère 4 : ratio des volumes de cellules :
    Nombre de mauvaises cellules détecté : 252 -->   0 %

  Critère 5 : culpabilité par association :
    Nombre de mauvaises cellules détecté : 725 -->   0 %

Code: Select all

SIGTERM signal (termination) received.
--> computation interrupted by environment.

Call stack:
   1: 0x3b42acb2e6 <__poll+0x66>                    (libc.so.6)
   2: 0x2af0af024191 ?                                (?)
   3: 0x2af0af022ff3 ?                                (?)
   4: 0x2af0af016fb1 <opal_progress+0xa1>             (libopen-pal.so.0)
   5: 0x2af0b16e3b95 ?                                (?)
   6: 0x2af0b3600954 ?                                (?)
   7: 0x2af0ae86ffac <MPI_Allreduce+0x17c>            (libmpi.so.0)
   8: 0x2af0acbd30ff <parcmx_+0x2f>                   (libsaturne.so.0)
   9: 0x2af0acc97207 <typecl_+0x4947>                 (libsaturne.so.0)
  10: 0x2af0acb8c50a <condli_+0x4ca>                  (libsaturne.so.0)
  11: 0x2af0acc8f3ed <tridim_+0x36d5>                 (libsaturne.so.0)
  12: 0x2af0acb7ac39 <caltri_+0x33e9>                 (libsaturne.so.0)
  13: 0x2af0acb521a5 <cs_run+0x405>                   (libsaturne.so.0)
  14: 0x2af0acb52325 <main+0x155>                     (libsaturne.so.0)
  15: 0x3b42a1d994 <__libc_start_main+0xf4>         (libc.so.6)
  16: 0x402a49     <main+0x39>                      (cs_solver)
End of stack
Yvan Fournier
Posts: 4208
Joined: Mon Feb 20, 2012 3:25 pm

Re: computation crash with polyhedra mesh

Post by Yvan Fournier »

Hello,

If you have any error_* files, they are more interesting than the global error file.

Do you have any message about incorrect boundary faces in the "listing" ?

You mesh has isolated faces, so I assume some face boundary condition issues are due to this.

What tool was the mesh built with ? You probably have badly warped cells, rather than orientation errors, so you should deactivate reorientation. If this leads to a different error, post that one.

Regards,

Yvan
sirlb
Posts: 34
Joined: Mon Mar 17, 2014 11:54 am

Re: computation crash with polyhedra mesh

Post by sirlb »

Thank you Yvan for your help,

error_* files don't seem to provides more info, they all have :

Code: Select all

SIGFPE signal (floating point exception) intercepted!
Call stack:
   1: 0x33c22302d0 ?                                (?)
   2: 0x2adf98942d0e <typecl_+0xa342>                 (libsaturne.so.0)
   3: 0x2adf986d3622 <condli_+0x9ee>                  (libsaturne.so.0)
   4: 0x2adf98930b4d <tridim_+0x9015>                 (libsaturne.so.0)
   5: 0x2adf98680607 <caltri_+0x4a97>                 (libsaturne.so.0)
   6: 0x2adf9864b669 <cs_run+0x38d>                   (libsaturne.so.0)
   7: 0x2adf9864b8d0 <main+0x15d>                     (libsaturne.so.0)
   8: 0x33c221d994 <__libc_start_main+0xf4>         (libc.so.6)
   9: 0x402b49     <main+0x39>                      (cs_solver)
End of stack
Mesh is an import in ccm format, made in star-ccm.
version is 3.2.1.

I try in a first run without reorientation but it gave the same error output.

I guess I have to dig again in the mesh generation to try to get rid of the isolated faces.
Yvan Fournier
Posts: 4208
Joined: Mon Feb 20, 2012 3:25 pm

Re: computation crash with polyhedra mesh

Post by Yvan Fournier »

Hello,

Did you visualize the mesh quality criteria (without any reorientation) ?

You should not need to reorient Star-CCM+ meshes (at least, I do note remember having seen a case where such a mesh was badly oriented).

Also, do you have a "debug" install of the code ? (see installation manual if you do not know what this means). If you do, or can install one, the error message will also provide the line where the SIGFPE occurs, which can help me determine the cause of the error.

Regards,

Yvan
sirlb
Posts: 34
Joined: Mon Mar 17, 2014 11:54 am

Re: computation crash with polyhedra mesh

Post by sirlb »

Good evening Yvan,

Here is the results of my investigations :

1 - i improved my mesh, i have no more isolated faces and no warning on non-orthogonalities but i still have a warning on 4 faces for orientation / warping (i do not use reorientation)

2 - debug version does not add extra information (either it is bad compiled, either i don't know how to use it)

3 - but i insert some traces in the source file typecl.f90 and finally observed that :
the simulation blows at line 1136 (file for version 3.2.1) at the parcmx call
if i comment the "if" block of this part of the code, the simulation goes further but crashes probably at line 1299 at call of parcpt

4 - i thougth it was a partitionning issue, change the number of proc -> it still crashes

5 - i changed all boundary conditions to wall and then it is running !!

I will try to find some time to go further in the analysis, certainly regarding to my inlets, but if you have some clues please let me know.

Regards.
Yvan Fournier
Posts: 4208
Joined: Mon Feb 20, 2012 3:25 pm

Re: computation crash with polyhedra mesh

Post by Yvan Fournier »

Hello,

1) OK, you still have a few highly warped faces, but not so much as before (you can use quality criteria visualization to see where)

2) Either your build is not really a debug build, or the backtrace does not provide any additional information; this may depend on your OS and compiler.

3) The error seems to be for a counter; strange, as it is not expected to reach values which could provoke a floating point exception. With a small mesh, running with "debug build + valgrind" would certainly help, otherwise, printing "iok" on all ranks and logging output for all ranks (advanced section of "Calculation Management/prepare batch calculation") may help, though we'll need several iterations.

4) I guess then that the error might be due to a single error, and might be due to a single proc. This probably means the "iok" value takes a bad value for a single type of boundary condition error (probably one we do not use much in our computations, if it has not occurred earlier).

5) This is consistent with 3) and 4). We just need to find which boundary condition error provokes this. So adding your initial boundary conditions one at a time may definitely help here.

Regards,

Yvan
sirlb
Posts: 34
Joined: Mon Mar 17, 2014 11:54 am

Re: computation crash with polyhedra mesh

Post by sirlb »

Hello Yvan,

My apologies, i was completely mistaken.
I guess i was looking at the master node stack trace and then missing the real line code that was in trouble.

I log each processor nodes and now the error is much revelant.

My run crashes because of pressure reference setting. (i think it is in line 811 of typcl.f90 )
For information, it is an incompressible flow without any temprerature / energy / gravity.

I then trace the output of total pressure values at each cell :

Code: Select all

trace bp2ter
     propce iiptot:    0.10132E+06
trace bp2ter
     propce iiptot:    0.10132E+06
trace bp2ter
     propce iiptot:    0.10132E+06
trace bp2ter
     propce iiptot:    0.10132E+06
trace bp2ter
     propce iiptot:    0.60471-309  !!!!
trace bp2ter
     propce iiptot:    0.31467-308  !!!!
trace bp2ter
     propce iiptot:            NaN      !!!!
trace bp2ter
     propce iiptot:            NaN      !!!!

Why would the pressure have these strange values in many cells ? Is it linked to the mesh quality ? But i have a poly mesh with better quality than in the trim cell case that run without problem...

Now i am trying to figure out where do these strange values are coming...
Yvan Fournier
Posts: 4208
Joined: Mon Feb 20, 2012 3:25 pm

Re: computation crash with polyhedra mesh

Post by Yvan Fournier »

Hello,

Could you also post all setup file listed in the forum usage recommendations (except the mesh).

I suspect the issue is not related to the polyhedral mesh, but to a bug in handling of some boundary conditions.

Did you use similar conditions successfully on a non-polyhedral mesh ?

Are you using the compressible model ?

Regards,

Yvan
sirlb
Posts: 34
Joined: Mon Mar 17, 2014 11:54 am

Re: computation crash with polyhedra mesh

Post by sirlb »

Sorry for late reply, i am not able to provide files for now since i am not at my desk.
What i can say is :

1 - similar settings are used successfully with another mesh (trim cell type mesh with hex dominant cells and a few polyhedra + prism layer)
2 - i do not use the compressible model

I will try to replace the confidential geometry with a simple shape and if the bug is reproducible, i will send you all the case data. Hopefully i will try to do it within next week.

Regards
Yvan Fournier
Posts: 4208
Joined: Mon Feb 20, 2012 3:25 pm

Re: computation crash with polyhedra mesh

Post by Yvan Fournier »

Hello,

Building a new mesh might not reproduce the issue...

The warped faces might be the main cause of the issue. Could you try splitting highly warped faces (for example those with an angle of more than 10°) ?

This may make things better or worse, but is worth trying...

Regards,

Yvan
Post Reply