Error in wallfunction with turbomachinery

Questions and remarks about code_saturne usage
Forum rules
Please read the forum usage recommendations before posting.
Post Reply
sirlb
Posts: 34
Joined: Mon Mar 17, 2014 11:54 am

Error in wallfunction with turbomachinery

Post by sirlb »

Hello,

I have a strange failure on start of computation for some of my jobs.
The process is the following :

- i use the turbomachine module
- i have many configurations with different axis and rotation speed for the same mesh
- for each computation I am using the cs_user_mesh source file to translate the mesh origin to the given rotation axis
- then I launch the computation with translated mesh and new rotation speed.

What's happen is that for some configuration the computation crashes at the first time step within the clptur wallfunction calls. I even tried to test some speed very close to value for which the computation succeed but it crashes too.

Is there anything I can do to have more information on the failure and try to solve this issue ? Is there a problem of rotation speed regarding to computation of wall function variables ?

Thanks for any help.

Version of CS is 3.2.1 compiled with gcc (4.5.0)

The listing ends as following :

Code: Select all

   ** BOUNDARY MASS FLOW INFORMATION
      ------------------------------

---------------------------------------------------------------
Boundary type          Code    Nb faces           Mass flow
---------------------------------------------------------------
Inlet                         2       1652         0.000000000E+00
Smooth wall               5     162860         0.000000000E+00
Rough wall                 6           0         0.000000000E+00
Symmetry                  4           0         0.000000000E+00
Free outlet                 3        243         0.000000000E+00
Free inlet                  13           0         0.000000000E+00
Undefined                   1           0         0.000000000E+00
---------------------------------------------------------------

SIGTERM signal (termination) received.
--> computation interrupted by environment.

Call stack:
   1: 0x36ab8302d0 ?                                (?)
   2: 0x36ab8cb2e6 <__poll+0x66>                    (libc.so.6)
   3: 0x2ba612aa5191 ?                                (?)
   4: 0x2ba612aa3ff3 ?                                (?)
   5: 0x2ba612a97fb1 <opal_progress+0xa1>             (libopen-pal.so.0)
   6: 0x2ba615164b95 ?                                (?)
   7: 0x2ba617081954 ?                                (?)
   8: 0x2ba6122f0fac <MPI_Allreduce+0x17c>            (libmpi.so.0)
   9: 0x2ba60fa84d3c <parmin_+0x4c>                   (libsaturne.so.0)
  10: 0x2ba60f9c8c53 <clptur_+0x1f03b>                (libsaturne.so.0)
  11: 0x2ba60f9e1882 <condli_+0x5c4e>                 (libsaturne.so.0)
  12: 0x2ba60fc39b4d <tridim_+0x9015>                 (libsaturne.so.0)
  13: 0x2ba60f989607 <caltri_+0x4a97>                 (libsaturne.so.0)
  14: 0x2ba60f954669 <cs_run+0x38d>                   (libsaturne.so.0)
  15: 0x2ba60f9548d0 <main+0x15d>                     (libsaturne.so.0)
  16: 0x36ab81d994 <__libc_start_main+0xf4>         (libc.so.6)
  17: 0x402cd9     <main+0x41>                      (cs_solver)
End of stack
I have the following output in the error file

Code: Select all

SIGTERM signal (termination) received.
--> computation interrupted by environment.

Call stack:
   1: 0x36ab8302d0 ?                                (?)
   2: 0x36ab8cb2e6 <__poll+0x66>                    (libc.so.6)
   3: 0x2ba612aa5191 ?                                (?)
   4: 0x2ba612aa3ff3 ?                                (?)
   5: 0x2ba612a97fb1 <opal_progress+0xa1>             (libopen-pal.so.0)
   6: 0x2ba615164b95 ?                                (?)
   7: 0x2ba617081954 ?                                (?)
   8: 0x2ba6122f0fac <MPI_Allreduce+0x17c>            (libmpi.so.0)
   9: 0x2ba60fa84d3c <parmin_+0x4c>                   (libsaturne.so.0)
  10: 0x2ba60f9c8c53 <clptur_+0x1f03b>                (libsaturne.so.0)
  11: 0x2ba60f9e1882 <condli_+0x5c4e>                 (libsaturne.so.0)
  12: 0x2ba60fc39b4d <tridim_+0x9015>                 (libsaturne.so.0)
  13: 0x2ba60f989607 <caltri_+0x4a97>                 (libsaturne.so.0)
  14: 0x2ba60f954669 <cs_run+0x38d>                   (libsaturne.so.0)
  15: 0x2ba60f9548d0 <main+0x15d>                     (libsaturne.so.0)
  16: 0x36ab81d994 <__libc_start_main+0xf4>         (libc.so.6)
  17: 0x402cd9     <main+0x41>                      (cs_solver)
End of stack
and the following output in a file named error_nxx

Code: Select all

Call stack:
   1: 0x36ab8302d0 ?                                (?)
   2: 0x36abc13d50 ?                                (?)
   3: 0x36abc2400c <sqrt+0x1c>                      (libm.so.6)
   4: 0x2b4120ca06ab ?                                (?)
   5: 0x2b4120ca0f93 <wallfunctions_+0x1fa>           (libsaturne.so.0)
   6: 0x2b4120b95bb4 <clptur_+0x4f9c>                 (libsaturne.so.0)
   7: 0x2b4120bc8882 <condli_+0x5c4e>                 (libsaturne.so.0)
   8: 0x2b4120e20b4d <tridim_+0x9015>                 (libsaturne.so.0)
   9: 0x2b4120b70607 <caltri_+0x4a97>                 (libsaturne.so.0)
  10: 0x2b4120b3b669 <cs_run+0x38d>                   (libsaturne.so.0)
  11: 0x2b4120b3b8d0 <main+0x15d>                     (libsaturne.so.0)
  12: 0x36ab81d994 <__libc_start_main+0xf4>         (libc.so.6)
  13: 0x402cd9     <main+0x41>                      (cs_solver)
End of stack
Yvan Fournier
Posts: 4208
Joined: Mon Feb 20, 2012 3:25 pm

Re: Error in wallfunction with turbomachinery

Post by Yvan Fournier »

Hello,

As the last error is in a "sqrt" function, I assume the crash is due to a slightly negative value for a variable which should always be positive, such as k (but I can't be sure which).

Normally, the code ensures those values are positive, but if the model should guarantee it is positive, it might become slightly negative due to truncation errors. It should be "clipped" in this case, so this is probably a minor bug (or at least a robustness issue).

Which turbulence models and options cause the crash ?

Do you have a small version causing the crash you could post here (or if it is somewhat confidential, to the support e-mail) ? Otherwise, if you install a second build of the code configured in debug mode, we'll have some more coherence checks, the lines in additions to the function names in the backtraces, which may help a bit (debug builds are usually 3 times slower than standard builds, so you don't want to use them for production, but the are very helpful when developing or debugging).

Regards,

Yvan
sirlb
Posts: 34
Joined: Mon Mar 17, 2014 11:54 am

Re: Error in wallfunction with turbomachinery

Post by sirlb »

Thank you Yvan for your reply and concern.

First of all, I apologize to report problem whitout giving much information and modelisation file. The problem is that I can not send the original model and I am not sure to be able to reproduce the problem on a dummy case.
Which turbulence models and options cause the crash ?
The turbluence model is kw-SST with 2 scales wall functions. This is a steady case, schemes are all 2nd order (even for k and omega).
Otherwise, if you install a second build of the code configured in debug mode, we'll have some more coherence checks, the lines in additions to the function names in the backtraces
This seems weird to me because I did install a debug version, and the output given in the first post was given by a run with the debug version. (running a normal version gives less traces in the output, specially the sqrt line is not present if not in debug mode). I must have miss something since I don't have any line number information in my output.

I did some test by copying clptur and cs_wall_functions file in my SRC files and adding a lot of write or bft_print statement to print the arguments of sqrt operation before the function is called.

From now I only receive positive arguments to the sqrt calls. But (correct me if I am wrong) I feel that the printing operation only match the boundary faces operated by the process of rank 0. If this is the case is there a way to have the printing for all the boundary faces from all the processors ?
sirlb
Posts: 34
Joined: Mon Mar 17, 2014 11:54 am

Re: Error in wallfunction with turbomachinery

Post by sirlb »

I found the method to print all rank listings and then i found some negative values for the sqrt of line 395 in cs_wall_functions:

Code: Select all

*uk = sqrt( (1.-g) * cmu025 * cmu025 * kinetic_en
            + g * l_visc * vel / y);
It appears that for the face causing problem, the value of y is negative.
If i understand well, y is the wall distance and should always be positive ?

Is this normal to have some y negative value ?
As i already used the same mesh successfully for some other conditions of rotation (other reference axis), why does the wall distance would be a problem now ?
Yvan Fournier
Posts: 4208
Joined: Mon Feb 20, 2012 3:25 pm

Re: Error in wallfunction with turbomachinery

Post by Yvan Fournier »

Hello,

I have limited Internet access this week, so my answers will be short until next week...

I'll also encourage others to help you...

I am not sure how y+ is computed as it is an adimensional wall distance and combines distance and velocity...

With the k-omega model I believe the wall distance is computed for all cells (which may make the rotor-stator computation slower than with k-epsilon). If the mesh quality is bad, the computed distance might be negative. If this is the case, deactivating reconstruction for the computation of wall distance may help.

I'll let others explain how to do this, though there might be recent references to this in this forum or the best practice guidelines.

Regards,

Yvan
sirlb
Posts: 34
Joined: Mon Mar 17, 2014 11:54 am

Re: Error in wallfunction with turbomachinery

Post by sirlb »

Hello Yvan,

I have the feeling that this problem was caused by problem of cells / faces reorientation during preprocessing phase (i have a warning about cells that could not be reoriented).
I have still to check but i think i have negative values for wall distance for the non reoriented faces.
I will try to add some logging in the cs_wallfunction file and give you some conclusion.

For switching wall distance computation method, i tried to change icdpar variable in cs_user_parameters file but without any success. Do you have any hint to test the method ?

Thank you.
Yvan Fournier
Posts: 4208
Joined: Mon Feb 20, 2012 3:25 pm

Re: Error in wallfunction with turbomachinery

Post by Yvan Fournier »

Hello,

Did you really need cell reorientation during preprocessing ? Which mesher did you use ? Did you use symmetries, or have issues for specific element types ?

In most cases, if you have orientation warnings about only a few cells, its most often due to badly warped (non-convex) elements, and it is better not to try reorienting them. Though if they are badly warped, they can cause issues with quality or robustness later...

In any case, the orientation test can be "fooled" by badly warped cells, which is the reason we do not make the reorientation automatic.

Do you have any view of your mesh in the "suspect" regions ? Do the cells with issues have high aspect ratios ?

Regards,

Yvan
Post Reply