Page 1 of 1

error using mesh translation

Posted: Thu Sep 03, 2015 2:57 pm
by sirlb
Hello,

I have a strange behavior of CS v3.2.1 when using translation in cs_user_mesh.c.

I use :

Code: Select all

/* Add same mesh with transformations */
  if (true) {
    const char *renames[] = {NULL, NULL};

    const double transf_matrix[3][4] = {{1., 0., 0., -5.00},
                                                     {0., 1., 0., 0.},
                                                     {0., 0., 1., 1250.0}};

    cs_preprocessor_data_add_file("mesh_input",
                                  0, NULL,
                                  transf_matrix);
  }


It then produce the following error when computing geometric quantities :

Code: Select all

1: 0x36898302d0 ?                                (?)
   2: 0x36898cb2e6 <__poll+0x66>                    (libc.so.6)
   3: 0x2ac950ec8191 ?                                (?)
   4: 0x2ac950ec6ff3 ?                                (?)
   5: 0x2ac950ebafb1 <opal_progress+0xa1>             (libopen-pal.so.0)
   6: 0x2ac953587b95 ?                                (?)
   7: 0x2ac9554a4954 ?                                (?)
   8: 0x2ac950713fac <MPI_Allreduce+0x17c>            (libmpi.so.0)
   9: 0x2ac94e2d0651 <cs_mesh_quantities_compute+0x718> (libsaturne.so.0)
  10: 0x2ac94deb4433 <cs_preprocess_mesh+0x3de>       (libsaturne.so.0)
  11: 0x2ac94dd774b1 <cs_run+0x1d5>                   (libsaturne.so.0)
  12: 0x2ac94dd778d0 <main+0x15d>                     (libsaturne.so.0)
  13: 0x368981d994 <__libc_start_main+0xf4>         (libc.so.6)
  14: 0x402cd9     <main+0x41>                      (cs_solver)

I have made the following tests :

1 - run without translation : computation starts running ok :D
2 - run with cs_user_mesh.c, with 0.0 values in translation :D
3 - increasing translation values in z : run ok :D for low value, but fails :evil: for bigger translation value (around 1200 ??).

I unfortunately can not supply input files, but does anybody as a clue to explain this strange error ?

Re: error using mesh translation

Posted: Thu Sep 03, 2015 4:05 pm
by Yvan Fournier
Hello,

This is surprising. When you run with 0 or a small translation, are all mesh quantities (volume, ...) identical ?

Could you run in a debug build, to determine the lines where thins fail when computing mesh quantities ?

What are the x,y,z bounds of your initial mesh ?

I also recommend upgrading to version 4.0.

Regards,

Yvan

Re: error using mesh translation

Posted: Fri Sep 04, 2015 5:53 pm
by sirlb
Hello,

I tried translating the mesh in the mesher, then running without translation in CS.
I got the same failure.

Then i guess it is a problem in the mesh, something like loss of tolerance link to the translation... ??

Re: error using mesh translation

Posted: Sun Sep 06, 2015 8:33 pm
by Yvan Fournier
Hello,

This is strange. Even if you cannot provide your mesh, running under a debug build to obtain the exact line where the crash occurs, and indicating the minimum/maximum x,y,z values might help.

Regards,

Yvan

Re: error using mesh translation

Posted: Mon Sep 07, 2015 8:48 am
by sirlb
Hello Yvan,

I am trying to have a working debug version without any success until now.
I configured the build with the option --enable-debug, forced compilers flags with -g in environment variables and checked that the -g options was present in the command lines during compilation.

But the only stack trace i obtained is :

Code: Select all

1: 0x36898302d0 ?                                (?)
   2: 0x36898cb2e6 <__poll+0x66>                    (libc.so.6)
   3: 0x2ac950ec8191 ?                                (?)
   4: 0x2ac950ec6ff3 ?                                (?)
   5: 0x2ac950ebafb1 <opal_progress+0xa1>             (libopen-pal.so.0)
   6: 0x2ac953587b95 ?                                (?)
   7: 0x2ac9554a4954 ?                                (?)
   8: 0x2ac950713fac <MPI_Allreduce+0x17c>            (libmpi.so.0)
   9: 0x2ac94e2d0651 <cs_mesh_quantities_compute+0x718> (libsaturne.so.0)
  10: 0x2ac94deb4433 <cs_preprocess_mesh+0x3de>       (libsaturne.so.0)
  11: 0x2ac94dd774b1 <cs_run+0x1d5>                   (libsaturne.so.0)
  12: 0x2ac94dd778d0 <main+0x15d>                     (libsaturne.so.0)
  13: 0x368981d994 <__libc_start_main+0xf4>         (libc.so.6)
  14: 0x402cd9     <main+0x41>                      (cs_solver)
I guess i forgot some options during build or launch of the run.
I am not such a compilation expert, so if you have some advice, i will try to correct this and provide more information.

Re: error using mesh translation

Posted: Mon Sep 07, 2015 6:15 pm
by Yvan Fournier
Hello,

Normally, adding "--enable-debug" should be enough. You do not need to add "-g" or modify the flags using "known" compilers (--enable-debug adds what is necessary).

What compiler are you using ? Did you install in a separate path ? In this case, are you sure you used the debug build ?

Regards,

Yvan

Re: error using mesh translation

Posted: Tue Sep 08, 2015 2:37 pm
by sirlb
Hello,
What compiler are you using ?
> GCC 4.5.0
Did you install in a separate path ?
> yes
In this case, are you sure you used the debug build ?
> from the summary file :

Code: Select all

========================================================
Start time       : Tuesday September 08 14:36:15 CEST 2015
========================================================
  Command        : /home/applis/code_saturne/3.2.1_db/gcc-4.5.0/bin/code_saturne run ...
I finaly put some trace using bft_print in cs_mesh_quantities.c.
The error seems to appear in function _compute_face_quantities when computing "centres of gravity, normals, and surfaces of interior faces".

It appears in second loop (Second loop on triangles of the face (for the barycentre)), between line 1094 and 1141.

Domain size is in the order of 100m and translation in the order of 1000m, but i have small faces in the center of the domain.

I hope that it helps, and i will try to change my mesh to see if it fix the problem.

Regards

Re: error using mesh translation

Posted: Tue Sep 08, 2015 5:58 pm
by Yvan Fournier
Hello,

If your mesh is not too big, can you also try running in serial (non parallel) mode ?

If you reproduce the error in this case, I can guide you through using the gdb debugger to have the exact values causing the crash (I assume we are multiplying huge values, or dividing by a quasi-zero surface face).

I parallel, it is possible also, but more complex to debug...

Regards,

Yvan