SIGSEGV signal writing results in extra operations

Questions and remarks about code_saturne usage
Forum rules
Please read the forum usage recommendations before posting.
Post Reply
antoineb
Posts: 26
Joined: Mon Sep 16, 2019 4:06 pm

SIGSEGV signal writing results in extra operations

Post by antoineb »

Hi there,

After a successfull instalaltion of Code_saturne 6.0 a few weeks ago, I went at it again, downloaded the last version (September 26th) but on the first run of a previously working case, i had the following error :

Code: Select all

SIGSEGV signal (forbidden memory area access) intercepted!

Call stack:
   1: 0x7fe6ae13b924 <fclose+0x4>                     (libc.so.6)
   2: 0x7fe6afd0b0b1 <cs_run+0x421>                   (libcs_solver-6.0.so)
   3: 0x7fe6afd0ab8d <main+0x19d>                     (libcs_solver-6.0.so)
   4: 0x7fe6ae0e0b6b <__libc_start_main+0xeb>         (libc.so.6)
   5: 0x555ff7e1f0ea <_start+0x2a>                    (cs_solver)
End of stack
I looked at the listing and, the calculation finishes with no issue because I have all the timesteps and the :

Code: Select all

===============================================================



                 FINAL STAGE OF THE CALCULATION              
                 ==============================              


 =========================================================== 

It seems to fail when saving the calculation.

Any idea where this might come from ? I tried running it again on my other computer, with a previous version of CS6.0, it worked fine. Any chance you have a link to the previous version (that I did not save...)

Best regards,

Antoine
Last edited by antoineb on Wed Oct 09, 2019 12:58 pm, edited 1 time in total.
antoineb
Posts: 26
Joined: Mon Sep 16, 2019 4:06 pm

Re: SIGSEGV signal writing results in extra operations

Post by antoineb »

Ok I found the problem, but can't seem to solve it...
It's related to extra operations done after calculation is finished in cs_user_extra_operations.c

The segmentation fault comes from the fclose() function that seems to consider the pointer to the file is NULL.
What am I doing wrong ?

Here is my cs_user_extra_operations.c :

Code: Select all

/*----------------------------------------------------------------------------*/

BEGIN_C_DECLS

/*=============================================================================
 * Local Macro definitions and structure definitions
 *============================================================================*/

static FILE *f1 = NULL;

/*============================================================================
 * User function definitions
 *============================================================================*/
/*----------------------------------------------------------------------------*/

void
cs_user_extra_operations_initialize(cs_domain_t     *domain)
{
  const cs_mesh_t *m = domain->mesh;
  const cs_mesh_quantities_t *mq = domain->mesh_quantities;
  const int n_cells = m->n_cells;

  if (cs_glob_rank_id < 1) { /* Only the first rank write something */
    f1 = fopen("velocity_roof.dat","a");
    fprintf(f1,"# N, Time, Mean_vel\n");
  }

}

/*----------------------------------------------------------------------------*/

void
cs_user_extra_operations(cs_domain_t     *domain)
{

  /* Local variables */
  int nt_cur = domain->time_step->nt_cur;

  const cs_mesh_t *m = domain->mesh;
  const cs_mesh_quantities_t *mq = domain->mesh_quantities;

  const cs_real_t *cell_vol = mq->cell_vol;

  const cs_zone_t *z = cs_volume_zone_by_name("roof");

  /* Get physical fields */
  const cs_real_3_t *vel = (cs_real_3_t *)CS_F_(vel)->val;

  cs_real_t norm_vel = 0.;

  /* Loop over elts of the zone */
  for (cs_lnum_t elt_id = 0; elt_id < z->n_elts; elt_id++) {
    const cs_lnum_t cell_id = z->elt_ids[elt_id];
    norm_vel += cs_math_3_norm(vel[cell_id]) * cell_vol[cell_id];
  }


  /* Sum of values on all ranks (parallel calculations) */

  cs_parall_sum(1, CS_DOUBLE, &norm_vel);

  norm_vel /= z->measure;

  /*  Write
    ========*/

  if (cs_glob_rank_id < 1) { /* Only the first rank write something */
    fprintf(f1,"%10d %12.8e %12.8e\n",
        domain->time_step->nt_cur,
        domain->time_step->t_cur,
        norm_vel);
  }

}

/*----------------------------------------------------------------------------*/

void
cs_user_extra_operations_finalize(cs_domain_t     *domain)
{
  if (cs_glob_rank_id < 1) { /* Only the first rank write something */
    fclose(f1);
  }
}


END_C_DECLS
When I comment all the writing-to-file parts, it gives no error.

Does it sound familiar ?

Best regards,

Antoine
antoineb
Posts: 26
Joined: Mon Sep 16, 2019 4:06 pm

Re: SIGSEGV signal writing results in extra operations

Post by antoineb »

To be even more specific, with the help of a much more CS qualified person, I came to the point where cs_user_extra_operations() and cs_user_extra_operations_initialize() don't seem to be called during calculation.

That would explain the segmentation fault because, the file was simply never created and the pointer was indeed NULL.

Best regards,

Antoine
Yvan Fournier
Posts: 4070
Joined: Mon Feb 20, 2012 3:25 pm

Re: SIGSEGV signal writing results in extra operations

Post by Yvan Fournier »

Hello,

This is strange. cs_user_extra_operations_finalize is called directly by C code, while the other 2 cs_user_extra_operations_* functions may be called through Fortran, so I suspect i subtle installation (or reinstallation) issue.

Did you simply overwrite your previous installation, or did you start from a new build directory ? Did you have a some other change in your system (i.e. upgrade) during that time ? What system are you running on ?

Best regards,

Yvan
antoineb
Posts: 26
Joined: Mon Sep 16, 2019 4:06 pm

Re: SIGSEGV signal writing results in extra operations

Post by antoineb »

Hello Yvan,

I'm running on an ubuntu 19.04 VM. The fortran compiler is f95 and the c compiler is cc.

Martin Ferrand told me that user_extra_operations_initialize() is called by inivar.f90 indeed. He advised me to make a copy of the file, insert a bit of print code and recompile with --enable-debug in order to see if the subroutine is called during computation.

Anyway I think I'm going to do a clean install to see if the problem still happens.
I'll let you know !

Thanks,

Antoine
antoineb
Posts: 26
Joined: Mon Sep 16, 2019 4:06 pm

Re: SIGSEGV signal writing results in extra operations

Post by antoineb »

Hello again,

After two clean installs (with f95 and gfortran as f compiler) I still have issues.

I inserted a simple : write(nfecra,*) “here we are”
Just before the call to user_extra_operations_initialize() in inivar.f90, and i do get the "here we are" line in the listing.

With this basic cs_user_extra_operations.c, only the finalize() function gets called...
Is my syntax ok ?

Is it possible that the C bindings aren't working ?

Another precision : I'm building without the gui.

Best regards,

Antoine

edit : it works well on a locally installed ubuntu 19.04. I'm guessing the issue is on the cloud VM, but I see no difference between the two configs (same gcc, same python, same version of ubuntu, same code_saturne, same setup file)
Attachments
cs_user_extra_operations.c
(4.39 KiB) Downloaded 164 times
Yvan Fournier
Posts: 4070
Joined: Mon Feb 20, 2012 3:25 pm

Re: SIGSEGV signal writing results in extra operations

Post by Yvan Fournier »

Hello,

Since you can install the code, a good test in my opinion would be to modify the cs_user_extra_operations.c from the main sources, then reinstall and see how it behaves. This will allow determining whether the call site is never reached for some reason or whether the default user function is not replaced by the one from the case you are running (we have already seen strange things on Ubuntu, but so far only in the packaged install").

Best regards,

Yvan
antoineb
Posts: 26
Joined: Mon Sep 16, 2019 4:06 pm

Re: SIGSEGV signal writing results in extra operations

Post by antoineb »

Finally got it working on a google cloud VM, but on Debian 9.11 instead of Ubuntu 19.04...
I did everything exactly the same, but all the subroutines and post calculation operations are ok now !

Best regards,

Antoine
Post Reply