Parallel computation of loops

Questions and remarks about code_saturne usage
Forum rules
Please read the forum usage recommendations before posting.
Post Reply
Mohammad
Posts: 114
Joined: Thu Oct 25, 2018 12:18 pm

Parallel computation of loops

Post by Mohammad »

Hello,

I have the following loop in cs_user_extra_operations.c file:

Code: Select all

  for (cs_lnum_t i = 0; i < n_faces; i++) {
    face_id = face_list[i];
    iel = b_face_cells[face_id];
    Tau_Wall_Mean +=  Tau_wall[iel];
 }
  FILE *f1 = NULL;
     f1 = fopen("MEAN_SHEAR.dat","a");
     fprintf(f1, "%i\t%f\n", ntcabs, Tau_Wall_Mean/n_faces);
     fclose (f1);
This code calculates a variable(Tau_Wall_Mean) in a loop and then writes it to a file(MEAN_SHEAR.dat). I use 8 cores for processing.

When I open the exported file, It gives me 8 different values at each time step number which means that each core is computing (maybe a part of) the loop separately and gives a different value and the code does not collect the result from the cores.

How can I force the loop to collect the results from all cores and give me just one number?

Does this problem occur just for the loops which a variable should be summed by its previous value?

CS version: 5.0.9

Regards,
Mohammad
Yvan Fournier
Posts: 4070
Joined: Mon Feb 20, 2012 3:25 pm

Re: Parallel computation of loops

Post by Yvan Fournier »

Hello,

Are you using MPI or OpenMP parallelism ?

There are user examples handling parallelism at least for MPI, so check the cs_user_extra_operations variants.

Regards,

Yvan
Mohammad
Posts: 114
Joined: Thu Oct 25, 2018 12:18 pm

Re: Parallel computation of loops

Post by Mohammad »

Hello and thank you!

I use MPI.
I checked all those files it's a bit confusing. Some of them use this condition to write only one output at each time step so I used it and it worked:

Code: Select all

if (cs_glob_rank_id <= 0)
I don't know what is cs_glob_rank_id and it also doesn't have a definition in doxygen.
There's just a comment above one of them which says:
/* Only process of rank 0 (parallel) or -1 (scalar) writes to this file. */
and also some examples use the following command at the end of a summing loop which seems to sum all values on all processors, so, I added it to my code:

Code: Select all

cs_parall_sum(1, CS_FLOAT, &Tau_Wall_Mean);
When I use those codes, It gives me different output averages for different number of cores! For example if I use 8 cores the average of my outputs becomes 0.8. If I use 4 cores it becomes 0.2 and for single core its -0.0002!

It's really confusing!

My modified code is now:

Code: Select all

  for (cs_lnum_t i = 0; i < n_faces; i++) {
    face_id = face_list[i];
    iel = b_face_cells[face_id];
    Tau_Wall_Mean +=  Tau_wall[iel];
 }
cs_parall_sum(1, CS_FLOAT, &Tau_Wall_Mean);

if (cs_glob_rank_id <= 0){
  FILE *f1 = NULL;
     f1 = fopen("MEAN_SHEAR.dat","a");
     fprintf(f1, "%i\t%f\n", ntcabs, Tau_Wall_Mean/n_faces);
     fclose (f1);
}
Regards,

Mohammad
Luciano Garelli
Posts: 280
Joined: Fri Dec 04, 2015 1:42 pm

Re: Parallel computation of loops

Post by Luciano Garelli »

Hello,

cs_glob_rank_id will give you the rank of a MPI process in case of parallelism and it will take the values between 0<=cs_glob_rank_id< number of process. In case of a serial run cs_glob_rank_id =-1, so only process of rank=0 (parallel) or -1 (serial) will write to this file.


In you code, the n_faces will give you the local number of faces, so if you need to compute an average over the faces you have to do the parellel sum of n_faces to get the total number of faces, or divide Tau_Wall_Mean/n_faces after the loop before writting. If you need Tau_Wall_Mean just write it without divide by n_faces.

Regards,

Luciano
Mohammad
Posts: 114
Joined: Thu Oct 25, 2018 12:18 pm

Re: Parallel computation of loops

Post by Mohammad »

Hello,

Thank you very much Lucian, your helps solved the problem.

Regards,
Mohammad
Post Reply