Uslain in parallel computation

Ayde · Post by **Ayde** » Thu Jun 19, 2014 9:55 am

Hi !

I am Alexis. I use Code_Saturne v3.0.0 and I have some questions about the Lagrangian module and the parallel computation.

I try to simulate the dispersion of particles in a turbulent channel flow on several processors. The fluid phase is now done and works well.

In order to get an handle on the lagrangian module, I began with a simple case of particles in an laminar flow. In a first time, I wish to inject the particles inside the channel with an inlet condition. In a second time, I use the subroutine "uslain" to distribute the particles in a homogenous way inside the channel. Then starts the computation. This case works perfect with a single processor.

The next step is to compute this case on several processors. And here comes my problem because computation does not work. After some attempts, I discovered that without the random distribution of the particles in "uslain", the computation runs with no problems.

So is it because this subroutine cannot work in parallel computation ? Or maybe the way the subroutine is coded isn't compatible in a parallel computation ?
Here is the code :

! reinitialization of the counter of the new particles

npt = nbpart

! for each boundary zone:
do ii = 1,nfrlag
izone = ilflag(ii)

! for each class:
do iclas = 1, iusncl(izone)

! if new particles must enter the domain:
if (mod(ntcabs,iusloc(iclas,izone,ijfre)).eq.0) then

do ip = npt+1 , npt+iusloc(iclas,izone,ijnbp)

! Re-initialization of the particle location (B. Arcen):

! Generation of three random numbers (uniform distribution)
call zufall(3,vunif)

! The particle location (xp,yp,zp) is then changed
!
xp = vunif(1)*12.d0
yp = vunif(2)*2.d0
zp = vunif(3)*6.d0

! The number of the cell in which the particle [itepa(ip,jisor)] is located
! is found using the following base function
call findpt(ncelet,ncel, xyzcen,xp,yp,zp,ipnode,ndrang)
itepa(ip,jisor)=ipnode
! Finally the particle coordinate variables stored in ettp are updated
ettp(ip,jxp) = xp
ettp(ip,jyp) = yp
ettp(ip,jzp) = zp
! Velocity is set to zero
ettp(ip,jup) = 0.d0
ettp(ip,jvp) = 0.d0
ettp(ip,jwp) = 0.d0

enddo
npt = npt + iusloc(iclas,izone,ijnbp)
endif
enddo
enddo

Thanks for reading me.
Regards.

Alexis

Post by **Yvan Fournier** » Fri Jun 20, 2014 9:47 pm

Hello,

What do you mean by "not working" ? Does the code hang (interlock), crash, or give bad results ?

The problem is probably related to the use of findpt.

findpt itself requires some parallel exchanges (behaving as a collective parallel call), so if iusloc(iclas,izone,ijnbp) is different from process to process, the code will probably hang.

Otherwise, I assume the problem is simply related to the return value of findpt, which will return the number of the closest cell to the desired coordinate, which my be on another proc.

A simple solution would be to test for (ndrang .eq. irangp) just after the call to findpt, and ignore anything not on the current rank.

In code:

Code: Select all

! reinitialization of the counter of the new particles

npt = nbpart

! for each boundary zone:
do ii = 1,nfrlag
izone = ilflag(ii)

! for each class:
do iclas = 1, iusncl(izone)

! if new particles must enter the domain:
if (mod(ntcabs,iusloc(iclas,izone,ijfre)).eq.0) then

ipl = 0

do ip = npt+1 , npt+iusloc(iclas,izone,ijnbp)

  ! Re-initialization of the particle location (B. Arcen):

  ! Generation of three random numbers (uniform distribution)
  call zufall(3,vunif)

  ! The particle location (xp,yp,zp) is then changed

  xp = vunif(1)*12.d0
  yp = vunif(2)*2.d0
  zp = vunif(3)*6.d0

  ! The number of the cell in which the particle [itepa(ip,jisor)] is located
  ! is found using the following base function
  call findpt(ncelet,ncel, xyzcen,xp,yp,zp,ipnode,ndrang)

  if (ndrang.eq.irangp) then

    ipl = ipl + 1  ! Increment for this rank

    itepa(ipl,jisor)=ipnode
    ! Finally the particle coordinate variables stored in ettp are updated
    ettp(ipl,jxp) = xp
    ettp(ipl,jyp) = yp
    ettp(ipl,jzp) = zp
    ! Velocity is set to zero
    ettp(ipl,jup) = 0.d0
    ettp(ipl,jvp) = 0.d0
    ettp(ipl,jwp) = 0.d0

  endif

enddo
npt = npt + iusloc(iclas,izone,ijnbp)
endif
enddo
enddo

Notice I added an intermediate ipl counter to hava a local equivalent of ip once it is known a particle is injected on the current rank, so you need to declare it also (as an integer).

Regards,

Yvan

Ayde · Post by **Ayde** » Mon Jun 23, 2014 10:50 am

Hi,

I tried to compute with your code but nothing changes. Computation stops even before achieving the first iteration and gives me the same report.
Here is the message code_saturne shows me in the main window.

[ESS-LEM-C117-01:31218] *** An error occurred in MPI_Allreduce
[ESS-LEM-C117-01:31218] *** on communicator MPI_COMM_WORLD
[ESS-LEM-C117-01:31218] *** MPI_ERR_TRUNCATE: message truncated
[ESS-LEM-C117-01:31218] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec has exited due to process rank 3 with PID 31220 on
node ESS-LEM-C117-01 exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).
--------------------------------------------------------------------------
[ESS-LEM-C117-01:31216] 1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal
[ESS-LEM-C117-01:31216] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
solver script exited with status 15.

Error running the calculation.

Check code_saturne log (listing) and error* files for details.

****************************
Saving calculation results
****************************

Error in calculation stage.

And here is the error file.

Code: Select all

Signal SIGTERM (terminaison) reçu.
--> calcul interrompu par l'environnement.

Pile d'appels :
   1: 0x7fe55a2c6586 <mca_btl_sm_component_progress+0x46> (mca_btl_sm.so)
   2: 0x7fe55f4976ab <opal_progress+0x5b>             (libmpi.so.1)
   3: 0x7fe55f3e409d <ompi_request_default_wait_all+0xad> (libmpi.so.1)
   4: 0x7fe5591ff8e1 <ompi_coll_tuned_allreduce_intra_recursivedoubling+0x261> (mca_coll_tuned.so)
   5: 0x7fe55f3f0333 <PMPI_Allreduce+0x1a3>           (libmpi.so.1)
   6: 0x7fe5609ab74d <parfpt_+0x5d>                   (libsaturne.so.0)
   7: 0x7fe560a0421f <findpt_+0x103>                  (libsaturne.so.0)
   8: 0x408149     <uslain_+0x21b>                  (cs_solver)
   9: 0x7fe560db3e11 <lagent_+0x6521>                 (libsaturne.so.0)
  10: 0x416d3a     <lagune_+0x19ca>                 (cs_solver)
  11: 0x7fe560959d29 <caltri_+0x32e5>                 (libsaturne.so.0)
  12: 0x7fe560930415 <cs_run+0xa35>                   (libsaturne.so.0)
  13: 0x7fe56092f8fa <main+0x14a>                     (libsaturne.so.0)
  14: 0x7fe55ff2f76d <__libc_start_main+0xed>         (libc.so.6)
  15: 0x407459     <>                               (cs_solver)
Fin de la pile

I hope this can help you.

Regards.

Alexis

Post by **Yvan Fournier** » Wed Jun 25, 2014 11:32 am

Hello,

Could you check if the value of iusloc(iclas,izone,ijnbp) is different on each rank ? (print it from the user sorce file before the loop on injected particles and activate logging on all ranks in the advanced options of the run options).

If it is the same, I'll re-check my code snippet to see if yjere is a small mistake. If not, as I said before, it can't work, so we need an additional adaptation.

Regards,

Yvan

Ayde · Post by **Ayde** » Wed Jun 25, 2014 2:23 pm

Hi,

I did check the values of iusloc(iclas,izone,ijnbp) on each rank and they are differents. What a shame.

However I found a trick to bypass the problem. I write it there, it always may be useful for someone who would have a similar case.

The solution is to compute the case on a single processor during a few iterations. It will manage the initialization and the distribution of the particles. Then restart the computation on several proc by using the checkpoint of the previous single processor computation. Initialization and distribution are so not managed in parallel.

This is not very elegant and might be very expensive on complex geometries for the single processor, but it works.

Thanks for your help anyway.

Alexis

Post by **Yvan Fournier** » Wed Jun 25, 2014 9:48 pm

Hello,

Thanks for your feedback. I'm not shure what we'll have time to by by version 4.0 (which should be branched around october for a validation and release early late winter 2015), but more elegant solution which would work in 4.0 (but could already work today) would be to :

Assign an expected mean number of particles and a "surface injection probability density" to each inlet face for each time step
Loop on "local" random faces, generate a random number, divide it by the face surface, and compare it to the surface probability density
repeat so that the injection probability density integrated over all faces matches the expected mean number of particles (be careful here: variants on the algorithm may change the distribution type: many substeps with small probability densities will probably converge to a linear distribution)
If a particle is injected at a face, randomize its position relative to the face...

This is better (and more efficient) than using findpt (which I hope to replace by 4.0), but needs some adjustment. Many variants and improvements on this pseudo-algorithm are possible, but I hope this may provide some ideas (i.e. avoiding communication beyond a first surface scan).

Regards,

Yvan

code_saturne User's Forum

Uslain in parallel computation

Uslain in parallel computation

Re: Uslain in parallel computation

Re: Uslain in parallel computation

Re: Uslain in parallel computation

Re: Uslain in parallel computation

Re: Uslain in parallel computation