Uslain in parallel computation

Questions and remarks about code_saturne usage
Forum rules
Please read the forum usage recommendations before posting.
Post Reply
Ayde

Uslain in parallel computation

Post by Ayde »

Hi !

I am Alexis. I use Code_Saturne v3.0.0 and I have some questions about the Lagrangian module and the parallel computation.

I try to simulate the dispersion of particles in a turbulent channel flow on several processors. The fluid phase is now done and works well.

In order to get an handle on the lagrangian module, I began with a simple case of particles in an laminar flow. In a first time, I wish to inject the particles inside the channel with an inlet condition. In a second time, I use the subroutine "uslain" to distribute the particles in a homogenous way inside the channel. Then starts the computation. This case works perfect with a single processor.

The next step is to compute this case on several processors. And here comes my problem because computation does not work. After some attempts, I discovered that without the random distribution of the particles in "uslain", the computation runs with no problems.

So is it because this subroutine cannot work in parallel computation ? Or maybe the way the subroutine is coded isn't compatible in a parallel computation ?
Here is the code :
! reinitialization of the counter of the new particles

npt = nbpart

! for each boundary zone:
do ii = 1,nfrlag
izone = ilflag(ii)

! for each class:
do iclas = 1, iusncl(izone)

! if new particles must enter the domain:
if (mod(ntcabs,iusloc(iclas,izone,ijfre)).eq.0) then

do ip = npt+1 , npt+iusloc(iclas,izone,ijnbp)

! Re-initialization of the particle location (B. Arcen):

! Generation of three random numbers (uniform distribution)
call zufall(3,vunif)

! The particle location (xp,yp,zp) is then changed
!
xp = vunif(1)*12.d0
yp = vunif(2)*2.d0
zp = vunif(3)*6.d0

! The number of the cell in which the particle [itepa(ip,jisor)] is located
! is found using the following base function

call findpt(ncelet,ncel, xyzcen,xp,yp,zp,ipnode,ndrang)
itepa(ip,jisor)=ipnode
! Finally the particle coordinate variables stored in ettp are updated
ettp(ip,jxp) = xp
ettp(ip,jyp) = yp
ettp(ip,jzp) = zp
! Velocity is set to zero
ettp(ip,jup) = 0.d0
ettp(ip,jvp) = 0.d0
ettp(ip,jwp) = 0.d0

enddo
npt = npt + iusloc(iclas,izone,ijnbp)
endif
enddo
enddo
Thanks for reading me.
Regards.

Alexis
Yvan Fournier
Posts: 4208
Joined: Mon Feb 20, 2012 3:25 pm

Re: Uslain in parallel computation

Post by Yvan Fournier »

Hello,

What do you mean by "not working" ? Does the code hang (interlock), crash, or give bad results ?

The problem is probably related to the use of findpt.

findpt itself requires some parallel exchanges (behaving as a collective parallel call), so if iusloc(iclas,izone,ijnbp) is different from process to process, the code will probably hang.

Otherwise, I assume the problem is simply related to the return value of findpt, which will return the number of the closest cell to the desired coordinate, which my be on another proc.

A simple solution would be to test for (ndrang .eq. irangp) just after the call to findpt, and ignore anything not on the current rank.

In code:

Code: Select all

! reinitialization of the counter of the new particles

npt = nbpart

! for each boundary zone:
do ii = 1,nfrlag
izone = ilflag(ii)

! for each class:
do iclas = 1, iusncl(izone)

! if new particles must enter the domain:
if (mod(ntcabs,iusloc(iclas,izone,ijfre)).eq.0) then

ipl = 0

do ip = npt+1 , npt+iusloc(iclas,izone,ijnbp)

  ! Re-initialization of the particle location (B. Arcen):

  ! Generation of three random numbers (uniform distribution)
  call zufall(3,vunif)

  ! The particle location (xp,yp,zp) is then changed

  xp = vunif(1)*12.d0
  yp = vunif(2)*2.d0
  zp = vunif(3)*6.d0

  ! The number of the cell in which the particle [itepa(ip,jisor)] is located
  ! is found using the following base function
  call findpt(ncelet,ncel, xyzcen,xp,yp,zp,ipnode,ndrang)

  if (ndrang.eq.irangp) then

    ipl = ipl + 1  ! Increment for this rank

    itepa(ipl,jisor)=ipnode
    ! Finally the particle coordinate variables stored in ettp are updated
    ettp(ipl,jxp) = xp
    ettp(ipl,jyp) = yp
    ettp(ipl,jzp) = zp
    ! Velocity is set to zero
    ettp(ipl,jup) = 0.d0
    ettp(ipl,jvp) = 0.d0
    ettp(ipl,jwp) = 0.d0

  endif

enddo
npt = npt + iusloc(iclas,izone,ijnbp)
endif
enddo
enddo
Notice I added an intermediate ipl counter to hava a local equivalent of ip once it is known a particle is injected on the current rank, so you need to declare it also (as an integer).

Regards,

Yvan
Ayde

Re: Uslain in parallel computation

Post by Ayde »

Hi,

I tried to compute with your code but nothing changes. Computation stops even before achieving the first iteration and gives me the same report.
Here is the message code_saturne shows me in the main window.
[ESS-LEM-C117-01:31218] *** An error occurred in MPI_Allreduce
[ESS-LEM-C117-01:31218] *** on communicator MPI_COMM_WORLD
[ESS-LEM-C117-01:31218] *** MPI_ERR_TRUNCATE: message truncated
[ESS-LEM-C117-01:31218] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec has exited due to process rank 3 with PID 31220 on
node ESS-LEM-C117-01 exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).
--------------------------------------------------------------------------
[ESS-LEM-C117-01:31216] 1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal
[ESS-LEM-C117-01:31216] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
solver script exited with status 15.

Error running the calculation.

Check code_saturne log (listing) and error* files for details.


****************************
Saving calculation results
****************************

Error in calculation stage.
And here is the error file.

Code: Select all

Signal SIGTERM (terminaison) reçu.
--> calcul interrompu par l'environnement.

Pile d'appels :
   1: 0x7fe55a2c6586 <mca_btl_sm_component_progress+0x46> (mca_btl_sm.so)
   2: 0x7fe55f4976ab <opal_progress+0x5b>             (libmpi.so.1)
   3: 0x7fe55f3e409d <ompi_request_default_wait_all+0xad> (libmpi.so.1)
   4: 0x7fe5591ff8e1 <ompi_coll_tuned_allreduce_intra_recursivedoubling+0x261> (mca_coll_tuned.so)
   5: 0x7fe55f3f0333 <PMPI_Allreduce+0x1a3>           (libmpi.so.1)
   6: 0x7fe5609ab74d <parfpt_+0x5d>                   (libsaturne.so.0)
   7: 0x7fe560a0421f <findpt_+0x103>                  (libsaturne.so.0)
   8: 0x408149     <uslain_+0x21b>                  (cs_solver)
   9: 0x7fe560db3e11 <lagent_+0x6521>                 (libsaturne.so.0)
  10: 0x416d3a     <lagune_+0x19ca>                 (cs_solver)
  11: 0x7fe560959d29 <caltri_+0x32e5>                 (libsaturne.so.0)
  12: 0x7fe560930415 <cs_run+0xa35>                   (libsaturne.so.0)
  13: 0x7fe56092f8fa <main+0x14a>                     (libsaturne.so.0)
  14: 0x7fe55ff2f76d <__libc_start_main+0xed>         (libc.so.6)
  15: 0x407459     <>                               (cs_solver)
Fin de la pile
I hope this can help you.

Regards.

Alexis
Yvan Fournier
Posts: 4208
Joined: Mon Feb 20, 2012 3:25 pm

Re: Uslain in parallel computation

Post by Yvan Fournier »

Hello,

Could you check if the value of iusloc(iclas,izone,ijnbp) is different on each rank ? (print it from the user sorce file before the loop on injected particles and activate logging on all ranks in the advanced options of the run options).

If it is the same, I'll re-check my code snippet to see if yjere is a small mistake. If not, as I said before, it can't work, so we need an additional adaptation.

Regards,

Yvan
Ayde

Re: Uslain in parallel computation

Post by Ayde »

Hi,

I did check the values of iusloc(iclas,izone,ijnbp) on each rank and they are differents. What a shame.

However I found a trick to bypass the problem. I write it there, it always may be useful for someone who would have a similar case.

The solution is to compute the case on a single processor during a few iterations. It will manage the initialization and the distribution of the particles. Then restart the computation on several proc by using the checkpoint of the previous single processor computation. Initialization and distribution are so not managed in parallel.

This is not very elegant and might be very expensive on complex geometries for the single processor, but it works.

Thanks for your help anyway.

Alexis
Yvan Fournier
Posts: 4208
Joined: Mon Feb 20, 2012 3:25 pm

Re: Uslain in parallel computation

Post by Yvan Fournier »

Hello,

Thanks for your feedback. I'm not shure what we'll have time to by by version 4.0 (which should be branched around october for a validation and release early late winter 2015), but more elegant solution which would work in 4.0 (but could already work today) would be to :
  • Assign an expected mean number of particles and a "surface injection probability density" to each inlet face for each time step
  • Loop on "local" random faces, generate a random number, divide it by the face surface, and compare it to the surface probability density
  • repeat so that the injection probability density integrated over all faces matches the expected mean number of particles (be careful here: variants on the algorithm may change the distribution type: many substeps with small probability densities will probably converge to a linear distribution)
  • If a particle is injected at a face, randomize its position relative to the face...
This is better (and more efficient) than using findpt (which I hope to replace by 4.0), but needs some adjustment. Many variants and improvements on this pseudo-algorithm are possible, but I hope this may provide some ideas (i.e. avoiding communication beyond a first surface scan).

Regards,

Yvan
Post Reply