user routine parallelization issues - v7
Posted: Fri Sep 22, 2023 9:20 am
Hello,
I am facing issues in making a user routine working correctly, when parallelized.
The specific "code" is written inside the routine "cs_boundary_conditions_ale.f90" (subroutine usalcl), in v7.
The routine loops on the faces of one boundary surface, and stores the position of the nodes:
Up to 20 CPU on the same node, the routine works correctly. It shows issues when switching to higher CPU number (I have tested it on a 40 CPU node), and also when I want to parallelize the simulation on more the one node (for example, I tried on two nodes of 20 CPU each). The error I get is the following:
Moreover, the behavior seems case dependent: for certain cases, the same routine works without problems even when parallelized on several nodes...
Any suggestion or idea on what could create the problem?
Thank you very much in advance for your help.
Kind regards,
Daniele
I am facing issues in making a user routine working correctly, when parallelized.
The specific "code" is written inside the routine "cs_boundary_conditions_ale.f90" (subroutine usalcl), in v7.
The routine loops on the faces of one boundary surface, and stores the position of the nodes:
Code: Select all
allocate(lstelt(nfabor))
call getfbr('BC1', nlelt1, lstelt)
if(.not.allocated(y_v)) then
allocate(y_v(nnod))
endif
if(.not.allocated(y_v_paral)) then
allocate(y_v_paral(nnod))
endif
! We store inside y_v (or y_v_paral) the y-coordinate of all vertices --> y_v_paral will have a size equal to k_par
k=1
do ilelt = 1, nlelt1
ifac = lstelt(ilelt)
do ii = ipnfbr(ifac), ipnfbr(ifac+1)-1
inod = nodfbr(ii)
y_v(k) = (xyzno0(3,inod))
k=k+1
enddo
enddo
! Parallelization of y_v to y_v_par
k_par = k-1
if (irangp.ge.0) then
call parcpt(k_par)
call cs_parall_allgather_r(k-1,k_par,y_v,y_v_paral)
endif
Code: Select all
MPI_ABORT was invoked on rank 15 in communicator MPI_COMM_WORLD
with errorcode 1.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[node141:106113] 2 more processes have sent help message help-mpi-api.txt / mpi-abort
[node141:106113] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
solver script exited with status 1.
Error running the calculation.
Any suggestion or idea on what could create the problem?
Thank you very much in advance for your help.
Kind regards,
Daniele