Different pressure values on different machines!

Questions and remarks about code_saturne usage
Forum rules
Please read the forum usage recommendations before posting.
Mohammad
Posts: 114
Joined: Thu Oct 25, 2018 12:18 pm

Different pressure values on different machines!

Post by Mohammad »

Hello,

I ran the same calculation on two different machines with the same number of cores, same mesh and exactly same files and same versions of the code (6.0.4) (Started from the same checkpoint file).
The calculated velocities are the close to each other on both machines while the pressure and total pressure are quite different!
The pressure on the first machine is:
field minimum maximum set mean spatial mean
v Pressure -5.6154e+08 -5.6152e+08 -5.6153e+08 -5.6153e+08
And on the second machine is:
field minimum maximum set mean spatial mean
v Pressure -70045 74732 5001.5 -2.1064e-07
As you can see, the pressure field on the first machine is 10000 times bigger than the second machine!
The correct values are for the second machine.

The weird thing is that I have version 5.0 on both machines and test it with that version and this problem did not occur in that version!

What is wrong there?

Regards,
Mohammad
Yvan Fournier
Posts: 4077
Joined: Mon Feb 20, 2012 3:25 pm

Re: Different pressure values on different machines!

Post by Yvan Fournier »

Hello,

Do you have a small test case (same setup, but small mesh if possible) with which we could try to reproduce this ?

Or in any case at least the xml and user files.

Regards,

Yvan
Mohammad
Posts: 114
Joined: Thu Oct 25, 2018 12:18 pm

Re: Different pressure values on different machines!

Post by Mohammad »

Hello Yvan and thank you.

I don't know what happend, but the problem went away automatically after 1 day!
I will send the files if this problem happens again!

Regards,
Mohammad
Mohammad
Posts: 114
Joined: Thu Oct 25, 2018 12:18 pm

Re: Different pressure values on different machines!

Post by Mohammad »

Hello,

The problem happened again, I noticed it only happens if I use Neumann B.C for the pressure at the outlet boundary.

Code: Select all

    icodcl(ifac,ipr) = 3 !Neumann Boundary Condition for the Pressure
    rcodcl(ifac,ipr,1) = 0.d0
    rcodcl(ifac,ipr,2) = rinfin
    rcodcl(ifac,ipr,3) = 0.d0
When I add these codes to the outlet B.C (isolib) in cs_user_boundary_conditions.f90, I get different results for the pressure on different machines (my PC and cluster node). The first machine which is my PC gives correct results for the pressure while the second machine which is a cluster node, gives very big numbers for the pressure and the number of iterations for the pressure solver increases with the time (from 30 to about 320 after a few hours), while my PC gives correct results for the pressure! (The results for the PC and the cluster node are the same when I use ver. 5.0.9 and the pressure is correct on both of them!)

I uploaded the files of the case with small mesh. You can download it here:
https://gofile.io/d/9a7R86

I also attached the setup.log files of PC and cluster node.
setup_PC.log
(22.79 KiB) Downloaded 144 times
setup_Cluster.log
(22.51 KiB) Downloaded 142 times
Regards,

Mohammad
Yvan Fournier
Posts: 4077
Joined: Mon Feb 20, 2012 3:25 pm

Re: Different pressure values on different machines!

Post by Yvan Fournier »

Hello,

Ok thanks for the info. This is probably related to partitioning. Using the default outlet boundary conditions, there is a specific treatment to ensure the choice of the reference pressure point is independent of mesh partitioning (and without this, we observed similar convergence dependence on partitioning). This is based on setting the default outlet type, not simply Neumann on the pressure.

I suspect there is a similar issue here. What boundary condition type code do you use for the outlet (not just pressure) ?

Regards,

Yvan
Mohammad
Posts: 114
Joined: Thu Oct 25, 2018 12:18 pm

Re: Different pressure values on different machines!

Post by Mohammad »

Hello,

Thanks for your reply.

I tested the case on another PC (with ver. 6.0.4 installed) with my checkpoint files, there is no problem on that PC, too! The problem just occurs on the cluster and only on version 6.

[For your information, I start the calculations with a previous simulation checkpoint file.]

I use isolib type of B.C for the outlet, but I changed the velocity to convective outlet and pressure to Neumann as you can see below:

Code: Select all

double precision courant
double precision, dimension(:,:), pointer :: previous_vel
double precision, dimension(:), pointer :: cflnum
call field_get_val_prev_v(ivarfl(iu), previous_vel)
call field_get_val_s(icour, cflnum) 

! SOME CODES ...

call getfbr('OUTLET', nlelt, lstelt)
do ilelt = 1, nlelt
   ifac = lstelt(ilelt)
   itypfb(ifac) = isolib   

    iel = ifabor(ifac)
    courant = cflnum(iel)
    icodcl(ifac,iu) = 2
    rcodcl(ifac,iu,1) = previous_vel(1,iel)
    rcodcl(ifac,iu,2) = courant

    icodcl(ifac,iv) = 2
    rcodcl(ifac,iv,1) = previous_vel(2,iel)
    rcodcl(ifac,iv,2) = courant

    icodcl(ifac,iw) = 2
    rcodcl(ifac,iw,1) = previous_vel(3,iel)
    rcodcl(ifac,iw,2) = courant

    icodcl(ifac,ipr) = 3 !Neumann Boundary Condition for Pressure
    rcodcl(ifac,ipr,1) = 0.d0
    rcodcl(ifac,ipr,2) = rinfin
    rcodcl(ifac,ipr,3) = 0.d0
enddo
Regards,
Mohammad
Mohammad
Posts: 114
Joined: Thu Oct 25, 2018 12:18 pm

Re: Different pressure values on different machines!

Post by Mohammad »

Hello,

I noticed that MPI is not installed on the cluster and it seems that it does not apply the partitioning on the mesh!

I don't see any messages about partitioning in the listing file on the cluster, while on my own PC I see the following messages on all runs:
No "partition_input/domain_number_4" file available;

----------------------------------------------------------

Partitioning by space-filling curve: Morton (in bounding box).
Number of cells per domain (histogramm):
[ 788417 ; 788417 ] = 4

Partitioning finished (27.9 s)
There's no such message in listing file on the cluster.

You can see the header of listing file on both systems for the same case:
listing_PC.txt
(30.32 KiB) Downloaded 150 times
listing_CLUSTER_6.txt
(21.78 KiB) Downloaded 135 times

I tried to force it to use METIS by using cs_user_performance_tuning-partition.c file, but it does not make any difference.

I also tried to perform mesh partitioning on my own PC and then import that on cluster using domain.partition_input in cs_user_scripts.py, but again, it does not even read the partitions on the cluster!

I don't have privilege to install MPI on the cluster. And you said that this problem is maybe due to the partitioning.

The weird thing is that there is no problem on the same cluster with code_saturne version 5.0 with the same settings and same files! The problem only occurs if I use version 6.0.

I also attached the header of the listing file on the cluster for version 5. Hope it will be useful!
listing_CLUSTER_5.txt
(21.1 KiB) Downloaded 145 times

Regards,
Mohammad
Yvan Fournier
Posts: 4077
Joined: Mon Feb 20, 2012 3:25 pm

Re: Different pressure values on different machines!

Post by Yvan Fournier »

Hello,

Good you noticed. It means you are running the same computation multiple times on the cluster.

Perhaps MPI is already installed on the cluster, and you only need to detect it correctly and reinstall code_saturne to use it.
Check the cluster's documentation to check which MPI libraries are installed, and load environment modules if needed. Once "mpicc" and "mpicxx" scripts are found in your path (test using "mpicc -show"), you can use CC=mpixcc CXX=mpicxx in the code_saturne "configure" command (you do not need MPI for Fortran, as it is not used directly from Fortran in code_saturne).

Partitioning is only useful if you are running wth MPI (or to prepare a partitionning for another computation with MPI). You can run in parallel (MPI) without Metis (with the default partitioner), but without MPI, you are limited to OpenMP, which is used only in some parts of the code and is much more limited (and less tested/possibly buggy).

Regards,

Yvan
Mohammad
Posts: 114
Joined: Thu Oct 25, 2018 12:18 pm

Re: Different pressure values on different machines!

Post by Mohammad »

Hello,

Thank you very much dear Yvan, I found mpi on the cluster.
But now I get this error in compile.log file when I run a job:
/state/partition1/intel/impi/5.0.1.035/intel64/bin/mpiicc: line 590: icc: command not found

/state/partition1/intel/impi/5.0.1.035/intel64/bin/mpiicc -DHAVE_CONFIG_H -I/share/users/mohammad/usr/Code_Saturne/6.0.0/arch/prod_mpi/include -I/share/users/mohammad/usr/Code_Saturne/6.0.0/arch/prod_mpi/include/code_saturne -I/share/users/mohammad/usr/share/hdf5-1.10.5//include -I/usr/include -DMPICH_SKIP_MPICXX -D_POSIX_C_SOURCE=200809L -DNDEBUG -std=c99 -restrict -funsigned-char -Wall -Wcheck -Wshadow -Wpointer-arith -Wmissing-prototypes -Wuninitialized -Wunused -wd981 -openmp -O2 -c /share/users/mohammad/JOB/RESU/20201112-2259/src/cs_user_parameters-fields.c
Regards,
Mohammad
Yvan Fournier
Posts: 4077
Joined: Mon Feb 20, 2012 3:25 pm

Re: Different pressure values on different machines!

Post by Yvan Fournier »

Hello,

Maybe we should move this to the "install" section of the forum...

I guess you do not have the same environment when running or installing (perhaps not the same environment on the compute nodes).

Did you do the post-install (code_saturne.cfg) to indicate the batch system ? If yes, did you try "code_saturne submit" instead of "code_saturne run" . It compiles the code on the front-end, so may have a more complete environment.

Otherwise, dif you load any modules or need settings to find mpicc ?

Best regards,

Yvan
Post Reply