Hi Code_Saturne users and developers,
My system is a small cluster which uses Portable Batch System (PBS). I tried a small model of 66704 cells (turbulent junction flow, no heat transfer, picture attached below) with v4.0.5 and then with v4.3.0. Both runs 300 steps using 12 cores (in the same node). What is really strange is the run times are drastically different. v4.0.5 uses 87 seconds elapse time and 1047 seconds total CPU time, but v4.3.0 uses 157706 seconds elapse time and 1893645 seconds total CPU time! Same mesh, identical parameter files. What is going on with v4.3.0? I attach the listing and performance files. Any idea? Thanks.
- ffan
Performance of v4.0.5 and 4.3.0
Forum rules
Please read the forum usage recommendations before posting.
Please read the forum usage recommendations before posting.
Performance of v4.0.5 and 4.3.0
- Attachments
-
- v4.3.0.tar.gz
- (12 KiB) Downloaded 275 times
-
- v4.0.5.tar.gz
- (10.44 KiB) Downloaded 236 times
-
- Posts: 4208
- Joined: Mon Feb 20, 2012 3:25 pm
Re: Performance of v4.0.5 and 4.3.0
Hello,
By default, V4.3 uses OpenMP. You might be uoversubscribing threads.
To use the same number of cores, if you use the same number of MPI ranks, you need to use 1 OpenMP thread per rank only. Here, you are using 12 ranks * 12 threads.
Could you post your "runcase" file ?
Regards,
Yvan
By default, V4.3 uses OpenMP. You might be uoversubscribing threads.
To use the same number of cores, if you use the same number of MPI ranks, you need to use 1 OpenMP thread per rank only. Here, you are using 12 ranks * 12 threads.
Could you post your "runcase" file ?
Regards,
Yvan
Re: Performance of v4.0.5 and 4.3.0
Thanks Yvan. I attach "runcase" here, but I think it is the one when I ran it interactively with Code_Saturne GUI. The one actually controls the batch run is the PBS script "run_parallel" below. Thanks.
- ffan
- ffan
- Attachments
-
- run_parallel.txt
- (1.36 KiB) Downloaded 283 times
-
- runcase.txt
- (204 Bytes) Downloaded 263 times
-
- Posts: 4208
- Joined: Mon Feb 20, 2012 3:25 pm
Re: Performance of v4.0.5 and 4.3.0
Hello,
I recommend using the Code_Saturne post-install (see install documentation) so that newly create runcase scripts contain the relevant batch information (which is also handled by the GUI). This would have avoided you issue, which you can solve adding OMP_NUM_THREADS=1 to your parallel script.
Regards,
Yvan
I recommend using the Code_Saturne post-install (see install documentation) so that newly create runcase scripts contain the relevant batch information (which is also handled by the GUI). This would have avoided you issue, which you can solve adding OMP_NUM_THREADS=1 to your parallel script.
Regards,
Yvan
Re: Performance of v4.0.5 and 4.3.0
Yvan,
Thank you very much.
- ffan
Thank you very much.
- ffan