Page 1 of 1

Performance of v4.0.5 and 4.3.0

Posted: Tue Oct 04, 2016 6:03 am
by ffan
Hi Code_Saturne users and developers,

My system is a small cluster which uses Portable Batch System (PBS). I tried a small model of 66704 cells (turbulent junction flow, no heat transfer, picture attached below) with v4.0.5 and then with v4.3.0. Both runs 300 steps using 12 cores (in the same node). What is really strange is the run times are drastically different. v4.0.5 uses 87 seconds elapse time and 1047 seconds total CPU time, but v4.3.0 uses 157706 seconds elapse time and 1893645 seconds total CPU time! Same mesh, identical parameter files. What is going on with v4.3.0? I attach the listing and performance files. Any idea? Thanks.

- ffan

Re: Performance of v4.0.5 and 4.3.0

Posted: Tue Oct 04, 2016 9:42 am
by Yvan Fournier
Hello,

By default, V4.3 uses OpenMP. You might be uoversubscribing threads.

To use the same number of cores, if you use the same number of MPI ranks, you need to use 1 OpenMP thread per rank only. Here, you are using 12 ranks * 12 threads.

Could you post your "runcase" file ?

Regards,

Yvan

Re: Performance of v4.0.5 and 4.3.0

Posted: Tue Oct 04, 2016 6:44 pm
by ffan
Thanks Yvan. I attach "runcase" here, but I think it is the one when I ran it interactively with Code_Saturne GUI. The one actually controls the batch run is the PBS script "run_parallel" below. Thanks.

- ffan

Re: Performance of v4.0.5 and 4.3.0

Posted: Tue Oct 04, 2016 8:38 pm
by Yvan Fournier
Hello,

I recommend using the Code_Saturne post-install (see install documentation) so that newly create runcase scripts contain the relevant batch information (which is also handled by the GUI). This would have avoided you issue, which you can solve adding OMP_NUM_THREADS=1 to your parallel script.

Regards,

Yvan

Re: Performance of v4.0.5 and 4.3.0

Posted: Wed Oct 05, 2016 5:29 pm
by ffan
Yvan,

Thank you very much.

- ffan