How many process can I run on this machine?

Questions and remarks about code_saturne usage
Forum rules
Please read the forum usage recommendations before posting.
leguichet
Posts: 34
Joined: Wed Sep 03, 2014 1:00 pm

How many process can I run on this machine?

Post by leguichet »

Hello everyone,

I am not sure if this question is appropriate for "Code_Saturne usage" section.

I am trying to run a turbo-machinery case of 11 millions tetra cells. I have a workstation with one CPU Intel(R) Xeon(R) CPU E5-2687W v3 @ 3.10GHz.

In the listing, it shows
Configuration locale du cas :

Date : lun. 30 oct. 2017 16:43:29 CET
Système : Linux 3.16.0-4-amd64 (Debian GNU/Linux 8)
Machine : belledone
Processeur : model name : Intel(R) Xeon(R) CPU E5-2687W v3 @ 3.10GHz
Mémoire : 129096 Mo
Utilisateur : gao (gao)
Répertoire : /home/gao/Documents/TEST/Turbo/RESU/20171030-1642
Rangs MPI : 20 (attribut appnum : 0)
Threads OpenMP : 1
Processeurs/noeud : 40
méthode de lecture : MPI-IO collectif (positions explicites)
méthode d'écriture : MPI-IO collectif (positions explicites)
pas des rangs E/S : 1

Why there is Processeurs/noeud : 40? On the intel site, the CPU Intel(R) Xeon(R) CPU E5-2687W v3 @ 3.10GHz has 10 cores and 20 threads.

I am wondering what is the largest numbers of MPI/OpenMP configuration I can use to reduce the cpu elapsed time?
Number of processes 20
Threads per process 1

or
Number of processes 10
Threads per process 2

or
Number of processes 40
Threads per process 1

or
Number of processes 20
Threads per process 2

Thank you very much!
Yvan Fournier
Posts: 4070
Joined: Mon Feb 20, 2012 3:25 pm

Re: How many process can I run on this machine?

Post by Yvan Fournier »

Hello,

This is strange. What is the output of:
cat /proc/cpuinfo
on your machine ?

I recommend different combination, but it is probable that

Number of processes 20
Threads per process 1

or

Number of processes 10
Threads per process 2

will lead to the fastest combination. Depending on the processor memory performance, you might also not gain any peformance beyond a few processes (on some of our higher end machines, we get good scaling up to all cores, on some lower end machines, no improvement is obtained beyond half of the machine... So you may want to compare to:

Number of processes 10
Threads per process 1

Regards,

Yvan
leguichet
Posts: 34
Joined: Wed Sep 03, 2014 1:00 pm

Re: How many process can I run on this machine?

Post by leguichet »

Thank you for your reply.

The output
cat /proc/cpuinfo
is attache here.

I am wondering if I can go to 40 processes.
Attachments
cpu.txt
(40.02 KiB) Downloaded 247 times
Luciano Garelli
Posts: 280
Joined: Fri Dec 04, 2015 1:42 pm

Re: How many process can I run on this machine?

Post by Luciano Garelli »

Hello,

Are you sure that you have only one socket? because in the cpuinfo your "physical id" are 0 and 1, which means that you have two sockets, each one with 10 cores and hyperthreading.

You can also run "lscpu" to get additional info.

Regards,

Luciano
leguichet
Posts: 34
Joined: Wed Sep 03, 2014 1:00 pm

Re: How many process can I run on this machine?

Post by leguichet »

Hello

lscpu gives
Architecture : x86_64
Mode(s) opératoire(s) des processeurs : 32-bit, 64-bit
Boutisme : Little Endian
Processeur(s) : 40
Liste de processeur(s) en ligne : 0-39
Thread(s) par cœur : 2
Cœur(s) par socket : 10
Socket(s) : 2
Nœud(s) NUMA : 2
Identifiant constructeur : GenuineIntel
Famille de processeur : 6
Modèle : 63
Nom de modèle : Intel(R) Xeon(R) CPU E5-2687W v3 @ 3.10GHz
Révision : 2
Vitesse du processeur en MHz : 1207.789
Vitesse maximale du processeur en MHz : 3500,0000
Vitesse minimale du processeur en MHz : 1200,0000
BogoMIPS : 6189.13
Virtualisation : VT-x
Cache L1d : 32K
Cache L1i : 32K
Cache L2 : 256K
Cache L3 : 25600K
Nœud NUMA 0 de processeur(s) : 0-9,20-29
Nœud NUMA 1 de processeur(s) : 10-19,30-39
It is very strange that we bought only one socket machine.
Luciano Garelli wrote:Hello,

Are you sure that you have only one socket? because in the cpuinfo your "physical id" are 0 and 1, which means that you have two sockets, each one with 10 cores and hyperthreading.

You can also run "lscpu" to get additional info.

Regards,

Luciano
Luciano Garelli
Posts: 280
Joined: Fri Dec 04, 2015 1:42 pm

Re: How many process can I run on this machine?

Post by Luciano Garelli »

Hello,

You are lucky, you have got a machine with 2 socket with 10 physical cores each one (20 physical cores in total).

Regards,

Luciano
leguichet
Posts: 34
Joined: Wed Sep 03, 2014 1:00 pm

Re: How many process can I run on this machine?

Post by leguichet »

Hello,

Thank you for your reply. Therefore, I can run with

Number of processes 20
Threads per process 2

or

Number of processes 40
Threads per process 1
?

My colleague told me that the CPU has 10 cores, each of which has two virtual threads (not two physical threads). I am wondering if it is true?

When we are running AUTODESK CFD SIMULATION ON WINDOWS, he observed that only one process on one core (one thread running, another idling). So he says that each core has only one physical thread, not two.

Thank you!

Kind regards
Luciano Garelli wrote:Hello,

You are lucky, you have got a machine with 2 socket with 10 physical cores each one (20 physical cores in total).

Regards,

Luciano
Luciano Garelli
Posts: 280
Joined: Fri Dec 04, 2015 1:42 pm

Re: How many process can I run on this machine?

Post by Luciano Garelli »

Hello,

Each one of yours CPU's has 10 physical cores and hyperthreading. As Yvan mention you will get the best scalability using

Number of processes 20
Threads per process 1

or

Number of processes 10
Threads per process 2

Also you can run a smaller case and measure the speedup by your own to get yours conclusions. Also you have to check your available memory because with the first option (flat MPI) you will need more memory that with the hybrid case.


Regards,

Luciano
leguichet
Posts: 34
Joined: Wed Sep 03, 2014 1:00 pm

Re: How many process can I run on this machine?

Post by leguichet »

Thank you Luciano. It is very helpful.
Luciano Garelli wrote:Hello,

Each one of yours CPU's has 10 physical cores and hyperthreading. As Yvan mention you will get the best scalability using

Number of processes 20
Threads per process 1

or

Number of processes 10
Threads per process 2

Also you can run a smaller case and measure the speedup by your own to get yours conclusions. Also you have to check your available memory because with the first option (flat MPI) you will need more memory that with the hybrid case.


Regards,

Luciano
JonasA
Posts: 18
Joined: Tue Feb 06, 2018 11:49 am

Re: How many process can I run on this machine?

Post by JonasA »

Hi everyone! I am a new user of code saturne v5.0.5 on Debian 8.
I have run a speed-up test with different configurations
Luciano Garelli wrote: Also you have to check your available memory because with the first option (flat MPI) you will need more memory that with the hybrid case.
Memory is not a problem for this machine so far.

I run the test by computing 500 steps with a case in the tutorial 08_PUMP_JOINING_5.0 provided during the training course last year in Julich research center in Germany.
Running on 20 processes with 2 thread per each process was ~15% slower than running with 40 processes with one thread per each process: 359 second vs 316 second. When looking at the number of threads used with htop, it is found that the 40 processes configuration uses constantly the total 40 threads for the simulation while the other configuration uses between 20 and 40 threads(the number of used threads changes every few seconds).

Therefore, it seems that it is more efficient to use all the virtual threads than asking Code-Saturne to divide the process by virtual threading. Does this conclusion make sense in general?

Best regards

Jonas
Attachments
Belledonne_CS5_Speed_Test.ods
The speed test on various configuration
(20.26 KiB) Downloaded 229 times
Post Reply