Dear all,
Please could you give me some ideas about this question:
I am running Code_Saturne on my desktop PC: Intel(R) Xeon(R) Bronze 3104 CPU @ 1.70GHz. I know my PC has 5 processes, with 6 threads per process.
When I started calculation in GUI, if I choose "Number of processes: 1, Threads per process: 8", for example, the calculation started with no errors. Then when I check the "summary" file, I found "threads-per-task 8" and "N Procs : 1", which I think might be impossible, because my PC has only 6 threads per process.
So how could I make sure, how many processes and threads are actually using in a calculation?
Best regards,
Ruonan
How to know how many processes and threads are actually used in a calculation?
Forum rules
Please read the forum usage recommendations before posting.
Please read the forum usage recommendations before posting.
-
- Posts: 4220
- Joined: Mon Feb 20, 2012 3:25 pm
Re: How to know how many processes and threads are actually used in a calculation?
Hello,
Actually, the value in the listing/riun_solver.log file should be correct.
On most systems, you can assign more threads than the hardware can run simultaneously, which is called oversbuscribing. In this case, execution alternates between those threads, so in most (not all) cases, you can expect to have better performance at 6 threads than at 8 (and even possibly at 5, if the system also needs to run tasks in the background).
This is possible also with MPI, though some recent MPI libraries (such as OpenMPI) only allow running up to the physical number of cores, unless you add the "--overscubscribe" option, which basically tells it to do it anyways, that you know what you are doing (usually bad for performance, but useful for debugging).
Best regards ,
Yvan
Actually, the value in the listing/riun_solver.log file should be correct.
On most systems, you can assign more threads than the hardware can run simultaneously, which is called oversbuscribing. In this case, execution alternates between those threads, so in most (not all) cases, you can expect to have better performance at 6 threads than at 8 (and even possibly at 5, if the system also needs to run tasks in the background).
This is possible also with MPI, though some recent MPI libraries (such as OpenMPI) only allow running up to the physical number of cores, unless you add the "--overscubscribe" option, which basically tells it to do it anyways, that you know what you are doing (usually bad for performance, but useful for debugging).
Best regards ,
Yvan
Re: How to know how many processes and threads are actually used in a calculation?
Hello Yvan,
Thank you very much for the reply. I found the information in run_solver.log file.
Yes in my current case, I have a better performance using 4 threads than 6 threads. Also it depends on the mesh elements number, I think.
Best regards,
Ruonan
Thank you very much for the reply. I found the information in run_solver.log file.
Yes in my current case, I have a better performance using 4 threads than 6 threads. Also it depends on the mesh elements number, I think.
Best regards,
Ruonan
Re: How to know how many processes and threads are actually used in a calculation?
Hello. According to Intel specs, you CPU has 6 cores so you should get maximum performance with 6 threads in any CFD software. I can guess the following reasons for the slowdown you observe:
1. Your CPU is overheating (for example, thermal grease has dried).
2. You have another process(es) running that "consume" core(s).
3. The mesh is too "lite" for parallel calculation, although it's rare. If you have at least 0.5...1 mln of cells you definitely will no have this problem on your machine. I ran lots of simulations on workstations with 4...16 cores and there was no problems with parallel performance (for example, ~3mln cells mesh was OK for 4-core and 16-core machines).
1. Your CPU is overheating (for example, thermal grease has dried).
2. You have another process(es) running that "consume" core(s).
3. The mesh is too "lite" for parallel calculation, although it's rare. If you have at least 0.5...1 mln of cells you definitely will no have this problem on your machine. I ran lots of simulations on workstations with 4...16 cores and there was no problems with parallel performance (for example, ~3mln cells mesh was OK for 4-core and 16-core machines).
Re: How to know how many processes and threads are actually used in a calculation?
Hello Antech,
Thank you very much for telling me these. There may be some misunderstands in the observation I mentioned before. In the latest case, I can see the phenomena consistent with what you said:
In my latest calculation (mesh with 0.12 million elements, 0.11 million nodes, LES in Saturne):
When I use full 6 cores, I got maximum performance, and all cores' efficiencies are 100%. When I changed to 4 cores, the calculating speed decreased to 86%. When I changed to 2 cores, the speed went on decrease to 56%. And when I oversubscribed (assign 8 cores), I can see all 6 cores' efficiencies decreased from 100% to 50%, and the speed decreased to 38% compared to 6 cores', which is the worst performance compare to others.
And yes, my desktop has 1 processor 6 cores 6 threads, not 5 processors.
Best regards,
Ruonan
Thank you very much for telling me these. There may be some misunderstands in the observation I mentioned before. In the latest case, I can see the phenomena consistent with what you said:
In my latest calculation (mesh with 0.12 million elements, 0.11 million nodes, LES in Saturne):
When I use full 6 cores, I got maximum performance, and all cores' efficiencies are 100%. When I changed to 4 cores, the calculating speed decreased to 86%. When I changed to 2 cores, the speed went on decrease to 56%. And when I oversubscribed (assign 8 cores), I can see all 6 cores' efficiencies decreased from 100% to 50%, and the speed decreased to 38% compared to 6 cores', which is the worst performance compare to others.
And yes, my desktop has 1 processor 6 cores 6 threads, not 5 processors.
Best regards,
Ruonan