Dear all,
I am facing an issue regarding a parallel batch job.
The batch system is SGE.
Although in my script there is the option to run my job in parallel: #$ -l nodes=2, once the job starts running is still serial!
I also tried to add --mpi at the call to run the job; in my case:
/scratch/hpc/25/attene/CS_wave_loads/73_steady_conformal/3.62/RESU/20180808-0022/run_solver --mpi
with or without --mpi the calculation behaves the same.
any idea?
I was thinking this issue may be due to the configuration of the batch system during the post installation...
regards,
FA
parallel batch job
Forum rules
Please read the forum usage recommendations before posting.
Please read the forum usage recommendations before posting.
-
- Posts: 4251
- Joined: Mon Feb 20, 2012 3:25 pm
Re: parallel batch job
Hello,
How are you submitting the job ? What is in your "run_solver" script (generated in RESU/<run_id> before running) ?
Do you have any messages when submitting ? Or in the batch log files ?
Regards,
Yvan
How are you submitting the job ? What is in your "run_solver" script (generated in RESU/<run_id> before running) ?
Do you have any messages when submitting ? Or in the batch log files ?
Regards,
Yvan
Re: parallel batch job
Hi Yvan,
I have attached the run_solver as well as the performance.log and setup.log.
I didn't have any particular messages when submitting neither in the batch log files.
At the moment I am waiting for a simulation to start: I added mpirun at the line of the command of cs_solver in the run_solver.
Regards,
FA
I have attached the run_solver as well as the performance.log and setup.log.
I didn't have any particular messages when submitting neither in the batch log files.
At the moment I am waiting for a simulation to start: I added mpirun at the line of the command of cs_solver in the run_solver.
Regards,
FA
- Attachments
-
- setup.log
- (25.7 KiB) Downloaded 332 times
-
- performance.log
- (882 Bytes) Downloaded 360 times
-
- Posts: 4251
- Joined: Mon Feb 20, 2012 3:25 pm
Re: parallel batch job
Hello,
These are not the type o log files to which I am referring. Batch systems usually add a file (either in RESU/<run_id> or SCRIPTS, depending on how you submitted the file, with a name containing the job number, and extension .out and .err).
Also how do you submit the job ? What does your runcase (including the batch header) look like ?
Regards,
Yvan
These are not the type o log files to which I am referring. Batch systems usually add a file (either in RESU/<run_id> or SCRIPTS, depending on how you submitted the file, with a name containing the job number, and extension .out and .err).
Also how do you submit the job ? What does your runcase (including the batch header) look like ?
Regards,
Yvan
Re: parallel batch job
Hello,
Files directly from the batch systems are, in my case, with extension .e1385884 and o.01385884 which are empty and I have not attached.
Anyway I solved the problem by adding mpirun at the command to the binary cs_solver in run_solver. I called then the run_solver to the batch script (see it attached).
It looks like the cs_solver by default runs in serial when invoked on their own..
Regards,
FA
Files directly from the batch systems are, in my case, with extension .e1385884 and o.01385884 which are empty and I have not attached.
Anyway I solved the problem by adding mpirun at the command to the binary cs_solver in run_solver. I called then the run_solver to the batch script (see it attached).
It looks like the cs_solver by default runs in serial when invoked on their own..
Regards,
FA
- Attachments
-
- script_parallel.txt
- (206 Bytes) Downloaded 355 times
-
- Posts: 4251
- Joined: Mon Feb 20, 2012 3:25 pm
Re: parallel batch job
Hello,
Maybe you can automate this better by defining the mpiexec command choice in the post-install (code-saturne.cfg), since it seems the automatic/default values do not sem to be adapted to your system.
Details on your system (OS version, mpi library version, and current code_saturne.cfg file, ans summary file) could help us improve the default detection (but SGE is always a mess, with a difficult to automate syntax, so I am not surprised)
Best regards,
Yvan
Maybe you can automate this better by defining the mpiexec command choice in the post-install (code-saturne.cfg), since it seems the automatic/default values do not sem to be adapted to your system.
Details on your system (OS version, mpi library version, and current code_saturne.cfg file, ans summary file) could help us improve the default detection (but SGE is always a mess, with a difficult to automate syntax, so I am not surprised)
Best regards,
Yvan