parallel batch job

Questions and remarks about code_saturne usage
Forum rules
Please read the forum usage recommendations before posting.
Post Reply
attene
Posts: 68
Joined: Fri Jun 29, 2018 10:54 am

parallel batch job

Post by attene »

Dear all,

I am facing an issue regarding a parallel batch job.
The batch system is SGE.
Although in my script there is the option to run my job in parallel: #$ -l nodes=2, once the job starts running is still serial!



I also tried to add --mpi at the call to run the job; in my case:

/scratch/hpc/25/attene/CS_wave_loads/73_steady_conformal/3.62/RESU/20180808-0022/run_solver --mpi

with or without --mpi the calculation behaves the same.

any idea?

I was thinking this issue may be due to the configuration of the batch system during the post installation...

regards,

FA
Yvan Fournier
Posts: 4251
Joined: Mon Feb 20, 2012 3:25 pm

Re: parallel batch job

Post by Yvan Fournier »

Hello,

How are you submitting the job ? What is in your "run_solver" script (generated in RESU/<run_id> before running) ?

Do you have any messages when submitting ? Or in the batch log files ?

Regards,

Yvan
attene
Posts: 68
Joined: Fri Jun 29, 2018 10:54 am

Re: parallel batch job

Post by attene »

Hi Yvan,

I have attached the run_solver as well as the performance.log and setup.log.
I didn't have any particular messages when submitting neither in the batch log files.

At the moment I am waiting for a simulation to start: I added mpirun at the line of the command of cs_solver in the run_solver.

Regards,

FA
Attachments
setup.log
(25.7 KiB) Downloaded 332 times
performance.log
(882 Bytes) Downloaded 360 times
Yvan Fournier
Posts: 4251
Joined: Mon Feb 20, 2012 3:25 pm

Re: parallel batch job

Post by Yvan Fournier »

Hello,

These are not the type o log files to which I am referring. Batch systems usually add a file (either in RESU/<run_id> or SCRIPTS, depending on how you submitted the file, with a name containing the job number, and extension .out and .err).

Also how do you submit the job ? What does your runcase (including the batch header) look like ?

Regards,

Yvan
attene
Posts: 68
Joined: Fri Jun 29, 2018 10:54 am

Re: parallel batch job

Post by attene »

Hello,

Files directly from the batch systems are, in my case, with extension .e1385884 and o.01385884 which are empty and I have not attached.

Anyway I solved the problem by adding mpirun at the command to the binary cs_solver in run_solver. I called then the run_solver to the batch script (see it attached).
It looks like the cs_solver by default runs in serial when invoked on their own..

Regards,

FA
Attachments
script_parallel.txt
(206 Bytes) Downloaded 354 times
Yvan Fournier
Posts: 4251
Joined: Mon Feb 20, 2012 3:25 pm

Re: parallel batch job

Post by Yvan Fournier »

Hello,

Maybe you can automate this better by defining the mpiexec command choice in the post-install (code-saturne.cfg), since it seems the automatic/default values do not sem to be adapted to your system.

Details on your system (OS version, mpi library version, and current code_saturne.cfg file, ans summary file) could help us improve the default detection (but SGE is always a mess, with a difficult to automate syntax, so I am not surprised)

Best regards,

Yvan
Post Reply