[SOLVED] Code_Saturne + SLURM: Errors

All questions about installation
Forum rules
Please read the forum usage recommendations before posting.
Post Reply
FredH
Posts: 6
Joined: Thu Nov 15, 2018 11:04 am

[SOLVED] Code_Saturne + SLURM: Errors

Post by FredH »

Hello Specialists,

I'm trying to install Code_Saturne for one user on my HPC (w/ SLURM).
CentOS Linux release 7.3.1611 (Core)

I've "succeed" one time to install v5.1.5 (on NFS shared folder), with the auto install script.
But the user reported a SIGTERM signal (error.png).

So, I tried to compile others flavors: first time stable version v5.0.9, v5.0.9 debug, v5.3.0, v5.1.5 debug with the auto install script, and now manually.

Now, I'me facing this error for each new initialization/version, even new v5.1.5 :shock: :

Code: Select all

$ code_saturne run --initialize -p setup.xml --id=test1

                      Code_Saturne
                      ************

 Version:   5.0
 Path:      /work/projects/Code_Saturne/test

 Result directory:
   /scratch/hmzf/Test1/Case1-xml/RESU/test1


 Single processor Code_Saturne simulation.


 ***************************
  Preprocessing calculation
 ***************************

Traceback (most recent call last):
  File "/work/projects/Code_Saturne/test/bin/code_saturne", line 76, in <module>
    retcode = cs.execute()
  File "/work/projects/Code_Saturne/test/lib/python2.7/site-packages/code_saturne/cs_script.py", line 93, in execute
    return self.commands[command](options)
  File "/work/projects/Code_Saturne/test/lib/python2.7/site-packages/code_saturne/cs_script.py", line 168, in run
    return cs_run.main(options, self.package)
  File "/work/projects/Code_Saturne/test/lib/python2.7/site-packages/code_saturne/cs_run.py", line 387, in main
    return run(argv, pkg)[0]
  File "/work/projects/Code_Saturne/test/lib/python2.7/site-packages/code_saturne/cs_run.py", line 375, in run
    stages=stages)
  File "/work/projects/Code_Saturne/test/lib/python2.7/site-packages/code_saturne/cs_case.py", line 1945, in run
    mpiexec_options)
  File "/work/projects/Code_Saturne/test/lib/python2.7/site-packages/code_saturne/cs_case.py", line 1646, in preprocess
    d.preprocess()
  File "/work/projects/Code_Saturne/test/lib/python2.7/site-packages/code_saturne/cs_case_domain.py", line 807, in preprocess
    retcode = run_command(cmd, pkg=self.package)
  File "/work/projects/Code_Saturne/test/lib/python2.7/site-packages/code_saturne/cs_exec_environment.py", line 524, in run_command
    p = subprocess.Popen(cmd, universal_newlines=True, env = env, **kwargs)
  File "/usr/lib64/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/usr/lib64/python2.7/subprocess.py", line 1327, in _execute_child
    raise child_exception
OSError: [Errno 2] No such file or directory
Every initialization is done on a fresh session (logout/logon), with export of the version path + alias.


Here is in attachment:
- install.txt : What i do for install (this time manually) + post
- mpic_versions.txt: -v of mpicc & mpic++
- configure.log.txt: the output of configure
- config.log
- run.txt: run Code_Saturne as user.
- test1: the RESU of the initialize



I'm really not a Code_Saturne specialist, and my user too, so if you could help me. I've probably missed something...
Thanks a lot.
Regards
Attachments
CodeSaturne.zip
(81.42 KiB) Downloaded 341 times
error.png
Last edited by FredH on Wed Mar 06, 2019 3:43 pm, edited 1 time in total.
Yvan Fournier
Posts: 4070
Joined: Mon Feb 20, 2012 3:25 pm

Re: Code_Saturne + SLURM: Errors

Post by Yvan Fournier »

Hello,

It seems that in your configuration options, you added --disable-frontend, which means the preprocessor is not available (we need to add a check for this to have a better error message).

So with this option, you can only use pre-,imported "mesh_input"/"mesh_output" files or directories.

Otherwise, using SlURM, did you do the post-installation involving code_saturne.cfg ?

Regards,

Yvan
FredH
Posts: 6
Joined: Thu Nov 15, 2018 11:04 am

Re: Code_Saturne + SLURM: Errors

Post by FredH »

Hello Yvan,

Oh yes, thank you for this precision.
It's my first install and this confirm that I'm really not aware about the usage of Code_Saturne.
And my user not so much... (no prepossessing at the beginning, prepossessing now... v5.1, v5.0.9 now...),

For the post-install, I've followed point 8 of the Install manual, and missed the compute_versions setting in the cfg file. :oops: (A pause must be done to clear my mind).

So, two compilations must be done:
  • For Front-end without MPI
  • For Compute Nodes: --disable-frontend + MPI
I'll try to test this as soon as possible, and keep you in touch.

Thanks for your help.
Yvan Fournier
Posts: 4070
Joined: Mon Feb 20, 2012 3:25 pm

Re: Code_Saturne + SLURM: Errors

Post by Yvan Fournier »

Hello,

No, you do not need to disable the front-end on most installations. In our case, we usually install a "main" production version, and a "debug" build (using --enable-debug). I often add --disable-frontend to the debug build to avoid duplicate installs of the preprocessor and documentation, and add the debug buid to the "compute_versions" of the main build, so as to be able to do everything from the main build, including choose the compute version from the GUI, but this as an "advanced" (though recommended) install.

Completely different builds on the front-end and compute nodes is only necessary when the two system types are quite different (as on IBM Blue Gene machines, or some Crays, but not most clusters).

Simply installing the code on a cluster and adding SLURM or a path to a SLURM template file to the "batch" entry in code_saturne.cfg should work fine.

Best regards,

Yvan
FredH
Posts: 6
Joined: Thu Nov 15, 2018 11:04 am

Re: Code_Saturne + SLURM: Errors

Post by FredH »

Thank you very much for the answer.

Sorry for the delay, I hope to be able to do a quick test this week (between maintenances and filesystem issues ).

Best regards.
FredH
Posts: 6
Joined: Thu Nov 15, 2018 11:04 am

Re: Code_Saturne + SLURM: Errors

Post by FredH »

Hello,
sorry for the delay (admin life...)

Here is the last error reported by the user (error2.png)

Seems simple, I'll try to find on my side too.

Regards.
Attachments
error2.png
(11.05 KiB) Not downloaded yet
Yvan Fournier
Posts: 4070
Joined: Mon Feb 20, 2012 3:25 pm

Re: Code_Saturne + SLURM: Errors

Post by Yvan Fournier »

Hello,

For OpenMPI, using SLURM, on at least one of our systems, we set
mpiexec_n_per_node =
(empty string)

in etc/code_saturne.cfg so as to avoid the automatic -ppn setting. This should work here also. Though I don't understand why it appears if you did not set it (in bin/cs_exec_environment.py, it is set if the detected mpiexec is Hydra, from MPICH, so should not appear here).

Best regards,

Yvan
FredH
Posts: 6
Joined: Thu Nov 15, 2018 11:04 am

Re: Code_Saturne + SLURM: Errors

Post by FredH »

Hello,
thanks for the quick answer, the ppn is removed now.

Another quick one:
I've added some #SBATCH lines in code_saturne-5.0.9/share/code_saturne/batch/batch.SLURM and they do not appear in the run_solver after a "code_saturne run --initialize...", I've missed something again ?

Thanks
Yvan Fournier
Posts: 4070
Joined: Mon Feb 20, 2012 3:25 pm

Re: Code_Saturne + SLURM: Errors

Post by Yvan Fournier »

Hello,

Dis you reinstall the code after modification ?
As a note, you can use a batch template (named <basename>.SLURM if using SLURM) which is separate from the code sources, and use an absolute path in the code_saturne.cfg "batch" entry to use it (which avoids requiring reinstalling and also allows separating sources from "site" files).

In any case, you also need to build a new case (or import one using "code_saturne create --import-only" from within the base directory of a case) for the batch template to be rebuilt.

Best regards,

Yvan
FredH
Posts: 6
Joined: Thu Nov 15, 2018 11:04 am

Re: Code_Saturne + SLURM: Errors

Post by FredH »

Dear Yvan,

Great news,
Finally my user was able to run a SLURM job on many nodes/cores.

He is using the run_solver in ../RESU/RUN_TEST of his case...
Adding #SBATCH options, good setup of openmpi... make it run.

Thanks a lot for your time and your support.

Best regards.
Post Reply