This page explains how to submit cases on a cluster using the SLURM resource manager with the studymanager tool.
In order to activate the submission of cases on cluster with SLURM, it is necessary to add the --submit command line.
It is also possible to specify the two following options:
Remark: In order to prevent a large number of submissions on cluster with a single case per batch, the total number of cases with an expected time lower than 5 minutes is counted. If this number is greater than 50, --slurm-batch-size is set to 50.
For instance, the following commands
will submit one case per batch. The total number of submissions will be the number of cases defined in smgr.xml plus the final analysis.
will submit batch of cases defined in sample.xml with a maximum of 20 cases per batch and a maximum total computation time of 5 hours. All cases are automatically sorted by number of required processes so that the number of tasks per batch is the same. The number of cases per batch could then be inferior to 20 if the total computation time exceeds 5h or if the number of cases with the same configuration (i.e. number of tasks) is limiting.
In order to compute the total computation time per batch, it is necessary to specify an expected computation time per case (HH:MM) otherwise the default value (3 hours) will be applied.
The expected computation time can be specified in the SMGR xml file.
In this case, the run Grid1 has an expected computation time of 2 hours and 15 minutes and the one of Grid2 is 5 hours.
It can also be specified in the run.cfg file a given case.
In this case, all runs have an expected computation time of 30 minutes, except the one of the run mesh2 that is 2 hours.
Remark: As for the number of process (n_procs), the expected time specified in SMGR xml file dominates over those specified in run.cfg files.
SLURM batch files are automatically generated in the folder slurm_files in destination. The following file is an example of a SLURM file used with the SLURM batch mode.
Batch cases which require 6 or more processes will be executed in exclusive mode (i.e. no other submission will run on the node).
Additional SLURM batch parameters can be also specified at run time using the --slurm-batch-arg option. This option only takes into account one argument at a time. For example, to add the "exclusive" and send an e-mail notification use the following command-line option: --slurm-batch-arg=--exclusive --slurm-batch-arg=--mail-user=name.last@email.com
All ouput and error files are also in the folder slurm_files in destination.
Job-dependencies are defined automatically such that blocks of dependency level M will wait until all required blocks of level M-1 are successfully finished.
Three methods are available to define a dependency between cases:
Dependencies defined using a <depends> node have priority over those deduced from parametric arguments. They both have priority over restarts in code_saturne data settings.
In the rare cases where dependencies are not related to restarts, the <depends> approach allow fine-grained control. In other cases, dependencies are deduced from restart definitions, so no additional user settings are needed.
The state analysis is automatically added with the slurm batch mode. This final batch will depend on all previous submissions. It can also include postprocessing and comparison steps if these options are activated.
In the following example, a list of 8 cases is launched in SLURM batch mode:
Here are some explanations on cases allocation per batch :