Installation on cluster

All questions about installation
Forum rules
Please read the forum usage recommendations before posting.
Post Reply
Puneeth
Posts: 8
Joined: Thu Oct 18, 2018 8:37 am

Installation on cluster

Post by Puneeth » Tue Sep 15, 2020 5:29 pm

Hello,

While testing a fresh installation of Code-Saturne v5.1.6 on a cluster, I'm facing a Compile or link error shown in slurm-275114.out.
The error doesn't provide much info on what is going wrong or where.
I kindly request you to assist me in debugging this error.
Also, please find the compile.log attached herewith.

Thanks and Regards,
Puneeth
Attachments
compile.log
(29.25 KiB) Downloaded 1 time
slurm-275114.txt
(1.96 KiB) Downloaded 1 time

Yvan Fournier
Posts: 3049
Joined: Mon Feb 20, 2012 3:25 pm

Re: Installation on cluster

Post by Yvan Fournier » Tue Sep 15, 2020 6:11 pm

Hello,

Thé compile log indicates the link with libxml2 fails. It is possibly due to not having the libxml2 dev package on the computer nodes (i.e. libxml2.so.x present but not the libxml2.so link).

Do you use "code_saturne submit" or the GUi, or submit a runcase directly (not recommend ed for the above reason) ?

You also have compile warnings you should check.

Best regards,

Yvan

Puneeth
Posts: 8
Joined: Thu Oct 18, 2018 8:37 am

Re: Installation on cluster

Post by Puneeth » Wed Sep 16, 2020 9:06 am

Hi,

Thanks for the reply.

I will confirm with the Cluster admins whether Code_saturne was installed with the libxml2 option enabled. And also for the dev package of libxml2 to be installed.

The simulation is run using a batch file.

Best regards,

Puneeth
Last edited by Puneeth on Wed Sep 16, 2020 3:28 pm, edited 1 time in total.

Puneeth
Posts: 8
Joined: Thu Oct 18, 2018 8:37 am

Re: Installation on cluster

Post by Puneeth » Wed Sep 16, 2020 2:17 pm

Hello,

There has been an update in the situation.

The admin has recompiled Code-saturne v5.1.6 considering your recommendations about libxml2.
However, the simulations still fail due to another error:
"/gpfslocalsup/pub/code-saturne/5.1.6/libexec/code_saturne/cs_preprocess: error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory"
Please find the slurm-xxxx.out file attached, which specifies this error.

Surprisingly, libimf.so seems to be found as a dependency by cs_process when we check the output for
"$ldd /gpfslocalsup/pub/code-saturne/5.1.6/libexec/code_saturne/cs_preprocess":

$ldd /gpfslocalsup/pub/code-saturne/5.1.6/libexec/code_saturne/cs_preprocess
linux-vdso.so.1 (0x00007fff3c184000)
libhdf5.so.10 => /gpfslocalsup/spack_soft/hdf5/1.8.21/intel-19.0.4-ze52g22lxxwb7ezsvxepmcixo6lmotwe/lib/libhdf5.so.10 (0x00007f0468a9a000)
libm.so.6 => /lib64/libm.so.6 (0x00007f0468718000)
libz.so.1 => /lib64/libz.so.1 (0x00007f0468501000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f04682fd000)
libmpifort.so.12 => /gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/mpi/intel64/lib/libmpifort.so.12 (0x00007f0467f3e000)
libmpi.so.12 => /gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/mpi/intel64/lib/release/libmpi.so.12 (0x00007f046704c000)
librt.so.1 => /lib64/librt.so.1 (0x00007f0466e43000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f0466c23000)
libgcc_s.so.1 => /gpfslocalsup/spack_soft/gcc/7.3.0/gcc-8.3.1-vqzoua4fyg6e5jiz3vhkpjb4qtofjfrf/lib64/libgcc_s.so.1 (0x00007f0466a0b000)
libc.so.6 => /lib64/libc.so.6 (0x00007f0466648000)
libimf.so => /gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/compiler/lib/intel64_lin/libimf.so (0x00007f04660a8000)
libsvml.so => /gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/compiler/lib/intel64_lin/libsvml.so (0x00007f0464704000)
libirng.so => /gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/compiler/lib/intel64_lin/libirng.so (0x00007f0464392000)
libintlc.so.5 => /gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/compiler/lib/intel64_lin/libintlc.so.5 (0x00007f0464120000)
/lib64/ld-linux-x86-64.so.2 (0x00007f046906b000)
libfabric.so.1 => /gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/mpi/intel64/libfabric/lib/libfabric.so.1 (0x00007f0463ee8000)

Would you happen to have an idea about why this error appears?

Thank you,

Best regards,

Puneeth
Attachments
slurm-278118.txt
(1.31 KiB) Downloaded 1 time

Puneeth
Posts: 8
Joined: Thu Oct 18, 2018 8:37 am

Re: Installation on cluster

Post by Puneeth » Wed Sep 16, 2020 3:15 pm

Hello,

There is also something going wrong with LD_LIBRARY_PATH.
The path printed in the summary doesn't correspond to the path printed on the terminal.

The terminal output for ">echo $LD_LIBRARY_PATH" is:
/gpfslocalsup/spack_soft/scotch/6.0.6/intel-19.0.4-v5fgt76h3qeay6moyrh3w5jmof5kd5mq/lib:
/gpfslocalsup/spack_soft/hdf5/1.8.21/intel-19.0.4-ze52g22lxxwb7ezsvxepmcixo6lmotwe/lib:
/gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64_lin:
/gpfslocalsup/pub/code-saturne/5.1.6/lib:
/gpfslocalsup/spack_soft/libxml2/2.9.9/gcc-8.3.1-oeywxcenymqugus6ctqdzstgjibgnwvj/lib:
/gpfslocalsup/spack_soft/petsc/3.11.3/intel-19.0.4-npalh4bbtqx2n646lssj4yqxzkejhwls/lib:
/gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/mpi/intel64/libfabric/lib:
/gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/mpi/intel64/lib/release:
/gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/mpi/intel64/lib:
/gpfslocalsup/spack_soft/metis/5.1.0/intel-19.0.4-2rnvhtykdeapptm3tr5a4qle5y3miact/lib:
/gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/compiler/lib/intel64_lin:
/gpfslocalsup/spack_soft/gcc/7.3.0/gcc-8.3.1-vqzoua4fyg6e5jiz3vhkpjb4qtofjfrf/lib64:
/gpfslocalsup/spack_soft/gcc/7.3.0/gcc-8.3.1-vqzoua4fyg6e5jiz3vhkpjb4qtofjfrf/lib:
/gpfslocalsys/slurm/current/lib/slurm:
/gpfslocalsys/slurm/current/lib


But the LD_LIBRARY_PATH in the summary file is different:
LD_LIBRARY_PATH=/gpfslocalsup/spack_soft/libxml2/2.9.9/gcc-8.3.1-oeywxcenymqugus6ctqdzstgjibgnwvj/lib

Could this be the reason why Code-Saturne doesn't find libimf.so?

Thank you,

Best regards,

Puneeth
Attachments
summary.txt
(24.38 KiB) Not downloaded yet

Yvan Fournier
Posts: 3049
Joined: Mon Feb 20, 2012 3:25 pm

Re: Installation on cluster

Post by Yvan Fournier » Thu Sep 17, 2020 6:49 am

Hello,

Yes, this could explain the issue. Do you have environment modules loaded at install time ?

Regards,

Yvan

Puneeth
Posts: 8
Joined: Thu Oct 18, 2018 8:37 am

Re: Installation on cluster

Post by Puneeth » Thu Sep 17, 2020 8:23 am

Hello,

The installation is based on the following modules:
-intel-compilers/19.0.4
-intel-mpi/2019.4
-intel-mkl/2019.4
-hdf5/1.8.21-mpi
-scotch/6.0.6-mpi
-petsc/3.11.3-mpi
-libxml2/2.9.9

Best,

Puneeth

Yvan Fournier
Posts: 3049
Joined: Mon Feb 20, 2012 3:25 pm

Re: Installation on cluster

Post by Yvan Fournier » Thu Sep 17, 2020 10:20 pm

Hello,

Yes, but how is the environment see sourced/loaded ?

Can you post the config.log ?

Regards,

Yvan

Puneeth
Posts: 8
Joined: Thu Oct 18, 2018 8:37 am

Re: Installation on cluster

Post by Puneeth » Fri Sep 18, 2020 7:59 am

Hello,

Please find the config.log attached herewith.

Best regards,

Puneeth
Attachments
config.log
(384.56 KiB) Downloaded 1 time

Yvan Fournier
Posts: 3049
Joined: Mon Feb 20, 2012 3:25 pm

Re: Installation on cluster

Post by Yvan Fournier » Sat Sep 19, 2020 5:55 pm

Hello,

Environment modules were detected (and probably loaded) when running the "configure' step.

It is possible that those modules are not loaded correctly when running (depending on the module system variant), as they are loaded by a Python script.

Starting with code_saturne 5.0.10 (the latest 5.0 release is 5.0.12), there is a "--with-shell-env" configure option allowing to source a shell environment script first, which might help in your case. If you do this, you can also add --with-modules=no, since you need one mechanism or the other, not both.

To use the --with-shell-env option, first install using --with-shell-env and no path, then copy/adapt the <install_prefix>/bin/code_saturne script so as to load the module or environment variables you need, and re-install usin --with-shell-env=<path-to-modified-script> (unless everything works ok on the first pass).

Best regards,

Yvan

Post Reply