Installation on cluster

All questions about installation
Forum rules
Please read the forum usage recommendations before posting.
Puneeth
Posts: 11
Joined: Thu Oct 18, 2018 8:37 am

Installation on cluster

Post by Puneeth »

Hello,

While testing a fresh installation of Code-Saturne v5.1.6 on a cluster, I'm facing a Compile or link error shown in slurm-275114.out.
The error doesn't provide much info on what is going wrong or where.
I kindly request you to assist me in debugging this error.
Also, please find the compile.log attached herewith.

Thanks and Regards,
Puneeth
Attachments
compile.log
(29.25 KiB) Downloaded 203 times
slurm-275114.txt
(1.96 KiB) Downloaded 202 times
Yvan Fournier
Posts: 4070
Joined: Mon Feb 20, 2012 3:25 pm

Re: Installation on cluster

Post by Yvan Fournier »

Hello,

Thé compile log indicates the link with libxml2 fails. It is possibly due to not having the libxml2 dev package on the computer nodes (i.e. libxml2.so.x present but not the libxml2.so link).

Do you use "code_saturne submit" or the GUi, or submit a runcase directly (not recommend ed for the above reason) ?

You also have compile warnings you should check.

Best regards,

Yvan
Puneeth
Posts: 11
Joined: Thu Oct 18, 2018 8:37 am

Re: Installation on cluster

Post by Puneeth »

Hi,

Thanks for the reply.

I will confirm with the Cluster admins whether Code_saturne was installed with the libxml2 option enabled. And also for the dev package of libxml2 to be installed.

The simulation is run using a batch file.

Best regards,

Puneeth
Last edited by Puneeth on Wed Sep 16, 2020 3:28 pm, edited 1 time in total.
Puneeth
Posts: 11
Joined: Thu Oct 18, 2018 8:37 am

Re: Installation on cluster

Post by Puneeth »

Hello,

There has been an update in the situation.

The admin has recompiled Code-saturne v5.1.6 considering your recommendations about libxml2.
However, the simulations still fail due to another error:
"/gpfslocalsup/pub/code-saturne/5.1.6/libexec/code_saturne/cs_preprocess: error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory"
Please find the slurm-xxxx.out file attached, which specifies this error.

Surprisingly, libimf.so seems to be found as a dependency by cs_process when we check the output for
"$ldd /gpfslocalsup/pub/code-saturne/5.1.6/libexec/code_saturne/cs_preprocess":

$ldd /gpfslocalsup/pub/code-saturne/5.1.6/libexec/code_saturne/cs_preprocess
linux-vdso.so.1 (0x00007fff3c184000)
libhdf5.so.10 => /gpfslocalsup/spack_soft/hdf5/1.8.21/intel-19.0.4-ze52g22lxxwb7ezsvxepmcixo6lmotwe/lib/libhdf5.so.10 (0x00007f0468a9a000)
libm.so.6 => /lib64/libm.so.6 (0x00007f0468718000)
libz.so.1 => /lib64/libz.so.1 (0x00007f0468501000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007f04682fd000)
libmpifort.so.12 => /gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/mpi/intel64/lib/libmpifort.so.12 (0x00007f0467f3e000)
libmpi.so.12 => /gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/mpi/intel64/lib/release/libmpi.so.12 (0x00007f046704c000)
librt.so.1 => /lib64/librt.so.1 (0x00007f0466e43000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f0466c23000)
libgcc_s.so.1 => /gpfslocalsup/spack_soft/gcc/7.3.0/gcc-8.3.1-vqzoua4fyg6e5jiz3vhkpjb4qtofjfrf/lib64/libgcc_s.so.1 (0x00007f0466a0b000)
libc.so.6 => /lib64/libc.so.6 (0x00007f0466648000)
libimf.so => /gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/compiler/lib/intel64_lin/libimf.so (0x00007f04660a8000)
libsvml.so => /gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/compiler/lib/intel64_lin/libsvml.so (0x00007f0464704000)
libirng.so => /gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/compiler/lib/intel64_lin/libirng.so (0x00007f0464392000)
libintlc.so.5 => /gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/compiler/lib/intel64_lin/libintlc.so.5 (0x00007f0464120000)
/lib64/ld-linux-x86-64.so.2 (0x00007f046906b000)
libfabric.so.1 => /gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/mpi/intel64/libfabric/lib/libfabric.so.1 (0x00007f0463ee8000)

Would you happen to have an idea about why this error appears?

Thank you,

Best regards,

Puneeth
Attachments
slurm-278118.txt
(1.31 KiB) Downloaded 200 times
Puneeth
Posts: 11
Joined: Thu Oct 18, 2018 8:37 am

Re: Installation on cluster

Post by Puneeth »

Hello,

There is also something going wrong with LD_LIBRARY_PATH.
The path printed in the summary doesn't correspond to the path printed on the terminal.

The terminal output for ">echo $LD_LIBRARY_PATH" is:
/gpfslocalsup/spack_soft/scotch/6.0.6/intel-19.0.4-v5fgt76h3qeay6moyrh3w5jmof5kd5mq/lib:
/gpfslocalsup/spack_soft/hdf5/1.8.21/intel-19.0.4-ze52g22lxxwb7ezsvxepmcixo6lmotwe/lib:
/gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64_lin:
/gpfslocalsup/pub/code-saturne/5.1.6/lib:
/gpfslocalsup/spack_soft/libxml2/2.9.9/gcc-8.3.1-oeywxcenymqugus6ctqdzstgjibgnwvj/lib:
/gpfslocalsup/spack_soft/petsc/3.11.3/intel-19.0.4-npalh4bbtqx2n646lssj4yqxzkejhwls/lib:
/gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/mpi/intel64/libfabric/lib:
/gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/mpi/intel64/lib/release:
/gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/mpi/intel64/lib:
/gpfslocalsup/spack_soft/metis/5.1.0/intel-19.0.4-2rnvhtykdeapptm3tr5a4qle5y3miact/lib:
/gpfslocalsys/intel/parallel_studio_xe_2019_update4_cluster_edition/compilers_and_libraries_2019.4.243/linux/compiler/lib/intel64_lin:
/gpfslocalsup/spack_soft/gcc/7.3.0/gcc-8.3.1-vqzoua4fyg6e5jiz3vhkpjb4qtofjfrf/lib64:
/gpfslocalsup/spack_soft/gcc/7.3.0/gcc-8.3.1-vqzoua4fyg6e5jiz3vhkpjb4qtofjfrf/lib:
/gpfslocalsys/slurm/current/lib/slurm:
/gpfslocalsys/slurm/current/lib


But the LD_LIBRARY_PATH in the summary file is different:
LD_LIBRARY_PATH=/gpfslocalsup/spack_soft/libxml2/2.9.9/gcc-8.3.1-oeywxcenymqugus6ctqdzstgjibgnwvj/lib

Could this be the reason why Code-Saturne doesn't find libimf.so?

Thank you,

Best regards,

Puneeth
Attachments
summary.txt
(24.38 KiB) Downloaded 199 times
Yvan Fournier
Posts: 4070
Joined: Mon Feb 20, 2012 3:25 pm

Re: Installation on cluster

Post by Yvan Fournier »

Hello,

Yes, this could explain the issue. Do you have environment modules loaded at install time ?

Regards,

Yvan
Puneeth
Posts: 11
Joined: Thu Oct 18, 2018 8:37 am

Re: Installation on cluster

Post by Puneeth »

Hello,

The installation is based on the following modules:
-intel-compilers/19.0.4
-intel-mpi/2019.4
-intel-mkl/2019.4
-hdf5/1.8.21-mpi
-scotch/6.0.6-mpi
-petsc/3.11.3-mpi
-libxml2/2.9.9

Best,

Puneeth
Yvan Fournier
Posts: 4070
Joined: Mon Feb 20, 2012 3:25 pm

Re: Installation on cluster

Post by Yvan Fournier »

Hello,

Yes, but how is the environment see sourced/loaded ?

Can you post the config.log ?

Regards,

Yvan
Puneeth
Posts: 11
Joined: Thu Oct 18, 2018 8:37 am

Re: Installation on cluster

Post by Puneeth »

Hello,

Please find the config.log attached herewith.

Best regards,

Puneeth
Attachments
config.log
(384.56 KiB) Downloaded 219 times
Yvan Fournier
Posts: 4070
Joined: Mon Feb 20, 2012 3:25 pm

Re: Installation on cluster

Post by Yvan Fournier »

Hello,

Environment modules were detected (and probably loaded) when running the "configure' step.

It is possible that those modules are not loaded correctly when running (depending on the module system variant), as they are loaded by a Python script.

Starting with code_saturne 5.0.10 (the latest 5.0 release is 5.0.12), there is a "--with-shell-env" configure option allowing to source a shell environment script first, which might help in your case. If you do this, you can also add --with-modules=no, since you need one mechanism or the other, not both.

To use the --with-shell-env option, first install using --with-shell-env and no path, then copy/adapt the <install_prefix>/bin/code_saturne script so as to load the module or environment variables you need, and re-install usin --with-shell-env=<path-to-modified-script> (unless everything works ok on the first pass).

Best regards,

Yvan
Post Reply