SIGSEGV signal

Questions and remarks about code_saturne usage
Forum rules
Please read the forum usage recommendations before posting.
Post Reply
fomeh

SIGSEGV signal

Post by fomeh »

Hello everyone
I am trying to run saturne a cluster using MPI mvapich2 on 4 processor for a preliminary test. I have an error concerning SGSEGV signal. Could someone help me with this? My log file is attached.
Attachments
nameandcase.txt
(16.44 KiB) Downloaded 214 times
fomeh

Re: SIGSEGV signal

Post by fomeh »

An update: i have this error even on 1 processor.
Yvan Fournier
Posts: 4208
Joined: Mon Feb 20, 2012 3:25 pm

Re: SIGSEGV signal

Post by Yvan Fournier »

Hello,

This may be an installation issue, as the code crashes very early on, but parts of the XML file and user subroutines are applied first, so we can't help you much if you don't post those...

Also, it is strange that you have 4 log files: this is not the default, so if you changed a setting, it may be explained, otherwise it is definitely an installation issue...

Regards,

Yvan
fomeh

Re: SIGSEGV signal

Post by fomeh »

Thank's Yvan.
There is no user subroutine. I made this case study just to verify m .pbs submission file but it crashes. This case study works on my work station very well but even in parallel but i have this error on cluster.
Yvan Fournier
Posts: 4208
Joined: Mon Feb 20, 2012 3:25 pm

Re: SIGSEGV signal

Post by Yvan Fournier »

Hello,

Could you still post the xml file ?

Also, another test would be to edit run_solver.sh in the execution directory
so as to replace:

--param fm

With:

--quality

and re-run run_solver.sh (possibly adding the BATCH templates from runcase if necessary).

This way, you will only compute quality criteria, and not load the XML file. This may help determine whether the crash is due to the xml file (or libxml2 library installation) or something else.

Your trace seems to refer to environment modules. Code_Saturne tries to detect which are loaded at install time, and reload the same modules at run time. Depending on your module command version, this may fail, so you may want to try adding:

--with-modules=no

to the configure line and load modules separately (in your own environment).

We do not have experience with MVAPICH, so if none of the previous tests help, you may want to install a serial only version of the code on the cluster, using all the same tools, but adding --without-mpi to the configure line, just to see if the problem is due to MVAPICH (even on a single rank, it may try to run MPI_Init()). If that is the cause, you then have options in the DATA/cs_user_scripts.py (to be copied from DATA/REFERENCE) to modify parts of the MPI launch command.

Regards,

Yvan
fomeh

Re: SIGSEGV signal

Post by fomeh »

Thanks Yvan
I will try to do these tests.
I also attached the xml file. The installed software on the cluster is without the GUI.
Attachments
fm.txt
(7.49 KiB) Downloaded 291 times
Last edited by fomeh on Fri May 03, 2013 10:41 pm, edited 1 time in total.
Yvan Fournier
Posts: 4208
Joined: Mon Feb 20, 2012 3:25 pm

Re: SIGSEGV signal

Post by Yvan Fournier »

Hello,

The XML file reads fine a a workstation I tested it on (It fails much later as I tested it on another mesh, but I just wanted to test the initialization anyways).

Regards,

Yvan
fomeh

Re: SIGSEGV signal

Post by fomeh »

I recompiled the code with gcc and openMPI on the cluster. It works finally.
Thank's Yvan for your valuable help
Post Reply