Page 3 of 3

Re: Some problems on BlueGene/Q

Posted: Fri Nov 22, 2013 10:31 am
by zeph67
Hello Yvan,

No, I don't know about the core files, but I'm gonna ask the admins. It seems to be a listing of problematic memory accesses, but tough to understand.

But I'll first wait for you to run my simple case on the BGQ of EDF.



Thanks a lot & best regards.

Re: Some problems on BlueGene/Q

Posted: Mon Nov 25, 2013 4:31 pm
by Yvan Fournier
Hello,

In the archive of your test case (the one with BGQ/workstation and simple/full combinations), the mesh_input file is missing (it is only a symbolic link, not the target).

I'll try to test once you post it.

Regards,

Yvan

Re: Some problems on BlueGene/Q

Posted: Mon Nov 25, 2013 6:00 pm
by zeph67
Hello Yvan,

Sorry for the mistake. Here are both mesh_input's. I gzip'ed the one for the full case, because the mesh is larger.

Thanks a lot, regards.

Re: Some problems on BlueGene/Q

Posted: Tue Nov 26, 2013 6:17 pm
by Yvan Fournier
Hello,

I still do not understand why you have issues compiling in debug mode, as I was able to reproduce your crash on the simple case in debug mode.

The crash seems to be due to the way the XL Fortran compiler handles pointers assigned to null() with arguments of Fortran routines with no explicit interfaces, and I am not going to check if this is an issue with the compiler or an unspecified aspect of the Fortran standard, as we are (very progressively) moving to C anyways.

A workaround, which is ugly, but seems safe, is to map unused pointers to a small non-null target array (actually, it breaks Fortran's non-aliasing rule, but only for unused arrays).

I'll push this type of "precautionary" syntax change in trunk, 3.1, and 3.0, so upcoming versions 3.2.0, 3.1.1, and 3.0.2 (very soon) will include the fixes.

In the meantime, you can try using the attached file in your user subroutines.

Regards,

Yvan

Re: Some problems on BlueGene/Q

Posted: Wed Nov 27, 2013 2:15 pm
by zeph67
Hello Yvan,

Finally it seems working, for both cases, with just your modified turrij.f90 . The manipulation you did in turrij.f90, is it to be reproduced in other routines ?
Should I also try to re-build CS for the compute nodes, with differents combinations of mpicc's, bgxlf's, as long as it doesn't work ? I remember I've tried that on several combinations, but maybe not all.

The failure of the debug build, on the BG/Q, also remains a mystery for me.

A lot of thanks !

Best regards,
Christophe