Errors of CS 4.0.1 when compiled on ARCHER

All questions about installation
Forum rules
Please read the forum usage recommendations before posting.
Post Reply
iorishx
Posts: 20
Joined: Fri Jun 19, 2015 11:33 am

Errors of CS 4.0.1 when compiled on ARCHER

Post by iorishx »

Hello, everyone.

I was trying to compile CS in my own directory again due to the recent library update on ARCHER.

I exactly followed the summary of the configuration options and full compile instructions at http://www.archer.ac.uk/documentation/s ... phase2.php for reference.

The initialization was fine as I have obtained a result file folder with a mesh_input in it.

But when I tried to run CS, it gave errors like:

"Rank 11 [Mon Apr 4 13:39:09 2016] [c0-0c0s1n1] Fatal error in PMPI_Alltoallv: Other MPI error, error stack:
PMPI_Alltoallv(557)............: MPI_Alltoallv(sbuf=0x14091c0, scnts=0x13d8838, sdispls=0x1409340, MPI_BYTE, rbuf=0x8c50f0, rcnts=0x13d8538, rdispls=0x1409940, MPI_BYTE, MPI_COMM_WORLD) failed
MPIR_Alltoallv_impl(380).......:
MPIDI_CRAY_ugni_alltoallv(1373):
MPIU_ugni_wait_rdma_events(412): GNI_CqGetEvent (GNI_RC_SUCCESS)
Rank 9 [Mon Apr 4 13:39:09 2016] [c0-0c0s1n1] Fatal error in PMPI_Alltoallv: Other MPI error, error stack:
PMPI_Alltoallv(557)............: MPI_Alltoallv(sbuf=0x85edd0, scnts=0x1a5a848, sdispls=0x85ef50, MPI_BYTE, rbuf=0x11dbd70, rcnts=0x1a5a548, rdispls=0x85f550, MPI_BYTE, MPI_COMM_WORLD) failed
MPIR_Alltoallv_impl(380).......:
MPIDI_CRAY_ugni_alltoallv(1373):
MPIU_ugni_wait_rdma_events(412): GNI_CqGetEvent (GNI_RC_SUCCESS)
......" (See attached for details)

I am very confused about these errors and it may have loaded an incorrect version of one of its dependencies or so.

I also found the error happened when the computation was at the step "Partitioning 13476280 cells to 192 domains on 192 ranks (SCOTCH_dgraphPart)." May be it is SCOTCH problem?

Does anyone know about this?

Sean
Attachments
restart_ARCHER.pbs.txt
Computation pbs file
(335 Bytes) Downloaded 309 times
init.pbs.txt
initialization pbs file
(317 Bytes) Downloaded 299 times
E_A13.e3604253.txt
The error file
(81.67 KiB) Downloaded 311 times
Yvan Fournier
Posts: 4070
Joined: Mon Feb 20, 2012 3:25 pm

Re: Errors of CS 4.0.1 when compiled on ARCHER

Post by Yvan Fournier »

Hello,

This is probably an installation issue, but to check if the issue is not simply due to PT-Scotch, you can force a different partitioning scheme using the performance tuning tab in the GUI or the cs_user_performance_tuning.c user subroutines.

Regards,

Yvan
Post Reply