Opened 18 months ago

Last modified 17 months ago

#354 accepted

Test coupled model with Intel MPI

Reported by: Martin Dix Owned by: Martin Dix
Priority: major Component: ACCESS-CM2
Keywords: Cc:

Description

Test whether this resolves the "too many retries" problems at start up.

Change History (3)

comment:1 Changed 18 months ago by Martin Dix

Owner: set to Martin Dix
Status: newaccepted

Built oasis3-mct using

module load intel-fc/15.0.1.133
module load intel-cc/15.0.1.133
module load intel-mkl/15.0.1.133
module load intel-mpi/5.1.3.210
module load netcdf/4.3.2

Created module oasis3-mct-local/intelmpi.5.1.3.210

oasis3_tutorial test works.

Intel mpirun doesn't support the -wd argument but uses -wdir instead. OpenMPI also supports this.

The CICE build script uses mpifort which isn't set up in the intel-mpi environment. Use mpif90 instead.

Intel mpirun doesn't support use of rankfiles or hostfiles for each executable. Use a single hostfile for the whole job (which is probably redundant).

Add access-coupled-intelmpi and access-atmos-intelmpi scripts.

Test suite u-au795 is a copy of u-aq959.

At runtime need to load modules that normal suite only requires at build time

            module load intel-fc/17.0.1.132
            module load libpng
            module load openjpeg
            module load zlib

Don't understand this at the moment.

Last edited 18 months ago by Martin Dix (previous) (diff)

comment:2 Changed 18 months ago by Martin Dix

Three month run on Broadwell, UM 28x20, MOM 8x14, CICE 28 cores. UM timings reported to avoid any PBS delays.

Intel MPI (u-au795)
Maximum Elapsed Wallclock Time: 5763.74
Rerun
Maximum Elapsed Wallclock Time: 6650.37

OpenMPI 1.10.2 (u-aq795)
Maximum Elapsed Wallclock Time: 5074.78

Results were identical.

Last edited 18 months ago by Martin Dix (previous) (diff)

comment:3 Changed 17 months ago by Martin Dix

Component: ACCESS modelACCESS-CM2
Note: See TracTickets for help on using tickets.