Opened 2 years ago

Last modified 2 years ago

#312 new

MOM decomposition

Reported by: rb4844 Owned by:
Priority: minor Component: ACCESS-CM2
Keywords: MOM, reproducibility Cc:


ACCESS-CM2 current configuration assigns 8 x 12 cpus to MOM. Changing this to 12 x 8 works but the surface temperature results are affected.

Change History (5)

comment:1 Changed 2 years ago by rb4844

Component: ACCESS modelACCESS-CM2

comment:2 Changed 2 years ago by Peter Dobrohotoff

8x12 gives a different result from 12x8.

Compiling MOM with fp-precise does not fix the issue.

Version 0, edited 2 years ago by Peter Dobrohotoff (next)

comment:3 Changed 2 years ago by Peter Dobrohotoff

Priority: majorminor

comment:4 Changed 2 years ago by Scott Wales

I believe this is expected behaviour, as collective MPI operations are not bit-reproducible across different processor decompositions (as floating point operations are not commutative, and different decompositions evaluate the individual ranks in different orders)

From Nic's comment it sounded like there is a compile-time option for using reproducible MPI operations

comment:5 Changed 2 years ago by Martin Dix

Suite u-an301 can do comparisons across different processor decompositions. The MOM build was modified to set the REPRO flag, so build options were

-Duse_netCDF -Duse_netCDF3 -Duse_libMPI -DACCESS -DACCESS_CM  -fpp -Wp,-w  -fno-alias 
-safe-cray-ptr -fpe0 -ftz -assume byterecl -i4 -r8 -traceback -nowarn -check noarg_temp_created 
-assume buffered_io -convert big_endian -O2 -debug minimal -no-vec -fp-model precise 

I followed Marshall's suggestion and added




UM and CICE used the same processor decomposition in each run and MOM used 8x12 and 12x8.

The results were different after one day.

Note: See TracTickets for help on using tickets.