Opened 12 months ago

Last modified 11 months ago

#360 new

MOM should use netCDF-4 with compression

Reported by: Martin Dix Owned by:
Priority: major Component: ACCESS-CM2
Keywords: Cc:

Description

At the moment both restart and history files are netCDF-3 format

% ncdump -k /short/p66/dhb599/archive/av630/restart/ocn/ocean_temp_salt.res.nc-02010630
classic

%  ncdump -k /short/p66/dhb599/archive/av630/history/ocn/ocean_month.nc-02010630 
64-bit offset

Using netCDF4 with even the lowest level compression reduces the size by about 2/3.

% nccopy -k 3 -d 1 ocean_month.nc-02010630 ocean_month.nc4

reduces size from 3605 MB to 1054 MB

% nccopy -k 3 -d 1 ocean_temp_salt.res.nc-02010630 ocean_temp_salt.res.nc4

reduces size from 82 MB to 37 MB.

Change History (4)

comment:1 Changed 12 months ago by Martin Dix

It should be simple to change the model and mppcombine to use compressed netCDF4. Alternatively the files could just be rewritten in a post-processing step. Using netCDF4-classic with compression should be transparent to MOM when it reads restart files.

Note that this compression is lossless.

comment:2 Changed 12 months ago by Aidan P Heerdegen

This has been done already, and used routinely in production runs in ocean/ice simulations.

FMS has the ability to save to netCDF4 already, it just needs to be specified at compile time. In the MOM5 compile script

https://github.com/mom-ocean/MOM5/blob/43986e20236cc74b83cd5ee09ebd798f4da40824/exp/MOM_compile.csh

It is exposed as a command line option --use_netcdf4. Once you have an executable which is writing netCDF4 files the following options added to input.nml set the compression parameters:

 &mpp_io_nml
    deflate_level = 5
    shuffle = 1
/

I recommend to use shuffle, it can result in up to 10% improvement "for free". Similarly I found a deflate level of 4 or 5 to be the sweet spot in terms of compression, but your mileage may vary as they say.

This will nicely compress the restarts and the outputs. If your outputs are tiled, then they will be output again by mppnccombine. We have a version with compression enabled here:

https://github.com/mom-ocean/MOM5/blob/43986e20236cc74b83cd5ee09ebd798f4da40824/src/postprocessing/mppnccombine/mppnccombine.c

It requires command line options to enable compression. Specifically -n4 -z -d 5 is the full option list, but this is equivalent to -z as the default deflate level is 5, shuffle is on by default, and netCDF4 output should be enabled if compression is specified.

Last edited 12 months ago by Aidan P Heerdegen (previous) (diff)

comment:3 Changed 12 months ago by Martin Dix

Coupled model suites load the default netcdf module which is 4.2.1.1.

MOM_compile.csh in access_cm2_drivers has

set cppDefs  = ( "-Duse_netCDF -Duse_netCDF3 -Duse_libMPI -DACCESS -DACCESS_CM" )

Modified this to have -Duse_netCDF3.

Built ~access/access_cm2/utils/mppnccombine_nc4 from the version described above and modified mppcombine.sh to use
~access/access-cm2/utils/mppnccombine_nc4 -n4 -z -v -r ...

Use rev 544 of access-cm2-drivers to get these updates.

Last edited 11 months ago by Martin Dix (previous) (diff)

comment:4 Changed 11 months ago by Martin Dix

Tested in a branch nc4 of suite u-aq959. One month runs

% ls -l aq959/history/ocn/
total 954436
-rw-r-----  1 mrd599 p66  26809712 Mar  9 15:40 ocean_daily.nc-00010131
-rw-r-----  1 mrd599 p66 950492160 Mar  9 15:40 ocean_month.nc-00010131
-rw-r-----+ 1 mrd599 p66     28024 Mar  9 15:36 ocean_scalar.nc-00010131
% ls -l aq959-nc4/history/ocn/
total 258252
-rw-r-----  1 mrd599 p66  11830684 Mar  9 15:10 ocean_daily.nc-00010131
-rw-r-----  1 mrd599 p66 252472829 Mar  9 15:11 ocean_month.nc-00010131
-rw-r-----+ 1 mrd599 p66    132629 Mar  9 15:03 ocean_scalar.nc-00010131

% du -h aq959/restart/ocn/
1.1G	aq959/restart/ocn/
% du -h aq959-nc4/restart/ocn/
320M	aq959-nc4/restart/ocn/

nccmp shows that files are identical (apart from format).

nccmp -s -w format -d aq959/history/ocn/ocean_daily.nc-00010131 aq959-nc4/history/ocn/ocean_daily.nc-00010131
DIFFER : FILE FORMATS : NC_FORMAT_64BIT <> NC_FORMAT_NETCDF4_CLASSIC
Files "aq959/history/ocn/ocean_daily.nc-00010131" and "aq959-nc4/history/ocn/ocean_daily.nc-00010131" are identical.

Suite changes are https://code.metoffice.gov.uk/trac/roses-u/changeset?reponame=&new=71115%40a%2Fq%2F9%2F5%2F9%2Fnc4%2Fapp&old=71108%40a%2Fq%2F9%2F5%2F9%2Fnc4%2Fapp

Essentially, update revision in app/fcm_make_drivers/rose-app.conf and add a new namelist mpp_io_nml to the MOM runtime configuration.

Last edited 11 months ago by Martin Dix (previous) (diff)
Note: See TracTickets for help on using tickets.