Changes between Version 16 and Version 17 of access/SMOCS


Ignore:
Timestamp:
Jul 18, 2018 9:16:12 AM (5 years ago)
Author:
Lawrence Rikus
Comment:

Revision for July 2018

Legend:

Unmodified
Added
Removed
Modified
  • access/SMOCS

    v16 v17  
    1010  * Structure software so specific directory/file structure is at upper levels so that the underlying processes are as general as possible.
    1111  * Develop tools to interrogate the collected data for various purposes
    12       * Quantify model health (e.g. do the statistics stray from expected ranges)
     12      * Quantify model health (e.g. do the statistics stray from expected ranges?)
    1313      * Evaluate the differences between models
    1414          - for model upgrades
     
    2424== Collection stage ==
    2525
    26    The collection scripts cycle over the output files from each model run. They can either be run at the end of each model output step or over a directory containing a set of output files. This means they can be used to collect statistics for an operational model by being run over the operational output directory or on the model output archive at the NCI (rr4). Currently the input format has to be netcdf or UM pp-fieldsfiles so grib archives need translation before they can be 'collected'. The scripts read each file and try to read all the fields on each file. The mean, standard deviation and maximum and minimum values are calculated over the entire domain as well as over the sea and land separately and saved for each model output time step. The position in the field of the maximum and minimum are also saved. The results are saved as a json file for each model run and these in turn are saved in a directory structure which separates each month.As well as the general collection scripts there are separate 'utility collection' scripts. These calculate fields of interest which are not output by the model (such as the hourly rainfall accumulations or overall budget terms) and add them to the forecast summary json file. For 3D fields the results are saved for the individual levels as well as the full 3D domain.
     26   The collection scripts cycle over the output files from each model run. They can either be run at the end of each model output step or over a directory containing a set of output files. This means they can be used to collect statistics for an operational model by being run over the operational output directory or on the model output archive at the NCI (lb4). Currently the input format has to be netcdf or UM fieldsfiles so grib archives need translation before they can be 'collected'. The scripts read each file and try to read all the fields on each file. The mean, standard deviation and maximum and minimum values are calculated over the entire domain as well as over the sea and land separately and saved for each model output time step. The position in the field of the maximum and minimum are also saved. For 3D fields the results are saved for the individual levels as well as the full 3D domain.
     27Certain fields have the option for collecting either a percentile or histogram (bin) table which effectively bins values and their coordinates.
     28
     29
     30The results are saved as a json file for each model run and these in turn are saved in a directory structure which separates each month. As well as the general collection scripts there are separate 'utility collection' scripts. These calculate fields of interest which are not output by the model (such as the hourly rainfall accumulations or overall budget terms) and add them to the forecast summary json file.
     31
     32Currently the I/O engine is based on cdms which is a python library which reads in netcdf or UM fieldsfiles and uses a netcdf variable API.
    2733
    2834== Harvesting Programs ==
     
    3440'''bodoView.py''' - a simple json tree viewer available to look at individual SMOCS json files. Available in the Tools svn subdirectory or from my python directory on all machines.
    3541
     42'''smocsView.py''' - an updated version of bodoView.
     43
    3644=== Known Bugs ===
    3745  * Name translation errors
    38      * accumulated precip is identified as accum_prcp_rate_pa which propagates through the rate calculations
    39      * maximum screen temperature is identified as temp_scrn
     46     * New fields not in the translation table are identified by their STASH id.
    4047
    4148
     
    4350
    4451  * Energy integral calculation for model monitoring purposes.
    45   * Add in percentile binning to augment/replace the binning process.
    4652  * Submodel domain generation (e.g. r,c in g; c in r; tc in g?)
    4753  * Sub-region (e.g. NH, SH, tropics etc) for global domains.
    4854  * Efficiency improvements:
    49        * Speed up binning routines
    50        * parallelize in python
     55       * Speed up binning/percentile routines
     56  * Investigate a move to iris and/or mule
    5157 
    5258= File Locations =