pack/compress netcdf data ("add offset" and "scale factor") with CDO, NCO or similar

1.5k views Asked by At

I have heavy netCDF files with floating 64-bits precision. I would like to pack using specific values for the add_offset and scale_factor parameters (so then I could transform to short I16 precision). I have found information for unpacking with CDO operators but not for packing.

Any help? Thank you in advance!

Edit:

diego@LAcompu:~/new$ ncks -m in.nc
netcdf in {
  dimensions:
    bnds = 2 ;
    lat = 202 ;
    lon = 62 ;
    time = UNLIMITED ; // (15777 currently)

  variables:
    float lat(lat) ;
      lat:standard_name = "latitude" ;
      lat:long_name = "latitude" ;
      lat:units = "degrees_north" ;
      lat:axis = "Y" ;

    float lon(lon) ;
      lon:standard_name = "longitude" ;
      lon:long_name = "longitude" ;
      lon:units = "degrees_east" ;
      lon:axis = "X" ;

    double t2m(time,lat,lon) ;
      t2m:long_name = "2 metre temperature" ;
      t2m:units = "Celsius" ;
      t2m:_FillValue = -32767. ;
      t2m:missing_value = -32767. ;

    double time(time) ;
      time:standard_name = "time" ;
      time:long_name = "time" ;
      time:bounds = "time_bnds" ;
      time:units = "hours since 1900-01-01 00:00:00.0" ;
      time:calendar = "gregorian" ;
      time:axis = "T" ;

    double time_bnds(time,bnds) ;
} // group /
diego@LAcompu:~/new$ ncap2 -v -O -s 't2m=pack_short(t2m,0.00166667,0.0);' in.nc out.nc
ncap2: WARNING pack_short(): Function has been called with more than one argument
diego@LAcompu:~/new$ ncks -m out.nc
netcdf out {
  dimensions:
    lat = 202 ;
    lon = 62 ;
    time = UNLIMITED ; // (15777 currently)

  variables:
    float lat(lat) ;
      lat:standard_name = "latitude" ;
      lat:long_name = "latitude" ;
      lat:units = "degrees_north" ;
      lat:axis = "Y" ;

    float lon(lon) ;
      lon:standard_name = "longitude" ;
      lon:long_name = "longitude" ;
      lon:units = "degrees_east" ;
      lon:axis = "X" ;

    short t2m(time,lat,lon) ;
      t2m:scale_factor = -0.000784701646794361 ;
      t2m:add_offset = -1.01787074416207 ;
      t2m:_FillValue = -32767s ;
      t2m:long_name = "2 metre temperature" ;
      t2m:missing_value = -32767. ;
      t2m:units = "Celsius" ;

    double time(time) ;
      time:standard_name = "time" ;
      time:long_name = "time" ;
      time:bounds = "time_bnds" ;
      time:units = "hours since 1900-01-01 00:00:00.0" ;
      time:calendar = "gregorian" ;
      time:axis = "T" ;
} // group /
2

There are 2 answers

7
Charlie Zender On BEST ANSWER

NCO will automatically pack with optimal values for scale_factor and add_offset with, e.g.,

ncpdq -P in.nc out.nc

and you can add lossless compression as well with

ncpdq -P -L 1 -7 in.nc out.nc

Documentation at http://nco.sf.net/nco.html#ncpdq

and ncap2 accepts specific values of scale_factor and add_offset for per-variable packing with pack() documented at http://nco.sf.net/nco.html#ncap_mth

Demonstration:

zender@spectral:~$ ncap2 -v -O -s 'rec_pck=pack(three_dmn_rec_var,-0.001,40.0);' ~/nco/data/in.nc ~/foo.nc
zender@spectral:~$ ncks -m ~/foo.nc
netcdf foo {
  dimensions:
    lat = 2 ;
    lon = 4 ;
    time = UNLIMITED ; // (10 currently)

  variables:
    float lat(lat) ;
      lat:long_name = "Latitude (typically midpoints)" ;
      lat:units = "degrees_north" ;
      lat:bounds = "lat_bnd" ;

    float lon(lon) ;
      lon:long_name = "Longitude (typically midpoints)" ;
      lon:units = "degrees_east" ;

    short rec_pck(time,lat,lon) ;
      rec_pck:scale_factor = -0.001f ;
      rec_pck:add_offset = 40.f ;
      rec_pck:_FillValue = -99s ;
      rec_pck:long_name = "three dimensional record variable" ;
      rec_pck:units = "watt meter-2" ;

    double time(time) ;
      time:long_name = "time" ;
      time:units = "days since 1964-03-12 12:09:00 -9:00" ;
      time:calendar = "gregorian" ;
      time:bounds = "time_bnds" ;
      time:climatology = "climatology_bounds" ;
} // group /
2
ClimateUnboxed On

So this was simpler than I thought in cdo

cdo pack in.nc out.nc 

This packs with optimal add_offset and scale_factor, converting the field to I16.