Java NetCDF : Aggregating Existing Files : time dimension not found issue

526 views Asked by At

For my brain teaser, I've searched through the docs and mailing list archives awhile and am having a hard time putting together the steps I need to handle this aggregation.

CFSR 1 hour data files data from here : http://rda.ucar.edu/datasets/ds094.0/

cdas_20161215_0000_f00000_G4.grib2
cdas_20161215_0000_f00100_G4
cdas_20161215_0000_f00200_G4
cdas_20161215_0000_f00300_G4
etc...

The hourly files declare 2 time dimensions, one with bounds set and another without.

cdas_20161215_0000_f00300_G4.grib2
double time(time=1);
  :units = "Hour since 2016-12-15T00:00:00Z";
  :standard_name = "time";
  :long_name = "GRIB forecast or observation time";
  :calendar = "proleptic_gregorian";
  :bounds = "time_bounds";
double time_bounds(time=1, 2);
  :units = "Hour since 2016-12-15T00:00:00Z";
  :long_name = "bounds for time";
double time1(time1=1);
  :units = "Hour since 2016-12-15T00:00:00Z";
  :standard_name = "time";
  :long_name = "GRIB forecast or observation time";
  :calendar = "proleptic_gregorian";

The problem is that when I step through each dataset creation, different hourly files will swap names for the 2 time dimension names. So then AggregationExisting is unable to find the dimension name 'time' for certain files, e.g. on the u-component_of_wind_isobaric variable in the 0300 file because it was declared time1 instead.

Code I'm calling:

List<String> variableNames = Arrays.asList("u-component_of_wind_isobaric","u-component_of_wind_height_above_ground","v-component_of_wind_isobaric","v-component_of_wind_height_above_ground","Pressure_reduced_to_MSL_msl","Geopotential_height_isobaric");
NetcdfDataset netcdfDataset = new NetcdfDataset();
//here i'm trying to aggregate on a dimension called 'time'
AggregationExisting aggregationExisting = new AggregationExisting(netcdfDataset, "time", null);
aggregationExisting.addDatasetScan(null,
                   "/cfsr-gribs/201612/",
                    "G4.grib2",
                    null,
                    null,
                    NetcdfDataset.getDefaultEnhanceMode(),
                    "false",
                    null);
aggregationExisting.persistWrite();
aggregationExisting.finish(new CancelTaskImpl());
GridDataset gridDataset = new GridDataset(netcdfDataset);
writer.setRedefineMode(true);
CFGridWriter2.writeFile(gridDataset, variableNames, gridDataset.getBoundingBox(), null, 1, null, null, 1, true, writer);

Time dimension name issue illustrated in 2 files:

//cdas_20161215_0000_f00300_G4.grib2

float u-component_of_wind_isobaric(time1=1, isobaric3=37, lat=361, lon=720);
  :long_name = "u-component of wind @ Isobaric surface";
  :units = "m/s";
  :abbreviation = "UGRD";
  :missing_value = NaNf; // float
  :grid_mapping = "LatLon_Projection";
  :coordinates = "reftime time1 isobaric3 lat lon ";
  :Grib_Variable_Id = "VAR_0-2-2_L100";
  :Grib2_Parameter = 0, 2, 2; // int
  :Grib2_Parameter_Discipline = "Meteorological products";
  :Grib2_Parameter_Category = "Momentum";
  :Grib2_Parameter_Name = "u-component of wind";
  :Grib2_Level_Type = "Isobaric surface";
  :Grib2_Generating_Process_Type = "Forecast";


//cdas_20161215_0000_f00200_G4.grib2

float u-component_of_wind_isobaric(time=1, isobaric3=37, lat=361, lon=720);
  :long_name = "u-component of wind @ Isobaric surface";
  :units = "m/s";
  :abbreviation = "UGRD";
  :missing_value = NaNf; // float
  :grid_mapping = "LatLon_Projection";
  :coordinates = "reftime time isobaric3 lat lon ";
  :Grib_Variable_Id = "VAR_0-2-2_L100";
  :Grib2_Parameter = 0, 2, 2; // int
  :Grib2_Parameter_Discipline = "Meteorological products";
  :Grib2_Parameter_Category = "Momentum";
  :Grib2_Parameter_Name = "u-component of wind";
  :Grib2_Level_Type = "Isobaric surface";
  :Grib2_Generating_Process_Type = "Forecast";

This is my first NetCDF library use so I'm shopping for some preprocessing tools to get these datasets merged that have this quirk. Could I move all the variables into the same time dimension and rename it, for instance? Even a link to an example I missed would be helpful. Otherwise I'm guessing I will look into manually stamping out dimensions and using readDataSlice() to manually copy in the data into a new merged file.

2

There are 2 answers

1
N1B4 On

If you're intersted in using non-Java tools, I recommend checking out NCO.

First, you'll need to convert from grib to netcdf, perhaps using the wgrib2 utility (example of the conversion is here) or ncl_convert2nc.

Second, you can develop a simple script that loops through the netcdf files in question, checks whether time1 exists as a dimension name, and if so, change the name to time. NCO's ncrename tool can do this:

ncrename -d time1,time file.nc file.nc 

Third, check to make sure that time (which should exist in all files now) is the record dimension. If not, let's make it so using NCO's ncks tool:

ncks --mk_rec_dmn time file.nc 

Finally, use NCO's ncrcat to concatenate files along the record (time) dimension:

ncrcat cdas*.nc all_files.nc 

Note: you don't have to use the wildcard in the line above, you could just include a list of files you want to be concatenated, e.g.

ncrcat cdas_20161215_0000_f00000_G4.nc cdas_20161215_0000_f00100_G4.nc all_files.nc 
0
Tom H On

So I received a response from Ucar that Grib2 is a different beast that is not currently going to work with AggregationExisting. Their THREDDS server product has the functionality for Grib2 files, so it's some different classes, e.g. GribCollectionImmutable.

Here's what they recommended for this approach, which worked great for me :

        List<String> variableNames = Arrays.asList("u-component_of_wind_isobaric","u-component_of_wind_height_above_ground","v-component_of_wind_isobaric","v-component_of_wind_height_above_ground","Pressure_reduced_to_MSL_msl","Geopotential_height_isobaric");
        FeatureCollectionType fcType = FeatureCollectionType.GRIB2;
        Path outputPath = Paths.get("/cfsr/Netcdf4/201612/Cfsr_201612_Monthly.nc");
        String dataDir = "/cfsr-gribs/201612/";
        String spec = dataDir + ".*grib2$";
        String timePartition = "file";
        String dateFormatMark = null;
        String olderThan = null;
        Element innerNcml = null;
        String path = dataDir;
        String name = "cfsr";
        String collectionName = "cfsrCollection";

        //find and configure the folder as a grib collection
        FeatureCollectionConfig fcc = new FeatureCollectionConfig(name, path, fcType, spec,
                collectionName, dateFormatMark, olderThan, timePartition, innerNcml);

        try (GribCollectionImmutable gc = GribCdmIndex.openGribCollection(fcc, null, log)) {
            //had to breakpoint and see the dataset typenames to choose 'TP', could be different for each dataset
            GribCollectionImmutable.Dataset ds = gc.getDatasetByTypeName("TP");
            String fullCollectionIndexFilePath = dataDir + name + ".ncx3";
            // now we open the collection index file, which catalogs all of the grib
            //  records in your collection
            NetcdfDataset ncd = gc.getNetcdfDataset(ds, ds.getGroup(0), fullCollectionIndexFilePath,
                    fcc, null, log);
            try (NetcdfFileWriter writer = NetcdfFileWriter.createNew(NetcdfFileWriter.Version.netcdf4,
                    outputPath.toString(), new Nc4ChunkingDefault())) {
                GridDataset gridDataset = new GridDataset(ncd);
                for (String variableName : variableNames) {
                    GeoGrid grid = gridDataset.findGridByShortName(variableName);
                    //Check that the time dimension is the length you'd expect
                    log.info(String.format("Found grid for : %s = %s, with dimension length %s", variableName, grid != null, grid != null ? grid.getDimension(0).getLength() : 0));
                }
                writer.setRedefineMode(true);
                //write the aggregated variables to my output file
                CFGridWriter2.writeFile(gridDataset, variableNames, gridDataset.getBoundingBox(), null, 1, null, null, 1, true, writer);
            } catch (Exception exc) {
                exc.printStackTrace();
            }
        } catch (IOException e) {
            e.printStackTrace();
        }