Uploaded image for project: 'Hyrax Data Server'
  1. HYRAX-612

Renaming the result of an aggregation (only join new?) fails.

    Details

    • Type: Bug
    • Status: Done (View workflow)
    • Priority: Medium
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: ncml_handler
    • Labels:
      None

      Description

      Aafaque,

      My apologies for the top post, but the thread is getting quite long.

      Here’s what I found, the issue does not appear to be a problem of aggregating four or more files - I can get the error below to trigger with just one file. The problem is the renaming of the band_1 variable after it has been built up during an aggregation. (I’m very glad you sent all the various tiff and ncml files - I’d have never figured this out w/o them.)

      Here’s what will work: If you use NCML to rename the Grid variable ‘band_1’ to ‘SRB’ for each file you intend to aggregate, you can then aggregate on the Grid ‘SRB.’ So, given:

      edamame:tests jimg$ ls ../data/gdal/
      LC08_CU_002010_20130416_20170727_C01_V01_SRB2.ncml
      LC08_CU_002010_20130416_20170727_C01_V01_SRB2.tif
      LC08_CU_002010_20130418_20170727_C01_V01_SRB2.ncml
      LC08_CU_002010_20130418_20170727_C01_V01_SRB2.tif
      LC08_CU_002010_20130425_20170727_C01_V01_SRB2.ncml
      LC08_CU_002010_20130425_20170727_C01_V01_SRB2.tif
      LC08_CU_002010_20130504_20170727_C01_V01_SRB2.ncml
      LC08_CU_002010_20130504_20170727_C01_V01_SRB2.tif
      

      Where the .ncml files rename ‘band_1’ to ‘SRB2’ in each of the matching .tif files, the following NCML will work:

      <?xml version='1.0' encoding='utf-8'?>
      <netcdf title="Time based join for california_LC08_SRB2" xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
        <aggregation dimName="time" type="joinNew">
          <variableAgg name="SRB2" />
          <netcdf coordValue="4854" location="/data/gdal/LC08_CU_002010_20130416_20170727_C01_V01_SRB2.ncml" />
          <netcdf coordValue="4856" location="/data/gdal/LC08_CU_002010_20130418_20170727_C01_V01_SRB2.ncml" />
          <netcdf coordValue="4863" location="/data/gdal/LC08_CU_002010_20130425_20170727_C01_V01_SRB2.ncml" />
          <netcdf coordValue="4872" location="/data/gdal/LC08_CU_002010_20130504_20170727_C01_V01_SRB2.ncml" />
        </aggregation>
        <variable name="time" type="double">
          <attribute name="units" type="string">days since 2000-01-01 00:00</attribute>
        </variable>
      </netcdf>
      

      And the resulting 3 dimensional Grid will be named SRB2.

      And, if you can live with the Grid being named ‘band_1,’ you can aggregate the .tif files like this:

      <?xml version='1.0' encoding='UTF-8'?>
      <netcdf title="Time based join for california_LC08_SRB2" xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
        <aggregation dimName="time" type="joinNew">
          <variableAgg name="band_1" />
          <netcdf coordValue="4854" location="/data/gdal/LC08_CU_002010_20130416_20170727_C01_V01_SRB2.tif" />
          <netcdf coordValue="4856" location="/data/gdal/LC08_CU_002010_20130418_20170727_C01_V01_SRB2.tif" />
          <netcdf coordValue="4863" location="/data/gdal/LC08_CU_002010_20130425_20170727_C01_V01_SRB2.tif" />
          <netcdf coordValue="4872" location="/data/gdal/LC08_CU_002010_20130504_20170727_C01_V01_SRB2.tif" />
        </aggregation>
        <variable name="time" type="double">
          <attribute name="units" type="string">days since 2000-01-01 00:00</attribute>
        </variable>
        <!-- variable name="SRB2" orgName="band_1" / -->
      </netcdf>
      

      (note that I commented out the 'variable name="SRB2" orgName=“band_1”’ part).

      But, if you add <variable name="SRB2" orgName=“band_1"/> back in, the aggregation will fail. It’s a bug, at least I think it is a bug, but I cannot promise a fix soon. I tried, as a work around, using NCML to change the name of the resulting 3d Grid, like this:

      <?xml version='1.0' encoding='UTF-8'?>
      <netcdf location="/data/ncml/gdal/h02v10_tiff_files.ncml" xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2">
        <variable name="SRB2" orgName="band_1" />
      </netcdf>
      

      But that does not work, again, that maybe a bug.

      So for now: You can aggregate as many TIFF files as you want, but if you want to rename the variable that results from the aggregation, you must do so by renaming all of its components first, then combine those together.

      I’m sorry this took so long to sort out and that the outcome is less than satisfactory. What you’re trying seems logically correct, but fails because the NCML implementation is applying the variable rename operation (<variable name="SRB2" orgName="band_1" />) to the DDS of the elements of the aggregation, and not the DDS that results from the previous operations (i.e., the DDS of the aggregation).

      Here’s the actual error message from the ASSERT in the code:

      [MST Thu Feb 15 10:56:17 2018 id: 34209][ncml] NCMLModule InternalError: [static void agg_util::AggregationUtil::transferArrayConstraints(libdap::Array *, const libdap::Array &, bool, bool, bool, const std::string &)]: ASSERTION FAILED: condition=( fromArrIt->name == toArrIt->name ) GAggregationUtil::transferArrayConstraints: Expected the dimensions to have the same name but they did not: 'easting', 'northing’.

      This is not in the code you’re running - that might be an issue as well - because assertions are compiled out for production releases. I’ll change that so we trap these conditions and produce a more informative error message.

      Thanks,
      James

        Attachments

          Activity

            People

            • Assignee:
              jimg James Gallagher
              Reporter:
              jimg James Gallagher
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0 minutes
                0m
                Logged:
                Time Spent - 2 hours
                2h