Sometimes in an NcML modified dataset the DDX response contains 2 copies (one modified, one not) of every variable.

Description

Hi. Our Hyrax user reported a problem I’ve never noticed. It is also difficult to reproduce. They claimed that the ddx response can occasionally list the variables in a data granule twice.

It appears to happen with NcML granules, and reported cases are a few of our AIRS datasets. We enhance the metadata in these datasets to add variable attributes like ‘units’ and ‘long_name’ etc. in the NcML files. By inspecting their attached ddx I can see that the first time the ddx listed variables, e.g. ‘TotalCounts_A’ correctly by including the ‘long_name’, and the second time it listed ‘TotalCount_A’ without ‘long_name’. In fact the second listing for all variable looks just like what would be reported in ddx had the server accessed the original HDF data. It may sound like the server is attempting to access the original HDF granule and somehow combine the metadata. However if a repeat request was made (by e.g. a second wget or simply reloading the browser) the ddx would correctly list one single case of the variables with NcML enhancement. It is this behavior that made it difficult to reproduce the problem.

You may attempt to reproduce it with the granules here but it is not guaranteed that you’d be hitting it the first time (and not sure since when):

https://acdisc.gesdisc.eosdis.nasa.gov/opendap/hyrax/ncml/Aqua_AIRS_Level3/AIRS3STD.006/2005/contents.html

Our user didn’t report if this happened to das or dds. I suppose there is a machine interface reading the ddx.

If it cannot be reliably reproduced, I guess this becomes a theoretical question: is it possible that server ddx response can access a link (in this case the original HDF granule) in an NcML file under certain circumstances? Is it possible that the server ddx can read any cache entries pointing to a data resource with similar filenames (we append .ncml to an HDF granule name)? Other possibilities?

Source Data File: https://acdisc.gesdisc.eosdis.nasa.gov/data/s4pa/Aqua_AIRS_Level3/AIRS3STD.006/2005/AIRS.2005.01.01.L3.RetStd_IR001.v6.0.9.0.G13252161408.hdf

NcML File:

Hyrax Configuration Files:

Environment

None

Status

Assignee

Slav Korolev

Reporter

Nathan Potter

Labels

Fix versions

None

Story Points

None

Epic Link

Priority

Medium