Uploaded image for project: 'Hyrax Data Server'
  1. HYRAX-561

The fileout_netcdf, and/or the ncml_handler code does not clean up the temporary netCDF result file if the requesting client is killed during the transaction.

    Details

      Description

      At the request of NSIDC I reopened this issue: https://github.com/OPENDAP/bes/issues/39

      Matt provided a recipe to reproduce the issue:

      ... For now, I can give you a repository that should allow you to reproduce the issue on your end. Just follow the README instructions to reproduce the issue.

      https://bitbucket.org/MattF-NSIDC/opendap-tmp-overflow-sscce

      (I have attached the download from bitbucket to this ticket.)

      But after looking more closely I think the problem deserves a new issue. Here's my last message to the parent thread (which I am going to close):

      Hi Matt,

      I fired up an AWS system running CentOS-7 and pulled your examples and Hyrax onto it: With everything in place I can reliably replicate the problem doing just this much:

      #!/bin/bash
      test_url="http://127.0.0.1:8080/opendap/aggregates/h00v00.ncml.nc?Geophysical_Data_baseflow_flux[0:1:0][0:1:7][0:1:1623][0:1:3855],Geop
      hysical_Data_heat_flux_ground[0:1:0][0:1:7][0:1:1623][0:1:3855],Geophysical_Data_heat_flux_latent[0:1:0][0:1:7][0:1:1623][0:1:3855],Geo
      physical_Data_heat_flux_sensible[0:1:0][0:1:7][0:1:1623][0:1:3855],Geophysical_Data_height_lowatmmodlay[0:1:0][0:1:7][0:1:1623][0:1:385
      5],Geophysical_Data_land_evapotranspiration_flux[0:1:0][0:1:7][0:1:1623][0:1:3855],Geophysical_Data_land_fraction_saturated[0:1:0][0:1:
      7][0:1:1623][0:1:3855],Geophysical_Data_land_fraction_snow_covered[0:1:0][0:1:7][0:1:1623][0:1:3855],Geophysical_Data_land_fraction_uns
      aturated[0:1:0][0:1:7][0:1:1623][0:1:3855]"
      curl --globoff "${test_url}" > bad_h00v00.ncml.nc 2>&1 &
      PID=$!
      echo "Requesting ridiculously large amount of data (PID $PID)..."
      And then issuing kill $PID myself from the command line.

      I think this not a re manifestation of Hyrax-39 so I am going to make a new Ticket in our JIRA - I'll send you a link - and start trying to see where this goes awry.

      Thanks for your patience,

      N

      What's made clear in https://github.com/OPENDAP/bes/issues/39 is that killing curl like this causes the BES (and fileout_netcdf in particular) to leave behind very large orphaned files in the tmp area:

      -rw-------. 1 centos centos 1803554208 Dec  6 23:50 ncaSKSaE
      -rw-------. 1 centos centos 1803554208 Dec  6 23:50 ncVx3ikn
      

      Also of note is the requested dataset, the NcML aggregation h00v00.ncml and the response type, netcdf. Thus we are producing the fileout_netcdf response for an ncml aggregation subset. This is why I added the NcML handler to the list of components. It may well be that the problem is in there.

        Attachments

          Issue links

            Activity

              People

              • Assignee:
                ndp Nathan Potter
                Reporter:
                ndp Nathan Potter
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 3 days Original Estimate - 3 days
                  3d
                  Remaining:
                  Remaining Estimate - 0 minutes
                  0m
                  Logged:
                  Time Spent - 1 week, 1 day, 4 hours, 30 minutes
                  1w 1d 4h 30m