The fileout_netcdf, and/or the ncml_handler code does not clean up the temporary netCDF result file if the requesting client is killed during the transaction.

Description

At the request of NSIDC I reopened this issue: https://github.com/OPENDAP/bes/issues/39

Matt provided a recipe to reproduce the issue:

... For now, I can give you a repository that should allow you to reproduce the issue on your end. Just follow the README instructions to reproduce the issue.

https://bitbucket.org/MattF-NSIDC/opendap-tmp-overflow-sscce

(I have attached the download from bitbucket to this ticket.)

But after looking more closely I think the problem deserves a new issue. Here's my last message to the parent thread (which I am going to close):

Hi Matt,

I fired up an AWS system running CentOS-7 and pulled your examples and Hyrax onto it: With everything in place I can reliably replicate the problem doing just this much:

#!/bin/bash
test_url="http://127.0.0.1:8080/opendap/aggregates/h00v00.ncml.nc?Geophysical_Data_baseflow_flux[0:1:0][0:1:7][0:1:1623][0:1:3855],Geop
hysical_Data_heat_flux_ground[0:1:0][0:1:7][0:1:1623][0:1:3855],Geophysical_Data_heat_flux_latent[0:1:0][0:1:7][0:1:1623][0:1:3855],Geo
physical_Data_heat_flux_sensible[0:1:0][0:1:7][0:1:1623][0:1:3855],Geophysical_Data_height_lowatmmodlay[0:1:0][0:1:7][0:1:1623][0:1:385
5],Geophysical_Data_land_evapotranspiration_flux[0:1:0][0:1:7][0:1:1623][0:1:3855],Geophysical_Data_land_fraction_saturated[0:1:0][0:1:
7][0:1:1623][0:1:3855],Geophysical_Data_land_fraction_snow_covered[0:1:0][0:1:7][0:1:1623][0:1:3855],Geophysical_Data_land_fraction_uns
aturated[0:1:0][0:1:7][0:1:1623][0:1:3855]"
curl --globoff "${test_url}" > bad_h00v00.ncml.nc 2>&1 &
PID=$!
echo "Requesting ridiculously large amount of data (PID $PID)..."
And then issuing kill $PID myself from the command line.

I think this not a re manifestation of Hyrax-39 so I am going to make a new Ticket in our JIRA - I'll send you a link - and start trying to see where this goes awry.

Thanks for your patience,

N

What's made clear in https://github.com/OPENDAP/bes/issues/39 is that killing curl like this causes the BES (and fileout_netcdf in particular) to leave behind very large orphaned files in the tmp area:

Also of note is the requested dataset, the NcML aggregation h00v00.ncml and the response type, netcdf. Thus we are producing the fileout_netcdf response for an ncml aggregation subset. This is why I added the NcML handler to the list of components. It may well be that the problem is in there.

Environment

None

Activity

Show:
Nathan Potter
December 8, 2017, 3:52 PM
Edited

From a low vote thread in StackOverflow

I'd add SIGPIPE to the list of signals that you might legitimately handle, but it does depend on your program. It is often a good idea to ignore SIGPIPE so that you get a write error instead of an interrupt when you write to a pipe which has no process reading from it. – Jonathan Leffler Mar 10 '15 at 22:23

Done

Assignee

Nathan Potter

Reporter

Nathan Potter

Labels

None

Time tracking

0m

Time remaining

0m

Epic Link

Components

Priority

High
Configure