There are a number of segmentation faults in the DMR++ software for large runs

Description

Here are the places where the handler running on C7 fails:

Likely destructor fail:

Two dmrpp_easy_handle::read_data() calls:

The rest are calls to Chunk::set_rbuf()

Environment

None

Activity

Show:
James Gallagher
October 2, 2018, 6:34 PM
Edited

52.203.246.103
nohup ./p18-2-test-harness > p18-2-test-harness.log 2>&1 & (edited)
../mkCsv (edited)

also - the mds is in /home/centos/build/share/mds to replace it's state with pre-populated goodness:
cd /home/centos/build/share; rm -r mds mds_ledger.txt; tar -xvf mds-primed.tgz
and you should be good to go.

I’m going to see if I can find these using ASAN.

Working on the C7 vm, I had to add the package ‘libasan’ to get the BES to build (but not libdap4…)

James Gallagher
October 10, 2018, 4:31 PM

These were most likely due to threads continuing to run after one thread threw an exception and the master deleted the object(s) that held the data buffers those threads were (going to) write to. The code now joins with all outstanding threads before throwing an exception. It also re-tries HTTP GET requests that result in 500 errors.

Assignee

James Gallagher

Reporter

James Gallagher

Labels

None

Fix versions

Time remaining

0m

Story Points

None

Affects versions

Epic Link

Priority

High
Configure