As a user or data, I want access to those data to be as fast as possible so I can do my work.
There are a number of potential optimizations for access to data in S3. The simplest is to use HTTP and employ connection reuse. Do that.
If there's time, try adding parallel access to chunks can help to a point but only to about 5-8 streams. Doing this will require N HTTP 'connections,' so some modification to the connection reuse code will have to be made.
I dropped this story down to one point, because the connection reuse optimization is really simple and offers significant return for the effort.
Read the code in H4ByteStream, DmrppCommon,DmrppUtil and DMRppArray. Then make a plan for restructring this software so that it can be optimized. Now it is too much of a POC implementation to support things like connection pools and parallel transfers based not on the total number of chunks but on some finite number of (optimal) parallel connections to S3 using HTTP/S.