my stream from http request through to aws s3 multiple part upload is not storing complete file

376 views Asked by At

I am piping a 1.7GB file through from HTTP request into s3 upload function that does a multipart upload. The end function is called - when i look in s3 - the file is only 1 GB. Seems as though not all the chunks are piping through.

This loss in bytes is relative to the file. Smaller files will loss relative amount of bytes.

I have tried piping off http request.

I have tried managing the chunk upload by uploading to s3 part upload every 5MB chunks via on('data')

Both the above give the same result.

Im at a loss.

1st attempt

Here is the approach using http stream pipe into s3 upload.


uploadFromStream(bucket, key) {
  const pass = new Stream.PassThrough();
  const params = {Bucket: bucket, Key: key, Body: pass}
  const streamOptions = {
      partSize: 5 * 1024 * 1024,
      queueSize: 1
  }
  this.s3Client.upload(params, streamOptions, (err, data) => {
    if(err) console.log('Stream upload Error', err)
    console.log(data)
  })

  return pass;
}

const getContents = await this.getFileById(queryResult.fileId)
getContents.pipe(this.uploadFromStream(s3bucket, s3key))
getContents.on('end', async () => {
   return resolve(true)
})

2nd attempt

Below is the approach to manage the chunks into 5MB packets to using in s3 part upload. Below this function is hit - i have initiated a multipart upload and obtained the id.

const maxChunkLimit = 1024 * 1024 * 5;
let chunkLimit = ''
let i = 0
getContents.on('data', async (chunk) => {
  chunkLimit += chunk
  if(chunkLimit.length > maxChunkLimit) {
    i = i + 1
    this.s3Client.uploadPart({
      Body: chunkLimit, 
      Bucket: s3bucket, 
      Key: s3key, 
      PartNumber: i, 
      UploadId: multiPartUpload.UploadId
    })
    chunkLimit = ''
  }
})

on the end - im uploading the final chunk remainder (less than 5mb)

When I sum up all chunk length, it does not equal the content length of the file it is reading. This sum does equal the file size of the uploaded version (1.08gb). This proves the upload is completed and all part uploads succeeded before completing the multipart upload.

Both approaches give me the same outcome - uploaded file is not the same file size as the original read file.

Loss in bytes is relative to the size of the original file

Strange thing is the max bytes i can upload is just over 1gb... I tried 1.7gb, 3gb, 6gb. All completed with a 1gb file.

Im using Node 8.10 and AWS Node SDK 2.360.0

0

There are 0 answers