Chrome HTML5 FileReader crashes when trying to read large file

4.9k views Asked by At

I have a form that allows users to select a zip file for upload. I'm trying to do client-side validation of this zip file before it is uploaded to my server since the upload might take a while and I'd also like to conserve bandwidth.

All I need to do is read a .csv file that should be included in the zip, and verify the presence of other files in the zip that are referenced in the .csv. To accomplish this, I am trying to use JSZip.

If the archive is small, this works great. If the archive is large (testing with ~500MB file) then Chrome crashes.

var  reader = new FileReader();
reader.onload = function (e) {
  console.log("Got here!");
  // Read csv using JSZip, validate zip contents
};
reader.readAsArrayBuffer(file);

In my code I've commented out all of the logic in my onload callback and verified that none of that is causing the crash. I've discovered that Chrome crashes before the onload callback is ever called.

I've tested this in FireFox with much larger zip files and it works fine.

1

There are 1 answers

3
Mihail Malostanidis On

It is the browser tab running out of memory.

In order to work with such a large file, you should load slices of it at a time:

Use File.slice(start, end + 1), read the resulting Blob as ArrayBuffer, work on that chunk, then make sure no references to it remain so that it can be garbage collected.

Depending on what you are doing with the chunk, you may have to even set a Timeout to give the garage collector extra time to get to it. Make sure to test all of the browsers you support, as some might force you to have bigger timeouts or smaller chunks. Also, keep in mind that garbage collection might take even longer on overloaded less powerful computers.

This is a good example of slicing. Of course, you would slice by much larger pieces. You would also probably want to combine it with the next example on the page in order to give feedback on the progress that includes fetching the chunk from slow/remote storage in addition to the current chunk number.