I'm reading a large XML file (~1.5gb) in Node Js. I'm trying to stream it and do something with chunks of data, but I'm finding it difficult to understand the documentation.
My current simple code is:
var fs = require('fs');
var stream = fs.createReadStream('xml/bigxmlfile.xml');
stream.on('data', function(chunk){
console.log(chunk)
});
The console gives a bunch of buffer
hex (I think) codes like this:
<Buffer 65 61 6e 2d 63 75 74 20 67 72 69 64 20 6c 69 6e 65 73 20 74 68 65 20 73 70 72 65 61 64 20 63 6f 6c 6c 61 72 20 61 6e 64 20 6d 69 74 65 72 65 64 2c 20 74 ...>
<Buffer 65 79 77 6f 72 64 73 3e 3c 2f 6b 65 79 77 6f 72 64 73 3e 3c 75 70 63 3e 34 32 39 35 36 30 31 33 38 33 38 39 3c 2f 75 70 63 3e 3c 6d 31 3e 36 38 38 39 31 ...>
<Buffer 6f 75 6e 74 3e 3c 63 75 72 72 65 6e 63 79 3e 55 53 44 3c 2f 63 75 72 72 65 6e 63 79 3e 3c 2f 63 6f 73 74 3e 3c 69 6e 66 6f 72 6d 61 74 69 6f 6e 3e 3c 2f ...>
<Buffer 65 20 62 72 69 65 66 73 20 74 68 61 74 20 73 69 74 20 63 6f 6d 66 6f 72 74 61 62 6c 79 20 61 74 20 74 68 65 20 68 69 70 73 2e 20 43 6f 6c 6f 72 28 73 29 ...>
<Buffer 3c 64 65 73 63 72 69 70 74 69 6f 6e 3e 3c 73 68 6f 72 74 3e 43 7a 65 63 68 20 63 72 79 73 74 61 6c 73 20 73 70 72 69 6e 6b 6c 65 20 61 20 73 6c 69 6e 67 ...>
I've also tried:
var fs = require('fs');
var parseString = require('xml2js').parseString;
var stream = fs.createReadStream('xml/lsnordstrom.xml');
stream.on('data', function(chunk){
//do something on file data
parseString(chunk, function (err, result) {
console.log(result);
});
});
(so I can read parse the XML stream as JSON) but I get undefined
results in the console.
How do I actually convert this data into something useful?
You can set the stream encoding like so:
Or convert the buffers to strings:
Also, to parse XML like you're trying to do, you'll need a stream parser.