how do you decode gtfs protobufs in node

3.6k views Asked by At

I am trying to use https://github.com/dcodeIO/ProtoBuf.js to parse triments gtfs data.

Here is the code I have so far, it parses the .proto file correctly and creates the builder and has all the expected properties and methods, it throws an error when I try to decode any data with it.

Error: Data must be corrupt: Buffer overrun

the proto file is from https://developers.google.com/transit/gtfs-realtime/gtfs-realtime-proto

var ProtoBuf = require('protobufjs')
  , request = require('request')

var transit = ProtoBuf.protoFromFile('gtfs-realtime.proto').build('transit_realtime')

request('http://developer.trimet.org/ws/V1/FeedSpecAlerts/?appID=618F30BB3062F39AF24AED9EC', parse)

function parse(err, res, body) {
  try {
    console.log(transit.FeedMessage.decode(res.body))
  } catch(e) {
    console.log(e)
  }
}

Thanks to Brian Ferris I able to parse the first part of the header gtfs_realtime_version: "1" but the parser fails on the next component (the time stamp uint64)

Thanks to

5

There are 5 answers

1
Brian Ferris On

I'm not a node expert, but the root message type of a GTFS-realtime feed is "FeedMessage":

https://developers.google.com/transit/gtfs-realtime/reference

You seem to be trying to parse the feed as an "Alert" message:

console.log(transit.Alert.decode(res.body))

Maybe try changing Alert to FeedMessage and see what happens?

1
vinayr On

This does not answer your issue but you can get RT feed in text using url like http://developer.trimet.org/ws/V1/FeedSpecAlerts/appid/618F30BB3062F39AF24AED9EC/text/true

Also have a look at node-gtfs module.

0
Kirk On

I kept finding your question when searching for the same issue you were having and hopefully I can help someone else. After scouring the internet for a lot longer than I should have, I've come up with something that works. I didn't quite understand the data until I had a working feed decoded.

It largely appears that this has to do with how the data is being read. I'm not a NodeJS person so I don't know why, but it's dependent on how the data is read with http rather than request for decoding. I couldn't get the same method to work with request for the data.

Part of this I found from https://github.com/dcodeIO/ProtoBuf.js/wiki/How-to-read-binary-data-in-the-browser-or-under-node.js%3F but I didn't quite yet understand how to use protobufjs, so I'm putting a working example here for others. Hope it helps.

var ProtoBuf = require('protobufjs');
var http = require("http");

// create a protobuf decoder
var transit = ProtoBuf.protoFromFile('gtfs-realtime.proto').build('transit_realtime');
// your protobuf binary feed URL
var feedUrl = "...";    

// HTTP GET the binary feed
http.get(feedUrl, parse);

// process the feed
function parse(res) {
    // gather the data chunks into a list
    var data = [];
    res.on("data", function(chunk) {
        data.push(chunk);
    });
    res.on("end", function() {
        // merge the data to one buffer, since it's in a list
        data = Buffer.concat(data);
        // create a FeedMessage object by decooding the data with the protobuf object
        var msg = transit.FeedMessage.decode(data);
        // do whatever with the object
        console.log(msg);
    }); 
});
1
JJones On

From the Google Developers page at developers.google.com/transit/gtfs-realtime/examples/nodejs-sample. Google has now made a Node.js npm module available to make things very easy:

npm install gtfs-realtime-bindings

Here's Google's code snippet ( Apache 2.0 License )

var GtfsRealtimeBindings = require('gtfs-realtime-bindings');
var request = require('request');

var requestSettings = {
  method: 'GET',
  url: 'URL OF YOUR GTFS-REALTIME SOURCE GOES HERE',
  encoding: null
};
request(requestSettings, function (error, response, body) {
  if (!error && response.statusCode == 200) {
    var feed = GtfsRealtimeBindings.transit_realtime.FeedMessage.decode(body);
    feed.entity.forEach(function(entity) {
      if (entity.trip_update) {
        console.log(entity.trip_update);
      }
    });
  }
});
0
Alastair On

I was able to get this to work (with New York MTA feeds, anyway) by forcing the request module to have a null encoding, thus ensuring it returns a buffer instead of a string. Like so:

request({
    url: 'http://developer.trimet.org/ws/V1/FeedSpecAlerts/?appID=618F30BB3062F39AF24AED9EC'
    encoding: null
}, parse)

Then the parsing appears to work fine.