Jsdom throwing error for some URLs

1.3k views Asked by At

I am new to nodejs, what I'm trying to do is to scan all the url of my site (with javascript and jquery enabled) and check that the url contains a given string.

To do this I'm using jsdom, but when I launch the script extracts only some url and then crashes giving this error:

timers.js:110
    first._onTimeout();
          ^
TypeError: Property '_onTimeout' of object [object Object] is not a function
at Timer.listOnTimeout [as ontimeout] (timers.js:110:15)

Surely there is something wrong but I don't understand where..

This is my script:

var request = require('request');
var jsdom = require('jsdom');

request({ uri: 'http://www.example.com' }, function (error, response, html) {
  if (!error && response.statusCode == 200) {

     var doc = jsdom.jsdom(html, null, {
           features: {
              FetchExternalResources   : ['script'],
              ProcessExternalResources : ['script'],
              MutationEvents           : '2.0',
           }
     });

     var window = doc.createWindow();
     jsdom.jQueryify(window, "http://code.jquery.com/jquery-1.5.min.js", function() {
        var $ = window.jQuery;
        $('a').each(function(i, element){
             var a = $(this).attr('href');
             console.log(a);
             if (a.indexOf('string') != -1) {
               console.log('The winner: '+a);
               //return a;
             }
        });
        window.close();
    });
  }
});
1

There are 1 answers

0
Farid Nouri Neshat On

This is because of somewhere in your page they are calling setTimeout/setInterval with a string that is not supported in node and it results in that error.

To find out where is it coming from, I suggest just require longjohn module(require('longjohn')) and you get long stack traces, which they will help you to find the error. For example I got something like this from doing this in the repl:

    at listOnTimeout (timers.js:110:15)
---------------------------------------------
    at startTimer (/home/alfred/repos/node_modules/jsdom/lib/jsdom/browser/index.js:75:15)
    at DOMWindow.setTimeout (/home/alfred/repos/node_modules/jsdom/lib/jsdom/browser/index.js:124:50)
    at file:///home/alfred/repos/repl:undefined:undefined<script>:1:1
    at Contextify.sandbox.run (/home/alfred/repos/node_modules/jsdom/node_modules/contextify/lib/contextify.js:12:24)
    at exports.javascript (/home/alfred/repos/node_modules/jsdom/lib/jsdom/level2/languages/javascript.js:5:14)
    at define.proto._eval (/home/alfred/repos/node_modules/jsdom/lib/jsdom/level2/html.js:1523:47)
    at /home/alfred/repos/node_modules/jsdom/lib/jsdom/level2/html.js:76:20
    at item.check (/home/alfred/repos/node_modules/jsdom/lib/jsdom/level2/html.js:345:11)

If by any chance that didn't work for you or you didn't like it, then I suggest you to modify this jsdom file: node_modules/jsdom/lib/jsdom/browser/index.js, function startTimer. Throw an error there if the callback wasn't a function. This will throw whenever offending code was run.

In case if you are running code that you can't change(like from websites you don't own, which I don't suggest it because foreign javascript like that could be used to attack your app), you could override DOMWindow.setTimeout/.setInterval to support string arguments. You could also make open an issue for jsdom to have this opt-in.