Limit speed of urlfetch per domain

94 views Asked by At

Is there a way to limit the number of requests that urlfetch makes to any single server, per time unit?

I accidentally DoS'd a site I was crawling, since the async urlfetch api made it branch out until it died (each request spawns more than one new request on average). The logs contain ~200 DeadlineExceeded with a millisecond between each.

1

There are 1 answers

0
Avara On

You could use time.sleep() method. Suspend execution of the current thread for the given number of seconds.

import time
[...]
for u in urls:
    urllib2.urlopen(u, timeout=4)
    time.sleep(1)

https://docs.python.org/2/library/time.html#time.sleep