wget not working in python

18.6k views Asked by At

I am using the os module to issue a wget request through python. It looks something like this:

os.system("wget 'http://superawesomeurl.com'")

If I issue the wget request straight from terminal, it works, but I have two problems:

  1. When I build this in sublime, it give me the error: sh: wget: command not found
  2. When I enter this into a python shell command line it sends the request but comes back bad: 400 bad request

I noticed that other people don't use the quotes around the url, but in terminal this is the only way it works. I am using python 2.7.8 and running Yosemite.

3

There are 3 answers

5
Hubert Grzeskowiak On

Your code should work if wget is on your PYTHONPATH.

But seriously, do not use wget in Python!

Better use a Python-native function like urlopen: https://docs.python.org/2/library/urllib2.html#urllib2.urlopen It's as simple as this:

from urllib2 import urlopen
response = urlopen("http://stackoverflow.com").read()

Now response contains the whole contents of the html page. You can also use readlines() if you wanted to iterate over it line by line. To save the html file to disk use:

download = open("index.html", "w")
download.write(response.read())
download.close()
0
krukita On

Make sure wget is on your PYTHONPATH

from subprocess import Popen, PIPE
wget = Popen(["wget", "http://superawesomeurl.com"],stdout=PIPE).read()
print wget
0
shrewmouse On

I would guess one of two problems:

  1. wget is not in your path
  2. wget is not installed (I'm not trying to insult you)

From a bash terminal type which wget. That will tell you where wget is installed on your system.

[sri@localhost ~]$ which wget
/usr/bin/wget
[sri@localhost ~]$ 

If which didn't locate wget then use find:

sudo find / -name wget

Once you know the path to wget, try adding the complete path to wget in your call to os.system:

[sri@localhost ~]$ which wget
/usr/bin/wget
[sri@localhost ~]$ python
Python 2.7.5 (default, Feb 19 2014, 13:47:40) 
[GCC 4.8.2 20131212 (Red Hat 4.8.2-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.system('/usr/bin/wget "www.asciitable.com"')
--2014-11-11 09:29:52--  http://www.asciitable.com/
Resolving www.asciitable.com (www.asciitable.com)... 192.185.246.35
Connecting to www.asciitable.com (www.asciitable.com)|192.185.246.35|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘index.html.1’

    [ <=>                                                                ] 6,853       --.-K/s   in 0.006s  

2014-11-11 09:29:53 (1.06 MB/s) - ‘index.html.1’ saved [6853]

0
>>>