web scraping with python to snatch a file

104 views Asked by At

Hi I want to snatch csv file in the URL please see below [download]. Being new to python i gotten this far can someone leverage what i have. many thanks.

from requests import session
import bs4

payload = {
    'action': 'login',
    'username': 'xxxxxxx',
    'password': 'zzzzzz'
}

with session() as c:
    c.post('https://www.zuora.com/apps/newlogin.do', data=payload)

    request = c.get('https://www.zuora.com/apps/JournalRuns.dox?method=view&number=JR-00000119') #my target url
    soup = bs4.BeautifulSoup(request.text)
    print  soup

</td>
<!--z:field end-->
<!--z:field begin-->
<td id="" rowspan="2" style="padding-left:10px;padding-top:10px;vertical-align:top">
<!--z:label.link begin-->
<span> <a href="JournalEntries.dox?method=view&amp;number=JE-00000721" id="">JE-00000721</a></span>
<!--z:label.link end--><br/>
<font color="gray"><!--z:label.text begin-->
<span class="text" id="">126 Transaction(s)</span>
<!--z:label.text end--></font><br/>
<!--z:label.link begin-->
<span> <a href='javascript:downloadTansactions("JournalEntries.dox?method=downloadTransactions&amp;number=JE-00000721");' id="">[download]</a></span>
<!--z:label.link end-->
</td>
1

There are 1 answers

0
twtr On

Are you asking how to do it, or are you asking for someone to do it for you? First you will need to traverse the HTML to retrieve the download link. http://www.crummy.com/software/BeautifulSoup/bs4/doc/

The BS4 documentation has instructions on how to do this.