Downloading files from protected website with VBA-Excel

393 views Asked by At

I'm trying to download a file from a protected webpage (from my job, so I cannot post the url).

When I save the bytes (I'm using a WinHTTP request), in an .xls, the file appears to be corrupted.

My code (without sensitive information) and a little code taken by "HTTP Header Live".

option explicit

Sub SaveFileFromURL()

Dim FileNum As Long
Dim FileData() As Byte
Dim WHTTP As Object
Dim SetCookieString As String
Dim mainUrl As String
Dim fileUrl As String
Dim SavePath As String
Dim strAuthenticate As String

'this is the url to login, extracted with HTTP Header Live.
'however, the url shown in the browser is: "https://www.website.com/retro/default.asp?idioma=ES"
mainUrl = "https://www.website.com/retro/logincheck.asp"

'this is the url for to download the file, but need to send a cookie as credentials
'(if you login via browser and paste the link, it will popup the saveas dialog, but if you don't login, it will return an internal error)
fileUrl = "https://www.website.com/retro/VerBordero.asp?id=27348&p=3º Trimestre 2019&n=0&m=UNKNOWN&con=CIRCULAR&fmt=xls"

'the path where to save the file with the extension I know it will have.
SavePath = "C:\Users\Victor\Desktop\bordero.xls"

'authetication should be:    
strAuthenticate = "txtUser=MYUSER&txtpwd=MYPASS&lg=es"

'I login to catch the cookie that it is suppose to allow me to download the file.
Set WHTTP = CreateObject("WinHTTP.WinHTTPrequest.5.1")
WHTTP.Open "POST", mainUrl, False
WHTTP.Send strAuthenticate

'if it is correct:
If WHTTP.Status = 200 Then

    'I get the cookie
    SetCookieString = WHTTP.getResponseHeader("Set-Cookie")

    'Then you have to GET direct file url
    WHTTP.Open "GET", fileUrl, False
    WHTTP.setRequestHeader "Content-Type", "application/x-msexcel"
    WHTTP.setRequestHeader "Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
    WHTTP.setRequestHeader "Accept-Language", "es-ES,es;q=0.8,en-US;q=0.5,en;q=0.3"
    WHTTP.setRequestHeader "Cookie", SetCookieString
    WHTTP.Send

    'if i get the file
    If WHTTP.Status = 200 Then

        'download bytes
        FileData = WHTTP.responseBody
        Set WHTTP = Nothing

        'Save the bytes into file
        FileNum = FreeFile
        Open SavePath For Binary Access Write As #FileNum
            Put #FileNum, 1, FileData
        Close #FileNum

    End If

End If

End Sub

The code runs. I got "OK" in all checked status but when I try to open the file (that have 622kb, the same as if I download it manually), Excel prompts me with a message saying that there is too much data for a cell. If I click on "open anyway", I can read some parts.

It looks like the file is being saved in different encoding.

Some code from HTTP Header Live.

LOGIN:
https://www.website.com/retro/logincheck.asp
Host: www.website.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
Accept-Language: es-ES,es;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate, br
Content-Type: application/x-www-form-urlencoded
Content-Length: 38
Origin: https://www.website.com
Connection: keep-alive
Referer: https://www.website.com/retro/default.asp?idioma=ES
Cookie: cc_social=yes; cc_analytics=yes; cc_advertising=yes; cc_necessary=yes; _ga=GA1.2.859443177.1570897852; __utma=136336428.859443177.1570897852.1570897852.1570897852.1; __utmz=136336428.1570897852.1.1.utmcsr=google|utmccn=(organic)|utmcmd=organic| utmctr=(not%20provided); ASPSESSIONIDQWSBSQAD=LDIIIENCGKDHMEJEGKJKBDMM; cookieconsent=dismiss; ASPSESSIONIDQWQCRRBD=PCOIENHDNLHCKNODMJLKAFGM
Upgrade-Insecure-Requests: 1
txtUser=MYUSER&txtpwd=MYPASS&lg=es
POST: HTTP/2.0 302 Found
date: Wed, 04 Dec 2019 22:20:17 GMT
server: Microsoft-IIS/8.5
cache-control: private
content-type: text/html
expires: Wed, 04 Dec 2019 22:20:18 GMT
location: retro.asp
content-length: 130
via: 2.0 www.website.com
X-Firefox-Spdy: h2

FILE:
https://www.website.com/retro/VerBorderoGRxls.asp?id=27348&p=3º Trimestre 2019&n=0&m=UNKNOWN&con=CIRCULAR&fmt=xls
Host: www.website.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
Accept-Language: es-ES,es;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate, br
Connection: keep-alive
Referer: https://www.website.com/retro/borderos_resumen.asp
Cookie: cc_social=yes; cc_analytics=yes; cc_advertising=yes; cc_necessary=yes; _ga=GA1.2.859443177.1570897852; __utma=136336428.859443177.1570897852.1570897852.1570897852.1; __utmz=136336428.1570897852.1.1.utmcsr=google|utmccn=(organic)|utmcmd=organic| utmctr=(not%20provided); ASPSESSIONIDQWSBSQAD=LDIIIENCGKDHMEJEGKJKBDMM; cookieconsent=dismiss; ASPSESSIONIDQWQCRRBD=PCOIENHDNLHCKNODMJLKAFGM
Upgrade-Insecure-Requests: 1

GET: HTTP/2.0 200 OK
date: Wed, 04 Dec 2019 22:21:40 GMT
server: Microsoft-IIS/8.5
cache-control: private
content-type: application/x-msexcel
expires: Wed, 04 Dec 2019 22:21:40 GMT
content-length: 637440
via: 2.0 www.website.com
X-Firefox-Spdy: h2

0

There are 0 answers