Read straight web content with Excel VBA

242 views Asked by At

there are many article on this site on how to read tags and tables in web sites with Excel VBA, but I am stuck here.

This website gives me business locations after entering a Zip code. ("Where is the closest location relative to my Zip Code")

I managed to navigate to the site, enter the Zip code and click Submit:

Dim Browser As SHDocVw.InternetExplorer
Dim HTMLDoc As MSHTML.HTMLDocument


Set Browser = New SHDocVw.InternetExplorer                     ' create a browser
Browser.Visible = True                                   ' make it visible
Application.StatusBar = ".... opening page"
Browser.navigate "https://www.thewebsite.com"            ' navigate to page
WaitForBrowser Browser, 1                                ' wait for completion or timeout

Application.StatusBar = "gaining control over DOM object"
Set HTMLDoc = Browser.document                         ' load the DOM object
WaitForBrowser Browser, 1

HTMLDoc.getElementById("ZipCode").Value = "28278"
HTMLDoc.getElementById("localTeamZipSubmit").Click

The site opens and the relevant content looks like this:

 <div>
        <div class="columns">
            <div class="column boldText paddingFive" style="padding-left: 20px; width: 70px;">
                Location:
            </div>
            <div class="column paddingTopFive">CHARLOTTE</div>
        </div>
        <div class="columns">
            <div class="column boldText paddingFive" style="padding-left: 20px; width: 120px;">
                Location Number:
            </div>
            <div class="column paddingTopFive">102340</div>
        </div>
        <div class="columns">
            <div class="column boldText paddingTopFive paddingLeftTwenty" style="vertical-align: top;">
                Address:
            </div>
            <div class="column paddingTopFive paddingLeftTwenty">
                <div>8848 Main St.</div>
                <div>Suite F</div>
                <div></div>
                <div>Charlotte, NC 27218</div>
            </div>
        </div>
            <div class="columns">
                <div class="column boldText paddingFive" style="padding-left: 20px; width: 70px;">
                    Phone:
                </div>
                <div class="column paddingTopFive">(704) 911-4440</div>
            </div>
                        <div class="columns">
                <div class="column boldText paddingFive" style="padding-left: 20px; width: 70px;">
                    Fax:
                </div>
                <div class="column paddingTopFive">(704) 911-4441</div>
            </div>
    </div>

As you can see, this section has no table, no named tags and classes that are use over and over. I was not able to read this information yet. I would be happy to get the whole blob into a String and parse it"

"Text = HTMLDoc.getEverything()"

Thanks a lot for your help!!!


In the meantime I found another code snippet that I modified but I am getting stuck at the same point:

Post and submit works but how to get the answer....

 {   Private Sub PostalCodes()
Dim ie As Object
Set ie = CreateObject("InternetExplorer.Application")
On Error GoTo errHandler
ie.Visible = 1

With ie
    .navigate "https://www.pattersondental.com/ContactUs/MyLocalTeam"
    Do While .busy: DoEvents: Loop
    Do While .ReadyState <> 4: DoEvents: Loop
    With .document.Forms("GetBranchFromZipForm")
        .ZipCode.Value = "28273"
        .submit
    End With
'    Do While Not CBool(InStrB(1, .document.URL, _
 '       "cp_search_response-e.asp"))
 '       DoEvents
  '  Loop
    Do While .busy: DoEvents: Loop
    Do While .ReadyState <> 4: DoEvents: Loop
'    MsgBox .document.all.tags("Colums").Item(1).Rows(1).Cells(1).innerText
     MsgBox .document.all.tags("Colums").innerText
  '  MsgBox .document}

I guess I have to search no for "how to dissect a HTML document"...

Add on:

It seems that while ie is a valid item (in the watch window) IE.Document is empty... why can this be, The website is still there with new data. I even tried another code snippet that looks for open websites in IE, it finds the site (with the correct data) but the document is still empty and getelementBY... does not find anything of course. I am about to start drinking...

1

There are 1 answers

0
Johns On

I can't believe it.

After 3 days of poking I found this:

  With ActiveSheet.QueryTables.Add(Connection:="URL;     
  https://www.pattersondental.com/ContactUs/MyLocalTeam",   
  Destination:=Range("A1"))
 .PostText = "ZipCode=70032"
 .RefreshStyle = xlOverwriteCells
 .SaveData = True
 .Refresh

I don't pretend to understand why it works, but is does.

John, I will still check out, what you suggested. Thanks