How to follow multiple hyperlinks and extract webpage data

How to follow multiple hyperlinks and extract webpage data can be useful in a variety of situations. For example, a person may like to automate the process of extracting the sales price and then maximum retail price (mrp) of a product from an ecommerce website. Or, somebody may wish to login into a specific website and extract inventory data automatically. In our Excel training video today, we learn to follow many links on a website and navigate to a specific webpage from which we extract specific data for our database.

 

Watch this training video on YouTube.

Here’s the complete VBA macro code:

Sub testweb()

‘Below is a label
mystart:

‘ First we create an Internet Explorer object, specify its location and size and ensure that the ‘webpage is completely loaded before we can perform automatic actions

Set objIE = CreateObject(“InternetExplorer.Application”)
objIE.Top = 0
objIE.Left = 0
objIE.Width = 1600
objIE.Height = 900
objIE.Visible = True ‘We can see IE

On Error Resume Next
objIE.navigate (“http://www.amazon.in/”)

Do
DoEvents
If Err.Number <> 0 Then
objIE.Quit
Set objIE = Nothing
GoTo mystart:
End If
Loop Until objIE.readystate = 4

‘we define an object variable Alllinks and loop through all the links to search for

Set Alllinks = objIE.document.getElementsByTagName(“A”)
For Each Hyperlink In Alllinks
If InStr(Hyperlink.innerText, “Micromax Mobiles”) > 0 Then
Hyperlink.Click
Exit For
End If
Next

‘we use a timer to wait for 2 seconds

Application.Wait (Now + TimeValue(“0:00:02”))

‘this code works as above

Set aAlllinks = objIE.document.getElementsByTagName(“A”)
For Each Hyperlink In aAlllinks
If InStr(Hyperlink.innerText, “With Dual SIM Facility”) > 0 Then
Hyperlink.Click
Exit For
End If
Next

Application.Wait (Now + TimeValue(“0:00:02”))

Set bAlllinks = objIE.document.getElementsByTagName(“A”)
For Each Hyperlink In bAlllinks
If InStr(Hyperlink.innerText, “Micromax Unite 3 Q372 (Blue, 8GB)”) > 0 Then
Hyperlink.Click
Exit For
End If
Next

Application.Wait (Now + TimeValue(“0:00:02”))

‘ now we locate a string and extract the number of characters we need

‘ we study the webpage thoroughly

Dim strCountBody As String
Dim startPos As Long
Dim startPos2 As Long
Dim endPos As Long
Dim endPos2 As Long
Dim textWanted As String
Dim textWanted2 As String
strCountBody = objIE.document.body.innerText
startPos = InStr(1, strCountBody, “Sale:”)
‘MsgBox startPos
endPos = startPos + 16
‘MsgBox endPos
textWanted = Mid(strCountBody, startPos, endPos – startPos)
textWanted = Right(textWanted, 8)

Application.Wait (Now + TimeValue(“0:00:02”))

startPos2 = InStr(1, strCountBody, “M.R.P.:”)
‘MsgBox startPos
endPos2 = startPos2 + 16
‘MsgBox endPos
textWanted2 = Mid(strCountBody, startPos2, endPos2 – startPos2)
textWanted2 = Right(textWanted2, 8)

‘we transfer the data to the worksheet

Range(“A2”) = objIE.document.Title
Range(“B2”) = textWanted
Range(“C2”) = textWanted2
Range(“A:B”).Columns.AutoFit

Application.Wait (Now + TimeValue(“0:00:02”))

Set cAlllinks = objIE.document.getElementsByTagName(“A”)
For Each Hyperlink In cAlllinks
If InStr(Hyperlink.innerText, “See more product details”) > 0 Then
Hyperlink.Click
Exit For
End If
Next

‘we quit the application Internet Explorer or close it
objIE.Quit

End Sub