Extract Specific Link From Web Pages Using Excel VBA

A website visitor wants help on how to automate the extraction of a specific hyper-link from web pages using Excel VBA
‘Hello Dr. Takyar,

It was pleasure talking to you the other day. Hope you and your friends/relatives were safe from the recent storm.

Here is the task that I would like to automate with Excel VBA and seeking help on. I liked your code for extracting table data from multiple web pages and creating and pasting it in respective worksheets – http://www.youtube.com/watch?v=qbOdUaf4yfI However my task is a bit different which is the following:

1) There are several (100s or 1000s) web-links in column A.
2) A connection is established with a web-link’s webpage.
3) On the connected webpage, the first HREF link that contains the text “About” is copied and it’s web-link is pasted in the adjacent cell in column B. If “About” HREF
link is not found on connected webpage, then the cell is populated with “NOT FOUND”.
4) The connection is closed and steps 1 – 3 are repeated until there are no more web-links in column A.

Some examples we can try with are the following web-links in
column A:

Column A Column B

1) https://www.google.com/ https://www.google.com/intl/en/about/
2) https://www.facebook.com/ https://www.facebook.com/facebook
3) http://www.youtube.com/ http://www.youtube.com/yt/about/
4) http://www.yahoo.com/ http://info.yahoo.com/
5) http://www.wikipedia.org/ NOT FOUND
6) http://www.linkedin.com/ http://www.linkedin.com/about-us

Note:
1) When a respective “About” HREF link is copied from the webpage of a web-link in Column A, and the same is pasted in
Column B, the web-link of the “About” HREF link is displayed in Column B.
2) “About” HREF links on webpages can be one of several variations like – About, About Us, About “Company_Name”, etc. Whatever it is, we want to capture the first HREF link that contains the text – “About”.
Please feel free to email me with any questions or if you need me to call you.

best wishes,

Romi Ghose’

Here is the complete VBA with proper remarks:
Sub GetAboutUsLinks()
‘First define all the variables
Dim ie As Object ‘Internet Explorer
Dim html As Object ‘ HTML document
Dim myLinks As Object ‘ Links collection
Dim myLink As Object ‘Single Link
Dim result As String
Dim myURL As String ‘Web Links on worksheet
Dim LastRow As Integer ‘ VBA execution should stop here

Set ie = CreateObject(“InternetExplorer.Application”)

LastRow = Sheet1.Cells(Rows.Count, “A”).End(xlUp).Row
‘Loop through all the web links on the worksheet one by one and then do some things
For i = 2 To LastRow
‘Get the link from the worksheet and assign it to the variable
myURL = Sheet1.Cells(i, 1).Value
‘Now go to the website
ie.navigate myURL
‘Keep the internet explorer visible
ie.Visible = True
‘Ensure that the web page has downloaded completely
While ie.readyState <> 4
DoEvents
Wend
‘Get the data from the web page that is in the links and assign it to the variable
result = ie.document.body.innerHTML
‘create a new html file
Set html = CreateObject(“htmlfile”)
‘now place all the data extracted from the web page into the new html document
html.body.innerHTML = result

Set myLinks = html.getElementsByTagName(“a”)
‘loop through the collected links and get a specific link defined by the conditions
For Each myLink In myLinks
If Right$(myLink, 13) = “about-us.html” Or Right$(myLink, 10) = “about.html” Or Right$(myLink, 8) = “about-us” Or Right$(myLink, 5) = “about” Then
Sheet1.Cells(i, “B”).Value = myLink
End If
‘go to the next link
Next myLink
‘once the last web link on the sheet has been visited close the internet explorer
If i = LastRow Then
ie.Quit
End If
‘ go to the next web link on the worksheet
Next i

End Sub

Watch the video:


Further reading:
Using Web Queries and a Loop to Download 4000 Database Entries from 4000 Web Pages

2 thoughts on “Extract Specific Link From Web Pages Using Excel VBA

  1. Prabu

    Hello Sir,
    I am beginner in Excel Vba, lot of information am learning from this website . Thank you very much.

    Am facing an issue while opening web link using hyperlink.
    There are several web links (more than 1000) saved in Column A.

    A2 – http://www.google.com
    A3 – http://www.yahoo.com
    A3 – http://www.gmail.com

    I will check each link one by one and update the status in Column B. I have created a macro for hyperlink, if my cursor is “B2” column(short cut key created) then “A2” web link will open automatically in web page. But second web link open in new tab instead of same tab. Please help to open all links one by one in same tab instead of different tabs.

    Regards,
    Prabu

    Reply
    1. Starr

      I am a complete NOOB. I really like your videos, even if you do move kind of quickly, lol.I have questions regarind this video.1) It appears as though 2/3 of the work you outline is to get SPreocvery on the phone. In my case I followed your latest vids and got 0.99.3b installed and working fine. I also got my phone rooted with Pet’s ESE81 rooted rom. So far so good! CAN WE SKIP THE DROID ROOT HELPER part of these vids?2) what is busy box and why do i need it to install kesnrl+eomr?

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *