0% found this document useful (0 votes)
6 views

Automate the Web - create web automation bots with Excel & VBA

The document provides a guide on how to create web automation bots using Excel and VBA, specifically for automating Internet Explorer. It outlines the necessary prerequisites, including basic knowledge of Excel, VBA, and HTML, and walks through the steps to build a simple keyword search bot that retrieves and displays search results in an Excel sheet. The guide includes detailed code examples and explanations of how to manipulate web elements through VBA to achieve automation tasks.

Uploaded by

phuockcn
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Automate the Web - create web automation bots with Excel & VBA

The document provides a guide on how to create web automation bots using Excel and VBA, specifically for automating Internet Explorer. It outlines the necessary prerequisites, including basic knowledge of Excel, VBA, and HTML, and walks through the steps to build a simple keyword search bot that retrieves and displays search results in an Excel sheet. The guide includes detailed code examples and explanations of how to manipulate web elements through VBA to achieve automation tasks.

Uploaded by

phuockcn
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

8/19/2020 Automate the Web - create web automation bots with Excel & VBA

Excel + VBA + IE = web automation

Learn to write Excel VBA code to automate your


web browser.

Excel = Microsoft Excel


VBA = Visual Basic for Applications, the macro programming language within
Excel
IE = Internet Explorer

Most browser automation tools are


complex, programmed by software
engineers. But because XL and IE –both
Microsoft products– play so well together,
normal folks like you and I can build some
pretty cool things too.

Web browsing & interaction, sending &


replying to emails, social media interaction,
blogging/posting/content editing, product
searching/comparing/price-
checking/buying, all kinds of stuff … you can build a robot to do it for you, and do it faster,
and around the clock.

TL;DR »

Let’s start by building a basic web bot to open a search page, submit a search request,
evaluate and scrape the results page, and return the results to Excel.

1. Excel Basics – but first, if you’re new to Excel you should start here.

automatetheweb.net 1/16
8/19/2020 Automate the Web - create web automation bots with Excel & VBA

2. VBA Basics – it would also help to know some VBA fundamentals (eg. what’s a
variable, what’s a subroutine, for-next loops, the VBA Editor, etc) — these courses are
good.

3. HTML and the DOM – and for creating web-bots, you should understand basic HTML
structure – how a webpage is laid out – the Document Object Model.

Ok, you’re grounded in the basics. Now let’s jump into building a simple bot to search for a
keyword and return the results…

A beginner web automation project: keyword search bot


Let’s create a web bot that goes to my new favorite search engine, enters a keyword,
submits the query, then scans the results and prints them to an Excel sheet, highlighting
any links from yellowpages.com.

Begin by opening Excel and start a new, blank workbook.

Enter a keyword – let’s say auto parts – in cell A2 , and a location – let’s say Houston TX
– in cell C1 .

Excel workbook with Developer tab selected

Click the Developer tab (not showing?), and click Visual Basic .

automatetheweb.net 2/16
8/19/2020 Automate the Web - create web automation bots with Excel & VBA

Then click Tools › References , and add the reference libraries Microsoft HTML Object
Library and Microsoft Internet Controls …

VBA – default reference libraries

VBA – add these web-related reference libraries

Next, in the Visual Basic Editor (VBE), click View › Project Explorer .

Then, right-click VBAProject , and Insert › a new Module …

automatetheweb.net 3/16
8/19/2020 Automate the Web - create web automation bots with Excel & VBA

VBA – inserting a new module

Now copy & paste the code below into Module1 …

'start a new subroutine called SearchBot


Sub SearchBot()

'dimension (declare or set aside memory for) our variables


Dim objIE As InternetExplorer 'special object variable representing the I
Dim aEle As HTMLLinkElement 'special object variable for an <a> (link) el
Dim y As Integer 'integer variable we'll use as a counter
Dim result as String 'string variable that will hold our result link

'initiating a new instance of Internet Explorer and asigning it to objIE


Set objIE = New InternetExplorer

'make IE browser visible (False would allow IE to run in the background)


objIE.Visible = True

'navigate IE to this web page (a pretty neat search engine really)


objIE.navigate "https://duckduckgo.com"

'wait here a few seconds while the browser is busy


Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop

'in the search box put cell "A2" value, the word "in" and cell "C1" value
objIE.document.getElementById("search_form_input_homepage").Value = _
Sheets("Sheet1").Range("A2").Value & " in " & Sheets("Sheet1").Range("C

'click the 'go' button


objIE.document.getElementById("search_button_homepage").Click

'wait again for the browser


Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop

automatetheweb.net 4/16
8/19/2020 Automate the Web - create web automation bots with Excel & VBA

'the first search result will go in row 2


y = 2

'for each <a> element in the collection of objects with class of 'result_
For Each aEle In objIE.document.getElementsByClassName("result__a")

'...get the href link and print it to the sheet in col C, row y
result = aEle
Sheets("Sheet1").Range("C" & y).Value = result

'...get the text within the element and print it to the sheet in col
Sheets("Sheet1").Range("D" & y).Value = aEle.innerText
Debug.Print aEle.innerText

'is it a yellowpages link?


If InStr(result, "yellowpages.com") > 0 Or InStr(result, "yp.com") >
'make the result red
Sheets("Sheet1").Range("C" & y).Interior.ColorIndex = 3
'place a 1 to the left
Sheets("Sheet1").Range("B" & y).Value = 1
End If

'increment our row counter, so the next result goes below


y = y + 1

'repeat times the # of ele's we have in the collection


Next

'add up the yellowpages listings


Sheets("Sheet1").Range("B1").Value = _
Application.WorksheetFunction.Sum(Sheets("Sheet1").Range("B2:B100"))

'close the browser


objIE.Quit

'exit our SearchBot subroutine


End Sub

Your module should look like the image below. To run this subroutine you’d press the green
play button or just press F5 . But first, let’s have a closer look. Click the first line, Sub
SearchBot() . Then press F8 to begin stepping through the code line-by-line (also called
debugging).

automatetheweb.net 5/16
8/19/2020 Automate the Web - create web automation bots with Excel & VBA

VBA code module in VB Editor

Each time you press F8 the line of code highlighted in yellow will execute, and then the
next line down of code will turn yellow and wait for your key press. Say you change the
code in a line and want to re-run it… you’d move the yellow line back up by dragging the
yellow arrow with your mouse, or by clicking the line you want and then pressing
CTRL + F9

Notice that after executing the line objIE.Visible = True , Internet Explorer opens? Now
position all 3 applications (Excel, VBE, and IE) on the screen so that each is visible, and
continue stepping through the code using F8 .
automatetheweb.net 6/16
8/19/2020 Automate the Web - create web automation bots with Excel & VBA

you want. CamelCase naming convention, with first letters capitalized, is typical for naming
subroutines and functions.

Dim objIE as InternetExplorer

Think of variables (like objIE) as floating references or containers that refer to objects or
numbers we want to manipulate. But before we get to use them, we have to declare, or
Dim , them, which sets aside some memory for whatever kind of use we have in mind for
them. To manipulate Internet Explorer as an object in VBA, we need to Dim the browser —
either generically as an Object variable (an example of late binding objects), or more
specifically as an InternetExplorer special object variable like we did here (an example
of early binding objects).

Right now don’t worry about early vs. late


binding… there are pros and cons for each
method… but one advantage of early binding is
VBA knows exactly what you intend for the
variable, and therefore can show you its
Intellisense (helper hints) popup menu when
typing code after a specified object variable,
VBA Intellisense code helper
which is great for helping beginners snoop
around and see different things the object could
possibly do. Why call it objIE? No special reason, aside from making it easy to remember
what it’s for. Names of variables can be almost anything you want, and camelCase with
first letter lowercase is the usual naming convention.

Dim aEle As HTMLLinkElement

Here we reserve aEle as a different type of object variable. Specifically, we early bind it for
use as a webpage link (the <a> tags on a webpage). The other way would be to declare it
more generically as Dim aEle as Object then later put it to use in one or more specific
roles as an object — either way works. Once again, I could choose to call it anything, even
mySuperHrefThingy, but I like the simplicity of aEle because it’s short and makes me think
<a> element.

Dim y as Integer

Let’s use the variable y to refer to the row number on the Excel sheet where we’ll have the
bot put our search results. Integer type variables in VBA are round numbers between
automatetheweb.net 8/16
8/19/2020 Automate the Web - create web automation bots with Excel & VBA

What’s happening here…

Let’s look at the code line by line. Watch the video too if you want.

'start a new subroutine called SearchBot

The first thing you’ll notice are lots of comments I added to the code. VBA ignores lines
beginning with a single quote ( ' ), highlighting them green, as comments. Adding your own
personal comments to your code is an especially good idea for beginners, or for more
complex code that requires a memory-jogger when you go back to it.

Sub SearchBot()

This tells VBA we are starting a new subroutine, or Procedure . A module can contain
subroutines ( Sub … End Sub ) and functions ( Function … End Function ). In the video,
we start the subroutine by putting the cursor within the code and pressing F5 (run) or F8
(execute line-by-line). But usually a Sub or Function is called by another Sub or
Function , or by a launch button. The name of a Sub or Function can be almost anything

automatetheweb.net 7/16
8/19/2020 Automate the Web - create web automation bots with Excel & VBA

-32,768 and 32,768. If we needed a variable to hold a value greater than 32,768 we’d Dim
y as Long instead. Or if we needed it to have decimals we’d use Dim y as Double .

Dim result as String

For each aEle we find, we’ll copy its href value over to a string (alphanumeric text) variable
called result, then print the string to the Excel sheet. Ok, I admit we don’t really need this –
we could just print aEle’s value to the sheet directly– but I added it to introduce you to
string variables.

Set objIE = New InternetExplorer

Setting an object instantiates or activates a new instance of its object type. So we’re saying
let’s start a new instance of the IE browser.

objIE.Visible = True

Awesome – your bot just opened your IE browser! Properties (characteristics, such as
making IE Visible) and Methods (actions) are invoked after typing a dot ( . ) after an object,
like you see here. In your VBE, delete this line and start retyping it. You’ll notice after typing
objIE. that the Intellisense helper pops up, listing all the properties and methods
available for the InternetExplorer object. Go ahead and scroll through the list.

objIE.navigate "https://duckduckgo.com"

The .navigate event tells IE to navigate to whatever webpage is within the quotes. Yay
automation time!

Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop

But the code wants to run all at once, in a millisecond. By the time IE actually loads the
page of search results, the entire subroutine is long over. We have to slow things down and
wait for the browser while its navigating. We could use something like
Application.Wait(Now + TimeValue("00:00:05")) to wait 5 seconds, but if the page
loads faster we’re wasting time waiting for nothing, and if the page takes longer we’re in
the same pickle as before. Do While objIE.Busy = True Or objIE.readyState <> 4:
DoEvents: Loop is an elegant Do...Loop solution consisting of 3 VBA statements we’ve
tidied up on one line, using colons ( : ) to connect them. It’s saying: while IE is busy
navigating, just sit here and run in circles. objIE.Busy = True and objIE.readyState <>
4 ( <> means not equal) are essentially the same thing –one could probably be ommitted–

automatetheweb.net 9/16
8/19/2020 Automate the Web - create web automation bots with Excel & VBA

but just to make sure the page is really done loading we make both of these conditions
requirements for getting out of the loop, by using the logical operator Or . The DoEvents
part is technically not needed but we add it in the loop to free up processing resources and
allow other things to happen within Excel during this time, asynchronous to the running
loop. For the over-achievers out there, here are some other ways we could have written
this line of code to do the same thing.

objIE.document.getElementById("search_form_input_homepage").Value = _
Sheets("Sheet1").Range("A2").Value & " in " &
Sheets("Sheet1").Range("C1").Value

This VBA statement says: Find the search box element on the webpage by its ID , which
is search_form_input_homepage , and make its text value what we have in cell A2 , plus
the string " in ", plus what we have in cell C1 . The underscore ( _ ) just lets us put longer
pieces of code on multiple lines for better readability. How did I know what the search
input box’s ID was, or that it even had an ID ? Firebug of course! If you don’t have
Firefox get it, and then install the Firebug add-on. Then right-click the search box, click
Inspect Element with Firebug , and behold…

automatetheweb.net 10/16
8/19/2020 Automate the Web - create web automation bots with Excel & VBA

Firefox Firebug Inspector

automatetheweb.net 11/16
8/19/2020 Automate the Web - create web automation bots with Excel & VBA

Firebug Console – highlighted code

…the line of code is highlighted in the Firebug console (I added the red box around the
ID ). The image above reminds us a webpage is organized in a nested, tree-like structure,
starting with .document as the root object and branching out from there into smaller
objects. This is called the Document Object Model. .getElementById("__") is the go-to
tool for grabbing one of those objects from the webpage. With it you can scrape data from
a page, as well as writing to a webpage as we’re doing here. Here’s a sneak peak at some
other ways we could have found the search box and entered data into it (shown in the
order I would have tried them, if an ID was not available).

objIE.document.getElementById("search_button_homepage").Click

I used Firebug to find the ID of the search button (magnifying glass) on the page, then
used .getElementById again to grab it, and the .Click method to click it. Alternatively,
we could have submitted the search box form like this instead
objIE.document.getElementById("search_form_homepage").Submit . Actually, there’s a
simpler method still that would allow us to skip the previous 3 steps altogether. Can you
see it? Hint: take a look at the url we ended up with… https://duckduckgo.com/?
q=auto+parts+in+Houston+TX. Got it? If you still don’t see it, and you don’t think you’ll be
able to sleep tonight, email me 🙂

Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop

We need to wait for the browser for a second (an eternity in CPU land) while the search is
running.

y = 2

Now we have a page full of search results. We’ll put them in column C , but we need to
start in row 2 since C1 contains our heading Houston TX.

For Each aEle In objIE.document.getElementsByClassName("result__a")

This is the opening statement in our action-packed For...Next loop, where the procedure
seeks out each aEle (link) object that’s found in the collection of all objects on the page
that have a class of result__a (again I used Firebug to locate this particular class name to

automatetheweb.net 12/16
8/19/2020 Automate the Web - create web automation bots with Excel & VBA

target). For each match found, we temporarily assign it to the aEle variable and do the
following with it…

result = aEle
Sheets("Sheet1").Range("C" & y).Value = result

We take the found <a> element ( aEle ) and assign its value to a string variable, result .
Ok, we really didn’t need to do this… we could just work with aEle directly without passing
it around… I just wanted to demonstrate what a string variable is! Then we print the string
to the sheet in column C , row y .

Sheets("Sheet1").Range("D" & y).Value = aEle.innerText

Let’s say we also grab the anchor text for each link and plop that into the cell to the right.
The .innerText property (or .textContent ) gets us that text. Side-note: I’ve built many
bots with the code If InStr(objIE.document.body.innerText, "some text") > 0 then
... where I needed to know if the web page had some text in it somewhere.

Debug.Print aEle.innerText

We’ll also print this text in the debugging console, the Immediate Window , as well (note:
you may have to press CTRL + G to show the Immediate Window , or open it from the
View menu).

If InStr(result, "yellowpages.com") > 0 Or InStr(result, "yp.com") > 0 Then


Sheets("Sheet1").Range("C" & y).Interior.ColorIndex = 3
Sheets("Sheet1").Range("B" & y).Value = 1
End If

Let’s say we want to single out Yellow Pages links. InStr looks for one string inside
another, using the format InStr(haystack, needle) and returns the numerical starting
point of the first occurence of needle, or 0 if not found. If ‘yp’ or ‘yellowpages’ is found in a
link we make the interior of the cell red with the .ColorIndex VBA property, and place a ‘1’
in the cell to the left. FYI, short and simple If...Then statements, like this one: If x = y
Then Debug.print "Yo, x is the same as y, bro!" , can be written on one line with the
closing End If ommitted. If...Then statements are the backbone of any great A.I. …
Watson, Wall-E, and now SearchBot()!

y = y + 1

automatetheweb.net 13/16
8/19/2020 Automate the Web - create web automation bots with Excel & VBA

Now we increment y by 1, so each new result found will be printed on the next row down.

Next

Continue looping until we run out of aEle ‘s to look at.

Sheets("Sheet1").Range("B1").Value = _
Application.WorksheetFunction.Sum(Sheets("Sheet1").Range("B2:B100"))

Since we put a 1 in row B each time we found a Yellowpages link, we can now just add
them up, making cell B1 display the total. Alternatively, we could have used another
For...Next loop to look at each cell in column B one at a time, adding up all the 1’s. But
using Excel’s powerful built-in WorksheetFunction ‘s whenever possible is better.

objIE.Quit
End Sub

So we opened IE, navigated to DuckDuckGo, performed a search, parsed the results on


the web page, printed them to Excel, and even did some further analysis on the retrieved
data. All that’s left is to close ( .Quit ) the browser object and end ( End Sub ) the
procedure.

Finally, let’s save the file. We need to save the file with the extension .xlsm , since this is
most definitely a macro-enabled workbook.

automatetheweb.net 14/16
8/19/2020 Automate the Web - create web automation bots with Excel & VBA

Excel – Save-as .xlsm

Congratulations! You built your first bot, and in the process learned a good deal about the
most common VBA commands used to interact with and automate the web.

To download this Excel VBA project: xl-searchbot.xlsm (keyword search & scrape
bot)

Recommended next: Web scraping with getElementsByTagName()

follow:
Automate the Web / Proudly powered by WordPress
Bots should never be created that harm or impede the functionality of a website. By using this site you agree to these
terms.

automatetheweb.net 15/16
8/19/2020 Automate the Web - create web automation bots with Excel & VBA

automatetheweb.net 16/16

You might also like