HTML - How Can I Scrape Multiple Pages - Links at Once Using VBA - Stack Overflow
HTML - How Can I Scrape Multiple Pages - Links at Once Using VBA - Stack Overflow
- Stack Overflow
I'm currrently trying to scrape info from this Reddit Page. My goal is to make excel open all the
posts in new tabs and then I want to scrape information from each of those pages, since the
1 starting page doesn't have as much information.
I've been trying for the last few hours to figure this out, but I'm admittedly pretty confused about
how to do it, just overall unsure what to do next, so any pointers would be greatly appreciated!
Here is my current code, it works decently enough but as I said, I'm not sure what I should do
next to open the links it finds one by one and scrape each page for data. The links are scraped off
that first page and then added to my spreadsheet right now, but if possible I'd like to just skip that
step and scrape them all at once.
Thanks! :)
Sub GetData()
objIE.navigate (ActiveCell.Value)
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
y = 1
End Sub
Your privacy
By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose
Run code snippet Expand snippet
information in accordance with our Cookie Policy.
https://stackoverflow.com/questions/61598820/how-can-i-scrape-multiple-pages-links-at-once-using-vba 1/3
5/9/2021 html - How can I scrape multiple pages/links at once using VBA? - Stack Overflow
Share Improve this question Follow edited May 4 '20 at 21:32 asked May 4 '20 at 18:10
QHarr Bloggy
71.7k 10 41 77 89 7
@QHarr I'm basically trying to open each of the links (the hrefs) and then scrape a few html elements for
each of them and output those to my spreadsheet. So the data to scrape would be say, for example the # of
upvotes and the output would be a number. – Bloggy May 4 '20 at 19:16
The % Upvoted is the only additional info those pages have, yes, but it's pretty important for my project and
I'm just trying to automate as much as possible. – Bloggy May 4 '20 at 20:24
Yep! Because the percentage is what's got me stuck, really. – Bloggy May 4 '20 at 20:36
You should be able to gather the urls then visit in a loop and write results from page visited to
array, then array to sheet. Add this after your existing line
2
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
Add:
Note: You are only potentially gaining on the page loads, as VBA is single threaded. To do that
you would need to store a reference to each tab, or open all first, then loop through relevant open
windows to do the scrape. My preference would be to keep in same tab to be honest.
71.7k 10 41 77
Does .NodeValue work similarly how .next_sibling works in BeautifulSoup @QHarr? – SIM May 4 '20
at 21:53
Sorry if I took time to reply, I'm trying to understand and not just copy ^^ For some reason it's scraping the
title of the first post in the list just fine, along with the upvotes, but not the %. And then after the macro
finishes I end up with the first post (and its upvotes) repeating over 25 rows instead of all the different
posts. I can't figure out what's causing that. – Bloggy May 4 '20 at 22:25
I checked the HTML and there's another CSS class called "word" that's technically below the one I want,
that might be what's causing issues with the % though that's probably not why it's not scraping the other
posts. – Bloggy May 4 '20 at 22:36
that fixed the first problem, thanks! and yeah, it's writing out [object Text]. – Bloggy May 4 '20 at 22:55
Weirdly enough, It's telling me that the "object doesn't support this property or method". – Bloggy May 4
'20 at 23:15
Your privacy
By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose
information in accordance with our Cookie Policy.
https://stackoverflow.com/questions/61598820/how-can-i-scrape-multiple-pages-links-at-once-using-vba 3/3