Web Scraping using Beautifulsoup and scrapingdog API
Last Updated :
15 Nov, 2021
In this post we are going to scrape dynamic websites that use JavaScript libraries like React.js, Vue.js, Angular.js, etc you have to put extra efforts. It is an easy but lengthy process if you are going to install all the libraries like Selenium, Puppeteer, and headerless browsers like Phantom.js. But, we have a tool that can handle all this load itself. That is Web Scraping Tool which offers APIs and Tools for web scraping.
Why this tool? This tool will help us to scrape dynamic websites using millions of rotating proxies so that we don’t get blocked. It also provides a captcha clearing facility. It uses headerless chrome to scrape dynamic websites.
What will we need?
Web scraping is divided into two simple parts —
- Fetching data by making an HTTP request
- Extracting important data by parsing the HTML DOM
We willScrapingDog be using python and Scrapingdog API :
- Beautiful Soup is a Python library for pulling data out of HTML and XML files.
- Requests allow you to send HTTP requests very easily.
Setup
Our setup is pretty simple. Just create a folder and install Beautiful Soup & requests. To create a folder and install libraries type below given commands. I am assuming that you have already installed Python 3.x.
mkdir scraper
pip install beautifulsoup4
pip install requests
Now, create a file inside that folder by any name you like. I am using scraping.py.
Firstly, you have to sign up for this web scraping tool. It will provide you with 1000 FREE credits. Then just import Beautiful Soup & requests in your file. like this.
from bs4 import BeautifulSoup
import requests
|
Scrape the dynamic content
Now, we are familiar with Scrapingdog and how it works. But for reference, you should read the documentation of this API. This will give you a clear idea of how this API works. Now, we will scrape amazon for python books title.

Now we have 16 books on this page. We will extract HTML from Scrapingdog API and then we will use Beautifulsoup to generate JSON response. Now in a single line, we will be able to scrape Amazon. For requesting an API I will use requests.
this will provide you with an HTML code of that target URL.
Now, you have to use BeautifulSoup to parse HTML.
soup = BeautifulSoup(r, ’html.parser’)
|
Every title has an attribute of “class” with the name “a-size-mini a-spacing-none a-color-base s-line-clamp-2” and tag “h2”. You can look at that in the below image.

First, we will find out all those tags using variable soup.
allbooks = soup.find_all(“h2”, {“ class ”:”a - size - mini a - spacing - none a - color - base s - line - clamp - 2 "})
|
Then we will start a loop to reach all the titles of each book on that page using the length of the variable “allbooks”.
l = {}
u = list ()
for i in range ( 0 , len (allbooks)):
l[“title”] = allbooks[i].text.replace(“\n”, ””)
u.append(l)
l = {}
print ({ "Titles" :u})
|
The list “u” has all the titles and we just need to print it. Now, after printing the list “u”out of the for loop we get a JSON response. Which looks like…
Output-
{
“Titles”: [
{
“title”: “Python for Beginners: 2 Books in 1: Python Programming for Beginners, Python Workbook”
},
{
“title”: “Python Tricks: A Buffet of Awesome Python Features”
},
{
“title”: “Python Crash Course, 2nd Edition: A Hands-On, Project-Based Introduction to Programming”
},
{
“title”: “Learning Python: Powerful Object-Oriented Programming”
},
{
“title”: “Python: 4 Books in 1: Ultimate Beginner’s Guide, 7 Days Crash Course, Advanced Guide, and Data Science, Learn Computer Programming and Machine Learning with Step-by-Step Exercises”
},
{
“title”: “Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and The Cloud”
},
{
“title”: “Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython”
},
{
“title”: “Automate the Boring Stuff with Python: Practical Programming for Total Beginners”
},
{
“title”: “Python: 2 Books in 1: The Crash Course for Beginners to Learn Python Programming, Data Science and Machine Learning + Practical Exercises Included. (Artificial Intelligence, Numpy, Pandas)”
},
{
“title”: “Python for Beginners: 2 Books in 1: The Perfect Beginner’s Guide to Learning How to Program with Python with a Crash Course + Workbook”
},
{
“title”: “Python: 2 Books in 1: The Crash Course for Beginners to Learn Python Programming, Data Science and Machine Learning + Practical Exercises Included. (Artificial Intelligence, Numpy, Pandas)”
},
{
“title”: “The Warrior-Poet’s Guide to Python and Blender 2.80”
},
{
“title”: “Python: 3 Manuscripts in 1 book: — Python Programming For Beginners — Python Programming For Intermediates — Python Programming for Advanced”
},
{
“title”: “Python: 2 Books in 1: Basic Programming & Machine Learning — The Comprehensive Guide to Learn and Apply Python Programming Language Using Best Practices and Advanced Features.”
},
{
“title”: “Learn Python 3 the Hard Way: A Very Simple Introduction to the Terrifyingly Beautiful World of Computers and Code (Zed Shaw’s Hard Way Series)”
},
{
“title”: “Python Tricks: A Buffet of Awesome Python Features”
},
{
“title”: “Python Pocket Reference: Python In Your Pocket (Pocket Reference (O’Reilly))”
},
{
“title”: “Python Cookbook: Recipes for Mastering Python 3”
},
{
“title”: “Python (2nd Edition): Learn Python in One Day and Learn It Well. Python for Beginners with Hands-on Project. (Learn Coding Fast with Hands-On Project Book 1)”
},
{
“title”: “Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems”
},
{
“title”: “Hands-On Deep Learning Architectures with Python: Create deep neural networks to solve computational problems using TensorFlow and Keras”
},
{
“title”: “Machine Learning: 4 Books in 1: Basic Concepts + Artificial Intelligence + Python Programming + Python Machine Learning. A Comprehensive Guide to Build Intelligent Systems Using Python Libraries”
}
]
}
Similar Reads
Create Cricket Score API using Web Scraping in Flask
Web scraping is the process of extracting data from websites automatically. It allows us to collect and use real-time information from the web for various applications. In this project, we'll understand web scraping by building a Flask app that fetches and displays live cricket scores from an online
7 min read
Scraping Reddit with Python and BeautifulSoup
In this article, we are going to see how to scrape Reddit with Python and BeautifulSoup. Here we will use Beautiful Soup and the request module to scrape the data. Module neededbs4: Beautiful Soup(bs4) is a Python library for pulling data out of HTML and XML files. This module does not come built-in
3 min read
Pagination using Scrapy - Web Scraping with Python
Pagination using Scrapy. Web scraping is a technique to fetch information from websites. Scrapy is used as a Python framework for web scraping. Getting data from a normal website is easier, and can be just achieved by just pulling the HTML of the website and fetching data by filtering tags. But what
3 min read
Automated Website Scraping using Scrapy
Scrapy is a Python framework for web scraping on a large scale. It provides with the tools we need to extract data from websites efficiently, processes it as we see fit, and store it in the structure and format we prefer. Zyte (formerly Scrapinghub), a web scraping development and services company,
5 min read
Spoofing IP address when web scraping using Python
In this article, we are going to scrap a website using Requests by rotating proxies in Python. Modules RequiredRequests module allows you to send HTTP requests and returns a response with all the data such as status, page content, etc. Syntax: requests.get(url, parameter) JSON JavaScript Object Nota
3 min read
How to Scrape Websites with Beautifulsoup and Python ?
Have you ever wondered how much data is created on the internet every day, and what if you want to work with those data? Unfortunately, this data is not properly organized like some CSV or JSON file but fortunately, we can use web scraping to scrape the data from the internet and can use it accordin
10 min read
Web Scraping Financial News Using Python
In this article, we will cover how to extract financial news seamlessly using Python. This financial news helps many traders in placing the trade in cryptocurrency, bitcoins, the stock markets, and many other global stock markets setting up of trading bot will help us to analyze the data. Thus all t
3 min read
Scraping Amazon Product Information using Beautiful Soup
Web scraping is a data extraction method used to exclusively gather data from websites. It is widely used for Data mining or collecting valuable insights from large websites. Web scraping comes in handy for personal use as well. Python contains an amazing library called BeautifulSoup to allow web sc
6 min read
How to Scrape Nested Tags using BeautifulSoup?
We can scrap the Nested tag in beautiful soup with help of. (dot) operator. After creating a soup of the page if we want to navigate nested tag then with the help of. we can do it. For scraping Nested Tag using Beautifulsoup follow the below-mentioned steps. Step-by-step Approach Step 1: The first s
3 min read
Downloading PDFs with Python using Requests and BeautifulSoup
BeautifulSoup object is provided by Beautiful Soup which is a web scraping framework for Python. Web scraping is the process of extracting data from the website using automated tools to make the process faster. The BeautifulSoup object represents the parsed document as a whole. For most purposes, yo
2 min read