In this article, we will learn about web scraping technique using lxml module available in Python.
What is web scraping?
Web scraping is used to obtain/get the data from a website by the help of a crawler/scanner. Web scrapping comes handy to extract the data from a web page that doesn't offer the functionality of an API. In python, web scrappping can be done by the help of various modules namely Beautiful Soup, Scrappy & lxml.
Here we will discuss web scrapping using the lxml module.
For that, we first need to install lxml .
Type in the terminal or command prompt −
>>> pip install lxml
Here xpath is used to access the data .
In this article we will extract data from the website known as steam containing informations about different games.
https://store.steampowered.com/genre/Free%20to%20Play/
On the page, we will try to extract information from the popular new releases section.
Here we will extract names , prices , tags associated & target platform .
On the page see the html code of new releases tab by using inspect element feature in the chrome . Here we will get to know that which tag is storing the required information.
Here in this website ; every list element is encapslated in a div tag id=tab_content which is further encapsualted in
a div tag id=tab_select_newreleases
Now let's see the implementation