Data wrangling.
When you scrape data and it is not the exact data you wanted them you use data wrangling.
1. Main things to learn.
● Here, HTML is for learning proper website language so we can easily do web scraping.
● Regular expressions is a set of tools that are commonly used to work with text data.
They allow for the certain intelligent target searches for different types of text.
● Write a script in python to access a website.
● Sometimes these scraping scripts will encounter errors so exception handling will help
our script to react to these errors properly and recover after these errors happen.
Understanding HTML code.
We can use the inspect tool on the browser to see HTML code of a website.
Regular expressions.
This below is done using the re module in python.
● Finding characters in a text.
● Quantifiers.
They allow us to know the exact amount of each character we want to match.
● Matching character groups.
Web scraping.
Example.
Scraping a laptop store website.