0% found this document useful (0 votes)
49 views

Python Web Scraping

Web scraping involves using programs or algorithms to extract large amounts of data from websites for analysis. It allows data scientists and engineers to gather information from websites without APIs for tasks like interfacing with third parties or accessing anonymous data. When scraping, one should check the site's terms, avoid commercial use of data, not overload websites with requests, and rewrite code if site layouts change over time to behave reasonably like a human visitor.

Uploaded by

Shubham
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views

Python Web Scraping

Web scraping involves using programs or algorithms to extract large amounts of data from websites for analysis. It allows data scientists and engineers to gather information from websites without APIs for tasks like interfacing with third parties or accessing anonymous data. When scraping, one should check the site's terms, avoid commercial use of data, not overload websites with requests, and rewrite code if site layouts change over time to behave reasonably like a human visitor.

Uploaded by

Shubham
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Python Web scraping

Web scraping
Web scraping is a term used to describe the use of a program or algorithm to extract and process
large amounts of data from the web. Whether you are a data scientist, engineer, or anybody who
analyzes large amounts of datasets, the ability to scrape data from the web is a useful skill to
have.
Why we scrap
Web pages contain wealth of information (in text form), designed mostly for human
consumption. Interfacing with 3rd party with no API access. Websites are more important than
API’s Anonymous access.

You should check a website’s Terms and Conditions before you scrape it. Be careful to read the
statements about legal use of data. Usually, the data you scrape should not be used for commercial
purposes.

How it works

Do not request data from the website too aggressively with your program (also known as
spamming), as this may break the website. Make sure your program behaves in a reasonable
manner (i.e. acts like a human). One request for one webpage per second is good practice.

The layout of a website may change from time to time, so make sure to revisit the site and rewrite
your code as needed.

You might also like