0% found this document useful (0 votes)
105 views2 pages

bs4 Examples

This document provides a tutorial on how to perform web scraping using the Requests and Beautiful Soup libraries in Python. It demonstrates how to install the libraries, make a GET request to a URL to retrieve content, parse the content using Beautiful Soup to find specific elements, and extract data from the elements into a dictionary. The goal is to scrape book sampler links and titles from an O'Reilly website as a sample scraping project. Additional resources on web scraping with Requests and Beautiful Soup in Python are also provided.

Uploaded by

Ankita Padhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
105 views2 pages

bs4 Examples

This document provides a tutorial on how to perform web scraping using the Requests and Beautiful Soup libraries in Python. It demonstrates how to install the libraries, make a GET request to a URL to retrieve content, parse the content using Beautiful Soup to find specific elements, and extract data from the elements into a dictionary. The goal is to scrape book sampler links and titles from an O'Reilly website as a sample scraping project. Additional resources on web scraping with Requests and Beautiful Soup in Python are also provided.

Uploaded by

Ankita Padhi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

1 Web Scraping Workshop

Using Requests and Beautiful Soup, with the most recent Beautiful Soup 4 docs.

2 Getting Started

Install our tools (preferably in a new virtualenv):

pip install beautifulsoup4

pip install requests

3 Start Scraping!

Lets grab the Free Book Samplers from


O'Reilly: http://oreilly.com/store/samplers.html.

>>> import requests

>>>

>>> result = requests.get("http://oreilly.com/store/samplers.html")

Make sure we got a result.

>>> result.status_code

200

>>> result.headers

...

Store your content in an easy-to-type variable!

>>> c = result.content

Start parsing with Beautiful Soup. NOTE: If you installed with pip, you'll need to
import from bs4. If you download the source, you'll need to import
from BeautifulSoup (which is what they do in the online docs).

>>> from bs4 import BeautifulSoup

>>> soup = BeautifulSoup(c)

>>> samples = soup.find_all("a", "item-title")

>>> samples[0]
<a class="item-title"
href="http://cdn.oreilly.com/oreilly/booksamplers/9780596004927_samp
ler.pdf">

Programming Perl

</a>

Now, pick apart individual links.

>>> data = {}

>>> for a in samples:

... title = a.string.strip()

... data[title] = a.attrs['href']

Check out the keys/values in the data dict. Rejoice!

Now go scrape some stuff!

………………………………………………………………………………………………..

https://www.digitalocean.com/community/tutorials/how-to-work-with-web-data-using-requests-
and-beautiful-soup-with-python-3

https://www.pythonforbeginners.com/python-on-the-web/web-scraping-with-beautifulsoup

//csv

https://www.geeksforgeeks.org/implementing-web-scraping-python-beautiful-soup/

https://www.codementor.io/dankhan/web-scrapping-using-python-and-beautifulsoup-o3hxadit4

https://www.learndatasci.com/tutorials/ultimate-guide-web-scraping-w-python-requests-and-
beautifulsoup/

You might also like