0% found this document useful (0 votes)

27 views7 pages

On Python Project VI Semester: Academic Year: 2018-2019

This project report summarizes a student project on web scraping using Beautiful Soup. The objective was to build a system to extract large amounts of data from websites and save it to a local file. The system was built using Python, PyCharm IDE, Beautiful Soup and lxml libraries to scrape the Wikipedia page on Python and extract and display heading text. Screenshots show the source code, results and scraped webpage. The project was carried out by three students under the guidance of Mr. Shyam Sundar in the Department of Computer Science and Engineering at GAT for the 2018-2019 academic year.

Uploaded by

okokok

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views7 pages

On Python Project VI Semester: Academic Year: 2018-2019

Uploaded by

okokok

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

Report

PYTHON PROJECT

VI Semester
Academic Year: 2018-2019

Title: WEB SCRAPING USING BEAUTIFUL SOUP

USN Name Signature
1GA14CS010 Akash Kumar S
1GA15CS053 G Janany
1GA16CS191 Vishal Kumar

Guide
[Mr.Shyam Sundar]

Dept. of CSE, GAT 2018-19 1

Objective of the Project
To build a system that is capable of extracting large amounts of data from websites
whereby the data is extracted and saved to a local file or displayed. It is either custom
built for a specific website or is one which can be configured to work with any website.
With the click of a button we can easily save the data available in the website to a file in
our computer.

Dept. of CSE, GAT 2018-19 2

System Requirement Specification

Software Requirements Specification

➢ Language used : Python Programming Language

➢ IDE/Compiler used : PyCharm
➢ OS used : Windows 10

Hardware Requirements Specification

o Processor : i7 8th generation

o Hard Disk : 1 TB
o Monitor : HD LED Antiglare
o Keyboard : Island Style

Dept. of CSE, GAT 2018-19 3

Source Code

# make sure to have python ver 3.5 or higher

# 1> install requests using - pip install requests
# 2> install beautifulsoup using - pip install beautifulsoup4
# 3> install lxml using - pip install lxml
(enter the commands on cmd promt , not on python shell)

import requests #imports requests module

import bs4 #imports beautifulsoup module

res = requests.get('https://en.wikipedia.org/wiki/Python_(programming_language)')
res.text #obtains the entire HTML and/or CSS code of the
website

soup = bs4.BeautifulSoup(res.text, 'lxml') #lxml is a data structure

result = soup.select('.mw-body-content h2') #here you can give any HTML
tag which you want to scrape

for i in soup.select('https://en.wikipedia.org/wiki/Python_(programming_language)'):
print(i.text)

result #displays the required data in html code

result[0] #displays first element in the array(in this case there is only
one element)
result[0].getText() #displays the required data in string format

Dept. of CSE, GAT 2018-19 4

Snapshots

1.Snapshot of Source Code

Dept. of CSE, GAT 2018-19 5

2. Snapshot of Result

Dept. of CSE, GAT 2018-19 6

3. Snapshot of Webpage

Dept. of CSE, GAT 2018-19 7

Matthew Johnson SR Python Developer 85/1, South Street, Philadelphia PA, 19019 US Citizen Professional Summary
No ratings yet
Matthew Johnson SR Python Developer 85/1, South Street, Philadelphia PA, 19019 US Citizen Professional Summary
14 pages
Programming Foundations Fundamentals
No ratings yet
Programming Foundations Fundamentals
67 pages
3252 Ids 10
No ratings yet
3252 Ids 10
5 pages
Api and Data Structure
No ratings yet
Api and Data Structure
3 pages
03 Web Scraping
No ratings yet
03 Web Scraping
41 pages
Practical Introduction To Web Scraping in Python
100% (1)
Practical Introduction To Web Scraping in Python
14 pages
Programming 2 Lectures
No ratings yet
Programming 2 Lectures
52 pages
Python Script To Fetch The Content From Web Pages
No ratings yet
Python Script To Fetch The Content From Web Pages
11 pages
DAP Module4
No ratings yet
DAP Module4
109 pages
Python Using AI
No ratings yet
Python Using AI
9 pages
Python Module-4
No ratings yet
Python Module-4
109 pages
Python Week 1 PDF
No ratings yet
Python Week 1 PDF
8 pages
The Ultimate Guide To Python Programming With Python 3.10
No ratings yet
The Ultimate Guide To Python Programming With Python 3.10
2 pages
DWV Labs 2025 1
No ratings yet
DWV Labs 2025 1
17 pages
Cheat Sheet: API's and Data Collection: Package/Method Description Code Example
No ratings yet
Cheat Sheet: API's and Data Collection: Package/Method Description Code Example
6 pages
DAP - Module 4
No ratings yet
DAP - Module 4
57 pages
Python Study Material
No ratings yet
Python Study Material
14 pages
Implementing Web Scraping in Python With Beautifulsoup
No ratings yet
Implementing Web Scraping in Python With Beautifulsoup
6 pages
Web Scraping
No ratings yet
Web Scraping
35 pages
Cric Score App
No ratings yet
Cric Score App
16 pages
DAP 4 Module
No ratings yet
DAP 4 Module
45 pages
DeepSeek - Python Tutorial
No ratings yet
DeepSeek - Python Tutorial
8 pages
Retrieving Data From The Web
No ratings yet
Retrieving Data From The Web
9 pages
Web Scraping and Data Collection CheatSheet 1731972399
No ratings yet
Web Scraping and Data Collection CheatSheet 1731972399
10 pages
Citl Exp 8
No ratings yet
Citl Exp 8
7 pages
Python Ap
No ratings yet
Python Ap
27 pages
Python Cheat Sheet - The Basics CC
No ratings yet
Python Cheat Sheet - The Basics CC
2 pages
Python Essentials Objectives
No ratings yet
Python Essentials Objectives
2 pages
Python Cheatsheet
No ratings yet
Python Cheatsheet
14 pages
SEM IV - FCSP-2 - CE - Syllabus-1
No ratings yet
SEM IV - FCSP-2 - CE - Syllabus-1
5 pages
First Web Scraper
No ratings yet
First Web Scraper
34 pages
Python Roadmap - Noobacker
No ratings yet
Python Roadmap - Noobacker
6 pages
PDF Document 2
No ratings yet
PDF Document 2
24 pages
PY0101 - Python For Data Science, AI, & Development Cheat Sheet
No ratings yet
PY0101 - Python For Data Science, AI, & Development Cheat Sheet
2 pages
Python
No ratings yet
Python
3 pages
A Guide To Web Scraping in Python Using Beautiful Soup
No ratings yet
A Guide To Web Scraping in Python Using Beautiful Soup
6 pages
Python Ultimate Guide
100% (1)
Python Ultimate Guide
10 pages
4) Lesson Plan - PY
No ratings yet
4) Lesson Plan - PY
10 pages
WEP Curriculum
No ratings yet
WEP Curriculum
11 pages
Web Scraping Python
No ratings yet
Web Scraping Python
13 pages
Chapter-4 Update
No ratings yet
Chapter-4 Update
16 pages
Web Scraping Report
No ratings yet
Web Scraping Report
14 pages
Python Cheatsheet
No ratings yet
Python Cheatsheet
6 pages
Data Engineering Concepts #2 - Sending Data Using An API - by Bar Dadon - Dev Genius
No ratings yet
Data Engineering Concepts #2 - Sending Data Using An API - by Bar Dadon - Dev Genius
14 pages
Ibm Python Module 5 Apis Data Collection
No ratings yet
Ibm Python Module 5 Apis Data Collection
3 pages
Python Cheat Sheet - The Basics Edx
No ratings yet
Python Cheat Sheet - The Basics Edx
2 pages
Syllabus - Fundamentals of Python Programming
No ratings yet
Syllabus - Fundamentals of Python Programming
4 pages
Esc Enter M Y A B D + D Z F Shift + Up/Down Space Shift + Space
No ratings yet
Esc Enter M Y A B D + D Z F Shift + Up/Down Space Shift + Space
12 pages
Python Units 4 Notes
No ratings yet
Python Units 4 Notes
11 pages
Cia 2 PSPP
No ratings yet
Cia 2 PSPP
12 pages
Unit 4
No ratings yet
Unit 4
36 pages
Python Self Study Material
0% (1)
Python Self Study Material
9 pages
SRS - How to build a Pen Test and Hacking Platform
From Everand
SRS - How to build a Pen Test and Hacking Platform
alasdair gilchrist
2/5 (1)
Python Cheatsheet
No ratings yet
Python Cheatsheet
8 pages
API Cheatsheet
No ratings yet
API Cheatsheet
4 pages
Syllabus 2024
No ratings yet
Syllabus 2024
2 pages
Chapter 11. Web Scraping
100% (1)
Chapter 11. Web Scraping
57 pages
Computer Science PYTHON Book PDF For Class 12
No ratings yet
Computer Science PYTHON Book PDF For Class 12
325 pages
Python Developer - Full Course
No ratings yet
Python Developer - Full Course
35 pages
Blood PitTM In-Game Module
From Everand
Blood PitTM In-Game Module
Stephen Gose
No ratings yet
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
From Everand
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
Mulayam Singh
No ratings yet
C Programming Wizardry: From Zero to Hero in 10 Days: Programming Prodigy: From Novice to Virtuoso in 10 Days
From Everand
C Programming Wizardry: From Zero to Hero in 10 Days: Programming Prodigy: From Novice to Virtuoso in 10 Days
kok keong teo
No ratings yet
PSPP UNIT-2 FULL 2 Marks Q & A
No ratings yet
PSPP UNIT-2 FULL 2 Marks Q & A
13 pages
Getting Started With Python-1
No ratings yet
Getting Started With Python-1
17 pages
Computer Science B.SC 1 To 6th Sem Syllabus 2024-25
No ratings yet
Computer Science B.SC 1 To 6th Sem Syllabus 2024-25
42 pages
Question Bank Class Xi Cs Answer To Be Uploaded-4 (2) Removed
No ratings yet
Question Bank Class Xi Cs Answer To Be Uploaded-4 (2) Removed
5 pages
Gupta Sachin Resume Google
No ratings yet
Gupta Sachin Resume Google
2 pages
Coding and Programming-Introduction
No ratings yet
Coding and Programming-Introduction
4 pages
CSE&DS R24 COURSE STRUTURE With Syllabus
No ratings yet
CSE&DS R24 COURSE STRUTURE With Syllabus
14 pages
Ece Lab Record Front Page
No ratings yet
Ece Lab Record Front Page
4 pages
Resume Harsh-Bansal For HARSH BANSAL
No ratings yet
Resume Harsh-Bansal For HARSH BANSAL
1 page
O Level m3 r5 Python Important MCQ For Theory Examination
No ratings yet
O Level m3 r5 Python Important MCQ For Theory Examination
49 pages
Python Frontmatter
No ratings yet
Python Frontmatter
12 pages
Isatis - Neo-Mining Release Notes
No ratings yet
Isatis - Neo-Mining Release Notes
23 pages
TM01 Apply Object-Oriented Programming Language Skills
No ratings yet
TM01 Apply Object-Oriented Programming Language Skills
102 pages
AMIT
No ratings yet
AMIT
3 pages
Adam Mechtley - Maya Python For Games and Film - 2012
No ratings yet
Adam Mechtley - Maya Python For Games and Film - 2012
383 pages
Think Python
100% (1)
Think Python
234 pages
Python If Conditional Statement
No ratings yet
Python If Conditional Statement
3 pages
Python Imp Questions
No ratings yet
Python Imp Questions
1 page
Python Microproject
No ratings yet
Python Microproject
7 pages
Python Programs
100% (1)
Python Programs
74 pages
How To Predict Doge Coin Price Using Machine Learning and Python
No ratings yet
How To Predict Doge Coin Price Using Machine Learning and Python
14 pages
Hello Python
No ratings yet
Hello Python
102 pages
Ch2 Python
No ratings yet
Ch2 Python
12 pages
M3 R5 Python MCQ
No ratings yet
M3 R5 Python MCQ
7 pages
Tutorial 1
No ratings yet
Tutorial 1
11 pages
CH2. Python Revision Tour - II: Cbse - Class - Xii Computer Science With Python (NEW) (Subject Code: 083)
No ratings yet
CH2. Python Revision Tour - II: Cbse - Class - Xii Computer Science With Python (NEW) (Subject Code: 083)
18 pages
Harsh CV
No ratings yet
Harsh CV
2 pages
Iot 1-5 Units Notes (Elective)
No ratings yet
Iot 1-5 Units Notes (Elective)
51 pages

On Python Project VI Semester: Academic Year: 2018-2019

Uploaded by

On Python Project VI Semester: Academic Year: 2018-2019

Uploaded by

Report

Title: WEB SCRAPING USING BEAUTIFUL SOUP

Dept. of CSE, GAT 2018-19 1

Dept. of CSE, GAT 2018-19 2

Software Requirements Specification

➢ Language used : Python Programming Language

Hardware Requirements Specification

o Processor : i7 8th generation

Dept. of CSE, GAT 2018-19 3

# make sure to have python ver 3.5 or higher

import requests #imports requests module

soup = bs4.BeautifulSoup(res.text, 'lxml') #lxml is a data structure

result #displays the required data in html code

Dept. of CSE, GAT 2018-19 4

1.Snapshot of Source Code

Dept. of CSE, GAT 2018-19 5

Dept. of CSE, GAT 2018-19 6

Dept. of CSE, GAT 2018-19 7

You might also like