Savitendra Miniproject
Savitendra Miniproject
Savitendra Miniproject
MOVIE
SYNOPSIS
OF MINI PROJECT
BACHELOR OF TECHNOLOGY
COMPUTER SCINCE AND TECHNOLOGY
SUBMITTED BY SUBMITTED TO
i
CERTIFICATE
This is hereby to certify that, the original and genuine investigation work has been carried
out to investigate about the subject matter and the related data collection and investigation
has been completed solely, sincerely and satisfactorily done by
Savitendra Mani Pandey students of COMPUTER SCIENCE AND ENGINNERING for the
academic session 2023-2024 from Kamla Nehru Institute of Technology, Sultanpur.
For Project Department under direct supervision of the undersigned as per the requirement
for the Board Examination. The project report embodies results of original work and studies
carried out by students and contents do not form the basis for the award of any other
certification courses.
ii
ACKNOWLEDGEMENT
We have taken efforts in this project. However, it would not have been possible without the
kind support and help of many individuals. I would like to extend my sincere thanks to all of
them. It has been great honour and privilege to complete my project work at Kamala Neharu
Institute of Technology Sultanpur. We are highly indebted to Prof. SHOBHIT SHUKLA and
Prof. ASHA SINGH for their guidance and constant supervision as well as for providing
necessary information regarding the project and also for their support in completing the
project. His constant guidance and willingness to share his vast knowledge made us understand
this project and its manifestations in great depths ad helped us to complete the assigned tasks
on time. I would like to express my gratitude towards my parents and members of Kamala
Neharu Institute of Technology Sultanpur for their kind cooperation which help me in
completion of this project. My thanks and appreciations also go to my colleagues in developing
the project and people who have willingly helped me out with their abilities.
iii
TABLE OF CONTENT
Abstract V
Introduction 1
Proposed Work 2
Prerequisite 5
Application of Architecture 5
Reference 7
iv
ABSTRACT:
Web Scraping or Web Harvesting is a software technology aims at extracting information
from website. Web scraping typically exploring of the World Web by creating Hyper Text
Transfer Protocol or implement a Suitable Web Browser. It is closely related to Web
Indexing, an information extracting technique used by multiple search engines to index-data
on the Web human programmed bots.
In comparison, web scraping stresses on transforming unstructured information (usually in
HTML format) on the web structured information that can be saved and processed in a
centralised database.
Web scraping mostly used for price comparison online, webpage interface change detection,
weather forecast information, web information integration, webpage mix ups or mashups,
and web surveys. Currently, there are multiple software gadgets available that aim to apply
scraping techniques to personalize your website.
v
INTRODUCTION
Web scraping is a technique using which the webpages from the internet are fetched and
parsed to understand and extract specific information to human being. Web scraping consists
of two part:
1. Web Crawling – Accessing the webpages over the internet and pulling data from them.
2. HTML Parsing – Parsing the HTML content of the webpages obtained through web
crawling and then extracting specific information from it.
Hence, web scrapers are applications/bots, which automatically send requests to website and
then extract the desired information from the website output. Let’s take an example : how do
we buy a phone online ? 1.We first look for a phone with good reviews 2. We see on which
website it’s available at lowest price 3. We check whether it’s delivered in out area or not 4. If
everything looks good, then we buy the phone. What if there is a computer program that can
do all of these for us? That’s what web scrappers necessarily do. They try to understand the
webpage content as human being would do. Other examples of the application of web
scrapping are:
➢ Competitive pricing.
➢ Manufacturers monitor the market, whether the retailer is maintaining a minimum price
or not.
➢ Sentiment analysis of consumers, whether they are happy with the services and products
or not.
➢ To aggregate Marketing data.
➢ To gain financial insights from the market.
➢ To gather data for research.
➢ To generate marketing leads.
➢ To collect trending topics by media houses. And , the list goes on.
1
PROPOSED WORK
In this document , we’ll take the example of searching movie info online further and try to
scrap the movie info from the website about the movie that we searching for . For example ,
if we open imbd.com and search for top 250 movie , the search result will be as follows:
2
Then if we click on a movie link, it will take us to the following page:
Now , we will get to see following information about movie like timing ,releasing year ,
IMBD ratings, and if we scroll down , we will get the director name :
3
Then if we click on a director link, it will take us to the following page:
Now ,if we scroll down, we will get to see top four movies of directed by director:
4
PREREQUISITE
The thing needed before we start building a python based web scraper are:
• Python installed.
• A Python IDE (Integrated Development Environment): like PyCharm, Spyder , or any
other IDE of choice.
• Basic understand of Python and HTML.
• Basic understanding of Request and BeautifulSoup Module.
APPLICATION OF ARCHITECTURE
The architecture of the application is:
START
MOVIE FOUND
5
CONCLUSION AND FUTURE SCOPE
In this project, we built a web scraper from scratch that collects the movie info of movie from
the internet and also collect the director information for a movie name from the internet.
Text scrappers are extensively used in the industry today for competitive pricing, market
studies, customer sentiment analysis, etc…
In the future , Web scraping will be one of the important tools in the lead generation process.
The web scraping tool can make market research of the particular product/services and
enormous benefits to offer in the marketing field.
REFERENCES
➢ Web3School:
Website-https://www.w3school.com/
➢ Udemy:
Website-https://www.udemy.com/
➢ “Datahen.”3 Advantage of web scraping””
➢ Greeks for Greeks
Website-https://www.greeksforgreeks.org
➢ IMDB
Website-https://imbd.com