Movie Prediction
Movie Prediction
Team members:
1032212858 – Nishant Kasliwal -PG 60
1032220963- Rohit Wani - PG 66
1032221046- Sahil Mehta - PG 67
1032222133 -Aryan Sarode- -PG 69
Problem Statement
The film industry faces challenges in predicting and optimizing the success of movie releases, making
informed decisions regarding production, marketing, and resource allocation. The lack of a reliable tool
for anticipating audience response and understanding key success factors poses a significant hurdle for
filmmakers and stakeholders. This project aims to develop a Movie Success Predictor using machine
learning and data analytics, leveraging IMDb data to provide insights into genre preferences, audience
trends, and factors influencing movie ratings.
Project overview
The movie success prediction project aims to address key challenges in the film industry by providing a
predictive model for estimating IMDb ratings. This model leverages big data technologies to offer
valuable insights into factors influencing movie success, enabling
data-driven decision-making for filmmakers and stakeholders. The project focuses on genre mapping,
offering visualizations and analysis to help filmmakers understand genre trends and align their strategies
with audience preferences. By facilitating efficient resource allocation and providing a competitive edge
through audience insights, the project contributes to a more informed and strategic approach to movie
production and marketing.
In summary, it seeks to reduce uncertainty, enhance decision-making, and optimize the chances of
success in the dynamic and competitive film industry.
Project workflow
1. Data Collection
1.1 Objective: Efficiently gather a comprehensive dataset from the IMDb website using web
scraping techniques.
● Developing web scraping scripts in Python.
● Implement web scraping using Octoparse for final data extraction.
● Extract 5 years of movie data with about 10,000 entries per year.
● Include columns such as 'Year,' 'certificate,' 'Time,' 'Score,' 'director,' 'cast,' 'number of votes,'
'gross revenue,' 'genre,' 'imdb_rating,' 'Movie_Title,' and others.
3. Frontend
3.1 Objective: Create an interactive and visually appealing web interface for users to explore IMDb
rating predictions.
● Structure the webpage using HTML to define the layout.
● Style the webpage with CSS for a visually appealing design.
● Implement JavaScript for interactivity, including dropdowns for genre selection.
● Use DOM manipulation to dynamically update content based on user interactions.
● Include a scrolling effect using the <marquee> tag.
4. Visualizations
4.1 Objective: Generate informative visualizations and implement a Flask backend for server-
side logic.
4.2 Steps:
● Utilize Jupyter Notebook for creating visualizations.
● Explore genre mapping and trends using bar charts, pie charts, or other relevant plots.
● Develop a Flask backend to handle interactions between the frontend and data sources.
● Include visualizations in the Flask application for dynamic exploration.
● Deploy the Flask backend for user accessibility.
5. Backend
Future scope:
Conclusion:
In summary, the movie success predictor project offers a data-driven solution to the uncertainties in the
film industry by predicting IMDb ratings. Through big data technologies, it equips filmmakers with
insights into genre trends and audience preferences. The future scope includes refining the predictive
model, incorporating real-time predictions, and collaborating with industry professionals. This project has
the potential to transform decision-making in the film industry, providing a competitive edge and
optimizing resource allocation for successful movie production.
Relevant screenshots:
Visualization of data of the year 2021: percentage of movie production segregated by genre
References:
Other links:
https://www.geeksforgeeks.org/scrape-imdb-movie-rating-and-details-using
-python/ https://www.youtube.com/watch?v=BuxBLXmH2H4