0% found this document useful (0 votes)
6 views

Python Project Description

Uploaded by

shashi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Python Project Description

Uploaded by

shashi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Projects Details:

 Python:-
Project 1:
Domain :-
IMDB Analytics

Skills :-
Python, Numpy, Pandas

Description :-
IMDB is one of the most sought-after rating platforms for movies. The data also contains the movie
titles, ratings, revenue, year of the content, etc. The data is beneficial in studying the impact of ratings
on the movie's revenue. The data is also beneficial in finding correlations between genres, ratings, and
revenue.

Below is the data dictionary for the dataset.

Result required :-
IMDB is the biggest movie rating platform in the world. A significant proportion of users choose to watch
movies only after checking ratings, the number of votes, and reviews of the movie on the IMDB
platform. The data IMDB captures is critical to generate insights into what is the public's preference.
Which genres make more money, and which genres have good ratings and still have lesser revenue?
 How many rows are there in the IMDB dataset?
 What is the 75th percentile of rating in the IMDB dataset?
 How many NA values are there in the field ‘Revenue’?
 How many movies have revenue higher than 75 million?
 How many movies have revenue greater than 50 million but rating less than 7?
 What is the total revenue generated by movies in the year 2015?
 What is the average rating for the genre adventure in the year 2015?
 What is the average duration of movies in rows 75 to 150?
 Which year generated the highest revenue?
 What is the maximum revenue out of (10,20,30,40,50) rows?
 How many movies with the genres ‘Adventure’, ‘Action’, ‘Horror’, and ‘Crime’ exist in the
IMDB dataset?
 Create a genre-level report with metrics average rating, the average number of votes, and
average revenue. What is the average rating of the ‘Horror’ genre? (Round to 2 decimal
places)
 What is the % revenue of the movie ‘Split’ in its respective genre and year?
 What is the average ‘Votes_norm’ ?
 What is the highest ‘Total_rating’ ?
 Create a new column ‘Revenue_bins’ so that ‘Revenue_millions’ are categorized in buckets 0-
50, 51-100, 101-150, and 150+. Which bucket has the highest number of movies?
 How many directors have created movies in the highest number of genres?

Technical skills :-
1. Pandas library functions
2. How to run functions on columns or rows
3. Overwrite data or create a new instance
4. How to change the structure of data
5. Merge and apply functions

Analytical skills :-
1. Sports analytics
2. Data cleaning and missing value treatment
3. Identify business KPIs and create reports
4. Entertainment analytics

Project 2:
Domain :-
Titanic

Skills :-
Python, Numpy, Pandas

Description :-
The sinking of the Titanic is one of the most infamous shipwrecks in history.
On April 15, 1912, during her maiden voyage, the widely considered “unsinkable” RMS Titanic sank after
colliding with an iceberg. Unfortunately, there weren’t enough lifeboats for everyone on board,
resulting in 1502 out of 2224 passenger and crew deaths.
While some elements of luck were involved in surviving, some groups of people seemed more likely to
survive than others.
Let’s explore this dataset to find some insights. Below is the sample of data.

Data Dictionary :-

Technical skills :-
 Pandas library functions
 How to run functions on columns or rows
 Overwrite data or create a new instance
 How to change the structure of data
 Merge and apply functions

Analytical skills :-
 Sports analytics
 Data cleaning and missing value treatment
 Identify business KPIs and create reports
 Entertainment analytics

Project 3:
Domain :-
NBA

Skills :-
Python, Numpy, Pandas
Description :-
The National Basketball Association is a professional basketball league in North America. The league
comprises 30 teams and is one of the four major professional sports leagues in the United States and
Canada. It is the premier men's professional basketball league in the world.
Below is the dataset with players’ information such as team, college, height, weight, salary, etc.

Data Dictionary :-

Technical skills :-
 Pandas library functions
 How to run functions on columns or rows
 Overwrite data or create a new instance
 How to change the structure of data
 Merge and apply functions

Analytical skills :-
 Sports analytics
 Data cleaning and missing value treatment
 Identify business KPIs and create reports
 Entertainment analytics

You might also like