Project 5
Project 5
DESCRIPTION:
Before getting started with the data we need to start off with cleaning
the data as it contains blank cells, duplicate values and data which is
not productive for analyzing the data.
For data cleaning we need to delete the rows which are not necessary
to us for the future analysis. After that we need to select the data
which is remaining after cleaning and identify blank cells. We need to
do this by going to the home tab and then select FIND & SELECT
function to find the blank cells and then delete the entire row as the
missing value will be meaningful with the rest. After performing this
we need to check for the duplicates in the text by selecting the Data
tab we find Remove Duplicates function and it deletes the duplicates
from the data. Now we are good to go with the rest of the data.
You are required to provide a detailed report for the below data record
mentioning the answers to the questions that follows:
Soln: So, before counting the number of movies for each genre the
genre column must be manipulated so I've used text to column
function from the data tab and removed the delimiters from the data.
And then countif function has been used to find the number of movies
for that genre.
PAGE 2
Here I've obtained different genres from the given data and found that
Drama genre is the most common among the movies and performed
descriptive statistics to find out AVERAGE, MEDIAN, MODE, MAX, MIN,
VAR, and STDEV.
Soln: Here I've copied the data into a new sheet and calculated the
Average movie duration, Median and Standard Deviation for the same.
Then I've plotted the scatter plot for Movie Duration and the
respective IMDB score. Then I've added a trendline for the same.
PAGE 3
The above plot depicts that from movie duration 80 mins to 150 mins
the graph shows average score is similar. And the trendline shows
slightly exponential growth.
PAGE 4
TASK D. Director Analysis: Influence of directors on movie ratings.
Identify the top directors based on their average IMDB score and
analyze their contribution to the success of movies using percentile
calculations.
Soln: Here I’ve used Pivot table for the director names and IMDB score
of the movies and I've applied filters to obtain the directors with
maximum average rating using pivot table. Then ive used large
function to find out the highest average rating among the IMDB scores
and then PERCENTRANK to find out how the ranks compared to the
rest. PERCENTILE function is used to find the value below which a
given percentage of data falls.
PAGE 5
TASK E. Budget Analysis: Explore the relationship between movie
budgets and their financial success. Analyze the correlation between
movie budgets and gross earnings and identify the movies with the
highest profit margin.
PAGE 6
Loom Video:
https://drive.google.com/file/d/1gWec9iuXerOzKgWa4JDFUBMOZy -
c67WQ/view?usp=drive_link
PAGE 7
PAGE 8
PAGE 9
PAGE 10
PAGE 11
PAGE 12
PAGE 13