0% found this document useful (0 votes)
485 views7 pages

Netflix Data Analysis Project Report

Uploaded by

amitesh7668
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
485 views7 pages

Netflix Data Analysis Project Report

Uploaded by

amitesh7668
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Netflix Data Cleaning and Visualization with SQL and Python

Overview
This project aims to analyze Netflix’s content library through data cleaning, exploratory data analysis (EDA),
and visualization using SQL and Python. The goal is to derive insights into the trends and patterns in Netflix's
catalog, including the types of content, release trends, and genre distributions.

Table of Contents
Project Overview Dataset Tools and Technologies Data Cleaning Exploratory Data Analysis (EDA) Key
Insights Visualizations Conclusion Installation Usage
Dataset

Source: The dataset used in this project is from Netflix’s Movies and TV Shows dataset. Description: The
dataset contains information about movies and TV shows available on Netflix, including columns like title,
type, release_year, date_added, country, duration, and genre.
Tools and Technologies

SQL: Used for initial data cleaning and transformations. Python: Used for in-depth analysis and visualization.
Pandas: Data manipulation and analysis. Matplotlib & Seaborn: Data visualization. Jupyter Notebook: To
document the analysis. Power BI / Tableau: For creating an interactive dashboard to visualize the analysis.

Data Cleaning
The data cleaning process involved:
Removing rows with missing date_added values. Converting date_added to a proper datetime format.
Extracting month and year from date_added for trend analysis. Handling null values in columns like country
and duration.

Exploratory Data Analysis (EDA)


Key analyses performed in this project include:

Monthly Releases of Movies and TV Shows:


Analyzed the count of content added each month. Observed that December had the highest number of new
releases.
Yearly Releases of TV Shows:
Investigated the trend of TV show releases over the years. Found a significant increase in TV shows starting
from 2015.
Content Type Distribution:
Compared the proportion of Movies vs. TV Shows in the Netflix library. Found that Movies account for
around 65% of the total content.
Top Genres Analysis:
Identified the most common genres, with Drama and Comedy leading the list.
Content Duration Analysis:
Analyzed the distribution of movie durations, with most movies falling between 90 to 120 minutes.
Country-Wise Analysis:
Investigated which countries contribute the most to Netflix’s content, with the USA leading
Key Insights
December has the highest number of new content additions, indicating a focus on holiday releases. TV Show
releases have increased significantly since 2015, reflecting Netflix’s strategy to produce more original series.
The USA, India, and the UK are the top contributors of content on Netflix. The platform's content is
dominated by Drama and Comedy, appealing to a wide range of audiences.

Visualizations
Chart 1: Content type distribution

 Content type distribution: The majority of the content on Netflix is movies (69.9%).

Content type in Percentage

T
V

S
h
o
w
,

Chart 2: Country distribution

 Country distribution: The United States has the largest number of contents on Netflix, followed
by India and the United Kingdom.

Number of Contents By Country


United States 2394 845
India 976 81
United Kingdom 387 251
Pakistan 71 350
Not Given 257 30
Canada 187 84
Japan 87 172
South Korea 49 165
France 14865
Spain 12953
Chart 3: Content growth over time

 Content growth over time: The number of new contents added to Netflix has increased steadily
over time

Number of Contents added through


the Years 201
187
164
149
118

42

24 82
2 2 13 3 11

200 2009 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021
8

Chart 4: Popular directors

 Popular directors: The most popular directors on Netflix is Rajic Chilaka.


 We can also see that most of these directors contents are movies.
 We can also note that the duo of Raul Campos and Jan Suter are fond of working together and have
directed 18 movies on Netflix.

Rajiv Chilaka 19 1
Alastair Fothergill 4 14
Raúl Campos, Jan Suter 18
Suhas Kadav 16
Marcus Raboy 15 1
Jay Karas 14
Cathy Garcia-Molina 13
Martin Scorsese 12
Youssef Chahine 12
Jay Chapman 12

Directors with most Contents on


Netflix
Chart 5: Popular genres

 Popular genres: We can see that Drama & International movies followed by Documentary
have the highest number of contents on Netflix within the period.

Top Genres
Dramas, International Movies 362
Documentaries 359
Stand-Up Comedy 334
Comedies, Dramas, International Movies 274
Dramas, Independent Movies, International Movies 252
Kids' TV 219
Children & Family Movies 215
Children & Family Movies, Comedies 201

Chart 6: Documentaries, International Movies 186 Top rated


contents Dramas, International Movies, Romantic Movies 180

The top rated contents on Netflix are rated TV-MA. We can note that most contents on Netflix are
rated TV-MA. TV-MA in the United States by the TV Parental Guidelines signifies content for
mature audiences

Top Ratings

3205 2157 860 799 490 333 306 287 220 79 41 6 3 3

Chart 7: Oldest contents(Movie & TV Show)


 Oldest contents: The oldest content on Netflix is from 1925.

Oldest Movie and TV Show on Netflix Year


Pioneers: First Women Filmmakers* 1925
Prelude to War 1942
The Battle of Midway 1942
Undercover: How to Operate Behind Enemy
Lines 1943
Why We Fight: The Battle of Russia 1943
WWII: Report from the Aleutians 1943
The Negro Soldier 1944
Tunisian Victory 1944
Five Came Back: The Reference Films 1945
Know Your Enemy – Japan 1945
Nazi Concentration Camps 1945
San Pietro 1945
Chart 8: Movie vs TV show content
over time

 More movies have been added to Netflix than TV shows over time.
 In 2013, the number of contents added to Netflix for both were almost the same with Movies
having 6 contents that year and Tv shows having 5.
 It shows that in th efirst5 years, only movies were added to Netflix.

Types of contents added over the


Years 1424
1237 1284

993
835

592 595
505
411
349
251
175
52
1 1 2 1 13 3 6 15
5 9 6
Chart 9: Release Year With Highest Content On Netflix

 Release years with most content: The release years with the most content on Netflix are 2012 to
2018.
 We can see that from 2012 to 2018, Netflix added most recent contents, they made sure most recent
contents per release year are higher than the older release year contents. Then in 2019, it started
dropping, this may be due to the Covid-19, but further analysis may be needed to determine this.

Release Year With Highest Content On Netflix


1400

1146
1200
1030 1030
953
1000 901

800
555 592
600

352
400 286
236
200

0
2012 2013 2014 2015 2016 2017 2018 2019 2020 2021

You might also like