0% found this document useful (0 votes)
27 views

Final

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Final

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

PROJECT TITLE

INTRODUCTION

Exploratory Data Analysis (EDA) serves as a critical lens


through which we unravel the complexities of the IMDb Top
250 list for Indian movies. IMDb, a leading film rating
platform, provides a rich dataset reflecting audience
preferences and critical acclaim.
ABOUT THE PROJECT

Objectives of EDA:-

• Uncover trends in the IMDb Top 250 Indian movies.


• Analyze IMDb rating distribution and influencing factors.
• Investigate the evolution of Indian cinema by decade.
• Provide insights into the dynamics of filmmaking in India.

• The dataset consists of 250 records from the IMDb Top 250 list for Indian movies,
featuring essential attributes like ranking, title, rating, release year, duration, genre,
and a brief description
STEPS OF EDA
Data Cleaning Steps

⮚ Handling Missing Values: Removed 'Unnamed: 0' and 'Links' columns.


⮚ Consistency Check: Ensured uniform data types and addressed any inconsistencies.
⮚ Null Values: Handled null values to enhance overall data quality.

Data Analysis Steps

• Univariate Analysis: Explored IMDb ratings and movie release years through visualizations.
• Bivariate Analysis: Examined relationships using scatterplots and correlation matrices.
• Multivariate Analysis: Utilized dimensionality reduction and advanced visualizations.
DATA COLLECTION

Curated a dataset of 250 records from IMDb's Top 250 list for Indian
movies. Extracted crucial details including ranking, title, rating, release
year, duration, genre, and brief descriptions. This comprehensive
dataset sets the stage for a detailed exploration of highly acclaimed
Indian films.
DATA ANALYSIS
The univariate analysis brought to light the nuances of IMDb ratings and the
distribution of movie release years, providing a snapshot of individual
variables. Transitioning to bivariate analysis, we delved into relationships
between pairs of variables, using scatterplots and correlation matrices to
uncover meaningful associations. The multivariate analysis phase introduced
advanced techniques such as dimensionality reduction (PCA) and diverse
visualizations, offering insights into the intricate web of interactions among
multiple variables. Although not the primary focus, exploratory modeling
techniques were applied, contributing to a nuanced understanding of
underlying patterns within the dataset
DATA ANALYSIS
INSIGHTS

• IMDb Top 250 list features iconic classics with near-perfect ratings.
• Movie ratings have evolved over time, reflecting changes in film quality, genres,
and audience preferences.
• Movies from the 1980s tend to have high average ratings.
• Genres like drama, crime, adventure, and action are associated with highly-rated
films.
• Dramas and documentaries generally receive higher ratings; comedy and horror
have more mixed ratings.
• The 2010s saw the highest number of Top 250 releases.
• There's a modest correlation betIen movie duration and IMDb rating, with longer
films slightly higher-rated on average.
LIMITATIONS AND RECOMMENDATIONS

∙ This analysis offers valuable insights, but it's important to recognize its limitations:

∙ Data Scope: The analysis relies on the provided dataset, which may not represent the entire universe of
Indian movies.

∙ Causation vs. Correlation: Correlations observed don't imply causation, as unaccounted factors may
influence trends.
CONCLUSION

In this Exploratory Data Analysis (EDA) of IMDb ratings for Indian movies , I have uncovered several key
takeaways:

Rating Distribution: The initial distribution of IMDb ratings for Indian movies in the dataset was right-skeId,
with a concentration of movies receiving ratings around 8.0.

Year-Based Analysis: I conducted a hypothesis test to compare IMDb ratings before and after the year 2000.
The results indicated that there is no significant difference in ratings betIen these two time periods.
THANK YOU

You might also like