Python Problem Statement
Python Problem Statement
Abstract:
YouTube (the world-famous video sharing website) maintains a list of the top trending
videos on the platform. According to Variety magazine, to determine the year's top-
trending videos, YouTube uses a combination of factors including measuring users
interactions (number of views, shares, comments and likes). Note: that they’re not the
most-viewed videos overall for the calendar year.
Problem Statement:
Read the youtube data and perform exploratory data analysis.
Dataset Information:
This dataset is the daily record from the top trending YouTube videos. Top 200 trending
videos of a given day. Original Data was collected during 14th November 2017 & 5th
March 2018(though, data for January 10th & 11th of 2017 is missing). Original dataset
was collected by Youtube API.
Variable Description:
Column Description
Tags Tags are descriptive keywords you can add to your video to
help viewers find your content
Like dislike disabled It represents the boolean value. True represents comments are
enabled and False represents comments are disabled
Tag appeared in It represents the boolean value. True represents that the
title respective tag has appeared in a particular video. False
represents that the respective tag has not appeared in
particular video.
Scope:
● Sentiment analysis in a variety of forms
● Categorising YouTube videos based on their comments and statistics.
● Training ML algorithms like RNNs to generate their own YouTube comments.
● Analysing what factors affect how popular a YouTube video will be.
● Statistical analysis over time
Learning Outcome:
The students will get a better understanding of how the variables are linked to each
other and how the EDA approach will help them gain more insights and knowledge
about the data that we have.