0% found this document useful (0 votes)

44 views7 pages

ML Internship Project Report 2024

Uploaded by

kondrujagadishchoudari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views7 pages

ML Internship Project Report 2024

Uploaded by

kondrujagadishchoudari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 7

REPORT OF A PROJECT

ABOUT

PREDICTING THE NUMBERS OF YOUTUBE ADVERTISEMENT VIEWERS

Submitted in fulfilment for the requirement of the award of Internship

Machine Learning

Submitted By

KONDRU JAGADISH CHOUDARI

CT UNIVERITY, PUNJAB.INDIA
ACKNOWLEDGEMENT

My sincerest gratitude goes to my project paper mentor and

Machine Learning coach. It is because of his backing and
support that I was able to complete the report. If there were
mistakes, he helped me correct them by offering various kinds of
assistance accordingly. These are not enough words to state
how grateful I am; lastly but not the least, I appreciate my
beloved parents for always being sources of encouragement and
moral support whose impacts on me were huge in this respect.
Also, I declare that to the best of my knowledge and belief,
Project Work has not been submitted anywhere else

INTRODUCTION
Youtube advertisers pay content creators depending on adviews and clicks for the goods or services they
are promoting The ads can also be estimated using other metrics such as likes, comments among others.
This therefore calls for training a number of regression models so as to select the best model that will
predict adviews which is the problem statement at hand. Data has to be cleaned up before it is fed into
algorithms in order to obtain better results.

Objective

To build a machine learning regression to predict youtube adview count based on other youtube
metrics.

Technology and Concepts

Machine Learning

In classic terms, machine learning is a type of artificial intelligence that enables selflearning from
data and then applies that learning without the need for human intervention.

Linear Regression

Linear Regression is a supervised machine learning algorithm where the predicted output is
continuous and has a constant slope. It's used to predict values within a continuous range, (e.g. sales,
price) rather than trying to classify them into categories (e.g. cat, dog).

There are two main types:

1. Simple regression.
2. Multiple regression

Support Vector Machine

“Support Vector Machine” (SVM) is a supervised machine learning algorithm which can be used for
both classification or regression challenges. However, it is mostly used in classification problems. In the
SVM algorithm, we plot each data item as a point in n-dimensional space (where n is number of features
you have) with the value of each feature being the value of a particular coordinate

Decision Tree
Decision tree analysis involves making a tree-shaped diagram to chart out a course of action or a
statistical probability analysis. It is used to break down complex problems or branches. Each branch
of the decision tree could be a possible outcome.

Artificial Neural Network(ANN)

An artificial neural network (ANN) is the piece of a computing system designed to simulate the
way the human brain analyzes and processes information. It is the foundation of artificial
intelligence (AI) and solves problems that would prove impossible or difficult by human or
statistical standards. ANNs have self-learning capabilities that enable them to produce better results
as more data becomes available.

Data Description

The file train.csv contains metrics and other details of about 15000 youtube videos. The metrics
include number of views, likes, dislikes, comments and apart from that published date, duration and
category are also included. The train.csv file also contains the metric number of adviews which is
our target variable for prediction.

Steps For adview prediction

1. Import the datasets and libraries, check shape and datatype.

2. Visualise the dataset using plotting using heatmaps and plots. You can study data distributions
for each attribute as well.

3. Clean the dataset by removing missing values and other things.

4. Transform attributes into numerical values and other necessary transformations

5. Normalise your data and split the data into training, validation and test set in the appropriate
ratio.

6. Use linear regression, Support Vector Regressor for training and get errors.

7. Use Decision Tree Regressor and Random Forest Regressors.

8. Build an artificial neural network and train it with different layers and hyperparameters.
Experiment a little. Use keras.

9. Pick the best model based on error as well as generalisation.

10. Take the test dataset test.csv

11. Clean the test dataset by removing missing values

12. Remove unnecessary columns which has no impact to target variable

13. Transform the categorical attribute to numerical attribute.

14. Find prediction using the best algorithm

15. Save it into a new csv file by naming as Predictions_Submission.csv

Visualization :

This is the histogram of “Category” column

This is the histogram of “adview” column

This is the heatmap which shows the co-relation of all columns with each other.
Table:

Algorithm Linear Random Decision tree Support ANN

Regression forest vector
machine
Mean 3707.37800 3274.69029 3059.31079 3707.37800 3304.26489
Absolute 5824529 66905504 2349727 5824529 4606637
Error

Mean 835663131. 644433788. 1226286165 835663131. 829552666.

Squared 1210335 0361483 .4118853 1210335 7955565

Error
Root Mean 28907.8385 25385.7004 35018.3689 28907.8385 28801.9559
Squ 7573986 6376795 713254 7573986 5433679
ared Error

Best Model
From the training dataset by applying all algorithms for train the model,we found that "Random
Forest Regressor" algorithm has less root mean squared error as compared to othre algorithms.As
we know model having less root mean squared error is more perfect.So here for prediction of test
dataset we use "Random Forest” algorithm.

Conclusions

We had a lot of different ideas for the project, but were maybe originally too ambitious for our
goals. We were originally trying to predict the view count of advertisement. In this way we can
predict the adview of an advertisement. We were hoping that. Some more things that we could have
tried if we had more time would include.

Aiml Manual 6th Sem
No ratings yet
Aiml Manual 6th Sem
15 pages
Web Development
No ratings yet
Web Development
11 pages
Internship Report1
No ratings yet
Internship Report1
29 pages
7 5-04-01-01.2 Analysis of Speed Power Trial Data PDF
No ratings yet
7 5-04-01-01.2 Analysis of Speed Power Trial Data PDF
25 pages
ML-ProblemStatement Youtube Adview Prediction-1 Lyst8087
No ratings yet
ML-ProblemStatement Youtube Adview Prediction-1 Lyst8087
2 pages
Industrial Internship Report
No ratings yet
Industrial Internship Report
21 pages
Artificial Intelligence and Machine Learning (18CS71) : "Personality Prediction System"
No ratings yet
Artificial Intelligence and Machine Learning (18CS71) : "Personality Prediction System"
28 pages
Internship Report On Ai
No ratings yet
Internship Report On Ai
32 pages
Detection of Fake Online Reviews Using Semi Supervised and Supervised Learning
No ratings yet
Detection of Fake Online Reviews Using Semi Supervised and Supervised Learning
4 pages
Video Summarization Project Presentaion
No ratings yet
Video Summarization Project Presentaion
34 pages
Sentiment Analysis Report
No ratings yet
Sentiment Analysis Report
31 pages
Report
100% (1)
Report
32 pages
MC4411 Project Work - Format
No ratings yet
MC4411 Project Work - Format
65 pages
LP3 - ML Mini-Project Report Format Shreeyas
No ratings yet
LP3 - ML Mini-Project Report Format Shreeyas
13 pages
Dsbda Mini Priyanshu
No ratings yet
Dsbda Mini Priyanshu
17 pages
DeepSide A Deep Learning Framework For Drug Side Effect Prediction
No ratings yet
DeepSide A Deep Learning Framework For Drug Side Effect Prediction
60 pages
Python Programming (Int 213) : Report For House Price Prdiction
No ratings yet
Python Programming (Int 213) : Report For House Price Prdiction
23 pages
Report of Project
No ratings yet
Report of Project
46 pages
Machine Learning/ Artificial Intelligence (MLAI) Internship
No ratings yet
Machine Learning/ Artificial Intelligence (MLAI) Internship
4 pages
Internship Report DiabetesPrediction
No ratings yet
Internship Report DiabetesPrediction
15 pages
Music Recommendation Based On Facial Expression
No ratings yet
Music Recommendation Based On Facial Expression
4 pages
Full Stack Internship Report
No ratings yet
Full Stack Internship Report
26 pages
Internship Report On Machine Learning
100% (1)
Internship Report On Machine Learning
26 pages
Deep Learning Based Recommendation Systems
No ratings yet
Deep Learning Based Recommendation Systems
47 pages
Price Prediction
No ratings yet
Price Prediction
16 pages
Rtmnu Time Table 5th STT - Be - 5th-Sem CBCS
No ratings yet
Rtmnu Time Table 5th STT - Be - 5th-Sem CBCS
3 pages
Eye Blink Detection: Integrated - Master of Computer Applications
100% (1)
Eye Blink Detection: Integrated - Master of Computer Applications
34 pages
Mad Practicals Batch 241
No ratings yet
Mad Practicals Batch 241
111 pages
Unit - 1 Deep Learning Techniques
No ratings yet
Unit - 1 Deep Learning Techniques
18 pages
Project - Report
No ratings yet
Project - Report
56 pages
Mini Project 2A PPT 2.0
No ratings yet
Mini Project 2A PPT 2.0
19 pages
SYMBIAN OS Report
No ratings yet
SYMBIAN OS Report
25 pages
CV For Frontend Dev Fresher
No ratings yet
CV For Frontend Dev Fresher
1 page
Computer Vision Module Application For Finding A Target in A Live Camera
No ratings yet
Computer Vision Module Application For Finding A Target in A Live Camera
8 pages
Software Engineering
No ratings yet
Software Engineering
8 pages
In House Project Report - Beg
No ratings yet
In House Project Report - Beg
8 pages
AI - (Deep Learning/NLP) : 5 Days
No ratings yet
AI - (Deep Learning/NLP) : 5 Days
4 pages
Python and Machine Learning: A Practical Training Report On
No ratings yet
Python and Machine Learning: A Practical Training Report On
65 pages
Vandana Internship Report
No ratings yet
Vandana Internship Report
48 pages
Drowsiness Detection Using Opencv Final
No ratings yet
Drowsiness Detection Using Opencv Final
83 pages
Broadcasting Chat Server
83% (6)
Broadcasting Chat Server
25 pages
"Sentiment Analysis of Imdb Movie Reviews": A Project Report
No ratings yet
"Sentiment Analysis of Imdb Movie Reviews": A Project Report
27 pages
Internship Report 6th Sem
No ratings yet
Internship Report 6th Sem
26 pages
Nikhil MOOC Report
No ratings yet
Nikhil MOOC Report
16 pages
Features of MapReduce
No ratings yet
Features of MapReduce
4 pages
INTERN
No ratings yet
INTERN
40 pages
Internship Presentation WEB
No ratings yet
Internship Presentation WEB
15 pages
Internship Report Roshan
No ratings yet
Internship Report Roshan
14 pages
PDF Sentimental Analysis Project Documentation
No ratings yet
PDF Sentimental Analysis Project Documentation
74 pages
Black Book Final Year
No ratings yet
Black Book Final Year
99 pages
Sample Technical Seminar Vtu
No ratings yet
Sample Technical Seminar Vtu
14 pages
APMC Prachi Synopsis
No ratings yet
APMC Prachi Synopsis
6 pages
Project PPT 1
No ratings yet
Project PPT 1
16 pages
Internship Report
No ratings yet
Internship Report
13 pages
Blue Brain Technology: A Seminar Report On
No ratings yet
Blue Brain Technology: A Seminar Report On
17 pages
CB 17 Black Book
No ratings yet
CB 17 Black Book
47 pages
Digital Media Marketing Using Trend Analysis On Social Media Seminar Presentation
100% (1)
Digital Media Marketing Using Trend Analysis On Social Media Seminar Presentation
16 pages
Aiml - 4351601
No ratings yet
Aiml - 4351601
60 pages
Enterprise Computing With Java Practical File: Master of Computer Application
No ratings yet
Enterprise Computing With Java Practical File: Master of Computer Application
45 pages
Modelling and Simulation - Lecture 01
No ratings yet
Modelling and Simulation - Lecture 01
7 pages
Mastering WebGL: Crafting Advanced 3D Web Experiences: WebGL Wizadry
From Everand
Mastering WebGL: Crafting Advanced 3D Web Experiences: WebGL Wizadry
Kameron Hussain
No ratings yet
4 Lab Manual 18CSL76
No ratings yet
4 Lab Manual 18CSL76
29 pages
Crossmark: Ocean Engineering
No ratings yet
Crossmark: Ocean Engineering
13 pages
3251 Real Analysis IIComplex Analysis Feb Mar 2024
No ratings yet
3251 Real Analysis IIComplex Analysis Feb Mar 2024
2 pages
English p5 Mid Term Test
No ratings yet
English p5 Mid Term Test
7 pages
Wiener Index of Graphs Over Rings A Survey
No ratings yet
Wiener Index of Graphs Over Rings A Survey
10 pages
Hypothesis Testing Keshav N
No ratings yet
Hypothesis Testing Keshav N
8 pages
Ship Hydrodynamics 1 Part B Lecture 7 - Seakeeping Criteria - Supplement
100% (1)
Ship Hydrodynamics 1 Part B Lecture 7 - Seakeeping Criteria - Supplement
23 pages
MAT1100 Integral Calculus I - 2020
No ratings yet
MAT1100 Integral Calculus I - 2020
6 pages
A New Approach To Current Differential Protection For Transmission Lines
No ratings yet
A New Approach To Current Differential Protection For Transmission Lines
25 pages
Enotes
No ratings yet
Enotes
30 pages
Autodesk Nastran 2022 Nonlinear Analysis Handbook
No ratings yet
Autodesk Nastran 2022 Nonlinear Analysis Handbook
2 pages
Growth and Decay Basic Calculus Lesson Plan
No ratings yet
Growth and Decay Basic Calculus Lesson Plan
10 pages
MF821 Syllabus
No ratings yet
MF821 Syllabus
5 pages
Coordinate Geometry: Coordinate Geometry Is Considered To Be One of The Most
100% (1)
Coordinate Geometry: Coordinate Geometry Is Considered To Be One of The Most
5 pages
Model-Based Testing of Automotive Systems: Piketec GMBH, Germany
No ratings yet
Model-Based Testing of Automotive Systems: Piketec GMBH, Germany
9 pages
EN3037 FiniteDifference discussion-AMJ
No ratings yet
EN3037 FiniteDifference discussion-AMJ
9 pages
ABB机器人编程手册
No ratings yet
ABB机器人编程手册
1,280 pages
Inductive and Deductive Reasoning
No ratings yet
Inductive and Deductive Reasoning
52 pages
Frames of References 5th Sem Nep
No ratings yet
Frames of References 5th Sem Nep
16 pages
Filter Sizing - Pool & Spa News
No ratings yet
Filter Sizing - Pool & Spa News
3 pages
Module For Stem 12 Gen Physics
No ratings yet
Module For Stem 12 Gen Physics
23 pages
Mental Calculation
No ratings yet
Mental Calculation
54 pages
Chapter 2
No ratings yet
Chapter 2
29 pages
Purpose: Defining A Class in Otcl
No ratings yet
Purpose: Defining A Class in Otcl
4 pages
Class Test
No ratings yet
Class Test
10 pages
Algebra and Equations
No ratings yet
Algebra and Equations
36 pages
Chapter 3 FM I
No ratings yet
Chapter 3 FM I
16 pages
Game Theory Lecture Notes - Levent Kockesen
No ratings yet
Game Theory Lecture Notes - Levent Kockesen
120 pages
A Comparative Review of 3D Container Loading Algorithms
No ratings yet
A Comparative Review of 3D Container Loading Algorithms
34 pages

ML Internship Project Report 2024

Uploaded by

ML Internship Project Report 2024

Uploaded by

REPORT OF A PROJECT

PREDICTING THE NUMBERS OF YOUTUBE ADVERTISEMENT VIEWERS

Submitted in fulfilment for the requirement of the award of Internship

KONDRU JAGADISH CHOUDARI

My sincerest gratitude goes to my project paper mentor and

Technology and Concepts

There are two main types:

Support Vector Machine

Artificial Neural Network(ANN)

Steps For adview prediction

1. Import the datasets and libraries, check shape and datatype.

3. Clean the dataset by removing missing values and other things.

4. Transform attributes into numerical values and other necessary transformations

7. Use Decision Tree Regressor and Random Forest Regressors.

9. Pick the best model based on error as well as generalisation.

10. Take the test dataset test.csv

11. Clean the test dataset by removing missing values

12. Remove unnecessary columns which has no impact to target variable

13. Transform the categorical attribute to numerical attribute.

14. Find prediction using the best algorithm

15. Save it into a new csv file by naming as Predictions_Submission.csv

This is the histogram of “Category” column

This is the histogram of “adview” column

Algorithm Linear Random Decision tree Support ANN

Mean 835663131. 644433788. 1226286165 835663131. 829552666.

You might also like