DOST AI - Coding Exercises v0.1

This document outlines an introductory Python programming exercise for analyzing movie data. It includes instructions on plotting multiple variables, creating new variables, filtering the data, removing outliers, plotting different variables, researching sampling methods, and simulating linear regression. The goal is to analyze factors like director/actor popularity and genre profitability, and predict movie scores using linear regression with standardized variables like average actor likes and IMDB scores.

Uploaded by

Lovely Jill

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

129 views

DOST AI - Coding Exercises v0.1

Uploaded by

Lovely Jill

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 12

Python Programming

Day 1
Jupyter Notebook
Familiarization
Python Programming
Exercises
1. Multiple variables in a 2D Plot
2. New Variables
3. Querying : Advanced filters
4. Clearing outliers and regression on plots
5. Plotting different series
6. Researching sampling methods and how to do KDE Plots
7. Creating a function simulating linear regression
Multiple variables in a 2D Plot
Plot the following variables in one graph
• num_critic_for_reviews
• IMDB score
• gross
• Steven Spielberg against others
New Variables
• Compute sales (gross – budget) and store it in the same data frame
Querying : Advanced filters
Which directors garnered the most total sales?
1. Get the top 10 directors
2. Filter the data frame for only these directors
3. Proceed to plot
Which actors garnered the most total sales?
1. We have three actor fields. For now, get only the first one.
2. Filter the data frame for only these actors
3. Proceed to plot
4. Bonus: Create a series / dictionary for the three actor fields to their sales
Clearing outliers and regression on plots
Plot sales as a function of movie_facebook_likes. Plot it as a
scatterplot. Fit it with a line.
1. Create a function for the Tukey's method above.
2. Remove sales and movie_facebook_like outliers for a better
understanding.
3. Add jitter.
4. Plot if there is a good linear correlation.
5. Bonus: Try a nonlinear fit, i.e. polynomial of order 2, 3, 4?
Plotting different series
Which of these genres are the most profitable? Plot their sales using
different histograms, superimposed in the same axis.
• Romance
• Comedy
• Action
• Fantasy
Researching sampling methods and how to
do KDE Plots
Plot a Kernel Density Estimation plot of the following variable combinations:
• Duration and Gross
• Duration and IMDB Scores
To review, for clearer plotting, you have the following options:
• Sampling - research on this
• Jittering
• Outlier removal
Preparing for Machine Learning
For this exercise, we will simulate a common algorithm, used in statistics, machine learning and beyond,
linear regression
1. Create a function for z-normalization.
2. Standardize your sales variable and save it to a new variable in the same data frame.
3. Compute average actor likes, which averages the three actor’s Facebook likes. Standardize this variable as
well.
4. Create a function that takes (1) a scalar, (2) theta and (3) a bias variable to output a value as close as
possible to gross. Call this function, predict_score.
Preparing for Machine Learning
1. Create the RMSE function. Create a function that compares two vectors and outputs the root mean
squared error / deviation.

2. Create the best possible thetas by brute-forcing against the RMSE function. Create predictions for your
entire dataset. Compare your predictions against the score. Achieve the smallest RMSE you can.
3. Plot your best theta, bias variable against the imdb score for each movie. For a cleaner plot, you should:
1. Compile your average_actor_likes, imdb_scores and predicted to a new dataframe
2. Limit the bounds of your predicted ratings
3. Use a combination of scatter plot for the independent variables / features and a line plot for the predicted score
Preparing for Machine Learning
1. Convert your hypothesis function to use more variables. Standardize your new variables.

2. Compile your theta values to a new pandas dataframe which consists of the following columns:

3. Plot how each theta parameter influence the RMSE. Which one seems to be most influential?

Download full An Introduction to Generalized Linear Models Third Edition Barnett ebook all chapters
100% (2)
Download full An Introduction to Generalized Linear Models Third Edition Barnett ebook all chapters
55 pages
Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Practical Research 1 Summative Test
93% (29)
Practical Research 1 Summative Test
3 pages
C & C++ Interview Questions You'll Most Likely Be Asked
From Everand
C & C++ Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Mastering Python for Finance
From Everand
Mastering Python for Finance
James Ma Weiming
5/5 (1)
Company Wise Data Science Interview Questions
100% (2)
Company Wise Data Science Interview Questions
39 pages
Statistical Methods in Data Analysis - W. J. Metzger
No ratings yet
Statistical Methods in Data Analysis - W. J. Metzger
278 pages
Vertopal.com IMDb+Movie+Assignment Stub
No ratings yet
Vertopal.com IMDb+Movie+Assignment Stub
9 pages
Recommender System
No ratings yet
Recommender System
45 pages
Report
No ratings yet
Report
26 pages
Team_Renegades_MMLA_Report
No ratings yet
Team_Renegades_MMLA_Report
27 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Advanced Multiplayer Game Development with Ureal Engine 5: A Comprehensive Guide to C++ Scripting
From Everand
Advanced Multiplayer Game Development with Ureal Engine 5: A Comprehensive Guide to C++ Scripting
Vladimir Kiselev
No ratings yet
Common DS Interview Questions and Answers - 1
No ratings yet
Common DS Interview Questions and Answers - 1
4 pages
2_DataPreProcessing_code
No ratings yet
2_DataPreProcessing_code
46 pages
Predicting Movie Rating Prior to Release (1)
No ratings yet
Predicting Movie Rating Prior to Release (1)
15 pages
Interview_Preparation_Notes
No ratings yet
Interview_Preparation_Notes
3 pages
Unleashing the Power of TypeScript
From Everand
Unleashing the Power of TypeScript
Steve Kinney
No ratings yet
Report
No ratings yet
Report
11 pages
Review 2
No ratings yet
Review 2
21 pages
1st Harvard Project
No ratings yet
1st Harvard Project
17 pages
Code-It Workbook 4: Problem Solving Using Scratch
From Everand
Code-It Workbook 4: Problem Solving Using Scratch
Phil Bagge
No ratings yet
Review 1
No ratings yet
Review 1
18 pages
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
Machine Learning: Hands-On for Developers and Technical Professionals
From Everand
Machine Learning: Hands-On for Developers and Technical Professionals
Jason Bell
No ratings yet
IMDB Movie Analysis
No ratings yet
IMDB Movie Analysis
80 pages
Composing Software: An Exploration of Functional Programming and Object Composition in JavaScript
From Everand
Composing Software: An Exploration of Functional Programming and Object Composition in JavaScript
Eric Elliott
No ratings yet
Mastering SFML Game Development
From Everand
Mastering SFML Game Development
Raimondas Pupius
No ratings yet
Linear Algebra For Machine Learning
No ratings yet
Linear Algebra For Machine Learning
65 pages
Ml Cyber Lab
No ratings yet
Ml Cyber Lab
16 pages
Microsoft Azure Machine Learning
From Everand
Microsoft Azure Machine Learning
Sumit Mund
4.5/5 (3)
Analysis and Design of Algorithms: A Beginner’s Hope
From Everand
Analysis and Design of Algorithms: A Beginner’s Hope
Shefali Singhal
No ratings yet
Learning C++ by Creating Games with UE4
From Everand
Learning C++ by Creating Games with UE4
William Sherif
3/5 (7)
R Studio Formulae Cheat Sheet_DU Scholars
No ratings yet
R Studio Formulae Cheat Sheet_DU Scholars
4 pages
capstone overview
No ratings yet
capstone overview
58 pages
subtitle
No ratings yet
subtitle
3 pages
Python: Tips and Tricks to Programming Code with Python: Python Computer Programming, #3
From Everand
Python: Tips and Tricks to Programming Code with Python: Python Computer Programming, #3
Charlie Masterson
5/5 (1)
Schaum's Outline of Mathematica, 2ed
From Everand
Schaum's Outline of Mathematica, 2ed
Eugene Don
3.5/5 (3)
The Supervised Learning Workshop - Second Edition: A New, Interactive Approach to Understanding Supervised Learning Algorithms, 2nd Edition
From Everand
The Supervised Learning Workshop - Second Edition: A New, Interactive Approach to Understanding Supervised Learning Algorithms, 2nd Edition
Blaine Bateman
No ratings yet
PMT2 23
No ratings yet
PMT2 23
28 pages
Lecture 1 Pyhton Programming DOST 1
No ratings yet
Lecture 1 Pyhton Programming DOST 1
67 pages
Dive Into Algorithms: A Pythonic Adventure for the Intrepid Beginner
From Everand
Dive Into Algorithms: A Pythonic Adventure for the Intrepid Beginner
Bradford Tuckfield
No ratings yet
MLCyberLab
No ratings yet
MLCyberLab
9 pages
Python: Tips and Tricks to Programming Code with Python
From Everand
Python: Tips and Tricks to Programming Code with Python
Charlie Masterson
No ratings yet
Ads - Phase 5
No ratings yet
Ads - Phase 5
14 pages
MACHINE LEARNING
No ratings yet
MACHINE LEARNING
13 pages
Special Techniques in Excel
From Everand
Special Techniques in Excel
David Fong
No ratings yet
Movie Recomendation System Using R
No ratings yet
Movie Recomendation System Using R
41 pages
Python for Finance
From Everand
Python for Finance
Yuxing Yan
2.5/5 (4)
Backtrader Essentials: Building Successful Strategies with Python
From Everand
Backtrader Essentials: Building Successful Strategies with Python
Ali AZARY
No ratings yet
The Art of Clean Code: Best Practices to Eliminate Complexity and Simplify Your Life
From Everand
The Art of Clean Code: Best Practices to Eliminate Complexity and Simplify Your Life
Christian Mayer
No ratings yet
What Are The Differences Between Supervised and Unsupervised Learning?
No ratings yet
What Are The Differences Between Supervised and Unsupervised Learning?
22 pages
Generative AI for Trading and Asset Management
From Everand
Generative AI for Trading and Asset Management
Hamlet Jesse Medina Ruiz
No ratings yet
Python Programming Using Google Colab
From Everand
Python Programming Using Google Colab
AM Govind Kumar
No ratings yet
Datamites Certified Data Scientist Syllabus PDF
50% (2)
Datamites Certified Data Scientist Syllabus PDF
12 pages
Recommendation Engine Problem Statement
No ratings yet
Recommendation Engine Problem Statement
37 pages
project 5
No ratings yet
project 5
5 pages
Mastering Data Structures and Algorithms in Python & Java
From Everand
Mastering Data Structures and Algorithms in Python & Java
Sachin Naha
No ratings yet
Final Lab Manual
No ratings yet
Final Lab Manual
34 pages
Linear+Regression
No ratings yet
Linear+Regression
3 pages
Mastering Classification Algorithms for Machine Learning: Learn how to apply Classification algorithms for effective Machine Learning solutions (English Edition)
From Everand
Mastering Classification Algorithms for Machine Learning: Learn how to apply Classification algorithms for effective Machine Learning solutions (English Edition)
PARTHA MAJUMDAR
No ratings yet
Learn to Program with Small Basic: An Introduction to Programming with Games, Art, Science, and Math
From Everand
Learn to Program with Small Basic: An Introduction to Programming with Games, Art, Science, and Math
Majed Marji
No ratings yet
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
From Everand
Image Classification: Step-by-step Classifying Images with Python and Techniques of Computer Vision and Machine Learning
Mark Magic
No ratings yet
Assignment Semester Oct 2017 Essb2143 Essb2013 Uploaded 10 Oct
No ratings yet
Assignment Semester Oct 2017 Essb2143 Essb2013 Uploaded 10 Oct
5 pages
The Sociological Jurisprudence of Roscoe Pound (Part II)
No ratings yet
The Sociological Jurisprudence of Roscoe Pound (Part II)
29 pages
3 Social Conflict Over Styles of Sociological Work
No ratings yet
3 Social Conflict Over Styles of Sociological Work
19 pages
Talavera Senior High School: Roxas Street, Pagasa District, Talavera, Nueva Ecija
No ratings yet
Talavera Senior High School: Roxas Street, Pagasa District, Talavera, Nueva Ecija
18 pages
Survey For New Launching of Male Body Wash
No ratings yet
Survey For New Launching of Male Body Wash
32 pages
Basic Concepts in Statistics
No ratings yet
Basic Concepts in Statistics
40 pages
Lab Report 3 sbl1023
No ratings yet
Lab Report 3 sbl1023
5 pages
Correlational Research - Group 2
No ratings yet
Correlational Research - Group 2
9 pages
Rural Landscapes Past Processes and Futu PDF
No ratings yet
Rural Landscapes Past Processes and Futu PDF
6 pages
Syllabus: IOM 530: Applied Modern Statistical Learning Techniques
No ratings yet
Syllabus: IOM 530: Applied Modern Statistical Learning Techniques
2 pages
Chapter 3
No ratings yet
Chapter 3
34 pages
1 Proposal
No ratings yet
1 Proposal
37 pages
Decision Tree and Sensitivity Analysis
No ratings yet
Decision Tree and Sensitivity Analysis
2 pages
Psych Stat - Module 1
No ratings yet
Psych Stat - Module 1
8 pages
Testing Hypotheses (Exercises) : Use The P Value Approach To Solve The Following Problems
No ratings yet
Testing Hypotheses (Exercises) : Use The P Value Approach To Solve The Following Problems
5 pages
Chapter One Reaserch
No ratings yet
Chapter One Reaserch
54 pages
EXERCISE 6: Discriminant Analysis
No ratings yet
EXERCISE 6: Discriminant Analysis
4 pages
Das Analytical Chemistry Second Edition Sample Chatper PDF
0% (1)
Das Analytical Chemistry Second Edition Sample Chatper PDF
39 pages
The Influence of Social Media On Customer Hotel Choices
No ratings yet
The Influence of Social Media On Customer Hotel Choices
7 pages
PYL101 QM Lecture 3
No ratings yet
PYL101 QM Lecture 3
16 pages
Randomized Block Design
100% (1)
Randomized Block Design
15 pages
Action Research in Social Psychology
No ratings yet
Action Research in Social Psychology
9 pages
EM561 Lecture Notes - Part 3 of 3
No ratings yet
EM561 Lecture Notes - Part 3 of 3
103 pages
Types of Sampling Lecture Notes
No ratings yet
Types of Sampling Lecture Notes
4 pages
Practical Research 1
No ratings yet
Practical Research 1
41 pages
Use of Psychometric Tests in The Process of Recruitment in Human Resource Management
100% (2)
Use of Psychometric Tests in The Process of Recruitment in Human Resource Management
100 pages
EcolizerEN 1180 PDF
No ratings yet
EcolizerEN 1180 PDF
89 pages

DOST AI - Coding Exercises v0.1

Uploaded by

DOST AI - Coding Exercises v0.1

Uploaded by

Python Programming

You might also like