0% found this document useful (0 votes)
77 views58 pages

Data Science Seminar

Uploaded by

Alvin Sibayan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views58 pages

Data Science Seminar

Uploaded by

Alvin Sibayan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 58

Data Science

Seminar Polytechnic University


of the Philippines

Dr. Rodolfo C. Raga Jr.

September 07, 2019

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 1 of 58
Outline of Talk

• Data Science and Industry 4.0


(The 4th Industrial Revolution)
• Correlation and Data Analysis using Excel
• Building Information Dashboards
• Introduction to Machine Learning
• Brief introduction to Python and Jupyter
Notebook
• Introduction to Deep Learning

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 2 of 58
The Industrial Revolutions

1765-1870 1870-1969 1950-1970 2011-Present

Image lifted from https://en.wikipedia.org/wiki/Industrial_Revolution

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 3 of 58
Technology Trends in Industry 4.0

Source: CapGeminii

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 4 of 58
Byproduct of Industry 4.0 Technology

Business Organizations are Collecting Huge Amounts


of Data Footprints
DATA DATA
HARNESSING VOLUMES
Companies store
each piece of 2010 1.2
information
generated during 2012 2.4
the business
2015 7.9
operations and
customer Volumes in Trillion GB
interactions.

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 5 of 58
Data Science in Industry 4.0 Business Environment
Data on its own is useless unless you can make sense of it!

THIS IS WHERE DATA SCIENCE COMES IN…


DATA SCIENCE involves transforming data into insight for
making better decisions, offering new opportunities for a
competitive advantage

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 6 of 58
Why Learn Data Science?

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 7 of 58
Data Science is Reshaping Competition

Source: World Economic Forum

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 8 of 58
Data Science is Reshaping Competition

Source: World Economic Forum

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 9 of 58
Analytics Leaders in the Business World
Application of Analytics have enabled these companies to establish themselves as
some of the most valuable companies in the world

Source: McKinsey Global Institute – The Age of Analytics: Competing in a Data Driven World (2016)

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 10 of 58
The same pattern on the next wave of disruptors
The business models of these companies are all predicated on data and data science

Source: McKinsey Global Institute – The Age of Analytics: Competing in a Data Driven World (2016)

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 11 of 58
Why Data Science is a Critical Skill to learn

$16.9 Billion
Industry

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 12 of 58
Demand for Data Scientist

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 13 of 58
Why Data Science is a Critical Skill to learn

CS 561, Lecture 1

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 14 of 58
Data Science =
Business Intelligence +
Business Analytics

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 15 of 58
The Role of BI and BA in Data Science

Source: Udemy Data Science Bootcamp 2019

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 16 of 58
Pillars of Data Science

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 17 of 58
Pillars of Data Science

CS 561, Lecture 1

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 18 of 58
Cycle of BI and BA Processing
Analytics have enabled these companies to establish themselves as some of the
most valuable companies in the world

Data Data Business


Collection Processing Intelligence

Business Data
Analytics Archiving

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 19 of 58
BI and BA
Demo
Using Excel

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 20 of 58
INTRODUCTION
TO
CORRELATION

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 21 of 58
Basic Terminology
• Dataset: is a collection of data. Commonly corresponds to the
contents of a single database table, or a single statistical data
matrix.

• Data: The facts and figures collected, analyzed, and summarized


for presentation and interpretation

• Variable: A characteristic or a quantity of interest that can take on


different values
• Observation: Set of values corresponding to a set of variables

• Variation: The difference in a variable measured over observations

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 22 of 58
A dataset
Variables / Attributes

Student# Height Weight Gender Age GPA

001 5’6” 120lbs M 23 2.35


Observations
002 4’0” 200lbs F 19 3.25

Data

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 23 of 58
Correlation
• The term "correlation" refers to a mutual
relationship or association between quantities of
variables in a dataset.
• It is concerned with strength of the relationship
and does not indicate causal effect.
• Correlation is considered as a useful metric:
– It can (but often does not, as we will see in some examples below)
indicate the presence of a causal relationship
– It is used as a basic quantity and foundation for many other
modeling techniques
– It can help in predicting one quantity from another

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 24 of 58
Correlation by Intuition

Student# Height Weight Gender Age GPA


001 5’6” 120lbs M 23 2.35
002 4’0” 200lbs F 19 3.25

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 25 of 58
Scatter Plots and Correlation
• It is always a good idea to use visualization techniques
to get a better picture of how variables relate to each
other.
• A scatter plot (or scatter diagram) is the most often
used diagram to graphically depict the relationship
between two variables
– It uses cartesian coordinate
– Represents two quantitative variables
– One variable is called independent (X) and the second is
called dependent (Y)

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 26 of 58
Example
Scatter diagram of weight and systolic blood
pressure
SBP
Wt.
(mmH
(kg)
g)
67 120
69 125
85 140
83 160
74 130
81 180 {x=67, y=120}
97 150
92 140
114 200

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 27 of 58
Scatter Plot Examples
Linear relationships Curvilinear relationships

y y

x x

y y

x x

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 28 of 58
Scatter Plot Examples
Strong relationships Weak relationships

y y

x x

y y

x x

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 29 of 58
Scatter Plot Examples

Positive relationships Negative


relationships

y y

x x

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 30 of 58
Scatter Plot Examples
No relationship

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 31 of 58
Correlation Coefficient
(continued)

• The population correlation coefficient ρ


(rho) measures the strength of the
association between the variables

• The sample correlation coefficient r is an


estimate of ρ and is used to measure the
strength of the linear relationship in the
sample observations

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 32 of 58
Characteristics of ρ and r
• Unit free
• Range between -1 and 1
• The closer to -1, the stronger the negative
linear relationship
• The closer to 1, the stronger the positive
linear relationship
• The closer to 0, the weaker the linear
relationship

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 33 of 58
➢ The value of r denotes the strength of the
association as illustrated
by the following diagram.

strong intermediate weak weak intermediate strong

- 0
-0.75 -0.25 0.25 0.75 1
1
inverse direct
perfect no relation perfect
linear linear
correlation correlation

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 34 of 58
Scatterplot visualizations of
r Values
y (1) y (2) y (3)

x x x
r = -1 r = -.6 r=
y 0
y

(4) (5)

x x
r = +.3 r = +1

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 35 of 58
Real examples of correlation

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 36 of 58
Calculating the
Correlation Coefficient
Sample correlation coefficient:

where:
r = Sample correlation coefficient
n = Sample size
x = Value of the independent variable
y = Value of the dependent variable

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 37 of 58
Calculation Example
Tree Trunk
Height Diameter
y x xy y2 x2
1 35 8 280 1225 64
2 49 9 441 2401 81
3 27 7 189 729 49
4 33 6 198 1089 36
5 60 13 780 3600 169
6 21 7 147 441 49
7 45 11 495 2025 121
8 51 12 612 2601 144
=321 between
Q1: Is there a correlation =73
the height =3142 =14111
of the tree and =713
its trunk diameter?

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 38 of 58
Calculation Example (continued)

Tree
Height,
y

r = 0.886 → relatively strong positive


linear association between x and y
Trunk Diameter, x

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 39 of 58
Excel Output
Excel Correlation Output
Data/ data analysis / correlation…

Correlation between
Tree Height and Trunk Diameter

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 40 of 58
INTRODUCTION
TO LINEAR
REGRESSION

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 41 of 58
Regression Analysis
In statistical modeling, regression
analysis is a set of statistical processes for
estimating the relationships among
variables.

Regression analysis is also used to understand


which among the independent variables are
related to the dependent variable.

9-42

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 42 of 58
Purpose of Regression Analysis

• The purpose of regression analysis is to analyze


relationships among variables.
• The analysis is carried out through the
estimation of a relationship and the results serve
the following two purposes:

1. Answer the question of how much y changes


with changes in each of the x's (x1, x2,...,xk),
– Y is the dependent variable
– X is the independent variable

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 43 of 58
Does Y really changes with X
• The computed p-value for each pair of variables tests the null
hypothesis that X has no effect with Y.

• A low (significant) p-value (< 0.05) indicates that changes in


the X variable’s value are related to changes in the Y variable’s
value.

• Conversely, a larger (insignificant) p-value suggests that


changes in the predictor are not associated with changes in the
response.

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 44 of 58
Simple Linear Regression Model

• A simple linear regression model is a statistical


method that allows us to summarize and study
relationships between two continuous
(quantitative) variables:
– One variable, denoted x, is regarded as the predictor,
explanatory, or independent variable.
– The other variable, denoted y, is regarded as the
response, outcome, or dependent variable.

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 45 of 58
The Simple Linear Regression
Model
Population Random
Population Independent Error
Slope
Y intercept Variable term
Coefficient
Dependent
Variable

Linear Random Error


component component

Statistics for Managers Using


Microsoft Excel, 5e © 2008
Prentice-Hall, Inc.

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 46 of 58
Estimated Linear Regression
(continued)

y
Observed Value
of y for xi
Slope = β1
Predicted Value
of y for xi

Intercept = β0

x x
i

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 47 of 58
Linear Regression Equation

The simple linear regression equation provides an


estimate of the population regression line

Estimated (or Estimate of the Estimate of the


predicted) Y value regression regression slope
for observation i intercept

Value of X for
observation i

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 48 of 58
Interpretation of the
Intercept and the Slope

• b0 is the estimated mean value of Y


when the value of X is zero

• b1 is the estimated change in the mean


value of Y for every one-unit change in
X

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 49 of 58
Multiple Linear Regression Model
• Typically, we want to use more than a single
predictor (independent variable) to make
predictions

• Regression with more than one predictor is


called “multiple regression”

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 50 of 58
Model and Required Conditions
• We allow for n independent variables to
potentially be related to the dependent
variable

Coefficie Random error


nts variable
Y = b 0 + b 1X1+ b 2X2 + …+ b nXn + e

Dependent Independent
variable variables

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 51 of 58
Advanced Analysis Using
Python and Jupyter
Notebook

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 52 of 58
What are notebooks?
A notebook combines the functionality of
a word processor that handles formatted text
a "shell" that executes statements in a programming
language and includes output inline
a rendering engine that renders and graphic outputs

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 53 of 58
The Jupyter Notebook
The Jupyter Notebook App is a server-client application that allows editing and
running notebook documents via a web browser.

The Jupyter Notebook App can be executed on a local desktop requiring no internet
access or it can be installed on a remote server and accessed through the internet.

Best of all, as part of the open source Project Jupyter, they are completely free.

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 54 of 58
Jupyter Notebook
Demo

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 55 of 58
Python Programming Language
• Python is the most popular programming
languages used by data analysts and data
scientists.
• It is free and open source, and serves as a
general-purpose programming language.
• It is best for anyone interested in machine
learning, working with large datasets, or
creating complex data visualizations.

56

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 56 of 58
Python
Capabilities
Demo

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 57 of 58
DISCUSSIONS…

Thank you for listening

QUESTIONS???

[email protected]

https://lookaside.fbsbx.com/file/PUP%20presentation.pptx?token=AWyafMrkUbuiEBNQ…d5Xh8AQQJWmqiML9gXfMKoswFiRwJiM9X9tiQUZ94p1stSrR5pzAdyM5Bf7NkCPLVbH4 08/09/2019, 6_36 PM


Page 58 of 58

You might also like