0% found this document useful (0 votes)

14 views36 pages

Lecture 3 - Machine Learning and Data Driven Analysis

The document provides an overview of artificial intelligence, machine learning, and deep learning, explaining their definitions and relationships. It emphasizes the importance of machine learning as a data-driven analysis tool that improves decision-making and predictions through iterative learning from data. Additionally, it covers regression analysis, including its history, methods, and applications in predicting relationships between variables.

Uploaded by

22momo.wowo33

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views36 pages

Lecture 3 - Machine Learning and Data Driven Analysis

Uploaded by

22momo.wowo33

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Data driven process analysis and thinking

Machine learning and data driven modeling

Artificial Intelligence, Machine Learning, and Deep Learning

❑ Machine learning is a
artificial intelligence method of achieving
artificial intelligence by
machine learning learning from data

deep learning ❑ Deep learning is a method

of machine learning,
generally referring to a
model that imitates
biological neural networks
Artificial Intelligence, Machine Learning, and Deep Learning

artificial intelligence machine learning deep learning

The definition is broad, and the Compared with the traditional It belongs to the further
concept first appeared, including one, it has a data-based development of machine learning
traditional expert systems, learning function. The and generally refers to the bionic
traditional automatic voice commonly used models include neural network model, which has
customer service, recovery kNN , SVM , etc., but the ability more powerful functions and can
according to hard-coded programs, to process complex data such more effectively process
and no ability to learn as images and voices is limited. unstructured data such as images
independently and voices, but requires higher
computing power

Enable simple and repetitive The accuracy rate greatly improved,

applications have the ability to learn initially
adapting to more complex scenarios

Every detail needs to be designed development line Less human design and intervention
in advance
How do machines learn?

artificial intelligence machine learning deep learning

Enable simple and repetitive The accuracy rate greatly improved,

applications have the ability to learn initially
adapting to more complex scenarios

Every detail needs to be designed development line Less human design and intervention
in advance

supervised unsupervised semi-supervised reinforcement

learning learning learning learning
What is Machine Learning
Although machine learning is a highly interdisciplinary area, it is
generally considered as the subfield of computer science that gives
“computers the ability to learn without being explicitly
programmed’. (Arthur Samuel, 1959).

“A computer program is said to learn from experience E with respect

to some task T and some performance measure P, if its performance
on T, as measured by P, improves with experience E.”
(Tom Mitchell, 1997)
What is Machine Learning

Use of data to answer questions

Training Predicting
(data) (output)
What is Machine Learning
Machine learning (ML) is the study of computer algorithms that
improve automatically through experience. Machine learning
algorithms build a model based on sample data, known as "training
data", in order to make predictions or decisions without being
explicitly programmed to do so
What is Machine Learning

𝑌 = 𝑓(𝑋1 , 𝑋2 , … , 𝑋𝑛 )

• Estimate the relationship between Y and X (model building)

• Dependent variables can be explained by the values of
independent variables
• Most times are indicating supervised learning (due to its
popularity)
Why is Machine learning so important?

The iterative aspect of machine

learning is important because as models are
exposed to new data, they are able to learn and
independently adapt. They learn from previous
computations to produce reliable, repeatable
decisions and results. It's a science that's not new –
but one that has gained fresh momentum.
Machine learning vs Data Driven analysis
Machine Learning and Data-Driven Analysis often seem like interchangeable buzzwords:
• Machine learning is a key component of data-driven analysis, as it provides the means to build
models and generate insights from large amounts of data. In a data-driven analysis workflow,
machine learning can be used for tasks such as:
– Predicting future outcomes based on historical data
– Identifying patterns, trends, and relationships in complex datasets
– Automating decision-making processes through intelligent algorithms
– Classifying and segmenting data based on learned features

In summary, machine learning is an essential tool in the toolbox of data-driven analysis, allowing for the
extraction of valuable information and insights from data. By leveraging the power of machine learning,
data-driven analysis can lead to more informed decisions, better predictions, and increased efficiency in
various domains.
Examples of ML methods Machine Learning

Supervised
Learning Unsupervised
Learning

Classification Regression Clustering

● Linear
● Support Vector
Regression ● K-Means, K-Medoids
Machine
● SVR, GPR Fuzzy C-Means
● Discriminant
● Ensemble ● Gaussian Mixture
Analysis
Methods ● Neural Networks
● Naive Bayes
● Decision Trees ● Hidden Markov
● Nearest
● Neural Model
Neighbor
Networks
Build a simplest machine learning/data-driven model: linear
regression
Regression, one of the most established
supervised learning approaches
A brief history of the term “regression”
The method of least squares, which was first introduced by Legendre in
1805 and later by Gauss in 1809, was the earliest form of regression. They
both applied this method to determine the orbits of celestial bodies around
the sun, particularly comets and newly discovered minor planets. Gauss
further developed the theory of least squares in 1821, which also included
his version of the Gauss-Markov theorem.

Francis Galton coined the term "regression" in the 19th century to describe
the biological phenomenon of the tendency for the heights of descendants
of tall ancestors to regress down towards a normal average, also known as
regression toward the mean. Initially, Galton's work only focused on this
biological meaning of regression, but Udny Yule and Karl Pearson later
extended it to a more general statistical context. In the work of Yule and
Pearson, the joint distribution of the response and explanatory variables is
assumed to be Gaussian distribution. British biologist and
statistician
Galton (Francis Galton ,
Regression examples: find the relation between
input X and target/output Y
Regression problems are usually used to predict a
value, such as predicting prices, temperature, etc.
For example, the actual price of a product is 500
yuan, and the predicted value through regression
analysis is 499 yuan. We think this is a relatively
good regression analysis.

Height (cm)
life expectancy

Age

No. of doctors/million population

Keywords of regression analysis
Regression analysis : a statistical research method used to investigate the
relationship between a dependent variable (Y or outputs) and one or more
independent variables (X or inputs)

Independent variable : the variable used as the basis for estimation

Dependent variable : the variable to be estimated

Regression equation : a mathematical expression that reflects the relationship

between the independent variable and the dependent variable

Regression model : a general term used to refer to a specific type of

regression equation
Regression family

No. of variables
Univariate：𝑦 = 𝑓(𝑥)

Multivariate：𝑦 = 𝑓(𝑥1 , 𝑥2 ⋯ 𝑥𝑛 )

Regression
Linear：𝑦 = 𝑎𝑥 + 𝑏

Function
Non-linear：𝑦 = 𝑎𝑥 2 + 𝑏𝑥 + 𝑐
Linear regression: the simplest algorithm
• The simplest linear regression algorithm is univariate linear
regression. The sample data corresponding to the problem it
solves has only one feature/input.

𝑦 = 𝑎𝑥 + 𝑏
𝑦 = 𝑎𝑥 + 𝑏

Find a and b, and make the

regression line best fit the
known data examples.
How to find a and b?

X is known independent variable，y is 𝑚

corresponding target, and y’is the predicted 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒 ෍(𝑦𝑖′ − 𝑦𝑖 )2

value of the model：we want y’to be as close 𝑖=1
to y as possible m is the number of instances
A simple example: which is the best fit?

x y y'1 y'2

1 1 0.5 4

2 2 1 3

3 3 1.5 2

1 𝑚 1
𝐽1 = σ𝑖=1(𝑦1′ 2
− 𝑦) = × 0.5 − 1 2 + 1−2 2 + 1.5 − 3 2 = 1.166
𝑚 3
1 𝑚 1
𝐽2 = σ𝑖=1(𝑦2′ − 𝑦)2 = × 4−1 2 + 3−2 2 + 2−3 2 = 3.66
𝑚 3
How to find a and b in a regression problem
1. Conventional math analytic approach (least squares)

2. Gradient descend approach (a basic machine learning

approach)
The method of least squares
y

Time Production
20 195
30 305
50 480
60 580
𝐽 = 𝑓 𝑝 = 3.5𝑝2 − 14𝑝 + 14
The gradient descend approach
𝑦 = 𝑎x + b 1
Minimize 𝐽 𝑎, b = σ(ොy− 𝑦)2
y 𝑛
𝜕
𝑝𝑖+1 = 𝑝𝑖 − a 𝐽(𝑝𝑖 )
𝜕𝑝𝑖
Step-by-step approach

x
𝜕 2
𝑎𝑛𝑒𝑤 = 𝑎𝑜𝑙𝑑 − a 𝐽 𝑎, 𝑏 = 𝑎𝑜𝑙𝑑 − a ෍(𝑎𝑜𝑙𝑑 𝑥𝑖 + 𝑏𝑜𝑙𝑑 − 𝑦𝑖 )𝑥𝑖
𝜕𝑎 𝑛
𝜕 2
𝑏𝑛𝑒𝑤 = 𝑏𝑜𝑙𝑑 − a 𝐽 𝑎, 𝑏 = 𝑏𝑜𝑙𝑑 − a ෍(𝑎𝑜𝑙𝑑 𝑥𝑖 + 𝑏𝑜𝑙𝑑 − 𝑦𝑖 )
𝜕𝑏 𝑛
Gradient descend example

𝐽 𝑝 = 3.5𝑝2 − 14𝑝 + 14
𝑝𝑖 = 0.5, 𝛼 = 0.01
𝑝𝑖+1 =? ?

𝜕 𝜕
𝑓 𝑝𝑖 = 7𝑝 − 14 𝑝𝑖+1 = 𝑝𝑖 −𝛼 𝐽 𝑝𝑖
𝜕𝑝𝑖 𝜕𝑝𝑖
𝜕 = 0.5 + 0.105
𝑓 𝑝𝑖 = −10.5 = 0.605
𝜕𝑝𝑖

Slowly approach（p=2)
Illustration of solving linear regression
𝐶𝑜𝑠𝑡 = 𝐽 𝑎, 𝑏 𝑦 = 𝑎𝑥 + 𝑏
With the same example,
Time Production please select a learning rate
20 195 and using gradient descent to
update a and b of a linear
30 305 regression model
50 480
60 580 𝜕
𝑝𝑖+1 = 𝑝𝑖 − a 𝐽(𝑝𝑖 )
𝜕𝑝𝑖
Multivariate linear regression
𝑦 = 𝑎1 𝑥1 + 𝑎2 𝑥2 + ⋯ + 𝑎𝑛 𝑥𝑛 + 𝑏

N is the number of input variables (features), 𝑥1 to 𝑥𝑛 are the

features, 𝑎1 to 𝑎𝑛 are the coefficients, and b is the intercept. The
least squares method is used to find the values of 𝑎1 to 𝑎𝑛 and b
that minimize the sum of the squared residuals, which is similar to
the case of simple linear regression.
Matrix calculation approach
• A brief review of matrix multiplication

https://www.youtube.com/watch?v=2spTnAiQg4M&t=159s
Multivariate linear regression
𝑦 = 𝑎1 𝑥1 + 𝑎2 𝑥2 + ⋯ + 𝑎𝑛 𝑥𝑛 + 𝑏
We can solve it using matrix calculation approach by transforming the equation into Y = XA

𝑥11 ⋯ 𝑥1𝑛 𝑎1 𝑦1
1
𝑋= ⋮ ⋱ ⋮ 𝐴= ⋮ 𝑌= ⋮
⋮ 𝑎𝑛
𝑥𝑚1 ⋯ 𝑥𝑚𝑛 1 𝑦𝑚
𝑏

• X is an m × (n+1) matrix where the last column is all ones (for the intercept b term), the
remaining columns represent the n independent variables, and m is the number of instances
in the dataset.
• A is an (n+1) × 1 matrix (column vector) of the coefficients, including the intercept b as the
first element of this vector.
• Y is an m × 1 matrix (column vector) representing the dependent variable.
Then A is solved in the following procedures
The residuals are given by the difference between the observed values and the predicted values, which
is:
R = Y - XA
The sum of squared residuals is given by the dot product of R with itself, which is:
S = RᵀR = (Y - XA) ᵀ (Y - XA)
Here ᵀ denotes the transpose of a matrix.
To find the matrix A that minimizes S, we take the derivative of S with respect to A, set it to zero, and
solve for A. The derivative of S with respect to A is:
dS/dA = -2Xᵀ (Y - XA)
Setting this to zero and solving for A gives:
0 = -2Xᵀ (Y - XA)
XᵀY = XᵀXA
(XᵀX)A = XᵀY
Assuming that (XᵀX) is invertible, we can multiply both sides by the inverse of (XᵀX) to get:
A = (XᵀX)^-1 XᵀY
Similar strategy can be applied to second-order
nonlinear regression model
𝑦 = 𝑎𝑥 2 + 𝑏𝑥 + 𝑐
Here we could consider 𝑥 2 as an input variables. This equation
can be written in matrix form as:
Y = XA
where:
• A is a column vector of the coefficients given by [a, b, c]ᵀ.
• X is a column vector of the variables given by [x²,x,1].
In class exercise: model the relationship between fuel flow rate
and output power in a gas turbine-driven generator

Fuel flow rate Power Please build both linear regression model
(𝑦 = 𝑎x + 𝑏)
1.0 20 and the second-order nonlinear models
2.0 45 (𝑦 = 𝑎𝑥 2 + 𝑏𝑥 + 𝑐)
3.0 55 to fit the provided data and compare
their performance using the matrix
4.0 75
approach:
A = (XᵀX)^-1 XᵀY
Fuel flow Power Linear Nonlinear
rate model yො model yො
1.0 20 22.5 21.25

2.0 45 40 41.25

3.0 55 57.5 58.75

4.0 75 75 73.75

MSE 9.375 7.8125

E.D Lab Report 1
No ratings yet
E.D Lab Report 1
15 pages
Lecture-07 & 08 (New)
No ratings yet
Lecture-07 & 08 (New)
17 pages
Day 2. Lecture - Machinelearning
No ratings yet
Day 2. Lecture - Machinelearning
32 pages
CSE3506 PPT Ref1
No ratings yet
CSE3506 PPT Ref1
135 pages
AI ML 3 Updated
No ratings yet
AI ML 3 Updated
34 pages
Machine Learning
No ratings yet
Machine Learning
87 pages
Machine Learning
No ratings yet
Machine Learning
41 pages
ML Introduction
No ratings yet
ML Introduction
76 pages
Unit Iii Supervised Learning
No ratings yet
Unit Iii Supervised Learning
67 pages
Ai ML 3
No ratings yet
Ai ML 3
27 pages
Machine Learning
No ratings yet
Machine Learning
33 pages
Summer of Science-Final Report
100% (1)
Summer of Science-Final Report
7 pages
ML 1 PPT Unit 1
No ratings yet
ML 1 PPT Unit 1
93 pages
Unit 3
No ratings yet
Unit 3
45 pages
Machine Learning
No ratings yet
Machine Learning
53 pages
Tybsc Cs368 Data Analytics Labbook
No ratings yet
Tybsc Cs368 Data Analytics Labbook
58 pages
AI 4 Unit Notes
No ratings yet
AI 4 Unit Notes
47 pages
Unit 4 - Machine Learning PDF
No ratings yet
Unit 4 - Machine Learning PDF
49 pages
Machine Learning Ppts
No ratings yet
Machine Learning Ppts
38 pages
Machine Learning and Regression
No ratings yet
Machine Learning and Regression
8 pages
Progression Linaire
No ratings yet
Progression Linaire
187 pages
Chapter 2
No ratings yet
Chapter 2
136 pages
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
No ratings yet
What Are Linear Models in Machine Learning (1) .Docx (Unit3 ML)
60 pages
Unit 3 Model Construction 3.1 Machine Learning Concepts - An Overview
No ratings yet
Unit 3 Model Construction 3.1 Machine Learning Concepts - An Overview
36 pages
AI & ML Unit 3 Notes
No ratings yet
AI & ML Unit 3 Notes
20 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
17 pages
Chapter 6 Supervised Learning
No ratings yet
Chapter 6 Supervised Learning
6 pages
DS-05 Introduction To Machine Learning
No ratings yet
DS-05 Introduction To Machine Learning
103 pages
PerceptiLabs-ML Handbook
No ratings yet
PerceptiLabs-ML Handbook
31 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
Ds Module 4
No ratings yet
Ds Module 4
73 pages
Week 9 - PROG 8510 Week 9
No ratings yet
Week 9 - PROG 8510 Week 9
27 pages
Forecasting and Learning Theory
No ratings yet
Forecasting and Learning Theory
46 pages
Unit 1-1
No ratings yet
Unit 1-1
32 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Unit Iii
No ratings yet
Unit Iii
18 pages
Unit 2 Supervised Learning and Applications
No ratings yet
Unit 2 Supervised Learning and Applications
13 pages
Linear Regression For ML Ass
No ratings yet
Linear Regression For ML Ass
99 pages
Unit 1 (DS)
No ratings yet
Unit 1 (DS)
15 pages
Data Analytics Unit1
No ratings yet
Data Analytics Unit1
17 pages
Module 2
No ratings yet
Module 2
139 pages
Statistical Modeling
No ratings yet
Statistical Modeling
22 pages
ML 2
No ratings yet
ML 2
155 pages
Linear Regression and Logistic Regression
No ratings yet
Linear Regression and Logistic Regression
19 pages
Mechine Learning
No ratings yet
Mechine Learning
106 pages
Unit 1 - Machine Learning
No ratings yet
Unit 1 - Machine Learning
17 pages
Regression 0
No ratings yet
Regression 0
108 pages
Linear Regression
No ratings yet
Linear Regression
64 pages
Final ML
No ratings yet
Final ML
54 pages
Machine Learning: Bilal Khan
100% (2)
Machine Learning: Bilal Khan
20 pages
Week 4 - Intro To ML
No ratings yet
Week 4 - Intro To ML
37 pages
Jntuk Machine Learning 3-2 Unit-2
No ratings yet
Jntuk Machine Learning 3-2 Unit-2
47 pages
Fileml
No ratings yet
Fileml
54 pages
Unit V - Big Data Programming
No ratings yet
Unit V - Big Data Programming
22 pages
ML Chapter 1
No ratings yet
ML Chapter 1
41 pages
2 Machine Learning
No ratings yet
2 Machine Learning
69 pages
ICT202B AI ML and Emerging Technologies UNIT 3 (Classification and Regression) 2
No ratings yet
ICT202B AI ML and Emerging Technologies UNIT 3 (Classification and Regression) 2
23 pages
Module 3
No ratings yet
Module 3
63 pages
Slide 1
No ratings yet
Slide 1
29 pages
50 Breakthrough AI Concepts in 500 Words Each: In 500 words, #17
From Everand
50 Breakthrough AI Concepts in 500 Words Each: In 500 words, #17
Nietsnie Trebla
No ratings yet
Fundamentals of Machine Learning: a Simplified Approach
From Everand
Fundamentals of Machine Learning: a Simplified Approach
Er. Sudhir Goswami
No ratings yet
INTRODUCTION
No ratings yet
INTRODUCTION
5 pages
Y Through Competency Framework and Standards Activities 1
No ratings yet
Y Through Competency Framework and Standards Activities 1
2 pages
Module 1 - Human Repro
0% (1)
Module 1 - Human Repro
16 pages
Choudhury Et Al. (2008)
No ratings yet
Choudhury Et Al. (2008)
6 pages
DLL Matatag - Science 7 Q2 W1
No ratings yet
DLL Matatag - Science 7 Q2 W1
21 pages
IC Marketing Project Management Spreadsheet 11025
No ratings yet
IC Marketing Project Management Spreadsheet 11025
2 pages
Ucsp Week 5
No ratings yet
Ucsp Week 5
7 pages
BTM 203 - Tour Operations Management
No ratings yet
BTM 203 - Tour Operations Management
138 pages
2017 Procemin Plant Performance Forecasting
No ratings yet
2017 Procemin Plant Performance Forecasting
11 pages
M37000EN
No ratings yet
M37000EN
7 pages
The Relationship Between Emotional Intelligence and Work Attitudes Behavior and
No ratings yet
The Relationship Between Emotional Intelligence and Work Attitudes Behavior and
35 pages
Basic Mechanical Engineering: Credit & Contact Hours: Course Code: Course Level: Prerequisite(s)
No ratings yet
Basic Mechanical Engineering: Credit & Contact Hours: Course Code: Course Level: Prerequisite(s)
2 pages
Teacher's Planner
No ratings yet
Teacher's Planner
3 pages
Ncert Solutions Class 8 Math Chapter 12 Ex 12 1
No ratings yet
Ncert Solutions Class 8 Math Chapter 12 Ex 12 1
14 pages
BCP 51-BCP 52-BCP 53 - Transistor PNP PDF
No ratings yet
BCP 51-BCP 52-BCP 53 - Transistor PNP PDF
8 pages
Technology and Society in The Bronze Ages
No ratings yet
Technology and Society in The Bronze Ages
18 pages
824494.life Cycle Assessment of Power Transformer-Case Study
No ratings yet
824494.life Cycle Assessment of Power Transformer-Case Study
10 pages
Critical Paper On The True Cost Documentary
100% (1)
Critical Paper On The True Cost Documentary
3 pages
EM80 Technical Specification
No ratings yet
EM80 Technical Specification
6 pages
Scientific No 4.: Second Bgu
No ratings yet
Scientific No 4.: Second Bgu
12 pages
Specification of Surface Texture VDA 2005: Klass - NR: 02624
No ratings yet
Specification of Surface Texture VDA 2005: Klass - NR: 02624
32 pages
Activity 1 and 2
No ratings yet
Activity 1 and 2
4 pages
Dhirubhai Ambani International School MUN Info
No ratings yet
Dhirubhai Ambani International School MUN Info
2 pages
TESCO
No ratings yet
TESCO
19 pages
Pta Meeting Minutes Sample
No ratings yet
Pta Meeting Minutes Sample
2 pages
Self-Compacting Concrete: Geert de Schutter, Peter J. M. Bartos, Peter Domone and John Gibbs
No ratings yet
Self-Compacting Concrete: Geert de Schutter, Peter J. M. Bartos, Peter Domone and John Gibbs
3 pages
ChE 303 Introduction and VLE
No ratings yet
ChE 303 Introduction and VLE
21 pages
Mapa Conceptual Sedimentadores
No ratings yet
Mapa Conceptual Sedimentadores
5 pages
Klakson Stebel
No ratings yet
Klakson Stebel
6 pages

Lecture 3 - Machine Learning and Data Driven Analysis

Uploaded by

Lecture 3 - Machine Learning and Data Driven Analysis

Uploaded by

Data driven process analysis and thinking

Machine learning and data driven modeling

deep learning ❑ Deep learning is a method

artificial intelligence machine learning deep learning

Enable simple and repetitive The accuracy rate greatly improved,

artificial intelligence machine learning deep learning

Enable simple and repetitive The accuracy rate greatly improved,

supervised unsupervised semi-supervised reinforcement

“A computer program is said to learn from experience E with respect

Use of data to answer questions

• Estimate the relationship between Y and X (model building)

The iterative aspect of machine

Classification Regression Clustering

No. of doctors/million population

Independent variable : the variable used as the basis for estimation

Dependent variable : the variable to be estimated

Regression equation : a mathematical expression that reflects the relationship

Regression model : a general term used to refer to a specific type of

Find a and b, and make the

X is known independent variable，y is 𝑚

corresponding target, and y’is the predicted 𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒 ෍(𝑦𝑖′ − 𝑦𝑖 )2

2. Gradient descend approach (a basic machine learning

N is the number of input variables (features), 𝑥1 to 𝑥𝑛 are the

3.0 55 57.5 58.75

MSE 9.375 7.8125

You might also like