0% found this document useful (0 votes)

18 views

AI14 - MachineLearning

The document discusses machine learning concepts including linear regression. It introduces linear regression as a supervised learning technique to find the best fitting linear relationship between two variables. It describes using scikit-learn to implement linear regression on a dataset relating student heights and weights to predict unknown weights. Code examples are provided to generate a linear regression model and use it to predict weights for new heights.

Uploaded by

NGUYỄN LÊ BẢO DUY

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

AI14 - MachineLearning

Uploaded by

NGUYỄN LÊ BẢO DUY

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 49

Chapter 14.

Basic
Concepts of Ma-
chine Learning
▪ Table of contents
• 14.1 Introduction
• 14.2 Linear regression
• 14.3 k-NN algorithm
• 14.4 Overfitting and un-
derfitting
14.1 Introduction

Big Data Analytics &

Artificial Intelligence
Starting with Python
Introduction
• Rule-based method: A method of writing a program and instructing the computer to
do something.
• Machine learning method: A method in which a computer learns on its own based
on data to solve problems.
Like “AlphaGo,” if you tell the computer only the rules of the Go game and notation
of the previous game, the computer can learn the principles of Go on its own and
play Go.

Go has many
game rules.

Apply machine
learning

The computer program that beats Lee Sedol

Introduction
• Machine learning is a field of artificial intelligence, a research field
for giving learning capabilities to computers.
• Machine learning, which evolved from pattern recognition and
computational learning theories, involves making computers learn how
to make decisions by looking at given data.
• The performance of decision-making algorithms improves as more
data is available to train.
• Unlike algorithms composed of commands that always perform
predetermined operations, learning algorithms can predict and make
decisions using data.
• The field where machine learning is mainly used is to deal with
problems that are difficult for computers to do by specifying a solution
method. For example, it is actively used in fields such as spam mail
filtering, automatic detection of network intruders, computer vision,
and autonomous driving
Introduction

• Different types of machine learning techniques: Machine learning is

generally divided into supervised learning and unsupervised learning
depending on the existence of a “teacher” who teaches.
Machine Learning

Supervised learning Unsupervised learning Reinforcement learning

Regression analysis Clustering

Random forest Dimensionality

reduction
Decision tree

Classification
Introduction

• Supervised learning: The computer is given examples and correct answers

(or labels) given by the "teacher". The goal of supervised learning is to
learn general rules for mapping inputs to outputs.

Labeling

cat

data Learning using data and labels

Introduction

• Unsupervised learning is representative of clustering shown above. For

this data, it can be divided into two large groups, the rules are that the
computer learns itself by looking at the data.
Introduction

• Reinforcement learning is given learning data in the form of rewards and punish-
ments. This is the case where only feedback on the behavior of the program is pro-
vided in a dynamic environment, such as driving a vehicle or playing against an oppo-
nent.
14.2 Linear regression

Big Data Analytics &

Artificial Intelligence
Starting with Python
Linear regression

Supervised learning informs problems and correct answers and enables learn-
ing
• Supervised learning predicts a reasonable output value when a new input value
14.1 Supervised learning:
comes in after learning a given input-output pair.
• In other words, supervised learning can be said to learn a mapping function f(x)
Linear regression
from input to output when input (x) and output (y) are given.
Linear regression
Supervised learning informs problems and correct answers and enables learn-
ing
• Suppose we are given points (1, 10), (2, 20), (3, 30), and (4, 40) as input data in
the form of (x, y). The computer does not yet know that the y value for the x
14.1 Supervised learning:
value is data that can be expressed by the equation 𝑦 = 10𝑥. I want to make the
Linear regression
computer answer 50 by learning 4 given data and inputting x = 5 after learning
is finished.
• Supervised learning is when a computer finds the best function that can explain
this input by itself based on input values, and this problem can be called regres-
sion analysis among supervised learning.
Linear regression

Find a function that describes the data well: a regression problem

Linear regression Nonlinear regression Classification

14.1 Supervised learning: Lin-

ear regression

• Regression is the problem of finding the straight line or curve that best describes
the data, usually after plotting the data in a multidimensional space.

• In other words, predicting the function 𝑓(𝑥) while looking at the input 𝑥 and the
output 𝑦 at 𝑦=𝑓(𝑥) is called a regression technique.
Linear regression

• Scikit-learn
– Libraries for Machine Learning

– It includes classification, regression, and clustering algorithms such as linear

regression, k-NN algorithm, support vector machine, random forest, gradient
boosting, and k-means, so it is gaining popularity as a good tool for those
new to machine learning.

• How to install Scikit-learn in Anaconda?

pip install numpy==1.19.2 scipy==1.9.3 scikit-learn
Linear regression

The Simplest Regression: Linear Regression

• Linear regression is a technique for modeling the

correlation of a random variable 𝑥 with another
variable 𝑦 depending on it (x is the feature of the
data, m is the slope, b is the intercept)
Here, it is an input
with two variables,
but in three dimen-
sions or more, it is
When there a hyperplane.
are more than
two variables,
it is called
multiple lin-
ear regres-
sion.
Linear regression
Let's implement linear regression with the Scikit-Learn library
• Suppose we are examining the height and weight of students in a classroom.

• Suppose, in general, that taller students weigh more.

• Height and weight were measured for several students.
• Student A, whose height is accurately known but whose weight is unknown, is
absent from school.
• Can you accurately estimate this student A's weight?
• If we could create a formula to quantify the correlation between height and weight,
we would be able to estimate the weight of student A, whose weight is unknown.

?
Linear regression
Let's implement linear regression with the Scikit-Learn library
• Four students were randomly extracted to measure the height and weight, and
the height was 164, 179, 162, and 170 cm, and their weight was 53, 63, 55, and
59 kg, respectively.

The input value

Write the most suitable
must be arranged
linear equation to de-
2D
scribe this distribution
키
Linear
Regression

Height
weight
Linear regression
Let's implement linear regression with the Scikit-Learn library
Caution: Input value is a person’s height, 164, 179, 162, 170, respectively.
-> The input of linear regression must be used to use a multi-dimensional array.

Target value

As a function that creates a linear regression model

Create an input vector X to optimize the target value Y.
In other words, it is a model generator.
Linear regression
Check and predict the results of linear regression learning

If you want to check the slope and sec-

tions of the determined straight line,
check the characteristic value CoEF_
and Intercept_. And how well these val-
ues are suitable for predicting Y for in-
put X, confirming the score () function

How well the model predict

the data
: About 90 points
Linear regression
Check and predict the results of linear regression learning

Now, for students with a height of 180 or 185, I would like to find out
how the Regrin Linear Return model we created predicts weight.
To do this, prepare the input data.

180 regr.predict( 63.71

185 ) 66.47

Use the predict () function of the linear regression model regr.

Enter the students' keys by entering this function.
Now, based on the model of regr, it is returned by estimating the stu-
dents’ heights.
Linear regression

Use MATPLOTLIB library to graph this linear regression

Linear regression
Question: Use a linear regression model to predict the weights of [166, 0] and [170, 1]?

Student 1 Student 2 Student 3 Student 4 Student 5 Student 6 Student 7 Student 8

Height 164 167 165 170 179 163 159 166

Sex 1 1 0 0 0 1 0 1
Weight 43 48 47 66 67 50 52 44

I am a
I am a
man
woman (1)
(0)

Even if the height is similar, the weight of

male and women will be different.

The feature value of student 4 is [170,

0]
As a feature to be used for input of linear
regression
Add a man (0), a woman (1)
Linear regression
Diabetes examples:

• The sklearn library includes a dataset from diabetics.

• The data has more data and features than the above examples.
Linear regression
Diabetes examples: diabetes dataset

This data includes data used as an input, targets used as a result of learning, and
feature_names that store the names of the input features.
Linear regression

Diabetes examples: diabetes dataset

• Extract only one third item corresponding to the body mass index bmi out of 10 features

Use this data as an input of regr we learned earlier.

• The data used as the input of the function must be a two-dimensional array.

Increase the dimension of array using np.newaxis

Now you can use this data X as an input of a linear return
model.

We will just extract only bmi data and use it as an input of linear regression.
Linear regression

Diabetes examples: What is the correlation between the body mass index and the
diabetes level?

• Linear regression learning

Linear regression
Diabetes examples: Separate the diabetes example into training and test data

diabetes dataset Let's see how accurate this model is

load_diabetes()
using 20% of the test data (new data).

train_test_split()

learning data test data

X_train X_test
y_train y_test

Final performance
evaluation

linear re-
gression Model Accuracy
learning

regr.fit() regr.predict()
Linear regression
Diabetes examples:
Separate the diabetes example into training and test data
• Only 80%of the total 442 are used for learning (or training)
• Using the remaining 20%for testing

Learning as a linear regression

model using learning data such as
X_train and y_train
(Using only bmi data)

Both training data and test data show

scores of 35 and 31 points.
Linear regression

Diabetes examples: Use all the features in the dataset for linear regression
Linear regression

Diabetes examples: Use all the features in the dataset for linear regression

predicted
value

actual value
Linear regression

Diabetes examples: Mean squared error(MSE):

There are various methods of calculating the error between y_pred and y_test.
One of them is Mean squared error(MSE):

Where, N is the number of elements, is the ith y_test value, and is the
y_pred estimated by the linear regression model, corresponding to .
Linear regression

Diabetes examples: Python program

Linear regression

Diabetes examples: Python program: Results

14.3 k-NN algorithm

Big Data Analytics &

Artificial Intelligence
Starting with Python
k-NN algorithm
The problem of classifying Dachshund and Samoyed dogs
Assume a simple case where the feature space of the data consists of two fea-
tures and items are displayed in this feature space.
k-NN algorithm

Samoyed dog
Height

Dachsund dog

Length
• The Samoyed has a high height value compared to its length, and the lower
Dachshund has a low height value compared to its length.
k-NN algorithm
If you classify by looking at 3
Class A of the number of nearest
neighbors, you belong to
class B, but if you classify by
looking at 5, you belong to
class A.

Class B

When k = 3: 2 class B, 1 class A

When k = 5: 2 class B, 3 class A
k-NN algorithm

Get ready to classify the beautiful irises

k-NN algorithm

sepal length

petal length
sepal width

petal width
Setosa : 0
Versicolor : 1
Virginica: 2.
k-NN algorithm

Setosa : 0
Versicolor : 1
Virginica: 2.
k-NN algorithm

You can also see that the labels are encoded as 0, 1, 2

Setosa : 0
Versicolor : 1
Virginica: 2
k-NN algorithm
Apply the k-NN algorithm
Using 80% of the total data as training
data, this model
The remaining 20% of the test data
Make sure you predict well

Training and testing using KNeigh-

borClassifier
k-NN algorithm

Let's classify new flowers by applying the model?

k-NN algorithm
Let's find out the accuracy of the classifier
k-NN algorithm
Let's find out the accuracy of the classifier
k-NN algorithm
Let's find out the accuracy of the classifier
14.4 Overfitting and under-
fitting

Big Data Analytics &

Artificial Intelligence
Starting with Python
Overfitting and underfitting

When performance is excellent on

Poor performance on both trained and
trained data, but performance on
new data.
new data is poor.

Underfitting and overfitting

1) Reasons for underfitting
and overfitting?
2) Techniques to reduce un-
derfitting and overfitting?

Underfit Good fit Overfit

(https://www.geeksforgeeks.org/underfitting-and-overfitting-in-machine-learning)
Thank you

LN NN Rug
No ratings yet
LN NN Rug
215 pages
Machine Learning
No ratings yet
Machine Learning
47 pages
AI14 - MachineLearning
No ratings yet
AI14 - MachineLearning
49 pages
AI & ML Unit 3 Notes
No ratings yet
AI & ML Unit 3 Notes
20 pages
Linear Regression
No ratings yet
Linear Regression
36 pages
Aiml 4
No ratings yet
Aiml 4
107 pages
Machine Learning With Python Algorithms
No ratings yet
Machine Learning With Python Algorithms
28 pages
Progression Linaire
No ratings yet
Progression Linaire
187 pages
Week 9 - PROG 8510 Week 9
No ratings yet
Week 9 - PROG 8510 Week 9
27 pages
ML-2
No ratings yet
ML-2
155 pages
Linear Regression
No ratings yet
Linear Regression
8 pages
Lesson 09 - Introduction To Model Building
No ratings yet
Lesson 09 - Introduction To Model Building
85 pages
Forecasting and Learning Theory
No ratings yet
Forecasting and Learning Theory
46 pages
AI ML 3
No ratings yet
AI ML 3
27 pages
AI Lec 2
No ratings yet
AI Lec 2
49 pages
Foundation of Machine Learning F-PMLFML02-WS
No ratings yet
Foundation of Machine Learning F-PMLFML02-WS
352 pages
Lecture-07 & 08 (New)
No ratings yet
Lecture-07 & 08 (New)
17 pages
Unit-Vi 2
No ratings yet
Unit-Vi 2
31 pages
Whole ML PDF 1614408656
100% (1)
Whole ML PDF 1614408656
214 pages
LP III Lab Manual
100% (1)
LP III Lab Manual
8 pages
Regression Analysis
No ratings yet
Regression Analysis
11 pages
Machine Learning
No ratings yet
Machine Learning
53 pages
Chapter - 2-ML
No ratings yet
Chapter - 2-ML
63 pages
ML_Introduction
No ratings yet
ML_Introduction
76 pages
LinearRegression PDF
No ratings yet
LinearRegression PDF
4 pages
Machine Learning: Introduction and Linear Regression
No ratings yet
Machine Learning: Introduction and Linear Regression
29 pages
Regression Analysis
No ratings yet
Regression Analysis
16 pages
Lecture 17&18 - Introduction To Machine Learning
No ratings yet
Lecture 17&18 - Introduction To Machine Learning
51 pages
Unit 3 Machine Learning
No ratings yet
Unit 3 Machine Learning
12 pages
Slide 1
No ratings yet
Slide 1
29 pages
Machine Learning
No ratings yet
Machine Learning
115 pages
Linear Regression for ML ass
No ratings yet
Linear Regression for ML ass
99 pages
Supervised and Unsupervised Learning
No ratings yet
Supervised and Unsupervised Learning
92 pages
Ilovepdf_merged (1)_merged - Copy
No ratings yet
Ilovepdf_merged (1)_merged - Copy
30 pages
Class 8_Linear Regression
No ratings yet
Class 8_Linear Regression
56 pages
Lecture 3_Regression (1)
No ratings yet
Lecture 3_Regression (1)
47 pages
Linear Regression - Numpy and Sklearn
No ratings yet
Linear Regression - Numpy and Sklearn
7 pages
Chapter II - Lecture 3 - Linear Regression
No ratings yet
Chapter II - Lecture 3 - Linear Regression
44 pages
ML UNIT II
No ratings yet
ML UNIT II
30 pages
Machine Learning Concepts
No ratings yet
Machine Learning Concepts
68 pages
Linear Regression
No ratings yet
Linear Regression
36 pages
Linear_Regression_Presentation[1]
No ratings yet
Linear_Regression_Presentation[1]
8 pages
7 محاضرات
No ratings yet
7 محاضرات
36 pages
2-(9-3) Regression Classifiers
No ratings yet
2-(9-3) Regression Classifiers
35 pages
Chapter 6 Supervised Learning
No ratings yet
Chapter 6 Supervised Learning
6 pages
unit-3
No ratings yet
unit-3
30 pages
Ch-2 Supervised Machine Learning
No ratings yet
Ch-2 Supervised Machine Learning
48 pages
Introduction To Machine Learning Algorithms: Linear Regression
No ratings yet
Introduction To Machine Learning Algorithms: Linear Regression
1 page
AI Lec-04
No ratings yet
AI Lec-04
21 pages
Supervised and Unsupervised Learning Algorithm-2
No ratings yet
Supervised and Unsupervised Learning Algorithm-2
52 pages
Linear Regression For Machine Learning
No ratings yet
Linear Regression For Machine Learning
17 pages
ML Unit
No ratings yet
ML Unit
23 pages
QSRI-lecture1
No ratings yet
QSRI-lecture1
45 pages
Linear Regression in Python
No ratings yet
Linear Regression in Python
28 pages
Linear-Regression ML
No ratings yet
Linear-Regression ML
36 pages
Lecture 3
No ratings yet
Lecture 3
51 pages
Unit-III Advanced Machine Learning
No ratings yet
Unit-III Advanced Machine Learning
8 pages
ML-1-PPT-UNIT-1
No ratings yet
ML-1-PPT-UNIT-1
93 pages
Commonly Used Machine Learning Algorithms
No ratings yet
Commonly Used Machine Learning Algorithms
38 pages
ML LN 3
No ratings yet
ML LN 3
44 pages
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Bilinear Interpolation: Enhancing Image Resolution and Clarity through Bilinear Interpolation
From Everand
Bilinear Interpolation: Enhancing Image Resolution and Clarity through Bilinear Interpolation
Fouad Sabry
No ratings yet
Acknowledgement
No ratings yet
Acknowledgement
24 pages
Research Paper On AI
No ratings yet
Research Paper On AI
16 pages
Machine Learning Techniques Quantum
No ratings yet
Machine Learning Techniques Quantum
161 pages
Machine Learning
No ratings yet
Machine Learning
27 pages
Sentiment Analysis Methods, Applications, and Future Directions
No ratings yet
Sentiment Analysis Methods, Applications, and Future Directions
8 pages
Midterm Exam - Attempt Review
No ratings yet
Midterm Exam - Attempt Review
17 pages
ML WITH PYTHON
No ratings yet
ML WITH PYTHON
6 pages
Sat - 90.Pdf - Prediction of Bank Customer Churn Using Machine Learning Technique
No ratings yet
Sat - 90.Pdf - Prediction of Bank Customer Churn Using Machine Learning Technique
11 pages
Aiml Q Bank
No ratings yet
Aiml Q Bank
25 pages
Chapter-2-Fundamentals of Machine Learning
No ratings yet
Chapter-2-Fundamentals of Machine Learning
23 pages
Unit 4 Ai
No ratings yet
Unit 4 Ai
13 pages
Building Machine Learning Systems With A Feature Store - Early Release
100% (1)
Building Machine Learning Systems With A Feature Store - Early Release
48 pages
Day 2-FDP
No ratings yet
Day 2-FDP
23 pages
Review (3) A Comprehensive Review On Email Spam Classification Using Machine Learning Algorithms
No ratings yet
Review (3) A Comprehensive Review On Email Spam Classification Using Machine Learning Algorithms
6 pages
machine learning unit 1 ppt
No ratings yet
machine learning unit 1 ppt
40 pages
Supervised Learning Notes 1-4
No ratings yet
Supervised Learning Notes 1-4
42 pages
Aim All Slide
No ratings yet
Aim All Slide
454 pages
云计算和机器学习在精算行业中的应用
No ratings yet
云计算和机器学习在精算行业中的应用
9 pages
Midterm Lab Exam - Attempt Review
No ratings yet
Midterm Lab Exam - Attempt Review
17 pages
PSY417 Week12
No ratings yet
PSY417 Week12
34 pages
Sent-Machine Learning For Data Science
100% (1)
Sent-Machine Learning For Data Science
463 pages
Chapter 4brev - Digital Image Processing
No ratings yet
Chapter 4brev - Digital Image Processing
85 pages
Keyword Extraction Methods From Documents in NLP
No ratings yet
Keyword Extraction Methods From Documents in NLP
15 pages
Unit IV - Learning
No ratings yet
Unit IV - Learning
18 pages
Machine Learning and Soft Computing: CSCC53 Mca V Sem 2020
No ratings yet
Machine Learning and Soft Computing: CSCC53 Mca V Sem 2020
33 pages
Unit-3 Artificial Intelligence
No ratings yet
Unit-3 Artificial Intelligence
68 pages
Data modification and predictive analytics_MCQ_1_2 (1)
No ratings yet
Data modification and predictive analytics_MCQ_1_2 (1)
24 pages

AI14 - MachineLearning

Uploaded by

AI14 - MachineLearning

Uploaded by

Chapter 14.

Big Data Analytics &

The computer program that beats Lee Sedol

• Different types of machine learning techniques: Machine learning is

Supervised learning Unsupervised learning Reinforcement learning

Regression analysis Clustering

Random forest Dimensionality

• Supervised learning: The computer is given examples and correct answers

data Learning using data and labels

• Unsupervised learning is representative of clustering shown above. For

Big Data Analytics &

Find a function that describes the data well: a regression problem

14.1 Supervised learning: Lin-

– It includes classification, regression, and clustering algorithms such as linear

• How to install Scikit-learn in Anaconda?

The Simplest Regression: Linear Regression

• Linear regression is a technique for modeling the

• Suppose, in general, that taller students weigh more.

The input value

As a function that creates a linear regression model

If you want to check the slope and sec-

How well the model predict

180 regr.predict( 63.71

Use the predict () function of the linear regression model regr.

Use MATPLOTLIB library to graph this linear regression

Student 1 Student 2 Student 3 Student 4 Student 5 Student 6 Student 7 Student 8

Height 164 167 165 170 179 163 159 166

Even if the height is similar, the weight of

The feature value of student 4 is [170,

• The sklearn library includes a dataset from diabetics.

Diabetes examples: diabetes dataset

Use this data as an input of regr we learned earlier.

Increase the dimension of array using np.newaxis

• Linear regression learning

diabetes dataset Let's see how accurate this model is

learning data test data

Learning as a linear regression

Both training data and test data show

Diabetes examples: Mean squared error(MSE):

Diabetes examples: Python program

Diabetes examples: Python program: Results

Big Data Analytics &

When k = 3: 2 class B, 1 class A

Get ready to classify the beautiful irises

You can also see that the labels are encoded as 0, 1, 2

Training and testing using KNeigh-

Let's classify new flowers by applying the model?

Big Data Analytics &

When performance is excellent on

Underfitting and overfitting

Underfit Good fit Overfit

You might also like