0% found this document useful (0 votes)
18 views

AI14 - MachineLearning

The document discusses machine learning concepts including linear regression. It introduces linear regression as a supervised learning technique to find the best fitting linear relationship between two variables. It describes using scikit-learn to implement linear regression on a dataset relating student heights and weights to predict unknown weights. Code examples are provided to generate a linear regression model and use it to predict weights for new heights.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

AI14 - MachineLearning

The document discusses machine learning concepts including linear regression. It introduces linear regression as a supervised learning technique to find the best fitting linear relationship between two variables. It describes using scikit-learn to implement linear regression on a dataset relating student heights and weights to predict unknown weights. Code examples are provided to generate a linear regression model and use it to predict weights for new heights.
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 49

Chapter 14.

Basic
Concepts of Ma-
chine Learning
▪ Table of contents
• 14.1 Introduction
• 14.2 Linear regression
• 14.3 k-NN algorithm
• 14.4 Overfitting and un-
derfitting
14.1 Introduction

Big Data Analytics &


Artificial Intelligence
Starting with Python
Introduction
• Rule-based method: A method of writing a program and instructing the computer to
do something.
• Machine learning method: A method in which a computer learns on its own based
on data to solve problems.
Like “AlphaGo,” if you tell the computer only the rules of the Go game and notation
of the previous game, the computer can learn the principles of Go on its own and
play Go.

Go has many
game rules.

Apply machine
learning

The computer program that beats Lee Sedol


Introduction
• Machine learning is a field of artificial intelligence, a research field
for giving learning capabilities to computers.
• Machine learning, which evolved from pattern recognition and
computational learning theories, involves making computers learn how
to make decisions by looking at given data.
• The performance of decision-making algorithms improves as more
data is available to train.
• Unlike algorithms composed of commands that always perform
predetermined operations, learning algorithms can predict and make
decisions using data.
• The field where machine learning is mainly used is to deal with
problems that are difficult for computers to do by specifying a solution
method. For example, it is actively used in fields such as spam mail
filtering, automatic detection of network intruders, computer vision,
and autonomous driving
Introduction

• Different types of machine learning techniques: Machine learning is


generally divided into supervised learning and unsupervised learning
depending on the existence of a “teacher” who teaches.
Machine Learning

Supervised learning Unsupervised learning Reinforcement learning

Regression analysis Clustering

Random forest Dimensionality


reduction
Decision tree

Classification
Introduction

• Supervised learning: The computer is given examples and correct answers


(or labels) given by the "teacher". The goal of supervised learning is to
learn general rules for mapping inputs to outputs.

Labeling

cat

data Learning using data and labels


Introduction

• Unsupervised learning is representative of clustering shown above. For


this data, it can be divided into two large groups, the rules are that the
computer learns itself by looking at the data.
Introduction

• Reinforcement learning is given learning data in the form of rewards and punish-
ments. This is the case where only feedback on the behavior of the program is pro-
vided in a dynamic environment, such as driving a vehicle or playing against an oppo-
nent.
14.2 Linear regression

Big Data Analytics &


Artificial Intelligence
Starting with Python
Linear regression

Supervised learning informs problems and correct answers and enables learn-
ing
• Supervised learning predicts a reasonable output value when a new input value
14.1 Supervised learning:
comes in after learning a given input-output pair.
• In other words, supervised learning can be said to learn a mapping function f(x)
Linear regression
from input to output when input (x) and output (y) are given.
Linear regression
Supervised learning informs problems and correct answers and enables learn-
ing
• Suppose we are given points (1, 10), (2, 20), (3, 30), and (4, 40) as input data in
the form of (x, y). The computer does not yet know that the y value for the x
14.1 Supervised learning:
value is data that can be expressed by the equation 𝑦 = 10𝑥. I want to make the
Linear regression
computer answer 50 by learning 4 given data and inputting x = 5 after learning
is finished.
• Supervised learning is when a computer finds the best function that can explain
this input by itself based on input values, and this problem can be called regres-
sion analysis among supervised learning.
Linear regression

Find a function that describes the data well: a regression problem


Linear regression Nonlinear regression Classification

14.1 Supervised learning: Lin-


ear regression

• Regression is the problem of finding the straight line or curve that best describes
the data, usually after plotting the data in a multidimensional space.

• In other words, predicting the function 𝑓(𝑥) while looking at the input 𝑥 and the
output 𝑦 at 𝑦=𝑓(𝑥) is called a regression technique.
Linear regression

• Scikit-learn
– Libraries for Machine Learning

– It includes classification, regression, and clustering algorithms such as linear


regression, k-NN algorithm, support vector machine, random forest, gradient
boosting, and k-means, so it is gaining popularity as a good tool for those
new to machine learning.

• How to install Scikit-learn in Anaconda?


pip install numpy==1.19.2 scipy==1.9.3 scikit-learn
Linear regression

The Simplest Regression: Linear Regression

• Linear regression is a technique for modeling the


correlation of a random variable 𝑥 with another
variable 𝑦 depending on it (x is the feature of the
data, m is the slope, b is the intercept)
Here, it is an input
with two variables,
but in three dimen-
sions or more, it is
When there a hyperplane.
are more than
two variables,
it is called
multiple lin-
ear regres-
sion.
Linear regression
Let's implement linear regression with the Scikit-Learn library
• Suppose we are examining the height and weight of students in a classroom.

• Suppose, in general, that taller students weigh more.


• Height and weight were measured for several students.
• Student A, whose height is accurately known but whose weight is unknown, is
absent from school.
• Can you accurately estimate this student A's weight?
• If we could create a formula to quantify the correlation between height and weight,
we would be able to estimate the weight of student A, whose weight is unknown.

?
Linear regression
Let's implement linear regression with the Scikit-Learn library
• Four students were randomly extracted to measure the height and weight, and
the height was 164, 179, 162, and 170 cm, and their weight was 53, 63, 55, and
59 kg, respectively.

The input value


Write the most suitable
must be arranged
linear equation to de-
2D
scribe this distribution

Linear
Regression

Height
weight
Linear regression
Let's implement linear regression with the Scikit-Learn library
Caution: Input value is a person’s height, 164, 179, 162, 170, respectively.
-> The input of linear regression must be used to use a multi-dimensional array.

Target value

As a function that creates a linear regression model


Create an input vector X to optimize the target value Y.
In other words, it is a model generator.
Linear regression
Check and predict the results of linear regression learning

If you want to check the slope and sec-


tions of the determined straight line,
check the characteristic value CoEF_
and Intercept_. And how well these val-
ues are suitable for predicting Y for in-
put X, confirming the score () function

How well the model predict


the data
: About 90 points
Linear regression
Check and predict the results of linear regression learning

Now, for students with a height of 180 or 185, I would like to find out
how the Regrin Linear Return model we created predicts weight.
To do this, prepare the input data.

180 regr.predict( 63.71


185 ) 66.47

Use the predict () function of the linear regression model regr.


Enter the students' keys by entering this function.
Now, based on the model of regr, it is returned by estimating the stu-
dents’ heights.
Linear regression

Use MATPLOTLIB library to graph this linear regression


Linear regression
Question: Use a linear regression model to predict the weights of [166, 0] and [170, 1]?

Student 1 Student 2 Student 3 Student 4 Student 5 Student 6 Student 7 Student 8

Height 164 167 165 170 179 163 159 166


Sex 1 1 0 0 0 1 0 1
Weight 43 48 47 66 67 50 52 44

I am a
I am a
man
woman (1)
(0)

Even if the height is similar, the weight of


male and women will be different.

The feature value of student 4 is [170,


0]
As a feature to be used for input of linear
regression
Add a man (0), a woman (1)
Linear regression
Diabetes examples:

• The sklearn library includes a dataset from diabetics.


• The data has more data and features than the above examples.
Linear regression
Diabetes examples: diabetes dataset

This data includes data used as an input, targets used as a result of learning, and
feature_names that store the names of the input features.
Linear regression

Diabetes examples: diabetes dataset

• Extract only one third item corresponding to the body mass index bmi out of 10 features

Use this data as an input of regr we learned earlier.

• The data used as the input of the function must be a two-dimensional array.

Increase the dimension of array using np.newaxis


Now you can use this data X as an input of a linear return
model.

We will just extract only bmi data and use it as an input of linear regression.
Linear regression

Diabetes examples: What is the correlation between the body mass index and the
diabetes level?

• Linear regression learning


Linear regression
Diabetes examples: Separate the diabetes example into training and test data

diabetes dataset Let's see how accurate this model is


load_diabetes()
using 20% of the test data (new data).

train_test_split()

learning data test data

X_train X_test
y_train y_test

Final performance
evaluation

linear re-
gression Model Accuracy
learning

regr.fit() regr.predict()
Linear regression
Diabetes examples:
Separate the diabetes example into training and test data
• Only 80%of the total 442 are used for learning (or training)
• Using the remaining 20%for testing

Learning as a linear regression


model using learning data such as
X_train and y_train
(Using only bmi data)

Both training data and test data show


scores of 35 and 31 points.
Linear regression

Diabetes examples: Use all the features in the dataset for linear regression
Linear regression

Diabetes examples: Use all the features in the dataset for linear regression

predicted
value

actual value
Linear regression

Diabetes examples: Mean squared error(MSE):

There are various methods of calculating the error between y_pred and y_test.
One of them is Mean squared error(MSE):

Where, N is the number of elements, is the ith y_test value, and is the
y_pred estimated by the linear regression model, corresponding to .
Linear regression

Diabetes examples: Python program


Linear regression

Diabetes examples: Python program: Results


14.3 k-NN algorithm

Big Data Analytics &


Artificial Intelligence
Starting with Python
k-NN algorithm
The problem of classifying Dachshund and Samoyed dogs
Assume a simple case where the feature space of the data consists of two fea-
tures and items are displayed in this feature space.
k-NN algorithm

Samoyed dog
Height

Dachsund dog

Length
• The Samoyed has a high height value compared to its length, and the lower
Dachshund has a low height value compared to its length.
k-NN algorithm
If you classify by looking at 3
Class A of the number of nearest
neighbors, you belong to
class B, but if you classify by
looking at 5, you belong to
class A.

Class B

When k = 3: 2 class B, 1 class A


When k = 5: 2 class B, 3 class A
k-NN algorithm

Get ready to classify the beautiful irises


k-NN algorithm

sepal length

petal length
sepal width

petal width
Setosa : 0
Versicolor : 1
Virginica: 2.
k-NN algorithm

Setosa : 0
Versicolor : 1
Virginica: 2.
k-NN algorithm

You can also see that the labels are encoded as 0, 1, 2


Setosa : 0
Versicolor : 1
Virginica: 2
k-NN algorithm
Apply the k-NN algorithm
Using 80% of the total data as training
data, this model
The remaining 20% of the test data
Make sure you predict well

Training and testing using KNeigh-


borClassifier
k-NN algorithm

Let's classify new flowers by applying the model?


k-NN algorithm
Let's find out the accuracy of the classifier
k-NN algorithm
Let's find out the accuracy of the classifier
k-NN algorithm
Let's find out the accuracy of the classifier
14.4 Overfitting and under-
fitting

Big Data Analytics &


Artificial Intelligence
Starting with Python
Overfitting and underfitting

When performance is excellent on


Poor performance on both trained and
trained data, but performance on
new data.
new data is poor.

Underfitting and overfitting


1) Reasons for underfitting
and overfitting?
2) Techniques to reduce un-
derfitting and overfitting?

Underfit Good fit Overfit


(https://www.geeksforgeeks.org/underfitting-and-overfitting-in-machine-learning)
Thank you

You might also like