0% found this document useful (0 votes)

6 views

L09 - Learning - Part 2

Uploaded by

eki kun

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

L09 - Learning - Part 2

Uploaded by

eki kun

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

Lecture 9:

Learning – Part 2
Practical Machine Learning

KT24202 Artificial Intelligence

Faculty of Computing & Informatics,
Universiti Malaysia Sabah

1
Outline
• Machine Learning Tools
• Practical Machine Learning
– Decision Tree
– Instance Based Learning
– Neural Network

2
Machine Learning Tools
• ML tools are software application of
learning in artificial intelligence
– Learn from input (data) trough training and
testing
– Provide output such as learning model
(such as rules or mathematical model) that
capable to classify or predict future data

3
ML Tools
Platform Cost Written in Algorithms or Features
language
Scikit Learn Linux, Mac Free Python, Classification, Regression. Clustering
OS, . Cython, C, Preprocessing, Model Selection
Windows C++ Dimensionality reduction.
PyTorch Linux, Mac Free Python, Autograd Module, Optim Module, nn
OS, C++, Module
Windows CUDA
TensorFlow Linux, Mac Free Python, Provides a library for dataflow
OS, C++, programming.
Windows CUDA
Weka Linux, Mac Free Java Data preparation, Classification
OS, Regression, Clustering, Visualization
Windows Association rules mining

Reference: https://www.softwaretestinghelp.com/machine-learning-tools/
4
ML Tools
Platform Cost Written in Algorithms or Features
language
KNIME Linux, Free Java Can work with large data
Mac OS, volume.
Windows Supports text mining & image
mining through plugins

Colab Cloud Free - Supports libraries of PyTorch,

Service Keras, TensorFlow, and
OpenCV
Keras.io Cross- Free Python API for neural networks
platform
Rapid Miner Cross- Free plan Java Data loading &
platform Small: $2500 per year. Transformation
Medium: $5000 per Data preprocessing &
year. Large: $10000 per visualization.
year.
5
Sample Data
Taxable
Tid Refund Marital Status Cheat
Income
1 Yes Single 125K No
2 No Married 100K No
3 No Single 70K No
4 Yes Married 120K No
5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
• Tid is record number, Refund, Marital Status and Taxable
Income are attributes and Cheat is class 6
Classification: Definition
• Given a collection of records (training set )
– Each record contains a set of attributes, one of the
attributes is the class.
• Find a model for class attribute as a function of the
values of other attributes.
• Goal: previously unseen records should be assigned
a class as accurately as possible.
– A test set is used to determine the accuracy of the model.
Usually, the given data set is divided into training and test
sets, with training set used to build the model and test set
used to validate it.

7
Illustrating Classification Task
Tid Attrib1 Attrib2 Attrib3 Class Learning
1 Yes Large 125K No
algorithm
2 No Medium 100K No
3 No Small 70K No
4 Yes Medium 120K No
Induction
5 No Large 95K Yes
6 No Medium 60K No
7 Yes Large 220K No Learn
8 No Small 85K Yes Model
9 No Medium 75K No
10 No Small 90K Yes
Model
10

Training Set
Apply
Tid Attrib1 Attrib2 Attrib3 Class Model
11 No Small 55K ?
12 Yes Medium 80K ?
13 Yes Large 110K ? Deduction
14 No Small 95K ?
15 No Large 67K ?
10

Test Set

8
What is DT learning?
• DT is a ‘flow-chart-like’ structure:
– Internal node represent a test to an attribute
– A branch represent the outcomes of the test
– Leaf node represent a class label

Outlook?
Decision Tree
for Concept PlayTennis
Sunny Overcast Rain
Yes/No=class label
Humidity? yes Wind?

High Normal Strong Light

No Yes No yes

• DT Used for: Classify the unknown instance

– Test the class label using decision tree, can be converted to rule
9

9
Splitting Attributes
Tid Refund Marital Taxable
Status Income Cheat

1 Yes Single 125K No

2 No Married 100K No Refund
Yes No
3 No Single 70K No
4 Yes Married 120K No NO MarSt
5 No Divorced 95K Yes Married
Single, Divorced
6 No Married 60K No
7 Yes Divorced 220K No TaxInc NO
8 No Single 85K Yes < 80K > 80K
9 No Married 75K No
NO YES
10 No Single 90K Yes
10

Training Data Model: Decision Tree

10
MarSt Single,
Married Divorced
Tid Refund Marital Taxable
Status Income Cheat
NO Refund
1 Yes Single 125K No
Yes No
2 No Married 100K No
3 No Single 70K No NO TaxInc
4 Yes Married 120K No < 80K > 80K
5 No Divorced 95K Yes
NO YES
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No There could be more than one tree that
10 No Single 90K Yes fits the same data!
10

11
Decision Tree Classification Task
Tid Attrib1 Attrib2 Attrib3 Class
Tree
1 Yes Large 125K No Induction
2 No Medium 100K No algorithm
3 No Small 70K No
4 Yes Medium 120K No
Induction
5 No Large 95K Yes
6 No Medium 60K No
7 Yes Large 220K No Learn
8 No Small 85K Yes Model
9 No Medium 75K No
10 No Small 90K Yes
Model
10

Training Set
Apply Decision
Model
Tid Attrib1 Attrib2 Attrib3 Class Tree
11 No Small 55K ?
12 Yes Medium 80K ?
13 Yes Large 110K ?
Deduction
14 No Small 95K ?
15 No Large 67K ?
10

Test Set
12
Test Data
Start from the root of tree. Refund Marital Taxable
Status Income Cheat

No Married 80K ?
Refund 10

Yes No

NO MarSt
Married
Single, Divorced

TaxInc NO
< 80K > 80K

NO YES

13
Test Data
Refund Marital Taxable
Status Income Cheat

No Married 80K ?
Refund 10

Yes No

NO MarSt
Single, Divorced Married

TaxInc NO
< 80K > 80K

NO YES

14
Test Data
Refund Marital Taxable
Status Income Cheat

No Married 80K ?
Refund 10

Yes No

NO MarSt
Single, Divorced Married

TaxInc NO
< 80K > 80K

NO YES

15
Test Data
Refund Marital Taxable
Status Income Cheat

No Married 80K ?
Refund 10

Yes No

NO MarSt
Single, Divorced Married

TaxInc NO
< 80K > 80K

NO YES

16
Test Data
Refund Marital Taxable
Status Income Cheat

No Married 80K ?
Refund 10

Yes No

NO MarSt
Single, Divorced Married

TaxInc NO
< 80K > 80K

NO YES

17
Test Data
Refund Marital Taxable
Status Income Cheat

No Married 80K ?
Refund 10

Yes No

NO MarSt
Single, Divorced Married Assign Cheat to “No”

TaxInc NO
< 80K > 80K

NO YES

18
Online DT builder:
https://planetcalc.com/8443/

19
Introduction to Instance Based
Learning (IBL)
• Background:
– IBL is focuses on storing data and using the data for
classification.
– Commonly used algorithms in instance-based
learning is k-Nearest Neighbors (k-NN).
– In k-NN, the classification or prediction of a new
instance (data) is based on the similarity or
proximity of its features to the instances in the
training data. It will select the k nearest neighbors
to makes predictions/classification based on the
majority class or average value of those neighbors
20
20
K-NN
– Locate k nearest training examples xn to
query xq
• Nearest normally calculated using Euclidean
Distance:
n
d(xi ,x j ) ≡ ∑ (a (x ) − a (x
r =1
r i r j
2
))

– Then estimate the classification by voting

21
K-NN Example
• Assume there are two
5 - - f - class + and – and
4 e - d - represented as
coordinate, e.g.,
3
– b is (3,1)
2 +
a
+ x – d is (4,4)
1 c+
b
+ • X is new instance to be
classified (as + or -)
1 2 3 4 – X is (4,2)
22
K-NN example
• What is the classification of X if k=1?
– X is (4,2), and b is (3,1)
• D(x,a) = (4 − 3)2 +(2 − 1)2
• D(x,a) = 1.4142
– X is (4,2), and d is (4,4)
• D(x,a) = (4 − 4)2 +(2 − 4)2
• D(x,a) = 2
– Therefore, X is classified as + because the distance is
nearest to b than d
– How about k=3?

23
5 - - f -

4 e - d -

2 +
a
+ x

1 c + b
+

1 2 3 4
24
K-NN on this data?
• Data is not in numeric but
nominal (categorical), so
that calculation can be
made
• Method
– Encoding – represent
attribute value by possible
value (e.g. Yes =1, No = 2
for Attrib1)
– Similarity – if attribute in
test is similar to attribute in
training, assign 0, otherwise
1. 25
1-hot encoding
• Attrib1: {yes=0, no=1}
• Attrib2: {Large=0, Medium=1,
Small=2}
• Attrib3: use numeric, e.g. 125k =
125, 100k = 100, etc.
• Representation of Training Tid 1:
– [0, 0, 125, No]
• Representation of Test Tid 11:
– [1, 2, 55]

26
K-NN Example
• Attrib1: {yes=0, no=1} D(T11, T1):
• Attrib2: {Large=0, = (1 − 0)2 + (2 − 0)2 +(55 − 125)2
Medium=1, Small=2} =?

• Attrib3: use numeric,

D(T11, T2):
e.g. 125k = 125, 100k =
100, etc. = (0 − 0)2 + (1 − 0)2 +(80 − 125)2
=?
• Representation of
Training Tid 1:
Exercise:
– [0, 0, 125, No]
• Continue to calculate all the other data
• Representation of Test • What is the classification of T11 if k=1
Tid 11 & 12: and k=3?
– [1, 2, 55]
– [0, 1, 80]
27
Introduction to NN
• Background:
– NN is based on function and structure of brain
which suggests an Artificial Neural Networks.
– Neural networks or Artificial Neural Network
(ANN) is an information processing system
which has the similarity of biological neural.
– In computing, it is designed and developed as
a mathematical model of human thinking and
pattern recognition.

28
28
Biological neuron
• Ref: Quasar Jarosz, courtesy of
Wikipedia
• Dendrite (input zone), Axon (output
zone)

29
29
Assumption in NN
• Information processing happen in an
element called neuron.
• Signals are transmitted between neurons
through the connectors.
• Each neuron applies the activation
function to produce the output.

30
30
Use of NN
• Storing and retrieving data,
• Pattern classification,
• Perform general mapping from input
pattern to output pattern,
• Compiling similar patterns or finding
solutions to optimization problems.

31
31
Structure
• Neural networks are made up of many
simple processing elements called neurons,
units, cells or nodes.
• Each neuron is connected to another neuron
via a connector (by weight).
• The weight represents the information that
the network uses to resolve the problem.
• Each neuron has an internal state called
activation - a function of the input received.
32
32
Network structure
Neuron Y receive input from
neuron X1, X2 and X3.
X1 w1

w2
Output signal for neuron X1,
X2 Y X2 and X3 are x1, x2 and x3.

X3
Weight from neuron X1, X2
and X3
input Weight output Are w1, w2 dan w3.

Figure Simple Artificial Neuron

33
33
Network Structure
• The net input of y_in to Y neurons is the sum of
the signal intensities of X1, X2 and X3. That is:

y_in = w1x1 + w2x2 + w3x3.

• The Y activation of Y neurons is obtained by

function y = f(y_in).
• Examples of activation functions are sigmoid
logistic functions. 1
f ( x) = −x
1+ e 34
34
Learning mode
• Supervised learning
– The network learns based on predefined goals.
– Learning is successful when error rates (the difference between
output and target) are minimal.
– Examples: Backpropogation Network, Radial Basis Function

• Unsupervised training
– Networks learn by grouping all the same input patterns into one
group.
– No target given.
– Examples: The Cluster Network, Adaptive Resonance Theory

35
35
Application of NN
• Used in application field:
– Signal processing
– Control
– Pattern recognition
– Medicine
– Conversation recognition
– Business, etc.

36
36
Example: Predict student performance
• Student performance can be predicted
based on:
– Attendance
– Test scores
– Quiz score
– Assignment score
– Frequency of meeting lecturers
• Given that student performance is “Good”
or “Bad”. Using the information above draw
the NN architecture with 3 hidden units.
37
37
Predicting Student Performance
x1 Attendance
x1 x2 Test scores
x3 Quiz scores
x2 z1 x4 Assignment Scores
x5 Frequency meeting
lectures
x3 z2 y1

x4 z3

x5
Input Layer Hidden Layer Output Layer

38
Example

39
Summary
• Machine learning is the study of learning
algorithm to be applied to the data in
order to build model, extract hidden
pattern and knowledge.
• ML algorithm include:
– Decision Tree, Neural Network, SVM, Naïve
Bayes, etc.

40
Acknowledgement
• Various sources including - Artificial
Intelligence: Modern Approach (Russel,
Norvig) and some original from slides:
PM Dr. Mohd Hanafi Hijazi
• Prepared by Dr. Mohd Shamrie Sainin
(2023)

2023 IIF-EY Survey Report on AI_ML Use in Financial Services - Public Report - Final
No ratings yet
2023 IIF-EY Survey Report on AI_ML Use in Financial Services - Public Report - Final
27 pages
Stock Price Prediction Using Artificial Neural Networks: Padmaja Dhenuvakonda, R. Anandan, N. Kumar
100% (1)
Stock Price Prediction Using Artificial Neural Networks: Padmaja Dhenuvakonda, R. Anandan, N. Kumar
5 pages
Lecture3 2020classification PDF
No ratings yet
Lecture3 2020classification PDF
124 pages
Datamining-Lect5 Decision Tree
No ratings yet
Datamining-Lect5 Decision Tree
38 pages
Datamining-lect3 - Classification. Decision Trees. Evaluation
No ratings yet
Datamining-lect3 - Classification. Decision Trees. Evaluation
95 pages
Classification: Basic Concepts and Decision Trees
No ratings yet
Classification: Basic Concepts and Decision Trees
56 pages
Decision Tree
No ratings yet
Decision Tree
42 pages
Data Mining: Lecture - 03
No ratings yet
Data Mining: Lecture - 03
56 pages
Introduction To ML
No ratings yet
Introduction To ML
31 pages
Data Mining: Classification
No ratings yet
Data Mining: Classification
87 pages
CH03 Classification Part I
No ratings yet
CH03 Classification Part I
58 pages
Introduction To Artificial Intelligence: Amna Iftikhar Spring ' 2021 1
No ratings yet
Introduction To Artificial Intelligence: Amna Iftikhar Spring ' 2021 1
50 pages
Tree Based Classifiers: Dinesh R
No ratings yet
Tree Based Classifiers: Dinesh R
54 pages
IntroClassificationDA-2024
No ratings yet
IntroClassificationDA-2024
129 pages
Lecture 2
No ratings yet
Lecture 2
98 pages
Classification: Basic Concepts and Decision Trees
No ratings yet
Classification: Basic Concepts and Decision Trees
71 pages
7 - Classfication - Concept - DecisionTree - Evaluation
No ratings yet
7 - Classfication - Concept - DecisionTree - Evaluation
47 pages
Lecture 6 - Decision Trees
No ratings yet
Lecture 6 - Decision Trees
43 pages
Decision Tree and Evalaution
No ratings yet
Decision Tree and Evalaution
50 pages
datamining-lect10a-Classsification-basics-DT
No ratings yet
datamining-lect10a-Classsification-basics-DT
87 pages
20150908-Lecture-3-Draft Asd Def HFL DFGF Lkreglker Lerg Kelr GK
No ratings yet
20150908-Lecture-3-Draft Asd Def HFL DFGF Lkreglker Lerg Kelr GK
15 pages
Decision Trees 4
No ratings yet
Decision Trees 4
56 pages
Chapter 6. Decision Tree Classification
No ratings yet
Chapter 6. Decision Tree Classification
19 pages
Decision Tree
No ratings yet
Decision Tree
30 pages
Classification: Lecture Notes For Chapters 4 & 5
No ratings yet
Classification: Lecture Notes For Chapters 4 & 5
42 pages
ML L8 Decision Tree
No ratings yet
ML L8 Decision Tree
109 pages
CH 6
No ratings yet
CH 6
72 pages
ML s2 Part1 24
No ratings yet
ML s2 Part1 24
110 pages
Classification Slides
No ratings yet
Classification Slides
147 pages
Class10 14 PatternClassification - 13 24sept2019
No ratings yet
Class10 14 PatternClassification - 13 24sept2019
50 pages
06 Classification
No ratings yet
06 Classification
32 pages
Week 6 - 7 - Classification
No ratings yet
Week 6 - 7 - Classification
67 pages
Lecture 11-Classification-M
No ratings yet
Lecture 11-Classification-M
33 pages
Decision Tree and Ensemble
No ratings yet
Decision Tree and Ensemble
92 pages
Lecture 11
No ratings yet
Lecture 11
24 pages
ML 05 Decision Trees
No ratings yet
ML 05 Decision Trees
76 pages
08 Class Basic
No ratings yet
08 Class Basic
103 pages
2EL1730-ML-Lecture05-Trees and Ensemble Learning
No ratings yet
2EL1730-ML-Lecture05-Trees and Ensemble Learning
70 pages
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
No ratings yet
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
50 pages
Classification Basics
No ratings yet
Classification Basics
65 pages
01 Classification
No ratings yet
01 Classification
77 pages
Unit II Part 1
No ratings yet
Unit II Part 1
62 pages
Lec 16,17
No ratings yet
Lec 16,17
90 pages
L6 Decision Tree Classifier
No ratings yet
L6 Decision Tree Classifier
46 pages
Liaquat Majeed Sheikh: National University of Computer and Emerging Sciences
No ratings yet
Liaquat Majeed Sheikh: National University of Computer and Emerging Sciences
79 pages
CCST9017 (2023-24lecture11printed Version) MachineLearning
No ratings yet
CCST9017 (2023-24lecture11printed Version) MachineLearning
55 pages
CH 8 Data Mining
No ratings yet
CH 8 Data Mining
30 pages
2223 ML Lecture05
No ratings yet
2223 ML Lecture05
63 pages
3-Classification, Clustering and Prediction
No ratings yet
3-Classification, Clustering and Prediction
142 pages
Outline: - Learning Agents - Inductive Learning - Decision Tree Learning
No ratings yet
Outline: - Learning Agents - Inductive Learning - Decision Tree Learning
30 pages
DW&M Unit 3 Part I
No ratings yet
DW&M Unit 3 Part I
101 pages
Unit 4 Classification
No ratings yet
Unit 4 Classification
87 pages
Classification Algorithm
No ratings yet
Classification Algorithm
78 pages
TTDS Lecture 4
No ratings yet
TTDS Lecture 4
31 pages
Big Data Lesson 5 Lucrezia Noli
No ratings yet
Big Data Lesson 5 Lucrezia Noli
30 pages
DWDM Unit Iv
No ratings yet
DWDM Unit Iv
81 pages
Practicalintroductiontomachinelearning1561472049990 PDF
No ratings yet
Practicalintroductiontomachinelearning1561472049990 PDF
110 pages
05 Classification Part1
No ratings yet
05 Classification Part1
35 pages
Important For Data Mining
No ratings yet
Important For Data Mining
96 pages
Data Mining Classification Algorithms: Credits: Padhraic Smyth
No ratings yet
Data Mining Classification Algorithms: Credits: Padhraic Smyth
54 pages
Classification
No ratings yet
Classification
33 pages
Python Machine Learning By Example: Unlock machine learning best practices with real-world use cases
From Everand
Python Machine Learning By Example: Unlock machine learning best practices with real-world use cases
Yuxi (Hayden) Liu
No ratings yet
Science Stream - Holiday Homework - Xi C, D & e
No ratings yet
Science Stream - Holiday Homework - Xi C, D & e
20 pages
6 - Netflix History 101 Episode Six - Robots Movie Guide (Answer Key Included)
No ratings yet
6 - Netflix History 101 Episode Six - Robots Movie Guide (Answer Key Included)
4 pages
Star: Self-Taught Reasoner: Bootstrapping Reasoning With Reasoning
No ratings yet
Star: Self-Taught Reasoner: Bootstrapping Reasoning With Reasoning
30 pages
Multiclass Classification of DGA Based Malware Using NLP
No ratings yet
Multiclass Classification of DGA Based Malware Using NLP
3 pages
CFP : 4th International Conference on NLP and Machine Learning Trends (NLMLT 2025)
No ratings yet
CFP : 4th International Conference on NLP and Machine Learning Trends (NLMLT 2025)
2 pages
Named Entity Recognition Using Ensemble
No ratings yet
Named Entity Recognition Using Ensemble
5 pages
Pyramid Diagram of Organizational Levels and Information Requirements
100% (1)
Pyramid Diagram of Organizational Levels and Information Requirements
5 pages
DMDW Lab8 Kirtan
No ratings yet
DMDW Lab8 Kirtan
49 pages
Credit Card Analytics: A Review of Fraud Detection and Risk Assessment Techniques
No ratings yet
Credit Card Analytics: A Review of Fraud Detection and Risk Assessment Techniques
12 pages
Presentation On Data Mining
100% (1)
Presentation On Data Mining
51 pages
Using Machine Learning For Detection and Prediction of Chronic Diseases
No ratings yet
Using Machine Learning For Detection and Prediction of Chronic Diseases
17 pages
Final-Research-Chapters-1-and-3-Template
No ratings yet
Final-Research-Chapters-1-and-3-Template
15 pages
Module 4 ISML
No ratings yet
Module 4 ISML
88 pages
2025-02-26_PwC_Global-Compliance-Study-2025 (1)
No ratings yet
2025-02-26_PwC_Global-Compliance-Study-2025 (1)
48 pages
Life Box
0% (1)
Life Box
576 pages
ECS44006 UNIVERSITYLAB (OS LAB) Splitup Mark - (B.TECH-CSE,AIML,DS) (1)
No ratings yet
ECS44006 UNIVERSITYLAB (OS LAB) Splitup Mark - (B.TECH-CSE,AIML,DS) (1)
4 pages
Robotics, AI, And Humanity: Science, Ethics, And Policy Joachim Von Braun 2024 Scribd Download
100% (6)
Robotics, AI, And Humanity: Science, Ethics, And Policy Joachim Von Braun 2024 Scribd Download
55 pages
Lumi April 15-21
No ratings yet
Lumi April 15-21
10 pages
AI Class 10 Sample Paper-1 - 2024
100% (9)
AI Class 10 Sample Paper-1 - 2024
7 pages
Voice Recognition Thesis Philippines
100% (2)
Voice Recognition Thesis Philippines
7 pages
Code Generation With LLMs
No ratings yet
Code Generation With LLMs
59 pages
Course Synopsis BCS
No ratings yet
Course Synopsis BCS
8 pages
REINFORCEMENT LEARNING
No ratings yet
REINFORCEMENT LEARNING
1 page
Note 1
No ratings yet
Note 1
14 pages
Documentation EMS
No ratings yet
Documentation EMS
27 pages
Tutorial - Hybrid Fuzzy Models
No ratings yet
Tutorial - Hybrid Fuzzy Models
19 pages
Copilot For Microsoft 365 Implementation Framework: Overview Guidance
0% (1)
Copilot For Microsoft 365 Implementation Framework: Overview Guidance
17 pages
Unit V Aiml
No ratings yet
Unit V Aiml
18 pages