0% found this document useful (0 votes)

19 views39 pages

Ch01 ICS422 02

Uploaded by

Vipul Khandke

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views39 pages

Ch01 ICS422 02

Uploaded by

Vipul Khandke

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 39

ICS422 APPLIED

PREDICTIVE ANALYTICS
[3- 0-0-3]

CLASS 02

Presented by
Dr. Selvi C
Assistant
Professor
TYPES OF ML ALGORITHMS
🞆 Supervised Learning/Predictive Learning
🞆 Unsupervised Learning/ Descriptive Learning
🞆 Semi-Supervised Learning
🞆 Reinforcement Learning

2
SUPERVISED LEARNING
• The machine has a "supervisor" or a "teacher" who
gives the machine all the answers, like whether it's
a apple in the picture or a orange.
• The teacher has already divided (labeled) the data
into oranges and apples, and the machine is using
these examples to learn -- > One by one

3
SUPERVISED LEARNING

4
SUPERVISED LEARNING
🞆 When an algorithm learns from example data and
associated target responses in order to later predict the
correct response when posed with new examples

🞆 Labeled data: Data consisting of a set of training

examples, where each example is a pair consisting of an
input and a desired output value (also called the
supervisory signal, labels, etc)

5
SUPERVISED LEARNING

🞆 Classification: Attempts to find the appropriate class

label, such as analyzing positive/negative sentiment, male
and female persons, benign and malignant tumors, secure
and unsecure loans etc.
Map input variables into discrete
categories

🞆 Regression: Predicts a continuous-valued response, for

example predicting real estate prices.
Map input variables to some continuous
function

6
SUPERVISED LEARNING
House Rent Prediction
SQ.FEET RENT
Square feet Vs House Rent
100 1500
200 3000 20000

500 5000 18000

800 6000 16000
1200 12500 14000
2000 18000 12000
700 ?
10000
8000
6000
4000
2000
0
0 500 1000 1500 2000 2500

Regression
7
SUPERVISED LEARNING

A. Given a picture of Male/Female, predict his/her age on

the basis of given picture.
B. Given a picture of Male/Female, predict Whether He/She
is of High school, College, Graduate age.
C. Banks have to decide whether or not to give a loan to
someone on the basis of his credit history

A. Regression
B. Classification
C. Classification

8
SUPERVISED LEARNING

A. Predicting the results of a game.

B. Predicting whether a tumour is malignant or benign.
C. Predicting the price of domains like real estates, stokes
etc.
D. Classifying an email as spam or not.
E. Face recognition

A. Regression
B. Classification
C. Regression
D. Classification
E. Classifcation
9
SOLVE THIS…

You’re running a company, and you want to develop learning

algorithms to address each of two problems.
Problem 1:You have a large inventory of identical items. You want
to predict how many of these items will sell over the next 3 months.

Problem 2: You’d like software to examine individual customer

accounts, and for each account decide if it has been
hacked/compromised. Should you treat these as classification or as
regression problems?

A. Treat both as classification problems.

B. Treat problem 1 as a classification problem, problem 2 as a
regression problem.
C. Treat problem 1 as a regression problem, problem 2 as a
10
classification problem.
D. Treat both as regression problems.
SUPERVISED LEARNING -
ALGORITHMS
🞆 k-Nearest Neighbours
🞆 Decision Trees
🞆 Naive Bayes
🞆 Logistic Regression
🞆 Linear Regression
🞆 Support Vector Machines

11
UNSUPERVISED LEARNING

12
UNSUPERVISED LEARNING- CLUSTERING

Divides objects based on unknown features. Machine

chooses the best way

13
UNSUPERVISED LEARNING

🞆 Learn from the data- No labels

🞆 Discover interesting structures from the data-
Knowledge Discovery
🞆 Does not require an human expert to label the data
In unsupervised learning, there is no instructor or
teacher, and the algorithm must learn to make sense of
the data without this guide.

14
UNSUPERVISED LEARNING
Clustering
🞆 Detecting potentially useful clusters of input examples.

For example, a taxi agent might gradually develop a concept

of “good traffic days” and “bad traffic days” without ever
being given labeled examples of each by a teacher.

— Pages 694-695, Artificial Intelligence: A Modern Approach,

3rd edition, 2015.

15
UNSUPERVISED LEARNING -
DIMENSIONALITY REDUCTION
•Assembles specific features into more high-
level ones

16
UNSUPERVISED LEARNING -
ASSOCIATION RULE LEARNING

• "Look for patterns in the orders' stream"

17
SOLVE THIS…

Of the following examples, which would you address using an

unsupervised learning algorithm? (Check all that apply.)

A. Given email labeled as spam/not spam, learn a spam filter.

B. Given a set of news articles found on the web, group them
into sets of articles about the same stories.
C. Given a database of customer data, automatically discover
market segments and group customers into different
market segments.
D. Given a dataset of patients diagnosed as either having
diabetes or not, learn to classify new patients as having
diabetes or not. 18
SUPERVISED VS UNSUPERVISED

Supervised Unsupervised
Labelled data No labels
Direct Feedback No feed back
Predict outcome Find Hidden
Structure in data
Supervised It is easier to get
machine learning unlabeled data
helps you to solve from a computer
various types of than labeled data,
real-world which needs
computation manual
problems. intervention.

19
SEMI-SUPERVISED LEARNING
🞆 If some learning samples are labeled, but some other are
not labeled, then it is semi-supervised learning.

🞆 It makes use of a large amount of unlabeled data for

training and a small amount of labeled data for testing.

❖ Trained upon a combination of labeled and unlabeled

data
❖ First, cluster similar data using an unsupervised
learning algorithm
❖ Then use the existing labeled data to label the rest of
the unlabeled data
Semi-supervised learning is applied in cases where it is expensive
to acquire a fully labeled dataset while more practical to label a
small subset. 20
SEMI-SUPERVISED LEARNING

21
REINFORCEMENT LEARNING
🞆 Accompany an example with positive or negative feedback
according to the
solution the algorithm proposes
🞆 Learning by trial and error: The system evaluates its
performance based on the feedback responses and reacts
accordingly
🞆 “how to act or behave when given occasional reward or
punishment signals”

22
23
7 STEPS OF MACHINE LEARNING

Problem: Classification of Oranges and Apples.

24
STEP – 1 GATHERING DATA
• First real step of machine learning is gathering data
• Quality and quantity of data

Apple or
Color Shape Orange?
Red Round Conical Apple
Orange Round Orange

Data Collection
⮚ For collecting data on color, we may use a spectrometer and,
for the shape data, we may use pictures of the fruits so that
they can be treated as 2D figures.
⮚ For the purpose of collecting data, we would try to get as many
different types of apples and orange as possible in order to
25
create diverse data sets for our features. For this purpose,
we may try to search the markets for oranges and apples that
may be from different parts of the world.
STEP – 2 DATA PREPARATION

• Load our data into a suitable place and prepare it for use in
our machine learning training
• Randomize the ordering – will improve the model
• Visualizations of your data - Relevant relationships vs Data
imbalances
• Example
• More data points about apple than orange, the model we
train will be biased
• Split the data in two parts.
• First used in training our model, will be the majority of the
dataset – Train data
• Second part will be used for evaluating our trained model’s
performance – Test Data
• Not to use the same data for training and testing

26
STEP – 3 CHOOSING A MODEL

• Categorize the problem

• Categorize by input
• Categorize by output

• Understand your
constraints

• Find the available

algorithms

27
STEP – 4 TRAINING

Use our data to

incrementally improve our • Weights
model’s ability • Biases

Each iteration or cycle of updating

the weights and biases is called
28
one training “step”
STEP – 5 AND 6
Step 6 : Parameter Tuning

Step 5: Evaluation • Further improvement to

• Classification Accuracy training
• Logarithmic Loss • Many times we run through
• Confusion Matrix the training dataset
• Area under Curve
• F1 Score
• Mean Absolute Error
• Mean Squared Error

29
STEP – 7 PREDICTION /
INFERENCE

Color: Red
Shape: Round and Conical

30
ML-TERMINOLOGIES
31
ML TERMINOLOGIES

🞆 Algorithm
A method, function, or series of instructions used to generate
a machine learning model. Examples include linear
regression, decision trees, support vector machines, and
neural networks.

🞆 Attribute
A quality describing an observation (e.g. color, size, weight).
In Excel terms, these are column headers.

🞆 Dimension
How much feature you have in you data . 32
ML TERMINOLOGIES
🞆 Training set
A set of observations used to generate machine learning models.

🞆 Test set
A set of observations used at the end of model training and
validation to assess the predictive power of your model. How
generalizable is your model to unseen data?

🞆 Validation set
A set of observations used during model training to provide
feedback on how well the current parameters generalize beyond
the training set. If training error decreases but validation error
increases, your model is likely overfitting and you should pause
training.

33
ML TERMINOLOGIES
🞆 Dataset Split
✔ First split the dataset into 2 — Train and Test
✔ Keep aside the Test set
✔ Randomly choose X% of their Train dataset to be the
actual Train set and the remaining (100-X)% to be
the Validation set
✔ The model is then iteratively trained and validated on
these different sets

34
ML TERMINOLOGIES
🞆 Dataset Split
Cross Validation: K Fold
The training set is split into k smaller sets
The following procedure is followed for each of the k
“folds”:
🞆 A model is trained using K-1 of the folds as
training data;

🞆 The resulting model is validated on the remaining

part of the data (i.e., it is used as a test set to
compute a performance measure such as
accuracy).

35
ML TERMINOLOGIES
🞆 Parameters
Parameters are properties of training data learned by training
a machine learning model or classifier. They are adjusted
using optimization algorithms and unique to each experiment.
Examples of parameters include:
o weights in an artificial neural network
o support vectors in a support vector machine
o coefficients in a linear or logistic regression

36
ML TERMINOLOGIES
🞆 Overfitting
Overfitting occurs when your model learns the training data
too well and incorporates details and noise specific to your
dataset. You can tell a model is overfitting when it performs
great on your training/validation set, but poorly on your test
set (or new real-world data).

🞆 Underfitting
The counterpart of overfitting, happens when a machine
learning model is not complex enough to accurately capture
relationships between a dataset features and target variables.

37
ML TERMINOLOGIES

🞆 Bias
Bias is the difference between the average prediction of our
model and the correct value which we are trying to predict.
Model with high bias pays very little attention to the training
data and oversimplifies the model. It always leads to high
error on training and test data.
🞆 Variance
Variance is the variability of model prediction for a given data
point or a value which tells us spread of our data. Model with
high variance pays a lot of attention to training data and does
not generalize on the data which it hasn’t seen before. As a
result, such models perform very well on training data but has
high error rates on test data..
38
THANK YOU

Week 6 - Lecture 11-1
No ratings yet
Week 6 - Lecture 11-1
28 pages
Unit I 2
No ratings yet
Unit I 2
78 pages
Unit-5 Machine Learning
No ratings yet
Unit-5 Machine Learning
25 pages
21CSC305P ML - Unit 1-E
No ratings yet
21CSC305P ML - Unit 1-E
137 pages
Classification of Machine Learning
No ratings yet
Classification of Machine Learning
73 pages
UNit 1 Introduction To ML
No ratings yet
UNit 1 Introduction To ML
225 pages
CVP Analysis 2
50% (2)
CVP Analysis 2
7 pages
Chapter Five
No ratings yet
Chapter Five
178 pages
Chapter 2
No ratings yet
Chapter 2
124 pages
Installation Guide & User 'S Manual: The ACS-600 Load Moment Limiter
100% (1)
Installation Guide & User 'S Manual: The ACS-600 Load Moment Limiter
35 pages
Machine Learning - Introduction
No ratings yet
Machine Learning - Introduction
138 pages
Deep Learnng IA
No ratings yet
Deep Learnng IA
69 pages
Machine Learning IAI
No ratings yet
Machine Learning IAI
94 pages
Basics of Machine Learning and Deep Learning
No ratings yet
Basics of Machine Learning and Deep Learning
49 pages
Social Media Analytics Techniques
No ratings yet
Social Media Analytics Techniques
77 pages
Python UNIT-5
100% (1)
Python UNIT-5
67 pages
Machine Learning
No ratings yet
Machine Learning
122 pages
Machine Learning: BE Sixth Semester 20CS610
No ratings yet
Machine Learning: BE Sixth Semester 20CS610
211 pages
Lecture 4 Machine Learning - BCSC
No ratings yet
Lecture 4 Machine Learning - BCSC
45 pages
Introduction To ML Unit-1
No ratings yet
Introduction To ML Unit-1
90 pages
Chapter 7 Learning
No ratings yet
Chapter 7 Learning
34 pages
Machine Learning
No ratings yet
Machine Learning
74 pages
A.I. Lecture 4 NEW
No ratings yet
A.I. Lecture 4 NEW
31 pages
1 - Machine Learning Overview
No ratings yet
1 - Machine Learning Overview
56 pages
Machine Learning
No ratings yet
Machine Learning
57 pages
1 - Machine Learning Overview
No ratings yet
1 - Machine Learning Overview
53 pages
3025 Fluorescence Microscope System Manual PDF
100% (1)
3025 Fluorescence Microscope System Manual PDF
16 pages
ML Unit1
No ratings yet
ML Unit1
31 pages
AML and KYC
0% (2)
AML and KYC
34 pages
Unit 1 - Machine Learning - NOTES1 - ML
No ratings yet
Unit 1 - Machine Learning - NOTES1 - ML
52 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
61 pages
AIML
No ratings yet
AIML
26 pages
Unit 1 PDF
No ratings yet
Unit 1 PDF
135 pages
How The World Sees You
100% (1)
How The World Sees You
10 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
Lecture 2
No ratings yet
Lecture 2
36 pages
ML 1
No ratings yet
ML 1
35 pages
AITools Unit 2
No ratings yet
AITools Unit 2
34 pages
OR Forecasting Tool
No ratings yet
OR Forecasting Tool
39 pages
Lec2 Intro To ML
No ratings yet
Lec2 Intro To ML
35 pages
CE880 Lecture5 Slides
No ratings yet
CE880 Lecture5 Slides
32 pages
Session 3 Types of Machine Learning
No ratings yet
Session 3 Types of Machine Learning
22 pages
Unit-4object Segmentation Regression Vs Segmentation Supervised and Unsupervised Learning Tree Building Regression Classification Overfitting Pruning and Complexity Multiple Decision Trees
No ratings yet
Unit-4object Segmentation Regression Vs Segmentation Supervised and Unsupervised Learning Tree Building Regression Classification Overfitting Pruning and Complexity Multiple Decision Trees
25 pages
Introduction To ML
No ratings yet
Introduction To ML
17 pages
Lect3 Machine Learning
No ratings yet
Lect3 Machine Learning
27 pages
Chap 10-Machine Learning
No ratings yet
Chap 10-Machine Learning
25 pages
Ai Chapter 5
No ratings yet
Ai Chapter 5
45 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
Machine Learning
No ratings yet
Machine Learning
39 pages
CV of DR Uday Dokras
No ratings yet
CV of DR Uday Dokras
27 pages
Transportation Engg: Compiled By: Engr Muhammad Abbas Khan
No ratings yet
Transportation Engg: Compiled By: Engr Muhammad Abbas Khan
9 pages
Unit-1 MLT
No ratings yet
Unit-1 MLT
51 pages
Chapter 1 Introduction To Machine Learning
No ratings yet
Chapter 1 Introduction To Machine Learning
29 pages
Machine Learning Types
No ratings yet
Machine Learning Types
30 pages
Machine Learning-Lecture 01
No ratings yet
Machine Learning-Lecture 01
28 pages
AIV ML: Achine Learning Ntroduction
No ratings yet
AIV ML: Achine Learning Ntroduction
10 pages
ML Doc1
No ratings yet
ML Doc1
14 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
16 pages
University Institute of Engineering Department of Computer Science and Engg
No ratings yet
University Institute of Engineering Department of Computer Science and Engg
27 pages
Machine Learning-Supervised Learning
No ratings yet
Machine Learning-Supervised Learning
31 pages
An Overview of Machine Learning
No ratings yet
An Overview of Machine Learning
20 pages
Machine Learning Notes
100% (1)
Machine Learning Notes
8 pages
Unit 1 ML
No ratings yet
Unit 1 ML
70 pages
MAchine Learning Notes
No ratings yet
MAchine Learning Notes
6 pages
Ch01 ICS422 01
No ratings yet
Ch01 ICS422 01
42 pages
Production of Amorphous SIlica From Geothermal Sludge of Dieng Indonesia
No ratings yet
Production of Amorphous SIlica From Geothermal Sludge of Dieng Indonesia
9 pages
Global Maritime Distress and Safety System (GMDSS) : Companies Can Opt For Block Booking
100% (1)
Global Maritime Distress and Safety System (GMDSS) : Companies Can Opt For Block Booking
1 page
What Is Machine Learning
No ratings yet
What Is Machine Learning
4 pages
Sec 4 Water - Resources - (Regulation - and - Management) - Act, - 2010-1-16
No ratings yet
Sec 4 Water - Resources - (Regulation - and - Management) - Act, - 2010-1-16
16 pages
List Spare Part NCR BSB - 6622 - 6622e - Rev1
No ratings yet
List Spare Part NCR BSB - 6622 - 6622e - Rev1
56 pages
4 交易之王语录
No ratings yet
4 交易之王语录
98 pages
3.1 K Nearest Neighbour Classifier
No ratings yet
3.1 K Nearest Neighbour Classifier
24 pages
Republic Act No 11479
No ratings yet
Republic Act No 11479
2 pages
Health Informatics Quiz 1-6
No ratings yet
Health Informatics Quiz 1-6
11 pages
R Basics - 02
No ratings yet
R Basics - 02
34 pages
Imanager U2000 Product Documentation V200R014C50 - 02 20191127111505
No ratings yet
Imanager U2000 Product Documentation V200R014C50 - 02 20191127111505
7 pages
Fiber Optics Communication en
No ratings yet
Fiber Optics Communication en
50 pages
Ch01 ICS422 04
No ratings yet
Ch01 ICS422 04
84 pages
Administration of Estates
No ratings yet
Administration of Estates
52 pages
Ch01 ICS422 03
No ratings yet
Ch01 ICS422 03
46 pages
R Data Structures - 07 - 4
No ratings yet
R Data Structures - 07 - 4
27 pages
Community Consultation On The Response Actions (CORA) For COVID-19 - 1
No ratings yet
Community Consultation On The Response Actions (CORA) For COVID-19 - 1
35 pages
Hive Part 2
No ratings yet
Hive Part 2
47 pages
GST & Central Excise
No ratings yet
GST & Central Excise
2 pages
R Data Structures - 07 - 3
No ratings yet
R Data Structures - 07 - 3
35 pages
Oemaomaa PDF 1734439841
No ratings yet
Oemaomaa PDF 1734439841
34 pages
MLR Multicollinearlty, Categorical Variable
No ratings yet
MLR Multicollinearlty, Categorical Variable
48 pages
Digital Signatures: CCA Controller of Certifying Authorities
No ratings yet
Digital Signatures: CCA Controller of Certifying Authorities
18 pages
R Functions - 06
No ratings yet
R Functions - 06
26 pages
R Operators - 03
No ratings yet
R Operators - 03
26 pages
MLR - R and R2
No ratings yet
MLR - R and R2
17 pages
Map Reduce
No ratings yet
Map Reduce
37 pages
Residual Analysis and Test - 02
No ratings yet
Residual Analysis and Test - 02
37 pages
6.1a Black-Scholes-Merton Formulas
No ratings yet
6.1a Black-Scholes-Merton Formulas
25 pages
Hive Updated
No ratings yet
Hive Updated
18 pages
R Loops - 05
No ratings yet
R Loops - 05
16 pages
5 Decision Tree Updated
No ratings yet
5 Decision Tree Updated
30 pages
R Data Structures - 07 - 1
No ratings yet
R Data Structures - 07 - 1
30 pages
Abangan v. Abangan
No ratings yet
Abangan v. Abangan
2 pages
Practice Questions On Loops in Java
No ratings yet
Practice Questions On Loops in Java
6 pages
5 Decision Tree
No ratings yet
5 Decision Tree
26 pages
Naive Bayes
No ratings yet
Naive Bayes
25 pages
Hive Table Session
No ratings yet
Hive Table Session
23 pages
R DataPreprocessing
No ratings yet
R DataPreprocessing
23 pages
Oferta de Compraventa Bilingüe
No ratings yet
Oferta de Compraventa Bilingüe
6 pages
R Statements - 04
No ratings yet
R Statements - 04
21 pages
R Data Structures - 07 - 2
No ratings yet
R Data Structures - 07 - 2
18 pages
Application For Probation 175 Basmayor
No ratings yet
Application For Probation 175 Basmayor
3 pages
4 Hadoop Ecosystem
No ratings yet
4 Hadoop Ecosystem
16 pages
Multiple Linear Regression - Excel
No ratings yet
Multiple Linear Regression - Excel
14 pages
Linear Regression
No ratings yet
Linear Regression
12 pages
LR Assumptions - 05
No ratings yet
LR Assumptions - 05
12 pages
Sub: Clean Overdraft/ DPN Facility To Employees - Modification of Guidelines
No ratings yet
Sub: Clean Overdraft/ DPN Facility To Employees - Modification of Guidelines
3 pages
References
No ratings yet
References
3 pages
M71-WL Manual v1.0
No ratings yet
M71-WL Manual v1.0
6 pages
1 Rakitan Printer 02 Agustus 2021
No ratings yet
1 Rakitan Printer 02 Agustus 2021
1 page