0% found this document useful (0 votes)
11 views11 pages

Unit 1 Part 3

Uploaded by

simbaaaa0702
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views11 pages

Unit 1 Part 3

Uploaded by

simbaaaa0702
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

UNIT 1 PART 3

Different types of ML/ models of ML

In general, machine learning algorithms can be classified into three types.


Sometimes 4.

o Supervised Learning
o Unsupervised Learning
o Reinforcement Learning
o Semi Supervised Learning

Supervised Learning is further divided into two categories:

o Classification
o Regression

Unsupervised Learning is also divided into below categories:

o Clustering
o Association Rule
o Dimensionality Reduction
1. Supervised Machine Learning

Supervised Learning is the simplest machine learning model to understand in which


input data is called training data and has a known label or result as an output.

So, it works on the principle of input-output pairs. It requires creating a function that
can be trained using a training data set, and then it is applied to unknown data and
makes some predictive performance. Supervised learning is task-based and tested on
labeled data sets.

We can implement a supervised learning model on simple real-life problems. For


example, we have a dataset consisting of age and height; then, we can build a
supervised learning model to predict the person's height based on their age.

Supervised Learning models are further classified into two categories:

1. Classification

➢ Definition
In machine learning, classification is the problem of identifying to which of a set of
categories a new observation belongs, on the basis of a training set of data
containing observations (or instances) whose category membership is known.

➢ EXAMPLE
• Data in Table 1.1 is the training set of data. There are two attributes “Score1”
and “Score2”.
• The class label is called “Result”. The class label has two possible values
“Pass” and “Fail”.
• The data can be divided into two categories or classes: The set of data for
which the class label is “Pass” and the set of data for which the class label
is“Fail”.
• Let us assume that we have no knowledge about the data other than what is
given in the table.
• Now, the problem can be posed as follows: If we have some new data, say
“Score1 = 25” and “Score2 = 36”, what value should be assigned to “Result”
corresponding to the new data; in other words, to which of the two categories
or classes the new observation should be assigned?

➢ Real life examples

i) Optical character recognition


Optical character recognition problem, which is the problem of recognizing
character codes from their images, is an example of classification problem. This is
an example where there are multiple classes, as many as there are characters we
would like to recognize. Especially interesting is the case when the characters are
handwritten. People have different handwriting styles; characters may be written
small or large, slanted, with a pen or pencil, and there are many possible images
corresponding to the same character.
ii) Face recognition : In the case of face recognition, the input is an image, the classes
are people to be recognized, and the learning program should learn to associate the
face images to identities. This problem is more difficult than optical character
recognition because there are more classes, input image is larger, and a face is three-
dimensional and differences in pose and lighting cause significant changes in the
image.

iii) Speech recognition: In speech recognition, the input is acoustic and the classes
are words that can be uttered.

iv) Medical diagnosis: In medical diagnosis, the inputs are the relevant information
we have about the patient and the classes are the illnesses. The inputs contain the
patient’s age, gender, past medical history, and current symptoms. Some tests may
not have been applied to the patient, and thus these inputs would be missing.

v) Knowledge extraction : Classification rules can also be used for knowledge


extraction. The rule is a simple model that explains the data, and looking at this
model we have an explanation about the process underlying the data.

vi) Compression: Classification rules can be used for compression. By fitting a rule
to the data, we get an explanation that is simpler than the data, requiring less
memory to store and less computation to process.

vii) More examples

Here are some further examples of classification problems.


(a) An emergency room in a hospital measures 17 variables like blood pressure,
age, etc. of newly admitted patients. A decision has to be made whether to put
the patient in an ICU. Due to the high cost of ICU, only patients who may
survive a month or more are given higher priority. Such patients are labeled
as “low-risk patients” and others are labeled “high-risk patients”. The problem
is to device a rule to classify a patient as a “low-risk patient” or a “high-risk
patient”.
(b) A credit card company receives hundreds of thousands of applications for
new cards. The applications contain information regarding several attributes
like annual salary, age, etc. The problem is to devise a rule to classify the
applicants to those who are credit-worthy, who are not credit-worthy or to
those who require further analysis.
(c) Astronomers have been cataloguing distant objects in the sky using digital
images created using special devices. The objects are to be labeled as star, galaxy,
nebula, etc. The data is highly noisy and are very faint. The problem is to device a
rule using which a distant object can be correctly labeled.

➢ Algorithms

There are several machine learning algorithms for classification. The following are
some of the well-known algorithms.
a) Logistic regression
b) Naive Bayes algorithm
c) k-NN algorithm
d) Decision tree algorithm
e) Support vector machine algorithm
f) Random forest algorithm
➢ Remarks

• A classification problem requires that examples be classified into one of two or


more classes.
• A classification can have real-valued or discrete input variables.
• A problem with two classes is often called a two-class or binary classification
problem.
• A problem with more than two classes is often called a multi-class classification
problem.
• A problem where an example is assigned multiple classes is called a multi-label
classification problem.

2. Regression

• In machine learning, a regression problem is the problem of predicting the


value of a numeric variable based on observed values of the variable.
• The value of the output variable may be a number, such as an integer or a
floating point value.
• These are often quantities, such as amounts and sizes.
• The input variables may be discrete or real-valued.

➢ Example
Consider the data on car prices given in Table 1.2.
Suppose we are required to estimate the price of a car aged 25 years with distance
53240 KM and weight 1200 pounds.

This is an example of a regression problem because we have to predict the value of


the numeric variable “Price”.

➢ Different regression models

• Simple linear regression: There is only one continuous independent variable x and
the assumed relation between the independent variable and the dependent variable y
is y = a + bx

• Multivariate linear regression: There are more than one independent variable, say
x1; : : : ; xn,
and the assumed relation between the independent variables and the dependent
variable is
y = a0 + a1x1 +_+ anxn:
• Polynomial regression: There is only one continuous independent variable x and
the assumed
model is
y = a0 + a1x1 +_+ anxn:

• Logistic regression: The dependent variable is binary, that is, a variable which takes
only the values 0 and 1. The assumed model involves certain probability
distributions.

II. Unsupervised Machine learning models

Unsupervised Machine learning models implement the learning process opposite to


supervised learning, which means it enables the model to learn from the unlabeled
training dataset.

Based on the unlabeled dataset, the model predicts the output. Using unsupervised
learning, the model learns hidden patterns from the dataset by itself without any
supervision.

Unsupervised learning models are mainly used to perform three tasks, which are as
follows:

o Clustering
Clustering is an unsupervised learning technique that involves clustering or
groping the data points into different clusters based on similarities and
differences. The objects with the most similarities remain in the same group,
and they have no or very few similarities from other groups.
Clustering algorithms can be widely used in different tasks such as Image
segmentation, Statistical data analysis, Market segmentation, etc.
Some commonly used Clustering algorithms are K-means Clustering,
hierarchal Clustering, DBSCAN, etc.

o Association Rule Learning

Association rule learning is an unsupervised learning technique, which finds


interesting relations among variables within a large dataset. The main aim of
this learning algorithm is to find the dependency of one data item on another
data item and map those variables accordingly so that it can generate
maximum profit. This algorithm is mainly applied in Market Basket
analysis, Web usage mining, continuous production, etc.
Some popular algorithms of Association rule learning are Apriori Algorithm,
Eclat, FP-growth algorithm.

o Dimensionality Reduction

The number of features/variables present in a dataset is known as the


dimensionality of the dataset, and the technique used to reduce the
dimensionality is known as the dimensionality reduction technique.
Although more data provides more accurate results, it can also affect the
performance of the model/algorithm, such as overfitting issues. In such cases,
dimensionality reduction techniques are used.
"It is a process of converting the higher dimensions dataset into lesser
dimensions dataset ensuring that it provides similar information."
Different dimensionality reduction methods such as PCA(Principal
Component Analysis), Singular Value Decomposition, etc.
III. Reinforcement Learning –

Reinforcement learning is a type of machine learning technique where a computer


agent learns to perform a task through repeated trial and error interactions with a
dynamic environment. This learning approach enables the agent to make a series of
decisions that maximize a reward metric for the task without human intervention and
without being explicitly programmed to achieve the task.

AI programs trained with reinforcement learning beat human players in board games
like Go and chess, as well as video games.

…………………………………………………………………………………

Semi supervised learning –

Semi-supervised learning bridges supervised learning and unsupervised learning


techniques to solve their key challenges. With semi-supervised learning, you train an
initial model on a few labeled samples and then iteratively apply the model to a larger
dataset.

Unlike unsupervised learning, which is useful only in a limited set of situations, SSL
works for a variety of problems from classification and regression to clustering and
association.

A semi-supervised learning approach uses small amounts of labeled data and also
large amounts of unlabeled data. This reduces expenses on manual annotation and
cuts data preparation time.

Since unlabeled data is abundant, easy to get, and inexpensive, semi-supervised


learning finds many applications, while the accuracy of results doesn’t suffer.
……………………………………………………………………………………

Difference - Supervised vs Unsupervised vs Reinforcement Learning

Criteria Supervised Unsupervised Learning Reinforcement


Learning Learning
Input Data Input data is labelled. Input data is not labelled. Input data is not
predefined.
Problem Learn pattern of Divide data into classes. Find the best reward
inputs and their between a start and an
labels. end state.
Solution Finds a mapping Finds similar features in Maximizes reward by
equation on input input data to classify it assessing the results of
data and its labels. into classes. state-action pairs
Model Model is built and Model is built and trained The model is trained and
Building trained prior to prior to testing. tested simultaneously.
testing.
Applications Deal with regression Deals with clustering and Deals with exploration
and classification associative rule mining and exploitation
problems. problems. problems.
Algorithms Decision trees, linear K-means clustering, k- Q-learning, SARSA,
Used regression, K-nearest medoids clustering, Deep Q Network
neighbors agglomerative clustering
Examples Image detection, Customer segmentation, Drive-less cars, self-
Population growth feature elicitation, navigating vacuum
prediction targeted marketing, etc cleaners, etc

You might also like