Pma 5
Pma 5
DIVYA M
SANTHANAM L
Corporate Trainer
UNIT V
INTRODUCTION TO MODEL
MODELLING ALGORITHMS
Modelling Algorithms are used provides machines the ability to
learn automatically by feeding lot of data.
TYPES OF MACHINE LEARNING
Outcomes.
Examples: YES or NO, MALE or FEMALE, SPAM or NOT SPAM, CAT or
DOG, etc.
• MULTI CLASS CLASSIFIER: This classification problem has more than two
outcomes.
3. Temperature Forecasting
5. Medical Diagnosis
Cluster customers based on age, call usage, data usage, etc. in order to
divide them into gold, silver and bronze segments.
Cluster insurance claims and look for unusual cases within the groups.
This is also known as anomaly detection and is commonly used method to
detect fraud.
CREATING A MODEL IN IBM SPSS MODELER
When execute a model, a model nugget (a yellow diamond node)
is added to the stream canvas.
The model nugget stores the results of the analysis and is linked
to the modeling node.
The link ensures that when you rerun the model, for example with
other inputs, that the model nugget is updated with the new
results.
To view the model's results, open the model nugget. The output
depends on which model was executed. For example, you will have
a tree diagram when you run a CHAID node, a cluster profile
when you run a segmentation node, and a set of rules when you
execute an association model.
MODELLING PALETTE
The Modeling palette is organized into categories based on type of
models: each type is a sub palette.
Selecting one of the sub palettes will show all modeling nodes
suitable for that category.
Each type of model requires specific field roles:
❑ Supervised models require one of more input fields
(predictors) and a target field.
❑ Segmentation models only require input fields. The cluster
solution will be based on these fields. No target field is
specified.
❑ Association models involve rules where a field can appear both.
as input and as target
NEURAL NETWORKS
A Neural Network has an Input Layer, a Hidden Layer, and an
Output Layer.
0.5
20 30 40 50 60 80
Age
2. Suppose Linear regression:-
Have
Insurance
0.5
20 30 40 50 60 80
Age
3. In Logistic Regression
Have
Insurance
0.5
20 30 40 50 60 80
Age
For example, when the probability that a customer pays back a loan is 4/5, then the
probability that that customer does not pay back the loan= 1- 4/5 = 1/5.
Therefore, the odds will be (4/5) / (1/5) = 4 for this customer. When the odds are 1,
you know that the probability for the event to occur equals the probability that the
event does not occur, and both probabilities are 0.5
The odds are linked to the predictors by the equation given as:
Here, exp (…) is another way to write e ^ (…), where e is, approximately,
the number 2.72.
Logistic Regression can be used for classification problems such as:
Predict whether a customer churns or not
Predict whether customer accepts an offer or not
Predict whether an email is spam or not
INTRODUCTION OF NEURAL NETWORKS
Neural networks attempt to solve problems using methods modeled on
how the human brain operates.
The connections between the neurons provide the network with the ability
to learn patterns and relationships in data.
MULTILAYER PERCEPTRON (MLP)
The Multilayer Perceptron consists of several processing units, the neurons, arranged in layers
to create a network.
Each neuron in the hidden layer receives an input based on a weighted combination of the
values of the neurons in the previous layer.
The neurons within the hidden layer are, in turn, combined to produce an output value, the
prediction.
This predicted value is compared to the actual value of the target and the difference between
the two values (the error) is fed back into the network (known as "back propagation"), which
in turn is updated
HOW DOES A MULTILAYER PERCEPTRON
NEURAL NETWORK LEARN?
Consider the example of a child learning the difference between an apple and a pear. The child
currently does not know the difference between an apple and a pear.
When shown the first example of a fruit, the child may look at the fruit and decide that it is round,
red in color and of a particular weight.
Not knowing what an apple or a pear actually looks like, the child may decide to place equal
importance on each of these factors.
The importance is what a network refers to as weights. At this stage the child is most likely to
randomly choose either an apple or a pear for the prediction. On being told the correct response, the
child will increase or decrease the relative importance of each of the factors to improve the decision
(reduce the error).
In a similar fashion, a Multilayer Perceptron network begins with
random weights placed on each of the inputs and generates a
predicted value of target.
On being told the actual value of the target, the network adjusts
these internal weights. In time, the child and the network will
hopefully make correct predictions.
RADIAL BASIS FUNCTION (RBF)
The Radial Basis Function (RBF) is a more recent type of network and
is quicker to train than the Multilayer Perceptron.
The RBF can be thought of performing a type of clustering within the
input space, encircling individual clusters of data by a number of
so-called basis functions.
If a data point falls within the region of activation of a particular basis
function, then the neuron corresponding to that basis function
responds most strongly.
The selection of the centers of each basis function is where difficulties
arise.
RBF networks are typically quicker to train than a MLP, and it can
model data that are clustered within the input space.