0% found this document useful (0 votes)
19 views

Machine - Learning - Unit - 1

Uploaded by

Santhosh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

Machine - Learning - Unit - 1

Uploaded by

Santhosh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70

An introduction to

Machine Learning

Ms. Deepa A

Assistant Professor

Kongu Engineering College, Perundurai


AI, ML, DL
Artificial Programs with the ability to learn and
Intelligence
reason like humans

Machine Learning
Algorithms with the ability to learn
without being explicitly
programmed

Deep Learning Subset of Machine Learning in which


artificial neural networks adapt and
learn from vast amount of data
Machine Learning
• Machine Learning is the science of getting computers to act
without being explicitly programmed-Andrew Ng

• Machine Learning is the Study of algorithms that


✔ Improve their performance(P)

✔ At some task(T)

✔ With experience(E)-Tom Mitchell


Machine Learning Vs Programming
Machine Learning Types
Supervised Learning
• Given:
-a set of input features X1,….Xn
-A target feature Y
-a set of training examples where the values for the input features
and the target features are given for each example
-a new example, where the values for the input features are given
• Predict the values for the target features for the new example.
-Classification when Y is discrete
-Regression when Y is continuous
Supervised Learning
Unsupervised Learning
• Data with no target attribute. Describe hidden structure from
unlabelled data.
• Explore the data to find some intrinsic structures in them.
• Clustering: the task of grouping a set of objects in such a way that
objects in the same group(called a cluster) are more similar to each
other than to those in other clusters.
• Useful for
Automatically organizing data
Understanding hidden structure in data
Preprocessing for further analysis.
Reinforcement Learning
• Reinforcement-anything that strengthens or increases behaviour or
giving praise
• Reinforcement learning is a sub-branch of ML that trains a model to
return an optimum solution for a problem by taking a sequence of
decision by itself.
• Decision process
• Reward system
• Learn series of actions
Reinforcement Learning Analogy
Machine Learning Applications
• Traffic Alerts
• Image Recognition
• Online supporting Chatbots
• Google Translate
• Online Video streaming applications
• Stock Market Analysis
• Social Media Analysis
Most popular programming languages for
ML
• Python
•R
• C++
• Java
• JavaScript
Python Libraries for ML
Open Source tools for ML Implementation
• Anaconda(Jupyter, Spyder, Orange3,etc.)-Offline Resources

• Google Colab-Online Resources

• Kaggle
UNIT 1
INTRODUCTION
• What is Machine Learning?
• Types of Machine Learning
• Supervised Learning: Regression and Classification
• Machine Learning Process
• Some Terminology
• Testing ML algorithm
• Turning data into probabilities
• Naïve Bayes Classifier
• The brain and Neuron
• Neural Networks
• Perceptron
Types of Machine Learning

Machine Learning

Supervised Unsupervised Reinforcement Evolutionary


Learning Learning Learning Learning
Supervised Learning
• Set of Training data + Target data

• Also known as learning from exemplars

• Usually written with

• Supervised Learning can be implemented in two ways:


✔ Regression-Value of the Output(Continuous one)

✔ Classification-Target Class(Discrete one)


Regression

Find y when
x=0.44
Contd..
-a statistical technique that relates a dependent variable to one or
more independent variables.
-ultimate goal of the regression algorithm is to plot a best-fit line or a
curve between the data.

Regression-fit a mathematical function describing a


curve, so that the curve passes as close as possible to
all of the datapoints. It is generally a problem of
function approximation or interpolation, working out
the value between values that we know.
Classification
• Classification is a supervised machine learning method where
the model tries to predict the correct label of a given input data.
• In classification, the model is fully trained using the training
data, and then it is evaluated on test data before being used to
perform prediction on new unseen data.
• Novelty detection is the process of identifying new or unknown data
or patterns in a dataset that a machine learning system has not been
exposed to during training.
Contd..
• Curse of dimensionality:
• As the dimensionality of the features space
increases, the number configurations can
grow exponentially, and thus the number of
configurations covered by an observation
decreases.

Example: Predicting house price


Evolutionary learning
• Biological evolution can be seen as learning process.

• Biological organisms adapt to improve their survival rates and chance


of having offspring in their environment.

• This solves problems by employing processes that mimic the


behaviours of living things.
Machine Learning Process

Data • Data can be collected from various


Collection and
Preparation sources and prepare for next process

• Is the method of reducing the input variable to your model by


Feature using only the relevant data and getting rid of noise in data
Selection

• Given the data set the choice of an


Algorithm
Choice appropriate algorithm

Parameter
and model • Selecting the best algorithm and model architecture
Selection suited for a particular task

• Given the dataset,algorithm,


Training
and parameters,training
should be simply the use of
computational resources. • The model is evaluated to
Evaluation test if the model is any
good.
Some Terminology
(terminologies, Weight space, curse of
dimensionality)
• Inputs-an input vector is the data given as one input to algorithm(x)

• Weights-weighted connections between node i and j(wij)

• Outputs-Output vector if y

• Targets-The target vector t

• Activation function-g(.)

• Error E
Weight Space
• Since we are using neural network to implement the solution, we
need to find the distance between the input and the neuron. This is
computed by Euclidean distance,

If the neuron is close to the


input then it should fire or else
shouldn’t.
Curse of dimensionality
• If the number of dimensions increases, the volume of the unit
hypersphere does not increases with it.
Testing Machine Learning Algorithm
• Overfitting
• Training, Testing and Validation Sets
• The Confusion Matrix
• Accuracy metrics
• Receiver Operator Characteristic(ROC) Curve
• Unbalanced Datasets
• Measurement Precision
Overfitting
• If we train the machine with too many training data it may led to
overfitting.
Training, Testing and Validation sets
The Confusion Matrix
• Confusion Matrix is a table used in ML and statistics to assess the
performance of a classification model.
• Compare the results of actual output and predicted output.
• Accuracy can be calculated by dividing
the sum of elements on the leading diagonal
by the sum of all of the elements in the matrix.
Accuracy metrics
• Accuracy is one metric for evaluating classification model.
• It is defined as the sum of the number of true positives and true negatives
divided by the total number of examples.

1. True positive: An instance for which both predicted and actual values are
positive.
2. True negative: An instance for which both predicted and actual values are
negative.
3. False Positive: An instance for which predicted value is positive but actual
value is negative.
4. False Negative: An instance for which predicted value is negative but actual
value is positive.
• Confusion Matrix

• There are two complementary pair of measurements that can help us to


interpret the performance of a classifier
• Sensitivity and Specificity
• Precision and Recall
• Sensitivity(TPR) is the ratio of number of correct positive examples to the
number classified as positive.
• Specificity(FPR) is the ratio of number of correct negative examples to the
number classified as negative.
• Precision: ratio of the number of correct positive examples to the number
of actual positive examples
• Recall: ratio of the number of correct positive examples to the number that
are classified as positive.
Precision and recall can be combined to give
a single measure called F1 measure,
The Receiver Operator Characteristic Curve
• ROC is a graph showing the performance of a classification model.
This curve plots two parameters: TPR and FPR. It is computed by Area
Under Curve(AUC).
Unbalanced Datasets
• In some cases the datasets is not balanced one(i.e. Not contain same
number of positive and negative examples).At this time, a more
correct measure is Matthew’s Correlation Coefficient, which is
computed as,
Turning Data into Probabilities
• In ML turning data into probabilities often involves using probabilistic
models or algorithms to make predictions or classifications.
• Now worked out two things from our training data:
• the joint probability and
• The conditional probability

There is a link between the joint probability and conditional probability

By equating the above 2 equations we get Bayes’ rule,


• However, if we notice that any observation Xk has to belong to some
class Ci ,then we can marginalise over the classes to compute.

• Where x is a vector of feature values


Instead of just one feature. This is known
as maximum a posteriori or MAP hypothesis
Naïve Bayes’ Classifier
• It is a supervised learning algorithm, which is based on bayes theorem
and used for solving classification problems.
• The crux of the classifier is based on Bayes theorem
• It describes the probability of an event, based on prior knowledge of
conditions that might be related to the event.
• The classifier calculates the conditional probability of each feature
given a class and then combines these probabilities to determine the
overall probability of the observation belonging to a certain class.
Example
Brain and the Neuron
• In ML, the inspiration drawn from the human brain, especially the
neural networks in the brain, has led to the development of Artificial
neural network(ANN)
• The fundamental building block of both ANN and the human brain is
the neuron
• Neurons receive signals from other neurons through dendrites,
process these signals in the cell body, and then transmit signals to
other neurons through an axon.
• The connections between neurons are known as synapses where
information is transmitted through the release of neurotransmitters.
•Synaptic Plasticity:
•Modifying the strength of synaptic connections
between the neurons and creating new connections
•The strength of connections between neurons can be
adjusted over time in a process is called Synaptic
Plasticity
Hebb’s Rule
• Hebb’s rule says that the changes in the strength of synaptic
connections are proportional to the correlation in the firing of the
two connecting neurons.
• So, if two neurons consistently fire simultaneously, then any
connection between them will change in strength, becoming
stronger.
• However, if the two neurons never fire simultaneously the
connection between them will die away.
• It is also known as long-term potentiation and neural plasticity, and it
does appear to have correlates in real brains.
McCulloch and Pitts Neurons
• Studying neurons isn’t actually easy. You need to be able to extract
the neuron from the brain and then keep it alive so that you can see
how it reacts in controlled circumstances.
• McCulloch and Pitts produced a perfect example of t his when they
modelled a neuron as:
1. A set of weighted inputs wi that correspond to the synapses
2. An adder that sums the input signals (equivalent to the membrane
of the cell that collects electrical charge)
3. An activation function that decides whether the neuron fires for the
current inputs.
Contd..

• A picture of McCulloch and Pitts mathematical model of a neuron.


The inputs are multiplied by the weights and the neurons sum their
values. If the sum is greater then the threshold Ɵ then the neuron
fires; otherwise it does not.
Contd..

o is the activation function if the obtained


value is greater than the threshold value
then the neuron should fire otherwise does
not.
Limitations of McCulloch and Pitts Neuron
Model
• This model uses a binary activation function(output is either 0 or 1).
This binary nature over simplifies the behaviour of real neurons,
which exhibit graded responses and can transmit continuous signals
• The original model does not include weights on connection between
neurons. In real neural networks, the strength of connection plays a
crucial role in information processing
• The neurons are not updated sequentially according to a computer
clock, but update themselves randomly whereas in many of our
models we will update themselves randomly.
Neural Networks
• A neural network is a computational mode inspired by the structure
and function of the human brain’s neural networks.
• It consists of interconnected nodes called neurons.
• Each neuron receives input signals, processes them and then
produces an output signal that may be passed to other neurons.
• These networks are trained using algorithms to recognize patterns
and relationships in data, making them particularly useful for tasks
like classification, regression, clustering and pattern recognition.
Perceptron
• Perceptron is Machine Learning algorithm for supervised learning of
various binary classification tasks.
• This model consists of four main parameters
• Input values
• Weights
• Net Sum
• Activation function
Perceptron Network
Contd..
• Suppose if we are getting the output for the above model is
(0,1,0,0,1) the neuron 2,5 should fire others shouldn’t
• In some cases, the output of each neuron may be incorrect. This may
be corrected in two ways
1. By changing the weight value of input and neuron
2. By multiplying a parameter(ƞ-Learning rate) with weight and input value.
The weight of the network can be changed using the following
equation:

New weight value =Obtained value plus old weight value


Contd..
•In some other cases weight of the value may be correct, if a
neuron is wrong, changing the relevant weight doesn’t do
anything; we need to change the threshold value.
•This is done by multiplying a parameter called learning rate
ƞ with the McCulloch and Pitts Neuron model,

•By using different values for the learning rate tends to make
the network unstable, so that it never settles down. We
therefore use a moderate learning rate, typically 0.1<ƞ<0.4,
depending on how much error we expect in the inputs.
Bias Input
• When we discussed the McCulloch and Pitts neuron, we gave each
neuron a firing threshold Ɵ that determined what value it needed
before it should fire.
• This threshold should be adjustable, so that we can change the value
that the neuron fires at.
• In a network, if all the input value is zero, no matter what is the value
of weights were set.
• Suppose if we set all the threshold value for neuron at zero. Now we
add extra input weight to the neuron, with the value of the input to
that weight always being fixed (usually +-1 )
Contd..
• Usually we will take -1, even when all other inputs are zero. This input
is called a bias node.
The Perceptron Learning Algorithm
• The perceptron algorithm is divided into two parts: a training phase
and a recall phase
• Training phase-
Complexity O(mn)
• Recall phase-
Complexity O(Tmn)
Example of Perceptron Learning: Logic
Functions
Contd..
• Initially assign weights to small random numbers ,
w0=-0.05 w1=-0.02, w2=0.02
So the value reaches to the neuron for(0,0)
-0.05*-1+-0.02*-1+0.02*-1=0.05. So this value is 1, so the neuron fires
and output is 1.It is wrong.
Hence we will apply learning rate to find it out (0,0)
0.2*-1+-0.02*-1+0.02*-1=-0.2
So this value is 0- neuron does
not fires
Output is 0

You might also like