0% found this document useful (0 votes)
38 views106 pages

Unit1 ML NGP

The document discusses machine learning techniques and provides an overview of topics including recommended books, course outcomes, types of learning, well-posed learning problems in designing a learning system, and machine learning approaches.

Uploaded by

animehv5500
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views106 pages

Unit1 ML NGP

The document discusses machine learning techniques and provides an overview of topics including recommended books, course outcomes, types of learning, well-posed learning problems in designing a learning system, and machine learning approaches.

Uploaded by

animehv5500
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 106

Machine Learning

Techniques
(KAI 601)
Recommended Books:
1. Tom M. Mitchell,―Machine Learning, McGraw-Hill Education (India) Private Limited,
2013.

2. M.Gopal, Applied Machine Learning

3. Ethem Alpaydin,―Introduction to Machine Learning (Adaptive Computation and


Machine Learning), The MIT Press 2004.

4. Stephen Marsland, ―Machine Learning: An Algorithmic Perspective, CRC Press, 2009.

5. Bishop, C., Pattern Recognition and Machine Learning. Berlin: Springer- Verlag.
Course Outcomes
• CO1. To understand the need for machine learning for various problem solving.

• CO2. To understand a wide variety of learning algorithms and how to evaluate models generated
from data.

• CO3. To understand the latest trends in machine learning.

• CO4. To design appropriate machine learning algorithms and apply the algorithms to a real-world
problems.

• CO5. To optimize the models learned and report on the expected accuracy that can be achieved by
applying the models.
Unit-1

• INTRODUCTION – Learning, Types of


Learning, Well defined learning problems,
Designing a Learning System, History of ML,
Introduction of Machine Learning Approaches –
(Artificial Neural Network, Clustering,
Reinforcement Learning, Decision Tree
Learning, Bayesian networks, Support Vector
Machine, Genetic Algorithm), Issues in Machine
Learning and Data Science Vs Machine
Learning;
• Beyond: Performance measures, Cross
validation, Confusion Matrix, Data sampling,
Overfitting, Underfitting, Bias, Variance
What is Learning?
• Learning is the process of acquiring new understanding, knowledge,
behaviors, skills, values, attitudes and preferences.

• The ability to learn is possessed by humans, animals, and some machines.

• “Learning denotes changes in a system that ... enable a system to do the same task
more efficiently the next time.” - Herbert Simon

• “Learning is making useful changes in our minds.” - Marvin Minsky

• Some learning is immediate, induced by a single event (e.g. being burned by a hot
stove), but much skill and knowledge accumulates from repeated experiences.
Types of Learning
1. Visual (Spatial) :By representing information and with
images, students are able to focus on meaning, such as
architecture, engineering, project management, or design.
2. Aural (Auditory-Musical): If you need someone to
tell you something out loud to understand it, you are an
auditory learner. such as musician, recording engineer,
speech pathologist, or language teacher.
3. Verbal (Linguistic): People who find it easier to express
themselves by writing or speaking can be regarded as a verbal learner.
4. Physical (Kinesthetic) :In this style, learning happens
when the learner carries out a physical activity, rather
than listening to a lecture or watching a demonstration.
Types of Learning (Cont…)
5. Logical (Mathematical) :When you like using your
brain for logical and mathematical reasoning,
you’re a logical learner. You easily recognize patterns
and can connect seemingly meaningless concepts easily.
such as scientific research, accountancy, bookkeeping
or computer programming.
6. Social (Interpersonal) : If you’re at best in socializing
and communicating with people, both verbally and
non-verbally, this is what you are; a social learner.
People often come to you to listen and ask for
advice. counseling, teaching, training and coaching,
sales, politics, and human resources among others.
Why ML is the future?

• ML will enable automation and will improve healthcare through


personalized treatments and diagnoses.
• Machine learning will have a transformative impact on the future
of various fields including automation, healthcare, natural
language processing, transportation, personalized experiences,
cybersecurity, and science.
Why is ML used?

Machine learning allows the user to feed a computer algorithm an


immense amount of data and have the computer analyze and
make data-driven recommendations and decisions based on only
the input data.
Why is ML important?

• Machine learning is important because it gives enterprises a


view of trends in customer behavior and business operational
patterns, as well as supports the development of new products.
AI vs ML vs DL
Well – Posed Learning Problems
• Learning can be defined through a computer program that
improves its performance at some task through experience.
• Definition of Learning: A computer program is said to learn from
experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks in T, as
measured by P, improves with experience E.
• Three features:
the class of tasks,
the measure of performance to be improved, and
the source of experience.
Well – Posed Learning Problems
• Three features: the class of tasks, the measure of performance to be improved, and the source of
experience.
• A checkers learning problem:
• Task T: playing checkers
• Performance measure P: percent of games won against opponents
• Training experience E: playing practice games against itself
• A handwriting recognition learning problem:
• Task T: recognizing and classifying handwritten words within images
• Performance measure P: percent of words correctly classified
• Training experience E: a database of handwritten words with given classifications
• A robot driving learning problem:
Task T: driving on public four-lane highways using vision sensors
Performance measure P: average distance traveled before an error (as judged by human
overseer)
Training experience E: a sequence of images and steering commands recorded while observing a
human driver
DESIGNING A LEARNING SYSTEM

While designing a Learning system various design issues and approaches must be
considered.

1. Choosing the Training Experience

2. Choosing the Target Function

3. Choosing a Representation for the Target Function

4. Choosing a Function Approximation Algorithm

5. The Final Design


• choose the type of training experience from which our system will
learn. The type of training experience available can have a significant
impact on success or failure of the learner.
• One key attribute is whether the training experience provides direct
or indirect feedback regarding the choices made by the performance
system.
• Another key attribute of the training experience is how well it
represents the distribution of examples over which the final system
performance P must be measured.
Important Points
• In general, machine learning is all about making predictions and
classifications.
• Machine learning algorithms use training data to generate models.
• Models are generated functions used to predict and classify new,
previously unseen data. Sometimes models are called classifiers.
• Conceptually, a model is a mathematical function. The complexity of the
function depends on the ML algorithm, and the amount/variety of training
data used.
• Before we train a model, we can hold back a subset of the data for testing
the predictive power of the model. This is called test data.
• The predictive power of the model depends on many factors, including the
quality of data, number of samples, algorithm used, and whether a pattern
can be learned.
Another Definition of Machine Learning
• Machine Learning is the process of building a model, or a function,
with data.
f(x) = data
f(x) is a model
The ML algorithm creates the model.
A model maps input to output.
STEPS TO SOLVE A MACHINE LEARNING PROBLEM

Data Gathering Collect data from various sources

Data Preprocessing Clean data to have homogeneity

Feature Engineering Making your data more useful

Algorithm Selection & Selecting the right machine learning


Training model

Making Predictions Evaluate the model


Machine Learning Approaches
TYPES OF ML
Using data for answering questions
Training Predicting
The basic procedure involved is that first, the programmer will cluster similar data using an unsupervised
learning algorithm and then use the existing labeled data to label the rest of the unlabelled data.
• Tuning Parameter: Validation set used to tune the system.
• Tuning parameter is not estimated, but just sort of guessed.
Each sample is tested individually.
Confusion Matrix

• A confusion matrix is a matrix that summarizes the


performance of a machine learning model on a set of test data.
• It is often used to measure the performance of classification
models, which aim to predict a categorical label for each input
instance.
• The matrix displays the number of true positives (TP), true
negatives (TN), false positives (FP), and false negatives (FN)
produced by the model on the test data.
• For binary classification, the matrix will be of a 2X2 table, For
multi-class classification, the matrix shape will be equal to the
number of classes i.e for n classes it will be nXn.
To summarize how each method performed on the Testing Data, one way to do this, is by creating a
CONFUSION MATRIX for each method.
Actually, Patient has heart disease, but Actually, Patient does not have heart disease, but
algorithm predicted not have heart disease. algorithm predicted patient has heart disease.
Correctly Classified Misclassified

Then, Confusion Matrix of all the methods(K-nearest neighbor, Random Forest, Logistic Regression) were compared to
select one.
But sometimes two or more confusion matrices are very similar and make it hard to choose which Machine Learning
method is a better fit for this data?

So, we have more sophisticated metrics, Like Sensitivity, Specificity, ROC and AUC, that can help us in making a decision.
• The size of the confusion matrix is determined by the number of
things we want to predict.
• In the first example, we were only trying to predict two things: if
someone had heart disease or not. So, that gave us a confusion
matrix with 2 rows and 2 columns.
• Now, if in next example we have 3 things to choose from, then we
have confusion matrix with 3 rows and 3 columns.
X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=0.25)

model.fit(X_train, y_train)

y_pred = model.predict(X_test)

cm = confusion_matrix(y_test,y_pred)

accuracy = accuracy_score(y_test, y_pred)

precision = precision_score(y_test, y_pred)

recall = recall_score(y_test, y_pred)

F1_score = f1_score(y_test, y_pred)


AUC - ROC Curve:

AUC (Area Under The Curve) -ROC (Receiver Operating Characteristics) curve.

When it comes to a classification problem, we can count on an AUC - ROC


Curve.

It is one of the most important evaluation metrics for checking any


classification model’s performance.

It is also written as AUROC (Area Under the Receiver Operating


Characteristics)
What is the AUC - ROC Curve?
AUC - ROC curve is a performance measurement for the classification problems at various threshold settings. ROC is a
probability curve and AUC represents the degree or measure of separability. It tells how much the model is capable of
distinguishing between classes. Higher the AUC, the better the model is at predicting 0 classes as 0 and 1 classes as 1.

The ROC curve is plotted with TPR against the FPR where TPR is on the y-axis and FPR is on the x-axis.
PR Curve
• It is plotting the precision against the recall for each threshold.
• A no-skill classifier is one that cannot discriminate between the
classes and would predict a random class or a constant class in all
cases.
• It is desired that the algorithm should have both high precision, and
high recall. However, most machine learning algorithms often involve
a trade-off between the two. A good PR curve has greater AUC (area
under curve).
How to read
a PR Curve
F1 Score
• It is described as the harmonic mean of the precision and recall of a
classification model. The two metrics contribute equally to the score,
ensuring that the F1 metric correctly indicates the reliability of a
model.
F1 Score
• The F1 score integrates precision and recall into a single metric to gain
a better understanding of model performance.
• Accuracy is used when the True Positives and True negatives are more
important while F1-score is used when the False Negatives and False
Positives are crucial.
• Accuracy can be used when the class distribution is similar while F1-
score is a better metric when there are imbalanced classes.
• It ranges from 0 to 1, where 1 indicates perfect precision and recall,
and 0 means neither perfect precision nor recall.
• As a general rule of thumb, an F1 score of 0.7 or higher is often
considered good. In some applications, a higher F1 score may be
required, mainly if precision and recall are both essential and a high
cost is associated with false positives and false negatives.
There are mainly two types of errors in machine learning, which are:
•Reducible errors: These errors can be reduced to improve the model accuracy. Such errors
can further be classified into bias and Variance.
•Irreducible errors: These errors will always be present in the model regardless of which
algorithm has been used. The cause of these errors is unknown variables whose value can't
be reduced.
Bias and Variance are the components of generalization error.
What is Bias?
• It can be defined as an inability of machine learning algorithms to capture the true relationship
between the data points.
• While making predictions, a difference occurs between prediction values made by the model
and actual values/expected values, and this difference is known as bias errors or Errors due to
bias.
• Each algorithm begins with some amount of bias because bias occurs from assumptions in the
model, which makes the target function simple to learn.
• Generally, a linear algorithm has a high bias, as it makes them learn fast. Whereas a nonlinear
algorithm often has low bias.

Ways to reduce High Bias:


High bias mainly occurs due to a much simple model. Below are some ways to reduce the high bias:
o Increase the input features as the model is underfitted.
o Decrease the regularization term.
o Use more complex models, such as including some polynomial features.
What is a Variance?
• The variance would specify the amount of variation in the prediction if the different training data
was used.
• variance tells that how much a random variable is different from its expected value.
• Ideally, a model should not vary too much from one training dataset to another, which means the
algorithm should be good in understanding the hidden mapping between inputs and output
variables.
• A model that shows high variance learns a lot and perform well with the training dataset, and does
not generalize well with the unseen dataset. As a result, such a model gives good results with the
training dataset but shows high error rates on the test dataset.
A model with high variance has the below problems:
o A high variance model leads to overfitting.
o Increase model complexities.
Ways to Reduce High Variance:
o Reduce the input features or number of parameters as a model is overfitted.
o Do not use a much complex model.
o Increase the training data.
o Increase the Regularization term.
• Machine learning algorithms with low variance are, Linear Regression, Logistic
Regression, and Linear discriminant analysis.
• At the same time, algorithms with high variance are decision tree, Support
Vector Machine, and K-nearest neighbours.
• Usually, nonlinear algorithms have a lot of flexibility to fit the model, have high
variance.
Different Combinations of Bias-Variance

There are four possible combinations of bias and variances, which are represented by the below diagram:

Low-Bias, Low-Variance:
The combination of low bias and low variance shows an ideal machine
learning model. However, it is not possible practically.
Low-Bias, High-Variance: With low bias and high variance, model
predictions are inconsistent and accurate on average. This case occurs
when the model learns with a large number of parameters and hence
leads to an overfitting
High-Bias, Low-Variance: With High bias and low variance, predictions
are consistent but inaccurate on average. This case occurs when a model
does not learn well with the training dataset or uses few numbers of the
parameter. It leads to underfitting problems in the model.
High-Bias, High-Variance:
With high bias and high variance, predictions are inconsistent and also
inaccurate on average.
How to identify High variance or High Bias?
Bias-Variance Trade-Off
How to identify High variance or High Bias?
High variance can be identified if the model has Low training error and high test error.

High Bias can be identified if the model has High training error and the test error is almost similar to
training error.

Bias-Variance Trade-Off
While building the machine learning model, it is really important to take care of bias and variance in order
to avoid overfitting and underfitting in the model. If the model is very simple with fewer parameters, it
may have low variance and high bias. Whereas, if the model has a large number of parameters, it will have
high variance and low bias. So, it is required to make a balance between bias and variance errors, and this
balance between the bias error and variance error is known as the Bias-Variance trade-off.
Bias-Variance Trade-Off
For an accurate prediction of the model, algorithms need a low variance and low bias. But this is not
possible because bias and variance are related to each other:

o If we decrease the variance, it will increase the bias.


o If we decrease the bias, it will increase the variance.
Bias-Variance trade-off is a central issue in supervised learning.

Ideally, we need a model that accurately captures the regularities in training data and simultaneously
generalizes well with the unseen dataset.

Unfortunately, doing this is not possible simultaneously. Because a high variance algorithm may perform
well with training data, but it may lead to overfitting to noisy data.
Whereas, high bias algorithm generates a much simple model that may not even capture important
regularities in the data.

So, we need to find a sweet spot between bias and variance to make an optimal model.
Three commonly used methods for finding the sweet spot between simple and complicated models are:
1. Regularization
2. Bagging
3. Boosting
What is Entropy in Machine Learning
Entropy is the machine learning metric that measures the unpredictability or impurity in the system.

When information is processed in the system, then every piece of information has a specific value to make and can be
used to draw conclusions from it. So if it is easier to draw a valuable conclusion from a piece of information, then
entropy will be lower in Machine Learning, or if entropy is higher, then it will be difficult to draw any conclusion from
that piece of information.
Entropy is the measurement of disorder or impurities in the information processed in machine learning. It determines
how a decision tree chooses to split data.

We can understand the term entropy with any simple example: flipping a coin. When we flip a coin, then there can be
two outcomes. However, it is difficult to conclude what would be the exact outcome while flipping a coin because there
is no direct relation between flipping a coin and its outcomes. There is a 50% probability of both outcomes; then, in
such scenarios, entropy would be high. This is the essence of entropy in machine learning.

You might also like