0% found this document useful (0 votes)
8 views124 pages

AI&ML Unit 4

Uploaded by

727621bme027
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views124 pages

AI&ML Unit 4

Uploaded by

727621bme027
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 124

Sreejith.S.

Nair
Assistant Professor(SS)
19MECC1701-ARTIFICIAL DepartmentINTELLIGENCE AND MACHINE LEARNING
of Mechanical Engineering
Dr.Mahalingam College of Engineering and
Technology,
Dept of Mechanical Engineering Pollachi.
1
Course Code: 19MECC1701 Course Title: Artificial Intelligence & Machine Learning

Course Category: Professional Core Course Level: Mastery

L:T:P (Hours/Week) 3: 0: 0 Credits:3 Total Contact Hours:45 Max. Marks:100

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 2


Course Outcomes

Course Outcomes Cognitive


At the end of this course, students will be able to: Level

CO1: Explain the basic concept, and application of Artificial Intelligence.


Understand
CO2: Explain uninformed and informed search methods for problem solving. Understand

CO3: Explain knowledge representation using first order logic. Understand

CO4: Explain the basic concept, and application of Machine Learning. Understand

CO5: Explain the classification and clustering techniques for decision making. Understand

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 3


Text & Reference books
Text books:
Stuart J. Russell and Peter Norvig, “Artificial Intelligence- A Modern Approach”, Fourth Edition, Pearson
Series, 2021.
Tom M. Mitchell, “Machine Learning”, McGraw hill, 2013.
Reference books:
• George Lugar, “Al-Structures and Strategies for and Strategies for Complex Problem solving”,
Sixth Edition, 2009, Pearson Educations.
•E. Rich and K. Knight, “Artificial intelligence”, McGraw Hill, 3rd ed., 2017.
•Robin R Murphy, “Introduction to AI Robotics”, PHI Publication, 2nd edition,2019.
•Nils.J. Nilsson, “Principles of AI”, Narosa Publ. House, 2000.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 4


UNIT IV - INTRODUCTION TO MACHINE LEARNING

Introduction: Basic definitions, types of learning, hypothesis, space and inductive bias,

evaluation, cross-validation- Linear regression- R programming, Decision trees, over

fitting-Instance based learning, Feature reduction, Collaborative filtering based

recommendation- Probability and Bayes learning.

5
Basic Concepts in Machine Learning
• Machine Learning is continuously growing in the IT world and gaining strength in
different business sectors.
• Although Machine Learning is in the developing phase, it is popular among all
technologies. It is a field of study that makes computers capable of
automatically learning and improving from experience.
• Hence, Machine Learning focuses on the strength of computer programs with
the help of collecting data from various observations.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 6


Machine Learning
• Machine learning is a growing technology which enables computers to learn
automatically from past data.
• Machine learning uses various algorithms for building mathematical models
and making predictions using historical data or information. Currently, it is being used
for various tasks such as image recognition, speech recognition, email filtering,
Facebook auto-tagging, recommender system, and many more.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 7


How does Machine Learning work
• A Machine Learning system learns from historical data, builds the prediction models,
and whenever it receives new data, predicts the output for it. The accuracy of predicted
output depends upon the amount of data, as the huge amount of data helps to build a better
model which predicts the output more accurately.
• Suppose we have a complex problem, where we need to perform some predictions, so
instead of writing a code for it, we just need to feed the data to generic algorithms, and with the
help of these algorithms, machine builds the logic as per the data and predict the output.
Machine learning has changed our way of thinking about the problem.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 8


Need for Machine Learning

• The need for machine learning is increasing day by day. The reason behind the need for
machine learning is that it is capable of doing tasks that are too complex for a person to
implement directly. As a human, we have some limitations as we cannot access the huge
amount of data manually, so for this, we need some computer systems and here comes
the machine learning to make things easy for us.
• We can train machine learning algorithms by providing them the huge amount of data
and let them explore the data, construct the models, and predict the required output
automatically. The performance of the machine learning algorithm depends on the
amount of data, and it can be determined by the cost function. With the help of
machine learning, we can save both time and money.
• The importance of machine learning can be easily understood by its uses cases,
Currently, machine learning is used in self-driving cars, cyber fraud detection, face
recognition, and friend suggestion by Facebook, etc. Various top companies such as
Netflix and Amazon have build machine learning models that are using a vast amount of
data to analyze the user interest and recommend product accordingly.
Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 9
What is Machine Learning?
• Machine Learning is defined as a technology that is used to train machines to perform
various actions such as predictions, recommendations, estimations, etc., based on historical data
or past experience.
• Machine Learning enables computers to behave like human beings by training them with the
help of past experience and predicted data.
• There are three key aspects of Machine Learning, which are as follows:
• Task: A task is defined as the main problem in which we are interested. This task/problem can be
related to the predictions and recommendations and estimations, etc.
• Experience: It is defined as learning from historical or past data and used to estimate and
resolve future tasks.
• Performance: It is defined as the capacity of any machine to resolve any machine learning task
or problem and provide the best outcome for the same. However, performance is dependent on
the type of machine learning problems.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 10


Example

A robot driving learning problem:

 Task T: driving on public four-lane highways using vision sensors.

 Performance measure P: average distance traveled before an error

(as judged by human overseer).

 Training experience E: a sequence of images and steering commands

recorded while observing a human driver.

Dept of Automobile Engineering Artificial Intelligence Techniques 11


Types of Machine Learning
• Machine learning is a subset of AI, which enables the machine to automatically learn from data,
improve performance from past experiences, and make predictions.
• Machine learning contains a set of algorithms that work on a huge amount of data.
• Data is fed to these algorithms to train them, and on the basis of training, they build the model &
perform a specific task.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 12


Machine Learning Types

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 13


Machine Learning Types
• Based on the methods and way of learning, machine learning is divided
into mainly four types, which are:

• Supervised Machine Learning

• Unsupervised Machine Learning

• Semi-Supervised Machine Learning

• Reinforcement Learning

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 14


Supervised machine learning
• As its name suggests, Supervised machine learning is based on supervision. It
means in the supervised learning technique, we train the machines using
the "labelled" dataset, and based on the training, the machine
predicts the output.
• Here, the labelled data specifies that some of the inputs are already mapped
to the output.
• More preciously, we can say; first, we train the machine with the input
and corresponding output, and then we ask the machine to predict the
output using the test dataset.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 15


Example
• Let's understand supervised learning with an example. Suppose we have an input
dataset of cats and dog images. So, first, we will provide the training to the machine to
understand the images, such as the shape & size of the tail of cat and dog, Shape of
eyes, colour, height (dogs are taller, cats are smaller), etc.

• After completion of training, we input the picture of a cat and ask the machine to
identify the object and predict the output. Now, the machine is well trained, so it will
check all the features of the object, such as height, shape, colour, eyes, ears, tail, etc., and
find that it's a cat. So, it will put it in the Cat category. This is the process of how the
machine identifies the objects in Supervised Learning.

• The main goal of the supervised learning technique is to map the input
variable(x) with the output variable(y).
• Some real-world applications of supervised learning are Risk Assessment, Fraud
Detection, Spam filtering, etc.
Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 16
Categories of Supervised Machine Learning
• Supervised machine learning can be classified into two types of problems, which are given below:
• Classification
• Regression
• a) Classification
• Classification algorithms are used to solve the classification problems in which the output variable
is categorical, such as "Yes" or No, Male or Female, Red or Blue, etc. The classification
algorithms predict the categories present in the dataset. Some real-world examples of classification
algorithms are Spam Detection, Email filtering, etc.
• Some popular classification algorithms are given below:
• Random Forest Algorithm
• Decision Tree Algorithm
• Logistic Regression Algorithm
• Support Vector Machine Algorithm

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 17


b) Regression
• Regression algorithms are used to solve regression problems in which there is a linear relationship between input and output
variables. These are used to predict continuous output variables, such as market trends, weather prediction, etc.
• Simple Linear Regression Algorithm
• Multivariate Regression Algorithm
• Decision Tree Algorithm
• Lasso Regression

• Advantages and Disadvantages of Supervised Learning


• Advantages:
• Since supervised learning work with the labelled dataset so we can have an exact idea about the classes of objects.
• These algorithms are helpful in predicting the output on the basis of prior experience.
• Disadvantages:
• These algorithms are not able to solve complex tasks.
• It may predict the wrong output if the test data is different from the training data.
• It requires lots of computational time to train the algorithm.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 18


Applications of Supervised Learning
• Some common applications of Supervised Learning are given below:
• Image Segmentation:
• Supervised Learning algorithms are used in image segmentation. In this process, image classification is performed
on different image data with pre-defined labels.
• Medical Diagnosis:
• Supervised algorithms are also used in the medical field for diagnosis purposes. It is done by using medical images
and past labelled data with labels for disease conditions. With such a process, the machine can identify a
disease for the new patients.
• Fraud Detection - Supervised Learning classification algorithms are used for identifying fraud transactions, fraud
customers, etc. It is done by using historic data to identify the patterns that can lead to possible fraud.
• Spam detection - In spam detection & filtering, classification algorithms are used. These algorithms classify an
email as spam or not spam. The spam emails are sent to the spam folder.
• Speech Recognition - Supervised learning algorithms are also used in speech recognition. The algorithm is trained
with voice data, and various identifications can be done using the same, such as voice-activated passwords, voice
commands, etc.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 19


2. Unsupervised Machine Learning
• Unsupervised learning is different from the Supervised learning technique;
as its name suggests, there is no need for supervision. It means,
in unsupervised machine learning, the machine is trained
using the unlabeled dataset, and the machine predicts the
output without any supervision.
• In unsupervised learning, the models are trained with the data that is
neither classified nor labelled, and the model acts on that data without
any supervision.
• The main aim of the unsupervised learning algorithm is to group or
categories the unsorted dataset according to the similarities, patterns,
and differences. Machines are instructed to find the hidden patterns
from the input dataset.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 20


Example
• Let's take an example to understand it more preciously; suppose there is
a basket of fruit images, and we input it into the machine
learning model.
• The images are totally unknown to the model, and the task of the
machine is to find the patterns and categories of the objects.
• So, now the machine will discover its patterns and differences, such as
colour difference, shape difference, and predict the output when it
is tested with the test dataset.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 21


Categories of Unsupervised Machine Learning

• Unsupervised Learning can be further classified into two types, which

are given below:

• Clustering

• Association

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 22


Categories of Unsupervised Machine Learning
• 1) Clustering
• The clustering technique is used when we want to find the inherent groups from the
data. It is a way to group the objects into a cluster such that the objects with the
most similarities remain in one group and have fewer or no similarities with the
objects of other groups. An example of the clustering algorithm is grouping the
customers by their purchasing behaviour.
• Some of the popular clustering algorithms are given below:
• K-Means Clustering algorithm
• Mean-shift algorithm
• DBSCAN Algorithm
• Principal Component Analysis
• Independent Component Analysis

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 23


2) Association
• Association rule learning is an unsupervised learning technique, which
finds interesting relations among variables within a large dataset.
• The main aim of this learning algorithm is to find the dependency of one
data item on another data item and map those variables accordingly
so that it can generate maximum profit.
• This algorithm is mainly applied in Market Basket analysis, Web usage
mining, continuous production, etc.

• Some popular algorithms of Association rule learning are


Apriori
Algorithm, Eclat, FP-growth algorithm.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 24


Advantages and Disadvantages of Unsupervised Learning Algorithm

• Advantages:
• These can be used for complicated tasks compared to the
supervised ones because these algorithms work on the unlabeled dataset.
algorithms
• Unsupervised algorithms are preferable for various tasks as getting
unlabeled dataset is easier as compared to the labelled dataset.
the
• Disadvantages:
• The output of an unsupervised algorithm can be less accurate as the dataset is
not labelled, and algorithms are not trained with the exact output in prior.
• Working with Unsupervised learning is more difficult as it works with the
unlabelled dataset that does not map with the output.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 25


Applications of Unsupervised Learning
• Network Analysis: Unsupervised learning is used for identifying plagiarism
and copyright in document network analysis of text data for scholarly
articles.
• Recommendation Systems: Recommendation systems widely use
unsupervised learning techniques for building recommendation
applications for different web applications and e-commerce websites.
• Anomaly Detection: Anomaly detection is a popular application of
unsupervised learning, which can identify unusual data points within
the dataset. It is used to discover fraudulent transactions.
• Singular Value Decomposition: Singular Value Decomposition or SVD is used
to extract particular information from the database. For example,
extracting information of each user located at a particular location.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 26


3. Semi-Supervised Learning

• Semi-Supervised learning is a type of Machine Learning algorithm that


lies between Supervised and Unsupervised machine learning.
• It represents the intermediate ground between Supervised
(With Labelled training data) and Unsupervised learning (with no
labelled training data) algorithms and uses the combination of
labelled and unlabeled datasets during the training period.
• Although Semi-supervised learning is the middle ground between
supervised and unsupervised learning and operates on the data
that consists of a few labels, it mostly consists of unlabeled data.
As labels are costly, but for corporate purposes, they may have
few labels.
• It is completely different from supervised and unsupervised learning as
they are based on the presence & absence of labels.
Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 27
3. Semi-Supervised Learning

• To overcome the drawbacks of supervised learning and unsupervised learning


algorithms, the concept of Semi-supervised learning is introduced.
• The main aim of semi-supervised learning is to effectively use all the available
data, rather than only labelled data like in supervised learning.
• Initially, similar data is clustered along with an unsupervised learning algorithm, and
further, it helps to label the unlabeled data into labelled data. It is because labelled data
is a comparatively more expensive acquisition than unlabeled data.
• We can imagine these algorithms with an example. Supervised learning is where a
student is under the supervision of an instructor at home and college.
• Further, if that student is self-analysing the same concept without any help from the
instructor, it comes under unsupervised learning. Under semi-supervised learning, the
student has to revise himself after analyzing the same concept under the guidance of an
instructor at college.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 28


Advantages and disadvantages of Semi-supervised Learning

• Advantages:
• It is simple and easy to understand the algorithm.
• It is highly efficient.
• It is used to solve drawbacks of Supervised and Unsupervised Learning
algorithms.
• Disadvantages:
• Iterations results may not be stable.
• We cannot apply these algorithms to network-level data.
• Accuracy is low.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 29


4. Reinforcement Learning

• Reinforcement learning works on a feedback-based process, in which an AI agent (A


software component) automatically explore its surrounding by hitting & trail, taking
action, learning from experiences, and improving its performance.
• Agent gets rewarded for each good action and get punished for each bad
action; hence the goal of reinforcement learning agent is to maximize the rewards.
• In reinforcement learning, there is no labelled data like supervised learning, and
agents learn from their experiences only.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 30


4. Reinforcement Learning

• The reinforcement learning process is similar to a human being; for


example, a child learns various things by experiences in his day-to-
day life.
• An example of reinforcement learning is to play a game, where the
Game is the environment, moves of an agent at each step define
states, and the goal of the agent is to get a high score.
• Agent receives feedback in terms of punishment and rewards.
• Due to its way of working, reinforcement learning is employed in
different fields such as Game theory, Operation Research,
Information theory, multi-agent systems.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 31


Categories of Reinforcement Learning

• Reinforcement learning is categorized mainly into two types


methods/algorithms:
of
• Positive Reinforcement Learning: Positive reinforcement learning specifies
increasing the tendency that the required behaviour would occur again
by adding something. It enhances the strength of the behaviour of the
agent and positively impacts it.
• Negative Reinforcement Learning: Negative reinforcement learning works
exactly opposite to the positive RL. It increases the tendency that the
specific behaviour would occur again by avoiding the negative condition.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 32


Real-world Use cases of Reinforcement Learning

• Video Games:
• RL algorithms are much popular in gaming applications. It is used to gain super-human performance. Some
popular games that use RL algorithms are AlphaGO and AlphaGO Zero.
• Resource Management:
• The "Resource Management with Deep Reinforcement Learning" paper showed that how to use RL in
computer to automatically learn and schedule resources to wait for different jobs in order to
minimize average job slowdown.
• Robotics:
• RL is widely being used in Robotics applications. Robots are used in the industrial and manufacturing area,
and these robots are made more powerful with reinforcement learning. There are different industries
that have their vision of building intelligent robots using AI and Machine learning technology.
• Text Mining
• Text-mining, one of the great applications of NLP, is now being implemented with the help of
Reinforcement Learning by Salesforce company.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 33


Advantages and Disadvantages of Reinforcement Learning

• Advantages
• It helps in solving complex real-world problems which are difficult to be
solved by general techniques.
• The learning model of RL is similar to the learning of human beings;
hence most accurate results can be found.
• Helps in achieving long term results.
• Disadvantages
• RL algorithms are not preferred for simple problems.
• RL algorithms require huge data and computations.
• Too much reinforcement learning can lead to an overload of states
which can weaken the results.
Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 34
Hypothesis
• In most supervised machine learning algorithm, our main goal is to find out a
possible hypothesis from the hypothesis space that could possibly map out the
inputs to the proper outputs.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 35


Hypothesis (h)&Hypothesis Space (H):
• Hypothesis (h):
• A hypothesis is a function that best describes the target in supervised
machine learning. The hypothesis that an algorithm would come
up depends upon the data and also depends upon the restrictions
and bias that we have imposed on the data.
• Hypothesis Space (H):
• Hypothesis space is the set of all the possible legal hypothesis. This is
the set from which the machine learning algorithm would
determine the best possible (only one) which would best
describe the target function or the outputs.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 36


Hypothesis (h)&Hypothesis Space (H):
•The hypothesis (h) can be formulated in machine learning as
follows: y= mx + b
Where,
Y: Range
m: Slope of the line which divided test data or changes in y divided by change
in x.
x: domain

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 37


Hypothesis (h)&Hypothesis Space (H):Example

• Let's understand the hypothesis (h) and hypothesis space (H) with a two-
dimensional coordinate plane showing the distribution of data as follows.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 38


Hypothesis (h)&Hypothesis Space (H):Example

• Now, assume we have some test data by which ML algorithms predict the
outputs for input as follows:

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 39


Hypothesis (h)&Hypothesis Space (H):Example

• If we divide this coordinate plane in such as way that it can help you to
predict output or result as follows:

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 40


Hypothesis (h)&Hypothesis Space (H):Example
• Based on the given test data, the output result will be as follows:

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 41


Hypothesis (h)&Hypothesis Space (H):Example

• However, based on data, algorithm, and constraints, this coordinate


plane can also be divided in the following ways as follows

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 42


Hypothesis (h)&Hypothesis Space (H):Example

• With the above example, we can conclude that;


• Hypothesis space (H) is the composition of all legal best possible ways to
divide the coordinate plane so that it best maps input to proper output.
• Further, each individual best possible way is called a hypothesis (h). Hence,
the hypothesis and hypothesis space would be like this:

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 43


Inductive bias
• In machine learning, the term inductive bias refers to a set of (explicit or
implicit) assumptions made by a learning algorithm in order to
perform induction, that is, to generalize a finite set of observation (training
data) into a general model of the domain.
• Without a bias of that kind, induction would not be possible, since
the observations can normally be generalized in many ways.
• Treating all these possibilities in equally, i.e., without any bias in the sense of a
preference for specific types of generalization (reflecting
background knowledge about the target function to be learned),
predictions for new situations could not be made.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 44


Cross-Validation
• In machine learning, we couldn’t fit the model on the training data and can’t
say that the model will work accurately for the real data.
• For this, we must assure that our model got the correct patterns from
the data, and it is not getting up too much noise. For this purpose, we use
the cross-validation technique.
• Cross-Validation
• Cross-validation is a technique in which we train our model using the subset
of the data-set and then evaluate using the complementary subset of
the data-set.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 45


Cross-Validation
• Hence the basic steps of cross-validations are:

• Reserve a subset of the dataset as a validation set.

• Provide the training to the model using the training dataset.

• Now, evaluate model performance using the validation set. If the model
performs well with the validation set, perform the further step,
else check for the issues.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 46


Methods of Cross-Validation
• There are some common methods that are used for cross-validation.
These methods are given below:

• Validation Set Approach

• Leave-P-out cross-validation

• Leave one out cross-validation

• K-fold cross-validation

• Stratified k-fold cross-validation

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 47


Methods of Cross-Validation
• Validation Set Approach
• In this method, we perform training on the 50% of the given data-set and rest 50% is used for
the testing purpose. The major drawback of this method is that we perform training on the 50% of
the dataset, it may possible that the remaining 50% of the data contains some important
information which we are leaving while training our model i.e higher bias.
• LOOCV (Leave One Out Cross Validation)
• In this method, we perform training on the whole data-set but leaves only one data-point of
the available data-set and then iterates for each data-point. It has some advantages as well
as disadvantages also.
• An advantage of using this method is that we make use of all data points and hence it is low bias.
• The major drawback of this method is that it leads to higher variation in the testing model as
we are testing against one data point. If the data point is an outlier it can lead to higher
variation. Another drawback is it takes a lot of execution time as it iterates over ‘the number of
data points’ times.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 48


K-Fold Cross-Validation
• K-fold cross-validation approach divides the input dataset into K groups of samples of
equal sizes. These samples are called folds. For each learning set, the prediction function
uses k-1 folds, and the rest of the folds are used for the test set. This approach is a very
popular CV approach because it is easy to understand, and the output is less biased than
other methods.
• The steps for k-fold cross-validation are:
• Split the input dataset into K groups
• For each group:
• Take one group as the reserve or test data set.
• Use remaining groups as the training dataset
• Fit the model on the training set and evaluate the performance of the model using the
test set.
Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 49
Methods of Cross-Validation
• Stratified k-fold cross-validation
• This technique is similar to k-fold cross-validation with some little changes. This approach works
on stratification concept, it is a process of rearranging the data to ensure that each fold or group is
a good representative of the complete dataset. To deal with the bias and variance, it is one of
the best approaches.
• It can be understood with an example of housing prices, such that the price of some houses can
be much high than other houses. To tackle such situations, a stratified k-fold cross-validation
technique is useful.
• Holdout Method
• This method is the simplest cross-validation technique among all. In this method, we need
to remove a subset of the training data and use it to get prediction results by training it on the
rest part of the dataset.
• The error that occurs in this process tells how well our model will perform with the
unknown dataset. Although this approach is simple to perform, it still faces the issue of high
variance, and it also produces misleading results sometimes.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 50


Limitations of Cross-Validation
• There are some limitations of the cross-validation technique, which are given
below:
• For the ideal conditions, it provides the optimum output. But for the
inconsistent data, it may produce a drastic result. So, it is one of the
big disadvantages of cross-validation, as there is no certainty of the
type of data in machine learning.
• In predictive modeling, the data evolves over a period, due to which, it may
face the differences between the training set and validation sets. Such as
if we create a model for the prediction of stock market values, and the data
is trained on the previous 5 years stock values, but the realistic future values
for the next 5 years may drastically different, so it is difficult to
expect the correct output for such situations.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 51


Applications of Cross-Validation
• This technique can be used to compare the performance of different
predictive modeling methods.

• It has great scope in the medical research field.

• It can also be used for the meta-analysis, as it is already being used by the
data scientists in the field of medical statistics.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 52


Linear Regression in Machine Learning
• Linear regression is one of the easiest and most popular Machine
Learning algorithms. It is a statistical method that is used for
predictive analysis. Linear regression makes predictions for
continuous/real or numeric variables such as sales, salary, age,
product price, etc.
• Linear regression algorithm shows a linear relationship between a
dependent (y) and one or more independent (y) variables, hence
called as linear regression. Since linear regression shows
the linear relationship, which means it finds how the value
of the dependent variable is changing according to the value of
the independent variable.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 53


Linear Regression in Machine Learning
• The linear regression model provides a sloped straight line representing
the relationship between the variables. Consider the below image:
Mathematically, we can represent a linear regression as:
y= a0+a1x+ ε

Here,

Y= Dependent Variable (Target Variable)


X= Independent Variable (predictor Variable)
a0= intercept of the line (Gives an additional degree of freedom)
a1 = Linear regression coefficient (scale factor to each input value).
ε = random error

The values for x and y variables are training datasets for Linear Regression model
representation

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 54


Types of Linear Regression

• Linear regression can be further divided into two types of the algorithm:
• Simple Linear Regression:
• If a single independent variable is used to predict the value of a numerical
dependent variable, then such a Linear Regression algorithm is called
Simple Linear Regression.
• Multiple Linear regression:
• If more than one independent variable is used to predict the value of a
numerical dependent variable, then such a Linear Regression algorithm
is called Multiple Linear Regression.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 55


Linear Regression
• Linear Regression Line
• A linear line showing the relationship between the dependent and
independent variables is called a regression line. A regression line can show two types of
relationship:
• Positive Linear Relationship:
• If the dependent variable increases on the Y-axis and independent variable increases on
X-axis, then such a relationship is termed as a Positive linear relationship.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 56


Linear Regression
• Negative Linear Relationship:
• If the dependent variable decreases on the Y-axis and independent
variable increases on the X-axis, then such a relationship is called
a negative linear relationship.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 57


• Finding the best fit line:
• When working with linear regression, our main goal is to find the best fit
line that means the error between predicted values and actual
values should be minimized. The best fit line will have the least error.
• The different values for weights or the coefficient of lines (a0, a1) gives a
different line of regression, so we need to calculate the best values
for a0 and a1 to find the best fit line, so to calculate this we
use cost function.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 58


Cost function-
• The different values for weights or coefficient of lines (a0, a1) gives the
different line of regression, and the cost function is used to estimate
the values of the coefficient for the best fit line.
• Cost function optimizes the regression coefficients or weights.
measures
It how a linear regression model is performing.
• We can use the cost function to find the accuracy of the mapping
function, which maps the input variable to the output variable.
This mapping function is also known as Hypothesis function.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 59


Cost function-
• For Linear Regression, we use the Mean Squared Error (MSE) cost
function, which is the average of squared error occurred between
the predicted values and actual values. It can be written as:

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 60


Cost function-
• Residuals: The distance between the actual value and predicted values
is called residual. If the observed points are far from the regression
line, then the residual will be high, and so cost function will
high. If the scatter points are close to the regression line, then
the residual will be small and hence the cost function.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 61


Model Performance:
• The Goodness of fit determines how the line of regression fits the set of observations. The process of finding the
best model out of various models is called optimization. It can be achieved by below method:
• 1. R-squared method:
• R-squared is a statistical method that determines the goodness of fit.
• It measures the strength of the relationship between the dependent and independent variables on a scale of 0-
100%.
• The high value of R-square determines the less difference between the predicted values and actual values and
hence represents a good model.
• It is also called a coefficient of determination, or coefficient of multiple determination for multiple regression.
• It can be calculated from the below formula:

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 62


Simple Linear Regression in Machine Learning

• Simple Linear Regression is a type of Regression algorithms that models the relationship between
a dependent variable and a single independent variable. The relationship shown by a Simple
Linear Regression model is linear or a sloped straight line, hence it is called Simple Linear Regression.
• The key point in Simple Linear Regression is that the dependent variable must be a
continuous/real value. However, the independent variable can be measured on continuous or
categorical values.

• Simple Linear regression algorithm has mainly two objectives:

• Model the relationship between the two variables. Such as the relationship between Income and
expenditure, experience and Salary, etc.

• Forecasting new observations. Such as Weather forecasting according to temperature, Revenue


of a company according to the investments in a year, etc.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 63


Simple Linear Regression Model:
• The Simple Linear Regression model can be represented using the below
equation:
• y= a0+a1x+ ε
• Where,
• a0= It is the intercept of the Regression line (can be obtained putting x=0)
• a1= It is the slope of the regression line, which tells whether the line is
increasing or decreasing.
• ε = The error term. (For a good model it will be negligible)

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 64


Simple Linear Regression Model:
• Example
• You are a social researcher interested in the relationship between
income and happiness. You survey 500 people whose incomes
range from 15k to 75k and ask them to rank their happiness on a
scale from 1 to 10.
• Your independent variable (income) and dependent variable (happiness)
are both quantitative, so you can do a regression analysis to see if
there is a linear relationship between them.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 65


Multiple Linear Regression
• In the previous topic, we have learned about Simple Linear Regression, where a
single Independent/Predictor(X) variable is used to model the response variable (Y). But there
may be various cases in which the response variable is affected by more than one predictor
variable; for such cases, the Multiple Linear Regression algorithm is used.
• Moreover, Multiple Linear Regression is an extension of Simple Linear regression as it takes
more than one predictor variable to predict the response variable. We can define it as:
• Multiple Linear Regression is one of the important regression algorithms which models the
linear relationship between a single dependent continuous variable and more than one
independent variable.
• Example:
• Prediction of CO2 emission based on engine size and number of cylinders in a car.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 66


MLR equation:
• MLR formula look like : y = a + bx1 + cx2 + dx3 + …….
• The coefficients tell you exactly how much each independent variable
contributes to the dependent variable and how much each
independent variable contributes in isolation.
• For example, if you had two independent variables (x1 and x2), then the
coefficient for x1 would tell you how strongly each unit change in
x1 affects y—and likewise for x2.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 67


Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 68
Overfitting and Under fitting in Machine Learning

• In machine learning, overfitting and underfitting are key concepts that


describe how well a model is able to learn from data and generalize to unseen
data.
• Both issues affect the performance of machine learning models and need to
be managed carefully to build robust, reliable models.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 69


Overfitting and Under fitting in Machine Learning

• Before understanding the overfitting and underfitting, let's understand


some basic term that will help to understand this topic well:
• Signal: It refers to the true underlying pattern of the data that helps the
machine learning model to learn from the data.
• Noise: Noise is unnecessary and irrelevant data that reduces
performance of the model.
the
• Bias: Bias is a prediction error that is introduced in the model due to
oversimplifying the machine learning algorithms. Or it is the
difference between the predicted values and the actual values.
• Variance: If the machine learning model performs well with the training
dataset, but does not perform well with the test dataset, then
variance occurs.
Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 70
Overfitting in Machine Learning

• Overfitting occurs when a machine learning model is trained too well on the
training data, including capturing noise, outliers, and irrelevant patterns that
are specific to the training data.
• While this might result in a low error on the training set, the model performs
poorly on new, unseen data (test data), as it has effectively "memorized" the
data rather than learned the general patterns.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 71


Overfitting
• Overfitting occurs when our machine learning model tries to cover all the data
points or more than the required data points present in the given dataset.
• Because of this, the model starts caching noise and inaccurate values present in
the dataset, and all these factors reduce the efficiency and accuracy of
the model.
• The overfitted model has low bias and high variance.
• Overfitting is the main problem that occurs in supervised learning.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 72


Example
• Example: The concept of the overfitting can be understood by the below
graph of the linear regression output:

As we can see from the above graph, the


model tries to cover all the data points present
in the scatter plot.
It may look efficient, but in reality, it is not so.
Because the goal of the regression model to
find the best fit line, but here we have not got
any best fit, so, it will generate the prediction
errors.
Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 73
Characteristics of Overfitting:

• Low training error: The model fits the training data almost perfectly.
• High test/validation error: When tested on unseen data, the model performs
poorly.
• Complexity: The model may have too many parameters or be too flexible
(e.g., a very deep neural network or a high-degree polynomial regression).

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 74


Causes of Overfitting

• Insufficient training data: With a small amount of data, the model may try to
learn every specific detail, including noise.
• Too many features: Including too many input variables or features may lead
the model to memorize specific relationships.
• Model complexity: A model that is too complex (e.g., too many layers in a
neural network or high-degree polynomial) can capture unnecessary patterns.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 75


How to Avoid Overfitting:

• Regularization: Techniques like L1 and L2 regularization add penalties to the


complexity of the model, which discourages it from fitting irrelevant details.
• Cross-validation: Splitting data into training, validation, and test sets helps in
monitoring the model’s ability to generalize. K-fold cross-validation is a common
technique.
• Pruning: In decision trees, pruning removes unnecessary branches that don’t add
much value to prediction.
• Reduce model complexity: Use simpler models or fewer features when possible.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 76


Underfitting in Machine Learning

Underfitting occurs when a model is too simple and unable to capture the
underlying patterns in the training data.
This leads to poor performance on both the training set and the test set because
the model fails to understand the complexity of the data and learn the
relationships between input and output variables.
Underfitting
• Example: We can understand the underfitting using below output of the
linear regression model:

As we can see from the above


diagram, the model is unable to
capture the data points present in
the plot.

How to avoid underfitting:


By increasing the training time of the
model.
By

increasing
Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 78
Characteristics of Underfitting:

High training error: The model does not even fit the training data well.
High test/validation error: Since the model fails to capture the patterns, its
performance on new data is also poor.
Simplicity: The model might be too simple to learn the actual relationships in the
data (e.g., using linear regression on data with nonlinear relationships).
Causes of Underfitting:

Oversimplified model: Using a model that is too basic, like a linear regression model
for data that requires a more complex model (e.g., polynomial regression or a deep
neural network).
Lack of features: If the input data doesn’t contain enough information, the model
may not be able to learn.
Too much regularization: Applying too much regularization may make the model too
rigid, preventing it from learning important relationships.
How to Avoid Underfitting:

Increase model complexity: Use a more complex model that can capture more

nuances in the data.

Feature engineering: Add more relevant features to the input dataset to help the

model learn better.

Reduce regularization: Decrease the strength of regularization techniques (like L2

regularization) if the model is overly constrained.


Example : Predicting Housing Prices
Let’s take an example of predicting house prices based on a single feature: the
size of the house (square footage). We have data that shows house sizes
(input) and their prices (output). The goal is to train a model that can predict
house prices for new houses based on their size.

Scenario 1: Underfitting
If we apply a linear regression model (i.e., fitting a straight line to the data), it may not capture
the complex relationship between house size and price. In reality, house prices may increase non-
linearly with size (e.g., houses above a certain size may have premium pricing), but a linear
model oversimplifies the problem and cannot capture this.
• Result: The model performs poorly on both the training and test data. It is underfitting because
it is too simple to capture the pattern in the data.
Example : Predicting Housing Prices

Scenario 2: Overfitting
Now, let’s apply a polynomial regression model with a very high degree (e.g.,
a 10th-degree polynomial). This model may fit the training data very well,
creating a curve that perfectly passes through every single training point,
even capturing the noise in the data (such as anomalies or errors in pricing).
•Result: The model performs extremely well on the training data but very
poorly on the test data. This is because it has overfitted the data and captured
noise or random fluctuations that do not generalize to unseen houses.
Example : Predicting Housing Prices

Scenario 3: Good Fit


Finally, we apply a moderate-degree polynomial regression (e.g., 2nd or 3rd-degree).
This model is complex enough to capture the nonlinear relationship between house
size and price, but not so complex that it starts fitting noise and outliers.
•Result: The model performs well on both the training data and the test data. It has a
good fit, capturing the general pattern without overfitting or underfitting.
Overfitting and Under fitting in Machine Learning

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 85


R programming
• R language is basically developed by statisticians to help
statisticians
other and developers faster and efficiently with the data.
• As by now, we know that machine learning is basically working with a
large amount of data and statistics as a part of data science the use of
R language is always recommended.
• Therefore the R language is mostly becoming handy for those working
with machine learning making tasks easier, faster, and innovative.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 86


R programming
• Here are some top advantages of R language to implement a machine learning
algorithm in R programming.
• Advantages to Implement Machine Learning Using R Language
• It provides good explanatory code. For example, if you are at the early stage of working with
a machine learning project and you need to explain the work you do, it becomes easy to work with
R language comparison to python language as it provides the proper statistical method to work
with data with fewer lines of code.
• R language is perfect for data visualization. R language provides the best prototype to work
with machine learning models.
• R language has the best tools and library packages to work with machine learning
projects. Developers can use these packages to create the best pre-model, model, and post-
model of the machine learning projects. Also, the packages for R are more advanced and extensive
than python language which makes it the first choice to work with machine learning projects.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 87


Popular R Language Packages Used to Implement Machine Learning

• lattice: The lattice package supports the creation of the graphs displaying the
variable or relation between multiple variables with conditions.

• DataExplorer: This R package focus to automate the data visualization and data
handling so that the user can pay attention to data insights of the project.

• Dalex(Descriptive Machine Learning Explanations): This package helps to provide


various explanations for the relation between the input variable and its output. It
helps to understand the complex models of machine learning

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 88


Popular R Language Packages Used to Implement Machine Learning

• dplyr: This R package is used to summarize the tabular data of machine learning with rows and
columns. It applies the “split-apply-combine” approach.

• Esquisse: This R package is used to explore the data quickly to get the information it holds. It also
allows to plot bar graph, histograms, curves, and scatter plots.

• caret: This R package attempts to streamline the process for creating predictive models.

• janitor: This R package has functions for examining and cleaning dirty data. It is basically built for
the purpose of user-friendliness for beginners and intermediate users.

• rpart: This R package helps to create the classification and regression models using two-stage
procedures. The resulting models are represented as binary trees.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 89


Application Of R in Machine Learning

• Social Network Analytics


• To analyze trends and patterns
• Getting insights for behaviour of users
• To find the relationships between the users
• Developing analytical solutions
• Accessing charting components
• Embedding interactive visual graphics

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 90


Decision tree
• A decision tree is a model composed of a collection of "questions" organized
hierarchically in the shape of a tree. The questions are usually called a condition, a
split, or a test. We will use the term "condition" in this class. Each non-leaf node
contains a condition, and each leaf node contains a prediction.
• Botanical trees generally grow with the root at the bottom; however, decision trees are
usually represented with the root (the first node) at the top.

Inference of a decision tree model is computed by


routing an example from the root (at the top) to one of
the leaf nodes (at the bottom) according to the
conditions.
The value of the reached leaf is the decision tree's
prediction. The set of visited nodes is called the
inference path.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 91


Decision tree
• Decision Tree is a Supervised learning technique that can be used for
both classification and Regression problems, but mostly it is
preferred for solving Classification problems.
• It is a tree-structured classifier, where internal nodes represent the
features of a dataset, branches represent the decision rules and
each leaf node represents the outcome.
• In a Decision tree, there are two nodes, which are the Decision Node
and Leaf Node.
• Decision nodes are used to make any decision and have multiple
branches, whereas Leaf nodes are the output of those decisions and
do not contain any further branches.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 92


Decision tree
• The decisions or the test are performed on the basis of features of the
given dataset.
• It is a graphical representation for getting all the possible solutions to a
problem/decision based on given conditions.
• It is called a decision tree because, similar to a tree, it starts with the
root node, which expands on further branches and constructs a tree-
like structure.
• In order to build a tree, we use the CART algorithm, which stands for
Classification and Regression Tree algorithm.
• A decision tree simply asks a question, and based on the answer
(Yes/No), it further split the tree into subtrees.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 93


Below diagram explains the general structure of a decision tree:

Note: A decision tree can contain


categorical data (YES/NO) as well as
numeric data.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 94


Why use Decision Trees?
• There are various algorithms in Machine learning, so choosing the best
algorithm for the given dataset and problem is the main point
to remember while creating a machine learning model. Below are
the two reasons for using the Decision tree:

• Decision Trees usually mimic human thinking ability while making a


decision, so it is easy to understand.
• The logic behind the decision tree can be easily understood because it
shows a tree-like structur

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 95


Decision Tree Terminologies

• Root Node: Root node is from where the decision tree starts. It represents the
entire dataset, which further gets divided into two or more homogeneous sets.
• Leaf Node: Leaf nodes are the final output node, and the tree cannot be
segregated further after getting a leaf node.
• Splitting: Splitting is the process of dividing the decision node/root node into
sub-nodes according to the given conditions.
• Branch/Sub Tree: A tree formed by splitting the tree.
• Pruning: Pruning is the process of removing the unwanted branches from the
tree.
• Parent/Child node: The root node of the tree is called the parent node, and
other nodes are called the child nodes.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 96


How does the Decision Tree algorithm Work?

• In a decision tree, for predicting the class of the given dataset, the
algorithm starts from the root node of the tree.
• This algorithm compares the values of root attribute with the
record (real dataset) attribute and, based on the comparison,
follows the branch and jumps to the next node.
• For the next node, the algorithm again compares the attribute value
with the other sub-nodes and move further.
• It continues the process until it reaches the leaf node of the tree. The
complete process can be better understood using the below algorithm:

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 97


Steps –Decision Tree
• Step-1: Begin the tree with the root node, says S, which contains the
complete dataset.
• Step-2: Find the best attribute in the dataset using Attribute Selection
Measure (ASM).
• Step-3: Divide the S into subsets that contains possible values for the
best attributes.
• Step-4: Generate the decision tree node, which contains the best
attribute.
• Step-5: Recursively make new decision trees using the subsets of the
dataset created in step -3. Continue this process until a stage is
reached where you cannot further classify the nodes and called the final
node as a leaf node.
Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 98
Decision Tree
• Example: Suppose there is a candidate who has
a job offer and wants to decide whether he
should accept the offer or Not.
• So, to solve this problem, the decision tree
starts with the root node (Salary attribute by
ASM).
• The root node splits further into the next
decision node (distance from the office) and
one leaf node based on the corresponding
labels.
• The next decision node further gets split into
one decision node (Cab facility) and one leaf
node. Finally, the decision node splits into two
leaf nodes (Accepted offers and Declined offer).
Consider the below diagram:
Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 99
INSTANCE-BASED LEARNING
• Instance-based learning is a family of learning algorithms that, instead
performing
of explicit generalization, compares new problem instances with
instances seen in training, which have been stored in memory.
• They are sometimes referred to as lazy learning methods because they delay
processing until a new instance must be classified.
• The nearest neighbours of an instance are defined in terms of
Euclidean distance.
• No model is learned .
• The stored training instances themselves represent the knowledge .
• Training instances are searched for instance that most closely resembles new
instance.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 100
Instance-based learning
• Instance-based learning: It generates classification predictions using
specific
only instances.
• Instance-based learning algorithms do not maintain a set of abstractions derived
from specific instances.
• This approach extends the nearest neighbour algorithm, which has large storage
requirements.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 101
Performance dimensions used for instance-based learning algorithm

• Time complexity of Instance based learning algorithms depends upon


the size of training data.
• Time complexity of this algorithm in the worst case is O (n), where n is
the number of training items to be used to classify a single
new instance.
• Some of the instance-based learning algorithms are :
• K Nearest Neighbor (KNN)
• Self-Organizing Map (SOM)
• Learning Vector Quantization (LVQ)
• Locally Weighted Learning (LWL)

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 102
Functions of instance-based learning
• Functions are as follows:
• Similarity: Similarity is a machine learning method that uses a nearest
neighbour approach to identify the similarity of two or more objects
to each other based on algorithmic distance functions.
• Classification: Process of categorizing a given set of data into classes, It
can be performed on both structured or unstructured data. The
process starts with predicting the class of given data points. The
classes are often referred to as target, label or categories.
• Concept Description: Much of human learning involves acquiring
general concepts from past experiences. This description can then
be used to predict the class labels of unlabelled cases.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 103
Advantages & Disadvantages of Instance-based Learning

Advantages of instance-based learning:

• It has the ability to adapt to previously unseen data, which means that
one can store a new instance or drop the old instance.

Disadvantages of instance-based learning:

• Classification costs are high.

• Large amount of memory required to store the data, and each query
involves starting the identification of a local model from scratch.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 104
Bayes theorem
• Bayes theorem is given by an English statistician, philosopher, and Presbyterian minister
named Mr. Thomas Bayes in 17th century.
• Bayes provides their thoughts in decision theory which is extensively used in important
mathematics concepts as Probability.
• Bayes theorem is also widely used in Machine Learning where we need to predict classes
precisely and accurately. An important concept of Bayes theorem named Bayesian
method is used to calculate conditional probability in Machine Learning application that
includes classification tasks.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 105
Bayes theorem
• Bayes theorem is also known with some other name such as Bayes rule
or Bayes Law.
• Bayes theorem helps to determine the probability of an event with
random knowledge.
• It is used to calculate the probability of occurring one event while other
one already occurred. It is a best method to relate the
condition probability and marginal probability.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 106
What is Bayes Theorem?
• Bayes theorem is one of the most popular machine learning concepts
that helps to calculate the probability of occurring one event
with uncertain knowledge while other one has already occurred.
• Bayes' theorem can be derived using product rule and conditional
probability of event X with known event Y:
• According to the product rule we can express as the probability of event
X with known event Y as follows;
• P(X ? Y)= P(X|Y) P(Y) {equation 1}
• Further, the probability of event Y with known event X:
• P(X ? Y)= P(Y|X) P(X) {equation 2}

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 107
Bayes theorem
• Mathematically, Bayes theorem can be expressed by combining both
equations on right hand side. We will get:

Here, both events X and Y are independent events which means


probability of outcome of both events does not depends one another.

The above equation is called as Bayes Rule or Bayes


Theorem.
Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 108
Bayes theorem
• P(X|Y) is called as posterior, which we need to calculate. It is defined
as updated probability after considering the evidence.
• P(Y|X) is called the likelihood. It is the probability of evidence when
hypothesis is true.
• P(X) is called the prior probability, probability of hypothesis before
considering the evidence
• P(Y) is called marginal probability. It is defined as the probability of
evidence under any consideration.
• Hence, Bayes Theorem can be written as:
• posterior = likelihood * prior / evidence

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 109
Collaborative filtering
• Collaborative filtering is used by most recommendation systems to find similar patterns
or information of the users, this technique can filter out items that users like on the basis
of the ratings or reactions by similar users.
• An example of collaborative filtering can be to predict the rating of a particular user based
on user ratings for other movies and others’ ratings for all movies.
• This concept is widely used in recommending movies, news, applications, and so many
other items.

Collaborative filtering is a technique widely used in recommendation


systems to suggest items (e.g., movies, products, or books) to users based
on their preferences and the preferences of similar users. It relies on the
idea that people who have agreed in the past will agree again in the future.
Collaborative filtering can be either user-based or item-based.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 110
Collaborative Filtering
• Let’s take one example understand more about what is
and Collaborative Filtering,

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 111
Example
• Let’s assume I have user U1, who likes movies m1,m2,m4. user U2 who likes
movies m1,m3,m4, and user U3 who likes movie m1.

• So our job is to recommend which are the new movie to watch for the user
U3 next.

• So here we can see users U1, U2, U3 watch/likes movies m1, so three have
the same taste. now in user U1, U2 has like/watch movies m4, so user
U3 could like movie m3 so I recommend movie m4, this is the flow of
logic.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 112
Types of Filtering

User-Based Collaborative Filtering:


In this approach, users are grouped based on their similarity in preferences.
Recommendations for a user are made by finding similar users and suggesting items that
those similar users have liked or interacted with.

Example: If User A and User B both rated several movies similarly, and User A liked a movie
that User B hasn’t seen, that movie will be recommended to User B.

Item-Based Collaborative Filtering:


Instead of comparing users, this method compares items. It looks at how similar items are
rated by users. Items that are similar in terms of user ratings are recommended to users who
have shown interest in similar items.Example: If a user liked Movie A and Movie B, and other
users who liked Movie A also liked Movie C, then Movie C will be recommended to that user.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 113
Steps in Collaborative Filtering

Data Collection: Gather user-item interaction data, which is often in the form of a sparse matrix where

rows represent users and columns represent items (e.g., movies or products). The entries in the matrix

may represent ratings, likes, or interactions between users and items.

Similarity Calculation: Compute the similarity between users or items. This can be done using various

distance or similarity metrics, such as cosine similarity or Pearson correlation.

Recommendation Generation: Once the similarities between users or items are calculated,

recommendations are generated by identifying items that are highly rated by similar users (user-based)

or items that are similar to the ones the user has previously liked (item-based).
Example of Collaborative Filtering:

Consider a movie recommendation system like Netflix. Suppose we have a user, John, who has
watched and rated several action movies highly but hasn’t rated any comedy movies.
Collaborative filtering can recommend movies to John in two ways:
User-based: The system identifies other users who have watched and rated similar action movies. It
then looks for movies these similar users have rated highly but that John hasn’t watched yet, and
recommends those to John.
Item-based: The system looks at the action movies John has rated highly and finds other movies that
are similar in genre, actors, or themes, which are also highly rated by other users. These similar movies
are recommended to John.
Advantages of Collaborative Filtering:

No need for domain knowledge: Collaborative filtering relies purely on user interaction data, so it
doesn’t require knowledge about the specific features of items.

Personalized recommendations: By leveraging user preferences, collaborative filtering can provide highly
personalized suggestions tailored to individual tastes.

Limitations of Collaborative Filtering:


• Cold-start problem: Collaborative filtering struggles with new users or new items for which there is little
to no interaction data.
• Sparsity: In large datasets, the user-item interaction matrix may be very sparse (i.e., most users have
only interacted with a small subset of items), making it harder to find meaningful similarities.
• Scalability: For very large datasets, calculating similarities between millions of users or items can be
computationally expensive.
Example of Collaborative Filtering in Action:

Consider an e-commerce platform like Amazon that wants to recommend products to its users
based on their previous purchases and browsing history. Here’s how collaborative filtering could be
applied:
• User-Based Collaborative Filtering: Amazon identifies users who have purchased similar items
(e.g., laptops) as User A. It finds that these users have also bought certain laptop accessories
that User A hasn’t bought yet, such as laptop stands or external hard drives. These accessories
are then recommended to User A.
• Item-Based Collaborative Filtering: If User A has bought a laptop, collaborative filtering looks
at other users who bought the same or similar laptops. It identifies products that these users
often purchase together with the laptop (e.g., a particular type of mouse or laptop bag) and
recommends these items to User A.
Feature reduction
• Feature reduction, also known as dimensionality reduction, is
process of reducing the number of features in a resource heavy
the
computation without losing important information.
• Reducing the number of features means the number of variables is
reduced making the computer’s work easier and faster.
• Feature reduction can be divided into two processes: feature selection
and feature extraction.
• There are many techniques by which feature reduction is accomplished.
Some of the most popular are generalized discriminant analysis,
autoencoders, non-negative matrix factorization, and principal
component analysis.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 118
Feature reduction
• In machine learning classification problems, there are often too many
factors on the basis of which the final classification is done.
• These factors are basically variables called features. The higher the
number of features, the harder it gets to visualize the training set
and then work on it.
• Sometimes, most of these features are correlated, and hence
redundant. This is where dimensionality reduction algorithms come
into play.
• Dimensionality reduction is the process of reducing the number
of random variables under consideration, by obtaining a set of
principal variables. It can be divided into feature selection and
feature extraction.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 119
Why is this Useful?
• The purpose of using feature reduction is to reduce the number of features (or
variables) that the computer must process to perform its function. Feature reduction
leads to the need for fewer resources to complete computations or tasks. Less
computation time and less storage capacity needed means the computer can do more
work. During machine learning, feature reduction removes multicollinearity resulting in
improvement of the machine learning model in use.
• Another benefit of feature reduction is that it makes data easier to visualize for
humans, particularly when the data is reduced to two or three dimensions which can
be easily displayed graphically. An interesting problem that feature reduction can help
with is called the curse of dimensionality. This refers to a group of phenomena in which
a problem will have so many dimensions that the data becomes sparse. Feature
reduction is used to decrease the number of dimensions, making the data less sparse
and more statistically significant for machine learning applications.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 120
Example
• An intuitive example of dimensionality reduction can be discussed through a simple e-mail
classification problem, where we need to classify whether the e-mail is spam or not.
• This can involve a large number of features, such as whether or not the e-mail has a generic
title, the content of the e-mail, whether the e-mail uses a template, etc.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 121
Dimensionality reduction/Feature reduction

• There are two components of dimensionality reduction:


• Feature selection: In this, we try to find a subset of the original set of
variables, or features, to get a smaller subset which can be used
to model the problem. It usually involves three ways:
• Filter
• Wrapper
• Embedded
• Feature extraction: This reduces the data in a high dimensional space to
a lower dimension space, i.e. a space with lesser no. of dimensions.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 122
Dimensionality reduction/Feature reduction
• The various methods used for dimensionality reduction include:
• Principal Component Analysis (PCA)
• Linear Discriminant Analysis (LDA)
• Generalized Discriminant Analysis (GDA)
Advantages of Dimensionality Reduction
• It helps in data compression, and hence reduced storage space.
• It reduces computation time.
• It also helps remove redundant features, if any.
• Disadvantages of Dimensionality Reduction
• It may lead to some amount of data loss.
• PCA tends to find linear correlations between variables, which is sometimes undesirable.
• PCA fails in cases where mean and covariance are not enough to define datasets.

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 123
THANK
YOU

Dept of Mechanical Engineering 19MECC1701- Artificial Intelligence & Machine Learning 113

You might also like