0% found this document useful (0 votes)

57 views89 pages

Ai&ml Unit 5

This document is a course outline for 'Artificial Intelligence and Machine Learning' at RMK Group of Educational Institutions, created by Dr. T. Mahalingam and Dr. S. Selvakanmani. It includes course objectives, prerequisites, a detailed syllabus, course outcomes, and a lecture plan for the academic year 2022-2026. The course covers various topics such as problem-solving strategies, knowledge representation, machine learning techniques, and real-world applications.

Uploaded by

daru22018.it

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

57 views89 pages

Ai&ml Unit 5

Uploaded by

daru22018.it

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 89

Please read this disclaimer before proceeding:

This document is confidential and intended solely for the educational purpose of
RMK Group of Educational Institutions. If you have received this document
through email in error, please notify the system manager. This document
contains proprietary information and is intended only to the respective group /
learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender
immediately by e-mail if you have received this document by mistake and delete
this document from your system. If you are not the intended recipient you are
notified that disclosing, copying, distributing or taking any action in reliance on
the contents of this information is strictly prohibited.
22IT401
ARTIFICIAL
INTELLIGENCE AND
MACHINE LEARNING
Department: Information Technology
Batch/Year: 2022-2026 / II
Created by: Dr. T. Mahalingam
Dr. S. Selvakanmani
Date: 12-02-2024
Table of Contents
SLIDE
S.NO. CONTENTS
NO.
1 CONTENTS 5

2 COURSE OBJECTIVES 7

3 PRE REQUISITES (COURSE NAMES WITH CODE) 8

4 SYLLABUS (WITH SUBJECT CODE, NAME, LTPC DETAILS) 9

5 COURSE OUTCOMES (6) 11

6 CO- PO/PSO MAPPING 12

7 LECTURE PLAN –UNIT 4 13

8 ACTIVITY BASED LEARNING –UNIT 4 16

9 CROSSWORD PUZZLE 17
10 VIDEO LINK-QUIZ 17
11 TEST YOURSELF 18
12 LECTURE NOTES – UNIT 4

Supervised Learning, Regression, Linear regression,

13 16
Multiple linear regression, A multiple regression analysis
The analysis of variance for multiple regression, Examples
14 17
for multiple regression

15 Overfitting, Detecting overfit models: Cross validation 17

Cross validation: The ideal procedure, Parameter

16 21
estimation, Logistic regression
Decision trees: Background, Decision trees, Decision trees
17 36
for credit card promotion

An algorithm for building decision trees, Attribute selection

18 40,43
measure: Information gain, Entropy

Decision Tree: Weekend example, Occam’s Razor,

19 45
Converting a tree to rules
Unsupervised learning, Semi Supervised learning,
20 46
Clustering, K – means clustering, Automated discovery
Table of Contents

SLIDE
S.NO. CONTENTS
NO.

21 Reinforcement learning, Multi-Armed Bandit algorithms 51

22 Influence diagrams, Risk modelling 53

23 Sensitivity analysis, Casual learning 54

24 ASSIGNMENT 1- UNIT 4 64

25 PART A Q & A (WITH K LEVEL AND CO) 65

26 PART B Q s (WITH K LEVEL AND CO) 68

27 SUPPORTIVE ONLINE CERTIFICATION COURSES 69

REAL TIME APPLICATIONS IN DAY TO DAY LIFE AND TO

28 70
INDUSTRY

29 CONTENTS BEYOND THE SYLLABUS 71

30 ASSESSMENT SCHEDULE 72

31 PRESCRIBED TEXT BOOKS & REFERENCE BOOKS 73

32 MINI PROJECT SUGGESTIONS 74

2. COURSE OBJECTIVES

Understand the concept of agents, problem solving and

searching strategies.

Familiarize with Knowledge reasoning and representation

based AI systems and approaches.

Apply the aspect of Probabilistic approach to AI.

Understanding of concepts of machine learning approaches.

Recognize the concepts of Machine Learning and its

deterministic tools
3. PRE REQUISITES

PRE-REQUISITE CHART
22IT401-ARTIFICIAL INTELLIGENCE
AND MACHINE LEARNING

22MA401- Probability
and Statistics

22CS303- Design and

Analysis of Algorithms
4. 22IT401 ARTIFICIAL INTELLIGENCE AND MACHINE
LEARNING LTPC
OBJECTIVES 3 00 3
• Understand the concept of Artificial Intelligence

• Familiarize with Knowledge based AI systems and approaches

• Apply the aspect of Probabilistic approach to AI

• Identify the Neural Networks and NLP in designing AI models

• Recognize the concepts of Machine Learning and its deterministic tools

UNIT 1 PROBLEM SOLVING AND SEARCH STARTEGIES

Introduction: What Is Ai, The Foundations Of Artificial Intelligence, The History Of Artificial
Intelligence, The State Of The Art. Intelligent Agents: Agents And Environments, Good Behaviour:
The Concept Of Rationality, The Nature Of Environments, And The Structure Of Agents. Solving
Problems By Searching: Problem-Solving Agents, Uninformed Search Strategies, Informed
(Heuristic) Search Strategies, Heuristic Functions. Beyond Classical Search: Local Search
Algorithms and Optimization Problems, Searching With Nondeterministic Actions And Partial
Observations, Online Search Agents And Unknown Environments. Constraint Satisfaction
Problems: Definition, Constraint Propagation, Backtracking Search, Local Search, The Structure Of
Problems.

List of Exercise/Experiments

1. Implementation of uninformed search algorithm (BFS and DFS).

2. Implementation of Informed Search algorithm (A* and Hill Climbing Algorithm)

UNIT 2 KNOWLEDGE REPRESENTATION AND REASONING

Logical Agents: Knowledge-Based Agents, Propositional Logic, Propositional Theorem Proving,

Effective Propositional Model Checking, Agents Based on Propositional Logic. FirstOrder Logic:
Syntax and Semantics, Knowledge Engineering in FOL, Inference in First-Order Logic, Unification
and Lifting, Forward Chaining, Backward Chaining, Planning: Definition, Algorithms, Planning
Graphs, Hierarchical Planning, Multi-agent Planning. Knowledge Representation: Ontological
Engineering, Categories and Objects, Events, Mental Events and Mental Objects, Reasoning
Systems for Categories, Reasoning with Default Information, The Internet Shopping World.

List of Exercise/Experiments
1. Implementation of forward and backward chaining.
2. Implementation of unification algorithms.

9
4. 22IT401 ARTIFICIAL INTELLIGENCE AND MACHINE
LEARNING
LTPC
UNIT 3 LEARNING
3003
Learning from Examples: Forms of Learning, Supervised Learning, Learning Decision
Trees, Evaluating and Choosing the Best Hypothesis, The Theory of Learning, Regression
and Classification with Linear Models, Artificial Neural Networks. Applications: Human
computer interaction (HCI), Knowledge management technologies, AI for customer
relationship management, Expert systems, Data mining, text mining, and Web mining,
Other current topics.

List of Exercise/Experiments

1. Numpy Operations

2. NumPy arrays

3. NumPy Indexing and Selection

4. NumPy Exercise:

(i) Write code to create a 4x3 matrix with values ranging from 2 to 13.

(ii) Write code to replace the odd numbers by -1 in the following array.

(iii) Perform the following operations on an array of mobile phones prices 6999,
7500, 11999, 27899, 14999, 9999.

a) Create a 1d-array of mobile phones prices

b) Convert this array to float type

c) Append a new mobile having price of 13999 Rs. to this array

d) Reverse this array of mobile phones prices

e) Apply GST of 18% on mobile phones prices and update this array.

f) Sort the array in descending order of price

g) What is the average mobile phone price.

TOTAL : 45 PERIODS

10
4. 22IT401 ARTIFICIAL INTELLIGENCE AND MACHINE
LEARNING
LTPC
UNIT 4 FUNDAMENTALS OF MACHINE LEARNING
3003
Motivation for Machine Learning, Applications, Machine Learning, Learning associations,
Classification, Regression, The Origin of machine learning, Uses and abuses of machine
learning, Success cases, How do machines learn, Abstraction and knowledge
representation, Generalization, Factors to be considered, Assessing the success of
learning, Metrics for evaluation of classification method, Steps to apply machine learning
to data, Machine learning process, Input data and ML algorithm, Classification of machine
learning algorithms, General ML architecture, Group of algorithms, Reinforcement
learning, Supervised learning, Unsupervised learning, Semi-Supervised learning,
Algorithms, Ensemble learning, Matching data to an appropriate algorithm.

List of Exercise/Experiments

1. Build linear regression models to predict housing prices using python , using data set
available Google colabs.

2. Stock Ensemble-based Neural Network for Stock Market Prediction using Historical Stock
Data and Sentiment Analysis.

UNIT 5 MACHINE LEARNING AND TYPES

Supervised Learning, Regression, Linear regression, Multiple linear regression, A multiple

regression analysis, The analysis of variance for multiple regression, Examples for
multiple regression, Overfitting, Detecting overfit models: Cross validation, Cross
validation: The ideal procedure, Parameter estimation, Logistic regression, Decision trees:
Background, Decision trees, Decision trees for credit card promotion, An algorithm for
building decision trees, Attribute selection measure: Information gain, Entropy, Decision
Tree: Weekend example, Occam’s Razor, Converting a tree to rules, Unsupervised
learning, Semi Supervised learning, Clustering, K – means clustering, Automated
discovery, Reinforcement learning, Multi-Armed Bandit algorithms, Influence diagrams,
Risk modelling, Sensitivity analysis, Casual learning.

11
4. 22IT401 ARTIFICIAL INTELLIGENCE AND MACHINE
LEARNING
List of Exercise/Experiments
LTPC
Use Cases
3003
Case Study 1: Churn Analysis and Prediction (Survival Modelling)

Cox-proportional models

Churn Prediction

Case Study 2: Credit card Fraud Analysis

Imbalanced Data

Neural Network

Case study 3: Sentiment Analysis or Topic Mining from New York Times

Similarity measures (Cosine Similarity, Chi-Square, N Grams)

Part-of-Speech Tagging

Stemming and Chunking

Case Study 4: Sales Funnel Analysis

A/B testing

Campaign effectiveness, Web page layout effectiveness

Scoring and Ranking

Case Study 5: Recommendation Systems and Collaborative filtering

User based

Item Based

Singular value decomposition–based recommenders

Case Study 6: Customer Segmentation and Value

Segmentation Strategies

Lifetime Value

Case Study 7: Portfolio Risk Conformance

Risk Profiling

Portfolio Optimization

Case Study 8: Uber Alternative Routing

Graph Construction

12
Route Optimization
5.COURSE OUTCOME

Cognitive/
Affective Expected
Course
Course Outcome Statement Level of the Level of
Code
Course Attainment
Outcome
Course Outcome Statements in Cognitive Domain

Explain the problem solving and Understand

C211.1 70%
search strategies. K2
Demonstrate the techniques for
Apply
C211.2 knowledge representation and 70%
K3
reasoning.
Interpret various forms of learning, Apply
C211.3 artificial neural networks and its K3 70%
applications.

Experiment various machine Analyse

C211.4 70%
learning algorithms. K4

Employ AI and machine learning

Apply
C211.5 algorithms to solve real world 70%
K3
problems.

13
6.CO-PO/PSO MAPPING

Correlation Matrix of the Course Outcomes to

Programme Outcomes and Programme Specific
Outcomes Including Course Enrichment Activities

Programme Outcomes (POs), Programme Specific Outcomes (PSOs)

Course PO PO PO PO PO PO1 PO1 PO1 PSO PSO PSO
Outcomes PO3 PO4 PO5 PO9
1 2 6 7 8 0 1 2 1 2 3
(COs) K3/K
K3 K4 K5 K5 A2 A3 A3 A3 A3 A3 A2 K3 K3 K3
5
C211 K
2 1 1 1 3 3
.1 2
C211 K
3 2 1 1 3 3
.2 3
C211 K
3 2 1 1 3 2 3
.3 3
C211 K
.4 4
3 3 2 2 5 3 3

C211 K
3 2 1 1 3 2 3
.5 3
C211 2.8 2 1.2 1.2 3 0.8 3

14
UNIT V

MACHINE LEARNING AND TYPES

13
LECTURE PLAN – UNIT V
UNIT 5 INTRODUCTION

Sl.
No PRO POS ED
LECT URE
ACTUAL
NO OF LECTURE
PERTAIN TAXONOMY
TOPIC PERIO MODE OF
ING LEVEL
DS DELIVERY
CO(s)
PERI OD PERIOD

21.03.2024 21.03.2024
Supervised Learning, Regression, (2) (2)
1 Linear regression, Multiple linear
1 CO2 K2 MD1
regression, A multiple regression
analysis

The analysis of variance for

multiple regression, Examples for 1 23.03.2024 23.03.2024 CO2 K2 MD1
multiple regression (1) (1)
2

26.03.2024 26.03.2024
Overfitting, Detecting overfit (1) (1)
3 1 CO2 K2 MD1
models: Cross validation

Cross validation: The ideal 28.03.2024 28.03.2024

procedure, Parameter estimation, 1 (2) (2) CO2 K3 MD1
4 Logistic regression

11.04.2024 11.04.2024
Decision trees: Background, (2) (2)
5 Decision trees, Decision trees for 1 CO2 K3 MD1
credit card promotion

6 An algorithm for building decision

trees, Attribute selection measure: 13.04.2024 13.04.2024
1 (1) (1) CO2 K2 MD1
Information gain, Entropy

Decision Tree: Weekend 15.04.2024 15.04.2024

7 example, Occam’s Razor, 1 (1) (1) CO2 K2 MD1
Converting a tree to rules

8 Unsupervised learning, Semi

Supervised learning, Clustering, K –
1 CO2 K2 MD1
means clustering, Automated
discovery 16.04.2024 16.04.2024
(1) (1)
Reinforcement learning, Multi-
Armed Bandit algorithms, Influence
9 diagrams, Risk modelling, 1 18.04.2024 18.04.2024 CO2 K2 MD1
Sensitivity analysis, Casual learning. (2) (2)
LECTURE PLAN – UNIT V

ASSESSMENT COMPONENTS MODE OF DELEIVERY

AC 1. Unit Test MD 1. Oral presentation
AC 2. Assignment MD 2. Tutorial
AC 3. Course Seminar MD 3. Seminar
AC 4. Course Quiz MD 4 Hands On
AC 5. Case Study MD 5. Videos
AC 6. Record Work MD 6. Field Visit
AC 7. Lab / Mini Project
AC 8. Lab Model Exam
AC 9. Project Review

15
LECTURE NOTES
UNIT 5
UNIT 5

Supervised Machine Learning

Supervised learning is the types of machine learning in which machines are trained using well
"labelled" training data, and on basis of that data, machines predict the output. The labelled
data means some input data is already tagged with the correct output.

In supervised learning, the training data provided to the machines work as the supervisor that
teaches the machines to predict the output correctly. It applies the same concept as a student
learns in the supervision of the teacher.

Supervised learning is a process of providing input data as well as correct output data to the
machine learning model. The aim of a supervised learning algorithm is to find a mapping
function to map the input variable(x) with the output variable(y).

In the real-world, supervised learning can be used for Risk Assessment, Image classification,
Fraud Detection, spam filtering, etc.

How Supervised Learning Works?

In supervised learning, models are trained using labelled dataset, where the model learns about
each type of data. Once the training process is completed, the model is tested on the basis of
test data (a subset of the training set), and then it predicts the output.

The working of Supervised learning can be easily understood by the below example and
diagram:
Suppose we have a dataset of different types of shapes which includes square, rectangle,
triangle, and Polygon. Now the first step is that we need to train the model for each shape.

o If the given shape has four sides, and all the sides are equal, then it will be labelled as
a Square.

o If the given shape has three sides, then it will be labelled as a triangle.

o If the given shape has six equal sides then it will be labelled as hexagon.

Now, after training, we test our model using the test set, and the task of the model is to identify
the shape.

The machine is already trained on all types of shapes, and when it finds a new shape, it classifies
the shape on the bases of a number of sides, and predicts the output.

Steps Involved in Supervised Learning:

o First Determine the type of training dataset

o Collect/Gather the labelled training data.
o Split the training dataset into training dataset, test dataset, and validation dataset.
o Determine the input features of the training dataset, which should have enough
knowledge so that the model can accurately predict the output.

o Determine the suitable algorithm for the model, such as support vector machine,
decision tree, etc.
o Execute the algorithm on the training dataset. Sometimes we need validation sets as
the control parameters, which are the subset of training datasets.
o Evaluate the accuracy of the model by providing the test set. If the model predicts the
correct output, which means our model is accurate.

Types of supervised Machine learningAlgorithms:

Supervised learning can be further divided into two types of problems:

1. Regression

Regression algorithms are used if there is a relationship between the input variable and the
output variable. It is used for the prediction of continuous variables, such as Weather
forecasting, Market Trends, etc. Below are some popular Regression algorithms which come
under supervised learning:

o Linear Regression
o Regression Trees
o Non-Linear Regression

o Bayesian Linear Regression

o Polynomial Regression

2. Classification

Classification algorithms are used when the output variable is categorical, which means there
are two classes such as Yes-No, Male-Female, True-false, etc.

Spam Filtering,

o Random Forest

o Decision Trees
o Logistic Regression
o Support vector Machines
Advantages of Supervised learning:
o With the help of supervised learning, the model can predict the output on the basis of
prior experiences.

o In supervised learning, we can have an exact idea about the classes of objects.
o Supervised learning model helps us to solve various real-world problems such as fraud
detection, spam filtering, etc.

Disadvantages of supervised learning:

o Supervised learning models are not suitable for handling the complex tasks.
o Supervised learning cannot predict the correct output if the test data is different from
the training dataset.

o Training required lots of computation times.

o In supervised learning, we need enough knowledge about the classes of object.

Regression Analysis in Machine learning

Regression analysis is a statistical method to model the relationship between a dependent

(target) and independent (predictor) variables with one or more independent variables. More
specifically, Regression analysis helps us to understand how the value of the dependent
variable is changing corresponding to an independent variable when other independent
variables are held fixed. It predicts continuous/real values such as temperature, age, salary,
price, etc.

We can understand the concept of regression analysis using the below example:

Example: Suppose there is a marketing company A, who does various advertisement every
year and get sales on that. The below list shows the advertisement made by the company in the
last 5 years and the corresponding sales:
Now, the company wants to do the advertisement of $200 in the year 2019 and wants to know
the prediction about the sales for this year. So to solve such type of prediction problems in
machine learning, we need regression analysis.

Regression is a supervised learning technique which helps in finding the correlation between
variables and enables us to predict the continuous output variable based on the one or more
predictor variables. It is mainly used for prediction, forecasting, time series modeling, and
determining the causal-effect relationship between variables.

In Regression, we plot a graph between the variables which best fits the given datapoints, using
this plot, the machine learning model can make predictions about the data. In simple
words, "Regression shows a line or curve that passes through all the datapoints on target-
predictor graph in such a way that the vertical distance between the datapoints and the
regression line is minimum." The distance between datapoints and line tells whether a model
has captured a strong relationship or not.

Some examples of regression can be as:

o Prediction of rain using temperature and other factors

o Determining Market trends

o Prediction of road accidents due to rash driving.

Why do we use Regression Analysis?

As mentioned above, Regression analysis helps in the prediction of a continuous variable.

There are various scenarios in the real world where we need some future predictions such as
weather condition, sales prediction, marketing trends, etc., for such case we need some
technology which can make predictions more accurately. So for such case we need
Regression analysis which is a statistical method and used in machine learning and data
science. Below are some other reasons for using Regression analysis:

o Regression estimates the relationship between the target and the independent variable.

o It is used to find the trends in data.

o It helps to predict real/continuous values.
o By performing the regression, we can confidently determine the most important
factor, the least important factor, and how each factor is affecting the other factors.

Types of Regression

There are various types of regressions which are used in data science and machine learning.
Each type has its own importance on different scenarios, but at the core, all the regression
methods analyze the effect of the independent variable on dependent variables. Here we are
discussing some important types of regression which are given below:

o Linear Regression

o Logistic Regression
o Polynomial Regression
o Support Vector Regression
o Decision Tree Regression

o Random Forest Regression

o Ridge Regression
o Lasso Regression:

Linear Regression:

o Linear regression is a statistical regression method which is used for predictive analysis.
o It is one of the very simple and easy algorithms which works on regression and shows
the relationship between the continuous variables.

o It is used for solving the regression problem in machine learning.

o Linear regression shows the linear relationship between the independent variable (X-
axis) and the dependent variable (Y-axis), hence called linear regression.
o If there is only one input variable (x), then such linear regression is called simple linear
regression. And if there is more than one input variable, then such linear regression is
called multiple linear regression.
o The relationship between variables in the linear regression model can be explained
using the below image. Here we are predicting the salary of an employee on the basis
of the year of experience.

o Below is the mathematical equation for Linear regression:

Y= aX+b

Here, Y = dependent variables (target variables),

X= Independent variables (predictor variables),
a and b are the linear coefficients

Some popular applications of linear regression are:

o Analyzing trends and sales estimates

o Salary forecasting

o Real estate prediction

o Arriving at ETAs in traffic.
Multiple linear regression

Regression models are used to describe relationships between variables by fitting a line to the
observed data. Regression allows you to estimate how a dependent variable changes as the
independent variable(s) change.

Multiple linear regression is used to estimate the relationship between two or more
independent variables and one dependent variable. You can use multiple linear regression
when you want to know:

1. How strong the relationship is between two or more independent variables and one
dependent variable (e.g. how rainfall, temperature, and amount of fertilizer added affect
crop growth).
2. The value of the dependent variable at a certain value of the independent variables (e.g.
the expected yield of a crop at certain levels of rainfall, temperature, and fertilizer
addition).

Multiple linear regression example

You are a public health researcher interested in social factors that influence heart disease. You
survey 500 towns and gather data on the percentage of people in each town who smoke, the
percentage of people in each town who bike to work, and the percentage of people in each
town who have heart disease.
Because you have two independent variables and one dependent variable, and all your
variables are quantitative, you can use multiple linear regression to analyze the
relationship between

them.

Assumptions of multiple linear regression

Multiple linear regression makes all of the same assumptions as simple linear regression:

Homogeneity of variance (homoscedasticity): the size of the error in our prediction doesn’t
change significantly across the values of the independent variable.
Independence of observations: the observations in the dataset were collected using
statistically valid sampling methods, and there are no hidden relationships among variables.

In multiple linear regression, it is possible that some of the independent variables are actually
correlated with one another, so it is important to check these before developing the regression
model. If two independent variables are too highly correlated (r2 > ~0.6), then only one of them
should be used in the regression model.

Normality: The data follows a normal distribution.

Linearity: the line of best fit through the data points is a straight line, rather than a curve or
some sort of grouping factor.

How to perform a multiple linear regression

Multiple linear regression formula

The formula for a multiple linear regression is:

• = the predicted value of the dependent variable

• = the y-intercept (value of y when all other parameters are set to 0)

• = the regression coefficient ( ) of the first independent variable ( )

(a.k.a. the effect that increasing the value of the independent variable has on the
predicted y value)

• … = do the same for however many independent variables you are testing

• = the regression coefficient of the last independent variable

• = model error (a.k.a. how much variation there is in our estimate of )

To find the best-fit line for each independent variable, multiple linear regression calculates
three things:

• The regression coefficients that lead to the smallest overall model error.
• The t statistic of the overall model.
• The associated p value (how likely it is that the t statistic would have occurred by
chance if the null hypothesis of no relationship between the independent and
dependent variables was true).

It then calculates the t statistic and p value for each regression coefficient in the model.

Example of How to Use Multiple Linear Regression

from sklearn.datasets import load_boston

import pandas as pd

from sklearn.model_selection import train_test_split

def sklearn_to_df(data_loader):

X_data = data_loader.data

X_columns = data_loader.feature_names

X = pd.DataFrame(X_data, columns=X_columns)

y_data = data_loader.target

y = pd.Series(y_data, name='target')

return x, y

x, y = sklearn_to_df(load_boston())

x_train, x_test, y_train, y_test = train_test_split(

x, y, test_size=0.2, random_state=42)

from load_dataset import x_train, x_test, y_train, y_test

from multiple_linear_regression import MultipleLinearRegression

from sklearn.linear_model import LinearRegression

mulreg = MultipleLinearRegression()

# fit our LR to our data

mulreg.fit(x_train, y_train)

# make predictions and score

pred = mulreg.predict(x_test)

# calculate r2_score

score = mulreg.r2_score(y_test, pred)

print(f'Our Final R^2 score: {score}')

The Difference Between Linear and Multiple Regression

When predicting a complex process's outcome, it is best to use multiple linear regression
instead of simple linear regression.

A simple linear regression can accurately capture the relationship between two variables in
simple relationships. On the other hand, multiple linear regression can capture more complex
interactions that require more thought.

A multiple regression model uses more than one independent variable. It does not suffer from
the same limitations as the simple regression equation, and it is thus able to fit curved and non-
linear relationships. The following are the uses of multiple linear regression.
1. Planning and Control.

2. Prediction or Forecasting.

Estimating relationships between variables can be exciting and useful. As with all other
regression models, the multiple regression model assesses relationships among variables in
terms of their ability to predict the value of the dependent variable.

Why and When to Use Multiple Regression Over a Simple OLS Regression?

When you're trying to predict something, it's usually helpful to start with a linear model. But
sometimes things aren't so simple.

Multiple regression is used when you want to predict a dependent variable using more than one
independent variable. It's the same type of regression as ordinary linear squares (OLS)
regression. On the other hand, OLS regression distinguishes the effect of an explanatory
variable on a continuous dependent variable by comparing the distributions of these variables
based on the changes in the value of the explanatory variables.

MLR can use more than one explanatory variable at once. This allows you to make better
predictions about what might happen in your data if certain changes were made.
Examples : Example #1

Let us try and understand the concept of multiple regression analysis with the help of an
example. But, first, let us try to find out the relation between the distance covered by an
UBER driver and the age of the driver, and the number of years of experience of the
driver.
To calculate multiple regression, go to the “Data” tab in Excel and select the “Data Analysis”
option. For further procedure and calculation, refer to the: Analysis ToolPak in Excel article.

The regression formula for the above example will be

1. y = MX + MX + b
2. y= 604.17*-3.18+604.17*-4.06+0
3. y= -4377

In this particular example, we will see which variable is the dependent variable and which
variable is the independent variable. The dependent variable in this regression equation is the
distance covered by the UBER driver, and the independent variables are the age of the driver
and the number of experiences he has in driving.
Example #2

Let us try and understand the concept of multiple regression analysis with the help of
another example. Let us try to find the relation between the GPA of a class of students,
the number of hours of study, and the student’s height.

Go to the “Data” tab in Excel and select the “Data Analysis” option for the calculation.

The regression equation for the above example will be

y = MX + MX + b

y= 1.08*.03+1.08*-.002+0
y= .0325

In this particular example, we will see which variable is the dependent variable and which
variable is the independent variable. The dependent variable in this regression is the GPA,
and the independent variables are study hours and the height of the students

Overfitting, Detecting overfit models: Cross validation, Cross validation: The ideal
procedure, Parameter estimation,

Overfitting

A modeling error that occurs when a function corresponds too closely to a particular set
of data

What is Overfitting?

Overfitting is a term used in statistics that refers to a modeling error that occurs when a function
corresponds too closely to a particular set of data. As a result, overfitting may fail to fit
additional data, and this may affect the accuracy of predicting future observations.

Overfitting can be identified by checking validation metrics such as accuracy and loss. The
validation metrics usually increase until a point where they stagnate or start declining when the
model is affected by overfitting. During an upward trend, the model seeks a good fit, which,
when achieved, causes the trend to start declining or stagnate.

Summary

• Overfitting is a modeling error that introduces bias to the model because it is too closely
related to the data set.
• Overfitting makes the model relevant to its data set only, and irrelevant to any other
data sets.
• Some of the methods used to prevent overfitting include ensembling, data
augmentation, data simplification, and cross-validation.
How to Detect Overfitting?

Detecting overfitting is almost impossible before you test the data. It can help address the
inherent characteristic of overfitting, which is the inability to generalize data sets. The data can,
therefore, be separated into different subsets to make it easy for training and testing. The data
is split into two main parts, i.e., a test set and a training set.

The training set represents a majority of the available data (about 80%), and it trains the model.
The test set represents a small portion of the data set (about 20%), and it is used to test the
accuracy of the data it never interacted with before. By segmenting the dataset, we can examine
the performance of the model on each set of data to spot overfitting when it occurs, as well as
see how the training process works.

The performance can be measured using the percentage of accuracy observed in both data
sets to conclude on the presence of overfitting. If the model performs better on the training set
than on the test set, it means that the model is likely overfitting.

How to Prevent Overfitting?

Below are some of the ways to prevent overfitting:

1. Training with more data

One of the ways to prevent overfitting is by training with more data. Such an option makes it
easy for algorithms to detect the signal better to minimize errors. As the user feeds more
training data into the model, it will be unable to overfit all the samples and will be forced to
generalize to obtain results.

Users should continually collect more data as a way of increasing the accuracy of the model.
However, this method is considered expensive, and, therefore, users should ensure that the data
being used is relevant and clean.
2. Data augmentation

An alternative to training with more data is data augmentation, which is less expensive
compared to the former. If you are unable to continually collect more data, you can make the
available data sets appear diverse.

Data augmentation makes a sample data look slightly different every time it is processed by
the model. The process makes each data set appear unique to the model and prevents the model
from learning the characteristics of the data sets.

Another option that works in the same way as data augmentation is adding noise to the input
and output data. Adding noise to the input makes the model become stable, without affecting
data quality and privacy, while adding noise to the output makes the data more diverse.
However, noise addition should be done with moderation so that the extent of the noise is not
so much as to make the data incorrect or too different.

3. Data simplification

Overfitting can occur due to the complexity of a model, such that, even with large volumes of
data, the model still manages to overfit the training dataset. The data simplification method is
used to reduce overfitting by decreasing the complexity of the model to make it simple enough
that it does not overfit.

Some of the actions that can be implemented include pruning a decision tree, reducing the
number of parameters in a neural network, and using dropout on a neutral network. Simplifying
the model can also make the model lighter and run faster.

4. Ensembling

Ensembling is a machine learning technique that works by combining predictions from two or
more separate models. The most popular ensembling methods include boosting and bagging.

Boosting works by using simple base models to increase their aggregate complexity. It trains
a large number of weak learners arranged in a sequence, such that each learner in the
sequence learns from the mistakes of the learner before it.
Boosting combines all the weak learners in the sequence to bring out one strong learner. The
other ensembling method is bagging, which is the opposite of boosting. Bagging works by
training a large number of strong learners arranged in a parallel pattern and then combining
them to optimize their predictions.

How to Detect Overfit Models

As I discussed earlier, generalizability suffers in an overfit model. Consequently, you can

detect overfitting by determining whether your model fits new data as well as it fits the data
used to estimate the model. In statistics, we call this cross-validation, and it often involves
partitioning your data.

However, for linear regression, there is an excellent accelerated cross-validation method

called predicted R-squared. This method doesn’t require you to collect a separate sample or
partition your data, and you can obtain the cross-validated results as you fit the model.
Statistical software calculates predicted R-squared using the following automated procedure:

o It removes a data point from the dataset.

o Calculates the regression equation.
o Evaluates how well the model predicts the missing observation.
o And, repeats this for all data points in the dataset.

Predicted R-squared has several cool features. First, you can just include it in the output as you
fit the model without any extra steps on your part. Second, it’s easy to interpret. You simply
compare predicted R-squared to the regular R-squared and see if there is a big difference.

Cross-Validation with Linear Regression | Kaggle

Refer the above link to work on Overfit

What is cross-validation?

Cross-validation is a technique for evaluating the performance of a model.

This process usually involves testing several techniques. Or doing hyperparameter optimization
of a particular method. In such cases, your goal is to check which alternative is best for the input
data.

The idea is to select the approach that maximizes performance. This is the model that will be
deployed into production. Besides, you also want to get a reliable estimate of that model’s
performance.

Re-training After Cross-Validation

Suppose you do cross-validation to select a model. You test many alternatives using 5-fold
cross-validation. Then, a linear regression comes out on top.

What should you do next?

Should you re-train the linear regression using all available data? or should you use the models
trained during cross-validation?
This part creates some confusion among data scientists — not only among beginners but also
among more seasoned professionals.

After cross-validation, you should re-train the best approach using all available data. Here’s a
quote taken from the legendary book Elements of Statistical Learning [1](parenthesis mine):

Our final chosen model [after cross-validation] is f(x), which we then fit to all the data.

But, this idea is not consensual.

Some practitioners keep the best models trained during cross-validation. Following the example
above, you’d keep 5 linear regression models. Then, during the deployment stage, you’d
average their predictions for each prediction.

That’s not how cross-validation works.

There are two problems with this approach:

• It uses fewer data for training;

• It leads to increased costs due to having to maintain many models.

Fewer data

By not re-training, you’re not using all available instances for creating a model.

This can lead to a sub-optimal model unless you have tons of data. Training with all available
instances is likely to generalize better.

Re-training is especially important in time series because the most recent observations are used
for testing. By not re-training in these, the model might miss newly emerged patterns.
Increased costs

One can argue that combining the 5 models trained during cross-validation leads to better
performance.

Yet, it’s important to understand the implications. You’re no longer using a simple,
interpretable, linear regression.

Your model is an ensemble whose individual models are trained by random subsampling.
Random subsampling is a way of introducing diversity in ensembles. Ensembles often perform
better than single models. But, they also lead to extra costs and lower transparency.

What if you just keep one, instead of combining all models?

That would solve the problem of increased costs. Yet, it’s not clear which version of the model
you should choose.

There are two reasons re-training can be skipped. If the data set is large or if re-training is too
costly. These two issues are often linked.

Re-training — Practical example

Here’s an example of how you can re-train the best model after cross-validation:

from sklearn.datasets import make_regression

from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import GridSearchCV, KFold

# creating a dummy dataset

X, y = make_regression(n_samples=100)

# 5-fold cross-validation
cv = KFold(n_splits=5)
# optimizing the number of trees of a RF
model = RandomForestRegressor()
param_search = {'n_estimators': [10, 50,
100]}

# applying cross-validation with a grid-search

# and re-training the best model afterwards
gs = GridSearchCV(estimator=model, cv=cv, refit=True, param_grid=param_search)
gs.fit(X, y)

The goal is to optimize the number of trees in a Random Forest. This is done with
the GridSearchCV class from scikit-learn. You can set the parameter refit=True, and the best
model is re-trained after cross-validation automatically.

You can do this explicitly by getting the best parameters from GridSearchCV to initialize a
new model:

best_model = RandomForestRegressor(**gs.best_params_)
best_model.fit(X, y)

Getting Reliable Performance Estimates

When developing a model, you want to achieve three things:

1. Select a model among many alternatives;

2. Train the selected model and deploy it;

3. Get a reliable estimate of the performance of the selected model.

Cross-validation and re-training cover the first two points, but not the third.
Why is that?

Cross-validation is often repeated several times before selecting a final model. You test
different transformations and hyperparameters. So, you end up adjusting your method until
you’re happy with the result.

This can lead to overfitting because the details of the validation sets can leak into the model.
Thus, the performance estimate you get from cross-validation can be too optimistic. You can
read more about this in the article in reference [2].

This is one of the reasons why Kaggle competitions have two leaderboards, one public and
another private. This prevents competitors from overfitting the test set.

So, how do you solve this problem?

You should make an extra evaluation step. After cross-validation, you evaluate the selected
model in a held-out test set. The full workflow is like this:

1. Split the available data into training and testing sets;

2. Apply cross-validation with the training set to select a model;

3. Re-train the chosen model using the training data and evaluate it on the test set.
This provides you with an unbiased performance estimate;

4. Re-train the chosen model using all available data and deploy it.

Here’s a visual description of this process:

Applying cross-validation with training data. After cross-validation, re-training the chosen
model and evaluate it on the test set. Finally, re-train the chosen model and deploy it. Image
by author.

Logistic Regression:
o Logistic regression is another supervised learning algorithm which is used to solve the
classification problems. In classification problems, we have dependent variables in a
binary or discrete format such as 0 or 1.
o Logistic regression algorithm works with the categorical variable such as 0 or 1, Yes
or No, True or False, Spam or not spam, etc.

o It is a predictive analysis algorithm which works on the concept of probability.

o Logistic regression is a type of regression, but it is different from the linear regression
algorithm in the term how they are used.
o Logistic regression uses sigmoid function or logistic function which is a complex
cost function. This sigmoid function is used to model the data in logistic regression.
The function can be represented as:

o f(x)= Output between the 0 and 1 value.

o x= input to the function
o e= base of natural logarithm.
When we provide the input values (data) to the function, it gives the S-curve as follows:

o It uses the concept of threshold levels, values above the threshold level are rounded up to 1,
and values below the threshold level are rounded up to 0.

There are three types of logistic regression:

o Binary(0/1, pass/fail)
o Multi(cats, dogs, lions)
o Ordinal(low, medium, high)

Decision trees for credit card promotion

Imagine, you got a new job offer and need to decide whether you are going to take it or leave
it. You consider several factors; starting from Salary, distance and commute time to the
office, other perks and benefits, career growth, and so on. But you don’t choose all the
factors at one time. Your brain processes the information through a series of if-else branches
like the picture below:
You start thinking about the salary first; which becomes the main factor or starting point for
youranalysis.

If salary criteria are met and it’s above $50K then do you think the commute time to the office
is more than an hour or less? If the office is nearby and you can reach it easily, then you start
thinking about whether the office offers you coffee and other perks. Gradually if all those
conditions are met, you finally go ahead and accept the offer.

The decision tree algorithm works exactly in the same fashion. In some sense, it’s the real-
world replica of how a human brain makes decisions with series of clarifications and asks
processing each one at a time in a sequential manner.

Introduction to Decision tree algorithm:

A decision tree is a machine learning algorithm that is used for classification and regression
tasks. It works by creating a tree-like model of decisions and their possible consequences, with
the goal of accurately predicting the outcome of a given input.

To create a decision tree, the algorithm begins by considering all the available features (also
called “attributes”) of the input data. It then selects the feature that best splits the data into
different classes or categories. This process is repeated for each split, with the algorithm
choosing the feature that best divides the data at each step. The process continues until the tree
is fully grown, or until a stopping criterion is reached (such as a maximum tree depth or a
minimum number of samples in a leaf node).
Once the decision tree is created, it can be used to make predictions on new input data by
following the path down the tree based on the feature values of the input data.

Decision trees are easy to interpret and understand, and they can handle both continuous and
categorical data. However, they can be prone to overfitting, particularly if the tree is allowed
to grow too deep. To mitigate this, techniques such as pruning (removing branches from the
tree) or limiting the maximum depth of the tree can be used.

In machine learning, a decision tree may be defined as a non-parametric supervised

algorithm. This algorithm uses a series of the if-else-based flowchart-like tree structures. This
obtains the predictions that result from a sequence of feature-based splits. It starts with a root
node, your first point of consideration: Salary. It ends with a decision made by leaves
(Terminal nodes: Whether you accept or decline the offer)

Important Terminologies related to a Decision Tree algorithm

Here are some terms and terminologies related to decision trees:

• Root node: The top node of a decision tree, representing the entire population
or sample.
• Splitting: The process of dividing a node into two or more sub-nodes based on a
feature or attribute value.
• Decision node: A node that represents a decision to be made based on the value
of a feature or attribute.
• Leaf node: A terminal node that does not have any sub-nodes, representing a
classification or prediction.
• Pruning: The process of removing branches from a decision tree to reduce
overfitting and improve generalization to new data.
• Decision boundary: The line or plane that separates different classes or categories
in the data.
• Gini index: A measure of the purity of the nodes in a decision tree, based on
the proportion of samples belonging to a particular class.
• Information gain: A measure of the reduction in entropy (randomness or
uncertainty) caused by splitting the data based on a particular feature.
• Overfitting: The phenomenon where a model fits the training data too well and
does not generalize well to new data.
• Underfitting: The phenomenon where a model does not fit the training data well
and therefore performs poorly on both the training and test data.
A decision tree in general is termed a Classification and Regression Tree (CART). It can be
used for both classification problems as well as for continuous variable predictions too.
However, in this article, we will restrict ourselves to a real-world example in classification
only.

Application of Decision Tree

There are multiple real-life applications of Decision trees. Some examples include:

• Medical diagnosis: Make medical diagnoses based on a set of symptoms or test results.
• Credit approval: Banks and financial institutions can use decision trees to predict
the likelihood of an individual defaulting on a loan or credit card based on their
credit history and other factors.
• Marketing: Predict customer behavior and make targeted marketing campaigns
based on factors such as age, income, and purchasing history.
• Fraud detection: Identify fraudulent transactions in areas such as credit card use or
insurance claims.
• Oil reservoir characterization: Predict the characteristics of an oil reservoir based
on data such as rock type and porosity.
• Customer churn prediction: Predict the likelihood of a customer churning
(leaving a company) based on factors such as their usage patterns and customer
service
interactions.

Growing a tree

The decision tree algorithm starts at the root node and progresses downward in search of the
purest set of data points. Speaking simply, the objective of a decision tree algorithm is to create
splits at different nodes such that the resulting nodes (set of observations or points) are as
homogeneous as possible.

As can be seen in the below figure, node A is an equal mix of blue and yellow dots and the
most impure node in that sense, node C is all blue and the purest set of data points, and node B
falls in-between node A and C.
Concept of Entropy and Splits:
In decision tree analysis, entropy is a measure of the impurity or randomness of a set of data.
It is commonly used to evaluate the quality of a split in a decision tree. The idea is that a split
that results in pure, homogeneous subsets (low entropy) is more useful for making accurate
predictions than a split that results in mixed or heterogeneous subsets (high entropy).

In information theory, entropy is a measure of the uncertainty or randomness of a random

variable. In decision tree analysis, it is used to measure the impurity or randomness of a set of
data. The entropy of a set of data is calculated using the following formula:

Entropy = – ∑(p(i) * log2(p(i)))

where p(i) is the proportion of data points in the set that belong to class i.

For example, consider a set of data with two classes, A and B. If the data is perfectly balanced,
with 50% of the data points belonging to class A and 50% belonging to class B, the entropy
would be 1. If the data is completely imbalanced, with all data points belonging to class A or
all data points belonging to class B, the entropy would be 0.

In a decision tree, the entropy of a set of data is used to evaluate the quality of a split. A split
that results in pure, homogeneous subsets (low entropy) is more useful for making accurate
predictions than a split that results in mixed or heterogeneous subsets (high entropy). The goal
of the decision tree is to find the split that results in the lowest possible entropy, so that the
resulting subsets are as pure as possible.

Methods of splitting and growing a tree (concept of information gain):

While building a decision tree it becomes very important in choosing the right feature or
predictor for splitting and growing the treetop to down. To obtain the right set of features, the
concept of Information gain is used which is developed on the principle of maximum entropy
reduction while traversing from the top node to the bottom node by choosing the right set of
features.
The concept of information gain is presented below:

Say in a real-life problem, you need to decide which factor is more important among Energy
level and motivation for going to the gym. While exploring this the following set of responses
in the form of a decision tree were observed.

Therefore, it’s evident that information gain or reduction in entropy would be higher if we
chose Energy as the next feature. Therefore, the tree would select “Energy” as the next
splitting criteria.
The split with the highest information gain will be taken as the first split. The process will
continue until all children nodes are pure, or until the information gain is 0. That’s the reason,
decision tree algorithms are termed greedy algorithms. They build the tree until each and
every node becomes completely pure.

However, growing a tree to reach the purest set of nodes may not be always feasible owing to
computational challenges and overfitting problems on the training data. That is why the
concept of pruning comes into the picture. The growth of the decision tree can be restricted
by cutting the branches using hyperparameter tuning or by cost complexity pruning. The
details of this are outside the scope of this article. However, we should have a clear
understanding of these aspects as well while building a model using the CART algorithm.

Unsupervised machine learning algorithms/methods

For this family of models, the research needs to have at hand a dataset with some observations
without the need of having also the labels/classes of the observations.

Unsupervised learning studies how systems can infer a function to describe a hidden structure
from unlabeled data. The system doesn’t predict the right output, but instead, it explores the data
and can draw inferences from datasets to describe hidden structures from unlabeled data.

Unsupervised models can be further grouped. into clustering and association cases.
• Clustering: A clustering problem is where you want to unveil
the inherent groupings in the data, such as grouping animals based on some
characteristics/features e.g. number of legs.

• Association: An association rule learning is where you want to

discover association rules such as people that buy X also tend to buy Y.

Some examples of models that belong to this family are the following: PCA, K-means,
DBSCAN, mixture models etc.

K-Means Clustering Algorithm

K-Means Clustering is an unsupervised learning algorithm that is used to solve the clustering
problems in machine learning or data science. In this topic, we will learn what is K-means
clustering algorithm, how the algorithm works, along with the Python implementation of k-
means clustering.

What is K-Means Algorithm?

K-Means Clustering is an Unsupervised Learning algorithm, which groups the unlabeled

dataset into different clusters. Here K defines the number of pre-defined clusters that need to
be created in the process, as if K=2, there will be two clusters, and for K=3, there will be
three clusters, and so on.

It is an iterative algorithm that divides the unlabeled dataset into k different clusters in such a
way that each dataset belongs only one group that has similar properties.

It allows us to cluster the data into different groups and a convenient way to discover the
categories of groups in the unlabeled dataset on its own without the need for any training.

It is a centroid-based algorithm, where each cluster is associated with a centroid. The main aim
of this algorithm is to minimize the sum of distances between the data point and their
corresponding clusters.
The algorithm takes the unlabeled dataset as input, divides the dataset into k-number of
clusters, and repeats the process until it does not find the best clusters. The value of k should
be predetermined in this algorithm.

The k-means clustering algorithm mainly performs two tasks:

o Determines the best value for K center points or centroids by an iterative process.
o Assigns each data point to its closest k-center. Those data points which are near to the
particular k-center, create a cluster.

Hence each cluster has datapoints with some commonalities, and it is away from other
clusters.

The below diagram explains the working of the K-means Clustering Algorithm:

How does the K-Means Algorithm Work?

The working of the K-Means algorithm is explained in the below steps:

Step-1: Select the number K to decide the number of clusters.

Step-2: Select random K points or centroids. (It can be other from the input dataset).

Step-3: Assign each data point to their closest centroid, which will form the predefined K
clusters.
Step-4: Calculate the variance and place a new centroid of each cluster.

Step-5: Repeat the third steps, which means reassign each datapoint to the new closest centroid
of each cluster.

Step-6: If any reassignment occurs, then go to step-4 else go to FINISH.

Step-7: The model is ready.

Let's understand the above steps by considering the visual plots:

Suppose we have two variables M1 and M2. The x-y axis scatter plot of these two variables is
given below:

o Let's take number k of clusters, i.e., K=2, to identify the dataset and to put them into
different clusters. It means here we will try to group these datasets into two different
clusters.

o We need to choose some random k points or centroid to form the cluster. These points
can be either the points from the dataset or any other point. So, here we are selecting
the below two points as k points, which are not the part of our dataset. Consider the
below image:

o Now we will assign each data point of the scatter plot to its closest K-point or
centroid. We will compute it by applying some mathematics that we have studied to
calculate the distance between two points. So, we will draw a median between both

the centroids. Consider the below image:

From the above image, it is clear that points left side of the line is near to the K1 or blue
centroid, and points to the right of the line are close to the yellow centroid. Let's color them as
blue and yellow for clear visualization.

o As we need to find the closest cluster, so we will repeat the process by choosing a new
centroid. To choose the new centroids, we will compute the center of gravity of these
centroids, and will find new centroids as below:
o Next, we will reassign each datapoint to the new centroid. For this, we will repeat the
same process of finding a median line. The median will be like below image:

From the above image, we can see, one yellow point is on the left side of the line, and two blue
points are right to the line. So, these three points will be assigned to new centroids.

As reassignment has taken place, so we will again go to the step-4, which is finding new
centroids or K-points.

o We will repeat the process by finding the center of gravity of centroids, so the new
centroids will be as shown in the below image:
o As we got the new centroids so again will draw the median line and reassign the data
points. So, the image will be:

o We can see in the above image; there are no dissimilar data points on either side of the
line, which means our model is formed. Consider the below image:

o
As our model is ready, so we can now remove the assumed centroids, and the two final clusters
will be as shown in the below image:

How to choose the value of "K number of clusters" in K-means Clustering?

The performance of the K-means clustering algorithm depends upon highly efficient clusters
that it forms. But choosing the optimal number of clusters is a big task. There are some different
ways to find the optimal number of clusters, but here we are discussing the most appropriate
method to find the number of clusters or value of K. The method is given below:

Reinforcement machine learning algorithms/methods

This family of models consists of algorithms that use the estimated errors as rewards or
penalties. If the error is big, then the penalty is high and the reward low. If the error is
small, then the penalty is low and the reward high.

Trial error search and delayed reward are the most relevant characteristics of reinforcement
learning. This family of models allows the automatic determination of the ideal behavior
within a specific context in order to maximize the desired performance.

Reward feedback is required for the model to learn which action is best and this is known as
“the reinforcement signal”.

Some examples of models that belong to this family is the Q-learning.

Multi-Armed Bandit algorithms

What is the Multi-Armed Bandit Problem (MABP)?

A bandit is defined as someone who steals your money. A one-armed bandit is a simple slot
machine wherein you insert a coin into the machine, pull a lever, and get an immediate reward.
But why is it called a bandit? It turns out all casinos configure these slot machines in such a
way that all gamblers end up losing money!

A multi-armed bandit is a complicated slot machine wherein instead of 1, there are several
levers which a gambler can pull, with each lever giving a different return. The probability
distribution for the reward corresponding to each lever is different and is unknown to the
gambler.
The task is to identify which lever to pull in order to get maximum reward after a given set of
trials. This problem statement is like a single step Markov decision process, which I discussed
in this article. Each arm chosen is equivalent to an action, which then leads to an immediate
reward.

Exploration Exploitation in the context of Bernoulli MABP

The below table shows the sample results for a 5-armed Bernoulli bandit with arms labelled as
1, 2, 3, 4 and 5:

This is called Bernoulli, as the reward returned is either 1 or 0. In this example, it looks like
the arm number 3 gives the maximum return and hence one idea is to keep playing this arm in
order to obtain the maximum reward (pure exploitation).

Just based on the knowledge from the given sample, 5 might look like a bad arm to play, but
we need to keep in mind that we have played this arm only once and maybe we should play it
a few more times (exploration) to be more confident. Only then should we decide which arm
to play (exploitation).

Use Cases
Bandit algorithms are being used in a lot of research projects in the industry. I have listed some
of their use cases in this section.
Clinical Trials
The well being of patients during clinical trials is as important as the actual results of the study.
Here, exploration is equivalent to identifying the best treatment, and exploitation is treating
patients as effectively as possible during the trial.

Network Routing
Routing is the process of selecting a path for traffic in a network, such as telephone networks
or computer networks (internet). Allocation of channels to the right users, such that the overall
throughput is maximised, can be formulated as a MABP.

Online Advertising
The goal of an advertising campaign is to maximise revenue from displaying ads. The
advertiser makes revenue every time an offer is clicked by a web user. Similar to MABP, there
is a trade-off between exploration, where the goal is to collect information on an ad’s
performance using click-through rates, and exploitation, where we stick with the ad that has
performed the best so far.
Game Designing
Building a hit game is challenging. MABP can be used to test experimental changes in game
play/interface and exploit the changes which show positive experiences for players.

What is an Influence Diagram?

An influence diagram is a compact illustration of a decision condition in both visual and

mathematical terms. Influence diagrams are a visualization method to chart the relationship
between the main elements of a decision-setting.

A decision is an attribute that you (or your organization) have the authority to alter as the
decision-maker explicitly. When people make decisions, substantial things should be
considered in doing business, such as how to invest a new project, how much to spend
and sell, where to place a website, or what budget to devote to marketing.

As a project or system gets more complicated, the multiple inter-relationships on a single

diagram will become even more challenging to display. One strategy is to limit the influence
diagram to one particular aspect of the work, like project relationships, outcomes and
benefits, relationships between outputs and risk events, or relationships between projects and
stakeholders.

1. The Usage of Influence Diagram

• Create a collective awareness of "how things work";

• Facilitate cooperation between technical professionals, decision-makers, and key
stakeholders;

• Incorporate information from different backgrounds into decision making;

• Encourage rational learning about the connections between cause and effect;
• Defining standards for assessment;
• Determining simulation and knowledge uses that are specifically relevant to
evaluation requirements;

• The corresponding quantitative modeling is standardized;

• Documenting the rationale for expert judgments and strengthening their
accountability.

When a decision tree is very abstract and hard to explain certain thing, an influence diagram
is more useful as it would provide a higher-level description of what was found using the
decision tree.

• Empathy mapping;
• Experience mapping;
• Customer journey mapping;
• Service design (blueprint) mapping.

An experience map draws on a standard, high-level interpretation of human actions to

accomplish a clear objective.

2. The Elements in Influence Diagram

In influence diagrams, the semantics are of two kinds - Arrows and Nodes.

o Arrow

An arrow will denote an influence. An arrow from A to B implies that understanding A will
directly affect our assumption or opinion for B. An effect communicates the pertinence
information, which may mean a causal interaction, or a flow of data, information, or money,
but need not.

o Node
A node is a predecessor at the beginning of an arc; a node at the end is a successor.

(1) The decision node is shown as a rectangle;

(2) Uncertainty node is drawn as an oval (corresponding to the ambiguity to be based on);

(3) The deterministic node is drawn as a double oval (corresponding to a special kind of
uncertainty where the end decision is already known).

1.3 Influence Diagram vs. Decision Tree

The influence diagram displays system dependency. There is an essential contrast between
the influence diagrams and the decision trees. Decision trees provide much more information
on a potential decision.

Influence diagrams are directly connected to and mostly used in combination with decision
trees. An influence diagram provides a summary of the knowledge in a decision tree.

A decision tree is a diagram of a set of connected choices with different outcomes. It allows a
person or organization, based on their costs, probabilities, and benefits, to evaluate available
options against each other.
Decision trees can become immensely complicated. A more compact influence diagram may
be a suitable substitute in these situations. Influence diagrams simplify the emphasis on
essential choices, inputs, and goals.

A decision tree can be used to develop automatic predictive models with machine learning,
data mining, and analytics applications.

How to Create The Influence Diagram in EdrawMax

Step 1: Open EdrawMax and click flowchart to select a proper template.

Step 2: According to your needs, to customize anything you like, from text to the shapes.

Step 3: Once you are satisfied, just export your influence diagram in various formats, like
Microsoft Office, Graphs, PDF, PS, Visio and more.
Examples of The Influence Diagram

Source:EdrawMax

The first example is a basic influence diagram, which illustrates how the business operates
from investment to the final making profits. Since it is the simple influence diagram, the
information has been visualized and easy to understand.

The above example denotes the influence diagram of a store including aspects like
salesperson, cashier, product, money, etc. This diagram represents the key areas of decision
and uncertainty and is connected with arrows.
Risk Modeling

Credit risk modeling–the process of estimating the probability someone will pay back a
loan–is one of the most important mathematical problems of the modern world. In this article,
we’ll explore from the ground up how machine learning is applied to credit risk modeling.

You don’t need to know anything about machine learning to understand this article!

To explain credit risk modeling with machine learning, we’ll first develop domain knowledge
about credit risk modeling. Then, we’ll introduce four fundamental machine learning systems
that can be used for credit risk modeling:

• K-Nearest Neighbors

• Logistic Regression

• Decision Trees

• Neural Networks

By the end of this article, you’ll understand how each of these algorithms can be applied to the
real-world problem of credit risk modeling, and you’ll be well on your way to understanding
the field of machine learning in general!

Let’s begin learning about what credit risk modeling is by looking at a simple situation.

The Situation

Say your buddy Ted needs ten bucks. You’ll want those bucks back, so he promises
he’ll repay you tomorrow when you see him again.
SensitivityAnalysis
Sensitivity analysis is a powerful tool used in many different disciplines to analyze the impact
of certain changes on a given system or model. It can be used for risk management, cost-
benefit analysis, statistical modeling, and other applications. By understanding the sensitivity
of a system to changes in its parameters, companies can make more informed decisions and
develop better strategies for success. In this article, we will discuss what sensitivity analysis
is, its benefits, steps for conducting a sensitivity analysis, applications, limitations, data
requirements, key considerations, and tools used.

What is SensitivityAnalysis?
Sensitivity analysis is a technique used to determine how much a system or model's output
changes when one or more of its inputs change. It is typically used to measure the effect of a
single change in one or more parameters on the output of a system. It is used to assess the
impact of input changes on a system's performance measure, such as cost or profit.
Sensitivity analysis helps companies understand the sensitivity of their systems or models to
changes in their inputs. This allows them to identify which parameters are most important to
their output and which can be safely changed without compromising their performance.
Sensitivity analysis can also be used to identify relationships between input parameters and
output measures.

Sensitivity analysis is a powerful tool for understanding the behavior of complex systems. It
can be used to identify areas of potential improvement, as well as to identify areas of risk. By
understanding the sensitivity of a system to changes in its inputs, companies can make
informed decisions about how to optimize their systems and models. Additionally, sensitivity
analysis can be used to identify areas of potential risk, allowing companies to take proactive
steps to mitigate those risks.

Benefits of Sensitivity Analysis

Sensitivity analysis is beneficial for many reasons, including helping companies identify
potential risks and opportunities in their operations. By understanding how changes in certain
parameters will affect their system, companies can develop strategies to mitigate potential
risks. Additionally, sensitivity analysis can help companies identify cost-saving opportunities
and improve performance by understanding the impact of changes in their inputs. Sensitivity
analysis is also useful for developing accurate models and significant correlations between
input parameters and output measures.

Sensitivity analysis can also be used to identify areas of improvement in a company's

operations. By understanding the impact of changes in certain parameters, companies can
identify areas where they can make adjustments to improve their performance. Additionally,
sensitivity analysis can help companies identify areas where they can reduce costs and
increase efficiency. Finally, sensitivity analysis can be used to identify areas of potential
growth and expansion for a company.

Steps in Conducting a SensitivityAnalysis

Conducting a sensitivity analysis involves several steps. First, data needs to be gathered and
analyzed to determine which input parameters are most important to the system's
performance. Next, parameters need to be identified that can be adjusted to achieve desired
outcomes. Then, changes to the parameters need to be tested to measure the effect of the
changes on the system's performance. The results of the testing should then be analyzed to
determine which parameters are most sensitive to changes and which can be adjusted without
compromising the system's performance.

Finally, the sensitivity analysis should be documented and the results should be shared with
stakeholders. This will ensure that everyone is aware of the potential impacts of changes to
the system's parameters and can make informed decisions about how to adjust them in the
future.
Applications of SensitivityAnalysis
Sensitivity analysis can be used in many different applications, including financial
management, risk management, decision-making, cost-benefit analysis, statistical modeling,
and machine learning. In financial management, sensitivity analysis can be used to identify
potential risks and opportunities in investments. In risk management, sensitivity analysis can
help identify potential risks associated with certain processes or activities. In decision-making,
sensitivity analysis can help managers decide which parameters are most important and which
can be safely adjusted without compromising their decision. In cost-benefit analysis, sensitivity
analysis helps identify opportunities for cost savings and increased efficiency. In statistical
modeling and machine learning, sensitivity analysis can help identify significant correlations
between input parameters and output measures.

Sensitivity analysis can also be used to identify areas of potential improvement in existing
processes or systems. By analyzing the sensitivity of different parameters, organizations can
identify areas where changes can be made to improve efficiency or reduce costs.
Additionally, sensitivity analysis can be used to identify potential areas of risk in a system or
process, allowing organizations to take proactive steps to mitigate those risks.

Limitations of Sensitivity Analysis

Sensitivity analysis is not without its limitations. One of the main limitations is that it can only
analyze one parameter at a time; it is not able to assess the effect of multiple parameter changes
simultaneously. Additionally, it does not provide insights into the interactions between
different parameters and how they affect the system's performance. Lastly, sensitivity analysis
does not always provide accurate results as it relies on data that may be incomplete or
inaccurate.

Data Requirements for SensitivityAnalysis

To conduct a successful sensitivity analysis, data needs to be gathered about the system or
model being studied. This includes data about the inputs (parameters) and outputs (performance
measures). Additionally, data about relationships between inputs and outputs needs to be
gathered in order to determine which parameters are most important to the system's
performance. The data must be accurate and complete in order for the sensitivity analysis to
provide meaningful results.
Video Links
Unit – III
Sl. Topic Video Link
No.
Video Links
1 Supervised Learning,
https://www.youtube.com/watch?v=a7_1-
HNNAFw
Regression, Linear regression,
Multiple linear regression, A
https://www.youtube.com/watch?v=_OyKjstWe80
multiple regression analysis

2 https://www.youtube.com/watch?v=aTZnuhTCFtI
The analysis of variance for https://www.youtube.com/watch?v=fYStutigCkE
multiple regression, Examples
for multiple regression
3 https://www.youtube.com/watch?v=vv0fWd09-js
Overfitting, Detecting https://www.youtube.com/watch?v=PF2wLKv2lsI
overfit models: Cross
validation
4 Cross validation: The ideal
https://www.youtube.com/watch?v=PK37PqkIOg4
procedure, Parameter
&list=PLfFghEzKVmjunyr8OPegxrX7y83IDuZN
estimation, Logistic regression
V
5
Decision trees: Background, https://www.youtube.com/watch?v=MiJ9LjJBGaY&list=
Decision trees, Decision trees PLdKd-j64gDcC5TCZEqODMZtAotCfm5Zkh
for credit card promotion
https://www.youtube.com/watch?v=w1bFfpW_-LA
6
An algorithm for building https://www.youtube.com/watch?v=y6VwIcZAUkI
decision trees, Attribute
selection measure: https://www.youtube.com/watch?v=coOTEc-
Information gain, Entropy 0OGw
7
Decision Tree: https://www.youtube.com/watch?v=SVwFJZeWdt
Weekend example, g
Occam’s Razor,
Converting a tree to https://www.youtube.com/watch?v=VOIIvr8tWf4
rules
8
Unsupervised learning, Semi https://www.youtube.com/watch?v=KzJORp8bgqs
Supervised learning,
Clustering, K – means
clustering, Automated
discovery
Assignments
Unit - V

71
Assignment Questions
Assignment Questions – Very Easy
Q. ASSIGNMENT QUESTIONS Marks Knowledg CO
e level
No.
1 Differentiate between supervised and unsupervised 5 K2 CO5
learning.
Explain the concept of regression in supervised 5 K2 CO5
2
learning.

Assignment Questions – Easy

Q. ASSIGNMENT QUESTIONS Marks Knowledge CO
level
No.
1 5 K2 CO5
Explain how decision trees handle both classification
and regression problems.
5 K3 CO5
Provide an example scenario where decision trees
can be applied.

Assignment Questions – Medium

Q. ASSIGNMENT QUESTIONS Marks Knowledg CO
e level
No.
1 5 K3 CO5
Compare and contrast k-means clustering with
hierarchical clustering.

Discuss the exploration-exploitation trade-off in 5 K2 CO5

reinforcement learning.

Assignment Questions – Hard

Q. ASSIGNMENT QUESTIONS Marks Knowledge CO
Level
No.
1 Critically evaluate the challenges and limitations of risk 10 K4 CO5
modeling techniques in predicting and mitigating risks.

72
Assignment Questions

Assignment Questions – Very Hard

Q. ASSIGNMENT QUESTIONS Marks Knowledg CO
e level
No.
1 Explore real-world applications of clustering techniques 10 K5 CO5
and their impact on industries such as marketing or
healthcare.

Course Outcomes:
CO5: Improve problem solving skills using the acquired knowledge in the areas of,
reasoning, natural language understanding, computer vision, automatic programming
and machine learning.
*Allotment of Marks

Correctness of the Presentation Timely Submission Total (Marks)

Content

15 - 5 20

73
Part A – Questions
& Answers
Unit – V

74
Part A - Questions & Answers
1. What is Supervised Learning?
Definition: Supervised learning is a type of machine learning where the algorithm
learns from labeled data, making predictions or decisions based on input-output
pairs.
Key Points:
It requires a dataset with labeled examples.
It includes regression and classification tasks.
2. Define Regression in Machine Learning.
Definition: Regression is a supervised learning technique used to predict continuous
values based on input features.
Key Points:
Linear regression is a common type of regression where the relationship between
the independent and dependent variables is approximated using a linear function.
3. Explain Multiple Linear Regression.
Definition: Multiple linear regression is a regression model that examines the linear
relationship between multiple independent variables and a single dependent
variable.
Key Points:
It extends linear regression to accommodate multiple predictors.
Each independent variable contributes to the prediction of the dependent variable
with its own coefficient.
4. What is Overfitting in Machine Learning?
Definition: Overfitting occurs when a model learns the training data too well,
capturing noise and irrelevant patterns that do not generalize to new data.
Key Points:
Overfit models perform well on training data but poorly on unseen data.
Regularization techniques can help mitigate overfitting.
5. How is Overfitting Detected?
Cross-validation is a common technique used to detect overfitting.
It involves splitting the dataset into training and validation sets multiple times and
evaluating the model's performance on each split.
6. Explain Logistic Regression.
Definition: Logistic regression is a type of regression used for classification tasks,
where the output is a probability value representing the likelihood of a particular
class.
Key Points:
It uses the logistic function (sigmoid function) to map predictions to probabilities.
It's suitable for binary classification tasks.
7. What is Decision Tree in Machine Learning?
Definition: A decision tree is a predictive model that maps observations about an
item to conclusions about its target value.
Key Points:
It consists of nodes that represent decision rules based on input features.
It's used for both classification and regression tasks.

75
8. Describe K-means Clustering.
Definition: K-means clustering is an unsupervised learning algorithm used to partition
a dataset into K distinct, non-overlapping clusters.
Key Points:
It assigns data points to the nearest cluster centroid iteratively.
It aims to minimize the within-cluster variance.

9. What is Reinforcement Learning?

Definition: Reinforcement learning is a type of machine learning where an agent
learns to make decisions by interacting with an environment to maximize cumulative
rewards.
Key Points:
It involves learning a policy that maps states to actions.
It's commonly used in scenarios with sequential decision-making.
10. Define Sensitivity Analysis.
Definition: Sensitivity analysis is a technique used to determine how changes in the
inputs of a model affect the output.
Key Points:
It helps understand the robustness of a model's predictions.
It's valuable for identifying critical variables and assessing model reliability.
11. Explain Casual Learning.
Definition: Causal learning is the process of understanding the cause-effect
relationships between variables in a system.
Key Points:
It involves inferring causal relationships from observational or experimental data.
Techniques like causal inference and experimental design are used in causal learning.
12. What is Semi-Supervised Learning?
Definition: Semi-supervised learning is a machine learning paradigm where the
algorithm learns from a dataset that contains a small amount of labeled data and a
large amount of unlabeled data.
Key Points:
It combines aspects of supervised and unsupervised learning.
It's useful when acquiring labeled data is expensive or time-consuming.
13. Describe Decision Trees for Credit Card Promotion.
Example: Decision trees can be used in banking to decide whether to approve a
credit card application. Factors such as credit score, income, and existing debt can
influence the decision tree's branches, ultimately leading to approval or rejection.
14. Explain the Ideal Procedure for Cross Validation.
Cross-validation involves partitioning the dataset into k subsets, training the model on
k-1 subsets, and validating it on the remaining subset. This process is repeated k
times, ensuring that each subset serves as the validation set once. The average
performance across all folds provides an unbiased estimate of the model's
performance.
15. What is Information Gain in Decision Trees?
Information gain is a measure used to decide the relevance of an attribute in a
decision tree. It quantifies the effectiveness of an attribute by measuring the reduction
in entropy or disorder when the dataset is split based on that attribute.

76
16. Explain Occam's Razor in the Context of Decision Trees.
Occam's Razor is a principle that suggests selecting the simplest explanation or
model that adequately explains the data. In decision trees, it translates to favoring
simpler tree structures over more complex ones to avoid overfitting and promote
generalization.
17. Describe an Algorithm for Building Decision Trees.
Decision tree algorithms, such as ID3 or CART, recursively partition the dataset based
on the attributes that best separate the target variable. They select attributes that
maximize information gain or minimize impurity at each node to construct the tree.
18. Define Attribute Selection Measure: Entropy.
Entropy is a measure of impurity or disorder in a dataset. In decision trees, it's used
to quantify the uncertainty in the target variable's distribution. A lower entropy
indicates less uncertainty and better attribute choice for splitting.
19. Explain Converting a Tree to Rules.
Converting a decision tree into rules involves translating the decision logic
represented by the tree's nodes and branches into a set of if-then rules. Each path
from the root to a leaf node corresponds to a rule, making the decision-making
process transparent and interpretable.
20. What is Automated Discovery?
Automated discovery refers to the process of automatically identifying patterns,
structures, or relationships in data without explicit human intervention. Machine
learning algorithms play a significant role in automated discovery by extracting
insights from large datasets.
21. Define Multi-Armed Bandit Algorithms.
Multi-Armed Bandit algorithms are a class of algorithms used in reinforcement
learning and online decision-making settings where an agent must repeatedly choose
between multiple actions, each with uncertain rewards. The goal is to maximize the
cumulative reward while balancing exploration and exploitation.
22. Explain Influence Diagrams.
Influence diagrams are graphical models used to represent decision-making problems
under uncertainty. They depict the relationships between decisions, uncertainties,
and objectives, facilitating probabilistic reasoning and optimal decision-making
strategies.
23. What is Risk Modeling?
Risk modeling involves quantifying and assessing the potential risks associated with
specific actions, events, or decisions. It aims to provide insights into the likelihood
and impact of various risk factors, enabling informed decision-making and risk
management strategies.
24. Describe Decision Tree: Weekend Example.
Example: A decision tree could be used to predict whether a person goes for outdoor
activities on the weekend based on factors like weather conditions, availability of
transportation, and personal preferences. Each decision node represents a factor,
and each leaf node represents a decision (e.g., stay indoors or go outdoors).

77
25. Explain Parameter Estimation.
Parameter estimation involves determining the values of parameters in a model to
best fit the observed data. In machine learning, it often involves optimizing a loss
function to find the parameters that minimize prediction error on the training data.

26. Define Clustering.

Clustering is an unsupervised learning technique used to group similar data points
together based on their features or attributes. It aims to discover inherent
structures or patterns in the data without prior knowledge of the class labels.

27. Explain the Concept of Automated Discovery.

Automated discovery refers to the process of automatically identifying patterns,
relationships, or insights in data without explicit human intervention. It involves the
use of machine learning and data mining techniques to extract valuable knowledge
from large datasets.

28. What are Influence Diagrams Used For?

Influence diagrams are used to model decision-making problems under
uncertainty. They help visualize the relationships between decisions, uncertainties,
and objectives, aiding in the formulation and analysis of decision strategies.

29. What is Risk Modeling in Decision Making?

Risk modeling involves quantifying and analyzing the potential risks associated with
various decisions or actions. It aims to provide insights into the likelihood and
impact of different risk factors, enabling stakeholders to make informed decisions
and develop risk mitigation strategies.

30. Define Causal Learning.

Causal learning is the process of understanding cause-and-effect relationships
between variables in a system. It involves identifying causal mechanisms from
observational or experimental data, allowing for deeper insights into how changes
in one variable affect others.

78
Part B – Questions
Unit – V

79
Part B Questions
Q. No. Questions K Level CO
Mapping
Define supervised learning and provide an K4 CO5
1.
example of its application in real-world
scenarios.
Explain the concept of linear regression.
2. K2 CO5
Provide an example illustrating its use.

What is multiple regression analysis, and

3. K3 CO5
how does it differ from simple linear
regression? Give a practical example.

Describe the analysis of variance (ANOVA)

4. K2 CO5
in the context of multiple regression. How
is it used to assess the significance of
predictors?
Define overfitting in the context of machine K4 CO5
5.
learning. How can cross-validation help detect
overfit models?
Explain logistic regression and its primary use
6. K4 CO5
cases. Provide an example where logistic
regression is more appropriate than linear
regression.
Provide background information on decision
7. K5 CO5
trees. How do they work, and what
advantages do they offer in machine learning?

Outline the algorithm for building decision

8. K2 CO5
trees. What factors influence the construction
of decision trees?
Define information gain and entropy in the
9. K3 CO5
context of decision trees. How are these
measures used in attribute selection?

Explain Occam's Razor and its significance in

10. K3 CO5
machine learning. How does it relate to
model complexity and generalization?

What is unsupervised learning, and how

11. K5 CO5
does it differ from supervised learning?
Provide an example illustrating its
application.
Describe reinforcement learning and its key
12. K5 CO5
components. Provide an example of a real-
world application where reinforcement
learning is utilized effectively.
80
SUPPORTIVE ONLINE COURSES – UNIT V

https://onlinecourses.nptel.ac.in/noc21_cs42/preview
An Introduction to Artificial Intelligence
By Prof. Mausam | IIT Delhi

https://www.coursera.org/learn/computational-thinking-problem-
solving

https://www.coursera.org/learn/artificial-intelligence-education-
for-teachers

https://www.coursera.org/specializations/ai-healthcare

https://www.coursera.org/learn/predictive-modeling-machine-
learning
https://www.drdobbs.com/parallel/the-practical-application-of-
prolog/184405220

71
REAL TIME APPLICATION- UNIT V
Neural Networks find extensive applications in areas where traditional computers
don’t fare too well. Like, for problem statements where instead of programmed
outputs, you’d like the system to learn, adapt, and change the results in sync with
the data you’re throwing at it. Neural networks also find
rigorous applications whenever we talk about dealing with noisy or incomplete data.
And honestly, most of the data present out there is indeed noisy.

With their brain-like ability to learn and adapt, Neural Networks form the entire
basis and have applications in Artificial Intelligence, and consequently, Machine
Learning algorithms. Before we get to how Neural Networks power Artificial
Intelligence, let’s first talk a bit about what exactly is Artificial Intelligence.

For the longest time possible, the word “intelligence” was just associated with the
human brain. But then, something happened! Scientists found a way of training
computers by following the methodology our brain uses. Thus came Artificial
Intelligence, which can essentially be defined as intelligence originating from
machines. To put it even more simply, Machine Learning is simply providing
machines with the ability to “think”, “learn”, and “adapt”.

With so much said and done, it’s imperative to understand what exactly are the use
cases of AI, and how Neural Networks help the cause. Let’s dive into
the applications of Neural Networks across various domains – from Social
Media and Online Shopping, to Personal Finance, and finally, to the smart assistant
on your phone.

You should remember that this list is in no way exhaustive, as the applications
of neural networks are widespread. Basically, anything that makes the machines
learn is deploying one or the other type of neural network.

72
Social Media
The ever-increasing data deluge surrounding social media gives the creators of these
platforms the unique opportunity to dabble with the unlimited data they have. No
wonder you get to see a new feature every fortnight. It’s only fair to say that all of this
would’ve been like a distant dream without Neural Networks to save the day.

Neural Networks and their learning algorithms find extensive applications in the world of
social media. Let’s see how:

Facebook
As soon as you upload any photo to Facebook, the service automatically highlights faces
and prompts friends to tag. How does it instantly identify which of your friends is in the
photo?
The answer is simple – Artificial Intelligence. In a video highlighting Facebook’s Artificial
Intelligence research, they discuss the applications of Neural Networks to power their
facial recognition software. Facebook is investing heavily in this area, not only within the
organization, but also through the acquisitions of facial-recognition startups
like Face.com (acquired in 2012 for a rumored $60M), Masquerade (acquired in 2016 for
an undisclosed sum), and Faciometrics (acquired in 2016 for an undisclosed sum).
In June 2016, Facebook announced a new Artificial Intelligence initiative that uses
various deep neural networks such as DeepText – an artificial intelligence engine
that can understand the textual content of thousands of posts per second, with
near-human accuracy.
Instagram
Instagram, acquired by Facebook back in 2012, uses deep learning by making use
of a connection of recurrent neural networks to identify the contextual meaning of
an emoji – which has been steadily replacing slangs (for instance, a laughing
emoji could replace “rofl”).
By algorithmically identifying the sentiments behind emojis, Instagram creates
and auto-suggests emojis and emoji related hashtags. This may seem like a
minor application of AI, but being able to interpret and analyze this emoji-to-text
translation at a larger scale sets the basis for further analysis on how people use
Instagram
Online Shopping
Do you find yourself in situations where you’re set to buy something, but you end
up buying a lot more than planned, thanks to some super-awesome
recommendations?
Yeah, blame neural networks for that. By making use of neural network and its
learnings, the e-commerce giants are creating Artificial Intelligence systems that
know you better than yourself. Let’s see how:
Search
Your Amazon searches (“earphones”, “pizza stone”, “laptop charger”, etc) return a
list of the most relevant products related to your search, without wasting much
time. In a description of its product search technology, Amazon states that
its algorithms learn automatically to combine multiple relevant features. It uses
past patterns and adapts to what is important for the customer in question.
And what makes the algorithms “learn”? You guessed it right – Neural Networks!
Recommendations
Amazon shows you recommendations using its “customers who viewed this item
also viewed”, “customers who bought this item also bought”, and also via curated
recommendations on your homepage, on the bottom of the item pages, and
through emails. Amazon makes use of Artificial Neural Networks to train its
algorithms to learn the pattern and behaviour of its users. This, in turn, helps
Amazon provide even better and customized recommendations.
CONTENT BEYOND SYLLABUS – UNIT V

1. Fuzzy Logic .
2. Uncertainty
3. Explain how MYCIN works?

75
ASSESSMENT SCHEDULE

Tentative schedule for the Assessment During 2023-

2024 Even semester

Name of the
S.NO Start Date End Date Portion
Assessment

1 UNIT TEST 1 2.2.24 9.2.24 UNIT 1

2 IAT 1 12.2.24 17.2.24 UNIT 1 & 2

3 UNIT TEST 2 11.3.24 16.3.24 UNIT 3

4 IAT 2 1.4.24 6.4.24 UNIT 3 & 4

5 MODEL 20.4.24 30.4.24 ALL 5 UNITS

86
PRESCRIBED TEXT BOOKS AND REFERENCE BOOKS

1. Introduction to Artificial Intelligence and Machine Learning (IBM ICE

Publications).

2. Stuart Russell, Peter Norvig, “Artificial Intelligence: A Modern

Approach”, Third Edition, Pearson Education I Prentice Hall of India,
2010.

3. Elaine Rich and Kevin Knight, “Artificial Intelligence”, Third Edition,

Tata McGraw-Hill, 2010.

REFERENCES:

1. Patrick H. Winston. "Artificial Intelligence", Third edition, Pearson

Edition, 2006.

2. Dan W.Patterson, “Introduction to Artificial Intelligence and Expert

Systems”, PHI, 2006.

3. Nils J. Nilsson, “Artificial Intelligence: A new Synthesis”, Harcourt

Asia Pvt. Ltd., 2000.

77
Mini Projects

Uber Price prediction

Problem statement : A dataset from kaggle with all the taxi ride
details which include drop_location,drop_time,pickup_time are
given our task is to use any ML prediction to predict the time taken
with given input. Dataset :
https://www.kaggle.com/code/angadchau/taxi-time-prediction
Thank you

Disclaimer:

This document is confidential and intended solely for the educational purpose of
RMK Group of Educational Institutions. If you have received this document through
email in error, please notify the system manager. This document contains proprietary
information and is intended only to the respective group / learning community as
intended. If you are not the addressee you should not disseminate, distribute or
copy through e-mail. Please notify the sender immediately by e-mail if you have
received this document by mistake and delete this document from your system. If
you are not the intended recipient you are notified that disclosing, copying,
distributing or taking any action in reliance on the contents of this information is
strictly prohibited.

OCS351 - AI ML Fundamentals Syllabus
No ratings yet
OCS351 - AI ML Fundamentals Syllabus
2 pages
Unit 1
No ratings yet
Unit 1
279 pages
SEM 5 Syllabus
No ratings yet
SEM 5 Syllabus
28 pages
CS3491 Prebook
No ratings yet
CS3491 Prebook
12 pages
22IT501-Computational Intelligence-Theory Cum Practical Syllabus
No ratings yet
22IT501-Computational Intelligence-Theory Cum Practical Syllabus
2 pages
Bits f464 Machine Learning l1
No ratings yet
Bits f464 Machine Learning l1
5 pages
Artificial Intelligence and Machine Learning Digital Notes
No ratings yet
Artificial Intelligence and Machine Learning Digital Notes
185 pages
MLT Course Content-4
No ratings yet
MLT Course Content-4
209 pages
ML Intro 23
No ratings yet
ML Intro 23
11 pages
ML Full Syllabus
No ratings yet
ML Full Syllabus
576 pages
AI&MLm
No ratings yet
AI&MLm
81 pages
AI&ML
No ratings yet
AI&ML
85 pages
Artificial Intelligence and Machine Learning
100% (1)
Artificial Intelligence and Machine Learning
11 pages
Unit 3
No ratings yet
Unit 3
148 pages
AI ML Syllabus
No ratings yet
AI ML Syllabus
3 pages
Artificial Inteligence and Machine Learning
No ratings yet
Artificial Inteligence and Machine Learning
8 pages
Session 0 CO1-Introduction To AI and ML
No ratings yet
Session 0 CO1-Introduction To AI and ML
18 pages
AIMl Syllabus
No ratings yet
AIMl Syllabus
4 pages
Artificial Intelligence Essential
No ratings yet
Artificial Intelligence Essential
8 pages
AIML Domestic Executive Brochure Dec 10 2024
No ratings yet
AIML Domestic Executive Brochure Dec 10 2024
25 pages
Chapter 1
No ratings yet
Chapter 1
62 pages
Ai&ml Unit 4
No ratings yet
Ai&ml Unit 4
68 pages
MLUnit 1
No ratings yet
MLUnit 1
131 pages
Final Unit 4
No ratings yet
Final Unit 4
107 pages
Syllabus
No ratings yet
Syllabus
5 pages
Mac Unit 3
No ratings yet
Mac Unit 3
65 pages
Artificial Intelligence Machine Learning Program Brochure
No ratings yet
Artificial Intelligence Machine Learning Program Brochure
24 pages
Artificial Intelligence & Machine Learning: Post Graduate Program in
No ratings yet
Artificial Intelligence & Machine Learning: Post Graduate Program in
16 pages
Ai&ml Unit 3
No ratings yet
Ai&ml Unit 3
81 pages
Aiml Online Brochure PDF
No ratings yet
Aiml Online Brochure PDF
23 pages
Aiml Online Brochure
No ratings yet
Aiml Online Brochure
20 pages
BITS F464 Handout
No ratings yet
BITS F464 Handout
3 pages
Syllabus Sowmi
No ratings yet
Syllabus Sowmi
2 pages
ML CP-23-24 EVEN As On 81.25
No ratings yet
ML CP-23-24 EVEN As On 81.25
13 pages
AI Questions
No ratings yet
AI Questions
3 pages
1 22csu601-Aiml Syllabus
No ratings yet
1 22csu601-Aiml Syllabus
4 pages
PGP Aiml2024
No ratings yet
PGP Aiml2024
22 pages
AI&ML
No ratings yet
AI&ML
3 pages
Cp-Integrated - Aiml
No ratings yet
Cp-Integrated - Aiml
8 pages
CCAI Content
No ratings yet
CCAI Content
3 pages
Wa0006.
No ratings yet
Wa0006.
2 pages
Course Outline 2025
No ratings yet
Course Outline 2025
5 pages
CS3491 AI and ML Syllabus
No ratings yet
CS3491 AI and ML Syllabus
2 pages
R22 - B.tech CSE (AI & ML) Minors Syllabus - 20-11-2024
No ratings yet
R22 - B.tech CSE (AI & ML) Minors Syllabus - 20-11-2024
9 pages
Sublly Sunny
No ratings yet
Sublly Sunny
10 pages
Artificial Intelligence and Machine Learning
No ratings yet
Artificial Intelligence and Machine Learning
4 pages
R18B Tech MinorIVYearISemesterTENTATIVESyllabus
No ratings yet
R18B Tech MinorIVYearISemesterTENTATIVESyllabus
22 pages
2 Syllabus
No ratings yet
2 Syllabus
3 pages
Artificial Intelligence & Machine Learning Curriculum Pregrad
No ratings yet
Artificial Intelligence & Machine Learning Curriculum Pregrad
12 pages
Unit I
No ratings yet
Unit I
10 pages
CS3491 ARTIFICIAL INTELLIGENCE AND MACHINE LEARNIN - Docx Syllabus
No ratings yet
CS3491 ARTIFICIAL INTELLIGENCE AND MACHINE LEARNIN - Docx Syllabus
1 page
2ceit506 ML
No ratings yet
2ceit506 ML
2 pages
CIS Theory - MachineLearning
No ratings yet
CIS Theory - MachineLearning
13 pages
INF385T IMLsyllabus
No ratings yet
INF385T IMLsyllabus
4 pages
Curriculum Guide: Artificial Intelligence and Machine Learning
No ratings yet
Curriculum Guide: Artificial Intelligence and Machine Learning
8 pages
Artificial Intelligence Machine Learning Program Brochure
No ratings yet
Artificial Intelligence Machine Learning Program Brochure
24 pages
Ai ML
No ratings yet
Ai ML
2 pages
Data Science
No ratings yet
Data Science
3 pages
Curriculum Guide: Artificial Intelligence and Machine Learning: Business Applications
No ratings yet
Curriculum Guide: Artificial Intelligence and Machine Learning: Business Applications
8 pages
MLP Slides Merged
No ratings yet
MLP Slides Merged
480 pages
Data Mining Complete Lab Manual - DRSNR
No ratings yet
Data Mining Complete Lab Manual - DRSNR
27 pages
Major Project Report
No ratings yet
Major Project Report
74 pages
Intern Project
No ratings yet
Intern Project
113 pages
BDA Lec11
No ratings yet
BDA Lec11
32 pages
Gradient Descent and Its Types
No ratings yet
Gradient Descent and Its Types
5 pages
Bert-Sentiment Analysis Ahmed Zeshan 1108 (Final)
No ratings yet
Bert-Sentiment Analysis Ahmed Zeshan 1108 (Final)
79 pages
1 Lect - 1.2 - 12 - August 2022 PDF
No ratings yet
1 Lect - 1.2 - 12 - August 2022 PDF
59 pages
Cse400 p2 Group8 (PDF) - 1
No ratings yet
Cse400 p2 Group8 (PDF) - 1
21 pages
Momo Fraude Prevention Data Analysis Model
No ratings yet
Momo Fraude Prevention Data Analysis Model
203 pages
Unit No. 01 - Introduction To AI & ML
No ratings yet
Unit No. 01 - Introduction To AI & ML
31 pages
Enhancing Medicare Fraud Detection Through Machine Learning Addressing Class Imbalance With SMOTE-ENN
No ratings yet
Enhancing Medicare Fraud Detection Through Machine Learning Addressing Class Imbalance With SMOTE-ENN
16 pages
Survey of ML:DL Techniques Used For Malware Classification and Detection
No ratings yet
Survey of ML:DL Techniques Used For Malware Classification and Detection
10 pages
Lecture 1 Machine Learning
No ratings yet
Lecture 1 Machine Learning
22 pages
Delving Into Deep Imbalanced Regression
No ratings yet
Delving Into Deep Imbalanced Regression
10 pages
Facial Emotion Recognition Methods, Datasets and Technologies A Literature Survey
No ratings yet
Facial Emotion Recognition Methods, Datasets and Technologies A Literature Survey
5 pages
Thesis SVM
100% (1)
Thesis SVM
5 pages
Artificial Intelligence and Training: Opportunities and Challenges in The Zimbabwean Mining Industry
No ratings yet
Artificial Intelligence and Training: Opportunities and Challenges in The Zimbabwean Mining Industry
13 pages
Weed Detection by Using Image Processing
No ratings yet
Weed Detection by Using Image Processing
9 pages
Article PP 1416-1433
No ratings yet
Article PP 1416-1433
18 pages
s41664 018 0068 2
No ratings yet
s41664 018 0068 2
14 pages
49 - Detection - of - Covid-19 - Cases - From - Chest - X-Ray - Image
No ratings yet
49 - Detection - of - Covid-19 - Cases - From - Chest - X-Ray - Image
9 pages
Machine Learning LAB MANUAL
No ratings yet
Machine Learning LAB MANUAL
23 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
13 pages
1 s2.0 S187705092202261X Main
No ratings yet
1 s2.0 S187705092202261X Main
10 pages
1.exploring Unsupervised Machine Learning
No ratings yet
1.exploring Unsupervised Machine Learning
12 pages
Estimating The Stress Distribution Within MERO Joint Using (FEM-ANN)
No ratings yet
Estimating The Stress Distribution Within MERO Joint Using (FEM-ANN)
10 pages
ML Speech Recognition Lab Manual - Public
No ratings yet
ML Speech Recognition Lab Manual - Public
24 pages
Tri-Net For Semi-Supervised Deep Learning
No ratings yet
Tri-Net For Semi-Supervised Deep Learning
7 pages