0% found this document useful (0 votes)

4 views

Data Analytics Unit 3

The document provides an overview of various data analysis techniques, including R programming, regression modeling, multivariate analysis, and Bayesian modeling. It covers statistical methods for estimating relationships between variables, the use of regression for predictions, and techniques for analyzing multiple variables simultaneously. Additionally, it discusses the Naive Bayes classifier and Bayesian networks as tools for probabilistic classification and inference.

Uploaded by

Rohini Rajaram Pandian

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Data Analytics Unit 3

Uploaded by

Rohini Rajaram Pandian

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 104

Big Data Analytics

Unit 5
Data Analysis

Dr. Vandana Bhatia

 Overview of R programming language
 Regression Modelling,
 Multivariate Analysis
 Bayesian Modelling
 Inference and Bayesian Networks
 Support Vector and Kernel Methods
 Analysis of Time Series
 Linear Systems Analysis
 Nonlinear Dynamics
Conten 

Rule Induction
Neural Networks
ts  Learning And Generalization
 Competitive Learning
 Principal Component Analysis and Neural
Networks
 Fuzzy Logic: Extracting Fuzzy Models from Data
 Fuzzy Decision Trees,
 Stochastic Search Methods.
#Printing Hello World
myString <- "Hello, World!“
print ( myString)
# Create a vector.
apple <- c('red','green',"yellow")
print(apple)

Overview of # Get the class of the vector.

print(class(apple))

R # Create a list.
list1 <- list(c(2,5,3),21.3,sin)

programming
# Print the list.
print(list1)
# Create a matrix.
language M=
matrix( c('a','a','b','c','b','a'),
nrow = 2, ncol = 3, byrow =
TRUE)
print(M)
# Create an array
a <- array(c('green','yellow'),dim = c(3,3,2))
print(a)
# Create the data frame.
BMI <- data.frame(
gender = c("Male", "Male","Female"),
height = c(152, 171.5, 165),
weight = c(81,93, 78),
Regression Versus Classification
Regression Versus
Classification Contd..
Regression
Modelling
Regression analysis is a set of statistical
processes for estimating the relationships
between a dependent variable (often called
the 'outcome' or 'response' variable) and
one or more independent variables (often
called 'predictors', 'covariates', 'explanatory
variables' or 'features').
Regression Modeling Steps

1 2 3 4 5 6 7
Define Specify Collect data Do Estimate Evaluate Use model
problem or model descriptive unknown model for
question data analysis parameters prediction
•  represents the • i represents the
unit change in Y per unit change in Y
unit change in X . per unit change
• Does not take into in Xi.
account any • Takes into account
Simple other variable the effect of
besides single other
vs. independent
Multiple variable. i s.
• “Net regression
coefficient.”
Linearity - the Y variable is linearly related to the
value of the X variable.

Independence of Error - the error (residual) is

independent for each value of X.
Assumpti
Homoscedasticity - the variation around the line
ons of regression be constant for all values of X.

Normality - the values of Y be normally

distributed at each value of X.
Go
al
Develop a statistical model that
can predict the values of a
dependent (response) variable
based upon the values of the
independent (explanatory)
variables.
Simple Linear
Regression
Simple Linear Regression

 Managerial decisions often are based on the

relationship between two or more
variables.
 Regression analysis can be used to develop an
equation showing how the variables are related.
 The variable being predicted is called the dependent
variable and is denoted by y.
 The variables being used to predict the value of the
dependent variable are called the independent
variables and are denoted by x.
Simple Linear Regression

 Simple linear regression involves one independent

variable and one dependent variable.
 The relationship between the two variables is
approximated by a straight line.
 Regression analysis involving two or more
independent variables is called multiple
regression.
Representing Linear Regression
Model
Simple Linear Regression
Model
The equation that describes how y is related to x
and an error term is called the regression model.
 The simple linear regression model is:

y = 0 + 1x +

where:
0 and 1 are called parameters of the model,
 is a random variable called the error term.
Simple Linear Regression Equation

The simple linear regression equation is:

E(y) = 0 + 1x

• Graph of the regression equation is a straight

line.
• 0 is the y intercept of the regression line.
• 1 is the slope of the regression line.
• E(y) is the expected value of y for a given x
value.
Simple Linear
Regression
Positive Linear

RelEatqiounsahitpion

E(y)Regression
line
Intercept Slope 1
0
is positive

x
Simple Linear Regression Equation

Negative Linear Relationship

E(y)

Intercept
0 Regression
line

Slope 1
is negative

x
Simple Linear Regression Equation

No Relationship

E(y)

Intercept Regression
0 line
Slope 1
is 0

x
Least Squares
Method
• Least Squares Criterion
min  (y i  y‸ i )2

where:
yi = observed value of the dependent variable
for the ith observation
yi^ = estimated value of the dependent
for the ith observation
variable
Least Squares
• Slope Method
for the Estimated Regression
Equation
b1 (x  x )(y
i i 
 y )(x i 
where: x )2
xi = value of independent variable for ith
observation
yi = value of dependent variable for ith
_ observation
x = mean value for independent variable
_
y = mean value for dependent variable
Least Squares Method

y-Intercept for the Estimated Regression Equation

b0  y  b1 x

x 2 5 3 5 1 6
y 4 7 6 8 4 9
Simple Linear Regression

Example: Reed Auto Sales

Reed Auto periodically has a special week-long sale.
As part of the advertising campaign Reed runs one or
more television commercials during the weekend
preceding the sale. Data from a sample of 5 previous
sales are shown on the next slide.
Simple Linear Regression

Example: Reed Auto Sales

Number of Number of
TV Ads (x) Cars Sold
(y)
1 14
3 24
2 18
1 17
3 27
x = 10 y = 100
x 2 y  20
Estimated Regression
Equation
Slope for the Estimated Regression Equation

b1  (x x )(y
i i
2
5
y )  (x i
0
y-Intercept for thexEstimated
) 2 Regression Equation
4
b0 y b1 x 20 5(2) 10

Estimated Regression Equation

y
10 
5x
ˆ
Coefficient of
Determination
• Relationship Among SST, SSR, SSE
SST = SSR + SSE

(y  y )
i
2
 (yˆi  y )2 (yi  yˆ i )2

where:
SST = total sum of squares
SSR = sum of squares due to regression
SSE = sum of squares due to error
Coefficient of Determination

The coefficient of determination is:

r2 = SSR/SST

where:
SSR = sum of squares due to regression
SST = total sum of squares
Coefficient of Determination

r2 = SSR/SST = 100/114 = .8772

The regression relationship is very strong; 87.72%
of the variability in the number of cars sold can be
explained by the linear relationship between the
number of TV ads and the number of cars sold.
Multivariate Analysis
• Multivariate means involving multiple dependent variables resulting
in one outcome.
• The majority of the problems in the real world are Multivariate.
• For example, we cannot predict the weather of any year based on the
season. There are multiple factors like pollution, humidity,
precipitation, etc.
Multivariate analysis techniques:
Dependence vs. interdependence
Dependence methods
• Dependence methods are used when one or some of the variables are
dependent on others.
• Dependence looks at cause and effect; in other words, can the values of two or
more independent variables be used to explain, describe, or predict the value
of another, dependent variable?
• Example: the dependent variable of “weight” might be predicted by
independent variables such as “height” and “age.”
• In machine learning, dependence techniques are used to build predictive
models.
• The analyst enters input data into the model, specifying which variables are
independent and which ones are dependent—in other words, which variables
they
want the model to predict, and which variables they want the model to use to
make those predictions.
Multivariate analysis techniques:
Dependence vs. interdependence
Interdependence methods
• Interdependence methods are used to understand the structural makeup and underlying
patterns within a dataset. In this case, no variables are dependent on others, so you’re not
looking for causal relationships. Rather, interdependence methods seek to give meaning to
a set of variables or to group them together in meaningful ways.
• So: One is about the effect of certain variables on others, while the other is all about the
structure of the dataset.
• With that in mind, let’s consider some useful multivariate analysis techniques. We’ll
look
at:
• Multiple linear regression
• Multiple logistic regression
• Multivariate analysis of variance (MANOVA)
• Factor analysis
• Cluster analysis
Multivariate analysis of variance (MANOVA)
• Multivariate analysis of variance (MANOVA) is used to measure the effect of multiple independent variables on two or
more dependent variables.
• With MANOVA, it’s important to note that the independent variables are categorical, while the dependent variables are
metric in nature.
• A categorical variable is a variable that belongs to a distinct category—for example, the variable “employment
status” could be categorized into certain units, such as “employed full-time,” “employed part-time,” “unemployed,”
and so on. A metric variable is measured quantitatively and takes on a numerical value.
Example of MANOVA:
• Let’s imagine you work for an engineering company that is on a mission to build a super-fast, eco-friendly rocket. You
could use MANOVA to measure the effect that various design combinations have on both the speed of the rocket and
the amount of carbon dioxide it emits. In this scenario, your categorical independent variables could be:
o Engine type, categorized as E1, E2, or E3
o Material used for the rocket exterior, categorized as M1, M2, or M3
o Type of fuel used to power the rocket, categorized as F1, F2, or F3
• Your metric dependent variables are speed in kilometers per hour, and carbon dioxide measured in parts per million.
Using MANOVA, you’d test different combinations (e.g. E1, M1, and F1 vs. E1, M2, and F1, vs. E1, M3, and F1, and
so on) to calculate the effect of all the independent variables. This should help you to find the optimal design solution
for your rocket.
Bayesian
Modelling
Learning

supervised unsupervised

Eg: Fruit
Classification
Classifier-
with some Eg: Fruit
orange, apples, Seen lots of
known label Classifier-
bananas example but not
a proper label. “fruit with
soft skin”.
Like clustering “red
fruits”
Supervised
learning

O/P variable
Draw Classification Regression is real or
conclusions continuous
like spam values like
or not, red marks or
or blue Linear weight
Naïve Bayes
regression

Polynomial
Decision tree
regression

SVM
SVM
Regression
What Is Naive Bayes?

Naive Bayes is among one of the simplest, but most powerful

algorithms for classification based on Bayes' Theorem with an
assumption of independence among predictors.

The Naive Bayes classifier assumes that the presence of a

feature in a class is unrelated to any other feature.

Very intuitive classification algorithm.

• It makes the assumption that features
Naive of a measurement are independent
of each other.

• Even if these features depend on each

Naive Why Naïve other or upon the existence of the
other features, all of these
Bayes properties independently
contribute to the probability that
a particular fruit is an apple or an
Bayes orange or a banana, and that is
why it is known as "Naive."
Things we would like to
do…..
Spam Classification
• Given an email, predict whether it is spam or not

Medical Diagnosis
• Given a list of symptoms, predict whether a patient has disease X or not

Weather
• Based on temperature, humidity, etc… predict if it will rain tomorrow
Feature
Feature x1
X2

Bayesian Classification Feature

Problem statement x3
Given features X1 ,X2 ,…,Xn
Predict a label Y

Label Y
NAÏVE BAYS EXAMPLE:
To predict days suitable for a football match based on weather conditions
Smaller circle- low probability to play (P<0.5)
Big Circle- High probability to play (P>0.5
Combining both the conditions

We get an outlook by combining both the data

Comparing the states when more
information is added

Naïve bayes tries to

understand such
interaction of probability
Probabilistic classifier based on
Bayes’ Theorem

Naïve Bayes Takes independent assumptions between

Classifier the features.
Related dataset
Attribute probability

Probability
of the class

0.60
Prior Probability
Predict the likelihood to play football on
( Season =Winter, Sunny=No, Windy= yes
Probability of match not being played??
Face recognition

Mail classification

Handwriting analysis

Salary prediction
Statistical
Learning :

Bayesian
Network
Bayesian
Network
A simple graphical representation for a joint probability distribution.
• Nodes are random variables
• Directed edges between nodes reflect dependence

Syntax:
– a set of nodes, one per variable
– a directed, acyclic graph (link ≈ "directly influences")
– if there is a link from x to y, x is said to be a parent of y
– a conditional distribution for each node given its parents:
P (Xi | Parents (Xi ))

In the simplest case, conditional distribution represented as a conditional

probability table (CPT) giving the distribution over Xi for each
combination of parent values
Step 3: Now choose the parents for each variable by
evaluating conditional independencies
 Fire is the first variable in the ordering, X1. It does not
have parents.
 Tampering independent of fire (learning that one is
true would not change your beliefs about the
probability of the other)
 Alarm depends on both Fire and Tampering: it could
be caused by either or both.
 Smoke is caused by Fire, and so is independent
of Tampering and Alarm given whether there is a Fire
 Leaving is caused by Alarm, and thus is independent
of the other variables given Alarm.
 Report is caused by Leaving, and thus is independent
of the other variables given Leaving.
Bayes Nets Representing and
Reasoning about Uncertainty
• I am at work, my neighbor John calls to say that my alarm went off, my neighbor
Mary doesn’t call. Sometimes the alarm is set off by a minor earthquake. Is there
a burglar?
Example 2: Earthquake
Example
I am at work, my neighbor John calls to say that my alarm went off,
neighbor Mary doesn’t call. Sometimes the alarm is set off by a minor
earthquake. Is there a burglar?
Example 2:
Earthquake
Example
Example 2: Earthquake Example

Find the probability when John Call and Marry Calls and Alarm
Went Off and there is no burglary and no earthquake happened.

P(J , M , A ,¬B , ¬ E)
= P(J | A)* P(M | A)* P(A| ¬ B , ¬ E) * P(¬ B) *P(¬ E)
= 0.90 x 0.70 x 0.001 x 0.999 x 0.998
= 0.0006
Inference and Bayesian
Networks
• Bayesian networks are a type of probabilistic
graphical model that uses Bayesian inference for
probability computations.
• Bayesian networks aim to model conditional
dependence, and therefore causation, by
representing conditional dependence by edges in a
directed graph.
• Through these relationships, one can efficiently
conduct inference on the random variables in the
graph through the use of factors.
•
Inference and Bayesian
Networks
• A Bayesian network is a directed acyclic graph in
which each edge corresponds to a conditional
dependency, and each node corresponds to a unique
random variable.
• Formally, if an edge (A, B) exists in the graph
connecting random variables A and B, it means that
P(B|A) is a factor in the joint probability distribution,
so we must know P(B|A) for all values of B and A in
order to conduct inference.
• In the above example, since Rain has an edge
going into WetGrass, it means that P(WetGrass|Rain)
will be a factor, whose probability values are specified
next to the WetGrass node in a conditional
probability table.
• Support Vector Machine, abbreviated as
SVM can be used for both regression and
classification tasks.

• But, it is widely used in classification

objectives.

Support Vector and • The objective of the support vector

machine algorithm is to find a hyperplane
Kernel Methods in an N-dimensional space(N — the
number of features) that distinctly
classifies the data points.

• To separate the two classes of data points,

there are many possible hyperplanes that
could be chosen.
Support Vector Machines

• The objective is to find a plane that has

the maximum margin, i.e the
maximum distance between data
points of both classes.
• Maximizing the margin distance
provides some reinforcement so that
future data points can be classified
with more confidence.
Hyperplane as Decision Surface
• Hyperplanes are decision boundaries that help
classify the data points.
• Data points falling on either side of the
hyperplane can be attributed to different classes.
It is a sort of binary classification
• The dimension of the hyperplane depends upon
the number of features. If the number of input
features is 2, then the hyperplane is just a line. If
the number of input features is 3, then the
hyperplane becomes a two-dimensional plane.
Support
Vectors
Support vectors are data points that are
closer to the hyperplane and influence
the position and orientation of the
hyperplane.

Using these support vectors, we

maximize the margin of the classifier.

Deleting the support vectors will change

the position of the hyperplane. These
are the points that help us build our
SVM.
Maximizing the Margin

• In logistic regression, we take the output of the linear function and squash the value
within the range of [0,1] using the sigmoid function.
• If the squashed value is greater than a threshold value(0.5) we assign it a label 1,
else we
assign it a label 0.
• In SVM, we take the output of the linear function and if that output is greater than 1,
we identify it with one class and if the output is -1, we identify is with another class.
• Since the threshold values are changed to 1 and -1 in SVM, we obtain this reinforcement
range of values([-1,1]) which acts as margin.
Sec.
15.1

Support • SVMs maximize the margin around the

separating hyperplane.
Vector • A.k.a. large margin classifiers
Machine
Support vectors • The decision function is fully specified
(SVM) by a
subset of training samples, the support
vectors.
• Solving SVMs is a quadratic
programming
problem
• Seen by many as the most successful current
Maximizes
Narrowmearrgin
text classification method*
margin 71
SV
M
Sec.
15.1

Maximum Margin:
Formalization
w: decision hyperplane normal vector

xi: data point i

yi: class of data point i (+1 or -1) NB: Not 1/0

Classifier is: f(xi) = sign(wTxi + b)

Functional margin of xi is: But note that we can increase this

yi (wTxi + b) margin simply by scaling w, b….
73
Sec.
Geometric 15.1

Margin
• Distance from example to the separator is r 
wT x  b
y w
• Examples closest to the hyperplane are support vectors.
• Margin ρ of the separator is the width of separation between support vectors of classes.

Derivation of finding r:
ρ Dotted line x’−x is perpendicular to
x
decision boundary so parallel to w.
 Unit vector is w/|w|, so line is
r
x rw/|w|.
x’ = x – yrw/|w|.
′ x’ satisfies wTx’+b = 0. So
wT(x –yrw/|w|) + b = 0
Recall that |w| =
sqrt(wTw).
So wTx –yr|w| + b = 0
w So, solving for r gives:
r = y(wTx + b)/|w|
Sec.
15.1
Linear SVM
Mathematically The
• Assume that all data is at least linearly separable
distance 1 from the hyperplane, then the following two constraints
follow for a training set {(x ,y )}
i i
case
wTxi + b ≥ 1 if yi = 1
wTxi + b ≤ −1 if yi = −1

• For support vectors, the inequality becomes an equality

• Then, since each example’s distance from the hyperplane is
wT x  b
r
y w
• The margin is:

2

 w
75
Sec.
15.1

Linear Support Vector

Machine (SVM)
ρ wTxa + b = 1

•
wTxb + b = -1
Hyperplane
wT x + b = 0
• Extra scale constraint:
mini=1,…,n |wTxi + b| = 1

• This implies:
wT(xa–xb) = 2
ρ = ||xa–x b|| 2 = 2/||w||2 wT x + b = 0

76
Solving the Optimization Problem
Find w and b such that
Φ(w) =½ wTw is minimized;
and for all {(xi ,yi)}: yi (wTxi + b) ≥ 1

• This is now optimizing a quadratic function subject to linear constraints

• Quadratic optimization problems are a well-known class of mathematical
programming problem, and many (intricate) algorithms exist for solving them
(with many special ones built for SVMs)
• The solution involves constructing a dual problem where a Lagrange
multiplier
αi is associated with every constraint in the primary problem:
Find α1…αN such that
Q(α) =Σαi - ½ΣΣαiαjyiyjxiTxj is maximized and
(1) Σαiyi = 0
(2) αi ≥ 0 for all αi
77
The Optimization Problem Solution

• The solution has the form:

w =Σαiyixi b= yk- wTxk for any xk such that αk 0

• Each non-zero αi indicates that corresponding xi is a support vector.

• Then the classifying function will have the form:

f(x) = ΣαiyixiTx + b

• Notice that it relies on an inner product between the test point x and the support vectors xi
• We will return to this later.
• Also keep in mind that solving the optimization problem involved computing the inner products x iTx j
between all pairs of training points.

78
Classification with SVMs

•Given a new point x, we can score its projection onto the

hyperplane normal:
• I.e., compute score: wTx + b = ΣαiyixiTx + b
• Decide class based on whether < or > 0

• Can set confidence threshold t.

Score > t: yes

Score < -t: no
1
Else: don’t -1
0
79
Linear SVMs:
Summary
• The classifier is a separating hyperplane.

• The most “important” training points are the support vectors; they define the hyperplane.

• Quadratic optimization algorithms can identify which training points xi are support vectors with
non-zero Lagrangian multipliers αi.

• Both in the dual formulation of the problem and in the solution, training points appear only
inside
inner products:
Find α1…αN such that f(x) = ΣαiyixiTx + b
Q(α) =Σαi - ½ΣΣαiαjyiyjxiTxj is maximized and
(1) Σαiyi = 0
(2) 0 ≤ αi ≤ C for all αi

80
Non-linear SVMs
• Datasets that are linearly separable (with some noise) work out great:

0 x

• But what are we going to do if the dataset is just too hard?

0 x

•How about … mapping data to a higher-xd imensional space:

0 x
81
Non-linear SVMs:
Feature spaces
• General idea: the original feature space can always be mapped
to some higher-dimensional feature space where
the training set is separable:

Φ: x → φ(x)

82
The “Kernel
Trick”
• The linear classifier relies on an inner product between vectors K(x ,x )=x x i j i
T
j

• If every datapoint is mapped into high-dimensional space via some transformation Φ: x → φ(x), the
inner product becomes:
K(xi,xj)= φ(xi) Tφ(xj)
• A kernel function is some function that corresponds to an inner product in some expanded feature
space.
• Example:
2-dimensional vectors x=[x1 x2]; let K(xi,xj)=(1 + xi xj) ,
T 2

Need to show that K(xi,xj)= φ(xi) Tφ(xj):

K(xi,xj)=(1 + xi xj) ,= 1+ xi1 xj1 + 2 xi1xj1 xi2xj2+ xi2 xj2 + 2xi1xj1 + 2xi2xj2=
T 2 2 2 2 2

= [1 xi12 √2 xi1xi2 xi22 √2xi1 √2xi2]T [1 xj1 √2 xj1xj2 xj2 √2xj1 √2xj2]
2
2
2 2
= φ(xi) Tφ(xj) where φ(x) = [1 x1 x2 √2x1 √2x2]
83
√2 x1x2
Sec.
15.2.3

Kernels

Why use kernels?

• Make non-separable problem separable.

• Map data into better representational space

Common kernels

• Linear
• Polynomial K(x,z) = (1+xTz)d
• Gives feature conjunctions
• Radial basis function (infinite dimensional space)

Haven’t been very useful in text classification

84
Analysis of Time Series
A time series is nothing but a sequence of various data points that
occurred in a successive order for a given period of time.

Objectives:
• To understand how time series works, what factors are affecting
a certain variable(s) at different points of time.
• Time series analysis will provide the consequences and insights
of
features of the given dataset that changes over time.
• Supporting to derive the predicting the future values of the time
series variable.
• Assumptions: There is one and the only assumption that is
“stationary”, which means that the origin of time, does not affect
the properties of the process under the statistical factor.
How to analyze Time Series?

• Collecting the data and cleaning it

• Preparing Visualization with respect to time vs key
feature
• Observing the stationarity of the series
• Developing charts to understand its nature.
• Model building – AR, MA, ARMA and ARIMA
• Extracting insights from prediction
Significance of Time Series and its
types
• TSA is the backbone for prediction and forecasting analysis, specific to the
problem
time-basedstatements.
• Analyzing the historical dataset and its
• patterns
Understanding and matching the current situation with patterns derived from
stage.
the previous
• Understanding the factor or factors influencing certain variable(s) in
different periods.
• With help of “Time Series” we can prepare numerous time-based analyses
and results.
o Forecasting
o Segmentation
o Classification
o Descriptive analysis`
o Intervention analysis
Components of Time Series
Analysis
• Trend: In which there is no fixed
interval and any divergence within the
given dataset is a continuous timeline.
The trend would be Negative or
Positive or Null Trend
• Seasonality: In which regular or fixed
interval shifts within the dataset in a
continuous timeline. Would be bell
curve or saw tooth
• Cyclical: In which there is no fixed
interval,
uncertainty in movement and its
pattern
• Irregularity: Unexpected
situations/events/scenarios and
spikes in a short time span.
Data Types of
Time
Series
Stationary: A dataset should follow the
below thumb rules, without having Trend,
Seasonality, Cyclical, and Irregularity
component of time series
• The MEAN value of them should be
completely
constant in the data during the analysis
• The VARIANCE should be constant with
respect to the time-frame
• The COVARIANCE measures the
relationship between two variables.
Non- Stationary: This is just the
opposite of Stationary.
Fuzzy Logic: Extracting Fuzzy Models from
Data
• The fuzzy logic works on the levels of possibilities of input to achieve the definite
output.
Implementation
• It can be implemented in systems with various sizes and capabilities ranging from
small micro-controllers to large, networked, workstation-based control systems.
• It can be implemented in hardware, software, or a combination of both.
Why Fuzzy Logic?
• Fuzzy logic is useful for commercial and practical purposes.
• It can control machines and consumer products.
• It may not give accurate reasoning, but acceptable reasoning.
• Fuzzy logic helps to deal with the uncertainty in engineering.
• Membership functions allow you to quantify linguistic term
Fuzzy Logic: and represent a fuzzy set graphically. A membership function
for a
Membershi fuzzy set A on the universe of discourse X is defined as μA:X
→ [0,1].
p Function • Here, each element of X is mapped to a value between 0 and 1. It is
called membership value or degree of membership. It quantifies
the degree of membership of the element in X to the fuzzy set A.
• x axis represents the universe of discourse.
• y axis represents the degrees of membership in the [0, 1] interval.
• There can be multiple membership functions applicable to fuzzify
a numerical value. Simple membership functions are used as use of
complex functions does not add more precision in the output.
Fuzzy Logic: Extracting Fuzzy Models from Data

• In the boolean system truth value, 1.0 represents the absolute

truth value and 0.0 represents the absolute false value.
• But in the fuzzy system, there is no logic for the absolute truth
and absolute false value.
• But in fuzzy logic, there is an intermediate value too present
which is partially true and partially false.
Fuzzy Logic: Extracting Fuzzy Models from Data

Classical set
1. Classical set is a collection of distinct objects. For example, a set of students passing
grades.
2. Each individual entity in a set is called a member or an element of the set.
3. The classical set is defined in such a way that the universe of discourse is splitted into
two groups members and non-members. Hence, In case classical sets, no
partial membership exists.
4. Let A is a given set. The membership function can be use to define a set A is given by:
Fuzzy Logic: Extracting Fuzzy Models from Data

Fuzzy set:

1. Fuzzy set is a set having degrees of membership between 1 and 0. Fuzzy sets are
represented with tilde character(~). For example, Number of cars following traffic
signals at a particular time out of all cars present will have membership value
between [0,1].
2. Partial membership exists when member of one fuzzy set can also be a part of other
fuzzy sets in the same universe.
3. The degree of membership or truth is not same as probability, fuzzy truth
represents membership in vaguely defined sets.
4. A fuzzy set A~ in the universe of discourse, U, can be defined as a set of ordered
pairs
and it is given by
Fuzzy Logic: Extracting Fuzzy Models from Data

Common Operations on fuzzy sets: Given two Fuzzy sets A~ and B~

 Union : Fuzzy set C~ is union of Fuzzy sets A~ and B~ :

 Intersection: Fuzzy set D~ is intersection of Fuzzy sets A~ and

B~ :

 Complement: Fuzzy set E~ is complement of Fuzzy set

A~
 Algebraic sum:
 Algebraic:
 Algebraic product:
 Bounded sum:
 Bounded difference:
Fuzzy Set Example
• A=(x1,0.2)(x2,0.3)(x3,0.5)
• B=(x1,0.3)(x2,0.4),(x3,0.1)
• Find A union B
Fuzzy Logic: ARCHITECTURE

It has four main parts as shown −

• Fuzzification Module − It transforms the
system
inputs, which are crisp numbers, into fuzzy sets.
• Knowledge Base − It stores IF-THEN
rules provided by experts.
• Inference Engine − It simulates the human
reasoning process by making fuzzy inference
on the inputs and IF-THEN rules.
• Defuzzification Module − It transforms the
fuzzy set obtained by the inference engine into
a crisp value.
Fuzzy Logic: Example- Air Conditioner
Fuzzy Logic: Extracting Fuzzy Models from
Data
Algorithm
• Define linguistic Variables and terms (start)
• Construct membership functions for them. (start)
• Construct knowledge base of rules (start)
• Convert crisp data into fuzzy data sets using membership functions.
(fuzzification)
• Evaluate rules in the rule base. (Inference Engine)
• Combine results from each rule. (Inference Engine)
• Convert output data into non-fuzzy values. (defuzzification)
Fuzzy Logic: Extracting Fuzzy Models from
Data
• Step 1 − Define linguistic variables and terms
• Linguistic variables are input and output
variables in the form of simple words or
sentences. For room temperature, cold, warm,
hot, etc., are linguistic terms.
• Temperature (t) = {very-cold, cold, warm,
very-warm, hot}
• Every member of this set is a linguistic term
and it can
cover some portion of overall temperature
values.
• Step 2 − Construct membership functions for
them
• The membership functions of temperature
Fuzzy Logic: Extracting Fuzzy Models from Data
RoomTemp.
Very_Cold Cold Warm Hot Very_Hot
• Step3 − Construct knowledge base /Target
rules Very_Cold No_Change Heat Heat Heat Heat
• Create a matrix of room temperature
values versus target temperature values Cold Cool No_Change Heat Heat Heat
that an air conditioning system is
expected to provide.
Warm Cool Cool No_Change Heat Heat

Hot Cool Cool Cool No_Change Heat

Very_Hot Cool Cool Cool Cool No_Change

• Build a set of rules into the knowledge
base in the form of IF-THEN-ELSE
structures. Sr. No. Condition Action

1 IF temperature=(Cold OR Very_Cold) AND target=Warm THEN Heat

2 IF temperature=(Hot OR Very_Hot) AND target=Warm THEN Cool

3 IF (temperature=Warm) AND (target=Warm) THEN No_Change

Fuzzy Logic: Extracting Fuzzy Models from Data

• Step 4 − Obtain fuzzy value

• Fuzzy set operations perform
evaluation of rules. The operations
used for OR and AND are Max and Min
respectively. Combine all results of
evaluation to form a final result. This
result is a fuzzy value.
• Step 5 − Perform defuzzification
• Defuzzification is then performed
according to membership function for
output variable.
Fuzzy Logic: Extracting Fuzzy Models from
Data
Advantages of Fuzzy Logic System
• This system can work with any type of inputs whether it is imprecise, distorted or noisy input
information.
• The construction of Fuzzy Logic Systems is easy and understandable.
• Fuzzy logic comes with mathematical concepts of set theory and the reasoning of that is
quite simple.
• It provides a very efficient solution to complex problems in all fields of life as it resembles
human reasoning and decision-making.
• The algorithms can be described with little data, so little memory is required.
Disadvantages of Fuzzy Logic Systems
• Many researchers proposed different ways to solve a given problem through fuzzy logic which leads to
ambiguity. There is no systematic approach to solve a given problem through fuzzy logic.
• Proof of its characteristics is difficult or impossible in most cases because every time we do not get a
mathematical description of our approach.
• As fuzzy logic works on precise as well as imprecise data so most of the time accuracy is
compromised.
Fuzzy Logic: Extracting Fuzzy Models from
Data
Application
• It is used in the aerospace field for altitude control of spacecraft and satellites.
• It has been used in the automotive system for speed control, traffic control.
• It is used for decision-making support systems and personal evaluation in the
large company business.
• It has application in the chemical industry for controlling the pH, drying,
chemical distillation process.
• Fuzzy logic is used in Natural language processing and various
intensive applications in Artificial Intelligence.
• Fuzzy logic is extensively used in modern control systems such as
expert systems.
• Fuzzy Logic is used with Neural Networks as it mimics how a person
would make
decisions, only much faster. It is done by Aggregation of data and changing it
into more meaningful data by forming partial truths as Fuzzy sets.

Unit 5
No ratings yet
Unit 5
104 pages
Lecture 6 - Regression Analysis
No ratings yet
Lecture 6 - Regression Analysis
34 pages
UNIT II Regression
No ratings yet
UNIT II Regression
59 pages
Management Science Notes
No ratings yet
Management Science Notes
13 pages
Chapter2 1
No ratings yet
Chapter2 1
55 pages
Chapter 6
No ratings yet
Chapter 6
58 pages
SSDMA UNIT 2 PART1
No ratings yet
SSDMA UNIT 2 PART1
20 pages
Regression Coeffient
No ratings yet
Regression Coeffient
52 pages
ida unit-3.rtf
No ratings yet
ida unit-3.rtf
34 pages
Unit-4 DS Student
No ratings yet
Unit-4 DS Student
43 pages
Machine Learning: Bilal Khan
100% (2)
Machine Learning: Bilal Khan
20 pages
Model Development
No ratings yet
Model Development
80 pages
DA-3rd unit
No ratings yet
DA-3rd unit
16 pages
Module 4
No ratings yet
Module 4
41 pages
Simple and Multiple Linear Regression
No ratings yet
Simple and Multiple Linear Regression
91 pages
L4a - Supervised Learning
No ratings yet
L4a - Supervised Learning
25 pages
Regression PDF
No ratings yet
Regression PDF
16 pages
Module -05 Statistical Computing and r Programming
No ratings yet
Module -05 Statistical Computing and r Programming
53 pages
Classical Machine Learning: Linear Regression: Ramesh S
No ratings yet
Classical Machine Learning: Linear Regression: Ramesh S
28 pages
Session 15 Regression and Correlation
No ratings yet
Session 15 Regression and Correlation
66 pages
Lab-3: Regression Analysis and Modeling Name: Uid No. Objective
No ratings yet
Lab-3: Regression Analysis and Modeling Name: Uid No. Objective
9 pages
MODULE-3
No ratings yet
MODULE-3
34 pages
F_Regression
No ratings yet
F_Regression
65 pages
Regression
No ratings yet
Regression
14 pages
Regression
No ratings yet
Regression
25 pages
Intro To Regresion: Codergirl Data Analysis
No ratings yet
Intro To Regresion: Codergirl Data Analysis
32 pages
ML_Unit2
No ratings yet
ML_Unit2
69 pages
Lecture6 Regression
No ratings yet
Lecture6 Regression
42 pages
Module 2 Part 1 - Types of Forecasting Models and Simple Linear Regression
No ratings yet
Module 2 Part 1 - Types of Forecasting Models and Simple Linear Regression
71 pages
lecture 9-10
No ratings yet
lecture 9-10
28 pages
Chapter 6: How To Do Forecasting by Regression Analysis
No ratings yet
Chapter 6: How To Do Forecasting by Regression Analysis
7 pages
Linear Regression. Com
No ratings yet
Linear Regression. Com
13 pages
QMM Epgdm 5
No ratings yet
QMM Epgdm 5
58 pages
Linear Regression
No ratings yet
Linear Regression
7 pages
ML Unit-2
No ratings yet
ML Unit-2
123 pages
Note 13 - Linear Regression
No ratings yet
Note 13 - Linear Regression
25 pages
BA unit3
No ratings yet
BA unit3
42 pages
Regression and Correlation
No ratings yet
Regression and Correlation
66 pages
Untitled 472
No ratings yet
Untitled 472
13 pages
LECTURE Regression
No ratings yet
LECTURE Regression
12 pages
Introduction To Management Science: Post Mid Sessions 2 & 3 November 4 and 6 2019
No ratings yet
Introduction To Management Science: Post Mid Sessions 2 & 3 November 4 and 6 2019
26 pages
AI_Lec23
No ratings yet
AI_Lec23
36 pages
PSAI Unit3
No ratings yet
PSAI Unit3
36 pages
Unit 2 Data Analytics (1)
No ratings yet
Unit 2 Data Analytics (1)
33 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
20 pages
unit5_R
No ratings yet
unit5_R
5 pages
1_UNIT 2 2 files merged
No ratings yet
1_UNIT 2 2 files merged
80 pages
Regression
No ratings yet
Regression
4 pages
IV Ai & Ds Al3451 Ml Unit2
No ratings yet
IV Ai & Ds Al3451 Ml Unit2
50 pages
Lecture Note #8_PEC-CS701E
No ratings yet
Lecture Note #8_PEC-CS701E
20 pages
m2 Data analytic and visualization
No ratings yet
m2 Data analytic and visualization
53 pages
DS Unit-Iv
No ratings yet
DS Unit-Iv
34 pages
1.linear Regression PSP
No ratings yet
1.linear Regression PSP
92 pages
Unit_6_Machine_Learning_Algorithms
No ratings yet
Unit_6_Machine_Learning_Algorithms
13 pages
Lecture 4.3 Regression-1
No ratings yet
Lecture 4.3 Regression-1
30 pages
REGRESSION ANALYSIS 1 and 2 Notes
No ratings yet
REGRESSION ANALYSIS 1 and 2 Notes
9 pages
Predictive Modelling Using Linear Regression: © Analy Datalab Inc., 2016. All Rights Reserved
No ratings yet
Predictive Modelling Using Linear Regression: © Analy Datalab Inc., 2016. All Rights Reserved
16 pages
Lecture-3---Linear-Regression-imran-20022025-092939am
No ratings yet
Lecture-3---Linear-Regression-imran-20022025-092939am
46 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Hebb Network
No ratings yet
Hebb Network
10 pages
Neural Network
No ratings yet
Neural Network
55 pages
unit 2 mcqs
No ratings yet
unit 2 mcqs
6 pages
unit-5 -Question bank
No ratings yet
unit-5 -Question bank
5 pages
Tutorial On Diffusion Models For Imaging and Vision: Stanley Chan September 10, 2024
No ratings yet
Tutorial On Diffusion Models For Imaging and Vision: Stanley Chan September 10, 2024
89 pages
TSA: James D. Hamilton, Time Series Analysis, Princeton University Press, 1994
No ratings yet
TSA: James D. Hamilton, Time Series Analysis, Princeton University Press, 1994
6 pages
STT WK 11 Lec 21 22
No ratings yet
STT WK 11 Lec 21 22
11 pages
Toc
No ratings yet
Toc
14 pages
Simulation Calculation
No ratings yet
Simulation Calculation
62 pages
STDM Presentation On Sums From Levin & Rubin by
No ratings yet
STDM Presentation On Sums From Levin & Rubin by
7 pages
Data Analytics Essentials 2022-23
No ratings yet
Data Analytics Essentials 2022-23
27 pages
Experimental
No ratings yet
Experimental
10 pages
Blocking and Confounding in Factorial Dessigns
No ratings yet
Blocking and Confounding in Factorial Dessigns
13 pages
WGU C784- APPLIED HEALTHCARE STATISTICS PRE-ASSESSMENT EXAM QUESTIONS AND CORRECT ANSWERS 2024 GUARANTEED A+
No ratings yet
WGU C784- APPLIED HEALTHCARE STATISTICS PRE-ASSESSMENT EXAM QUESTIONS AND CORRECT ANSWERS 2024 GUARANTEED A+
16 pages
Inferential Statistical Analysis Using Python -
No ratings yet
Inferential Statistical Analysis Using Python -
22 pages
Chapter 5
No ratings yet
Chapter 5
35 pages
15 Anova PDF
No ratings yet
15 Anova PDF
31 pages
2023 11 Exam P Syllabus
No ratings yet
2023 11 Exam P Syllabus
7 pages
Cs3353 Fds Unit 3 Notes Eduengg
No ratings yet
Cs3353 Fds Unit 3 Notes Eduengg
47 pages
Derivation of The IMM Filter: K K K 1 K K K K K K K
No ratings yet
Derivation of The IMM Filter: K K K 1 K K K K K K K
6 pages
Random Process
No ratings yet
Random Process
71 pages
Assignment 01 Nipun Goyal Jinye Lu
No ratings yet
Assignment 01 Nipun Goyal Jinye Lu
12 pages
Solution To Exercise 2: Probability Distributions
No ratings yet
Solution To Exercise 2: Probability Distributions
10 pages
CLASSWORK 9 - Normal - Prob
No ratings yet
CLASSWORK 9 - Normal - Prob
1 page
Jawaban Soal MTK
No ratings yet
Jawaban Soal MTK
22 pages
Machine Learning Online Bootcamp Beginners Track Curriculum
No ratings yet
Machine Learning Online Bootcamp Beginners Track Curriculum
9 pages
Chapter15 Econometrics InstrumentalVariable
No ratings yet
Chapter15 Econometrics InstrumentalVariable
5 pages
Sampling Distribution, Central Limit Theorem and Point Estimation of Parameters
No ratings yet
Sampling Distribution, Central Limit Theorem and Point Estimation of Parameters
22 pages
CE204 Recitation08 Week10 Chapter4
No ratings yet
CE204 Recitation08 Week10 Chapter4
3 pages
QTII Bcom2019
No ratings yet
QTII Bcom2019
3 pages
Assignment 3
No ratings yet
Assignment 3
3 pages
Answer On Question #43725-Math-Statistics and Probability
No ratings yet
Answer On Question #43725-Math-Statistics and Probability
1 page
Linear Regression
No ratings yet
Linear Regression
10 pages
Applied Univariate, Bivariate, and Multivariate Statistics Using Python: A Beginner's Guide To Advanced Data Analysis 1st Edition Daniel J. Denis
100% (6)
Applied Univariate, Bivariate, and Multivariate Statistics Using Python: A Beginner's Guide To Advanced Data Analysis 1st Edition Daniel J. Denis
62 pages

Data Analytics Unit 3

Uploaded by

Data Analytics Unit 3

Uploaded by

Big Data Analytics

Dr. Vandana Bhatia

Overview of # Get the class of the vector.

Independence of Error - the error (residual) is

Normality - the values of Y be normally

 Managerial decisions often are based on the

 Simple linear regression involves one independent

The simple linear regression equation is:

• Graph of the regression equation is a straight

Negative Linear Relationship

y-Intercept for the Estimated Regression Equation

Example: Reed Auto Sales

Example: Reed Auto Sales

Estimated Regression Equation

The coefficient of determination is:

r2 = SSR/SST = 100/114 = .8772

Naive Bayes is among one of the simplest, but most powerful

The Naive Bayes classifier assumes that the presence of a

Very intuitive classification algorithm.

• Even if these features depend on each

Bayesian Classification Feature

We get an outlook by combining both the data

Naïve bayes tries to

Naïve Bayes Takes independent assumptions between

In the simplest case, conditional distribution represented as a conditional

• But, it is widely used in classification

Support Vector and • The objective of the support vector

• To separate the two classes of data points,

• The objective is to find a plane that has

Using these support vectors, we

Deleting the support vectors will change

Support • SVMs maximize the margin around the

xi: data point i

yi: class of data point i (+1 or -1) NB: Not 1/0

Classifier is: f(xi) = sign(wTxi + b)

Functional margin of xi is: But note that we can increase this

• For support vectors, the inequality becomes an equality

Linear Support Vector

• This is now optimizing a quadratic function subject to linear constraints

• The solution has the form:

w =Σαiyixi b= yk- wTxk for any xk such that αk 0

• Each non-zero αi indicates that corresponding xi is a support vector.

•Given a new point x, we can score its projection onto the

• Can set confidence threshold t.

Score > t: yes

• But what are we going to do if the dataset is just too hard?

•How about … mapping data to a higher-xd imensional space:

Need to show that K(xi,xj)= φ(xi) Tφ(xj):

Why use kernels?

• Make non-separable problem separable.

Haven’t been very useful in text classification

• Collecting the data and cleaning it

• In the boolean system truth value, 1.0 represents the absolute

Common Operations on fuzzy sets: Given two Fuzzy sets A~ and B~

 Intersection: Fuzzy set D~ is intersection of Fuzzy sets A~ and

 Complement: Fuzzy set E~ is complement of Fuzzy set

It has four main parts as shown −

Hot Cool Cool Cool No_Change Heat

Very_Hot Cool Cool Cool Cool No_Change

1 IF temperature=(Cold OR Very_Cold) AND target=Warm THEN Heat

2 IF temperature=(Hot OR Very_Hot) AND target=Warm THEN Cool

3 IF (temperature=Warm) AND (target=Warm) THEN No_Change

• Step 4 − Obtain fuzzy value

You might also like