Bi 2
Bi 2
3-4
Data Analytlcs Business Intelllgence and Data Analytics Classification and Clustering
Business Intelligence and
3-5
for testing
random sampling training and Vis purpose.
Formula for repeated Accuracy
Observationsintwo
disjoint sets T
and V. Tis for
method rnumber of times. Repeat Accuracy is calculated as the number of all correct predictlons divtded by the total number of dataset. The best
There are the holdout calculated1T, involves t
involvesreplicating correspondingaccuracy is
randomsampllng sample T,, is extractedand
For each
repetitlon a
observat aceuracy is 1.0whereas the worst Is 0.0. It can be calculated as.1 - EPR.
TRP +TAN
ACC = TRP + TAN + FAN + FAP
TRP +TAN
P+N
where V, =D- T%
accaF(V
ac =acap=2
True positive rate
k=1 predictions divided by the total
True posittve rate or sensittvity is calculated as the number of correct posítive
number of positives.
3.2.3 Cross-Validation The best true poslttve rate is 1.0 and worst is 0.0.
observation of dataset D appears the
assures that each TRP
overlapping test sets, It True positive rate = TRP + FAN
Cross validation evades
number of times. Lq,and require r
The cross validation is bascd on
dataset D. There are r disjoint
subsets L, Ly
partition as the training set
iteratlons.At True negative rate or specificity
union of all other subsets in the
and negatives.
iteration L, is selected as the test set Itis the number of correct negative predictions divided by the total number of
TAN
SP = TAN + FAP
have shown that 10.
cross validation. Extensive experiments
Standard method for evaluation is ten fold Precision
best choice to get accurate estimate. predíctions divided by the total number of positive
times and reculs It is calculated as the total number of correct positive
Ten fold cross validation repeated 10
Repeated stratified cross validation even better. predictions. The best preclsion is 1.0 whereas the worst is 0.0.
Leave one out is a particular form of cross validation. In this case m to
averaged (reduces the variance). TRP
measure accuracy. Precision = TRP+ FAP
include only one observation and each example in turn
TechKneuled!
Publlc ations
Classification and Cluste
3-6
Business Intelligence and Data Analytlcs Classification and Clustering
Data Analytics classifler 3-7
Business lntelligence and Four outcomes of a
Observed labels
Posltlve
prediction
Cost Total Number of Students Contacted Posltlve Response
FAN 10000 10000 6000
FAB 10000
20000 20000
negatiye(N)
30000 30000 13000
0.00 90
0.00 0.25 0.50 0.75 1.00 sesuodseH
es 80
70 Lift curve
1- Speciticity
A ROC curve represents a classifier 60 Base line
with the random pertormance level. 50
The curve separates the space into 40
two areas for good and poor
periormance levels. a 30
20
Fig. 3.2.3
10
0
3.2.6 Cumulative Gain and Lift Charts 0 10 20 30 40 50 60 70 80 90 100
% Customers Contacted
Gain or lift is the measure of the effectiveness of classification model. It is calculated as the ratio between the
Fig. 3.2.4
results obtained with or without model. It is visual aid for calculating performance of classification model. Both
charts consist of lift curve and base line. Lift chart
For example, An educational institute wants to do mail get 10% of the responders and
marketing drive for new course. It costs institute irs Tor It shows actual lift For contacting 10% of students using no model we should
each item mailed. They have information of 1,00,000 students, Out of 1 model 30% of the responders soy value of the lift curve is 30/10 = 3. Similarly for 20% of students 50% of
lac 20000 students showed positve using
an idea that which customers to contact.
response. Suppose we use response model to assign score.
Prediction of response model. the responders so 50/20 = 2.5. The cumulative and lift chart gives
Tech Kaonlde
PubliCatlons
Tech Knouloly
Public atlons
Llon andClus
3-8
and Data Analytics LiftChart Business Intelligence and Data Analytics Classification and Clustertng
Business Intelligence 3-9
LiftCuve
P(A) Ís the prior probablity of Aoccurring Independently.In our example this Is P(Bp). This value ls given to us.
+Basellne
P(B) is the prior probablity of Boccurring independently. In our example this is P(Pos).
2.5
P(AJB) 0s the posterior probablity that Aoccurs given B. JInour example this is P(Bp| Pos).
2
That the probabílity of an individual having Blood pressure, glven that, that indtvidual got a po_ltive test
result. This is the value that we are looking to calculate.
1
P(BJA) 0s the lkelihood probability of Boccurring, given A. In our example thls is P(Pos|Bp). This value is given to
100
0 70 80 90
40 50 60
10 20 30 Putting our values into the formula for Bayes theorem we get :
% Customers Contacted
Fig.3.2.5 P(Bp|Pos) = (P(Bp) "P(Pos|Bp) / P(Pos)
The probability of getting a positive test result P (Pos) can be calculated using the Sensitivity and Specificity.
Using specificity and sensitivity are as follows:
Bayesian Methods
3.3
algorithms developed by Reverend
P(Pos) = P(Bp) *Sensitivity] +(P(~Bp) *(1- Specificity)
Bayes' theorem is one of
the earliest
probabilistic inference
assumes that there is
independence
among Bayes.Iu
predictor
P(Bp) Probability having blood pressure =0.01
Bayes' Theorem. It feature in a class is P(~Bp) = Probability of not having blood pressure =0.99
classification technique based on presence of a particular
classifier assumes that the unrelate Sensitivity = P(Pos/Bp) =getting positive result = 0.9
slmple terms, a Naive Bayes'
the presence of any other
feature. about 3 inches in P(Negl~Bp) = 0.9 =getting negative result
eif it is red, round, and diameter,
Ev
For example, a fruit may be consideredto be an apple
the other features. all of these P(Pos) = Probability of getting positive test result = (P(Bp) *Sensitivity] +[P(~Bp) *(1- Specificity))]
existence of
these features depend on each other or upon the
and that is why it is known as
'Nat propen 3.3.2 Naive Bayes Classifier (Simplification)
that this fruit is an apple
independently contribute to the probability
The naive Bayes algorithm reduces the complexity of Bayes' theorem by assuming conditional
p(class/data) =p(data/class)·p(class) p(data)
independence over the training dataset.
3.3.1 Bayes' Theorem Implementation This assumption makes the Bayes algorithm, naive.
example. Suppose we want to find the odds of an indi:
Let us implement the Bayes' Theorem using a simple Given, n different attribute values, the likelihood now can be written as,
she was tested for it and got a positive result. In the medical Rala
having high blood pressure, given that he or n
life and death situations. P(X,..X,|Y = II P(X,1Y)
probabilities play a very important role as it usually deals with
i=1
We assume the following: related to
In Naive Bayes algorithm considers the features that particular feature in a class is independent or not
P(Bp) is the probability of a person having Blood pressure. the presence of any other feature.
Assume 19% of the general populatlon has Blood pressure: So p(Bp)= 0.01 inches in diameter. In this
For example, a fruit may be considered to be an apple if it is red, round, and about 3
P(Pos) is the probability of gettingapositive test result. is an apple and that
case allproperties or features are independently contribute to the probability that this fruit
P(Neg) is the probability of getting anegative test result. it is known as 'Naive'. So in the above example, we are considering only one feature, that is the test result.
is why
P(Pos|Bp) is the probability of gettinga positive result ona test done for detecting Blood pressure, given thatyu If we add another feature, 'exercise'.
have Blood pressure. This has a value 0.9. In other words the test is correct 90% of the time. This is also calla the índividual exercises less
Let's say this feature has a binary value of 0 and 1, where the former signifies that
the Sensitivity or True Positive Rate. than or equal to 2 days a week and the latter signifies that the individual exercises greater
than or equal to 3 days
P(Negl~ Bp) is the probability of getting a negative result on a test done for detecting diabetes, given that youi a week.
not have diabetes. This also has a value of 0.9 and is therefore correct. 90o of the time. This is also called u of the 'exercise' feature, to compute
If we had to use both of these features, namely the test result and the value
Specificity or True Negative Rate. Naive Bayes' is an extension of Bayes' theorem that assumes
our final probabilities, Bayes' theorem would fail.
The Bayes formula is as follows:
that all the features are independent of each other.
P(A|B) P(BIA) P(A)
P(B)
TechKnouledge
PuDIcatlons
TechKnould
Publtatll
Classification and Cluster
3-10
Data Analytics Business Intelligence and Data Analytlcs 3-11 Classlflcation andClustering
Business Intelligence and
predictlon. When X (Age ='<= 30, Income = medium, student= yes, credit_rating = falr)
Advantages
It is easy and fast
to predict
class of test data
Baves classifer
set. It performs well in
performs better
multi clasS
compare to other models like logistic assumptregreiosnlo P(c1) = p(Buys_computer = yes)= 9/14 = 0.643
P(c2) = p(buys_computer = no) = 5/14 = 0.357
Naive
independence holds, a For
tralning data. compared to
numerical variable(s), numerical varla P(age <= 30/buys_computer = ves)- number of rows where age <= 30 buys computer = yes)
and you need less categorical input variables (number of rows which buys computer = yes)
in case of assumption).
It perform well which is a strong P(age <=30/buys_computer =yes) = 2/9 = 0.222
assumed (bell curve,
normal distribution is P(age <=30/buys_computer =no) =3/5 = 0.6000
Disadvantages observed In training data set, then P(Income = medium/buys_computer = yes) = 4/9 = 0.444
variable in test data set
has a
category,which was not
prediction. This is often
known as model wa
"Zero Frequency. P(income = medium/buys_computer = no) = 2/5 = 0.400
If categorical to make a
probability. It will be unable
assign a 0(zero) Laplace estimation. P(student = yes/buys_computer = yes)6/9 = 0.667
simplest techniques is called
solve this,one of the predictors. In real life situation, it is not
Bayes is the assumption of independent poss P(student = ues/buys_computer = no) = 1/5 =. 2000
The limitation of Naive independent. P(credit = fair/buys_computer = yes) = 6/9 = 0.667
which are completely
to get a set of predictors
P(credit =fair/buys_computer= no) = 2/S = 400
Applications of Naive Bayes Algorithms X = (Age ='<=30,Income= medium,student=yes,credit_rating= fair)
very fast.
Naive Bayes is used for making predictions in real time. It is To find p(X/buys computer = yes) = plage<30/buys computer yes)
multiple classes of target variable
feature. It predict the probability of
It is used for multi class prediction *p(income = medium/buys computer yes)*p(student =yes/buys computer =yes)
text classification (due to better result in multi class prohlame
Naive Bayes classifiers mostly used in is widely ...
compared to other algorithms. As a result, it *p(credit ration = fair/buys computer= yes)
independence rule) have higher success rate as media analysis, to identify positi. 0.222°0.444*0.667*0.667 = 0.044
and Sentiment Analysis (in social
Spam filtering (identify spam e-mail)
negative customer sentiments). 3.3.3 Bayesian Networks
builds a Recommendation System. It uses mach
Naive Bayes Classifier and Collaborative Filtering together resource or not.
user would like a given Bayesian networks are a type of Probabilistic Graphical Model. It is used to build models from data and/or
learning and data mining techniques to to predict whether a
expert opinion.
Example of Naive Bayes Classifier
It can be used for a wide range of tasks including time series prediction,decision under uncertaínty, diagnostics,
Sr. No Agencome Student Credit card performance Class- Buys computer automated insight anomaly detection and reasoning.
1 <30 High No Fair no A Bayesian network consist of two maín components. The first is an acyclic oriented graph where the nodes
2 < 30 High No Excellent No correspond to the predictive variables and the arcs indicate relationships of stochastic dependence.
30 To 59 High No Fair Yes The variable X, associated with node a,in the network which is dependent on predecessor nodes of a,
4 >60 Medium No Fair Yes
The second component consists of the table associated with the variable X, indicates the conditional distribution
of P(X IC ). where C represents the set of explanatory variables associated with the predecessor nodes of node
5 > 60 Low Yes Fair Yes a, in the network and is estimated based on the relative frequencies in the dataset.
>60 Low Yes Excellent No
7 30 To 59 Low 3.4 Logistic Regression
Yes Excellent Yes
<30 Medium No Fair No
Logistic regression is used to :
9 <30 Low Yes Fair Yes
Estimate the probability of an event occurs for a randomly selected observation verses the probability that the
10 60Medium Yes Fair Yes event does not occur.
11 <30 Medium Yes Excellent Predict the effect of variables on binary response variable.
Yes
12 30 To 59|Medium No Excellent Yes Classify observation by estimating the probability that an observation is ín particular category.
13 |30 To 59 High Yes Fair Model the probability of an event occurring depending on the values of the independent variable, which can be
Yes
14 >60 Medium No numerical.
excellent NO
TechKnouledy TechKnouledge
u b c ations
PubCatl00
Classification and Cluster
3-12
meansthe
Business Intelligence and
Data Analytics Binary. That dependenttvara
dependent variable is Default", "Living or- Dead".
Business Intelligence and Data Analytics 3-13 Classification and Clustering
Logisticregression
cantake onlytwo
lIs generally
possible values
used where the
such as "Yes or
No" "Default
or No
variables.
"Responde 3.5 Neural Networks
Aneural network comprises of units (neurons) which is arranged ín layers. It converts an input vector into some
"Yes or No" etc.
Non Responder" categorical or numerical output. Each unit takes an input, It applies a nonlinear function to t and then passes the output on to the next
variables can be
Independentfactors or layer. Generally the networks are defined to be feed-forward: aunit feeds its output to allthe units on the next
regression laver, but there is no feedback to the previous layer.
Example of logistic
card to a Weightings are applied to the signals which passes from one unit to another, and in these weightíngs which are
Example 1 whether to issue a credit cUstomer tuned in the training phase to adapt a neural network to the particular problem at hand.
going to build a model to decide on this credit card.
Ifa credit card
company is
customer is going to
"Default" or "Not Default" This is cal 3.5.1 The Rosenblatt Perceptron
whether the
will model for
"Default Propensity Modeling". when we plot the probability Perceptron were popularised by Frank Rosenblatt in the 1960. They appeared to have very powerful learning
1(or 0% to 100%).
of depend
probability of any event lies between 0and curve. algorithm. Aperceptron is a neural network unit (an artificial neuron) which does certain computations to detect
The 'S' shape
factors, it will demonstrate an features or business intelligence in the input data.
variable by independent
It consists of single neuron with adjustable synaptic weights and bias. It can be used to classify linearly separated
Example 2 pattern. A simple perceptron can be used to classify into two classes. A Perceptron is a supervised learning
get admission in a college of his or her cho
probability of a given candidate to
" Suppose we have to predict the The dependent variable is binary- "Admiscinn algorithm for binary classifiers. This algorithm enables neurons to learn and processes elements in the trainíng
in the admission test.
by the score candidates receives
set one at a time.
Admission".
linear it shows an 'S' shan
Inputs weights
and Probability of Selection is not
Since the relationship between the Score transformatis Net input Activatian
selection by a score. We need to do Logit
use a linear model to predict probability of function function
between the predictor and dependent variable linear, Use a l.
dependent variable to make the correlation +Output
getting the "Admission.
regression model to predict the probability of
100.0%
90.0%
80.0%
ofseie 70.0%
Probabilty Fig. 3.5.1
60.0% There are two types of Perceptrons: Single layer and Multilayer.
50.0%
Single layer Perceptrons can learn only linearly separable patterns.
40.0%
30.0%
Multilayer Perceptrons or feed forward neural networks with two or more layers have the greater
processing power.
20.0%
10.0%
Perceptron Function
0.0% Perceptron is a function that maps its input "x" which is multiplied with the learned weight coefficient; an output
200 300 400 500 600 700 800 value "f(x)"is generated.
Score in entrance test
f(x) = J1 if 0x+b>0
otherwise
Fig. 3.4.1 :Graph for selection of college
Where,
The above graph is called as Sigmoid function and it gives S-shaped curve. It gives value between 0 < p<1.
"o" = vector of real-valued weights.
The logistic function is defined as:
"b" =bias (an element that adjusts the boundary away from origin without any dependence on the input value).
Transformed = 1/(1+e^-x) "x" = vector of input x values.
Where e is the numerical constant Euler's number and x is a input we
plug into the function. Logit expression a m
be expressed as,
log p(x)/(1-p()) i=1
where the left hand side is called the logit or log Where, "m" = number of inputs to the Perceptron.
odds function. The odds signifies the ratio of
success to probability of failure. probabliy The output can be represented as "1" or "0." It can also be represented as "1" or "-1" depending on which
activation function is used.
TechKoul TechKaowledge
PubIICations
PUbica tlet
Cassification and Clue.
3-14
n=2 Where
m=4 m>nn+ 1=3
Fig. 3.6.9
Fig. 3.6.5
For the maximum margin hyper plane only examples on the margin mater (only these affect the distances).
These are called support vectors.
Techkou Tech Knouledge
PUbIlcatt PubIItations
Classification: and custa
3-18
and Data Analytics Buslness Intellgence and Data Analytlcs 3-19 Classificatlon and Clustering
Business Intelllgence
Definition
3.7 Clustering
planes Hsuch that:
Define the hyper Cluster analysis or clustering is the task of grouplng a set of objects in such a way that objects in the same group
Wx,+b2+1, wheny,=+1.
(called a cluster) are more simllar (ln some sense) to each other than to those In other groups (clusters).
wx,+b2-1, when y,=-1.
theplanes:
3.7.1 Clustering Methods
H, andH, are
H,:Wx+b2+1 Clustering methods must satlsfy a few general necessities, as indicated below.
vectors.
H, Wx,+ b2-1 the tips ofthe
support Clustering Methods
planes H, and H, are Necessltles
Thepoints on the where wx +b= 0.
median in between,
The planes H, is the positivepoint
1. Flexibility
distance to the closest
d*= the shortest negative point.
distanceto the closest 2. Robustness
d= the shortest
separating hyper plane is d' +d.
The margin (gutter) of a H4
3. Etficiency
3.6.3 Nonlinear Separation The robustness of an algorithm is the stability of the clusters generated with respect to small changes in the
values of the attributes of each observation.
by a linear boundary.
Nonlinear Classification: Classes may not be separable 3. Efficiency
data to higher dimensions where it au
Kernels: Make linear models work in nonlinear settings By mapping
linear patterns. In some applications there are large number of observations In such case clustering algorithms must generate
line, flat plane an N-dimensional hvner l.
The simplest way to separate two groups of data is with straight clusters efficiently in order to guarantee reasonable computing times for large problems.
efficiently.
However there are situations where a non linear region can separate the groups more
SVM handles this by using kernel function(non linear) to map the data into different space where a hypernh 3.7.2 Taxonomy of Clustering Methods
(linear) cannot be used to do the separations.
The different types of Clustering based on the logic are partition methods, hierarchical methods, density based
It means a non linear function is learned by linear learning machine in a high dimensional feature space wè methods and grid methods.
the capacity of the system is controlled by aparameter that does not depend on the dimensionality of the spae
This is called as kernel trick which means kernel function transform the data into higher dimensional featu Types of Clustering
space to make it possible to perform the linear separation.
Kernel function map the data into new space. It take the inner product of new vectors. The image of the Inm 1.Partition methods
product of the data is the linear product of the images of the data. Two kernel function are shown as below:
2. Hierarchical methods
Polynomial kernel
3. Density based methods
4. Grid methods
Gaussian kernels
TechKwe
PubIlCati
TechKneuledya
PubICatlons
Classification and Clustetn
3-20
Business lntelligence and Data Analytics Classiflcatlon and Clustering
and Data Analytics
3-21
Business Intelligence A third option which generalizes both the Euclldean and Manhattan metrics. The Minkowski
K of non--empty
1. Partition methods
ofthe given
datasetínto a
predetermined number subsets. The defined as,
distance
4. Grid methods The distance between two points is the sum of the (absolute) differences of thelr coordinates. E.g. it counts 1 unit
achieved to reduce com.. for a straight move, and ít counts cost as 2 if one takes crossed move.
consisting of cells. The grid structure is
Grid methods obtain a grid structure Manhattan Distance
clusters generated.
times, despite a lower accuracy in the
Given two p-dimensional data objects i= (X, Xyp .x)and j= (Kyxg , the following common distang Inchess, the distance between squares on the chessboard for rooks is measured in Manhattan distance
functions can be defined: 3.7.4 Attribute
1 Euclidian Distance Function
An attribute is a data field, which represents a characteristic or feature of a data object. The nouns attribute,
2. Manhattan Distance Function dimension,feature, and variable are commonly recognized as attribute in literature.
1. Euclidian Distance Function In data warehousing attributes are referred as dimension. In Machine learning literature it is referred as feature,
while statisticians call this term as variable.
Data mining and database professionals commonly use the term attribute. Atributes describing a customer
2. Manhattan Distance Function object can include, for example, customer ID, name, and address.
d(i j) Univatiate distribution involves only one attribute. The distribution of data having two attributes is known as
bivariate.
Distances are always positive numbers. In the Euclidian distance
function, attributes with larger scales o
measurement may overcome attributes measured on a smaller scale. To prevent this
The type of an attribute is determined by the set of possible values the attribute can have. Attributes can be
problem, the attribut nominal, binary, ordinal, or numeric. In the following subsections, we introduce each type.
values are often normalized to lie between 0 and 1.
d()) = a+r+S+t
1. BinarY The above equatlon states a degree of similarity between paírs(i, I) of observations through the coefficlent
2. Nominal of simllarlty. Assume that all n attributes are binary and asymmetric. In such case, for a pair of asymmetric
attributes it is interesting to match positives, records possessing the property relative to each attribute.
3. Ordinal For binary variables, the Jaccard coefficient is therefore used
Composition Attribute
4. Mixed d(i, j)) = r+s/q +r+s
Fig. 3.7.5
2 Nominal Attribute
1. 0means at Nominal attributes means "relating to names." Nominal attribute are symbols or names of things. Each value
1. Binary Attributes categories or states 0or
attribute. It has two two states denotes some kind of category, code, or state. Nominal attributes are also referred as categorical. In
treated as binary
Nominal attribute is
present. Binary attributes
are referred to as Boolean as
patíent object, 1 indicates
Correespond to
that the computer science, the values are also known as enumerations.
absent and 1 means it is Smoker describing a
that it is present. E.g. measure for two objects, i
and j,
will
true and false. 1 means
smokes, while 0indicates
that the patient does
objects are unalike.
not a similarity
typicaly Nominal attributes. Suppose that Hair color and Marital status are two attributes describing person objects.
In our application, possible values for Hair color are black, brown blond, red, auburn, grey, and white.
return the value O if the (Typically, a value of 1 ind It is symmetric attribute where the value is greater than 2.We use similarity coefficient in extended form,
the greater the similarity between objects.
The higher the similarity value, dist (i .)) = (n -)/n
objects are identical.)
complete similarity, that is,that the a value of 0if the objects
are the sanm
Where, fis the number ofattributes in which observations i and jtake the sane value.
works the opposite way. It returns more dissimilar the twn a
A dissimilarity measure higher the dissimilarity value, the
therefore, far from being dissimilar. The states. For example, flower color is
a nomínal attribut.,
Ordinal Attribute
on two or more
are a nominal attribute can take Values of ordinal attribute has possible values and have a meaningful order or ranking among them. The
yellow, green, pink, and blue.
may have, say, five states: red, letters, symbols magnitude between consecutive values is not known.
attribute be M. The states can be denoted by
Let the number of states of a nominal on .
dissimilarity between two objects iand j can be computed based Suppose that Drink size corresponds to the size of drinks available at a restaurant. This ordinal attribute has
of integers, such as 1, 2, ..., M. The
ratio of mismatches: three possible values - small, medium, and large. However, we cannot tell from the values how much bigger,
say, a medium is from a large ordinal variable can be discrete or continuous. Order is important and can be
p-m
d(i,) = p treated like interval scaled.
iand j are in the same state). and
Where m is the number of matches (i.e., the number of attributes for which Replace ordinal variables value by its rank re {1... M)
assigned to increase the effect of m
p is the total number of attributes describing the objects. Weights can be
having a larger number of states Map the range of variable (0,1].
or to assign greater weight to the matches in attributes
binary data.
There is another approach which involves computing a dissimilarity matrix from the given
Table 3.7.1 :A contingency table for binary attributes
4. Mixed Composition attribute
Object j
Object i 1 sum Adataset contain all attribute types nominal, ordinal, symmetric binary, asymmetricbinary etc. To define an
1 R
overall affinity measure which defines similarity between observations d, and d, One can use weighted
q q+r
formula as follows,
S t s+t
P
sum q+s r+t
Where g is the number of attributes that equal 1 for both objects i and i. r is the number of
attributes that d(1, j) =
f=1
equal 1 for objecti but that are 0 for object j. s is the number of attributes that equal 0 for P
1for object j. Andt is the number of attributes that egual 0 for both
object i but equal
obijects i and i. The total number o
attributes is p. Where p = q+r+ s + t. Recall that for symmetric binarv
valuable attributes, each state is equaly f=1
2 1,5 2.0
3 3 4.0
Partition Methods
3.8 where at each sten
heuristic nature. Thev are based on
greedy methods they make th 4 5 7.0
3 3.61 3.61
7.21
2. Manhattan Distance Function
5 4.72 2.06
6 5.31 2,06
3. Steps of k-means Algorithm
7 4.30 3
1. Choose k clusters arbitrarily.
L, = (1/3(1.0+1.5+3.0), 1/3(1.0+2.0+4.0) = (1.83, 2.33) = cluster 1
2. Initialize cluster centres with those k clusters.
Ly = 1/4(5.0+3.5+3.5), 1/3(7.0+5.0+4.5) = (4.12, 5.38)) =cluster2
3. loop
(a) Partition by assigning or reassigning all data
V(m-²+ (m-y)²
objects to their closest cluster center. d(m, 2) = VI1.0- 1.5|+ |1.0 -2.0|' =1.12
(b) Compute new cluster centers as mean value of the objects in each
cluster. d(m, 2) = VI5.0-1?.5|+ (7.0- 2.0j=6.10
(c) Until no change in cluster
center calculation.
TechKnoaledgl TechKnouledge
PUbIlCatlons PUDIICations
Classificatton, and Clustert
3-26
So, we compare each
Business Intelligence
and Data
not surethat
Analytics
has been assignedto the
each individual oppositecluster.And we
right cluster.
find:
(centroid) of Cliste
individua Rusiness Intelligence and Data Analytics
Four Swapping Cases
3-27 Classificatlon and Clustering
We are still thatofthe When a medold m is to be swapped with a non-medoid obiect h. check
own cluster mean and to
Cluster1Distancetomean each of other non-medold o0jecsj
its of
distance to
IndBvidual/Distanceto mean (centrold) 5.4
fis in cluster of m reassign /.
1.5 4.3 Case1: jis closer to some k than to h;after swapping m and h. i relocates to cluster
1
represented by k
0.4 1.8 Cimh dý, k) - dg, m) 0
2
2.1 1.8 Case 2: jis closer to h than to k; after swappíng mand h, jis in cluster represented by h.
3
5.7 0.7 Cimh dj, h) - dj, m)
4
3.2 0.6 jis in cluster of some k, not mcompare k with h.
5
6
3.8
1,. Case 3: jis closer to some kthan to h; afterswapping mand h,j remains in cluster represented by k.
2,8 Cimh = d(j, k) - d(j, k) = 0
7 Its own (Cluster 1). In other
cluster (Cluster 2) than wor
closer to the mean ofthe opposite smaller that the distance to the other cluster's Case 4: jis closer to hthan to k; after swapping mand h,j is in cluster represented by h.
Individual 3 is
cluster mean should be new nals mh dí, h) - dj, k) <0
each individual's
distance to its own
relocated to Cluster 2 resulting in the
with indívidual 3). Thus, individual 3 is
(which is notthecase Mean Vector (centroid) The K-medoids algorithm requires a large number of iterations and is not suited to deriving clusters for large
Individual datasets.
(1.3, 1.5)
Cluster 1 1,2 3.9 Hierarchical Methods
Cluster 2 3,4, 5, 6, 7 (3.9, 5.1)
Hierarchical clustering generates hierarchy in clusters. No need to specify k. It is more deterministic. The
3.8.2 K-medoids Algorithm graphical representation of the resulting hierarchy is a tree-structured graph called a dendrogram.
sum of dissinmil In order to calculate the distance between two clusters, the hierarchical algorithms resort to one of five
error. While k-medoids minimizes the
K-means tries to minimize the total squared center of that cluster. In contrastto . alternative measures: mínimum distance, maximum distance, mean distance, distance between centroids, and
point designated as the ward distance.
between points labelled to be in a cluster and a
datapoints as centers
means algorithm, k-medoids chooses Types of Hierarchical Clustering
reference point, mediods can be used, which is the m
Instead of taking mean value of the object in a cluster as
centrally located object incluster. 1. Single Linkage
algorithm. :
Kmedoids is called as Partitioning Around Medoids (PAM)
that they are medoids are not. 2. Complete Linkage
All the items from the input data set are examined by one to see
1 Initialize: arbitrarily select k out of the n data points as the medoids. 3. Average Linkage
is Pecd(p. m)²
Compute E-m L(r,s) = min(D(xYs)
Choose the minimum swapping cost.
Fig. 3.9.2
TechKaould TechKnowledga
PubC ations
PUbIlCatlos
Classification and Cluster
3-28
Analytics Business Intelllgence and Data Analytlcs Classification and Clustering
Business Intelligence and Data 3-29
points in each In agglomerative or bottom-up clustering method we
2. Complete linkage distance between two cluster is defh
the between their
assign each
Step 1: Calculate the slmilarity (e.g., distance) between each of the
observatlon to its own cluster. Then,
clustering, longest the length of two furth clusters and jotn the two most similar clusters.
linkage hlerarchical is equal to
In complete between clusters " and "s" Step 2: Find the nearest (most simllar) pair of clusters and merge them into a
single cluster, so that now you have
For example,
the distance one less cluster.
points. Step 3: Compute distances (similarities) between the new cluster and each of the
old clusters.
Step 4: Repeat steps 2 and 3 until allitems are clustered into a single cluster of size N.
L(rs) = max(D(*X) In divisive or top-down clustering methodwe allocate all of the observations to a single cluster. We
partition the
cluster to two least similar clusters.
Fig. 3.9.3 Finally, we proceed repetitively on each cluster until there is one cluster for each observation. There is evidence
that divisive algorithms produce more accurate hierarchies than agglomerative algorithms ín some
3. Average Linkage to eye circumstances but is conceptually more complex.
between each point in one cluster
average linkage hierarchical clustering, the average distance the left Divisible hierarchical clustering, top down approach is used. It starts with all objects in one cluster. Clusters
In between clusters "r" and "s" to is
in the other cluster is defined.
For example, the distance equal
t
are subdivided into smaller and smaller clusters until each object forms a cluster on its owm. Certain
connecting the points of one cluster to the other. termination
average length each arrow between condition is satisfied.
Acluster is split according to some principle, eg., the maximum Euclidian distance between the closest
neighbouring objects in the cluster. Start with single cluster at the top of the tree and continue splítting it into
smaller and smaller
Clusters till the bottom is reached where there are n clusters with one menmber each. Dendrogram is a tree data
L(r.s) = D(g)) structure which illustrates hierarchical clustering techniques.
nns i=1 i=1 Each level shows clusters for that level. Leaf- indívidual cluster, Root- one cluster. A cluster at level ii is the Union
Fig. 3.9.4 of its children clusters at level i + 1.
Ward distance
The Ward distance, basedon the analysis of the variance of the Euclidean distances between the observations
Methods based on the Ward distance tend to generate a large number of clusters, each containing a few observatios
Centroid Method
In centroid method, distance between the two mean vectors of the clusters is consider as the distance betweer
two clusters. At each stage of the process we combine the two clusters that have the smallest centroid distanca
A B
Hierarchical methods can be subdivided into two main groups: agglomerative and divisive methods. E
Fig. 3.9.5
3.9.1 Agglomerative and Divisive Hierarchical Methods
Agglomerative method is bottom up clustering. Suppose there is set of N To measure of performance of a clustering method, one need to verify the clusters generated correspond to an
(similarities) between the clusters equal the distances (similarities) between observations.
the
Calculate the distances actual regular pattern in the data. It is appropriate to apply other clustering algorithms and to compare the
most similar clusters. items they contain. Join the t results obtained by different methods.
In this way it is also possible to evaluate if the number of identifñed clusters is robust with respect to the different
techniques applied.
TechKnonlodu TechKaonlede
PubICatlos PubICations
Classification ano Custer
3-30
Data Analytics Business Intellgence and Data Analytics 3-31 Classification and Clustering
Business Intelligence and related are objects
in cluster.
other cluster. a.19 Write short note Binary attribute. (Refer
howcloselv from Sectlorn (8Marks)
coheslon: Measures separatedclusteris generated 3.7.5(1)
Cluster
how distinctor
well Kclusters
separatlon:Measures x} be the setof Q. 20 Write short note on Nominal attribute. (Refer Sectlon (4 Marke)
Cluster X =(x,,Xy, .., 3.7.5(2)
dist (C, Cj) a. 21 Wite short note on Ordinal attribute. (Refer
Let Sectlon 3.7.5(3) (4 Marks)
Q. 22 Explain K-means method. (Reer Sectlon 3.8.1)
(X,) coh = (4 Marke)
Cohesion is defined as
O. 23 Explain K-medoids algorithm. (Refer Sectlon 3.8.2) (5 Marke)
defined as,
of clustersis Explain single linkage, complete linkage, average linkage and ward distance.(Refer Sectlon 3.9)
Separation betweena pair 0. 24 (5 Marke)
dist (C, C) How one evaluates clustering model? (Refer Sectlon 3.10)
a. 25 (5 Marke
Sep (Xy, Xa) =
consistency of clusters of
data..The silhouette
cluster (separation).valy
and valldation of
of interpretation compared to other
Silhouette refers a method own cluster (cohesion)
similar an object is to its
is a measure of how the object is well I matched with Its ow
1to + 1. The high value indicate that
The coefficient value ranges from - Silhouette can be calculated with distance metric such
with neighbouring cluster.
cluster and poorly matched
distance.
Eclulidean or Manhattan
Review Questlons
Q.9 Explain the ROC curve chart. (Refer Sectlon 3.2.5) (5 Marl
a. 10 Explain the Cumulative gain and lift chart. (Refer Section 3.2.6) (5 Marks)
Q. 11 Write short note on Bayesian methods. (Refer Sectlon 3.3) (4 Marks)
a. 12 Explain naive Bayes classifier with example. (Refer Section 3.3.2) (5 Marks)
Q. 13 What is Bayesian networks ? (Refer Sectlon 3.3.3) (4 Marks)
Q, 14 Write short note on logistic regression. (Refer Section 3.4) (5 Marks)
Q. 15 Write short note on neural network. (Refer Sectlon 3.5) (5 Marko
Q. 16 Write short note on support vector machine. (Refer Section 3.6) (5 Marksl
a. 17 What are the characteristics of clustering method? (4Marts
(Refer Section 3.7.1)
Q. 18 What is taxonomy of clustering method? (Refer
Section 3.7.3) (4Marls)
TechKnoully
PUbtatiots
Husiness Intelligenceand Data Analytics 42 Manage ment Information System
4 Systemn
Descriptive Information: Explains what has haopened using
dashboards, reports, and basic anays
Dlagnostic Informatlon:Analyzes why somerhing happened by identifring patterns and
corTelations.
Predlctive Informatlon Provides forecasts based on histordcal dsta using pradicttve modeing and
machine learning.
Prescriptive Information : Sugyests actionable stens to ontimiza outeomes through advanced
Syllabus Quality of Information, simulations
(MIS): Classification and and algorithms.
Management Information System management,
Marketing models:Relational
marketing Sales force for logistics plannino D
optimization, Optimization models Based on Frequency
: Supply chain S.
Logistic and production models
management systems. of good operating practíces Static Information: Data that does not change freauently. like
The CCR model, ldentification archived financial regorts.
Data envelopment analysis, Dynamic Information : Continuously updated dat. such as real-ríme stock prices or
website tratic
4.1 Management Information System (MIS) analytics.
Intelligence (B) bv
as a foundation for Business 4.1.2 Quality of Information
Management Information System (MIS) serves lereby
decision-making, analy_is, and reporting. BÍ enhances MIS
structured data and information to support and predictive insights. The quality of information in BI is critical for making informed decislons. BI tools focus on detivering high
data analysis, visualization,
advanced tools and technologies for quality data with the following attributes:
4.1.1 Classification of Information Accuracy : Ensures that the data reflects real-world facts without errors or
inconsistencies.
can be tailored to suit specifc hue Timeliness : Provides up-to-date infornation for real-time or nearreal-time decision- making
In the context of Business Intelligence, the classification of information
2
needs Here's how information in BIl can be classified Relevance : Delivers information tailored to the specilc business questions and user roles.
1. Based on Decision-Making Level 4. Completeness : Ensures no critical data is missing, providing a full picture of the situation.
5. Consistency : Harmonizes data across multiple sources to avoid contradictlons in reports and anaByss
Strategic Information : Supports high-level decision-making by analyzing market trends, competior
performance. and long-term forecasts. 6. Clarity i Presents information in an intuitive and understandable format, such as through visualizatlons.
dashboards, or summnary reports
Tactical Information : Aids middle-level managers in optimizíng business processes, managing resourcas
and improving performance. 7, Accessibllity: Makes data readily available to authorlzed users through 8l platforms or seif- service toois
Operational Information Provides granular, real-time data to support daily tasks and immedate 8. Actionabllity : Provides insights that can directly influence decision- making and strategy
formulation
decisions, such as inventory levels or customer support metrics. 9. Scalabillty : Ensures that the data infrastructure can handle increasing amounts of information as the
business
2. Based on Data Source grows.
Internal Information: Derived from within the organization, such as sales reports, financial 4.1.3 Role of MIS in Business Intelligence
statements
and employee performance data.
External Information: Collected from external sources like market MIS lays the groundwork for Bl by ensuring that
research, social media, government
reports, and industry benchmarks. Datais systematically collected, stored, and processed.
3. Based on Data Characteristics Information flows efflclently across organizatlonal levels.
Structured Data : Organized into rows, columns, and A single version of truth is maintained, reducing data silos.
predefined formats. such as data in relational
databases. Decision-makers can rely on consistent, accurare, and actionable insights.
Unstructured Data : Includes emails, social media posts,
images, videos, and other formats that require
advanced analytics. By integrating MIS with Bl tools, organizations can leverage their data to gain a comperitive edye, improve
performance, and drive innovation. The synergy between MIS and Bl transtorms raw data into strategic assets.
Semi-Structured Data: Data with some level of organization, like JSON or
XML files.
Teca kaeuedee
Management
Information s
Business Intelligenceand Data
Analytics
4-3 Syst Rusiness Intelllgenceand Data Analytics
4-4 Management Information System
Relational Marketing
Marketing Models : whenever a
4.2 us have
noticed that
so that they
get rmobile ompany Producis
Sovices Distributlon
Let's understand relational
about to launch a new
marketing
device into the
with
market
example. Most of
a
the
the
functionality
company
survey is done by provided by that device.
the feedback
aif erent oplnlo channals
them to enhance
customers, which helns
when you visit a
restaurant waiters get
aspects so that
forms along wa
Irom their
And it is not only about a
mobile phone,
customers have to
rate the
restaurant in different
they improvs Segments Relattonat
marketing
Sales
processes
the bills wherein the customers and try to
themselves.
Almost all the companies
study the behaviour
and the feedbacks
given by the
Fig. 4.2.3 represent the different people involved in relational marketing strategy where all the nodes are
interconnected to each other.
ln the startine
throughout t Selection of
prospects
by any company.purchasing
taken for a customer
the prphase
time.
actlons that can be started
not yet
It also shows
the different
individual is a prospect or also
known as potential
customer who has oduct Cross-selling Business
Intelligence Customers
company. directly and indirectly
Up-sell1ng acquisition
using the services provided by the
actlons are been
carried out in both
emails.
fashion. In and data mining
acquisition or service via calls, oral
For these customers
acquisition the customer is
beengiven information about the product talks wt
on.
the agents of the company and so displayed on the
company's acquisition
In indirect
advertising
webslte highlighting neWinformation
the and
about the product is
dashboard
of t
products or services. This actions includes cost whlch wi) be assigne
g
Retention
LOsses Acquisition the customers that were seethed in December month. For testing the model the data from t-2 should not be used
because that is the training period of the model.
Time
Most recent
data (period t +t
can be or may
be partlallyor products or services inthe
this services or the
other case
would be that
the custone also are only been done to the customers holding debit card and not to those holding credit. So this defines a margin
companyfor did not requlre
hunting for better for acquisition to call only those customers holding debit card.
competitors who are
customersof the competitor. acquisition This can also be stated as up selling where the customer is informed and asked to own the product or services
has switched
from your
company to the
prospects it is
important to assign
marketing strategies
campaign with which are one level higher than the existing one and willhave more features and availability.
Once the
company has identified the
prospects and the company
with various levels along wth t 4.2.8 Market Basket Analysis
profitabilltyto both the company. thee earlier pools
avallable with the campaign is based on
marketingresources
strategies are were the
advertising and
been provided which is
been n
taken fro The main objective of market basket analysis is to get the exact view of what products the customers are
Traditlonal marketing of products and services that
are
acquisltion fed nt purchasing so that the company gets the required knowledge to organize and plan their marketing strategles.
quality of
the publíc in order to
enhance the characteristics for the profiles Usually used to analyze what kínd of product is sold more on e commerce sites or retail industries.
ruleswhich provides
data mart to derive classification It can also to be applied to check the purchases done with help of credit card or landline services or
4.2.6 Retention
complementary once to check whether the policies taken are been taken by same households.
its saturation in market has Data used here can also be referred as purchase transactions whích can be associated with time dímenston
to
products and services and
Due to the reach of maturity stage by most of the leada track the purchase.
competition amongst companies. company has more at
of customer base of
thls the negative side effect is that the expansion which is common in 4,2.9 Web Mining
Due to other company
customer at cost of that taken by
mechanism like acquisition of on. As it is well known fact that web is the most common and easier way of communication with the maximum
of the
telecommunication and so
Industries for saving management, attributee . crowd. And most of the companies are using social media platform to promnote their products to the people.
analyze and characterize the
companies invest more amounts in resources to Ecommerce sites are considered to be the important sales channels.
Due to this many
company to another.
which customer's switches from their grab the attention nf A. Since web mining is used to analyze data from the activities that are been carried out on those sites
by the visítor
attractive offers given by the competitive company to
The other reason could be the this web mining methods are mostly used for three purposes content mining, structure mining and usage mining.
if the company down.
prospects and thus bring the market strategies
pay for the seri Text míning
there can be various reasons that the customer would not find the charge relevant to
Also for the same.
alternative one and switch HTML mining
provided by the company and thus hunt for an
of products and services that are been provided hu
There are various other aspects that would lead to retention Content mining XML miníng
thus the company has to be keen about the same.
the company and Image mining
company.
Structure mining Dynamic links
For example assume amobile shop where there is an offer that if the customer buys a smart phone the or she cn
pay extra Rs. 100to get annual subscription of Netflix along with smart phone but there is no compulsion thal User profile
every customer purchasing smart phone would be interested for the subscription and due to this the mobilk
provider get the classification of customer who are interested and people who are not interested in the offer. Usage mining Clickstream analysis
And if the number of interested customer is more the shop owner will have to get more services from Neuls
Purchasing behayior
This demographic information about the customer can be fed into data mart which can be used as explana0)
attributes to develop classification model which will help to develop various offers forthcoming period and Fig. 4.2.10: Taxonomy of web mining analyses
how customer would react to it.
Tech Kaowledge
PubIlC ations
TechKnouldy
PUbIlCatloas
Management Informatlon Sysy
4-11 Business Intelligenceand Data Analytics
Analytics 4-12 Management Information System
Inteligenceand Data
Business information contact management.
custSearomerch .eng
required
Content mining the web page to remove requiredl by the
1 sales opportunity management.
of content
thatis there on datathat is been
analyses provide links to present on web page in
Itinvolves con tent mining to of texts customer management.
lke Google also
It can also be
perform
tracked back to
images and
data míníng
problems for
multímedia content.
analysis
format activity management.
Tech Kneuley!
Pubcatloas TechKaeuledge
PUbIICations
Management Information Syst
4-13
Rusiness Intelligenceand Data Analytics
Data Analytics 4-14 Management Information System
Business Intelligenceand 4.3.1(B) Planning
subsequent restriction phase.
For
4.3.1(A) Design
aiferentexampnanN
during worksin
commercial activity or companies. This phase Decision making tasks that are associated with planning are
phase of any group of
during the design phase, to market entitles. Resources can be assienmnent
dealswith the start prospects or types ofdecisions. of sales resources, structured and Slzed
It plans for the
of acquisition build. Salesforcedesign
includes three
calculated as work time of the agent and the buaget
during the planninggsegments which is whereas market entities comprises of products, market
ofcreation of
market Types of decisions segments, distribution channels and customers.
Allocation can be calculated as the time spend on every customer to
cost required to travel and how effect the action was to
promote the product or service, time and
1, Organizational structure convince the customer for the product. Further
Dossibilities can also be considered like explaining the technical and functional
and suggestions coming from the customers.
features of the product or service
2. Sizing
3. Sales territories
4.3.1(C) Assessment
Decisions Assessment is important to control the activities to check the effectiveness and efficiency of the
Fig. 4.3.2:Types of agents in sales
network so that proper remuneration and incentives can be designed for every
individual. On account to
measure effective efficiency of the agent it is very important to announce the criteria on
1, Organizational structure which they would be
hierarchical cluster of agents judged.
groupstructure
This can take
of products, geographical
forms which corresponds to
differentareas or brands, in some cases markets are also been considered to fona
with help So that the agents give their full contribution towards the sales of the
product and services thus increasing the
profit of the company as well as their individual proit and also enhance their performance
cluster. the cuet
is mandatory to analyze complexity of
For understanding organizational structure it and to what extent. 4.3.2 Models for Sales Force Management
agents be specialized
products and else activity to decide how can
Following are some classes of optimization models for designing and planning salesforce. Before starting here
2. Sizing are some of the notions that would be used in following sectlons so let's learm
work within a selected structure of sales whiol about it first.
It is the working done on the number of agents that should Let's assume that are a particular region is divided into M geographical areas of sales,
on different factors like count of customers and prospects, how much of sales area coverage should be done which is also known as
sales coverage unit so let M= (1, 2, .., M). Areas should be divided into disjoint clusters known as territories such
limit for every call and travelling time of every agent.
that each area belongs to only one territory and is also connected to all
3. Sales territories areas of same territory.
Time connection property implements that each area it is possible to reach another area of same
When it comes designing sales territory means creating a cluster of geographical areas in a region any territory. Time
span can be divided into T intervals which are of same length which are usually weeks or
assigns that region to a particular agent or group of agent. months which can be
indicated as te T={ 1,2, .., T}.
Factors that should be considered while designing and assigning this territories to the agents are the salek Each territory has a sales agent associated with it which belongs o one area of the territory which is
potential of every area, time required to travel from one area to another and what time limit a particular considered
to be agent's residence. Time and cost of travelling from one area to another depends on the area of
agent has. residence of
the agent. Let Nbe number of territories so N= (1, 2, ., N).
Segmentation:Products-services In territories there are customers and prospects which would be visited by the agent to
promote their product
which will be given as H in some models it is considered to have various segments and thus they are
counted
Sales activity same. So h={1, 2, ., H}. And finally assume every agent sells K products and services during the call so let
k=(1,2,-. ,K).
Sales and communication channels 4.3.3 Response Functions
This plays an important role in formulating the models to design and plan sales network. In general it defines the
flexibility of sales with respect to sales action and a formal way to describe complex relationships between sales
Sales force Sales force
organization
actions and market reactions. Sales to which response functions refers to are expressed in products units or
Sizing
monetary units known as revenue or margins.
Sales territory allocation to agents They are presented as sales revenues formally. The anxiety of sales action can be related to diferent variables
Fig. 4.3.3: Salesforce number of calls made to the customer in given period of time, how many times product was mentioned in given
design process period of time ,how much time was given to customer in person during a given period of time.
TechKaould
PUbIlCatlons
TechKaouledge
PubICations
Management.Information Syste
4-15 Business Intelligenceand Data Analvtics 4-16
Management Information System
Business Intelligenceand Data Analytcs Define I additional continuous variables that express the deviations from the average sales opportunity
value for
Response each territory:
min
Sales action effort iel je)
response function
S.to ,asS,. ie l,
concave
Fig. 4.3.4(a) :A
Responset 1e1,
je J,
iel
S,e 0, Ye {0, 1) le l, j J.
if area j is assigned to territory i This multi centric logistic supply chains need to be widely spread with most of the automation which makes the
otherwise work simpler and also these chains have large amount of fnancial investment done so as to automate and make
the chains more effective. The effectiveness and features that are associated with logistic supply chain is directly
proportional to the profile that the company maintains to communicate with the customers.
TechKnowledgu Tech Kneuledge
Pubicatlons PuDIlcations
Managementt
4-17 Information Syst Business Intelligenceand Data Analytics
Data Analytics 4-18
-
Business Intelligenceand Transp. L.. are products I which ls in Management Information System
Inventory costs inventory at end period of time t.
Production
Transp costs d.. is the product demand l over t
Purchase costs
costs period of time.
is unit
Costs
manufacturing cost for I product in t period of time.
h.. is inventory cost for product I in t period of
US market
time.
e, is capacity absorption to
US manufacture a particular unit
US suppllers US plants b, is capacity available in period t.
Sothe problem is formulated as follows:
min
ieT iel
Asla plants European market
Europe
s.to P*1-de ie 1, te T,
Offshore suppliers
iel te T,
P20, ie l, te T.
TechKnouledu
atlens
Publltat Kaowledons
Management Information Systen
4-19
Rusíness Intelllgenceand Data Analytics
4-20 Management Informatlon System
Business Intelligenceand Data Analytics 4.5.6 Billof Materials
backlogrefers to
4.5.4 Backlogging
This is an additlonal feature
that isto be
consideredin logistic
and
systems. Term
ít could not
be
completedso thereispossipenalblitytytha
a
c
One more feature that can be added in
planning model ls hill of materdals whlch ls assoclated with
comples
perdod of time said to be backlogged. structure.
given in certain completion is
portion of demandis to be thetime
was left after Consumer goods In which end product that is been made
that is been involved and the
work that
industries which
produce mass
cannotbe whichis tmog willhave various components that are been used to build up ne eu
Backlog 0s a feature that usually happens
in
variants in backlog
vari
B2B
which can be referred as lostsales
which
fulfl ed and y product.
Parameters that deflne the format of bll of materials are:
likely to develop different for product i
the there is a subsequent
lost.
decision variables like B,, is
units of demand that are
iiover period of time ben
A..which ís units of product i directly required by one unit of
product , in which term product refers end
add new demand for product product and associated components required which define different levels of bill of
Thís model is importantto of delaying the materials.
8, is unít cost
delayed in period t. And parametersS So the formula becomes:
So the formula becomes: min (P+ hge)
ieT iel
min
ieT iel
ieI, te T, s.to P*i1-d ieI, te T,
s.to P ie-1 -e+Bi -B,t- 1=de
te T,
iel Psb te T,
jel te T.
PpB,20,
ie I, P20, ie I, te T.
ie l,te T,
ne N,
me M
ie L,te T,
me M, ne N.
Xm20,
Pie 20,Ye (0,1), ie I,te T.
TechKnewledga
Tech Knouldy PuDIIcatons
PubIlcatlons
Management Information Systen
4-21 Duusiness Intelligenceand Data Analytics 4-22 Management Information System
DataAnalytics Whereas when it comes to second case, the ability value mav bave diferent varlations also when it becomes
Business Intelligenceand profitsfo 366cult to fix single structure of weights which can be shared by different units.
Revenue Management Systems maximize the the
4.6
manageand its
main
objective is to
interest in
company So to avold different problems that can be raised by units to renresent a unit of weights that will give advantage
ro few DMUs instead of benefiting to all.
Revenue management: Sa
policy to
between demand
and supply.
criteria and
has also
gained service indusm Data envelopment analysis calculates the ablity for everv unit on bases of this weigh mechanísm which is go0d
maintaining the balance
marketing and logistic expected for DMU where the efficiency of system will be maximized. Also by dolng additional analysis the aim of data
created for companies. It was to
It is usually transport, tourist and hotels. distribution grow envelopment analysis are efficient or not.
responsible for manufacturing and maximizing their profit to the max.
accepted by about 4.7.1 Efficient Frontier
company thinks
Eventually it was been decision making
basic idea was
related to the
revenue and every
be planned according to the
strategies and
to it.
patterns and Ir is also known as production function which shows the relation between the inputs that are been used and the
management needs to when data is bee
feed
But the revenue becomes complex Outputs that are been produced using those inputs, It also shows the maximum amount of outputs that can be
companyand so it
models of the generated by given combination of inputs. Also it showed the minimum quantity of inputsthat would be
Management
Processes in Revenue mathematical models required to obtain the required output level.
4.6.1 Decision
management the models
that are involved
have which
product and its priceare use And hence efficient frontier is directly proportional to technical efciency of operating methods. Efictent
availability of the
When it comes to
revenue
the customers at every level so the canbe frontier can easily be gained by having set of observations whích shows the output level of given set of
the actions of out of the sales. combination of input level production factor.
to determine maximum of the profits managing various offers
optimized to have
revenue management is not only
maximizing profit but also
marketing strategies to promote the o
products
and When it comes to data envelopment analysis the observations that are been obtained responds to the units that
The aim of have different ideas of for the tra are been evaluated. Statistical methods which use instances to calculate regresslon curve give predeined
demand which will expenditure on the cost
servlces to increase the with minimum hypotheses on shape of production functions.
fulfilling the requirements policy and working over
it le.
logistics. It gives focus on companies have taken up this Data envelopment analysis considers assumptions on functional form of efficient frontier and is non parametric
policies most of the
successfully and the fields that ara
Since it is a managerial growing in nature. The only condition is that the units which are been compared should not be placed on production
become the favourite and companies, hotel chains. airli
notices that this policy have automotive rental companies, entertainment p0ssibil function depending on its ability value.
implenenting this policy are
fields are they have loW margin sales Cost and the
among these
so on. The common features also violating various sales channels. 4.8 The CCR Model
public and
imposing dynamic policies for
Development Analysis : Efficiency Measures When data envelopment analysis model is used the option of choosing the optimal weights of generic DMU,
4.7 Data included solving mathematical optimization model whose decision variables are given by weights u,, reKand v,
compared are known as decision mal
analvsis the units which are being ieH that is been associated with every input and output.
When it comes to data development
decisions that are self governed. There are various formulas to get the efficiency score both the well known is Charnes -Cooper-Rhodes (CCR)
units also known as DMUs as they have
being compared. If these units are ahlet model which is given by formula:
of n units N = (1, 2, .., n} re the set of units
To calculate the efficiency
effect of jh decision making unit DMU, j eNwhich k
produce one single output from one single input only the
max V=
reK
given as:
ieH y
,
if output is generated
In that y, will be the output value generated by DMU, and x, is input that is been used. And ieH
using different input factors, the efficiency of DMU, willbe defined as ratio between weighted sum of outputs and s.to s1, je N,
inputs. ie H.
u v20, re K,
Given by H=(1, 2, .., s) is set of production factors and K= (1, 2, .., m} which are the outputs. In x iE Hwhich
The aim is to maximize the capability measures for DMU,
gives quantity of inputs Iwhich are been used in DMU, and v, re Kwhich is the quantity of output r that is been
gained and the efficiency of DMU, is given as:
max :M
reKY
6, =
S. to
ie H
Vy1,
Where weighs u,ug,..u are been associated by outputs and v,,v,,..,.v. is been
assigned to inputs. Tech Kneuledge
Cati o
Tech Knoulelj!
PubIcatlons