Business Data Mining Week 7 A
Business Data Mining Week 7 A
1. Prior Probability
Prior Probability is the probability of occurring an event before the collection of new data. It
is the best logical evaluation of the probability of an outcome which is based on the present
knowledge of the event before the inspection is performed.
2. Posterior Probability
Pragalath EA2252001010013 1
When new data or information is collected then the Prior Probability of an event will be
revised to produce a more accurate measure of a possible outcome. This revised probability
becomes the Posterior Probability and is calculated using Bayes’ theorem. So, the Posterior
Probability is the probability of an event X occurring given that event H has occurred.
For example
Suppose, three bags have the labels A, B, and C. One bag has a red ball in it, while the other
two do not. The prior probability of red ball found in bag B is one-third or 0.333. But when
bag C is seen, and the result shows that there is no red ball in that bag, then the posterior
probability of red ball found in bag A and B becomes 0.5, as each bag has one out of two
chances.
Formula
Bayes’ Theorem, can be mathematically represented by the equation given below :
Where,
H and X are the events and,
P (X) ≠ 0
P(H/X) – Conditional probability of H.
Given that X occurs.
P(X/H) – Conditional probability of X.
Given that H occurs.
P(H) and P(X) – Prior Probabilities of occurring H and X independent of each other.
This is called the marginal probability.
Pragalath EA2252001010013 2
Pragalath EA2252001010013 3
Applications of Bayes’ Theorem
In the real world, there are plenty of applications of the Bayes’ Theorem. Some applications
are given below :
It can also be used as a building block and starting point for more complex methodologies,
For example, The popular Bayesian networks.
Used in classification problems and other probability-related questions.
Bayesian inference, a particular approach to statistical inference.
In genetics, Bayes’ theorem can be used to calculate the probability of an individual
having a specific genotype.
Examples
1. SpamAssassin works as a mail filter to identify the spam in which users
train the system. In emails, it considers patterns in the words which are
marked as spam by the users. For Example, it may have learned that the
word “release” is marked as spam in 30% of the emails. Concluding 0.8%
of non-spam mails which includes the word “release” and 40% of all emails
which are received by the user is spam. Find the probability that a mail is
a spam if the word “release” seems in it.
Pragalath EA2252001010013 4
2. Bag1 contains 4 white and 8 black balls and Bag2 contains 5 white and
3 black balls. From one of the bag one ball is drawn at random and the ball
which is drawn comes out as black. Find the probability that the ball is
drawn from Bag1.
Pragalath EA2252001010013 5
Statistical methods, including Bayesian inference, play a crucial role in business data mining
by providing powerful tools for analyzing data, extracting insights, and making informed
decisions. Bayesian inference, in particular, offers a flexible framework for integrating prior
knowledge with observed data to obtain probabilistic estimates of parameters and
predictions. Here's how Bayesian inference can be used for business data mining along with
examples of real-world applications, as well as the advantages and challenges associated with
this approach:
4. Market Research and Forecasting: Bayesian time series analysis, such as Bayesian
Structural Time Series or Bayesian Vector Autoregression, can be used to analyze historical
market data and forecast future trends. This information is valuable for strategic planning,
inventory management, and pricing optimization.
5. A/B Testing and Decision Making: Bayesian methods provide a principled framework
for analyzing experimental data and making data-driven decisions. Bayesian hypothesis
testing, Bayesian Optimization, or Bayesian Decision Theory can be used to optimize
marketing campaigns, website designs, or product features based on observed outcomes and
prior beliefs.
Pragalath EA2252001010013 6
Advantages of Bayesian Inference in Business Data Mining:
3. Flexibility and Adaptability: Bayesian models can be easily adapted and updated with
new data, allowing businesses to continuously refine their models and predictions over time.
4. Handling of Small Sample Sizes: Bayesian methods are particularly useful when dealing
with small sample sizes or sparse data, as they can provide stable estimates even with limited
data.
Pragalath EA2252001010013 7
knowledge, computational resources, and software tools, which may pose challenges for
businesses with limited resources or expertise in this area.
In summary, Bayesian inference offers a powerful framework for business data mining,
allowing businesses to leverage statistical methods to extract insights, make predictions, and
drive informed decisions. While Bayesian methods offer several advantages, including the
incorporation of prior knowledge and uncertainty quantification, businesses should be
mindful of the challenges associated with computational complexity, subjectivity in prior
specification, and model interpretability. By addressing these challenges and leveraging the
strengths of Bayesian inference, businesses can unlock the full potential of their data and
gain a competitive edge in today's data-driven world.
Pragalath EA2252001010013 8