Open In App

How to Calculate Conditional Probability in R?

Last Updated : 28 Jul, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Conditional probability is the probability of one event occurring given that another event has already occurred. It helps in understanding how the likelihood of one outcome is affected by the knowledge of another outcome.

Conditional Probability Formula

The formula for conditional probability is:

P(A \mid B) = \frac{P(A \cap B)}{P(B)} \quad \text{if } P(B) \ne 0

P(B \mid A) = \frac{P(A \cap B)}{P(A)} \quad \text{if } P(A) \ne 0

The below figure depicts the Venn diagram representation

Example 1: Computation of Conditional Probability

From a pack of 50 Pokémon cards, a card is drawn at random. These 50 cards have 5 equal sets of red, blue, green, yellow and black cards respectively and each set has 2 water-type Pokémon with one water type being of high strength and the other one being of medium strength.

Considering A to be the event of drawing a high strength water-type Pokémon card and B to be the event of drawing a red card, what is the probability of drawing a high-strength, water-type Pokémon card with the red card already been drawn?

Solution:

P(A \mid B) = \frac{P(A \cap B)}{P(B)} = \frac{1/50}{10/50} = \frac{1}{10}  (since there are 10 red cards within a pack of 50 Pokémon cards.)

P(B \mid A) = \frac{1/50}{5/50} = \frac{1}{5} (as there are 5 high-strength water-type Pokémon cards within a pack of 50 cards.)

P(A \cap B) = \frac{1}{50} (as there is one red high strength water-type Pokémon card within a pack of 50 cards)

Since event B has already occurred hence there are 10 exhaustive cases and not 50 as earlier. Amongst these 10 red Pokémon cards, there is 1 high-strength, water-type Pokémon card.

Hence,

P(A \mid B) = \frac{1/50}{10/50} = \frac{1}{10}

This is the conditional probability of A given that B has already occurred.

Similarly,

P(B \mid A) =\frac{1/50}{5/50} = \frac{1}{5}

As there can be only 1 red high strength water-type Pokémon card within the high strength water-type Pokémon card already drawn from pack of 50 cards.

Example 2: Computation of Conditional Probability

A store owner has a list of 15 customers. He observes certain patterns in their purchases which are depicted in the table below.

Customers

Money spentFrequency

1

High

Less

2

Low

More

3

High

More

4

High

Less

5

Low

Less

6

Low

More

7

High

More

8

Low

Less

9

Low

Less

10

High

More

11

Low

More

12

Low

Less

13

High

Less

14

High

More

15

High

Less

Based on the above table, he is interested in finding out

  1. What is the probability of the customer spending high given that they are purchasing less often?
  2. What is the probability of the customer spending less given that they are purchasing more often?
  3. What is the probability of the customer spending less given that they are purchasing less often?
  4. What is the probability of the customer spending high given that they are purchasing more often?

Solution:

P(High Spend | Less Frequency)

P(Less Frequency) = 8/15 (as from the table,8 times out of 15, frequency is less)

P(High Spend Ո Less Frequency) = 4/15 (as from the table, there are 4 combinations out of 15 with high spend and less frequency)

P(\text{High Spend} \mid \text{Less Frequency}) = \frac{P(\text{High Spend} \cap \text{Less Frequency})}{P(\text{Less Frequency})} = \frac{4/15}{8/15} = 0.5

P(Low Spend | More Frequency)

P(\text{More Frequency}) = \frac{7}{15} ( as from the table,7 times out of 15, frequency is less)

P(\text{Low Spend} \cap \text{More Frequency}) = \frac{3}{15} (as from the table, there are 3 combinations out of 15 with low spend and more frequency)

P(\text{Low Spend} \mid \text{More Frequency}) = \frac{P(\text{Low Spend} \cap \text{More Frequency})}{P(\text{More Frequency})} = \frac{3/15}{7/15} = 0.4285714

Similarly,

P(Low Spend | Less Frequency) = 0.5

P(High Spend | More Frequency) = 0.5714286

The above example demonstrates how to apply the conditional probability formula using frequency data.

Implementation of Conditional Probability Calculation in R

We are calculating conditional probabilities for a dataset using R programming language. This helps us understand the relationship between money spent and frequency of visits.

1. Creating the Dataset

We are creating a dataset with two categorical variables: money spent and visit frequency.

  • data.frame: Used to create a structured table-like dataset.
  • c: Used to create vectors for the two variables.
R
Money_Spent <- c("High", "Low", "High", "High", "Low", "Low", "High", "Low", 
                 "Low", "High", "Low", "Low", "High", "High", "High")
Frequency <- c("Less", "More", "More", "Less", "Less", "More", "More", "Less",
               "Less", "More", "More", "Less", "Less", "More", "Less")
df <- data.frame(Money_Spent, Frequency)

2. Creating a Frequency Table

We are generating a contingency table to see how often each combination of the two variables occurs.

  • table: Creates a cross-tabulation of the two variables.
  • df$Column: Used to access specific columns in a data frame.
R
tab <- table(df$Money_Spent, df$Frequency)
print(tab)

Output:

table
Output

3. Calculating Conditional Probabilities

We are calculating the probability of one variable given the value of another using the frequency table.

  • sum: Adds the total of selected values from the table.
  • tab[row, column]: Accesses a specific cell value in the frequency table.
  • round: Rounds off numeric results to a specific number of decimal places.
R
p_high_given_less <- tab["High", "Less"] / sum(tab[, "Less"])
p_high_given_more <- tab["High", "More"] / sum(tab[, "More"])
p_low_given_less <- tab["Low", "Less"] / sum(tab[, "Less"])
p_low_given_more <- tab["Low", "More"] / sum(tab[, "More"])

4. Printing the Results

We are displaying the calculated conditional probabilities.

  • cat: Used to concatenate and print values.
  • round: Rounds values for cleaner output.
R
cat("P(High | Less) =", round(p_high_given_less, 2), "\n")
cat("P(High | More) =", round(p_high_given_more, 2), "\n")
cat("P(Low | Less) =", round(p_low_given_less, 2), "\n")
cat("P(Low | More) =", round(p_low_given_more, 2), "\n")

Output:

probability
Output

These results show that people who spend high money are more likely to visit less, while those who spend low money are more likely to visit more.


Similar Reads