0% found this document useful (0 votes)
14 views19 pages

Statistics Top Wise Important Formulas

The document covers fundamental concepts in set theory, permutations, combinations, descriptive statistics, probability, and discrete random variables. It explains set operations, arrangements of objects, selection methods, and statistical measures like mean, median, and mode. Additionally, it discusses probability theories, including conditional probability and the properties of discrete random variables, along with their expected values and variances.

Uploaded by

heenamakkar123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views19 pages

Statistics Top Wise Important Formulas

The document covers fundamental concepts in set theory, permutations, combinations, descriptive statistics, probability, and discrete random variables. It explains set operations, arrangements of objects, selection methods, and statistical measures like mean, median, and mode. Additionally, it discusses probability theories, including conditional probability and the properties of discrete random variables, along with their expected values and variances.

Uploaded by

heenamakkar123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

1.

Set theory, permutations, combinations

1. Set Theory

Set theory is a branch of mathematical logic that deals with collections of objects called sets. A set is a well-defined
collection of distinct objects.

Key Concepts:

a) Set Notation:
- Sets are usually denoted by capital letters: A, B, C, etc.
- Elements are listed within curly braces: {1, 2, 3}
- The symbol ∈ means "is an element of": 2 ∈ {1, 2, 3}
- The symbol ∉ means "is not an element of": 4 ∉ {1, 2, 3}

b) Types of Sets:
- Finite sets: Have a countable number of elements
- Infinite sets: Have an uncountable number of elements
- Empty set: Contains no elements, denoted as {} or ∅

c) Set Operations:
- Union (A ∪ B): All elements that are in A or B (or both)
- Intersection (A ∩ B): All elements that are in both A and B
- Difference (A - B): All elements in A that are not in B
- Complement (A'): All elements in the universal set that are not in A

Examples:

1. Let A = {1, 2, 3, 4} and B = {3, 4, 5, 6}


- A ∪ B = {1, 2, 3, 4, 5, 6}
- A ∩ B = {3, 4}
- A - B = {1, 2}
- If the universal set is U = {1, 2, 3, 4, 5, 6, 7, 8}, then A' = {5, 6, 7, 8}

2. In a class of 30 students:
Let M be the set of students who play music
Let S be the set of students who play sports
If 15 students play music, 20 play sports, and 10 play both:
- |M ∪ S| = |M| + |S| - |M ∩ S| = 15 + 20 - 10 = 25
So, 25 students play either music or sports (or both)

2. Permutations

Permutations are arrangements of objects where order matters. They are used when we want to know how many
ways we can arrange a set of objects.

Key Formulas:

a) Permutations without repetition:


P(n,r) = n! / (n-r)!
Where n is the total number of objects and r is the number being arranged
b) Permutations with repetition:
n^r
Where n is the number of options for each position and r is the number of positions

Examples:

1. How many ways can 5 books be arranged on a shelf?


This is a permutation of 5 objects taken 5 at a time.
P(5,5) = 5! = 5 × 4 × 3 × 2 × 1 = 120 ways

2. In how many ways can the first three positions in a race with 8 runners be filled?
This is a permutation of 8 objects taken 3 at a time.
P(8,3) = 8! / (8-3)! = 8 × 7 × 6 = 336 ways

3. How many 4-digit PINs are possible if repetition is allowed?


This is a permutation with repetition. We have 10 digits (0-9) and 4 positions.
10^4 = 10,000 possible PINs

3. Combinations

Combinations are selections of objects where order doesn't matter. They are used when we want to know how many
ways we can select objects from a larger set.

Key Formula:

Combinations without repetition:


C(n,r) = n! / (r! × (n-r)!)
Also written as ₙCᵣ or (n choose r)

Examples:

1. How many ways can a committee of 3 be chosen from 10 people?


C(10,3) = 10! / (3! × 7!) = (10 × 9 × 8) / (3 × 2 × 1) = 120 ways

2. In a standard 52-card deck, how many 5-card poker hands are possible?
This is a combination of 52 cards taken 5 at a time.
C(52,5) = 52! / (5! × 47!) = 2,598,960 possible hands

3. A bag contains 5 red marbles and 3 blue marbles. How many ways can you select 4 marbles?
This is a combination problem with two types of objects. We can break it down:
- 4 red, 0 blue: C(5,4) × C(3,0) = 5 × 1 = 5 ways
- 3 red, 1 blue: C(5,3) × C(3,1) = 10 × 3 = 30 ways
- 2 red, 2 blue: C(5,2) × C(3,2) = 10 × 3 = 30 ways
- 1 red, 3 blue: C(5,1) × C(3,3) = 5 × 1 = 5 ways
Total: 5 + 30 + 30 + 5 = 70 ways
2. Descriptive statistics: Mean, median, mode, frequency tables, bar graphs, pie charts

1. Mean:

Formula: Mean = Σ(x * f) / Σf

Calculation: (1 * 3) + (2 * 5) + (3 * 8) + (4 * 4) + (5 * 2) = 3 + 10 + 24 + 16 + 10 = 63 Total frequency: 3 + 5 + 8 + 4 + 2


= 22

Mean = 63 / 22 = 2.86

2. Median:

To find the median, we first need to find the middle position. Total frequency = 22 Middle position = (22 + 1) / 2 = 11.5

Now we need to find which value corresponds to the 11.5th position:

Cumulative frequency: 1: 3 2: 3 + 5 = 8 3: 8 + 8 = 16 (this exceeds 11.5)

The median is 3.

3. Mode:

The mode is the value with the highest frequency. In this case, it's 3 with a frequency of 8.

So, from this frequency table:

● Mean = 2.86
● Median = 3
● Mode = 3

3. Experiment, outcome, event, sample space

Experiment: A repeatable process with potentially different results each time.


Outcome: A possible result of an experiment.
Sample Space: The set of all possible outcomes of an experiment.
Event: A subset of the sample space, representing outcomes we're interested in.

4. Interpretations of probability, disjoint events, axioms of probability


Two events A and B are disjoint if their intersection is the empty set: A ∩ B = ∅
P(A or B) = P(A) + P(B) when A and B are disjoint
P(A and B) = 0 for disjoint events
P(A ∪ B) = P(A) + P(B) - P(A ∩ B)

5. Conditional probability, independent events, total probability theorem, Bayes'


theorem
Conditional Probability: P(A|B) = P(A ∩ B) / P(B), where P(B) > 0

Total Probability Theorem


For a partition of the sample space {B₁, B₂, ..., Bₙ} and an event A: P(A) = P(A|B₁)P(B₁) + P(A|B₂)P(B₂) + ... +
P(A|Bₙ)P(Bₙ)

Bayes' Theorem
For events A and B, where P(B) > 0:
P(A|B) = [P(B|A) * P(A)] / P(B)
Using the total probability theorem:
P(A|B) = [P(B|A) * P(A)] / [P(B|A)P(A) + P(B|A')P(A')]

6. Discrete random variables


# Discrete Random Variables

A discrete random variable is a variable that can take on a countable number of distinct values. Each possible value of
the discrete random variable has a probability associated with it.

## Properties

1. Countable set of possible values


2. Each value has a non-negative probability
3. The sum of probabilities for all possible values equals 1

## Probability Mass Function (PMF)

The probability mass function, denoted as P(X = x) or f(x), gives the probability that a discrete random variable X takes
on the value x.

Properties of PMF:
1. 0 ≤ P(X = x) ≤ 1 for all x
2. ∑P(X = x) = 1, summed over all possible values of x

## Cumulative Distribution Function (CDF)

The cumulative distribution function, denoted as F(x), gives the probability that X takes on a value less than or equal to
x.

F(x) = P(X ≤ x) = ∑P(X = t), summed over all t ≤ x

Properties of CDF:
1. 0 ≤ F(x) ≤ 1 for all x
2. F(x) is non-decreasing
3. lim(x→-∞) F(x) = 0 and lim(x→∞) F(x) = 1

## Expected Value (Mean)


The expected value of a discrete random variable X, denoted E(X) or μ, is the sum of each possible value multiplied
by its probability.

E(X) = ∑(x * P(X = x)), summed over all possible values of x

## Variance

The variance of a discrete random variable X, denoted Var(X) or σ², measures the spread of the distribution around its
mean.

Var(X) = E((X - μ)²) = ∑((x - μ)² * P(X = x))

An alternative formula: Var(X) = E(X²) - (E(X))²

The standard deviation is the square root of the variance: σ = √(Var(X))

## Common Discrete Probability Distributions

1. Bernoulli Distribution
- Models a single trial with two possible outcomes (success/failure)
- P(X = 1) = p, P(X = 0) = 1 - p
- E(X) = p, Var(X) = p(1-p)

2. Binomial Distribution
- Models the number of successes in n independent Bernoulli trials
- P(X = k) = C(n,k) * p^k * (1-p)^(n-k)
- E(X) = np, Var(X) = np(1-p)

3. Poisson Distribution
- Models the number of events occurring in a fixed interval of time or space
- P(X = k) = (λ^k * e^(-λ)) / k!
- E(X) = λ, Var(X) = λ

4. Geometric Distribution
- Models the number of trials until the first success in repeated Bernoulli trials
- P(X = k) = p * (1-p)^(k-1)
- E(X) = 1/p, Var(X) = (1-p) / p²

5. Uniform Distribution (Discrete)


- All outcomes are equally likely
- P(X = x) = 1/n, where n is the number of possible outcomes
- E(X) = (a + b) / 2, Var(X) = ((b - a + 1)² - 1) / 12, where a and b are the minimum and maximum values

## Examples

1. Rolling a fair six-sided die:


- X = number on the die
- Possible values: {1, 2, 3, 4, 5, 6}
- PMF: P(X = x) = 1/6 for x ∈ {1, 2, 3, 4, 5, 6}
- E(X) = (1 + 2 + 3 + 4 + 5 + 6) / 6 = 3.5

2. Number of heads in 3 coin flips:


- X = number of heads
- Possible values: {0, 1, 2, 3}
- Binomial distribution with n = 3, p = 0.5
- PMF: P(X = 0) = 1/8, P(X = 1) = 3/8, P(X = 2) = 3/8, P(X = 3) = 1/8
- E(X) = np = 3 * 0.5 = 1.5

3. Number of customers arriving at a store in an hour:


- X = number of customers
- Possible values: {0, 1, 2, ...}
- Can be modeled by a Poisson distribution with an appropriate λ

7. Expectation, variance, conditional expectation, law of total expectation

## 1. Expectation (Expected Value)

The expectation or expected value of a random variable X, denoted as E(X) or μ, is a measure of the central tendency
of the distribution.

### For Discrete Random Variables:


E(X) = ∑(x * P(X = x)), summed over all possible values of x

### For Continuous Random Variables:


E(X) = ∫(x * f(x) dx), integrated over the entire range of X

Where f(x) is the probability density function.

### Properties of Expectation:


1. Linearity: E(aX + b) = aE(X) + b
2. Additivity: E(X + Y) = E(X) + E(Y)
3. For independent X and Y: E(XY) = E(X)E(Y)

### Example:
Rolling a fair six-sided die:
E(X) = 1(1/6) + 2(1/6) + 3(1/6) + 4(1/6) + 5(1/6) + 6(1/6) = 3.5

## 2. Variance

Variance measures the spread or dispersion of a random variable around its mean.

Var(X) = E((X - μ)²) = E(X²) - (E(X))²

### For Discrete Random Variables:


Var(X) = ∑((x - μ)² * P(X = x)), summed over all possible values of x

### For Continuous Random Variables:


Var(X) = ∫((x - μ)² * f(x) dx), integrated over the entire range of X

### Properties of Variance:


1. Var(aX + b) = a²Var(X)
2. For independent X and Y: Var(X + Y) = Var(X) + Var(Y)

### Standard Deviation:


The standard deviation σ is the square root of the variance:
σ = √(Var(X))
### Example:
For the fair six-sided die:
E(X²) = 1²(1/6) + 2²(1/6) + 3²(1/6) + 4²(1/6) + 5²(1/6) + 6²(1/6) = 15.17
Var(X) = E(X²) - (E(X))² = 15.17 - 3.5² = 2.92

## 3. Conditional Expectation

The conditional expectation of X given Y, denoted as E(X|Y), is the expected value of X when Y is known or fixed.

### For Discrete Random Variables:


E(X|Y = y) = ∑(x * P(X = x|Y = y)), summed over all possible values of x

### For Continuous Random Variables:


E(X|Y = y) = ∫(x * f(X|Y)(x|y) dx), integrated over the entire range of X

Where f(X|Y)(x|y) is the conditional probability density function.

### Properties of Conditional Expectation:


1. Linearity: E(aX + b|Y) = aE(X|Y) + b
2. Tower Property: E(E(X|Y)) = E(X)

### Example:
Consider a two-stage experiment:
1. Roll a fair die (Y)
2. Flip a coin Y times and count the number of heads (X)

E(X|Y = y) = y/2 (because for y coin flips, we expect half to be heads)

## 4. Law of Total Expectation

The law of total expectation, also known as the law of iterated expectations or the tower rule, states that the expected
value of a random variable X can be computed by first taking the conditional expectation with respect to another
random variable Y and then taking the expectation of that result.

E(X) = E(E(X|Y))

For discrete Y: E(X) = ∑(E(X|Y = y) * P(Y = y)), summed over all possible values of y
For continuous Y: E(X) = ∫(E(X|Y = y) * f(y) dy), integrated over the entire range of Y

### Properties:
1. Useful for computing expectations in multi-stage experiments
2. Helps in breaking down complex expectations into simpler parts

### Example (continuing from the previous one):


E(X) = E(E(X|Y)) = E(Y/2) = E(Y)/2 = 3.5/2 = 1.75

This means that in our two-stage experiment, we expect an average of 1.75 heads.

9. Discrete probability distributions: Bernoulli, binomial, Poisson, geometric, uniform


# Discrete Probability Distributions: Formulas and Examples
## 1. Bernoulli Distribution

### Formula:
P(X = x) = p^x * (1-p)^(1-x), where x ∈ {0, 1}

### Properties:
- Mean (μ) = p
- Variance (σ²) = p(1-p)

### Best Example:


Flipping a coin once:
- X = 1 if heads (success)
- X = 0 if tails (failure)
- For a fair coin, p = 0.5

### Real-world Application:


Modeling the outcome of a single quality control check in manufacturing.

## 2. Binomial Distribution

### Formula:
P(X = k) = C(n,k) * p^k * (1-p)^(n-k)
where C(n,k) is the binomial coefficient ("n choose k")

### Properties:
- Mean (μ) = np
- Variance (σ²) = np(1-p)

### Best Example:


Number of heads in 10 coin flips:
- n = 10 (number of trials)
- p = 0.5 (probability of success on each trial)
- X = number of heads (successes)

### Real-world Application:


Modeling the number of defective items in a batch of manufactured goods.

## 3. Poisson Distribution

### Formula:
P(X = k) = (λ^k * e^(-λ)) / k!
where λ is the average rate of occurrence

### Properties:
- Mean (μ) = λ
- Variance (σ²) = λ

### Best Example:


Number of customers arriving at a store in an hour:
- λ = 5 (average number of customers per hour)
- X = number of customers arriving in a given hour

### Real-world Application:


Modeling rare events like radioactive decay or traffic accidents.
## 4. Geometric Distribution

### Formula:
P(X = k) = p * (1-p)^(k-1), where k = 1, 2, 3, ...

### Properties:
- Mean (μ) = 1/p
- Variance (σ²) = (1-p) / p²

### Best Example:


Number of coin flips until the first head appears:
- p = 0.5 (probability of success on each trial)
- X = number of flips needed to get the first head

### Real-world Application:


Modeling the number of attempts until a rare event occurs, like the number of job applications until getting an offer.

## 5. Discrete Uniform Distribution

### Formula:
P(X = x) = 1/n, where x ∈ {a, a+1, ..., b} and n = b - a + 1

### Properties:
- Mean (μ) = (a + b) / 2
- Variance (σ²) = ((b - a + 1)² - 1) / 12

### Best Example:


Rolling a fair six-sided die:
- a = 1, b = 6
- P(X = x) = 1/6 for x ∈ {1, 2, 3, 4, 5, 6}

### Real-world Application:


Modeling the selection of a random item from a finite set, like choosing a card from a well-shuffled deck.

## Comparison and Selection

1. Use Bernoulli for single trials with two possible outcomes.


2. Use Binomial for fixed number of independent Bernoulli trials.
3. Use Poisson for counting rare events in a fixed interval.
4. Use Geometric for counting trials until the first success.
5. Use Discrete Uniform when all outcomes are equally likely.

## Practical Examples

1. Bernoulli: A website A/B test where each user either clicks (1) or doesn't click (0) on a button.

2. Binomial: In a political survey of 1000 people, modeling the number who support a particular candidate.

3. Poisson: Modeling the number of earthquakes in a specific region over a year.

4. Geometric: The number of times you need to spin a roulette wheel before hitting your lucky number.

5. Discrete Uniform: Modeling the day of the week a person is born, assuming equal probability for each day.
Understanding these distributions and when to apply them is crucial for modeling real-world phenomena and making
informed decisions based on probability theory.

10. Decision models: Non-deterministic uncertainty (maximax, maximin, Laplace,


Hurwicz, minimax regret), probabilistic uncertainty / risk (EMV, EVPI)

1. Maximax (Optimistic Approach)

Formula: Choose the alternative with the highest possible payoff.


Max[Max(payoffs for each alternative)]

Example:
A company is considering three investment options with the following potential returns based on market conditions:
Option A: $50k, $80k, $30k
Option B: $40k, $70k, $60k
Option C: $55k, $65k, $45k

Maximax solution: Choose Option A, as it has the highest possible payoff of $80k.

2. Maximin (Pessimistic Approach)

Formula: Choose the alternative with the best worst-case scenario.


Max[Min(payoffs for each alternative)]

Example:
Using the same investment options:
Option A: $50k, $80k, $30k
Option B: $40k, $70k, $60k
Option C: $55k, $65k, $45k

Maximin solution: Choose Option C, as its worst-case scenario ($45k) is better than the others.

3. Laplace (Equal Likelihood)

Formula: Assume all outcomes are equally likely and choose the alternative with the highest average payoff.
Max[Average(payoffs for each alternative)]

Example:
For the same investment options:
Option A: ($50k + $80k + $30k) / 3 = $53.33k
Option B: ($40k + $70k + $60k) / 3 = $56.67k
Option C: ($55k + $65k + $45k) / 3 = $55k

Laplace solution: Choose Option B, as it has the highest average payoff.

4. Hurwicz (Weighted Average)

Formula: Choose based on a weighted average of the best and worst outcomes.
Max[α * Max(payoffs) + (1-α) * Min(payoffs)]
Where α is the optimism coefficient (0 ≤ α ≤ 1)

Example:
Let's use α = 0.6 for the same investment options:
Option A: 0.6 * $80k + 0.4 * $30k = $60k
Option B: 0.6 * $70k + 0.4 * $40k = $58k
Option C: 0.6 * $65k + 0.4 * $45k = $57k

Hurwicz solution: Choose Option A.

5. Minimax Regret

Formula:
1. Calculate the regret matrix (opportunity loss)
2. Find the maximum regret for each alternative
3. Choose the alternative with the minimum of these maximum regrets

Example:
Let's create a regret matrix for our investment options:
Option A: 5, 0, 30
Option B: 15, 10, 0
Option C: 0, 15, 15

Maximum regrets:
Option A: 30
Option B: 15
Option C: 15

Minimax Regret solution: Choose either Option B or C, as they have the lower maximum regret.

6. Expected Monetary Value (EMV)

Formula: EMV = Σ (Probability of outcome * Value of outcome)

Example:
Suppose we have probabilities for our market conditions:
Good (30%), Normal (50%), Poor (20%)

Option A: 0.3 * $80k + 0.5 * $50k + 0.2 * $30k = $56k


Option B: 0.3 * $70k + 0.5 * $60k + 0.2 * $40k = $59k
Option C: 0.3 * $65k + 0.5 * $55k + 0.2 * $45k = $56k

EMV solution: Choose Option B, as it has the highest expected monetary value.

7. Expected Value of Perfect Information (EVPI)

Formula: EVPI = EMV with perfect information - EMV without perfect information

Example:
EMV with perfect information:
0.3 * $80k + 0.5 * $60k + 0.2 * $45k = $63.5k

EMV without perfect information (from previous calculation): $59k


EVPI = $63.5k - $59k = $4.5k

This means that perfect information about market conditions would be worth $4.5k to the decision-maker.

These models provide different approaches to decision-making under uncertainty, each with its own assumptions and
implications. The choice of model often depends on the decision-maker's attitude towards risk and the specific context
of the decision.

11. Simple decision trees (with not more than 2 decision nodes)
Certainly! I'll explain simple decision trees with up to two decision nodes and provide an example problem.

Decision trees are graphical representations of decision-making processes. They consist of:
1. Decision nodes (squares)
2. Chance nodes (circles)
3. End nodes (triangles)
4. Branches (lines connecting nodes)

For a simple decision tree with up to two decision nodes, we typically follow these steps:
1. Draw the initial decision node
2. Draw branches for each alternative
3. Add chance nodes for uncertain outcomes
4. Add end nodes with payoffs
5. Calculate expected values working backwards from right to left

Let's work through an example problem:

Example: Ice Cream Shop Decision

An entrepreneur is considering opening an ice cream shop. They have two decisions to make:
1. Shop size: Small or Large
2. Menu variety: Basic or Gourmet

The success of the shop depends on customer demand, which can be High or Low.

Here's the decision tree:

```mermaid
graph TD
A[Shop Size] -->|Small| B[Menu]
A -->|Large| C[Menu]
B -->|Basic| D((Demand))
B -->|Gourmet| E((Demand))
C -->|Basic| F((Demand))
C -->|Gourmet| G((Demand))
D -->|High 60%| H[$50k]
D -->|Low 40%| I[$20k]
E -->|High 70%| J[$70k]
E -->|Low 30%| K[$10k]
F -->|High 70%| L[$100k]
F -->|Low 30%| M[$-20k]
G -->|High 80%| N[$150k]
G -->|Low 20%| O[$-40k]

```

Now, let's solve this decision tree:

1. Calculate expected values at each chance node:


D: (0.6 * $50k) + (0.4 * $20k) = $38k
E: (0.7 * $70k) + (0.3 * $10k) = $52k
F: (0.7 * $100k) + (0.3 * $-20k) = $64k
G: (0.8 * $150k) + (0.2 * $-40k) = $112k

2. Choose the best option at each decision node:


B (Small shop): Max($38k, $52k) = $52k (Gourmet menu)
C (Large shop): Max($64k, $112k) = $112k (Gourmet menu)

3. Make the final decision:


A: Max($52k, $112k) = $112k (Large shop)

Therefore, the optimal decision is to open a large shop with a gourmet menu, which has an expected value of $112k.

This example demonstrates how a simple decision tree with two decision nodes can be used to analyze a complex
decision-making process involving multiple choices and uncertain outcomes. The tree helps visualize the options,
calculate expected values, and determine the optimal course of action based on the given probabilities and payoffs.

12. Discrete joint probability, properties of expectation, properties of variance

1. Discrete Joint Probability

Discrete joint probability deals with the probability of two or more discrete random variables occurring together.

Key concepts:
- Joint Probability Mass Function (PMF): P(X=x, Y=y)
- Marginal Probability: P(X=x) or P(Y=y)
- Conditional Probability: P(X=x | Y=y) or P(Y=y | X=x)

Example:
Suppose we have two dice, a red (X) and a blue (Y) die. The joint probability of getting a 3 on the red die and a 4 on
the blue die is:

P(X=3, Y=4) = 1/36

Properties:
1. 0 ≤ P(X=x, Y=y) ≤ 1 for all x and y
2. ∑∑ P(X=x, Y=y) = 1 (sum over all possible values of X and Y)

2. Properties of Expectation

Expectation (E) is the average value of a random variable over many trials.

Key properties:
a) Linearity: E[aX + b] = aE[X] + b
b) Additivity: E[X + Y] = E[X] + E[Y]
c) Multiplicativity for independent variables: E[XY] = E[X]E[Y]

Example:
Let X be the outcome of rolling a fair six-sided die.
E[X] = (1 + 2 + 3 + 4 + 5 + 6) / 6 = 3.5

If we define Y = 2X + 1:
E[Y] = E[2X + 1] = 2E[X] + 1 = 2(3.5) + 1 = 8

3. Properties of Variance

Variance (Var) measures the spread of a random variable around its mean.

Key properties:
a) Var(X) = E[(X - μ)²] = E[X²] - (E[X])²
b) Var(aX + b) = a²Var(X)
c) For independent variables: Var(X + Y) = Var(X) + Var(Y)

Example:
For the fair six-sided die:
E[X²] = (1² + 2² + 3² + 4² + 5² + 6²) / 6 = 91/6
Var(X) = E[X²] - (E[X])² = 91/6 - (3.5)² = 35/12 ≈ 2.92

Let's solve a problem that incorporates all these concepts:

Problem:
Two fair six-sided dice are rolled, a red (X) and a blue (Y) die.

a) What is P(X=4, Y=5)?


b) What is P(X=4)?
c) What is P(Y=5 | X=4)?
d) Calculate E[X + Y]
e) Calculate Var(X + Y)

Solutions:

a) P(X=4, Y=5) = 1/36

b) P(X=4) = 1/6

c) P(Y=5 | X=4) = P(X=4, Y=5) / P(X=4) = (1/36) / (1/6) = 1/6

d) E[X + Y] = E[X] + E[Y] = 3.5 + 3.5 = 7

e) Var(X + Y) = Var(X) + Var(Y) = 35/12 + 35/12 = 35/6 ≈ 5.83

This problem demonstrates the application of joint probability, conditional probability, expectation, and variance in a
simple scenario involving dice rolls.

13. Markov's and Chebyshev's inequalities


Certainly! Markov's and Chebyshev's inequalities are important probability inequalities that provide upper bounds on
the probability that a random variable deviates from its expected value. Let's explore each of these inequalities, their
formulas, and provide an example for each.

1. Markov's Inequality

Markov's inequality provides an upper bound for the probability that a non-negative random variable X is greater than
or equal to some positive value a.

Formula:
P(X ≥ a) ≤ E[X] / a

Where:
X is a non-negative random variable
E[X] is the expected value of X
a is any positive real number

Key points:
- Applies to non-negative random variables only
- Provides a loose bound, but is applicable to any distribution
- Most useful when we only know the expected value of the distribution

Example problem:

The average time students spend on social media daily is 2 hours. Using Markov's inequality, find an upper bound for
the probability that a randomly selected student spends at least 5 hours on social media daily.

Solution:
Let X be the time spent on social media (in hours).
E[X] = 2 hours
a = 5 hours

P(X ≥ 5) ≤ E[X] / a = 2 / 5 = 0.4

Therefore, the probability that a randomly selected student spends at least 5 hours on social media is at most 40%.

2. Chebyshev's Inequality

Chebyshev's inequality provides an upper bound for the probability that a random variable X deviates from its mean by
more than k standard deviations.

Formula:
P(|X - μ| ≥ kσ) ≤ 1 / k²

Where:
X is a random variable with mean μ and standard deviation σ
k is any positive real number

Alternatively, it can be written as:


P(|X - μ| ≥ a) ≤ σ² / a²

Key points:
- Applies to any distribution with a finite mean and variance
- Provides a tighter bound than Markov's inequality
- Most useful when we know both the mean and variance of the distribution

Example problem:

The scores on a standardized test are normally distributed with a mean of 500 and a standard deviation of 100. Use
Chebyshev's inequality to find an upper bound for the probability that a randomly selected student's score deviates
from the mean by more than 200 points.

Solution:
μ = 500
σ = 100
a = 200

We want to find P(|X - 500| ≥ 200)

Using Chebyshev's inequality:


P(|X - μ| ≥ a) ≤ σ² / a²
P(|X - 500| ≥ 200) ≤ 100² / 200² = 10000 / 40000 = 1/4 = 0.25

Therefore, the probability that a randomly selected student's score deviates from the mean by more than 200 points is
at most 25%.

Note: In this case, since we know the distribution is normal, we could calculate the exact probability, which would be
about 4.6%. Chebyshev's inequality provides an upper bound that is valid for any distribution with the same mean and
standard deviation, not just the normal distribution.

14. Basics of continuous random variables & probability distributions


Certainly! Let's dive into the basics of continuous random variables and probability distributions.

1. Continuous Random Variables

A continuous random variable is a variable that can take on any value within a given range. Unlike discrete random
variables, which can only take on specific values, continuous random variables can assume an infinite number of
values.

Key concepts:

a) Probability Density Function (PDF):


- Denoted as f(x)
- The area under the PDF curve between two points represents the probability of the random variable falling within
that range.
- ∫[all x] f(x) dx = 1

b) Cumulative Distribution Function (CDF):


- Denoted as F(x)
- F(x) = P(X ≤ x) = ∫[from -∞ to x] f(t) dt
- 0 ≤ F(x) ≤ 1

c) Expected Value:
E[X] = ∫[all x] x * f(x) dx

d) Variance:
Var(X) = E[(X - μ)²] = ∫[all x] (x - μ)² * f(x) dx

2. Common Continuous Probability Distributions

a) Uniform Distribution:
- Constant probability density over an interval [a, b]
- PDF: f(x) = 1 / (b - a) for a ≤ x ≤ b
- Mean: μ = (a + b) / 2
- Variance: σ² = (b - a)² / 12

b) Normal (Gaussian) Distribution:


- Bell-shaped curve
- PDF: f(x) = (1 / (σ√(2π))) * e^(-(x-μ)² / (2σ²))
- Mean: μ
- Variance: σ²
- Standard Normal Distribution: Z ~ N(0, 1)

c) Exponential Distribution:
- Models time between events in a Poisson process
- PDF: f(x) = λe^(-λx) for x ≥ 0
- Mean: 1 / λ
- Variance: 1 / λ²

Example Problem:

Let's solve a problem involving the uniform distribution:

Problem: The time it takes to complete a certain task is uniformly distributed between 10 and 20 minutes.

a) What is the probability density function (PDF) for this distribution?


b) What is the probability that the task takes more than 15 minutes?
c) What is the expected time to complete the task?
d) What is the variance of the completion time?

Solution:

a) PDF:
f(x) = 1 / (b - a) = 1 / (20 - 10) = 1/10 for 10 ≤ x ≤ 20

b) P(X > 15) = (20 - 15) / (20 - 10) = 5/10 = 0.5 or 50%

c) Expected value:
E[X] = (a + b) / 2 = (10 + 20) / 2 = 15 minutes

d) Variance:
Var(X) = (b - a)² / 12 = (20 - 10)² / 12 = 100 / 12 ≈ 8.33 (minutes²)

15. Continuous uniform distribution, normal approximation to the binomial


distribution.
Certainly! Let's dive into the continuous uniform distribution and the normal approximation to the binomial distribution.

1. Continuous Uniform Distribution


The continuous uniform distribution, also known as the rectangular distribution, is a distribution where all intervals of
equal length on the distribution's support have equal probability.

Key properties:

a) Probability Density Function (PDF):


f(x) = 1 / (b - a) for a ≤ x ≤ b
f(x) = 0 otherwise

b) Cumulative Distribution Function (CDF):


F(x) = 0 for x < a
F(x) = (x - a) / (b - a) for a ≤ x ≤ b
F(x) = 1 for x > b

c) Mean: μ = (a + b) / 2

d) Variance: σ² = (b - a)² / 12

e) Standard Deviation: σ = (b - a) / √12

Example Problem:

The time to complete a certain task is uniformly distributed between 5 and 15 minutes.

1) What is the probability that the task takes less than 8 minutes?
2) What is the expected time to complete the task?
3) What is the standard deviation of the completion time?

Solution:

a = 5, b = 15

1) P(X < 8) = F(8) = (8 - 5) / (15 - 5) = 3/10 = 0.3 or 30%

2) E[X] = (a + b) / 2 = (5 + 15) / 2 = 10 minutes

3) σ = (b - a) / √12 = (15 - 5) / √12 ≈ 2.89 minutes

2. Normal Approximation to the Binomial Distribution

For large n and p not too close to 0 or 1, the binomial distribution can be approximated by a normal distribution. This is
based on the Central Limit Theorem.

Conditions for a good approximation:


-n*p≥5
- n * (1-p) ≥ 5

The approximation uses these parameters:


-μ=n*p
- σ = √(n * p * (1-p))

When using this approximation, we often apply a continuity correction, which means we adjust the boundaries of our
interval by 0.5.
Example Problem:

A fair coin is tossed 100 times. Use the normal approximation to estimate the probability of getting 45 or fewer heads.

Solution:

1) Check if we can use the normal approximation:


n = 100, p = 0.5
n * p = 100 * 0.5 = 50 ≥ 5
n * (1-p) = 100 * 0.5 = 50 ≥ 5
So, we can use the approximation.

2) Calculate μ and σ:
μ = n * p = 100 * 0.5 = 50
σ = √(n * p * (1-p)) = √(100 * 0.5 * 0.5) = 5

3) Apply the continuity correction:


We want P(X ≤ 45), which becomes P(X < 45.5) in the continuous approximation.

4) Standardize to Z-score:
Z = (X - μ) / σ = (45.5 - 50) / 5 = -0.9

5) Look up the probability in a standard normal table or use a calculator:


P(Z < -0.9) ≈ 0.1841

Therefore, the probability of getting 45 or fewer heads in 100 tosses of a fair coin is approximately 0.1841 or 18.41%.

You might also like