0% found this document useful (0 votes)
18 views50 pages

Probability and Statistics

The document provides an overview of probability and statistics, explaining key concepts such as probability measures, types of data, and methods for data collection and analysis. It covers primary and secondary data, descriptive and inferential statistics, and various data presentation techniques including frequency distribution and graphical representations. Additionally, it discusses measures of central tendency, including mean, median, and mode, along with their properties and examples.

Uploaded by

danialzaheer80
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views50 pages

Probability and Statistics

The document provides an overview of probability and statistics, explaining key concepts such as probability measures, types of data, and methods for data collection and analysis. It covers primary and secondary data, descriptive and inferential statistics, and various data presentation techniques including frequency distribution and graphical representations. Additionally, it discusses measures of central tendency, including mean, median, and mode, along with their properties and examples.

Uploaded by

danialzaheer80
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Probability:

Probability is the study of how likely an event is to happen. It's a measure of the
chance that a specific outcome will occur out of all possible outcomes.

Examples:

● Flipping Coin (Probability of getting head is 1 out of 2, or 50%)


● Weather Forecast (If it says theirs is a 70% chance of rain tomorrow then the
probability of rain is 70%)

Statistics:
Statistics is the study of collecting, organizing, presenting, analyzing, summarizing,
interpreting data, and making decisions from that study.

Examples:

● Average Score (Collect data in the form of marks, find the average, and then
draw conclusion about performance)
● Survey Reports (Collecting surveys, organize in tabular form with
frequencies, and then make the decision)

Sources of Data:

● Primary Data: Primary data refers to raw data directly collected from you as
an original source. This data is in its unmanaged form, meaning it hasn’t been
processed, analyzed, or manipulated through any statistical process.
Examples:
▪ Conducting Surveys (When you distribute a survey to gather responses,
the data you collect such as customer satisfaction feedback is
considered primary data because it is directly obtained from
participants)
▪ Performing Experiments (In scientific research, data collected directly
from experiments, like measuring the effect of a new drug on blood
pressure, is primary data as it’s unprocessed and directly obtained from
test subjects)
● Secondary Data: Data or information already collected by someone
available for use after processing through some statistical process is called
secondary data.
Examples:
▪ Survey Reports (These are processed summaries of survey data
collected by others. For example, a government report on national
health statistics is secondary data that you can use without conducting
your own surveys)
▪ Experiment Results (Published research results become secondary data.
For example, a study on diet effects on heart health can be used as a
reference in your research, saving time and resources)

Types of Statistics:

● Descriptive: Descriptive statistics involves collecting, organizing,


presenting, analyzing and summarizing data using statistics formulas and
methods.
Examples:
▪ Company Performance Reports (A business may use descriptive
statistics to summarize sales data over the past year, highlighting
trends and average monthly sales to inform future strategies)
▪ Educational Performance (Schools might use descriptive statistics to
analyze student test scores, identifying the average performance and
distribution of grades to assess overall academic achievement)
● Inferential: Inferential statistics involves making decisions and drawing
conclusions based on sample information. Unlike descriptive statistics, which
only describe the data, inferential statistics allow you to make inferences and
decisions beyond the immediate data.
Examples:
▪ School Testing (A teacher gives a sample of students a practice test
and uses their scores to predict how the entire class will perform on
the actual exam)
▪ Product Testing (A company tests a new product on a small group of
customers and uses the feedback to infer how well the product will be
received by the broader market)

Basic Terms in Statistics

● Population: A complete set of all the objects or observations possessing


some common characteristics of interest.
● Sample: A sample is a part of population, and it must be the subset of
population based on which we draw conclusions for population.
● Parameter: Any numerical value calculated from population is called
parameter.
● Statistic: Any numerical value calculated from sample is called statistic.

Types of Data:

● Qualitative: Data that can’t be measurable but observed is called


quantitative data.
Example:
▪ Customer Feedback (Descriptions of customer experiences and
opinions, such as satisfaction levels or comments about a service)
▪ Product Reviews: Written reviews and ratings where users describe
their feelings about a product.
● Quantitative: Data that can be calculated or observed on the basic of some
scale is called quantitative data.

Discrete: Data that can be countable is called discrete data.


Examples:
▪ Number of Students in the Class
▪ Number of Pages in a Book

Continuous: Data that can be measured on some scale is called continuous


data.
Example:
▪ Temperature (Measured on different scales Celsius, Kelvin,
Fahrenheit etc.)
▪ Difference between two cities (Can measured on some scale or
rules)
● Qualitative Variable: Anything that varies from object to object and store
Qualitative data also known as Attribute
Example: Color, Grades
● Quantitative Variable: Anything that varies from object to object and stores
Quantitative data.
Example: Marks, CGPA (continuous variable)
Data Presentation:
1. Frequency Distribution
2. Graphical Representation

Frequency Distribution:
● Frequency Distribution (Qualitative Data):
▪ F = Frequency
▪ RF = Relative Frequency (proportion)
▪ % = Percentage = RF * 100

Grade F RF %
A 6 6/20 30
B 2 2/20 10
C 2 2/20 10
D 4 4/20 20
F 6 6/20 30

Total Number of Observations = Sum of Frequency

● Frequency Distribution (Quantitative Data):


▪ F = Frequency
▪ RF = Relative Frequency (proportion)

▪ % = Percentage

▪ C-I = Class interval

▪ CB = Class bound

▪ X = midpoint

▪ CF = Cumulative Frequency

Optional Steps to select No. Of Classes and its Range:

Step 1: Find out Range = Max value – Min value

Step 2: Select No. Of Classes, Min (4), Ideal (5-7), Max (10-11)
Step 3: Range / No. Of Class = Difference between each interval

✔ The smallest value must fall into the first interval and largest
must fall into last interval. Suitable if they are not on edges and
are in between the interval.
✔ Quantitative data can be summarized into range.

C-I F CB X CF RF %
2-5 8 1.5 - 5.5 3.5 8 8/30 26.6
6-9 7 5.5 - 9.5 7.5 15 7/30 23.3
10 - 13 8 9.5 - 13.5 11.5 23 8/30 26.6
14 - 17 7 13.5 - 17.5 15.5 26 7/30 23.3
18 - 21 9 17.5 - 21.5 19.5 30 9/30 30

▪ To Find CB find the difference between ending of an interval and the starting
of next interval and divide it by 2. e.g. 1 interval ends at 5 and second start at
6. so, it will be 6-5/2 = 0.5.
▪ Now take this number and subtract from starting of each interval and add it i
the ending of each interval. E.g. 1st interval will convert from 2 – 5 -> 1.5 -
5.5

Graphical Representation:
● Histogram
● Pie chart
● Bar graph
● Box plot (using EDA)
1. Histogram:

Histogram is a graph of quantitative frequency distribution. It is also used to


estimate/examine the shape of data sets or distribution. Class boundaries are
plotted on x-axis and frequencies on y-axis.

Diagram:
Types of histogram graphs include.

Example:
Get more info at:
https://www.atlassian.com/data/charts/histogram-complete-guide#:~:text=What%2
0is%20a%20histogram%3F,value%20within%20the%20corresponding%20bin.

2. Pie Chart:

Pie chart is a graphical representation of qualitative data. It is a circular diagram,


and the angle of circle is 360 degrees. The circle is divided into different sectors and
these sectors represent different characters, attributes, or quantities.

Diagram:

Suppose the following table represents the expenses of a person:

Items Expenses
Food 70
Clothing 50
Fuel 30
Rent 40
Others 10
In this table, we see that Total = 200. So, we distribute the frequencies as:

● Food = 70/200x100%
● Clothing = 50/200x100%
● Fuel = 30/200x100%
● Rent = 40/200x100%
● Others = 10/200x100%

Based on these percentages we draw the pie chart by dividing the sectors based on
calculated percentages.

✔ If we have different objects, then we have different pie charts. Each pie
chart represents every single individual.

3. Bar Chart:
Bar charts are used for predictions and can show both quantitative and
qualitative data. It is used to represent data that changes overtime.
Diagram:

✔ Simple bar chart: Having only 1 qualitative/quantitative variable

✔ Multiple bar chart: Having more than 1 qualitative/quantitative variable


Summarize:
1. Measure of Central Tendency or Location:
Central tendency means the center of data where there is more frequency of
items lie. Means where data is present in clustered form. We have different
methods to find the center of different types of data:
● Mean: It is the standard average, for example, sum of all data items and
divide the sum into total number of items.
● Median: This method divides the data into two equal parts.

● Mode: This method aims to find the most frequent value in the data set.
❖ Mean:
Mean is a value that is obtained after the sum of all the data items and then
divides this sum to the number of items present in data set.
The representation of mean for sample data:
∑𝑋𝑖 𝑠𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑣𝑎𝑙𝑢𝑒𝑠
𝑥= 𝑛
= 𝑛𝑜 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠
The representation of mean for population data:
∑𝑁𝑖 𝑠𝑢𝑚 𝑜𝑓 𝑎𝑙𝑙 𝑣𝑎𝑙𝑢𝑒𝑠
μ= 𝑛
= 𝑛𝑜 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠
✔ The data set we have is always a sample until the population is
mentioned.

Example:

We have marks of 18 students in test:

5.5, 10, 11, 5.5, 6, 7, 11, 5, 5, 6, 7, 7, 11, 10, 7, 7, 7, 7


∑𝑋𝑖 135
𝑥= 𝑛
= 18
= 7. 5

Suppose I missed the marks of 1 student which is 39, If we add this item to
data set, then:
∑𝑋𝑖 174
𝑥= 𝑛
= 19
= 9. 16

We have seen how much the mean value is affected by extreme value.
Mean value is highly affected by extreme values. So, if the data set has
extreme values, we must find different averages like median or mode which
does not affect extreme values.

Question: The mean of 10 values is 13, after finding the mean we observe
that one value is missed, by adding the value the mean becomes 19, find that
value?

Solution:

First mean:

∑𝑋 = 130 → 𝑥 = 13, 𝑛 = 10

Second mean:

∑𝑋 = 130 + 𝐴 → 𝑥 = 19, 𝑛 = 11

Now,
130+𝐴
19 = 11
→209 = 130 + 𝐴 →𝐴 = 209 − 130 = 79

Properties of Mean:

● Weighted mean
● Combined mean
Weighted Mean:
When values are not equally important and each contain different weights,
then finding the mean in this situation is called weighted mean.
Example: Aggregate calculation (Matric: 10% weight, FSc: 40% weight, Test:
50% weight).
Weights may be in the form of:
▪ Percentage
▪ Ratio
▪ Proportion
▪ Rank
✔ If weights are not assigned, we give our own weights.
∑𝑊𝑖𝑋𝑖 𝑊1𝑋1+𝑊2𝑋2+…
𝑋𝑤 = ∑𝑊𝑖
= 𝑊1+𝑊2+…

Item X W
Food 70 1
Fuel 30 3
Clothing 50 4
Rent 40 2
Other 10 5
1*70+3*30+4*50+2*40+5*10 490
= 1+3+4+2+5
= 15
= 32. 67

Combined Mean:
In this we compute means from different groups whose means are already
known and all these means are computed on the same type of items, e.g.
heights.
∑𝑛𝑖𝑋𝑖
𝑋𝑐 = ∑𝑛𝑖

Example:

Groups A B C

No of observations 15 25 10

Mean 5 7 3

nX 75 175 30

75+175+30 280
= 50
= 50
= 5. 6

Question: The average weight of 10 males is 70 and the average weight of


15 females is 45, then find the combined mean?
Solution:
10*70+15*45 1375
= 10+15
= 25
= 55

❖ Median:
Median is a value which divides any data set into two equal parts after
arranging the data in ascending or descending order. 50% observations are
below the median and 50% observations are above the median.
𝑀= ( )𝑡ℎ 𝑣𝑎𝑙𝑢𝑒 𝑖𝑛 𝑑𝑎𝑡𝑎
𝑛+1
2

Example:
Raw data: 7, 9, 11, 15, 7, 5, 2, 75, 7
Arranging in ascending order: 2, 5, 7, 7, 7, 9, 11, 15, 75

Data: 2, 7, 9, 11, 15, 17, 21, 25, 26,


𝑀= ( 10+1
2 )𝑡ℎ 𝑣𝑎𝑙𝑢𝑒 = 5. 5𝑡ℎ 𝑣𝑎𝑙𝑢𝑒 = 5𝑡ℎ 𝑣𝑎𝑙𝑢𝑒+6𝑡ℎ 𝑣𝑎𝑙𝑢𝑒
2
=
15+17
2
= 16

Extensions:

● Quartile: Divide data into 4 parts. It contains 3 points known as Q1, Q2, Q3
and 4 parts. Where Q2 is known as median of the data set.
● Decile: Divide data set into 10 equal parts. It contains 9 points known as D1,
D2, D3, D4, D5, D6, D7, D8, D9. Where Q5 is also known as median of the
data set.
● Percentile: Divide data set into 100 equal parts. It contains 99 points.
Percentile is used when we want to study data individually.

Note: Most important type is quartile and frequently used.

Quartile:

Formula:
𝑣(𝑛+1)
Qv = 4
,Where v = 1,2,3

Example:

Raw data: 7, 5, 3, 7, 9, 11, 15, 12, 20, 17, 13, 15, 10, 9

Arranged data: 3, 5, 7, 7, 9, 9, 10, 11, 12, 13, 15, 15, 17, 20

Now,

Q1 =
1(14+1)
4
= 3. 75𝑡ℎ = 3𝑟𝑑 + 0. 75(4𝑡ℎ − 3𝑟𝑑) = 7 + 0. 75(7 − 7) = 7
2(14+1) 7𝑡ℎ+8𝑡ℎ 10+11
Q2 = 4
= 7. 5𝑡ℎ = 2
= 2
= 10. 5

Q3 =
3(14+1)
4
= 11𝑡ℎ + 0. 25(12𝑡ℎ − 11𝑡ℎ) = 15 + 0. 25(15 − 15) = 15

Deciles:

Formula:
Dv = v(n+1)/10, Where v = 1,2,3,4,5,6,7,8,9

Percentile:

Formula:

Pv = v(n+1)/100, Where v = 1,2,3,4,5...99

❖ Mode:

Most repeated values in the data set.

Case 1: 3,5,15,11 // no mod

Case 2: 2,2,3,3,9,11,9,11 // no mod

Case 3: 3,7,9,15,15,8,17 // mod = 15

What Is the Difference Between Mode and Mean?

The mode is the number in a set of numbers that appears the most often. The
mean of a set of numbers is the sum of all the numbers divided by the
number of values in the set. The mean is also known as the average.

Identify Shape of with Average:

✔ Symmetrical: mean = median = mode

✔ Positively Skewed: mean > median > mode

✔ Negatively Skewed: mean < median < mode

2. Measure of Dispersion:
● Range
● Variance
● Standard deviation (SD)
● Interquartile range (IQR)

❖ Range:
R = Max-Min => only for small data sets
Because if we have large data and it also contains outsiders then the range is
not useful.
❖ Variance:
∑𝑋 2
2
𝑆 =
2 ∑𝑋
𝑛
− ( ) 𝑛

❖ Standard Deviation:

∑𝑋 2
2
𝑆=
∑𝑋
𝑛
− ( )
𝑛

Example:

Data of group A: 3, 5, 7, 5, 2, 6, 4

Now,

X 𝑋
2

3 9
5 25
7 49
5 25
2 4
6 36
4 16
∑X = 32, n = 7 2
∑𝑋 = 164, 𝑛 = 7

∑𝑋 2 32 2
2
2
𝑆 =
∑𝑋
𝑛
− ( )
𝑛
=
164
7
− ( ) 7
= 23. 4 − 20. 25 = 3. 15 => 𝑆 = 1. 77

Mean = 4.5, SD = 1.77

Data of group B: 5, 9, 11, 10, 11, 10, 7

Now,
X 2
𝑋
5 25
9 81
11 121
10 100
11 121
10 100
7 49
∑X = 63, n = 7 2
∑𝑋 = 597, 𝑛 = 7

∑𝑋 2 63 2
2
2
𝑆 =
∑𝑋
𝑛
− ( )
𝑛
=
597
7
− ( )
7
= 85. 3 − 81 = 4. 3 => 𝑆 = 2. 04

Mean = 9, SD = 2.04

❖ IQR:
✔ IQR = Q3 – Q1 = Inter Quartile Range
✔ Best suited for variation in SD.

Coefficient of Variation:

CV is used when we must compare two or more data sets having different
units.
𝑆
𝐶𝑉 = * 100
𝑥

Let,
5𝑘𝑔
𝐶𝑉 = 5𝑘𝑔
* 100→𝑢𝑛𝑖𝑡 𝑙𝑒𝑠𝑠 = 100%

10𝑐𝑚
𝐶𝑉 = 3𝑐𝑚
* 100→𝑢𝑛𝑖𝑡 𝑙𝑒𝑠𝑠 = 333. 3%

If the CV is high, there is more variation in data. So, let:

1. CV1 = 10%
2. CV2 = 30%
3. CV3 = 20%

Data 2 is more variant than data 3, and data 3 is more variant than data 1.

❖ EDA (Exploratory Data Analysis) is an approach to analyzing data sets


to summarize their main characteristics, often using visual methods. It helps
in understanding the data before applying formal modeling or hypothesis
testing. The primary objectives of EDA are to:
● Understand data patterns: Identify trends, patterns, and relationships
between variables.
● Spot anomalies or outliers: Detect unusual data points that may skew
results.
● Check assumptions: Validate assumptions about distributions,
relationships, or other statistical properties.
● Prepare data for modeling: Clean and pre-process data by identifying
missing values, redundant features, or irrelevant variables.
❖ Techniques used in EDA:
● Descriptive statistics: Measures like mean, median, mode,
variance, etc.
● Visualization: Graphs like histograms, scatter plots, box plots, and
bar charts.
● Correlation analysis: Understanding relationships between
variables, often through scatter plots or correlation coefficients.
❖ Steps:
● Min
● Q1
● Median
● Q3
● Max
❖ Detecting Outliers:
𝑀𝑖𝑛 = 𝑄1 − 1. 5(𝐼𝑄𝑅)

𝑀𝑎𝑥 = 𝑄3 + 1. 5(𝐼𝑄𝑅)

If a value is greater than the Max value and if a value is less than the Min
value, then the values are outliers

❖ Box Plot:
Probability
o Probability is the chance of an event occurring.
o It is calculated and expressed as quantitative data
o It can’t be greater than 1 or 100%
𝐹𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒 𝑂𝑢𝑡𝑐𝑜𝑚𝑒𝑠
𝑃(𝐴) = 𝑇𝑜𝑡𝑎𝑙 𝑂𝑢𝑡𝑐𝑜𝑚𝑒𝑠

E.g.

Male = 15

Total = 45

15 1
𝑃(𝑀) = 45
= 3
𝑜𝑟 0. 33 𝑜𝑟 33. 3%

𝑃(𝐹) = 𝑃(𝑇𝑜𝑡𝑎𝑙) − 𝑃(𝑀𝑎𝑙𝑒𝑠) = 1 − 0. 33 = 0. 67


❖ Experiment:
o Any planned activity which generates data
❖ Random Experiment:
o Any activity(experiment) under similar conditions generates different
outcomes is called random experiment
❖ Sample Space:
o A complete set of All possible outcomes from an experiment

E.g.

Dice Sample Space = {1, 2, 3, 4, 5, 6}

❖ Sample Point:
o Sample point all combine to make Sample space also called as
element or member of sample space
❖ Event:
o Set of all possible Favorable outcomes from sample space
o n = number of elements in that set

E.g.
𝑆 = {1 , 2 , 3 , 4 , 5 , 6}

𝐴 = 𝑜𝑑𝑑 𝑛𝑢𝑚 = {1 , 3 , 5}

𝐵 = 𝑚𝑢𝑙𝑡𝑖𝑝𝑙𝑒 𝑜𝑓 4 = { 4 }

𝐶 = 𝑚𝑢𝑙𝑡𝑖𝑝𝑙𝑒 𝑜𝑓 7 = ϕ

𝑛(𝐴) = 3 , 𝑛(𝑆) = 6
𝑛(𝐴) 3 1
𝑃(𝐴) = 𝑛(𝑆)
= 6
= 2
𝑜𝑟 50%

Types of Events
❖ Simple event:
o An event consisting of only 1 sample point (element / unit / Outcome)
e.g. 𝐵 = { 4 }

❖ Compound Event:
o An event consisting of more than 1 sample point e.g. 𝐴 = { 1 , 3 , 5}

❖ Sure event:
o An event that contains all possible outcomes from Sample space
o 𝐶 { 𝑥 | 1 ≤ 𝑥 ≤6 |} = 𝑆

❖ Impossible Event:
o An event with outcome does not present in Sample space
o 𝐷 = {7} , 𝑆 ∩ 𝐷 = ∅

❖ Equi Likely Event:


o If probability of one event is equal to the probability of the second
event, then it is called Equi likely event.
o 𝑃(𝐴) = 𝑃(𝐵) 𝑜𝑟 𝑛(𝐴) = 𝑛(𝐵)

E.g.

𝑆 = {1, 2, 3, 4, 5, 6}

𝐴 = {3 , 6 } , 𝐵 = {2 , 4 , 6} , 𝐶 = {1 , 3 , 5 }
B and C are equally likely events
❖ Mutually exclusive events:
o If two events have no common elements
o If two numbers can’t occur together
o 𝐵∩𝐶 = ϕ
❖ Exhaustive events:
o When two events combine to make complete sample space
o 𝐵∪𝐶 = 𝑆
❖ Independent Event:
o If an event B is affected by an earlier event A. Then B depends on A
o 𝑃 ( 𝐴 ∩ 𝐵 ) = 𝑃(𝐴) ⋅ 𝑃(𝐵)
Example:

Sample space of dice and coin:

𝑆 = {(𝐻, 1), (𝐻, 2), (𝐻, 3), (𝐻, 4), (𝐻, 5), (𝐻, 6), (𝑇, 1), (𝑇, 2), (𝑇, 3), (𝑇, 4), (𝑇, 5), (𝑇, 6)}

A is an event that Head occurs:


𝐴 = {(𝐻, 1), (𝐻, 2), (𝐻, 3), (𝐻, 4), (𝐻, 5), (𝐻, 6)}

B is an event that multiple of 2 occurs:


𝐵 = {(𝐻, 2), (𝐻, 4), (𝐻, 6), (𝑇, 2), (𝑇, 4), (𝑇, 6)}

C is an event that multiple of 5 occurs:


𝐶 = {(𝐻, 5), (𝑇, 5)}

Questions:

● Prove that A and B are independent events.


● Check the independency between A, C & B, C

Solution:

𝐴∩𝐵 = {(𝐻, 2), (𝐻, 4), (𝐻, 6)}


𝑛(𝐴∩𝐵) 3 1
𝑃(𝐴∩𝐵) = 𝑛(𝑆)
= 12
= 4

𝑛(𝐴) 6 1
𝑃(𝐴) = 𝑛(𝑆)
= 12
= 2

𝑛(𝐵) 6 1
𝑃(𝐵) = 𝑛(𝑆)
= 12
= 2

Now,
1 1 1 1
4
= 𝑃(𝐴∩𝐵) = 𝑃(𝐴) . 𝑃(𝐵) = 2
. 2
= 4

So, both events are independent events.

𝐴∩𝐶 = {(𝐻, 5)}


𝑛(𝐴∩𝐶) 1 1
𝑃(𝐴∩𝐶) = 𝑛(𝑆)
= 12
= 12

𝑛(𝐴) 6 1
𝑃(𝐴) = 𝑛(𝑆)
= 12
= 2

𝑛(𝐶) 2 1
𝑃(𝐶) = 𝑛(𝑆)
= 12
= 6

Now,
1 1 1 1
12
= 𝑃(𝐴∩𝐶) = 𝑃(𝐴) . 𝑃(𝐶) = 2
. 6
= 12

So, both events are independent events.

𝐵∩𝐶 = {(𝑇, 5)}


𝑛(𝐵∩𝐶) 1 1
𝑃(𝐵∩𝐶) = 𝑛(𝑆)
= 12
= 12

𝑛(𝐵) 6 1
𝑃(𝐵) = 𝑛(𝑆)
= 12
= 2

𝑛(𝐶) 2 1
𝑃(𝐶) = 𝑛(𝑆)
= 12
= 6

Now,
1 1 1 1
12
= 𝑃(𝐵∩𝐶) = 𝑃(𝐵) . 𝑃(𝐶) = 2
. 6
= 12

So, both events are independent events.

Addition law:

If events are mutually exclusive:

𝑃(𝐴∪𝐵) = 𝑃(𝐴) + 𝑃(𝐵)

If events are not mutually exclusive:

𝑃(𝐴∪𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴∩𝐵)

✔ Sum all probabilities can’t be greater than one.

Question: A & B can solve 70% and 80% of the problems using software
respectively. Find the chance that a problem chosen at random is solved by at least
one of them?
Solution:

P(A) = 0.7

P(B) = 0.8

Both events are not mutually exclusive.


Both are independent events.
At least one mean either A, B, or both can solve the problem.

𝑃(𝐴∪𝐵) = 𝑃(𝐴) + 𝑃(𝐵) − 𝑃(𝐴∩𝐵)

𝑃(𝐴∩𝐵) = 𝑃(𝐴) . 𝑃(𝐵) = 0. 7 * 0. 8 = 0. 56

Now,

𝑃(𝐴∪𝐵) = (0. 7 + 0. 8) − 0. 56 = 0. 94 𝑜𝑟 94%

Question: A card is selected from a deck. Find the probability that the chosen card
is King or Queen.

Solution:
𝑛(𝐾) 4
𝑃(𝐾) = 𝑛(𝐶𝑎𝑟𝑑𝑠)
= 52
= 0. 07

and,
𝑛(𝑄) 4
𝑃(𝑄) = 𝑛(𝐶𝑎𝑟𝑑𝑠)
= 52
= 0. 07

Events are mutually exclusive


𝑃(𝐴∪𝐵) = 𝑃(𝐴) + 𝑃(𝐵)

𝑃(𝐴∪𝐵) = 0. 07 + 0. 07 = 0. 15 𝑜𝑟 15%

Conditional probability:

If the event already occurred, then what will be the probability of coming event
which is affected by previous event is called conditional probability.
𝑃(𝐴∩𝐵)
𝑃(𝐵) = 𝑃(𝐵)

A | B means B already occurred

Example:

Speed Violation No Violation Total


Car Phone User 25 280 305
No Car Phone User 45 405 450
Total 70 685 755

Find the probability that a person is a car phone user:


305
𝑃(𝐶𝑎𝑟𝑃ℎ𝑜𝑛𝑒𝑈𝑠𝑒𝑟) = 755
= 0. 4 𝑜𝑟 40%

What is the probability for speed violation:


70
𝑃(𝑆𝑝𝑒𝑒𝑑𝑉𝑖𝑜𝑙𝑎𝑡𝑖𝑜𝑛) = 755
= 0. 09 𝑜𝑟 9%

Probability for no violation:


685
𝑃(𝑁𝑜𝑉𝑖𝑜𝑙𝑎𝑡𝑖𝑜𝑛) = 755
= 0. 9 𝑜𝑟 90%

Find the probability that a person is a car phone user or no violation:


✔ The events are not mutually exclusive:

𝑃(𝐶𝑎𝑟𝑃ℎ𝑈𝑠𝑟∪𝑁𝑜𝑉𝑖𝑜) = 𝑃(𝐶𝑎𝑟𝑃ℎ𝑈𝑠𝑟) + 𝑃(𝑁𝑜𝑉𝑖𝑜) − 𝑃(𝐶𝑎𝑟𝑃ℎ𝑈𝑠𝑟∩𝑁𝑜𝑉𝑖𝑜)

𝑃(𝐶𝑎𝑟𝑃ℎ𝑈𝑠𝑟∪𝑁𝑜𝑉𝑖𝑜) = ( 305
755
+
685
755 )− 280
755
= 0. 94 𝑜𝑟 94%

Find the probability of car phone users that is affected by speed violation that
has already occurred:

𝑃(𝐶𝑎𝑟𝑃ℎ𝑈𝑠𝑟 ∩𝑆𝑝𝑑𝑉𝑖𝑜) 25/755 25


𝑃(𝐶𝑎𝑟𝑃ℎ𝑈𝑠𝑟 |𝑆𝑝𝑑𝑉𝑖𝑜) = 𝑃(𝑆𝑝𝑑𝑉𝑖𝑜)
= 70/755
= 70
= 0. 35 𝑜𝑟 35%

Example: A pair of dice is rolled Find the Given Probabilities:


Sample Space:

{(1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,2), (2,3), (2,4), (2,5), (2,6), (3,1), (3,2),
(3,3), (3,4), (3,5), (3,6), (4,1), (4,2), (4,3), (4,4), (4,5), (4,6), (5,1), (5,2), (5,3), (5,4),
(5,5), (5,6), (6,1), (6,2), (6,3), (6,4), (6,5), (6,6)}

● The Sum is greater than 5 given that first dice is 4


Total Outcomes= (4,1), (4,2), (4,3), (4,4), (4,5), (4,6)
Favorable= (4,2), (4,3), (4,4), (4,5), (4,6)

𝐹𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒 5
𝑃(𝐺) = 𝑇𝑜𝑡𝑎𝑙
= 6

● The sum is greater than 9 given that second dice is 6


Total Outcomes= (1,6), (2,6), (3,6), (4,6), (5,6), (6,6)
Favorable= (4,6), (5,6), (6,6)
𝐹𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒 3 1
𝑃(𝐴) = 𝑇𝑜𝑡𝑎𝑙
= 6
= 2
● The sum is Even given that second dice is 4
Total Outcomes= (1,4), (2,4), (3,4), (4,4), (5,4), (6,4)
Favorable= (2,4), (4,4), (6,4)
𝐹𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒 3 1
𝑃(𝐴) = 𝑇𝑜𝑡𝑎𝑙
= 6
= 2

● The sum is less than 3 given the sum is odd


Total= (1,2), (1,4), (1,6), (2,1), (2,3), (2,5), (3,2), (3,4), (3,6), (4,1), (4,3), (4,5),
(5,2), (5,4), (5,6), (6,1), (6,3), (6,5)

Favorable = 0
𝐹𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒
𝑃(𝐴) = 𝑇𝑜𝑡𝑎𝑙
=0

● The sum is even or multiple of 2 occurs on 2nd dice


Intersection of even sum and second is multiple of 2= (2,2), (2,4), (2,6), (4,2),
(4,4), (4,6), (6,2), (6,4), (6,6)
Probability= 9/36 = 27
𝐹𝑎𝑣𝑜𝑢𝑟𝑎𝑏𝑙𝑒 18 18 9 3
𝑃(𝐴) = 𝑇𝑜𝑡𝑎𝑙
= 36
+ 36
− 36
= 4

Question: A bag contains 6 red,4 black balls, another bag contains 4 red,6 black
and the 3rd bag contains 5 red, 5 black balls. One bag is selected randomly, and a
ball is drawn. If the ball is red Find probability that ball is selected from bag 1.
Solution:
1
𝑃(𝐵1) = 3

1
𝑃(𝐵2) = 3

1
𝑃(𝐵3) = 3

Let, A = Red Ball

𝑃(𝐴 |𝐵1)𝑃(𝐵1)
𝑃(𝐵1 |𝐴) = 𝑃(𝐴)
𝑃(𝐴 |𝐵1)𝑃(𝐵1)
𝑃(𝐵1 |𝐴) = 𝑃(𝐴 |𝐵1)𝑃(𝐵1)+𝑃(𝐴 |𝐵2)𝑃(𝐵2)+𝑃(𝐴 |𝐵3)𝑃(𝐵3)
6
𝑃(𝐴 |𝐵1) = 10

4
𝑃(𝐴 |𝐵2) = 10

5
𝑃(𝐴 |𝐵3) = 10

𝑃(𝐵1 |𝐴) =
( )( ) 6
10
1
3

( )( ) ( )( )+( )( )
6
10
+
1
3
4
10
1
3
5
10
1
3

6 2
𝑃(𝐵1 |𝐴) = 15
= 5

Or,

Red Black
Bag1 6 4 10
Bag2 4 6 10
Bag3 5 5 10
15 15 30

6 2
𝑃(𝐵1 |𝐴) = 15
= 5

Question: 2000 scooter drivers, 4000 car drivers, and 6000 truck drivers. The
probability of an accident involving a scooter driver, car driver, and truck drivers are
0.01, 0.03, and 0.15 respectively. If an accident happens, find the probability that
the driver is a scooter driver.
Solution:
𝑇𝑜𝑡𝑎𝑙 = 12000

𝑆𝑐𝑜𝑜𝑡𝑒𝑟 = 2000

𝐶𝑎𝑟 = 4000

𝑇𝑟𝑢𝑐𝑘 = 6000

Now,
2000 1
𝑃(𝑆) = 12000
= 6

4000 1
𝑃(𝐶) = 12000
= 3

6000 1
𝑃(𝑇) = 12000
= 2

And,
𝑃(𝐴 |𝑆) = 0. 01

𝑃(𝐴 |𝐶) = 0. 03

𝑃(𝐴 |𝑇) = 0. 15

1
𝑃(𝑆)𝑃(𝑆) 0.01* 6
𝑃(𝐴) = 𝑃(𝐴 |𝑆)𝑃(𝑆)+𝑃(𝐴 |𝐶)𝑃(𝐶)+𝑃(𝐴 |𝑇)𝑃(𝑇)
= 1 1 1 = 2%
0.01* 6 +0.03* 3 +0.15* 2

Bayes Theorem Practice Questions

Weather Forecasting:
A weather service predicts rain with an accuracy of 80%. However, the actual
probability of rain on any given day is 30%. What is the probability that it will rain
given that the service has predicted rain?
Solution:
𝑃(𝑊𝑃 |𝑅) = 0. 8

𝑃(𝑊𝑃 |𝑁𝑅) = 0. 2

𝑃(𝑅) = 0. 3

𝑃(𝑁𝑅) = 0. 7

Now,
𝑃(𝑊𝑃 |𝑅)𝑃(𝑅) 0.8*0.3
𝑃(𝑅 |𝑊𝑃) = 𝑃(𝑊𝑃 |𝑅)𝑃(𝑅)+𝑃(𝑊𝑃 |𝑁𝑅)𝑃(𝑁𝑅)
= 0.8*0.3+0.2*0.7
= 0. 6316

Spam Filtering:
A certain email is classified as spam based on certain keywords. If 70% of spam
emails contain a specific keyword, and only 10% of legitimate emails contain it,
what is the probability that an email with the keyword is spam, given that 20% of
all emails are spam?
Solution:
𝑃(𝐾𝑊 |𝑆𝐸) = 0. 7

𝑃(𝐾𝑊 |𝐿𝐸) = 0. 1

𝑃(𝑆𝐸) = 0. 2
𝑃(𝐿𝐸) = 0. 8

Now,
𝑃(𝐾𝑊 |𝑆𝐸)𝑃(𝑆𝐸) 0.7*0.2
𝑃(𝑆𝐸 |𝐾𝑊) = 𝑃(𝐾𝑊 |𝑆𝐸)𝑃(𝑆𝐸)+𝑃(𝐾𝑊 |𝐿𝐸)𝑃(𝐿𝐸)
= 0.7*0.2+0.1*0.8
= 0. 6364

Quality Control:
A factory produces two types of products, A and B. Product A is defective 1% of the
time, and Product B is defective 2% of the time. If 60% of the products are A and
40% are B, what is the probability that a randomly chosen defective product is from
type A?
Solution:
𝑃(𝐷 |𝐴) = 0. 01

𝑃(𝐷 |𝐵) = 0. 02

𝑃(𝐴) = 0. 6

𝑃(𝐵) = 0. 4

Now,
𝑃(𝐷 |𝐴)𝑃(𝐴) 0.01*0.6
𝑃(𝐴 |𝐷) = 𝑃(𝐷 |𝐴)𝑃(𝐴)+𝑃(𝐷 |𝐵)𝑃(𝐵)
= 0.01*0.6+0.02*0.4
= 0. 4286

Disease and Symptoms:


A certain disease occurs in 1% of the population. A test for the disease has a 90%
true positive rate and a 5% false positive rate. If a person tests positive, what is the
probability that they have the disease?
Solution:
𝑃(𝐷𝑂) = 0. 01

𝑃(𝐷𝑁𝑂) = 0. 99

𝑃(𝑇 |𝐷𝑂) = 0. 9

𝑃(𝑇 |𝐷𝑁𝑂) = 0. 05

Now,
𝑃(𝑇 |𝐷𝑂)𝑃(𝐷𝑂) 0.9*0.01
𝑃(𝐷𝑂 |𝑇) = 𝑃(𝑇 |𝐷𝑂)𝑃(𝐷𝑂)+𝑃(𝑇 |𝐷𝑁𝑂)𝑃(𝐷𝑁𝑂)
= 0.9*0.01+0.05*0.99
= 0. 1538

Legal Cases:
In a certain region, 5% of people are criminals. If a person is a criminal, there is a
75% chance that they will leave evidence. If they are not criminals, there is a 10%
chance they will leave evidence. If evidence is found, what is the probability that the
person is a criminal?
Solution:
𝑃(𝐶) = 0. 05

𝑃(𝑁𝐶) = 0. 95

𝑃(𝐸 |𝐶) = 0. 75

𝑃(𝐸 |𝑁𝐶) = 0. 10

Now,
𝑃(𝐸 |𝐶)𝑃(𝐶) 0.75*0.05
𝑃(𝐶 |𝐸) = 𝑃(𝐸 |𝐶)𝑃(𝐶)+𝑃(𝐸 |𝑁𝐶)𝑃(𝑁𝐶)
= 0.75*0.05+0.1*0.95
= 0. 283

False Alarm:
A security system has a 98% chance of correctly detecting a break-in but also has a
3% false alarm rate. If an alarm goes off, what is the probability that there was a
break-in, given that break-ins occur in 1% of cases?
Solution:
𝑃(𝐷 |𝐵) = 0. 98

𝑃(𝐷 |𝑁𝐵) = 0. 03

𝑃(𝐵) = 0. 01

𝑃(𝑁𝐵) = 0. 99

Now,
𝑃(𝐷 |𝐵)𝑃(𝐵) 0.98*0.01
𝑃(𝐵 |𝐷) = 𝑃(𝐷 |𝐵)𝑃(𝐵)+𝑃(𝐷 |𝑁𝐵)𝑃(𝑁𝐵)
= 0..98*0.01+0.03*0.99
= 0. 2481

Counterfeit Coins:
You have two coins: one is a fair coin (50% heads, 50% tails), and the other is a
biased coin (75% heads. 25% tails). You pick one at random and flip it, getting
heads. What is the probability that you picked the biased coin?
Solution:
𝑃(𝐹) = 0. 5

𝑃(𝐵) = 0. 5

𝑃(𝐻 |𝐹) = 0. 5

𝑃(𝐻 |𝐵) = 0. 75
Now,
𝑃(𝐻 |𝐵)𝑃(𝐵) 0.75*0.5
𝑃(𝐵 |𝐻) = 𝑃(𝐻 |𝐵)𝑃(𝐵)+𝑃(𝐻 |𝐹)𝑃(𝐹)
= 0.75*0.5+0.5*0.5
= 0. 6

Urn Problem:
An urn contains 3 red balls and 2 blue balls. You draw a ball at random, note its
color, and put it back. Then you draw a second ball. If the second ball is red, what is
the probability that the first ball was also red?
Solution:
𝑃(𝑅1 |𝑅2) =?

Given,
3
𝑃(𝑅1) = 5
= 0. 6

2
𝑃(𝐵1) = 5
= 0. 4

3
𝑃(𝑅2 | 𝑅1) = 5
= 0. 6

3
𝑃(𝑅2 |𝐵1) = 5
= 0. 6

Now,
𝑃(𝑅2 |𝑅1)𝑃(𝑅1) 0.6*0.6
𝑃(𝑅1 |𝑅2) = 𝑃(𝑅2 |𝑅1)𝑃(𝑅1)+𝑃(𝑅2 |𝐵1)𝑃(𝐵1)
= 0.6*0.6+0.6*0.4
= 0. 6

Question: A person goes to work daily, he uses his car 70% of the time and walks
30% of the time and he uses the bus 40% of the time. He is late 10 % of the time
when he walks to work, 3% of the time when he uses his car and 7% of the time
when he uses the bus. The person is late, find the prob he is in his car?

𝑃(𝐵) = 0. 4, 𝑃(𝐶) = 0. 7, 𝑃(𝑊) = 0. 3

𝑃( 𝐿 |𝐵) = 0. 07 , 𝑃(𝐿 |𝐶) = 0. 03, 𝑃(𝐿 |𝑊) = 0. 1

𝑃(𝐿 | 𝐶) . 𝑃 (𝐶)
𝑃(𝐶 |𝐿) = 𝑃(𝐿 |𝐶) . 𝑃(𝐶)+𝑃(𝐿 |𝑊).𝑃(𝑊)+𝑃(𝐿 |𝐵). 𝑃(𝐵)

Random Variables and Probability Distribution

- Discrete Random variable


- Continuous Random variables
Discrete
1. F(x) >= 0
F(x) is a function that computes probability

2. ∑ 𝐹(𝑥) = 1

If this is True, then it is a frequency distribution

3. P (X = x) = F(x)
● x is a random variable.
● Through a function F(X) generates different probability and if ∑ 𝐹(𝑥) = 1 then
it is a probability distribution.

Example Question

Question: A shipment of 20 similar laptops contains 3 defective laptops, if a school


randomly purchases 2 laptops. Find the prob distribution for defective laptops. I also
find that exactly 2 laptops are defective, at least 1 laptop is defective, no defective.

(𝑛𝐶𝑥) *((𝑁−𝑛)𝐶𝑥)
𝑃( 𝑋 = 𝑥) = 𝑁𝐶𝑘

● n = Defective = 3
● x = selected = 0,1,2
● K = selected item = 2
● N = total population = 20
3𝐶0 . 17𝐶2
𝑃( 𝑋 = 0) = 20𝐶2
= 0. 715

3𝐶1 . 17𝐶1
𝑃( 𝑋 = 1) = 20𝐶2
= 0. 268

3𝐶2 . 17𝐶0
𝑃( 𝑋 = 2) = 20𝐶2
= 0. 015
Frequency Distribution

X P(X)
0 0.715
1 0.268
2 0.015
Total 0.998
a) 𝑃 (𝑋 = 2) = 0. 015
b) 𝑃(𝑋≥1) = 𝑃(1) + 𝑃(2) = 0. 268
c) 𝑃(𝑋 = 0) = 0. 715

Question:
a) A bag contains 6 red, 4 white and 5 blue balls. 3 balls are randomly selected.
Find the probability that all balls are different color

x = 0, 1, 2, 3

6𝐶1 . 5𝐶1 . 4𝐶1


𝑃( 𝑥 = 𝑎𝑙𝑙 𝑎𝑟𝑒 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡 ) = 15𝐶3
= 0.264

b) What is the prob that 2 white and 1 blue is selected

6𝐶0 . 5𝐶1 . 4𝐶2


𝑃( 2 𝑤ℎ𝑖𝑡𝑒, 1 𝑏𝑙𝑢𝑒) = 15𝐶3

c) Find the prob that at least 2 Red balls are selected

(6𝐶2 . 4𝐶1) + (6𝐶2 . 5𝐶1) +(6𝐶3)


𝑃(𝑎𝑡 𝑙𝑒𝑎𝑠𝑡 2 𝑅𝑒𝑑) = 15𝐶3

Discrete-Probability Distribution

1. Binomial probability distribution


2. Poisson probability distribution
3. Hypergeometric probability distribution
Binomial Probability Distribution

Steps:

1. Outcome is divided into 2 parts


● Event occurring
● Event not occurring (binomial)
2. N is Fixed
3. Trials are independent
4. For every trial probability of success is fixed
5. Parameters are n & p

Formula:

𝑃(𝑥 = 𝑋) = ( ). 𝑝 . 𝑞
𝑛
𝑥
𝑥 𝑛−𝑥

X = random variable

p = probability of success

q = probability of failure

n = no fixed trials

Parameters:

1) N = no fixed trials
2) P = probability of success

Question: The probability of generating a true code using one specific algorithm is
65%. For quality checking we randomly selected 3 algorithms. What’s the
probability that exactly 2 generates true code and what’s the probability that at
least one algorithm generates true code?

Solution:

● p=0.65
● n=3
● x=0,1,2,3

𝑃(𝑥 = 0)
( 3
0 )(0. 65) (0. 35)
0 3

𝑃(𝑥 = 1)
( 3
1 )(0. 65) (0. 35)
1 2

𝑃(𝑥 = 2)
( 3
2 )(0. 65) (0. 35)
2 1

𝑃(𝑥 = 3)
( 3
3 )(0. 65) (0. 35)
3 0

1) 𝑃(𝑥 = 2) = 0. 4436
2) 𝑃(𝑥≥1) = 1 − 𝑃(𝑥 = 0) = 0. 95

Binomial Probability
o Formula: ( ). 𝑝 . 𝑞
𝑛
𝑥
𝑥 𝑛−𝑥

o Parameters: [ n & p ]
o Mean: 𝑛 . 𝑝
o Variance = 𝑛 . 𝑝. 𝑞
o Standard deviation: 𝑛 . 𝑝. 𝑞
o 1=𝑝 + 𝑞

Binomial Probability Practice Questions:

Question: A company runs an online ads campaign and estimates that 20% of
individuals will click on the ad if 200 individuals see the ad

a) What is the probability exactly 30 will click on the ad?

𝑃𝑦𝑡ℎ𝑜𝑛 𝑐𝑜𝑑𝑒: 𝑏𝑖𝑛𝑜𝑚. 𝑝𝑚𝑓(30, 200, 0. 2)

𝑃(30) = ( ) (0. 2)
200
30
30 170
(0. 8) = 0. 0147

b) What is the probability that at least 2 will click the add?

𝑃𝑦𝑡ℎ𝑜𝑛 𝑐𝑜𝑑𝑒: 1 − 𝑏𝑖𝑛𝑜𝑚. 𝑐𝑑𝑓(2, 200, 0. 2)


𝑃( 𝑥≥2) = 1 − ⎡
⎣ ( ) (0. 2)
200
1
1
(0. 8)
199
+ ( ) (0. 9)
200
0
200 0
(0. 2) ⎤ = 1

c) At most 198 will click the ad

𝑃𝑦𝑡ℎ𝑜𝑛 𝑐𝑜𝑑𝑒: 𝑏𝑖𝑛𝑜𝑚. 𝑐𝑑𝑓(198, 200, 0. 2)

𝑃( 𝑥≤198) = 1 − ⎡
⎣ ( ) (0. 2)
200
199
199
(0. 8)
1
+ ( ) (0. 2)
200
200
200 0
(0. 8) ⎤ = 1

d) what is the average number of people who click on ad?

𝑀𝑒𝑎𝑛 = 𝑛 . 𝑝 = 200 * 0. 2 = 40

So, on average, 40 people clicked on ad

Question: The mean and variance is 10 and 3 respectively find the 𝑃( 𝑥 = 2)

𝑀𝑒𝑎𝑛 = 𝑛 . 𝑝

10 = 𝑛 . 𝑝→𝐸𝑞𝑢𝑎𝑡𝑖𝑜𝑛 1

𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = 𝑛 . 𝑝 . 𝑞

3 = 𝑛 . 𝑝 . 𝑞→𝐸𝑞𝑢𝑎𝑡𝑖𝑜𝑛 2

Put equation 1 into equation 2:


3 = (10) . 𝑞
3
𝑞= 10

Now:
𝑝=1 −𝑞
3 7
𝑝=1 − 10
= 10

Also:
𝑚𝑒𝑎𝑛 = 𝑛 . 𝑝

10
0.7
= 𝑛
𝑛 = 14

Using 𝑛 = 14 , 𝑝 = 0. 7 & 𝑞 = 0. 3

𝑃(𝑥 = 2) = ( ) (0. 7) (0. 3)


14
2
2 12

Question: 20 patients are ill. Medicine is effective 70% of the time what is the
probability that at-least 5 people will be cured

𝑃( 𝑥≥5) = 1 − [𝑃(0) + 𝑃(1) + 𝑃(2) + 𝑃(3) + 𝑃(4)]

𝑃(𝑥≥5) = 1 − ⎡
⎣ ( ) (0. 7)
20
0
0
(0. 3)
20
+ ( ) (0. 7)
20
1
1
(0. 3)
19
+ ( ) (0. 7)
20
2
2
(0. 3)
18
+ ( ) (0. 7)
20
3
3
(0. 3)
17
+ ( ) (0. 7)
20
4
4 16
(0. 3) ⎤

Question: 10 % chance of item being defective 50 are selected at random what is


the prob that 5 are defective

𝑃(𝑥 = 5) = ( ) (0. 1)
50
5
5
(0. 9)
45

Poisson Probability
−λ 𝑥
𝑒 . λ
o Formula: 𝑃 ( 𝑋 = 𝑥) = 𝑥!
o Parameter: λ = Mean / average / var
o 0≤𝑥≤ ∞
o Used in place of binomial if 𝑛≥100 and 𝑃(𝑠𝑢𝑐𝑐𝑒𝑠𝑠) < 0. 1
o λ = 𝑛. 𝑝 = 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑏𝑖𝑛𝑜𝑚𝑖𝑎𝑙

Poisson Probability Practice Question

Question: A firm on average produces 30 defective products find the probability


that at least 5 are defective.

λ = 30 , 𝑛 = 5

𝑃( 𝑥≥5) = 1 − [𝑃(0) + 𝑃(1) + 𝑃(2) + 𝑃(3) + 𝑃(4)]


−30 0 −30 1 −30 2 −30 3 −30 4
𝑒 . 30 𝑒 . 30 𝑒 . 30 𝑒 . 30 𝑒 . 30
𝑃(𝑥≥5) = 1 − ⎡⎢ 0!
+ 1!
+ 2!
+ 3!
+ 4!


⎣ ⎦

Question: A person makes a website and he observed that some customers visit his
website four times per minute. What is the probability that there will be exactly 2
visits in the next 2 minutes? And how many visits in the next 5 minutes? What’s the
probability that there will be at most 2 visits?
Solution:
λ=4

1) For two minutes,

λ=2 *4 = 8
−8 2
𝑒 8
𝑃(𝑥 = 2) = 2!
= 0. 0107

2) = 4𝑥5 = 20
−4 0 −4 1 −4 2
𝑒 4 𝑒 4 𝑒 4
3) 𝑃(𝑥≤2) = 𝑃(0) + 𝑃(1) + 𝑃(2) = 0!
+ 1!
+ 2!

Hypergeometric Probability Distribution

o Formula: 𝑃 ( 𝑋 = 𝑥) =
( ).( )
𝑘
𝑥
𝑁−𝑘
𝑛−𝑥

( ) 𝑁
𝑛

o Parameter:
▪ N = sum of objects in all categories
▪ n = number of selected objects
▪ k = successful object in total population (N)
o Range:
▪ Lower bound: 𝑚𝑎𝑥(0, 𝑛 + 𝑘 − 𝑁)

▪ Upper bound: 𝑚𝑖𝑛⁡(𝑛 , 𝑘 )


o Mean = 𝑛 ( )
𝑘
𝑁
o Variance = (𝑛) ( )(
𝑘
𝑁
𝑁−𝑘
𝑁
) ( 𝑁−𝑛
𝑁−1 )

Example Questions:

Question: A committee of size 3 is selected from 4 teachers and 2 doctors. Find the
probability that at least 2 teachers are selected.

Solution:
𝑁 = 𝑡𝑜𝑡𝑎𝑙 𝑠𝑢𝑚 𝑜𝑏𝑗 = 4 + 2 = 6
𝑛 = 𝑠𝑖𝑧𝑒 𝑜𝑓 𝑐𝑜𝑚𝑚𝑖𝑡𝑡𝑒𝑒 = 3

𝑘 = 𝑠𝑢𝑐𝑐𝑒𝑠𝑠𝑓𝑢𝑙 𝑜𝑏𝑗 = 4

𝐿𝑜𝑤𝑒𝑟 : 𝑚𝑎𝑥(0, 3 + 4 − 6) = 𝑚𝑎𝑥(0, 1) = 1

𝑈𝑝𝑝𝑒𝑟: 𝑚𝑖𝑛(3, 4) = 3

𝑅𝑎𝑛𝑔𝑒 = (1 , 3) 𝑖𝑛𝑐𝑙𝑢𝑠𝑖𝑣𝑒

𝑋 ~ 𝐻( 𝑁 = 6, 𝑛 = 3, 𝑘 = 4)

⎡ ( ).( ) ⎤
𝑘 𝑁−𝑘

𝑃 ( 𝑋≥2) = 1 − ⎢ 𝑥 𝑁 𝑛−𝑥 ⎥
⎣ (𝑛) ⎦

⎡ ( )( ) ⎤
4 6−4

= 1 − ⎢ 1 63−1 ⎥
⎣ (3) ⎦

4
= 1 − ⎡ 20 ⎤
⎣ ⎦

= 0. 8

b) Find prob distribution of doctors?


𝑁 = 6, 𝑛 = 3,𝑘 = 2

𝐿𝑜𝑤𝑒𝑟 : 𝑚𝑎𝑥(0, 3 + 2 − 6) = 𝑚𝑎𝑥(0, − 1) = 0

𝑈𝑝𝑝𝑒𝑟 : 𝑚𝑖𝑛(3, 2) = 2

𝑅𝑎𝑛𝑔𝑒 = (0 , 2) 𝑖𝑛𝑐𝑙𝑢𝑠𝑖𝑣𝑒

𝑆𝑜 , 𝑥 = 0, 1, 2

𝑃(𝑋 = 0) =
( ).( )
2
0
6−2
3−0
= 0. 2
( ) 6
3
𝑃(𝑋 = 1) =
( ).( )
2
1
6−2
3−1
= 0. 6
( )6
3

𝑃(𝑋 = 2) =
( ).( )
2
2
6−2
3−2
= 0. 2
( )6
3

X Probability
0 0.2
1 0.6
2 0.2
Total 1

Continuous Probability Distributions:

4. Uniform probability distribution


5. Normal probability distribution

𝑓(𝑥) >= 0

● Where f(x) is the function that generates probability


● Interval Based distribution

○ ∫ 𝐹(𝑋)𝑑𝑥 = 1
−∞
𝑏
○ 𝑃(𝑎 <= 𝑥 <= 𝑏) = ∫ 𝐹(𝑋) 𝑑𝑥
𝑎
● Mean:

○ ∫ 𝑥 𝐹(𝑋)𝑑𝑥
−∞
● Variance:
∞ ∞
2 2
○ ∫ 𝑥 𝐹(𝑋)𝑑𝑥 - [ ∫ 𝑥 𝐹(𝑋)𝑑𝑥 ]
−∞ −∞

Uniform Probability Distribution


1
o 𝐹(𝑥) = 𝑏−𝑎
𝑏
o 𝐹𝑜𝑟𝑚𝑢𝑙𝑎 = 𝑃(𝑎 <= 𝑥 <= 𝑏) = ∫ 𝐹(𝑋) 𝑑𝑥
𝑎

o Parameter:
▪ a = lower limit
▪ b = upper limit

𝑎+𝑏
o Mean = 2

2
(𝑏 − 𝑎)
o Variance = 12

Question: The total duration of basketball game in 2011 is between 447 and 521
hrs

𝑋 ~ 𝑈 (447 , 521)
1 1
𝐹(𝑥) = 521 − 447
= 74

a) What is the probability that the duration is between 480 and 500 hrs?

𝑎 = 480 , 𝑏 = 500

500
1 1
P(480 ≤ x ≤ 500) = ∫ 74
= 74
[ 500 - 480] = 0.27
480

b) What is the probability that the duration is greater than equal 500 hrs?

𝑎 = 500 , 𝑏 = 521
521
1 1
P(500 < x ≤ 521) = ∫ 74
= 74
[ 521 - 500] = 0.28
500

Easier method:

𝑏−𝑎 521 − 500


P( x > 500) = 𝐵−𝐴
= 521 − 447
= 0. 28

Normal Probability Distribution

The probability density function (P.D.F) is given by:


2
(𝑥−𝜇)

1 2σ
2

𝑓(𝑥) = 2
𝑒
2Πσ

Parameters:
● Mean: 𝜇
● Standard deviation: σ

Standard Normal Distribution:

The Standard Normal Distribution is a special type of normal distribution where:

● The mean (μ) = 0


● The standard deviation (σ) = 1

To convert a normal variable X into the standard normal form (denoted as Z), use
this formula:
𝑋−𝜇
𝑍= 𝛿

● Z: The Z-score (how many standard deviations X is away from the mean).
● X: The actual value from the distribution.
● μ: The mean of the distribution.
● σ: The standard deviation of the distribution.

Calculating Probabilities:
● Use the standard normal distribution table (Z-table)
● Example: To find P(X ≤ x), standardize x to Z, then look up the cumulative
probability for Z in the Z-table.

Question:
The test scores of a physics class with 800 students are distributed normally with a
mean of 75 and a standard deviation of 7.
(a) What percentage of the class has a test score between 68 and 82?
(b) Approximately how many students have a test score between 61 and 89?
(c) What is the probability that a student chosen at random has a test score
between 54 and 75?
(d) Approximately how many students have a test score greater than or equal 96?
Solution:
● Mean= 𝑢 = 75
● Standard Deviation = σ = 7
● Total Students = 800

a) Percentage of Students with Scores Between 68 and 82

● For X=68,
68−75
𝑍= 7
=− 1
𝑃(𝑍 ≤− 1) = 0. 15866
● For X=82,
82−75
𝑍= 7
= 1
𝑃(𝑍 ≤ 1) = 0. 84134

The percentage of students with scores between 68 and 82:


𝑃(68 ≤ 𝑋 ≤ 82) = 0. 84134 − 0. 15866 = 0. 68268

b) Number of Students with Scores Between 61 and 89

● For X=61,

61−75
𝑍= 7
=− 2
𝑃(𝑥 ≤ − 2) = 0. 02275

● For X=89,
89−75
𝑍= 7
= 2
𝑃(𝑥 ≤ 2) = 0. 97725

𝑃(61 ≤ 𝑋 ≤ 89) = 0. 97725 − 0. 02275 = 0. 9545

The number of students:

Students= 0.9544×800=763.52≈764

Approximately 764 students have scores between 61 and 89.

c) Probability of Scores Between 54 and 75

● For X=54,
54−75
𝑍= 7
=− 3
𝑃(𝑍 ≤− 3) = 0. 00135
● For X=75,
75−75
𝑍= 7
= 0
𝑃(𝑍 ≤ 0) = 0. 50000

The percentage of students with scores between 54 and 75:


𝑃(54 ≤ 𝑋 ≤ 75) = 0. 50000 − 0. 00135 = 0. 49862

d) Students with Scores Greater Than or Equal to 96

● For X=96,

96−75
𝑍= 7
= 3

Using the z-table we get:

𝑃(𝑍≤3) = 0. 9987
The probability of 𝑃(𝑍≥3) = 1 − 0. 9987 = 0. 0013

The number of students:

Students=0.0013×800=1.04≈1

Approximately 1 student has a score greater than or equal to 96.

Correlation & Regression

Correlation:

Correlation is the measure of the strength of association between two


variables (Quantitative).

● If both variables have the same direction (both increasing or both decreasing)
then there is positive correlation.
● If both variables have different directions (if one increases, other decreases)
then there is negative correlation.
● If no variable is affected by another then there is no relation.
Notation: r

Where,

− 1 <= 𝑟 <= 1

If,

● r = +1 (perfect positive correlation)


● r = -1 (perfect negative correlation)
● r >= +0.8 (strong positive correlation)
● r <= -0.8 (strong negative correlation)
● +0.5 < r < +0.8 (moderate positive correlation)
● -0.8 < r < -0.5 (moderate negative correlation)
● r <= +0.5 (weak positive correlation)
● r >= -0.5 (weak negative correlation)
● r = 0 (no correlation)

Formula:
𝑛Σ𝑥𝑦 − (Σ𝑥)(Σ𝑦)
𝑟= 2 2 2 2
(𝑛Σ𝑥 −(Σ𝑥) )(𝑛Σ𝑦 −(Σ𝑦) )

Question: Find the relation between marks of the students and number of lectures
he attended?

N=5

X (Marks) Y (Lectures) XY 𝑋
2
𝑌
2

3 1 3 9 1

5 2 10 25 4

7 2 14 49 4

2 1 2 4 1

5 3 15 25 9

ΣX = 22 ΣY = 9 ΣXY = 44 2
Σ𝑋 = 112
2
Σ𝑌 = 19

Thus,
5*44 − (22)(9)
𝑟= 2 2
= 0. 67
(5*112−(22) )(5*19−(9) )

So, there is moderate positive correlation between marks and lectures.

Regression:

Regression is a process by which we estimate the value of a dependent


variable(y) on the basis of one or more independent variables(x1,x2,x3,....,xn).

But for linear regression, there is only one independent variable x.

Formula:

𝑦 = 𝑎 + 𝑏(𝑥)

Where, a & b are coefficients of regression line,

● a = value of y when x is 0
● b = slop formed by x and y

Thus,

𝑎 = 𝑦 − 𝑏𝑥
𝑛Σ𝑥𝑦−(Σ𝑥)(Σ𝑦)
𝑏= 2 2
𝑛Σ𝑥 −(Σ𝑥)

Interpretation:
If x(independent) is increased/decreased by one unit then there are b(slope)
times increase/decrease occur in y(dependent).

Question: Estimate the marks of a student who attends 10 lectures?

N=5

X (Lectures) Y (Marks) XY 𝑋
2

5 11 55 25

7 10 70 49

3 15 45 9
6 4 24 36

4 12 48 16

ΣX = 25 ΣY = 52 ΣXY = 242 2
Σ𝑋 = 135

25
𝑥= 5
= 5

52
𝑦= 5
= 10. 4

As,

𝑦 = 𝑎 + 𝑏(𝑥)

Then,
𝑛Σ𝑥𝑦−(Σ𝑥)(Σ𝑦) 5*242−(25)(52)
𝑏= 2 2 = 2 = − 1. 8
𝑛Σ𝑥 −(Σ𝑥) 5*135−(25)

𝑎 = 𝑦 − 𝑏𝑥 = 10. 4 + 1. 8 * 5 = 19. 4

Thus,

𝑦 = 19. 4 − 1. 8(𝑥)

So, if the number of lectures increases by one then marks will decrease by -1.8.

If, student attends 10 lectures, then x=10,

𝑦 = 19. 4 − 1. 8(10) = 1. 4

Thus, if students attend 10 lectures then it is assumed that he gain 1.4 marks.

Hypothesis Testing

Null-Hypothesis:

● Represents the default status or assumption.


● It states that there is no effect, difference, or relationship.
● Example: H0: μ=50 (the population mean is 50).
● Default or current accepted value
● Equality must lie in null hypothesis

Alternative-Hypothesis:
● Represents the claim we want to test.
● It states that there is an effect, difference, or relationship.
● Example: Ha: μ≠50H(the population mean is not 50).
● Inequality must lie in alternative hypothesis
Level of Significance:
● The threshold for accepting/rejecting the null hypothesis.
● Common values are 0.1(10%), 0.05 (5%) or 0.01 (1%).

Test-Statistic:
● A standardized value used to decide whether to reject H0.
● We commonly use T-statistic(t-test)
𝑥−𝜇
Formula: 𝑡 = 𝛿
𝑛

Where,
𝑥 = 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑠𝑎𝑚𝑝𝑙𝑒
𝜇 = 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛
𝑆 = 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑠𝑎𝑚𝑝𝑙𝑒
𝑛 = 𝑠𝑖𝑧𝑒 𝑜𝑓 𝑠𝑎𝑚𝑝𝑙𝑒

Rejection/Critical Region:
Case A(2-tail):
If H0 is 𝜇 = 𝜇𝑜
then, Reject if
● t-value>+table-value
● t-value <-table-value
Case B:
If H0 is 𝜇≤𝜇𝑜
then, Reject if
● t-value>+table-value
Case C :
If H0 is 𝜇≥𝜇𝑜
then, Reject if
● t<-table-value
Conclusion:
● Draw conclusion in context of the problem
Question: From the data available, it is observed that 400 out of 850 customers
purchased the groceries online. Can we say that most of the customers are moving
towards online shopping even for groceries?
Solution:
400
𝑥= 850
= 0. 47
µ = 0. 5 (𝑖𝑓 𝑎𝑛𝑦 𝑐𝑢𝑠𝑡𝑜𝑚𝑒𝑟 𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑦 𝑐𝑟𝑜𝑠𝑠 50% 𝑚𝑒𝑎𝑛𝑠 𝑖𝑡 𝑖𝑠 𝑖𝑛 𝑚𝑎𝑗𝑜𝑟𝑖𝑡𝑦)

Step1: (Defining null and alternative hypothesis)


𝐻𝑜 = 𝑛𝑢𝑙𝑙 ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠 = 𝜇 ≥ 0. 5 (claim that most customers moving toward online
shopping means online shopping has at least 50% customer rate).
𝐻1 = 𝑎𝑙𝑡𝑒𝑟𝑛𝑎𝑡𝑖𝑣𝑒 ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠 = 𝜇 < 0. 5 (𝑙𝑒𝑓𝑡 𝑠𝑘𝑒𝑤𝑒𝑑)

Step2: (Setting or defining significance level)


As significance level is not given, so we choose a common one 5%,
α = 0. 05

Step3: (Test statistic or Z-score)


As sample size if greater than 30 so we have to calculate z-score instead of t-score
𝑥−𝜇 0.47−0.5
𝑧= 1−µ
= 1−0.5
= − 1. 2
(µ)( 𝑛
) (0.5)( 400
)

Step4: (Critical region)


As this is left tailed because alternative hypothesis have less than sign,
𝑧α < − 𝑡𝑎𝑏𝑙𝑒 − 𝑣𝑎𝑙𝑢𝑒 ⇒ − 1. 2 < − 0. 10565 (𝑇𝑟𝑢𝑒)

Step5: (Conclusion)
Hence, we reject the null hypothesis, that means with given data we can validate
significantly that most of the customers are not moving towards online shopping
even for groceries.

Confidence Interval:
1−µ 1−µ
𝑥 − (𝑧 α ) (µ)( 𝑛
) < 𝜇 < 𝑥 + (𝑧 α ) (µ)( 𝑛
)
2 2

0. 47 − (0. 10565)(0. 025) < 𝜇 < 0. 47 + (0. 105656)(0. 025)


0. 4674 < 𝜇 < 0. 4726
So, 95% of data lies between 0. 4674 and 0. 4726 and 5% lie outside.

Question: A teacher claims that the average score of students in class is greater
than 6. To test this claim we took a sample of 5 has mean of 5 and sample standard
deviation of 2, and level of significance is 5% or α = 0. 05
Solution:
α = 0. 05, 𝑥 = 5, 𝑆 = 2, 𝑛 = 5, µ = 6

Step1: (Defining null and alternative hypothesis)


𝐻𝑜 = 𝑛𝑢𝑙𝑙 ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠 = 𝜇 ≤ 6
𝐻1 = 𝑎𝑙𝑡𝑒𝑟𝑛𝑎𝑡𝑖𝑣𝑒 ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠 = 𝜇 > 6 (𝑟𝑖𝑔ℎ𝑡 𝑠𝑘𝑒𝑤𝑒𝑑)

Step2: (Setting or defining significance level)


α = 0. 05

Step3: (Test statistic or T-score)


𝑥−𝜇 5−6
𝑡= 𝛿 = 2 = − 1. 12
𝑛 5

Step4: (Critical region)


As this is right tailed because alternative hypothesis have greater than sign,
Degree of freedom = (sample size - 1) = 5 - 1 = 4
𝑡α > + 𝑡𝑎𝑏𝑙𝑒 − 𝑣𝑎𝑙𝑢𝑒 ⇒ − 1. 12 > + 2. 123 (𝐹𝑎𝑙𝑠𝑒)

Step5: (Conclusion)
As from step4, we are unable to reject the null hypothesis, so by this test the
average score of class is less than equal 6.

Confidence Interval:
𝑆 𝑆
𝑥 − (𝑡 α )( ) < 𝜇 < 𝑥 + (𝑡 α )( )
2 𝑛 2 𝑛

5 − (2. 776)(0. 89) < 𝜇 < 5 + (2. 776)(0. 89)


2. 52936 < 𝜇 < 7. 47064
So, 95% of data lies between 2.52936 and 7.47064 and 5% lie outside.

You might also like