0% found this document useful (0 votes)
68 views21 pages

Statistics Midterm Review

Uploaded by

Mai Nguyen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views21 pages

Statistics Midterm Review

Uploaded by

Mai Nguyen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Chapter 3: Describing data visually

44 sorted price/earnings ratio


7 9 10 10 10 10 11 11 11 12 13
13 13 14 14 15 16 16 16 16 17 17
18 18 19 19 20 20 21 21 22 23 24
26 26 27 28 31 37 37 38 42 50 59

1. Stem-and-leaf plot:
Frequency Stem Leaf
2 0 7 9
24 1 0 0 0 0 1 1 1 2 3 3 3 4 4 5 6 6 6 6 7 7 8 8 9 9
11 2 0 0 1 1 2 3 4 6 6 7 8
4 3 1 7 7 8
1 4 2
2 5 0 9
Total: 44

Use:
• To reveal essential data features (central tendency, dispersion)
For the example above:
o Central tendency: 24 of the 44 P/E ratios were in the 10-19 stem
o Dispersion: the range is from 7 to 59
• To retrieve the raw data by concatenating a stem digit with each of its leaf digits
2. Dot plots: Reveal dispersion, central tendency, and the shape of the distribution

Interpretation:
• The range is from 7 to 59.
• All but a few data values lie between 10 and 25.
• A typical “middle” data value would be around 17 or 18.
• The data are not symmetric due to a few large P/E ratios.
3. Frequency distribution:
Step 1: Choose the number of bins based on sample size (n)
Step 2: Construct the frequency distribution table

4. Histogram: a bar chart of a frequency distribution


Skewness:

5. Scatter plot:
The relationship between X and Y:

Example: Use Stata to make a scatter plot from the data file GPA1.dta, with hsGPA on the X-axis
and colGPA on the Y -axis. Describe the relationship (if any) between X and Y. Weak? Strong?
Negative? Positive? Linear? Nonlinear?
Answer: Weak positive relationship between high school GPA and MSU GPA.

Chapter 4: Descriptive statistics


3 key characteristics of numerical data: center, variability, and shape
1. Measures of center:
a. Mean:
Population mean Sample mean
"
∑!#$ 𝑥! ∑%!#$ 𝑥!
𝜇= 𝑥̅ =
𝑁 𝑛

b. Median (M): the 50th percentile or midpoint of the sorted sample data (n)
• If n is odd, the median is the middle observation.
Example:
1 2 4 6 7 9 11 15 20
Median
• If n is even, the median is the average of the middle two observations.
Example:
1 2 3 5 7 8 11 15 17 19
&'(
Median = ) = 7.5
c. Mode: The most frequently occurring data value
• May have multiple modes or no mode
• Most useful for discrete data
Example:
Anna’s scores: 50 50 50 65 65 70 => Mode = 50 (occurs 3 times)
Ben’s scores: 40 50 60 70 75 90 => Mode = none
Charlie’s scores: 60 60 80 90 90 => Mode = 60, 90
d. Shape: The shape of the distribution is judged by comparing the mean and median.

e. Geometric mean (G): used when all the data values are positive (> 0).
𝑮 = 𝒏,𝒙𝟏 𝒙𝟐 … 𝒙𝒏
Example: The geometric mean for X = 2, 3, 7, 9, 10, 12 is:
" "
𝐺 = ,(2)(3)(7)(9)(10)(12) = ,45,360 = 5.972
f. Growth rates: The average growth rate for a time series
𝒏#𝟏𝒙𝒏
𝑮𝑹 = ; −𝟏
𝒙𝟏
Example:

The average growth rate from 2006 to 2010 is:


% 3779
𝐺𝑅 = ; − 1 = 0.125 𝑜𝑟 12.5% 𝑝𝑒𝑟 𝑦𝑒𝑎𝑟
2361
g. Midrange: A dubious measure of center because it is sensitive to extreme data
values
𝒙𝒎𝒊𝒏 + 𝒙𝒎𝒂𝒙
𝑴𝒊𝒅𝒓𝒂𝒏𝒈𝒆 =
𝟐
h. Trimmed mean: The highest and lowest k percent of the observations in the sorted
data array are removed to mitigate the effects of extreme high values on either end.
Example: 20, 30, 50, 60, 78, 80, 90. Find 15% trimmed mean.
The number of observations: n = 7
k = 0.15 => we need to trim 0.15 x 7 = 1.05 => round down to 1 observation
So, we need to remove 1 smallest and 1 largest observations.
30 + 50 + 60 + 78 + 80
15% 𝑡𝑟𝑖𝑚𝑚𝑒𝑑 𝑚𝑒𝑎𝑛 = = 59.6
5
Note: If k x n = 1.65 => round up to 2 observations => remove 2 smallest and 2
largest observations before averaging the remaining values.
2. Measures of variability:
Variation: The “spread” of data points about the center of the distribution in a sample.

a. Variance: Variance measures how far a data set is spread out.


Population variance:
∑𝑵𝒊#𝟏(𝒙𝒊 − 𝝁)
𝟐
𝝈𝟐 =
𝑵
Sample variance:
𝟐
∑𝒏𝒊#𝟏(𝒙𝒊 − 𝒙
Z)𝟐
𝒔 =
𝒏−𝟏
Note: A sample contains n observations independently from the others. But once
you have calculated the sample mean, there are only n - 1 observations left (since
the sample values must add to a fixed total that gives the mean). We divide the sum
of squared deviations by n - 1 instead of n because we have “lost” one observation.
b. Standard deviation: indicates how individual values in a data set vary from the mean
For a population:
∑𝑵
𝒊#𝟏(𝒙𝒊 − 𝝁)
𝟐
𝝈=;
𝑵
For a sample:
∑𝒏 (𝒙𝒊 − 𝒙
Z)𝟐
𝒔 = ; 𝒊#𝟏
𝒏−𝟏
c. Coefficient of variation (CV): the standard deviation expressed as a percent of the
mean and is useful for comparing variables measured in different units
𝒔
𝑪𝑽 = 𝟏𝟎𝟎 ×
𝒙
Z
d. Mean absolute deviation (MAD): indicates the average distance from the center
∑𝒏𝒊#𝟏|𝒙𝒊 − 𝒙
Z|
𝑴𝑨𝑫 =
𝒏

3. Standardized data:
a. Calculate the standardized value (z-score):
For a population:
𝒙−𝝁
𝒛=
𝝈
For a sample:
𝒙2𝒙3
𝒛 = 𝒔 (𝑛 ≥ 30)
b. 2 theories:
Chebyshev’s Theorem Empirical Rule
For any data set, no matter how it is For data from a normal distribution,
distributed, the percentage of we expect the interval 𝜇 ± 𝑘𝜎 to
observations that lie within k contain a known percentage of
standard deviations of the mean the data:
(i.e., within 𝜇 ± 𝑘𝜎) must be at • 𝑘 = 1 68.26% will lie within 𝜇 ±
$
least 100[1 − 5 & ] 𝜎
• 𝑘 = 2 at least 75.0% will lie • 𝑘 = 2 95.44% will lie within 𝜇 ±
within 𝜇 ± 2𝜎 2𝜎
• 𝑘 = 3 at least 88.9% will lie • 𝑘 = 3 99.73% will lie within 𝜇 ±
within 𝜇 ± 3𝜎 3𝜎
• 𝑘 = 4 at least 93.8% will lie Note: Data values outside 𝜇 ± 3𝜎
are called outliers.
within 𝜇 ± 4𝜎
• Unusual: 2 < |𝑧| ≤ 3
• Outlier: |𝑧| > 3

Example 1. For an exam with 𝜇 = 72 and For a price survey with 𝜇 = 100 and
𝜎 = 8, what is the interval that 𝜎 = 25. If the price of product A is
at least 75% of the scores will 150, find its standardized data. Is this
be within? an outlier?
Answer: 72 ± 2(8) or [56, 88] Answer:
2. Suppose 400 students take an 627 $9:2$::
𝑧 = 8 = )9 = 2
exam. At least how many
students will have their scores ð The price of product A is not
lie within [120, 200] if 𝜇 = an outlier.
160 and 𝜎 = 10?
Answer:
200 = 160 + 𝑘(10) → 𝑘 = 4
® At least 93.8% of 400 students,
specifically 375 students, will have
their scores lie within [120, 200].
4. Percentiles, quartiles, and box plots:
a. Percentiles: The value below which a percentage of data falls

Exercises:
Given the data set:
30 50 45 50 40 80 70
Find the percentage of data that falls below the value of 45 (the percentile of
value 45). Find the value of the 66th percentile.
1. Given value => calculate percentile
Step 1: Sort the data given
Position Value
1 30
2 40
3 45
4 50
5 50
6 70
7 80
Step 2: Find the percentage of data that falls below the value of 45

1
Source: https://www.m7athsisfun.com/data/percentiles.html
2
× 100 = 28.57%
7
Thus, the value of 45 is the 29th percentile in the data set.
2. Given percentile => calculate value: Excel’s quartile interpolation method (*)
Step 1: Find the position
(%'$)=
$::
with n: the number of values in the sample
p: the percentile given
(&'$)??
The 66th percentile position: 𝑝> = $:: = 5.28 => lie between position 5 and 6
Step 2: Find the values
The value of the 66th percentile: 50 + 0.28(70 − 50) = 55.6
b. Quartiles: Scale points that divide the sorted data into 4 groups of approximately
equal size

Note:
• 25th percentile = the first quartile = lower quartile = Q1
• 50th percentile = the second quartile = median = Q2
• 75th percentile = the third quartile = upper quartile = Q3
Interquartile range: measures the degree of spread in the data (the middle 50%)
𝑰𝑸𝑹 = 𝑸𝟑 − 𝑸𝟏
How to find Q1, Q2, Q3? Use the method of medians or Excel’s quartile
interpolation method.
1. Method of medians:

2. Excel’s quartile interpolation method: (preferred)


($)'$))9
Q1 = 25th percentile => position: 𝑝 = $::
= 3.25 => Q1 = 25 + 0.25 (29-25) =
26
($)'$)9:
Q2 = 50th percentile => position: 𝑝 = $::
= 6.5 => Q2 = 35 + 0.5 (36-35) = 35.5
($)'$)&9
Q3 = 75th percentile => position: 𝑝 = $:: = 9.75 => Q3 = 39 + 0.75 (42-39) =
41.25
Note: Small differences in 2 techniques do not lead to different conclusions in
business applications.
c. Box plots:
Five-number summary: 𝑥A!% , 𝑄$ , 𝑄) , 𝑄B , 𝑥AC6
For the example in figure 4.25:
𝑥A!% = 7, 𝑄$ = 26, 𝑄) = 35.5, 𝑄B = 41.25, 𝑥AC6 = 49
=> 𝐼𝑄𝑅 = 𝑄B − 𝑄$ = 15.25

Inner fences Outer fences


Lower fence 𝑄$ − 1.5𝐼𝑄𝑅 𝑄$ − 3𝐼𝑄𝑅
= 26 − 1.5(15.25) = 26 − 3(15.25)
= 3.125 = −19.75
Upper fence 𝑄B + 1.5𝐼𝑄𝑅 𝑄B + 3𝐼𝑄𝑅
= 41.25 + 1.5(15.25) = 41.25 + 3(15.25)
= 64.125 = 87

So, for the example in figure 4.25, there are no outliers and extreme outliers.

d. Midhinge: An additional measure of center that is not influenced by outliers.


𝑸𝟏 + 𝑸𝟑
𝑴𝒊𝒅𝒉𝒊𝒏𝒈𝒆 =
𝟐
Another way to determine skewness:

5. Correlation and covariance:


a. Sample correlation coefficient (r): describes the degree of linearity between paired
observations on two quantitative variables X and Y.

−1 ≤ 𝑟 ≤ +1
• 𝑟 ≈ 0: There is little or no linear relationship between X and Y.
• 𝑟 near +1: Strong positive relationship between X and Y.
• r near -1: Strong negative relationship between X and Y.
b. Covariance: measures the degree to which the values of X and Y change together
For example, prices of two
stocks X and Y:
• Move in the same
direction: 𝜎6D > 0
• Move in opposite
directions 𝜎6D < 0
• The prices of X and Y are
unrelated: 𝜎6D = 0
Example: Your laptop gets warm (even hot) when you place it on your lap because it
is dissipating heat from its microprocessor and related components. Calculate the
correlation coefficient.

Answer:
X: Microprocessor Speed
Y: Power Dissipation
∑$E
!#$ 𝑥!
𝑥̅ = = 1703.79
14

∑$E
!#$ 𝑦!
𝑦u = = 70
14

∑$E
!#$(𝑥! − 𝑥̅ )(𝑦! − 𝑦
u) 802505
𝑟= = = 0.962
,∑$E ) $E
!#$(𝑥! − 𝑥̅ ) ,∑!#$(𝑦! − 𝑦 u)) √25194288√27624
Using your calculator to find r as follow (for Casio fx-580VNX):
Menu ® 6: Statistics ® 2: y=a+bx ® Enter all the data given for X and Y ® AC ®
OPTN ® 3: Regression calculation ® r = 0.962
6. Skewness and Kurtosis:
a. Skewness:
b. Kurtosis: The relative length of the tails and the degree of concentration in the
center

Chapter 15: Probability


1. Basic definition: Event, sample space, probabilities
𝒏(𝑨)
𝑷(𝑨) = 𝑤𝑖𝑡ℎ 0 ≤ 𝑃(𝐴) ≤ 1
𝒏(𝑺)
Where n(A) = the number of elements in the set of the event A
n(S) = the number of elements in the sample space S
2. Rules of probabilities:
a. The rule of complement:
Let 𝐴̅ (A bar) be the complement of A
𝑃(𝐴) = 1 − 𝑃(𝐴̅)
b. The rule of intersection:
Intersection (both A and B): and, all, both, etc.
A and B or A∩B
P(A∩B) is called joint probability.
c. The rule of union:
Union (either A or B or both): either…or…, at least, etc.
A or B or A∪B

d. General law of addition:

For several independent events 𝐴$ , 𝐴) , 𝐴B , …:


𝑷(𝑨𝟏 ∪ 𝑨𝟐 ∪ … ∪ 𝑨𝒏 ) = 𝟏 − 𝑷(𝑨 uuuu𝟏 )𝑷(𝑨
uuuu𝟐 ) … 𝑷(𝑨
uuuu
𝒏)
e. Mutually exclusive events:
Events A and B are mutually exclusive if their intersection is the empty set => A and
B cannot occur at the same time.
If 𝐴 ∩ 𝐵 = ∅, then 𝑃(𝐴 ∩ 𝐵) = 0
𝑃(𝐴 ∪ 𝐵) = 𝑃(𝐴) + 𝑃(𝐵)
Example: If we look at a person’s age, then P(under 21) = 0.28 and P(over 65) = 0.12,
so P(under 21 or over 65) = 0.28 + 0.12 = 0.4 since these events do not overlap.
To check whether two events are mutually exclusive or not?
• If 𝑃(𝐴 ∩ 𝐵) = 0, then A and B are mutually exclusive.
• If 𝑃(𝐴 ∩ 𝐵) ≠ 0, then A and B are not mutually exclusive.
f. Collectively exhaustive events:
Events are collectively exhaustive if their union is the entire sample space S.
g. Conditional probability:
𝑷(𝑨∩𝑩)
The probability of A given B: 𝑷(𝑨|𝑩) = 𝑷(𝑩)
for 𝑃(𝐵) > 0
ð General law of multiplication: 𝑷(𝑨 ∩ 𝑩) = 𝑷(𝑨|𝑩)𝑷(𝑩)

h. Independent events:
Event A is independent of event B ó 𝑃(𝐴|𝐵) = 𝑃(𝐴)
If events A and B are independent, then 𝑃(𝐴 ∩ 𝐵) = 𝑃(𝐴)𝑃(𝐵)
If events are independent, 𝑃(𝐴$ ∩ 𝐴) ∩ … ∩ 𝐴% ) = 𝑃(𝐴$ )𝑃(𝐴) ) … 𝑃(𝐴% )

3. Contingency tables:
a. Marginal probabilities:

The marginal probability of a medium salary gain:


33
𝑃(𝑆) ) = = 0.4925
67
ð Salary gains at about 49 percent of the top-tier schools were between $50,000 and
$100,000.
The marginal probability of low tuition:
16
𝑃(𝑇$ ) = = 0.2388
67
ð There is a 24 percent chance that a top-tier school’s MBA tuition is under $40,000.
b. Joint probabilities:

The joint probability that the school has low tuition (T1) and has large salary gains (S3):
1
𝑃(𝑇$ ∩ 𝑆B ) = = 0.0149
67
ð There is less than a 2 percent chance that a top-tier school has both low tuition and
high salary gains.
c. Conditional probabilities:
The conditional probability that salary gains are small (S1) given that the MBA
tuition is large (T3):
5
𝑃(𝑆$ |𝑇B ) = = 0.1563
32
ð There is about a 16 percent chance that a top-tier school’s salary gains will be small
despite its high tuition.

4. Bayes’ theorem:
𝑷(𝑨|𝑩)𝑷(𝑩) 𝑷(𝑨|𝑩)𝑷(𝑩)
𝑷(𝑩|𝑨) = =
𝑷(𝑨) 𝑷(𝑨|𝑩)𝑷(𝑩) + 𝑷(𝑨|𝑩> )𝑷(𝑩> )

Example: Suppose that 10 percent of the women who purchase over-the-counter


pregnancy testing kits are actually pregnant. For a particular brand of kit, if a woman is
pregnant, the test will yield a positive result 96% of the time and a negative result 4% of
the time (called a “false negative”). If she is not pregnant, the test will yield a positive
result 5% of the time (called a “false positive”) and a negative result 95% of the time.
Suppose the test comes up positive. What is the probability that she is really pregnant?
5. Counting rules:
a. Factorials:
𝒏! = 𝒏(𝒏 − 𝟏)(𝒏 − 𝟐) … 𝟏
Exercise 1: A home appliance service truck must make 3 stops (A, B, C). In how many
ways could the three stops be arranged?
Answer: 3! = 3 × 2 × 1 = 6
Exercise 2:
(a) In a certain state, license plates consist of three letters (A–Z) followed by three
digits (0–9). How many different plates can be issued?
(b) If the state allows any six-character mix (in any order) of 26 letters and 10 digits,
how many unique plates are possible?
(c) Why might some combinations of digits and letters be disallowed?
*(d) Would the system described in (b) permit a unique license number for every car
in the United States? For every car in the world? Explain your assumptions.
*(e) If the letters O and I are not used because they look too much like the numerals
0 and 1, how many different plates can be issued?
Answer:
a. The number of different plates can be issued is: 263×103=17,576,000
b. The number of possible unique plates issued is: 366=2,176,782,336
c. Letters like O and I might be disallowed because they are similar in appearance to
numbers like 0 and 1, vice versa.
d. Yes, the number of unique plates in system (b) which is nearly 2.2 billion should
be enough for every car in the US or for every car in the world.
e. If letters O and I are not used, the remaining characters are 24 letters and 10
digits. Thus, the number of different plates can be issued in this case is:
346=1,544,804,416
b. Permutations:

The order is important.


c. Combinations:

The order is not important.


Exercise 3: There are 10 students in class. Divide them into groups of 3 students.
How many ways to divide them into groups of 3?
Answer: 10C3
Exercise 4: A deck consists of 52 cards
a. How many ways to have groups of 4 cards?
b. How many ways to choose 4 cards one after another with replacement (the same
item can be chosen more than once)? And without replacement (the same item
cannot be selected more than once)?
Answer:
a. 52C4
b. With replacement: 52×52×52×52 = 7,311,616
Without replacement: 52×51×50×49 = 6,497,400
Exercise 5: According to a survey of 24 employees, there is 60% that the employees
leave their jobs. What is the probability that:
a. None of them leave their jobs?
b. All of them leave their jobs?
c. At least 2 of them leave their jobs?
d. At most 2 of them leave their jobs?
e. More than 1 employee leave their jobs?
Answer:
The probability that the employees leave their jobs: P(A) = 0.6 => The probability
that they stay: P(A’) = 1 – 0.6 = 0.4
Let X be the number of employees that leave their jobs
a. P (X = 0) = 0.424
b. P (X = 24) = 0.624
c. P (X ≥ 2) = 1 – P (X < 2) = 1 – P (X = 1) – P (X = 0) = 1 – 0.6×0.423 – 0.424
d. P (X ≤ 2) = P (X = 0) + P (X = 1) + P (X = 2) = 0.424 + 0.6×0.423 + 0.62×0.422
e. P (X > 1) = 1 – P (X ≤ 1) = 1 – P (X = 0) – P (X = 1) = 1 – 0.424 – 0.6×0.423

You might also like