Lecture 3-Descriptive Statistics
Lecture 3-Descriptive Statistics
Elementary Statistics
(STATS 1)
Lecture 3: Descriptive
Statistics
For example, the average height of a sample of 1503 men is 165.2 cm.
This is Statistical description
-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
The Mean
The mean, also known as the arithmetic average, is found by adding the
values of the data and dividing by the total number of values.
𝑥=
𝑥 1 + 𝑥 2+ 𝑥 3 +… + 𝑥𝑛
=
∑ 𝑥𝑛
𝑛 𝑛
a. 1 -4 7 -5 8
-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
110 76 29 38 105 31
a. 1 -4 7 -5 8 3
-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
110 76 29 38 105 31
• A data set that has only one value that occurs with the greatest frequency
is said to be unimodal.
• If a data set has two values that occur with the same greatest frequency,
both values are the mode, and the data set is said to be bimodal.
• If a data set has more than two values that occur with the same greatest
frequency, each value is used as the mode, and the data set is said to be
multimodal.
• When no data value occurs more than once, the data set is said to have no
mode.
Example 3: Calculate the mode of the following:
a. 1 -4 7 -5 8 3 5 7
-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
a. 1 -4 7 -5 8 3
-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
110 76 29 38 105 31
d. If I only ask 5 people of their height what is their midrange of their heights?
170.5 cm, 162.3 cm, 150 cm, 157 cm, 160.1 cm
e. Midrange of the following baby Birth Weight (kg)
3.1 5.3 4.5 3.6 5.6
4.7 3.8 6.1 5.9 4.2
5.6 7.1 5.3 4.8 6.3
5. Calculate the mean, median, mode and midrange of the following data
value.
25 25 28 25 21
28 28 25 26 21
21 27 25 29 29
The Weighted Mean
Sometimes, you must find the mean of a data set in which not all values are
equally represented.
When data values are assigned different weights, we can compute a
weighted mean.
𝑋=
𝑤1 𝑋 1 +𝑤 2 𝑋 2 +𝑤3 𝑋 3 +…+ 𝑤𝑛 𝑋 𝑛
=
∑ 𝑤𝑋
𝑤 1+𝑤 2 +𝑤 3 + …+𝑤 𝑛 ∑𝑤
6. Taxi 1
Taxi 2
Taxi 3
7. Computing Grade Point Average. In her first semester of college, a student
took five courses. Her final grades along with the number of credits for each
course were: A (4 credits); A (4 credits); B (4 credits), C (4 credits), and B (4
credits). The grading system assigns quality points to letter grades as follows:
A=4; B=3; C=2; D=1 ;. Compute her grade point average (GPA).
Distribution Shapes
Review of the Basics
Measures of Variation
Basics of variation
Consider the following group of number and determine which group varies
more?
a. 6 8
b. 5 3 9
c. 1 2 1 3
-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
Intuition of variation
250
350
300
200
250
200
150
150
100 100
50
50
0
0 200 400 600 800 1000 1200
-50
0
0 200 400 600 800 1000 1200
-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
The Variance
SAMPLE VARIANCE
Measures the spread of data in the sample. Sample Variance is
denoted by 2 ∑ ( 𝑥 − 𝑥) 2
𝑠 =
𝑛 −1
9. Consider the following number and determine the variance
a. 6 8 4 7
b. 5 3 9 8
-8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8
POPULATION VARIANCE
Measures the spread of data in the population. Sample Variance is
denoted by
𝜎 =
2 ∑ ( 𝑥 − 𝜇 ) 2
𝑁
10. Calculate the variance age of 65 members of parliament.
The Standard Deviation
SAMPLE STANDARD DEVIATION
Represent the standardized form of the spread of sample data. Sample
standard deviation is denoted as
√ ∑ 2
(𝑥 − 𝑥)
𝑠=
𝑛 −1
POPULATION VARIANCE
Represent the standardized form of the spread of population data. population
standard deviation is denoted as
√ ∑ ( 𝑥 − 𝜇) 2
𝜎=
𝑁
12. Calculate the standard deviation of 65 members of parliament.
13. Calculate the range, standard deviation and variance of the following
data set.
25 25 28 25 21
28 28 25 26 21
21 27 25 29 29
Measures of Position
Coefficient of Variation
Example 16:
A researcher conducted an IQ test to 1000 Adults in Timor-Leste. The mean IQ
was 90 and the standard deviation of 10.1. On the next day, He tested 4
random people for IQ and the results are as following:
a. Antonio IQ = 94
b. Maria IQ = 86
c. Jorge IQ = 130
d. Angelina IQ = 60
For example, the 50th percentile, denoted , has about 50% of the data
values below it and about 50% of the data values above it. So the 50th
percentile is the same as the median.
(First quartile): Separates the bottom 25% of the sorted values from the top
75%. (To be more precise, at least 25% of the sorted values are less than or
equal to and at least 75% of the values are greater than or equal to )
(Second quartile): Same as the median; separates the bottom 50% of the
sorted values from the top 50%.
(Third quartile): Separates the bottom 75% of the sorted values from the top
25%. (To be more precise, at least 75% of the sorted values are less than or
equal to and at least 25% of the values are greater than or equal to )
EXAMPLE 19:
Consider the following data set:
1 2 2 3 4
5 7 9 9 10
12 14 16 17 18
a. Find the
b. Find the IQR
EXAMPLE 19:
Consider the following data set:
1 2 2 3 4
5 7 9 9 10
12 14 16 17 18 20
a. Find the
b. Find the IQR
5 Number Summary and Box Plot
The values of the three quartiles are used for the 5-number summary
and the construction of boxplot graphs.
Use descriptive statistics techniques and decide what minimum salary should a
teacher gets.