0% found this document useful (0 votes)
20 views17 pages

Class 31 - Statistics 2

The document covers various statistical concepts including histograms, frequency polygons, line graphs, cumulative frequency, interquartile range, median, quartiles, and standard deviation. It provides worked examples and explanations for drawing histograms and frequency polygons, analyzing data over time with line graphs, and calculating cumulative frequencies and quartiles. Additionally, it discusses the significance of standard deviation in understanding data dispersion.

Uploaded by

marcojaylewis56
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views17 pages

Class 31 - Statistics 2

The document covers various statistical concepts including histograms, frequency polygons, line graphs, cumulative frequency, interquartile range, median, quartiles, and standard deviation. It provides worked examples and explanations for drawing histograms and frequency polygons, analyzing data over time with line graphs, and calculating cumulative frequencies and quartiles. Additionally, it discusses the significance of standard deviation in understanding data dispersion.

Uploaded by

marcojaylewis56
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Date: 06/02/2023

Class: #31

Syllabus Topic: Statistics

Title: Histogram, Line Graph, Frequency Polygon, Cumulative Frequency, Interquartile


Range, Median, Lower Quartile, Upper Quartile, Standard deviation

Histogram

A histogram displays the shape and spread of continuous sample data. The columns of a

histogram “touch” each other.

Worked Example 1

The table below shows the grouped data of the age of 29 persons.

Age Intervals Frequency

1-10 3

11-20 5

21-30 12

31-40 7

41-50 2

Draw a histogram to represent the information in the table above.

Note: Here are two common ways that students mistakenly draw a Histogram.

The following two Histograms are incorrect.


Solution:

Age Intervals Frequency Lower class Boundary Upper class Boundary

1-10 3 0.5 10.5

11-20 5 10.5 20.5

21-30 12 20.5 30.5

31-40 7 30.5 40.5

41-50 2 40.5 50.5

Title: Histogram showing the age of 29 persons

Scale
𝑥-axis: 1 cm = 5 years
𝑦-axis: 1 cm = 1 person
Note: When drawing a histogram, here are some useful tips:

- First, determine the upper class boundary and lower class boundary for each class

interval of the grouped data given.

- Draw the 𝑥-axis and 𝑦-axis using an appropriate scale.

- On the graph, insert a small dot at the UCB and LCB values for each interval based

on the frequency so you would be able to easily draw the height of the columns.

- Draw the vertical lines first and then the horizontal lines.

Frequency Polygon

A frequency polygon is a graph constructed by using lines to join the midpoints of each

interval and can be created from the histogram.

Worked Example 1

The table below shows the grouped data of the age of 29 persons.

Age Intervals Frequency Midpoint

1-10 3 5.5

11-20 5 15.5

21-30 12 25.5

31-40 7 35.5

41-50 2 45.5

Note: In your frequency polygon graph, ensure to “close” the polygon.


Title: Frequency Polygon showing the age of 29 persons

Scale
𝑥-axis: 1 cm = 10 years
𝑦-axis: 1 cm = 2 persons
Frequency

Age (years)

Line Graph

A line graph is a type of chart used to visualize the value of something over time.

Worked Example 1

The table below shows the number of cars manufactured from 1998-2001.

Year Number of cars manufactured


1998 10 000
1999 20 000
2000 15 000
2001 17 000
Title: Line Graph showing the number of cars manufactured from 1998-2001.

Scale
𝑥-axis: 1 cm = 1 year
𝑦-axis: 1 cm = 5 000 cars
Frequency (1000’s of cars)

Years

(a) Which year contained the most manufactured cars?

(b) During which two years had the largest increase in the number of manufactured

cars? State the increased amount.

(c) Determine the average amount of cars manufactured per year.

Solution:

(a) The most cars were manufactured in the year 1999.

(b) The largest increase in the number of manufactured cars occurred from 1998 to

1999.

Increased amount = 20 000 − 10 000

= 10,000 cars
10 000+20 000+15 000+17 000
(c) Average amount of cars manufactured per year = 4

= 15,500 cars

Cumulative Frequency

➢ uses continuous data

➢ the ‘running total’ of frequencies

➢ the sum of the class and all classes below it in a frequency distribution.

Worked Example

The table below shows the age of 80 persons.

Age Intervals Frequency

1-10 2

11-20 9

21-30 25

31-40 30

41-50 10

51-60 4

Question: What is the lower class boundary of the second class interval?

Answer: 10.5

Question: What is the upper class boundary of the last class interval?

Answer: 60.5
Question: What is the class width of the intervals?

Answer: The first class interval is 1-10. Using the first class interval,

Class width = Upper class boundary – Lower class boundary

= 10.5 − 0.5

= 10

Question: What is the lower class limit of the 4th class interval?

Answer: 31

(a) Fill out the cumulative frequency column of the data given above.

Age Intervals Frequency Cumulative Frequency

1-10 2 2
Note: The value
11-20 9 11
obtained in the last
21-30 25 36 row should be equal
to the total number
31-40 30 66
stated in the question.
41-50 10 76

51-60 4 80
(b) Draw a cumulative frequency curve using the information provided above.

Note: When drawing a cumulative frequency curve, here are some useful tips:

- First, determine the upper class boundary for each class interval of the grouped

data given as these values will be used to plot the cumulative frequency curve.

- In this case, the 𝑥-axis will represent age and 𝑦-axis is almost always used to

represent the cumulative frequency.

- The shape of the curve looks like a ‘stretched 𝑆’ and is called an ogive.

- Use a small ‘×’ to plot the points and a fine pencil to connect them.

- Try to use as much of the graph paper as possible.

Age Intervals Frequency Cumulative Frequency Upper Class Boundary

1-10 2 2 10.5

11-20 9 11 20.5

21-30 25 36 30.5

31-40 30 66 40.5

41-50 10 76 50.5

51-60 4 80 60.5
Title: Cumulative Frequency Graph showing the age of 80 persons

Cumulative
Frequency

Scale
𝑥-axis: 1 cm = 5 years
𝑦-axis: 1 cm = 5 persons

Age (years)
(c) What is the median age?

𝑛+1 80+1 81 Note: Either


The median value occurs at the = = = 40.5th value
2 2 2
equation can
OR be used.
𝑛 80
The median value occurs at the 2 = = 40th value
2

Using the cumulative frequency curve, the median age is 32 years.

(d) Determine the interquartile range.

Equation:

IQR = 𝑄3 − 𝑄1

where IQR = interquartile range

𝑄3 = 3rd quartile

𝑄1 = 1st quartile

The IQR (interquartile range) describes the middle 50% of values when ordered

from highest to lowest. It is a measure of the dispersion of data.

3(𝑛+1) 3(80+1)
The 3rd quartile occurs at the = = 60.75th value
4 4

OR

3𝑛 3(80)
The 3rd quartile occurs at the = = 60th value
4 4

Using the cumulative frequency curve, the value of 𝑄3 = 38 years.


𝑛+1 80+1
The 1st quartile occurs at the = = 20.25th value
4 4

OR
𝑛 80
The 1st quartile occurs at the 4 = = 20th value
4

Using the cumulative frequency curve, the value of 𝑄1 = 25 years.

∴ IQR = 𝑄3 − 𝑄1

= 38 − 25

= 13 years

(e) Determine the semi-interquartile range.

IQR
SIQR =
2

13
= 2

= 6.5 years

(f) How many persons are over 50 years old?

Using the cumulative frequency curve, it can be deduced that 76 persons are under

the age of 50.

Hence, 80 − 76 = 4 persons are over the age of 50.


(g) What is the probability of selecting a person over 50 years old?

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐷𝑒𝑠𝑖𝑟𝑒𝑑 𝑂𝑢𝑡𝑐𝑜𝑚𝑒𝑠


Probability = 𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒𝑠

4
= 80

1
= 20 or 0.05 or 5%

Example – Finding the median of a cumulative frequency curve

Question: Determine the median of the cumulative frequency curve below.

Cumulative
Frequency Scale
𝑥-axis: 1 cm = 5 years
𝑦-axis: 1 cm = 10 persons

Age (years)
Solution:

𝑛+1 60+1 61
The median value occurs at the = = = 30.5th value
2 2 2

OR

𝑛 60
The median value occurs at the 2 = = 30th value
2

Using the cumulative frequency curve, the median age is 20 years.

Question: Determine the value of 𝑄3 .

Solution:

3𝑛 3(60)
The 3rd quartile occurs at the = = 45th value
4 4

Using the cumulative frequency curve, the value of 𝑄3 = 23 years.

Question: Determine the value of 𝑄1.

Solution:

𝑛 60
The 1st quartile occurs at the 4 = = 15th value
4

Using the cumulative frequency curve, the value of 𝑄1 = 17 years.


Standard Deviation

In statistics, the standard deviation is the measure of the amount of variation or

dispersion of a set of values. A low standard deviation indicates that the values tend to

be close to the mean of the set, while a high standard deviation indicates that the values

are spread out over a wider range.

➢ The standard deviation tells me how much the individual numbers are deviating

from the mean.

Example:

The tables below show the amount of weekly spending money between two students.

Jeremiah Brother 1 Brother 2


Mean = $333
333 330 336

Cromwell Brother 1 Brother 2


Mean = $333
$5 $400 $595

Notice that the average of both students, Jeremiah and Cromwell are equal to each

other. However, in Cromwell’s situation, the average does not represent how much

money he receives for the week. We can use the standard deviation to show how much

the individual numbers deviate from the mean.

Note: When comparing the standard deviation of a set


of values, vital words to include are:

- “spread”
- “deviates”
- “mean”
Example:

Diagram 1:

Standard deviation = 5

Diagram 2:

Standard deviation = 1
For diagram 1, suppose that the standard deviation is 5 and for diagram 2, suppose that

the standard deviation is 1. The standard deviation of diagram 1 is larger than the

standard deviation of diagram 2. Since standard deviation is a measure of the spread of

data, it means that the values of diagram 1 were more ‘spread out’ or distributed over

its range than the values in diagram 2. In diagram 1, they deviated further from the

mean than in diagram 2.

You might also like