0% found this document useful (0 votes)
20 views113 pages

Topic 1 Descriptive Statistics SV

Uploaded by

Keith 777
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views113 pages

Topic 1 Descriptive Statistics SV

Uploaded by

Keith 777
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 113

Centre For Foundation Studies

Department of Sciences and Engineering

FHMM1214 Mathematics for


Social Science

Topic 1
Descriptive Statistics
1
Content
1.1 What is Statistics?
1.2 Population Versus Sample
1.3 Basic Terms
1.4 Types of Variables
1.5 Raw Data
1.6 Organizing and Graphing Qualitative Data
1.7 Organizing and Graphing Quantitative Data
1.8 Shapes of Histograms
1.9 Cumulative Frequency Distributions
2
1.10 Stem-and-Leaf Displays
1.1
What is Statistics ?
1st Meaning of Statistics
The word ‘statistics’ has 2 meanings.
1. Statistics refers to numerical facts.

 The age of a student.


 The number of students enrolled in UTAR.
 The income of a family.
 The percentage of passes in a statistics class.

4
2nd Meaning of Statistics
2. Statistics refers to the field or
discipline of study.
Statistics is a group of methods used to
collect, analyze, present, and interpret
data and to make decisions.

5
1.2
Population Versus
Sample
Population
PopulationVersus
and Sample
Sample
Population or Target Population
Consists of all elements (individuals, items,
or objects) whose characteristics are being
studied.

Sample
A portion of the population selected for
study.

7
Illustration

8
1.3
Basic Terms
Definition
Element or Member
An element or member of a sample or
population is a specific subject or
object (e.g. a person, firm, item, state, or
country) about which the information is
collected.

Variable
A variable is a characteristics under study that
assumes different value for different elements.
10
Definition

Observation or Measurement
The value of a variable for an element.

Data Set
A data set is a collection of observations on
one or more variables.

11
SUMMARY
Population or Target Population
Consists of all elements (individuals, items, or objects) whose characteristics are
being studied.

Sample
A portion of the population selected for study.

Element or Member
An element or member of a sample or population is a specific subject or object
(e.g. a person, firm, item, state, or country) about which the information is
collected.

Variable
A variable is a characteristics under study that assumes different value for
different elements.

Observation or Measurement
The value of a variable for an element.

Data Set
A data set is a collection of observations on one or more variables.
Example

13
Example
Problem
The following table gives the scores of five
students on a statistics test.
Student Score i) What is the variable for
Kevin 83 this data set?
ii) How many observations
Susan 91
does this data set
David 78 contain?
Jeff 69 iii) How many elements
Johan 87 does this data set
contains? 14
Solution

15
1.4
Types of Variables
Quantitative Variables
Definition

• A variable that can be measured


numerically is called quantitative
variable.

• The data collected on a quantitative


variable is called quantitative data.

17
Quantitative Variables
a) Discrete Variable
A variable whose values are countable is
called a discrete variable. In other words,
a discrete variable can assume only
certain values with no intermediate values.

Example: The number of students in a class etc.

18
Quantitative Variables

b) Continuous Variable
A variable that can assume any numerical
value over a certain interval is called a
continuous variable.

Example:
The height of a person etc.
The time taken to complete an examination.
The yield of potatoes (in pounds) per acre.
19
Qualitative / Categorical Variables

Definition
• A variable that cannot assume numerical value
but can be classified into two or more non-
numeric categories.
• The data collected on such a variable are called
qualitative data.

Example: Gender of a person, hair color

20
Exercise
Determine whether the following is a Population or Sample
and hence, identify the following as Qualitative,
Quantitative Discrete, or Quantitative Continuous
variables.

a) Annual income of all employees of a restaurant.


b) Number of subjects taken by students selected in a class.
c) Name of all students in a school.
d) Weights of 50 kids selected in the kindergarten.
e) Time taken to complete a test by all students in a class.

21
Solution

22
Illustration

23
1.5
Raw Data
Definition
RAW DATA
Data recorded in the sequence in which
they are collected and before they are
processed or ranked are called raw data.

25
Raw Data (quantitative data)

26
Raw Data (qualitative data)

27
1.6
Organizing &
Graphing Qualitative
Data
Example 1
A sample of 30 employees were asked how stressful their
jobs were. Their responses are recorded below.

Somewhat none somewhat very very none


very somewhat somewhat very somewhat somewhat
very somewhat none very none somewhat
somewhat very somewhat somewhat very none
Somewhat very very somewhat none somewhat

Construct a frequency distribution table for these data.

29
Example 1 (Solution)

30
Relative Frequency &
Percentage Distributions
Tabular arrangement that lists the
relative frequencies and percentages
for all categories.
frequency of that category f
relative frequency of a category  
sum of all frequencie s f
Percentage  relative frequency  100

31
Example 1 (Solution)

f
10
14
6
Sum = 30

32
Exercise
The following data give the results (in grade) of 20
students in Mathematics Test.
A C A B F
B A B C B
A B C F A
B B C C B
a) Construct a frequency distribution table.

b) Calculate the relative frequencies and percentages


for the results.
33
Solution
Grade Frequency Relative Percentage
Frequency (%)
A

34
Exercise

35
Frequency, Relative Frequency, Percentage
Distributions Table of Students’ Status

Relative
Status Frequency Percentage
Frequency
F
SO
J
SE
sum
Revision exercise
In a survey, 120 Malaysian adults were asked to rate their health.
The table below summarizes their responses.
State of Health Percentage of Response
Excellent 17.5
Very good 37.5
Good 32.5
Fair 10.0
Poor 2.5

Find the number of adults who was in an excellent health


condition.
37
1.7
Organizing &
Graphing Quantitative
Data
Ungrouped Frequency Distribution

Frequency Distributions for Quantitative Data

Single-Valued Classes
Are used if the observations in a data set assume
only a few distinct (integer) values
( i.e. classes are made of single values and not of
intervals).

39
Example 2
The Number of Vehicles Owned by 40 Households
from a City

5 1 1 2 0 1 1 2 1 1
1 3 3 0 2 5 1 2 3 4
2 1 2 2 1 2 2 1 1 1
4 2 1 1 2 1 1 4 1 3
Construct a frequency distribution table for these data.

40
Example 2 (Solution)
Number of Households
Vehicles Owned
(f)
0 2
1 18
2 11
3 4
4 3
5 2
Sum 40

41
Bar Graph

42
Grouped Frequency Distribution

Grouped Frequency Distribution

• Lists all the classes and the number of


values that belong to each class.
• Data presented in the form of a frequency
distribution are called grouped data.

43
Example 3
Weekly Earnings of 100 Employees of a Company
401 410 448 450 490 505 521 555 600 601
605 610 620 625 630 650 678 680 685 690
700 725 750 760 770 780 785 790 795 798
800 801 805 809 810 810 814 815 820 825
828 830 835 840 845 850 855 860 865 870
880 888 890 895 900 910 920 930 935 940
950 956 959 960 965 967 970 980 995 1000
1010 1020 1030 1055 1068 1070 1079 1090 1100 1110
1120 1130 1155 1167 1180 1230 1250 1259 1270 1290
1300 1320 1350 1400 1410 1460 1500 1541 1560 1600

Construct a frequency distribution table for these data.


Example 3 (Solution)
Number of Employees
Weekly Earnings
(f)
401 – 600 9
601 – 800 22
801 – 1000 39
1001 – 1200 15
1201 – 1400 9
1401 – 1600 6
Sum 100

45
Relative Frequency &
Percentage Distributions

The relative frequencies and percentages for


a quantitative data set are obtained as follows:

frequency of that category f


relative frequency of a category  
sum of all frequencie s f
Percentage  relative frequency  100

46
Illustration 1
Illustration 2

Class Boundaries f
134.5 – 156.5 10
156.5 – 178.5 3
178.5 – 200.5 7
200.5 – 222.5 6
222.5 – 244.5 4

Sum = 30

48
Example 4

49
Example 4 (Solution)
Find the frequency, relative frequency and
percentage for all classes.
Relative
Age Frequency Percentage
Frequency
18 – 21
22 – 25
26 – 29
30 – 33
34 – 37
sum
Definition
Class
An interval that includes all the values that fall within
two numbers, the lower and upper limits
Class limits
Endpoints of each interval
Class Boundary
The dividing line between two classes and is given
by the midpoint of the upper limit of one class and
the lower limit of the next higher class.
51
Definition
Class width / class size
The difference between the upper and lower
class boundary.

Class mark / class midpoint


The midpoint of the class interval.
Lower Limit  Upper Limit
Class midpoint or class mark 
2

52
Example 5

53
Example 5 (Solution)

Class Boundaries

400.5 – 600.5
600.5 – 800.5
800.5 – 1000.5
1000.5 – 1200.5
1200.5 – 1400.5
1400.5 – 1600.5

54
Example 6
Class Lower Upper Class
Midpoint
interval boundary boundary width
10  11 15  16 11  15
11 – 15  10.5  15.5  13 15.5 – 10.5 = 5
2 2 2

16 – 20 15  16 20  21 16  20
 15.5  20.5  18 20.5 – 15.5 = 5
2 2 2
20  21 25  26 21  25
21 – 25  20.5  25.5  23 25.5 – 20.5 = 5
2 2 2
25  26 30  31 26  30
26 – 30  25.5  30.5  28 30.5 – 25.5 = 5
2 2 2

55
Exercise
Class Lower Upper Class
Midpoint
interval boundary boundary size
0–9

10 – 19

20 – 29

30 – 39

56
Solution
Class Lower Upper Class
Midpoint
interval boundary boundary size
0–9 - 0.5 9.5 4.5 10

10 – 19 9.5 19.5 14.5 10

20 – 29 19.5 29.5 24.5 10

30 – 39 29.5 39.5 34.5 10

57
Exercise
Find the class boundaries and class limits.
a) Number of books 2–3 4–5 6–7 8 – 9 10 – 11
Frequency 10 12 8 4 2

b) Weight (kg) 40 – <50 50 – <60 60 – <70 70 – <80 80 – <90


Frequency 10 12 8 4 2

58
Solution
(a)
Number of frequency Class boundaries Class limit
books
2–3 10
4–5 12
6–7 8
8–9 4
10 – 11 2

59
Solution
(b)
Weight (kg) frequency Class boundaries Class limit

40 – <50 10
50 – <60 12
60 – <70 8
70 – <80 4
80 – <90 2

60
Revision exercise
The following table gives the frequency
distribution of ages for all 50 employees of a
company.
Age No. of Employees
18 to 30 12
31 to 43 19
44 to 56 14
57 to 69 5
61
Revision exercise
a) Find the class boundaries and class midpoints.

b) Construct a relative frequency and percentage table.

c) Do all classes have the same width? If yes, what is


that width?

d) What is the percentage of the employees of this


company are age 43 or younger?

62
Solution
Class Class Relative Percentage
Age Midpoint frequency
boundaries width frequency (%)

18 – 30 12

31 – 43 19

44 – 56 14

57 – 69 5

Sum =
50
63
Graphing Grouped Data
Grouped (quantitative) data can be
displayed in a histogram or a polygon.
Histogram
Three types of histogram

1. Frequency histogram
2. Relative frequency histogram
3. Percentage histogram
64
Histogram
• A histogram is a graph in which class boundaries
are marked on the horizontal (x) axis & the
frequencies, relative frequencies, or
percentages are marked on the vertical (y) axis.

• The frequencies, relative frequencies, percentages


are represented by the heights of the bars.

• The bars are drawn adjacent to each other.


65
Illustration 3
Total Home Class Relative Percentage
Frequency
Runs boundaries Frequency (%)

135 – 156 134.5 – 156.5 10 0.3333 33.33

157 – 178 156.5 – 178.5 3 0.1000 10.00

179 – 200 178.5 – 200.5 7 0.2333 23.33

201 – 222 200.5 – 222.5 6 0.2000 20.00

223 – 244 222.5 – 244.5 4 0.1333 13.33

Sum = 30 Sum = 0.999 Sum = 99.9%


Frequency Histogram
 Frequency may be used as the height of rectangle

134.5 156.5 178.5 200.5 222.5 244.5


67
Relative Frequency Histogram

134.5 156.5 178.5 200.5 222.5 244.5

68
Percentage
Percentage Histogram

134.5 156.5 178.5 200.5 222.5 244.5

69
Polygon
• A graph formed by joining the midpoints of
the tops of successive bars in a histogram.

• Next, we mark two more classes (with zero


frequencies), one at each end, and mark the
midpoints.

70
Polygon

134.5 156.5 178.5 200.5 222.5 244.5


Total home runs
Example 7
The marks obtained by 134 students in an examination
is recorded in the following table.

20 30 40 50 60 70 80
Marks – – – – – – –
29 39 49 59 69 79 89

Frequency 22 18 22 24 14 14 20

Construct a histogram for the frequency distribution.

72
Example 7 (Solution)
Class
Marks frequency
boundaries
20 – 29 19.5 – 29.5 22
30 – 39 29.5 – 39.5 18
40 – 49 39.5 – 49.5 22
50 – 59 49.5 – 59.5 24
60 – 69 59.5 – 69.5 14
70 – 79 69.5 – 79.5 14
80 – 89 79.5 – 89.5 20
73
Example 7 (Solution)

74
Exercise
The table below shows the ages distribution for 30
participants in a game. Draw a histogram for frequency
distribution.

Age (Years) Frequency


6 – 10 2
11 – 15 7
16 – 20 8
21 – 25 6
26 – 30 3
31 – 35 4

75
Solution

Class
Age frequency
boundaries
6 – 10 5.5 – 10.5 2
11 – 15 10.5 – 15.5 7
16 – 20 15.5 – 20.5 8
21 – 25 20.5 – 25.5 6
26 – 30 25.5 – 30.5 3
31 – 35 30.5 – 35.5 4

76
Histogram for the frequency distribution for the age (years) of 30
participants in a game
9

5
Frequency

0
5.56 – 1010.511 – 1515.516 – 2020.521 – 25
25.526 – 30
30.531 – 35
35.5 77
Age
Revision exercise
Weekly Earnings of 100 Employees of a Company
Weekly Earnings Number of Employees
(dollars) (f)
401 – 600 9
601 – 800 22
801 – 1000 39
1001 – 1200 15
1201 – 1400 9
1401 – 1600 6
Sum 100
Construct a histogram and Polygon for the frequency distribution.
Revision exercise
The marks obtained by 120 students in an
examination is recorded in the following
table.
Marks 20-29 30-39 40-49 50-59 60-69 70-79 80-89

Frequency 12 18 24 20 18 16 12

Construct a percentage histogram to


represent the above information.
Solution

80
1.8
Shapes of Histograms
Symmetric Histogram

It is identical on both sides of its central point.

82
Skewed Histogram
It is asymmetric and the tail on one side is
longer than the tail on the other side.

83
1.9
Cumulative Frequency
Distributions
Definition
A cumulative frequency distribution
gives the total number of values that fall
below the upper boundary of each class.

85
Example
Example 68
Prepare a cumulative frequency distribution for the
following frequency distribution.

86
Example 8 (Solution)

f
10
3
7
6
4
Sum = 30

87
Cumulative Relative Frequency &
Cumulative Percentage
Cumulative frequency of a class
Cumulative relative frequency =
Total observations in the data set

Cumulative percentage = (Cumulative relative frequency) x 100

88
Example 8 (Solution)

c.f.
10
13
20
26
30
89
Ogive
An ogive is a curve drawn for the cumulative
frequency distribution by joining with straight
lines the dots marked above the upper
boundaries of classes at heights equal to the
cumulative frequencies of respective classes.

90
Cumulative Frequency Curve
(Ogive)
There are two types of cumulative frequency
curves:
1) ‘less than’ cumulative frequency curve
2) ‘more than’ cumulative frequency curve

91
Example 9
Construct a ‘less than’ ogive for the data below.

92
Example 9 (Solution)

Upper boundary ‘Less than’ cumulative


frequency
<134.5 0
<156.5 10
<178.5 13
<200.5 20
<222.5 26
<244.5 30
93
‘Less than’ Ogive

94
Example 10
Using the data given below, construct a ‘less than’ cumulative
frequency distribution and draw the ogive.

Marks 1 – 10 11 – 20 21 – 30 31 – 40 41 – 50 51 – 60 61 – 70 71 – 80

Number of
3 8 12 14 10 6 5 2
Students ( f )

Estimate from the ogive,


(i) the number of students who score less than 60 marks.
(ii) the number of students who score more than or equal to 60 marks.
(iii) the value of x, if 20% of the students score less than x marks.
(iv) the value of x, if 80% of the students score more than or equal to x marks.

95
‘Less than’ Cumulative Frequency Distribution

‘Less than’
Upper boundary cumulative
frequency
Marks Frequency Less than 0.5 0
1 – 10 3 < 10.5 3
11 – 20 8 < 20.5 11
21 – 30 12 < 30.5 23
31 – 40 14 < 40.5 37
41 – 50 10 < 50.5 47
51 – 60 6 < 60.5 53
61 - 70 5 < 70.5 58
71 – 80 2 < 80.5 60
Sum 60
“Less than” ogive for the cumulative frequency distribution for the
marks scored by 60 students
Example 10 (Solution)
(i) Approximately 52 students score less than 60 marks.

(ii) Approximately 8 students score at least 60 marks.

(iii) 20% of students (12 students) obtain less than x marks;


From the graph, x = 21

(iv) 80% of students (48 students) obtain at least x marks,


20% of the students (12 students) obtain less than x marks.
From the graph, x = 21
 80% (48) students score at least 21 marks.
98
Revision exercise
Draw a histogram and a ‘less than’ cumulative
frequency curve based on the following frequency
distribution.
Number of Students
Marks
(f)
0 ≤ x < 20 30
20 ≤ x < 40 40
40 ≤ x < 60 50
60 ≤ x < 80 60
80 ≤ x <100 20
Sum = 200
99
Solution

100
1.10
Stem-and-Leaf Displays
Definition
In a stem-and-leaf display of quantitative
data, each value is divided into two portions
– a stem and a leaf.

The leaves for each stem are shown


separately in a display.

102
Example 11
The following are the scores of 30 college students
on a statistics test.
75 52 80 96 65 79 71 87 93 95
69 72 81 61 76 86 79 68 50 92
83 84 77 64 71 87 72 92 57 98

Construct a stem-and-leaf display.

103
Stem-and-leaf display for two-digit
numbers

104
Stem-and-leaf display for two-digit
numbers

5 2 0 7
6 9 1 4 5 8
7 5 2 7 6 1 9 1 9 2
8 3 4 0 1 6 7 7
9 6 2 3 5 2 8
105
Example 11 (Solution)

Key: 5|0 means 50


106
Example 12
The following are the age of 27 patients who had
the first heart attack.

65 40 63 67 75 79 85 45 90
60 55 67 86 55 49 78 76 54
67 98 56 45 50 85 67 72 83

Construct a stem-and-leaf plot.

107
Example 12 (Solution)

108
Example 13
The following data give the monthly rents paid by a
sample of 27 households selected from a small city.

880 1081 721 1075 1023 775 1235 750 965


960 1210 985 1231 932 850 825 1000 915
1191 1035 1151 1180 1175 952 1100 1140 860

Construct a stem-and-leaf display for these data.

109
Example 13 (Solution)

110
Example 14
The following stem-and-leaf display is prepared for the number of hours
that 25 students spent working on computers during the past month.

Stem Leaf
0 6 26 38 49 67 85
1 1 7 9
6 34 37 19 22
2 2 6
3 2 4 7 8 41 56 58 32 49
4 1 5 6 9 9 64 65 45 46 86
5 3 6 8
53 11 17 62 64
6 2 4 4 5 7
7
8 5 6

Key: 0 6 means 6
Prepare a new stem-and-leaf display by grouping the stems with class
interval 0 – 2, 3 – 5, 6 – 8.
Example 14 (Solution)

112
The End
of
Topic 1

You might also like