0% found this document useful (0 votes)
18 views23 pages

Chapter 3

The document discusses different techniques for organizing and summarizing qualitative and quantitative data, including frequency tables, bar charts, pie charts, and time series graphs. Graphical techniques and measures of central tendency and dispersion are used to present information about data. Descriptive statistics and graphical techniques can be the final product of statistical analysis or used to better interpret data.

Uploaded by

Nurul Izzati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views23 pages

Chapter 3

The document discusses different techniques for organizing and summarizing qualitative and quantitative data, including frequency tables, bar charts, pie charts, and time series graphs. Graphical techniques and measures of central tendency and dispersion are used to present information about data. Descriptive statistics and graphical techniques can be the final product of statistical analysis or used to better interpret data.

Uploaded by

Nurul Izzati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

CHAPTER 3: ANALYSING DESCRIPTIVE STATISTICS

• Descriptive statistics are methods for organizing and summarizing data.

• For example, tables or graphs are used to organize data, and descriptive values such as the
average score are used to summarize data.

• A descriptive value for a population is called a parameter and a descriptive value for a
sample is called a statistic.

• Descriptive statistics can be used to describe data on a single variable. There have three
major techniques for describing the sample data which include graphical techniques,
measure of central tendency, measure of dispersion.

GRAPHICAL MEASURE OF MEASURE OF


TECHINIQUES CENTRAL TENDENCY DISPERSION
• Qualitative data • Mean, mode and • Standard deviation,
Frequency table, bar median. variance and
chart, pie chart. interquatile.
• Quantitative data
Time series plot,
scatter plot,
histogram.

• Numerical descriptive measure and graphic techniques are used to present information
about the data being studied.
• Graphical techniques not able to convey a whole picture of frequency table. Therefore,
measure of central tendency and measure of dispersion are important to produce better data
interpretation.
• Descriptive statistics and graphical techniques can also be the final product of a statistical
analysis.

1
©NOOR MAIZATUL NAZUHA MOHAMAD
3.1 GRAPHICAL TECNIQUES
3.1.2 Organizing and Graphing Qualitative Data
When the raw data is obtained, we can tabulate the data in ordered manner in a frequency
table and contingency table, or using graphical presentation such as pie chart and bar chart.

Frequency Table
• A frequency table for qualitative data lists all categories and the number of elements
that belong to each of the categories.
• By using data below, let’s try construct the frequency table
A, B, D, B, C, C, C, A, A, B, D, D, B, D, C, B, D, A, D

Table 1: Class Frequency Table

Class Frequency Tally


A 4
B 5
C 4
D 6

From Table 1, we can conclude that the highest-class member is D, followed by B while A
and C share the same frequency.

Contingency Table

• A contingency table also know as a cross-classification table. It often used to


examine two categorical variables. For example, a contingency table between
gender of staff and their respective department as shown in Table 2 below.

2
©NOOR MAIZATUL NAZUHA MOHAMAD
Table 2: Number of Staff for each Department in Company XYZ

Gender Department
Marketing Account Management
Female 4 6 1
Male 18 2 3
Total 22 8 4

From Table 2, there have 22, 8 and 4 staffs in marketing, account and management
departments respectively. Majority of the male staffs are working in marketing
department whereas majority of female staffs are working in account department.
Only 1 female staff working in management department.

Bar chart
• A graph made of bars that represent the frequencies of respective categories.
• State the title and labels for both axis appropriately.
• It has few types of bar charts that include:
✓ Vertical bar chart
✓ Horizontal bar chart
✓ Component bar chart
✓ Multiple bar chart
• Vertical bar chart

0
A B C D

Figure 1: Vertical Bar Chart

3
©NOOR MAIZATUL NAZUHA MOHAMAD
• Horizontal bar chart
To construct horizontal bar chart, mark the various categories on the vertical axis
and mark the frequencies on the horizontal axis.

0 1 2 3 4 5 6 7

Figure 2: Horizontal Bar Chart


• Component bar chart
✓ To construct a component bar chart, all categories is in one bar and every
bar is divided into components
✓ The height of the components should be tally with representative
frequencies.

Management

Female Male

Account

Marketing

0 5 10 15 20 25

Figure 3: Component Bar Chart

4
©NOOR MAIZATUL NAZUHA MOHAMAD
• Multiple bar chart
✓ To construct a multiple bar chart, each bar that representative any categories
are gathered in groups.
✓ The height of the bar represented the frequencies categories.
✓ Useful for making comparisons

20
18
16
14
12
10
8
6
4
2
0
Marketing Account Management

Female Male

Figure 4: Multiple Bar Chart

Pie Char

• Pie chart comprise of a circle with sectors.


• The frequency of each category of the data determines the size of angle of each
sector.
𝑓
• 𝐴𝑛𝑔𝑙𝑒 = × 360°
𝑁

5
©NOOR MAIZATUL NAZUHA MOHAMAD
Class Frequency Angle of Sector
A 4 4
× 360° = 76°
19
B 5 5
× 360° = 95°
19
C 4 4
× 360° = 76°
19
D 6 6
× 360° = 114°
19
Total 19
Table 3: Angle for each sector

A B C D

Figure 5: Pie Chart of the Alphabet

6
©NOOR MAIZATUL NAZUHA MOHAMAD
3.1.3 Organizing and Graphing Quantitative Data

For quantitative data, we will learn about time series graph, frequency distribution table,
histogram, frequency polygon, stem-and-leaf and ogive.

Time Series Graph

• Refer to graph that represents data that occur over a specific period time of time.
• This type of graph is popular because their visual characteristics reveal data trends
clearly and these graphs are easy to create.
• Two data sets can be compared on the same graph (called a compound time series
graph) if two lines are used.
• A line graph is a visual comparison of how two variables shown on the x-axis and
y-axis are related or vary with each other. It shows related information by drawing
a continuous line between all the points on a grid.

Example 1
A doctor wishes to use the following data for a presentation to show the trend of dengue
death from year 2007 until 2015. Draw a time series graph for the data and summarize
the findings.

Table 4: Number of Dengue Death (2007 – 2015)

Year Dengue death


2007 98
2008 112
2009 88
2010 134
2011 36
2012 35
2013 92
2014 215
2015 336

7
©NOOR MAIZATUL NAZUHA MOHAMAD
Solution:

Dengue death
400

350

300

250

200

150

100

50

0
2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016

Figure 6: A graph for Dengue Death

The graph shows a rise in number of dengue death through 2007 until 2010, and sloly
decreases for the years 2011 and 2012. However, start from 2013, it dramatically
increases until 2015.

Frequency Distribution Table

In Table 5, we show the example of quantitative of data where the variable involved is
the examination scores.

Table 5: Scores in Statistics Test for 45 Students

20 16 15 26 24 20 15 19 35
16 30 43 14 7 21 24 10 44
23 13 40 11 20 6 37 9 38
24 44 14 17 23 27 32 20 45
18 8 30 23 37 19 10 24 17

8
©NOOR MAIZATUL NAZUHA MOHAMAD
When working with large quantitative data sets, it is often helpful to organize and
summarize data by constructing table called frequency distribution.

Table 6: Frequency Distribution of Statistic Score

Scores No of Students, f
6 - 12 7
13 - 19 12
20 - 26 13
27 - 33 7
34 - 40 2
41 - 47 4

• Steps to construct the frequency distribution are as follows:


1. Decide the number of classes by using Sturge’s formula
𝒄 = 𝟏 + 𝟑. 𝟑 𝐥𝐨𝐠 𝒏 .
𝑅𝑎𝑛𝑔𝑒
2. Identify the class width or size of the class, 𝑖 = 𝑐

3. Decide the starting point of the class limit. Usually choose smallest number.
4. Identify the class limit, lower limit and upper limit.
5. Determine the frequency of each class by using counting or tally method.
• From Table 5, we decide the number of classes by using Sturge’s formula where c
is the no. of classes and n is the no. of observation in the data set.
𝑐 = 1 + 3.3 log 45 = 6.455 ≈ 6
So, the number of classes is 6.
• Then, from the class limits, we can identify the lower and upper limit from the class.
𝟏
𝑳𝒐𝒘𝒆𝒓 𝒄𝒍𝒂𝒔𝒔 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 = 𝒍𝒐𝒘𝒆𝒓 𝒍𝒊𝒎𝒊𝒕𝒔 𝒐𝒇 𝒕𝒉𝒆 𝒇𝒊𝒓𝒔𝒕 𝒄𝒍𝒂𝒔𝒔 −
𝟐
𝑼𝒑𝒑𝒆𝒓 𝒄𝒍𝒂𝒔𝒔 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚
(𝒖𝒑𝒑𝒆𝒓 𝒍𝒊𝒎𝒊𝒕𝒔 𝒐𝒇 𝒕𝒉𝒆 𝒄𝒍𝒂𝒔𝒔 + 𝒍𝒐𝒘𝒆𝒓 𝒍𝒊𝒎𝒊𝒕𝒔 𝒐𝒇 𝒕𝒉𝒆 𝒏𝒆𝒙𝒕 𝒄𝒍𝒂𝒔𝒔)
=
𝟐

9
©NOOR MAIZATUL NAZUHA MOHAMAD
Scores (Class limit) Class Boundary No of Students, f
6 - 12 5.5 – 12.5 7
13 - 19 12.5 – 19.5 12
20 - 26 19.5 – 26.5 13
27 - 33 26.5 – 33.5 7
34 - 40 33.5 – 40.5 2
41 - 47 40.5 – 47.5 4

• Hence, we can find the class width or size of the class by subtracting between lower
and upper limit.
𝑪𝒍𝒂𝒔𝒔 𝒘𝒊𝒅𝒕𝒉 = 𝒖𝒑𝒑𝒆𝒓 𝒄𝒍𝒂𝒔𝒔 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚 − 𝒍𝒐𝒘𝒆𝒓 𝒄𝒍𝒂𝒔𝒔 𝒃𝒐𝒖𝒏𝒅𝒂𝒓𝒚

• The class midpoints are the value in the middle of the classes.
𝒍𝒐𝒘𝒆𝒓 𝒄𝒍𝒂𝒔𝒔 𝒍𝒊𝒎𝒊𝒕 + 𝒖𝒑𝒑𝒆𝒓 𝒄𝒍𝒂𝒔𝒔 𝒍𝒊𝒎𝒊𝒕
𝑪𝒍𝒂𝒔𝒔 𝒎𝒊𝒅𝒑𝒐𝒊𝒏𝒕 =
𝟐

Scores (Class limit) Class Boundary Class Midpoint No of Students, f


6 - 12 5.5 – 12.5 9 7
13 - 19 12.5 – 19.5 16 12
20 - 26 19.5 – 26.5 23 13
27 - 33 26.5 – 33.5 30 7
34 - 40 33.5 – 40.5 37 2
41 - 47 40.5 – 47.5 44 4

10
©NOOR MAIZATUL NAZUHA MOHAMAD
Example 2

The researchers collect data from the employees by asking the approximate travelling
distance (miles) from their home to the office. The raw data is as follows;

1 2 6 7 12 13 2 6 9 5
18 7 3 15 15 4 17 1 14 5
4 16 4 5 8 6 5 20 5 2
9 11 12 1 9 2 10 11 4 10
9 19 8 8 4 14 7 3 2 6

Based on this data, construct a frequency distribution table.

11
©NOOR MAIZATUL NAZUHA MOHAMAD
Stem and Leaf Plot

• Stem and leaf plots are method for visualizing the frequency with which certain
classes of values occurs.
• In the stem and leaf plot, each value are separated into two parts which are stem
(on the left hand side) and leaf (on the right hand side).
• The plot can help us to understand the distributional of data either symmetry, skew
to the left or skew to the right.
✓ Symmetry – if a line is drawn down the middle of the graph, the two sides
will mirror each other.
✓ Skew to the left – Asymmetry (unbalanced). More data on the left-hand
side. Less data on the right-hand side.
✓ Skew to the right. Asymmetry (unbalanced). More data on the right-hand
side. Less data on the left-hand side.
• In constructing a stem and leaf plot, a key must be included to explain the meaning
of entries.

Example 3

The results of 22 students for a quiz of 30 multiple choice question are recorded as follows.
Display the data with a stem and leaf plot.

12 24 15 19 14 10 31 28 16 13 7
14 16 39 8 26 27 16 9 16 8 23

Solution:

Identify the stem and left (distinct each value in two parts). From the data, we can see that
the minimum number is 7 and the maximum number is 39. Meaning that, it is consisting
two digits. The first digit 0, 1, 2 and 3 can be used as a stem, while the leaf consists the
second digit. Draw a vertical line to separate stem and leaf. Stem at the left hand side, leaf
at the right hand side.

12
©NOOR MAIZATUL NAZUHA MOHAMAD
Stem Leaf
0 7 8 8 9
1 0 2 3 4 4 5 5 6 6 6 9
2 3 4 6 6 7 8
3 1 9

Histogram

• A histogram is a graph that displays the data continuous vertical bar of various
heights to represent the frequencies of the classes.
• The bars in histogram are drawn adjacent to each other without leaving any gap
between them.
• Steps for constructing a frequency distribution as follows;
1. Draw and label the x-axis and y-axis. The x-axis always the horizontal axis
and y-axis is always the vertical axis.
2. Represent the class boundaries on the x-axis and the frequency on the y-
axis.
3. Using the frequencies as the heights, draw vertical bars for each class.

13
©NOOR MAIZATUL NAZUHA MOHAMAD
Example 4

The following Table 7 shows the summary of certain data that have been collected in one
research.

Table 7: The frequency distribution of the data

Relative Frequency,
Class Limit Class boundary Midpoint, x Frequency %

1-5 0.5-5.5 3 3 15
6-10 5.5-10.5 8 7 35
11-15 10.5-15.5 13 4 20
16-20 15.5-20.5 18 3 15
21-25 20.5-25.5 23 1 5
26-30 25.5-30.5 28 2 10
20 0

• By using the data from Table 7, the histogram can be drawn.

8
7
6
Frequency

5
4
3
2
1
0
0.5-5.5 5.5-10.5 10.5-15.5 15.5-20.5 20.5-25.5 25.5-30.5
Class Boundary

Figure 7: Frequency histogram using boundary as an x-axis

14
©NOOR MAIZATUL NAZUHA MOHAMAD
Frequency Polygon
• The alternative way to present the frequency distribution is frequency polygon.
• This is a graph that displays the data by plotting frequencies against the class
midpoints or joining midpoints at the top of each histogram bar.
• Frequency polygons are normally used to compare distributions of two different set
of data.

5
Frequency

0
0 5 10 15 20 25 30

Midpoint

Figure 8: Frequency polygon using midpoints as an x-axis

15
©NOOR MAIZATUL NAZUHA MOHAMAD
Cumulative Frequency Curve (Ogive)
• Another suitable way to presenting data is cumulative frequency.
• Cumulative frequency is the sum of the frequencies accumulated up to the upper
boundaries of the class.
• Before drawing a cumulative frequency curve, a cumulative frequency table which
comprises of the upper boundaries columns must be constructed first.
• The steps for constructing a cumulative frequency curve (ogive) are as follows;
1. Find the cumulative frequency for each class.
2. Draw the x and y axis. Label the x-axis with the class boundaries. Use the
appropriate scale for the y-axis to represent the cumulative frequencies.
3. Plot the cumulative frequency at each upper class boundary.

Cumulative
Class Limit Class boundary Midpoint, x Frequency Frequency

1-5 0.5-5.5 3 3 3
6-10 5.5-10.5 8 7 10
11-15 10.5-15.5 13 4 14
16-20 15.5-20.5 18 3 17
21-25 20.5-25.5 23 1 18
26-30 25.5-30.5 28 2 20
Table 7: The frequency distribution of the data

25
Cumulative Frequency

20

15

10

0
0 5 10 15 20 25 30 35
Upper Boundary

Figure 9: Ogive using upper boundary as an x-axis

16
©NOOR MAIZATUL NAZUHA MOHAMAD
3.2 MEAN, MODE, MEDIAN AND STANDARD DEVIATION
• Measure of central tendency is a single value that attempts to describe a set of data
by identifying the central position within that set of data. As such, measures of
central tendency are sometimes called measure of central location.
• The mean is most likely the measure of central tendency that you are most familiar
with, but there are others such as median and mode.
• The mean, median and mode are all valid measures of central tendency but under
different condition, some measures of central tendency become more appropriate
to use than other.
• What is mean, median and mode in statistics?
✓ Mean is the average of a data set
✓ Mode is the most common number in a data set.
✓ Median is the middle of the set of numbers.
• The mean requires you to compute. Adding all the numbers and dividing with the
sample size.
• Mode is the most popular member of the data set.
• Arrange the data either in ascending or descending order, then the middle value is
your median.
• Use the mean to describe the sample with the single value that represents the center
of the data. Many statistical analyses use the mean as a standard measure of the
center of the distribution of the data.
• When you have unusual values, you can compare the mean and the median to
decide which is the better measure to use. If your data symmetric, the mean and
median are the same.
• The standard deviation is the most common measure of dispersion.
• Therefore, standard deviation is the average distance from one to another points of
data.

17
©NOOR MAIZATUL NAZUHA MOHAMAD
Table 8: Best Central Tendency Measure

MEASUREMENT SCALE BEST CENTRAL TENDENCY


MEASURE
NOMINAL MODE
ORDINAL MODE, MEDIAN
INTERVAL &RATIO (SYMMETRY) MEAN
INTERVAL & RATION (NOT SYMMETRY) MEDIAN

3.3 Data Presentation Techniques using SPSS


SPSS can also display any charts that we want to display. There have two options that
include using Descriptive Statistics menu or Graphs menu.
Option 1
Step 1: Select Analyze menu, Select Descriptive Statistics, Click on Frequencies, Select
the appropriate variable.

18
©NOOR MAIZATUL NAZUHA MOHAMAD
Step 2: Click on the Arrow button into the variable box.

Step 3: Click on Charts, then Select the appropriate Chart Type.

19
©NOOR MAIZATUL NAZUHA MOHAMAD
Step 4: Click on Continue then Click OK.

20
©NOOR MAIZATUL NAZUHA MOHAMAD
Option 2
Step 1: Select Graphs menu, Select Legacy Dialog, then Choose appropriate charts.

3.4 Numerical Techniques using SPSS


By using SPSS, all the numerical value (mean, mode, median, standard deviation,
variance) that we want to know can be display directly using SPSS.
Frequencies
Step 1: Select Analyze menu, Select Descriptive Statistics, Click on Frequencies, Select
the appropriate variable.
Step 2: Click on the Arrow button into the variable box.
Step 3: Click on Statistics, then Select appropriate statistics, Click on Continue

Descriptive
Step 1: Select Analyze menu, Select Descriptive Statistics, Click on Descriptive, Select

21
©NOOR MAIZATUL NAZUHA MOHAMAD
the appropriate variable.
Step 2: Click on the Arrow button into the variable box.
Step 3: Click on Options, Select the appropriate statistics
Step 4: Click on Continue, then Click on OK.

3.5 Interpreting the Output

Figure 10
From Figure 10, the mean score for English and Mathematics subjects are 41.28 and 40.90
respectively. The median values obtained are 51 and 40 respectively. The standard
deviation for both subjects are 24.074 and 17.522 respectively. The minimum scores for
English subject are 2 and maximum are 75. The minimum and maximum value for
Mathematics subject is 16 and 86. The skewness coefficient for English subjects are -0.547
and this indicates that the distribution of student’s marks for English subject are skew to
the left. While the skewness coefficient for Mathematics subject is 0.912 that indicates the
distribution of student’s marks for Mathematics subject are skew to the right.

22
©NOOR MAIZATUL NAZUHA MOHAMAD
EXERCISE

Table below shows the marks obtained by three different classes in a statistics test.

Class A Class B Class C


62 69 56 92 32 50 65 58 55
71 82 74 72 37 74 63 71 71
59 77 69 45 59 82 64 65 59
62 75 68 87 98 46 60 50 67
77 70 82 57 64 42 65 69 64
61 72 78 58 88 42 65 69 62
70 80 86 60 61 72 52 62 52
80 67 84 31 43 71 69 56 67
64 77 67 60 61 72 64 62 63
88 64 76 45 59 82 67 54 65
67 87 62 77 52 30 61 70 72

Feed the data into SPSS editor and find the following descriptive values for each of the
classes.
a) Mean
b) Median
c) Mode
d) Standard deviation
e) Variance
f) Range
g) Skewness

23
©NOOR MAIZATUL NAZUHA MOHAMAD

You might also like