0% found this document useful (0 votes)
34 views34 pages

Descriptive Statistics, Tables and Graphs 20

The document discusses various descriptive statistics concepts including descriptive versus inferential statistics, measures of central tendency such as mean, median and mode, and measures of variability such as range and standard deviation. It also covers different types of graphs, tables, and diagrams that can be used to present numeric data, including bar charts, histograms, pie charts, and scatter plots. Guidelines are provided on how to properly construct and label these various data visualizations.

Uploaded by

Liaqat Bhatti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views34 pages

Descriptive Statistics, Tables and Graphs 20

The document discusses various descriptive statistics concepts including descriptive versus inferential statistics, measures of central tendency such as mean, median and mode, and measures of variability such as range and standard deviation. It also covers different types of graphs, tables, and diagrams that can be used to present numeric data, including bar charts, histograms, pie charts, and scatter plots. Guidelines are provided on how to properly construct and label these various data visualizations.

Uploaded by

Liaqat Bhatti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Descriptive statistics,

Tables and Graphs


Dr Rubeena Zakar

Outline
• Concept of Statistics

• Descriptive vs. inferential statistics

• Graphs, Charts, Diagrams, Tables

1
Statistics
• Statistics is a discipline which is
concerned with:
– designing experiments and data collection,
– processing and summarizing information
to aid understanding,
– drawing conclusions from data, and
– estimating the present or predicting the future

Descriptive versus
inferential
statistics
• Descriptive statistics are used to
summarize and describe patterns
through the analysis of numeric data

• Inferential statistics are used to draw


conclusions about some unknown aspect
of a population and make predictions
based on the analysis of sample data

2
Applications
• Statisticians may apply their knowledge
of statistical methods to a variety of
subject areas, such as biology,
economics, engineering, medicine, public
health, psychology, marketing, education,
and sports.

3
4
5
Variable

6
• Data:
– Refers to observations made on individuals.
• Primary data:
– Collected and recorded systematically by the
investigator himself/herself for some defined
purposes.
• Secondary data:
– Collected by somebody else or for other
purposes. E.g. Information derived from
hospital records
• Raw data:
– Collected data before any cleaning, editing, and
statistical manipulations

7
Common examples
• Mean, median, mode, range, and
standard deviation are some of the
main descriptive statistics.

• A common method used in inferential


statistics is estimation. In estimation, the
sample is used to estimate a parameter, and
a confidence interval about the estimate is
constructed. Other examples of inferential
statistics methods include hypothesis testing
and regression.

Small test about


measures of central
tendency
• What is a mean, median and mode for
the following series?
4,6,5,3,3
• What is the mode for the following
distributions? How do call them on the basis
of mode?
2,4,3,5,4
and
2,4,3,5,4,3

8
Mean

Median
Order data from smallest to largest!
If odd number of data points, the median is the
middle value
Data: 4 5 6 3 3
Ordered Data: 3 3 4 5 6

If even number of data points, the median is a


mean of the two middle values

Data: 4 5 6 5 3 3
Ordered Data: 3 3 4 5 5 6

9
Mode
Data: 4 7 6 53 3
Mode: 3

Standard deviation (SD)


• Represents the average distance
(deviation) of observations from the mean

• A small SD represents a data set where


values are very close to the mean; i.e. a
smaller range

• A large SD has values with more


variance; i.e. a larger range

10
Small test about measures
of central tendency
• The overall mean score based on 3
tests for students A and B is 70 out of
100.

• Which statement is true?


(i) A is better than B (A > B)
(ii) B > A
(iii) A=B
(iv) Not possible to judge

Detailed scores of three tests:


Student “A”: 50, 70, 90; Mean=70, SD = 20, σ2=400
Student “B”: 60, 70, 80; Mean= 70, SD=10, σ2=100

Why is “B” better?


Because variation from mean is less in case of “B” as
compared to “A”, i.e., “B” is more consistent than
“A”.

11
Who is better: A or B?

Standard Deviation

12
Application of Median and
Mode
• Median is used for data with extreme
values (outliers)

• Mode can be used for all types of data,


but most useful for categorical data or
discrete data with only a few number of
possible values

13
Range

Interquartile range (IQR)


• IQR = 3rd quartile – 1st quartile = Q3-Q1

14
Boxplot
(Box-and-Whisker
diagram)

15
Standard deviation and normal
distribution

Shape of distribution:
Skewness

16
Normal Distribution
• All values are symmetrically distributed
around the mean (mean=median=mode)

• Characteristic “bell-shaped” curve

• Assumed for all quality control statistics

• Many test statistics are based on


normality assumptions

Choosing appropriate
measures
• For symmetric distributions (with no
outliers), better to use the mean and
standard deviation;
• For skewed distributions, better to use the
median and interquartile range
• For nominal variables use
frequencies and percentage
• For continuous variables use the mean

17
How to display/present data?

General rules of Graphs


and Tables
• Self-explanatory
• Simple
• Title should include
– what, who, where, when
• Define abbreviations and symbols
• Note data exclusions
• Reference the source

18
Power of graphs
• Why use graphs?
– Gives reader a compact and structured synthesis
– Many details can be shown in a small area
– Gives an immediate depiction of the
differences and patterns in a set of data
– Reader can see immediately major
similarities and differences without having to
compare and interpret figures

Graphs & Diagrams


• Line
• Bar
• Histogram
• Pie
• Scatter

19
Line graph
• Line graphs show the progression of
values over time
• Easier for the eye to follow curves for
different series
• Easier to get a clearer picture of the
development over time
• Good for answering the following questions:
– In what periods were the changes large?
– When were the turning points?

Scale Line Graph:


Rules
• Represents frequency distributions over time
• Use for time series data
• Y-axis represents frequency
• Start the Y-axis with zero
• X-axis represents time
• Y-axis should be shorter than X-axis
• Determine the range of values needed
• Select an interval size

20
Example: Scale Line
Graph

21
22
Bar Chart
• Bar graphs compare the values of different
items in specific categories or at discrete
points in time
• Vertical or horizontal
• Simple to create and easy to interpret
• Used to illustrate variable values which
are distinct (i.e. qualitative variable)
• Y-axis represents frequency
• X-axis may represent time or different classes

Bar chart: Rules


• Order
– Natural
– Decreasing or increasing
• Same width of bars
• Length of the bar = frequency
• Space between bars and groups, but not within groups
• Use different colours to show different sub-groups
• Include a legend that interprets different colours
• Show no more than 3 sub-bars within a group of bars

23
Bar chart

Bar chart

24
Clustered Bar chart
• Bars can be presented as clusters of
subgroups in clustered bar charts.

• These are useful to compare values


across categories.

• They are sometimes called stacked bar charts.

Clustered Bar Chart

25
Clustered Bar chart

Horizontal Bar chart

26
Stacked bar chart
(Total value of categories are easily visible)

100% Stacked Bar


Graph

• Stacked Bar 100 charts are used when


you have three or more data series and
want to compare distributions within
categories, and at the same time display
the differences between categories.

• Each bar represents 100% of the


amounts for that category.

27
100% Stacked Bar Chart

Histogram
• A representation of a frequency
distribution by means of rectangles
• Width of bars represents class intervals
and height represents corresponding
frequency
– Area proportional to number
– No space between columns
– One population

28
Histogram

Histogram

29
Pie Chart
• A circular (360 degree) graphic representation
• Compares subclasses or categories
to the whole class or category using
differently coloured or patterned
segments

Pie Chart
• Suitable for illustrating
percentage distributions of
qualitative variables
• Displays the contribution of each value
to a total
• Best suited for overviews
• Should not have too many sectors –
maximum 5 or 6

30
Pie Chart

Tables
• A rectangular arrangement of data in which
the data are positioned in rows and
columns.
• Each row and column should be labelled
• Rows and columns with totals should be
shown in the last row or in the right-hand
column
• Units of measurements
• Max five variables
• Horizontal lines OK, vertical not

31
Commonly used tables
• Single variable tables
– Frequency distribution

• Multivariable tables
– Contingency tables
• 2x2 tables

Table 1. Percentage distribution of diarrhoea


cases (n=1500) by age group in Bangladesh,
September 2008

32
Table 2. Gonorrhoea by age-group and
sex,
Norway, 2005

Graphs versus Tables


• The main purpose of graphs is to visually
impart information that cannot be easily
read from a data table.

• It would be very difficult to readily „see‟ trends


and contrasts in a tables having many
data points

33
In Summary
• Depending on your data, you can choose
from a variety of chart and graph formats,
including pie charts, histograms, tables
etc.

• Using several simpler graphics is more


effective than attempting to combine all
of the information into one figure.

34

You might also like