Stats Lecture-2
Stats Lecture-2
Statistics Lecture 2
Dr Sumeyye BAKIM
2024
1
Outline
• Histograms
• Shapes of Frequency Distributions
• Misleading Graphs
• Frequency Tables and Histograms in
Research Articles
2
Histograms
A graph is another good way to facilitate
understanding of a large group of scores. A result
can be described in a thousand words or a
thousand numbers. A simple approach is to create
a graph of the frequency table. A graph of the
information in a frequency table is called a
histogram, which is a type of bar chart. In a
histogram, the height of each bar represents the
frequency of each value in the frequency table.
Normally, in a histogram, all the bars are placed
side by side without any gaps in between
3
the appearance of a city skyline.
Stress Level Example
Histogram
4
Social Interaction Frequency
Histogram
Histogram for number of social interactions during a week for 94 college students,
based on grouped frequencies. (Data from McLaughlin-Volpe et al., 2001.
5
How to Make a Histogram
❷ Put the values along the bottom of the page, from left to right, from lowest
to highest
Attention!! When creating a histogram from a grouped frequency table, the values you place at the
bottom of the page are the midpoints of the intervals. The midpoint of an interval is halfway between
the start of that interval and the start of the next highest interval (for the interval 0-4, the midpoint is
2.5).
❸ Make a scale of frequencies along the left edge of the page that goes from 0 at the bottom to the highest
frequency for any value.
❹ Make a bar above each value with a height for the frequency of that value.
For each bar, make sure that the middle of the bar is above its value.
Since the values of a nominal variable are not ordered, a gap is left between the bars.
6
Closest Person Example
Bar Graph
7
Shapes of Frequency Distributions
A frequency distribution shows the pattern of frequencies across various values. A frequency table
or histogram defines a frequency distribution because each illustrates how the frequencies are
spread or 'distributed.' Psychologists also describe this shape in words. It is important to define the
shape of a distribution.
Single?
Double?
Multiple? None?
8
FREQUENCY POLIGONS
Unimodal Distribution A frequency distribution in which one value has a
frequency that is clearly larger than the others.
Rectangular Distribution A distribution with values of all about the same frequency
10
Bimodal
(a) A bimodal distribution showing
the possible frequencies for people
of different ages in a toddler’s play
area.
Rectangular
(b) A regular distribution showing
the possible frequencies of
students at different grade
levels in an elementary school.
11
Symmetric and Skewed Distributions
Take another look at the histograms of the
stress rating example. The distribution is
balanced by an increase in scores towards
the ends, which is somewhat unusual.
Most things we measure in science have
equal numbers on both sides of the
middle. This means that in science, scores
often follow an approximately symmetric
distribution (if you fold the graph of a
symmetric distribution in half, the two
halves look the same).
12
A distribution that is not symmetric is called a skewed
distribution. The stress rating distribution is an example of
this. A skewed distribution has a long and stretched side,
resembling a tail. The side with fewer scores (the tail-like
side) is considered the direction of skewness. Thus, the stress
study example, which has very few scores at the lower end, is
skewed to the left.
The example of social interactions, which has very few scores
at the upper end, is skewed to the right (see the figure on
right). The figure below shows examples of approximately
symmetric and skewed distributions.
14
A distribution skewed to the right due to a floor effect: fictional
distribution of the number of children in families.
Ceiling Effect
The skewed distribution caused by the upper
limit can be seen in the figure on the right. This
distribution represents the results of an adults'
multiplication table test and is strongly skewed
to the left. This illustrates a ceiling effect. A
ceiling effect is also evident in the stress level
example, where the highest stress level is 10
and cannot exceed this value.
A distribution skewed to the left due to a ceiling
effect: fictional distribution of adults’ scores on
a multiplication table test.
16
Kurtotic Distribution
Kurtosis measures how different the shape of a distribution is from a normal curve. Is it taller or
flatter than the normal curve? The term "kurtosis" comes from the Greek word "kyrtos," meaning
"curve."
The figure below (b) shows a kurtotic distribution with a more pronounced peak than the normal
curve. Figure (c) illustrates an extreme example of a kurtotic distribution that is very flat. (A
rectangular distribution would be an even more extreme example.)
Distributions that are taller or flatter than a normal curve also tend to have different shapes in
their tails. Distributions with a very tall curve typically have more data points in their tails
compared to the normal curve (see figure b).
In contrast, flatter distributions tend to have fewer data points in their tails than the normal curve
(see figure c).
By comparing kurtosis to the normal curve, we can determine how much it has become taller or
flatter. The key point here is the number of data points in the tails.
18
Failure to Use Equal Interval Sizes
A fundamental requirement for a grouped frequency
table or graph is that the size of the intervals must
be equal. If they are not, the table or graph can be
very misleading. The table next to this text gives the
impression that commissions paid to travel agencies
dropped dramatically in 1978.
19
Histograms in Research Articles
Maggi and colleagues (2007) conducted a study on age-related
changes in smoking behavior among Canadian adolescents. As
shown in the figure, they created a histogram from a grouped
frequency table to display their results. Their histogram
represents the results from two samples (illustrated with dark
and light bars).
As can be seen from the figure, less than 10% of those aged 10-11
have tried smoking, while more than half of those aged 16-17
have attempted it. In this example, the researchers drew the
histogram with gaps between the bars, whereas gaps should be
avoided (unless you are drawing a bar graph for a nominal
variable).
Additionally, the differing sample sizes in each age group can lead
to misleading percentages.
20
Exaggeration of Proportions
The height of a histogram or bar graph (or
frequency polygon) typically starts at 0 or the
lowest value on the scale and extends to the
highest value on the scale.
21
The total ratio of a histogram or bar graph should be approximately 1 to 1.5 times its length, as
seen in Figure a for the stress rating example. However, consider what happens if we make the
graph much shorter or longer, as shown in Figures b and c. This change is akin to using
software to alter a person's photograph: the actual image is distorted. Any shape of a
histogram can be considered correct in some sense. However, a ratio of 1 to 1.5 has been
adopted as a standard for comparison purposes. Altering this ratio misleads the viewer.
22