Lecture Note 3
Lecture Note 3
4
Experimental Design
Experiment
An experiment is a controlled study in which the researcher attempts to
understand cause-and-effect relationships.
Does aspirin reduce the risk of heart attacks? Is one brand of fertilizer more
effective at growing roses than another? Is fatigue as dangerous to a driver
as the influence of alcohol? Questions like these are answered using
randomized experiments.
Experiment
The purpose of an experiment is to investigate the relationship between
variables.
When one variable causes change in another, we call the first variable the
explanatory variable. The affected variable is called the response variable.
• The control group consists of participants who do not receive the experimental
treatment being studied. Instead, they get a placebo (a fake treatment; for
example, a sugar pill); a standard, nonexperimental treatment (such as vitamin C,
in the zinc study); or no treatment at all, depending on the situation.
In the end, the responses of those in the treatment group are compared with the
responses from the control group to look for differences that are statistically
significant (unlikely to have occurred just by chance).
Blind Experiment and Double-Blind Experiment
Blind experiment: A blind experiment is one in which the subjects who are
participating in the study are not aware of whether they’re in the treatment
group or the control group. In the zinc example, the vitamin C tablets, and
the zinc tablets would be made to look exactly alike, and patients would not
be told which type of pill they were taking.
5
Example
Researchers want to investigate whether taking aspirin regularly reduces the
risk of heart attack. Four hundred men between the ages of 50 and 84 are
recruited as participants. The men are divided randomly into two groups: one
group will take aspirin, and the other group will take a placebo. Each man
takes one pill each day for three years, but he does not know whether he is
taking aspirin or the placebo. At the end of the study, researchers count the
number of men in each group who have had heart attacks.
Identify the following values for this study: population, sample, experimental
units, explanatory variable, response variable, treatments, control group and
the treatment group.
Is this experiment blind or double blind? Why?
Answer:
7
INTRODUCTORY STATISTICS
Chapter 2 Descriptive Statistics
Introduction
Statistical Graph
A statistical graph is a tool that helps you learn about the shape or distribution of a sample
or a population. A graph can be a more effective way of presenting data than a mass of
numbers because we can see where data clusters and where there are only a few data
values. Newspapers and the Internet use graphs to show trends and to enable readers to
compare facts and figures quickly. Statisticians often graph data first to get a picture of the
data. Then, more formal tools may be applied.
Some of the types of graphs that are used to summarize and organize data:
• Stem-and-leaf plot
• Line graph
• Bar graph
• Histogram
• Frequency polygon
• Pie chart
• Box plot.
In this chapter, we will briefly look at stem-and-leaf plots, line graphs, and bar graphs, as
well as frequency polygons, and time series graphs. Our emphasis will be on histograms and
box plots.
2.1
Stem-and-leaf Graphs, Line
Graphs, and Bar Graphs
Stem-and-leaf Graph
The stem-and-leaf graph is used to display quantitative data. It is a good
choice when the data sets are small. To create the plot, divide each
observation of data into a stem and a leaf. The leaf consists of a final
significant digit. For example, 23 has stem two and leaf three. The number
432 has stem 43 and leaf two. Likewise, the number 5,432 has stem 543 and
leaf two. The decimal 9.3 has stem nine and leaf three. Write the stems in a
vertical line from smallest to largest. Draw a vertical line to the right of the
stems. Then write the leaves in increasing order next to their corresponding
stem.
5
6
Example
Draw stem-and-leaf plot: For Susan Dean's spring pre-calculus class, scores for the
first exam were as follows (smallest to largest):
33; 42; 49; 49; 53; 55; 55; 61; 63; 67; 68; 68; 69; 69; 72; 73; 74; 78; 80; 83;
88; 88; 88; 90; 92; 94; 94; 94; 94; 96; 100
stem leaf
3 3
4s
6 i 3 7 8 89 9
7 2 3 4
O 3 88 88
9 0 2 4 44 4 6
10 to
key 714 74
7
Side by Side Stem-and-leaf Graph
8
Example
Mrs. Cameron teaches AP Statistics at GHI High School. She recently wrote
down the class marks for her current grade 12 class and compared it to the
previous grade 12 class. The data can be found below. Construct a two-sided
stem-and-leaf plot for the data.
Class 2021:
70, 70, 71, 72, 74, 74, 74 ,75, 76, 76, 77, 78, 79, 80, 82, 82, 82, 83, 85, 85, 86, 87, 93, 100
Class 2020:
66, 76, 76, 76, 77, 78, 78, 78, 79, 80, 80, 81, 81, 82, 82, 83, 83, 83, 85, 85, 88, 91, 92, 95
o
sss
5 2 I 9 3
0
10
key 711 71
9
Line graph
A line graph is a type of chart with points connected by lines to show how something changes
in value:
• as time goes by,
• or as something else changes.
We also call it a line chart. The line graph comprises of two axes known as “x” axis and “y” axis.
• The horizontal axis is known as the x-axis.
• The vertical axis is known as the y-axis.
Parts of a line graph
10
Bar graph
(Review from chapter 1)Bar graphs represent each category as a bar. The
bar heights show the category counts or percents.
11
Example
The population in Park City is made up of children, working-age adults, and retirees.
Table below shows the three age groups, the number of people in the town from
each age group, and the proportion (%) of people in each age group. Construct a bar
graph showing the proportions.
ParkCity
Answer: The population
in
Proportion
Number of
Age Groups of
People
population
i
Working-age
152,198 43%
adults
12
2.2
Histograms, Frequency
Polygons, and Time Series
Graphs
Histogram
A histogram is a graphical display of data using bars of different heights. In a
histogram, each bar groups numbers into ranges. Taller bars show that more
data falls in that range. A histogram displays the shape and spread of
continuous sample data.
A histogram consists of contiguous (adjoining) boxes. It has both a
horizontal axis and a vertical axis. The horizontal axis is labeled with what
the data represents (for instance, distance from your home to school). The
vertical axis is labeled either frequency or relative frequency (or percent
frequency or probability). The graph will have the same shape with either
label. The histogram can give you the shape of the data, the center, and the
spread of the data.
Recall:
• The relative frequency is equal to the frequency for an observed value of
the data divided by the total number of data values in the sample.
• Frequency is defined as the number of times an answer occurs.
14
Bar graph versus Histogram
15
Example
The following data represent the number of employees at 24 restaurants in New
York City. Using this data, create a histogram.
22; 35; 15; 26; 40; 28; 18; 20; 25; 34; 39; 42; 24; 22; 19; 27; 22; 34; 40; 20;
38; 45; 50; and 28
Use 10.5–19.5 as the first interval (Placing the limits of the intervals midway
between two numbers (e.g., 10.5) ensures that every score will fall in an interval
rather than on the boundary between intervals).
t
r
19.5 28.5 N
th lo
46.5 55.5 1
total 24
if
23 1
T.si
F ss.s4F.s
16
Frequency Polygon
Another type of graph that can be
drawn to represent the same set of
data as a histogram represents is
a frequency polygon. A frequency
polygon is a graph constructed by using
lines to join the midpoints of each
interval, or bin. The heights of the
points represent the frequencies. A
frequency polygon can be created from
the histogram or by calculating the
midpoints of the bins from the
frequency distribution table.
The midpoint of a bin is calculated by
adding the upper and lower boundary
values of the bin and dividing the sum
by 2.
17
Example
Construct a frequency polygon.
Number of
hours of
Age
watching Tv in a
Midpoint points
week midpoint frequency
11-20 13 i
21-30 7
fi l
25.5 15.5 10
15.5 10 5.5 4 i
2
IT
25.5 IS 5
IN
115.5 415.5 55.5
5s
15.5110 55.5
age 18