Quantitative Mathematics Module 2 PDF
Quantitative Mathematics Module 2 PDF
2020-2021)
QUANTITATIVE MATHEMATICS
MODULE 2
CHARTS AND GRAPHS
I. PATTERNS IN DATA
- Graphic displays are useful for seeing patterns in data. Patterns in data are
commonly described in terms of : center, spread, shape, and unusual features.
- Some common distributions have special descriptive labels, such as symmetric,
bell-shaped, skewed, etc.
1. Center
- Graphically, the center of a distribution is located at the median of the distribution. This
is the point in a graphic display where about half of the observations are on either side. In
the chart below, the height of each column indicates the frequency of observations. Here,
the observations are centered over 4.
2. Spread
- The spread of a distribution refers to the variability of the data. If the observations
cover a wide range, the spread is larger. If the observations are clustered around a
single value, the spread is smaller.
3. Shape
- The shape of a distribution is described by the following characteristics.
a. Symmetry. When it is graphed, a symmetric distribution can be divided at the center
so that each half is a mirror image of the other.
b. Number of peaks. Distributions can have few or many peaks. Distributions with one
clear peak are called unimodal, and distributions with two clear peaks are called
bimodal. When a symmetric distribution has a single peak at the center, it is
referred to as bell-shaped.
c. Skewness. When they are displayed graphically, some distributions have many
more observations on one side of the graph than the other. Distributions with fewer
observations on the right (toward higher values) are said to be skewed right; and
distributions with fewer observations on the left (toward lower values) are said to be
skewed left.
pg. 1
AE103-MS - Management Science (A.Y. 2020-2021)
d. Uniform. When the observations in a set of data are equally spread across the range
of the distribution, the distribution is called a uniform distribution. A uniform
distribution has no clear peaks.
4. Unusual Features
Sometimes, statisticians refer to unusual features in a set of data. The two most
common unusual features are gaps and outliers.
a. Gaps. Gaps refer to areas of a distribution where there are no observations. The
figure below has a gap; there are no observations in the middle of the distribution.
II. DOTPLOTS
- A dotplot is a type of graphic display used to compare frequency counts within categories
or groups.
Overview
As you might guess, a dotplot is made up of dots plotted on a graph. Here is how to interpret
a dotplot.
pg. 2
AE103-MS - Management Science (A.Y. 2020-2021)
• Each dot represents a specific number of observations from a set of data. (Unless
otherwise indicated, assume that each dot represents one observation. If a dot
represents more than one observation, that should be explicitly noted on the plot.)
• The dots are stacked in a column over a category, so that the height of the column
represents the relative or absolute frequency of observations in the category.
• The pattern of data in a dotplot can be described in terms of symmetry and skewness
only if the categories are quantitative. If the categories are qualitative (as they often
are), a dotplot cannot be described in those terms.
Compared to other types of graphic display, dotplots are used most often to plot frequency
counts within a small number of categories, usually with small sets of data.
Example
Here is an example to show what a dotplot looks like and how to interpret it. Suppose 30 first
graders are asked to pick their favorite color. Their choices can be summarized in a dotplot,
as shown below.
Each dot represents one student, and the number of dots in a column represents the
number of first graders who selected the color associated with that column. For example,
Red was the most popular color (selected by 9 students), followed by Blue (selected by 7
students). Selected by only 1 student, Indigo was the least popular color.
In this example, note that the category (color) is a qualitative variable; so it is not appropriate
to talk about the symmetry or skewness of this dotplot. The dotplot in the next section uses a
quantitative variable, so we will illustrate skewness and symmetry of dotplots in the next
section.
Bar Charts
A bar chart is made up of columns plotted on a graph. Here is how to read a bar chart.
• The columns are positioned over a label that represents a categorical variable. • The
height of the column indicates the size of the group defined by the column label.
pg. 3
AE103-MS - Management Science (A.Y. 2020-2021)
The bar chart below shows average household income for the four "New" states - New
Jersey, New York, New Hampshire, and New Mexico.
Histograms
Like a bar chart, a histogram is made up of columns plotted on a graph. Usually, there is
no space between adjacent columns. Here is how to read a histogram.
• The columns are positioned over a label that represents a continuous, quantitative
variable.
• The column label can be a single value or a range of values.
• The height of the column indicates the size of the group defined by the column label.
The histogram below shows per capita income for five age groups.
IV. STEMPLOTS
- A stemplot (aka, stem and leaf plot) is a type of chart that shows how individual values are
distributed within a set of data.
A stemplot is used to display quantitative data, generally from small data sets (50 or
fewer observations). The stemplot below shows IQ scores for 30 sixth graders.
pg. 4
AE103-MS - Management Science (A.Y. 2020-2021)
In a stemplot, the entries on the left are called stems; and the entries on the right are called
leaves. In the example above, the stems are tens (8 represents 80, 9 represents 90, 10
represents 100, and so on); and the leaves are ones. However, the stems and leaves could
be other units - millions, thousands, ones, tenths, etc.
Some stemplots include a key to help the user interpret the display correctly. The key in
the stemplot above indicates that a stem of 11 with a leaf of 7 represents an IQ score of
117.
Looking at the example above, you should be able to quickly describe the distribution of IQ
scores. Most of the scores are clustered between 90 and 109, with the center falling in the
neighborhood of 100. The scores range from a low of 81 (two students have an IQ of 81) to
a high of 151. The high score of 151 might be classified as an outlier.
Note: In the example above, the stems and leaves are explicitly labeled for educational
purposes. In the real world, however, stemplots usually do not include explicit labels for
the stems and leaves.
V. BOXPLOTS
- A boxplot, sometimes called a box and whisker plot, is a type of graph used to display
patterns of quantitative data.
Basics
A boxplot splits the data set into quartiles. The body of the boxplot consists of a "box" (hence,
the name), which goes from the first quartile (Q1) to the third quartile (Q3).
Within the box, a vertical line is drawn at the Q2, the median of the data set. Two horizontal
lines, called whiskers, extend from the front and back of the box. The front whisker goes
from Q1 to the smallest non-outlier in the data set, and the back whisker goes from Q3 to the
largest non-outlier.
If the data set includes one or more outliers, they are plotted separately as points on
the chart. In the boxplot above, two outliers are shown to the right of the second
whisker.
Additionally, boxplots display two common measures of the variability or spread in a data set.
pg. 5
AE103-MS - Management Science (A.Y. 2020-2021)
• Range. If you are interested in the spread of all the data, it is represented on a boxplot
by the horizontal distance between the smallest value and the largest value,
including any outliers. In the boxplot above, data values range from about 0 (the
smallest non-outlier) to about 16 (the largest outlier), so the range is 16. If you ignore
outliers, the range is illustrated by the distance between the opposite ends of the
whiskers - about 10 in the boxplot above.
• Interquartile range (IQR). The middle half of a data set falls within the interquartile
range. In a boxplot, the interquartile range is represented by the width of the box (Q3
minus Q1). In the chart above, the interquartile range is equal to about 7 minus 3 or
about 4.
And finally, boxplots often provide information about the shape of a data set. The examples
below show some common patterns.
Each of the above boxplots illustrates a different skewness pattern. If most of the
observations are concentrated on the low end of the scale, the distribution is skewed
right; and vice versa. If a distribution is symmetric, the observations will be evenly split at
the median, as shown above in the middle figure.
VI. CUMULATIVE FREQUENCY PLOTS
- A cumulative frequency plot is a way to display cumulative information graphically. It
shows the number, percentage, or proportion of observations that are less than or equal to
particular values.
pg. 6
AE103-MS - Management Science (A.Y. 2020-2021)
In the first chart (shown below), column height indicates frequency - the number of students
in each test score grouping. For example, about 30 students received a test score between
51 and 60.
In the next chart, column height shows cumulative frequency - the number of students up to
and including each test score. The chart below is a cumulative frequency chart. It shows
that 30 students received a test score of at most 50; 60 students received a score of at
most 60; 120 students received a score of at most 70; and so on.
pg. 7
AE103-MS - Management Science (A.Y. 2020-2021)
Let's work through an example to understand how to read this cumulative frequency plot.
Specifically, let's find the median. Follow the grid line to the right from the Y axis at 50%.
This line intersects the curve over the X axis at a test score of about 73. This means that
half of the students received a test score of at most 73, and half received a test score of at
least 73. Thus, the median is 73.
You can use the same process to find the cumulative percentage associated with any other
test score. For example, what percentage of students received a test score of 64 or less?
From the graph, you can see that about 25% of students received a score of 64 or less.
VII. SCATTERPLOT
- A scatterplot is a graphic tool used to display the relationship between two quantitative
variables.
Let's work through an example. Here is a table showing the height and weight of five
starters on a high school basketball team.
And here is the same data displayed in a scatterplot.
pg. 8
AE103-MS - Management Science (A.Y. 2020-2021)
Each player in the table is represented by a dot on the scatterplot. The first dot, for example,
represents the shortest, lightest player. From the scale on the X axis, you see that the
shortest player is 67 inches tall; and from the scale on the Y axis, you see that he/she
weighs 155 pounds. In a similar way, you can read the height and weight of every other
player represented on the scatterplot.
Additionally, scatterplots can reveal unusual features in data sets, such as clusters, gaps,
and outliers. The scatterplots below illustrate some common patterns.
The pattern in the last example (nonlinear, zero slope, weak) is the pattern that is found
when two variables are not related.
pg. 9
AE103-MS - Management Science (A.Y. 2020-2021)
• Center. Graphically, the center of a distribution is the point where about half of the
observations are on either side.
• Spread. The spread of a distribution refers to the variability of the data. If the
observations cover a wide range, the spread is larger. If the observations are
clustered around a single value, the spread is smaller.
• Shape. The shape of a distribution is described by symmetry, skewness, number of
peaks, etc.
• Unusual features. Unusual features refer to gaps (areas of the distribution where
there are no observations) and outliers.
The remainder of this lesson shows how to use various graphs to compare data sets in
terms of center, spread, shape, and unusual features.
Dotplots
When dotplots are used to compare data sets, they are positioned one above the other,
using the same scale of measurement, as shown below.
The dotplots show pet ownership in homes on two city blocks. Pet ownership is a little lower
in block A. In block A, most households have zero or one pet; in block B, most households
have two or more pets. In block A, pet ownership is skewed right; in block B, it is roughly
bell-shaped. In block B, pet ownership ranges from 0 to 6 pets per household versus 0 to 4
pets in block A; so there is more variability in the block B distribution. There are no outliers
or gaps in either data set.
Back-to-Back Stemplots
The back-to-back stemplots are another graphic option for comparing data from two groups.
The center of a back-to-back stemplot consists of a column of stems, with a vertical line on
each side. Leaves representing one data set extend from the right, and leaves representing
the other data set extend from the left.
pg. 10
AE103-MS - Management Science (A.Y. 2020-2021)
The back-to-back stemplot above shows the amount of cash (in dollars) carried by a
random sample of teenage boys and girls. The boys carried more cash than the girls - a
median of $42 for the boys versus $36 for the girls. Both distributions were roughly
bell-shaped, although there was more variation among the boys. And finally, there were
neither gaps nor outliers in either group.
Parallel Boxplots
With parallel boxplots (aka, side-by-side boxplots), data from two groups are displayed on
the same chart, using the same measurement scale.
The boxplot above summarizes results from a medical study. The treatment group
received an experimental drug to relieve cold symptoms, and the control group received a
placebo. The boxplot shows the number of days each group continued to report
symptoms.
Neither boxplot reveals unusual features, such as gaps or outliers. Both plots are skewed to
the right, although the skew is more prominent in the treatment group. The range of patient
response was about the same in both groups. In the treatment group, cold symptoms lasted
1 to 15 days (range = 14) versus 3 to 17 days (range = 14) for the control group. The
median recovery time is more telling - about 6 days for the treatment group versus about 9
days for the control group. It appears that the drug may have had a positive effect on
patient recovery.
pg. 11
AE103-MS - Management Science (A.Y. 2020-2021)
Both groups prefer the Japanese cars to the American cars, with Honda receiving the
highest ratings and Ford receiving the lowest ratings. Moreover, both genders agree on the
rank order in which the cars are rated. As a group, the men seem to be tougher raters; they
gave lower ratings to each car than the women gave.
Prepared by:
pg. 12