Quartiles
Quartiles
QUARTILES ARE THE VALUES THAT DIVIDE A LIST OF NUMBERS INTO QUARTERS: THE THREE
NUMBERS Q1, Q2, AND Q3 THAT PARTITION A RANKED DATA SET INTO FOUR (APPROXIMATELY)
EQUAL GROUPS ARE CALLED THE QUARTILES OF THE DATA.
EXAMPLE
• For instance, for the data set below, the values Q1 = 11, Q2 = 29, and Q3 = 104 are the
quartiles of the data.
• The quartile Q1 is called the first quartile. The quartile Q2 is called the second quartile. It
is the median of the data. The quartile Q3 is called the third quartile.
THE FOLLOWING METHOD OF FINDING QUARTILES MAKES USE OF MEDIANS.
EXAMPLE
• 1, 3, 3, 4, 5, 6, 6, 7, 8, 8
• A box-and-whisker plot (sometimes called a box plot) is often used to provide a visual
summary of a set of data.
• A box-and-whisker plot shows the median, the first and third quartiles, and the minimum
and maximum values of a data set. See the figure below.
EXAMPLE
• Construct a box-and-whisker plot for the data set Q1 = 39, Q2 = 43, and Q3 = 51.5. The
minimum data value for the data set is 26, and the maximum data value is 73. Thus the
box-and-whisker plot is shown below.
FREQUENCY DISTRIBUTIONS AND HISTOGRAMS
• A frequency distribution shows how often each different value in a set of data occurs. A
histogram is a most commonly used graph to show frequency distributions. It looks very
much like a bar chart, but there are important differences between them.
WHAT IS A HISTOGRAM DIAGRAM?
• The histogram is a graph that is often used in mathematics and statistics. Histograms are
used to measure how frequently values or value ranges appear in a set of data.
BAR CHARTS AND HISTOGRAMS:
CATEGORICAL AND QUANTITATIVE
• Histogram diagrams have characteristics in common with traditional bar charts – they
both measure frequency and use a similar layout. However, there is a key difference:
• Bar charts measure categorical data: data that can be split into different
categories or types
• Histograms measure continuous, quantitative data: data that can be counted
Bar charts are certainly a useful tool to visualise the size of each category, but
histograms are a better way to display frequency distribution over a range.
Histograms also allow us to better analyse the data set and find its mean, median
and mode.
NORMAL DISTRIBUTIONS AND THE EMPIRICAL RULE
• Because the area under the curve is 1, the unshaded region under the curve has area 1 -
0.159, or 0.841, representing the fact that 84.1% of the data are less than 10.
• The following rule, called the Empirical Rule, describes the percent’s of data that lie
within 1, 2, and 3 standard deviations of the mean in a normal distribution.
EMPIRICAL RULE FOR A NORMAL DISTRIBUTION
• A survey of 1000 U.S. gas stations found that the price charged for a gallon of regular gas
could be closely approximated by a normal distribution with a mean of $3.10 and a
standard deviation of $0.18. How many of the stations charge.
a) between $2.74 and $3.46 for a gallon of regular gas?
SOLUTION:
The $2.74 per gallon price is 2 standard deviations below the mean. The $3.46 price is 2
standard deviations above the mean. In a normal distribution, 95% of all data lie within 2
standard deviations of the mean. See Figure 5.1.
• Therefore approximately,
• (95%) (1000) = (0.95) (1000) = 950 of the stations charge between $2.74 and $3.46 for a
gallon of regular gas of the stations charge between $2.74 and $3.46 for a gallon of
regular gas.
THE STANDARD NORMAL DISTRIBUTION
• It is often helpful to convert data values x to z-scores, by using the z-score formulas:
To find the area of a tail region, we subtract the entry in normal curve table from 0.500.
• Because the area of a portion of the standard normal distribution can be interpreted as a
percentage of the data or as a probability that the variable lies in a particular interval, we
can use the standard normal distribution to solve many application problems.
EXAMPLE
• A soda machine dispenses soda into 12-ounce cups. Tests show that the actual amount of
soda dispensed is normally distributed, with a mean of 11.5 oz and a standard deviation
of 0.2 oz.
What percent of cups will receive less than 11.25 oz of soda?
SOLUTION: We know now that the formula for the z-score for a data value x is
Thus 10.6% of the cups filled by the soda machine will receive less than 11.25 oz of soda.