0% found this document useful (0 votes)
88 views2 pages

10) .Identify Any Outliers Following Sets of Data

Outliers of big data processing

Uploaded by

processaa2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
88 views2 pages

10) .Identify Any Outliers Following Sets of Data

Outliers of big data processing

Uploaded by

processaa2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

28 DE SC RI BI NG D ATA WI T H TA B L E S A N D G R A P H S

Might Exclude from Summaries


You might choose to segregate (but not to suppress!) an outlier from any summary
of the data. For example, you might relegate it to a footnote instead of using exces-
sively wide class intervals in order to include it in a frequency distribution. Or you
might use various numerical summaries, such as the median and interquartile range, to
be discussed in Chapters 3 and 4, that ignore extreme scores, including outliers.

Might Enhance Understanding


Insofar as a valid outlier can be viewed as the product of special circumstances, it
might help you to understand the data. For example, you might understand better why
crime rates differ among communities by studying the special circumstances that produce
a community with an extremely low (or high) crime rate, or why learning rates differ
among third graders by studying a third grader who learns very rapidly (or very slowly).

Progress Check *2.4 Identify any outliers in each of the following sets of data collected
from nine college students.

SUMMER INCOME AGE FAMILY SIZE GPA


$6,450 20 2 2.30
$4,820 19 4 4.00
$5,650 61 3 3.56
$1,720 32 6 2.89
$600 19 18 2.15
$0 22 2 3.01
$3,482 23 6 3.09
$25,700 27 3 3.50
$8,548 21 4 3.20

Answers on page 421.

2 . 4 R E L AT I V E F R E Q U E N C Y D I S T R I B U T I O N S
An important variation of the frequency distribution is the relative frequency distribution.

Relative Frequency Distribution Relative frequency distributions show the frequency of each class as a part or
A frequency distribution showing fraction of the total frequency for the entire distribution.
the frequency of each class as a
This type of distribution allows us to focus on the relative concentration of observa-
fraction of the total frequency for
tions among different classes within the same distribution. In the case of the weight
the entire distribution. data in Table 2.2, it permits us to see that the 160s account for about one-fourth
(12/53 = 23, or 23%) of all observations. This type of distribution is especially helpful
when you must compare two or more distributions based on different total numbers of
observations. For instance, as in Review Question 2.17, you might want to compare the
distribution of ages for 500 residents of a small town with that for the approximately
300 million residents of the United States. The conversion to relative frequencies
allows a direct comparison of the shapes of these two distributions without having to
adjust for the radically different total numbers of observations.

Constructing Relative Frequency Distributions


To convert a frequency distribution into a relative frequency distribution, divide
the frequency for each class by the total frequency for the entire distribution. Table 2.5
illustrates a relative frequency distribution based on the weight distribution of Table 2.2.

Witte11e_c02.indd 28 11/18/2016 9:02:53 PM


A NS WE RS TO S E L E C T E D Q U E S TI O N S 42 1

IQ TALLY* f
120–124 / 1
115–119 0
110–114 // 2
105–109 /// 3
100–104 //// 4
95–99 //// / 6
90–94 //// // 7
85–89 //// 4
80–84 /// 3
75–79 /// 3
70–74 / 1
65–69 / 1
Total 35
*Tally column usually is omitted from the finished table.
(b) 64.5–69.5

2.3 Not all observations can be assigned to one and only one class (because of gap between
20–22 and 25–30 and overlap between 25–30 and 30–34). All classes are not equal in
width (25–30 versus 30–34). All classes do not have both boundaries (35–above).

2.4 Outliers are a summer income of $25,700; an age of 61; and a family size of 18. No
outliers for GPA.
2.5 GRE RELATIVE f
725–749 .01
700–724 .02
675–699 .07
650–674 .15
625–649 .17
600–624 .21
575–599 .15
550–574 .14
525–549 .07*
500–524 .02
475–499 .01
Totals 1.02
*From 13/200 = .065, which rounds to .07.

2.6
(a) (b)
CUMULATIVE CUMULATIVE
GRE f PERCENT(%)
725–749 200 100
700–724 199 100
675–699 196 98
650–674 182 91
625–649 152 76
600–624 118 59
575–599 76 38
550–574 46 23
525–549 19 10
500–524 6 3
475–499 2 1

Witte11e_Appendix_B.indd 421 11/16/2016 12:27:54 PM

You might also like