0% found this document useful (0 votes)

27 views94 pages

Statistics Unit1ppt

This document provides an introduction to business statistics, covering key concepts such as populations, samples, parameters, and statistics, as well as applications in various fields like accounting, production, marketing, economics, and finance. It distinguishes between descriptive and inferential statistics, explains data collection methods, and discusses different types of data and measurement scales. Additionally, it outlines sampling techniques and the importance of frequency distributions and graphical methods for data representation.

Uploaded by

alhajeri136

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views94 pages

Statistics Unit1ppt

Uploaded by

alhajeri136

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 94

Statistics for

Business
STAT130
Unit 1: Introduction and
Descriptive Statistics
Chapter 1
An Introduction to Business
Statistics
Applications in Business and
Economics
□ Accounting
■ Public accounting firms use statistical
procedures when conductin sampling
clients. g audits for their
□ Production
■ A variety of statistical quality control charts are
used to monitor the output of a production process.
□ Marketing
■ Electronic point-of-sale scanners at retail checkout
counters are being used to collect data for a variety
of marketing research applications.
3
Applications in Business and
Economics
□ Economics
■ Economists use statistical information in making
forecasts about the future of the economy or some
aspect of it.
□ Finance
■ Financial advisors use a variety of statistical
information, including price-earnings ratios and
dividend yields, to guide their investment
recommendations.

4
Key
Definitions
□ A population is the collection of all items or
things under consideration –people or objects
□ A sample is a portion of the population
for analysis
selected population is more accurate than the samples
anything we calculate while we measure population
□ A parameter is a summary measure that
describes a characteristic of the population
□ A statistic is a summary measure
a sample from
computed anything we calculate from the sample

□ A survey is the gathering of data about a

particular group of people or items
□ A census is a survey of the entire
population
5
population is the children toys
sample is 500 toys
parameter is 5% (claim is always parameter)

Exerci
statistics is 8%
parameter
statistics

seA manufacturer of children toys claims that less

□
*important: inferences (‫بهلدت‬ ‫)الدتاس‬
‫ ل‬how can we test the claim?? go to slide 10

than 5% of his products are defective. When 500

toys were drawn from a large production run, 8%
were found to be defective.
a) What is the population of interest?
b) What is the sample?
c) What is the parameter?
d)What is the
statistic?
Does e) the value 5% refer to the parameter or
the statistic?
f) Is the value 8% a parameter or a statistics?
g) Explain briefly how the statistic can be used to
make inferences about the parameter to test the claim.
6
What is Statistics?
data will be ready
□ Statistics is a science that deals with
collecting and analyzing data, drawing
conclusions, and making decisions.
□ There are two main areas of Statistics:
■ Descriptive statistics: describing & summarising the data,
provides tabular and graphical techniques and
numerical measures for describing data.
■ Inferential statistics: slide 9 makes it more clear

provides procedures for analyzing data and

making decisions.

7
Descriptive Statistics

□ Collect data
■ e.g. Survey
□ Present
data
■ e.g. Tables
□ Characterize
and graphs
data
■ e.g. Sample mean = X i

n
8
Inferential Statistics
□ Estimation
■ e.g.: Estimate the
population mean weight
using the sample mean
weight
□ Hypothesis testing
■ e.g.: Test the claim that the
population mean weight is
over 120 pounds

Drawing conclusions and/or making decisions

concerning a population based on sample results.
9
the procedure to test the claim

Inferential
Statistics
□ Making statements about a population
by examining sample results
Sample statistics Population parameters
(known) Inference (unknown, but can
because we calculated it, be estimated from
sample evidence)

Sample
Population

10
next class

Sources of
data
□ The most popular sources of data are:
■ Published material, observational studies,
experimental studies and surveys.
□ Published material
found in books, in scientific journals, on
tapes, on CDs, on the Internet, etc…
■ Data published by the organization that
collected the data are called PRIMARY DATA
■ Data published by an organization other than
the organization that collected the data are
called SECONDARY DATA.
11
Sources of data
□ Observational studies:
■ are studies in which the sample elements are observed
and the information is recorded without
controlling any of the factors that might affect
the information or measurements.
□ Experimental studies:
■ are studies which the measurements are recorded while
controlling some factors that might influence the
results of the study.
□ Surveys:
■ are questionnaires designed to solicit information from
people, by means of (face-to-face interview,
telephone interview, postal mail, e-mail, fax)
12
Types of data
□ Data are the facts, figures, or records that
are collected from the sample elements.
□ Data can be classified:
■ Qualitative data are labels or names used to
identify attributes of the sample elements.
The labels can be numbers with no real
numerical meaning.
□ Examples: gender, marital status, race, ..
■ Quantitative data are numbers (with real
meaning), representing measurements,
obtained from the sample elements.
□ Examples: salary, age, number of branches,..
13
Measurement Scales
□ Nominal data if the order is not important.
■ Examples: data representing marital status,
gender, work sector (public, private), get
promoted (yes, no), etc …
□ Ordinal data if the order is important.
■ Examples: data representing job performance
(excellent, good, fair, poor), income level (low,
medium, high), educational level (less than
high school, high school, college), etc…

14
Measurement Scales
□ Interval data: All of the characteristics of
ordinal plus…
■ Measurements are on a numerical scale with an arbitrary
zero point
□ The “zero” is assigned: it is nonphysical and not
meaningful
□ Zero does not mean the absence of the quantity that we
are trying to measure
■ Can only meaningfully compare values by the interval
between them
□ Cannot compare values by taking their ratios
□ “Interval” is the arithmetic difference between the
values
■ Example: temperature
□ 0 F means “cold,” not “no heat”
□ 80 F is not twice as warm as 40 F 15
Measurement Scales
□ Ratio data: All the of interval
characteristics plus…
■ Measurements are on a numerical scale with a
meaningful zero point
□ Zero means “none” or “nothing”
■ Values can be compared in terms of their interval and
ratio
□ $30 is $20 more than $10
□ $0 means no money
■ In business and finance, most quantitative variables are
ratio variables, such as anything to do with money
□ Examples: Earnings, profit, loss, age, distance, height,
weight
16
Exercise
□ After the graduation ceremonies at a university, six
Business graduates were asked whether they will join
an MBA program next year. Some information
about these graduates is shown below.
Graduate Sex Age MBA Rank
Huda F 52 1 1
Mohamed M 24 1 2
Sara F 33 0 4
Ali M 38 0 20
Fatima F 25 1 3
Samer M 19 0 8

a) How many elements are in the data set?

b) How many variables are in the data set?
c) How many observations are in the data set?
d) Classify the above variables (qualitative/ quantitative).
17
Sampling
□ Reasons for Drawing a Sample
■ It may cost too much to collect information from each
element of the population.
■ The population may be too large and it would take a
long time to collect information.
■ It may not be possible to obtain information from
some elements of the population.

Probability Samples

Simple Systematic Stratified Cluster

18
Simple Random Samples
□ Every individual or item from the frame
has an equal chance of being selected.
□ Selection may be with replacement or
without replacement.
□ Samples obtained from computer random
number generators.

19
Systematic Samples
□ Decide on sample size: n
□ Divide frame of N individuals into groups of k
individuals: k=N/n
□ Randomly select one individual from the 1st
group.
□ Select every kth individual thereafter

N = 64
n=8 First Group
k=8 20
Stratified Samples
□ Population divided into two or more subgroups
(called strata) according to some common
characteristic.
□ Simple random sample selected from
each subgroup.
□ Samples from subgroups are combined into
one.
Population
Divided
into 4
strata
21
Sample
Cluster Samples
□ Population is divided into “clusters,” each
representative of the population
□ A simple random sample of clusters is
selected
■ All items in the selected clusters can be used, or items
can be chosen from a cluster using another
probability sampling technique

Population
divided into
16 clusters. Randomly selected
clusters for sample
22
Advantages and
Disadvantages
□ Simple random sample and systematic sample
■ Simple to use
■ May not be a good representation of the
population’s underlying characteristics that
have small probabilities
□ Stratified sample
■ Ensures representation of individuals across the
entire population
□ Cluster sample
■ More cost effective
■ Less efficient (need larger sample to acquire
the
same level of precision) 23
Chapter 2
Descriptive Statistics: Tabular
and Graphical Methods
Organizing and
□ Presenting
Data in raw form areData
usually not easy to use
for decision making
■ Some type of organization is needed
□ Table
□ Graph
□ Techniques reviewed here:
■ Stem-and-Leaf Display
■ Frequency Distributions and Histograms
■ Bar charts and pie charts
■ Contingency tables and Scatter Diagrams

25
Representing Qualitative
Data
Qualitative Data

Tabulating Data Graphing Data

Frequency Bar Pie

Table Charts Charts

26
Frequency Tables
□ A frequency table consists of two columns,
one of which shows the categories or classes
and the other specifies the frequency for
each category.
□ In a frequency table, all frequencies must add
up to the sample size (n).
□ A relative frequency table consists of two
columns, one of which shows the categories
or classes and the other specifies the
relative frequency for each category.
The relative frequency=(Frequency/sample size)
27
Example
□ The following table lists all 251 vehicles sold
in 2006 by the greater Cincinnati Jeep
dealers
Jeep Model Frequency
Commander 71
Grand Cherokee 70
Liberty 80
Wrangler 30
251

28
Example: Relative
Frequency Table
Relative Percent
Jeep Model Frequency Frequency

Commander 0.2829 28.29%

Grand Cherokee 0.2789 27.89%
Liberty 0.3187 31.78%
Wrangler 0.1195 11.95%
1.0000 100.00%

29
Bar Charts and Pie Charts
□ Bar chart: A vertical or horizontal rectangle
represents the frequency for each category
■ Height can be frequency, relative frequency, or
percent frequency
■ What to Look For: Frequently and infrequently
occurring categories.
□ Pie chart: A circle divided into slices where the
size of each slice represents its relative frequency
or percent frequency
■ What to Look For: Categories that form large and
small proportions of the data set.

30
Excel Bar
Chart

31
Excel Pie Chart

32
Exercise
□ A random sample of 25 female shoppers was
selected on a given day and each
shopper was asked: “what is your
favorite shampoo?”. The data were as
follows:
p, p, s, d, s, d, d, s, p, d, p, d, d, s, d, p, s, s,
d, s, p, d, d, s, d,
where d= Dove, p= Pantene and s=
Sunsilk.
Construct a frequency table, a bar chart and
a pie chart and comment on the plots.
33
Representing
Quantitative Data
Quantitative Data

Frequency Distributions
Ordered Array and
Cumulative
Distributions
Stem and Leaf
Histogram Polygon Ogive
Display

34
Frequency Distributions
□ A frequency distribution is a list or a table
■ containing class groupings (categories or ranges
within which the data falls)
■ and the corresponding frequencies with which
data
falls within each grouping or category
□ Why Use Frequency Distributions?
■ A frequency distribution is a way to summarize
data
■ The distribution condenses the raw data into a
more useful form
■ allows for a quick visual interpretation of the
data
■ and easy graphical display 35
Class Intervals and Class
□Boundaries
If each class grouping has the same width
■ Determine the width of each interval by

range
Width of interval 
number of desired class
groupings
■ Use at least 5 but no more than 15 groupings
■ Class boundaries never overlap
■ Round up the interval width to get desirable
endpoints

36
Frequency Distribution
Example
□ A manufacturer of insulation randomly selects 20
winter days and records the daily high
temperature
■ Sort raw data in ascending order:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30,
32, 35, 37, 38, 41, 43, 44, 46, 53, 58
■ Find range: 58 - 12 = 46
■ Select number of classes: 5 (usually 5 to 15)
■ Compute class interval (width): 10 (46/5 then roundup)
■ Compute class boundaries (limits): 10, 20, 30, 40, 50,
60
■ Compute class midpoints: 15, 25, 35, 45, 55
■ Count observations & assign to classes 37
Frequency Distribution
Example
Ordered Data:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Relative
Class Frequency Frequency Percentage

10 but less than 20 3 .15 15

20 but less than 30 6 .30 30
30 but less than 40 5 .25 25
40 but less than 50 4 .20 20
50 but less than 60 2 .10 10
Total 20 1.00 100 38
The Histogram
□ A graph of the data in a frequency distribution is
called a histogram
□ The class boundaries (or class midpoints) are
shown on the horizontal axis
□ frequency is measured on the vertical axis
□ Bars of the appropriate heights can be used to
represent the number of observations within each
class
□ What to Look For: Central or typical value, extent
of spread or variation, general shape, location and
number of peaks, presence of gaps and outliers.
39
Histogram Example
Class
Class Midpoint Frequency
10 but less than 20 15 3
20 but less than 30 25 6
30 but less than 40 35 5
40 but less than 50 45 4 Histogram : Daily High Te m pe r a tur e
50 but less than 60 55 2
7 6
6 5
5 4
Frequency

4 3
3 2
2
(No gaps 1 0 0
between 0
bars) 5 15 25 35 45 55 More
40
Shapes of Histograms

symmetric histograms skewed histograms 41

Frequency Polygons
□ Plot a point above
class midpoint at a
each
height equal to the
frequency of the class
□ Useful when comparing
two or more
distributions

42
Cumulative Distributions
and
□ Ogive
Another way to summarize a distribution is to
construct a cumulative distribution
□ Rather than a count, we record the number of
measurements that are less than the upper
boundary of that class
□ Ogive: A graph of a cumulative distribution
■ Plot a point above each upper class boundary at height
of cumulative frequency
■ Connect points with line segments
■ Can also be drawn using
□ Cumulative relative frequencies
□ Cumulative percent frequencies
43
Cumulative Frequency
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58

Cumulative Cumulative
Class Freq % Class Frequency %

10 - <20 3 15 less than 20 3 15

20 - <30 6 30 less than 30 9 45

30 - <40 5 25 less than 40 14 70

40 - <50 4 20 less than 50 18 90

50 - <60 2 10 less than 60 20 100

Total 20 100
44
Graphing Cumulative
Frequencies:
The Ogive (Cumulative %
Lower
class Cumulative

Polygon)
Class

less than 10
boundary Percentage
10 0
less than 20 20 15
less than 30 30 45 Ogive: Daily High Temperature
less than 40 40 70
less than 50 50 90 1

Cumulative Percentage
0
less than 60 60 100 0

8
0

6
0

4
0 10 20 30 40 50 60

2
0
45
0
Exercise
□ A random sample of 25 stocks was selected from
the New York Stock Exchange and the book value
(net worth divided by The number of outstanding
shares) was recorded for each stock. The data
were as follows:
10 8 16 14 4 10 8 12 9 7 14 13
11 7 10 17 8 11 9 15 8 6 18 9 12
■ Construct a frequency table
■ Construct a histogram and describe the distribution.
■ Determine the cumulative frequency table

46
Stem and Leaf Display
□ Purpose is to see the overall pattern of the data,
by grouping the data into classes
■ the variation from class to class
■ the amount of data in each class
■ the distribution of the data within each class
□ What to look for: The display conveys
information about a representative to a typical
value in the data set, the extent of spread about
such a value, the presence of any gaps in the
data, the extent of symmetry in the distribution
of values, the number and location of peaks, and
the presence of any outliers (unusual points).
47
Example
Data in ordered array:
21, 24, 24, 26, 27, 27, 30, 32, 38, 41
□ Here, use the 10’s digit for the stem unit:
Stem Leaf
□ 21 is shown as 2 1
□ 38 is shown as 3 8

Stem Leaves
2 1 4 4 6 7
7
3 0 2 8
48
4 1
Car Mileage: Results
□ Refer to the Car
Mileage Case (Table 2.14)
□ Looking at the stem-and-
leaf display, the
distribution appears
almost
“symmetrical”
■ The upper portion (29, 30,
31) is almost a
mirror image of the lower
portion of the display
(31, 32, 33)
□ Stems 31, 32*, 32, and
33*
■ But not exactly a mirror 49

reflection
Crosstabulation Tables
□ Classifies data on two dimensions
■ Rows classify according to one dimension
■ Columns classify according to a second
dimension
□ Requires three variable
1. The row variable
2. The column variable
3. The variable counted in the cells

50
Example: The Investor
Satisfaction
□ Investment broker Case
sells several kinds of
investments (stock fund, bond fund, tax-deferred
annuity)
□ Wishes to study whether satisfaction depends on
the type of investment product purchased
Fund Type High Medium Low Total
Bond Fund 15 12 3 30
Stock Fund 24 4 2 30
Tax Deferred Annuity 1 24 15 40

Total 40 40 20 100
51
More on Crosstabulation
Tables
□ Row totals provide a frequency distribution for
the different fund types
□ Column totals provide a frequency distribution for
the different satisfaction levels
□ One way to investigate relationships is to
compute row and column percentages
■ Compute row percentages by dividing each
cell’s frequency by its row total and expressing
as a percentage
■ Compute column percentages by dividing by the
column total

52
Row Percentage for Each Fund
Type
Fund Type High Medium Low Total

Bond Fund 50.0% 40.0% 10.0% 100%

Stock Fund 80.0% 13.3% 6.7% 100%

Tax Deferred 2.5% 60.0% 37.5% 100%

Annuity

53
Scatter Plots
□ Scatter plots are used for bivariate numerical
data
■ Bivariate data consists of paired observations
taken from two numerical variables
□ The Scatter plot:
■ one variable (dependent) is measured on the
vertical axis and the other variable (independent)
is measured on the horizontal axis.
□ What to look for:
■ Describe the type of the relationship (linear,
nonlinear), the direction (positive, negative) and
the strength (strong, moderate, weak).
54
Examples of Scatter Plots

Describing direction Describing strength 55

Scatter Plot Example
Volume Cost per
per day day
23 125
26 140
29 146
33 160
38 167
42 170
50 188
55 195
60 200
Strong positive linear relationship

56
Chapter 3
Descriptive Statistics:
Numerical Methods
Summary Measures
Describing Data Numerically

Center and Location Measures of Variation

Relative Standing
Mean Range
Percentiles
Median Interquartile Range
Quartiles
Mode
Variance

Standard Deviation

Coefficient of
Variation 58
Measures of Central
Tendency
□ In addition to describing the shape of a
distribution, want to describe the data
set’s central tendency
■ A measure of central tendency represents the
center or middle of the data

Central Tendency

Mean Median Mode

59
Mean (Arithmetic
Average)
□ The Mean is the arithmetic average of data
values
■ Sample mean n = Sample
n Size

 x  x ·  x
x  xi1
i n  1 2n n

■ Population mean N = Population

N Size

 x  x ·  x
  xi1
iN  1 2N N
60
Arithmetic Mean
□ The most common measure of central tendency
□ Mean = sum of values divided by the number of
values
□ Affected by extreme values (outliers)

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Mean = 3 Mean = 4

1 2  3  4  5 15 1 2  3  4  10 20
   
61
3 4
Median
□ Not affected by extreme values

0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

Median = 3 Median = 3

□ In an ordered array, the median is the

“middle” number (50% above, 50% below)

62
Finding the Median
□ The location of the median:

n1
Median position  position in the ordered
array
2
■ If the number of values is odd, the median is the middle
number
■ If the number of values is even, the median is the
average of the two middle numbers

□ Note that (n+1)/2 is not the value of the median,

only the position of the median in the ranked data

63
Mode
□ A measure of central tendency
□ Value that occurs most often
□ Not affected by extreme values
□ Mainly used for grouped numerical data or
categorical data
□ There may may be no mode
□ There may be several modes
No Mode

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6

Mode = 9 64
Review Example
□ Five houses on a hill by the beach
$2,000 K
House Prices:

$2,000,000
500,000 $500 K
300,000 $300 K
100,000
100,000

$100 K

$100 K
65
Example: Summary
Statistics□ Mean ($3,000,000/5)
House Prices: : = $600,000
$2,000,000
500,000 □ Median: middle value
300,000 of ranked data
100,000
= $300,000
100,000
Sum □ Mode: most frequent
3,000,000
value
= $100,000
66
Which measure is the
“best”?
□ Mean is generally used, unless extreme values
(outliers) exist
□ Then median is often used, since the median is
not sensitive to extreme values.
□ For a relatively small number of extreme
observations (either very small or very large,
but not both), the median is usually better.
□ Choosing:
■ The mode is meaningful on a nominal scale.
■ The median is meaningful on an ordinal scale.
■ The mean is meaningful on an interval/ratio scale.

67
Shape of a Distribution
□ Describes how data is distributed
□ Symmetric or skewed
■ If the distribution is symmetric, then mean=median.
■ If the distribution is skewed to right, then
mode < median < mean
■ If the distribution is skewed to left, then
mode > median > mean

68
Exercise
□ The following data represent the ages of 20
randomly selected managers:
43 44 49 37 45 35 46
32 47 42 39 40 41 45
41 43 50 47 41 51
a) Find the mean, median and mode for the
above data.
b) Which measure would you choose to describe
the data? Why?

69
Measures of Variability
Variability

Range Variance Standard Coefficient

Deviation of Variation

70
Measures of Variation
□ Knowing the measures of center is not enough
□ Both of the distributions below have identical
measures of central tendency

Variation

Range Variance Standard Coefficient

Deviation of Variation 71
Range
□ Simplest measure of variation
□ Difference between the largest and the
smallest observations:

Range = maximum – minimum

Example:

0 1 2 3 4 5 6 7 8 9 10
11 12 13 14

Range = 14 - 1 = 13 72
Disadvantages of the Range
□ Ignores the way in which data are distributed

7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5

□ Sensitive to outliers

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
73
Varian
ce
□ Average of squared deviations of values from
the mean
■ Population variance: Sample
variance: n
 i
N
(X  X) 2
 (X i  μ) 2

i1
2
σ  i1 S2 
N
n -1
Where Where
μ = population mean X = arithmetic mean
N = population size n = sample size
Xi = ith value of the variable X Xi = ith value of the variable X 74
Standard Deviation
□ Most commonly used measure of variation
□ The square root of the variance
□ Shows variation about the mean
□ Has the same units as the original data
■ Sample standard deviation:

 i
(X  X) 2

i1
S
 n -1 75
Example: Sample Standard
Deviation
Sample
Data 10 12 14 15 17 18 18 24
(Xi) :
n=8 Mean = X = 16

(10  X ) 2  (12  X ) 2  (14  X ) 2  ·  (24  X)2

S
 n 1

(10  16) 2  (12  16) 2  (14  16) 2  ·  (24  16) 2


8 1

126
  4.2426
7 76
Comparing Standard
Deviations
Data A
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 S = 3.338

Data B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 S = .9258
Data C
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 S = 4.57
77
Coefficient of Variation
□ Measures relative variation
□ Always a percentage (%)
□ Shows variation relative to mean
□ Is used to compare two or more sets of
data measured in different units

 S 
CV  
100%  X  78
Comparing Coefficients of
Variation
□Stock A:
■ Average price last year = $50
■ Standard deviation = $5

CVA  S  100% $5 100%

 10% $50 Both stocks
□ Stock B: X have the same
standard
■ Average price last year = $100 deviation, but
stock B is less
■ Standard deviation = $5 variable relative
to its price
CVB  S  100% $5 100%
  X  5% $100
79
The Empirical Rule
□ If the data distribution is bell-shaped, then
the interval:
a) (-, +) contains about 68.26% of the values in
the population.
b) (-2, +2) contains about 95.44% of the values
in the population.
c) (-3, +3) contains about 99.74% of the values
in the population.

80
Example
□ IQs measured on the Stanford Revision of the Binet–
Simon Intelligence Scale have a mean of 100 points and a
standard deviation of 16 points. The interval:
a) (84, 116) contains about 68.26% of the IQ scores.
b) (68, 132) contains about 95.44% of the IQ scores.
c) (52, 148) contains about 99.74% of the IQ scores.
□ The scores of 25 randomly selected people are shown
below.
66 82 86 88 91 95 96 96 97 98
101 102 102 104 105 106 111 112 115
116 118 121 124 127 129
a) 18 scores (72%) fall in the interval (84, 116).
b) 24 scores (96%) fall in the interval (68, 132).
c) 25 scores (100%) fall in the interval (52, 148).
81
Exercise
□ The exam scores for the students in an
introductory statistics course are as follows.
88 67 64 76 86 85 82 39 75
34
90 63 89 90 84 81 96 100 70
96
a) Compute the descriptive statistics for the
given exam scores.
b) Apply the empirical rule and check the
consistency with the sample results. Explain
your conclusion.

82
Measures of Relative
Standing Measures of
Relative Standing

Percentiles Quartiles

The pth percentile in a data: □ 1st quartile = 25th percentile

□ p% are less than or equal to □ 2nd quartile = 50th percentile
this value = median
□ (100 – p)% are greater than □ 3rd quartile = 75th percentile
or equal to this value
(where 0 ≤ p ≤ 100)

83
Percentiles
□ The pth percentile in an ordered array of n
values is the value in ith position, where

i
p
□ Example: The 60th percentile in an ordered array
(n th
of 19 values is the value in 12 position:
1) 100
p 60
i (n  1)  (19  1)  12
100 100
□ In Excel, write =percentile(array, k), where array
is the range of data and k is the percentile value
in the range 0-1. 84
Quartiles
□ Quartiles split the ranked data into 4 equal
groups
25% 25% 25% 25%

Q1 Q3

Q2
16 16 17 18 21 22
□ Example: Find the first quartile
(n = 9)
Sample Data in Ordered Array: 11 12
Q1= 25th percentile, so find the (25/100)(9+1) = 2.5 position
13
so use the value half way between the 2nd and 3rd values,
so Q1= 12.5
85
Interquartile Range and
□ Fences
Difference between the first and third
quartiles
IQR = Q3 – Q1
□ Inner fences: Located 1.5IQR away from
the quartiles:
■ Q1 – (1.5  IQR)
■ Q3 + (1.5  IQR)
□ Outer fences: Located 3IQR away from the
quartiles:
■ Q1 – (3  IQR)
■ Q3 + (3  IQR)
86
Outliers
□ Outliers are measurements that are very different
from other measurements
■ They are either much larger or much smaller than most
of the other measurements
□ Outliers lie beyond the fences of the box-and-
whiskers plot
■ Measurements between the inner and outer fences
are mild outliers
■ Measurements beyond the outer fences are
severe outliers
□ The adjacent values are:
■ The smallest data point falls above the lower
fence.
■ The largest data point falls below the upper fence.
87
Box and Whisker Plot
(Boxplot)
□ A Graphical display of data using 5-number
summary:
Minimum -- Q1 -- Median -- Q3 -- Maximum

□ The box plots the:

■ First quartile (Q1), median (Md), third quartile (Q3).
■ Inner fences, outer fences
■ The “whiskers” are dashed lines that plot the range
of the data
□ A dashed line drawn from the box below Q1 down to
the minimum
□ Another dashed line drawn from the box above Q3 up
to the maximum.
88
Distribution shapes and
boxplots

89
How to construct a Boxplot?
1. Determine the quartiles.
2. Determine the outliers and the
potential adjacent values.
3. Draw a horizontal axis on which the numbers
obtained in Steps 1 and 2 can be located. Above
this axis, mark the quartiles and the adjacent
values with vertical lines.
4. Connect the quartiles to each other to make a
box, and then connect the box to the adjacent
values with lines.
5. Plot the potential outlier with an asterisk.
90
Example: Box-and-Whiskers
Plots

91
Example
□ A sample of 20 people yielded the weekly
viewing times, in hours,
25 41 27 32 43 66 35 31 15 5
34 26 32 38 16 30 38 30 20 21
■ The five-number summary is
5 24 30.5
35.75 66
■ IQR=35.75-24=11.75
■ 1.5*IQR=1.5*13.5=17.625
■ Lower Fence=Q1-1.5*IQR=24-17.625=6.375
■ Upper
Fence=Q3+1.5*IQR=35.75+17.625=53.375
□ The observations, 5 and 66, lie beyond the inner
fences and hence should be classified as outlier.92
The adjacent values are 15 and 43.
Example: Excel output

□ The distribution of the viewing

times is right skewed with two
outliers.

93
Exercise
□ IQs measured on the Stanford Revision of the
Binet–Simon Intelligence Scale. The scores of 25
randomly selected people are shown below.
66 82 86 88 91 95 96 96 97
98 101 102 102 104 105 106
111
112 115 116 118 121 124 127 129
Identify potential outliers, if any, and construct
and interpret a boxplot

Analysis and Interpretation of Assessment Results
50% (2)
Analysis and Interpretation of Assessment Results
84 pages
Week 1
No ratings yet
Week 1
76 pages
# Basic Statistics For Accounting & Finance
100% (2)
# Basic Statistics For Accounting & Finance
187 pages
Statistics Teaching Notes For Exams: Mean, Median and Mode
No ratings yet
Statistics Teaching Notes For Exams: Mean, Median and Mode
15 pages
Unit-1-Introduction To Statistical Analysis
No ratings yet
Unit-1-Introduction To Statistical Analysis
103 pages
PowerPoint Presentation On Statistics
No ratings yet
PowerPoint Presentation On Statistics
66 pages
Statistics and Probability Exam
100% (1)
Statistics and Probability Exam
3 pages
Part 1 - Basic Statistics
No ratings yet
Part 1 - Basic Statistics
44 pages
Business Statistics: A Decision-Making Approach: The Where, Why, and How of Data Collection
No ratings yet
Business Statistics: A Decision-Making Approach: The Where, Why, and How of Data Collection
129 pages
Stat Intro 01 June 2020
No ratings yet
Stat Intro 01 June 2020
17 pages
LEC 04 - Student - Introduction To Statistics (Part 1)
No ratings yet
LEC 04 - Student - Introduction To Statistics (Part 1)
68 pages
Statistics For Business and Economics: 8 Global Edition
No ratings yet
Statistics For Business and Economics: 8 Global Edition
68 pages
Chapter 1 Slides
No ratings yet
Chapter 1 Slides
40 pages
Std121-121e - Business Statistics Course Booklet 2023
No ratings yet
Std121-121e - Business Statistics Course Booklet 2023
82 pages
Lecture 4
No ratings yet
Lecture 4
61 pages
ML Practical 1 Code
100% (1)
ML Practical 1 Code
1 page
Statistical Analysis (Lecture 1)
No ratings yet
Statistical Analysis (Lecture 1)
40 pages
Eco2061 Week 2
No ratings yet
Eco2061 Week 2
68 pages
Eknm 201 - Statistics I Departments: Business Administration International Trade and Finance
No ratings yet
Eknm 201 - Statistics I Departments: Business Administration International Trade and Finance
94 pages
Ghon Stat Chapter1
No ratings yet
Ghon Stat Chapter1
39 pages
Lecture 1 Statistics and Lecture2
No ratings yet
Lecture 1 Statistics and Lecture2
44 pages
STA132 Complete Note
No ratings yet
STA132 Complete Note
110 pages
DRS 111 Probability Theory Lecture Notes Collection
No ratings yet
DRS 111 Probability Theory Lecture Notes Collection
286 pages
7.1 Fundamental Theories of Probability: Reporter: Erika Dianne Salma
No ratings yet
7.1 Fundamental Theories of Probability: Reporter: Erika Dianne Salma
22 pages
Introduction To Statistics
100% (3)
Introduction To Statistics
43 pages
Week 1 - Data & Statistics
No ratings yet
Week 1 - Data & Statistics
75 pages
BS Week1
No ratings yet
BS Week1
141 pages
Chapter1 S
No ratings yet
Chapter1 S
100 pages
Frequency Distribution Samples
No ratings yet
Frequency Distribution Samples
11 pages
مبادئ الاحصاء
No ratings yet
مبادئ الاحصاء
66 pages
Statistics - Unit1 PDF
No ratings yet
Statistics - Unit1 PDF
94 pages
Intro To Stats (QUAT)
No ratings yet
Intro To Stats (QUAT)
27 pages
St. Anthony'S College Liberal Arts and Education Department
No ratings yet
St. Anthony'S College Liberal Arts and Education Department
7 pages
Lecture1 Introduction
No ratings yet
Lecture1 Introduction
49 pages
Quantitative Methods
100% (2)
Quantitative Methods
103 pages
Intro To Statistics LECTURE 1
No ratings yet
Intro To Statistics LECTURE 1
28 pages
Basic Statistics
No ratings yet
Basic Statistics
31 pages
MATH30 6 Lecture 1 1
No ratings yet
MATH30 6 Lecture 1 1
32 pages
ST' Lideta Business & Health Science College Department of Nursing
No ratings yet
ST' Lideta Business & Health Science College Department of Nursing
4 pages
Chapter 1 BKU2032
No ratings yet
Chapter 1 BKU2032
57 pages
Introduction Key Concepts
No ratings yet
Introduction Key Concepts
37 pages
01 Introduction
No ratings yet
01 Introduction
50 pages
Lesson 3
No ratings yet
Lesson 3
2 pages
Statistics 1 Chapter 1
No ratings yet
Statistics 1 Chapter 1
28 pages
Graph Skewness
No ratings yet
Graph Skewness
6 pages
Lecture 1 Data Overview and Introduction To SPSS VJU
No ratings yet
Lecture 1 Data Overview and Introduction To SPSS VJU
49 pages
Statistics in Education: Distribution
No ratings yet
Statistics in Education: Distribution
79 pages
Statistics Lec 1
No ratings yet
Statistics Lec 1
21 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
54 pages
Lecture 2
No ratings yet
Lecture 2
23 pages
Topic 1 - Ch1 Ch2
No ratings yet
Topic 1 - Ch1 Ch2
20 pages
PDF Laporan Praktikum Data Mining - Compress
No ratings yet
PDF Laporan Praktikum Data Mining - Compress
142 pages
Statistics
No ratings yet
Statistics
55 pages
TOPIC 1 - Introduction To Statistics in Relation To
No ratings yet
TOPIC 1 - Introduction To Statistics in Relation To
47 pages
Finals MMW Final
No ratings yet
Finals MMW Final
4 pages
Introduction To Statistics: Neeta Pathak
No ratings yet
Introduction To Statistics: Neeta Pathak
22 pages
Stats Book
No ratings yet
Stats Book
102 pages
Lecture1 2 3
No ratings yet
Lecture1 2 3
86 pages
Statistics Lec 1
No ratings yet
Statistics Lec 1
28 pages
Assignment 1 Mean Median Mode
No ratings yet
Assignment 1 Mean Median Mode
9 pages
Introduction Book 1
No ratings yet
Introduction Book 1
41 pages
Chapters 1 and 2chapters 1 and 2chapters 1 and 2chapters 1 and 2chapters 1 and 2
No ratings yet
Chapters 1 and 2chapters 1 and 2chapters 1 and 2chapters 1 and 2chapters 1 and 2
47 pages
Lecture Note On Basic Business Statistics - I Mustafe Jiheeye-1
No ratings yet
Lecture Note On Basic Business Statistics - I Mustafe Jiheeye-1
81 pages
Edup3063i Task 2 Tesl 3 Hanis Hafizah
No ratings yet
Edup3063i Task 2 Tesl 3 Hanis Hafizah
42 pages
Lecture
No ratings yet
Lecture
13 pages
Overall Descriptive Statistics
No ratings yet
Overall Descriptive Statistics
127 pages
Stastical Data Analysis: A Lokeshwari 22N31E0014
No ratings yet
Stastical Data Analysis: A Lokeshwari 22N31E0014
30 pages
Chapter 1 Slides PDF
No ratings yet
Chapter 1 Slides PDF
45 pages
Session 1
No ratings yet
Session 1
11 pages
Basic Business Statistics: Introduction and Data Collection
No ratings yet
Basic Business Statistics: Introduction and Data Collection
33 pages
Business Statistics May Module
No ratings yet
Business Statistics May Module
72 pages
Introduction To Stati Stics: There Are Three Kinds of Lies: Lies, Damned Lies, A ND Statistics." (B.Disraeli)
No ratings yet
Introduction To Stati Stics: There Are Three Kinds of Lies: Lies, Damned Lies, A ND Statistics." (B.Disraeli)
39 pages
Introduction Bus Statistics
No ratings yet
Introduction Bus Statistics
32 pages
Covariance Correlation
No ratings yet
Covariance Correlation
4 pages
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
No ratings yet
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
46 pages
191MA303 - P-S Unit 5
No ratings yet
191MA303 - P-S Unit 5
9 pages
Test 1 Statistics Huflit
No ratings yet
Test 1 Statistics Huflit
8 pages
Aadt1.Csv and Aadt2.Csv From Ublearns - Fit A LR Model Fit1 From Aadt1.Csv
No ratings yet
Aadt1.Csv and Aadt2.Csv From Ublearns - Fit A LR Model Fit1 From Aadt1.Csv
4 pages
Statistics
No ratings yet
Statistics
52 pages
Final Persistent Systmes
No ratings yet
Final Persistent Systmes
7 pages
Nguyễn Phát Thịnh - assignment 11
No ratings yet
Nguyễn Phát Thịnh - assignment 11
6 pages
Lec 1 - Data, Tables and Graphs
No ratings yet
Lec 1 - Data, Tables and Graphs
18 pages
Stat - Prob 11 - Q3 - SLM - WK3
87% (30)
Stat - Prob 11 - Q3 - SLM - WK3
18 pages
1.7 BoxWhisker and Histograms HW KEY
No ratings yet
1.7 BoxWhisker and Histograms HW KEY
4 pages
Unit 4 Business Analytics Notes
No ratings yet
Unit 4 Business Analytics Notes
3 pages
Program To Find The Variance and Standard Deviation of Set of Elements
No ratings yet
Program To Find The Variance and Standard Deviation of Set of Elements
3 pages
Data Sheet Injoon Cha
No ratings yet
Data Sheet Injoon Cha
2 pages
Definition of Statistics
No ratings yet
Definition of Statistics
4 pages
Chapter1 Introduction To Statistics
No ratings yet
Chapter1 Introduction To Statistics
27 pages