0% found this document useful (0 votes)
69 views47 pages

Descriptive Statistics: Overview of Using Data

Uploaded by

BUSHRA ZAINAB
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views47 pages

Descriptive Statistics: Overview of Using Data

Uploaded by

BUSHRA ZAINAB
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 47

Business Analytics

Descriptive Statistics
Overview of Using Data

Lecture # 02
TOPICS to be COVERED

01 Definitions and Goals

02 Types of Data

03 Types of Measurements

04 Modifying Data in Excel

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 2
Overview of Using Data: Definitions and Goals
• Data
• Variable
• Observation
• Variation
• Random variables

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 3
• Data are the facts and figures collected, analyzed,
and summarized for presentation and
interpretation.
• A variable is any characteristics, number, or
quantity that can be measured or counted. It is
called a variable because the value may vary
between data units in a population, and may
change in value over time.
Example: Height of a whole class students
• An observation is a set of values corresponding
to a set of variables
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 4
variation is the difference in a variable measured over
observations (time, customers, items, etc.).
Example: When we collect data, we are gathering past observed values, or realizations of a variable.
By collecting these past realizations of one or more variables, our goal is to learn more about the
variation of a particular business situation.

The role of descriptive analytics is to collect and analyse data to gain a


better understanding of variation and its impact on the business setting.

The values of some variables are under direct control of the


decision maker (these are often called decision variables).
The values of other variables may fluctuate with uncertainty
because of factors outside the direct control of the decision
maker. In general, a quantity whose values are not known with
certainty is called a random variable, or uncertain variable.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 5
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 6
Types of data
• Population and Sample Data
• Quantitative and Categorical Data
• Cross-Sectional and Time Series Data

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 7
Data can be categorized in several ways based
on how they are collected and the type
collected.
• In many cases, it is not feasible to collect data
from the population of all elements of
interest.
• In such instances, we collect data from a
subset of the population known as a sample.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 8
What is a Statistic????

Sample
Sample
Sample

Population
Sample

Parameter: value that describes a population

Statistic: a value that describes a sample


© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 9
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 10
Quantitative and Categorical Data
Data are considered quantitative data if numeric and
arithmetic operations included , such as
• addition,
• subtraction,
• multiplication,
• and division.
For instance, we can sum the values for Volume in the
Dow data in Table 2.1 to calculate a total volume of all
shares traded by companies included in the Dow.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 11
• If arithmetic operations cannot be performed
on the data, they are considered categorical
data
• For instance, the data in the Industry column
in Table 2.1 are categorical

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 12
Cross-Sectional and Time Series Data
• Cross-sectional data are collected from several entities at the
same, or approximately the same, point in time.

The data in Table 2.1 are cross-sectional because they describe


the 30 companies that comprise the Dow at the same point in
time (July 2015).

• Time series data are collected over several time periods.

Graphs of time series data are frequently found in business and


economic publications.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 13
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 14
Types of measurement
• When collecting or gathering data we collect data from individuals
cases on particular variables.
• A variable is a unit of data collection whose value can vary.
• Variables can be defined into types according to the level of
mathematical scaling that can be carried out on the data.
• There are four types of data or levels of measurement:

1. Categorical (Nominal) 2. Ordinal

3. Interval 4. Ratio

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 15
Categorical (Nominal) data
• What does this mean? No mathematical
operations can be performed on the data
relative to each other.
• Therefore, nominal data reflect qualitative
differences rather than quantitative ones.
• Nominal measurements only permit you to
determine whether two individuals are the
same or different.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 16
Nominal data
Examples:

What is your gender? (please Did you enjoy the film?


tick) (please tick)

Male Yes
Female No

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 17
Ordinal data
• Ordinal data is data that comprises of categories that can be rank
ordered.
• Similarly with nominal data the distance between each category
cannot be calculated but the categories can be ranked above or
below each other.
• No fixed units of measurement
Examples:
‒ college football rankings
‒ survey responses:(poor, average, good, very good, excellent)
What does this mean? Can make statistical judgements and perform
limited maths.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 18
Ordinal data

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 19
Interval and Ratio data
• Both interval and ratio data are examples of scale data.
• Scale data:
• data is in numeric format ($50, $100, $150)
• data that can be measured on a continuous scale
• the distance between each can be observed and as a result
measured
• the data can be placed in rank order.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 20
Interval data
• Ordinal data but with constant differences
between observations
• Examples:
• Time – moves along a continuous measure or
seconds, minutes and so on and is without a
zero point of time.
• Temperature – moves along a continuous
measure of degrees and is without a true zero.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 21
Ratios
• Ratio data measured on a continuous scale and does
have a natural zero point
• Ratios are meaningful
Examples:
‒ Monthly sales
‒ Delivery times
‒ Weight
‒ Height
‒ Age

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 22
Data for Business Analytics
Classifying Data Elements in a Purchasing Database

Ra
Ca

Ca
Ca
Ca

Ra

Ra

In
In
Ra
te

tio

te
te
te
te

te
tio

tio

tio
go

go

r
go
go

rv

va
al
ric

r ic
ric
r ic

l
al

al
al
al

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 23
• Ref book pg 24

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 24
Modifying Data in Excel
Sorting Data in Excel
• Step 1. Select cells A1:F21
• Step 2. Click the Data tab in the Ribbon
• Step 3. Click Sort in the Sort & Filter group
• Step 4. Select the check box for My data has headers
• Step 5. In the first Sort by dropdown menu, select Sales
(March 2010)
• Step 6. In the Order dropdown menu, select Largest to
Smallest (see Figure 2.4)
• Step 7. Click OK
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 25
Filtering
• Step 1. Select cells A1:F21
• Step 2. Click the Data tab in the Ribbon
• Step 3. Click Filter in the Sort & Filter group
• Step 4. Click on the Filter Arrow in column B, next to
Manufacturer
• Step 5. If all choices are checked, you can easily
deselect all choices by unchecking
• (Select All). Then select only the check box for Toyota.
• Step 6. Click OK
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 26
Creating Distributions from Data
• Distributions help summarize many
characteristics of a data set by describing how
often certain values for a variable appear in
that data set.
• Distributions can be created for both
categorical and quantitative data, and they
assist the analyst in determining variation.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 27
Frequency Distributions for Categorical Data
• A frequency distribution is a summary of data
that shows the number (frequency) of
observations in each of several non
overlapping classes, typically referred to as
bins.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 28
Frequency Distribution
Consider a data set of 26 children of ages 1-6 years. Then the frequency
distribution of variable ‘age’ can be tabulated as follows:

Frequency Distribution of Age

Age 1 2 3 4 5 6
Frequency 5 3 7 5 4 2
Grouped Frequency Distribution of Age:
Age Group 1-2 3-4 5-6

Frequency 8 12 6

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 29
Example: 1
• A survey was taken in Maple Avenue. In each
of 20 homes, people were asked how many
cars were registered to their households.
• The results were recorded as follows:
3, 1, 4, 0, 2, 1, 5, 2, 1, 5, 4, 2, 3, 2, 0, 2, 1, 0, 3, 2.
• Present this data in Frequency Distribution
Table.
• Also find maximum number of cars registered
by household.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 30
Example: 2

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 31
Solution ?
• Discussed in class

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 32
Relative Frequency and Percent Frequency Distributions

• A relative frequency distribution is a tabular


summary of data showing the relative
frequency for each bin.

• A percent frequency distribution summarizes


the percent frequency of the data for each
bin.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 33
Relative Frequency and Percent Frequency
Distributions

for Coca-Cola is 19/50 = 0.38,


for Diet- Coke is 8/50 = 0.16, and so on.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 34
Example: 3

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 35
Frequency Distributions for Quantitative Data
• Consider the quantitative data in Table 2.6

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 36
• These data show the time in days required to
complete year-end audits for a sample of 20 clients
of Sanderson and Clifford, a small public
accounting firm. The three steps necessary to
define the classes for a frequency distribution with
quantitative data are as follows:

1. Determine the number of non overlapping bins.


2. Determine the width of each bin.
3. Determine the bin limits.
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 37
• Number of Bins: Bins are formed by specifying
the ranges used to group the data.
• Width of the Bins: choose a width for the
bins.

bin width of (33 -12)/5 = 4.2 Approx. is 5

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 38
• Bin Limits: Bin limits must be chosen so that
each data item belongs to one and only one
class.

lower and upper bin limits to obtain a total of five classes:


10–14,
15–19,
20–24,
25–29,
30–34.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 39
Example: 4

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 40
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 41
• Step 1. Select cells B10:B14
• Step 2. Type the formula =FREQUENCY(A2:D6,
A10:A14). The range A2:D6
• defines the data set, and the range A10:A14
defines the bins.
• Step 3. Press CTRL+SHIFT1+ENTER after typing
the formula in Step 2.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 42
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 43
Data Presentation
Two types of statistical presentation of data - graphical and numerical.

Graphical Presentation, we look for the overall pattern and for striking deviations
from that pattern. Over all pattern usually described by shape, center, and spread
of the data. An individual value that falls outside the overall pattern is called an
outlier.

• Bar diagram and Pie charts are used for categorical variables.
• Histogram, stem and leaf and Box-plot are used for numerical variable.

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 44
Histograms
• Step 1. Click the Data tab in the Ribbon
• Step 2. Click Data Analysis in the Analyze group
• Step 3. When the Data Analysis dialog box opens, choose
Histogram from the list of
• Analysis Tools, and click OK
• In the Input Range: box, enter A2:D6
• In the Bin Range: box, enter A10:A14
• Under Output Options:, select New Worksheet Ply:
• Select the check box for Chart Output (see Figure 2.13)
• Click OK
© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 45
A common graphical presentation
of quantitative data is a histogram

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 46
Thank You !

© 2016 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part,
except for use as permitted in a license distributed with a certain product or service or otherwise on a password-
protected website for classroom use. 47

You might also like