0% found this document useful (0 votes)
9 views34 pages

Statistics - Basic Concepts Part 1

The document discusses different types of variables and measurement scales used in data analysis including nominal, ordinal, interval, ratio scales as well as histograms and measures of central tendency and dispersion.

Uploaded by

K.P.S Drones
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views34 pages

Statistics - Basic Concepts Part 1

The document discusses different types of variables and measurement scales used in data analysis including nominal, ordinal, interval, ratio scales as well as histograms and measures of central tendency and dispersion.

Uploaded by

K.P.S Drones
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Part 1

Data Analytics for


Decision Making
• Statistical concepts and methods

• Measures of central tendency and dispersion


• Histograms
• From histograms to probability distributions
• The Normal distribution
• Confidence intervals
• Hypothesis tests
Introduction
to
descriptive
statistics …
Key definitions
Population
Whole set of individuals or objects to be studied
Example = All PC buyers

Sample
Representative part of the population selected through different methods
Results from sample will be considered for the whole population
Example = PC buyers that were interviewed

Parameter
One descriptive measure that applies to the whole population
Example = Number of PCs purchased per consumer and per year
Statistic
Estimation of a parameter
Measures of central tendency
Mode
The observation that occurs more frequently in a set of data
Qualitative and quantitative data / Not sensitive to extreme values

Median
The middle observation after all data have been placed in rank order
Not sensitive to extreme values / Only quantitative data /

Mean or Average
The sum of all scores divided by the number of these scores
Most used measure / Only quantitative data / Sensitive to extreme values
• Mean versus Median
• When is a Median a better summary
description of data as compared to the
Mean?
• Let's take a seven-employee small firm
with the following salaries:

• 28,000 $
• 33,000 $
• 33,000 $
• 34,000 $
• 37,000 $
• 40,000 $
• 400,000 $
• What is the ‘typical’ salary in this
group?
• Mean = 86,000$
• Median = 34,000$

The Mean is influenced to a greater extent by extreme observations


Measures of dispersion or variability
Variance
Average squared deviation from the mean
s = S [score – m]2 / N
The square operation transforms all differences in positive values

Standard deviation
Square root of the variance
SD or s = sqrt {s} = sqrt {S [score – m]2 / N }
More the data is dispersed, higher values of SD

Range
Difference between highest and lowest values
Strongly influenced by extreme scores
Measures of Dispersion / Spread
Firm 1 Firm 2
$34,500 $35,800
$30,700 $25,500
$32,900 $31,600
$36,000 $41,700
$34,100 $35,300
$33,800 $33,800
$32,500 $30,800
Mean = $33,500 Mean = $33,500
Median = $33,800 Median = $33,800
Graphing your data
25

20
E A

15

10
B
5 D
C
0
A B C D E

25 25

20 20

15 15

10 10

5 5

0 0
A B C D E 0 1 2 3 4 5 6
Histograms
Similar to bar graph, but it is used to display frequencies of
quantitative or qualitative variables rather than quantitative values

30 30

25 25

20 20

15 15

10 10

5 5

0 0
26-29 30-33 34-37 38-41 42 & more UK USA France China Italy

Age of participants Country of origin


Histograms and cumulative probability
A histogram ordered in descending order of number of occurrences
allows the construction of a graph of cumulative probability

100
30

Accumulated probability
80
Number of occurrences

25

20 60

15 40
10
20
5

0 0
E UK I D NL P F ROM YUG

Number of cups won in UEFA Champions League


Type of variables
• Most common
classifications:

A. By the nature of the


Measurement:
✓ Qualitative
✓ Quantitative

B. By the relationship of
variables:
✓ Cause - Effect
BASIC MEASUREMENT SCALES:
• Nominal Scale
• Ordinal Scale
• Interval Scale
• Ratio Scale

Measurement Scales
Qualitative Variables

• Can only be expressed in qualitative terms, by setting categories,


levels, hierarchies, etc.
• Ordinal Variables: The number indicates an
• Nominal Variables: A nominal variable is a type of variable that
is used to name, label or categorize particular attributes that are ordered relationship. The median and mode
being measured. It takes qualitative values representing can be analyzed.
different categories, and there is no intrinsic ordering of these Example:
categories. You can code nominal variables with numbers, but
the order is arbitrary and arithmetic operations cannot be How satisfied are you with our service?
performed on the numbers. 5: Very satisfied
Examples: 4: Satisfied
• Place of residence:
3: Indifferent
• 1: Madrid, 2:Barcelona, 3:Valencia
• Gender: 2: Dissatisfied
• 1: Male, 2:Female. 1: Very dissatisfied
Quantitative Variables
• These are Variables whose values result from counting or measuring something.
Examples: height, weight, time, number of items sold to a shopper...
• They can be classified as:

• Continuous: When the values can be measured. They can have decimal values.
Example: Weight (76,2kg – 84,5kg – 52,7kg)
• Discrete: When values are counted and therefore can only take integer values.
Example: Number of students (25 – 34 – 29)
• Interval: They allow to quantify exact distances between the different values assigned.
Example: Income, Number of inhabitants (1: 0-270.000, 2: 1001-5000)
A ratio scale is a type of measurement scale that has a true zero point and equidistant intervals.
This means that it not only determines the order of values, but also the difference between values,
and allows for meaningful ratios of the measured quantity. Because of the presence of a true zero,
one can say that a value on a ratio scale is twice as much as another value or that one value is zero
times another. This characteristic distinguishes the ratio scale from the other scales like nominal,
ordinal, and interval.

Examples of ratio scales in marketing research include:


1. Sales figures: If one product sold 200 units and another sold 100 units, one can infer that the first
product sold twice as much as the second.
2. Time taken by a consumer to decide on a purchase.
3. Age of consumers: If one consumer is 20 years old and another is 40, one can state that the second
consumer is twice as old as the first.
4. Money spent by customers: The exact amount a customer spends on a product allows for clear
comparisons and ratios between different customers.

These examples showcase that with ratio scales, not only can we order data or determine differences,
but we can also make meaningful statements about the ratio of values
Measurement Scales
Other Measurment Scales
• Likert:
• Likert Scale Examples for Surveys
To which extent do you agree with the previous statement?:greement
1= Strongly Agree
2= Agree
How often do you purchase product A?
3= Undecided
To which extent do you agree with the
4= Disagree previous statement? Frequency
5= Strongly Disagree 5 = Always
4 = Often
3 = Sometimes
2 = Rarely
1 = Never
Measurement Scales
Other Measurment Scales
• Dichotomous: a question that has two possible responses
• Example:
• Yes/No
• True/False
• Agree/Disagree

Semantic Differential: It is a survey or questionnaire rating


scale that asks people to rate a product, company, brand,
etc., on a multi-point rating option. These survey answering
options are grammatically on opposite adjectives at each
end. For example, “very satisfied-very unsatisfied”, with
intermediate options in between.
Example:
• How Likely are you to purchase our product again?

1. Very likely 2. 3. 4. 5.Very


Unlikely
Cause – Effect
Variables
• Independent Variable: Variables or factors that
are the cause, or that explain a
phenomenon.

• Dependent Variable : The dependent


variable is the effect. Its value depends on
changes in the independent variables.

• Examples:
Sales = Dependent Variable
Advertising expenditure, Sales force
number,…= Independent Variables
A. Descriptive Statistics
✓ Frequency Table
✓ Central Tendency Measures
✓ Scattering Measures
✓ Position Measures

B. Inferential Statistics
✓ Bivariate
✓ Multivariate

Types of statistical analysis


Descriptive Statistics

• The main objective of this statistical approach is to


summarize or describe the important characteristics of
a dataset

• Some measures that are commonly used to describe a


data set are measures of central tendency (mean,
median, mode), and measures of variability (Standard
deviation/variance, minimum, maximum).
Degree of
Value Count %
agreement
1 Strongly agree 200 17%
• A frequency
2 Agree somewhat 350 30% distribution is a list, table
or graph that displays the
3 Not sure 240 21% frequency of various
outcomes in
4 Disagree somewhat 225 19% a dataset. Each value in
the table contains the
5 Strongly disagree 145 13% frequency or count of the
occurrences of values
within a particular group or
interval.

Frequency Table
Central Tendency Measures
• The most commonly used central tendency measures in
market research studies are:

1) Mean (arithmetic): Generally the most important of


all numerical measures used to describe data; also
called average.

2) Median: The median is the middle value. To find the


median, we have to order the data from smallest to
largest, and then find the data point that has an equal
amount of values above it and below it.
3) Mode: The mode is the value that occurs the most
frequently in your data set. On a bar chart, the mode
is the highest bar
Measures of
Dispersion
In Statistics, the dispersion or
variability, is the extent to which
a distribution is stretched or
squeezed. The main statistics
used to describe dispersion are:
. The Standard Deviation (SD),
and the Variance (which is the
square of the Standard Deviation)
. The Range, which is the
difference between the Maximum
(largest) and Minimum (smallest)
Values
• Statistical inference is the process of using
Inferential statistical analysis to deduce properties of a
dataset sampled from a larger population. It
Statistics infers properties of a population, for example
by testing hypotheses and deriving
estimates.
1. Dependency Methods:
. A regression analysis allows us to find out to what extent
one variable can be determined knowing another (or others). It is used
to predict the behavior of certain variables from others, such as the
profits of a film from marketing spending and production spending.2
. A discriminant analysis can give us a function that can be

Inferential used to distinguish between two or more groups, and the way we make
decisions.

Statistics Logistic regression enables the development of a regression analysis


to estimate and test the influence of one or more variables on another,
when the dependent or response variable is of a dichotomous type.
2. Interdependency Methods:
1. The analysis of Principal Components seeks to determine a
smaller system of variables that without the original system.
2. Cluster analysis classifies a sample of features (individuals or
variables) into a small number of groups so that observations
belonging to a group are very similar to each other and very
dissimilar from the rest. Unlike the Discriminant Analysis, the
number and composition of these groups is unknown.
PRACTICE WITH EXCEL
Excel Add-ins Analysis Tool Pack/Solver Installation
Excel Add-ins Analysis Tool Pack/Solver Installation
Excel Add-ins Analysis Tool Pack/Solver Installation
PRACTICE WITH EXCEL AND SPSS
OR…SEE ALSO (Optional):
. realstatistics Excel Add-in (free)

https://www.real-statistics.com/
PRACTICE WITH EXCEL

You might also like