0% found this document useful (0 votes)

7 views54 pages

Statistical Measures

The document outlines descriptive statistical measures, including differences between populations and samples, and various measures such as location (mean, median, mode), dispersion (range, variance, standard deviation), shape (skewness, kurtosis), and association (covariance, correlation). It provides examples and applications for calculating these measures, as well as discussing the implications of skewness and kurtosis on data interpretation. Additionally, it emphasizes the importance of understanding these measures for making valid inferences in statistical analysis.

Uploaded by

gerald.tanwh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views54 pages

Statistical Measures

Uploaded by

gerald.tanwh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 54

Descriptive Statistical Measures

\
Learning objectives
• Explain the difference between Populations and Samples
• Understand and be able to distinguish and apply the different measures:
• Measures of Location (Mean, Median, Mode),
'
I
• Measures of Dispersion (Range, Variance, Standard Deviation,
Chebyshev’s Theorem, Coefficient of Variation)
• Measures of Shape (Skewness, Kurtosis)
• Measures of Association (Covariance and Correlation)
• Be able to identify outliers
stat
Populations and Samples > similar to Jc

'
to he striie
items
the entire group of
Basically
f)
v Population - all items of interest for a particular
decision or investigation
• all married drivers over 25 years old
• all subscribers to Netflix
v Sample - a subset of the population
• a list of married drivers over 25 years old who
bought a new car in the past year
• a list of individuals who rented a comedy from
Netflix in the past year
v Purpose of sampling is to obtain sufficient
information to draw a valid inference about a
population
Measures of Location – Mean
>

The average value of a variable

> µ ,
population men

For a population of size N:

For a sample of n observations: >

Ñ
sample mean

Mean is also commonly known as average

Measures of Location – Mean
Example: Computing Mean Cost per Order (Using Purchase Orders data)
- Using formula:
Mean = $2,471,760/94
= $26,295.32
Measures of Location – Median
~ middle value of the data when arranged from least to greatest

Example: Finding the Median Cost per Order (Purchase Orders data)

Sort the data in column B.

Since n = 94,
Median = $15,656.25
(average of 47th &
48th observations)

NOTE: If n is odd number, then median is the

value of the middle observation
1 is ever
n ,
when
middle
there are 2

WH '
°ᵗ
aware
the
guy the median
a]
values
the 2
Using R to compute mean and median

:
Measures of Location – Mode
~ observation that occurs most often or, for grouped data, the group with
the greatest frequency.
Eg for observation data: Finding the Mode of A/P terms
(Purchase Orders data)
} Mode of A/P terms:
= 30 months
turtle
Revise table

and frequency
tables ! !
!

note use table C) function rot with the

There Is no Fin] the
take
function R to get th frequency main frequent
Measures of Location – Mode
~ observation that occurs most often or, for grouped data, the group
with the greatest frequency.

Eg for grouped data: Finding the Mode of Cost per order (Purchase Orders data)

} Mode is the group

between $0 and
$20,000
Measures of Location: Application
Problem: Quoting Computer Repair Times
Data set (Computer Repair Times) includes 250 repair times for customers.
– What repair time would be reasonable to quote
to a new customer?
# Use R functions for Mean & Median

{
Compute
vein

metier

# Compute Mode

compute ☆
••
n.ae

# Use Table function to obtain frequencies for each value of X

Measures of Location: Application
Problem: Quoting Computer Repair Times
Data set (Computer Repair Times) includes 250 repair times for customers.
– What repair time would be reasonable to quote to a new customer?
– Mean repair time is about 15 days
– Median repair time: 2 weeks
– Mode is 12 and 15 days
?? ?
Measures of Location: Application
Problem: Quoting Computer Repair Times
Data set (Computer Repair Times) includes 250 repair times for customers.
– What repair time would be reasonable to quote to a new customer?
– Mean repair time is about 15 days
– Median repair time: 2 weeks
– Mode is 12 and 15 days
Measures of Dispersion
• Dispersion refers to the degree of variation (numerical spread or
compactness) in the data
• Range is the difference between the maximum and minimum data values
• Interquartile range difference between the first and third quartiles (Q3 –
Q1) (uses middle 50% data)
• Variance is an average of squared deviations from mean (uses all data
values) var ( )
• Standard Deviation is the square root of the variance sd ( )
x is the computer ☆
repair times variable Range=?
IQR =?

☆
☆ install.packages (psych)
library (psych)
Measures of Dispersion – Variance É
E

a- =
:& "
i a.
^

/
↑

~ average of squared deviations from mean ñj

É Cn ; -

If n=N ,

SE :{ Cn ;
-
mi
l s?
n
-

N -
I

Computing the Variance i.

6! ✗ 52

If sample data is also population data, then

n=N, to compute population variance:
• For a population: [(N-1)/N]*var(X)

• For a sample: In R, sample variance is var(X)

WIN] cacubtel

Sample mince only

He
• Note the difference in denominators <
'

. To caviar population
Wmu
,

Recall that

:& ( ni Mi
6! { 62
-

=
✗ 52
N

= nice i. € ,
cni -

¥÷i
n -
1

AND

sample size n =
popnktie size N
for
Measures of Dispersion – Standard Deviation → Math
deviation
Stahler
~ square root of the variance
(popular measure of risk)

Computing the Standard Deviation

Sqrt(((N-1)/N)*var(X))
} For a population:

sd (X)
} For a sample:
Measures of Dispersion – Standard Deviation
• Which has a higher standard deviation?

Source: Wikipedia (Standard Deviation)

Measures of Dispersion - Application
Mean & Standard Deviation of Closing Stock Prices

Intel (INTC): → Higher rim

Mean = $18.81
Stdev. = $0.50

General Electric (GE):

Mean = $16.19
Stdev. = $0.35

Whose risk may be higher?

Measures of Dispersion

¥
* Chebyshev’s Theorem
• For any data set, the proportion of values that lie within k (k > 1) standard
deviations of the mean is at least 1 – 1/k2
:
Substituting values of k, we get:
• For k = 2: at least ¾ or 75% of the data lie within two standard deviations
of the mean ÷:
• For k = 3: at least 8/9 or 89% of the data lie within three standard
deviations of the mean

Why is this useful?

• Able to use mean and standard deviation to find percentage of total
observations that fall within a given interval about the mean
Example: For Cost per order data in Purchase orders database
Example: For Cost per order data in Purchase orders database
• Applying two std dev interval (i.e. k=2)

94.68% falls within 2 sd of the mean.

Ø k= 2; 75% of according to Chebyshev’s
theorem

• Applying three std dev interval (i.e. k=3)

97.87% falls within 3 sd of the mean

Ø k=3; 89% according to Chebyshev’s
theorem
Measures of Dispersion
• Empirical Rules- For normally distributed data set, the proportion of values
☆ that lie within k (k > 1) standard deviations of the mean follows the empirical
rules:
• For k = 1: about 68% lie within one standard deviations of the mean
• For k = 2: about 95% lie within two standard deviations of the mean
• For k = 3: about 99.7% lie within three standard deviations of the mean

Source: Statistics Libretexts

Application of Empirical Rule - Process Capability Index
• Process capability index (Cp) is a measure of how well a
manufacturing process can achieve specifications
• Using a sample of output, measure the dimension of interest, and
compute the total variation using the third empirical rule.
• Compare results to specifications using:

☆
Eg: Using Empirical Rules to Measure the Capability of a Manufacturing
Process

Part dimension (cm)

Cp=0.4/0.7

Third Empirical rule

In practice:
AIM for Cp => 1.5
distribution )
Standardized Values ( used In probability
• A standardized value, commonly called a z-score, provides a relative measure of the
distance an observation is from the mean (independent of units of measurement)
• z-score for ith observation in a data set is:

- +
Z = -1 Z=1
(observation is 1 SD to the left of mean) (observation is 1 SD to the right of mean)
E G : C o m p u t i n g z - S c o re s

• Purchase Orders Cost per order data

*
Computing 2 sure in :

df$zscore <- (df$cost -mean(df$cost))

sd(df$cost)
Coefficient of Variation
} The coefficient of variation (CV) provides a relative measure of dispersion
in data relative to the mean:

} Sometimes expressed as a percentage (x100)

} Provides a relative measure of risk to return
} Useful when comparing variability of two or more data sets with different
scales
} Smaller CV à smaller risk
} Reciprocal of CV à return to risk
Eg: Applying the Coefficient of Variation
• Closing Stock Prices database
• Which investment is most risky?
• Which investment would have least risk?

Intel (INTC) is slightly riskier than the other stocks.

Index fund has the least risk (lowest CV).

Measures of Shape: Skewness
• Skewness describes the lack of symmetry of data.
• Distributions that tail off to the right are called positively skewed; those
that tail off to the left are said to be negatively skewed.

Positively skewed Symmetrical

Coefficient of Skewness
Cs
Uted
• Coefficient of Skewness (CS):
y tr than if

is
the dentures
' skewed
left / right
} CS is negative for left-skewed data.
} CS is positive for right-skewed data.
} |CS| > 1 suggests high degree of skewness.
} 0.5 ≤ |CS| ≤ 1 suggests moderate skewness.
} |CS| < 0.5 suggests relative symmetry.

In R, can be obtained using the “skew” function in “psych” package.

I CS win
! !!
Mun fhlthh to get
p hui
Eg: Measuring Skewness
• Using Purchase Orders database
Eg: Measuring Skewness
• Using Purchase Orders database
thered)
• Cost per order data: CS = 1.61 ( right
• A/P terms data: CS = 0.58 In Which has higher skewness?
Positive or Negative?

CS = 1.61 CS = 0.58
High positive skewness Moderate positive skewness
Shape and Measures of Location
Comparing measures of location can sometimes reveal information about the shape of the distribution of
observations.
Negatively skewed Positively skewed
Mean < Median < Mode Mode < Median < Mean

For example:
• If distribution was perfectly symmetrical and unimodal, the mean, median, and
mode would all be the same.
• If it were negatively skewed, mean < median < mode
• Positive skewness would suggest that mode < median < mean
Measures of Shape: Kurtosis
• Kurtosis refers to the tailedness of the distribution.
• Coefficient of kurtosis (CK) measures the degree of kurtosis

☆ kurtosis > 3

} CK = 3 for normal (or mesokurtic) distribution kurtosis = 3

} CK < 3 indicates platykurtic distribution which

has a thicker & shorter tail (fewer extreme kurtosis < 3

points or outliers)
} CK > 3 indicates leptokurtic distribution has a
thinner & longer (or heavier) tail (more
extreme points or outliers) Source: https://www.analyticsvidhya.com/blog/2021/05/shape-of-data-skewness-and-kurtosis/
Measures of Shape: Kurtosis
-

Note, several functions in R computes excess kurtosis:

[ –3 ,
Excess kurtosis > 0

Eg: psych::kurtosis; e10071::kurtosis Excess kurtosis = 0

Excess kurtosis < 0

Excess kurtosis cut off is zero.

Excess kurtosis > 0, hence the distributions of both Cost and AP are leptokurtic with longer and
thinner tails than normal distribution.
2T
DON'T GET
→ I
Descriptive Statistics for Grouped Data

}
• Population mean: variation for egm

" " " ""

• Sample mean: in group Inta

• Population variance:

• Sample variance:
Eg: Computing Statistical Measures from Frequency Distributions
• Computer Repair Times

5.962 = 35.50
Eg: Computing Home Value by Type and Region
function in
`psych` package
Descriptive Statistics for Categorical Data: The Proportion

• Proportion (p), is the fraction of data that have a certain

characteristic.
• Proportions are key descriptive statistics for categorical data,
such as defects or errors in quality control applications or
consumer preferences in market research.
Eg: Computing a Proportion
} Proportion of orders placed by Spacetime Technologies

Proportion = Number of Orders by SpaceTime Technologies

Total number of Orders
☆
can also use filter function in dplyr
Measures of Association
• Data from 49 top liberal arts and research universities can be used to
answer questions:
• Is Top 10% HS related to Graduation %?
• Is Accept. Rate related to Expenditures/Student?
• Is Median SAT related to Acceptance Rate?
Measures of Association - Covariance
• Covariance is a measure of the linear association between two
variables, X and Y.
if Mr

• For a population:

• For a sample:

Positive Covariance à direct relationship

Negative Covariance à inverse relationship
Magnitude à degree of association
Measures of Association - Covariance
Eg: Computing the Covariance
• Scatterplot of the Colleges and Universities data
convince
Diff for

Measures of Association - Correlation

• Correlation is a measure of the linear association between two variables, X and Y (not
dependent on units of measurement)
• Correlation Coefficient formulas: For a population:

For a sample:

• Range: -1 (Strong negative) and 1 (Strong positive linear relationship)

• 0 indicates no linear relationship

• Also known as: Pearson product moment correlation or Pearson's correlation coefficient
Measures of Association
• Correlation as a measure of LINEAR association

Source: Wikipedia (Correlation and dependence)

Computing Correlation of Multiple Variables
• Is Top 10% HS related to Graduation %?
• Is Accept. Rate related to Expenditures/Student?
• Is Median SAT related to Acceptance Rate?
Plotting a correlation matrix
¢ Introduction to outliers ]
Outliers
• Mean and Range are sensitive to outliers
• No standard definition of what constitutes an outlier.
• How do we identify potential outliers?
• Some rules of thumbs: aye
Into quartile
} z-scores > +3 or < -3

{
(< 0.3% for normal data)
• } Extreme outliers are > 3*IQR to the left of Q1 or right of Q3
} Mild outliers are between (1.5to 3)*IQR to the left of Q1 or
right of Q3
Eg: Investigating Outliers
• Home Market Value data

• None of the z-scores exceed 3. However, while individual variables might not
exhibit outliers, combinations of them might.
• The last observation has a high market value ($120,700) but a relatively small house size
(1,581 square feet) and may be an outlier.
What do you do with outliers?
↓
- Leave them in the data if it is important
- Remove them if they are different from the rest
- Correct error in data entry
Statistical Thinking in Business DM
• Statistical Thinking is a philosophy of learning and action for
improvement, based on principles that:
• all work occurs in a system of interconnected processes
• variation exists in all processes
• better performance results from understanding and reducing
variation
• Business Analytics provide managers with insights into facts
and relationships that enables them to make better
decisions.
Applying Statistical Thinking
• Excel file Surgery Infections
• Is month 12 simply random variation or some explainable phenomenon?
Applying Statistical Thinking
• Excel file Surgery Infections
• Is month 12 simply random variation or some explainable phenomenon?
Applying the 3 std dev empirical rule!

upper limit

lower limit
Variability in Samples
• Different samples from any population will vary
• different means, standard deviations, and other statistical measures
• differences in shapes of histograms
• Samples are extremely sensitive to the sample size – the
number of observations included in the samples.
Eg: Variation in Sample Data
• Samples from Computer Repair Times data
• Population statistics: µ = 14.91 days, σ2 = 35.5 days2
• Two samples of size 50:

→ Diff

• Two samples of size 25:

→ DIG

Descriptive Statistical Measures
No ratings yet
Descriptive Statistical Measures
63 pages
FM200 - System Components
100% (3)
FM200 - System Components
36 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
59 pages
General Description
No ratings yet
General Description
1,525 pages
Atomic Structure
100% (6)
Atomic Structure
83 pages
RC - Design-Torsion-corrected
No ratings yet
RC - Design-Torsion-corrected
62 pages
Pharmaceutical Analysis Introduction
No ratings yet
Pharmaceutical Analysis Introduction
17 pages
DSML
No ratings yet
DSML
510 pages
A10VSO32
100% (1)
A10VSO32
72 pages
Ken Black QA ch03
0% (1)
Ken Black QA ch03
61 pages
Grade 10 Maths Summer Packet - 23
100% (1)
Grade 10 Maths Summer Packet - 23
32 pages
Numerical Descriptive Techniques (6 Hours)
No ratings yet
Numerical Descriptive Techniques (6 Hours)
89 pages
We954 200703
100% (1)
We954 200703
6 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
Share MBBS - Lecture 4 (1) - 1
No ratings yet
Share MBBS - Lecture 4 (1) - 1
68 pages
Describing Data - Numerical Measure
No ratings yet
Describing Data - Numerical Measure
33 pages
API UT21 ThicknessProcedure 20190304
100% (1)
API UT21 ThicknessProcedure 20190304
7 pages
Overview:: Book Title:-Thermodynamics For Engineers With
No ratings yet
Overview:: Book Title:-Thermodynamics For Engineers With
5 pages
PWPS 001
No ratings yet
PWPS 001
2 pages
Lecture 04
No ratings yet
Lecture 04
88 pages
Descreptive Statistics 1
No ratings yet
Descreptive Statistics 1
74 pages
Lesson 3.2 Measures of Central Tendency Position and Variation
No ratings yet
Lesson 3.2 Measures of Central Tendency Position and Variation
62 pages
RMBS BPT402
No ratings yet
RMBS BPT402
103 pages
EDA W3 Obtaining-Data
No ratings yet
EDA W3 Obtaining-Data
57 pages
Medical Lamps: 2018 Catalogue
100% (1)
Medical Lamps: 2018 Catalogue
16 pages
EECM3724 Unit 1 Ch3 Slides 2022
No ratings yet
EECM3724 Unit 1 Ch3 Slides 2022
48 pages
Chapter 3 (Technical English For Statistics)
No ratings yet
Chapter 3 (Technical English For Statistics)
8 pages
SC Physics Formulas
86% (7)
SC Physics Formulas
2 pages
Reservoir Laboratories BR
No ratings yet
Reservoir Laboratories BR
28 pages
Lecture 2
No ratings yet
Lecture 2
66 pages
Trelleborg AVS IND Product Catalogue V032020
No ratings yet
Trelleborg AVS IND Product Catalogue V032020
120 pages
Lecture III-Measures of Dispersion
No ratings yet
Lecture III-Measures of Dispersion
33 pages
Evans Analytics2e PPT 04
No ratings yet
Evans Analytics2e PPT 04
57 pages
Chapter 3 - Numerical Technique - Send
No ratings yet
Chapter 3 - Numerical Technique - Send
49 pages
Data Sheet LLP Compressor's (Updated)
No ratings yet
Data Sheet LLP Compressor's (Updated)
8 pages
Week 03
No ratings yet
Week 03
38 pages
1-Descriptive Statistics
No ratings yet
1-Descriptive Statistics
44 pages
Midterm Reviewer
No ratings yet
Midterm Reviewer
21 pages
Liberdade Advanced - Underwater - Glider
No ratings yet
Liberdade Advanced - Underwater - Glider
1 page
Lecture 2-Summarizing Data - HSciences Biostats - 010232en
No ratings yet
Lecture 2-Summarizing Data - HSciences Biostats - 010232en
37 pages
Numerical Descriptive Measures
No ratings yet
Numerical Descriptive Measures
52 pages
Chapter 3 Numerical Technique
No ratings yet
Chapter 3 Numerical Technique
56 pages
Statistical Data
No ratings yet
Statistical Data
41 pages
2 Descriptives
No ratings yet
2 Descriptives
43 pages
Discriptive Statistics
No ratings yet
Discriptive Statistics
23 pages
Chapter Four: Numerical Descriptive Techniques
No ratings yet
Chapter Four: Numerical Descriptive Techniques
65 pages
Lecture 3 Numerical Measures of Data
No ratings yet
Lecture 3 Numerical Measures of Data
36 pages
Week 6+7+8
No ratings yet
Week 6+7+8
37 pages
2nd Unit - Statistics
No ratings yet
2nd Unit - Statistics
15 pages
03 Numerical Description
No ratings yet
03 Numerical Description
52 pages
Part 2-Chapter 3 - Describing Data - Edit
No ratings yet
Part 2-Chapter 3 - Describing Data - Edit
46 pages
Bus. Statt. Chapter-Lecture 2+3
No ratings yet
Bus. Statt. Chapter-Lecture 2+3
43 pages
Lec5&6 02sep2016
No ratings yet
Lec5&6 02sep2016
32 pages
Descriptive Statistics 1
No ratings yet
Descriptive Statistics 1
63 pages
Lecture 2b - Describing Data-Numerical
No ratings yet
Lecture 2b - Describing Data-Numerical
47 pages
Analysis of Statistcal Data
No ratings yet
Analysis of Statistcal Data
46 pages
Measures of Central Tendency
100% (15)
Measures of Central Tendency
15 pages
Descriptive Measures With Samples-1
No ratings yet
Descriptive Measures With Samples-1
33 pages
1 - 3 Descriptive Measures
No ratings yet
1 - 3 Descriptive Measures
33 pages
Lecture 3 Sem 1 Edited
No ratings yet
Lecture 3 Sem 1 Edited
30 pages
Basic 1
No ratings yet
Basic 1
60 pages
1-Descriptive Statistics
No ratings yet
1-Descriptive Statistics
44 pages
Lecture 3 & 4 Describing Data Numerical Measures
No ratings yet
Lecture 3 & 4 Describing Data Numerical Measures
24 pages
Capacitive Touch Module V0.2
No ratings yet
Capacitive Touch Module V0.2
11 pages
Using A Variable Buoyancy System For Energy Savings in An AUV
No ratings yet
Using A Variable Buoyancy System For Energy Savings in An AUV
5 pages
Measusres of Locations
No ratings yet
Measusres of Locations
52 pages
Measures of Location and VARIATION For 1 Variable
No ratings yet
Measures of Location and VARIATION For 1 Variable
44 pages
Physics Paper 2 TZ2 SL Markscheme 2024
No ratings yet
Physics Paper 2 TZ2 SL Markscheme 2024
12 pages
DDMR Trajectory Tracking With Using Pid and Kinematic Based Backstepping Controller
No ratings yet
DDMR Trajectory Tracking With Using Pid and Kinematic Based Backstepping Controller
15 pages
Numerical Descriptive Measures 1
No ratings yet
Numerical Descriptive Measures 1
39 pages
Averages and Variation Eda
No ratings yet
Averages and Variation Eda
29 pages
Discriptive Statistics
No ratings yet
Discriptive Statistics
50 pages
CENTRAL TENDENCY MEASURES Lectures 3+4+5
No ratings yet
CENTRAL TENDENCY MEASURES Lectures 3+4+5
35 pages
Lecture 3. Descriptive Statistics
No ratings yet
Lecture 3. Descriptive Statistics
29 pages
Lecture 3
No ratings yet
Lecture 3
14 pages
S10FE IIc D 48 Electromagnetic Spectrum Practical Applications
No ratings yet
S10FE IIc D 48 Electromagnetic Spectrum Practical Applications
6 pages
Reporte Pvsyst Sobre Merida para 1.25 MWP
No ratings yet
Reporte Pvsyst Sobre Merida para 1.25 MWP
8 pages
Unit 8 - Electricity - Charge Launcher - 1607373
No ratings yet
Unit 8 - Electricity - Charge Launcher - 1607373
6 pages
Midterm Examination Reviewer
No ratings yet
Midterm Examination Reviewer
4 pages
An Analytical Solution For Fully Developed Flow in A Curved Porous Channel For The Particular Case of Monotonic Permeability Variation
No ratings yet
An Analytical Solution For Fully Developed Flow in A Curved Porous Channel For The Particular Case of Monotonic Permeability Variation
10 pages
Ec310 Day 2 Lecture Notes
No ratings yet
Ec310 Day 2 Lecture Notes
10 pages
Lecture 3 - Numerical Statistics
No ratings yet
Lecture 3 - Numerical Statistics
7 pages
Chapt3 Overheads
No ratings yet
Chapt3 Overheads
8 pages
Worked Example
No ratings yet
Worked Example
1 page
BMC - PPM Report - Cassette Type AC - Dec 26th, 2024
No ratings yet
BMC - PPM Report - Cassette Type AC - Dec 26th, 2024
1 page
Class Note 05 Mole Concept Chemisry Arjuna NEET 2-0-2026 Nikhil
No ratings yet
Class Note 05 Mole Concept Chemisry Arjuna NEET 2-0-2026 Nikhil
2 pages

Statistical Measures

Uploaded by

Statistical Measures

Uploaded by

Descriptive Statistical Measures

The average value of a variable

For a population of size N:

For a sample of n observations: >

Mean is also commonly known as average

Sort the data in column B.

NOTE: If n is odd number, then median is the

note use table C) function rot with the

} Mode is the group

# Use Table function to obtain frequencies for each value of X

~ average of squared deviations from mean ñj

Computing the Variance i.

If sample data is also population data, then

• For a sample: In R, sample variance is var(X)

Sample mince only

Computing the Standard Deviation

Source: Wikipedia (Standard Deviation)

Intel (INTC): → Higher rim

General Electric (GE):

Whose risk may be higher?

Why is this useful?

94.68% falls within 2 sd of the mean.

• Applying three std dev interval (i.e. k=3)

97.87% falls within 3 sd of the mean

Source: Statistics Libretexts

Part dimension (cm)

Third Empirical rule

• Purchase Orders Cost per order data

df$zscore <- (df$cost -mean(df$cost))

} Sometimes expressed as a percentage (x100)

Intel (INTC) is slightly riskier than the other stocks.

Index fund has the least risk (lowest CV).

Positively skewed Symmetrical

In R, can be obtained using the “skew” function in “psych” package.

} CK = 3 for normal (or mesokurtic) distribution kurtosis = 3

} CK < 3 indicates platykurtic distribution which

Note, several functions in R computes excess kurtosis:

Eg: psych::kurtosis; e10071::kurtosis Excess kurtosis = 0

Excess kurtosis < 0

Excess kurtosis cut off is zero.

" " " ""

• Proportion (p), is the fraction of data that have a certain

Proportion = Number of Orders by SpaceTime Technologies

Positive Covariance à direct relationship

Measures of Association - Correlation

• Range: -1 (Strong negative) and 1 (Strong positive linear relationship)

• 0 indicates no linear relationship

Source: Wikipedia (Correlation and dependence)

• Two samples of size 25:

You might also like