0% found this document useful (0 votes)
102 views31 pages

7u7 PDF

This document introduces basic statistics concepts. It discusses why statistics are useful, key topics like descriptive statistics for continuous and attribute data, and populations vs. samples. Descriptive statistics measures like mean, median, mode, range, deviation, and standard deviation are defined. Histograms and normal distributions are also covered, along with skewed data. The document stresses using appropriate charts like histograms for variables and Pareto charts for attributes. Exercises demonstrate creating charts to analyze datasets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
102 views31 pages

7u7 PDF

This document introduces basic statistics concepts. It discusses why statistics are useful, key topics like descriptive statistics for continuous and attribute data, and populations vs. samples. Descriptive statistics measures like mean, median, mode, range, deviation, and standard deviation are defined. Histograms and normal distributions are also covered, along with skewed data. The document stresses using appropriate charts like histograms for variables and Pareto charts for attributes. Exercises demonstrate creating charts to analyze datasets.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Basic Statistics

Why Learn About Basic Statistics?

• To describe and quantify practical problems


• To use data more effectively
• To reduce subjectivity in analysis
• To perform more rigorous analysis
• To help improve performance

Statistics are powerful tools for problem solvers.

- 2
Topics To Discuss

• Descriptive Statistics for Continuous Data


• Central Tendency
• Spread
• Histograms

• Working with Populations and Samples

• The Normal Distribution

• Descriptive Statistics for Attribute Data


• Pareto Charts

- 3
Descriptive Statistics
for Continuous Data
Populations vs. Samples

• Populations consist of every observation (as in a census)


• Samples are subsets of populations
• Project data is most often sample data
• Statistics use sample data to “infer” facts on populations
• Some calculations are different depending on whether there is sample data or
population data
• There are notation differences between sample and population statistics

Understand difference of populations and samples.

- 5
Populations vs. Samples

- 6
Descriptive Statistics For Variable Data

• Central Tendency: Around which value does data tend to cluster?

• Spread: How is data dispersed around central value?

• Histogram: Graph that visually depicts central tendency and spread.

- 7
Measures Of Central Tendency

• Mean: Average of values

• Median: Midpoint of sorted data, where 50% of values are below and 50% are
above

• Mode: Most frequently occurring value

- 8
Calculating Means

Example:

Data set: 2, 4, 5, 8, 3

Mean = 2 + 4 + 5 + 8 + 3 = 4.4
5

Means are most prevalent measure of central tendency

- 9
Calculating Medians

• An odd number of data, sorted from low to high, has middle value as Median
• An even number of data, sorted from low to high, has Mean of the 2 middle
values as Median

Examples:
Odd data set: 1, 2, 3, 4, 5, 6, 7 Median = 4
Even data set: 1, 2, 3, 4, 5, 6, 7, 8 Median = 4.5

Data must be ordered by value to calculate Medians.


Medians are more “robust” to outliers

- 10
Calculating Modes

Mode is most frequently occurring value.

Example: 45 47 49 51 46 47 49 52
47 47 50 53 47 48 51 54

Mode = 47

Mode is most frequent data value.

- 11
Measures Of Spread

Range: Difference between highest and lowest values

Deviation: Difference between a single data point and the Mean of a group of data

Standard Deviation: Average deviation of all data points in relation to the Mean
of the group

- 12
Calculating Ranges

Range = Highest – Lowest Values

Example: Data set: 2, 4, 5, 8, 3

Range = 8 - 2 = 6

Range only considers highest and lowest values.

- 13
Calculating Deviations

Deviations are differences between data points and Mean.

Deviation shows more than just extreme differences.

- 14
Calculating Standard Deviations

Standard Deviation measures average deviation from Mean.

Standard Deviation is “average” deviation of all data points.

Standard Deviation is most powerful measure of


spread.

- 15
Histograms

Graphical method for portraying data sets

•Data is divided into groups called “classes”


•Number of data points within each class are counted
•Bars are drawn for each class
Illustrate
•Central Tendency
•Spread
•General Shape
Require 50+ data points

Histograms create pictures of centering and spread.

- 16
Symmetric Data

If symmetric, then Mean, Median, and Mode are equal.

Symmetric data is often “normally” distributed.


This is commonly referred to as a “bell” curve.

- 17
Skewed Data

Standard Deviation measures average deviation from Mean.

Skewed data are typical


for money and delay.

“Skewed” Histograms lack symmetry. The Mode is not in center.

- 18
The Normal
distribution
Characteristics Of Normal Distributions

This Histogram shows a theoretical Normal distribution


It has several helpful characteristics

- 20
“Describing” Normal Distributions

• x and s completely describe any Normal distribution

•Larger Standard Deviation indicates more process variation

These distributions have equal Means but different


Standard Deviations.

- 21
Standard Normal “Bell Shaped” Curve

Standard normal curves have: x = 0 and s = 1

Each Standard Deviation captures a percentage of data.

- 22
Descriptive statistics
for attribute data
Descriptive Statistics For Attribute Data

Central tendency and spread do not exist for Attribute Data

Often there are only defect rates or counts by category

Pareto Charts are effective for portraying Attribute Data

Attribute Data has no Mean or Standard Deviation.

- 24
Pareto Charts

Pareto Charts are based on the 80/20 Rule


•80/20 Rule: 80% of defects result from 20% of causes
•Pareto Charts help teams focus efforts on most critical defect types or
most critical defect causes

- 25
Best Practices

Try to create Variable Data


• Select representative samples before calculating
• Always evaluate both centering and spread
• Use Histograms to create pictures of Variable Data
• Use Pareto Charts to create pictures of Attribute Data

- 26
Exercise - Histogram

Open the file

In the worksheet ‘Histogram’ there is a set of data, referring to sample


height of a given population.

• Use Histograms to create pictures of Data set


• Evaluate both centering and spread of Data set

- 27
Exercise – Pie chart

Open the file

In the worksheet ‘Pie chart’ there is a set of data referred to delivery of


spare parts in the listed regions in a given month

• Use Pie chart to create pictures of Data set

- 28
Exercise – Scatterplot

Open the file

Worksheet ‘Scatterplot’: the restaurant ‘YB’ wants to evaluate if the


waiting order time is correlated to the amount of clients

• Use Scatter plot to create pictures of Data set

- 29
Exercise – Pareto chart

Open the file

Worksheet ‘Pareto chart’ : the plant manager of YZ facility needs to


know the reason of not meeting shipping date to customers in a
specific month.
• Use the Pareto chart to assess the main 2 causes and suggest
eventually corrective actions

- 30
Exercise – Times series plot

Open the file

Worksheet ‘Time series plot’. The plant manager of YZ facility wants to


check the trend of shipping volumes across the time and the % of
rejected material causing shipping delay
• Use the Time Series plot to check the required trends.

- 31

You might also like