0% found this document useful (0 votes)
29 views43 pages

Unit 2 Part 2

Uploaded by

Amisha Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views43 pages

Unit 2 Part 2

Uploaded by

Amisha Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Histogram

•A histogram is a plot that lets you discover, and show, the underlying frequency
distribution (shape) of a set of continuous data.
•This allows the inspection of the data for its underlying distribution (e.g., normal
distribution), outliers, skewness, etc.
•With bar charts, each column represents a group defined by a categorical variable; and
with histograms, each column represents a group defined by a continuous, quantitative
variable.
Histogram

Wages in Rs. No. of Workers (f)


0-10 22
10--20 38
20-30 46
30-40 35
40-50 20

50
45
40
35
No. of Workers

30
25
20
15
10
5
0
0-10 10--20 20-30 30-40 40-50
Wages in Rs
Box Plot

1. A box plot is a method for graphically depicting groups of numerical data through
their quartiles.
2. Box plots may also have lines extending from the boxes (whiskers) indicating variability
outside the upper and lower quartiles, hence the terms box-and-whisker plot and box-
and-whisker diagram.
3. Outliers may be plotted as individual points.
Types:
• Standard Box Plot
• Variable width box plot
• Notched box plot
• Variable width box plots
Standard box plot

Displays :
1. Quartiles Q1,Q3
2. Median , M
3. Max and Min
4. Outliers
x
Outliers
x
Maximum

Q3

Whiskers Median

Q1

Minimum
Important note

•Data sets can sometimes contain outliers that are suspected to be anomalies (perhaps
because of data collection errors).
•If outliers are present, the whisker on the appropriate side is drawn to
Q1- 1.5 * IQR
and
Q3+1.5 * IQR
rather than the data minimum or the data maximum.
•Small circles or unfilled dots are drawn on the chart to indicate where suspected outliers
lie. Filled circles are used for known outliers.
Variable width box plot

Displays :
1. Quartiles Q1,Q3
2. Median , M
3. Max and Min
4. Outliers
5. Sample size
n=100 n=50
Notched box plot

Median
Notch

For 95% confidence interval


Notch = ± 1.57XIQR/n0.5
Notched box plot

Overlap of notches indicating Non overlap of notches indicating


no significant change there is a significant change
Case study
Ref: www.itl.nist.gov

1. Machine 2 has the smallest


median diameter and machine
1 having the largest median
diameter.
2. Machines 1 and 2 have
comparable variability while
machine 3 has somewhat
larger variability.
Case study
Ref: www.itl.nist.gov

1. Neither the location


nor the spread seem
to differ significantly
by day.
Case study
Ref: www.itl.nist.gov

1. Neither the location


nor the spread seem
to differ significantly
by time of day.
Examples

Construct a box and whisker plot of the concentration of suspended


solid material from lake and state your conclusions
42.4 65.7 29.8 58.7 52.1 55.8
57 68.7 67.3 67.3 54.3 54
73.1 81.3 59.9 56.9 62.2 69.9
66.9 59 56.3 43.3 57.4 45.3

Min 35.0625(29.8)
Q1 54.225
Med 58.05
Q3 67
Max 86.1625(81.3)
Run Chart

•A run chart, also known as a run-sequence plot is a graph that displays observed data in a
time sequence.
•Often, the data displayed represent some aspect of the output or performance of a
manufacturing or other business process. It is therefore a form of line chart
•By collecting and charting data over time, you can find trends or patterns in the process.
Because they do not use control limits, run charts cannot tell you if a process is stable.
No of complaints

Days
Run chart rules for interpretation

1. Rule One – A Shift


•A shift on a run chart is six or more consecutive points either all above or all below the
median.
• Skip values that fall on the median and continue counting.
•The change is likely to be attributable to something and not the result of random
variation within a process.
2. Rule Two – Trend:
• A trend on a run chart is five or more consecutive points all going up or all going
down.
• If the value of two or more successive points is the same, ignore one of the points
when counting.
• Like values do not make or break a trend.
A Shift
Trend
Run chart rules for interpretation

3. Rule Three – Runs:


• A run is a series of points in a row on one side of the median.
• A non-random pattern or signal of change is indicated by too few or too many runs
• To determine the number of runs above and below the median, count the number of
times the data line crosses the median and add one
4. Rule Four – Astronomical Point:
• This rule aids in detecting unusually large or small numbers.
• They are characterised by data points that are different from all or most of the
other values
Too less Runs
Astronomical Point
Example

An automobile industry manufactures engine components. the components are ground


for which out of roundness is required to be less than 5 microns. Sample 1 of 19
components and it was observed that some components do not meet this requirement.
Machine was handed over to maintenance dept.. Sample 2 was taken after. The out of
roundness are recorded below.
Draw notched box plot for 95% confidence interval (C= 1.96 )and offer your comments.

Component Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Out of roundness
Sample 1 4 5 7 6 8 7 7 9 4 6 10 9 9 9 4 3 9 8 3
Out of roundness
Sample 2 4 2 2 4 4 3 3 4 1 3 3 2 3 2 2 2 5 5 2
Example

3 3 4 4 4 5 6 6 7 7 7 8 8 9 9 9 9 9 10

1 2 2 2 2 2 2 2 3 3 3 3 3 4 4 4 4 5 5

Q1 4 2
For 95% confidence interval
Md 7 3 Notch = ± (1.57X IQR)/n0.5
Q3 9 4 1.25XIQR
Sm=
IQR 5 2 1.35X n0.5
Notch= ± Sm X C
Width of Notch :
Sample 1 = ± (1.57X 5)/19 0.5 = ± 1.80
Sample 2 = ± (1.57X 2)/19 0.5 = ± 0.72
Example

Construct a run chart for the following data showing a process parameter. Comment
whether the process shows a common or special causes for variation. Has there been any
significant trend? Offer your comments.

1 0.2 13 0.37
2 0.36 14 0.24
3 0.32 15 0.42
4 0.38 16 0.26
5 0.23 17 0.42
6 0.37 18 0.28
7 0.38 19 0.68
8 0.22 20 0.4
9 0.24 21 0.21
10 0.26 22 0.39
11 0.27 23 0.3
12 0.3
Example

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Stem and Leaf plot

A Stem and Leaf Plot is a special table where each data value is split into a "stem" (the first
digit or digits) and a "leaf" (usually the last digit).

100 1 1
100 11 22 33 55
120 103 112 126 110 2 2
110 33 66 99 99 99
102 142 119 119 120
120 00 00 11 22 55 66 66 77 99

126 145 155 132 130 2 3 6 7 8


130 2 3 6 7 8
140 2 3 4 5
152 119 127 133 140
150 2
2
5
3
8
4 5

101 113 105 158 150 2 5 8


Key
121 144 120 129
100 2 = 102
Key
101 136 137 143
100 2 = 102
138 116 125 122
Draw a stem and leaf plot for the following data.

0.28 0.24 0 -0.34


-0.15 -0.19 -0.27 0.2
-0.25 -0.13 -0.19 0.49
0.19 0.24 -0.31 0.12
-0.26 0.18 0 0.29
-0.2 0.4 0.25 -0.32
-0.33 -0.15 -0.21 0.13
0.16 -0.14 0.19 0.44
0.48 -0.16 0.18 0.29
Example 2

-0.3 1 2 3 4
-0.2 0 1 5 6 7 8
-0.1 3 4 5 5 6 9 9
0 0 0
0.1 2 3 6 8 8 9 9
0.2 4 4 5 8 9 9
0.3
0.4 0 4 8 9

Key
0.1 2 = 0.12
Example 3
Normal Probability Plot

•The normal probability plot is a graphical technique for assessing whether or not a data set
is approximately normally distributed.
•The data are plotted against a theoretical normal distribution in such a way that the points
should form an approximate straight line.
•There are two ways to assess.
1. On Normal Distribution Probability Paper
2. On regular graph paper
Normal Distribution Probability Paper

Normality
SD (0.84-0.5)
Mean (0.5)

mathisfun.com
Test for Normality and estimate the parameters from sample given below

176 192

191 201

214 190

220 183

205 185
xj f(t)= (j-0.5)/n
j X axis Y axis
1 176 0.05
2 183 0.15
3 185 0.25
4 190 0.35
5 191 0.45
6 192 0.55
7 201 0.65
8 205 0.75
9 214 0.85
10 220 0.95
170 180 190 200 210 220
NPP on regular graph paper

Procedure:
•Arrange your x-values in ascending order.
•Calculate
fi = (i-0.375)/(n+0.25)
where i is the position of the data value in the ordered list and n is the
number of observations.
•Find the z-score for each fi
•Plot your x-values on the horizontal axis and the corresponding z-score
on the vertical axis.
Test for Normality and estimate the parameters from sample given below using
regular graph paper

176 192

191 201

214 190

220 183

205 185
xi
i X axis fi=(i-0.375)/(n+0.25) Z Value
1 176 0.060 -1.55

2 183 0.158 -1.0

3 185 0.256 -0.65

4 190 0.353 -0.38

5 191 0.451 -0.12

6 192 0.548 0.12

7 201 0.646 0.38

8 205 0.743 0.65

9 214 0.841 1.0

10 220 0.939 1.55


z value

Random Variable
Example

A soft drink bottler is studying the internal pressure strength of 1 litre glass bottles. A
random sample of 16 bottles is tested and pressure strengths are obtained. The data
collected is shown below. Plot this data on regular graph paper. Does it seem reasonable to
conclude that pressure strength is normally distributed

236 218 221 231

212 205 213 214

229 203 198 212

203 210 234 211


i x fi Z value
1 198 0.038 -1.77
2 203 0.100 -1.28
3 203 0.162 -0.99
4 205 0.223 -0.76
5 210 0.285 -0.57
6 211 0.346 -0.40
7 212 0.408 -0.23
8 212 0.469 -0.08
9 213 0.531 0.08
10 214 0.592 0.23
11 218 0.654 0.40
12 221 0.715 0.57
13 229 0.777 0.76
14 231 0.838 0.99
15 234 0.900 1.28
16 236 0.962 1.77

You might also like