Discover millions of ebooks, audiobooks, and so much more with a free trial

From $11.99/month after trial. Cancel anytime.

Practical Engineering, Process, and Reliability Statistics
Practical Engineering, Process, and Reliability Statistics
Practical Engineering, Process, and Reliability Statistics
Ebook442 pages2 hours

Practical Engineering, Process, and Reliability Statistics

Rating: 0 out of 5 stars

()

Read preview

About this ebook

This book is a convenient and comprehensive guide to statistics. A resource for quality technicians and engineers in any industry, this second edition provides even more equations and examples for the reader—with a continued focus on algebra-based math. Those preparing for ASQ certification examinations, such as the Certified Quality Technician (CQT), Certified Six Sigma Green Belt (CSSGB), Certified Quality Engineer (CQE), Certified Six Sigma Black Belt (CSSBB), Certified Reliability Engineer (CRE), and Certified Supplier Quality Professional (CSQP), will find this book helpful as well.

Inside you’ll ­find:
• Complete calculations for determining confidence intervals, tolerances, sample size, outliers, process capability, and system reliability

• Newly added equations for hypothesis tests (such as the Kruskal-Wallis test and Levene’s test for equality of variances), the Taguchi method, and Weibull and log-normal distributions
• Hundreds of completed examples to demonstrate practical use of each equation
• 20+ appendices, including distribution tables, critical values tables, control charts, sampling plans, and a beta table
LanguageEnglish
Release dateMar 31, 2022
ISBN9781636940168
Practical Engineering, Process, and Reliability Statistics
Author

Mark Allen Durivage

Mark Allen Durivage has worked as a practitioner, educator, and consultant. He is Managing Principal Consultant at Quality Systems Compliance LLC. He is an American Society for Quality (ASQ) Fellow and holds several ASQ certifications.

Read more from Mark Allen Durivage

Related to Practical Engineering, Process, and Reliability Statistics

Related ebooks

Mathematics For You

View More

Related articles

Related categories

Reviews for Practical Engineering, Process, and Reliability Statistics

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Practical Engineering, Process, and Reliability Statistics - Mark Allen Durivage

    1

    Point Estimates and Measures of Dispersion

    When performing statistical tests, we usually work with data that are samples drawn from a population. We use sample data to make estimates about the population. The first estimate is usually a point estimate (central tendency).

    As we will see in Chapter 2, Confidence Intervals, these point estimates are subject to sampling error and should be interpreted with caution, especially for small sample sizes. The accuracy of the point estimates becomes greater as the sample size gets larger.

    There are several point estimates commonly made by quality technicians, quality engineers, and reliability engineers. These include estimates of central tendency, such as the mean (average), median, and mode.

    Estimates of dispersion include the range, variance, standard deviation, coefficient of variation, and others. Descriptions of the shape of the distribution include skewness and kurtosis.

    Estimates of Central Tendency for Variables Data

    The most common measure of central tendency is the average or sample mean. The true (unknown) population mean is denoted by the letter μ and is estimated by X-bar (X). To estimate the parameter μ using X-bar, we use the following formula:

    where

    X = Data point

    n = Number of data points

    Example: Using the following seven data points, estimate the population mean μ by finding X-bar:

    43, 45, 40, 39, 42, 44, 41

    Another estimate of central tendency or location is the median. The median is a simpler value to determine because it can be determined without a mathematical calculation. The median value is most useful when there are outlying data points that could artificially inflate or deflate the arithmetic mean. To find the median, place the data points in an ordered form, generally the lowest value on the left and the greatest value on the right.

    Example: Using the following seven data points, determine the median:

    43, 45, 40, 39, 42, 44, 41

    Order the data points and select the point in the middle:

    39, 40, 41, 42, 43, 44, 45

    This example yields 42 as the median value for these seven data points. In the case where there is an even number of data points, add the two values in the middle and divide by two.

    Example: Using the following six data points, determine the median:

    43, 40, 39, 42, 44, 41

    Order the data points and select the points in the middle, and divide by two:

    39, 40, 41, 42, 43, 44

    The calculated median value is 41.5.

    The mode is the most frequently occurring value(s) in a set of data. A set of data may contain one mode, two modes (bimodal), many modes, or no mode.

    Example: Given the following data points, determine the mode:

    39, 40, 41, 41, 42, 43

    The mode is 41, as it is the most frequently appearing value in the data set.

    When the population set is unimodal and symmetrical, such as in the normal (Gaussian) distribution, the values of mean, median, and mode will occur at the same location, as shown in Figure 1.1. When the distribution is skewed, these values diverge, as shown in Figures 1.2 and 1.3.

    Range for Variables Data

    The range is the simplest method of measuring the spread of sample or population data. To calculate the range, we use the following formula:

    r = Xh Xl

    where

    r = Range

    Xh = Largest value

    Xl = Smallest value

    Example: Using the following seven data points, determine the range:

    43, 45, 40, 39, 42, 44, 41

    Order the data points and select the largest and smallest values:

    39, 40, 41, 42, 43, 44, 45

    r = Xh Xl = 45 – 39 = 6

    The range of this set of data is 6.

    Variance and Standard Deviation for Variables Data

    This section will focus on the two primary measures of dispersion or variation for individual values. The two principal measures of dispersion are the variance σ², which is estimated from sample data, and the statistic s², which is estimated by the statistic s. We see that the standard deviation is the square root of the variance. For variables data, when all values are available, the formula for calculating the population variance is

    where

    μ = Population mean

    x = Data point

    N = Number of data points in the population

    The formula for calculating the population standard deviation is

    or

    Example: Using the following seven data points, determine the population variance and standard deviation:

    43, 45, 40, 39, 42, 44, 41

    The calculated variance is 4.

    The calculated standard deviation is 2.

    Generally, though, the focus is on using a sample drawn from the population to make inferences about the population. When using sample data to calculate the mean, there are only n – 1 degrees of freedom (df) from which to calculate the spread of the data. It would not be appropriate or statistically correct to use the other df, n.

    The mean calculated from the sample data is not the true mean (parameter); rather, it is an estimate of the mean based on samples of data (statistic). The estimate is biased toward fitting the sample data because the sample data were used for the calculation. Estimating the spread around the true mean from the sample data is accomplished by calculating the spread around the estimated mean and will yield an biased value that is slightly too low. Using n – 1 instead of n compensates for the bias. To make an unbiased estimate of the population variance from sample data, the formula is

    where

    = Sample mean

    X = Data point

    n = Number of data points in the sample

    The formula for calculating the sample standard deviation is

    Example: Using the following seven data points, determine the sample variance and standard deviation:

    43, 45, 40, 39, 42, 44, 41

    The calculated variance is 4.67.

    The calculated standard deviation is 2.16.

    The coefficient of variation is a normalized measurement of dispersion. The coeffi­cient of variation is a relative measure of how much variation exists relative to the mean expressed as a percentage. A smaller percentage indicates less variation relative to the mean. The coefficient of variation is calculated by the following formula:

    Population

    Sample

    where

    μ = Population mean

    = Sample mean

    σ = Population standard deviation

    s = Sample standard deviation

    Example: Calculate the coefficient of variation for the following population values:

    μ = 42 and σ = 2

    The calculated population coefficient of variation is 4.76%.

    Example: Calculate the coefficient of variation for the following sample values:

    = 42 and s = 2.16

    The calculated sample coefficient of variation is 5.14%.

    Skewness and Kurtosis for Variables Data

    Skewness and kurtosis are two measures that describe the shape of the normal distribution. Skewness is a measure of symmetry about the center of the distribution (Figure 1.4). The skewness for a normal distribution should be zero. Negative skewness indicates the data are skewed to the left (the left tail is longer than the right tail) and positive skewness indicates the data are skewed to the right (the right tail is longer than the left tail). Please note that various sources indicate the use of other formulae. The formula to calculate skewness is

    Skewness =

    where

    = Sample mean

    s = Sample standard deviation

    X = Data point

    n = Number of data points in the sample

    Caution must be exercised when analyzing the calculated skewness from a sample of the population. The sample may not be indicative of the population. The following table provides a guideline for interpreting the calculated skewness value.

    Kurtosis is a measure that describes the flatness or peakedness of a normal distribution (Figure 1.5). Generally, kurtosis describes the closeness of the data points relative to the center of the distribution. A normal distribution is called mesokurtic. A distribution that is flattened is platykurtic. A distribution that has a sharp peak is leptokurtic. Higher values indicate peakedness, and lower values indicate a less pronounced peak. The formula for calculating kurtosis is

    Kurtosis =

    where

    = Sample mean

    s = Sample standard deviation

    X = Data point

    n = Number of data points in the sample

    Caution must be exercised when analyzing the calculated kurtosis from a sample of the population. The sample may not be indicative of the population. The following table provides a guideline for interpreting the calculated kurtosis value.

    Example: Using the following seven data points, determine the skewness and kurtosis:

    43, 45, 40, 39, 42, 44, 41

    Skewness =

    Kurtosis =

    Estimates of Central Tendency for AttributeS Data

    When working with attributes data, measures of central tendency that are analogous to the average are the proportion or expected numbers of occurrences. For nonconforming product, the fraction nonconforming is estimated by the statistic p, where p is found as

    A more general equation could be written as

    Example: If we test 700 units and find 16 to be nonconforming, the estimate of the population fraction nonconforming would be

    To convert to a percentage, multiply the p value by 100:

    0.023 * 100 = 2.3%

    When there are several inspection points per unit, such as on a circuit board, there is a possibility of more than one nonconformance per unit. When the opportunity for occurrence is the same, as in identical units, the average number of nonconformances is estimated using the c statistic. The c statistic is calculated using the following formula:

    Example: If 500 circuit boards reveal 1270 nonconformances, the estimate of the number of nonconformances per unit in the population is

    nonconformances per unit

    Estimates of Dispersion for Attributes Data

    When working with fraction nonconforming, the estimate of the variance is a function of the binomial distribution and is given as

    Sample variance

    Sample standard deviation

    Sample standard deviation

    where

    p = Fraction nonconforming

    Example: If we test 700 units and find 16 to be nonconforming, the estimate of the population fraction nonconforming is p = 0.023. We calculate the variance as

    The standard deviation is the square root of the variance:

    If we are interested in the number of nonconformances per unit or similar count data, we may model the variance by the Poisson distribution. When the data can be modeled by the Poisson distribution, the variance is equal to the mean value.

    Example: If 500 circuit boards reveal 1270 nonconformances, the estimate of the number of nonconformances per unit in the population is 2.54. Calculate the variance and standard deviation:

    s² = 2.54

    is the calculated sample variance.

    is the calculated sample standard deviation.

    Standard Error

    The standard error (SE) is the estimated sigma or measure of variability in the sampling distribution of a statistic. A low SE means there is relatively less spread in the sampling distribution. The SE indicates the likely accuracy of the sample mean (X-bar) as compared with the population mean (μ). The SE decreases as the sample size increases. Figure 1.6 shows the inverse relationship between sample size and the SE. The SE for the population is calculated by

    where

    σ = Population standard deviation

    n = Number of samples

    The standard deviation for a population of 7 was found to be 2. Calculate the SE.

    Generally, the true population standard deviation (σ) will not be known, so we must use the sample standard deviation (s). The SE for a sample is calculated by:

    where

    s = Sample standard deviation

    n = Number of samples

    The standard deviation for a sample of 7 was found to be 2.16. Calculate the SE.

    2

    Confidence Intervals

    In Chapter 1, we made estimates from a single sample of data drawn from a population. If we

    Enjoying the preview?
    Page 1 of 1