0% found this document useful (0 votes)
59 views

Solution 1

The document contains results from several statistical tests and analyses, including: 1) A two-sample t-test comparing the means of Materials A and B, finding no significant difference between them. 2) A Gage R&R study using ANOVA to assess measurement system variability, finding most variation due to differences in parts. 3) A one-way ANOVA comparing C2F6 gas flows, finding a significant difference between flow means.

Uploaded by

Mahesh Jain
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views

Solution 1

The document contains results from several statistical tests and analyses, including: 1) A two-sample t-test comparing the means of Materials A and B, finding no significant difference between them. 2) A Gage R&R study using ANOVA to assess measurement system variability, finding most variation due to differences in parts. 3) A one-way ANOVA comparing C2F6 gas flows, finding a significant difference between flow means.

Uploaded by

Mahesh Jain
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 14

Solution1) Null Hypothesis: Material A and Material B gives the same output.

Alternate Hypothesis: There is a difference between the outputs of these two Materials.

Two-Sample T-Test and CI: Mat-A, Mat-B


Two-sample T for Mat-A vs Mat-B Mat-A Mat-B N 10 10 Mean 10.63 11.04 StDev 2.45 2.52 SE Mean 0.78 0.80

Difference = mu (Mat-A) - mu (Mat-B) Estimate for difference: 95% CI for difference: -0.41 (-2.75, 1.93)

T-Test of difference = 0 (vs not =): T-Value = 0.37 P-Value = 0.717 DF = 17

Since the P-value is greater than alpha(.05), we fail to reject the null hypothesis. It means that the null hypothesis is true and there is no significant difference between the output from both the materials.

Individual Value Plot of Mat-A, Mat-B


15 14 13 12
Data

11 10 9 8 7 6 Mat-A Mat-B

Boxplot of Mat-A, Mat-B


15 14 13 12

Data

11 10 9 8 7 6 Mat-A Mat-B

Solution 2)

Gage R&R Study - ANOVA Method Two-Way ANOVA Table With Interaction

Source Part Operator Part * Operator Repeatability Total

DF 9 2 18 60 89

SS 88.3619 3.1673 0.3590 2.7589 94.6471

MS 9.81799 1.58363 0.01994 0.04598

F 492.291 79.406 0.434

P 0.000 0.000 0.974

Alpha to remove interaction term = 0.25


Two-Way ANOVA Table Without Interaction

Source Part Operator Repeatability Total


Gage R&R

DF 9 2 78 89

SS 88.3619 3.1673 3.1179 94.6471

MS 9.81799 1.58363 0.03997

F 245.614 39.617

P 0.000 0.000

Source Total Gage R&R Repeatability Reproducibility Operator Part-To-Part Total Variation

VarComp 0.09143 0.03997 0.05146 0.05146 1.08645 1.17788

%Contribution (of VarComp) 7.76 3.39 4.37 4.37 92.24 100.00

Source Total Gage R&R Repeatability Reproducibility Operator Part-To-Part Total Variation

StdDev (SD) 0.30237 0.19993 0.22684 0.22684 1.04233 1.08530

Study Var (9 * SD) 2.72134 1.79940 2.04154 2.04154 9.38095 9.76770

%Study Var (%SV) 27.86 18.42 20.90 20.90 96.04 100.00

Number of Distinct Categories = 4

Gage R&R (ANOVA) for Response


G age name: D ate of study :
Components of Variation
100
% Contribution % Study Var

Reported by : Tolerance: M isc:

Response by Part
2 0 -2

Percent

50

Gage R&R

Repeat

Reprod

Part-to-Part

5 Part

10

R Chart by Operator
1 2 3 UCL=0.880 _ R=0.342 LCL=0 2 0 -2 1

Response by Operator

Sample Range

1.0

0.5

0.0

Xbar Chart by Operator


1 2 2 3 2

2 Operator

Operator * Part Interaction


Operator 1 2 3

Sample Mean

_ _ UCL=0.351 X=0.001 LCL=-0.348

Average

-2

-2

5 6 Part

10

Interpretation:
Number of Distinct categories = 4, it tells you how many separate groups of parts the system is able to distinguish. 4 categories means: Generally unacceptable for estimating process parameters and indices, only provides coarse estimates

Component Of Variation. It is a good measurement system as the largest component of variation is Part-to-Part variation.

Solution 3)
One-way ANOVA: Observations versus C2F6 Flow Source C2F6 Flow Error Total S = 0.7132 DF 2 15 17 SS 3.648 7.630 11.278 MS 1.824 0.509 F 3.59 P 0.053

R-Sq = 32.34%

R-Sq(adj) = 23.32%

Individual 95% CIs For Mean Based on Level N Mean ------+---125 6 3.3167 160 6 4.4167 *---------) 200 6 3.9333 --) ------+---3.00 4.80 Pooled StDev = 0.7132 3.60 4.20 StDev 0.7600 0.5231 0.8214 Pooled StDev -----+---------+---------+--(---------*----------) (---------(----------*-----------+---------+---------+---

Interpretation:
Yes the Anova is significant and there is a significant difference between the means of all the three C2F6 flows. In the given question R2 is .3234, R2 refers to the fraction of variance. It is the square of the correlation coefficient between two dependent variables. It is a statistical term that tells us how good one variable is at predicting another. If R2 is 1.0, then given the value of one variable you can perfectly predict the value of

the other variable. If R2 is 0.0, then knowing either variable does not help you predict the other variable. In turn, the higher the R2 value the more correlation there is between the two variables. In the given question Adjusted R2 is .2332, Adjusted R2 is a modification of R2 that adjusts for the number of explanatory terms in a model. Unlike R2, the adjusted R2 increases only if the new term improves the model more than would be expected by chance. The adjusted R2 can be negative, and will always be less than or equal to R2.

Boxplot of Observations
5.0

4.5

Observations

4.0

3.5

3.0

2.5 125 160 C2F6 Flow 200

Residual Plots for Observations


Normal Probability Plot
99 90 50 10 1 -2 -1 0 Residual 1 2 1.0

Versus Fits

Residual

Percent

0.5 0.0 -0.5 -1.0 3.5 4.0 Fitted Value 4.5

Histogram
3 1.0

Versus Order

Frequency

Residual
-1.0 -0.5 0.0 0.5 Residual 1.0

2 1 0

0.5 0.0 -0.5 -1.0 2 4 6 8 10 12 14 Observation Order 16 18

Solution 4)

Box Plot:
A box plot is a way of summarizing a set of data measured on an interval scale. It is often used in exploratory data analysis. It is a type of graph which is used to show the shape of the distribution, its central value, and variability. The picture produced consists of the most extreme values in the data set (maximum and minimum values), the lower and upper quartiles, and the median. Example

These MINITAB boxplots represent lottery payoffs for winning numbers for three time periods (May 1975-March 1976, November 1976September 1977, and December 1980September 1981). The median for each dataset is indicated by the black center line, and the first and third quartiles are the edges of the red area, which is known as the interquartile range (IQR). The extreme values (within 1.5 times the inter-quartile range from the upper or lower quartile) are the ends of the lines extending from the IQR. Points at a greater distance from the median than 1.5 times the IQR are plotted individually as asterisks. These points represent potential outliers. In this example, the three box plots have nearly identical median values. The IQR is decreasing from one time period to the next, indicating reduced variability of payoffs in the second and third periods. In addition, the extreme values are closer to the median in the later time periods.

Descriptive Statistics:
"True" Mean and Confidence Interval. Probably the most often used descriptive statistic is the mean. The mean is a particularly informative measure of the "central tendency" of the variable if it is reported along with its confidence intervals. As mentioned earlier, usually we are interested in statistics (such as the mean) from our sample only to the extent to which they can infer information about the population. The confidence intervals for the mean give us a range of values around the mean where we expect the "true" (population) mean is located (with a given level of certainty, see also Elementary Concepts). For example, if the mean in your sample is 23, and the lower and upper limits of the p=.05 confidence

interval are 19 and 27 respectively, then you can conclude that there is a 95% probability that the population mean is greater than 19 and lower than 27. If you set the p-level to a smaller value, then the interval would become wider thereby increasing the "certainty" of the estimate, and vice versa; as we all know from the weather forecast, the more "vague" the prediction (i.e., wider the confidence interval), the more likely it will materialize. Note that the width of the confidence interval depends on the sample size and on the variation of data values. The larger the sample size, the more reliable its mean. The larger the variation, the less reliable the mean (see also Elementary Concepts). The calculation of confidence intervals is based on the assumption that the variable is normally distributed in the population. The estimate may not be valid if this assumption is not met, unless the sample size is large, say n=100 or more. Shape of the Distribution, Normality. An important aspect of the "description" of a variable is the shape of its distribution, which tells you the frequency of values from different ranges of the variable. Typically, a researcher is interested in how well the distribution can be approximated by the normal distribution (see the animation below for an example of this distribution) (see also Elementary Concepts). Simple descriptive statistics can provide some information relevant to this issue. For example, if the skewness (which measures the deviation of the distribution from symmetry) is clearly different from 0, then that distribution is asymmetrical, while normal distributions are perfectly symmetrical. If the kurtosis (which measures "peakedness" of the distribution) is clearly different from 0, then the distribution is either flatter or more peaked than normal; the kurtosis of the normal distribution is 0.

More precise information can be obtained by performing one of the tests of normality to determine the probability that the sample came from a normally

distributed population of observations (e.g., the so-called Kolmogorov-Smirnov test, or the Shapiro-Wilks' W test. However, none of these tests can entirely substitute for a visual examination of the data using a histogram (i.e., a graph that shows the frequency distribution of a variable).

The graph allows you to evaluate the normality of the empirical distribution because it also shows the normal curve superimposed over the histogram. It also allows you to examine various aspects of the distribution qualitatively. For example, the distribution could be bimodal (have 2 peaks). This might suggest that the sample is not homogeneous but possibly its elements came from two different populations, each more or less normally distributed. In such cases, in order to understand the nature of the variable in question, you should look for a way to quantitatively identify the two sub-samples.

Histogram:
A graphical representation, similar to a bar chart in structure, that organizes a group of data points into user-specified ranges. The histogram condenses a data series into an easily interpreted visual by taking many data points and grouping them into logical ranges or bins.

Normal Probability:
A random variable X whose distribution has the shape of a normal curve is called a normal random variable. Properties of a Normal Distribution 1. 2. 3. 4. The normal curve is symmetrical about the mean ; The mean is at the middle and divides the area into halves; The total area under the curve is equal to 1; It is completely determined by its mean and standard deviation (or variance 2) Note: In a normal distribution, only 2 parameters are needed, namely and 2

Normal Curve This random variable X is said to be normally distributed with mean and standard deviation if its probability distribution is given by

Layer Thickness
Normal
99.9 99 95 90 80 70 60 50 40 30 20 10 5 1 0.1 Mean StDev N AD P-Value 450.0 13.43 100 0.165 0.939

Percent

410

420

430

440

450 460 Thickness

470

480

490

500

Yes the distribution given in the question is Normal as we can judge it from the Graph.

You might also like