Analysis of Variance (Anova) Aliasing Confounding Alpha Risk

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 8

DESIGN OF EXPERIMENTS

Glossary Page 1 of 8

Analysis of Variance A statistical technique for comparing a number of groups to determine


(ANOVA) whether they all may be part of a single parent population

Aliasing Inability to distinguish between a factor or interaction effect and another


factor or interaction effect as a result of using a fractional design. AKA
Confounding

Alpha Risk The probability of making a wrong conclusion about the importance of a
factor, based on statistics; a Type I error. The lower the probability, the
higher the statistical significance. AKA Significance Level

Average The sum of the observations divided by the number of observations. A


measure of the central tendency of the data. AKA Mean

Bias Error Systematic variation in process output caused by an unintended pattern of


changes in process conditions

Block Recognition that a portion of the experimental samples may be more


similar (i.e. Homogeneous) than all samples. Blocking is a technique
used to increase the precision of the experiment.

Blocking Deliberate association of some elements of an experimental design so


that a particular contrast is reinforced in the design or an unavoidable
contrast is specifically taken into account and quantified.

Box-Behnken Design An advanced design that uses three levels of each factor to generate a full
data set suitable for response surface methods.

Central Composite Design An advanced design that uses five levels of each factor to generate a full
data set suitable for response surface methods.

Central Limit Theorem A mathematical principle stating that averages are much more reliable
indicators of the value of some property than any individual
measurement, and that the larger the sample size used to calculate the
average, the better the estimate it gives. This is because averages tend to
be normally distributed whether or not the population is.

Coding Use of special designations for variable levels to make examination of


the pattern easier, e.g., using –1 for 350 OF, 0 for 375 OF, and +1 for 400
O
F.

Coefficient The numerical part of a term, usually written before the literal part, such
as 3 in the term 3x.

Confidence Interval A range of values within which a particular number of interest is


expected to fall, at some specific level of probability.
DESIGN OF EXPERIMENTS

Glossary Page 2 of 8

Confounding Inability to distinguish between a factor or interaction effect and another


factor or interaction effect as a result of using a fractional design. AKA
Aliasing

Continuous Factor A control variable that can be changed over many levels, such as voltage,
temperature, or ingredient percentage.

Correlation Coefficient A statistic that provides a normalized and scale-free measurement of the
linear association between two variables. The coefficient values fall
between -1 and +1. A positive correlation indicates that the variables
vary in the same direction while a negative correlation indicates that the
variables vary in the opposite direction. Statistically independent
variables have an expected correlation of 0.

Correlation Matrix A table of the Pearson correlation coefficients for the estimated
coefficients in a model; a table of correlation coefficients that shows all
the pairs of correlations for a set of variables; a matrix in which each row
and column corresponds to a variable.

Degrees of Freedom (df) A parameter that, in general, is the number of independent comparisons
that are available to estimate a specific parameter. For a fitted model, df
is equal to the number of independent observations minus the number of
estimated parameters. For an experimental factor, df is equal to the
number of levels minus one.

Design of Experiments (DOE, The methodology for doing experiments in patterns instead of one at a
DOX) time, with the benefit of extracting more useful information at less cost
than from traditional methods.

Design Resolution The degree of confounding in a two-level fractional design (screening


design).
 Resolution III does not confound main effects with one another, but
does confound main effects with two-factor interactions.
 Resolution IV design does not confound main effects and two-factor
interactions, but does confound two-factor interactions with other two-
factor interactions.
 Resolution V design does not confound main effects and two-factor
interactions with each other, but does confound two-factor interactions
with three-factor interactions.

Designed Experiment The complete specification of experimental runs, including blocking,


randomization, replication, and the assignment of factor-level
combinations.
DESIGN OF EXPERIMENTS

Glossary Page 3 of 8

DFITS A statistic that measures the amount of change for each estimated
coefficient if the observation is removed from the data. A statistic that
estimates the influence of an individual observation on the fitted line in
regression. Both leverage and prediction error affect DFITS. An
observation has significant influence if it has an absolute value of DFITS
that is greater than two times the square root of the number of
coefficients divided by the number of observations.

Discrete Factor A factor with clearly separate levels. Not continuous.

Distribution The pattern in which numbers in a group vary within the overall
grouping.

Durbin-Watson Statistic A test for autocorrelation or serial correlation in the residuals of a least
squares regression analysis. As the autocorrelation increases, the Durbin-
Watson statistic goes down. The larger the correlation, the less reliable
the results of the regression analysis.

Error Sum of Squares In analysis of variance, the within-group sum of squares; that is, the part
that the treatment effects cannot explain. AKA Residual Sum of Squares.

Experimental Design A series of experimental trials arranged systematically to provide specific


information about a natural phenomenon.

Experimental Error The normal amount of variation seen in tests results from experiments or
measurements done at the same conditions.

Experimental Run A specific combination of settings for each factor in experiment. A


recipe

Face –Centered Cubic An advanced design that uses three levels of each factor to generate a full
Design data set suitable for response surface methods.

Factor The process or environmental variables that affect the responses (can be
either qualitative or quantitative). Depending on the particular
environment, factors may also be referred to as: treatments, process
variables, ingredients / components, input variables, independent
variables, predictor variables, temperature, pressure, operator, machine
type, etc.

Fitted Value The estimated value for the response at observed values for the
independent variables.
DESIGN OF EXPERIMENTS

Glossary Page 4 of 8

Foldover Design A way to obtain a resolution IV design based upon two designs of
resolution III. Used when confirmation runs from a resolution III design
differ substantially from their prediction, and when you want to de-alias
the two-way interactions from the main effects.

Fractional-Factorial Design A fully balanced yet partial set of experiments. An orthogonal subset of a
full-factorial design.

F-Test The ratio of two sample variances used to determine if their populations
are significantly different.

Full-Factorial Design A design that combines the levels of each factor with all the levels of
every other factor.

Gaussian Distribution A symmetric, bell-shaped distribution, completely determined by its


Mean and Standard Deviation used to calculate probabilities of events
that tend to occur around a mean value and trail off with decreasing
likelihood. AKA Normal Distribution

Half-Factorial Design The most basic fractional-factorial design. The number of runs is one
half that of a full-factorial design. Only simple confounding patterns
occur.

Histogram A bar diagram that represents a frequency distribution. The width of the
bars is equal to the class interval and the height proportional to the
number of values in the class.

Hyper-Graeco-Latin Squares A series of multilevel screening designs that permit examination of n


variables at n-1 levels. These designs have no capacity for detecting
interactions.

Interaction The existence of joint factor effects in which the effect of each factor
depends upon the level of other factors. A change in the response due to
the combination of two or more factors. An interaction involving two
factors is called a two-factor interaction, three factors is a three-factor
interaction, and so on.

Level A particular form or value of the factor studied. AKA Treatment

Leverage A statistic that measures the amount each estimated coefficient would
change if each observation was removed from the data. A statistic that
estimates the influence of an individual observation on the fitted line in
regression. A point has significant influence when its leverage value is
greater than three times the number of coefficients divided by the number
of observations.
DESIGN OF EXPERIMENTS

Glossary Page 5 of 8

Mahalanobis Distance A standardized form of Euclidean distance. Data are standardized by


scaling responses in terms of standard deviations and adjustments are
made for intercorrelations between the variables.

Mean The sum of the observations divided by the number of observations. A


measure of the central tendency of the data. AKA Average

Mean Absolute Error (MAE) The average of the absolute values of the residuals. A measure of
forecast accuracy calculated by summing the absolute values of the
individual forecast errors of the time series and dividing by the number
of observations. The MAE is appropriate when the function is linear and
symmetric.

Mean Square The result of dividing the sum of the squares by its associated degrees of
freedom.

Median The number halfway between the smallest and largest observations; the
50th percentile. A measure of the central tendency of the data.

Mixture Designs An experiment in which you assume that the response depends only on
the relative proportions of the ingredients (components) in the mixture
and not on the amount of the mixture.

Mode A value with the highest frequency. An observation in a sample which


occurs most frequently. A measure of the central tendency of the data.

Mu The population mean.

Normal Distribution A symmetric, bell-shaped distribution, completely determined by its


Mean and Standard Deviation used to calculate probabilities of events
that tend to occur around a mean value and trail off with decreasing
likelihood. AKA Gaussian Distribution

P Value A component of an ANOVA table that serves as a measure of


significance. The probability of observing a value for a test statistic that
is at least as inconsistent with the null hypothesis as the value of the test
statistic actually observed.

Parent Population The real or hypothetical group containing all possible members fitting the
description of the group.

Plackett-Burman Designs An orthogonal, balanced design that has a multiple of four runs. Used
for estimating main effects only. Interaction effects are dispersed widely
throughout the data making them less likely to interfere with the main
effects.
DESIGN OF EXPERIMENTS

Glossary Page 6 of 8

Random Error Differences in process output caused only by the normal amount of
variation inherent to the process.

Randomization Sequence of experiments and/or assignment of runs to treatment


combinations are performed in a purely chance manner (i.e. Equal
opportunity of selection).

Reflected Design An experiment where the factor levels from another experiment are
reversed.

Regression Analysis A mathematical tool that quantifies the relationship between a dependent
variable and one or more independent variables. The process of
estimating the parameters for a model by optimizing the value for an
objective function, and then testing the resulting predictions for statistical
significance against an appropriate null hypothesis model.

Regression Coefficient A parameter or its estimate for a regression model, often denoted by the
Greek letter, beta. A number that indicates the values for a dependent
variable that are associated with the values for an independent variable or
variables.

Repeat Multiple readings of the same run.

Replication One or more complete duplications of an experimental design. Multiple


readings of the same run are a repeat, not a replicate.

Residual The difference between the observed and the fitted value in a model.

Residual Sum of Squares In analysis of variance, the within-group sum of squares; that is, the part
that the treatment effects cannot explain. AKA Error Sum of Squares.

Response Surface A process of locating an optimal value in a higher-order model. The


Methodology (RSM) methodology utilizes regression, contour plots and/or method of steepest
ascent/descent.

Response Surface Plot A three-dimensional plot of a response surface that is useful in locating
optimal regions. It shows the relationship between the estimated
dependent variable and two variables you select. A plot that represents a
three-dimensional grid surface for the function Z = f(X,Y). You view the
plot from outside the plot area at an angle oblique to the X- and Y-axes.

Responses The product or process characteristics that are of interest to the


experimenter. Depending on the particular environment, response
variables may also be referred to as: product properties, product
characteristics, output variables, dependent variables, Y1…YX, yield,
cost, tensile strength, etc.
DESIGN OF EXPERIMENTS

Glossary Page 7 of 8

Robust Process A process that has been made relatively insensitive to minor changes in
some factor levels by selection of particular levels of other factors that
affect process variability more than process average.

Rotatable Design A design used in the mapping of response surfaces in which fitted models
estimate the response with equal precision at all points in the
experimental region that are equidistant from the center of the design.

R-Squared A statistic that measures the proportion of variability in a model for the
dependent variable (y).

Sampling Distribution The distribution of all possible samples of a certain size that could be
taken from a parent population. Because there are more possible
combinations of samples than individuals in the population, the sampling
distribution is larger and more narrowly defined than the parent
population. See Central Limit Theorem.

Saturated Design A design in which every column for factor settings has been assigned to
an individual factor with no columns left for interactions or error terms.

Screening Design Any fractional-factorial design being used solely or primarily to allow
evaluation of several factor main effects as an initial evaluation in order
to determine what factors are most important.

Sigma The population standard deviation.

Signal to Noise Ratio The relative size of an effect as compared to background scatter in the
response being measured.

Significance Level The probability of making a wrong conclusion about the importance of a
factor, based on statistics; a Type I error. The lower the probability, the
higher the statistical significance. AKA alpha risk.

Standard Deviation A measurement of the spread or dispersion of observations in a sample


that is the positive square root of the sample variance.

Standard Error The standard deviation divided by the square root of the sample size.
The standard deviation of a sampling distribution.

Standard Error of Estimate This statistic explains the value for the standard deviation of the
residuals. You can use this value to construct prediction limits for new
observations. Explains the value for the deviations from a line, curve, or
surface of regression; usually estimated by the square root of the mean
squared error.
DESIGN OF EXPERIMENTS

Glossary Page 8 of 8

Standard Order The order of the experimental runs of the design matrix for a two-level
factorial design. The first column consists of successive low and high
settings, the second column consists of successive pairs of low and high
settings, the third column consists of four low settings followed by four
high settings, and so on.

Steepest Descent A method of nonlinear regression analysis that searches for the minimum
least squares criterion measure by iteratively determining the direction in
which the regression coefficients should be changed. This method is
particularly effective when the initial values are not "good"; that is, they
are far from the final values.

Sum of Squares The sum of the squared deviations from the mean of the sample
observations.

T Distribution A distribution useful in forming confidence intervals for the mean when
the variance is unknown, testing to determine if two sample means are
significantly different, or testing to determine the significance of
coefficients in a regression. The distribution is similar in shape to a
Normal distribution.

Taguchi Designs A series of designs related to Plackett-Burman or other classic screening


designs, which use different conventions and slightly different analytical
techniques. Taguchi techniques are a sizeable body of philosophical
principles involving more specific examination of signal-to-noise and
concepts of robust design.

Total Sum of Squares The sum of squares for the deviations of the individual items from the
mean of all the data.

Treatment A particular form or value of the factor studied. AKA Level

T-Test A test that determines whether two means from two independent,
normally distributed samples differ. For large sample sizes the t statistic
is equivalent to the normal distribution (z).

Unsaturated Design A fractional-factorial design where some columns remain unassigned to


individual control factors, which allows estimates to be made on
interactions or scatter in the experiment.

Variance A measure of the spread (dispersion) of scores in a distribution of scores.


The larger the variance, the further the individual scores are from the
mean. The smaller the variance, the closer the individual scores are to
the mean.

You might also like