0% found this document useful (0 votes)
106 views11 pages

Manual PIFACE Ingles

This document provides information and instructions for using software to calculate sample sizes needed to estimate proportions or means within a specified margin of error. It explains how to determine sample sizes for both finite and infinite populations using normal, beta, or binomial approximations. It also discusses considerations for choosing an appropriate effect size and measurement precision.

Uploaded by

Katherine SG
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
106 views11 pages

Manual PIFACE Ingles

This document provides information and instructions for using software to calculate sample sizes needed to estimate proportions or means within a specified margin of error. It explains how to determine sample sizes for both finite and infinite populations using normal, beta, or binomial approximations. It also discusses considerations for choosing an appropriate effect size and measurement precision.

Uploaded by

Katherine SG
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

His dialog provides for sample-size determination for estimating a proportion to within a

specified margin of error, for either a finite population of specified size, or an infinite population
(or sampling with replacement).

The confidence interval is of the form p +/- ME, where p is the sample proportion and ME is the
margin of error:

ME = z * sqrt{p*(1-p)/(n-1)} (infinite pop)

ME = z * sqrt{p*(1-p)/(n-1)} * sqrt{1 - n/N} (finite pop)

Where z is a critical value from the normal distribution, p is the sample proportion, n is the
sample size, and N is the population size.

USING THE GRAPHICAL INTERFACE

The dialog is designed such that a sample size n is computed whenever you change any of the
other input values. If you change n, a new ME is computed (using sigma in place of s in the
above formulas).

Finite population

If "Finite population" is checked, calculations are based on the population size N entered in the
adjacent input field. If the box is unchecked, the "N" field is hidden, and calculations are based
on an infinite population.

Worst case

If "Worst case" is checked, computations are based on the assumption that the true sample
proportion, pi, is .5. If it is not checked, then a pi value other than .5 may be entered in the field
to the right of the checkbox.

Confidence

Choose the desired confidence coefficient for the ME. This determines the value of z.

Margin of Error

Enter the target value of the margin of error here. Note that the actual value achieved after you
collect the data will be more or less than this amount, because it will be based on p (estimated
from the data) rather than pi.

Enter the sample size here if you want to see what margin of error is achieved with that size
sample. When you change n, it is rounded to the nearest integer; otherwise, n is not rounded

Note on sliders

Sliders may be varied by clicking or dragging along the center of the scale. If you drag along at
the level of the tick labels, you can change the endpoints of the scale. The small button at the
upper right of each slider converts it to and from a keyboard-input mode. In that mode, you can
also change the scale endpoints and number of digits displayed.

Graphics

A simple graphics facility is available from the Options menu. Enter the variables for the x and y
axes, and the range of values for the x variable. More information is available through the
graphics dialog's help menu.

Cohen's effect sizes

--------------------

Jacob Cohen has proposed effect-size measures for several statistical procedures, and has
defined "small," "medium," and "large" effect sizes accordingly. These effect-size measures are
standardized quantities, e.g. an absolute difference of means divided by the standard deviation.
They are quite popularly used in sample-size problems, because they are so easy to use; you
don't have to think very hard to get an answer.

And that's the rub. You don't have to think nearly enough! Planning a study always requires
careful thought: what is the goal, how do we operationalize the research question, what do we
measure and how, what study design is needed, what result would be of scientific importance,
etc. None of those issues are addressed in specifying small, medium, or large on a
standardized scale. If you really care about the scientific merits of your work, then you should
not take this easy route. And I certainly will not help you do it using my software.

Suppose that a study involves measuring the thickness of fibers. There are various instruments
that could be used to do that. It makes sense that if an inaccurate instrument is used, you
should have more observations in the experiment than if you use really accurate
measurements. However, using, say, a "medium" effect size in the planning, you get the SAME
sample size regardless of whether you use a micrometer, caliper, or 6-inch plastic ruler.

That's because Cohen's measures are standardized. Using a micrometer, a medium effect is
perhaps a thousandth of an inch in absolute terms; whereas, using a ruler, a medium effect is
perhaps an eighth of an inch. If a .01- inch difference in mean fiber thicknesses is considered to
be important, then the plastic-ruler study is useless, while the micrometer study is over-powered
and could be done adequately with fewer data.

To do a responsible job of planning the study, you need to decide (1) what effect size, in
ABSOLUTE units (e.g., inches in the above example), is of importance from a scientific point of
view; and (2) how variable are the measurements (e.g., accuracy of instrumentation). Typically,
these are both hard questions. Question (1) requires a lot of thought and discussion. Question
(2) requires some experience with similar measurements, and/or a pilot study.

It is certainly a lot easier to talk about the ratio of (1) and (2), as Cohen does, rather than the
two quantities separately. But it is not good science.

For more discussion, see my article in a refereed publication of the American Statistical
Association:

Lenth, R.V. (2001), "Some Practical Considerations for Effective Sample Size Determination,"
The American Statistician, 55, 187-193.
This dialog provides for power analysis of a one-sample test of proportions. You have your
choice of the normal approximation, the beta approximation (which is more accurate), or two
exact calculations based on the binomial distribution. In the latter cases, the size of the test
(i.e., the probability of a type I error) is often much smaller than the stated significance level.

MORE DETAILS

Common notations: Suppose that X is a binomial random variable with "success" probability p
and n independent trials, and let P = X/n.

Normal approximation:

We approximate distribution of P with N(mu=p, sigma^2=p*(1-p)/n). This has the same mean
and variance as the true distribution of P. The critical region is obtained using this approximation
with p=p0, then the power is computed using this approximation with p=p1.

Beta approximation:

We approximate distribution of P with Beta(a=(n-1)*p, b=(n-1)*(1-p)). This has the same mean

and variance as the true distribution of P. The critical region is obtained using this approximation
with p=p0, then the power is computed using this approximation with p=p1.

Exact:

In the exact test, the significance level alpha is taken as an upper bound on the size of the test
(its power under the null hypothesis). Since X has a discrete distribution, the size cannot be
controlled exactly and is often much lower than the specified alpha.

For the alternative p < p0, let xl denote the largest x for which Pr(X <= x | p = p0) <= alpha.
Then the power is equal to Pr(X <= xl | p = p1) and the size of the test is Pr(X <= xl | p = p0).

For the alternative p > p0, let xu denote the smallest x for which Pr(X >= x | p = p0) <= alpha.
Then the power is equal to Pr(X >= xu | p = p1). and the size of the test is Pr(X >= xu | p = p0).

For the alternative p != p0, compute both powers as above but replace alpha by alpha/2; then
sum the powers and sizes.

Exact (Wald CV):

This is like the exact method, except the critical values are calculated based on an adjusted
Wald statistic. This does NOT guarantee that the size of the test is less than alpha.

Let

pAdj = (x + 2) / (n + 4),

z = (pAdj - p0) / sqrt{pAdj * (1 - pAdj) / (n + 4)}

zCrit = (1 - alpha) quantile of the N(0,1)

distribution (for a two-sided test, use the

1 - alpha/2 quantile).
Then xl is the largest x such that z <= -zCrit, and xu = smallest x such that z >= zCrit. Compute
the power as in the exact method using these critical values.

Note: This is the exact method used in the JMP statistical software.

This dialog provides approximate power computations for a two-sample comparison of


proportions. The normal approximation is used.

The test statistic considered is

Z = (ph1 - ph2) / sqrt{pbar(1-pbar)(1/n1 + 1/n2)}

where ph1 and ph2 are estimates of p1 and p2 based on n1 and n2 trials, and

pbar = (n1*ph1 + n2*ph2) / (n1 + n2)

If the continuity correction is used, the numerator is decreased in absolute value by .5 * (1/n1 +
1/n2) (at most).

Under the null hypothesis, Z is taken to be N(0,1). Under the alternative hypothesis, the
variance of Z is not equal to 1 because then p1 and p2 are not averaged.

Note: For numerical stability, the ranges of p1 and p2 are limited so that min[n1p1, n1(1 - p1)]
>= 5 and min[n2p2, n2(1 - p2)] >= 5.

This dialog provides for sample-size determination for estimating a mean to within a specified
margin of error, for either a finite population of specified size, or an infinite population (or
sampling with replacement).

The confidence interval is of the form y-bar +/- ME, where y-bar is the sample mean and ME is
the margin of error:

ME = t * s / sqrt(n) (infinite pop)

ME = t * s * sqrt{(1 - n/N) / n} (finite pop)

where t is a critical value from the t distribution, s is the sample SD, n is the sample size, and N
is the population size.

USING THE GRAPHICAL INTERFACE

The dialog is designed such that a sample size n is computed whenever you change any of the
other input values. If you change n, a new ME is computed (using sigma in place of s in the
above formulas).
Finite population

If "Finite population" is checked, calculations are based on the population size N entered in the
adjacent input field. If the box is unchecked, the "N" field is hidden, and calculations are based
on an infinite population.

Confidence

Choose the desired confidence coefficient for the ME. This determines the value of t.

Sigma

Enter your best guess at the population SD (based on historical or pilot data). For a finite
population, sigma^2 is defined as the sum of squared deviations from the population mean,
divided by N-1.

Margin of Error

Enter the target value of the margin of error here. Note that the actual value achieved after you
collect the data will be more or less than this amount, because it will be based on s (estimated
from the data) rather than sigma.

Enter the sample size here if you want to see what margin of error is achieved with that size
sample. When you change n, it is rounded to the nearest integer; otherwise, n is not rounded

Note on sliders

Sliders may be varied by clicking or dragging along the center of the scale. If you drag along at
the level of the tick labels, you can change the endpoints of the scale. The small button at the
upper right of each slider converts it to and from a keyboard-input mode. In that mode, you can
also change the scale endpoints and number of digits displayed.

Graphics

A simple graphics facility is available from the Options menu. Enter the variables for the x and y
axes, and the range of values for the x variable. More information is available through the
graphics dialog's help menu.

This dialog provides for power analysis of a one-sample t test or a pooled t test. The effect size
is the difference between the null and the target mean (in the one-sample test), or the mean
difference of the pairs (in the paired t test).

You may choose whether to solve for effect size or sample size when you click on the "Power"
slider. The current values are scaled upward or downward to make the power come out right,
while preserving (at least approximately) the ratio of the two SDs or two ns
This dialog provides for power analysis of a two-sample t test or a two-sample t test of
equivalence. If the "equal SDs" box is checked, then the pooled t test is used; otherwise, the
calculations are based on the Satterthwaite approximation.

You have three choices for sample-size allocation. "Independent" lets you specify n1 and n2
separately; "Equal" forces n1 = n2; and "Optimal" sets the ratio n1/n2 equal to sigma1/sigma2
(which minimizes the SE of the difference of the sample means).

You have a choice between using power or ROC area as the criterion for deciding sample size
(see the section below on ROC area for additional explanation).

You may choose whether to solve for effect size or sample size when you click on the "Power"
(or "ROC area") slider. The current values are scaled upward or downward to make the power
come out right, while preserving (at least approximately) the ratio of the two SDs or two ns.

To study a test of equivalence, check the "Equivalence" checkbox. A "Threshold" window will
appear for entering the negligible difference of means. An equivalence test is a test of the
hypotheses H0: |mu1 - mu2| >= threshold versus H1: |mu1 - mu2| < threshold. The test is done
by

performing two one-sided t tests:

H01: mu1 - mu2 <= -threshold vs. H11: > -threshold

H02: mu1 - mu2 >= threshold vs. H12: < threshold

Then H0 is rejected only if both H01 and H02 are rejected. If both tests are of size alpha, then
the size of the two tests together is at most alpha -- that's because H01 and H02 are disjoint.
Another way to look at this test is to construct a 100(1 - 2*alpha) percent CI for (mu1 - mu), and
reject H0 if this interval lies entirely inside the interval [-threshold, +threshold].

Options menu

"Use integrated power" is an experimental enhancement. Consider the plot of power versus
alpha; integrated power is the area under that curve. (This curve is also known as the ROC -
receiver operating characteristic - curve.) The integrated power, or area under the ROC curve,
is the average power over all possible values of alpha (therefore, it does not depend on alpha).

Integrated power has the following useful interpretation:

Consider two hypothetical studies of identical size and design. Suppose that in one of them,
there is no difference between the means, and in the other, the difference is the value specified
in the dialog. We compute the t statistic in each study. Then the integrated power is the
probability that the t statistic for the second (non-null) study is "more significant" than the one
from the null study. That is, it is the chance that we will correctly distinguish the null and non-
null studies. The lowest possible integrated power is .5 except in unreasonable situations (such
as using a right-tailed test when the difference is negative).

An advantage of using integrated power instead of power is that it doesn't require you to specify
the value of alpha (so the "alpha" widget disappears when you select the integrated power
method). Also, it is somewhat removed from the trappings of hypothesis testing, in that in the
above description, the t statistic is only used as a measure of effect size, not as a decision rule.
This may make it more palatable to some people (please tell me what you think!) A suggested
target integrated power is 0.95 - roughly comparable to a power of .80 at alpha = .05.

The above description of integrated power is for a regular t test, as opposed to an equivalence
test. In an equivalence test, the analogous definition and interpretation applies, but in cases
where the threshold is too small relative to sigma1 and sigma2, the power function is severely
bounded and this can make the integrated power less than .5.

Linear regression dialog

This is a simple interface for studying power and sample- size problems for simple or multiple
linear regression models. It is designed to study the power of testing one predictor, x[j], in the
presence of other predictors. The power of the t test of a regression coefficient depends on the
error SD, the SD of the predictor itself, and the multiple correlation between that predictor and
other predictors in the model. The latter is related to the variance inflation factor. It is assumed
that the intercept is included in the model.

The components in the dialog are as follows:

No. of predictors: Enter the total number of predictors (independent variables) in the regression
model. SD of x[j]: Enter the standard deviation of the values of the predictor of interest.

VIF[j]: (This slider appears only when there is more than one predictor.) Enter the variance-
inflation factor for x[j]. In an experiment where you can actually control the x values, you
probably should use an orthogonal design where all of the predictors are mutually uncorrelated
-- in which case all the VIFs are 1. Otherwise, you need some kind of pilot data to understand
how the predictors are correlated, and you can estimate the VIFs from an analysis of those
data.

Alpha: The desired significance level of the test.

Two-tailed: Check or uncheck this box depending on whether you plan to use a two-tailed or a
one-tailed test. If it is one-tailed, it is assumed right-tailed.

If a left-tailed test is to be studied, reverse the signs and think in terms of a right-tailed test.

Error SD: The SD of the errors from the regression model. You likely need pilot data or some
experience using the same measurement instrument.

Detectable beta[j]: The clinically meaningful value of the regression coefficient that you want to
be able to detect.

Sample size: The total number of observations in the regression dataset. This is forced to be at
least 2 greater than the number of predictors.

Power: The power of the t test, at the current settings of thre parameter values.

Solve for: Determines what parameter is solved for when you change the value of the Power
slider.
This dialog is used to specify an ANOVA model for study in a power analysis. Once you fill-in
the fields, clicking on one of the buttons at the bottom generates a graphical interface (GUI)
based on the model you specify.

The "Differences/Contrasts" buttons generates a GUI designed for studying the powers of
comparisons or contrasts among the levels of fixed factors, or combinations thereof. This is
probably what you want for most sample-size planning.

The "F tests" button creates a GUI for studying the powers of the F tests in the ANOVA table.
This is most useful when you want to study the powers of tests of random effects.

There are several built-in models; you may find what you want among them. These also serve
as examples of how to specify models.

"Model" is the only field that is required; there are defaults for the rest.

Title

This will be displayed on the title bar of the GUI.

Model

The terms in this model define the dialog. Separate the terms using "+" signs. Use "*" for
interactions, e.g., "A*B". Use "()", e.g., "Subj(Treat)". A "|" generates all main effects and
interactions, e.g., "A|B|C" is the same as "A + B + A*B + C + A*C + B*C + A*B*C".

Levels

You can set the starting number of levels for any factors in the model. (Since the levels
can be manipulated in the GUI, it is not mandatory to specify them here. The default for any
factor is 2 levels. Specify levels in the form "name levels name levels ...", e.g., "A 2 B 3".

Two special provisions exist:

(1) Locking levels: A specification like "A=B 3" sets A and B to always have the same
number of levels, starting at 3.

(2) Fractional designs: If the name of a factor is preceded by "/", then the total number of
observations in the experiment is divided by the number of levels of that factor. For example,
"row= col=/treat 5" specifies a 5x5 Latin square.

Random factors

Any factors listed here are taken to have random levels. Give their names, separated by
spaces. These settings can be altered later in the F-test GUI, but NOT in the
differences/contrasts GUI.

Replicated

If this box is checked, then a "within-cell error" term is added to the model, and an additional
window appears to specify the starting number of replications. If the box is NOT checked, then
the design is taken to be unreplicated, and a "residual" term will be added to the model. If the
model is saturated, nothing can be tested unless one or more of the factors is random.
Finally, if there are replications but the model is not saturated, the GUI assumes a residual term
that pools the within-cell error with the unspecified terms.

This dialog provides approximate power computations for a two-sample comparison of


variances, assuming independent random samples from normal populations, and that the F
ratio, F = (s1/s2)^2 is used as the test statistic, where s1 and s2 are the sample SDs.

The sample sizes are inputted as n1 and n2. The "equal ns" checkbox forces them to be equal
when checked.

The variances to be compared are inputted via the sliders for Variance 1 and Variance 2. Use
the drop-down list to specify the alternative hypothesis interest. (The null hypothesis in all
cases is that Var1 = Var2.). Use the alpha slider to set the desired sugnificance level for the
test.

The power slider displays the power of the test for the current parameter settings. This slider is
not clickable. To determine, say, the sample size for a given power, vary n1 and/or n2 until the
desired power is achieved.

This dialog provides rudimentary power analysis for a test of a coefficient of multiple
determination (R-square). The underlying model is that we have a sample of N iid multivariate
random vectors of length p, and that the pth variable is regressed on the first p-1 variables.

R^2 = 1 - SS(error) / SS(total) is the coefficient of multiple determination.

The usual way to test a hypothesis about R^2 is to transform it to an F statistic:

(n - k - 1) R^2

F = -----------------

k (1 - R^2)

This is the usual ANOVA F. The distinction that makes this dialog different from the one for
regular ANOVA is that the predictors are random. The power computed here is unconditional,
rather than conditional.

The GUI components are as follows:

Alpha: The desired significance level of the test

True rho^2 value: The population value of R^2 at which we want to compute the power.
Sample size: The number of N multivariate observations in the data set.

No, of regressors: The value of k = p - 1.

Power (output only): The power of the test.

References:

Gatsonis, C. and Sampson, A. (1989), Multiple correlation: Exact power and sample size
calculations. Psychological Bulletin, 106, 516--524.

Benton, D. and Krishnamoorthy, K. (2003), Computing discrete mixtures of continuous


distributions..., Computational Statistics and Data Analysis, 43, 249--267.

Note (9-18-06): This may still have some rough edges; the values obtained by my algorithms
seem to differ slightly from those provided in the Gatsonis and Sampson paper.

This dialog provides rudimentary power analysis of a chi-square test. Using prototype data of
sample size n*, compute the chi-square statistic Chi2*. Enter n* and Chi2* in the windows
provided; these define the effect size to use in the power calculations.

The prototype data should be fake data constructed to reflect the effect size of clinical or
scientific importance. Use the analysis you plan to do on these fake data to obtain the chi-
square value Chi2* for the dialog. The prototype data should include the expected frequencies
(or whatever), but should not include random fluctuations.

This dialog provides rudimentary power analysis for an exact test of a Poisson parameter.
Specifically, our data are assumed to be x_1, x_2, ..., x_n iid Poisson(lambda), so that x =
sum{x_i} is Poisson with mean n*lambda serves as the test statistic. The critical value(s) for x
are obtained using quantiles of the null Poisson distribution so as to cut off tail(s) of probability
less than or equal to alpha. The power of the test is then the probability of the critical region,
computed from a Poisson distribution with the specified value of lambda.

The GUI components are as follows:

lambda0: The value of the Poisson parameter under the null hypothesis.

alternative: Choose a two-tailed alternative (lambda != lambda0) or one of the one-tailed


alternatives.
Boundaries of acceptance region (output values): These are the lower and/or upper values of x
for which the null hypothesis would NOT be rejected.

size (output value): The actual probability of the critical region under the null hypothesis.

lambda: The Poisson parameter at which the power is to be calculated.

n: The sample size.

power: The probability of the critical region, given lambda and n.

Lucita y Abita son hermanas para siempre y Richard y Gladys somos una familia para siempre

You might also like