Determination of Sample Size

This document discusses determining sample size for studies. It notes that sample size is a crucial part of study design and the goal is to have enough subjects to detect statistically significant results but not waste resources. It discusses how using too few subjects leads to inconclusive results, while too many subjects wastes resources. The appropriate sample size optimizes the chance of interpretable results while minimizing waste. Sample size determination depends on parameters like desired statistical power, p-level, treatment variability, and error variability. These parameters take into account the variability inherent in measurements and between subjects.

Uploaded by

Sushant Sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

177 views9 pages

Determination of Sample Size

Uploaded by

Sushant Sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 9

ASSIGNMENT

“RMD”
ON
DETERMINATION OF
SAMPLE SIZE

SUBMITTED BY :
SUSHANT SHARMA
SEC: SD2
COURSE : ISBE
Determination of Sample Size
Determining the sample size for a study is a crucial component of study design. The
goal is to include sufficient numbers of subjects so that statistically significant results
can be detected. Using too few subjects results in wasted time, effort, research
dollars, and animal lives, and yields statistically inconclusive results. Statistically
inconclusive findings make it difficult to determine whether a particular treatment or
intervention was effective and to identify directions for future studies. Studies with
insufficient subjects also may result in potentially important research advances that
go undetected. In statistical language, these studies are referred to as “under-
powered.” That is, the probability that they will detect an existing treatment effect is
lower than optimal.
Using too many subjects may result in statistically significant conclusions and clear
future study directions. However, if the same answer could have been obtained with
fewer subjects, then time, effort, research dollars, and animal lives also have been
wasted. In statistical language, these studies are referred to as “over-powered.” That
is, the probability that they will detect a treatment effect is higher than optimal.
Using the appropriate number of subjects optimizes the probability that a study will
yield interpretable results and minimizes research waste. From a statistical
perspective, studies with the optimal number of subjects have sufficient -- neither
too much nor too little -- statistical “power” to detect findings.
Under federal regulations, one of the responsibilities of the Institutional Animal Care
and Use Committee (IACUC) is to ensure that study sample sizes have been rigorously
determined. One of the roles of the Data Management Services (DMS) Statistical
Consulting group is to assist investigators with these determinations.

The Role of Variability

In a perfect research environment in which measurement devices were errorless,
subjects were identical and exhibited identical responses to a treatment, and
treatments were implemented flawlessly and consistently, there would be no
variability in responses. All subjects treated the same would manifest exactly the
same response. In such a world – without variability -- there would be no need for
statistical analysis because whether a treatment altered responses or not would be a
certainty. This ideal world, however, does not exist. In the absence of absolute
consistency – that is, in the presence of variability – uncertainty exists about whether
or not a treatment altered responses. Statistical analyses address this uncertainty.
(As an aside, because statistical analyses require the presence of at least some
variability, it is not possible to use statistical approaches when only one subject is
used per treatment group.)
The measurement of almost any attribute reveals variability. For example, it is not
surprising that the body weights of animals of the same age and sex are not exactly
the same. Nor is it surprising that the viral titers or immune parameters of animals
that were infected with the same agent at the same time with the same dose also
exhibit some variability. In the simplest experimental design, containing a control
group and a treatment group, no investigator is surprised to find that the control
group values exhibit some variability and that the treatment group values also
exhibit some variability (see Figure 1). Keeping this variability in mind, the
investigator is more interested, however, in whether the treatment group values are
generally higher than the control group values.
The problem for the investigator is: given the variability that exists among subjects
treated the same (within the control group and within the experimental group), is
the difference between the two groups consistent enough to be certain that the
treatment had an effect?
Holding variability within groups constant, the larger the difference between group
means, the more certainty the investigator has that a treatment worked. Figure 2
depicts this situation. On the left side of the figure the difference between the group
means is five units. The variability within each group is indicated visually by error
bars that represent the standard error of the mean (sem) – a way of quantifying
variability among subjects treated the same. On the right side of the figure, the
difference between the group means is 10 units. Note that the within-group
variability – the error bars – are exactly the same. Intuitively, the data on the right
side of the figure reveal more certainty that the treatment worked than do the data
on the left side of the figure.
What if the difference between group means is the same, but the within-group
variability differs?
Figure 3 illustrates this situation. On both sides of the figure, the mean difference
between treatment groups is five units. On the left side of the figure, however, the
error bars are much larger than on the right side of the figure. In this hypothetical
situation, there is much more within-group variability on the left side than on the
right side. Intuitively, the data on the right side of the figure indicate more certainty
that the treatment worked – because the subjects responded with more consistency
-- than do the data on the left side of the figure.

The statistician comes to the same conclusion but expresses it in somewhat different
terms. The ultimate purpose of most studies is to use a sample (a subgroup) to make
inferences about a population (the larger group of interest). When data exhibit large
amounts of within-group variability relative to treatment variability, then any
generalizations made to the population must be made with uncertainty. In other
words, the reliability with which the sample can be used to make inferences about
the population is less than when within-group variability is relatively small.
Statistical Analysis and Variability
Many statistical analyses grapple with this problem – given that we know that
subjects will vary in their responses to the same treatment, are the observed
differences between treatment groups consistent enough to state with relative
certainty that the treatment worked?
The statistician conceptualizes the problem in terms of variability. Within a particular
study, there are two major influences on the variability of measured responses:
1) the treatment, and 2) error. The treatment contributes to the variability of
measured responses, if it was effective, by systematically increasing or
decreasing them. Error contributes to the variability of measured responses in
several ways. It is important to DMS – Statistical Consulting Group (Faraday)
January 2006 3 DMS – Statistical Consulting Group (Faraday) January 2006 4
note that the term “error” does not indicate that mistakes were made in the
study. “Error” is the term used to refer to all of the influences other those that
result from the treatment that could alter measured responses. Error
includes, therefore, the inconsistency inherent in measurements obtained
with a measurement device or technique that is not perfect, procedural
differences in how the same treatment was administered to subjects, and
inherent differences among subjects that are not related to the treatment.
Error is considered a non-systematic influence on responses because it can
increase or decrease them. The total variability in responses in a particular
study can be divided into these two components: 1) variability that is
associated with or that is the result of the treatment and, 2) variability that is
not the result of the treatment or error variability.

Many statistical analyses address the same question: is the variability associated with
the treatment large enough relative to the variability associated with error to be
relatively certain that the treatment worked?
Notice that this is the same question that was stated above using different
terminology. The intuitive grasp that the situation in the right side of Figure 3 reflects
more certainty about the treatment effectiveness than the situation on the left side
of the figure illustrates this point.
Also note that the absolute size of treatment variability and error variability is not
important – only their relative relationship. A useful analogy is a signal-to-noise ratio.
The treatment variability is the signal; the error variability is the noise. Noisy data –
data that exhibit a great deal of within-group variability – require that the signal –
the treatment variability -- be strong in order to be detected.

Parameters for Sample Size Determination

Sample size determinations depend on four parameters. These parameters are: 1)
the desired level of statistical power, 2) the p level, 3) treatment variability, and 4)
error variability.
Statistical power refers to the probability that a treatment effect will be detected if it
is there. By convention, power is generally set at about 0.80, or an 80% probability
that a treatment effect will be detected if present. When a study is under-powered,
it has less than an 80% chance of detecting an existing treatment effect. When it is
over-powered, it has a greater than 80% chance of detecting a treatment effect.
P level refers to the probability of detecting a statistically significant difference that is
the result of chance – not the result of the treatment. In other words, the p level
determines the probability of obtaining an erroneously significant result. In statistical
language, this error is called Type I error. By convention, p levels generally are set at
0.05, or a 5% probability that a significant difference will occur by chance. DMS –
Statistical Consulting Group (Faraday) January 2006 5 Two of the four parameters –
power and p level -- are pre-determined. The other two parameters – treatment
variability and error variability – must be estimated in order to complete the sample
size determination. Treatment and error variability can be estimated in three ways.
1) Pilot Studies: The most accurate determination of sample size is obtained when
the investigator has collected relevant data from which an estimate of
treatment variability and an estimate of error variability can be made. These
data generally are obtained in a pilot or small-scale preliminary study. Note
that the results of a pilot study do not have to be statistically significant in
order for the data to be used to estimate treatment and error variability. This
procedure is the best way to determine sample size.
2) Relevant Literature: Another means of making treatment and error variability
estimates is to use the relevant scientific literature. Estimates could be made
from the published work of investigators who have conducted similar studies
or who have addressed related questions. This is the second-best way to
determine sample size.
3) Rule-of-Thumb Estimates: The third means of making variability estimates is to
use rough approximations or rules-of-thumb that are accepted in a particular
field in the absence of data or published work. This procedure is, by far, the
least accurate means of determining sample size, but sometimes must be
used in the absence of data and relevant literature.

In general, if the variability associated with the treatment is large relative to the
error variability, then relatively few subjects will be required to obtain statistically
significant results. Conversely, if the variability associated with the treatment is small
relative to the error variability, then relatively more subjects will be required to
obtain statistically significant results.

Stats: Sample Size Determination

The sample size determination formulas come from the formulas for the maximum
error of the estimates. The formula is solved for n. Be sure to round the answer
obtained up to the next whole number, not off to the nearest whole number. If you
round off, then you will exceed your maximum error of the estimate in some cases.
By rounding up, you will have a smaller maximum error of the estimate than allowed,
but this is better than having a larger one than desired.
Population Mean

Here is the formula for the sample size which is obtained by solving the maximum
error of the estimate formula for the population mean for n.

Population Proportion

Here is the formula for the sample size which is obtained

by solving the maximum error of the estimate formula for the population proportion
for n. Some texts use p hat and q hat, but since the sample hasn't bee taken, there is
no value for the sample proportion. p and q are taken from a previous study, if one is
available. If there is no previous study or estimate available, then use 0.5 for p and q,
as these are the values which will give the largest sample size, and it is better to have
too large of a sample size and come under the maximum error of the estimate than
to have too small of a sample size and exceed the maximum error of the estimate.

Statistical Methods Vol-2 N.G. Das
82% (11)
Statistical Methods Vol-2 N.G. Das
496 pages
Ust
100% (1)
Ust
31 pages
Assumptions of MANOVA
100% (1)
Assumptions of MANOVA
2 pages
MH3511 Midterm 2017 Q
No ratings yet
MH3511 Midterm 2017 Q
4 pages
Statistics in Research 3
No ratings yet
Statistics in Research 3
22 pages
Sampling Techniques and Data Gathering
No ratings yet
Sampling Techniques and Data Gathering
34 pages
Yates PDF
No ratings yet
Yates PDF
8 pages
Measurement & Scaling Techniques PDF
No ratings yet
Measurement & Scaling Techniques PDF
16 pages
Performance Management - Unit 2
No ratings yet
Performance Management - Unit 2
71 pages
Outcome Measure Unit 3
No ratings yet
Outcome Measure Unit 3
20 pages
Strategic Industrial Relations
100% (1)
Strategic Industrial Relations
17 pages
Measurement Scales in Research Methodology
No ratings yet
Measurement Scales in Research Methodology
11 pages
Confidence Intervals: By: Asst. Prof. Xandro Alexi A. Nieto UST - Faculty of Pharmacy
No ratings yet
Confidence Intervals: By: Asst. Prof. Xandro Alexi A. Nieto UST - Faculty of Pharmacy
24 pages
Estimation in Statistics
100% (1)
Estimation in Statistics
4 pages
Module 1 - Neuropsychology
100% (1)
Module 1 - Neuropsychology
4 pages
Measuring The Occurrence of Disease: Dr. Elijah Kakande MBCHB, MPH Department of Public Health
No ratings yet
Measuring The Occurrence of Disease: Dr. Elijah Kakande MBCHB, MPH Department of Public Health
25 pages
Standard Deviation Handouts
No ratings yet
Standard Deviation Handouts
3 pages
Topic 1
100% (1)
Topic 1
37 pages
Types of Data & Measurement Scales: Ordinal Scale
No ratings yet
Types of Data & Measurement Scales: Ordinal Scale
2 pages
Anova Notes
No ratings yet
Anova Notes
7 pages
Randomized Controlled Trials
No ratings yet
Randomized Controlled Trials
22 pages
ANOVA Concept
100% (1)
ANOVA Concept
19 pages
Assumptions On Independent Ttest
100% (2)
Assumptions On Independent Ttest
2 pages
What Is Hypothesis Testing
No ratings yet
What Is Hypothesis Testing
18 pages
Research Report: Lecture # 1: Research: A Way of Thinking
No ratings yet
Research Report: Lecture # 1: Research: A Way of Thinking
35 pages
Presentation On Data Analysis: Submitted by
No ratings yet
Presentation On Data Analysis: Submitted by
38 pages
Quantitative Analysis
No ratings yet
Quantitative Analysis
3 pages
School of Public Health: Haramaya University, Chms
100% (1)
School of Public Health: Haramaya University, Chms
40 pages
Question 1
No ratings yet
Question 1
18 pages
Inferential Statistics
No ratings yet
Inferential Statistics
119 pages
Qualitative Data Analysis
No ratings yet
Qualitative Data Analysis
10 pages
Bca-629 Ob PDF
No ratings yet
Bca-629 Ob PDF
107 pages
Introduction To Inferential Statistics
No ratings yet
Introduction To Inferential Statistics
27 pages
Hypothesis Testing
100% (1)
Hypothesis Testing
38 pages
Theresa Hughes Data Analysis and Surveying 101
No ratings yet
Theresa Hughes Data Analysis and Surveying 101
37 pages
Qualitative Research: Trustworthiness Observation and Interviewing Content Analysis Ethnography
No ratings yet
Qualitative Research: Trustworthiness Observation and Interviewing Content Analysis Ethnography
40 pages
Types of Statistical Analysis
No ratings yet
Types of Statistical Analysis
2 pages
Measurement Scale Slide
100% (1)
Measurement Scale Slide
40 pages
Chapter11-Two Ways Anova
No ratings yet
Chapter11-Two Ways Anova
26 pages
Common Types of Variables
No ratings yet
Common Types of Variables
5 pages
Paired and Independent Samples T Test
No ratings yet
Paired and Independent Samples T Test
34 pages
What Is Program Evaluation?: A Beginners Guide
100% (1)
What Is Program Evaluation?: A Beginners Guide
19 pages
Group Protocol
No ratings yet
Group Protocol
5 pages
Correlation Design
No ratings yet
Correlation Design
43 pages
Measurement of Variables
No ratings yet
Measurement of Variables
41 pages
List of Statistical Packages
No ratings yet
List of Statistical Packages
2 pages
Two-Way Anova
No ratings yet
Two-Way Anova
19 pages
Primary Vs Secondary Sources of Literature Review
No ratings yet
Primary Vs Secondary Sources of Literature Review
5 pages
Chapter 10 - Data Collection Methods
0% (2)
Chapter 10 - Data Collection Methods
10 pages
Chapter 2 - Generating The Research Topic
No ratings yet
Chapter 2 - Generating The Research Topic
3 pages
Presentation 10 ANOVA-Table-Components Explanation Sum24
100% (1)
Presentation 10 ANOVA-Table-Components Explanation Sum24
20 pages
Calculating Sample Size
No ratings yet
Calculating Sample Size
5 pages
Business Statistics
100% (1)
Business Statistics
60 pages
Paired Samples T Test
No ratings yet
Paired Samples T Test
12 pages
Meaning, Purpose and Problems of Cost Benefit Analysis Meaning
No ratings yet
Meaning, Purpose and Problems of Cost Benefit Analysis Meaning
4 pages
Module 5
No ratings yet
Module 5
6 pages
Presentation On Hawthorne Studies: By: Medhana Bhatt Rachita Jain Suman Anita Gazala
No ratings yet
Presentation On Hawthorne Studies: By: Medhana Bhatt Rachita Jain Suman Anita Gazala
13 pages
Estimating Sample Size
No ratings yet
Estimating Sample Size
24 pages
Planning Survey Research
No ratings yet
Planning Survey Research
6 pages
Determination of Sample Size: DMS - Statistical Consulting Group (Faraday) January 2006
No ratings yet
Determination of Sample Size: DMS - Statistical Consulting Group (Faraday) January 2006
5 pages
Inferential Report 1
No ratings yet
Inferential Report 1
7 pages
IMMS Research Template
No ratings yet
IMMS Research Template
53 pages
Central Tendency
No ratings yet
Central Tendency
4 pages
Chap 2
No ratings yet
Chap 2
51 pages
2025 Nle Mn-Nres
No ratings yet
2025 Nle Mn-Nres
158 pages
Stats1 Chapter 2::: Measures of Location & Spread
No ratings yet
Stats1 Chapter 2::: Measures of Location & Spread
53 pages
Grade 09 Exam Files
No ratings yet
Grade 09 Exam Files
27 pages
Sample 7620
No ratings yet
Sample 7620
11 pages
Relative Grading
100% (2)
Relative Grading
13 pages
Thesis After Defense
No ratings yet
Thesis After Defense
43 pages
Business Statistics and o R
No ratings yet
Business Statistics and o R
3 pages
Chapter 6 - ANOVA and Kruskal-Wallis Test
No ratings yet
Chapter 6 - ANOVA and Kruskal-Wallis Test
33 pages
Syllabus For Subordinate Accounts/Audit Service (SAS) /revenue Audit and Incentive Examinations
No ratings yet
Syllabus For Subordinate Accounts/Audit Service (SAS) /revenue Audit and Incentive Examinations
16 pages
Coefficient of Variation - Definition, Formula, Interpretation, Examples & FAQs
No ratings yet
Coefficient of Variation - Definition, Formula, Interpretation, Examples & FAQs
19 pages
10 As Statistics and Mechanics Practice Paper E Mark Scheme
No ratings yet
10 As Statistics and Mechanics Practice Paper E Mark Scheme
10 pages
Spatial Interpolation: A Brief: Eugene Brusilovskiy
No ratings yet
Spatial Interpolation: A Brief: Eugene Brusilovskiy
58 pages
Paper On Holding Period
No ratings yet
Paper On Holding Period
10 pages
Univariate and Multivariate Data Exploration
No ratings yet
Univariate and Multivariate Data Exploration
26 pages
Culminating Stat Project January 2025
No ratings yet
Culminating Stat Project January 2025
5 pages
Correlation of Statistics
No ratings yet
Correlation of Statistics
6 pages
Educ8 Assessment Test
No ratings yet
Educ8 Assessment Test
13 pages
Eapp q2 Module 8
100% (7)
Eapp q2 Module 8
19 pages
A DA N 36: B GBGBGB G BGCH B GBGBGBGB GB GB GB G B G B GBGBG B GBG
No ratings yet
A DA N 36: B GBGBGB G BGCH B GBGBGBGB GB GB GB G B G B GBGBG B GBG
30 pages
(Arithmetic Progression) (Level 1 2 3) 2
No ratings yet
(Arithmetic Progression) (Level 1 2 3) 2
8 pages
Amazon Sales Analytics
No ratings yet
Amazon Sales Analytics
13 pages
Measurement: Lecture 2 of 2
No ratings yet
Measurement: Lecture 2 of 2
12 pages
Mathematical Literacy Grade 10 Term 1 Week 7 - 2021
No ratings yet
Mathematical Literacy Grade 10 Term 1 Week 7 - 2021
13 pages
MATH211 ProbabilityAndStatistics
No ratings yet
MATH211 ProbabilityAndStatistics
464 pages
Skor Hasil Lempar Cakram B. Data Terserak (Tunggal)
No ratings yet
Skor Hasil Lempar Cakram B. Data Terserak (Tunggal)
8 pages

Determination of Sample Size

Uploaded by

Determination of Sample Size

Uploaded by

ASSIGNMENT

The Role of Variability

Parameters for Sample Size Determination

Stats: Sample Size Determination

Here is the formula for the sample size which is obtained

You might also like