0% found this document useful (0 votes)

441 views47 pages

Data Collection and Analysis in Obstetrics and Gynecology

This document discusses data collection and analysis in obstetrics and gynecology. It defines key terms like data, categorical data, numerical data, variables, and different types of data collection systems. Regular or routine data collection provides ongoing information from sources like medical records. Ad hoc systems use surveys to gather additional information as needed. Proper questionnaire design is important for data collection, with guidelines like using simple language and avoiding leading questions. The planning process for surveys includes determining objectives, required information, the study population, and sampling methods. Analysis of collected data can then be both descriptive and analytical.

Uploaded by

api-3705046

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

441 views47 pages

Data Collection and Analysis in Obstetrics and Gynecology

Uploaded by

api-3705046

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 47

Data Collection and Analysis In Obstetrics

and Gynecology

S. M. Ogbonmwan
Department of Mathematics
University of Benin
Benin City, Nigeria

10/14/08 1
Data Collection and Analysis In Obstetrics
and Gynecology
 What is Data?
Measurable characteristics of a sampling unit (or
subject) of a population, that yields information
about the population.
 Type of Data:
There are mainly two types. viz: Broadly, data
can either be Categorical or Numerical
 Categorical Data:
The simplest type of observation that is made on
a subject that comes to the clinic is the allocation
(the classification) of the subject to one of only
two categories that relate to the presence or
absence of some attributes.
10/14/08 2
 Examples:
Pregnant/Not Pregnant
Married/Single
Hypertensive/Normotensive
Diabetic/Non-Diabetic.
More than two categories:
 Marital Status: Married/Single/Divorced/Separated
 Blood group: A/B/AB/O
 Degree of pain:
Minimal/Moderate/severe/unbearable
– Numerical Data:
 There are two main types viz: Discrete and
continuous.
 Discrete Data:
 Arise when observations take certain numerical
values through counting.
 Examples:
 Number of children, number of visits to ANC in a year,
number of ectopic heart beats in 24 hours, number of
threatened abortions in the last two years, etc.
10/14/08 3
 Continuous Data:
 Usually obtained by some form of measurements.
 Examples:
 Height, weight, age, body temperature, blood
pressure, serum cholesterol, etc.
 Other types of Data:
 Censored Data:
 In many cases of life data, one could find that all
of the subjects in the sample may not have
failed. That is, in some cases the event of
interest may not be observed or the exact times-
to-failure of some of the subjects may not be
known. These types of data are commonly called
censored data and they are of three types; viz:
right censored (or suspended), interval censored
and left censored data.
10/14/08 4
 Right Censored (Suspended): These are the cases (of
life data) composed of subjects that did not fail.
 Example: 8 breast cancer cases, 5, failed at the end of
experiment then the remaining 3 would be regarded as
suspended (right censored) data.
 Interval Censored Data: Interval censored data results
where there is uncertainty as to the exact times the units
failed within an interval.
 Example: Assuming units are being inspected every 6
hours say at 6:00 am, 12:00 noon, 6:00 pm and so on.
Assuming 8 were surviving at 6:00 am and when inspected
at 12:00 noon only 7 were surviving. Then you can only
say that one failed between 6:00 am and 12:00 noon. The
exact time when that one failed would not be known.
 Left Censored Data: In this case, failure time is only
known to be before a certain time.
 Example: Suppose an experiment scheduled for
inspection after 12 hours is found to have failed before
inspection. Thus, what is known is that the experiment
failed sometime before 12 hours (i.e. between 0 and 12
hours) but nor exactly when.

10/14/08 5
 Variable:
 A Variable is any attribute, Phenomenon or event
that can have different values.
 A variable can either be quantitative of qualitative
 A quantitative variable describes a
characteristic in terms of a numerical value. The
value may vary from subject to subject or from
time to time in the same subject. The value is
expressed in units of measurement.
 Examples: Height in meters, Blood pressure in
mm/Hg, weight in kilograms, etc.
 A qualitative variable describes the attribute of
a characteristic (by classifying it into categories
to which the subject either belongs or does not
belong).
 Examples: State of origin, Tribe or Ethnic
group, etc.

10/14/08 6
 Types of Variables: Two types: Continuous and
Discrete.
 Continuous Variable:
 A Variable with potentially infinite number of
possible values in any interval. It can assume
either integral or fractional values and can be
measured to different levels of accuracy.
Continuous variable is realized through actual
measurements.
 Examples: Weight of babies delivered in a
Health facility could be 314, 2.98. 2.94, 3.10 kg.
 Discrete Variable:
 Can have a number of values in any interval. The
values are invariably whole numbers. They are
integers. Discrete variable is usually realized
through counting.
 Examples: Number of children in a family,
number of clinic in a community, number of
children delivered within a given period in a
Teaching Hospital, etc.
10/14/08 7
 Collection of Data (In O & G)
 Sources of Data:
 There are two main sources of data in Healthcare
delivery including O & G. these are regular or
routine system and Ad Hoc systems.
 Regular or Routine Data Collection Systems:
 A regular or routine data collection system
usually consists of established procedures for
collecting data (in the clinics) as they become
available. This could be at national, sub-national
or institutional levels. This system provides a
rough indication of the frequency of occurrence of
diseases and their descriptive epidemiology,
which serves as leads concerning disease
etiology. The sources of data in this system
include information from: hospital (medical)
records, autopsy reports, physician records, etc.

10/14/08 8
 Example: (Part of Patient’s Form)SystDiast
 Patient’s Name: -----------------------------------------
 Patient’s Number:
 Data of Registration:
 Data of Birth:
 Sex (1= male, 2 = female):
 Marital status:
 Religion:
 Ethnic group/Tribe:
 Height (m):
 Weight (kg)
Systolic Diastolict
 Blood pressure (mm Hg):
 Number of Pregnancies:
 Number of Deliveries:
 Number of Children Alive:
 Number of Children Dead:
 Number of Abortions:
10/14/08 9
– The advantage of this system of data collection is that it
guarantees availability of data in every specific area of
healthcare delivery.
 Ad Hoc Data Collection Systems:
 Ad hoc data collection is usually in the form of a (Research)
survey to gather information that may not be available on a
regular basis. This at times may include special
investigative studies or it could just be the collection of
additional information as part of the routine data collection.
This system gives a large coverage of the population.
 Examples:
 An investigation of the effects of FGM on complications
during delivery
 An investigation of breastfeeding practices among women
who registered a birth in the previous year.
 A study to investigate whether the use of hormonal
contraceptives affect the fertility status of the users.
– The Ad hoc data collection systems could be extensive,
intensive and expensive. However, an advantage of the Ad
hoc system is that it provides accurate and reliable data (when
well conducted) in response to the specific needs of the users.
An important tool for ad hoc data collection system is the use
of adequate questionnaire.
10/14/08 10
 Good Questionnaire Design.
 Guidelines for Designing a Questionnaire
 ·Use simple language
 Avoid long complicated questions (avoid double negatives)
 Be unambiguous – be clear and simple
 Do not ask general questions if you want specific answers.
Ask only valid questions.
 Do not ask leading questions
 Avoid hypothetical questions about situations outside the
people’s direct experience
 Be careful with embarrassing questions. Do not make it
too difficult for the respondents.
 Use minimum number of questions.
 Pre-coded questions enable you to analyse your replies
easily by the computer, but they may force people to give
wrong answers.
 People tend to choose the first response.
 Ask easy questions first and difficult questions last.
 Pre-test your questionnaires.

10/14/08 11
– Steps in the Planning of a Survey
 Step 1 – Preparation of a detailed written statement of the
objectives of the survey.
 Step 2 – Determination of the items of information required
and methods of collection.
 Step 3 – Definition of the reference population on which
information is to be sought.
 Step 4 – decision on whether the reference population is to
be studied as a whole or in part (sample).
 Step 5 – Determination of the number of units in the
population to be selected for study during the survey
(sample size).
 Step 6 – Decision on how respondents will be selected from
the population (sampling method).
 Step 7 – Design, testing and validation of the
questionnaires on which observations will be recorded.
 Step 8 – Selection and training of enumerators
(interviewers).
 Step 9 – Collection of data.
 Step 10 – Preparation for data analysis.

10/14/08 12
 Analysis of Data:
 The general methodology for the analysis of data (in O & G)
is of two types; viz: Descriptive and Inferential.
 Descriptive Statistics Approach for Data
Analysis:
 Descriptive Statistics:
 Descriptive statistics are the statistical tools for the
organization and summarization of data. They describe a
set of data which eventually provides a basis for a
generalization about a population when only a sample is
observed. Descriptive statistics point up a characteristic of
the population being studied. Descriptive statistics simply
summarize a mass of data into a few simple ideas. In data
analysis, descriptive statistics are presented in tables which
provides summary statistics for continuous, numeric
variables. The summary statistics includes:
 measures of central tendency such as mean, median and
mode
 measures of dispersion (spread of the distribution) such as
range and standard deviation (including variance of the
distribution)
 measures of distribution such as skewness and kurtosis
which indicate how much a distribution varies from a
10/14/08 13
 In summary, descriptive statistics
described a set of data which will
provide a basis for a generalization
about a population when a sample is
observed. Thus, descriptive
statistics point up a characteristic of
the population being studied.
Descriptive statistics summarize a
mass of data into a few simple ideas.

10/14/08 14
 Organization and Presentation of data
 Useful information is usually not immediately evident from
a mass of raw data. Collected data need to be organized in
such a way that the information they contain may clearly
reveal the patterns of variation in the distribution.
Organization of data gives vent to the understanding of the
structures and characteristics of the data. Data are usually
presented in either tabular or diagrammatic forms.
 Tabular Presentation
 This is the presentation of data in tables so as to organize
them into a compact and readily comprehensible form. For
example, a frequency distribution table gives the number of
observations at different values or classes of the variable.
Tabular presentation could be handled as:
 (a) Single variable frequencies:
 For a qualitative variable (such as the distribution of the
state of origin of 100-women who visited the ANC in the
last one year).
 For a large data set of a quantitative variable requiring
grouping of the data into classes (such as the distribution
of the weight of new born babies in a Teaching Hospital)

10/14/08 15
 (b) Cross-tabulation:
 Two dimensional tables, in which two variables are cross-tabulated (such
as the cross-classification of weight of babies at birth and economic status
of their parents).
 Three-dimensional tables, in which three variables are cross-classified
(such as outcome of treatment by sex and by age group).
– Diagrammatic presentation
 Diagrammatic presentation is the use of a diagram to show the distribution
of data. The methods of diagrammatic presentation of data are:
 Qualitative or Categorical Data
 Pie Charts
 A circle is divided into sectors with areas proportional to the frequencies or
the relative frequencies of the categories of the variable.
 Bar Charts
 The bars are constructed to show the frequency or relative frequency for
each category of the attribute. The bars are usually equal in width. It is
important that the vertical scale should start at zero; otherwise the heights
of the bars will not be proportional to the frequencies.

10/14/08 16
 (b) Quantitative data
 Frequency Histograms
 The chosen class intervals should not overlap and should
cover the full range of the data. The area of each bar (not
just its height) should be proportional to the frequency.
Unequal class intervals are taken into account by the areas
of the bars.
 Frequency Polygons (Line Charts)
 This is constructed by joining the midpoints of the top of
each bar of a histogram. This chart provides ease of visual
comparison between two or more distributions drawn on
the same chart.
 Cumulative frequency polygons and cumulative
frequency charts (Ogives).
 This is the chart in which the cumulative frequencies are
plotted against the upper tabulated limit for each class. In
principle, the ogvie can be used to estimate, by
interpolation, the frequency of occurrence of a value of the
variable less than or equal to a specified value.

10/14/08 17
 Measures of Location:
 One of the first statistics usually computed for a set of data is a
measure of central tendency such as the Mean, Median and
the Mode.
 The Mean:
 Most frequently used in data analysis. The Mean may be
considered as the center of gravity of the distribution.
n
∑ xi
i= 1
Mean: X= Raw data
n
k
∑ f i xi
i= 1
X= k
Group data
∑ fi
i= 1
10/14/08 18
The Median:
It is the point in the distribution with 50% of the measures of scores on each
side of it. That is, it is the midpoint of the distribution for even number of
n n+2
observations; the median occupies the point between th and th
2 2
positions when the values of the observations are arranged in order of
magnitude. When the number of observations is odd, the Median occupies the
n +1
th position in the ordered arrangements. For the grouped data case, the
2
Median is estimated by using the expression:

n 
 −Cf 
Median = L1 +
2 C
i
fi
Where
L1 = lower class boundary of the median class
n= number of observations
C f = Cumulative frequency of the class just before the median class
Ci = Median class interval
f i = frequency of the median class
10/14/08 19
The Mode:
This is simply the value that occurs most frequently in the distribution. For
the grouped frequency case, the Mode is estimated by using the expression:

( f − fa ) × c
Mode = L1 +
( f − f a ) + ( f − fb )
Where
L1 = lower class boundary of the modal class
f= modal frequency
f a = Frequency of the class after the modal class
f b = Frequency of the class before the modal class
C= Modal class interval

10/14/08 20
 Measure of Variability (Measure of Spread)
 The Range:
The simplest way to describe the spread of a set of data is to quote the lowest
and highest values. The difference between the highest and lowest values given
the range of the distribution. It is however not satisfactory measure. It is
therefore not widely used.

 Variance:
This is the mean of the squared differences (deviations) between the mean and
each observed value. It is mathematically expressed as:

∑ ( xi − X ) ( )
n 2  k 2 
 ∑ f i xi − X 
Variance, S2 = i =1
=  i =1 k 
n −1  
 ∑ fi − 1 
 i =1 

 Standard Deviation:
The square root of the variance

∑ ( xi − X )
n 2

i =1
Standard deviation S =
10/14/08 n −1 21
 Inferential Statistics:
 Usually when samples are studied, the investigator will be
interested in going beyond the sample and would want to
make inference about the population from which the
sample was drawn. Thus, from the knowledge of the
descriptive statistics such as the mean and variance from
sample values, inferences about the same traits in the
population are made. The use of inferential statistics is
basic to Medical research. The exploits in inferential
statistics include: Confidence Interval, Test of hypothesis,
contingency Tables, Nonparametric Tests, Regression and
Correlation analysis, ANOVA, etc.
 Confidence Interval:
 Confidence Interval combines the features of estimates
from a sample with known properties of the normal
distribution to get an idea about the uncertainty associated
with a single sample estimate of the population parameter.
Confidence interval gives a range of values for which one
can be confident would include the true value.

10/14/08 22
C I for a Single Mean ( µ )
σ
The 100 (1 − α )% C I = X ± Z (α ) .
2 n
s
OR X ± t n−1 (α 2).
n

C I for the Difference of Two Means ( µ 1 − µ 2 )

σ 12 σ 22
The 100 (1 − α )% C I = X − X 2 ± Z (α ) +
2 n1 n2

1 1
OR C I = X − X 2 ± t n1+ n2 − 2 .(α 2) S p + ,
n1 n2

(n1 − 1) S12 + ( n2 − 1) S 22
where Sp =
n1 + n2 − 2

10/14/08 23
C I for the Single Proportion (P)

p0 q 0
The 100(1 − α )% C I = P ± Z (α )2 . n

Difference of Two Proportions ( Ρ1 − Ρ2 )

The 100(1 − α )% C I = Ρ 1 − Ρ 2 ± Z (α ) .
( ) (
Ρ 1− Ρ Ρ 1− Ρ
+
)
2 n1 n2

10/14/08 24
 Test of Statistical Significance
 Tests of significance are standard statistical procedures for
drawing inferences from sample estimates about unknown
population parameters
 In medical research, tests of significance allow us to decide
whether the sample estimates, or differences between
estimates are within their normal biological variation,
commonly called variability due to chance.
 Procedure for testing statistical hypothesis
– State the null hypothesis
– State the alternative hypothesis (indicate 1 – tail or 2 – tail)
– State the level of significance (explain type 2 errors)
– Choose the test statistic (explain parametric and non-
parametric tests)
– Compute the numerical value of the statistic from the
observed data
– Compare the calculated value of test statistic with tabulated
values in appropriate standard distribution tables at a specified
probability level of significance
– Decide whether or not to reject the null hypothesis according
to the p-value

10/14/08 25
Test for Single Mean:

Hypotheses Test Statistic Decision

Case 1 (right tail) X − µ0 Reject if Z > Z (α )
H 0 : µ = µ0 Z=
σ
n Reject if T > T (α )
H 1 : µ = µ1 > µ0 OR
X − µ0
T=
S
n
Case 2 (left tail) X − µ0 Reject H0 if Z > Z (α )
H 0 : µ = µ0 Z=
σ
n Reject if T > t n(α−1)
H 1 : µ = µ1 < µ0 OR
X − µ0
T=
σ
n
Case 3 (two tailed) X − µ0 Reject H0 if Z > Z (α )
H 0 : µ = µ0 Z= 2
σ
n
H 1 : µ = µ1 ≠ µ 0 OR Reject H0 if T > T (α )
2
X − µ0
T=
S
n
10/14/08 26
Test for Difference of Two Means:
H0 : µ 1 = µ 2
H 1 : (a) µ 1 > µ 2
(b) µ 1 < µ 2
(c) µ 1 ≠ µ 2

 Test statistics are created along the lines

given for the test for single mean, and the
decisions follow accordingly.
 Finally, Tests of proportions are handled
by the use of Z~ test for large samples or
by the use of t – test for small samples.

10/14/08 27
 Contingency Tables:
Test for Associations between two categorical variables is by the use of the χ ~2
distribution

The test statistic is:

n ( 0 i − ei ) 2
χ2 = ∑ and the null hypothesis of no association is rejected
i =1 ei
whenever the calculated value of χ 2 > χ υ2 (α )
where χ υ2 (α ) is the value of the chi-squared distribution with υ degrees of
freedom at α -level of significance.

10/14/08 28
 Nonparametric Tests:
 In the tests for means, proportions and association, there is a fundamental
assumption of the knowledge of the distribution of the test statistics and
indeed the knowledge of the functional form of the distribution of the
variables under consideration. When there is no knowledge of the
functional form of the basic density function of the variables, then it is
usually good to resort to the Nonparametric test such as:
 The Wilcoxon (Rank sum) test
 The Mann-Whitney U – test
 The Median test
 The Sign test

 The Wilcoxon Test (Two Samples)

n
Test statistic: SW = ∑ R j where Rj, j = 1, 2, …, n are the ranks of the X S
j =1

m( N + 1)
SW −
Reject H0 when Z = 2 > Z (α )
mn( N + 1) 2

12
10/14/08 29
The Mann-Whitney U – Test (Two Samples)

m(m + 1)
Test statistic: U = SW −
2
Where SW is as in Wilcoxon test
mn
U=
Reject H0 when Z = 2 > Z (α )
mn( N + 1) 2

12
 Regression and Correlation:
 A high proportion of data analyses are carried out
to study the relationship between two variables.
The purposes of such analysis are:
 To assess whether the two variables are
associated.
 To enable the value of one variable to be
predicted from any known value of the other
variable
 To assess the amount of agreement between the
values of the two variables.
10/14/08 30
 Correlation:
Correlation is the method of analysis used when studying the measure of
relationship (association) between two continuous variables – e.g. – percentage
of body fat and age or normal adults. The actual measure of the association is
done by calculating the correlation coefficient r. The correlation coefficient r
can take any value between –1 and +1.

The Pearson’s measure of correlation coefficient is expressed as:

∑ ( X i − X )(Yi − Y )
n

i =1
r=
∑(X ) ∑ (Y − Y )
n 2 n 2
i −X i
i =1 i =1

while the Spearman rank correlation coefficient is expressed as:

n
6∑ d i2
i =1
rs = 1 − 2
n(n + 1)

10/14/08 31
 Regression:
Linear regression describe the linear relationship between variables and can be
used to predict the value of one variable for an individual when we only known
the other variable. Consider a simple case of: Fetal weight (kg) and Non-
pregnant Maternal weight. Here we consider the fetal weight as the response
(or outcome) variable while the maternal weight is the predictor variable.
These are also called the dependent and independent variables respectively.
The linear relationship between the dependent (Y) and the independent (X)
variables is given as:

Y = α + βX

The estimate of α and β are:

∑ ( X i − X )(Yi − Y )
n

∧
β= i =1

∑( X )
n 2
i −X
i =1
∧ ∧
α =Y −β X
∧ ∧
Hence, Y = α+ β X
which is used for prediction.

10/14/08 32
Multiple Regression:
Y = α + β 1 X 1 + β 2 X 2 + ... + β p X p
e.g. – obesity, smoking and snoring
YSnoring = α + β 1 X Smoking + β 2 X Obesity

Logistic Regression:
Good for prediction for dichotomous variables.

10/14/08 33
 Simple Experimental Design
 One Way ANOVA
 In research work or in the handling of patients, comparisons are often
made between several sets of data collected from basically similar
populations, such as treatments given to some groups of patients having
the same ailment except that different drugs were used for each group.
Generally, any experiment denoted to compare several treatments (source
of variation) must embody two important principles of experimental design
viz: (i) Replication and (ii) Randomization. The simplest experimental
design which incorporates those two principles is the completely
Randomized design or simply also called the one-way classification or
the one-way analysis of variance involving one factor appearing at
different levels.

The null hypothesis we would wish to test is:

H0: µ1 = µ 2 = ... = µ k = µ versus
H1: At least one of the µ k differs from µ .
Test for One-Way Classification
1. State H0 and H1
H0: µ1 = µ 2 = ... = µ k

2. Choose the level of significance, α

3. Complete the ANOVA table

10/14/08 34
The null hypothesis we would wish to test is:
H0: µ 1 = µ 2 = ... = µ k = µ versus
H1: At least one of the µ k differs from µ .
Test for One-Way Classification
1. State H0 and H1
H0: µ 1 = µ 2 = ... = µ k

2. Choose the level of significance, α

3. Complete the ANOVA table

10/14/08 35
ANOVA TABLE
S. V. d. f. SS MS F-Ratio

Treatment k–1 SStr SStr/k–1 = MStr MS tr

= FCal
MS E
Error k(n – 1) SSE SSE/k(n–1)= MSE

Total kn – 1 SST

5. Under H0 and the assumptions in (3) being correct, Fcal under F – Ratio
in the ANOVA table has Fk-1,(n – 1) – distribution. Hence, we find the
critical point by reading off Fk-1,(n – 1) ( α ) from the F – distribution table for
the appropriate level of significance.

6. Compare the values of Fcal from the ANOVA table and Fk-1,(n – 1) ( α ) – from
the statistical table.
If Fcal > Fk-1,(n – 1) ( α ) then reject the null hypothesis.

7. Draw a conclusion.

Remark
When the sample sizes (i.e. the number of observations in each
treatment) are not all equal, necessary adjustment must be made in the
computation of sums of squares.

Example
Six patients each were tested on four types of oral contraceptive
10/14/08 36 to
investigate the average reaction time.
Risk Estimation:

Disease
Yes No Total

Yes a b c+b
Exposure
No c d c+d
Total a +c b+d n=a+b+c+d

 Relative Risk (RR)

 RR estimates the magnitude of an association between
exposure and disease. It indicates the likelihood of
developing the disease in the exposed group relative to
those who are not exposed. It is the ratio of the incidence
of disease in the exposed group divided by the
corresponding incidence of disease in the non-exposed
group.

10/14/08 37
a /( a + b) a c + d a (c + d )
Thus, RR = = . =
c /(c + d ) a + b c c ( a + b)
 Remarks:
 1. RR of 1.0 indicates that the incidence rates of
disease in the exposed and non-exposed groups
are identical and thus indicates that there is no
association observed between the exposure and
the disease.
 2. A value of RR greater than 1.0 indicates a
positive association or an increase risk among
those exposed (to a factor).
 3. Analogously, a RR less than 1.0 means that
there is inverse association or a decrease risk
among those exposed.
 4. RR may change (in some cases) with time
e.g. RR for 1 year exposure might be different
from RR for 10 years exposure.
10/14/08 38
Odd Ratio (for case – control cases)
Cases where participants are selected on the basis of their disease
status.
OR ≡ ratio of the odds of exposure among the cases to that
among the controls.
a
c ad
OR ≡ b
=
d
bc

10/14/08 39
 Worked Examples
 Example 1:
 Blood pressure levels were measured in 100 diabetic and
100 non-diabetic women aged 40 – 49 years. Mean
systolic blood pressures were 146.4 mm Hg (with standard
deviation of 18.5) among the diabetics and 140.4 mm Hg
(with standard deviation of 16.8) among the non-diabetics.
By making the necessary assumptions, calculate the 95%
confidence interval for the difference of means of the blood
pressures of the two groups of women.

 Solution:
 Assume that the blood pressures of each of the two groups
of women are normally distributed. Hence, assume that
the difference of means of the blood pressures is also
normally distributed.

10/14/08 40
Given is : 100(1− α )% = 95%
⇒ 1 − α = 0.95
⇒ α = 0.05

⇒ α = 0.025
2
The formula for 100(1 − α )% CI for difference of two means is:

S12 S 22
X 1 − X 2 ± Z (α ) . +
2 n1 n2
This is true since n1 = n2 = 100 are considered to be large values.

Substituting, we have Z (α ) = Z (0.025 ) = 1.96

18.5 2 16.8 2
146.4 − 140.4 ± 1.96 +
100 100
i.e. 6 ± 1.96 × 2498979792
i.e. 6 ± 4.898
(1.102, 10.898)
∴ 95% confidence interval for the difference of mean is: 1.1 to 10.9
10/14/08 41
 Example 2:
 A team of medical researchers wished to measure
the level of weight gained by users of oral
contraceptives. The weights of 12 women were
taken before and after the use of the
contraceptive within one year interval. But
unfortunately, one of the women died before the
end of the year, and therefore there was no
result for her (this is indicated by * in the date
set). Estimate the weight of the woman that died
before the experiment was concluded.

10/14/08 42
Weights of Women
Before (X) After (Y)
50 61
55 61
60 59
65 71
70 80
75 76
79.5 *
80 90
85 106
90 98
95 100
100 114

Solution: First, we shall find the regression line

Y = α + β X by estimating α and β .

10/14/08 43
Complete the table:

x y x2 y2 xy
50 61 2500 3721 3050
55 61 3025 3721 3355
60 59 3600 3481 3540
65 71 4225 5041 4615
70 80 4900 6400 5600
75 76 5625 5776 5700
80 90 6400 8100 7200
85 106 7225 11236 9010
90 98 8100 9604 8820
95 100 9025 10000 9500
100 114 10000 12996 11400
825 916 64625 80076 71790

10/14/08 44
Using the result of the table we get

β = ∑ i 2i ∑ i ∑2 i = 1.1236
∧ n x y − x y
n∑ x i − ( ∑ x i )
∧
α = Y − β X = −0.9973
∧ ∧
∴ Y = α + β X = − 0.9973 + 1.1236 X
Hence, when X = 79.5 we have
Y = − 0.9973 + 1.1236 × 79.5 = 88.3289
That is, the estimated weight of the woman that died (after one year) would
have been 88.33kg.

 Example 3: Serum amylase determination were

made on a sample of 15 apparently healthy subjects. The
sample yielded a mean of 96 units/100ml and a standard
deviation of 35 units/100 ml. The population variance was
unknown. Can one conclude that the mean of the
population from which the sample of Serum amylase
determination came is different from 120.
10/14/08 45
Solution:
H 0 : µ = 120 = µ 0
H 1 : µ ≠ 120 ≠ µ 0
X − µ0
test statistic is t=
S
µ
Let α = 0.05
Since we have a two sided test we put α = 0.025 in each tail of the
2
distribution
∴ we find t14 (0.025 ) = 2.1448 (obtained from statistical table)
96 − 120
computed t, t= = −2.65
35
15
∴ t = 2.65
Decision rule:
Since t = 2.65 > t14 (0.025 ) = 2.1448
We shall reject the null hypothesis.

Conclusion: Based on the given data we shall conclude that the mean of
the population from which the sample came is not 120.
10/14/08 46
Exercise:
At admission two groups of women on two different family planning methods in
clinical trials show the following characteristics.

Mean SD No. of women

Weight (kg)
Cycloprovera 56.83 12.48 42
HRP 102 59.29 15.47 48

Height (cm)
Cycloprovera 155.86 5.17 42
HRP 102 155.83 6.39 48

Age (years)
Cycloprovera 27.71 4.10 42
HRP 102 28.46 4.66 48

Systolic BP (mm Hg)

Cycloprovera 118.7 9.2 42
HRP 102 121.9 9.8 48

Diastolic BP (mm Hg)

Cycloprovera 78.1 7.3 42
HRP 102 78.9 7.9 48

Find whether the two groups differ substantially at admission

10/14/08 47

HO
No ratings yet
HO
14 pages
1 Intro To Statistics 1 6
No ratings yet
1 Intro To Statistics 1 6
106 pages
Giu 3084 65 22361 2025-02-17T15 43 52
No ratings yet
Giu 3084 65 22361 2025-02-17T15 43 52
13 pages
Biostatistics Y 1 LI 2022
No ratings yet
Biostatistics Y 1 LI 2022
43 pages
1 Introduction To Biostatistics
No ratings yet
1 Introduction To Biostatistics
26 pages
Introduction
No ratings yet
Introduction
76 pages
Data Variable
No ratings yet
Data Variable
17 pages
18 - Introduction and Levels of Measurements (2017-18)
No ratings yet
18 - Introduction and Levels of Measurements (2017-18)
41 pages
Biostats Lec 1 28 Sept 2012
No ratings yet
Biostats Lec 1 28 Sept 2012
35 pages
Jurnal Kesehatan Gigi
No ratings yet
Jurnal Kesehatan Gigi
4 pages
Nature of Biostat
No ratings yet
Nature of Biostat
54 pages
Introduction Bio.
No ratings yet
Introduction Bio.
12 pages
Theoretical Study of Aerosol Filtration by Fibrous Filters
No ratings yet
Theoretical Study of Aerosol Filtration by Fibrous Filters
16 pages
Concise Biostatistical Principles & Concepts: Guidelines for Clinical and Biomedical Researchers
From Everand
Concise Biostatistical Principles & Concepts: Guidelines for Clinical and Biomedical Researchers
Franklin Opara
No ratings yet
The Behavior of Water Saturated Sand Under Shock Loading
No ratings yet
The Behavior of Water Saturated Sand Under Shock Loading
7 pages
Itrodution To Biostatistics
No ratings yet
Itrodution To Biostatistics
130 pages
Week 1
No ratings yet
Week 1
6 pages
Spectroscopic Methods For Determination of Dexketoprofen
No ratings yet
Spectroscopic Methods For Determination of Dexketoprofen
8 pages
The Structure of Relationship Between Atttension and Intelligence
No ratings yet
The Structure of Relationship Between Atttension and Intelligence
23 pages
Lecture 1 - Online - INTRODUCTION TO BIOSTATISTICS (Compatibility Mode)
100% (1)
Lecture 1 - Online - INTRODUCTION TO BIOSTATISTICS (Compatibility Mode)
28 pages
Assignment 5 1
No ratings yet
Assignment 5 1
13 pages
Basics of Statistics Unit-I SCLS
No ratings yet
Basics of Statistics Unit-I SCLS
127 pages
Brand Image: Cronbach's Alpha N of Items
No ratings yet
Brand Image: Cronbach's Alpha N of Items
2 pages
Bio Statistics
No ratings yet
Bio Statistics
435 pages
High School Students' Perception of Challenges in Physics Learning and Relevance of Field Dependency
No ratings yet
High School Students' Perception of Challenges in Physics Learning and Relevance of Field Dependency
8 pages
Giordani 2006
No ratings yet
Giordani 2006
12 pages
Basics of Statistics Unit-I SCLS
No ratings yet
Basics of Statistics Unit-I SCLS
135 pages
Presentation 1st
No ratings yet
Presentation 1st
102 pages
Community Dental Health Flashcards - Quizlet
No ratings yet
Community Dental Health Flashcards - Quizlet
20 pages
Introduction To Biostatistics
No ratings yet
Introduction To Biostatistics
8 pages
Grading Development of Indonesian Bamboo Culm: Case Study On TALI BAMBOO (Gigantochloa Apus)
No ratings yet
Grading Development of Indonesian Bamboo Culm: Case Study On TALI BAMBOO (Gigantochloa Apus)
7 pages
Chapter 1.introduction To Biostat
No ratings yet
Chapter 1.introduction To Biostat
48 pages
Caffeine Consumption and Self-Assessed Stress, Anxiety, and Depression in Secondary School Children
No ratings yet
Caffeine Consumption and Self-Assessed Stress, Anxiety, and Depression in Secondary School Children
12 pages
Sickle Cell Disease in Pregnancy
75% (4)
Sickle Cell Disease in Pregnancy
18 pages
Scales of Measurement
No ratings yet
Scales of Measurement
23 pages
CH 01 Wooldridge 6e PPT Updated
No ratings yet
CH 01 Wooldridge 6e PPT Updated
77 pages
Lecture 1 Introduction To Biostatistics
No ratings yet
Lecture 1 Introduction To Biostatistics
31 pages
Mse1 Stat Class
No ratings yet
Mse1 Stat Class
81 pages
1. Introduction to biostatistics - ١٠٠٩٣٥
No ratings yet
1. Introduction to biostatistics - ١٠٠٩٣٥
30 pages
Ahmad 2014 - The Perceived Impact of JIT Implementation On Firms Financial Growth Performance
No ratings yet
Ahmad 2014 - The Perceived Impact of JIT Implementation On Firms Financial Growth Performance
13 pages
Chepter # 5 Simple Regression and Correlation Exercise # 5 by Shahid Mehmood Simple Regression
No ratings yet
Chepter # 5 Simple Regression and Correlation Exercise # 5 by Shahid Mehmood Simple Regression
7 pages
Initial Performance of Ipos in India: Evidence From 2010-2014
No ratings yet
Initial Performance of Ipos in India: Evidence From 2010-2014
10 pages
IC-RACE Paper
No ratings yet
IC-RACE Paper
5 pages
20 - Basic Concepts and Terminology in Biostatistics (SepI2020)
No ratings yet
20 - Basic Concepts and Terminology in Biostatistics (SepI2020)
38 pages
Jurna Internasiona
No ratings yet
Jurna Internasiona
6 pages
1 - Introduction To Biostatistics-2
No ratings yet
1 - Introduction To Biostatistics-2
23 pages
Demand Forecasting in Pharmaceutical Supply Chains-A Case Study PDF
No ratings yet
Demand Forecasting in Pharmaceutical Supply Chains-A Case Study PDF
8 pages
Unit1 - 1basics of Statistics
No ratings yet
Unit1 - 1basics of Statistics
24 pages
Learning Unit 8
No ratings yet
Learning Unit 8
13 pages
Classification of Data
No ratings yet
Classification of Data
34 pages
Basic Statistics: Chapter One
No ratings yet
Basic Statistics: Chapter One
15 pages
1 - 2 Biostatistics
No ratings yet
1 - 2 Biostatistics
24 pages
Week 1 Intro To Statistics and Level of Measurement
100% (1)
Week 1 Intro To Statistics and Level of Measurement
6 pages
GK
No ratings yet
GK
101 pages
Liquidity Analysis of Nepal Investment Bank LTD
No ratings yet
Liquidity Analysis of Nepal Investment Bank LTD
39 pages
Trip Generation
No ratings yet
Trip Generation
36 pages
1 - Introduction To Biostatistics
No ratings yet
1 - Introduction To Biostatistics
321 pages
Introductiontobasicsofbio Statistics 180127163400
No ratings yet
Introductiontobasicsofbio Statistics 180127163400
48 pages
Multi Fetal Pregnancy & Complications
100% (1)
Multi Fetal Pregnancy & Complications
20 pages
Developmental Psychology Notes WK 1 2
No ratings yet
Developmental Psychology Notes WK 1 2
16 pages
Biostatistics
No ratings yet
Biostatistics
234 pages
Supplier Selection Using Fuzzy Logic
No ratings yet
Supplier Selection Using Fuzzy Logic
7 pages
VVF Clinical Presentation 1
86% (7)
VVF Clinical Presentation 1
24 pages
Nature
No ratings yet
Nature
10 pages
House Officer
100% (1)
House Officer
32 pages
House Officer
100% (1)
House Officer
32 pages
Chapter-1 (Introduction To Biostatistics)
No ratings yet
Chapter-1 (Introduction To Biostatistics)
30 pages
Topic 1 - W1-3 Introduction To Biostatistics
No ratings yet
Topic 1 - W1-3 Introduction To Biostatistics
52 pages
Nature of Statistics
No ratings yet
Nature of Statistics
63 pages
VIT CSE BTech Course Plan
50% (2)
VIT CSE BTech Course Plan
76 pages
J Islamabad Med Dent Coll 2013 2 2 103
No ratings yet
J Islamabad Med Dent Coll 2013 2 2 103
1 page
The Nature of Statistics Students
No ratings yet
The Nature of Statistics Students
59 pages
Post Abortion Care (Pac)
100% (14)
Post Abortion Care (Pac)
9 pages
Biostatistics and Exercise
100% (9)
Biostatistics and Exercise
97 pages
ECON1005 Unit 1 Session 1
No ratings yet
ECON1005 Unit 1 Session 1
20 pages
Lesson1 1446
No ratings yet
Lesson1 1446
46 pages
MTPDF1 - Introduction To Statistics
No ratings yet
MTPDF1 - Introduction To Statistics
106 pages
Unstable Lie
67% (3)
Unstable Lie
7 pages
Contraception
100% (3)
Contraception
39 pages
Vesico Vaginal Fistula
0% (1)
Vesico Vaginal Fistula
6 pages
The Incompetent Cervix 2
100% (3)
The Incompetent Cervix 2
30 pages
Postpartum Heamorrhage
100% (2)
Postpartum Heamorrhage
14 pages
Vaginal Birth After Caesarean Section (Vbac)
50% (2)
Vaginal Birth After Caesarean Section (Vbac)
16 pages
Analysis Factor Analysis Cluster Analysis
No ratings yet
Analysis Factor Analysis Cluster Analysis
18 pages
Statistics Introduction
No ratings yet
Statistics Introduction
26 pages
1-The Nature of Statistics
No ratings yet
1-The Nature of Statistics
63 pages
Abdominal Pain in Pregnancy
100% (1)
Abdominal Pain in Pregnancy
22 pages
Biostatistics CN
No ratings yet
Biostatistics CN
79 pages
Overview of Role Back Malaria in Nigeria Current Treatment
100% (1)
Overview of Role Back Malaria in Nigeria Current Treatment
19 pages
Biostat
No ratings yet
Biostat
20 pages
STT034 Lecture
No ratings yet
STT034 Lecture
6 pages
Dysfunctional Uterine Bleeding (DUB)
100% (2)
Dysfunctional Uterine Bleeding (DUB)
21 pages
PeopleCert SixSigma GreenBelt Sample Paper
No ratings yet
PeopleCert SixSigma GreenBelt Sample Paper
10 pages
PMTCT
No ratings yet
PMTCT
13 pages
Biostatistics Teaching
No ratings yet
Biostatistics Teaching
283 pages
Preeclampsia and Eclampsia
100% (6)
Preeclampsia and Eclampsia
23 pages
Vulvar Malignancy
No ratings yet
Vulvar Malignancy
21 pages
Cervical Incompetence 1
100% (1)
Cervical Incompetence 1
5 pages
Current Management of Labour
100% (4)
Current Management of Labour
48 pages
Rhesus Iso Immunization
No ratings yet
Rhesus Iso Immunization
12 pages
Overview of Malaria in Nigeria
No ratings yet
Overview of Malaria in Nigeria
22 pages
CH 03
No ratings yet
CH 03
19 pages
Uterovaginal Prolapse
100% (2)
Uterovaginal Prolapse
16 pages
Ovulation and Conception - Normal Pregnancy
No ratings yet
Ovulation and Conception - Normal Pregnancy
8 pages
Roll Back Malaria
No ratings yet
Roll Back Malaria
2 pages
Uterine Fibroids
No ratings yet
Uterine Fibroids
11 pages
Contact Details:: Dr. Joy C. Chavez
No ratings yet
Contact Details:: Dr. Joy C. Chavez
101 pages
Biostat-Measures Transes
No ratings yet
Biostat-Measures Transes
8 pages
Detrusor Instability
No ratings yet
Detrusor Instability
7 pages
Septic Abortion
No ratings yet
Septic Abortion
15 pages
Molar Pregnancy
100% (1)
Molar Pregnancy
15 pages
BBFH 103 Notes
No ratings yet
BBFH 103 Notes
38 pages
Review of Basic Statistics: "There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.D. Israeli)
No ratings yet
Review of Basic Statistics: "There Are Three Kinds of Lies: Lies, Damned Lies, and Statistics." (B.D. Israeli)
50 pages
STPDF1 - Recalling Basic Concepts
No ratings yet
STPDF1 - Recalling Basic Concepts
31 pages
Lec Notes Business Stat
No ratings yet
Lec Notes Business Stat
7 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
101 pages
Introduction To Biostatistics
No ratings yet
Introduction To Biostatistics
272 pages
Statistics
No ratings yet
Statistics
36 pages

Data Collection and Analysis in Obstetrics and Gynecology

Uploaded by

Data Collection and Analysis in Obstetrics and Gynecology

Uploaded by

Data Collection and Analysis In Obstetrics

C I for the Difference of Two Means ( µ 1 − µ 2 )

Difference of Two Proportions ( Ρ1 − Ρ2 )

Hypotheses Test Statistic Decision

 Test statistics are created along the lines

The test statistic is:

 The Wilcoxon Test (Two Samples)

The Pearson’s measure of correlation coefficient is expressed as:

while the Spearman rank correlation coefficient is expressed as:

The estimate of α and β are:

The null hypothesis we would wish to test is:

2. Choose the level of significance, α

3. Complete the ANOVA table

2. Choose the level of significance, α

3. Complete the ANOVA table

Treatment k–1 SStr SStr/k–1 = MStr MS tr

 Relative Risk (RR)

Substituting, we have Z (α ) = Z (0.025 ) = 1.96

Solution: First, we shall find the regression line

 Example 3: Serum amylase determination were

Mean SD No. of women

Systolic BP (mm Hg)

Diastolic BP (mm Hg)

Find whether the two groups differ substantially at admission

You might also like