PCM 5
PCM 5
PCM 5
PHINMA
School of Medicine
MODULE 5
Summarizing data and Sampling
(Variables, Frequency Distribution, Sampling in
Public Health)
Inferential statistics
• The investigation of specified elements which allow us to make inferences about a larger
population (i.e., beyond the sample size)
• Here we compare groups of subjects or individuals
• It is normally not possible to include each subject or individual in a population in a study,
therefore we use statistics and infer that the results we get, apply to the larger population.
Sample
•A sample is a selection of members within the population (I'll discuss different ways of
selecting a sample a bit later in this course)
•Research is conducted using that sample set of members and any results can be inferred to the
population from which the sample was taken
•This use of statistical analysis makes clinical research possible as it is usually near impossible to
include the complete population
Variable
• There are many ways to define a variable, but for use in this course I will refer to a variable
as a group name for any data values that are collected for a study
• Examples would include age, presence of risk factor, admission temperature, infective
organism, systolic blood pressure
• This invariably becomes the column names in a data spreadsheet, with each row
representing the findings for an individual in a study
• The arithmetic mean is a more technical name for what is more commonly called the mean
or average.
• The arithmetic mean is the value that is closest to all the other values in a distribution.
• Method for calculating the mean
Step 1. Add all of the observed values in the distribution.
Step 2. Divide the sum by the number of observations
• Because of this centering property, the mean is sometimes called the center of gravity of a
frequency distribution.
• The arithmetic mean is the best descriptive measure for data that are normally distributed.
• the mean is not the measure of choice for data that are severely skewed or have extreme
values in one direction or another.
Rounding to one decimal, the 95% confidence interval is 200.1 to 211.9. In other words, this study’s best estimate of the true
population mean is 206, but is consistent with values ranging from as low as 200.1 and as high as 211.9. Thus, the confidence
interval indicates how precise the estimate is. (This confidence interval is narrow, indicating that the sample mean of 206 is fairly
precise.) It also indicates how confident the researchers should be in drawing inferences from the sample to the entire population