0% found this document useful (0 votes)
37 views

Handling Data

This document discusses handling data and statistics. It defines statistics as the collection, organization, analysis, and interpretation of numerical data. It outlines key learning objectives like different scales of measurement, data collection methods, techniques for presenting data, probability distributions, sampling techniques, statistical inference, estimation, and hypothesis testing. The document also discusses what statistics is used for, including organizing information, assessing health status, program evaluation, and drawing inferences about populations from samples.

Uploaded by

fayan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Handling Data

This document discusses handling data and statistics. It defines statistics as the collection, organization, analysis, and interpretation of numerical data. It outlines key learning objectives like different scales of measurement, data collection methods, techniques for presenting data, probability distributions, sampling techniques, statistical inference, estimation, and hypothesis testing. The document also discusses what statistics is used for, including organizing information, assessing health status, program evaluation, and drawing inferences about populations from samples.

Uploaded by

fayan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 35

Handling Data

By:
Abdusamed M., PhD fellow

May, 2022
Harar, Ethiopia
11/21/2023 1
Learning Outcomes
After completion of this session, the learners will be able to;
• Discuss the scales of Measurement

• Describe the methods of data collection

• List the techniques of presenting and summarizing data

• Pinpoint the probability distributions

• Differentiate the sampling techniques

• Confer statistical inferences

• Distinguish the point and interval estimation

• Explain hypothesis testing

• Determine sample size for different study designs

• Discourse the measures of association


11/21/2023 2
What is Statistics?

• Statistics: A field of study concerned with:


• Collection, organization, analysis, summarization and interpretation of numerical
data, and
• The drawing of inferences about a body of data when only a small part of the data
is observed

• Statistics helps us use numbers to communicate ideas

11/21/2023 3
Biostatistics:
· The application of statistical methods to the fields of biological, medical and
public health sciences
· Concerned with interpretation of biological data & the communication of
information derived from these data
· Has central role in medical investigations

• The numbers must be presented in such a way that valid interpretations


are possible

• Statistics are everywhere – just look at any newspaper or the current


medical and public health literature

11/21/2023 4
Question:

Why studying statistics?

Please describe:
• Rationale of studying statistics
• Limitations of statistics

11/21/2023 5
Uses of Biostatistics
• Provide methods of organizing information
• Assessment of health status
• Health program evaluation
• Resource allocation
• Magnitude of association
• Strong Vs weak association between exposure and outcome
• Assessing risk factors: Cause & effect relationship
• Evaluation of a new vaccine or drug
• What can be concluded if the proportion of people free from the disease is
greater among the vaccinated than the unvaccinated?
• How effective is the vaccine (drug)?
• Is the effect due to chance or some bias?
• Drawing of inferences: Information from sample to population
11/21/2023 6
Types of Statistics

Descriptive statistics:
• Ways of organizing and summarizing data
• Helps to identify the general features and trends in a set of data and extracting
useful information
• Also very important in conveying the final results of a study
• Example: tables, graphs, numerical summary measures

Inferential statistics:
• Methods used for drawing conclusions about a population based on the
information obtained from a sample of observations drawn from that population
• Example: Principles of probability, estimation, confidence interval,
comparison of two or more means or proportions, hypothesis testing,
etc.
11/21/2023 7
Data
• Data are numbers which can be measurements or can be obtained by counting
• The raw material for statistics
• Can be obtained from: routinely kept records, literature, surveys, counting,
experiments, reports, observation, etc.
Statistical Data
• Refers to numerical descriptions of things
• These descriptions may take the form of counts or measurements

Types of Data
1. Primary data: collected from the items or individual respondents directly
by the researcher for the purpose of a study
2. Secondary data: which had been collected by certain people or
organization, & statistically treated and the information contained in it is
used for other purpose by other people 8
Characteristics of Statistical Data
• Numerical descriptions must possess following characteristics to be called statistics:
i. They must be in aggregates – Statistics are 'number of facts'
• A single fact, even though numerically stated, cannot be called statistics
ii.They must be affected to a marked extent by a multiplicity of causes
• This means that statistics are aggregates of such facts only as grow out
of a ‘variety of circumstances’
iii. They must be enumerated or estimated according to a reasonable standard
of accuracy – Statistics must be enumerated or estimated according to reasonable
standards of accuracy
iv. They must have been collected in a systematic manner for a predetermined
purpose
v. They must be placed in relation to each other => they must be comparable
9
For
example:
– When a hospital administrator counts the number of patients (counting)

– When a nurse weighs a patient (measurement)

11/21/2023 10
Sources of Data
• We search for suitable data to serve as the raw material for our
investigation
• Such data are available from one or more of the following sources:
1. Routinely kept records:
• Hospital medical records contain immense amounts of information on patients
• Hospital accounting records contain a wealth of data on the facility’s business
activities
2. External sources: include already existed data in the form of
• Published reports
• Commercially available data banks, or
• The research literature, i.e. someone else has already asked the same question

11/21/2023 11
3. Surveys:
• The source may be a survey, if the data needed is about answering certain
questions
• For example:
– If the administrator of a clinic wishes to obtain information regarding the
mode of transportation used by patients to visit the clinic, then a survey may
be conducted among patients to obtain this information
4. Experiments:
• The data needed to answer a question are available only as the result of an
experiment
• For example:
– If a professional wishes to know which of several strategies is best for
maximizing patient compliance, he might conduct an experiment in which
the different strategies of motivating compliance are tried with different
patients
11/21/2023 12
Variable
• Characteristic that takes on different values in different persons,
places, or things
• It is not the same when observed in different possessors of it
• Examples:
– Diastolic blood pressure,
– Heart rate,
– The heights of adult males,
– The weights of preschool children,
– The ages of patients seen in a dental clinic

11/21/2023 13
Random Variable
• The values obtained arise as a result of chance factors, so that they cannot be
exactly predicted in advance
– Example:
• Adult height
• When a child is born, we cannot predict exactly his or her height at maturity
(genetic and environmental factors)

11/21/2023 14
Types of Data

Quantitative Data Qualitative Data


• Characteristics are not capable of being
It can be measured in the usual sense measured
For example: • Some of them can be ordered or ranked
• The heights of adult males, • Convey information regarding attribute
• The weights of preschool children
• The ages of patients seen in a dental
For example:
clinic • Classification of people into socio-
economic groups,
• Social classes based on income,
education, etc.

11/21/2023 15
.

Types of Data

Quantitative data Qualitative data

Quantitative
Qualitative
continuos
nominal

Quantitative Qualitative ordinal

descrete
11/21/2023 16
Types
. of Quantitative Data

A discrete data A continuous data


• Characterized by gaps or • Can assume any value within a specified
interruptions in the values that it can relevant interval of values assumed by the
assume variable

For example: For example:


- Height,
- The number of daily admissions to - Weight
a general hospital,
- Skull circumference
- The number of decayed, missing or • No matter how close together the observed
filled teeth per child in an heights of two people, we can find another
elementary school
person whose height falls somewhere in
between

11/21/2023 17
Types of Qualitative data
.

Nominal Ordinal

• As the name implies it consist of


“naming” or classifies into • Whenever qualitative observation can be
various mutually exclusive ranked or ordered according to some
categories criterion
Example: Example:
• Male – female • Blood pressure: (high-good-low)
• Sick – well • Grades: (Excellent – Very good – good –fail)
• Married – single –
divorced

11/21/2023 18
Measurement and Measurement Scale
 Measurement
• A procedure where qualities or quantities are assigned to the characteristics of
objects or events
• All measurements are not the same
• Example: weight- kg, height- meter, ……
• Measuring the status of the patient on the scale: “improved”, “ stable”,
“unimproved”
 Measurement of scales
• Are important for the statistical analysis of data
• There are four types of measurement scales
 Nominal scale
 Ordinary scale
 Interval scale
 Ratio scale
11/21/2023 19
The Nominal Scale
• The lowest measurement scale
• Consists of naming of observations or classifying them into various mutually
exclusive and collectively exhaustive categories
• The values fall into unordered categories or classes
• Uses names, labels or symbols to assign each measurement
• Example: blood types, sex, race, marital status, religion, causes of illnesses and
causes of death
• Dichotomous or binary: if nominal data can only take two possible values
• Example:
• Male / Female
• Yes / No
• Cured from the disease or not
• Well / Sick,
• Child / Adult,
11/21/2023• Married / Not married 20
The Ordinal Scale
• Assigned each measurement to limited number of categories that are ranked in
terms of order
• Observations are not only different from category to category but can be ranked
according to some criterion
• Although non-numerical, considered to have a natural ordering
• Example:
– Patient status:- unimproved, improved, and much unproved
– Cancer stages
– Social classes

11/21/2023 21
Example:
Pain level • The numbers have limited meaning
1. None • 6>5>4>3>2>1 is all we know apart from
2. Very mild their utility as labels
3. Mild
4. Moderate
5. Sever
6. Very
sever

11/21/2023 22
Likert scales are ordinal scale

• Indicate the extent of agreement or • Difficulty of walking for a patient


disagreement (have 5 or 7 point ordered 1. No
categories)
2. Slightly
• Response to a given question 3. Moderate
1. Strongly agree
4. Great difficulty
2. Agree
3. Neutral 5. Unable to walk at all
4. Disagree
5. Strongly disagree
The Interval Scale
• More sophisticated scale than the nominal or ordinal
• The distance between any two measurements is known
• Use of a unit distance and a zero point, both of which are arbitrary
• Truly quantitative
• Measured on a continuum and difference between any two numbers on a scale are
of known size
• Example: Temperature in ⁰c on four consecutive days
Days Monday Tuesday Wednesday
Temp in ⁰c 18 20 23

• For this data, not only is Monday with 18 ⁰c is cooler than Wednesday, but 5 ⁰c
cooler
• It has no true zero
• Example: intelligence, time in year, BP, etc.

11/21/2023 24
The Ratio Scale
• The highest level of measurement is the ratio scale

• Characterized by:
Equality of ratios can be determined
Equality of intervals may be determined
True zero point

• Example: height, weight, and length

11/21/2023 25
Exercise - 1

• Identify the type of data (nominal, ordinal, interval and ratio) represented by each of
the following. Confirm your answers by giving your own examples.
1. Blood group
2. Temperature (Celsius)
3. Ethnic group
4. Job satisfaction index (1-5)
5. Number of heart attacks
6. Calendar year
7. Serum uric acid (mg/100ml)
8. Number of accidents in 3 - year period
9. Number of cases of each reportable disease reported by a health worker
10. The average weight gain of 6 1-year old dogs (with a special diet supplement)
was 950grams last month
11/21/2023 26
Population and Sample
Population: Refers to any collection of objects
• The largest collection of entities for which we have an interest at a particular time
Finite populations
• If a population of values consists of a fixed number of these values
Infinite population
• A population consists of an endless succession of values

11/21/2023 27
Target Population:
• A collection of items that have something in common for which we wish to draw
conclusions at a particular time
• E.g., All hospitals in Ethiopia
• The whole group of interest

Study (Sampled) Population:


• The subset of the target population that has at least some chance of being sampled
• The specific population group from which samples are drawn and data are collected

Sample:
• A subset of a study population, about which information is actually obtained
• The individuals who are actually measured and comprise the actual data
28
11/21/2023
• Role of statistics in using information from a sample to make inferences
about the population E.g. In a study of the prevalence of Covid-19 among adults in
Ethiopia, a random sample of adults in Dire Dawa were
included
• Target Population: All adults in Ethiopia
• Study population: All adults in Dire Dawa
Population • Sample: Adults in Dire Dawa who were included in the study

Information Sample

Sample Study Population

Target Population 29
Generalizability
• Is a two-stage procedure:
• We need to be able to generalize from:
The sample to the study population, &
Then from the study population to the target population

• If the sample is not representative of the population, the conclusions


are restricted to the sample and don’t have general applicability

Collect information Draw conclusions


from a relatively about a rather
SMALL sample LARGE population

11/21/2023 30
Parameter and Statistic
• Parameter: A descriptive measure computed from the data of a
population
• E.g., the mean (µ) age of the target population

• Statistic: A descriptive measure computed from the data of a sample


• E.g., sample mean age ( )

11/21/2023 31
Major Steps in Statistical Methods
• A statistical investigation is an investigation conducted according to the
statistical technique
• The main steps in statistical investigation are:
i. Collection of data,
ii. Organization of data,
iii. Presentation of data,
iv. Analysis of data, and
v. Interpretation of data

11/21/2023 32
Collection of Data
• This is the process of obtaining measurements, counts, or information
• Is the first step in a statistical investigation
• Valid conclusions can only result from properly collected data
Organization of Data
• This step involves the series of data editing, classifying, and tabulation
• The data editing: correcting or adjusting omissions, inconsistencies, irrelevant
answers and wrong computations
• Data classification: arranging data according to some common characteristics
possessed by the items constituting the data
• Data tabulation: arranging the data in columns and rows so that there is
absolute clarity in the data presented

11/21/2023 33
Presentation of Data
• Is about arranging the data using graphs and diagrams
• The main purpose of data presentation is to facilitate statistical analysis
Analysis of Data
• Extraction of summarised and comprehensible numerical descriptions of the
data
• The purpose: to dig out information useful for decision-making
• Methods used in analysing data: observation, measures of central tendency,
measures of variation, correlation and regression
Interpretation of Data
• Refers to making conclusions about the data
• This step usually involves decision-making about
• A large collection of objects (population) and
• Information gathered from a small collection of similar objects (sample)
11/21/2023 34
What does Biostatistics cover?
Research Planning

Design The best way to learn about


biostatistics is to follow the flow
Biostatistical thinking Execution (Data collection) of a research from inception to
contribute in every step in a the final publication
research Data Processing

Data Analysis

Presentation

Interpretation
11/21/2023 Publication 35

You might also like