END OF SEMESTER EXAMINATION
MODULE NAME: INTRODUCTION TO BIOSTATISTICS AND RESEARCH METHODOLOGY : GST 06114: THEORY
TUTOR: MR. RICHARD MOSHI
TIME: 2HOURS
NUMBER OF INVIGILATORS - 3
1. Write TRUE for the correct statement and FALSE for the incorrect statement. (10 marks)
i. If you square the variance the answer obtained is standard Deviation. (_________)
ii. Nominal measurement is used to reflect rank or order categories within a variable. (________)
iii. Arithmetic mean is the sum of the observation multiplied by its number. (________)
iv. There can be more than one modal in a data set. (__________)
v. Median is an average value when the data is arranged in ascending or descending order. (________)
vi. Height is a continuous variable. (________)
vii. Health status is a qualitative variable. (___________)
viii. Latin square is an experimental design (___________)
ix. Randomization helps to reduce error in an experiment. (_________)
x. Frequency polygon is plotted cumulative frequencies against class limit. (_______)
2. Fill in the blanks with the correct words. (10 marks)
i. Middle value of data set arranged in ascending or descending order is called ____________________
ii. The square root of variance is called _________________________
iii. The median for even sum items in data is obtained by dividing the two central values by ___________
iv. The single value which represents the group value is called ____________________________
v. In experimental design independent variable serve both as experimental group and ________________
vi. The process of examining the truth of a statistical hypothesis is called ______________________
vii. Data collected from existing information is called __________________________________
viii. Pie diagrams are also known as ______________________________
ix. Variable taking only fixed value i.e. whole number is called ___________________________
x. The units of coefficient of variation is in terms of ___________________________
3. Choose the letter bearing the correct answer. (10 marks)
i. It acts as a guide to research work –
A) Literature review
B) Research data ( )
C) Research proposal
D) Research reference
ii. One of the following is probability sampling –
A) Accidental sampling
B) Purposive sampling ( )
C) Stratified sampling
D) Quota sampling
iii. The arithmetical mean of a set of ‘n’ numbers is –
A) The middle number ( )
B) The number that occur most frequently
C) N + 1
2
D) The sum of numbers divide by n
iv. Class interval value of the following data 20-24, 15-19, 10-14, 05-09, 00 – 04, is
A) 0.5
B) 5 ( )
C) 4
D) 24.5
v. Qualitative variable takes
A) Only fixed values
B) Value within meaningful extremes
C) Numerical values ( )
D) No numerical values
vi. Primary data is collected from
A) Government circulars, documents etc
B) Personal or individual diaries ( )
C) Interviews and observations in the field
D) Private institutions
vii. Is a measure of dispersion –
A) Mean
B) Variance ( )
C) Mode
D) Median
viii. Are numerical facts collected for reference –
A) Data
B) Experiment ( )
C) Treatment
D) Frequency
ix. Assumption to be provided or disapproved in a research is known as –
A) Blocking
B) Hypothesis ( )
C) Testing
D) Experiment
x. The type of graph which only represent one item –
A) Compound bar graph
B) Comparative bar graph ( )
C) Divergent bar graph
D) Simple bar graph
4. a) Match the following terms in column I to their correct corresponding
statements in column II. Use match panel provided below. (03 marks)
COLUMN I COLUMN II
A. Data editing a) Summarized collected raw data and displaying them
B. Data coding b) Arranging data in group or classes in basis of common characteristics
C. Data classification c) To identify errors and omissions if any and find ways to rectify them
d) Assigning numerical or other symbols to classes
COLUMN I COLUMN II e) Computation of certain measures searching for pattern relationships
A
B
C
b) Match the following terms in column I to their corresponding statement in column II.
Use the match panel provided below. (03 marks)
COLUMN I COLUMN II
A. Elements a) Individual person, objects or units which information is collected
B. Sample size b) Is the total number of items to be selected from the population
C. Sample frame c) Logical grouping of the attributes e.g. sex, gender etc
d) Complete list of all elements in the study population
e) A set of selected elements from the population for study
COLUMN I COLUMN II
A
B
C
c) Match the following measures of dispersion in column Ito their correct corresponding definition
in column II. Use the match panel provided below. (06 marks)
COLUMN I COLUMN II
A. Ogive a) The components of the bar proportional to constituent parts of the
variable
B. Frequency polygon b) Data presentation with help of diagrams
C. Histogram c) The y- co –ordinate will correspond to median number intersection of
two cumulative frequencies
d) Consist of rectangles erected with areas proportional to corresponding
frequencies
COLUMN I COLUMN II e) Figure formed by joining the graph points with smooth hand curves
A
B
C
5. i) List four (4) diagrammatic methods of presenting data. (4 marks)
ii) List four (4) advantages of interview method. (4marks)
iii) List four (4) types of complex random sampling. (4 marks)
6. Define the following terms. (2marks
a) Sample
b) Variable
c) Range
d) Data
e) Biostatistics
f) Frequency
7. a) Explain any three (3) advantages and three (3) disadvantages of questionnaire. (10 marks)
b) Explain the differences between primary and secondary data. (10 marks)
c) Explain the importance of research proposal. (10 marks)
8. a) Describe any five (5) salient points of interview technique (10 marks)
b) Describe random Block design in experiments (10 marks)
c) Describe any five (5). Sub – headings to be included in chapter one of research report. (10 marks)
9. a) Discuss the probability sampling method. (10 marks)
b) Discuss the variable measurement levels in statistics. (10 marks)
c) Discuss precaution measures to be taken when dealing with table presentation. (10 marks)
MODAL ANSWERS
1.
i. FALSE
ii. FALSE
iii. FALSE
iv. TRUE
v. FALSE
vi. TRUE
vii. TRUE
viii. TRUE
ix. TRUE
x. FALSE
2. FILL THE BLANKS
I. MEDIAN
II. STANDARD DEVIATION
III. TWO (2)
IV. MEAN
V. CONTROL
VI. EXPERIMENT
VII. SECONDARY
VIII. PIE CHARTY
IX. DISCRETE
X. PERCENTAGE (%)
3.
i. C
ii. C
iii. D
iv. B
v. D
vi. C
vii. B
viii. A
ix. B
x. D
4. a)
Matching panel
COLUMN I COLUMN II
A c
B d
C b
b) Matching panel
COLUMN I COLUMN II
A a
B b
C d
c) Matching panel
COLUMN I COLUMN II
A c
B e
C d
5. i) List four (4) diagrammatic methods of presenting data. (4 marks)
Diagrammatic are –
Bar diagram
Component Bar Diagram
Pie chart diagram
Pictogram (picto chart)
ii) List four (4) advantages of interview method. (4marks)
The method include both literacy and illiteracy respondents
There is high rate of respondents
The research can give clarification when question is not well understood
It can be used to collect data on sudden issues e.g. Diseases
The method allows great control over interviewing environment as research can study the behavior or
emotions of respondents
Researcher can make sure that all questions are answered and can be able to compare answers from two or
more respondents
Research can use qualitative analysis than quantitive to ensure that his questions are well understood
A researcher can get as many answers as possible due to freeness of the respondent
iii) List four (4) types of complex random sampling. (4 marks)
THEY ARE –
Stratified sampling
Multi stage sampling
Cluster sampling
Systematic sampling
6. Define the following terms. (2marks
a) SAMPLE: It is a set of selected element from the population for study
b) VARIABLE: Is an observation, characteristic or phenomenon that can take different value for different
individuals, time or place
c) RANGE: Is the difference between the highest and smallest value
d) DATA: Is the body of information which is usually in numerical manner
e) BIOSTATISTICS: Is the application of statistics to biological problems.
f) FREQUENCY: Is the number of individuals or objects having the same measurement or
enumeration count or lies in the same measurement group
7. EXPLAIN
a) Explain any three (3) advantages and three (3) disadvantages of questionnaire. (10 marks)
ADVANTAGES
Easy to analyse as they give minimum explanation
It can be used even where there respondents are scattered over a wide area
It costs less in terms of time and money
It avoids researches bias as no face to face contact
Uniform of responses is achieved as questions are controlled in closed questionnaires
It covers wide geographical area and more people than interview method
Participants (respondents) feed free to explain their opinions because at time researcher not present
Attract more respondents as it doesn’t require name, sex or job of respondent
DISADVANTAGES
A researcher ends up with many answers which become difficult to compare
They are time consuming in competing the research task
Researcher control research process hence no new answers especially for closed ended questionnaires
Respondents may cheat as the researcher does not observe their behavior
b) Explain the differences between primary and secondary data. (10 marks)
DIFFERENT ASPECT PRIMARY DATA SECONDARY DATA
Source of data collection Data is collected from the field using Data is collected by extracting
interview, questionnaires and information from existing documents
observation methods e.g. books, paper etc
Reliability of the data collected Primary data is mostly reliable as it is Secondary data is less reliable as it
i. 1st hand information i. 2nd hand information
ii. Real information ii. Biased
iii. Up to date iii. Sometimes is outdated
Cost of data collection More expensive to collect as they Less expensive to collect as they require
require more resources in terms of less resources in terms of time and
time and money money
c) Explain the importance of research proposal. (10 marks)
Research proposal is important for the following reasons –
It helps the researcher to think over important issues about the study
It helps the researcher to get go ahead from the client
It is used for administrative purposes as it is the basis for written agreement or contract between research
and client
It allows the sponsor to assess the sincerity of the researcher proposal
It is used to make choice among competing suppliers and to influence positively the decision to find the
proposed study
It help the researcher to evaluate the study looking at the difficulties which is likely to crop in during the
whole process
8. DESCRIBE
a) Describe any five (5) salient points of interview technique (10 marks)
The aim of interview is to get information from respondents therefore.
Researcher must understand respondents own situation
The researcher must grap the totality of respondent is situation
There must be a sense of closeness between researcher and respondent
There must be flexibility i.e. researcher must be able to put himself in the situation of respondent
Researcher must consider the place where interview will be conducted, i.e. must be common and
familiar to respondent
The interview must take between two parties
Time must be convenient to respondent and researcher too
b) Describe random Block design in experiments (10 marks)
In animal; Random block design is –
Done based on animals divided and grouped into blocks or units based on e.g. same litter,
same sex, same age or same body weight etc
The blocks or groups of the experimental material must be homogeneous
The grouping is homogeneity within the blocks.
ADVANTAGE OF RBD
Reduction of experimental errors to the formation of homogeneous blocks
It one block is affected, analysis can still be done using data from the remaining blocks
DISADVANTAGES OF RBD
The larger the block the less homogeneity hence more the experimental error
At times it can be affected by missing value
c) Describe any five (5). Sub – headings to be included in chapter one of research report. (10 marks)
Chapter one of research report would include the following –sub – headings
Introduction: Overview, brief introduction of content of the chapter plus background of the
study providing brief history, status of the topic and hint of the problem being investigated
Statement of research problem – Showing what has been done and what is missing and what is
going to be done or solved research or study
Main objectives of the study; the focus of the study and source information for the construction
of the research title
Research questions or hypothesis – statement of research question to guide the investigation or
hypothesis
Significant of the study: Finding of the study contribution in policy theory or expansion of
knowledge in general
Limitations of the study: Problem encounted during the study and how you managed them
9. DISCUSS
a) Discuss the probability sampling method. (10 marks)
Define: Sampling is the process of selecting elements from the population or universe.
There are two types of probability pampling vig
Simple random
Complex random
Simple random; This gives every possible combination elements in a population equal chance of
being included in the sample. Complex random: Complex random includes –
1. Stratified sampling – this is grouping of members or units into homogeneous strata of e.g sex, age etc.
2. Multi stage sampling: In this sampling the population is divided into random units from which the first
stage is done and the see second from the first and third from the second in that sequence until final
sample is obtained
3. Cluster sampling : Thus is sampling which the first level of units to be sampled is larger grouping or
cluster and cluster can be herd of cattle or farm etc.
b) Discuss the variable measurement levels in statistics. (10 marks)
INTRODUCTION: The term measument is used here in a numerical sense and variables are measure
on different levels of scales.
Following are variable measures:
1. Nominal measurement: This is used for identifying various categories that make up a given
variable e.g. Religion where
Muslim
Christian
Others
The numbering (codes) here doesn’t signify ranking out that the category comprising of nominal variable
cannot occur together and are not related.
2. Ordinal measurement: Used to reflect categories comprised within a variable e.g.
Degree of heat
1 =No heat
2 = Initial heat
3 = Peak heat
3. Interval measurement: Here the numbers used are more meaningful as compared to nominal and ordinal. Here
the arithmetic operation signs e.g. (+ or -) can be performed and distance between two consecutive print is
same.
4. Ratio measurement: Thus is the most soplusticated level of measurement. It has all the characteristics of interval
measurement but has an absolute zero point that represent the absence of measured quantity
c) Discuss precaution measures to be taken when dealing with table presentation. (10 marks)
Precaution in table presentation are –
1. The titles of the tables should be short and clear and should convey the general contents of the table.
2. Sub titles should be given so whatever apart of the information is required it can readily be obtained
from the margin totals of the tables.
3. The various items in a table should follow in a logical sequence e.g alphabetical order, chronologically
or numerically
4. Footnotes should be given at the end of the table, whenever a word or figure has to be explained more
elaborately
5. Space should be left after every item in each column and rows of the table in order to separate them.
PRACTICAL
SET I.
1. Draw a cumulative frequency curve (Ogive) using the following data below. State the median frequency
value number. (20marks)
Average Milk Production of Dairy Herd.
Number of cows 6 9 10 25 19 15 8
Class 0-3 3-6 6-9 9 - 12 12 - 15 15 - 18 18 -21
2. Draw a histogram comparing frequency distribution of cattle owner groups based on class stock
ownership. (10marks)
Group (class) 0-2 2-4 4-6 6-8 8-10 10-12
Number of owners 40 48 25 18 12 7
SET II
1. Study carefully the following data of live – weights obtained from goats in a certain farm, then answer the
questions that follow
Goat A B C D E F G H I J K L M N
Live wt 20 16 16 20 13 22 11 20 24 16 18 25 20 25
Find the following
1. Arithmetic mean
2. Variance
3. Standard deviation
4. Mode
2. Using Bar graph show the relationship between production and process of milk of the
given data below. (12marks)
Production 85 93 104 44
Processed 80 88 99 39
SET III
1. Draw a PIE - CHART showing different kinds of livestock population in a certain region in Tanzania using
percentages.
LIVESTOCK POPULATION NUMBERS
KINDS (000S)
1. Pigs 1
2. Sheep 2
3. Goats 3
4. Cattle 4
5. Chicken 5
2. Find for the following set of numbers: the mean, median, mode and range.
Set of numbers: 9, 8, 12, 12, 16, 15, 14, 5, 10, 10, 6, 12 (15marks)
PRACTICAL
MODAL ANSWER
SET I
1. DRAW A CUMULATIVE FREQUENCY CURVE (OGIVE)
2. DRAW A HISTOGRAM
SET II
1. Goat A B C D E F G H I J K L M N
Weight 20 16 16 20 13 22 11 20 24 16 18 25 20 25
SOLUTION
I. Arithmetic mean
Formula £x
N
Where x – frequency
N – number of item
Sum = 20+ 16 +16 +20+ 13+ 22+ 11+ 20+ 24+ 16+ 18+ 25+ 20+ 25 = 266 =19
14
2
II. Variance = s = £(x-x)
N
Frequency Mean Deviation (d)
X x x–x d2
1. 20 19 1 1
2. 16 19 -3 9
3. 16 19 -3 9
4. 20 19 1 1
5. 13 19 -6 36
6. 22 19 3 9
7. 11 19 -8 64
8. 20 19 1 1
9. 24 19 5 25
10. 16 19 -3 9
11. 18 19 -1 1
12. 25 19 4 16
13. 20 19 1 1
14. 25 19 4 16
198
VARIANCE = 198 = 14.14
14
Standard deviation =
Formula. SD (£X – X)2
N
= £(14.14
= 3.760
Mode = 20
SET III
1. Draw a PIE - CHART
2. Find for the following set of numbers: the mean, median, mode and range.
Set of numbers: 9, 8, 12, 12, 16, 15, 14, 5, 10, 10, 6, 12 (15marks)
i. Mean Formula
= £f
N
= 9+ 8 +12+ 12+ 16+ 15 +14 +5 +10+ 10+ 6+ 12
= 129
12
= 10.75
ii. Median – Arrange in ascending order
- Pick the middle number
Arrangement Aacending order;
5 6 8 9 10 10 12 12 12 14 15 16
1 2 3 4 5 6 7 8 9 10 11 12
10 + 12 = 22 = 11
2 2
Median number = 11
iii. Mode = 12
iv. Range
Formular = Highest value – Lowest
= 16 – 5
= 11
Range = 11