0% found this document useful (0 votes)
311 views17 pages

BIO 610 Lab Edited (Student)

About laboratory procedure.

Uploaded by

nikmunirah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
311 views17 pages

BIO 610 Lab Edited (Student)

About laboratory procedure.

Uploaded by

nikmunirah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

FAKULTI SAINS GUNAAN

UNIVERSITI TEKNOLOGI MARA

BIO 610
EXPERIMENTTAL BIOLOGY:
DESIGN & ANALYSIS
LAB MANUAL

Prepared by: WAN RAZARINAH BT. WAN ABDUL RAZAK

1
INTRODUCTION
________________________________________________________________________

The lab activity is an important part of this course.

The premise of this lab is this:


“You take a simple random sample from a population and then analyze the data in your sample.
You will see that learning about the population from data in any given sample is not as easy as
you might think”.

All lab work must be completed exactly as specified in this manual.

The students are also expected to read the labs carefully and understand them well before coming
into the lab.

Attendance is compulsory!!!!

LABORATORY SCHEDULE

WEEK TOPIC

2 Sampling, Frequency Tables and Stem – and – Leaf plots.

3 Summary Statistics.

4 Probability.

5 One-sample Inference: Confidence Interval for a Mean.

6 Inference about a Proportion.

7 Paired Samples and their Differences

8 Independent Samples and their Differences

9 Chi- Square Analysis

10 Biology Investigation and Experimental Design

2
Lab report:-
i) should be completed and handled in for marking, the following week during the
next practical.

ii) handed in late without any reason WILL NOT be marked.

iii) will provide the basis for your course work (lab report – 10%) assessment mark in
BIO 610.

Lab Reports Format

Introduction
 The following format is to be used for writing "full" lab reports.
 It is preferably typed however, a hand written report is also acceptable.
 Lab report grades are based on the following criteria:
o completeness,
o neatness,
o clarity,results,
o and answers to questions (if any),

I. Front Page
 The students must write;
o Full name,
o Title,
o Date of experiment,
o Group

II. Title
Give the full title of the lab exercise.

III. Purpose/objectives: State in a complete sentence the reason for doing the lab exercise, or

Hypothesis: Include a hypothesis, if it is appropriate for the exercise (otherwise delete this item).
The hypothesis should be in an if ______then_______statement.

IV. Introduction
An introduction of 1-2 paragraphs should be sufficient. It should provide the background of the
underlying lab and techniques that are used in it.

V. Materials and Methods


A. The procedure must be in a step form (1,2,3, ...etc.)
B. You must use complete sentences.
C. Summarize the procedure from the lab handout, use your own words.

VI. Results and Observations


A. Give all appropriate observations made; both qualitative and quantitative.

3
B. Organize data in an easy to read format. Use data tables whenever possible and be sure to give
all tables good titles.
C. All measurements must be in proper metric units.
D. Whenever you need to perform experiments with multiple replicates, condense your data into
averages and standard deviations across replicates.
E. Do not show all raw data unless I specifically ask you to.

VII. Data Analysis


A. Show proper setup for all calculations. All numbers must include unit identification.
B. Answer all questions asked on the lab handout, use question numbers provided on the handout
and be sure to use complete sentences.
C. Graphs go in this section. Be sure to give all graphs good titles and label all axes.

VIII. Discussion
A. Conclusion:
 Present a brief conclusion that ties together the reason for doing the exercise and results
obtained.
 You must either accept or reject your hypotheses.
 Use specific examples from your processing of data or data and observations to support your
conclusion.
 Use complete sentences.

B. Source of Error:
 Give at least one source of error.
 Be specific, and explain why it is a source of error. Also, explain how this error affected your
results.

IX. References
 Include complete citations of any works you cite such as textbooks, journals and internet
materials.

4
Scheme of Lab Report
 Marks will be given as follows:

Report Component Marks


Title 1
Introduction 3

Procedure/Materials & Methods 2

Result:
 Data Observation (diagram/ table, etc.) 5
 Data Analysis (calculation/graph, etc.)

Discussion:
 Include answer for any post-lab question) 5

Conclusion 2

References 1

Overall structure:
 Include grammar & quality. 1

TOTAL 20

(For the purpose of continuous assessment – 10% of 20 marks)

5
LAB 1: SAMPLING, FREQUENCY TABLES AND STEM-AND-LEAF PLOTS

Introduction

Two of the commonest types of mistake in statistical calculations are simple arithmetical errors
and errors in copying numbers, especially the very large numbers which tend to arise at
intermediate stages in statistical calculations. It is therefore important to cultivate good work
habits which keep these to a minimum. One such habit is the tabulation of data in a properly
constructed table, preferably on ruled.

Besides that, in statistic we are concerned not with the particular results of individual
measurements but with the distribution of the measured values. A great deal of work in statistics
is spent in identifying and describing the distribution associated with a particular set of
measurements or observations and the first thing we must do is to consider ways of representing
distributions of random numbers.

Objective:
i) To select a simple random sample from the population.

ii) To explore the LEAF data in your sample with a stem–and-leaf plot and frequency
table.

Materials:
 Leaves
 Ruler

Activities

1. Data Gathering
a. Find a tree or shrub (if possible, each group pick the leaves from different tree or shrub).

b. Pick 50 leaves per tree keeping the petiole attached. Try to pick a "random" assortment of
sizes if they exist on your chosen tree.

c. For each leaf, record the length of leaf (in tenths of mm).

d. Enter the leaf data into a computer file using an Excel/SPSS format, if possible.

6
Tasks:
1. Create the result as:-
a. Table of raw data.
b. Frequency distribution table which contain measured weight, implied weight,
class mark, frequency, relative frequency and cumulative frequency.

2. Construct a stem-and-leaf plot of the above data. Describe the shape, location, and spread
of the distribution.

Computerized analysis with SPSS

1. Stem-and-leaf plot
a. Start SPSS
b. Open the file which contains your data and click Analyze > Descriptive Statistics >
Explore.
c. Place the LENGTH OF LEAF in the Dependent List and click OK.
d. After the program runs, go to the OUTPUT window and navigate to the Stem-and-leaf
plot.
(How does this plot compare with the one you constructed by hand?)

2. Frequency table
a. Click the Variable View tab (toward the bottom of the screen) and create a new
variable named LENGTHGRP. Make this a numerical variable with width 8 and 0
decimals.
b. In the column called “Label,” enter “Age Group” to give the variable a descriptive
label.
c. Click the Data View tab at the bottom of the screen and classify each leaf with the
appropriate codes: e.g. 1 = 0-9 mm, 2 = 10-19 mm, and so on.
d. Click Analyze > Descriptive Statistics > Frequencies, select the LENGTHGRP
variable, and click OK.
e. Go to the Output Window and navigate to the frequency table for LENGTHGRP.
View the frequency table compiled by SPSS.
(How does this frequency table compare with the one you prepared by hand?)

7
LAB 2: SUMMARY STATISTIC

Introduction

In biology, we generally classify the objects around us and we will need to do the same in
statistics. The ‘objects’ we are concerned with in statistics are probability distributions. Collected
data can be shown via the various probability distributions (binomial, normal, chi-square, etc.).
Summary statistics shows about ways of classifying probability distributions so that we do not
need to specify the distributions in every detail but rather can pick out the key properties as we
need them.

Probability distributions can be classified by considering a few ways:-


i) Determine the center of a distribution – by calculate the mean, median and mode.
ii) Measure the spread or width of a distribution – by calculate variance, standard
deviation and range.
iii) Determine the shape of a distribution – by measure the skewness and kurtosis (the
degree of peakness of a distribution).

So, once a large set of data have been collected, we need to use some descriptive statistics to
convey the important aspects of the distribution of the data.

Objectives:
i) To calculate and interpret summary statistics (descriptive statistics) of the PULSE-RATE
data.
ii) To determine if the distribution follows normality

Materials:
 Stopwatch

Activities

1. Data Gathering

a. Count the resting pulse rate of the students in the class – number of beats per
minute

b. Get all the counts and show the raw data in your lab report.

c. Enter the pulse-rate data into a computer file using an Excel/SPSS format, if
possible.

Tasks
1. Calculate the summary statistics (mean, mode, median, variance, standard deviation and
range) for the class data set
 the class value is the population while the group value is the sample

8
2. Calculate the summary statistics for males and females.

3. 5-point summary & boxplot:


a. Determine the 5-point summary for your PULSE-RATE data - population and sample.
b. Draw a boxplot of each your PULSE-RATE data.
c. Are there any outside values in your data set?
d. In plain English, describe the distribution’s shape, location, spread.

4. SPSS descriptive statistics:


a. Start SPSS and open the data file PULSE-RATE.
b. Click Analyze > Descriptive Statistics > Descriptives and select the PULSE-RATE
variable for analysis. Compare the mean and standard deviation reported by SPSS to your
hand calculations. If results differ, track down the error and make corrections.

5. SPSS exploratory analysis:


a. Click Analyze > Descriptive Statistics > Explore.
b. Place the PULSE-RATE variable into the Dependent List.
c. Click the Statistics buttons, check the percentiles box, and click OK.
d. Go to the output window and navigate to the Percentiles section of the outputs.
This section reports quartiles using two methods of calculation. Our method of
corresponds to Tukey’s Hinges and NOT to the Weighted Average percentiles!
Ignore the weighted average percentiles. Make certain your quartiles match
Tukey’s hinges.
e. Navigate to the output region with the boxplot and compare the boxplot you drew by
hand with the boxplot created by SPSS. Are they similar?

Post-lab Question
1. Draw a histogram for the class data set-population and sample.
 on a separate graph put in the class mean (population), mean for males, mean for
females and mean for the different groups (samples)
Describe the general shape of the data distribution: normal (bell-shaped), uniform, skewed with a
long tail to the left or right, middle-heavy (platykurtic), or tail-heavy (leptokurtic), or bimodal.

2. What are the consequences of too few intervals in a histogram? Too many?

3. How strongly is the histogram affected by changes in interval start points?

4. Does the box plot change when you manipulate the histogram? Why?

9
LAB 4: ONE-SAMPLE INFERENCE

Confidence interval for the mean

Whenever carry out an experiment or make an observation, we should always take at least three
readings. One reading gives us an estimate of the mean but no indication of the dispersion. Two
readings enable us to calculate the standard deviation, and a 95% confidence interval for the
mean is then

m ± t1 (0.975) s/√1 = m ± 12.7s.

Two repeats is thus the absolute minimum number of readings we should take. However, with
three repeats the confidence interval for the mean is

m ± t2 (0.975) s/√2 = m ± 4.3 s/√2 = m ± 3.0s.

With 50% more effort, we have increased the precision of our estimate by a factor of 4!

The standard error (SE) is a measure of the reliability or precision of x as an estimate of μ. The
smaller the SE, the more precise the estimate.

Under certain circumstances, the SE can be given a definite quantitative interpretation by using it
to construct a confidence interval for the population mean. A confidence interval for μ is an
interval that upper and lower limits are computed from the data. The interval always contains x.

Unfortunately, the mean calculated from a sample, X, will differ from the population mean μ.
The expected discrepancy between X and μ depends on the size of the sample and the variability
of X. If sample size is small and X has high variance, then X may be quite far from the population
mean. In contrast, if sample size is large and X has low variance, X will probably be close to μ.

The first step in constructing a confidence interval is to choose a value called the confidence
coefficient, which measures our “confidence” that the confidence interval contains µ.

Student’s t describes the method for constructing a confidence interval for μ. First, suppose we
have chosen a confidence equal to 95%. To construct a 95% confidence interval for μ, we
compute the upper and lower limits of the interval as

x ± t.05 SEx

that is,
x ± t.05 s
__
√n
The confidence interval (CI) combines information on sample size and variability to put
probabilistic bounds on estimates of the population mean. CI’s can be calculated for any desired
degree of confidence, but 95% confidence intervals are most common. If your sample is random

13
and the population has a normal distribution, you can be “95% confident” that your confidence
interval includes the population mean. More accurately, if you sample repeatedly and generate a
95% CI’s each time; you can expect the CI to include the population mean in 95% of the cases,
and not in the other 5% of cases.

Since you usually don't know the population mean, you'll never know when this happens. If the
data are not from a normal distribution, then the 95% CI will include the true mean in
approximately 95% of cases only if sample size is large (follows from the Central Limit
Theorem).

Hypothesis testing for the mean


Samples from a population may also be taken to test hypotheses about the population mean. For
example, the sample may be the result of an experiment designed to test for a proposed treatment
effect against the null hypothesis of “no effect”:
Ho: μ = 0
Ha: μ ≠ 0

If the random sample is from a normally distributed population, then the (two-tailed) one-sample
t test may be used:
reject Ho if t ≥ t0.05(2),ν or t ≤ −t0.05(2),ν where;

t=X-µ
s/√n

and ν is degrees of freedom. The same procedure may be used if the data are not from a normal
distribution only if n is large (follows from the Central Limit Theorem).

If the data are not from a normal distribution and the sample size is not large, the Wilcoxon
signed-rank test may be used instead. We will learn more about the Wilcoxon and other “non-
parametric” tests in future lab exercises. These tests are based on ranks and do not require the
assumption that the population is normally distributed. However, rank tests are generally less
powerful than tests based on the normal distribution, and the latter are therefore preferred if the
assumption of normality can be met.

Objective:
i) To learn about distributions of sample means and confidence intervals for means.

Material:
 Health scale machines meter

Methods:
1. Data Gathering

a. Measure the weight (kg) of each student of the class carefully.

14
b. Get all the measurements and show the raw data in your lab report.

c. Enter the weight data into a computer file using an Excel/SPSS format, if possible.

d. Determine the population mean and population standard deviation for your class.

Tasks
1. Confidence interval for WEIGHT,  estimated.
Calculate the 95% confidence interval for the mean weight for your group. Use both your
group standard deviation (s) and the population standard deviation (σ) as comparisons.
Does the confidence interval contain/capture the population mean for both the cases?
(Refer examples of calculation given).

2. Then repeat for your group and the 2nd group; then your group, 2nd group and 3rd group;
your group, 2nd group, 3rd group and the 4th group. Does the SE decrease as the sample
size increases? Report your results as μ = X ± SE as well as graph your answer with SE
on the y–axis and n on the x-axis.

3. Calculate confidence interval for µweight, σ known.


Using the mean in your sample and the standard deviation in the population ( = 13.95),
calculate a 95% confidence interval for the population mean weight. Did your confidence
interval capture the value of population mean?

4. t Table and t percentiles.


Let tdf,p denote a t percentile with df degrees of freedom with a left tail area of p. Then,
using the t table, determine the percentiles listed below.
t9,.90 = ________________ t9, .95 = ____________________

t9,.99 = _______________ t9,.995 = ____________________

5. Open your data set in SPSS and check your confidence interval calculations with Analyze
> Descriptive Statistics > Explore (select the WEIGHT variable). The confidence
interval is reported in the output area labeled “Descriptives”.

15
Example calculation for Confidence Interval.

a) Confidence interval for µ, when σ is known.

σx = 6.21 n = 10 x = 159.40

using x as point estimate of µ and with a knowledge of the population sd the upper
limit (UL) and lower limit (LL) for a 95% confidence interval of the mean are
given by:-

UL0.95 = x + (1.96 * σx) **σ x = σ/√n

and LL0.95 = x - (1.96 * σx)


For the example;
UL0.95 = 159.40 + (1.96 * 6.21)
= 171.57

LL0.95 = 159.40 - (1.96 * 6.21)


= 147.23

» 147.23 < µ < 171.57


or simply µ = x ± 12.17.

Conclusion: We feel 95% confident that the population mean of this population is
included by the interval 147.23 to 171.57.

*** 0.95% of the standard normal distributions lies between z score -1.96 and
1.96.

b) Confidence interval for µ, σ unknown.

n = 20 x = 21.0 mm s = 1.76 mm

We wish to construct a 95% confidence interval for the population mean.

The upper limit for a 95% confidence interval is given by:

UL0.95 = x + (t(0.05,n-1)) * SEx )

and the lower limit is given by:


16
LL0.95 = x - (t(0.05,n-1)) * SEx ) **SEx = s/√n

For the example;


UL0.95 = 21.0 + 2.093 * 0.394
= 21.825

and,
LL0.95 = 21.0 – 2.093 * 0.394
= 20.175

Or, 20.175 < µ < 21.825

Or, µ = x ± 0.825

Thus, we conclude there is a 95% (0.95) probability that the range of 20.175 mm to
21.825 includes the population mean.

17
LAB 7: INDEPENDENT SAMPLES AND THEIR DIFFERENCES

Introduction
Two-sample inference is when we consider the problem of estimating and testing differences
between two means. For example, we may be interested in comparing the effects of two different
medications on patient mean blood pressure. Or, we may wish to compare the effects of different
fertilizers on mean plant growth. There are two completely different ways of carrying out such
comparisons of means. The first approach is to randomly assign independent observations (e.g.,
patients, field plots) to different treatments. In this case we have two samples of individuals,
each from separate populations: one sample of individuals given drug #1 and a second sample
of individuals given drug #2 (or, one sample of field plots treated with fertilizer #1 and another
sample treated with fertilizer #2). This is the two-sample design the subject of the present lab
exercise. Our goal is to compare the two population means (μ1 and μ2) using two random
samples of patients (or, field plots).

Distribution of Differences Between Sample Means


The foundation for analysis of means of two populations is the fact that if X has a normal
distribution in each of two populations, with equal variance σ2, then the difference between
sample means, X1 − X2 , also has a normal distribution.

You will have only a single estimate of each mean, but keep in mind that if you were to go back
and collect two more random samples, the value of X1 − X2 obtained the second time would be
different from that obtained the first time. The mean of the distribution of possible values for X1
− X2 is μ1 − μ2, and its standard deviation is σ X1 −X2 .

In this case, the quantity

t = (X1 – X2) - (µ1 - µ2)


SEx1-x2

has a t-distribution with n1 + n2 − 2 degrees of freedom. This fact is the basis of the two-sample t
test for a difference between population means, and of the confidence interval for the difference
between two means. The quantity s − is computed from the pooled sample variance, sp2, where
X1 X 2

SEx1-x2 = √ sp2/n1 + sp2/n2


If X is normal in both populations with unequal variances, then a modified version of the above
equation yields the Welch’s t-statistic, which has an approximate t-distribution.

Non-parametric Alternative to the Two-sample t Test


If the populations are not normally distributed, and sample size is not large enough to appeal to
the Central Limit Theorem, then an alternative approach is to use a nonparametric test.
Nonparametric tests are based on the ranks of the data rather than the data themselves, and they
assume only that X is a continuous variable. The nonparametric equivalent of the two-sample t-
test is the Wilcoxon rank sum test (equivalent to the Mann-Whitney U test). Under optimal

23
conditions the Wilcoxon rank sum test is about 95% as powerful as a 2 sample t-test, although it
may be less powerful in specific settings.

Power Analysis
When researchers carry out an experiment to test the difference between two treatment means,
how do they decide on the appropriate sample sizes to take? How confident are they about their
abilities to detect a difference if one is present? Power is the probability of correctly rejecting the
null hypothesis when it is false (power is 1−β, where β is the probability of making a Type II
error). The power of the two-sample t test depends on:

1. The sample size (n1+n2). Greater sample size increases power of a test.
2. The significance level (α). Power decreases with decreasing α. For example, reducing α from
0.05 to 0.01 to reduce the probability of making a Type I error but increases the probability of
making a Type II error.
3. The within-population variation (σ). Higher variation reduces power.
4. The difference between means, μ1−μ2. The larger the difference between the population
means, the greater the probability of rejecting Ho.

Objective
i) To describe independent samples.
ii) To estimate a mean difference with 95% confidence.
iii) To conduct an independent t test.

Tasks
1. Independent samples and Side-by-side boxplot.
Used the data from the Appendix (The average daily Na+ intakes (in milligrams) of 12
Normal and 10 Hypertensive subjects) to:
a. Determine 5-point summaries of the Normal (n1 = 12) and Hypertensive (n2 = 10) in
the sample.
b. Then, construct a side-by-side boxplot of these distributions. Do the distributions
overlap? How do the medians compare?

2. Mean and standard deviation


Calculate the mean and standard deviation of each group.

3. Confidence interval for independent mean difference


a. Calculate the pooled estimate of variance, standard error of the mean difference, and
95% confidence limits for μ1−μ2.
b. Interpret your confidence interval. Did your confidence interval capture the true mean
difference of 50 mg?

The mean difference in Na+ in Normal and Hypertensive in the population (μ1−μ2) = 50 mg.

24
4. Statistical hypothesis test
a. Test H0: μ1−μ2 = 0. List all hypothesis testing steps.
Were you able to reject the null hypothesis? Does this imply the null hypothesis is
correct? Did you make a type I or type II error?

5. SPSS

a. Enter the data into a computer file using an Excel/SPSS format, if possible and make
note of Na+ values for the Normal and Hypertensive in this sample.
b. Open .sav file in SPSS and click Analyze > Descriptive Statistics > Explore.
c. Put the variable Na+ in the Dependent List and Na+ in the Factor list.
d. Go to the output window and navigate to the boxplot. How does this boxplot compare
with the one you produced by hand?

e. Check your statistical hypothesis test calculations with SPSS by clicking Analyze >
Compare Means > Independent Samples T test. The Test variable is Na+ and the Group
Variable is People. You must use the Define Groups button to tell SPSS that Groups 1 is
coded “Y” and Group 2 is coded “N”.
f. After the program runs, go to the output window and navigate to the region labeled
“Independent Samples Test”. The first row of the output table (labeled “Equal Variances
Assumed”) contains confidence interval and test statistics. These should match the
statistics you calculated by hand in part 5 and 6, respectively.

Appendix

The average daily Na+ intakes (in milligrams) of 12 normal and 10 hypertensive subjects.
_____________________________________________________________________
Normal:
10.2 2.2 0.0 2.6 0.0 43.1 45.8 63.6 1.8 0.0 3.7 0.0
_____________________________________________________________________
Hypentensive:
92.8 54.8 51.6 61.7 250.8 84.5 34.7 62.2 11.0 39.1
_____________________________________________________________________
* The two groups were isolated for a week and compared with respect to Na + intake.
This data deal with sodium chloride preference as related to hypertension.

25

You might also like