0% found this document useful (0 votes)
74 views15 pages

Second Public Examination: Tuesday, 30 April 2019, 9.30 A.M. - 12.30 P.M

This document describes an exam for the Honours School of Biological Sciences at the University of Oxford. The exam is for Paper 2: Quantitative Methods, which will take place on Tuesday, April 30th, 2019 from 9:30 am to 12:30 pm. The exam consists of 3 questions, with students required to answer at least one question from each of the two sections, A and B. Calculators may be used.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
74 views15 pages

Second Public Examination: Tuesday, 30 April 2019, 9.30 A.M. - 12.30 P.M

This document describes an exam for the Honours School of Biological Sciences at the University of Oxford. The exam is for Paper 2: Quantitative Methods, which will take place on Tuesday, April 30th, 2019 from 9:30 am to 12:30 pm. The exam consists of 3 questions, with students required to answer at least one question from each of the two sections, A and B. Calculators may be used.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

A10425W1

SECOND PUBLIC EXAMINATION

Honour School of Biological Sciences

PAPER 2: QUANTITATIVE METHODS

TRINITY TERM 2019

Tuesday, 30th April 2019, 9.30 a.m. – 12.30 p.m.

Answer THREE questions

At least one question to be answered from each of the two sections, A and B

Calculators may be used

Write YOUR CANDIDATE NUMBER and NOT YOUR NAME on each answer book.
Write the numbers of all the questions you have answered on the front of your first answer book.

Do not turn over until told that you may do so.

1
This page is left intentionally blank.

A10425W1 2
SECTION A

Question 1

(i) A student is studying the spatial distribution of snake’s-head fritillaries at Iffley Meadows in Oxford.
At the scale of the whole meadow, the plants are aggregated into clumps.

a) Suggest a biological reason for this large-scale aggregation. (1 mark)

The student then chooses to study a single 1m by 1m area of the meadow. They produce a map
showing the locations of all fritillaries within that area. They use their map to find that the plants
are more regularly spaced than we would expect if their locations within the 1m by 1m area were
random.

b) Suggest a biological reason for this small-scale regularity (1 mark), and state which probability
distribution can be used to test the hypothesis of random spatial distribution (1 mark).

A second student is snorkelling on a coral reef. They are mapping the spatial distribution of a fish
that lives on the reef.

c) Suggest how the spatial distribution of juvenile and adult fish might differ, and why. (2 marks)

(ii) Sketch the following probability distributions. Show as much detail as you can.

a) a normal distribution with mean 10 and standard deviation 5. (2 marks)

b) the distribution you would expect to see if you were recording the number of successful
outcomes from a series of 10 Bernoulli trials where the probability of success was 0.5
(2 marks).

(iii) You are reading a report that describes the results of a previous survey of two adjacent coral reefs
(North Reef and South Reef). The survey investigated 80 different locations on the North Reef and
75 locations on the South Reef. The report states that “Evidence of coral bleaching was found at
30% of surveyed locations on the North Reef and 84% of locations on the South Reef.” The raw
data used to calculate these proportions were not included in the report.

a) Devise and carry out an appropriate test of the question: does the prevalence of coral bleaching
differ between the north and south reefs? Explain the logic behind the test. You should attempt
to calculate a test statistic but you do not need to determine the significance level associated
with that statistic, nor make a concluding statement. (4 marks)

A10425W1 3 TURN OVER


(iv) For this question and question (v), the following information may be helpful.

The binomial distribution is given by:

𝑘
𝑝(𝑥) = ( ) 𝑝 𝑥 (1 − 𝑝)(𝑘−𝑥) , 𝑥 = 0,1,2 … 𝑘
𝑥
𝑘 𝑘!
where (𝑥 ) = 𝑥!(𝑘−𝑥)!

a) Show all the ways that it is possible to obtain 3 successes from 4 trials. (1 mark)

b) If the number of successes (x) from k trials were to follow a Binomial distribution, which two
key assumptions must be met? (2 marks)

(v) Jonathan is interested in fish behaviour. He has a captive population of 500 guppies which he keeps
in a holding tank. He catches eight fish at a time and places them into a smaller tank. He records
how many of the eight fish approach and investigate a strange object placed in the tank that none
of the fish have previously seen. These fish are then returned to the holding tank. He patiently
repeats his experiment a large number of times. These are his data:

Number of fish Frequency


approaching
object

0 36
1 10
2 7
3 5
4 2
5 4
6 8
7 20
8 32

a) Carry out an appropriate statistical test to assess whether or not fish make independent
decisions when choosing whether or not to approach a strange object. Show your working
and outline your logic (5 marks). Draw an appropriate conclusion from your test result (2
marks). Table 1 may help you.

A10425W1 4
b) Comment on his experimental design, highlighting any obvious problems and suggesting a
remedy. (2 marks)

Table 1: Critical values of chi-square for two significance levels.

Degrees of freedom 0.95 0.99

1 3.841 6.635

2 5.991 9.210

3 7.815 11.345

4 9.488 13.277

5 11.070 15.086

6 12.592 16.812

7 14.067 18.475

8 15.507 20.090

9 16.919 21.666

A10425W1 5 TURN OVER


SECTION A

Question 2

Some commentators have argued that science is going through a “reproducibility crisis” because
statistically significant results turn out to be unrepeatable in later studies more often than expected.

(i) Define and briefly explain the concept of the conventional (frequentist) probability value (p-value)
generated by a null hypothesis significance test (e.g. a one-tailed t-test). (4 marks)

(ii) Explain THREE ways in which the use of conventional p-values may contribute to the reproducibility
crisis. (6 marks)

(iii) Explain THREE key principles of experimental design. (3 marks)

(iv) How could failure to follow these principles of experimental design contribute to the reproducibility
crisis? (6 marks)

(v) Suggest THREE simple actions that could be taken to try and reduce the reproducibility problem
when conducting the analysis of experimental data. (6 marks)

A10425W1 6
SECTION A

Question 3

Climate change researchers have taken measurements of the relationship between leaf temperature
difference (the difference between leaf and air temperature: a continuous response variable
‘Difference’) and vapour pressure (a continuous explanatory variable ‘Pressure’) under high and
low concentrations of CO2 (a categorical explanatory variable ‘CO2’ with two levels ‘high’ and ‘low’).
Figure 1 shows a graph of the data and the R summary() function produces the following output:

CO2 Pressure Difference


low :20 Min. :1.330 Min. :-0.040
high :21 1st Qu.:1.810 1st Qu.: 0.720
Median :2.070 Median : 1.360
Mean :2.047 Mean : 1.373
3rd Qu.:2.260 3rd Qu.: 1.940
Max. :2.700 Max. : 3.220

Fig. 1 Temperature difference as a function of vapour pressure at low and high CO2 concentration.

A10425W1 7 TURN OVER


A linear model (calculated using the function lm() in R) analysing Difference as a function of
Pressure and CO2 produces the following ANOVA table and coefficients. Note that the researchers
have re-levelled the CO2 factor to over-ride the R default setting.

Df SumSq MeanSq F-value Pr(>F)

Pressure 1 3.8497 3.8497 7.6194 0.008930

CO2 1 6.4158 6.4158 12.6983 0.001029

Pressure:CO2 1 2.0988 2.0988 4.1541 0.048731

Residuals 37 18.6943 0.5053

coef.est coef.se

(Intercept) 1.00 1.13

Pressure -0.02 0.54

CO2high 3.69 1.44

Pressure:CO2high -1.41 0.69

(i) Give a concise report of the results of the analysis including the key information from the ANOVA
table. Explain what the p value means. (5 marks)

(ii) Use the coefficients to calculate the intercept and slope of the relationship between leaf
temperature difference and vapour pressure for the low and high CO2 treatments (show your
workings). (4 marks)

(ii) Calculate an approximate 95% confidence interval for the estimate in the last row of the table of
coefficients (show your workings) – what null hypothesis does it test and what is the result? (4
marks)

(iv) Draw a copy of Figure 1 (the exact number or position of the data points is not important). For each
of the two treatments, calculate the beginning and end points of the regression lines. Add the
calculated regression lines to your copy of Figure 1, including the values on the x axis for the
beginning and end points. Show and briefly explain your working. (6 marks)

A10425W1 8
(v) Give a concise critical assessment of the experimental design with regard to the explanatory
variables CO2 and Pressure? (2 marks)

(vi) List FOUR assumptions of the statistical model used here and briefly explain how the validity of
these assumptions might be investigated. (4 marks)

A10425W1 9 TURN OVER


This page is left intentionally blank.

A10425W1 10
SECTION B

Question 4

Researchers have obtained protein sequences from samples of bone-derived collagen of two extinct
species: a 400,000 year old sample believed to be from a mastodon (Mammut americanum, a relative of
the elephant Loxodonta africanus) and a 68 million year old sample believed to be from the dinosaur
Tyrannosaurus rex. The two protein sequences represent the collagen alpha-1 chain. The reliability of
biomolecular sequences recovered from ancient biological samples has been controversial. You have been
therefore asked to use bioinformatics approaches to investigate the authenticity and evolutionary history
of these two samples.

The BLAST best match to the putative mastodon collagen protein sequence was a sequence obtained from
a marsupial, the short-tailed opossum Monodelphis domestica.

BLAST RESULT 1

The BLAST best match to the putative dinosaur collagen protein sequence was to a sequence obtained
from the domestic chicken Gallus gallus domesticus.

BLAST RESULT 2

(i) Evaluate the strength of the match between the protein sequences returned by BLAST and the
collagen sequences recovered from Mammut americanum and Tyrannosaurus rex. (2 marks)

(ii) What do you infer from the BLAST results about the likely authenticity of the collagen protein
sequences from the two extinct species? (5 marks)

A10425W1 11 TURN OVER


You generated a multiple sequence alignment containing the collagen protein sequences from the two
extinct species, together with collagen sequences from a range of extant vertebrate genera. Two
phylogenetic trees were then obtained from this alignment. The first phylogeny was estimated using
maximum likelihood, whose scale bar represents expected amino acid changes per site (Figure 1). The
second phylogeny was generated using maximum parsimony and is drawn as a cladogram (Figure 2). In
both cases, the results of a phylogenetic bootstrap analysis (with 100 replicates) are shown next to
internal nodes in the tree. For each branch, the name of the genus of the species analysed is reported,
while the common name is indicated within brackets.

Figure 1: Maximum likelihood phylogenetic tree

(iii) What conclusions can you draw from the maximum likelihood phylogenetic tree in Figure 1 about
the likely authenticity of the mastodon (Mammut) sequence? (4 marks)

(iv) How do your conclusions from the maximum likelihood phylogenetic analysis in Figure 1 differ from
those obtained using BLAST? (3 marks)

(v) What conclusions can you draw from the maximum likelihood phylogenetic tree in Figure 1 about
the likely authenticity of the dinosaur (Tyrannosaurus) sequence? (4 marks)

A10425W1 12
Figure 2: Maximum parsimony phylogenetic tree

100 Pan (chimpanzee)


100 Homo (human)
55
Macaca (macaque)
80
Bos (cow)
72 Canis (dog)
Mus (mouse)
100
100 Rattus (rat)
Echinops (tenrec)
100
74 Mammut (mastodon)
93 Loxodonta (elephant)
47
Monodelphis (opossum)
Alligator (alligator)
80 Tyrannosaurus (t-rex)
26
33 Struthio (ostrich)
88 Gallus (chicken)
Anolis (lizard)
Cynops (fire newt)

90 Rana (pond frog)


99 Xenopus (clawed frog)
Raja (skate)
Paralichthys (flounder)
100
100 Danio (zebrafish)
90 Oncorhynchus (salmon)

(vi) Do your conclusions regarding the reliability of the sequences from both species change when you
consider the maximum parsimony tree in Figure 2? Explaining your reasoning. (3 marks)

(vii) How might you determine whether the phylogenetic placement of the sequences obtained from
the two extinct species is the result of long branch attraction? (4 marks)

A10425W1 13 TURN OVER


Question 5

(i) How is evolution defined in the breeder’s equation? Give the mathematical notation (2 marks) and
a verbal description (3 marks).

(ii) Two quantities with the mathematical notations S and h2 make up the right hand side of the
breeder’s equation. What are the names of these two quantities? (2 marks)

(iii) Explain what S measures (2 marks). Show how it can be calculated from the following data (5
marks).

Phenotypic Fitness
trait
6.19 0
6.42 0
8.06 0
8.96 0
9.13 0
10 1
10.87 1
11.18 1
13.2 1
13.35 1

(iv) Explain what h2 measures (2 marks). It can be calculated as the ratio of two variances. What are
these two variances called? (3 marks)

(v) Draw an annotated graph to illustrate how h2 can be calculated using linear regression. (6 marks)

A10425W1 14
Question 6

(i) Draw a representation of a Leslie matrix model with 5 ages, where only individuals aged 4 and
5 are capable of reproducing. Use a zero to identify impossible transitions, an ‘S’ to represent
survival transitions and an ‘R’ to represent reproduction transitions. (5 marks)

(ii) Name THREE quantities that can be calculated from a Leslie matrix model, and describe what
they represent. (6 marks)

(iii) How does a stochastic matrix model differ from a deterministic matrix model? (4 marks)

(iv) Explain in words what the elasticity of a prediction of a matrix model to the value of a matrix
element describes. (5 marks)

(v) Using an example, explain how elasticities of matrix predictions to the value of a matrix
element can be used in population or conservation biology. (5 marks)

A10425W1 15 LAST PAGE

You might also like