0% found this document useful (0 votes)
8 views11 pages

QT 1 - Group 5 - R Assignment 1

Uploaded by

Gagan Hasija
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views11 pages

QT 1 - Group 5 - R Assignment 1

Uploaded by

Gagan Hasija
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Quantitative Techniques 1

Increasing COVID-19 testing capability through pooled testing


R Assignment

Group 5

B24061 Aman Sharma B24071 Gagan Hasija


B24081 Manu Agarwal B24092 Rishabh Saboo
B24102 Soham Parija BL24004 Anjaney Srinivas

1
Task 1

We have used a function (“expected_tests_first_twenty_pool_size”) which


will return the best pool size value.
The total sample is N=10000 and the number of people in a group is denoted
by n.
Total number of groups (NG) = floor(N/n)
Some samples would have to be tested individually as they can’t be clubbed
together in any group.
Total number of individual samples(N_INDV) = N - NG*n
P(An individual tests positive) = p = 0.01
P(An individual tests negative) = 1-p = 0.99
Therefore, P(A group tests negative) = (1-p)n.
P(A group tests positive) = 1-(1-p)n
In a particular group, we can either require 1 test if the test is negative or we
can require n+1 tests if the group is positive.
Now we use the concept of expectation through which we can find the
expected number of tests in a group.

E(number of tests in a group) = 1*(1-p)n + (n+1)*(1-(1-p)n) . This expectation


gets multiplied by the total number of groups to give the total expected tests
for all groups.
E(total number of tests) = E(number of tests in a group) * NG + N_INDV
Further we have used a loop to compute the expected number of tests for
different row sizes ranging from 1 to 20 and store them in a vector. We have
created a function(“best_pool_size_row_pool”) to return the best pool size
giving the highest numbers of test savings.

2
Result:
Group Size E(X) of Tests

1 10000.000

2 5199.000

3 3630.980

4 2894.040

5 2490.100

6 2254.964

7 2111.075

8 2022.553

9 1976.741

10 1956.179

11 1956.513

12 1972.697

13 1996.422

14 2030.017

15 2074.017

16 2110.422

17 2161.940

18 2218.208

19 2269.271

20 2320.931

3
The above data can be better understood by looking at the graph attached below:

For p=0.01, the optimal pool size is 10

Task 2

This task is an extension of task 1, where instead of finding the best


option from 1 to 30, here, we need to find the optimal value of n, for a
given prevalence rate p.

4
The function (“optimal_pool_size_row_pool”) iterates over all possible
pool sizes and returns the one that minimizes the expected number of
tests.

It has a while loop running from 2 to 10000 and every time an expected
test count is found to be lower than the pre-stored optimal tests, we
update the optimal tests with the new minimum and the optimal value
with the index of the minimum expected test count.

Results:

 For p=0.02, the optimal pool size is 8.


 For p=0.05, the optimal pool size is 5.
 For p=0.10, the optimal pool size is 4.

Task 3
CALCULATING THE CUT-OFF VALUE OF ‘p’ FOR ANY GIVEN VALUE OF ‘n’

Let 'p' be the prevalence of disease in an individual and n be the size of


the group.

X = number of tests that are needed for the group

P(Group result negative) = (1-p)n

Number of tests when the group has negative result = 1

Probability of group having at least one positive = 1-(1-p)n

Number of tests when the group has at least one positive = n+1

Therefore expected number of tests, E = 1*(1-p)n + (n+1)*(1-(1-p)n)

At the cut-off value of p, the expected number of tests for a group shall
be greater than or equal to the number of individuals in the group.

5
A/c to the previous statement, E>=n

=> 1*(1-p)n + (n+1)*(1-(1-p)n) >= n

=> n+1-n*(1-p)n>=n

=> n+1-(n*(1-p)n)-n>=n-n

=> 1-n*(1-p)n >=0

=> 1>=n*(1-p)n

=> 1/n>=(1-p)n (n>=1 always)

=> (1/n)(1/n)>=1-p

=> p>=1-(1/n) (1/n)

Therefore, the cutoff value of p = 1-(1/n) (1/n)

We use the above formula to find the cut-off value of p for a given value
of n.

n Prevalence rate(Task 2) Cutoff_value(p)


8 0.02 0.2288496
5 0.05 0.2752203
4 0.10 0.2928932

Task 4

6
We take n*n samples from N samples at a time. Each of those n*n samples is
referred to as a group.
Here N=10000 and p=0.01, the prevalence rate of the disease.
Number of groups(NG) = floor(N/n2)
Number of individual tests(N_IND) = Individual samples which are not a part of
any group
All samples of a row are tested together. Similarly, all samples of a column are
tested together. If a row and a column are tested positive, then their intersection
sample is tested again.
FOR A GROUP :
P(row negative) = (1-p)n
P(column negative) = (1-p)n
P(both row and column negative) = (1-p)(2*n-1)
P(row or column negative) =2*(1-p)n - (1-p)(2*n-1)
P(row and column positive) =1-P(row or column negative)
Number(row and column test) = 2*n
E(number of intersection tests) = n*n*P(row and column positive)
E(number of tests per group) = 2*n + n*n*P(row and column positive)
FOR THE WHOLE SAMPLE
E(number of tests) = E(number of tests per group)*NG + N_INDV

We iterate over a loop, where n varies from 1 to 30 and we compute the expected
number of tests for each value of n. The value of n which gives the minimum result
is taken as our best pool size.

7
Results
Group Size E(X) of Tests
1 10000
2 10100.99
3 6770.91
4 5108.733
5 4115.371
6 3475.433
7 2993.85
8 2657.457
9 2409.498
10 2174.045
11 2071.028
12 1927.111
13 1790.133
14 1680.411
15 1687.848
16 1557.408
17 1642.901
18 1694.564
19 1640.729
20 1399.152
21 1637.501
22 1643.745
23 1772.168
24 1534.84
25 1354.745
26 1821.143
27 1815.904
28 1884.139
29 2030.509
30 1485.499

8
Also attaching the graph for better understanding:

Running the R code, the best option for pool size is 25

Task 5

● The goal is to find the optimal square size for a given prevalence rate p.
● The function “optimal_pool_size_cross_pool” iterates over possible
square sizes and returns the one that minimizes the expected number of
tests.
● It builds on the previous task, with the only change that, instead of
finding the optimum value in the range of 1 to 30, in this task, we find
the optimum value in the range of 1 to 100 (floor value of square root
of N).

9
Results:

● For p = 0.02, the optimal square size is 16.


● For p = 0.05, the optimal square size is 10.
● For p = 0.10, the optimal square size is 7.

Task 6
From this table, we observe that the Acbott Test is conducted on 2000 Individuals
and the truth table looks as follows.

TOTAL TRUE POSITIVES ACTUAL TOTAL TRUE NEGATIVES ACTUAL

1000 1000

TOTAL TRUE POSITIVES TOTAL TRUE NEGATIVE


PREDICTED PREDICTED

990 950

PROBABILITY_TRUE_POSITIVE PROBABILITY_TRUE_NEGATIVE

p1=0.99 p3=0.95

PROBABILTY_FALSE_NEGATIVE PROBABILITY_FALSE_POSITIVE

p2=0.01 p4=0.05

p_pos=P(individual tests positive)=p1*p+p4*(1-p), where p is the


prevalence rate of disease in the population (assumed as 0.01)

We have assumed that whenever a row tests positive, we re-test all the
samples of the row.

With this information, we approach the problem in a way similar to the first
task. Every time a pool of samples test positive, all the individuals of the pool

10
would be retested individually.
E(number of tests per group)= (1*(1-p_pos)n + (n+1)*(1-(1-p_pos)n))

Number of groups(NG) = floor(N/n)

Number of individual testing(N_INDV) = N-Number of groups*n

E(total number of tests) = E(number of test per group)*NG + N_INDV

After computing the expected number of tests for one value of n in a


function(‘acbott_test_optimization’), we run a loop in another
function(‘expected_test_for_first_twenty_values_acbott_test1’) which stores
all the computed expected values, for all n belonging to 1 to 20. A third
function (‘best_pool_size_acbott_test1’) calculates the best pool size, among
these twenty values.

For general optimization of pool size, we have a minimum expected value.


When we iterate over all the possible values of n and if we get a value
smaller than the minimum value, we update our minimum value, as well as
the optimum batch size. This happens in another function called
(“best_pool_size_acbott_test2”)

11

You might also like