Tutorial On Loops and Functions: September 28, 2007
Tutorial On Loops and Functions: September 28, 2007
This tutorial gives a few examples of typical uses of loops in simple data analysis
problems.
Problem 1. In the first example, suppose we are given numerous vectors, say 10,
and asked to perform a t test on all possible pairs. As output, produce a table of
illustrative test components, sorted by p-value.
In practice we’d be given the vectors; here I’ll create some sample data. The 10
vectors will be stored in a list. First create an empty list. Then execute a loop that
generates a vector of length 20, randomly from normal distributions with varying
means. We start from a mean of −2 and increase it by 0.5 at each step. We’ll leave
the standard deviation at 1 (the default) in each sample.
We’re given 10 vectors, x1 , . . . , x10 and asked to perform a t test on each pair. If
we execute a t test on x1 , x2 , there is no need to repeat the test on x2 , x1 . Order only
matters in how certain numbers are reported (in a two-sided test). The following
array displays the pairs on which a t test should be run.
x1 x2 x1 x3 x1 x4 ··· x1 x10
x2 x3 x2 x4 ··· x2 x10
.. ..
. ··· .
x8 x9 x8 x10
x9 x10
1
loop performs one t test. We’ll want to store selected components of the test object
as rows in a data frame. To keep track of the variables in that particular t test we
need to associate it with a descriptive name. As the row names of the data frame
we use the name “1-2” for the t test with x1 , x2 , and similarly for other variables. As
useful components of the t-test we select statistic, p.value, estimate.
To store to t test results we create data frame with 45 entries (the total number of
tests) having 0’s as the entries and the characters “1” to “45” as the rownames. The
counter l keeps track of which of the 45 tests we are running and identifies the row
in which data should be stored.
The first 10 rows and the last 10 rows of the table are reported on the following
page.
Problem 2. Write a function in two variables, x and n, that successively takes the
exponential of x n times. NOTE: This function gets very large as n increases.
Samples:
2
Statistic P.value Estimate
1-10 −15.40 6.33e−18 −1.74
4-10 −14.00 1.41e−16 −0.89
1-9 −12.85 2.09e−15 −1.74
2-10 −12.79 2.44e−15 −0.96
3-10 −11.72 3.44e−14 −0.99
4-9 −11.13 1.60e−13 −0.89
1-8 −10.61 6.50e−13 −1.74
2-9 −10.43 1.04e−12 −0.96
3-9 −9.73 7.22e−12 −0.99
1-7 −9.27 2.69e−11 −1.74
> iterExp(5, 2)
[1] 2.851124e+64
> iterExp(2, 5)
[1] Inf