0% found this document useful (0 votes)
27 views

Accessing A Data Frame

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Accessing A Data Frame

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

2.

5: Accessing a Data Frame


We access the individual elements in a data frame using square brackets to identify a specific cell. For instance, the
following accesses the data in the cell in row 15, column 12:

> int92.dat[15,12]
[1] 180

We can also access cells by name by putting quotes around the name:

> int92.dat["71","perf"]
[1] 105.1

This expression returns the data in the row labeled 71 and the column labeled perf . Note that this is not row 71, but
rather the row that contains the data for the processor whose name is 71 .
We can access an entire column by leaving the first parameter in the square brackets empty. For instance, the following
prints the value in every row for the column labeled clock :

> int92.dat[,"clock"]
[1] 100 125 166 175 190 ...

Similarly, this expression prints the values in all of the columns for row 36:

> int92.dat[36,]
nperf perf clock threads cores ...
36 13.07378 79.86399 80 1 1 ...

The functions nrow() and ncol() return the number of rows and columns, respectively, in the data frame:

> nrow(int92.dat)
[1] 78
> ncol(int92.dat)
[1] 16

Because R functions can typically operate on a vector of any length, we can use built-in functions to quickly compute
some useful results. For example, the following expressions compute the minimum, maximum, mean, and standard
deviation of the perf column in the int92.dat data frame:

> min(int92.dat[,"perf"])
[1] 36.7
> max(int92.dat[,"perf"])
[1] 366.857
> mean(int92.dat[,"perf"])
[1] 124.2859
> sd(int92.dat[,"perf"])
[1] 78.0974

This square-bracket notation can become cumbersome when you do a substantial amount of interactive computation within
the R environment. R provides an alternative notation using the $ symbol to more easily access a column. Repeating the
previous example using this notation:

David Lilja 2.5.1 12/17/2021 https://stats.libretexts.org/@go/page/4406


> min(int92.dat$perf)
[1] 36.7
> max(int92.dat$perf)
[1] 366.857
> mean(int92.dat$perf)
[1] 124.2859
> sd(int92.dat$perf)
[1] 78.0974

This notation says to use the data in the column named perf from the data frame named int92.dat . We can
make yet a further simplification using the attach function. This function makes the corresponding data frame local
to the current workspace, thereby eliminating the need to use the potentially awkward $ or square-bracket indexing
notation. The following example shows how this works:

> attach(int92.dat)
> min(perf)
[1] 36.7
> max(perf)
[1] 366.857
> mean(perf)
[1] 124.2859
> sd(perf)
[1] 78.0974

To change to a different data frame within your local workspace, you must first detach the current data frame:

> detach(int92.dat)
> attach(fp00.dat)
> min(perf)
[1] 87.54153
> max(perf)
[1] 3369
> mean(perf)
[1] 1217.282
> sd(perf)
[1] 787.4139

Now that we have the necessary data available in the R environment, and some understanding of how to access and
manipulate this data, we are ready to generate our first regression model.

David Lilja 2.5.2 12/17/2021 https://stats.libretexts.org/@go/page/4406

You might also like