0% found this document useful (0 votes)
229 views27 pages

R Intro Edx-Datacamp

The document discusses creating and managing variables in the R programming environment's workspace. It provides examples of assigning values to variables, listing the contents of the workspace, and removing variables from the workspace. The key points covered are: 1) Variables are stored in the R workspace and the command ls() lists the contents of the workspace. 2) The rm() command allows users to remove objects from the workspace. 3) Examples are given of creating variables, listing the workspace contents, and removing a variable to demonstrate how objects can be built up, inspected, and managed in the workspace.

Uploaded by

Lau Chavez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
229 views27 pages

R Intro Edx-Datacamp

The document discusses creating and managing variables in the R programming environment's workspace. It provides examples of assigning values to variables, listing the contents of the workspace, and removing variables from the workspace. The key points covered are: 1) Variables are stored in the R workspace and the command ls() lists the contents of the workspace. 2) The rm() command allows users to remove objects from the workspace. 3) Examples are given of creating variables, listing the workspace contents, and removing a variable to demonstrate how objects can be built up, inspected, and managed in the workspace.

Uploaded by

Lau Chavez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Introduction To R – datacamp+edx

2nd chapter not available for free at datacamp but it does at edx, and goes as follows

The workspace
If you assign a value to a variable, this variable is stored in the workspace. It's the
place where all user-defined variables and objects live. The command ls() lists
the contents of this workspace. rm(<var_name>) allows you to remove objects from
the workspace again. Try the following code in the console:
a <- 1
b <- 2
ls()
rm(a)
ls()
The first two lines create the variables a and b. Calling ls()now shows you
that a and b are in the workspace. After removing a using rm(a), the
same ls() command will show you that only b remains in the workspace. You
could also remove both a and b in a one-liner: rm(a,b).
The first line of the sample code is rm(list = ls()). This is a very useful
command to clear everything from your workspace!
Instructions
100 XP
Instructions
100 XP

 List the contents of the workspace to check that the workspace is empty.
 Create a variable, horses, equal to 3.
 Create another variable, dogs, which you set to 7.
 Create a new variable, animals, that is equal to the sum of horses and dogs.

 Inspect the contents of the workspace again to see that indeed, these three
variables are available.
 Eliminate the dogs variable from the workspace.
 Finally, inspect the objects in your workspace once more to see that
only horses and animals remain.

Take Hint (-30 XP)

> # Clear the entire workspace

> rm(list = ls())

>

> # List the contents of your workspace

> ls()

character(0)
>

> # Create the variable horses

> horses <- 3

>

> # Create the variable dogs

> dogs <- 7

>

> # Create the variable animals

> animals <- horses + dogs

>

> # Inspect the contents of the workspace again

> ls()

[1] "animals" "dogs" "horses"

>

> # Remove dogs from the workspace

> rm(dogs)

>

> # Inspect the objects in your workspace once more

> ls()

[1] "animals" "horses"

>Awesome! Now that you know how you can build up, inspect and manage your workspace, it's
time for your first challenge!

Build and destroy your workspace


Apples and oranges, dogs and horses, you can model practically everything in R.
The only limit is your own imagination! However, how you create and manage the
variables you're creating is always the same. If fruits are not your kind of thing,
you're in luck! In this final coding exercise, you will compute the volume of a donut.
The volume of a donut can be expressed as:

V=2π2r2RV=2π2r2R
where rr is the minor radius and RR is the major radius. This is the same as
computing the area of the cylindrical portion of the donut ( πr2πr2) and multiplying it
by the circumference of the donut (2πR2πR). Top off this theory with some
workspace management and you've got one tasty challenge! One last tip: ππ is
available in R by default as pi.
Instructions
200 XP
Instructions
200 XP

 Define the variables r (inner radius) and R (outer radius) and set them to 2 and 6
using the assignment operator (<-).
 Calculate the volume of the donut based on the formula above and assign the
result to vol_donut. You can use intermediary variables for this if you want.
 Remove all intermediary variables that you've used to calculate vol_donut using
the rm() function.
 Finally, use ls() to list the elements in your workspace. Only vol_donut should
remain in your workspace at this point.

Take Hint (-60 XP)

> # Create the variables r and R

> r <- 2

> R <- 6

>

> # Calculate the volume of the donut: vol_donut

> pi_2 <- pi^2

> r_2 <- r^2

> vol_donut <- 2 * pi_2 * r_2 * R

>

> # Remove all intermediary variables that you've used with rm()

> rm(pi_2)

> rm(r_2)

> rm(r)

> rm(R)

>

> # List the elements in your workspace


> ls()

[1] "vol_donut"

>Awesome! In this exercise, the true power of variables became apparent. Close this window to
head back to edX and continue to learn more about R's basic data types.

You have finished the chapter "R: The true basics"!

ANOTHER CHAPTER

Coercion: Taming your data


As Filip explained in the video, coercion to transform your data from one type to the
other is possible. Next to the class() function and the is.*() functions, you can
use the as.*() functions to enforce data to change types. For example,
var <- "3"
var_num <- as.numeric(var)
converts the character string "3" in var to
a numeric 3 and assigns it to var_num.
Beware however, that it is not always possible to convert the types without
information loss or errors:
as.integer("4.5")
as.numeric("three")
The first line will convert the character string "4.5" to the integer 4. The second
one will convert the character string "three" to an NA.
Instructions
100 XP
Instructions
100 XP

 Convert var1, a logical, to a character and assign it to the variable var1_char.


 Next, see whether var1_char actually is a character by using
the is.character() function on it.
 Convert var2, a numeric, to a logical and assign it to the variable var2_log.
 Inspect the class of var2_log using class().
 Finally, try to coerce var3 to a numeric and assign the result to var3_num. Was it
successful?

Take Hint (-30 XP)

> # Create variables var1, var2 and var3

> var1 <- TRUE

> var2 <- 0.3

> var3 <- "i"

>
> # Convert var1 to a character: var1_char

> var1_char <- as.character(var1)

>

> # See whether var1_char is a character

> is.character(var1_char)

[1] TRUE

>

> # Convert var2 to a logical: var2_log

> var2_log <- as.logical(var2)

>

> # Inspect the class of var2_log

> class(var2_log)

[1] "logical"

>

> # Coerce var3 to a numeric: var3_num

> var3_num <- as.numeric(var3)

Warning message: NAs introduced by coercion

>Bellissimo! The final coercion you tried did not succeed, hence the warning. Head over to the
challenge that concludes this chapter.

Coercion for the sake of cleaning


Coercion can come in pretty handy when you're dealing with messy datasets
where supposedly numerical variables have been stored as character strings,
logicals have been stored as numerics etc. To prepare you for such problems, try
this coding exercise: your first modest steps in data cleaning! In the workspace,
some variables concerning the answers on a questionnaire have been defined;
have a look at them in the R console with ls().
Instructions
200 XP
Instructions
200 XP
 Use as.numeric() to convert the character age; assign the result to a new
variable age_clean.
 With the help of as.logical(), convert the numeric employed and store the result
to a new variable employed_clean.
 Using the as.numeric() function, convert the respondent's salary to a numeric;
assign the resulting numeric to the variable salary_clean.

Take Hint (-60 XP)

> ls()

[1] "age" "employed" "location" "salary"

> # Convert age to numeric: age_clean

> age_clean <- as.numeric(age)

>

> # Convert employed to logical: employed_clean

> employed_clean <- as.logical(employed)

>

> # Convert salary to numeric: salary_clean

> salary_clean <- as.numeric(salary)

>Perfect! Sit back and relax for a while after this first introduction to R, but not for too long: there
is much more to come! Close this tab and head over to edX again to learn more about vectors.

You have finished the chapter "Basic Data Types"!

ANOTHER CHAPTER……

Create a vector (1)


Feeling lucky?

You better, because this chapter takes you on a trip to Sin City, also known
as "Statisticians Paradise" ;-).

Thanks to R and your new data science skills, you will learn how to uplift your
performance at the tables and fire off your career as a professional gambler. This
chapter will show how you can easily keep track of your betting progress and how
you can do some simple analyses on past actions. Next Stop, Vegas Baby...
VEGAS!!

On your way from rags to riches, you will make extensive use of vectors. As Filip
explained you, vectors are one dimensional arrays that can hold numeric data,
character data, or logical data. You create a vector with the combine function c().
You place the vector elements separated by a comma between the brackets. For
example:

numeric_vector <- c(1, 2, 3)


character_vector <- c("a", "b", "c")
boolean_vector <- c(TRUE, FALSE)
Instructions
100 XP
Instructions
100 XP

Build a vector, boolean_vector, that contains the three


elements: TRUE, FALSE and TRUE (in that order).

Take Hint (-30 XP)

numeric_vector <- c(1, 10, 49)

character_vector <- c("a", "b", "c")

# Create boolean_vector

boolean_vector <- c(T, F, T)

Perfect! Notice that adding a space behind the commas in the c() function improve the readability
of your code. Let's practice some more with vector creation in the next exercise.

Create a vector (2)


After one week in Las Vegas and still zero Ferraris in your garage, you decide that
it is time to start using your data science superpowers.

Before doing a first analysis, you decide to first collect all the winnings and losses
for the last week:

For poker_vector:

 On Monday you won \$140


 Tuesday you lost \$50
 Wednesday you won \$20
 Thursday you lost \$120
 Friday you won \$240

For roulette_vector:
 On Monday you lost \$24
 Tuesday you lost \$50
 Wednesday you won \$100
 Thursday you lost \$350
 Friday you won \$10

You only played poker and roulette, since there was a delegation of mediums that
occupied the craps tables. To be able to use this data in R, you decide to create
the variables poker_vector and roulette_vector.
Instructions
100 XP
Instructions
100 XP

Assign the winnings/losses for roulette to the variable roulette_vector.


Take Hint (-30 XP)

# Poker winnings from Monday to Friday

poker_vector <- c(140, -50, 20, -120, 240)

# Roulette winnings from Monday to Friday: roulette_vector

roulette_vector <- c(-24, -50, 100, -350, 10)

Very good! To check out the contents of your vectors, remember that you can always simply type
the variable in the console and hit Enter. Proceed to the next exercise!

Naming a vector (1)


As a data analyst, it is important to have a clear view on the data that you are
using. Understanding what each element refers to is therefore essential.

In the previous exercise, we created a vector with your winnings over the week.
Each vector element refers to a day of the week but it is hard to tell which element
belongs to which day. It would be nice if you could show that in the vector itself.
Remember the names()function to name the elements of a vector?
some_vector <- c("Johnny", "Poker Player")
names(some_vector) <- c("Name", "Profession")
You can accomplish the exact same thing by using the equals sign inside c():
some_vector <- c(Name = "Johnny", Profession = "Poker Player")
Instructions
100 XP
Go ahead and assign the days of the week as names
to poker_vector and roulette_vector. In case you are not sure, the days of the
week are: Monday, Tuesday, Wednesday, Thursday and Friday.
Take Hint (-30 XP)

# Poker winnings from Monday to Friday

poker_vector <- c(140, -50, 20, -120, 240)

# Roulette winnings from Monday to Friday

roulette_vector <- c(-24, -50, 100, -350, 10)

# Add names to both poker_vector and roulette_vector

names(poker_vector) <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")

names(roulette_vector) <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")

Naming a vector (2)


If you want to become a good statistician, you have to become lazy. (If you are
already lazy, chances are high you are one of those exceptional, natural-born
statistical talents.)

In the previous exercises you probably experienced that it is boring and frustrating
to type and retype information such as the days of the week. However, there is a
more efficient way to do this, namely, to assign the days of the week vector to a
variable!

Just like you did with your poker and roulette returns, you can also create a
variable that contains the days of the week. This way you can use and re-use it.

Instructions
100 XP
Instructions
100 XP

 Create a variable days_vector that contains the days of the week, from


Monday to Friday.
 Use that variable days_vector to set the names
of poker_vector and roulette_vector.

Take Hint (-30 XP)
# Poker winnings from Monday to Friday

poker_vector <- c(140, -50, 20, -120, 240)

# Roulette winnings from Monday to Friday

roulette_vector <- c(-24, -50, 100, -350, 10)

# Create the variable days_vector

days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")

# Assign the names of the day to roulette_vector and poker_vector

names(poker_vector) <- days_vector

names(roulette_vector) <- days_vector

Nice one! A word of advice: try to avoid code duplication at all times. Continue to the next exercise
and learn how to do arithmetic with vectors!

Different ways to create and name


vectors
The previous exercises outlined different ways of creating and naming vectors.
Have a look at this chunk of code:

poker_vector1 <- c(140, -50, 20, -120, 240)


names(poker_vector1) <- c("Monday", "Tuesday", "Wednesday", "Thursday",
"Friday")

poker_vector2 <- c(Monday = 140, -50, 20, -120, 240)

roulette_vector1 <- c(-24, -50, 100, -350, 10)


days_vector <- names(poker_vector1)
names(roulette_vector1) <- days_vector

roulette_vector2 <- c(-24, -50, 100, -350, 10)


names(roulette_vector2) <- "Monday"

Which of the following statements is true?

Instructions
100 XP
Possible Answers

The code to define poker_vector2 is syntactically invalid.

poker_vector1 and poker_vector2 have different lengths.

poker_vector1 and roulette_vector1 have the same names,


while poker_vector2 and roulette_vector2 show a names mismatch.

You can only use names() to set the names of a vector, making days_vector <-


names(poker_vector1) invalid.

Submit Answer
Take Hint (-30 XP)
Correct! You might expect that the names of the
vectors roulette_vector1 and roulette_vector2 are named the same; but the different approaches
treat missing name information differently. Also, notice here how you can also
use names() to get the names of a vector! Head back to edX by closing this tab.

CHAPTER 3

Summing and subtracting vectors


Now that you have the poker and roulette winnings nicely as a named vector, you
can start doing some data science magic.

You want to find out the following type of information:

 How much has been your overall profit or loss per day of the week?
 Have you lost money over the week in total?
 Are you winning/losing money on poker or on roulette?
You'll have to do arithmetic calculations on vectors to solve these problems.
Remember that this happens element-wise; the following three statements are
completely equivalent:

c(1, 2, 3) + c(4, 5, 6)
c(1 + 4, 2 + 5, 3 + 6)
c(5, 7, 9)
Instructions
100 XP
Instructions
100 XP

 Take the element-wise sum of the variables A_vector and B_vector and it


assign to total_vector. The result should be a vector.
 Inspect the result by printing total_vector to the console.
 Do the same thing, but this time subtract B_vector from A_vector and
assign the result to diff_vector.
 Finally, print diff_vector to the console as well.

Take Hint (-30 XP)
# A_vector and B_vector have already been defined for you
A_vector <- c(1, 2, 3)
B_vector <- c(4, 5, 6)

# Take the sum of A_vector and B_vector: total_vector


total_vector <- c(A_vector + B_vector)

# Print total_vector
total_vector

# Calculate the difference between A_vector and B_vector: diff_vector


diff_vector <- A_vector - B_vector

# Print diff_vector
diff_vector
Calculate your earnings
Now that you understand how R does arithmetic calculations with vectors, it is time
to get those Ferraris in your garage! First, you need to understand what the overall
profit or loss per day of the week was. The total daily profit is the sum of the
profit/loss you realized on poker per day, and the profit/loss you realized on
roulette per day.

Instructions
100 XP

Assign to the variable total_daily how much you won or lost on each day in total
(poker and roulette combined).
Take Hint (-30 XP)

> # Casino winnings from Monday to Friday


> poker_vector <- c(140, -50, 20, -120, 240)
> roulette_vector <- c(-24, -50, 100, -350, 10)
> days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
> names(poker_vector) <- days_vector
> names(roulette_vector) <- days_vector
>
> # Calculate your daily earnings: total_daily
> total_daily <- poker_vector + roulette_vector
> total_daily
Monday Tuesday Wednesday Thursday Friday
116 -100 120 -470 250

Calculate total winnings: sum()


Based on the previous analysis, it looks like you had a mix of good and bad days.
This is not what your ego expected, and you wonder if there may be a (very very
very) tiny chance you have lost money over the week in total?

You can answer this question using the sum() function. As mentioned in the video,
it calculates the sum of all elements of a vector.
Instructions
100 XP
Instructions
100 XP

 Calculate the total amount of money that you have won/lost with poker and
assign it to the variable total_poker.
 Do the same thing for roulette and assign the result to total_roulette.
 Now that you have the totals for roulette and poker, you can easily
calculate total_week (which is the sum of all gains and losses of the week).
 Print the variable total_week.

Take Hint (-30 XP)
> # Casino winnings from Monday to Friday
> poker_vector <- c(140, -50, 20, -120, 240)
> roulette_vector <- c(-24, -50, 100, -350, 10)
> days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
> names(poker_vector) <- days_vector
> names(roulette_vector) <- days_vector
>
> # Total winnings with poker: total_poker
> total_poker <- sum(poker_vector)
>
> # Total winnings with roulette: total_roulette
> total_roulette <- sum(roulette_vector)
>
> # Total winnings overall: total_week
> total_week <- total_poker + total_roulette
>
> # Print total_week
> total_week
[1] -84
Oops, it seems like you are losing money. Time to rethink and adapt your strategy!
This will require some deeper analysis…
Comparing total winnings
The previous exercise showed that you are losing money, now what? After a short
brainstorm in your hotel's jacuzzi, you realize that a possible explanation might be
that your skills in roulette are not as well developed as your skills in poker. You
choose to use the >operator to reveal this.

Instructions
100 XP
Instructions
100 XP

 Create a new vector containing logicals, poker_better, that tells whether


your poker gains exceeded your roulette results on a daily basis.
 Calculate total_poker and total_roulette as in the previous exercise.
 Using total_poker and total_roulette, Check if your total gains in poker
are higher than for roulette by using a comparison. Assign the result of this
comparison to the variable choose_poker and print it out. What do you
conclude, should you focus on roulette or on poker?

Take Hint (-30 XP)
> # Casino winnings from Monday to Friday
> poker_vector <- c(140, -50, 20, -120, 240)
> roulette_vector <- c(-24, -50, 100, -350, 10)
> days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")
> names(poker_vector) <- days_vector
> names(roulette_vector) <- days_vector
>
> # Calculate poker_better
> poker_better <- poker_vector>roulette_vector
>
> # Calculate total_poker and total_roulette, as before
> total_poker <- sum(poker_vector)
> total_roulette <- sum(roulette_vector)
>
> # Calculate choose_poker
> choose_poker <- total_poker>total_roulette
>
> # Print choose_poker
> choose_poker
[1] TRUE
>Great! Your hunch seemed to be right. It appears that the poker game is more
your cup of tea than roulette. Ready for a challenge? Head over to the next
exercise!

First steps in rational gambling


In the previous exercise, you found out that roulette is not really your forte.
However, you have some vague memories from visits in Vegas where you actually
excelled at this game. You plan to dig through your receipts of when you withdrew
and cashed chips and found out about your actual performance in the previous
week you were in Sin City. In that week, you also only played poker and roulette;
the information is stored in poker_past and roulette_past. The information for the
current week, with which you have been working all along, is
in poker_present and roulette_present. All these variables are available in your
workspace.

Instructions
200 XP
Instructions
200 XP

 Use the sum() function twice in combination with the +operator to calculate


the total gains for your entire past week in Vegas (this means for both poker
and roulette). Assign the result to total_past.
 Calculate difference of past to present poker performance: Using
the - operator, subtract poker_past from poker_present, to
calculate diff_poker. diff_poker should be a vector with 5 elements.

Take Hint (-60 XP)
> # Calculate total gains for your entire past week: total_past
> total_past <- c(sum(poker_past) + sum(roulette_past))
>
> # Difference of past to present performance: diff_poker
> diff_poker <- c(poker_present - poker_past)
> diff_poker
Monday Tuesday Wednesday Thursday Friday
210 -140 -90 0 210
Awesome! It seems that indeed, your roulette skills have worsened if you compare
to your previous week in Vegas. Go back to edX to learn about new ways of
investigating your gambling performance.
You have finished the chapter "Vector Arithmetic"!

CHAPTER 4

Selection by index (1)


After you figured that roulette is not your forte, you decide to compare the your
performance at the beginning of the working week compared to the end of it. You
did have a couple of Margarita cocktails at the end of the week...

To answer that question, you only want to focus on a selection of


the total_vector. In other words, our goal is to select specific elements of the
vector.
Instructions
100 XP
Instructions
100 XP

 Assign the poker results of Wednesday to the variable poker_wednesday.


 Assign the roulette results of Friday to the variable roulette_friday.

Take Hint (-30 XP)

> # Casino winnings from Monday to Friday

> poker_vector <- c(140, -50, 20, -120, 240)

> roulette_vector <- c(-24, -50, 100, -350, 10)

> days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")

> names(poker_vector) <- days_vector

> names(roulette_vector) <- days_vector

>

> # Poker results of Wednesday: poker_wednesday

> poker_wednesday <- poker_vector["Wednesday"]


> poker_wednesday

Wednesday

20

>

> # Roulette results of Friday: roulette_friday

> roulette_friday <- roulette_vector["Friday"]

> roulette_friday

Friday

10

>Great! R also makes it possible to select multiple elements from a vector at once, remember? Put
the theory to practice in the next exercise!

Selection by index (2)


How about analyzing your midweek results?

Instead of using a single number to select a single element, you can also select
multiple elements by passing a vector inside the square brackets. For example,

poker_vector[c(1,5)]
selects the first and the fifth element of poker_vector.
Instructions
100 XP
Instructions
100 XP

 Assign the poker results of Tuesday, Wednesday and Thursday to the


variable poker_midweek.
 Assign the roulette results of Thursday and Friday to the
variable roulette_endweek.

Take Hint (-30 XP)

> # Casino winnings from Monday to Friday

> poker_vector <- c(140, -50, 20, -120, 240)

> roulette_vector <- c(-24, -50, 100, -350, 10)


> days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")

> names(poker_vector) <- days_vector

> names(roulette_vector) <- days_vector

>

> # Mid-week poker results: poker_midweek

> poker_midweek <- poker_vector[c(2,3,4)]

> poker_midweek

Tuesday Wednesday Thursday

-50 20 -120

> # End-of-week roulette results: roulette_endweek

> roulette_endweek <- roulette_vector[c(4,5)]

> roulette_endweek

Thursday Friday

-350 10

>Well done! Continue to the next exercise to specialize in vector selection some more!

Vector selection: the good times (3)


Now, selecting multiple successive elements of poker_vectorwith c(2,3,4) is not
very convenient. Many statisticians are lazy people by nature, so they created an
easier way to do this: c(2,3,4) can be abbreviated to 2:4, which generates a
vector with all natural numbers from 2 up to 4. Try it out in the console!
So, another way to find the mid-week results is poker_vector[2:4]. Notice how the
vector 2:4 is placed between the square brackets to select element 2 up to 4. You
don't have to use the c() function if you're using the shortcut with the colon.
Instructions
100 XP

 Assign to roulette_subset the roulette results from Tuesday to Friday inclusive


by making use of :.

 Print the resulting variable to the console.

Take Hint (-30 XP)
> # Casino winnings from Monday to Friday
> poker_vector <- c(140, -50, 20, -120, 240)

> roulette_vector <- c(-24, -50, 100, -350, 10)

> days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")

> names(poker_vector) <- days_vector

> names(roulette_vector) <- days_vector

>

> # Roulette results for Tuesday to Friday inclusive: roulette_subset

> roulette_subset <- roulette_vector[2:5]

>

> # Print roulette_subset

> roulette_subset

Tuesday Wednesday Thursday Friday

-50 100 -350 10

Awesome! The colon operator is extremely useful and very often used in R programming, so
remember it well. Have you noticed that the elements in poker_vector and roulette_vector also
have names associated with them? You can also subset vectors using these names, remember?

Selection by name (1)


Another way to tackle the previous exercise is by using the names of the vector
elements (Monday, Tuesday, ...) instead of their numeric positions. For example,

poker_vector["Monday"]
will select the first element of poker_vector since "Monday" is the name of that first
element.
Instructions
100 XP
Instructions
100 XP

 Select the fourth element, corresponding to Thursday, from roulette_vector.


Name it roulette_thursday.
 Select Tuesday's poker gains using subsetting by name. Assign the result
to poker_tuesday.

Take Hint (-30 XP)
> # Casino winnings from Monday to Friday

> poker_vector <- c(140, -50, 20, -120, 240)

> roulette_vector <- c(-24, -50, 100, -350, 10)

> days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")

> names(poker_vector) <- days_vector

> names(roulette_vector) <- days_vector

>

> # Select Thursday's roulette gains: roulette_thursday

> roulette_thursday <- roulette_vector["Thursday"]

>

> # Select Tuesday's poker gains: poker_tuesday

> poker_tuesday <- poker_vector["Tuesday"]

>

Selection by name (2)


Just like selecting single elements using numerics extends naturally to selecting
multiple elements, you can also use a vector of names. As an example, try

roulette_vector[c("Monday","Wednesday")]
Of course you can't use the colon trick here: "Monday":"Wednesday" will generate
an error.
Instructions
100 XP

 Create a vector containing the poker gains for the first three days of the week;
name it poker_start.
 Using the function mean(), calculate the average poker gains during these first
three days. Assign the result to a variable avg_poker_start.

Take Hint (-30 XP)

> # Casino winnings from Monday to Friday

> poker_vector <- c(140, -50, 20, -120, 240)

> roulette_vector <- c(-24, -50, 100, -350, 10)

> days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")


> names(poker_vector) <- days_vector

> names(roulette_vector) <- days_vector

>

> # Select the first three elements from poker_vector: poker_start

> poker_start <- poker_vector[1:3]

>

> # Calculate the average poker gains during the first three days: avg_poker_start

> avg_poker_start <- mean(poker_start)

>Good job! Next to subsetting vectors by index or by name, you can also use logical vectors. The
next exercises will test you on this.

Selection by logicals (1)


There are basically three ways to subset vectors: by using the indices, by using the
names (if the vectors are named) and by using logical vectors. Filip already told
you about the internals in the instructional video. As a refresher, have a look at the
following statements to select elements from poker_vector, which are all
equivalent:
# selection by index
poker_vector[c(1,3)]

# selection by name
poker_vector[c("Monday", "Wednesday")]

# selection by logicals
poker_vector[c(TRUE, FALSE, TRUE, FALSE, FALSE)]
Instructions
100 XP
Instructions
100 XP

 Assign the roulette results from the first, third and fifth day to roulette_subset.
 Select the first three days from poker_vector using a vector of logicals. Assign the
result to poker_start.

Take Hint (-30 XP)

> # Casino winnings from Monday to Friday

> poker_vector <- c(140, -50, 20, -120, 240)

> roulette_vector <- c(-24, -50, 100, -350, 10)

> days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")


> names(poker_vector) <- days_vector

> names(roulette_vector) <- days_vector

>

> # Roulette results for day 1, 3 and 5: roulette_subset

> roulette_subset <- roulette_vector[c(T, F, T, F, T)]

>

> # Poker results for first three days: poker_start

> poker_start <- poker_vector[c(T, T, T, F, F)]

Nice one! Using logical vectors to perform subsetting might seem somewhat tedious, but its true
power will become clear in the next exercise!

Selection by logicals (2)


By making use of a combination of comparison operators and subsetting using
logicals, you can investigate your casino performance in a more pro-active way.

The (logical) comparison operators known to R are:

 < for less than


 > for greater than
 <= for less than or equal to
 >= for greater than or equal to
 == for equal to each other
 != not equal to each other

Experiment with these operators in the console. The result will be a logical vector,
which you can use to perform subsetting! This means that instead of selecting a
subset of days to investigate yourself like before, you can simply ask R to return
only those days where you realized a positive return for poker.

Instructions
100 XP
Instructions
100 XP

 Check if your poker winnings are positive on the different days of the week (i.e. >
0), and assign this to selection_vector.
 Assign the amounts that you won on the profitable days, so a vector, to the
variable poker_profits, by using selection_vector.
Take Hint (-30 XP)

> # Casino winnings from Monday to Friday

> poker_vector <- c(140, -50, 20, -120, 240)

> roulette_vector <- c(-24, -50, 100, -350, 10)

> days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")

> names(poker_vector) <- days_vector

> names(roulette_vector) <- days_vector

>

> # Create logical vector corresponding to profitable poker days: selection_vector

> selection_vector <- poker_vector > 0

> selection_vector

Monday Tuesday Wednesday Thursday Friday

TRUE FALSE TRUE FALSE TRUE

>

> # Select amounts for profitable poker days: poker_profits

> poker_profits <- poker_vector[selection_vector == T]

> poker_profits

Monday Wednesday Friday

140 20 240

>

Selection by logicals (3)


To fully prepare you for the challenge that's coming, you'll do a final analysis of
your casino ventures. This time, you'll use your newly acquired skills to perform
advanced selection on roulette_vector.
Along the way, you'll need the sum() function. You used it before to calculate the
total winnings, so an a numeric vector. However, you can also use sum() on a
logical vector; it simply counts the number of vector elements that are TRUE.
Instructions
100 XP
Instructions
100 XP
 Assign the amounts that you made on the days that you ended positively for
roulette to the variable roulette_profits. This vector thus contains the positive
winnings of roulette_vector. You can do this with a one-liner!
 Calculate the sum of the amounts on these profitable days; assign the result
to roulette_total_profit.
 Find out how many roulette days were profitable, using the sum() function. Store
the result in a variable num_profitable_days.

Take Hint (-30 XP)

> # Casino winnings from Monday to Friday

> poker_vector <- c(140, -50, 20, -120, 240)

> roulette_vector <- c(-24, -50, 100, -350, 10)

> days_vector <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday")

> names(poker_vector) <- days_vector

> names(roulette_vector) <- days_vector

>

> # Select amounts for profitable roulette days: roulette_profits

> roulette_profits <- roulette_vector[roulette_vector>0]

>

> # Sum of the profitable roulette days: roulette_total_profit

> roulette_total_profit <- sum(roulette_profits)

>

> # Number of profitable roulette days: num_profitable_days

> num_profitable_days <- sum(roulette_vector>0)

>

> roulette_profits

Wednesday Friday

100 10

> roulette_total_profit

[1] 110

> num_profitable_days

[1] 2
> Awesome! If you inspect the variable num_profitable_days, you'll see that is equal to 2, meaning
that you only had two profitable roulette days. You can conclude that roulette is not your game,
right?

Vectors: place your bets!


By now, you should have gained some insights on how your casino habits are
actually working out for you. In fact, why not decide on changing your game
completely? Let's dive into the world of Blackjack for once, and analyze some
game outcomes here. In short, blackjack is a game where you have to ask for
cards until you arrive at a sum that is as close to 21 as possible. However, if you
exceed 21, you've lost. You can be greedy and go for 21, or you can be careful
and settle for 16 or so. A player wins when his or her sum, or score, exceeds that
of the house.

The sums for the player's last 7 games are stored in player; the house's scores
are contained in house. Both are available in the workspace. In both cases, the
scores were never higher than 21.
Instructions
200 XP
Instructions
200 XP

 With square brackets, select the player's score for the third game, using any of the
techniques that you've learned about. Store the result in player_third.
 Subset the player vector to only select the scores that exceeded the scores
of house, so the scores that had the player win. Use subsetting in combination with
the relational operator >. Assign the subset to the variable winning_scores.
 Count the number of times the score inside player was lower than 18. This time,
you should use a relational operator in combination with sum(). Save the resulting
value in a new variable, n_low_score.

Take Hint (-60 XP)

> # Select the player's score for the third game: player_third

> player_third <- player[3]

>

> # Select the scores where player exceeds house: winning_scores

> winning_scores <- player[player>house]

>

> # Count number of times player < 18: n_low_score

> n_low_score <- sum(player<18)


>

> player_third

[1] 20

> winning_scores

[1] 17 21 18

> n_low_score

[1] 3

> Awesome! This exercise concludes the chapter on vectors. The next module will introduce you
to the two-dimensional version of vectors: matrices. Close this tab to continue your learning on
edX.

You have finished the chapter "Subsetting Vectors"!

You might also like