Loops. Programming in R
Loops. Programming in R
Loops allow you to repeat a set of instructions multiple times. In R, loops are used to iterate over a sequence
of elements, perform operations on them, or generate outputs.
What is i in a Loop?
● i is a common placeholder variable used in loops to represent the current iteration or item in a
sequence.
● It can be any name, but i is widely used by convention.
Types of Loops in R
1. for Loop
● Iterates over a sequence (e.g., numbers, elements of a vector).
Syntax:
for (i in sequence) {
# Code to run for each element in the sequence
}
Example:
strings <- c("abcd", "cdab", "cabdab", "c ab", "ab")
2. while Loop
● Repeats as long as a condition is TRUE.
Syntax:
while (condition) {
# Code to execute while the condition is TRUE
}
Example:
x <- 1
while (x <= 5) {
print(x)
x <- x + 1
}
3. repeat Loop
● Repeats indefinitely until explicitly stopped with a break statement.
Syntax:
repeat {
# Code to execute
if (condition) break
}
Example:
x <- 1
repeat {
print(x)
x <- x + 1
if (x > 5) break
}
# Output:
# [1] 1
# [1] 2
# [1] 3
# [1] 4
# [1] 5
From the uploaded file, the following examples illustrate loop concepts:
Comments:
Key Considerations
for (i in strings) {
print(grepl("cd", i)) # Check if 'cd' is in each string
}
1. Vectors
A vector is a one-dimensional collection of data, such as numbers, text, or logical values.
Explanation:
value: This is not a column or a predefined variable. It’s a temporary name (placeholder) that represents one
element of the sales vector in each iteration. For example:
a) In the first iteration, value equals the first sale in the sales vector.
b) In the second iteration, value equals the second sale, and so on.
if (value > 600): Checks if the current value is greater than 600.
2) an example of loop that will make an additional column called "Level" and say there "small" if
the value in Sales (another column) is less than 20, "medium" if it is between 20 and 40, and
"high" if it is above 40
# Create a sample data frame with a "Sales" column
set.seed(42) # For reproducibility
data <- data.frame(
Product = paste("Product", 1:10), # Product names
Sales = sample(1:50, 10, replace = TRUE) # Random sales values between 1 and 50
)
3) an example of loop that will make an additional column called "Online" and say there "yes" if in
column "Type" (with variables "offline" and "online") it says "online". otherwise (if it says
"offline" there, it will say "no"). afterwards, add a line to calculate how many instances of online
we have and how many of offline. afterwards, make another code chunk where it will sum up the
numbers in column "Sale" (we used it in the prev.message) by "Type" (sales for instances with
"online" as a "Type" will be added, the same for "offline"). show output as well
2. Sequences
A sequence is a set of numbers generated systematically. Instead of processing the values directly, you can
loop through their positions (indices).
Explanation:
● 1:length(sales): Creates a sequence from 1 to the total number of sales (1, 2, 3, ..., 1000).
● i: Represents the current position in the sequence.
● sales[i]: Accesses the sales value at position i.
○ For example, if i = 10, then sales[i] is the 10th value in the sales vector.
● This approach is useful when you need the position of each value in addition to the value itself.
4. Data Frames
A data frame is a table where columns represent variables and rows represent observations. You can loop
through rows or columns.
Explanation:
● products: A data frame containing 10,000 rows and columns for Product, Category, and Price.
● 1:nrow(products): Loops through row indices (1, 2, 3, ..., 10000).
● products$Category[i]: Accesses the Category column for the ith row.
● products$Price[i]: Accesses the Price column for the ith row.
● products$Product[i]: Accesses the Product column for the ith row.
5. Expressions
You can use dynamic expressions or subsets of data directly in a loop.
Explanation:
● sales > 400: Creates a logical vector indicating which sales are greater than 400.
● sales[sales > 400]: Subsets the sales vector to include only values greater than 400.
● value: Represents one value from the filtered vector during each iteration.