0% found this document useful (0 votes)
2 views

Loops. Programming in R

The document explains different types of loops in R, including for loops, while loops, repeat loops, and apply-family functions, detailing when to use each type. It provides examples of how to implement these loops for various data structures such as vectors and data frames, along with explanations of key concepts like the placeholder variable 'i'. Additionally, it covers practical applications of loops for processing sales data and categorizing values based on conditions.

Uploaded by

soloviovalada
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Loops. Programming in R

The document explains different types of loops in R, including for loops, while loops, repeat loops, and apply-family functions, detailing when to use each type. It provides examples of how to implement these loops for various data structures such as vectors and data frames, along with explanations of key concepts like the placeholder variable 'i'. Additionally, it covers practical applications of loops for processing sales data and categorizing values based on conditions.

Uploaded by

soloviovalada
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Loops

When to Use Which Loop?

● for loops: When you need to iterate over a fixed sequence.


● while loops: When the number of iterations depends on a condition.
● repeat loops: When the loop should run until a specific condition is met (not necessarily known
beforehand).
● Apply-family functions: When dealing with vectors, lists, or data frames and performance or
conciseness is essential.

What are Loops?

Loops allow you to repeat a set of instructions multiple times. In R, loops are used to iterate over a sequence
of elements, perform operations on them, or generate outputs.

What is i in a Loop?

● i is a common placeholder variable used in loops to represent the current iteration or item in a
sequence.
● It can be any name, but i is widely used by convention.

Types of Loops in R

There are several types of loops in R:

1. for Loop
● Iterates over a sequence (e.g., numbers, elements of a vector).

Syntax:
for (i in sequence) {
# Code to run for each element in the sequence
}

Example:
strings <- c("abcd", "cdab", "cabdab", "c ab", "ab")

# Print each string in the vector


for (i in strings) {
print(i)
}

2. while Loop
● Repeats as long as a condition is TRUE.
Syntax:
while (condition) {
# Code to execute while the condition is TRUE
}

Example:
x <- 1
while (x <= 5) {
print(x)
x <- x + 1
}

3. repeat Loop
● Repeats indefinitely until explicitly stopped with a break statement.

Syntax:
repeat {
# Code to execute
if (condition) break
}
Example:
x <- 1
repeat {
print(x)
x <- x + 1
if (x > 5) break
}

# Output:
# [1] 1
# [1] 2
# [1] 3
# [1] 4
# [1] 5

4. Using apply Functions


● While not technically a loop, functions like sapply, lapply, and apply are often used to perform
operations over lists, vectors, or data frames.

Example with sapply:

strings <- c("abcd", "cdab", "cabdab", "c ab", "ab")

# Convert each string to uppercase


upper_case <- sapply(strings, toupper)
print(upper_case)
# Output:
# abcd cdab cabdab c ab ab
# "ABCD" "CDAB" "CABDAB" "C AB" "AB"

Key Details from Your File

From the uploaded file, the following examples illustrate loop concepts:

Basic Loop to Print Elements

strings <- c("abcd", "cdab", "cabdab", "c ab", "ab")

# Using for loop


for (i in strings) {
print(i)
}

# Using for loop with indices


for (i in 1:length(strings)) {
print(strings[i])
}

Comments:

Key Considerations

● If you need the index: Use 1:length(string_name). This is particularly


useful when modifying the vector or needing both the index and the value.
● If you only need the values: Use for (i in string_name) for simplicity and
readability.

Loop with Conditional Logic

for (i in strings) {
print(grepl("cd", i)) # Check if 'cd' is in each string
}

Using sapply for Vectorized Operations

# Convert strings to uppercase


sapply(strings, toupper)
for() examples and cases

1. Vectors
A vector is a one-dimensional collection of data, such as numbers, text, or logical values.

Example: Processing Daily Sales Data

1) Imagine you have a vector of 1,000 sales values.

# Generate a large vector of daily sales (random sales amounts)


sales <- rnorm(1000, mean = 500, sd = 100) # 1000 sales values

# Loop to process each sales value


for (value in sales) {
if (value > 600) {
print(value) # Print sales values greater than 600
}
}

Explanation:

sales: A numeric vector containing 1,000 randomly generated sales values.

value: This is not a column or a predefined variable. It’s a temporary name (placeholder) that represents one
element of the sales vector in each iteration. For example:

a) In the first iteration, value equals the first sale in the sales vector.
b) In the second iteration, value equals the second sale, and so on.

if (value > 600): Checks if the current value is greater than 600.

print(value): Displays the value if the condition is true.

2) an example of loop that will make an additional column called "Level" and say there "small" if
the value in Sales (another column) is less than 20, "medium" if it is between 20 and 40, and
"high" if it is above 40
# Create a sample data frame with a "Sales" column
set.seed(42) # For reproducibility
data <- data.frame(
Product = paste("Product", 1:10), # Product names
Sales = sample(1:50, 10, replace = TRUE) # Random sales values between 1 and 50
)

# Initialize the "Level" column as empty


data$Level <- NA # Set the new column with NA values

# Loop through each row to assign "Level"


for (i in 1:nrow(data)) {
if (data$Sales[i] < 20) {
data$Level[i] <- "small"
} else if (data$Sales[i] >= 20 && data$Sales[i] <= 40) {
data$Level[i] <- "medium"
} else {
data$Level[i] <- "high"
}
}

# Print the updated data frame


print(data)

3) an example of loop that will make an additional column called "Online" and say there "yes" if in
column "Type" (with variables "offline" and "online") it says "online". otherwise (if it says
"offline" there, it will say "no"). afterwards, add a line to calculate how many instances of online
we have and how many of offline. afterwards, make another code chunk where it will sum up the
numbers in column "Sale" (we used it in the prev.message) by "Type" (sales for instances with
"online" as a "Type" will be added, the same for "offline"). show output as well

Step 1: Adding an "Online" Column Using a Loop


# Sample data frame with "Type" and "Sales" columns
data <- data.frame(
Product = paste("Product", 1:10),
Sales = sample(1:50, 10, replace = TRUE), # Random sales values
Type = sample(c("online", "offline"), 10, replace = TRUE) # Randomly assign
"online" or "offline"
)

# Initialize the "Online" column as empty


data$Online <- NA # Set the new column with NA values

# Loop through each row to assign "yes" or "no" to "Online"


for (i in 1:nrow(data)) {
if (data$Type[i] == "online") {
data$Online[i] <- "yes"
} else {
data$Online[i] <- "no"
}
}

# Print the updated data frame


print(data)

Step 2: Count Instances of "online" and "offline"


# Count instances of "online" and "offline"
online_count <- sum(data$Type == "online")
offline_count <- sum(data$Type == "offline")

# Print the counts


cat("Number of online instances:", online_count, "\n")
cat("Number of offline instances:", offline_count, "\n")

Step 3: Sum "Sales" by "Type"

# Calculate total sales by "Type"


online_sales <- sum(data$Sales[data$Type == "online"])
offline_sales <- sum(data$Sales[data$Type == "offline"])

# Print the results


cat("Total sales for online:", online_sales, "\n")
cat("Total sales for offline:", offline_sales, "\n")

2. Sequences
A sequence is a set of numbers generated systematically. Instead of processing the values directly, you can
loop through their positions (indices).

Example: Processing Sales by Index

Suppose you need to access sales values using their position.

# Loop through indices instead of directly through values


for (i in 1:length(sales)) {
if (sales[i] > 600) {
print(paste("Sale at position", i, "has value", sales[i]))
}
}

Explanation:

● 1:length(sales): Creates a sequence from 1 to the total number of sales (1, 2, 3, ..., 1000).
● i: Represents the current position in the sequence.
● sales[i]: Accesses the sales value at position i.
○ For example, if i = 10, then sales[i] is the 10th value in the sales vector.
● This approach is useful when you need the position of each value in addition to the value itself.

3. Lists - not relevant

4. Data Frames
A data frame is a table where columns represent variables and rows represent observations. You can loop
through rows or columns.

Example: Processing Product Data by Row

Suppose you have a data frame with product details.

# Create a data frame of products


products <- data.frame(
Product = paste("Product", 1:10000), # Product names
Category = sample(c("A", "B", "C"), 10000, replace = TRUE), # Categories
Price = runif(10000, min = 50, max = 500) # Random prices
)

# Loop through each row of the data frame


for (i in 1:nrow(products)) {
if (products$Category[i] == "A" && products$Price[i] > 300) {
print(paste(products$Product[i], "is in Category A and costs",
products$Price[i]))
}
}

Explanation:

● products: A data frame containing 10,000 rows and columns for Product, Category, and Price.
● 1:nrow(products): Loops through row indices (1, 2, 3, ..., 10000).
● products$Category[i]: Accesses the Category column for the ith row.
● products$Price[i]: Accesses the Price column for the ith row.
● products$Product[i]: Accesses the Product column for the ith row.

5. Expressions
You can use dynamic expressions or subsets of data directly in a loop.

Example: Processing Filtered Sales

Suppose you want to process only sales greater than 400.

# Subset sales values greater than 400


high_sales <- sales[sales > 400]

# Loop through high sales values


for (value in high_sales) {
print(value) # Print each high sale value
}

Explanation:
● sales > 400: Creates a logical vector indicating which sales are greater than 400.
● sales[sales > 400]: Subsets the sales vector to include only values greater than 400.
● value: Represents one value from the filtered vector during each iteration.

You might also like