0% found this document useful (0 votes)

14 views27 pages

Subset Creation in R

Uploaded by

Kakashi Hatake

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views27 pages

Subset Creation in R

Uploaded by

Kakashi Hatake

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

Subset Creation in R

November 19, 2024

This conversation may reflect the link creator’s personalized data, which isn’t shared and can
meaningfully change how the model responds.

Consider the following data frame given below:

subject
class
1
marks
1
2
56
2
3
75
1
4
48
2
5
69
1
6
84
2
i.
53
Create a subset of subject less than 4 by using subset () function and demonstrate
the output.
ii.
Create a subset where the subject column is less than 3 and the class equals to 2
by using [ ] brackets and demonstrate the output.

To create the subsets from the given data frame, we'll use the `subset()` function and indexing with
square brackets `[ ]` in R. Here's how you can accomplish it:

Original Data Frame

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 1/27
R

# Create the data frame

df <- data.frame(
subject = c(1, 2, 3, 4, 5, 6),
class = c(1, 2, 1, 2, 1, 2),
marks = c(56, 75, 48, 69, 84, 53)
)
print(df)

Output

kotlin

subject class marks

1 1 1 56
2 2 2 75
3 3 1 48
4 4 2 69
5 5 1 84
6 6 2 53

Part i: Subset where `subject` is less than 4 using `subset()`

# Create subset where subject < 4

subset1 <- subset(df, subject < 4)
print(subset1)

Output

kotlin

subject class marks

1 1 1 56
2 2 2 75
3 3 1 48

Part ii: Subset where `subject` is less than 3 and `class` equals 2 using `[ ]`
brackets

# Create subset where subject < 3 and class == 2

subset2 <- df[df$subject < 3 & df$class == 2, ]
print(subset2)

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 2/27
Output

kotlin

subject class marks

2 2 2 75

Let me know if you'd like further clarifications or examples!

The data analyst of Argon technology Mr. John needs to enter the salaries of 10
employees in R. The salaries of the employees are given in the following table:
Sr. No. Name of employees Salaries
1
Vivek
21000
2
Karan
55000
3
James
67000
4
Soham
50000
5
Renu
54000
6
Farah
40000
7
Hetal
30000
8
Mary
70000
9
Ganesh
10
20000
Krish
15000
i.
ii.
Which R command will Mr. John use to enter these values demonstrate the output.
Now Mr. John wants to add the salaries of 5 new employees in the existing table,

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 3/27
which command he will use to join datasets with new values in R. Demonstrate the
output.

To enter and manage the given salary data in R, Mr. John can use a combination of commands to create
the initial data frame and then append additional data to it. Here's how he can proceed:

Part i: Create the Initial Dataset

To enter the salaries of the 10 employees, Mr. John can use the `data.frame()` function.

# Create the initial dataset

employees <- data.frame(
Sr_No = 1:10,
Name = c("Vivek", "Karan", "James", "Soham", "Renu", "Farah", "Hetal", "Mary", "Ganesh",
"Krish"),
Salaries = c(21000, 55000, 67000, 50000, 54000, 40000, 30000, 70000, 20000, 15000)
)
print(employees)

Output

Sr_No Name Salaries

1 1 Vivek 21000
2 2 Karan 55000
3 3 James 67000
4 4 Soham 50000
5 5 Renu 54000
6 6 Farah 40000
7 7 Hetal 30000
8 8 Mary 70000
9 9 Ganesh 20000
10 10 Krish 15000

Part ii: Add Salaries of 5 New Employees

To add new rows of employee data, Mr. John can use the `rbind()` function. Here is an example with the
new employee data.

New Employee Data:

Sr. No. Name Salaries

11 Rahul 45000
12 Pooja 35000

13 Ajay 60000
14 Simran 52000

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 4/27
Sr. No. Name Salaries

15 Meera 47000

R Code to Append New Data

# Create new employee data

new_employees <- data.frame(
Sr_No = 11:15,
Name = c("Rahul", "Pooja", "Ajay", "Simran", "Meera"),
Salaries = c(45000, 35000, 60000, 52000, 47000)
)

# Append new employees to the existing dataset

updated_employees <- rbind(employees, new_employees)
print(updated_employees)

Output

Sr_No Name Salaries

1 1 Vivek 21000
2 2 Karan 55000
3 3 James 67000
4 4 Soham 50000
5 5 Renu 54000
6 6 Farah 40000
7 7 Hetal 30000
8 8 Mary 70000
9 9 Ganesh 20000
10 10 Krish 15000
11 11 Rahul 45000
12 12 Pooja 35000
13 13 Ajay 60000
14 14 Simran 52000
15 15 Meera 47000

Let me know if you'd like further clarification or assistance!

i. Write the script to sort the values contained in the following vector in ascending
order and descending order: (23, 45, 10, 34, 89, 20, 67, 99). Demonstrate the
output.
ii. Name and explain the operators used to form data subsets in R.

Part i: Sorting a Vector in R

Script for Ascending and Descending Order

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 5/27
R

# Define the vector

vec <- c(23, 45, 10, 34, 89, 20, 67, 99)

# Sort in ascending order

ascending <- sort(vec)

# Sort in descending order

descending <- sort(vec, decreasing = TRUE)

# Print the results

print("Ascending Order:")
print(ascending)

print("Descending Order:")
print(descending)

Output

csharp

[1] "Ascending Order:"

[1] 10 20 23 34 45 67 89 99

[1] "Descending Order:"

[1] 99 89 67 45 34 23 20 10

Part ii: Operators Used to Form Data Subsets in R

In R, several operators are used to form subsets of data. Here’s an explanation of the most common
ones:

1. Square Brackets `[ ]`

Used to extract specific elements from vectors, matrices, data frames, or lists.
Syntax: `object[row, column]`
Example:

# Subsetting a vector
vec <- c(10, 20, 30, 40)
vec[2] # Returns the second element

# Subsetting a data frame

df <- data.frame(A = 1:3, B = c("x", "y", "z"))
df[1, 2] # Returns the element in the first row, second column

2. Dollar Sign `$`

Used to extract a column from a data frame or list by its name.

Example:

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 6/27
R

df <- data.frame(Name = c("A", "B"), Age = c(25, 30))

df$Name # Returns the 'Name' column

3. Double Square Brackets `[[ ]]`

Used to extract a single element from a list.

Example:

my_list <- list(a = 1:5, b = "Hello")

my_list[[1]] # Returns the first element (1:5)

4. Logical Operators

Logical conditions can be used inside square brackets to filter data.

Operators:
`<`, `>`, `<=`, `>=`: Comparison operators
`==`: Equality operator
`!=`: Not equal to
`&`, `|`: Logical AND and OR
Example:

vec <- c(10, 20, 30, 40)

vec[vec > 20] # Returns elements greater than 20

5. Subset Function `subset()`

A function used for creating subsets of data frames based on logical conditions.
Example:

df <- data.frame(Age = c(25, 30, 35), Name = c("A", "B", "C"))

subset(df, Age > 25) # Subset rows where Age is greater than 25

These operators make subsetting data in R flexible and powerful, allowing fine-grained data selection
for analysis.

Q) i. Create a data frame from the following 4 vectors and demonstrate the output:
emp_id = c (1:5) emp_name = c("Rick","Dan","Michelle", "Ryan","Gary")
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 7/27
start_date = c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11", "2015-03-27")
salary = c(60000, 45000, 75000, 84000, 20000)

ii. Display structure and summary of the above data frame.

iii. Extract the emp_name and salary columns from the above data frame.
iv. Extract the employee details whose salary is less than or equal to 60000

i. Create a Data Frame

To create the data frame from the given vectors:

# Create the vectors

emp_id <- c(1:5)
emp_name <- c("Rick", "Dan", "Michelle", "Ryan", "Gary")
start_date <- as.Date(c("2012-01-01", "2013-09-23", "2014-11-15", "2014-05-11", "2015-03-2
7"))
salary <- c(60000, 45000, 75000, 84000, 20000)

# Create the data frame

employees <- data.frame(emp_id, emp_name, start_date, salary)

# Print the data frame

print(employees)

Output

yaml

emp_id emp_name start_date salary

1 1 Rick 2012-01-01 60000
2 2 Dan 2013-09-23 45000
3 3 Michelle 2014-11-15 75000
4 4 Ryan 2014-05-11 84000
5 5 Gary 2015-03-27 20000

ii. Display Structure and Summary of the Data Frame

R Code

# Display the structure of the data frame

str(employees)

# Display the summary of the data frame

summary(employees)

Output
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 8/27
Structure:

ruby

'data.frame': 5 obs. of 4 variables:

$ emp_id : int 1 2 3 4 5
$ emp_name : chr "Rick" "Dan" "Michelle" "Ryan" ...
$ start_date: Date, format: "2012-01-01" "2013-09-23" ...
$ salary : num 60000 45000 75000 84000 20000

Summary:

mathematica

emp_id emp_name start_date salary

Min. :1.0 Length:5 Min. :2012-01-01 Min. :20000
1st Qu.:2.0 Class :character 1st Qu.:2013-09-23 1st Qu.:45000
Median :3.0 Mode :character Median :2014-05-11 Median :60000
Mean :3.0 Mean :2014-03-14 Mean :56800
3rd Qu.:4.0 3rd Qu.:2014-11-15 3rd Qu.:75000
Max. :5.0 Max. :2015-03-27 Max. :84000

iii. Extract the `emp_name` and `salary` Columns

R Code

# Extract emp_name and salary columns

emp_name_salary <- employees[, c("emp_name", "salary")]
print(emp_name_salary)

Output

markdown

emp_name salary
1 Rick 60000
2 Dan 45000
3 Michelle 75000
4 Ryan 84000
5 Gary 20000

iv. Extract Employee Details with Salary ≤ 60000

R Code

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 9/27
R

# Subset employees with salary <= 60000

low_salary_employees <- subset(employees, salary <= 60000)
print(low_salary_employees)

Output

yaml

emp_id emp_name start_date salary

1 1 Rick 2012-01-01 60000
2 2 Dan 2013-09-23 45000
5 5 Gary 2015-03-27 20000

Let me know if you need further assistance!

Q) List and explain various functions that allow users to handle data in R workspace with
appropriate examples.

In R, the workspace refers to the environment where all the data objects (vectors, matrices, data frames,
etc.) are stored. R provides a variety of functions that allow users to manage, manipulate, and query the
data stored in the workspace. Below is a list of key functions, along with explanations and examples, to
handle data in the R workspace:

1. `ls()` - List Objects in the Workspace

This function lists all the objects (variables, functions, etc.) currently in the R workspace.
Syntax: `ls()`

Example:

x <- 10
y <- 20
ls() # Lists the objects in the workspace

Output:

csharp

[1] "x" "y"

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 10/27
2. `rm()` - Remove Objects from the Workspace

This function removes specified objects from the R workspace, effectively deleting them.
Syntax: `rm(object_name)`

Example:

x <- 10
y <- 20
rm(x) # Removes the object 'x' from the workspace
ls() # Check the workspace after removal

Output:

csharp

[1] "y"

3. `rm(list = ls())` - Remove All Objects from the Workspace

This function removes all objects from the workspace by passing the `ls()` output to `rm()`.
Syntax: `rm(list = ls())`

Example:

x <- 10
y <- 20
rm(list = ls()) # Removes all objects from the workspace
ls() # Check if the workspace is empty

Output:

scss

character(0)

4. `objects()` - List Objects in the Workspace (Alternative to `ls()`)

This function is similar to `ls()`, listing the names of all objects in the current environment.
Syntax: `objects()`

Example:

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 11/27
R

a <- 5
b <- 6
objects() # List all objects in the workspace

Output:

csharp

[1] "a" "b"

5. `exists()` - Check If an Object Exists

This function checks if an object exists in the workspace.

Syntax: `exists("object_name")`

Example:

x <- 100
exists("x") # Check if 'x' exists in the workspace

Output:

csharp

[1] TRUE

6. `save()` - Save Objects to a File

This function saves one or more objects from the workspace to an external file (RData format).
Syntax: `save(object_name, file = "filename.RData")`

Example:

a <- 10
b <- 20
save(a, b, file = "data.RData") # Save 'a' and 'b' to 'data.RData'

7. `load()` - Load Objects from a File

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 12/27
This function loads objects from an external file into the workspace.
Syntax: `load("filename.RData")`

Example:

load("data.RData") # Load the objects from 'data.RData'

ls() # Check if the objects have been loaded

Output:

csharp

[1] "a" "b"

8. `saveRDS()` - Save a Single R Object to a File

This function saves a single R object to a file in RDS format, which can later be read back into R with
`readRDS()`.
Syntax: `saveRDS(object, file = "filename.rds")`

Example:

a <- c(1, 2, 3, 4, 5)
saveRDS(a, file = "vector.rds") # Save the vector 'a'

9. `readRDS()` - Read an R Object from an RDS File

This function reads an R object saved in RDS format back into R.

Syntax: `readRDS("filename.rds")`

Example:

a <- readRDS("vector.rds") # Load the vector from the RDS file

print(a)

Output:

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 13/27
csharp

[1] 1 2 3 4 5

10. `attach()` - Attach a Data Frame to the Workspace

This function allows the variables of a data frame to be accessed directly by name without
referencing the data frame.
Syntax: `attach(data_frame)`

Example:

df <- data.frame(x = 1:5, y = c(10, 20, 30, 40, 50))

attach(df) # Makes the columns 'x' and 'y' accessible directly
print(x) # Can access 'x' directly without using df$x
detach(df) # Detach the data frame

11. `detach()` - Detach a Data Frame from the Workspace

This function removes the data frame from the search path, making it no longer accessible by the
variable names alone.
Syntax: `detach(data_frame)`

Example:

detach(df) # Detach the data frame so its columns are no longer accessible directly

12. `ls.str()` - List Structure of All Objects

This function lists the names of all objects in the workspace along with their structure, which can
be helpful for understanding the content of larger objects.
Syntax: `ls.str()`

Example:

x <- 1:10
y <- data.frame(a = 1:5, b = letters[1:5])
ls.str() # List structure of all objects in the workspace

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 14/27
13. `clear()` (or `rm(list = ls())`) - Clear the Workspace

Clears all objects from the workspace, effectively resetting the environment.
Syntax: `rm(list = ls())`

Example:

rm(list = ls()) # Clears the workspace

ls() # Verify that the workspace is empty

14. `setwd()` - Set Working Directory

This function sets the working directory, which determines where files will be saved or read from.
Syntax: `setwd("path/to/directory")`

Example:

setwd("C:/Users/YourName/Documents") # Set the working directory

Conclusion

These functions are crucial for managing the workspace in R. They allow users to list, remove, save, and
load data, making it easier to organize and manipulate data in the R environment. These functions also
help in saving time and memory when working with large datasets.

Q) i. What are the advantages of using functions over scripts?

ii. Suppose you have two datasets A and B.
Dataset A has the following data: 6 7 8 9
Dataset B has the following data: 1 2 4 5.
Which function is used to combine the data from both datasets into dataset C.
Demonstrate the function with the input values and write the output.

i. Advantages of Using Functions Over Scripts

Using functions in R offers several advantages over using standalone scripts:

1. Reusability:
Functions allow you to encapsulate code logic that can be reused multiple times with different
inputs. This reduces redundancy and promotes cleaner, more modular code.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 15/27
Example: Instead of copying and pasting the same code multiple times, you can write a
function once and call it whenever needed.
2. Readability:
Functions help in making your code more readable and organized. With functions, you can
abstract complex operations and give them meaningful names.
Example: A function `calculate_sum()` is more readable and easier to understand than
manually writing out the summing logic multiple times.
3. Maintainability:
Functions make the code more maintainable since you only need to update or fix logic in one
place, rather than in multiple locations in a script.
Example: If you need to change the logic for calculating a value, you only need to update the
function, not every instance where that logic is used.
4. Modularity:
Functions break down large scripts into smaller, manageable pieces. Each function can handle
a specific task, making the code easier to debug and test.
Example: Instead of having one long script, you can have several smaller functions that each
handle different tasks.
5. Flexibility:
Functions can accept parameters and return values, which allows for flexible operations on
different data inputs. This makes them ideal for dynamic and data-driven tasks.
Example: You can write a function that sorts a dataset, and call it with different datasets
without modifying the function.
6. Avoiding Code Duplication:
By using functions, you avoid repeating the same code in different parts of your script. This
helps prevent errors and inconsistencies.
Example: If you need to calculate the mean of different columns in a dataset, you can write a
function that does this for any column, reducing repetitive code.

ii. Combining Datasets A and B into Dataset C

To combine the two datasets (A and B), the `c()` function in R is typically used to concatenate elements.
This function combines data from two or more vectors into one.

Given Data:

Dataset A: `6, 7, 8, 9`
Dataset B: `1, 2, 4, 5`

We can use the `c()` function to combine these datasets into Dataset C.

R Code:

# Define the datasets

A <- c(6, 7, 8, 9)
B <- c(1, 2, 4, 5)

# Combine A and B into C

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 16/27
C <- c(A, B)

# Print the combined dataset C

print(C)

Output:

csharp

[1] 6 7 8 9 1 2 4 5

Explanation:

The `c()` function combines the two datasets, `A` and `B`, into a single vector `C`. This is a simple
concatenation of the values from both datasets in the order they are provided.

Q) Consider the following data frame given below: course id class 1 11 marks 1 2 12 56 2 3 13 75 1 4
14 48 2 5 15 69 1 6 16 84 2 53 i.Create a subset of course less than 3 by using [ ] brackets and
demonstrate the output. ii. Create a subset where the course column is less than 3 or the class
equals to 2 by using subset () function and demonstrate the output.
Q) List and discuss various types of data structures in R.

i. Subset of `course` less than 3 using `[]` brackets

To create a subset where the `course` column is less than 3 using `[]` brackets, you can filter the data
based on the condition directly within the indexing.

R Code:

# Create the data frame

data <- data.frame(
course = c(1, 2, 3, 4, 5, 6),
id = c(11, 12, 13, 14, 15, 16),
marks = c(56, 75, 48, 69, 84, 53)
)

# Subset where course < 3

subset_course <- data[data$course < 3, ]
print(subset_course)

Output:

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 17/27
bash

course id marks
1 1 11 56
2 2 12 75

Explanation:

The expression `data$course < 3` creates a logical vector where only the rows where the `course`
value is less than 3 are selected.
The `[]` brackets are used to subset the data frame based on this condition.

ii. Subset where `course` is less than 3 or `class` equals 2 using `subset()` function

To create a subset where the `course` column is less than 3 or the `class` equals 2 using the `subset()`
function, you can define the condition inside the `subset()` function.

R Code:

# Create the data frame

data <- data.frame(
course = c(1, 2, 3, 4, 5, 6),
class = c(11, 12, 13, 14, 15, 16),
marks = c(56, 75, 48, 69, 84, 53)
)

# Subset where course < 3 or class == 2

subset_condition <- subset(data, course < 3 | class == 2)
print(subset_condition)

Output:

kotlin

course class marks

1 1 11 56
2 2 12 75

Explanation:

The `subset()` function allows for logical conditions to filter rows. The condition `course < 3 |
class == 2` selects rows where the `course` is less than 3 or the `class` is equal to 2.

Q) List and Discuss Various Types of Data Structures in R

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 18/27
R provides several types of data structures to store and manipulate data. Here are the main data
structures in R:

1. Vector

A vector is a basic data structure in R. It contains elements of the same type (e.g., numeric,
character, logical).
Types:
Numeric vector: Contains numbers.
Character vector: Contains strings.
Logical vector: Contains TRUE or FALSE values.
Example:

numeric_vector <- c(1, 2, 3, 4)

character_vector <- c("apple", "banana", "cherry")

2. Matrix

A matrix is a two-dimensional array where elements are of the same type (e.g., all numeric). It is
useful for mathematical computations.
Example:

m <- matrix(1:6, nrow = 2, ncol = 3)

print(m)

Output:

css

[,1] [,2] [,3]

[1,] 1 3 5
[2,] 2 4 6

3. Array

An array is a multi-dimensional data structure that can store elements of the same type. It can be
viewed as an extension of matrices to more than two dimensions.
Example:

arr <- array(1:8, dim = c(2, 2, 2))

print(arr)

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 19/27
Output:

css

, , 1

[,1] [,2]
[1,] 1 3
[2,] 2 4

, , 2

[,1] [,2]
[1,] 5 7
[2,] 6 8

4. Data Frame

A data frame is a two-dimensional data structure, similar to a table in a database or a data frame in
Python's Pandas. It can store columns of different data types (numeric, character, etc.).
Example:

df <- data.frame(Name = c("Alice", "Bob", "Charlie"), Age = c(25, 30, 35))

print(df)

Output:

markdown

Name Age
1 Alice 25
2 Bob 30
3 Charlie 35

5. List

A list is a flexible data structure that can hold elements of different types, including other lists. Lists
are often used to store mixed-type data.
Example:

lst <- list(Name = "Alice", Age = 25, Scores = c(90, 85, 92))
print(lst)

Output:

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 20/27
bash

$Name
[1] "Alice"

$Age
[1] 25

$Scores
[1] 90 85 92

6. Factor

A factor is used to represent categorical data. Factors are treated as integers with corresponding
levels, making them useful for modeling categorical variables.
Example:

f <- factor(c("low", "medium", "high", "medium"))

print(f)

Output:

scss

[1] low medium high medium

Levels: high low medium

7. Tibble

A tibble is a modern version of a data frame provided by the `tibble` package. It offers more
convenient printing and handling of data frames.
Example:

library(tibble)
tb <- tibble(Name = c("John", "Paul", "George"), Age = c(28, 30, 32))
print(tb)

Output:

perl

# A tibble: 3 × 2
Name Age
<chr> <dbl>
1 John 28

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 21/27
2 Paul 30
3 George 32

8. Data Table

Data tables are an enhanced version of data frames, which are optimized for speed and memory
usage. They are part of the `data.table` package.
Example:

library(data.table)
dt <- data.table(Name = c("Tom", "Jerry"), Age = c(5, 3))
print(dt)

Output:

makefile

Name Age
1: Tom 5
2: Jerry 3

Conclusion

These data structures provide different ways to handle data in R, depending on the type and complexity
of the task at hand. Understanding when and how to use them is crucial for efficient data analysis in R.

i. The following table shows the number of units of different products sold on
different days:
Product
Monday Tuesday Wednesday Thursda
y
Bread
12
3
5
Friday
Milk
11
21
27
18
9
Cola Cans
20

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 22/27
10
1
33
6
15
Chocolate
bars
12
6
7
4
13
Detergent
5
8
12
12
Create five sample numeric vectors from this data.
20
23
ii. Name and explain the operators used to form data subsets in R.

i. Creating Sample Numeric Vectors from the Given Data

We can create five numeric vectors, each corresponding to the data for one of the products. Each vector
will represent the number of units sold on different days (Monday, Tuesday, Wednesday, Thursday, and
Friday).

Given Data:

Product Monday Tuesday Wednesday Thursday Friday

Bread 12 3 5 20 23

Milk 11 21 27 18 9
Cola Cans 20 10 1 33 6
Chocolate Bars 12 6 7 4 13
Detergent 5 8 12 12 12

Now, let's create five numeric vectors, one for each product.

R Code:

# Create numeric vectors for each product

bread_sales <- c(12, 3, 5, 20, 23)
milk_sales <- c(11, 21, 27, 18, 9)
cola_cans_sales <- c(20, 10, 1, 33, 6)
chocolate_bars_sales <- c(12, 6, 7, 4, 13)
detergent_sales <- c(5, 8, 12, 12, 12)
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 23/27
# Print the vectors
print(bread_sales)
print(milk_sales)
print(cola_cans_sales)
print(chocolate_bars_sales)
print(detergent_sales)

Output:

csharp

[1] 12 3 5 20 23
[1] 11 21 27 18 9
[1] 20 10 1 33 6
[1] 12 6 7 4 13
[1] 5 8 12 12 12

Explanation:

Each vector represents the number of units sold for a specific product across the five days of the
week (Monday to Friday).

ii. Operators Used to Form Data Subsets in R

In R, there are several operators and functions that allow you to subset data, whether it is in a vector,
matrix, data frame, or list. The most commonly used operators for subsetting data are:

1. Indexing Operator `[]`

The `[]` operator is used to subset data frames, vectors, and matrices. You can specify the row and
column indexes inside the brackets to extract specific elements.
Examples:
For a vector:

vec <- c(10, 20, 30, 40, 50)

subset_vec <- vec[2:4] # Extracts elements 2 to 4
print(subset_vec)

Output:

csharp

[1] 20 30 40

For a data frame:

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 24/27
R

df <- data.frame(A = 1:5, B = 6:10)

subset_df <- df[2:4, ] # Extracts rows 2 to 4
print(subset_df)

Output:

css

A B
2 2 7
3 3 8
4 4 9

2. Column Subsetting Using `$` (for Data Frames)

The `$` operator is used to extract specific columns from a data frame by name.
Example:

df <- data.frame(A = 1:5, B = 6:10)

col_A <- df$A # Extracts the 'A' column
print(col_A)

Output:

csharp

[1] 1 2 3 4 5

3. Logical Subsetting

Logical conditions can be used inside the `[]` operator to subset data based on conditions. The
condition returns a logical vector (TRUE/FALSE) and is used to filter data.
Example:

vec <- c(10, 20, 30, 40, 50)

subset_vec <- vec[vec > 30] # Extracts values greater than 30
print(subset_vec)

Output:

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 25/27
csharp

[1] 40 50

4. `subset()` Function (for Data Frames)

The `subset()` function is used for filtering data frames based on a condition, and it is more
readable and intuitive for complex filtering.
Example:

df <- data.frame(A = 1:5, B = 6:10)

subset_df <- subset(df, A > 2) # Subsets rows where A > 2
print(subset_df)

Output:

css

A B
3 3 8
4 4 9
5 5 10

5. `which()` Function

The `which()` function is used to identify the indices of elements that meet a condition.
Example:

vec <- c(10, 20, 30, 40, 50)

indices <- which(vec > 30) # Identifies the indices where elements are greater than 30
print(indices)

Output:

csharp

[1] 4 5

6. Matrix Subsetting

In a matrix, rows and columns are selected using the `[]` operator, with row and column indices.
Example:

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 26/27
R

mat <- matrix(1:9, nrow = 3, byrow = TRUE)

subset_mat <- mat[1:2, 2:3] # Extracts the first two rows and columns 2 and 3
print(subset_mat)

Output:

css

[,1] [,2]
[1,] 2 3
[2,] 5 6

Summary of Operators for Data Subsetting:

1. `[]` – Indexing operator for vectors, matrices, data frames, and lists.
2. `$` – Extracts specific columns from data frames.
3. Logical conditions – Used within `[]` or `subset()` for conditional subsetting.
4. `subset()` – A function for filtering data frames with conditions.
5. `which()` – Identifies indices based on a condition.

Each of these methods has its use cases depending on the data structure and the task at hand, making
data manipulation in R flexible and efficient.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 27/27

Functions Vs Scripts and Datasets
No ratings yet
Functions Vs Scripts and Datasets
25 pages
Steel Grades For GB Standard - JIS Standard - ASTM Standard - DIN Standard
70% (10)
Steel Grades For GB Standard - JIS Standard - ASTM Standard - DIN Standard
8 pages
Oracle WMS PICK (White Paper)
100% (16)
Oracle WMS PICK (White Paper)
35 pages
M6 QA Univ Sol
No ratings yet
M6 QA Univ Sol
19 pages
Integrated Circuits - K. R. Botkar
No ratings yet
Integrated Circuits - K. R. Botkar
67 pages
Assignment Thermal
No ratings yet
Assignment Thermal
32 pages
7ut - Transformer Diff Relay Test
100% (2)
7ut - Transformer Diff Relay Test
25 pages
50 R Exercises
No ratings yet
50 R Exercises
44 pages
Unit 2 Reading and Writing Files
No ratings yet
Unit 2 Reading and Writing Files
33 pages
Data Structure & Algorithms Lab Manual V1.2-1
No ratings yet
Data Structure & Algorithms Lab Manual V1.2-1
97 pages
Measurement of Conductance and Kohlrauch's Law
No ratings yet
Measurement of Conductance and Kohlrauch's Law
23 pages
R Lab
No ratings yet
R Lab
8 pages
A Short List of Some Useful R Commands: Input and Display
No ratings yet
A Short List of Some Useful R Commands: Input and Display
2 pages
Introduction To R PDF
No ratings yet
Introduction To R PDF
56 pages
R Programming Basics Guide
No ratings yet
R Programming Basics Guide
30 pages
R Intro STAT5000
No ratings yet
R Intro STAT5000
17 pages
R-Unit 2
No ratings yet
R-Unit 2
81 pages
RLAB KP
No ratings yet
RLAB KP
16 pages
Network Analysis and Visualization With R and Igraph
No ratings yet
Network Analysis and Visualization With R and Igraph
62 pages
r22 Unit3 Vector Matrix
No ratings yet
r22 Unit3 Vector Matrix
30 pages
R Programming
No ratings yet
R Programming
50 pages
RStudio
No ratings yet
RStudio
31 pages
R Functions
No ratings yet
R Functions
8 pages
Unit - 3 Terms of Trade Types
No ratings yet
Unit - 3 Terms of Trade Types
4 pages
Lec 13
No ratings yet
Lec 13
46 pages
8 R Basics 3
No ratings yet
8 R Basics 3
27 pages
R Data Frame - Javatpoint
No ratings yet
R Data Frame - Javatpoint
14 pages
Filefile
No ratings yet
Filefile
39 pages
Handout 02
No ratings yet
Handout 02
12 pages
R Basic and Advanced
No ratings yet
R Basic and Advanced
9 pages
Dsda Manual
No ratings yet
Dsda Manual
64 pages
R Programming Basics Guide
No ratings yet
R Programming Basics Guide
5 pages
Daur Unit 2
No ratings yet
Daur Unit 2
28 pages
Session Set Working Directory Choose Directlry
No ratings yet
Session Set Working Directory Choose Directlry
17 pages
Section 03
No ratings yet
Section 03
20 pages
R Questions Solution (BDA) : Subscribe
No ratings yet
R Questions Solution (BDA) : Subscribe
11 pages
R Programming Language: History
No ratings yet
R Programming Language: History
20 pages
R Programmimg Practical Journal All-1
No ratings yet
R Programmimg Practical Journal All-1
25 pages
R Programming Cheat Sheet
No ratings yet
R Programming Cheat Sheet
7 pages
R Module 2
No ratings yet
R Module 2
30 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
40 pages
R Program Record Book Iba
No ratings yet
R Program Record Book Iba
24 pages
Practical File R by Komal
No ratings yet
Practical File R by Komal
26 pages
Base R
No ratings yet
Base R
9 pages
Experiment No 6
No ratings yet
Experiment No 6
5 pages
CH-6 Assignment - Models Modified
No ratings yet
CH-6 Assignment - Models Modified
48 pages
R Module 7 - Data Classes
No ratings yet
R Module 7 - Data Classes
45 pages
CH 3
No ratings yet
CH 3
33 pages
R Examples
No ratings yet
R Examples
56 pages
Part A R Programming
No ratings yet
Part A R Programming
10 pages
DS Lab
No ratings yet
DS Lab
31 pages
R File Code
No ratings yet
R File Code
16 pages
R Programming
No ratings yet
R Programming
30 pages
Pushpendra Lab File
No ratings yet
Pushpendra Lab File
51 pages
R Assignment
No ratings yet
R Assignment
9 pages
Data - Analysis - With - R - 24
No ratings yet
Data - Analysis - With - R - 24
47 pages
R Programming Materials
No ratings yet
R Programming Materials
51 pages
BDA Practical01
No ratings yet
BDA Practical01
5 pages
Kanak Gupta 1116 SEC Assignment
No ratings yet
Kanak Gupta 1116 SEC Assignment
3 pages
BigData - BCom Unit 4
No ratings yet
BigData - BCom Unit 4
9 pages
R Programming For NGS Data Analysis
No ratings yet
R Programming For NGS Data Analysis
5 pages
Practical Test 1222678
No ratings yet
Practical Test 1222678
5 pages
Amazon Braket: Developer Guide
No ratings yet
Amazon Braket: Developer Guide
54 pages
Practical Test 1222678
No ratings yet
Practical Test 1222678
5 pages
R Program
No ratings yet
R Program
22 pages
Vibration DNV
100% (1)
Vibration DNV
10 pages
Lab1 411 Eman Yahya 7773225
No ratings yet
Lab1 411 Eman Yahya 7773225
16 pages
Config Scanner 2000 Ba34
No ratings yet
Config Scanner 2000 Ba34
8 pages
Best Practice Catalog: Machine Condition Monitoring
No ratings yet
Best Practice Catalog: Machine Condition Monitoring
18 pages
Prog 9,10,11,12
No ratings yet
Prog 9,10,11,12
7 pages
Permutation
No ratings yet
Permutation
91 pages
Chapter 4 Bending Part 1
No ratings yet
Chapter 4 Bending Part 1
35 pages
MIT's Undergraduate String Theory Project
100% (13)
MIT's Undergraduate String Theory Project
18 pages
MMW Finals Notes Mod 5&6
No ratings yet
MMW Finals Notes Mod 5&6
52 pages
Third Order Intercepts
No ratings yet
Third Order Intercepts
6 pages
Carbon and Its Compounds Mind Map
No ratings yet
Carbon and Its Compounds Mind Map
1 page
Formlabs Fuse F1 - Sift Tech Specs
No ratings yet
Formlabs Fuse F1 - Sift Tech Specs
4 pages
Morris 2014
No ratings yet
Morris 2014
11 pages
Maths Practice Set 6 Solved (Combined Graduate Level Exam (CGLE) )
No ratings yet
Maths Practice Set 6 Solved (Combined Graduate Level Exam (CGLE) )
12 pages
Network An. Chapter-5
No ratings yet
Network An. Chapter-5
23 pages
Dr. Carlos S. Lanting College: Basic Education - Senior High School
No ratings yet
Dr. Carlos S. Lanting College: Basic Education - Senior High School
8 pages
Exam - 1013S 2023 Final
No ratings yet
Exam - 1013S 2023 Final
20 pages
25 GROOT 1 5 Learning To Follo
No ratings yet
25 GROOT 1 5 Learning To Follo
9 pages
CGMT Syllabus
No ratings yet
CGMT Syllabus
1 page
Formulation In-Vitro Evaluation of Sulfanilamide 15% Vaginal Cream
No ratings yet
Formulation In-Vitro Evaluation of Sulfanilamide 15% Vaginal Cream
3 pages
Abstrak Jibran
No ratings yet
Abstrak Jibran
2 pages
D12000i Rato Principle Block Diagram
No ratings yet
D12000i Rato Principle Block Diagram
1 page
Beginning Microsoft SQL Server 2012 Programming
From Everand
Beginning Microsoft SQL Server 2012 Programming
Paul Atkinson
1/5 (1)
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
TensorFlow深度学习项目实战: Chinese Edition
From Everand
TensorFlow深度学习项目实战: Chinese Edition
Posts & Telecom Press
No ratings yet