Homework R1
Homework R1
02/6/2024
This data sets consists of 3 different types of irises’ (Setosa, Versicolour, and
Virginica) petal and sepal length. The rows being the samples and the columns
being: Sepal Length, Sepal Width, Petal Length and Petal Width.
Question 1:
a. How many rows and columns does the data have?
b. Use the apply function to find the minimum and maximum values of the four
features (Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) for the three species.
c. Use the tapply function to calculate the mean of each feature for each of the three
species
a.
data <- iris
dim(data) # This will show the number of rows and columns
b.
# Find minimum values of the four features
min_values <- apply(data[, 1:4], 2, min)
# Find maximum values of the four features
max_values <- apply(data[, 1:4], 2, max)
c.
# Calculate the mean of each feature for each species
mean_sepal_length <- tapply(data$Sepal.Length, data$Species, mean)
mean_sepal_width <- tapply(data$Sepal.Width, data$Species, mean)
mean_petal_length <- tapply(data$Petal.Length, data$Species, mean)
mean_petal_width <- tapply(data$Petal.Width, data$Species, mean)
Question 2:
a. Write an if-else function to compare the average sepal length of 'setosa' and
'versicolor'. Print:
"Setosa sepal length is greater than versicolor" if true.
"Setosa sepal length is less than versicolor" if false.
"Setosa sepal length is equal to versicolor" if they are the same.
b. Create a for loop with nested if-else statements to print rows where the petal width
is 0.2 and the species is 'setosa'.
c. Use nested if-else within a for loop to create a new column 'classification' that:
Assigns "Small" if both sepal length and sepal width are less than their
respective averages.
Assigns "Medium" if only one of sepal length or sepal width is greater than or
equal to their respective averages.
Assigns "Large" if both sepal length and sepal width are greater than or equal
to their respective averages.
a.
# If-else function to compare the average sepal lengths
if (mean_sepal_length[1] > mean_sepal_length[2]) {
print("The sepal length of setosa is greater than versicolor")
} else if (mean_sepal_length[1] < mean_sepal_length[2]) {
print("The sepal length of setosa is less than versicolor")
} else {
print("The sepal length of setosa is equal to versicolor")
}
b.
# For loop with nested if-else to print rows based on conditions
for (i in 1:nrow(data)) {
if (data$Petal.Width[i] == 0.2 && data$Species[i] == "setosa") {
print(data[i, ])
}
}
c.
# Create a new column 'classification' in the data
data$classification <- NA