0% found this document useful (0 votes)
26 views4 pages

Assignment R Solutions

The document contains four R code exercises that analyze different datasets using ggplot2. Exercise 1 analyzes the mpg dataset and creates boxplots of highway MPG by manufacturer and class. Exercise 2 analyzes diamonds data and creates histograms and bar charts. Exercise 3 performs PCA on iris data and creates scatter plots and a correlation heatmap of the PCs. Exercise 4 transforms variables in the Animals dataset and checks normality before making scatter plots.

Uploaded by

aieditor audio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views4 pages

Assignment R Solutions

The document contains four R code exercises that analyze different datasets using ggplot2. Exercise 1 analyzes the mpg dataset and creates boxplots of highway MPG by manufacturer and class. Exercise 2 analyzes diamonds data and creates histograms and bar charts. Exercise 3 performs PCA on iris data and creates scatter plots and a correlation heatmap of the PCs. Exercise 4 transforms variables in the Animals dataset and checks normality before making scatter plots.

Uploaded by

aieditor audio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 4

```{r}

# Exercise 1

# Load required library


library(ggplot2)

# Load the mpg dataset


data(mpg)

# (a) Plotting hwy mpg against manufacturers


# Calculate median hwy mpg for each manufacturer
manufacturer_median <- tapply(mpg$hwy, mpg$manufacturer, median)
# Order manufacturers based on median hwy mpg
ordered_manufacturers <- names(sort(manufacturer_median, decreasing = TRUE))
# Plotting
ggplot(mpg, aes(x = reorder(manufacturer, -hwy, FUN = median), y = hwy)) +
geom_boxplot() +
coord_flip() +
labs(x = "Manufacturer", y = "Highway MPG") +
theme_minimal() +
theme(axis.text.y = element_text(size = 8)) +
scale_x_discrete(limits = ordered_manufacturers)

# (b) Plotting hwy mpg against class


# Calculate median hwy mpg for each class
class_median <- tapply(mpg$hwy, mpg$class, median)
# Order classes based on median hwy mpg
ordered_class <- names(sort(class_median, decreasing = TRUE))
# Plotting
ggplot(mpg, aes(x = reorder(class, -hwy, FUN = median), y = hwy)) +
geom_boxplot() +
coord_flip() +
labs(x = "Class", y = "Highway MPG") +
theme_minimal() +
theme(axis.text.y = element_text(size = 8)) +
scale_x_discrete(limits = ordered_class)

# (c) Bar chart of manufacturers in terms of numbers of different types of cars


manufactured
ggplot(mpg, aes(x = manufacturer)) +
geom_bar() +
labs(x = "Manufacturer", y = "Count") +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
coord_flip()

```

```{r}
# Exercise 2

# Load required library


library(ggplot2)

# Load the diamonds dataset


data(diamonds)

# (a) Histograms for carat and price


ggplot(diamonds, aes(x = carat)) +
geom_histogram(binwidth = 0.1, fill = "blue", color = "black") +
labs(x = "Carat", y = "Frequency") +
theme_minimal()

ggplot(diamonds, aes(x = price)) +


geom_histogram(binwidth = 1000, fill = "green", color = "black") +
labs(x = "Price", y = "Frequency") +
theme_minimal()

# (b) Bar charts of cut proportioned in terms of color


ggplot(diamonds, aes(x = cut, fill = color)) +
geom_bar(position = "fill") +
labs(x = "Cut", y = "Proportion") +
theme_minimal()

# Bar charts of cuts proportioned in terms of clarity


ggplot(diamonds, aes(x = cut, fill = clarity)) +
geom_bar(position = "fill") +
labs(x = "Cut", y = "Proportion") +
theme_minimal()

# (c) Scatter plot of cut, carat, and price


ggplot(diamonds, aes(x = carat, y = price, color = cut)) +
geom_point() +
labs(x = "Carat", y = "Price") +
theme_minimal()

```

```{r}
# Exercise 3

# Load required library


library(ggplot2)

# Load the iris dataset


data(iris)

# (a) Obtain PC scores


# Perform PCA
pca <- prcomp(iris[, -5], scale. = TRUE)
# Extract PC scores
pc_scores <- as.data.frame(pca$x)

# (b) Scatter plot representing PC1 vs. PC2 with data clusters marked
# Combine PC scores with species
pc_scores$Species <- iris$Species
# Plotting
ggplot(pc_scores, aes(x = PC1, y = PC2, color = Species)) +
geom_point() +
labs(x = "PC1", y = "PC2") +
theme_minimal()

# (c) Correlation heatmap between PC scores


correlation_matrix <- cor(pc_scores[, -4])
# Plotting
ggplot(correlation_matrix, aes(x = Var1, y = Var2, fill = value)) +
geom_tile() +
scale_fill_gradient(low = "blue", high = "red") +
labs(x = "PC Scores", y = "PC Scores", fill = "Correlation") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))

```

```{r}
# Exercise 4

# Load required libraries


library(ggplot2)
library(e1071)

# Load the Animals dataset


data(Animals)

# Task 1: Check normality of variables


# Histogram and Q-Q plot for brain weight
ggplot(Animals, aes(x = brain)) +
geom_histogram(fill = "blue", color = "black") +
labs(x = "Brain Weight (g)", y = "Frequency") +
theme_minimal()

qqnorm(Animals$brain)
qqline(Animals$brain)

# Histogram and Q-Q plot for body weight


ggplot(Animals, aes(x = body)) +
geom_histogram(fill = "green", color = "black") +
labs(x = "Body Weight (kg)", y = "Frequency") +
theme_minimal()

qqnorm(Animals$body)
qqline(Animals$body)

# Task 2: Find lambda values for power transformation


# Box-Cox transformation for brain weight
lambda_brain <- boxcox(Animals$brain)$lambda
# Box-Cox transformation for body weight
lambda_body <- boxcox(Animals$body)$lambda

# Task 3: Apply power transformation and check normality


# Power transformation for brain weight
transformed_brain <- Animals$brain^lambda_brain
# Histogram and Q-Q plot for transformed brain weight
ggplot(data.frame(transformed_brain), aes(x = transformed_brain)) +
geom_histogram(fill = "blue", color = "black") +
labs(x = "Transformed Brain Weight", y = "Frequency") +
theme_minimal()

qqnorm(transformed_brain)
qqline(transformed_brain)

# Power transformation for body weight


transformed_body <- Animals$body^lambda_body
# Histogram and Q-Q plot for transformed body weight
ggplot(data.frame(transformed_body), aes(x = transformed_body)) +
geom_histogram(fill = "green", color = "black") +
labs(x = "Transformed Body Weight", y = "Frequency") +
theme_minimal()
qqnorm(transformed_body)
qqline(transformed_body)

# Task 4: Create scatter plot of transformed data


ggplot(data.frame(brain = transformed_brain, body = transformed_body), aes(x =
brain, y = body)) +
geom_point() +
labs(x = "Transformed Brain Weight", y = "Transformed Body Weight") +
theme_minimal()

```

You might also like