This document contains R code to analyze several datasets:
1) It analyzes ratings data by finding the most frequent rating and creating a bar plot of the rating frequencies.
2) It analyzes health data by creating a bar plot of quality frequencies.
3) It analyzes income data by creating frequency distributions, relative frequency distributions, and a histogram of the income variable.
4) It analyzes gas price data by creating frequency distributions, relative frequency distributions, and a histogram of the price variable.
5) It analyzes asset return data by creating a scatterplot with a trend line and computing the correlation coefficient.
6) It analyzes health and lifestyle data by creating scatter plots
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
30 views6 pages
Question No1
This document contains R code to analyze several datasets:
1) It analyzes ratings data by finding the most frequent rating and creating a bar plot of the rating frequencies.
2) It analyzes health data by creating a bar plot of quality frequencies.
3) It analyzes income data by creating frequency distributions, relative frequency distributions, and a histogram of the income variable.
4) It analyzes gas price data by creating frequency distributions, relative frequency distributions, and a histogram of the price variable.
5) It analyzes asset return data by creating a scatterplot with a trend line and computing the correlation coefficient.
6) It analyzes health and lifestyle data by creating scatter plots
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6
#Question 01
#import the dataset and Label it
Getwd() #set working directory # Sample data of ratings Ratings <- c(4, 5, 3, 4, 5, 4, 3, 2, 4, 5, 4, 4, 3, 5, 4, 5, 4, 3, 4, 5) # Constructing frequency distribution Rating_counts <- table(ratings) # Displaying the frequency distribution Print(“Frequency Distribution:”) Print(rating_counts) # Finding the rating with the most frequency Most_frequent_rating <- names(which.max(rating_counts)) Print(paste(“Rating with the most frequency:”, most_frequent_rating)) # Constructing a bar chart Barplot(rating_counts, main = “Quality Ratings of Entrées”, Xlab = “Rating”, ylab = “Frequency”, col = “brown”, Ylim = c(0, max(rating_counts)+5)) #Question 02 #SET WORKING DIRECTORY Getwd() #IMPORT THE DATASET AND LABEL IT Ssdd <- read.csv(“Health.CSV”, header = TRUE, sep = “,”) #Creating Bar Plot Quality and #Creating Frequency for Quality Quality_Frequency <- table(ssdd$Quality) Quality_Frequency View(Quality_Frequency) Barplot(ssdd$Response, col=”red”) #The quality of the patients are in average good #Question 04 #Import the Dataset and Label it Getwd() #Set Working directory Ssdd <- read.csv(“TRANSACTIONS.CSV”, header =TRUE, sep = “,”) #Creating Frequency and relative Frequency Distributions Intervals <- seq(22000,27000, by=1000) Income.cut <- cut(ssdd$Income, intervals, left=FALSE, right=TRUE) Income.frequency <-table(Income.cut) Income.frequency View(Income.frequency) Income.prop<-prop.table(Income.frequency) Income.prop View(Income.prop) #Creating Histogram Hist(ssdd$Income,right=TRUE, main=”Histogram for the Income Variable”, xlab=”Annual Income(in $1000s)”, col=”black”) #Question 05 #Create Histogram Getwd() #Import your data set and label it Ssdd <- read.csv(“Gas_2019.csv”, header = TRUE, sep =”,”) #Creating Frequency and relative frequency distribution Intervals <- seq(1.70,4.00, by=0.30) Price.cut <- cut(ssdd$Price, intervals, left=FALSE, right=TRUE) Price.frequency <-table(Price.cut) Price.frequency View(Price.frequency) Price.prop <- prop.table(Price.frequency) Price.prop View(Price.prop) #Creating Histogram Hist(ssdd$Price, breaks=intervals, right=TRUE, main=”Histogram for the Price”, xlab=”Price”,col=”blue”) #Question 06 #Import the Dataset and Label it #Set Working directory Getwd() # Define the returns for Asset A and Asset B Return_A <- c(10, 8, 6, 4, 2) Return_B <- c(2, 4, 6, 8, 10)
# Create a scatterplot with a trend line
Plot(return_A, return_B, xlab = “Return A (%)”, ylab = “Return B (%)”, main = “Scatterplot of Returns”) Abline(lm(return_B ~ return_A), col = “red”)
Message <- ifelse(correlation < 0, “negative”, ifelse(correlation == 0, “no”, “positive”)) Cat(“Including both assets in the portfolio may help diversify risk as their returns have a”, message, “relationship.”) #Question 07 #Import the Healthy Living data and label it Getwd() #Set working directory Ssdd <- read.csv(“Healthy_Living.csv”, header = TRUE,sep = “,”) #Plotting a Scatter Plot Plot(ssdd$Health ~ ssdd$Exercise, main = “Scatterplot of Health and Exercise”, xlab = “Health”, ylab = “Exercise”, col=”red”, pch=16) Plot(ssdd$Health ~ ssdd$Smoking, main = “Scatterplot of Health and Smoking”, xlab = “Health”, ylab = “Smoking”, col=”blue”, pch=16)