0% found this document useful (0 votes)
10 views2 pages

S11202415 - Lab 2

The document creates a data frame from numeric vectors containing death counts by cause for years 2018-2020. It then calculates summary statistics like means and standard deviations for the death counts each year. Finally, it generates a boxplot to visualize the death counts across the years.

Uploaded by

lycanderek08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views2 pages

S11202415 - Lab 2

The document creates a data frame from numeric vectors containing death counts by cause for years 2018-2020. It then calculates summary statistics like means and standard deviations for the death counts each year. Finally, it generates a boxplot to visualize the death counts across the years.

Uploaded by

lycanderek08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

> cod.

2018 <- c(655381,599274,167127,159486,147810,122019,83564)


> cod.2019 <- c(659041,599601,173040,156979,150005,121499,87647)
> cod.2020 <- c(696962,598932,192176,156979,150005,133182,101106)
> Deaths <-
factor(c("Heart","Cancer","Injury","Respiratory","Stroke","Alzheimer","Dia
betes"))
>
> my.df <- cbind.data.frame(Deaths,cod.2018,cod.2019,cod.2020)
>
> View(my.df)
>
> colnames(my.df)=c("Deaths"="Deaths",
+ "cod.2018"="2018",
+ "cod.2016"="2019",
+ "cod.2017"="2020")
>
> # To find the Mean
> mean(my.df$`2018`)
[1] 276380.1
> mean(my.df$`2019`)
[1] 278258.9
> mean(my.df$`2020`)
[1] 289906
>
>
> # To find the Standard Deviation:
> #A. 2018
> sd(my.df$`2018`)
[1] 241880.9
>
> #B. 2019
> sd(my.df$`2019`)
[1] 242002.5
>
> #C. 2020
> sd(my.df$`2020`)
[1] 247720.5
>
> #We can round this to one decimal place using the round function.
> round(sd.2018,digits=1)
[1] 241880.9
> round(sd.2019,digits=1)
[1] 242002.5
> round(sd.2020,digits=1)
[1] 247720.5
>
> #We can also summarize our column data with the summary function
> summary(my.df)
Deaths 2018 2019 2020
Alzheimer :1 Min. : 83564 Min. : 87647 Min. :101106
Cancer :1 1st Qu.:134915 1st Qu.:135752 1st Qu.:141594
Diabetes :1 Median :159486 Median :156979 Median :156979
Heart :1 Mean :276380 Mean :278259 Mean :289906
Injury :1 3rd Qu.:383201 3rd Qu.:386321 3rd Qu.:395554
Respiratory:1 Max. :655381 Max. :659041 Max. :696962
Stroke :1
Question 5
> #visualize the data with a boxplot
> boxplot(x = my.df[,-1], # Name of your data frame (we exclude the first
column, 1)
+ xlab = "Years", # x-axis label
+ ylab = "Deaths", # y-axis label
+ ylim = c(83000,700000), # Limit of y-axes (from 83000 to 700000
Death)
+ border = "blue", # boxplot border colour
+ boxfill = "yellow",
+ frame.plot = TRUE
>

Question 6
The number of average deaths have been increasing from 2018 to 2020.

You might also like