0% found this document useful (0 votes)
4 views

Week7-solution

The document provides solutions to an assignment on R software, detailing commands for generating frequency tables, calculating statistical measures, and creating visualizations. It includes explanations for commands related to mean, median, variance, and other statistical functions, emphasizing the use of the na.rm argument for handling missing values. Each solution is accompanied by the correct option for multiple-choice questions related to R commands.

Uploaded by

neeru572010
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Week7-solution

The document provides solutions to an assignment on R software, detailing commands for generating frequency tables, calculating statistical measures, and creating visualizations. It includes explanations for commands related to mean, median, variance, and other statistical functions, emphasizing the use of the na.rm argument for handling missing values. Each solution is accompanied by the correct option for multiple-choice questions related to R commands.

Uploaded by

neeru572010
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

MOOC Course - Introduction to R Software

July 2017

Solution to Assignment 7

1. The command table is used to generate frequency tables in R. The data should be in
vector form. Thus table(c(4,5,5,1,2,1,1,6,3,1)) is the correct command to obtain
the absolute frequencies of the given data in R.
Hence option a is correct.

2. The command table is used to generate frequency tables in R. The corresponding data
should be in vector form.
The output of the command
table(c(5,4,6,5,3,3,5,1,4,4,2,1,5,5,6,3,1,1,2,1,1))in R is
1 2 3 4 5 6
6 2 3 3 5 2
Hence option c is correct.

3. The command quantile(x,probs=c(0.30,0.50,0.70)) is the correct command


to obtain the 3rd , 5th and 7th deciles of a data vector x in R.

Hence option d is correct.

4. Since the probabilities are generated in a sequence from 0 to 1 as 0.0, 0.2, 0.4, 0.6, 0.8 and
1 by the commend probs = seq(0, 1, 0.20), so it provides the 0th, 20th, 40th, 60th,
80th, 100thquantiles.

Hence option b is correct.

1
5. The command
table(c(5,4,6,5,3,3,5,1,4,4,2,1,5,5,6,3,1,1,2,1,1))/length(c(5,4
,6,5,3,3,5,1,4,4,2,1,5,5,6,3,1,1,2,1,1))
will give the relative frequencies of the given data in tabular form. The command
barplot(table(c(5,4,6,5,3,3,5,1,4,4,2,1,5,5,6,3,1,1,2,1,1))/leng
th(c(5,4,6,5,3,3,5,1,4,4,2,1,5,5,6,3,1,1,2,1,1)))

is used to obtain the barplot of the given data in R based on relative frequencies.

Hence option d is correct.

6. The command hist command gives the histogram of the given data in R. The data should
be stored in a vector.

Hence option b is correct.

7. Writing the command hist(data,freq=FALSE) or hist(data,prob=TRUE) gives


the histogram of the given data in R based on the relative frequencies.

Hence option a is correct.

8. To calculate the mean of the non-missing values in the data, the command na.rm is used
inside the argument as mean(data,na.rm=TRUE) or mean(data,na.rm=T).

Hence option a and b both are correct.

9.The commands length(c(4,5,8,6,NA),na.rm=TRUE) and


length(c(4,5,8,6,NA))will give error messages. Thus none of the commands can be
used.

Hence option d is correct.

2
10. To calculate the product of the non-missing values in the data, the command
na.rmargument is used as prod(data,na.rm=TRUE) or prod(data,na.rm=T).

Hence option cis correct.

11. The command mean(1/c(24,15,35,51,42))will give the mean of reciprocal of the


individual data values. Thus 1/mean(1/c(24,15,35,51,42)) is the correct command to
obtain the harmonic mean.

Hence option d is correct.

12. To calculate the median of the non-missing values in the data, the command na.rm
argument is used as median(data,na.rm=TRUE)or median(data,na.rm=T).

Hence option b is correct.

13. To calculate the variance of the non-missing values in the data, na.rm argument is used as
var(data,na.rm=TRUE)or var(data,na.rm=T). The standard deviation is defined
as the square root of the variance.

Hence option c is correct.

14. The command max(c(4,3,6,7,NA),na.rm=TRUE)will give the maximum element


among the non- missing numbers and similarly min(c(4,3,6,7,NA),na.rm=TRUE)will
give the minimum element among the non- missing numbers. Thus range
=max(c(4,3,6,7,NA),na.rm=TRUE) -min(c(4,3,6,7,NA),na.rm=TRUE).

Hence option c is correct.

15. The command IQR command gives the interquartile range of the data.
IQR(c(14,15,NA,11,12,11,11,16,NA,12),na.rm=TRUE) is the correct
command to obtain the interquartile range of the given data in R.

Hence option d is correct.

3
16. This command abs first computes the absolute values of the data, then sum finds its
addition and then difference with the mean values of the data computes the mean absolute
deviation.

Hence option b is correct.

17. The command boxplot() command gives the boxplot of the given data. To obtain the
boxplot of the non-missing values in the data, na.rm is used inside the argument as
boxplot(data,na.rm=TRUE) or boxplot(data,na.rm=T).

Hence option c is correct.

18. The command skewness() function in the 'moments' package gives the coefficient of
skewness of the given data. To obtain the coefficient of skewness of the non-missing values in
the data, na.rm is used inside the argument as
skewness(data,na.rm=TRUE) or skewness(data,na.rm=T).

Hence option b is correct.

19. The command kurtosis() function in the 'moments' package gives the coefficient
ofkurtos is of the given data.

Hence option a is correct.

20. The command kurtosis() function in the 'moments' package gives the coefficient of
kurtosis of the given data. To obtain the coefficient of kurtosis of the non-missing values in the
data, na.rm is used inside the argument as
kurtosis(data,na.rm=TRUE)or kurtosis(data,na.rm=T).

The obtained value is 1.668406.

Hence option c is correct.

You might also like