0% found this document useful (0 votes)
8 views

week 7 assignment solution

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

week 7 assignment solution

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Week 7: Advanced Data Visualization

1. For a frequency distribution of variable x, mean = 32, median = 30, mode = 26. The
distribution is:
a) Positively skewed (Hint: Mean>Median>Mode)
b) Negatively skewed (Hint: Mean < Median < Mode)
c) Mesokurtic (Hint: Probability distribution where extreme events are close to zero)
d) Platykurtic (Hint: Probability distribution is flatter than a normal distribution with
shorter tails)

2. Identify the incorrect statement.


a) mutate() is employed to create new column variables in a dataframe.
b) filter() is employed to subset/filter dataframes that follow a given rule/condition.
c) arrange() is employed to order the rows according to a variable
d) None of the above

Hint: mutate() creates new column variables in a dataframe. The filter() function is used to
extract subsets of rows from a data frame based on a certain rule or condition. arrange() is used
to reorder rows of a data frame (e.g., alphabetically or a numerical variable from high to low,
etc.).

3. Which of the following options is correct for writing a comma-separated file in R?

a) read.csv(“myfile.csv”) [Hint: read.csv is used for reading a comma-separated file in R]

b) write.csv(object, “myfile.csv”) [Hint: write.csv is used for writing an existing data


object into a comma-separated file in R]

c) saveRDS(“myfile.csv”) [Hint: saveRDS is used for saving files in RDS format]

d) None of these

4. To identify the levels in a variable, the variable should be of class:

a) Integer [Hint: there are no levels in the object having “integer” class in R]
b) Character [Hint: The objects of class “character” contain data in strings, and hence
command is not suitable for identifying levels]

c) Factor [Hint: Levels command is suitable for categorical variables. These variables
are defined as of class “factor” in R]

d) Numeric [Hint: The objects of class “numeric” contains data in decimals and hence does
not include categories/levels in data]

5. Which of the following codes is correct to remove all the objects from the current
working environment?

a) rm(list = ls()) [Hint: This will remove all the objects from the current working
environment]

b) list(rm=T) [Hint: This will create a list object type of class logical]

c) Ctrl+L [Hint: This will clear the console window]

d) None of these.

6. Which of the following statements is TRUE about factors in R?

a. Factors are used to represent numeric data in R. [Hint: Factors represent categorical
data, such as 1/0 or Male/Female]
b. Factors are used to represent categorical data in R. [Hint: Factors represent
categorical data, such as 1/0 or Male/Female]
c. Factors are used to represent missing values in R. [Hint: Factors represent categorical
data, such as 1/0 or Male/Female]
d. Factors are used to represent character data in R. [Hint: Factors represent categorical
data, such as 1/0 or Male/Female]

7. Which of the following functions in R can be used to generate a sequence of numbers?

a. summary() [Hint: summary() command is used to provide descriptive summary of a


vector according to its class]
b. seq() [ Hint: seq() command is used to generate sequence with certain given
parameters]
c. dim() [ Hint: dim() command is used to find row/column dimensions of a dataframe]
d. length() [Hint: length() command is used to find the length of a vector]
8. Which of the following is the correct way to select the first row and first column
element of a data frame named ‘df’ in R?

a. df[1,]. [Hint: df[1,] will select the first row of dataframe ‘df’ and all the columns]
b. df[,1]. [Hint: df[,1] will select all the rows and first column from ‘df’]
c. df[-1,]. [Hint df[-1,] will select all the columns and all the rows except the first row
from ‘df’]
d. df[1,1]. [Hint: df[1,1] will select the first row and first column element from ‘df’]

9. Which of the following best describes the output of the following R code: sqrt (-16)
a. NaN. [Hint: square root of -16 is not defined so it will result in NaN]
b. 4. [Hint: square root of -16 is not defined so it will result in NaN]
c. NA. [Hint: square root of -16 is not defined so it will result in NaN]
d. FALSE. [Hint: square root of -16 is not defined so it will result in NaN]

10. Which of the following is incorrect about the mean of a distribution?


(a) The extreme values (or outliers) may affect the mean. Hint: Extreme values vitiate the
estimation of mean and may result in a mean which does not accurately represent the
population.
(b) For a symmetric distribution (e.g., normal distribution) mean and mode are same.
Hint: For symmetric distribution like normal distribution, the mean and mode are the
same.
(c) Mean divides observations in two halves. Hint: If the distribution is not
symmetric then mean does not necessarily segregate the observations in two
halves.
(d) For standard normal distribution (z-distribution), mean is zero. Hint: For the standard
normal distribution z-transformation leads to zero mean irrespective of the original
distribution.

You might also like