0% found this document useful (0 votes)
4 views

week 7 assignment solution

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

week 7 assignment solution

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Week 7: Advanced Data Visualization

1. For a frequency distribution of variable x, mean = 32, median = 30, mode = 26. The
distribution is:
a) Positively skewed (Hint: Mean>Median>Mode)
b) Negatively skewed (Hint: Mean < Median < Mode)
c) Mesokurtic (Hint: Probability distribution where extreme events are close to zero)
d) Platykurtic (Hint: Probability distribution is flatter than a normal distribution with
shorter tails)

2. Identify the incorrect statement.


a) mutate() is employed to create new column variables in a dataframe.
b) filter() is employed to subset/filter dataframes that follow a given rule/condition.
c) arrange() is employed to order the rows according to a variable
d) None of the above

Hint: mutate() creates new column variables in a dataframe. The filter() function is used to
extract subsets of rows from a data frame based on a certain rule or condition. arrange() is used
to reorder rows of a data frame (e.g., alphabetically or a numerical variable from high to low,
etc.).

3. Which of the following options is correct for writing a comma-separated file in R?

a) read.csv(“myfile.csv”) [Hint: read.csv is used for reading a comma-separated file in R]

b) write.csv(object, “myfile.csv”) [Hint: write.csv is used for writing an existing data


object into a comma-separated file in R]

c) saveRDS(“myfile.csv”) [Hint: saveRDS is used for saving files in RDS format]

d) None of these

4. To identify the levels in a variable, the variable should be of class:

a) Integer [Hint: there are no levels in the object having “integer” class in R]
b) Character [Hint: The objects of class “character” contain data in strings, and hence
command is not suitable for identifying levels]

c) Factor [Hint: Levels command is suitable for categorical variables. These variables
are defined as of class “factor” in R]

d) Numeric [Hint: The objects of class “numeric” contains data in decimals and hence does
not include categories/levels in data]

5. Which of the following codes is correct to remove all the objects from the current
working environment?

a) rm(list = ls()) [Hint: This will remove all the objects from the current working
environment]

b) list(rm=T) [Hint: This will create a list object type of class logical]

c) Ctrl+L [Hint: This will clear the console window]

d) None of these.

6. Which of the following statements is TRUE about factors in R?

a. Factors are used to represent numeric data in R. [Hint: Factors represent categorical
data, such as 1/0 or Male/Female]
b. Factors are used to represent categorical data in R. [Hint: Factors represent
categorical data, such as 1/0 or Male/Female]
c. Factors are used to represent missing values in R. [Hint: Factors represent categorical
data, such as 1/0 or Male/Female]
d. Factors are used to represent character data in R. [Hint: Factors represent categorical
data, such as 1/0 or Male/Female]

7. Which of the following functions in R can be used to generate a sequence of numbers?

a. summary() [Hint: summary() command is used to provide descriptive summary of a


vector according to its class]
b. seq() [ Hint: seq() command is used to generate sequence with certain given
parameters]
c. dim() [ Hint: dim() command is used to find row/column dimensions of a dataframe]
d. length() [Hint: length() command is used to find the length of a vector]
8. Which of the following is the correct way to select the first row and first column
element of a data frame named ‘df’ in R?

a. df[1,]. [Hint: df[1,] will select the first row of dataframe ‘df’ and all the columns]
b. df[,1]. [Hint: df[,1] will select all the rows and first column from ‘df’]
c. df[-1,]. [Hint df[-1,] will select all the columns and all the rows except the first row
from ‘df’]
d. df[1,1]. [Hint: df[1,1] will select the first row and first column element from ‘df’]

9. Which of the following best describes the output of the following R code: sqrt (-16)
a. NaN. [Hint: square root of -16 is not defined so it will result in NaN]
b. 4. [Hint: square root of -16 is not defined so it will result in NaN]
c. NA. [Hint: square root of -16 is not defined so it will result in NaN]
d. FALSE. [Hint: square root of -16 is not defined so it will result in NaN]

10. Which of the following is incorrect about the mean of a distribution?


(a) The extreme values (or outliers) may affect the mean. Hint: Extreme values vitiate the
estimation of mean and may result in a mean which does not accurately represent the
population.
(b) For a symmetric distribution (e.g., normal distribution) mean and mode are same.
Hint: For symmetric distribution like normal distribution, the mean and mode are the
same.
(c) Mean divides observations in two halves. Hint: If the distribution is not
symmetric then mean does not necessarily segregate the observations in two
halves.
(d) For standard normal distribution (z-distribution), mean is zero. Hint: For the standard
normal distribution z-transformation leads to zero mean irrespective of the original
distribution.

You might also like