1.
Simple Random Sampling is best described as:
a) Selecting every nth individual from a population
b) Selecting individuals based on specific characteristics
c) Randomly selecting individuals, giving each an equal chance of
being chosen
d) Dividing the population into clusters. and selecting a few for study
2. Stratified Sampling is used to:
a) Increase the efficiency of sampling by selecting random samples
from each subgroup
b) Select samples based on convenience or accessibility
c) Choose samples based on a predetermined quota
d) Randomly select samples without considering subgroups
3. Which of the following is a limitation of Simple Random
Sampling?
a) It is complex to implement
b) It may not capture specific subgroups of interest
c) It is only suitable for small populations
d) It requires detailed prior knowledge of the population structure
4. In Hypothesis Testing, the Null Hypothesis (H0) usually
states that:
a) There is a significant effect or difference
b) There is no significant effect or difference
c) The sample data is biased
d) The observed effect is due to chance
5. In R programming, which function is typically used for
random sampling?
a) rnorm()
b) sample()
c) set.seed()
d) seq()
6. The ‘set.seed()' function in R is used to:
a) Generate a sequence of numbers
b) Set the starting point for producing a sequence of random
numbers
c) Sample a set number of observations from data
d) Create a variable
7. Cluster sampling involves:
a) Dividing the population into groups and selecting all members
from randomly chosen groups
b) Selecting every nth individual from a list
c) Randomly selecting individuals without any group division
d) Dividing the population into strata and randomly selecting from
each stratum
8. In hypothesis testing, Type I error occurs when:
a) The null hypothesis is incorrectly rejected
b) The null hypothesis is incorrectly accepted
c) There is a mistake in data collection
d) The test statistic is miscalculated
9. Which function in R is used for linear regression?
a) lm()
b) glm()
c) lapply()
d) regress()
10. For data visualization in R, which package provides
extensive graphing capabilities?
a) dplyr
b) tidyr
c) ggplot2
d) shiny
11. In stratified sampling, strata are formed based on:
a) Random selection
b) Characteristics irrelevant to the study
c) Specific characteristics that are relevant to the research question
d) Geographic location only
12. In R, which command is used to read a CSV file?
a) read.csv()
b) fileopen()
c) opencsv()
d) loadcsv()
13. The main advantage of stratified sampling over simple
random sampling is:
a) It is easier to implement
b) It ensures representation of all subgroups
c) It requires less knowledge about the population
d) It is less time-consuming
14. What is the primary use of the ‘t.test()' function in R?
a) To perform ANOVA
b) To compare means between two groups
c) To create plots
d) To perform cluster analysis
15. Which of the following is not a principle of sampling?
a) Representativeness
b) Bias elimination
c) Maximum variability
d) Randomization
16. In R, 'NA' represents:
a) A named argument
b) A non-applicable function
c) Missing or undefined data
d) Negative association
17. Which sampling method is best for ensuring each
subgroup within a population is represented?
a) Simple random sampling
b) Stratified sampling
c) Cluster sampling
d) Systematic sampling
18. A p-value in hypothesis testing signifies:
a) The probability that the null hypothesis is true
b) The probability of obtaining observed data under the null
hypothesis
c) The likelihood of a Type II error
d) The effect size of the test
19. In R, which command is use for installing. packages?
a) install.packages()
b) library()
c) require()
d) loadpackages()
20. The 'ggplot2' package in R is primarily used for:
a) Data manipulation
b) Statistical modeling
c) Data visualization
d) Machine learning
21. Simple random sampling ensures:
a) Each cluster is represented
b) Each stratum is represented
c) Each individual has an equal chance of being selected
d) The sample is convenient to collect
22. In hypothesis testing, a Type II error occurs when:
a) The null hypothesis is falsely rejected
b) The null hypothesis is falsely accepted
c) The alternative hypothesis is falsely rejected
d) The alternative hypothesis is falsely accepted
23. Which R function is used for generating normal
distribution values?
a) rnorm()
b) runif()
c) rbinom()
d) rpois()
24. Which of the following is an advantage of cluster
sampling?
a) It eliminates bias
b) It is cost-effective for large populations
c) It guarantees every subgroup is represented
d) It is the most accurate sampling method
25. In R, how do you view the structure of a dataset?
a) view()
b) str()
c) head()
d) summary()
26. What is the primary purpose of ANOVA?
a) To compare means of two groups
b) To compare variances within groups
c) To compare means across more than two groups
d) To assess correlation between two variables
27. What does the 'ANOVA' in ANOVA stand for?
a) Analysis Of Variance
b) Aggregate Normalization of Variants
c) Association of Variable Analysis
d) Analytical Observation of Variables
28. ANOVA is used to compare:
a) Two means
b) Two variances
c) Means of three or more groups
d) Variances of three or more groups
29. Which of the following is key assumption of ANOVA?
a) All groups are dependent
b) Normal distribution of data
c) All variables are qualitative
d) Data is collected through observation only
30. In ANOVA, if the null hypothesis is true, then:
a) There is a significant difference between group means
b) There is no significant difference between group means
c) The p-value is less than the significance level
d) The F-statistic is equal to zero
31. The `ggvis`package in R is used primarily for:
a) Data manipulation
b) Data visualization
c) Statistical modeling
d) Machine learning algorithms
32. Which of the following is a key feature of the`dplyr`
package in R?
a) Creating interactive web applications
b) Data manipulation and transformation
c) Statistical tests
d) Generating reports
33. In R, the `%%` operator is used for:
a) Matrix multiplication
b) Exponentiation
c) Integer division
d) Finding the remainder of division
34. A scatter plot in R can be created using which function?
a) plot()
b) scatter()
c) graph()
d) scatterplot()
35. In R, the `cor()` function is used to:
a) Calculate correlation
b) Conduct regression analysis
c) Plot data
d) Clean data
36. The `RMarkdown` framework is used for:
a) Statistical analysis
b) Data visualization
c) Creating dynamic documents
d) Database connections
37. In R, the `sapply()`function is a type of:
a) Loop
b) Conditional statement
c) User-defined function
d) Apply function
38. What does a boxplot visually represent in
data analysis?
a) Correlation between two variables
b) Distribution of a dataset
c) Linear regression model
d) Time series data
39. What is the purpose of the `hist()` function in R?
a) To create scatter plots
b) To generate histograms
c) To perform hypothesis testing
d) To compute correlations
40. In R, which function is used for calculating the
mean?
a) sum()
b) median()
c) mean()
d) average()
41. What does the function `sd()` compute in R?
a) Sum of data
b) Standard deviation
c) String distance
d) Sequential digits
42. In R, what does the function `var()` compute?
a) Variance
b) Vector length
c) Variable type
d) Value assignment
43. What does the `summary()`function in R
provide for a dataset?
a) A detailed analysis of each variable
b) A textual plot of the data
c) Summary statistics like Min, Max, Mean
d) A list of variables and functions used
44. What type of plot does the `boxplot()`function
in R create?
a) Histogram
b) Pie chart
c) Boxplot
d) Scatter plot
45. What is the main purpose of the `pairs()`
function in R?
a) To pair columns and rows in a data frame
b) To create a matrix of scatter plots
c) To calculate pairwise correlation
d) To merge two datasets
46. In R, `rbind()` function is used for:
a) Binding vectors
b) Randomly shuffling data
c) Row-wise binding of matrices or data frames
d) Recursive binding
47. The `barplot()` function in R is used to create:
a) Bar charts
b) Line graphs
c) 3D plots
d) Heatmaps
48. In R, `scale()` function is commonly used for:
a) Resizing plots
b) Standardizing variables
c) Scaling up computations
d) Scanning data frames
49. In R, `pie()` function is used for creating:
a) 3D plots
b) Pie charts
c) Histograms
d) Network diagrams
50. The function `t.test()` in R is used for:
a) Time-series testing
b) Testing table structures
c) Performing t-tests for comparing means
d) Testing for data trends
51. In R, a data.frame is:
a) A function for data analysis
b) A storage mode for datasets
c) A type of plot
d) An R package
52. The function prcomp() in R is used for:
a) Linear regression
b) Principal component analysis
c) Plotting data
d) Calculating p-values
53. In R, the apply() function is typically used for:
a) Applying a function to margins of an array
b) Plotting data
c) Database management
d) Machine learning
54. In R, the table() function is used for:
a) Creating pivot tables
b) Data frame creation
c) Generating contingency tables
d) Data visualization
55. The shiny package in R is used to:
a) Clean data
b) Build interactive web applications
c) Conduct statistical tests
d) Generate reports
56. The survival package in R is used for:
a) Survival analysis
b) Data manipulation
c) Creating plots
d) Machine learning
57. In R, the dbplyr package is primarily used for:
a) Database interfacing with dplyr
b) Data visualization
c) Statistical modeling
d) Text analysis
58. What does a boxplot visually represent in
data analysis?
a) Correlation between two variables
b) Distribution of a dataset
c) Linear regression model
d) Time series data
59. What is the main purpose of using hist()
function in R?
a) To create scatter plots
b) To generate histograms
c) To perform hypothesis testing
d) To compute correlations
60. The barplot() function in R is used to create:
a) Bar charts
b) Line graphs
c) 3D plots
d) Heatmaps