Que. Bank of PA
Que. Bank of PA
Question Bank
UNIT 1
UNIT 2
6. What is the formula for calculating the mean of a dataset, and what does it
represent?
7. How would you summarize the distribution of a categorical variable? What
are some techniques or visualizations that can be used?
8. What is a crosstab, and how can it be used to analyze the relationship
between two categorical variables?
9. What is the purpose of using scatter plots to visualize relationships between
two continuous variables?
10. How does a heatmap help in visualizing relationships between multiple
variables, and what type of data is best suited for this visualization?
UNIT 3
1. What are the common techniques used for handling incorrect or inconsistent
values in datasets?
2. What is the difference between univariate and multivariate outlier
detection?
3. What are some common ways to fix skewness in numerical data?
4. How can ZIP codes be used as a predictive feature in machine learning
models?
5. What are different sampling techniques, and when should they be used?
UNIT 4
UNIT 5