0% found this document useful (0 votes)
268 views

EDA - With Python Question Bank

This document outlines the topics and questions covered in 5 units on exploratory data analysis (EDA). Unit 1 discusses the significance of EDA, steps involved, data types, measurement scales, software tools, pandas operations, and importance of visual aids. Unit 2 covers reading email data, filtering, handling missing values and outliers. Unit 3 discusses distributions, measures of central tendency and dispersion, matrix operations. Unit 4 differentiates between data reduction types and discusses time series analysis. Unit 5 defines statistical concepts, types of machine learning, and applications. Long questions provide examples and ask to explain concepts in more detail using Python functions.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
268 views

EDA - With Python Question Bank

This document outlines the topics and questions covered in 5 units on exploratory data analysis (EDA). Unit 1 discusses the significance of EDA, steps involved, data types, measurement scales, software tools, pandas operations, and importance of visual aids. Unit 2 covers reading email data, filtering, handling missing values and outliers. Unit 3 discusses distributions, measures of central tendency and dispersion, matrix operations. Unit 4 differentiates between data reduction types and discusses time series analysis. Unit 5 defines statistical concepts, types of machine learning, and applications. Long questions provide examples and ask to explain concepts in more detail using Python functions.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Unit-I

Short questions

1. What is the significance of EDA


2. What are the steps involved in EDA
3. What are different types of data
4. Discuss different measuring scales of data
5. Discuss about different software tools available for EDA
6. Write any 3 operations that can be performed on pandas data frame
7. What is the importance of visual aids in EDA.

Long questions

8. Discuss the purpose of Numpy and Pandas libraries


i What is broadcasting write a python program
ii How to read CSV file and create CSV file write a python program
9. Explain the following concepts?
i Numerical Data
ii Categorical Data
iii Measurement Scales
10. What is the importance of visual aids in EDA. Visualize line chat, bar chat, scatter plot, pie chat,
histogram with meaning full data using python.
11. Discuss guidelines for choosing the best plot. Explain different types of charts based on their
purpose.

Unit – II

Short Questions

1. What are the steps to read Email data


2. How do we perform filtering.
3. How to perform filtering data to weed out the noise.
4. Discuss about column wise and row wise filtering using python.
5. How to diagnose missing values using python
6. How to handle with missing values using python.
7. How to detect and handle outliers.
8. Discuss about most frequently used aggregations.
9. Discuss about some group wise operations car (body-style, drive-wheels, length, width, height,
price)
10. Discuss about some group wise transformations
Long Questions

11. Explain the following concepts with Employee (name, age, income, gender, department, grade,
performance_score) features using python.

Discuss row wise and column wise filtering

Handling missing values: dropping missing values, filling missing values,


Handling outliers,

Feature encoding: one hot encoding, label encoding, ordinal encoding

Filtering data, row wise filtering column wise filtering,

12. Discuss some group wise operations car (body-style {convertible, hardtop, sedan, wagon}, drive-
wheels {fwd, rwd}, length, width, height, price)
Filtering groups, group wise aggregations, group wise transformations.

Unit - III

1. How to fit polynomials with numpy.


2. How to find determinant, inverse and rank of a matrix using python.
3. How to solve linear equations using numpy.
4. Discuss decomposing of matrix using svd with python.
5. Discuss about uniform distribution
6. Discuss about normal distribution.
7. Discuss about exponential distribution
8. Discuss about binomial distribution.
9. Discuss about measures of central tendency
10. Discuss about measures of dispersion
11. What is skewness. How to measure skewness using python function.
12. What is kurtosis and how to measure it with python function.
13. What is quartile and how to measure inter quartile range using python
14. D

Long Questions

15. How to decomposing matrix using SVD


Explain about uniform, normal, exponential and binomial distributions.
16. Discuss the following with suitable python functions
Skewness, kurtosis and inter quartile range, visualizing quartiles
17. Define correlation. Explain bi variate analysis and multivariate analysis with correlation using
python
Unit – IV

Short Questions

1. Differentiate between data reduction and data redundancy.


2. How to perform numerosity data reduction
3. How to perform dimensionality data reduction.
4. Discuss about time series data analysis

Long Questions

5. What is data reduction how to perform numerosity and dimensionality data reduction using
python
6. What is PCA. How to perform PCA on toy data set.
7. What is time series data analyzing the open power system data (date, consumption, wind, solar,
solar + wind)

Unit – V

Short Questions

1. Define terms null, alternative hypothesis, level of significance.


2. Differentiate between type – I and type – II error
3. What is machine learning and types of machine learning.
4. Discuss about accuracy in regression.
5. What is linear and non linear regression.
6. Discuss about reinforcement learning
7.

Long Questions

8. Explain hypothesis testing, types of hypothesis testing with example.


9. What is linear regression. Explain regression with example.
10. What is machine learning. Types of machine learning and applications of machine learning.

You might also like