0% found this document useful (0 votes)

18 views24 pages

EDA Question Bank Answers

The document provides a comprehensive overview of various concepts in exploratory data analysis (EDA), including types of data, data visualization techniques, data cleaning, and imputation methods. It outlines key differences between primary, secondary, and tertiary data, discusses the significance of graphical representations, and highlights the importance of data quality and preprocessing. Additionally, it covers challenges in data accessing, feature engineering, and dimensionality reduction techniques, emphasizing their roles in effective data analysis.

Uploaded by

rihehoj301

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views24 pages

EDA Question Bank Answers

Uploaded by

rihehoj301

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 24

EDA Question Bank Answers

1. Key Differences Between Primary, Secondary, and

Tertiary Data

- Question: What are the key differences between

Primary, Secondary, and Tertiary data? Provide
examples of each.
- Answer:
- Primary Data: Data collected directly by the
researcher for a specific purpose.
- Example: Conducting surveys or interviews to gather
customer feedback.
- Secondary Data: Data collected by someone else for
a different purpose but used by the researcher.
- Example: Using government census data for market
research.
- Tertiary Data: Summarized or analyzed data, often
derived from primary and secondary sources.
- Example: Reading a review article that summarizes
findings from multiple studies.
- Diagram: A flowchart showing the relationship
between primary, secondary, and tertiary data.
2. Bidimensional Graphical Representations

- Question: Explain Bidimensional Graphical

Representations and their significance in data
visualization.
- Answer:
- Definition: Graphical representations that display the
relationship between two variables.
- Examples:
- Scatter Plot: Shows the relationship between two
continuous variables.
- Heatmap: Displays the relationship between two
categorical variables using color intensity.
- Significance: Helps identify patterns, trends, and
correlations between variables.
- Example: A scatter plot showing the relationship
between advertising spend and sales revenue.
- Graph: A scatter plot with a trend line.

3. Distributional Assumptions in EDA

- Question: Discuss the Distributional Assumptions in

Exploratory Data Analysis (EDA).
- Answer:
- Definition: Assumptions about the distribution of data,
such as normality, uniformity, or skewness.
- Importance: These assumptions guide the choice of
statistical tests and models.
- Example: In hypothesis testing, the assumption of
normality is crucial for tests like the t-test.
- Graph: A histogram showing a normal distribution.
4. Exploratory Data Analysis (EDA)

- Question: Explain Exploratory Data Analysis (EDA) in

detail, highlighting its importance and techniques.
- Answer:
- Definition: EDA is the process of analyzing and
summarizing datasets to understand their main
characteristics, often using visual methods.
- Techniques:
- Summary Statistics: Mean, median, mode, standard
deviation.
- Visualizations: Histograms, box plots, scatter plots.
- Data Cleaning: Handling missing values, outliers,
and inconsistencies.
- Importance: EDA helps identify patterns, trends, and
anomalies in data.
- Example: A data scientist uses EDA to analyze
customer purchase behavior and identify key trends.
- Diagram: A flowchart showing the steps in EDA
(Data Collection → Data Cleaning → Visualization
→ Analysis).
5. Outliers and Their Types

- Question: Explain Outliers and their types. What

techniques can be used to identify outliers in a dataset?
- Answer:
- Definition: Outliers are data points that deviate
significantly from other data points in a dataset.
- Types:
- Univariate Outliers: Outliers in a single variable (e.g.,
extremely high income).
- Multivariate Outliers: Outliers in multiple variables
(e.g., a person with high income but low spending).
- Techniques to Identify Outliers:
- Box Plot: Visualizes outliers using the interquartile
range (IQR).
- Z-Score: Identifies outliers based on standard
deviations from the mean.
- Example: In a dataset of house prices, a house priced
at $10 million while most houses are priced between
$100,000 and $500,000 is an outlier.
- Graph: A box plot showing outliers.

6. Imputation Techniques for Missing Data

- Question: Describe the different types of imputation

techniques used to handle missing data. Provide
examples of scenarios where each technique would be
appropriate.
- Answer:
- Definition: Imputation is the process of replacing
missing data with substituted values.
- Techniques:
- Mean/Median Imputation: Replace missing values
with the mean or median of the variable.
- Example: Replacing missing age values with the
median age.
- Regression Imputation: Predict missing values using
regression models.
- Example: Predicting missing income values based
on education level.
- K-Nearest Neighbors (KNN): Replace missing
values with the average of the nearest neighbors.
- Example: Replacing missing values in a dataset of
customer transactions.
- Diagram: A flowchart showing different imputation
techniques.

7. Nominal, Ordinal, Interval, and Ratio Data

- Question: Differentiate between Nominal, Ordinal,

Interval, and Ratio data with examples.
- Answer:
- Nominal Data: Categories with no order (e.g., types of
fruits).
- Ordinal Data: Categories with order (e.g., education
levels: high school, bachelor’s, master’s).
- Interval Data: Numerical data with no true zero (e.g.,
temperature in Celsius).
- Ratio Data: Numerical data with a true zero (e.g.,
height, weight).
- Example: A survey collects nominal data (gender),
ordinal data (satisfaction level), interval data
(temperature), and ratio data (income).
- Diagram: A table comparing the four types of data.

8. Types of Data

- Question: Identify the type of data for each case:

- a. Quarterly GDP growth rates of a country over five
years.
- b. Employment status of individuals tracked over five
years.
- c. Types of vegetables sold in a market.
- d. Interviewing a scientist for their research findings.
- e. Reading a scientific review article on a topic.
- Answer:
- a. Interval Data: Quarterly GDP growth rates are
numerical with no true zero.
- b. Nominal Data: Employment status is categorical
with no order.
- c. Nominal Data: Types of vegetables are categorical
with no order.
- d. Primary Data: Interviewing a scientist involves
collecting data directly.
- e. Tertiary Data: Reading a scientific review article
involves summarized data.

9. Steps in Data Discovery

- Question: Explain the steps involved in Data Discovery

and how they help in data analysis.
- Answer:
- Steps:
1. Data Collection: Gather raw data from various
sources.
2. Data Cleaning: Handle missing values, outliers,
and inconsistencies.
3. Data Exploration: Use visualizations and summary
statistics to understand the data.
4. Data Analysis: Apply statistical techniques to
uncover patterns and insights.
- Example: A data scientist discovers patterns in
customer data by following these steps.
- Diagram: A flowchart showing the data discovery
process.

10. Unidimensional Graphical Representations

- Question: Explain Unidimensional Graphical

Representations and their importance in data
visualization.
- Answer:
- Definition: Graphical representations that display the
distribution of a single variable.
- Examples:
- Histogram: Shows the distribution of a continuous
variable.
- Bar Chart: Displays the frequency of categorical
variables.
- Importance: Helps in understanding the distribution
and central tendency of a single variable.
- Example: A histogram showing the distribution of
ages in a population.
- Graph: A histogram.

11. Data Quality Issues

- Question: Describe the common types of data quality

issues encountered in raw datasets. Provide examples
of how each issue can affect data analysis.
- Answer:
- Common Issues:
- Missing Data: Data points that are not recorded.
- Duplicate Data: Repeated entries in the dataset.
- Inconsistent Data: Data that does not follow a
consistent format.
- Example: A dataset with missing values can lead to
inaccurate analysis.
- Diagram: A flowchart showing data quality issues and
solutions.

12. Challenges in Data Accessing

- Question: Mention the challenges and issues related to

Data Accessing in business analytics.
- Answer:
- Challenges:
- Data Privacy: Ensuring that sensitive data is
protected.
- Data Security: Preventing unauthorized access to
data.
- Data Accessibility: Ensuring that data is easily
accessible to authorized users.
- Example: A company faces challenges in accessing
customer data due to privacy regulations.

13. Data Preprocessing

- Question: Define Data Preprocessing and explain its

role in improving data quality.
- Answer:
- Definition: The process of cleaning and transforming
raw data into a usable format.
- Role: Improves data quality and ensures that the data
is ready for analysis.
- Example: A dataset of customer transactions is
preprocessed by removing duplicates and handling
missing values.
- Diagram: A flowchart showing the data preprocessing
steps.
14. Types of Missing Data

- Question: What are the different types of missing data

(MCAR, MAR, MNAR)? Explain with examples.
- Answer:
- MCAR (Missing Completely at Random): Missing data
is unrelated to any other variable.
- Example: A survey respondent accidentally skips a
question.
- MAR (Missing at Random): Missing data is related to
other observed variables.
- Example: Younger respondents are less likely to
report their income.
- MNAR (Missing Not at Random): Missing data is
related to the missing values themselves.
- Example: High-income individuals are less likely to
report their income.
- Diagram: A flowchart showing the types of missing
data.

15. Feature Engineering

- Question: Discuss the importance of feature

engineering in data analysis.
- Answer:
- Definition: The process of creating new features or
transforming existing ones to improve model
performance.
- Importance: Enhances the predictive power of
machine learning models.
- Example: Creating a "day of the week" feature from a
timestamp to predict customer behavior.
- Diagram: A flowchart showing the feature engineering
process.

16. Data Transformation Techniques

- Question: Explain data transformation techniques and

their applications in data preprocessing.
- Answer:
- Definition: Techniques used to transform data into a
suitable format for analysis.
- Examples:
- Normalization: Scaling data to a range (e.g., 0 to 1).
- Standardization: Scaling data to have a mean of 0
and a standard deviation of 1.
- Example: Normalizing pixel values in an image
dataset for machine learning.
- Diagram: A flowchart showing data transformation
techniques.
17. Data Normalization vs. Standardization

- Question: What is data normalization? How does it

differ from data standardization?
- Answer:
- Normalization: Scales data to a range (e.g., 0 to 1).
- Example: Scaling customer age data to a range of 0
to 1.
- Standardization: Scales data to have a mean of 0 and
a standard deviation of 1.
- Example: Standardizing features in a dataset for
linear regression.
- Diagram: A comparison chart showing normalization
and standardization.

18. Correlation Analysis

- Question: Discuss the role of correlation analysis in

data exploration.
- Answer:
- Definition: Measures the strength and direction of the
relationship between two variables.
- Role: Helps identify relationships between variables in
data exploration.
- Example: A correlation analysis between advertising
spend and sales revenue.
- Graph: A scatter plot showing the correlation between
two variables.

19. Data Sampling Techniques

- Question: Explain the different types of data sampling

techniques used in data analysis.
- Answer:
- Random Sampling: Every individual in the population
has an equal chance of being selected.
- Example: A lottery system where each ticket has an
equal chance of being drawn.
- Stratified Sampling: The population is divided into
strata (subgroups), and samples are taken from each
stratum.
- Example: A researcher divides a population into age
groups (e.g., 18-25, 26-35) and samples from each
group.
- Cluster Sampling: The population is divided into
clusters, and entire clusters are randomly selected for
analysis.
- Example: A company divides its customers by
region and randomly selects a few regions to survey.
- Diagram: A comparison chart showing random,
stratified, and cluster sampling.
20. Importance of Data Cleaning

- Question: Discuss the importance of Data Cleaning in

business analytics.
- Answer:
- Definition: The process of detecting and correcting (or
removing) errors, inconsistencies, and inaccuracies in a
dataset.
- Importance:
- Improves Data Quality: Clean data ensures accurate
and reliable analysis.
- Enhances Decision-Making: Clean data leads to
better insights and decisions.
- Saves Time and Resources: Cleaning data upfront
reduces the need for rework during analysis.
- Example: A dataset of customer transactions is
cleaned by removing duplicate entries, handling missing
values, and correcting inconsistent formatting.
- Diagram: A flowchart showing the data cleaning
process.

21. Handling Duplicate Data

- Question: What are the various techniques to handle

duplicate data in a dataset?
- Answer:
- Techniques:
- Removing Duplicates: Deleting repeated entries.
- Example: Removing duplicate customer records
from a database.
- Merging Duplicates: Combining duplicate records
into a single entry.
- Example: Merging duplicate customer records with
the same name and address.
- Importance: Handling duplicates ensures data
accuracy and consistency.
- Example: A retail company removes duplicate entries
from its sales dataset to ensure accurate revenue
calculations.
- Diagram: A flowchart showing the process of handling
duplicate data.

22. Data Aggregation

- Question: Explain the concept of data aggregation and

its significance in analytics.
- Answer:
- Definition: Data aggregation is the process of
summarizing data into a more usable format, such as
totals, averages, or counts.
- Significance:
- Simplifies Analysis: Aggregated data is easier to
analyze and interpret.
- Identifies Trends: Aggregation helps identify patterns
and trends in large datasets.
- Example: A company aggregates daily sales data into
monthly sales totals to analyze seasonal trends.
- Diagram: A bar chart showing monthly sales totals.

23. Visualization Techniques for Univariate and

Multivariate Data

- Question: What are the common visualization

techniques used for univariate and multivariate data?
- Answer:
- Univariate Data: Data involving a single variable.
- Visualization Techniques:
- Histogram: Shows the distribution of a continuous
variable.
- Bar Chart: Displays the frequency of categorical
variables.
- Example: A histogram showing the distribution of
ages in a population.
- Multivariate Data: Data involving multiple variables.
- Visualization Techniques:
- Scatter Plot: Shows the relationship between two
continuous variables.
- Heatmap: Displays the relationship between two
categorical variables using color intensity.
- Example: A scatter plot showing the relationship
between advertising spend and sales revenue.
- Graph: A scatter plot with a trend line.

24. Box Plot and Histogram

- Question: Discuss Box Plot and Histogram as

graphical tools for data distribution analysis.
- Answer:
- Box Plot:
- Definition: A graphical representation of data using
quartiles to show the distribution and identify outliers.
- Use: Helps visualize the spread and skewness of
data.
- Example: A box plot showing the distribution of
house prices.
- Histogram:
- Definition: A graphical representation of the
frequency distribution of a continuous variable.
- Use: Helps understand the distribution and central
tendency of data.
- Example: A histogram showing the distribution of
employee salaries.
- Diagram: A side-by-side comparison of a box plot and
a histogram.
25. Dimensionality Reduction Techniques (PCA)

- Question: Explain the importance of dimensionality

reduction techniques like PCA in data analysis.
- Answer:
- Definition: Dimensionality reduction techniques
reduce the number of features in a dataset while
preserving the most important information.
- Principal Component Analysis (PCA):
- Definition: A technique that transforms data into a
lower-dimensional space by identifying the directions
(principal components) that maximize variance.
- Use: Reduces the complexity of data while retaining
its structure.
- Example: Reducing the number of features in an
image dataset from 1000 to 50 for faster processing.
- Diagram: A graph showing the original data and the
reduced-dimensional data after PCA.

26. Steps in Data Wrangling

- Question: Describe the steps involved in data

wrangling and their importance.
- Answer:
- Definition: Data wrangling is the process of cleaning,
transforming, and integrating raw data into a usable
format for analysis.
- Steps:
1. Data Collection: Gather raw data from various
sources.
2. Data Cleaning: Handle missing values, outliers,
and inconsistencies.
3. Data Transformation: Normalize, standardize, or
aggregate data.
4. Data Integration: Combine data from multiple
sources.
- Importance: Ensures that the data is ready for
analysis.
- Example: A data scientist wrangles customer data by
cleaning, transforming, and integrating it into a single
dataset for analysis.
- Diagram: A flowchart showing the data wrangling
process.

27. Data Integrity and Security

- Question: Explain the role of data integrity and security

in analytics.
- Answer:
- Definition: Data integrity refers to the accuracy and
consistency of data, while data security involves
protecting data from unauthorized access or breaches.
- Role:
- Data Integrity: Ensures that data is accurate and
reliable for analysis.
- Data Security: Protects sensitive data from breaches
and cyberattacks.
- Example: A company implements encryption and
access controls to protect customer data.
- Diagram: A flowchart showing data integrity and
security measures.

28. Impact of Biased Data

- Question: Discuss the impact of biased data in

decision-making and machine learning models.
- Answer:
- Definition: Biased data is data that is not
representative of the population, leading to skewed
results and inaccurate conclusions.
- Impact:
- Inaccurate Decisions: Biased data can lead to poor
business decisions.
- Biased Machine Learning Models: Models trained on
biased data will produce biased predictions.
- Example: A biased dataset of job applicants leads to
discriminatory hiring practices.
- Diagram: A graph showing the impact of biased data
on model predictions.

29. Handling Categorical Data

- Question: What are the best practices for handling

categorical data in business analytics?
- Answer:
- Techniques:
- One-Hot Encoding: Converts categorical variables
into binary vectors.
- Example: Converting "gender" into binary columns
(Male: 1 or 0, Female: 1 or 0).
- Label Encoding: Assigns a unique number to each
category.
- Example: Converting "product type" into numerical
values (e.g., Electronics: 1, Clothing: 2).
- Importance: Proper handling of categorical data
ensures accurate analysis and model performance.
- Example: A machine learning model uses one-hot
encoding to process categorical data like product
categories.
- Diagram: A flowchart showing the process of handling
categorical data.
30. Data Governance

- Question: Explain the role of data governance in

ensuring high-quality data management.
- Answer:
- Definition: Data governance refers to the policies,
processes, and standards for managing data quality,
security, and accessibility.
- Role:
- Ensures Data Quality: Maintains accurate and
consistent data.
- Protects Data Security: Implements measures to
prevent unauthorized access.
- Promotes Data Accessibility: Ensures that data is
easily accessible to authorized users.
- Example: A company implements data governance
policies to ensure that customer data is accurate,
secure, and accessible.
- Diagram: A flowchart showing the components of
data governance.

Endsem Imp Bi Unit 4
No ratings yet
Endsem Imp Bi Unit 4
36 pages
DS End Sem.
No ratings yet
DS End Sem.
31 pages
Foundation of Data Science Previous Year Question Paper
No ratings yet
Foundation of Data Science Previous Year Question Paper
40 pages
Real Statistics Using Excel - Examples Workbook Charles Zaiontz, 9 April 2015
No ratings yet
Real Statistics Using Excel - Examples Workbook Charles Zaiontz, 9 April 2015
1,595 pages
DAF1101 Business Statistics-1
No ratings yet
DAF1101 Business Statistics-1
219 pages
Introduction To Data Analysis
No ratings yet
Introduction To Data Analysis
94 pages
FDS - 3 Solved
No ratings yet
FDS - 3 Solved
21 pages
FDS - 4 Solved
No ratings yet
FDS - 4 Solved
21 pages
Cruz Ex1.2
No ratings yet
Cruz Ex1.2
3 pages
DMBI Sem 6 Important Topics (IT)
No ratings yet
DMBI Sem 6 Important Topics (IT)
20 pages
Data - Part 1
No ratings yet
Data - Part 1
58 pages
Crash Course Data Science
No ratings yet
Crash Course Data Science
7 pages
UNIT-1: What Is Data Analytics? Why Data Analytics Is Important? What Is The Role of Data Analytics and Ways To Use It?
No ratings yet
UNIT-1: What Is Data Analytics? Why Data Analytics Is Important? What Is The Role of Data Analytics and Ways To Use It?
10 pages
DA Unit 1
No ratings yet
DA Unit 1
43 pages
Da Mid1
No ratings yet
Da Mid1
32 pages
My e Book For Statistics For Free
100% (1)
My e Book For Statistics For Free
83 pages
DSA Question Bank
No ratings yet
DSA Question Bank
22 pages
Unit Test 3
No ratings yet
Unit Test 3
9 pages
Basic Data Analysis
No ratings yet
Basic Data Analysis
16 pages
Fds 2 Marks
No ratings yet
Fds 2 Marks
14 pages
Dev Answer Key
No ratings yet
Dev Answer Key
21 pages
Understanding Data Assignment 2
No ratings yet
Understanding Data Assignment 2
12 pages
Data Analytics
No ratings yet
Data Analytics
6 pages
Unit I - Part I Notes
100% (7)
Unit I - Part I Notes
33 pages
Business Research Methods: Unit 3 Scaling & Measurement Techniques
No ratings yet
Business Research Methods: Unit 3 Scaling & Measurement Techniques
45 pages
FTA-Module 1-Notes
No ratings yet
FTA-Module 1-Notes
24 pages
Q2 Ans
No ratings yet
Q2 Ans
5 pages
Notes - Unit 1 - Exploratory Data Analysis
No ratings yet
Notes - Unit 1 - Exploratory Data Analysis
33 pages
Question Bank (DA) - 1
No ratings yet
Question Bank (DA) - 1
14 pages
De&v Two Marks Questions With Answers
No ratings yet
De&v Two Marks Questions With Answers
19 pages
Explorotary Data Analysis
100% (1)
Explorotary Data Analysis
30 pages
Data Sources Data Handling Data Visualization
No ratings yet
Data Sources Data Handling Data Visualization
23 pages
FDSMSE Imp
No ratings yet
FDSMSE Imp
6 pages
DM Unit2
No ratings yet
DM Unit2
9 pages
Cognizant Data Analyst Interview Questions 1745235888
No ratings yet
Cognizant Data Analyst Interview Questions 1745235888
18 pages
Dev Core
No ratings yet
Dev Core
7 pages
Basic Statistics
No ratings yet
Basic Statistics
53 pages
Unit 2
No ratings yet
Unit 2
58 pages
FDS - 2 Solved
No ratings yet
FDS - 2 Solved
14 pages
Data Mining
No ratings yet
Data Mining
34 pages
FDS PYQ Solution
No ratings yet
FDS PYQ Solution
8 pages
Lesson 5 Finding Answers Through Data Collection
No ratings yet
Lesson 5 Finding Answers Through Data Collection
51 pages
Fda End Sem
No ratings yet
Fda End Sem
14 pages
FDS 1
No ratings yet
FDS 1
5 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
62 pages
ML Chapter 2
No ratings yet
ML Chapter 2
9 pages
Assignment Big Data
No ratings yet
Assignment Big Data
7 pages
Cluster Randomised Trials Second Edition 2nd Edition Richard J Hayes Instant Download
No ratings yet
Cluster Randomised Trials Second Edition 2nd Edition Richard J Hayes Instant Download
83 pages
Data Science Interview Best
No ratings yet
Data Science Interview Best
48 pages
Answers To In-Text Questions: Essays
100% (54)
Answers To In-Text Questions: Essays
33 pages
Concepts and Variables
100% (1)
Concepts and Variables
14 pages
Data Literacy II
No ratings yet
Data Literacy II
7 pages
CS3352-QB Fds
No ratings yet
CS3352-QB Fds
12 pages
Graphical Data Analysis With R
No ratings yet
Graphical Data Analysis With R
306 pages
ML Exp No 1
No ratings yet
ML Exp No 1
8 pages
100 Most Difficult Data Analyst Interview Q&A
No ratings yet
100 Most Difficult Data Analyst Interview Q&A
26 pages
Assignment 02
No ratings yet
Assignment 02
9 pages
MUGISA PETER - MASTERS THESIS - NOVEMEBER 2023 - Addressing Viva and Internal Examiners Comments
No ratings yet
MUGISA PETER - MASTERS THESIS - NOVEMEBER 2023 - Addressing Viva and Internal Examiners Comments
146 pages
Unit 2 Data Gathering
No ratings yet
Unit 2 Data Gathering
14 pages
Likert Scales and Data Analyses
No ratings yet
Likert Scales and Data Analyses
4 pages
Exploratory Data Analysis (Eda)
No ratings yet
Exploratory Data Analysis (Eda)
10 pages
Data Exploration
No ratings yet
Data Exploration
5 pages
Unit I 2 Marks
No ratings yet
Unit I 2 Marks
5 pages
Get (Ebook) Multivariable Analysis: A Practical Guide For Clinicians and Public Health Researchers by Mitchell H. Katz ISBN 9780521760980, 0521760984, 0521141079 PDF Ebook With Full Chapters Now
100% (5)
Get (Ebook) Multivariable Analysis: A Practical Guide For Clinicians and Public Health Researchers by Mitchell H. Katz ISBN 9780521760980, 0521760984, 0521141079 PDF Ebook With Full Chapters Now
76 pages
Business Analytics (DJ19ITEC7013) Prev Year QB
No ratings yet
Business Analytics (DJ19ITEC7013) Prev Year QB
5 pages
EDA
100% (1)
EDA
9 pages
Exploratory Data Analysis - Satyajit
No ratings yet
Exploratory Data Analysis - Satyajit
35 pages
Data Mining Reviewer
No ratings yet
Data Mining Reviewer
4 pages
DA Interview Questions
No ratings yet
DA Interview Questions
7 pages
CAMI - Taylor and Dear
No ratings yet
CAMI - Taylor and Dear
16 pages
Day 1 Article For Discussion
No ratings yet
Day 1 Article For Discussion
5 pages
BA Unit 1 Question Bank
No ratings yet
BA Unit 1 Question Bank
8 pages
General Data Analyst Interview Questions
No ratings yet
General Data Analyst Interview Questions
7 pages
Data Analytics Interview Questions
No ratings yet
Data Analytics Interview Questions
3 pages
Enablers and Disablers For Contactless Payment Acc
No ratings yet
Enablers and Disablers For Contactless Payment Acc
13 pages
PRELIS Examples Guide PDF
No ratings yet
PRELIS Examples Guide PDF
78 pages
Group 1 (Practical Data Science)
No ratings yet
Group 1 (Practical Data Science)
38 pages
Universiti Teknologi Malaysia Faculty of Education Mid-Term Examination
No ratings yet
Universiti Teknologi Malaysia Faculty of Education Mid-Term Examination
8 pages
Review Question - C2 - SACR3080
No ratings yet
Review Question - C2 - SACR3080
9 pages
Cat Pca Con Spss
No ratings yet
Cat Pca Con Spss
15 pages
Why More Intelligent Individuals Like Classical Music
No ratings yet
Why More Intelligent Individuals Like Classical Music
12 pages
Types of Variables: Qualitative Attribute Variable
No ratings yet
Types of Variables: Qualitative Attribute Variable
7 pages
Chi Square Test of Proportion
No ratings yet
Chi Square Test of Proportion
3 pages
Prelim Exam 2nd Sem For Students
No ratings yet
Prelim Exam 2nd Sem For Students
4 pages
The Effectiveness of Performance Apprais PDF
No ratings yet
The Effectiveness of Performance Apprais PDF
13 pages
Answer Key
No ratings yet
Answer Key
4 pages
ProbStat Tutor 01
100% (2)
ProbStat Tutor 01
3 pages
Nature of Veterinary Data Scale of Measurment, Data Elements. - R-019-1
No ratings yet
Nature of Veterinary Data Scale of Measurment, Data Elements. - R-019-1
6 pages
Pusat Pengajian Pendidikan Jarak Jauh Universiti Sains Malaysia
No ratings yet
Pusat Pengajian Pendidikan Jarak Jauh Universiti Sains Malaysia
3 pages

EDA Question Bank Answers

Uploaded by

EDA Question Bank Answers

Uploaded by

EDA Question Bank Answers

1. Key Differences Between Primary, Secondary, and

- Question: What are the key differences between

- Question: Explain Bidimensional Graphical

3. Distributional Assumptions in EDA

- Question: Discuss the Distributional Assumptions in

- Question: Explain Exploratory Data Analysis (EDA) in

- Question: Explain Outliers and their types. What

6. Imputation Techniques for Missing Data

- Question: Describe the different types of imputation

7. Nominal, Ordinal, Interval, and Ratio Data

- Question: Differentiate between Nominal, Ordinal,

- Question: Identify the type of data for each case:

9. Steps in Data Discovery

- Question: Explain the steps involved in Data Discovery

10. Unidimensional Graphical Representations

- Question: Explain Unidimensional Graphical

11. Data Quality Issues

- Question: Describe the common types of data quality

12. Challenges in Data Accessing

- Question: Mention the challenges and issues related to

13. Data Preprocessing

- Question: Define Data Preprocessing and explain its

- Question: What are the different types of missing data

15. Feature Engineering

- Question: Discuss the importance of feature

16. Data Transformation Techniques

- Question: Explain data transformation techniques and

- Question: What is data normalization? How does it

18. Correlation Analysis

- Question: Discuss the role of correlation analysis in

19. Data Sampling Techniques

- Question: Explain the different types of data sampling

- Question: Discuss the importance of Data Cleaning in

21. Handling Duplicate Data

- Question: What are the various techniques to handle

22. Data Aggregation

- Question: Explain the concept of data aggregation and

23. Visualization Techniques for Univariate and

- Question: What are the common visualization

24. Box Plot and Histogram

- Question: Discuss Box Plot and Histogram as

- Question: Explain the importance of dimensionality

26. Steps in Data Wrangling

- Question: Describe the steps involved in data

27. Data Integrity and Security

- Question: Explain the role of data integrity and security

28. Impact of Biased Data

- Question: Discuss the impact of biased data in

29. Handling Categorical Data

- Question: What are the best practices for handling

- Question: Explain the role of data governance in

You might also like