Data Wrangling Model Question Paper
Data Wrangling Model Question Paper
Bloom’s Course
Q.No. Question Taxonomy Outcome Marks
level s
UNIT-1
1.1 a) Define data wrangling and explain its significance in data analysis. L4 CO1 7M
b) List any five tasks involved in data wrangling and briefly describe
L3 CO1 7M
them.
(OR)
a) Explain why data wrangling is important for ensuring data quality
1.2 L1 CO1 7M
in a machine learning pipeline.
b) Illustrate the differences between CSV, JSON, and XML data
L4 CO2 7M
formats with examples.
UNIT-2
a) Define relational and non-relational databases. Give two examples
2.1 L3 CO2 7M
of each.
b) List the steps for installing Python packages required for Excel and
L2 CO2 7M
PDF parsing.
(OR)
a) Explain the differences between MySQL and PostgreSQL as
2.2 L2 CO2 7M
relational databases.
b) Describe the challenges of parsing PDF files programmatically and
L3 CO2 7M
how Python helps to overcome them.
UNIT-3
a) Define data cleanup and explain why it is essential in data
3.1 L6 CO3 7M
preprocessing.
b) List and briefly describe the steps involved in normalizing and
L3 CO3 7M
standardizing data.
(OR)
a) Explain the difference between finding duplicates and fuzzy
3.2 L6 CO3 7M
matching during data cleanup.
b) Illustrate the role of regular expressions (RegEx) in identifying
L3 CO4 7M
patterns for data cleanup.
UNIT-4
a) Define data exploration and list any four key functions involved in
4.1 L5 CO4 7M
exploring data.
b) Name three open-source platforms used for presenting and
L3 CO4 7M
publishing data, and explain their basic purposes.
(OR)
a) Explain how identifying correlations and outliers helps in data
4.2 L3 CO4 7M
analysis.
b) Discuss the significance of charts and maps in data visualization
L5 CO4 7M
with examples.
UNIT-5
a) Define web scraping and list any five common tasks it can be used
5.1 L3 CO5 7M
for
b) Name the tools and libraries commonly used for advanced web
L3 CO5 7M
scraping, such as browser-based parsing and spidering.
(OR)
a) Discuss the significance of analyzing a web page before initiating a
5.2 L3 CO5 7M
scraping process.
b) Assess the effectiveness of PySpider for large-scale web crawling
L3 CO5 7M
compared to Scrapy.