Ad3301 - Data Exploration and Visualization
Ad3301 - Data Exploration and Visualization
Ad3301 - Data Exploration and Visualization
3024
OBJECTIVES:
To outline an overview of exploratory data analysis.
To implement data visualization using Matplotlib.
To perform univariate data exploration and analysis.
To apply bivariate data exploration and analysis.
To use Data exploration and visualization techniques for multivariate and time series
data.
EDA fundamentals – Understanding data science – Significance of EDA – Making sense of data
– Comparing EDA with classical and Bayesian analysis – Software tools for EDA - Visual Aids
for EDA- Data transformation techniques-merging database, reshaping and pivoting,
Transformation techniques - Grouping Datasets - data aggregation – Pivot tables and cross-
tabulations.
Importing Matplotlib – Simple line plots – Simple scatter plots – visualizing errors – density and
contour plots – Histograms – legends – colors – subplots – text and annotation – customization
– three dimensional plotting - Geographic Data with Basemap - Visualization with Seaborn.
Introduction to Single variable: Distributions and Variables - Numerical Summaries of Level and
Spread - Scaling and Standardizing – Inequality - Smoothing Time Series.
1. Install the data Analysis and Visualization tool: R/ Python /Tableau Public/ Power BI.
2. Perform exploratory data analysis (EDA) on with datasets like email data set. Export all your
emails as a dataset, import them inside a pandas data frame, visualize them and get different
insights from the data.
3. Working with Numpy arrays, Pandas data frames , Basic plots using Matplotlib.
4. Explore various variable and row filters in R for cleaning data. Apply various plot features in R
on sample data sets and visualize.
5. Perform Time Series Analysis and apply the various visualization techniques.
6. Perform Data Analysis and representation on a Map using various Map data sets with Mouse
Rollover effect, user interaction, etc..
7. Build cartographic visualization for multiple datasets involving various countries of the world;
states and districts in India etc.
8. Perform EDA on Wine Quality Data Set.
9. Use a case study on a data set and apply the various EDA and visualization techniques and
present an analysis report.
COURSE OUTCOMES:
TOTAL: 75 PERIODS
TEXT BOOKS:
1. Suresh Kumar Mukhiya, Usman Ahmed, “Hands-On Exploratory Data Analysis with Python”,
Packt Publishing, 2020. (Unit 1)
2. Jake Vander Plas, "Python Data Science Handbook: Essential Tools for Working with Data",
Oreilly, 1st Edition, 2016. (Unit 2)
3. Catherine Marsh, Jane Elliott, “Exploring Data: An Introduction to Data Analysis for Social
Scientists”, Wiley Publications, 2nd Edition, 2008. (Unit 3,4,5)
REFERENCES:
1. Eric Pimpler, Data Visualization and Exploration with R, GeoSpatial Training service, 2017.
2. Claus O. Wilke, “Fundamentals of Data Visualization”, O’reilly publications, 2019.
3. Matthew O. Ward, Georges Grinstein, Daniel Keim, “Interactive Data Visualization:
Foundations, Techniques, and Applications”, 2nd Edition, CRC press, 2015.