Exploratorydataanalytics L T P C 3 0 0 3 Courseoutcomes CO1: CO2: CO3: CO4: CO5
Exploratorydataanalytics L T P C 3 0 0 3 Courseoutcomes CO1: CO2: CO3: CO4: CO5
3 0 0 3
COURSEOUTCOMES
Uponcompletionofthiscourse,thestudentwillbeableto
CO1: Understand the key concepts o exploratory data analysis.(K2)
CO2:Implement the data visualization using
Matplotlib.(K3)CO3: Perform univariate data
exploration and analysis
(K3)CO4:Applybivariatedataexplorationandanalysis.
(K3)
CO5: Apply Data exploration and visualization techniques for multivariate and
time seriesdata.(K3)
UNITII EDAUSINGPYTHON 9
Data Manipulation using Pandas – Pandas Objects – Data Indexing and
Selection –Operating on Data – Handling Missing Data – Hierarchical
Indexing – Combiningdatasets – Concat, Append, Merge and Join –
Aggregation and grouping – PivotTables–VectorizedStringOperations.
UNITIII UNIVARIATEANALYSIS 9
IntroductiontoSinglevariable:DistributionVariables NumericalSummariesofLeveland
Spread-ScalingandStandardizing–Inequality.
UNITIV BIVARIATEANALYSIS 9
RelationshipsbetweenTwoVariables-PercentageTables-
AnalyzingContingencyTables HandlingSeveralBatches–
ScatterplotsandResistant
Lines.
UNITV MULTIVARIATEANDTIMESERIESANALYSIS 9
ThirdVariable-CausalExplanations--Three-VariableContingencyTablesand
Beyond–FundamentalsofTSA–Characteristicsoftimeseriesdata–
DataCleaning–Time-basedindexing–Visualizing–Grouping–
Resampling.
L:45;TOTAL:45PERIODS
TEXTBOOKS
1. SureshKumarMukhiya,UsmanAhmed,―Hands-
OnExploratoryDataAnalysiswithPython‖,PacktPublishing,2020.
2. JakeVanderPlas,"PythonDataScienceHandbook:EssentialToolsfor
WorkingwithData",2ndEdition,OReilly,2022.
3. CatherineMarsh,JaneElliott,―ExploringData:AnIntroductiontoData
AnalysisforSocialScientists‖,WileyPublications,2 ndEdition,2008.
REFERENCES
1. ErnestoPellegrino,ManuelAndreBottiglieri,etal.,―ManagingandVisu
alizingYourBIMData:Describethefundamentalsofcomputersciencef
ordatavisualizationusingAutodeskDynamo,Revit,andMicrosoftPow
erBI‖,2021.
2. Eric Pimpler, Data Visualization and Exploration with R, Geo Spatial
Trainingservice,2017.
3. ClausO.Wilke,―FundamentalsofDataVisualization‖,O‘reillypublications,2
019.
Introduction and Tools: Exploratory Data Analysis- Definition and importance,Datatype and
its significance Role of EDA, EDA vs. Confirmatory Data Analysis. Tools and Technologies-
Python, R, Jupyter Notebooks, etc.. Data Collection and Preparation: Data Sources and
Collection-Databases, APIs, Web Scraping. Importing Data in Python (pandas) and R. Data
Cleaning-Identifying and Handling Missing Data, Removing Duplicates, Correcting Data
Errors, Data Types and Conversions. Data Transformation- Scaling and Normalization,
Encoding Categorical Variables, Feature Engineering. Data Visualization: Importance and
types of Data Visualizations. Tools and Libraries: Python-Matplotlib, R-ggplot2, etc.
Visualizing Univariate Data- Histograms, Box Plots, Bar Charts. Visualizing Multivariate
Data-Scatter Plots, Pair Plots, Heatmaps. Advanced Visualizations- Interactive
Visualizations, Geospatial Data Visualization. Data Summarization: Descriptive Statistics-
Measures of Central Tendency (Mean, Median, Mode), Measures of Dispersion (Range,
Variance, Standard Deviation), Skewness and Kurtosis. Data Summarization- Grouping and
Aggregation, Pivot Tables. Exploring Relationships: Correlation Analysis, Covariance. Time
Series Analysis and Dimensionality Reduction: Time Series Analysis- Time Series
Decomposition, Trend and Seasonality. Anomaly Detection- Outliers, Anomaly Detection.
Dimensionality Reduction-PCA, ICA, SVD, etc. Hypothesis Testing and Statistical Analysis:
Null and Alternative Hypotheses, Type I and Type II Errors. Common Statistical Tests- t-
tests, Chi-square Tests, ANOVA.