Data SC With Data Visualization
Data SC With Data Visualization
1) Advanced Excel
a) Introduction to MS Excel, Cell Ref, Basic Functions and Usage
b) Sorting, Filtering, Advance Filtering, Subtotal
c) Pivot Tables and Slicers
d) Goal Seek and Solver
e) Different Charts Graphs – Which one to use and when
f) Vlookup, Hlookup, Match, Index
g) Conditional Formatting
h) Worksheet & Workbook Reference, Error Handling
i) Logical Operators & Functions – IF and Nested IF
j) Data Validation
k) Text Functions
l) Form Controls
m) Dashboard
n) 6 Case Studies from App Cab Aggregators, Insurance, Sports, Sales, Marketing,
Web Analytics Industry
-1-
i) Working with Tables
ii) Creating Data Tables
f) Table Calculations
g) Calculated Field
h) Logical Calculations
i) If/Then
ii) IIF
iii) Case/When
i) Date Calculations
i) Date
ii) DateAdd
iii) DateDiff
iv) DateParse
v) Today()
vi) Now()
j) Parameters
i) Pre-defined Lists for Faster Filtering
ii) Top N Filter
iii) Reference Line Parameter
iv) Swapping Dimensions or Measures in a View
k) Using Actions to Create Interactive Dashboards
i) Filter Actions
ii) Highlight Actions
l) Advanced Charts
i) Heat maps, Tree Maps, Waterfall Charts etc.
ii) Working with latitude and Longitude
iii) Symbol and filled maps
m) Working with data
i) Joining multiple tables
ii) Blending of Data
n) Sets
i) In/Out Sets
ii) Combines Sets
o) Drilling Up/Down using Hierarchies
p) Grouping
q) Bins/Histograms
r) Analytics
i) Referencing lines
ii) Clustering
iii) Trend Line
s) Building dashboards
i) Layout and Formatting
ii) Interactivity with Actions
iii) Best Practices
t) Story Telling with Data
i) Working with story
ii) Highlighting important insights
u) Data Interpreter
i) Data Preparation
-2-
ii) Data Cleaning
iii) Pivoting
v) 4 Case Studies on Retail, Airline, Bank datasets
-3-
ii) Hypothesis - One way Anova
iii) Two way Anova
iv) Assignment on Hypothesis Testing
j) Regression Models
i) Regression, Linear Regression, Multiple Linear Regression
ii) Coefficient of Determination, R-square, Adjusted R-square
iii) Example using Excel
iv) Assignment on Correlation & Simple Regression
-4-
ii) Residual Analysis
iii) Multiple Regression
iv) Model Building
v) In-class Case Study: Predict Academic Performance of School Students
vi) Self Case Study: Predict Customer Value for an Insurance Firm
g) Logistic Regression in R
i) Model theory, Model Fit Statistics
ii) Reject Reference, Binning, Classing
iii) Dummy Creation, Dummy Correlation
iv) Model Development (Multicolinearity, WOE, IV, HLT, Gini KS, Rank Ordering,
Clustering Check)
v) Model Validation (Rerun, Scoring)
vi) Final Dashboard
vii) In-class Case Study: Predict Customer Churn for a Telecom firm
viii) Self Case Study: Predict Propensity to Buy Financial Product among Existing
Bank
h) Time Series theory discussion overview
i) ARIMA, Stationarity & Non stationarity check concepts
ii) forecasting
iii) components of Time Series
iv) Measurement of Circular Trend
v) Time Series codes overview
vi) Exponential smoothening theory discussion
vii) Case Study - Random walk in Time Series
viii) Case Study - Forecasting sales for retail
i) Clustering Concepts and Case Study
i) K-means Clustering
ii) Types of Clustering
iii) Centroids
iv) Case Study - Airline customer segmentation
j) Feature Engineering & Dimension Reduction and Case Study
i) Factor Analysis
ii) PCA
iii) Methods of Variable Reduction
iv) Dimensionality Reduction
k) Decision Trees
i) Pre-reading on basics of segmentation and decision trees
ii) Intro to Objective Segmentation
iii) CHAID and CART concept, example, and exercise
iv) Implement Decision Trees
v) Advantages and disadvantages of Decision Trees over Prediction
vi) Multiple Decision Trees
vii) Case Study – Predict earning of an individual
-5-
iii) Introduction to Python Editors & IDE's(Canopy, pycharm, Jupyter, Rodeo,
Ipython etc…)
iv) Understand Jupyter notebook & Customize Settings
v) Concept of Packages/Libraries - Important packages(NumPy, SciPy, scikit-learn,
Pandas, Matplotlib, etc)
vi) Installing & loading Packages & Name Spaces
vii) Data Types & Data objects/structures (strings, Tuples, Lists, Dictionaries)
viii) List and Dictionary Comprehensions
ix) Variable & Value Labels – Date & Time Values
x) Basic Operations - Mathematical - string - date
xi) Reading and writing data
xii) Simple plotting
xiii) Control flow & conditional statements
xiv) Debugging & Code profiling
xv) How to create class and modules and how to call them?
b) Scientific Distribution
i) Numpy, scify, pandas, scikitlearn, statmodels, nltk etc
c) Accessing / Importing and Exporting Data using Python modules
i) Importing Data from various sources (Csv, txt, excel, access etc)
ii) Database Input (Connecting to database)
iii) Viewing Data objects - subsetting, methods
iv) Exporting Data to various formats
v) Important python modules: Pandas, beautifulsoup
d) Data Manipulation
i) Cleansing Data with Python
ii) Data Manipulation steps(Sorting, filtering, duplicates, merging, appending,
subsetting, derived variables, sampling, Data type conversions, renaming,
formatting etc)
iii) Data manipulation tools(Operators, Functions, Packages, control structures,
Loops, arrays etc)
iv) Python Built-in Functions (Text, numeric, date, utility functions)
v) Python User Defined Functions
vi) Stripping out extraneous information
vii) Normalizing data
viii) Formatting data
ix) Important Python modules for data manipulation (Pandas, Numpy, re, math,
string, datetime etc)
e) Visualization using Python
i) Introduction exploratory data analysis
ii) Descriptive statistics, Frequency Tables and summarization
iii) Univariate Analysis (Distribution of data & Graphical Analysis)
iv) Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical
Analysis)
v) Creating Graphs- Bar/pie/line chart/histogram/ boxplot/ scatter/ density etc)
vi) Important Packages for Exploratory Analysis(NumPy Arrays, Matplotlib,
seaborn, Pandas and scipy.stats etc)
f) Introduction to Predictive Modeling
i) Concept of model in analytics and how it is used?
-6-
ii) Common terminology used in analytics & modeling process
iii) Popular modeling algorithms
iv) Types of Business problems - Mapping of Techniques
v) Different Phases of Predictive Modeling
g) Modeling on Linear Regression
i) Introduction - Applications
ii) Assumptions of Linear Regression
iii) Building Linear Regression Model
iv) Understanding standard metrics (Variable significance, R-square/Adjusted R-
square, Global hypothesis ,etc)
v) Assess the overall effectiveness of the model
vi) Validation of Models (Re running Vs. Scoring)
vii) Standard Business Outputs (Decile Analysis, Error distribution (histogram),
Model equation, drivers etc.)
viii) Interpretation of Results - Business Validation - Implementation on new data
h) Modeling on Logistic Regression
i) Introduction - Applications
ii) Linear Regression Vs. Logistic Regression Vs. Generalized Linear Models
iii) Building Logistic Regression Model (Binary Logistic Model)
iv) Understanding standard model metrics (Concordance, Variable significance,
Hosmer Lemeshov Test, Gini, KS, Misclassification, ROC Curve etc)
v) Validation of Logistic Regression Models (Re running Vs. Scoring)
vi) Standard Business Outputs (Decile Analysis, ROC Curve, Probability Cut-offs,
Lift charts, Model equation, Drivers or variable importance, etc)
vii) Interpretation of Results - Business Validation - Implementation on new data
i) Time Series Forecasting
i) Introduction - Applications
ii) Time Series Components (Trend, Seasonality, Cyclicity and Level) and
Decomposition
iii) Classification of Techniques (Pattern based - Pattern less)
iv) Basic Techniques - Averages, Smoothening, etc
v) Advanced Techniques - AR Models, ARIMA, etc
vi) Understanding Forecasting Accuracy - MAPE, MAD, MSE, etc
-9-
(2) Automate a Task using Macro
ii) Programming Concepts
(1) Data Types
(2) Procedure of VBA
(3) Conditional Logic
(4) Conditional Statements - If/Then/Else and Select/Case
(5) Definite and Conditional Loops – FOR, DO Loop, GOTO
iii) Analysis Using VBA
(1) VBA Objects – Workbooks, Worksheets, Range, Pivot
(2) Interacting with Users
(3) Creating User Defined Functions
(4) Automation Using Events
(5) Creating User Interface using Form Controls
iv) Creating Dashboards
v) 4 Industry Projects
- 10 -