Ivy - Data Science and Data Visualization Certification Course
Ivy - Data Science and Data Visualization Certification Course
3) Business Statistics
a) Types of data, Graphical representation
i) Introduction of data
ii) Types of data
iii) Data Presentation
iv) Charts & Diagrams
v) Assignment on Type of Data and Type of Charts
b) Correlation, Data Modeling & Index Numbers
i) Correlation
ii) Data Modeling
iii) Index Number
c) Measures of Central Tendency & Dispersion
i) Measures of Central Tendency
ii) Measures of Central Dispersion
iii) Measures of Central Dispersion (Variance)
-1-
iv) Normal Distribution
v) Assignment of Central Tendency and Dispersion
d) Forecasting & Time Series Analysis
i) Forecasting
ii) Components of time Series
iii) Measurement of Secular Trend
iv) Forecasting Software
e) Probability, Bayesian Theory
i) Probability
ii) Computing joint & marginal probabilities
iii) Bayes’ Theorem
f) Probability Distribution and Mathematical Expectation
i) Random Variables
ii) Probability Distribution (Discrete)
iii) Probability Distribution (Continuous)
iv) Finding Normal Probabilities
g) Sampling and Sampling Distribution
i) Sample, Types of sample
ii) Sampling Distribution
iii) Example of Sampling
iv) Assignment on Probability Distribution, Binomial & Poisson, Normal
Distribution
h) Theory of Estimation and Testing of Hypothesis
i) Theory of Estimation, Estimation Process, Statistical Inference
ii) Test of Hypothesis, Decision Errors, One Level of Significance
iii) Two-tail test, Testing of hypothesis
iv) Degrees of freedom
i) Analysis of Variance
i) Anova
ii) Hypothesis - One way Anova
iii) Two way Anova
iv) Assignment on Hypothesis Testing
j) Regression Models
i) Regression, Linear Regression, Multiple Linear Regression
ii) Coefficient of Determination, R-square, Adjusted R-square
iii) Example using Excel
iv) Assignment on Correlation & Simple Regression
-2-
ii) Variables in R
iii) Creating columns with conditions AND, OR
iv) Different numeric functions in R like exp, log, sqrt, sum, prod etc. Sorting in R.
Ranking and concatenating strings in R.
v) Exercises on Import / Export of Data
vi) Exercises on Data Handling in R
c) Overview of Analytics and Statistics
i) Types of data variables
ii) What is Population
iii) Mean, Median, or Mode – Their applications
iv) Basic Statistics Exercises
d) String and character functions in R
i) Substring, string split
ii) Change name of column and checking mode of variable
iii) Dividing variable into different buckets
iv) Creating user defined functions in R
v) Loops in R
vi) SQL in R using sqldf
vii) Scatter plot, Box plot, Histogram, pie chart in R T Test in R
viii) Exercise: Data Summarization using Financial Retail Datasets
e) Overview of Analytics and Statistics
i) Standard deviation interpretation
ii) Population vs Sample
iii) Univariate & Bivariate Analysis
iv) Normal distribution
v) What is Confidence Interval
vi) Hypothesis Testing
vii) In-Case Study: Academic Performance Case Study
viii) Self-Case Study: Health Care Case Study
f) Linear regression in R
i) Regression
ii) Residual Analysis
iii) Multiple Regression
iv) Model Building
v) In-class Case Study: Predict Academic Performance of School Students
vi) Self Case Study: Predict Customer Value for an Insurance Firm
g) Logistic Regression in R
i) Model theory, Model Fit Statistics
ii) Reject Reference, Binning, Classing
iii) Dummy Creation, Dummy Correlation
iv) Model Development (Multicolinearity, WOE, IV, HLT, Gini KS, Rank Ordering,
Clustering Check)
v) Model Validation (Rerun, Scoring)
vi) Final Dashboard
vii) In-class Case Study: Predict Customer Churn for a Telecom firm
viii) Self Case Study: Predict Propensity to Buy Financial Product among Existing
Bank
h) Time Series theory discussion overview
-3-
i) ARIMA, Stationarity & Non stationarity check concepts
ii) forecasting
iii) components of Time Series
iv) Measurement of Circular Trend
v) Time Series codes overview
vi) Exponential smoothening theory discussion
vii) Case Study - Random walk in Time Series
viii) Case Study - Forecasting sales for retail
i) Clustering Concepts and Case Study
i) K-means Clustering
ii) Types of Clustering
iii) Centroids
iv) Case Study - Airline customer segmentation
j) Feature Engineering & Dimension Reduction and Case Study
i) Factor Analysis
ii) PCA
iii) Methods of Variable Reduction
iv) Dimensionality Reduction
k) Decision Trees
i) Pre-reading on basics of segmentation and decision trees
ii) Intro to Objective Segmentation
iii) CHAID and CART concept, example, and exercise
iv) Implement Decision Trees
v) Advantages and disadvantages of Decision Trees over Prediction
vi) Multiple Decision Trees
vii) Case Study – Predict earning of an individual
-4-
c) Accessing / Importing and Exporting Data using Python modules
i) Importing Data from various sources (Csv, txt, excel, access etc)
ii) Database Input (Connecting to database)
iii) Viewing Data objects - subsetting, methods
iv) Exporting Data to various formats
v) Important python modules: Pandas, beautifulsoup
d) Data Manipulation
i) Cleansing Data with Python
ii) Data Manipulation steps(Sorting, filtering, duplicates, merging, appending,
subsetting, derived variables, sampling, Data type conversions, renaming,
formatting etc)
iii) Data manipulation tools(Operators, Functions, Packages, control structures,
Loops, arrays etc)
iv) Python Built-in Functions (Text, numeric, date, utility functions)
v) Python User Defined Functions
vi) Stripping out extraneous information
vii) Normalizing data
viii) Formatting data
ix) Important Python modules for data manipulation (Pandas, Numpy, re, math,
string, datetime etc)
e) Visualization using Python
i) Introduction exploratory data analysis
ii) Descriptive statistics, Frequency Tables and summarization
iii) Univariate Analysis (Distribution of data & Graphical Analysis)
iv) Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical
Analysis)
v) Creating Graphs- Bar/pie/line chart/histogram/ boxplot/ scatter/ density etc)
vi) Important Packages for Exploratory Analysis(NumPy Arrays, Matplotlib,
seaborn, Pandas and scipy.stats etc)
f) Introduction to Predictive Modeling
i) Concept of model in analytics and how it is used?
ii) Common terminology used in analytics & modeling process
iii) Popular modeling algorithms
iv) Types of Business problems - Mapping of Techniques
v) Different Phases of Predictive Modeling
g) Modeling on Linear Regression
i) Introduction - Applications
ii) Assumptions of Linear Regression
iii) Building Linear Regression Model
iv) Understanding standard metrics (Variable significance, R-square/Adjusted R-
square, Global hypothesis ,etc)
v) Assess the overall effectiveness of the model
vi) Validation of Models (Re running Vs. Scoring)
vii) Standard Business Outputs (Decile Analysis, Error distribution (histogram),
Model equation, drivers etc.)
viii) Interpretation of Results - Business Validation - Implementation on new data
h) Modeling on Logistic Regression
i) Introduction - Applications
-5-
ii) Linear Regression Vs. Logistic Regression Vs. Generalized Linear Models
iii) Building Logistic Regression Model (Binary Logistic Model)
iv) Understanding standard model metrics (Concordance, Variable significance,
Hosmer Lemeshov Test, Gini, KS, Misclassification, ROC Curve etc)
v) Validation of Logistic Regression Models (Re running Vs. Scoring)
vi) Standard Business Outputs (Decile Analysis, ROC Curve, Probability Cut-offs,
Lift charts, Model equation, Drivers or variable importance, etc)
vii) Interpretation of Results - Business Validation - Implementation on new data
i) Time Series Forecasting
i) Introduction - Applications
ii) Time Series Components (Trend, Seasonality, Cyclicity and Level) and
Decomposition
iii) Classification of Techniques (Pattern based - Pattern less)
iv) Basic Techniques - Averages, Smoothening, etc
v) Advanced Techniques - AR Models, ARIMA, etc
vi) Understanding Forecasting Accuracy - MAPE, MAD, MSE, etc
-6-
f) Table Calculations
g) Calculated Field
h) Logical Calculations
i) If/Then
ii) IIF
iii) Case/When
i) Date Calculations
i) Date
ii) DateAdd
iii) DateDiff
iv) DateParse
v) Today()
vi) Now()
j) Parameters
i) Pre-defined Lists for Faster Filtering
ii) Top N Filter
iii) Reference Line Parameter
iv) Swapping Dimensions or Measures in a View
k) Using Actions to Create Interactive Dashboards
i) Filter Actions
ii) Highlight Actions
l) Advanced Charts
i) Heat maps, Tree Maps, Waterfall Charts etc.
ii) Working with latitude and Longitude
iii) Symbol and filled maps
m) Working with data
i) Joining multiple tables
ii) Blending of Data
n) Sets
i) In/Out Sets
ii) Combines Sets
o) Drilling Up/Down using Hierarchies
p) Grouping
q) Bins/Histograms
r) Analytics
i) Referencing lines
ii) Clustering
iii) Trend Line
s) Building dashboards
i) Layout and Formatting
ii) Interactivity with Actions
iii) Best Practices
t) Story Telling with Data
i) Working with story
ii) Highlighting important insights
u) Data Interpreter
i) Data Preparation
ii) Data Cleaning
iii) Pivoting
-7-
v) 4 Case Studies on Retail, Airline, Bank datasets
2) Data Visualization using Power BI
a) Introduction to Power BI Interface
b) Connecting and Shaping Data
i) Types of data connector
ii) Power query
iii) Basic table transformation
iv) Working with Text data
v) Working with Numerical data
vi) Working with Date and Time
vii) Generating Index and Conditional Columns
viii) Grouping and Aggregating Records
ix) Pivoting and Unpivoting Data
x) Defining Hierarchy
c) Creating Table Relations & Data Models
i) What is ‘Data Model’?
ii) Principles of Database Normalization
iii) Creating Table Relationships
iv) Connecting Multiple Data Tables
v) Understanding Filters
d) Analysing Data with DAX Calculations
i) Introduction to DAX
ii) DAX Calculated Columns
iii) DAX Measures
iv) Context Filters
v) DAX Syntax and Operators
vi) DAX Functions
vii) DAX Date and Time functions
viii) DAX Conditional and Logical Functions
ix) DAX Common Text Functions
x) Joining Data with Related
xi) Basix Maths and Stats Functions
xii) Count Functions
xiii) Calculate
xiv) Calculate and Filter
xv) Iterator Functions
xvi) Time Intelligence Formula
e) Visualization Data with Power BI Reports
i) Report View
ii) Inserting Charts and Images
iii) Filtering
iv) Conditional Formatting
v) Slicers
vi) Different Types of Charts
vii) Drillthrough Filters
viii) Data Visualization Best Practices
-8-
INCLUDES BONUS MATERIAL WORTH INR 10,000
1) Automation of Excel Using VBA (Live Online/Self Paced)
a) Making Macro do Automated Tasks for You
i) Record a Macro
ii) Automate a Task using Macro
b) Programming Concepts
i) Data Types
ii) Procedure of VBA
iii) Conditional Logic
iv) Conditional Statements - If/Then/Else and Select/Case
v) Definite and Conditional Loops – FOR, DO Loop, GOTO
c) Analysis Using VBA
i) VBA Objects – Workbooks, Worksheets, Range, Pivot
ii) Interacting with Users
iii) Creating User Defined Functions
iv) Automation Using Events
v) Creating User Interface using Form Controls
d) Creating Dashboards
e) 4 Industry Projects
-9-
(TxD);Matrices; Similarity measures, Low-level processes (Sentence Splitting;
Tokenization; Part-of-Speech Tagging; Stemming; Chunking)
ii) Finding patterns in text: text mining, text as a graph
iii) Natural Language processing (NLP)
iv) Text Analytics – Sentiment Analysis using Python
v) Text Analytics – Word cloud analysis using Python
vi) Text Analytics - Segmentation using K-Means/Hierarchical Clustering
vii) Text Analytics - Classification (Spam/Not spam)
viii) Applications of Social Media Analytics
ix) Metrics(Measures Actions) in social media analytics
x) Examples & Actionable Insights using Social Media Analytics
xi) Important python modules for Machine Learning (SciKit Learn, stats models,
scipy, nltk etc)
xii) Fine tuning the models using Hyper parameters, grid search, piping etc.
xiii) Case Study on Text Analytics
- 10 -