Data Science Course in Hyderabad

Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

R

R E S E A R C H L A B S

DATA SCIENCE

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING
INSTITUTE THE TIMES GROUP COURSE
Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020

#206 A, 2nd floor, Fortune Signature, Above Pista House, Beside JNTU Metro, Opp: More Mega Store,
Kukatpally, Hyderabad, Telangana - 500085
R

R E S E A R C H L A B S

DATA SCIENCE CURRICULUM

Machine
Learning

Deep
Learning
Statistics Tableau
NLP
Python
SQL

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING
INSTITUTE THE TIMES GROUP COURSE
Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020

#206 A, 2nd floor, Fortune Signature, Above Pista House, Beside JNTU Metro, Opp: More Mega Store,
Kukatpally, Hyderabad, Telangana - 500085
R

R E S E A R C H L A B S

Course Objective

To understand the vital nature of data for organizations.

To learn the conceptual framework of machine learning.

To explore and analyze data using supervised and unsupervised learning techniques.

To develop and deploy knowledge learning models using Python.

To Work on Unstructured Data Like Text processing them using Nltk and building Modules.

Understanding Neural Networks and building deep networks using Tensorflow and Keras

and working with image processing using keras.

Key features in the Training

Duration: 4 Months

Class Duration: 2 - Hrs based on topic. Week-Days

Projects: Python: Data Analysis Project, Machine Learning: Regression, Classification, Time Series,

NLP: Sentiment Analysis / Chatbot, DeepLearning: Face Recognition.

Use Cases Covered: Python and Statistics : 4 , Machine Learning - 10, NLP - 2 , DL – 3.

One Big Hackathon Challenge on Machine Learning

Addition: Assignments, Quizzes for each Module From Python, Statistics, Machine Learning,

NLP and Deep Learning topic wise assignments and quiz.

Nearly working on 20 use cases during your course.

Best training materials are provided with Lab Exercises, Data sets, Codes, Quizzes,

Case studies on real data.

For every online session Recorded video & live running notes will provide.

Real time Training with live Scenarios and Applications.

Job Assistance after completion of the course.

Online help on Doubt Clearance, Career Guidance, Preparation and Interview Preparation

IBM Credentials and Certification after completion of the Course.


Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

INTRODUCTION & WALK THROUGH THE COURSE

INTRODUCTION

Introduction to Data Science

Life cycle of data science

Skills required for data science

Applications of data science in different industries

MODULE - 1: PYTHON CORE & ADVANCED

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

INTRODUCTION
What is Python?
Why does Data Science require Python?
Installation of Anaconda
Understanding Jupyter Notebook
Basic commands in Jupyter Notebook
Understanding Python Syntax

Data Types and Data Structures

Variables and Strings


Lists, Sets, Tuples and Dictionaries

Control Flow and Conditional Statements


Conditional Operators, Arithmetic Operators and Logical Operators
If, Elif and Else Statements
While Loops
For Loops
Nested Loops and List and Dictionary Comprehensions

Functions
What is function and types of functions
code optimization and argument functions
Scope
Lambda Functions
Map, Filter and Reduce

File Handling
Create, Read, Write files and Operations in File Handling
Errors and Exception Handling

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

MODULE 2: DATA ANALYSIS IN PYTHON

Numpy - NUMERICAL PYTHON


Introduction to Array
Creation and Printing of ndarray
Basic Operations in Numpy
Indexing
Mathematical Functions of Numpy

OpenCV (Computer vision)


Introduction to to Computer Vision
OpenCV Library in Python
Getting Started with Image / Videos
Operations on Images
Reshaping and Resizing images
Normalizing Images

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

Data Manipulation with Pandas


Series and DataFrames
Data Importing and Exporting through Excel, CSV Files
Data Understanding Operations
Indexing and slicing and More filtering with Conditional Slicing
Groupby, Pivot table and Cross Tab
Concatenating and Merging Joining
Descriptive Statistics
Removing Duplicates
String Manipulation
Missing Data Handling

DATA VISUALIZATION

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

Data Visualization using Matplotlib and Pandas


Introduction to Matplotlib
Basic Plotting
Properties of plotting
About Subplots
Line plots
pie chart and Bar Graph
Histograms
Box and Violin Plots
Scatterplot

Case Study on Exploratory Data Analysis (EDA)


and Visualizations

What is EDA?
Uni - Variate Analysis
Bi - Variate Analysis
More on Seaborn Based Plotting Including Pair Plots, Catplot,
Heat Maps, Count plot along with matplotlib plots.

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

UNSTRUCTURED DATA PROCESSING

Regular Expressions
Structured Data and Unstructured Data
Literals and Meta Characters
How to Regular Expressions using Pandas?
Inbuilt Methods
Pattern Matching

CAPSTONE PROJECT: DATA MINING and EXPLORATORY DATA ANALYSIS


Data Mining (WEB - SCRAPING)
This project starts completely from scratch which involves collection of Raw Data from
different sources and converting the unstructured data to a structured format to apply
Machine Learning and NLP models.
This project covers the main four steps of Data Science Life Cycle which involves
Data Collection
Data Mining
Data Preprocessing
Data Visualization
Ex: Text, CSV, TSV, Excel Files, Matrices, Images

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

MODULE 3: ADVANCED STATISTICS

Data Types and Data Structures

Statistics in Data science:


What is Statistics?
How is Statistics used in Data Science?
Population and Sample
Parameter and Statistic
Variable and its types

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

Data Gathering Techniques

Data types
Data Collection Techniques
Sampling Techniques:
Convenience Sampling, Simple Random Sampling
Stratified Sampling ,Systematic Sampling and Cluster Sampling

Descriptive Statistics

What is Univariate and Bi Variate Analysis?


Measures of Central Tendencies
Measures of Dispersion
Skewness and Kurtosis
Box Plots and Outliers detection
Covariance and Correlation

Probability Distribution
Probability and Limitations
Discrete Probability Distributions
Bernoulli, Binomial Distribution, Poisson Distribution
Continuous Probability Distributions
Normal Distribution, Standard Normal Distribution

Inferential Statistics
Sampling variability and Central Limit Theorem
Confidence Intervals
Hypothesis Testing
Z-test, t-test
Chi – Square Test
F -Test and ANOVA

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

MODULE 4: SQL

SQL for Data Science


Introduction to Databases
Basics of SQL
DML, DDL, DCL and Data Types
Common SQL commands using SELECT, FROM and WHERE
Logical Operators in SQL

SQL Joins
INNER and OUTER joins to combine data from multiple tables
RIGHT, LEFT joins to combine data from multiple tables
Filtering and Sorting
Advanced filtering using IN, OR and NOT
Sorting with GROUPBY and ORDER BY

SQL Aggregations
Common Aggregations including COUNT, SUM, MIN and MAX
CASE and DATE functions as well as work with NULL values

Subqueries and Temp Tables


Subqueries to run multiple queries together
Temp tables to access a table with more than one query

SQL Data Cleaning


Perform Data Cleaning using SQL
Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

MODULE 5: MACHINE LEARNING SUPERVISED &


UNSUPERVISED LEARNING

INTRODUCTION
What Is Machine Learning?
Supervised Versus Unsupervised Learning
Regression Versus Classification Problems Assessing Model Accuracy

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

REGRESSION TECHNIQUES

Linear Regression
Simple Linear Regression:
Estimating the Coefficients
Assessing the Coefficient Estimates
R Squared and Adjusted R Squared
MSE and RMSE

Multiple Linear Regression

Estimating the Regression Coefficients

OLS Assumptions

Multicollinearity

Feature Selection

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

Evaluating the Metrics of Regression Techniques

Homoscedasticity and Heteroscedasticity of error terms

Residual Analysis

Q-Q Plot

Cook's distance and Shapiro-Wilk Test

Identifying the line of best fit

Other Considerations in the Regression Model

Qualitative Predictors

Interaction Terms

Non-linear Transformations of the Predictors

Polynomial Regression
Why Polynomial Regression
Creating polynomial linear regression
evaluating the metrics

Time Series (Forecasting)

What is Times Series Data?


Stationarity in Time Series Data and Augmented Dickey Fuller Test
The AR Process
ACF & PACF
Decomposition of Times Series Trend, Seasonality and Cyclic
Moving Average, EWMA
Exponential Smoothing
ARIMA

Case Study: A Study Related to the Time Series using python

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

Regularization Techniques

Lasso Regularization
Ridge Regularization
ElasticNet Regularization

Case Study on Linear, Multiple Linear Regression, Polynomial, Regression using Python.

CAPSTONE PROJECT: A project on a use case will challenge the Data Understanding,
EDA, Data Processing and above Regression Techniques.

CLASSIFICATION TECHNIQUES

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

Logistic regression
An Overview of Classification
Difference Between Regression and classification Models.
Why Not Linear Regression?
Logistic Regression:
The Logistic Model
Estimating the Regression Coefficients and Making Predictions
Logit and Sigmoid functions
Setting the threshold and understanding decision boundary
Logistic Regression for >2 Response Classes
Evaluation Metrics for Classification Models:
Confusion Matrix
Accuracy and Error rate
TPR and FPR
Precision and Recall, F1 Score
AUC – ROC
Kappa Score

Naive Bayes
Principle of Naive Bayes Classifier

Bayes Theorem

Terminology in Naive Bayes

Posterior probability

Prior probability of class

Likelihood

Types of Naive Bayes Classifier

Multinomial Naive Bayes

Bernoulli Naive Bayes and Gaussian Naive Bayes

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

TREE BASED MODULES

Decision Trees
Decision Trees (Rule Based Learning):
Basic Terminology in Decision Tree
Root Node and Terminal Node
Regression Trees and Classification Trees
Trees Versus Linear Models
Advantages and Disadvantages of Trees
Gini Index
Overfitting and Pruning
Stopping Criteria
Accuracy Estimation using Decision Trees

Case Study: A Case Study on Decision Tree using Python

Resampling Methods:
Cross-Validation
The Validation Set Approach Leave-One-Out Cross-Validation
k-Fold Cross-Validation
Bias-Variance Trade-Off for k-Fold Cross-Validation

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

Ensemble Methods in Tree Based Models

What is Ensemble Learning?


What is Bootstrap Aggregation Classifiers and how does it work?

Random Forest

What is it and how does it work?

Variable selection using Random Forest

Boosting: AdaBoost, Gradient Boosting

What is it and how does it work?


Hyper parameter and Pro's and Con's

Case Study: Ensemble Methods - Random Forest Techniques using Python

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

DISTANCE BASED MODULES

K Nearest Neighbors
K-Nearest Neighbor Algorithm
Eager Vs Lazy learners
How does the KNN algorithm work?
How do you decide the number of neighbors in KNN?
Curse of Dimensionality
Pros and Cons of KNN
How to improve KNN performance

Case Study: A Case Study on k-NN using Python

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

Support Vector Machines

The Maximal Margin Classifier


HyperPlane
Support Vector Classifiers and Support Vector Machines
Hard and Soft Margin Classification
Classification with Non-linear Decision Boundaries
Kernel Trick
Polynomial and Radial
Tuning Hyper parameters for SVM
Gamma, Cost and Epsilon
SVMs with More than Two Classes

Case Study: A Case Study on SVM using Python

CAPSTONE PROJECT: A project on a use case will challenge the Data Understanding, EDA,
Data Processing and above Classifica on Techniques.

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

INTRODUCTION TO UNSUPERVISED LEARNING

Why Unsupervised Learning


How it Different from Supervised Learning
The Challenges of Unsupervised Learning

Principal Components Analysis

Introduction to Dimensionality Reduction and it's necessity


What Are Principal Components?
Demonstration of 2D PCA and 3D PCA
EigenValues, EigenVectors and Orthogonality
Transforming Eigen values into a new data set
Proportion of variance explained in PCA

Case Study: A Case Study on PCA using Python

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

K-Means Clustering

Centroids and Medoids


Deciding optimal value of 'k' using Elbow Method
Linkage Methods

Hierarchical Clustering

Divisive and Agglomerative Clustering


Dendrograms and their interpretation
Applications of Clustering
Practical Issues in Clustering

Case Study: A Case Study on clusterings using Python

Association Rules

Market Basket Analysis

Apriori
Metric Support/Confidence/Lift

Improving Supervised Learning algorithms with clustering

Case Study: A Case Study on association rules using Python

CAPSTONE PROJECT : A project on a use case will challenge the Data Understanding,
EDA, Data Processing and Unsupervised algorithms.

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

MODULE 6: NATURAL LANGUAGE PROCESSING (NLP)

INTRODUCTION
What is Text Mining?
Libraries
NLTK
Structured and Unstructured Data
Extracting Unstructured text from files and websites

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

Text Pre processing

Regular Expressions for Pattern Matching


Text Normalization
Text Tokenization
Sentence Tokenization
Word Tokenization
Text Segmentation
Stemming
Lemmatization

Natural Language Understanding (NLP Statistical)

Bag of Words
Word Vectorizer
TF – IDF
Automatic Tagging
N-grams Tagging
Transformation based Tagging
POS Tagging
Cosine Similarity
Named Entity Recognition

Text Classification
Case Studies :
Text Mining
Sentiment Analysis

CAPSTONE PROJECT: A project on a use case will challenge the NATURAL LANGUAGE PROCESS
Based Sentiment Analysis.

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

MODULE 7: DEEP LEARNING

Introduction to Neural Networks

Introduction to Neural Network


Introduction to Neuron and Perceptron
Sigmoid Neuron
Types of Activation functions used in deep learning networks
Cost Functions
Gradient Descent
Stochastic Gradient Descent
The feedforward model of neural network
Disadvantages of feedforward model
Applying weights to the feedforward model
Backpropagation algorithm

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

TensorFlow 2.0

Introducing Google Colab


Tensorflow basic syntax
Tensorflow Graphs
Tensorboard

Artificial Neural Network with Tensorflow

Neural Network for Regression


Neural Network for Classification
Evaluating the ANN
Improving and tuning the ANN
Saving and Restoring Graphs

Convolution Neural Networks

Convolution Operation
ReLU Layer
Pooling
Flattening
Full Connection
Softmax and Cross Entropy

Building Convolution Neural Network in Python

Building Convolutional Network with Tensorflow


Training CNN for Image Classification

Case Studies :
Image Classification

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

Recurrent Neural Network

What is a sequence-based model?


Vanishing Gradient
Exploding Gradient
The Idea behind Recurrent Neural Networks
LSTM (Long Short-Term Memory)

CAPSTONE PROJECT
Face Recognition

Face Recognition project gives details of the person and can recognize the gender
and names. This project involves in

Collection of images
Preprocessing the data
Applying the Model (Machine Learning or Deep Learning)
Training and Testing using the model

Ex: Security Unlock, Gender Recognition, Identity Recognition

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670


R

R E S E A R C H L A B S

MODULE 8: TABLEAU

Tableau for Data Science

Install Tableau for Desktop 10


Tableau to Analyze Data
Connect Tableau to a variety of dataset
Analyze, Blend, Join and Calculate Data
Tableau to Visualize Data
Visualize Data In the form of Various Charts, Plots and Maps
Data Hierarchies
Work with Data Blending in Tableau
Work with Parameters
Create Calculated Fields
Adding Filters and Quick Filters
Create Interactive Dashboards
Adding Actions to Dashboards

Awarded by

BEST BEST
DATA DIGITAL
SCIENCE MARKETING

Follow us on INSTITUTE THE TIMES GROUP


Optimal media solutions private limited
EDUCATION ICON OF THE YEAR 2019-2020
COURSE

www.innomatics.in Contact us : +91 9951666670

You might also like