0% found this document useful (0 votes)
188 views

Digital Vidya Python Data Analytst Course

This 6 course, 60+ hour program teaches Python for data science through live online classes, projects, and career support. Key features include instructor-led sessions covering Python programming, NumPy, Pandas, Matplotlib, statistics, machine learning algorithms like linear regression and decision trees, and Tableau for data visualization. The course is designed for working professionals to gain practical skills through hands-on labs and assignments over 3 weeks, with lifetime access to updated materials and career mentoring.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
188 views

Digital Vidya Python Data Analytst Course

This 6 course, 60+ hour program teaches Python for data science through live online classes, projects, and career support. Key features include instructor-led sessions covering Python programming, NumPy, Pandas, Matplotlib, statistics, machine learning algorithms like linear regression and decision trees, and Tableau for data visualization. The course is designed for working professionals to gain practical skills through hands-on labs and assignments over 3 weeks, with lifetime access to updated materials and career mentoring.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Data Science Master Course

Python Specialisation
35,000+ Participants | 3,000+ Trainings | 55+ Countries | 9+ Years
Data Science Master Course - Our Offering

6 60+
Live Class
Courses Hour

3 50+
Capstone Placement
Projects Partners

10+ 100+
Industry Assignment
Experts Hours

www.digitalvidya.com
Course Highlights

This Course is for Anyone with Programming Knowledge

Salient Features

3 Hrs/Week Live Instructor-Led 3 Weeks of Active Q/A Forum Class Labs/Home Assignment
Online Sessions Project Work (10 hours/Week Learning Time)

Govt. of India
Placement Support Individual Attention to Lifetime Access to
(Vskills Certified Course)
Each Learner Updated Content and Videos

Industry and
Top Python Internal Competitions Industry’s Top Python Advisors
Academia Faculty
Tools Covered with Prizes

Industry Relevant Curriculum Career Mentoring Hands-on Approach Money Back Guarantee

www.digitalvidya.com
Data Science using Python (18 Sessions)
Instructor-Led Online Course
Introduction to Data Science Data Visualization
Getting started with Jupyter Notebook Simple & multi-line plots, Multiple figures Simple plot with X and Y axis

Introduction to the Open Data Science learning and competitive platforms Linestyles and color

Mutiple lines on same plot


Python Programming Controlling line properties

Introduction Adding Lables, gridlines, annotations

Operators X and Y ticks and rotations

Data Types Splines

Loops: while & for Legends

Conditionals: if-else Working with Multiple figures and axes

Functions: defining functions, anonymous functions Share X and Y axis

Adding subplots

Scientific computing with Python


Numerical Python (NumPy) Matplotlib and Seaborn
Array Creation Line Graphs

Data Types Bar plots

Shape Manipulation Histograms

Array Indexing Box plot

Broadcasting Stacked plots

Universal Functions Scatter plot

Statistical Methods Pie Chart

Introduction to Pandas Statistics


Data Analysis workflow in Python using Pandas Data Structures Normal Distribution

Indexing and selecting data Hypothesis testing

Statistical Operations Introduction to z-test and t-test

Applying Functions Introduction to Chi-Square distribution

Groupby: split-apply-combine

Handling missing data

Merging multiple datasets

www.digitalvidya.com
Logistic Regression
Machine Learning Introduction

Simple Linear Regression Sigmoid function

Hypothesis testing in Linear regression Logistic regression Model Evaluation Evaluation Metrics

Interpreting slope and intercept coefficients Scoring Confusion Matrix Gain

Cost Function in Linear regression Lift Chart

Residuals analysis Concordant – Discordant Ratio

Interpreting R-square Text Mining


Dummy variables encoding
Clustering
K Means Clustering
Multiple Linear regression
Elbow method
Multicollinearity issue
Hierarchical clustering
Interpreting Adjusted R-square
Kolmogorov Smirnov Chart
Outlier detection and treatment
AUC – ROC Curve
Missing values treatment

Decision Trees
Gini Index

Entropy concept

Classification mechanism

Issues – Overfitting

bias-variance trade-off

Different types of decision trees

Bagging and Boosting concept

Random Forest

Introduction to boosting algorithms

Data Analysis workflow in Python using Pandas Data Structures

Indexing and selecting data

Statistical Operations

Applying Functions

Groupby: split-apply-combine

Handling missing data

Merging multiple datasets

www.digitalvidya.com
Introduction to Tableau (3 Sessions)
Instructor-led Online Course
Introduction to BI Visual Analytics
Connecting to Data Drill Down and Hierarchies

Getting Started with Data Sorting

Managing Extracts Grouping

Saving and Publishing Data Sources Additional Ways to Group

Data Prep with Text and Excel Files Creating Sets

Join Types with Union Working with Sets

Cross-database Joins Parameters

Data Blending Formatting

The Formatting Pane


Dashboards and Stories Tooltips

Getting Started with Dashboards and Stories Trend Lines

Building a Dashboard Reference Lines

Dashboard Objects Forecasting

Dashboard Formatting Clustering

Dashboard Interactivity Using Actions

Story Points Calculations


Getting Started with Calculations

Introduction to Pandas
Data Analysis workflow in Python using Pandas Data Structures

Indexing and selecting data

Statistical Operations

Applying Functions

Groupby: split-apply-combine

Handling missing data

Merging multiple datasets

www.digitalvidya.com
Python Programming Foundation
(10 Chapters)Self Study Course
Python – Starting your journey Functions
in Programming Lambda Expressions

Install Jupyter Notebook Why Function

Basic Data type and Variables Writing simple functions

Basic Math Operators Advanced functions – map, reduce, filter and zip

Comparison Operators Input, Output parameters

String Manipulations Errors, File handling


String datatype Syntax Errors

String operations Exceptions

Date and Time

Compound DataTypes Input/Output

Set File Handling

Tuples

Lists Object-Oriented Programming


List Comprehension Introduction to Object Oriented Programming

Dictionary Creating a Class

Generators & Iterators


Databases
Connection with database Regular Expressions
Import data from CSV to database Introduction to regular expressions

Create, read, update, delete (CRUD) How to write a regular expression

Orderby Various operations on strings using re module

Groupby re.search vs re.findall

Where

String operations
Conditionals and Control Flow
Joins If else

For Loop

Range

Break and continue

www.digitalvidya.com
Statistics Foundation
(17 Chapters) Self Study Course
Data and Statistics Continuous Probability
Elements, Variables, and Observations
Distributions
Scales of Measurement Uniform Probability Distribution

Categorical and Quantitative Data Normal Curve

Cross-Sectional and Time Series Data Standard Normal Probability Distribution

Descriptive Statistics Computing Probabilities for Any Normal Probability Distribution

Statistical Inference
Sampling and Sampling
Descriptive Statistics: Tabular Distributions
and Graphical Uniform Probability Distribution

Normal Curve
Summarizing Categorical Data
Standard Normal Probability Distribution
Summarizing Quantitative Data
Computing Probabilities for Any Normal Probability Distribution
Crosstabulations and Scatter Diagrams

Introduction to Probability
Descriptive Statistics: Simple random sample and its importance

Numerical Measures Difference between descriptive and inferential statistics

Sampling distribution
Measures of Location
Mean and standard deviation
Measures of Variability
Central Limit Theorem and its importance
Measures of Distribution Shape, Relative Location, and Detecting Outliers
Mean and standard deviation for the sampling distribution of the
Box Plot sample proportion

Measures of Association Between Two Variables Sampling distributions of sample variances

Discrete Probability Distributions Anova


An Introduction to Analysis of Variance
Random Variables
Analysis of Variance: Testing for the Equality of k Population Means
Discrete Probability Distributions
Multiple Comparison Procedures
Expected Value and Variance
An Introduction to Experimental Design
Binomial Probability Distribution
Completely Randomized Designs
Poisson Probability Distribution
Randomized Block Design

www.digitalvidya.com
Interval Estimation Simple Linear Regression
Point estimate and confidence interval estimate Simple Linear Regression Model

Construct and interpret confidence interval estimate Regression Model and Regression Equation

Form and interpret confidence interval estimate Estimated Regression Equation

Confidence Intervals for the Population Mean, μ Least Squares Method

Confidence Intervals for the Population Proportion, (large samples) Coefficient of Determination

Correlation Coefficient

Confidence Interval Model Assumptions

Point estimate and confidence interval estimate Testing for Significance

Construct and interpret confidence interval estimate Using the Estimated Regression Equation for Estimation and Prediction

Form and interpret confidence interval estimate Residual Analysis: Validating Model Assumptions

Confidence Intervals for the Population Mean, μ Residual Analysis: Outliers and Influential Observations

Confidence Intervals for the Population Proportion, (large samples)


Model building in Regression
Hypothesis Tests
Regression model-building methodology
Developing Null and Alternative Hypotheses
Dummy variables for categorical variables with more than two
Type I and Type II Errors categories

Population Mean: Known Dummy variables usage in experimental design models

Population Mean: Unknown Lagged values of the dependent variable is regressors

Specification bias and multicollinearity

Inference About Means and Heteroscedasticity and autocorrelation


Proportions with Two Populations
Nonparametric Methods
Inferences About the Difference Between Two Population Means
Sign Test
Inferences About a Population Variance
Wilcoxon Signed-Rank Test
Inferences About Two Population Variances
Mann-Whitney-Wilcoxon Test

Multiple Regression Kruskal-Wallis Test

Rank Correlation
Multiple Regression Model

Least Squares Method


Tests of Goodness of Fit and
Multiple Coefficient of Determination
Independence
Model Assumptions
Goodness of Fit Test: A Multinomial Population
Testing for Significance
Test of Independence
Categorical Independent Variables

Residual Analysis

www.digitalvidya.com
SQL Foundation
(6 Chapters) Self Study Course
Database Basics: Concepts and SQL: Complex Query Building
need of a database MySQL SubQuery

What is a database? MySQL INNER, OUTER, LEFT, RIGHT, CROSS

What is SQL? MySQL UNION – Complete

Database Learn Data Modeling

What is Normalization? 1NF, 2NF, 3NF & BCNF


SQL: Query Optimization
Views in MySQL: Create, Join & Drop
SQL Fundamental: Selecting MySQL INDEXES – Create, Drop & Add Index
and Filtering Data

MySQL Installation

MySQL Create Database & MySQL Data Types

MySQL SELECT Statement

MySQL WHERE Clause with- AND, OR, IN, NOT IN

SQL Fundamental: Updating Data


MySQL query INSERT INTO Table

MySQL UPDATE & DELETE Query

Sorting in MySQL ORDER BY, DESC and ASC

SQL: Data Aggregation and


Functions

MySQL GROUP BY and HAVING Clause

MySQL Wildcards : Like, NOT Like, Escape, ( % ), ( _ )

MYSQL Regular Expressions (REGEXP)

MySQL Functions

MySQL Aggregate Functions: SUM, AVG, MAX, MIN COUNT, DISTINCT

MySQL IS NULL & IS NOT NULL

MySQL AUTO_INCREMENT

MYSQL – ALTER, DROP, RENAME, MODIFYMySQL LIMIT & OFFSET

www.digitalvidya.com
Data Science using R
(15 Chapters) Self Study Course
Introduction to Data Science Exploratory analysis in R
Briefing about analytics domain Descriptive Statistical analysis

Business solving day to day problems using data Sampling in R

Technology platforms Merging data

Reshaping data
Introduction to R programming Central tendencies

The basics of coding on R studio platform Measurements of Dispersion

R nuts and bolts Test of Normality

Basics of R programming Null value treatment

Installing predefined packages Outlier treatment

Inputs and R objects (vector, matrix, dataframes and factors) Correlation analysis

R datatypes
Visualization
Using dplyr package

Text manipulations using Stringr


RStudio Visualizations
Categorical data: Barplot,Pie chart
Reading data (csv file) in R
Numeric: boxplot

Data manipulations and Histogram

looping in R Scatter plot

Line chart
Data manipulations Subsetting dataset
Libraries like ggplot2, Rcolorbrewer
Date and time in R
Interactive dashboard
Loops: while & for
Shiny for interactive graphical dashboards
Conditionals: if-else

Functions: defining functions, anonymous functions Inferential Analysis in R


Apply family of functions Parametric Statistical tests
Basics theory of inferential statistics

Hypothesis tests using Z test

T-statistics test

Two sampled z test and T test

ANOVA

Post-hoc test

www.digitalvidya.com
Non-Parametric Statistical Test Random Forest
Wilcoxen test

Mann-whitney U test
Decision Tree
K.S. test

Runn Test
Support Vector Machines
Chi-square test

Data Loading and file formats Naïve Bayes

Loading JSON files


Unsupervised learning techniques
XML and HTML web scraping

Interacting with HTML and web APIs

Interacting with Databases


Clustering

Text mining/text analytics in R

K-means Clustering
Machine Learning
What is Machine Learning Hierarchical Clustering
Machine Learning Real World Example

Time series analysis


Supervised learning techniques
Linear regression
Linear regression assumptions checks

Building Linear Regression model

Case Study- Linear Regression

Logistic Regression
Understanding Logistic Regression

Classification model building using logistic model

Confusion matrix

www.digitalvidya.com
Course Advisors and Instructors

Course Advisors

SHWETA GUPTA MANAS GARG VISHAL MISHRA


Vice President, Tech. Architect CEO & Co-Founder
Shweta Gupta has 19+ years of Technology Manas Garg heads the Analytics for Vishal is a Technology Influencer and
Leadership experience. She holds a patent and Marketing at Paypal. He takes Data CEO of Right Relevance.
number of publications in ACM, IEEE and IBM Driven Decisions for Marketing Success. (A platform used by millions for content
journals like Redbook and developerWorks. & influencer discovery)

Course Instructors

GANESH NAIK VAISHALI GARG PRITESH SHRIVASTAVA


Ganesh Naik is the author of several books Vaishali Garg is a self-taught data analyst Pritesh is a Data Science enthusiast with an
such as“Learning Linux Shell Scripting”, “Bash with a health-care background. She use ability to turn data into actionable insights and
Cookbook” and “Mastering Python Scripting for Python with Pandas, Numpy, Matplotlib and meaningful stories. He possesses solid
System Administrators.He is an awesome Scikit. She has keen interest in data analysis knowledge and hands-on experience of both
techie working on various Smart City Projectsin using Pandas and is actively answer Pandas quantitative and qualitative analysis and data
India.He also has worked as a corporate related ques-tions on StackOverflow mining.
trainer for ISRO,Intel, GE, Samsung, Motorola, (Vaishaligarg, alias: A-Za-z). Some of her Apart from his profession, he also procures
PSDC(Malaysia),various companies in analysis is available on Kaggle. passion and talent in Dramatics, Travel,
Singapore, Malaysia and India. Story-telling, Martial Artist

www.digitalvidya.com
Capstone Projects (3 Weeks)

Every participant is mandated to solve one Capstone Project for


Certification. The learner is encouraged to solve all available projects to
sharpen the skills across several domains.

Natural Language Processing Bank Marketing

Project Description: Project Description:


This is one of the most applied areas for AI, Data The banking industry is working in a very
Science, and Machine Learning across domains and competitive environment and needs to strat-
industries. The real world is filled with mostly messy egize to grow its business. This project is
text data, and handling text is an important step related to the marketing campaigns related
towards making smarter algorithms. Using IMDB to term deposits, making an interesting
dataset from the movie domain, the learner will multi-disciplinary work that mixes both the
apply the most common concepts of NLP. finance and the marketing domain.

Key Takeaway: Key Takeaway:


This project will empower the learners to build The approach to this project is to think,
intermediate skills in the natural language define, design, code, test and tune your
processing domain. A few of the fundamentals of solution, in such a way that you apply all
working with textual data covered in this project are: aspects of the data science process. The
data is a real-world data with unclean and
Remove stop words null values.

Apply Stemming and Lemmatization Build the model to predict if a customer will

Create a cluster of words Identify influential factors to form marketing

Build a sentiment analysis model and a clustering model Improve long-term relationship with the clients

www.digitalvidya.com
Healthcare Analysis Deep Learning Based Project

Project Description: Project Description:


Electroencephalography (EEG) is an electro- E-Commerce has experienced considerable
physiological monitoring method to record the growth since the dawn of the internet as a
electrical activity of the brain. For this project, we commercial enterprise. Deep Learning excels at
will use the large EEG database at UCI Machine identifying patterns in unstructured data and
learning repository. This data arises from a large can predict the class of an uploaded image
study to examine EEG correlates of genetic applied on eCommerce context. This project is
predisposition to alcoholism. One fascinating an attempt to replicate virtual store assistance
question is whether the patterns are different for through image recognition over an eCommerce
an alcoholic and regular subject? Fashion MNIST dataset.

Key Takeaway: Key Takeaway:


This capstone project focuses on EEG data This project focuses on the implementation of
analysis, giving an opportunity for students to Neural Networks to solve complex unstructured
learn through complexities in dealing with such data problems. The objective is to:
complex real-world data. The project contains
the following exercises:
Build the model to classify the various categories

(analytic vertical) of clothing/fashion related


Parse and store in an easily understandable and
images.
readable form

Understanding the implementation of deep learning


Exploratory data analysis to better understand the
concepts through Tensorflow and Keras.
data

Model optimization by tuning hyper-parameters


Using Statistical concepts like Hypothetical testing
and implementing dropout layers.

Identify features to predict whether a subject is alco-

holic or not Duration: 3 Weeks


Price: ₹5000 (Including Tax)

Use machine learning algorithms to develop a

suitable classifier

www.digitalvidya.com
Tools Covered Placement Services

We partner with 10+ organizations who directly source their Data Science

Language: Python manpower needs from us. From resume creation to helping you crack the
final interview, our dedicated place-ment team is always on toes to
connect talent with the right opportunity.
Python is becoming the first choice for Data Scientists.
The learners will be learning to use all the relevant
libraries, NumPy, Pandas, scikit-learn, Matplotlib.

The Placement Process


Tool: Jupyter Notebook
An open-source web application that contains live
code, visualizations and narrative text. Learners will be The Candidates resume is refined and
using this for all their data science work. polished as per Market Standards to help
them be searchable.

Platform: Kaggle
Kaggle is an online community of data scientists and
machine learners, owned by Google. Learners will be
introduced and mentored to use the platform for
practice and competitions.

The Resume is shared with relevant


organisations by our placement team.
Language: R
R is a language and environment for statistical
computing and graphics. Learners will have the
opportunity to build skills for using R for data science.

Tool: RStudio The Candidates are prepared for an initial


quiz and a coding test.

RStudio provides open source and enterprise-ready


professional software for the R environment. Learners
will be using this editor for Data Science using R
assignments.

Tool: Tableau The Candidates resume is refined and


polished as per Market Standards to help
them be searchable.
Tableau is the analytics platform that disrupted the
world of business intelligence. Learners will be
introduced to this tool.

www.digitalvidya.com
What Makes us Proud?

“ ”
Good to see Digital Vidya becoming increasingly more involved in covering data science vertical,
look forward to collaborate with DV to help shape this industry.

- Naresh Mehta
AVP – Data Science & Analytics ,

“ ”
Yes, I like the huge investment Digital Vidya is doing to create the next generation of talent. Initial
feedback suggests Digital Vidya produces high-quality Data Analysts.
-Ajay Ohri
Data Scientist,

“ ”
I can see a good course structure and well-designed syllabus for those who are passionate
enough to enter into the analytics world. The platform helps people grow professionally and in
very less time.
-Madhu Vadlamani
Lead Analytics,

rthis Speak

“ ”
I was looking for customized content and I found the same in Digital Vidya. Content is structured
and well planned. Classes were very interactive and trainer’s presentation skills were very good.
People who are new to the subject can also understand clearly. Thank you so much!

-Vani Ananthamurthy
(Business Operations Senior Analyst, Accenture)

“ ”
This course gets you started from very basics, makes you think and solve the assignments, and
suddenly you find yourself doing Data Science all by yourself!
-Nanddeep Nasnodkar
(Sr. Software Developer - Remote Software Solutions)

www.digitalvidya.com
Interested? Contact Us! Duration
18 Weeks
+91-84680-02880

[email protected] Fee
Rs. 34,900+GST

www.digitalvidya.com
Batch Options
Weekend

You might also like