Data Analytics TOC
Data Analytics TOC
Curriculum Details Scope and Objective Enable students to explore the fundamentals of Big
Data Analytics, to provide them with a base from
where they can up skill themselves for specific Big
Data Analytics job roles.
Intended Audience
University students enrolled in streams such as
Engineering, Computer Science, Statistics, Sciences or
Mathematics
List of Tools Suggested (Indicative) SQL, Mongo DB, Hadoop, MapReduce, HDFS, Apache
Spark, PySpark, SparkR, Java, Apache Pig, Dynamo DB,
Spark MLlib, GraphX, Postgres,
Pandas
Indicative TOC
Data Analytics
Module 1: Data analytics an Overview-
What & Why - Data Analytics?
Different components of a modern data ecosystem, and the role of Data Analysts play in this
ecosystem.
Different types of data analysis and the key steps in a data analysis process.
Roles, responsibilities, and skillsets required to be a Data Analyst
Data Analytics Tools
Chapter 2: HDFS
o HDFS Concepts & Design
o Architecture, HDFS Daemons
o Overview Of Hadoop Distributed File System
Name nodes
Data nodes
The Command-Line Interface
o Data Flow (File Read , File Write)
o Fault Tolerance, Shell Commands
o Data Flow Archives, Coherency -Data Integrity
o Role of Secondary NameNode
Tableau
o What is Tableau?
o Tableau Architecture
o Workspace & Navigation
o Tableau Data Connections
o Filter data in Tableau
o Tableau Sort Data
o Data Visualization with Tableau
o Dynamic Data Manipulation and Presentation in Tableau
Module 4: Mining & Visualizing Data and Communicating
Results
Chapter -1 Introduction to Statistical Modelling
o Linear Regression
Simple Linear Regression
Multiple Linear Regression
o Classification
Logistic Regression
Discriminant Analysis
o Resampling Methods
Bootstrapping
Cross-Validation
o Tree-based Methods
Bagging
Boosting
o Unsupervised Learning
Principal Component Analysis
K-Means Clustering
Hierarchical Clustering
o Types of Variables
Dependent Variable, also known as Response Variable:
Explanatory Variable, also known as Independent Variable:
o Model Parameters and Model Residuals
R Programming
o Understanding R as a programming environment
o R basics-
Math, Variables, and Strings
Vectors and Factors
Vector operations
o Data structures in R
o Arrays & Matrices
o Lists
o Dataframes
o R programming fundamentals
Conditions and loops
Functions in R
Objects and Classes
Debugging
o Working with data in R
Reading CSV and Excel Files
Reading text files
Writing and saving data objects to file in R
o Strings and Dates in R
String operations in R
Regular Expressions
Dates in R
o Descriptive Statistics using R
o Different career opportunities in the field of Data Analysis and the different paths that
you can take for getting skilled as a Data Analyst.
o Hands-on project on with use cases (scenario based) in gathering, wrangling, mining,
analyzing, and visualizing data.