Course Contents of Hadoop and Big Data
1. Introduction to Hadoop & Big Data
Introduction to Hadoop
Introduction to Big Data
Hadoop Ecosystem - Concepts
Hadoop Map Reduce Concepts and Features
Developing the Map Reduce Applications
PIG Concepts
HIVE Concepts
Flume concepts
HUE Concepts
Hbase concepts
Real Life Use Cases
How Hadoop can solve problem associated with traditional large scale system
Other Open Source Software related to Hadoop
IN depth Knowledge on how Big Data Solutions work on Cloud
How to create your own Hadoop Cluster
2. Hadoop Architecture
Understand the main Hadoop Components
Setup Hadoop
Pseudo Mode
Cluster Mode
Ipv6
Installation of Java , Hadoop
Configuration of Hadoop
Hadoop Processes
Name Node , Secondry Name Node
Job Tracker , Task Tracker
Data Node
HDFS - Hadoop Distributed File Syatem
Learn How HDFS Works
HDFS Design and Architecture
HDFS Concepts
Interacting HDFS using command Line
Interacting HDFS using Java APIs
Dataflow
Blocks
Replica
List Data Access patterns for which HDFS is Designed
Learn how data is stored in HDFC Cluster
Learn HDFC Commands
LINUX Basics , Installation and Commands
3. Querying Data
An overview of Pig, Hive and JAQL
Working with Pig
Working with Hive
Working with JAQL
Working with Pig , Hive and JAQL Transcript
Querying Data with Pig , Hive and JAQL
4. Introduction to MapReduce
Understands the concepts of map and reduce operations
Developing Map reduce Applications
Describes how Hadoop execute a MapReduce Job
Phases in Map reduce Framework
Map reduce Input and Output Formats
Advanced Concepts
Sample Application
Combiner
Joining Datasets in Map reduce Jobs
Map - Side Join
Reduce - Side Join
Map reduce customization
Custom Input Format class
Hash Partitioner
Custom Partitioner
Sorting Technique
Custom Output Format class
Writing a Map reduce programe
The Map reduce Flow
Examining a sample Map reduce programe
Basic Map reduce API concepts
The Mapper
The Reducer
Hadoop Streaming API
Using Eclipse for rapid development
Hans on Exercise
Common Map reduce Algorithms
Sorting and Searching
Indexing
Machine Learning with Mahout
Term Frequency
List MapReduce Fundamental Data Types
Explain a MapReduce Data flow
List MapReduce fault tolerance and scheduling features
5. HIVE
Introduction to Hive
Installation and Configuration
Interacting HDFS using Hive
Map Reduce Programs through Hive
Hive Commands
Loading , Filtering , Grouping
Data Types , Operators
Joins , Groups
Sample Program in Hive
Hive Query Language
Alter and Delete in Hive
Partition in Hive
Indexing
Joins in Hive , Unions in Hive
Authentication and Authorization
Statistics with Hive
Archiving in Hive
Hands on exercise
6. PIG
Introduction to PIG
Installation and Configuration
Commands
Data Loading in PIG
Data Extraction in PIG
Data Transformation in PIG
Hands on Exercise on PIG
7. Shifting Data into Hadoop
Understand how to transfer data into Hadoop using Flume
Introduction to Flume
Introduction to Flume Transcript
Working with Flume
Flume mode of operation and configuration
8. Working with Sqoop
Introduction to Sqoop
Import Data
Export Data
Sqoop Syntax
Databases connection
Hands on Exercise
9. Working with Flume
Introduction to Flume
Configuration and Setup
Flume sink with example
Channel
Flume source with example
Complex Flume Architecture
IMPALA Concepts
HUE Concepts
OOZIE Concepts
10. Graphs Techniques used in Hadoop
- See more at: http://www.madridsoftwaretrainings.com/hadoop.php#sthash.EaadiRab.dpuf
Course Content:
Course Objective Summary
During this course, you will learn:
Introduction to Big Data and Analytics
Introduction to Hadoop
Hadoop ecosystem - Concepts
Hadoop Map-reduce concepts and features
Developing the map-reduce Applications
Pig concepts
Hive concepts
Sqoop concepts
Flume Concepts
Oozie workflow concepts
Impala Concepts
Hue Concepts
HBASE Concepts
ZooKeeper Concepts
Real Life Use Cases
Reporting Tool
Tableau
1. Virtualbox/VM Ware
Basics
Installations
Backups
Snapshots
2. Linux
Basics
Installations
Commands
3. Hadoop
Why Hadoop?
Scaling
Distributed Framework
Hadoop v/s RDBMS
Brief history of hadoop
4. Setup hadoop
Pseudo mode
Cluster mode
Ipv6
Ssh
Installation of java, hadoop
Configurations of hadoop
Hadoop Processes ( NN, SNN, JT, DN, TT)
Temporary directory
UI
Common errors when running hadoop cluster, solutions
5. HDFS- Hadoop distributed File System
HDFS Design and Architecture
HDFS Concepts
Interacting HDFS using command line
Interacting HDFS using Java APIs
Dataflow
Blocks
Replica
6. Hadoop Processes
Name node
Secondary name node
Job tracker
Task tracker
Data node
7. Map Reduce
Developing Map Reduce Application
Phases in Map Reduce Framework
Map Reduce Input and Output Formats
Advanced Concepts
Sample Applications
Combiner
8. Joining datasets in Mapreduce jobs
Map-side join
Reduce-Side join
9. Map reduce customization
Custom Input format class
Hash Partitioner
Custom Partitioner
Sorting techniques
Custom Output format class
10. Hadoop Programming Languages :I.HIVE
Introduction
Installation and Configuration
Interacting HDFS using HIVE
Map Reduce Programs through HIVE
HIVE Commands
Loading, Filtering, Grouping.
Data types, Operators..
Joins, Groups.
Sample programs in HIVE
II. PIG
Basics
Installation and Configurations
Commands.
OVERVIEW HADOOP DEVELOPER
11. Introduction
12. The Motivation for Hadoop
Problems with traditional large-scale systems
Requirements for a new approach
13. Hadoop: Basic Concepts
An Overview of Hadoop
The Hadoop Distributed File System
Hands-On Exercise
How MapReduce Works
Hands-On Exercise
Anatomy of a Hadoop Cluster
Other Hadoop Ecosystem Components
14. Writing a MapReduce Program
The MapReduce Flow
Examining a Sample MapReduce Program
Basic MapReduce API Concepts
The Driver Code
The Mapper
The Reducer
Hadoops Streaming API
Using Eclipse for Rapid Development
Hands-on exercise
The New MapReduce API
15. Common MapReduce Algorithms
Sorting and Searching
Indexing
Machine Learning With Mahout
Term Frequency Inverse Document Frequency
Word Co-Occurrence
Hands-On Exercise.
16.PIG Concepts..
Data loading in PIG.
Data Extraction in PIG.
Data Transformation in PIG.
Hands on exercise on PIG.
17. Hive Concepts.
Hive Query Language.
Alter and Delete in Hive.
Partition in Hive.
Indexing.
Joins in Hive.Unions in hive.
Industry specific configuration of hive parameters.
Authentication & Authorization.
Statistics with Hive.
Archiving in Hive.
Hands-on exercise
18. Working with Sqoop
Introduction.
Import Data.
Export Data.
Sqoop Syntaxs.
Databases connection.
Hands-on exercise
19. Working with Flume
Introduction.
Configuration and Setup.
Flume Sink with example.
Channel.
Flume Source with example.
Complex flume architecture.
20.
21.
22.
23.
24.
OOZIE Concepts
IMPALA Concepts
HUE Concepts
HBASE Concepts
ZooKeeper concepts
Reporting Tool..
Tableau
This course is designed for the beginner to intermediate-level Tableau user. It is for anyone who works
with data regardless of technical or analytical background. This course is designed to help you
understand the important concepts and techniques used in Tableau to move from simple to complex
visualizations and learn how to combine them in interactive dashboards.
Course Topics
Overview
What is visual analysis?
Strengths/weakness of the visual system.
Laying the Groundwork for Visual Analysis
Analytical Process
Preparing for analysis
Getting, Cleaning and Classifying Your Data
Cleaning, formatting and reshaping.
Using additional data to support your analysis.
Data classification
Visual Mapping Techniques
Visual Variables : Basic Units of Data Visualization
Working with Color
Marks in action: Common chart types
Solving Real-World Problems with Visual Analysis
Getting a Feel for the Data- Exploratory Analysis.
Making comparisons
Looking at (co-)Relationships.
Checking progress.
Spatial Relationships.
Try, try again.
Communicating Your Findings
Fine-tuning for more effective visualization
Storytelling and guided analytics
Dashboards
Course Contents of Hadoop and Big Data
1. Introduction to Hadoop & Big Data
Introduction to Hadoop
Introduction to Big Data
Hadoop Ecosystem - Concepts
Hadoop Map Reduce Concepts and Features
Developing the Map Reduce Applications
PIG Concepts
HIVE Concepts
Flume concepts
HUE Concepts
Hbase concepts
Real Life Use Cases
How Hadoop can solve problem associated with traditional large scale system
Other Open Source Software related to Hadoop
IN depth Knowledge on how Big Data Solutions work on Cloud
How to create your own Hadoop Cluster
2. Hadoop Architecture
Understand the main Hadoop Components
Setup Hadoop
Pseudo Mode
Cluster Mode
Ipv6
Installation of Java , Hadoop
Configuration of Hadoop
Hadoop Processes
Name Node , Secondry Name Node
Job Tracker , Task Tracker
Data Node
HDFS - Hadoop Distributed File Syatem
Learn How HDFS Works
HDFS Design and Architecture
HDFS Concepts
Interacting HDFS using command Line
Interacting HDFS using Java APIs
Dataflow
Blocks
Replica
List Data Access patterns for which HDFS is Designed
Learn how data is stored in HDFC Cluster
Learn HDFC Commands
LINUX Basics , Installation and Commands
3. Querying Data
An overview of Pig, Hive and JAQL
Working with Pig
Working with Hive
Working with JAQL
Working with Pig , Hive and JAQL Transcript
Querying Data with Pig , Hive and JAQL
4. Introduction to MapReduce
Understands the concepts of map and reduce operations
Developing Map reduce Applications
Describes how Hadoop execute a MapReduce Job
Phases in Map reduce Framework
Map reduce Input and Output Formats
Advanced Concepts
Sample Application
Combiner
Joining Datasets in Map reduce Jobs
Map - Side Join
Reduce - Side Join
Map reduce customization
Custom Input Format class
Hash Partitioner
Custom Partitioner
Sorting Technique
Custom Output Format class
Writing a Map reduce programe
The Map reduce Flow
Examining a sample Map reduce programe
Basic Map reduce API concepts
The Mapper
The Reducer
Hadoop Streaming API
Using Eclipse for rapid development
Hans on Exercise
Common Map reduce Algorithms
Sorting and Searching
Indexing
Machine Learning with Mahout
Term Frequency
List MapReduce Fundamental Data Types
Explain a MapReduce Data flow
List MapReduce fault tolerance and scheduling features
5. HIVE
Introduction to Hive
Installation and Configuration
Interacting HDFS using Hive
Map Reduce Programs through Hive
Hive Commands
Loading , Filtering , Grouping
Data Types , Operators
Joins , Groups
Sample Program in Hive
Hive Query Language
Alter and Delete in Hive
Partition in Hive
Indexing
Joins in Hive , Unions in Hive
Authentication and Authorization
Statistics with Hive
Archiving in Hive
Hands on exercise
6. PIG
Introduction to PIG
Installation and Configuration
Commands
Data Loading in PIG
Data Extraction in PIG
Data Transformation in PIG
Hands on Exercise on PIG
7. Shifting Data into Hadoop
Understand how to transfer data into Hadoop using Flume
Introduction to Flume
Introduction to Flume Transcript
Working with Flume
Flume mode of operation and configuration
8. Working with Sqoop
Introduction to Sqoop
Import Data
Export Data
Sqoop Syntax
Databases connection
Hands on Exercise
9. Working with Flume
Introduction to Flume
Configuration and Setup
Flume sink with example
Channel
Flume source with example
Complex Flume Architecture
IMPALA Concepts
HUE Concepts
OOZIE Concepts
10. Graphs Techniques used in Hadoop
- See more at: http://www.madridsoftwaretrainings.com/hadoop.php#sthash.EaadiRab.dpuf
Course Content:
Course Objective Summary
During this course, you will learn:
Introduction to Big Data and Analytics
Introduction to Hadoop
Hadoop ecosystem - Concepts
Hadoop Map-reduce concepts and features
Developing the map-reduce Applications
Pig concepts
Hive concepts
Sqoop concepts
Flume Concepts
Oozie workflow concepts
Impala Concepts
Hue Concepts
HBASE Concepts
ZooKeeper Concepts
Real Life Use Cases
Reporting Tool
Tableau
1. Virtualbox/VM Ware
Basics
Installations
Backups
Snapshots
2. Linux
Basics
Installations
Commands
3. Hadoop
Why Hadoop?
Scaling
Distributed Framework
Hadoop v/s RDBMS
Brief history of hadoop
4. Setup hadoop
Pseudo mode
Cluster mode
Ipv6
Ssh
Installation of java, hadoop
Configurations of hadoop
Hadoop Processes ( NN, SNN, JT, DN, TT)
Temporary directory
UI
Common errors when running hadoop cluster, solutions
5. HDFS- Hadoop distributed File System
HDFS Design and Architecture
HDFS Concepts
Interacting HDFS using command line
Interacting HDFS using Java APIs
Dataflow
Blocks
Replica
6. Hadoop Processes
Name node
Secondary name node
Job tracker
Task tracker
Data node
7. Map Reduce
Developing Map Reduce Application
Phases in Map Reduce Framework
Map Reduce Input and Output Formats
Advanced Concepts
Sample Applications
Combiner
8. Joining datasets in Mapreduce jobs
Map-side join
Reduce-Side join
9. Map reduce customization
Custom Input format class
Hash Partitioner
Custom Partitioner
Sorting techniques
Custom Output format class
10. Hadoop Programming Languages :I.HIVE
Introduction
Installation and Configuration
Interacting HDFS using HIVE
Map Reduce Programs through HIVE
HIVE Commands
Loading, Filtering, Grouping.
Data types, Operators..
Joins, Groups.
Sample programs in HIVE
II. PIG
Basics
Installation and Configurations
Commands.
OVERVIEW HADOOP DEVELOPER
11. Introduction
12. The Motivation for Hadoop
Problems with traditional large-scale systems
Requirements for a new approach
13. Hadoop: Basic Concepts
An Overview of Hadoop
The Hadoop Distributed File System
Hands-On Exercise
How MapReduce Works
Hands-On Exercise
Anatomy of a Hadoop Cluster
Other Hadoop Ecosystem Components
14. Writing a MapReduce Program
The MapReduce Flow
Examining a Sample MapReduce Program
Basic MapReduce API Concepts
The Driver Code
The Mapper
The Reducer
Hadoops Streaming API
Using Eclipse for Rapid Development
Hands-on exercise
The New MapReduce API
15. Common MapReduce Algorithms
Sorting and Searching
Indexing
Machine Learning With Mahout
Term Frequency Inverse Document Frequency
Word Co-Occurrence
Hands-On Exercise.
16.PIG Concepts..
Data loading in PIG.
Data Extraction in PIG.
Data Transformation in PIG.
Hands on exercise on PIG.
17. Hive Concepts.
Hive Query Language.
Alter and Delete in Hive.
Partition in Hive.
Indexing.
Joins in Hive.Unions in hive.
Industry specific configuration of hive parameters.
Authentication & Authorization.
Statistics with Hive.
Archiving in Hive.
Hands-on exercise
18. Working with Sqoop
Introduction.
Import Data.
Export Data.
Sqoop Syntaxs.
Databases connection.
Hands-on exercise
19. Working with Flume
Introduction.
Configuration and Setup.
Flume Sink with example.
Channel.
Flume Source with example.
Complex flume architecture.
20.
21.
22.
23.
24.
OOZIE Concepts
IMPALA Concepts
HUE Concepts
HBASE Concepts
ZooKeeper concepts
Reporting Tool..
Tableau
This course is designed for the beginner to intermediate-level Tableau user. It is for anyone who works
with data regardless of technical or analytical background. This course is designed to help you
understand the important concepts and techniques used in Tableau to move from simple to complex
visualizations and learn how to combine them in interactive dashboards.
Course Topics
Overview
What is visual analysis?
Strengths/weakness of the visual system.
Laying the Groundwork for Visual Analysis
Analytical Process
Preparing for analysis
Getting, Cleaning and Classifying Your Data
Cleaning, formatting and reshaping.
Using additional data to support your analysis.
Data classification
Visual Mapping Techniques
Visual Variables : Basic Units of Data Visualization
Working with Color
Marks in action: Common chart types
Solving Real-World Problems with Visual Analysis
Getting a Feel for the Data- Exploratory Analysis.
Making comparisons
Looking at (co-)Relationships.
Checking progress.
Spatial Relationships.
Try, try again.
Communicating Your Findings
Fine-tuning for more effective visualization
Storytelling and guided analytics
Dashboards