0% found this document useful (0 votes)
57 views3 pages

Hadoop Updated Course Content

The document provides an overview of the content covered in an online Hadoop training course. The course introduces Hadoop and its ecosystem including HDFS, MapReduce, HBase, Pig, Hive, Sqoop and HUE. It covers topics such as installation, configuration, programming with MapReduce, analyzing structured and unstructured data with Hive, and monitoring and troubleshooting Hadoop clusters. Hands-on exercises are included to help students develop skills working with Hadoop frameworks and tools.

Uploaded by

nandy39
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views3 pages

Hadoop Updated Course Content

The document provides an overview of the content covered in an online Hadoop training course. The course introduces Hadoop and its ecosystem including HDFS, MapReduce, HBase, Pig, Hive, Sqoop and HUE. It covers topics such as installation, configuration, programming with MapReduce, analyzing structured and unstructured data with Hive, and monitoring and troubleshooting Hadoop clusters. Hands-on exercises are included to help students develop skills working with Hadoop frameworks and tools.

Uploaded by

nandy39
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

HADOOP Online Training Course Content

1) Introduction

Problems with traditional large-scale systems
Requirements for a new approach
Introduction to Existing Systems
Challenges in Traditional Databases

2) WELCOME to Hadoop

Introduction to Hadoop?
RDBMS Comparison with Hadoop
Motivation for Hadoop
Hadoop Terminology

3) Hadoop Eco-Systems

Namenode
Datanode
Secondary Namenode
Job Tracker
Task Tracker
Hands-On Exercise

4) HDFS

HDFS Configuration
Monitoring With HDFS
HDFS Permissions and Security
Scalability
Blocks
Replication
HDFS Architecture with Distributing Nodes
HDFS Shell
Hands-On Exercise

5) MapReduce

What Is MapReduce?
Features of MapReduce
Basic MapReduce Concepts
Architectural Overview
Fault Tolerance
Hands-On Exercise

6) Planning your Hadoop Cluster

General Planning Considerations
Choosing the Right Hardware
Network Considerations
Configuring Nodes





7) Hadoop Framework Full Installation

Installation Methods
Method 1:
Using pre-Configured Virtual Machine
Method 2:
Manual Installation and Configuration
Installation on Windows/Linux Machines
HDFS Configuration
MapReduce Configuration
Hands-On Exercise

8) Advanced Configuration

Advanced Parameters
Configuring Rack Awareness
Configuring Federation
Configuring High Availability

9) Getting Started With Eclipse IDE

Configuring Hadoop File System on Eclipse IDE
Connecting Eclipse IDE to HDFS
Developing Map/Reduce jobs on Eclipse IDE

10) Writing a MapReduce Program

The MapReduce Flow
Examining a Sample MapReduce Program
Basic MapReduce API Concepts
The Driver Code
The Mapper
The Reducer
Hadoops Streaming API
Using Eclipse for Rapid Development
Hands-On Exercise

11) MAPREDUCE

Parallel Programming Language
Map reduce Overview and Architecture
Developing Map reduce Jobs
Input and Output Data Formats
Job Configuration with Map/Reduce functions
Job Submission on HDFS
Jobs Monitoring

12) MapReduce Advanced Programming

Partitioner
Combiner
Indexing
Searching
Sorting
Grouping/Shuffling

13) Hadoop Streaming With Mapper

14) Distributing Debugging Hadoop Cluster



15) Cluster Monitoring and Troubleshooting

General System Monitoring
Managing Hadoops Log Files
Using the NameNode and
JobTracker Web UIs
Hands-On Exercise
Common Troubleshooting Issues
Benchmarking Your Cluster

16) Using Yahoo Web Services

17) Hadoop Security

18) Pig

Pig Overview
Installation
Pig Latin
Pig with HDFS
Loading HDFS Data into Pig
Grunt Shell
Practices on Pig Scripting
Seeing Pig in actionexample of computing similar patents

19) Hive

Hive Overview
Installation
Hive QL
Hive with HDFS
Hive Structured Data Analyzing
Hive Unstructured Data Analyzing
Hive Semi-structured Data Analyzing
Practices in Hive QL

20) HBase

HBase Overview and Architecture
HBase Installation
HBase Shell
CRUD operations
Scanning and Batching
Filters
HBase Key Design

21) Sqoop
Sqoop Overview
Installation
Imports and Exports

22) HUE
The GUI System
Monitoring Data with HUE

23) Conclusion

You might also like