0% found this document useful (0 votes)
86 views3 pages

62-BigData Hadoop Course

This document outlines a training course on Hadoop administration that covers traditional databases and SQL, challenges with traditional databases, an introduction to Hadoop architecture and HDFS, configuring single and multi-node Hadoop clusters, maintaining clusters, the Hadoop ecosystem including Sqoop, Flume, Hive, and Pig, monitoring, troubleshooting, and optimizing Hadoop clusters, backing up and restoring data, managing logs, and troubleshooting issues. The course contains 23 sessions teaching skills for working with Hadoop and related big data technologies.

Uploaded by

krishn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
86 views3 pages

62-BigData Hadoop Course

This document outlines a training course on Hadoop administration that covers traditional databases and SQL, challenges with traditional databases, an introduction to Hadoop architecture and HDFS, configuring single and multi-node Hadoop clusters, maintaining clusters, the Hadoop ecosystem including Sqoop, Flume, Hive, and Pig, monitoring, troubleshooting, and optimizing Hadoop clusters, backing up and restoring data, managing logs, and troubleshooting issues. The course contains 23 sessions teaching skills for working with Hadoop and related big data technologies.

Uploaded by

krishn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Big Data: Hadoop Administration

Session-1: Introduction to Traditional Databases


Introduction to database 3 tier Architecture
Data Models Entity Relationship Model
ER Diagram
Session-2: SQL (Structured Query Language)
Create Database, drop Database Create table and insert values
Queries Logical operators (AND,OR,NOT)
Update & Delete Queries Like and TOP Clause
Session-3: SQL Continues
Order by Group by
Distinct keyword SQL Constraints
Using joins UNION
Session-4: SQL Continues
Union Clause NULL Values
Using alias and truncate Having clause
Table Cloning Subqueries
Session-5: Data Backup
Backup Entire Database Backup single Database
Backup Single Table
Session-6: Challenges in Traditional Databases
Fragmented Resources The Emerging Data Libraries
Database Engine Architecture Unstructured Data
Data Loss/ Theft Data Security
Session-7: Challenges continues
Capacity Planning Backup for backup
Unpredictable cost Bandwidth Saturation
Data Storage Data Retrieval
Session-8: Introduction to HADOOP
Hadoop Architecture MapReduce
Hadoop Distributed File System Environment Setup
Session-9: HDFS Overview
HDFS Architecture Data node
Importing Data into HDFS MapReduce
MapReduce Job Management HDFS Commands
Session-10: Single Node Cluster Configuration
Hadoop Prerequisits Hadoop Installation & Configuration
Session-11: Multi Node Cluster Configuration
Hadoop Prerequisits Hadoop Installation & Configuration
Session-12: Cluster Maintainance
Checking HDFS Status Breaking The Cluster
Adding and Removing Cluster Nodes Rebalancing The Cluster
Copying Data between Cluster Cluster Upgrading
Session-13: Hadoop Ecosystem (Sqoop)
Introduction to Sqoop Downloading & Installing pakage
Server installation Client Installation
Upgrading Server
Session-14: Hadoop Ecosystem (Flume)
The need of Apache Flume Downloading & installing Flume
Data management using Flume
Session-15: Hadoop Ecosystem (Hive)
Introduction to Data Warehouse Hive Architecture
Installing Hive Data management using Hive
Session-16: Hadoop Ecosystem (Pig)
Pig Overview The Need of Apache Pig
Apache Pig Architeccture Downloading & Installing Pig
Pig Latin basics Latin Built in functions and data management
Session-17: Cluster Monitering, Troubleshooting & Optimisation
checking HDFS with fsck Breaking the cluster
copying data with distcopy Rebalancing cluster Nodes
Adding and removing cluster nodes clusters self healing feature
Sssion-18: Data Backup
Understand the process Pre requisits for data backup
Backing up hadoop cluster
Session-19: Restoring Data
Process understanding Pre requisits for data restore
Data restoring
Session-20: Manage Hadoop Log Files
Understand server Log Need of Data Visualisation
Apache Zeppelin Challenges in processing log file
Session-21: Observium
Introduction to observium Hardware & Software requirements
OS Installation Customisation
Session-22: Ganglia
Overview Download & Install ganglia
Customisation
Session-23: Troubleshooting Cluster
Validate Environment Information Validate Hadoop Cluster health
Troubleshooting HDFS Troubleshooting Hive

You might also like