0% found this document useful (0 votes)
54 views3 pages

Advanta Innovation: Course Objective Summary

This course covers the fundamentals of big data, Hadoop, and related technologies like MapReduce, HDFS, Pig, Hive, HBase and Oozie. Students will learn about big data challenges and how Hadoop provides a framework to process vast amounts of data in a distributed manner. The course includes hands-on exercises for developing MapReduce applications, interacting with HDFS and running jobs on a Hadoop cluster. Students will also learn conceptual aspects of Pig, Hive, HBase and scheduling workflows with Oozie. Real-life use cases are discussed to demonstrate how companies employ Hadoop technologies at scale.

Uploaded by

Faraz Matin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views3 pages

Advanta Innovation: Course Objective Summary

This course covers the fundamentals of big data, Hadoop, and related technologies like MapReduce, HDFS, Pig, Hive, HBase and Oozie. Students will learn about big data challenges and how Hadoop provides a framework to process vast amounts of data in a distributed manner. The course includes hands-on exercises for developing MapReduce applications, interacting with HDFS and running jobs on a Hadoop cluster. Students will also learn conceptual aspects of Pig, Hive, HBase and scheduling workflows with Oozie. Real-life use cases are discussed to demonstrate how companies employ Hadoop technologies at scale.

Uploaded by

Faraz Matin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Advanta Innovation

Course Objective Summary


During this course, you will learn:
Introduction to Big Data and Hadoop
Hadoop ecosystem - Concepts
Hadoop Map-reduce concepts and
features
Developing the map-reduce
Applications
Pig concepts
Hive concepts
Oozie workflow concepts
HBASE Concepts
Real Life Use Cases

Introduction to Big Data and


Hadoop

HDFS
Map Reduce
Statistics

Understanding the Cluster


Typical workflow
Writing files to HDFS
Reading files from HDFS
Rack Awareness
5 daemons

Let's talk Map Reduce


Before Map reduce
Map Reduce Overview

What is Big Data?

Word Count Problem

What are the challenges for processing


big data?

Word Count Flow and Solution

What technologies support big data?


What is Hadoop?

Map Reduce Flow


Algorithms for simple & Complex
problems

Why Hadoop?
History of Hadoop
Use Cases of Hadoop
Hadoop eco System

Developing the Map Reduce


Application
Data Types
File Formats

Advanta Innovation
Explain the Driver, Mapper and
Reducer code
Configuring development environment
- Eclipse
Writing Unit Test

Output Formats - text Output, binary


output, multiple outputs, lazy output
and database output
Hands on Exercises

Map Reduce Features

Running locally

Counters

Running on Cluster

Sorting

Hands on exercises

Joins - Map Side and Reduce Side

How Map-Reduce Works

Side Data Distribution

Anatomy of Map Reduce Job run

MapReduce Combiner

Job Submission

MapReduce Partitioner

Job Initialization

MapReduce Distributed Cache

Task Assignment

Hands Exercises

Job Completion

Hive and PIG

Job Scheduling

Fundamentals

Job Failures

When to Use PIG and HIVE

Shuffle and sort

Concepts

Oozie Workflows

Hands on Exercises

Hands on Exercises

Map Reduce Types and Formats

HBASE
CAP Theorem

MapReduce Types

Introduction to NOSQL

Input Formats - Input splits & records,


text input, binary input, multiple inputs
& database input

Hbase Architecture and concepts


Programming and Hands on Exercises

Advanta Innovation
Case Studies Discussions
Certification Guidance

You might also like