0% found this document useful (0 votes)
59 views5 pages

Hadoop Development Training in Bangalore

IGEEKS is a Bangalore based Training & Recruitment company. We offers osoftware training courses starting from absolute beginner level to advanced levels.Providing high quality training at affordable fees is our core value. All our trainers are working IT professionals with rich experience. We work with our students in developing the right skills they need to build their career in present competitive environment. We have flexible batch times to suit the timings of graduating students and working professionals.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views5 pages

Hadoop Development Training in Bangalore

IGEEKS is a Bangalore based Training & Recruitment company. We offers osoftware training courses starting from absolute beginner level to advanced levels.Providing high quality training at affordable fees is our core value. All our trainers are working IT professionals with rich experience. We work with our students in developing the right skills they need to build their career in present competitive environment. We have flexible batch times to suit the timings of graduating students and working professionals.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

IGEEKS TECHNOLOGIES

Software Training Division


Academic Live Projects For BE,ME,MCA,BCA and PHD Students


IGeekS Technologies (Make Final Year Project)
No: 19, MN Complex, 2nd Cross,
Sampige Main Road, Malleswaram, Bangalore- 560003.
phone No:080-32487434/9739066172
Mail: [email protected], [email protected]
Website: www.igeekstechnologies.com
Land mark : Near to Mantri Mall, Malleswaram Bangalore

Hadoop Development Training
IGeekS Technologies provides certificate course in Hadoop development. The course duration in 36
hours and aspirants will be trained in latest Hadoop technologies which is industry standard. By the
end of the course aspirants will be familiar with the basic understanding of Hadoop framework.

Big Data?
Big data is an all-encompassing term for any collection of data sets so large and complex that it
becomes difficult to process using on-hand data management tools or traditional data processing
applications.
The challenges include capture, curation, storage, search, sharing, transfer, analysis and visualization.
The trend to larger data sets is due to the additional information derivable from analysis of a single
large set of related data, as compared to separate smaller sets with the same total amount of data,
allowing correlations to be found to spot business trends, prevent diseases, combat crime and so on.

What is Hadoop?
Hadoop is a framework which is designed to solve the problems related to Big Data. Each and
everyday numerous amount of raw data is generated from different kinds of sources, this data
contains lot of useful information which help solve many different kinds of problems. Hadoop helps in
analysis of this huge data and provides with useful information.

Prerequisites
To learn Hadoop one has to have basic knowledge of OOP concepts and programming using Core
Java. SQL knowledge is alsopreferred.

Goal of Hadoop program in IGeekSTechnologies
The goal of this program is to make the candidate a complete Big Data professional by imparting all
the knowledge required to become a successful Hadoop Developer.

Job Responsibilities of a Hadoop Developer


Analytical and problem solving skills, applied to a Big Data environment.


Deep understanding and related experience with Hadoop stack, HBase, Hive, Pig, Sqoop


Hands-on experience with related/complementary open source software platforms and
programming (e.g. Java, Linux)


Good experience in writing map-reduce based algorithms and programs


Knowledge and hands-on experience with ETL (Extract-Transform-Load) tools (e.gSqoop,
Flume)


Understanding of BI tools and reporting software and their capabilities (e.g. Business
Objects)


Sound knowledge of No-SQL databases and Relational Databases (RDBMS) as well as SQL


Experience with agile/scrum methodologies to iterate quickly on product changes, developing
user stories and working through backlogs


Should be very analytical with ability to understand and interpret the business data

Course Contents
Hadoop Development
i. Introduction
ii. Hadoop Installation - configuration
iii. HDFS
iv. HDFS design considerations
v. Mapreduce
vi. Pig Installation, Configuration
vii. Grunt shell
viii. Data Model of Pig
ix. Advance features of pig latin
x. Development of PL scripts
xi. Performance tuning in pig
xii. Load-store functions in pig
xiii. Hive Introduction, Installation, Configuration
xiv. Data types and file formats
xv. HiveQL - DDL
xvi. HiveQL DML
xvii. Views in hive
xviii. Indexes in hive
xix. Performance tuning in hive
xx. Sqoop Installation
xxi. Sqoop - import data
xxii. Sqoop free form query import
xxiii. Sqoop export data

NoSQL
i. Introduction to NoSQL
ii. Interacting with NoSQL
iii. Storage Architecture
iv. CRUD operations
v. Query NoSQL stores
vi. Modifying data stores
vii. Indexing
viii. Managing Transactions
ix. NoSQL in cloud
x. Parallel processing
xi. Performance tuning
xii. Tools and Utilities

Pig
i. Introduction
ii. Installation, Configuration
iii. Grunt
iv. Data Model of Pig
v. Pig Latin
a. I/O
b. Relational operations
c. UDFs
vi. Advance features of pig latin
vii. Development of PL scripts
viii. Testing of PL scripts
ix. Performance tuning
x. PL in Python
xi. Filter functions
xii. Load-store functions
xiii. Pig and NoSQL

Hive
i. Introduction, Installation, Configuration
ii. Data types and file formats
iii. HiveQL - DDL
1. Database
2. Table &Index
3. Partitions
iv. HiveQL DML
1. Joins
2. Where clause
3. Group by
4. Casting
v. Views
vi. Indexes
vii. Schema design
viii. Performance tuning
ix. Compression
x. Hive development building hive from source
xi. Functions in hive
xii. Streaming
xiii. Hive thrift service
xiv. Storage handlers
xv. Security
xvi. Locking
xvii. Hive Integration
xviii. HCatalog
xix. Serialization

HBASE
i. HBASE fundamentals
ii. Data manipulation
iii. Data coordinates
iv. Data Models
v. ACID semantics
vi. Distributed HBASE
vii. HBASE &MapReduce
viii. HBASE schema design
ix. HBASE table design
x. De-normalization
xi. Heterogeneous data
xii. I/O considerations
xiii. Advanced column family configurations
xiv. Extending HBASE
xv. HBASE clients
xvi. HBASE deployment
xvii. HBASE distribution and configuration
xviii. Monitoring cluster
xix. Backup
xx. Replication
xxi. Migration

You might also like