0% found this document useful (0 votes)
125 views5 pages

Big Data and Spark Developers

This 10-day training course covers Apache Spark and big data concepts with 16 lessons totaling 60 hours. The course introduces students to core big data technologies like Hadoop, HDFS, YARN, MapReduce, Hive, and Spark. It teaches students how to work with different data formats and optimize Spark applications. Hands-on exercises are included to help students apply what they are learning.

Uploaded by

Balaji Arun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
125 views5 pages

Big Data and Spark Developers

This 10-day training course covers Apache Spark and big data concepts with 16 lessons totaling 60 hours. The course introduces students to core big data technologies like Hadoop, HDFS, YARN, MapReduce, Hive, and Spark. It teaches students how to work with different data formats and optimize Spark applications. Hands-on exercises are included to help students apply what they are learning.

Uploaded by

Balaji Arun
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Big Data and Spark Developers

Days : 10 Days Number of Hours : 60 Hours

Lesson 01 - Introduction to Big data and Hadoop Ecosystem

1.1 Introduction

1 .2 Overview to Big Data and Hadoop

1.3 Pop Quiz

1.4 Hadoop Ecosystem

Lesson 02 - HDFS and YARN

2.1 Introduction

2.2 HDFS Architecture and Components

2.3 Pop Quiz

2.4 Block Replication Architecture

2.5 YARN Introduction

2.6 Quiz

2.7 Key Takeaways

2.8 Hands-on Exercise

Lesson 03 - Map Reduce and Scoop

3.1 Introduction

3.2 Why Map reduce

3.3 Small Data and Big Data

3.4 Pop Quiz

3.5 Data Types in Hadoop

3.6 Joins in Map Reduce

3.7 What is Sqoop

3.8 Quiz

3.9 Key Takeaways

3.10 Hands-on Exercise

Page 1 of 5
Lesson 04 - Basics of Hive and Impala

4.1 Introduction

4.2 Pop Quiz

4.3 Interacting with Hive and Impala

4.4 Quiz

4.5 Key Takeaways

Lesson 05 - Working with Hive and Impala

5.1 Working with Hive and Impala

5.2 Pop Quiz

5.3 Data Types in Hive

5.4 Validation of Data

5.5 What is Hcatalog and Its Uses

5.6 Quiz

5.7 Key Takeaways

5.8 Hands-on Exercise

Lesson 06 - Types of Data Formats

6.1 Introduction

6.2 Types of File Format

6.3 Pop Quiz

6.4 Data Serialization 03:11

6.5 Importing MySql and Creating hivetb

6.6 Parquet With Sqoop

6.7 Quiz

6.8 Key Takeaways

6.9 Hands-on Exercise

Page 2 of 5
Lesson 07 - Advanced Hive Concept and Data File Partitioning

7.1 Introduction

7.2 Pop Quiz

7.3 Overview of the Hive Query Language

7.4 Quiz

7.5 Key Takeaways

7.6 Hands-on Exercise

Lesson 08 - Apache Flume and HBase

8.1 Introduction

8.2 Pop Quiz

8.3 Introduction to HBase

8.6 Hands-on Exercise

Lesson 09 - Pig

9.1 Introduction

9.2 Pop Quiz

9.3 Getting Datasets for Pig Development

9.4 Quiz

9.5 Key Takeaways

9.6 Hands-on Exercise

Lesson 10 - Basics of Apache Spark

10.1 Introduction

10.2 Spark - Architecture, Execution, and Related Concepts

10.3 Pop Quiz

10.4 RDD Operations

10.5 Functional Programming in Spark

10.6 Quiz

10.7 Key Takeaways

10.8 Hands-on Exercise

Page 3 of 5
Lesson 11 - RDDs in Spark

11.1 Introduction

11.2 RDD Data Types and RDD Creation

11.3 Pop Quiz

11.4 Operations in RDDs

11.5 Quiz

11.6 Key Takeaways

11.7 Hands-on Exercise

Lesson 12 - Implementation of Spark Applications

12.1 Introduction

12.2 Running Spark on YARN

12.3 Pop Quiz

12.4 Running a Spark Application

12.5 Dynamic Resource Allocation

12.6 Configuring Your Spark Application

12.7 Quiz

12.8 Key Takeaways

Lesson 13 - Spark Parallel Processing

13.1 Introduction

13.2 Pop Quiz

13.3 Parallel Operations on Partitions

13.4 Quiz

13.5 Key Takeaways

13.6 Hands-on Exercise

Lesson 14 - Spark RDD Optimization Techniques

14.1 Introduction

14.2 Pop Quiz

14.3 RDD Persistence

14.4 Quiz

Page 4 of 5
14.5 Key Takeaways

14.6 Hands-on Exercise

Lesson 15 - Spark Algorithm

15.1 Introduction

15.2 Spark: An Iterative Algorithm

15.3 Introduction To Graph Parallel System

15.4 Pop Quiz

15.5 Introduction To Machine Learning

15.6 Introduction To Three C's

15.7 Quiz

15.8 Key Takeaways

Lesson 16 - Spark SQL

16.1 Introduction

16.2 Pop Quiz

16.3 Interoperating with RDDs

16.4 Quiz

16.5 Key Takeaways

16.6 Hands-on Exercise

The End

Page 5 of 5

You might also like