0% found this document useful (0 votes)
65 views3 pages

Big Data and Hadoop Course

The document outlines a training program on big data analytics that consists of 5 modules over 15-16 weeks. Module 1 covers database architecture and SQL commands over 1 week. Module 2 covers big data, Hadoop architecture, and HDFS over 2 weeks. Module 3 provides an in-depth study of SQL concepts and commands over 3 weeks. Module 4 focuses on Hive concepts, loading and querying data in Hive, and Hive UDFs over 3 weeks. Module 5 examines other Hadoop technologies like MapReduce, Pig, Flume, HBase, and Spark over 5-6 weeks.

Uploaded by

Ramakant Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views3 pages

Big Data and Hadoop Course

The document outlines a training program on big data analytics that consists of 5 modules over 15-16 weeks. Module 1 covers database architecture and SQL commands over 1 week. Module 2 covers big data, Hadoop architecture, and HDFS over 2 weeks. Module 3 provides an in-depth study of SQL concepts and commands over 3 weeks. Module 4 focuses on Hive concepts, loading and querying data in Hive, and Hive UDFs over 3 weeks. Module 5 examines other Hadoop technologies like MapReduce, Pig, Flume, HBase, and Spark over 5-6 weeks.

Uploaded by

Ramakant Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Big Data Analytics

Module 1 : Duration:
1 weeks
Understanding Data base and SQL:
I. Architecture of Data base
II. What is SQL (structure query language)
III. SQL commands
SQL overview
SQL SELECT statements
SQL functions and expressions
SQL updating
SQL joins
SQL with multiple tables
SQL summarization
SQL: preparing for the real world )

Module 2 :
Duration:2 weeks

Understanding Big Data and Hadoop:

Topics :
I. Big Data
II. Limitations and Solutions of existing Data Analytics Architecture
III. Hadoop
IV. Hadoop Architecture and HDFS
Hadoop Cluster Architecture
Important Configuration files in a Hadoop Cluster
Data Loading Techniques.
Module 3 :
Duration:3 weeks

Understanding SQL:
I. SQL Overview
Relational database concepts, specific products
SQL syntax rules
Data definition, data manipulation, and data control
statements
Getting acquainted with the course database and editor
II. SQL SELECT statements
Clauses
The SELECT clause: columns and aliases, where
expressions, order by expressions how null values
behave
III. SQL Functions and Expressions
Eliminating duplicates with DISTINCT arithmetic
expressions
Replacing null values
Literals, concatenation, other string functions
Numeric operations, including rounding
Date and time functions
Nested table expressions
Case logic H. Other expressions in specific dbms
products
IV. SQL Updating
The INSERT, UPDATE and DELETE statements
Column constraints and defaults
Referential integrity constraints

V. SQL Joins
Inner joins with original and SQL 92 syntax
Table aliases
Left, right and full outer joins
Self-joins

VI. SQL Subqueries and Unions


Intersection with IN and EXISTS
Subqueries
Difference with NOT IN and NOT EXISTS subqueries
The purpose and usage of UNION and UNIONALL

VII. SQL Summarization


The column functions MIN, MAX, AVG, SUM and COUNT
The GROUP BY and HAVING clauses Grouping in a
combination with joining
Module 4 :
Duration:3 weeks

HIVE:
understanding Hive concepts, Loading and Querying Data in Hive and Hive UDF.

Topics :
Hive Background
Hive Use Case
About Hive
Hive Vs Pig
Hive Architecture and Components
Metastore in Hive
Limitations of Hive
Comparison with Traditional Database
Hive Data Types and Data Models
Partitions and Buckets
Hive Tables (Managed Tables and External Tables)
Importing Data
Querying Data
Managing Outputs
Hive Script
Hive UDF
Hive Demo on Healthcare Data set

Module 5: Duration:5-
6weeks

Other technologies associated with Hadoop:


Hadoop MapReduce framework
Advanced MapReduce
PIG
Advanced Hive and Data file partitioning
Apache Flume and HBASE
Processing Distributed data with Apache Spark
RDDs in Apache.
Spark SQL

You might also like