Unit 5
Unit 5
1. Pig
Introduction to Pig
Execution Modes
Grunt
Pig Latin
• Filtering (FILTER), grouping (GROUP), joining (JOIN), sorting (ORDER BY), etc.
2. Hive
Apache Hive Architecture
Hive Components
• Hive Shell
• Hive Services (Driver, Compiler, Execution Engine)
• Metastore: Stores schema and metadata.
• Schema-on-read vs schema-on-write.
• Optimized for batch processing, not OLTP.
HiveQL
3. HBase
HBase Concepts
HBase vs RDBMS
• Schema-less, horizontal scalability, real-time read/write vs. structured schema and ACID.
Advanced Usage
Schema Design
Advanced Indexing
Zookeeper
Infosphere
BigInsights
Big Sheets