Pig Architecture

Uploaded by

shivam.agrawalpy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views7 pages

Pig Architecture

Uploaded by

shivam.agrawalpy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

PIG

ARCHITECTURE
Group 9
What is Apache Pig
Apache Pig is a high-level data processing platform that simplifies the
process of analyzing large datasets stored in Hadoop. It provides a scripting
language called Pig Latin
Rich set of operators − It provides many operators to perform operations
like join, sort, filer, etc.
Ease of programming − Pig Latin is similar to SQL and it is easy to write a
Pig script if you are good at SQL.
All these scripts are internally converted to Map and Reduce tasks. Apache
Pig has a component known as Pig Engine that accepts the Pig Latin scripts
as input and converts those scripts into MapReduce jobs.
Handles all kinds of data − Apache Pig analyzes all kinds of data, both
structured as well as unstructured. It stores the results in HDFS.
Contd....
Scalability: Pig programs are designed to run on Hadoop clusters, which
can scale horizontally by adding more nodes. This allows Pig to handle very
large datasets efficiently.
Parallelization: Pig programs are automatically parallelized by the
underlying execution engine (typically MapReduce or Spark). This enables
parallel processing of data across multiple nodes in a Hadoop cluster,
significantly improving processing speed.
Extensibility: Pig allows users to define custom functions (UDFs) in
languages like Java or Python. This extensibility enables users to extend
Pig's functionality to handle specific data processing needs.
Why Apache Pig?
Hadoop uses Map Reduce to analyze & process big data
Programmers who are not so good at Java normally used to struggle
working with Hadoop, especially while performing any MapReduce
tasks.
Difficult for Non - programmer
Difficult task to maintain and optimize code
Using Pig Latin, programmers can perform MapReduce tasks easily
without having to type complex codes in Java.
Apache Pig uses multi-query approach, thereby reducing the length of
codes. For example, an operation that would require you to type 200
lines of code (LoC) in Java can be easily done by typing as less as just 10
LoC in Apache Pig. Ultimately Apache Pig reduces the development
time by almost 16 times.
Pig Latin is SQL-like language and it is easy to learn Apache Pig when
you are familiar with SQL.
Pig Architecture
IT is Pig interactive shell
which is used to execute Check syntax of the
all pig script script , The output of the
parser will be a DAG
(directed acyclic graph)

DAG is Passed to logic

optimizer where
optimization takes place
Convert DAG to map
reduce job

Result are displayed

using “DUMP” statement
and store in HDFS using
“STORE” statement
Apache Pig Data Model
A piece of data or a simple atomic value is
known as a field.
Example − ‘Linkin Park’ or ‘7’

An ordered set of fields is known as a

tuple
Example − (Linkin Park , 7)

A collection of tuples (non-unique) is

known as a bag
Example − {(Linkin Park , 7), (Metallica, 8)}

Map is set of Key-Values

Example[ ‘Bands’ # ’Linkin Park’ ,
‘Members’ # 7]
THANK YOU!!!

System Safety Assessment
50% (2)
System Safety Assessment
134 pages
Unit-V Pig Programming
No ratings yet
Unit-V Pig Programming
123 pages
Unit IV - Big Data Programming
No ratings yet
Unit IV - Big Data Programming
17 pages
BDA - UNIT 4 PIG Notes
No ratings yet
BDA - UNIT 4 PIG Notes
9 pages
Guidewire Resume Karthik
No ratings yet
Guidewire Resume Karthik
4 pages
BDA Unit-4
No ratings yet
BDA Unit-4
98 pages
BDA - HIVE & PIG-Other Notes in Detail
No ratings yet
BDA - HIVE & PIG-Other Notes in Detail
162 pages
Functional Safety Assessment Exida
No ratings yet
Functional Safety Assessment Exida
21 pages
Big Data Notes Pig
No ratings yet
Big Data Notes Pig
38 pages
Pig and Pig Latin
No ratings yet
Pig and Pig Latin
16 pages
Unit V-Apache Pig
No ratings yet
Unit V-Apache Pig
10 pages
Pig Full Lecture
No ratings yet
Pig Full Lecture
38 pages
5 PIG and HIVE
No ratings yet
5 PIG and HIVE
81 pages
Unit III
No ratings yet
Unit III
118 pages
Bdaut 2
No ratings yet
Bdaut 2
66 pages
Unit 5
No ratings yet
Unit 5
76 pages
Pig
No ratings yet
Pig
61 pages
Pig: Building High-Level Dataflows Over Map-Reduce
No ratings yet
Pig: Building High-Level Dataflows Over Map-Reduce
59 pages
Unit-4 Bigdata Analytics: What Is Apache Pig?
No ratings yet
Unit-4 Bigdata Analytics: What Is Apache Pig?
47 pages
Hadoop Pig
No ratings yet
Hadoop Pig
111 pages
Unit - V PIG Hadoop & Big Data: Pig Latin. This Language Provides Various Operators Using Which Programmers
No ratings yet
Unit - V PIG Hadoop & Big Data: Pig Latin. This Language Provides Various Operators Using Which Programmers
9 pages
3 Pig
No ratings yet
3 Pig
77 pages
6 Part2
No ratings yet
6 Part2
45 pages
Pig
No ratings yet
Pig
6 pages
Unit Iv Part - 2
No ratings yet
Unit Iv Part - 2
59 pages
BDA Unit5
No ratings yet
BDA Unit5
36 pages
Bda Unit Iv Notes
No ratings yet
Bda Unit Iv Notes
32 pages
PIG A Big Data Processor
No ratings yet
PIG A Big Data Processor
49 pages
BDA-Unit 5-Notes
No ratings yet
BDA-Unit 5-Notes
36 pages
BDA - Unit-4 Part 1
No ratings yet
BDA - Unit-4 Part 1
47 pages
Unit IV
No ratings yet
Unit IV
36 pages
BDP U4
No ratings yet
BDP U4
58 pages
Unit 5
No ratings yet
Unit 5
24 pages
BDA Module 4 - Part 1 (Pig) 2023
No ratings yet
BDA Module 4 - Part 1 (Pig) 2023
34 pages
Apache Pig
No ratings yet
Apache Pig
23 pages
Cse 17CS82 M2 S1 PPT
No ratings yet
Cse 17CS82 M2 S1 PPT
35 pages
06 Pig 01 Intro 1
No ratings yet
06 Pig 01 Intro 1
23 pages
Notes Unit 5 Bigdata
No ratings yet
Notes Unit 5 Bigdata
21 pages
Unit 4
No ratings yet
Unit 4
20 pages
UNIT 5 Complete Notes
No ratings yet
UNIT 5 Complete Notes
21 pages
Unit 4 Apachepig 210825041412
No ratings yet
Unit 4 Apachepig 210825041412
16 pages
Course On: Big Data Analytics
No ratings yet
Course On: Big Data Analytics
52 pages
Unit No. 8
No ratings yet
Unit No. 8
24 pages
Unit-4 SGS
No ratings yet
Unit-4 SGS
13 pages
BigData Unit 4
No ratings yet
BigData Unit 4
13 pages
Big Data Processing, 2014/15: Lecture 8: Pig Latin!
No ratings yet
Big Data Processing, 2014/15: Lecture 8: Pig Latin!
58 pages
Apache Pig Handy Notes Lab
No ratings yet
Apache Pig Handy Notes Lab
11 pages
Pig
No ratings yet
Pig
16 pages
Hadoop Pig Presentation
No ratings yet
Hadoop Pig Presentation
33 pages
PIG
No ratings yet
PIG
9 pages
Introductionto Apache Pig by AGurucharan
No ratings yet
Introductionto Apache Pig by AGurucharan
9 pages
Pig SKB
No ratings yet
Pig SKB
7 pages
Apache Pig: Simplifying Big Data Analysis
No ratings yet
Apache Pig: Simplifying Big Data Analysis
10 pages
Unit 4 Bba
No ratings yet
Unit 4 Bba
10 pages
Apache Pig
No ratings yet
Apache Pig
6 pages
What Is Apache Pig?
No ratings yet
What Is Apache Pig?
5 pages
Apache Pig in Nosql Databases
No ratings yet
Apache Pig in Nosql Databases
5 pages
Scet Unit 5
No ratings yet
Scet Unit 5
9 pages
Apache Pig: For Live Hadoop Training, Please See Courses
No ratings yet
Apache Pig: For Live Hadoop Training, Please See Courses
25 pages
Business Analyst - Interview Questions
No ratings yet
Business Analyst - Interview Questions
33 pages
McLeod CH01
100% (1)
McLeod CH01
29 pages
Apache Pig Data Processing Guide
No ratings yet
Apache Pig Data Processing Guide
10 pages
Unit-3 Bda Kalyan
No ratings yet
Unit-3 Bda Kalyan
1 page
c11 Solutions v12
No ratings yet
c11 Solutions v12
216 pages
Engine Data Sheet PDF
No ratings yet
Engine Data Sheet PDF
2 pages
SAD 3 Structured Analysis Latest
No ratings yet
SAD 3 Structured Analysis Latest
80 pages
Modern Management
No ratings yet
Modern Management
3 pages
006 - Security Management
No ratings yet
006 - Security Management
15 pages
OceanStor Dorado 6.1.3 Documentation Introduction
No ratings yet
OceanStor Dorado 6.1.3 Documentation Introduction
35 pages
Brochure Mitsubishi Medium Speed Marine Engines SU Series 1007 KWM To 3580 KWM
No ratings yet
Brochure Mitsubishi Medium Speed Marine Engines SU Series 1007 KWM To 3580 KWM
4 pages
Nano Ic Engine
100% (1)
Nano Ic Engine
25 pages
Railway Reservation Software Documentaion
No ratings yet
Railway Reservation Software Documentaion
3 pages
UIPath - Processes and Libraries
No ratings yet
UIPath - Processes and Libraries
85 pages
KQ Cargo: Your Route To Success
No ratings yet
KQ Cargo: Your Route To Success
20 pages
Checkpoint: Dorfman Pacific Rolls Out A New Wireless Warehouse
No ratings yet
Checkpoint: Dorfman Pacific Rolls Out A New Wireless Warehouse
8 pages
Ravi Kasi Resume.
No ratings yet
Ravi Kasi Resume.
3 pages
PMP s14 2020 v65 Agile Practice Guide
No ratings yet
PMP s14 2020 v65 Agile Practice Guide
13 pages
Lecture#7 Scripting Technique
No ratings yet
Lecture#7 Scripting Technique
13 pages
03 SE Agile
No ratings yet
03 SE Agile
16 pages
C# (Pronounced See Sharp, Like The Musical Note: Mads Torgersen
No ratings yet
C# (Pronounced See Sharp, Like The Musical Note: Mads Torgersen
2 pages
Penske Logistics Helps Beverage Distributor Rapidly Establish New Warehousing Operations
No ratings yet
Penske Logistics Helps Beverage Distributor Rapidly Establish New Warehousing Operations
2 pages
Architectural Modeling and Design Using Revit
No ratings yet
Architectural Modeling and Design Using Revit
35 pages
DRY Principles
No ratings yet
DRY Principles
4 pages
The Adoption of Component-Based Architecture in The Development of E-Learning Website Interface
No ratings yet
The Adoption of Component-Based Architecture in The Development of E-Learning Website Interface
6 pages
1 TPJ655 CourseAddendum F2023
No ratings yet
1 TPJ655 CourseAddendum F2023
2 pages
Question Bank - PPS
No ratings yet
Question Bank - PPS
2 pages
Policies
No ratings yet
Policies
2 pages
Learning Hadoop 2
From Everand
Learning Hadoop 2
Garry Turkington
4/5 (1)
Learning PySpark
From Everand
Learning PySpark
Tomasz Drabas
No ratings yet

Pig Architecture

Uploaded by

Pig Architecture

Uploaded by

PIG

DAG is Passed to logic

Result are displayed

An ordered set of fields is known as a

A collection of tuples (non-unique) is

Map is set of Key-Values

You might also like