0% found this document useful (0 votes)
399 views5 pages

Class Interface: Diffrence New Api Old Api

1. Hive provides a mechanism to project structure onto the data and query the data using a SQL-like language called HiveQL. 2. Hive architecture consists of three main components - Metastore service, Hive client and HiveServer2. The metastore service manages metadata about the tables and partitions. 3. Hive client communicates with HiveServer2 to compile and execute HiveQL queries. HiveServer2 translates the queries into MapReduce jobs or Spark jobs and executes them on the cluster. 2. Briefly explain the difference between hive and pig latin

Uploaded by

Devi Kondaveti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
399 views5 pages

Class Interface: Diffrence New Api Old Api

1. Hive provides a mechanism to project structure onto the data and query the data using a SQL-like language called HiveQL. 2. Hive architecture consists of three main components - Metastore service, Hive client and HiveServer2. The metastore service manages metadata about the tables and partitions. 3. Hive client communicates with HiveServer2 to compile and execute HiveQL queries. HiveServer2 translates the queries into MapReduce jobs or Spark jobs and executes them on the cluster. 2. Briefly explain the difference between hive and pig latin

Uploaded by

Devi Kondaveti
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Diffrence New API OLD API

New API useing Mapper and Reducer


as Class
IN OLD API used Mapper &
Mapper & So can add a method (with a default
Reduceer as Interface (still exist
Reducer implementation) to an
in New API as well)
abstract class without breaking old
implementations of the class
new API is in old API can still be found
Package the org.apache.hadoop.mapreduce p in org.apache.hadoop.mapred
ackage .
User Code
to JobConf, the OutputCollector,
commnicat use “context” object to communicate and the Reporter object use for
e with with mapReduce system communicate with Map reduce
MapReduc System
e Syaterm
Control
Mapper new API allows both mappers and Controlling mappers by writing
and reducers to control the execution a MapRunnable, but no
Reducer flow by overriding the run() method. equivalent exists for reducers.
execution
Job Control was done
JOB Job control is done through
through JobClient
control the JOB class in New API
(not exists in the new API)
jobconf objet was use for Job
configuration.which is extension
of Configuration class.
Job Configuration done
Job java.lang.Object
through Configuration class via some
Configurati extended by
of
on org.apache.hadoop.conf.Config
the helper methods on Job.
uration
extended by
org.apache.hadoop.mapred.Job
Conf
In the new API map outputs are
named part-m-nnnnn, and reduce
in the old API both map and
OutPut file outputs are named part-r-
reduce
Name nnnnn (where nnnnn is an integer
outputs are named part-nnnnn
designating the part number, starting
from zero).
reduce()
In the Old API, the reduce()
method In the new API, the reduce() method
method passes values as
passes passes values as a java.lang.Iterable
a java.lang.Iterator
values
MapReduce Pig

1. It is a Data Processing Language. It is a Data Flow Language.

It converts the job into map-reduce It converts the query into map-
2. functions. reduce functions.

3. It is a Low-level Language. It is a High-level Language

It is difficult for the user to perform Makes it easy for the user to
4. join operations. perform Join operations.

The user has to write 10 times The user has to write fewer lines of
more lines of code to perform a code because it supports the multi-
5. similar task than Pig. query approach.

It is less compilation time as the


It has several jobs therefore Pig operator converts it into
6. execution time is more. MapReduce jobs.

It is supported by recent versions It is supported with all versions of


7. of the Hadoop. Hadoop
1.what is hive drawn hive architecture

You might also like