0% found this document useful (0 votes)
7 views8 pages

Bda 1

Uploaded by

Nilay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views8 pages

Bda 1

Uploaded by

Nilay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

DATE

Explain Map reduce Loith sutable examplk


7 mapvedu ce notivg but just like algonthm that
is based an yARN tcavae wbrk the major teatue
t mapYeduce s to perHorm istraute pYacessig
parallei hadotp cluster which makes
hadoo werkig [o ast

mapC
Bie data ap
Cinput)
reduce C

Map task -

recordreader - the pux pose of record hreaker s


to break HhereCerda it is responai ble toY
pYnvidiyg
key- vau pairs in map (O () Aunction.
Map it is wser-eined tuntion whose bork
is to pncess the tuples obtaineJ trom recerd
reader

Combiner - it is wsed ox thedato


the wmap worktlow. it is sinlax to local
FOR EDUCATIONAL USE

REEGAL
DATE

Keader the indermediate key-Value that are


Aenerated in the mas is combined with the
helo of this comhiner
partitiovar - it is respowsible for tetehi
key vaue paiys atnerated n the wmap per. phe's
partitionay qenerates sharels coYres pondig
to eachedducer

Reduce task
. sule
shuttle and soxt - wsig shuling pYOcegS
shattnifs key
the system can sort the laa s i
Value. ence se me ot the mappi tasks aYe
dene shHig heqius that is wy it is
fastey process and does not wait toy compleie
t task pertormed by waapper
Reduce - Hgather taple genevated tom map
and taen pertoxm some soxtirq ad gtion
sort ot pyocess on those key-volue depending
o ts key element
Otput tormat pnce al cpevahiog are
ay
peror med,the key- value pairs aYe wnten into
ile with hel e recoKd wnter, each reeord
in new ineiand key and value in space- eparated
manner
FOR EDUCATIONAL USE
DATE

2:
2DiRerence hetten SaL ad NOS&L
NoSGL
SSL
etructured Sueny Lavguapt Not ory saL ar nen-relatioral
Aatabase

2Hdecarative query t is rot decarotive quey


langape
3
it is work on ACID itllowS CAP theorem
pnperties. Cansistency
Ato micity Avalakility
Consistency Daxtition tolesaMCe
isolatian
Duxahility
stsuctured avd ianized"Unstuured aud unveplicahle
clata data

s- velationd data ase is key-value pair sterage, almn


tabk based. sto re,dectmen stÏ re, praph
DB

data and its relationgips NO pYe-d ined schea


are stored in separade tabe
ight Consistency 'eyent Lonsisten cy rathen
than ACID poprrty
FOR EDUCATIONALUSE

REEGAL
-DATE

3
Explain 5v% c RDA

data reess to a lot ot into. much more


than what ere wsed to handling
Come tom anoe sourceg ke
Socia media Sensors ox onine tYanactics
Analyzig tis laxae volune ot doda lan
pDvide Valua ble insights and patterng.
2- Velo ity
- hin data is Acnerated and updated uickly
almost in real time.
porocessing and analyzig thi's data vomptly
s cIcia tor mmakiy timely deisione
examples include menitoi soçd
socia media
tracki stok market chorje
3. Vanety -
-bin data can be stetured ox Unstructured
-4encompasseg liverse Sturces guch
emails videos, tweets ox do umets
- techniqu es lrke text winiwage.
awalysis or NLP are yed to extact

FOR EDUCATIONAL USE

REEGAL
|a Vexacity
-VeKacity retess to reliaility and acou racy
ot data as bi data can [ometim eg be
messy or incomplete.
-ensun g data qyuality is essentia o
denive eaningtl ane achovahle insipt
-techuiues I:ke data cleavi validation
and veitication are emley ed t addrees
in a cCuraces

Explain
1. structured data i
st ructured dat Can he crndely detined ag
Hae data at resides in tixe tield oithin
xeLoxd
"it is type ct data most taiar toou
eveyday lives
cextain schema bindstsoal the lat
has the same st ot pro perties structored
data is also call ed eational data
relatios hips are entoreed by the applicatioM.
ot tale conetaints
"the bus inees Nalue ot stuctured data ies
wtewitin ox can utili2e
ew wel an
its exish vg Systemg and prcesseg Dor
Analysis purpos es.
FOR EDUCATIONAL USE

REEA
DATE

2
R Sei steted data
. semi structuxed data is not bound by
amy ngid schema for data storaae ad
handli
-sinceseistuctyred data doent nee
stuctae quey largi is commony cled
NoS&L lata
-A dato seiaizatio lawe i's Lse to exclhavg
semi- structuscd data across system that
way even have vayed undexlyig nra
stxucture
Stwi- structure Lotent 's alten wsel to
stoxe aetodata abont businees pvocess
hut i can a o ncle iles

3 UnstucHured data'
unstuctured data is the kind ol data Hhat
loesnt adhere to any letinite shema or set
o rules, ts arrargcment is unplaned avd
hap hazatd.
photos, videos, text doc dlq files can considex
unstuctured data.
.additionaly, ustructured data is algo
known sdask dato'" becage it Cannot
he analyzee oiteut proper soltware too s.

FOR EDUCATIONAL USE

REEGAL
DATE

S Explai hadoap arcitechuxe.


Hadoap istnbuted Etle System (HDFS)
HDES is the storage layer ot hadoopdesied
to handle large data sets by distbhi
Hhe acros s mdtiple machineg.
Name Node lmaster)
-manag es metadata ot ile System
- it desnt store thedata itsel but keeps
track ot hele locations across cluster
Data Node Cslaves)'
stoxes hae actual dat in blecks
-H commuicate with wameode to report
health and status ot stored data hledes
2 Mapxeducei
mapred uce is the aata processig layer of hedop
wich alows paralel pocessing et large data
sets
" Map tas k
- takes an input datast, splits t to keyvalue
pairs an proceisce each paik in paxale.
xeducp task
-aepete ad cobies outputs tasles
pduce fial task
to
FOR EDUCATIONAL USE

REEGAL
DATE

3.
YARN ( yet Anather Resource Neigkour)
managewtt
eagevtt laer istaduce!
n hado op
nhado 2.x toto inyo y sialability and
op 2.x
efiieneyol hadoc clusters
YesoUrCe man0ger (ma ster'
Kesource
wanoges the Lere
clugter's yesouK(e
resoures oand scheAlts
and schedias
e Neae wanoger Cslaves)
manageS esouxces on sige node admonitos
fhex Ésource usage cbntaners where
Hhe tosks un

hadoop eeosystem
- hive- ssL-ike intextace for hadoop
- Scipti lag tor comple data trestomotion
" Hbase- Nos&L DR that rung on top
ot HDFS
- Zookeeper- Ce-ordinatian Service tordistnbutes
o0zie- worklow schedi System "

You might also like