0% found this document useful (0 votes)
16 views12 pages

DS Assignment-1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views12 pages

DS Assignment-1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

.

Explaio HDES:
’Hadop Come uith a dütibuad ite SyeGim Called
Catl HOfs.
HDFS data is di tibtd
over SeNeral machines a epli
-catod to ensure
their duualbi (il to yailre & hiagh
to pasald applicatlon. aualabiltd
Hadoop dís
dibibutidtle Sysm s a bock- sliuctured ie
Ssaptin whene eah ile
pre
detamined size hese blocks ane Stored aca0Ss a clusten
% One OY Seveal Mahibes .

Q.x which you Can on qune as fpahe Hadoo p


pea you egireres
& I hawe a t'le
&I
example.txt' o Sige SI4 me .Suppose thlt
We ae
ne sinq the defoul Congiqwti on
1hen We ae
blok size dhich
uing the dafaull iontiqusadion
exompleotxt| f-look Size the ttrst fou blocka liu
lbe (9% mp Bul the

exomgle Bock

12%mYs 188 m

128 mrs 2m13

HDEs Anchitelin'
the Hadoop anchi tecture
å a package the fle Systin .
mapredu Engine & the HDES
’ HOfs yot'ous the Masla- Slave auchitecure &it has
se komporenls
ame Node |metadals)
calent

Reotdi
Dal nodes Daland

ane the Jit olerenl io


HDPs adhiteeture
") Namne Node: AI the blode on dala nodes are handled oy
Name node, which á know as hne masth hode
Tt peyons he tolowing tntion moni tor &
Coiuol al the dalanodes
Peam the use to occess a, tíle
7 t Sinpli7ia the architectue of the
Name node
mainly wed for storing the metadal in
-the dal abou the dala meta dalk Can be the
Jiansnti ons loqs that- keep tradk
a
hadop
User
dus ta.
metadak Can aao be he name o he ile, size
&the inton ati on aboul the docai on % datt node
’ Nomerode tshatr the dalsnodu with he opeate on
ike d deti, epliali ete
2

DalkNodos: Dalanos Wok as a sawe Dolanodu ce


ma[ny utiized to Sloting the dat to a hadocp cdula,
he numlbeu of
MoYe than hat
dalarod u a be oun Ito so or Sven

Each dalz node Contains


hese dal
multiple dalt blocks
bocs ane used to store dalt
the
Tespons bi 4 dal node to read 4 widt
vequat fom the clienls
Tt pal om block
ceation, deetioo & sepiation upon
fnalsuction kom the namencde .
Job Tackut- the Rola o (olo
tracken i lo aapt the
mapredue jobs fom cliult 4 poes the dalk by uiing
Nam eode.
- Ih rupose, Namenode
Nome node prov des metadal fa Tolo bauk.
Task Trncke ! t worte as a Save
hde for ob aacken

Code
he tle proess Can auo oe clad as
mapper.
a Big dalk layeu
A.i) DalE Soue
ibieatz dayen:
- Orqonizations qunuali a
daa on
daiy louu . he baie tuni on qthe dal Souees
Souues , at
Coning bom Vaiou
Vauy inq
’ he dali obtained komthe dala sSouces,hau to
be Validtd &cdeaned be feu inbo duing it
dogtca! Use
o any
tn he enlpise
’ the tauk
Validatin g , Seting
dow by ngutan layes
the the nqutios oye to ab Sob the
hug. ingow dat e Sort i
dalt out in dikk alzgorDes
hs layen sepols nod e
kom relovanl-
tofomation ,
Can hancle huge Vouums, high veloút ea
Vauilf %data, the tfngetion dnyen Validabi ,cleanses;
the unba ctod da ino
Biq dalt stadk fos fuuthen proauting.
the tundi oni
Idetiícation, fitiati on, Vali dation Noçe edurti on
Trans foun tion, Comp reni oo , nrquation.
Dala in the Hadoop Wold means ELT (Extiact ,
doad E Troms fom) as opposed to ETL
aaditional waehowes

he Hadoop Stora qe auer Cuppolt fault


toleane
Panetizati on, uhtdh endsle high - spud ditib utd 4
dqothm o exeult ovea das -scale dal
Thee aue
Lwo Comnponenl q Hadoop : a scalalde
3

HDPS hat Can Suppou- pelk byit of dal &a Mapredue


enqine that Momputi rullk in batehes
%ile Sys that is sed -to Store huqe Volum
dal aaoss a lavqe no q Commodilif mahines in a dudit
1 stores dala îo he ton blocdes iles & bollows the
wnil- onu- read- model to aeess dala kom these
Many
docks of les
phytot irtoetinehure':
&
paiacy neuiremnl+ fo big dal aue simi lay
to the xeguiemenli fo Cowentional dala envionmn
’ fe Seuat la reauhemantt have ko lbe Coey
alped to
Speci7ic buaines ned..
Some (riae challeng auie en biq dal beome
stata
a) Da& aue Use acess to aw or
Compulzd big dali ha
about the Same Aevel o technia
dal inplementati ons -
neairemeryt as non -biq

|b) Application acesi Appi ati on aen to dat t


Stniql onund som a techoial pepedive
also rdatively
|) Dali enayp ton : 1 á the most
Secila in a big dalt envisoomn.
d) theal- delscton: he rluion mobíte devicas
Socia nelooks exporuntially înua e both the amo unl
6 dala & the oppoutini hea to Secwtly theal
Mani toing oye
’ The 0oni toving dayen tonis %a numbe q moniteng
Systims these Syslns
Syatiiu remain automatiay auane
the Contiquu ations & funtion ditk 0s & bandase.
They alo pxovide the yacu CA machine Comuni Gtioob
With the hetp a monttormgq tooi though hgh-level
Pots cols Such as AEx Gnai ble Makup
darguage :
’ monitoin q syenu also povide to¢la dor dat storqe
&Váuai zations

Vúutgation
The
Núunli
daye
zation dayen handtlee tauk inlupretng
&iNúuaigin q Bi data.
’ 1a walizoticn dalr is done bs dala analiu to have
a Jook at ditt aupeli of dali io Vaious
Vaualmodeu
ftouw vüualiztin ayen:
vÛuatizatin Toole
hraditionw
BL Toos
Analyris tools
opuational
Dol stoe
Dala wuhoue kbala seoop
(Dali bkes
Pstind Datzhae NO SQL Dal
Souctured dalk)
Une buchawed
4
3
trplaio map redue wth Sutab le eiarmple
Map kodua :
’A mapredue is
dalk poaninq toot Athích ú uied
to
pocen the dalk aualely fn a ditibt1d fom
the inm petant onporsnl o Hadoop & map
’ Map Redue xoqam tok in ao phaues,
O Map phaue
Redus phase
Map taaks deal uwith Splting & maping dat thich
Teduce asta Shukte & nedue -the datk.
tie ioput q each phase & key Vbdue
Vale pais
Tothe
the map pen, the input
key- Valuu pair.
patr:
’ the output the mapper ed to he reden as
trput
the reduen
appe Ovey
The reduen too takeu fo
put n key- Value toumat
G the output o Tedun is
he fnal output
Aritecdin:
he Map Redue famewok opoali ckuy, Malur >
Paia
the ohole pous qoes heougs 4 phae of exeation.
Tdving maptaks
Reduataks
map(
Amap
Toput datr
data Amapc
Map)

’ 1he map trukes dak ?n


the tn bf pai» &selen
a lut ckey. Vae > aiu
the
keay wfu not be unique o his Case,
Usma the output map, Sot se shufsle ave
applied ly the Hadoop thchitelae
> on these i t of ckey, Vale > paus & sende ut
unigue kys a lit g aes Vaes asoiated wih -hiu

’ the dals qoes vouqh


follwing
) Trput splitt -- the dalk set
phac
provided to mapreduca
he

Maping - In thia phae dala în each sput


assecd pg un to podua output
(cun Shuallingq (onsunu he autpat of mapping phar.
|) Kducin g: he aatput om shadfling phae á ued au

The maprediue fameuwBk openlir on <Eey. Yalus peása


that is., the tame wOk vius he Tnput to he job au
a set <kuy, Valur> pai Set

Value pain as outpu


) Fist, e divide dalk inQ spülr
2)) Ihen some proatfnq wiu be pomed
peyome on on the ue
reairemnli, then a liu (- of key- Value pair wlbe
cuated.

3) P{ur the mappu phae, a pati tioo proom tatu


takee plae
4) Agr Sotinq &shutting phaue each duct li| have

s) eah yeduca roas the


) Pinatly al! the olp kylvalae pabs ane Coleded

Deen Be as
Deenl
beer Baa kive, Bea Bean
Pire
Car Ca Rivey ca ca Rve
Deer Caa Bea Deer Cae Bea
Dea(u)De
bea
4) CAP theore
The cop theorem Stati that a ditiutd data bae
has o make a kade t beli en oruiutery 4 Availalsililg
When a patition OUcus
A dii bibutt d dala bae Suysum á
is bourd to have patitions
a real- word to neloork faiuu or Some
otha reason.
Thene tore pauti tion bloanu å a ropeny kk Gannot awoid
wkde buld inq
while ous sys So, a di bib utad sus wis ithen
Choose to que up o0 onistercy
pati tion tolesane.
> the heoren providea a
way hirking abous the
dade- offs fhvolwcd dei qníng buildínq di tibutad sys.
Jt help to expain why Catai, typa
more sy' may'be
appoiar fpr Cettain ue Gaes.
’ Au to Brw, the heoren Stalis that a datibutod
Cao hawe at m
7hee thae quaonlia
Koputia CAP theorem i
popelif o 3 dit bilbutid Sys Chaateiu ia to hich CHP
heorm ryu

|Coniiteny Theorer) AValalil


Pati tion 'Toleane
6

Gonui
> tdeferyines that all cierls See the Same daa Sinultamaul
mta Which
Whi ch node
node Conned to în a dsbibutod Sy
they Cnel-
’ For eventual Conis tency the quaanl ae a bít d0 0se.
ftenlial Conuú teney quananli cien
means l eventalle
See the Sane dali oo al ne nates n
Some poit time
fulir

4Unodes See
the gan e dal

2. Avaid ailily

| ’ Aval aoili q dehines that a non


fain nodu ina diitib d
a re ponae fo all read C weik
regueti io a boundee
amounl time eNen 1f One or more othen
des ane
dOun

1
3. Pati' tion Tolenane
1t defun
do es that he synln ontinues to cpuatt dapie
aubi tauy mesaqe dou failare in pali syo
Dubibutod Sys quanantuing patition tolaane Can
fom pati tion one -the poutiton heala

La

You might also like