0% found this document useful (0 votes)
6 views28 pages

@vtudeveloper - in BDA Solved MQP

The document discusses Big Data, its characteristics, challenges, and analytics methods, including descriptive, diagnostic, predictive, and prescriptive analytics. It also explains the Hadoop ecosystem components, including HDFS, YARN, and various tools for data processing and analysis. Additionally, it covers NoSQL databases like MongoDB, highlighting their advantages and unique features for handling data.

Uploaded by

7darshan.dc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
6 views28 pages

@vtudeveloper - in BDA Solved MQP

The document discusses Big Data, its characteristics, challenges, and analytics methods, including descriptive, diagnostic, predictive, and prescriptive analytics. It also explains the Hadoop ecosystem components, including HDFS, YARN, and various tools for data processing and analysis. Additionally, it covers NoSQL databases like MongoDB, highlighting their advantages and unique features for handling data.

Uploaded by

7darshan.dc
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 28
Qi> a. Wha is Biy Dala ? Fp ae chloe ote = neyo ny ve lage dotacets tna ore quested vapid, and Come frm nuetriple ounces « They are too cornpu% Jor badibonal Systems fo tomndle . — pata Cam be Sbuckued (tables), Sui-shudured (XML, TSON), ow unsbuchived (videos , images, ourclio , Legs) — pate is Ceated ob a high sped, (Acel- Hue’ or mean real-time) —Thaditonal data tools Uke RDBMS Comth, bandle tt oppetnely. Chalten wit Biq Data @ Volume t- + marsine dalacets ale t +o Stove ¢ mramage: e nud dishibuled ple Sqstonns Lite HDFS. + Backutp 4 MeLOveny yoke more He £ space - + Gxt lot data, online pansaution Logs - : @ Naocity ie. data is genet combi narourrly ne Lalgh speed! e casera equi prawns 4 rash 9 Mt Aul-lime , « Roquires Wigy Hecoughepuak procsing wnganes . 2 ehi- Feud Asection Ain bannichns . @® Nodteky i o multi ple : Heal, fenages y Vidoes Logs . » Hod to Create ome oluk om {7 aul types . Dela hans Aion tools ae needed, ee ceeds free oe data, © Nesatity i= + Data may be sneonsistent , Invomplele oF | te Gita paolevenk of incorcect dala, @ Scanned with OKEN Scanner anineceee ere ar a © Taucstunttriness $ aceracy ase ard to ensure. 0 Requires data validahor teelaniques « © Natue i- + Gxlrailing resceuting pa Informalion ts dijpsuel, » nol alt data has busi nins Heal value . eo noel 4 tools ty deviue eneiahte . e Busines TnFelll gence % needed 4p jena hidden patterns. 4b> pty Data Analatics 2 PAplain the et cc 4 Anetylics. =) _Biq Data Analytics — Amalyxtng lye dala to Jind pabtions, cometations 4 rend - — provides buy news 4 Adeendice taigiits - — Us ML, Atelisdics , deta ratvivng teehiniques — £xi- Retommendation eystems, faudmdctation Classification o| ties 4. Deswipive Analytics t- > explauins vohak bappeneal ine past © Uses past data +o qurttale Auports 4 Gummaates. 6 hal undestond “bends 4 peyormence - xs Sales yu a haw Ladle. oy Diagnostic AnalytCs t+ @xplains Why somethin pened, © Users Axttl-doutn 4 data dove ce aj hep fe helps ident fy reasons ~~ suceerses® oy joilaes, © £x+ Dp to ueble Payie analysis . 3, Predictive Aralyics t- ¢ PorecasK what will hanpen 3n petute 2 uses ML rodels 4 Adolistical Lédrniques - @ Scanned with OKEN Scanner Y. Povscuipve Analytics :— + Rewmmends Getionsty act awe disived outcomes. + uses ophwization amd ginulation algortth ms, : 5s automate = deef tons + + xi Route Opkiwization {rr ee 4 Me Broly ho }40, 2.0 43/0 day ex plainy CAP Toren, => CAP Theorem — proposed by Eric Brau. — In dishibured sates , only fuse out-of tre collouing Hee wre quexanted + Concistenay 3 Availability Part Hon Tolesance - Consistency s— « Gus nrodet the susie has tre same data te Aome time. 2 Reads always sebum the moet recent wile. Ayolabi lity: — se cys awxponds +n ever requert » : y qome nodes are clout. ° eneuares qk Aes porses y though magy not Paxttion Tolerance t— « syten conkinues to smenwages bl nodes are Lor. Ue netuorie jailurs doa 2 mutt hand a x ayy: CAP Combinalions .— 1 CA (No Partition Tolerance) ensures CO @ Scanned with OKEN Scanner 7 CRC No Availabitite ) 3 shmy ee but no albicans eee d ee + AP_(No Consistency ): Always acsponseve beet mma retum old data, Congistenty Q2> b) What is Nuos@r? explarrs te charaduicnes 0) NUOsQl, => NwSOL — wed RDBMS Combines hadi tonal SQL jeakures wd) NoSgl payorynance bend K + — Acpporks SQL Uses the Atowdaad SAL query language. — awmproved Perjonmante > Scales hovizen tally 4 uppers Vag Unroug puts 5 a daxigned bs cloud built yor dishrbuted envionment . CAP § Chatakdienu 0) NuoSQb (ucID cempliane — ensures Adbeloility 4 Consistoney 4 hansactions, (CO Hip Scolmb ty — wens qypcenity wilh letge cLatas eld. ® ada Coneuweniy — managers many users 4 operations Umultancously - @ Real-time Daka Handling a Supports Hme~ Aensitive applicakions : @ Scanned with OKEN Scanner Qed Explain the Hadoop Ecosystem Components yor Dada preasing 4 Dela Analysis . = igs Cove components a ole J HODES (Hadoop dishributed File Syetem) — ftores biq data Auoss dishibuted machines. — Breaks les into blocks 4 Stores teen a i — provides eee + ge areas Lay) Utes : eee i data Sets in parallel. Sc rrving motel For prousting. lange ; S ee °4 Map Cpilieming 4,Sovting) g Restuse Capgrey ting) portions. — works ex erty aus clusters - gB. YARN yet omornnr Resource neq oHator) ey CUBA FESOUACLS - & rgclules and wmeonitors jobs. . ; = Sie nego Ce momnagenaurt pon J to prowoning . Additional tools : : ies oy. Hue i= + provides sQL-dike ddan for queaypng n : uel ge oct qummarizalion and omalytts « “ i dedas Rigi -suugting Payor jor cwvslfting Vrs . surpts converted to MapPeduie Internally . sane aT @ Scanned with OKEN Scanner f én — Nes@L database on top ‘4 HDes , — Seepports Atal-H me read | vorite 4. Sqoop — bensfers data jo RDBMS 4 Hadoop. ge. Flume — Collects 4 moves lanqe volumes 4 404 data tp HDES- : gq. Dorie ~- Cutork How engine +0 schedule op fobs \o » Zovleeep — Coordinates Aishtbuted Zerices 4 maoimbeins Conptquration 7 Bad vottn a meal diagram 1 eplain HDPS Asclniteclinne « => HDES Ardulidmre d.NameNode + ads as martin seme, + manages tie metadata “Be Gpmry blocks , Locations . . keeps ade 24 whith patanode hy whéch loleck . 2. DwaANode 5 6 Srerts adual data blocks. « Sends periodic fucartberts +> NameNode, £ prjorns Block scation, deletion , and ruplical on on tnshouel | 3. Sevendary NameNode > + Takes pertodic Srapchots ‘ NaweNode wutadata . + helps tr recovery Crot a bactup AOU): 4 .Ubink > + Interads wath NamuNode jo vend |uurtte dada , | comamunicakes directly with DakaNodes qr fe operations. 5. Blok Chrage —> « Hus axe Apr Ato bloccs ( ult ¢ [2eMmB) + Gary block %s Aaplioned Cagault 2 3 Copies) yor 4 WI nce. @ Scanned with OKEN Scanner GY Client 9s Srberasls unity NowmeNode te read | Vile date + Communi ces duiectty Wik Dakalodes pr pile operations 4 Datatwede 3. hres auual chita blots . 1 Gunds periodic ~“Wenrtbeads bo pyamuetlody . eye Biot Canon , delitior 4 naplicotions 3b) Explain HDPS Daum. = Damo rome Fondlion Nw Node —?> + parker rode tak Shoes eehaal, 4, Nowe range’ Le Auyvtero rckadala © maivkaines nanny ace qe le 6 ee ae as ke en 9 elas | NowreNode —> entodt copies meladota 2. Seeordaay + pave eel ie fe ma thy | rg «helps tn wip tots 4 rehrtoun blocks as fed +g. DataNode fee j a2 by NaneNoele . «Blinds block 2eporis 4 Wuaatbe @ Scanned with OKEN Scanner 4. Cneck porat Node — Pos is pm mg pagal yernngps 4 edt dogs + Supports paslet rem ey « 5. ea Node ——9_ » maintains peal H me Irnage ay le ae NaweNede qn asad Fre, ing onal gcy Give HDFS Commands jor fue yollouuing P ~ i b 4 4 Hag voor 0 > © To ger mae He of daceics tps => Nadoop po--R] i To ovale a ditutey tn HDPS = bodorp promedin | Aerie = no ot ape for locale syvero => nadoop yw apur |serpel pes tet { Sample 4 (D) To display canwenis./ey om HDPS fle console ) nacleoP ymca T Som | deshtyed - } O. To remove a divest yom HDFS =) whoop ye — an | Somaple d es ee yleread amd wile. file Read Opsaddion ie yead > nuts anes Namuetlod 4, Clink sequerts ‘ Wr ble Locals. sy @ Scanned with OKEN Scanner 3 CARRE ME AWN MEE EE : Nowe Neds vespennds —> prrvides Uist q) DatatVodey hawirng a Hur tgle lolocks - (uel contacts PataNodes —» Conneds to Closer DalaNode bo _ read data _peads_in_ Blocks — data tc vead block ~by- block is Sequence. « Paraltel reading > is can be read tn parallel pon 5 oa raullple Data Nodes - file wuite Opuabion 4, Cenk rusk to acate pie. > sends Suquint 4p NameNode to Ocede a fle. 2. Block allocation > aneNode allocates lolocks 4 sds palaNodes, weve = @ Scanned with OKEN Scanner 3. Data pipeline ~ Clif Lovkts to your DalaNlode , voldietn foronnds yo Second 4com, fie 5 : a: - oxplicalinn 5 tach block %S veplicaled to owe wedles (- faut ta aeplicas). I 5s . Block Conkirmalton > DataNodes send a aero welgernent ayer aucun jl vor , 4b) Tnaplnent Word Count Progen in Hadocp . S Word Count Progam Ovewteng dA. Obyetve 3- « count tne numba o' Sn a text fle 2 Mappa, Class:- «Splits each lane into words. «Gals Kaya valine pats as,( “ed 1), 4 Hrs ear tural appears protic class Tokent 2AMapper. extends Mappet £ pubic void Atduce CText Kou, erable < IntWrifabled values, lonlert context) § qnt 4WN =0 5 gov CIntOvitatele val + values) § Sum + = val. gerd), conkext. LOrike Chey, neud JnhOrtable (sur) 5 4 { fea 4.driva Clas: > Con gquars i Ns, Be Exeudion Steps + Use Hadoop commands to compl ALD Bs a ita ates a dataser. 5y a) What ds ea dane hy ManageDB ? Wushate ime prs of eventing oY quawating a vanigye ey. > oe sri! = MangodB ds adocument-ovtented Nest he a. Uses BSON Da crte tere de an Json— Like BSON fermeb fi CBinauy ISON). | |g feema les Shructure — Colleutions don't require apre-dapined fi oda - | | a Detigned for Atalnbllity — Sup | | 4 sharding . | an Open Source —_ ports dishibuled quchitecHae freely available 4 muindained by Morea Did dnc: @ Scanned with OKEN Scanner | 3. flotible Schema = picts can voay betuuen douments tn Avantages s~ J ‘pl ora - yale yead | vovile eae comm pated } RDBMS . a. Sealabhity - basil Aca horicontally using Shading « the dame colle. 4 Rida Quiny Lemquage — Auppeds poupal qpnies 5 Indariong aes f 5. Real-time Analyt — g£ulta \e for by data 4 yeal-B me Apps. Creating ov Gevwvaling a Unique Did pads — buay document a wigue std fia which ate as the pone . ay Automnatic: Cuastation +—< if id 48 inal spetiiged, MeorgoPB gudo- quan a ianbyte Weraderimal Objet la . 3 Cushorn ke 52 You cam mainly asin oun Uaague Value to id. 4y@xux & Maas (ol, "name" % “eharjal® 7 "bromch” + “Algps" o) sy Uniqueness t— MengeDB evrurss SA values are crate uritidy a Collection . J @ Scanned with OKEN Scanner Nhe 5 bY Anplont eek ows + Count = Sat = Lindt Shap : Agge i using MenqoDB - yen) a sLowdl) Fwnnubio +> lounks douuments in a collation e * , ~ cb. collection name , Count) } LXE Alo» Studien. Count () || 2. 600 faniliop 4- ging clounenk’ an astunding (4) 9 I AtscunMng (4) oder. ~ tes telleaion mame «pond 0. SOL(§ petd tL @ -13) Etim Ab. Students ind OO. sat C4 masts + -43) 8 Mra O fmactin t- mats the sembet * euumenis ached — db. colucton_ename nde, Umit Cn) Exy- absttudenls. Find C). init C3) 4. Skip CD panctioy .— atcips a Iwsals of updates a doumunt . — db. Colledion . Save (document) ex db tudents » Save CSiids 1, names "Rahul 9) 2. Find mttrod —> Rebiours douuments frm. Collerkion . — db. wlteton . pind Cquerq) oat ato -rudants . ind (fname, * abut”) 2. pretty O mtteod ~> formats dine output of yond) fo bees oreaclabili — db. Collum, bral, pretgo Bee- 4 “counkt) mated Couns makeing dowuments+ —db students . pnd CR grade * "A™Y). countC) 5 Seipl) wuttived > grips the yer asus. db Mudenls pend. Seip (8) 6bS MongoDB Commands por Hae following opaations (o» Studints Colleton) , Given rudst > ShudRoNNO, ChudNanae j Grade , Hobbies , Dos. Dihes he dels on adit 3 db .Hulends Tnrserk (£ Studketlias tol) StudNome "Rahul", Gres A", Hobbies + 0" Gvicket — DoT : hwo Date ("20a3 -o8-01 ") De. @ Scanned with OKEN Scanner =e . ae a on) Display he details 4 Atudent having avllne fol, pd Stuclenls spond C£ StudRol{vo +1013) ji) Chamage duc rebby of Atudunts "Rahul" prom Cricket +0 Preiball. _ydb. students update C & StudName + "Rahul" 3, ggeer t S Hobbies ¢ [Fortbau"I 44 ) gy) Delele tre recorel era belonging to grade’ V 1 Sab Students remove (¢ Grade t"V"3) vy) Rusia the deteils 0) Atucdens vohose nam Afarle vith! Bp) a db tents. pind Cg StuctNomes/% B14) Fay What 4s Hive 7 ist Hae yeatines oy Mae — Hiver— — Daka warehousing Toe! — biue is system bualt on a % Haderp « _ mlows wes fo wile quuts fn a Language Kinsler tosQl. — converts qweuis iho Reduce jobs 7 euulion . — weimly vied to handle Mine Ey saat sideband dads. — wok g wh with hasloop exec stom 4 Supports Large dodar els. an open-source dala yoarehoure yes > Osquinierouss (Hive) —yomtlion +o SAL MAAS y malig | 3 casttr gor amaiyeds « : © sualability — procenses pelabytes dain uring Haoop clusters ® Envensability — Apps UDP S Coser dtginned potions) to vrithond, jynstionaltty A @ Scanned with OKEN Scanner LT Y.pattt Tolereml > Truits jaullt —tolrvemnce 44 Hadeop Ss _ Supports Difloent Shyu 2 Types winks with HOPS; Apa, Hpace , and Amazon $3, Thy Caplan Hine file porwalls « => Hive Ple_jonmats 4. Teak File format > defor Storage format in Hive. + Shores plate teak dota Separated by dolinditers Clite Cav), « Sdmmple. but Slow yor large scale CG: ‘ 2. Sequante Fie Pormak > + Binary fle Font + Supprs Compremion 4 Hoes value palsy’, « fasten than Next fomat fe Lage Z.RCRie CRovw Column lomat) > hybrid format i a wo 4 Colum ig Atorge. pen ier vending a Beek 4 columns , a, Re Cophrized Rove bern + Stores dala Mp Coluume sruiise tow 2 hgh lompremion rate 4 gat re ans. J i + Sudtalola for Read — doe optakonis . 5. Parquet Fommal->Anowur columnar op wiced Com pie nested data. . ae r : Compt‘ ble with Hue 4 or tools Uke Spark. acy Explain Burketing with an Craamaple . =) Bee 3 — tt dtuidoy Arta into pix numba “4 ples oe Nh bbascA m A hash yumtion a column. TEE RI ITO @ Scanned with OKEN Scanner ee - === ” uscd \e beter data mannagensen +4 ydant query proanny. = PartiHonimg diuides data bu “column ie 5s buckess Smallra YU write cy Mairdes anto yh im @ pation. makes fetns 4 dampling pater by eductarg ple ACaMA . f ¢at— | CREATE TABLE Students -masks C name STRING 9e\lno INT, marks INT. { 7) CLUSTERED BY (rllno) INTO 4 BUCKETS, , Tis Command erates 4 buckets bard om oollnox ROU GAL asigned to lourkels Using ahasling puritron 7 anne - oprinciting quuris with pitts or joi om yollno. gay what ds Pig 2 Explain He key feats “4 Pig. =) gt ne ideo. — Apache Pia dsq Wigl-tenel platorm jor procasteg large ddan » — Pig uses Peg Latin Suipts that ore converted {nto. Vie — inks With Logs , tak, ISON, Cond rare. 4 = Scripts \oreute 0 MapRedatee yees- Fonts 9, peas son amp ae ‘ com po Saua MapReduice 10° ;@ banish allows otakion 4 “pas yas Yoj @ opkws zakion by aulornabically ppHinwizes eniution plans - @ Scanned with OKEN Scanner | 4. Schema Pevibility = dousnly xtquine atid Adnunag ike RDBMS. 5 .Induading 4 Baten Mode — Cam run In local os Haclovp tuk environments. | Bb with a neat Atagram , expleim Ha Anatonny 1) Pig . > _frrafone of Pig mt Ptq Latin Sori pt i fh oO Pig Lat Seip me apt uaritton by User tn Pq Latin’, @ Pare > cheers syntax + builds a logical plam . @ opioniner > ephwatzes logical plan to create, eypictint Maple Jobs. @© Lmeiler + eonverts opHwmaized plan to prysical plan | ® Exeuton Engine 9 Exeuts He plan Ph ‘Hadlow p Crap Recluce ane): band) Novo ¢te. t © -Datujtoro jollovo psp Tape Cog --HDPS) to output SOU 4 . * v3 Reduee ‘8 Gur p 3° Teuton Sell Sent engine a] |u ‘l 4 L Parser ; [Ps foprncer Ea tometer | Bey Explain amy ye welattonal, opusters 4 Pra with any | syovnple yer tauhn . ba a @ Scanned with OKEN Scanner : Load loads data bro HDPS oy local ile. > A= OAD | Students 1X1" SING Pig Stage CAs (name Chagastaul , ages sat); 5 PUTER > Piers Acids based om tmdition. oo ae FILTER A BY age 7185 g, TORCACH... GENERATE —> Pramstornns dala, Aclecs feds. -> C= PORCACH A GENERATE nawe a al 4. GROUP > epoups oetonds by a peld. —> De GRouP A BY ages 5. orden > Bos data baad on one or more yidda. 5s € = ORDER A BY name Accs §@-ay vite 4 meat diagram , cplatn svat radia, seals 4 Salk. => Mam Pauw Qi Apatine ponte O speed eS In - Mum procuning boots ees lo-100X yur thm Mapkedice « . iu + Teta Yor Teecline algorsbiun 4 tnbractine quads. ® Case 0) Und —> + Supports AP\s tn JavayP , Scala ond R. . fase won’ applica ons asang built ~{9 Ubsartes « | wromeed Aral bics > « faappots SQL) MLB (ML) ,4 | Copaph prouasiing) ond Atreanning ‘ @ Fault Toren > «uses Uncage ante CReuilent Dishibuked D alasels) Jor Accoy ie ere tet eat Sr ES SSEeHeSeE SH @ Scanned with OKEN Scanner @ Lempalible with Nadeop + Com Aun on Hedoop yaRM, arad HDRS , Base, Hine dala. Cove_Components 24 Spark | + Spork Cow engine Jor ea 4 | mamaguensing . Spank. SQL > Modecle yor qurying Aluutued "dada, + Spark Sheaming —> Real-Hme data prewsieg + MULib—» Machine Learning brary . hX > Graph prowmng — oe @@ aka, rage HDFS, : Hadoop | Includes ouMing engine 4 youl? to cna ae Compalivle data souoe, dt 4h UMA Che climbed dadanel Goo, \ staat Sulnedubing + 7 >. ww t/ Resource wut \ 7, tery Flav, YAR or { | @O_ explode’ tre S-layer Ardaledure a Rng Application’ Using S prak Shack » = 0 Applicaton Layer toe Conlning Urtr writen appli calions. | + wes APle in Sala » Pyle yJava jorR, . Wem dyin Logie ) hanstorm alons ractous , © Libmuries Layer i apvidis advonud prowrsiing tools s @ Scanned with OKEN Scanner i. spark sg jm data querying - icadi : spel MLUb yor Machine, Leammbing « j App ” ot layer | I A wl 7 | speck Gheaming 1 cveal—-lime « rr) engine Laser CProeevig, rayne) # Spark Cove ongine eae . Adreduling jobs ; + managing mauwnry 4 Vasks y handling failures © Custer M ( paoalel Tasks 4 Prrvourse naonag emeust) ¢ Avdeyaus Wilh cluslr 4o0lss _fpacne Mesos yHadovp YARN y Kubenstts ) or Sramdabone er ree 2 ee ipa LHDeS; SB) HOME Camondia ,Hime, Ue- Anal, BP, 81, Data Visualizasin , R-Desuip- wile" Aahistics ) Maubuine Learnang Applicakiony SOAKCGL » Spank Ahrtowsing 1 SpaHleR » Spa, Agpicaki ows Gpaphx Spore Lib boy oj ML axppiratn) , Spare trove Applicaxion, KUSH apgsicadions ll “Applicat Spas. 6) \ i) TDP Compabible dala Atores such as Hpase , Cassomdra ,Coph, $3 ee ale oe @ Scanned with OKEN Scanner Ao>-@) with @ need aiingoewnn 1 expladin fe Text => Tort wim | Precis Steps © Text pepecasin + Converts aero jexk fo Alructurcd fm. + Sips ined indude ¢- + Tokenization + breaks +zt into uimds. * Stopuuord Removal + Removes Common tords (eq. "and", "ye") * Storaiaving | LummalivaHion : Reduces wuts to bare porn - @® feokeive xhaukon +. etonvnk Hatt Into ena cords B01). « Tee, 4- # we . Te ol (tem Frequency Tovense Douument ; Prrequantd is @ Tour Transform akon 4 % Applies Aindinionnlity Aeduléon . ; 2 Teeny PCA Crema nerpad Conaponent Analysis) ® Vda Mine | : Atehistica_ models ; = fe pps scation (eq or aqboo dds?) o Ousermny eqs Ne naadelions) Miriing Pro cers | ® Lvobuslivo 4 a8 nner yard sang oh niined resus. — Wes pedir reoal) ;f4- Aue ad evaluahion @ Scanned with OKEN Scanner sre | [R Peatiunes| TS teatuy| fur. dada S tno pepo ng quasabion Adu wh ing | | ells oe Text Cleanee +800 © Reoluiee «clus ev iquabian’ fel + Tokenizak? r 8 Stems | | dfenunsin od : a «POs Tagging © Removing "| —ality « el ( oe Js word conse be ae Ne disambiqual” 1 ane “Space ed ee | | : 10 aie A ak Aaluer Ao> () Explamo tae Tage Rene Algorithm dang) Relative 4 oy Pas Over Livkid Cmildsen - => PageRank = tn agoritione by Fig awnportaue. ws based on Haun Ip" Ne 4p roe wae ee gota ree Liaw fit: woe ¢ a w% yer es point | -aer es OD sutnottare mony oresd PA i - Ae page 46 * yrildoen) casi Orin Raion Mating Ce fw Pageko flow oll ib Links bp, ’ Se pe a ules THs rene equally otf mae - ea 9. penile Clulotion : depandl om als 4 all pages liner to ila = Ram 4 2629 @ Scanned with OKEN Scanner @ TaagRovnl. formal meee ied) pRcp=(2- aig es pete) — eae ate p4e he ~ Board Mayor) REQ) ~ PagcRanle a Hankin el LOg)- 4 Ouccing Linke pm @ © on June —PageRomlt ane updated our reualtiple ieantionys uti Stable . © ie 0 gost! al pg) os 6 4 4y Inge O : > FP Tage a 7 oy [Fog ox oS fee @ Scanned with OKEN Scanner LIMIT : Limit aperator ws ysed to limit the number of alp tuples: pis : ; “a Ke tg Display fiat 3 tuples hom the “tudent’ elation eth ‘A Tard "| pigdeina [chudent stiv' ns Goollna: iat: name: chavonay, a: lont ); Qe LIMIT AS; We : Sa pump Bi ig ra. With neat diagram explain the main featinic Mo Apaik — aed - 1g APT's’ for Tn-memo yy f ca Apache spark shiny Gini (RDD's) engi Gentval- ae Bap. duler Apache Spark comping, 7 Stack Syrin Concise and Consistent ae An timized engine that SY pots qeneval Nefeat scale, i execution graphs, java. python and J = Main frahuats of Spank. — @ Scanned with OKEN Scanner The features of spark : 1. Spark grovisans for creating applications thal i the complex data. Th memowy Apache spank compubin engine enables up to | 100- times -.pefonmanc with aupeet “to ‘Madaop. Be ~ Execution yur uses both in “memory. an.“ On-dis k computtin A Data tpleaclin fom _an object Store for immediate ust as a gpavk object) jastence. - Trovides high performance “when” an “pplication accesses «memo, each. rom the disk. Processes any \okind of lata, namely shuctured, Semi + shad sor unshuctused dato aviving fromm 4 VERDI" spurte. * Ns - Supports mn new ees ia addition ‘to Map “and Reduce — funchions . o > wD => ¢ @ Scanned with OKEN Scanner 4b. Explain the five layer owchitectuxe fox vunni applications using spark stack. nit ve [tin layer — user appliationg f ipgliaton for the ETL ‘Analytics, BP, 82 ; Layer | Machine Leaning: applications ye Spark sgl 1 Spank Streaming , ipaakk i Applications Spark Gaph k, Spaak MLib’, ! y Sippost pfasponk, feo 4. layer. SET Beco ANSP | Des compatible data thous | | ‘Data 7 : ee) Storage 4 r Pawallel tasks : Hadoop wm oY Apache Mesos Deira actin a Grok Yeah eae ol Pe 1 Spadk, tack, » imbibts qeoerality ta spaak, grouping of the following, « forms. spark” stacks ys ” = Spatk SQL... for tha. structured, rai quns thee querits ‘pn Spark date in ee sings § “OMaLytis andl = wisqattzation “-cppbications, a v — @ Scanned with OKEN Scanner “Souk Streamine ba fg for’ preeasing ah a. theiming — data’. Processing As based yi ‘dy batihis syle © of computing andl processing. “pik ~ Yetmanmeer jackie used as Light -neight front, end + for Apache park Anon R.. Soak AD Uses, > SpavkR trou gh the Roo. class. . Spark MLib + is Sparks scapalable machine! Pity tibwogy |. 0 consiat of common=tearmioyy alg itis and atitities «-Muib includes clatsification » eq tsian, tlutering >} coll abovative Gtr: the. (Qa. With neat diagram exglain lext fain process 2, Features renevation 5» Analyzi ' Reiki y+ Clustering, “Vidalia ec TEE Cleanu ba of no _ | Tokenizationf— i + Skemniin + Reafovin Neqrams } | Classi fiat stop words } , [super wal | To Application @ Scanned with OKEN Scanner

You might also like