How To Set Up A Hadoop Cluster Using Oracle Solaris
How To Set Up A Hadoop Cluster Using Oracle Solaris
Welcomerushi
Search
Products
OracleTechnologyNetwork
Archive AutoServiceRequest(ASR) AllSystemAdminArticles AllSystemsTopics CoolThreads DST EndofNotices FAQ HandsOnLabs HighPerformanceComputing Interoperability Patches Security SoftwareStacks SolarisDeveloper SolarisHowTo SolarisStudioIDETopics SysadminDays SystemAdminDocs Upgrade VMServerforSPARC DidyouKnow JetToolkit OracleACESforSystems OracleonDell
Solutions
Downloads
HandsOnLabs
Store
Support
Training
Partners
About
OTN
SystemAdminsandDevelopers
HowtoSetUpaHadoopClusterUsingOracleSolaris
HandsOnLabsoftheSystemAdminandDeveloperCommunityofOTN byOrgadKimchi HowtosetupaHadoopclusterusingtheOracleSolarisZones,ZFS,andnetworkvirtualizationtechnologies.
PublishedOctober2013
TableofContents LabIntroduction Prerequisites SystemRequirements SummaryofLabExercises TheCaseforHadoop Exercise1:InstallHadoop Exercise2:EdittheHadoopConfigurationFiles Exercise3:ConfiguretheNetworkTimeProtocol Exercise4:CreatetheVirtualNetworkInterfaces Exercise5:CreatetheNameNodeandSecondaryNameNodeZones Exercise6:SetUptheDataNodeZones Exercise7:ConfiguretheNameNode Exercise8:SetUpSSH Exercise9:FormatHDFSfromtheNameNode Exercise10:StarttheHadoopCluster Exercise11:RunaMapReduceJob Exercise12:UseZFSEncryption Exercise13:UseOracleSolarisDTraceforPerformanceMonitoring Summary SeeAlso AbouttheAuthor
Expectedduration:180minutes LabIntroduction ThishandsonlabpresentsexercisesthatdemonstratehowtosetupanApacheHadoopclusterusingOracleSolaris11technologiessuchas OracleSolarisZones,ZFS,andnetworkvirtualization.KeytopicsincludetheHadoopDistributedFileSystem(HDFS)andtheHadoopMapReduce programmingmodel. WewillalsocovertheHadoopinstallationprocessandtheclusterbuildingblocks:NameNode,asecondaryNameNode,andDataNodes.In addition,youwillseehowyoucancombinetheOracleSolaris11technologiesforbetterscalabilityanddatasecurity,andyouwilllearnhowto loaddataintotheHadoopclusterandrunaMapReducejob. Prerequisites ThishandsonlabisappropriateforsystemadministratorswhowillbesettingupormaintainingaHadoopclusterinproductionordevelopment environments.BasicLinuxorOracleSolarissystemadministrationexperienceisaprerequisite.PriorknowledgeofHadoopisnotrequired. SystemRequirements ThishandsonlabisrunonOracleSolaris11inOracleVMVirtualBox.Thelabisselfcontained.AllyouneedisintheOracleVMVirtualBox instance. ForthoseattendingthelabatOracleOpenWorld,yourlaptopsarealreadypreloadedwiththecorrectOracleVMVirtualBoximage. IfyouwanttotrythislaboutsideofOracleOpenWorld,youwillneedanOracleSolaris11system.Dothefollowingtosetupyourmachine: IfyoudonothaveOracleSolaris11,downloadithere. DownloadtheOracleSolaris11.1VirtualBoxTemplate(filesize1.7GB). Installthetemplateasdescribedhere.(Note:Onstep4ofExercise2forinstallingthetemplate,settheRAMsizeto4GBinordertogetgood performance.)
NotesforOracleOpenWorldAttendees
Eachattendeewillhavehisorherownlaptopforthelab. Theloginnameandpasswordforthislabareprovidedina"onepager." OracleSolaris11usestheGNOMEdesktop.IfyouhaveusedthedesktopsonLinuxorotherUNIXoperatingsystems,theinterfaceshouldbe familiar.Herearesomequickbasicsincasetheinterfaceisnewforyou. InordertoopenaterminalwindowintheGNOMEdesktopsystem,rightclickthebackgroundofthedesktop,andselectOpenTerminalinthepop upmenu. Thefollowingsourcecodeeditorsareprovidedonthelabmachines:vi(typev i inaterminalwindow)andemacs(typee m a c s inaterminal window). SummaryofLabExercises Thishandsonlabconsistsof13exercisescoveringvariousOracleSolarisandApacheHadooptechnologies: InstallHadoop. EdittheHadoopconfigurationfiles. ConfiguretheNetworkTimeProtocol. Createthevirtualnetworkinterfaces(VNICs). CreatetheNameNodeandthesecondaryNameNodezones. SetuptheDataNodezones. ConfiguretheNameNode. SetupSSH. FormatHDFSfromtheNameNode. StarttheHadoopcluster. RunaMapReducejob. SecuredataatrestusingZFSencryption. UseOracleSolarisDTraceforperformancemonitoring. TheCaseforHadoop TheApacheHadoopsoftwareisaframeworkthatallowsforthedistributedprocessingoflargedatasetsacrossclustersofcomputersusingsimple programmingmodels. Tostoredata,HadoopusestheHadoopDistributedFileSystem(HDFS),whichprovideshighthroughputaccesstoapplicationdataandissuitable forapplicationsthathavelargedatasets. FormoreinformationaboutHadoopandHDFS,seehttp://hadoop.apache.org/. TheHadoopclusterbuildingblocksareasfollows: NameNode:ThecenterpieceofHDFS,whichstoresfilesystemmetadata,directstheslaveDataNodedaemonstoperformthelowlevelI/Otasks,
http://www.oracle.com/technetwork/systems/hands-on-labs/hol-setup-hadoop-solaris-2041770.html
1/15
2/21/2014
Figure2 OpenaterminalwindowbyrightclickinganypointinthebackgroundofthedesktopandselectingOpenTerminalinthepopupmenu.
Figure3 Next,switchtother o o t userusingthefollowingcommand. Note:ForOracleOpenWorldattendees,therootpasswordhasbeenprovidedintheonepagerassociatedwiththislab.Forthoserunningthislab outsideofOracleOpenWorld,entertherootpasswordyouenteredwhenyoufollowedthestepsinthe"SystemRequirements"section. r o o t @ g l o b a l _ z o n e : ~ #s uP a s s w o r d : O r a c l eC o r p o r a t i o n S u n O S5 . 1 1 1 1 . 1 S e p t e m b e r2 0 1 2 Setupthevirtualnetworkinterfacecard(VNIC)inordertoenablenetworkaccesstotheglobalzonefromthenonglobalzones. Note:OracleOpenWorldattendeescanskipthisstep(becausethepreloadedOracleVMVirtualBoximagealreadyprovidesconfiguredVNICs) andgodirectlytostep16,"Browsethelabsupplementmaterials." r o o t @ g l o b a l _ z o n e : ~ #d l a d mc r e a t e v n i cln e t 0v n i c 0 r o o t @ g l o b a l _ z o n e : ~ #i p a d mc r e a t e i pv n i c 0 r o o t @ g l o b a l _ z o n e : ~ #i p a d mc r e a t e a d d rTs t a t i cal o c a l = 1 9 2 . 1 6 8 . 1 . 1 0 0 / 2 4v n i c 0 / a d d r VerifytheVNICcreation: r o o t @ g l o b a l _ z o n e : ~ #i p a d ms h o w a d d rv n i c 0 A D D R O B J T Y P E S T A T E A D D R v n i c 0 / a d d r s t a t i c o k 1 9 2 . 1 6 8 . 1 . 1 0 0 / 2 4 Createtheh a d o o p h o l directorywewilluseittostorethelabsupplementmaterialsassociatedwiththislab,suchasscriptsandinputfiles.
http://www.oracle.com/technetwork/systems/hands-on-labs/hol-setup-hadoop-solaris-2041770.html
2/15
2/21/2014
Figure4 CopytheHadooptarballto/ u s r / l o c a l / h a d o o p h o l / B i n . r o o t @ g l o b a l _ z o n e : ~ #c p/ e x p o r t / h o m e / o r a c l e / D o w n l o a d s / h a d o o p 1 . 2 . 1 . t a r . g z/ u s r / l o c a l / h a d o o p h o l / B i n / Note:Bydefault,thefileisdownloadedtotheuser'sD o w n l o a d s directory. Next,wearegoingtocreatethelabscripts,socreateadirectoryforthem: r o o t @ g l o b a l _ z o n e : ~ #m k d i r/ u s r / l o c a l / h a d o o p h o l / S c r i p t s Createthec r e a t e z o n e scriptusingyourfavoriteeditor,asshowninListing1.WewillusethisscripttosetuptheOracleSolarisZones. r o o t @ g l o b a l _ z o n e : ~ #v i/ u s r / l o c a l / h a d o o p h o l / S c r i p t s / c r e a t e z o n e Listing1 # ! / b i n / k s h #F I L E N A M E : c r e a t e z o n e #C r e a t eaz o n ew i t haV N I C #U s a g e : #c r e a t e z o n e< z o n en a m e >< V N I C > i f[$ #! =2] t h e n e c h o" U s a g e :c r e a t e z o n e< z o n en a m e >< V N I C > " e x i t1 f i Z O N E N A M E = $ 1 V N I C N A M E = $ 2 z o n e c f gz$ Z O N E N A M E>/ d e v / n u l l2 > & 1< <E O F c r e a t e s e ta u t o b o o t = t r u e s e tl i m i t p r i v = d e f a u l t , d t r a c e _ p r o c , d t r a c e _ u s e r , s y s _ t i m e s e tz o n e p a t h = / z o n e s / $ Z O N E N A M E a d df s s e td i r = / u s r / l o c a l s e ts p e c i a l = / u s r / l o c a l s e tt y p e = l o f s s e to p t i o n s = [ r o , n o d e v i c e s ] e n d a d dn e t s e tp h y s i c a l = $ V N I C N A M E e n d v e r i f y e x i t E O F i f[$ ?= =0];t h e n e c h o" S u c c e s s f u l l yc r e a t e dt h e$ Z O N E N A M Ez o n e " e l s e e c h o" E r r o r :u n a b l et oc r e a t et h e$ Z O N E N A M Ez o n e " e x i t1 f i Createthev e r i f y c l u s t e r scriptusingyourfavoriteeditor,asshowninListing2.WewillusethisscripttoverifytheHadoopclustersetup. r o o t @ g l o b a l _ z o n e : ~ #v i/ u s r / l o c a l / h a d o o p h o l / S c r i p t s / v e r i f y c l u s t e r Listing2 # ! / b i n / k s h #F I L E N A M E : v e r i f y c l u s t e r #V e r i f yt h eh a d o o pc l u s t e rc o n f i g u r a t i o n #U s a g e : #v e r i f y c l u s t e r R E T = 1
http://www.oracle.com/technetwork/systems/hands-on-labs/hol-setup-hadoop-solaris-2041770.html
3/15
2/21/2014
2J u l 81 5 : 1 1B i n 2J u l 81 5 : 1 1D o c 2J u l 81 5 : 1 2S c r i p t s
http://www.oracle.com/technetwork/systems/hands-on-labs/hol-setup-hadoop-solaris-2041770.html
4/15
2/21/2014
http://www.oracle.com/technetwork/systems/hands-on-labs/hol-setup-hadoop-solaris-2041770.html
5/15
2/21/2014
ConceptBreak:OracleSolaris11NetworkingVirtualizationTechnology
OracleSolarisprovidesareliable,secure,andscalableinfrastructuretomeetthegrowingneedsofdatacenterimplementations.Itspowerful networkstackarchitecture,alsoknownasProjectCrossbow,providesthefollowing. NetworkvirtualizationwithvirtualNICs(VNICs)andvirtualswitching TightintegrationwithOracleSolarisZonesandOracleSolaris10Zones Networkresourcemanagement,whichprovidesanefficientandeasywaytomanageintegratedQoStoenforcebandwidthlimitsonVNICsand trafficflows Anoptimizednetworkstackthatreactstonetworkloadlevels Theabilitytobuilda"datacenterinabox" OracleSolarisZonesonthesamesystemcanbenefitfromveryhighnetworkI/Othroughput(uptofourtimesfaster)withverylowlatency comparedtosystemswith,say,1Gbphysicalnetworkconnections.ForaHadoopcluster,thismeansthattheDataNodescanreplicatetheHDFS blocksmuchfaster. Formoreinformationaboutnetworkvirtualizationbenchmarks,see"HowtoControlYourApplication'sNetworkBandwidth." Createaseriesofvirtualnetworkinterfaces(VNICs)forthedifferentzones: r o o t @ g l o b a l _ z o n e : ~ #d l a d mc r e a t e v n i cln e t 0n a m e _ n o d e 1 r o o t @ g l o b a l _ z o n e : ~ #d l a d mc r e a t e v n i cln e t 0s e c o n d a r y _ n a m e 1 r o o t @ g l o b a l _ z o n e : ~ #d l a d mc r e a t e v n i cln e t 0d a t a _ n o d e 1 r o o t @ g l o b a l _ z o n e : ~ #d l a d mc r e a t e v n i cln e t 0d a t a _ n o d e 2 r o o t @ g l o b a l _ z o n e : ~ #d l a d mc r e a t e v n i cln e t 0d a t a _ n o d e 3 VerifytheVNICscreation: r o o t @ g l o b a l _ z o n e : ~ #d l a d ms h o w v n i c L I N K O V E R S P E E D n a m e _ n o d e 1 n e t 0 1 0 0 0 s e c o n d a r y _ n a m e 1 n e t 0 1 0 0 0 d a t a _ n o d e 1 n e t 0 1 0 0 0 d a t a _ n o d e 2 n e t 0 1 0 0 0 d a t a _ n o d e 3 n e t 0 1 0 0 0
M A C A D D R E S S 2 : 8 : 2 0 : c 6 : 3 e : f 1 2 : 8 : 2 0 : b 9 : 8 0 : 4 5 2 : 8 : 2 0 : 3 0 : 1 c : 3 a 2 : 8 : 2 0 : a 8 : b 1 : 1 6 2 : 8 : 2 0 : d f : 8 9 : 8 1
M A C A D D R T Y P E r a n d o m r a n d o m r a n d o m r a n d o m r a n d o m
V I D 0 0 0 0 0
WecanseethatwehavefiveVNICsnow.Figure5showsthearchitecturelayout:
Figure5 Exercise5:CreatetheNameNodeandSecondaryNameNodeZones
ConceptBreak:OracleSolarisZones
OracleSolarisZonesletyouisolateoneapplicationfromothersonthesameOS,allowingyoutocreateanisolatedenvironmentinwhichusers canloginanddowhattheywantfrominsideanOracleSolarisZonewithoutaffectinganythingoutsidethatzone.Inaddition,OracleSolarisZones aresecurefromexternalattacksandinternalmaliciousprograms.EachOracleSolarisZonecontainsacompleteresourcecontrolledenvironment
http://www.oracle.com/technetwork/systems/hands-on-labs/hol-setup-hadoop-solaris-2041770.html
6/15
2/21/2014
http://www.oracle.com/technetwork/systems/hands-on-labs/hol-setup-hadoop-solaris-2041770.html
7/15
2/21/2014
B R A N D I P s o l a r i ss h a r e d s o l a r i s e x c l s o l a r i s e x c l s o l a r i s e x c l s o l a r i s e x c l s o l a r i s e x c l
r o o t @ g l o b a l _ z o n e : ~ #z l o g i nCn a m e n o d e Providethezonehostinformationbyusingthefollowingconfigurationforthen a m e n o d e zone: Forthehostname,usen a m e n o d e . Selectmanualnetworkconfiguration. Ensurethenetworkinterfacen a m e _ n o d e 1 hasanIPaddressof192.168.1.1andanetmaskof255.255.255.0. Ensurethenameserviceisbasedonyournetworkconfiguration.Inthislab,wewilluse/ e t c / h o s t s fornameresolution,sowewon'tsetupDNS forhostnameresolution.SelectDonotconfigureDNS. ForAlternateNameService,selectNone. ForTimeZoneRegion,selectAmericas. ForTimeZoneLocation,selectUnitedStates. ForTimeZone,selectPacificTime. Enteryourrootpassword. Afterfinishingthezonesetup,youwillgettheloginprompt.Logintothezoneasuserr o o t . n a m e n o d ec o n s o l el o g i n :r o o t P a s s w o r d : DevelopingforHadooprequiresaJavaprogrammingenvironment.YoucaninstallJavaDevelopmentKit(JDK)6usingthefollowingcommand: r o o t @ n a m e n o d e : ~ #p k gi n s t a l lj d k 6 VerifytheJavainstallation: r o o t @ n a m e n o d e : ~ #w h i c hj a v a / u s r / b i n / j a v a r o o t @ n a m e n o d e : ~ #j a v av e r s i o n j a v av e r s i o n" 1 . 6 . 0 _ 3 5 " J a v a ( T M )S ER u n t i m eE n v i r o n m e n t( b u i l d1 . 6 . 0 _ 3 5 b 1 0 ) J a v aH o t S p o t ( T M )C l i e n tV M( b u i l d2 0 . 1 0 b 0 1 ,m i x e dm o d e ) CreateaHadoopuserinsidethen a m e n o d e zone:
http://www.oracle.com/technetwork/systems/hands-on-labs/hol-setup-hadoop-solaris-2041770.html
8/15
2/21/2014
http://www.oracle.com/technetwork/systems/hands-on-labs/hol-setup-hadoop-solaris-2041770.html
9/15
2/21/2014
Figure6 Forthehostname,uses e c n a m e n o d e . Selectmanualnetworkconfigurationandforthenetworkinterface,uses e c o n d a r y _ n a m e 1 . UseanIPaddressof192.168.1.2andanetmaskof255.255.255.0. SelectDonotconfigureDNSintheDNSnameservicewindow. EnsureAlternateNameServiceissettoNone. ForTimeZoneRegion,selectAmericas. ForTimeZoneLocation,selectUnitedStates. ForTimeZone,selectPacificTime. Enteryourrootpassword. Note:Press~ . toexitfromthes e c n a m e n o d e consoleandreturntotheglobalzone. Performsimilarstepsford a t a n o d e 1 ,d a t a n o d e 2 ,andd a t a n o d e 3 : Dothefollowingford a t a n o d e 1 : r o o t @ g l o b a l _ z o n e : ~ #z o n e a d mzd a t a n o d e 1c l o n en a m e n o d e r o o t @ g l o b a l _ z o n e : ~ #z o n e a d mzd a t a n o d e 1b o o t r o o t @ g l o b a l _ z o n e : ~ #z l o g i nCd a t a n o d e 1 Forthehostname,used a t a n o d e 1 . Selectmanualnetworkconfigurationandforthenetworkinterface,used a t a _ n o d e 1 . UseanIPaddressof192.168.1.3andanetmaskof255.255.255.0. SelectDonotconfigureDNSintheDNSnameservicewindow. EnsureAlternateNameServiceissettoNone. ForTimeZoneRegion,selectAmericas. ForTimeZoneLocation,selectUnitedStates. ForTimeZone,selectPacificTime. Enteryourrootpassword. Dothefollowingford a t a n o d e 2 : r o o t @ g l o b a l _ z o n e : ~ #z o n e a d mzd a t a n o d e 2c l o n en a m e n o d e r o o t @ g l o b a l _ z o n e : ~ #z o n e a d mzd a t a n o d e 2b o o t r o o t @ g l o b a l _ z o n e : ~ #z l o g i nCd a t a n o d e 2 Forthehostname,used a t a n o d e 2 . Forthenetworkinterface,used a t a _ n o d e 2 . UseanIPaddressof192.168.1.4andanetmaskof255.255.255.0. SelectDonotconfigureDNSintheDNSnameservicewindow. EnsureAlternateNameServiceissettoNone. ForTimeZoneRegion,selectAmericas. ForTimeZoneLocation,selectUnitedStates. ForTimeZone,selectPacificTime. Enteryourrootpassword. Dothefollowingford a t a n o d e 3 : r o o t @ g l o b a l _ z o n e : ~ #z o n e a d mzd a t a n o d e 3c l o n en a m e n o d e r o o t @ g l o b a l _ z o n e : ~ #z o n e a d mzd a t a n o d e 3b o o t r o o t @ g l o b a l _ z o n e : ~ #z l o g i nCd a t a n o d e 3 Forthehostname,used a t a n o d e 3 . Forthenetworkinterface,used a t a _ n o d e 3 . UseanIPaddressof192.168.1.5andanetmaskof255.255.255.0. SelectDonotconfigureDNSintheDNSnameservicewindow. EnsureAlternateNameServiceissettoNone. ForTimeZoneRegion,selectAmericas. ForTimeZoneLocation,selectUnitedStates. ForTimeZone,selectPacificTime. Enteryourrootpassword. Bootthen a m e _ n o d e zone: r o o t @ g l o b a l _ z o n e : ~ #z o n e a d mzn a m e n o d eb o o t Verifythatallthezonesareupandrunning: r o o t @ g l o b a l _ z o n e : ~ #z o n e a d ml i s tc v I DN A M E S T A T U S P A T H B R A N D I P 0g l o b a l r u n n i n g / s o l a r i ss h a r e d 1 0s e c n a m e n o d e r u n n i n g / z o n e s / s e c n a m e n o d e s o l a r i s e x c l 1 2d a t a n o d e 1 r u n n i n g / z o n e s / d a t a n o d e 1 s o l a r i s e x c l 1 4d a t a n o d e 2 r u n n i n g / z o n e s / d a t a n o d e 2 s o l a r i s e x c l 1 6d a t a n o d e 3 r u n n i n g / z o n e s / d a t a n o d e 3 s o l a r i s e x c l 1 7n a m e n o d e r u n n i n g / z o n e s / n a m e n o d e s o l a r i s e x c l ToverifyyourSSHaccesswithoutusingapasswordfortheHadoopuser,dothefollowing. Fromn a m e _ n o d e ,loginviaSSHinton a m e n o d e (thatis,toitself): r o o t @ g l o b a l _ z o n e : ~ #z l o g i nn a m e n o d e r o o t @ n a m e n o d e : ~ #s u-h a d o o p
http://www.oracle.com/technetwork/systems/hands-on-labs/hol-setup-hadoop-solaris-2041770.html
10/15
2/21/2014
ConceptBreak:HadoopDistributedFileSystem(HDFS)
HDFSisadistributed,scalablefilesystem.HDFSstoresmetadataontheNameNode.ApplicationdataisstoredontheDataNodes,andeach DataNodeservesupblocksofdataoverthenetworkusingablockprotocolspecifictoHDFS.ThefilesystemusestheTCP/IPlayerfor communication.ClientsuseRemoteProcedureCall(RPC)tocommunicatewitheachother. TheDataNodesdonotrelyondataprotectionmechanisms,suchasRAID,tomakethedatadurable.Instead,thefilecontentisreplicatedon multipleDataNodesforreliability. Withthedefaultreplicationvalue(3),whichissetupintheh d f s s i t e . x m l file,dataisstoredonthreenodes.DataNodescantalktoeachotherin ordertorebalancedata,tomovecopiesaround,andtokeepthereplicationofdatahigh.InFigure7,wecanseethateverydatablockisreplicated acrossthreedatanodesbasedonthereplicationvalue. AnadvantageofusingHDFSisdataawarenessbetweentheJobTrackerandTaskTracker.TheJobTrackerschedulesmaporreducejobsto TaskTrackerwithanawarenessofthedatalocation.AnexampleofthiswouldbeifnodeAcontaineddata(x,y,z)andnodeBcontaineddata (a,b,c).ThentheJobTrackerwillschedulenodeBtoperformmaporreducetaskson(a,b,c)andnodeAwouldbescheduledtoperformmapor reducetaskson(x,y,z).Thisreducestheamountoftrafficthatgoesoverthenetworkandpreventsunnecessarydatatransfer..Thisdataawareness canhaveasignificantimpactonjobcompletiontimes,whichhasbeendemonstratedwhenrunningdataintensivejobs. FormoreinformationaboutHadoopHDFSseehttps://en.wikipedia.org/wiki/Hadoop.
Figure7 ToformatHDFS,runthefollowingcommandsandanswerYattheprompt:
r o o t @ g l o b a l _ z o n e : ~ #z l o g i nn a m e n o d e r o o t @ n a m e n o d e : ~ #m k d i rp/ h d f s / n a m e r o o t @ n a m e n o d e : ~ #c h o w nRh a d o o p : h a d o o p/ h d f s r o o t @ n a m e n o d e : ~ #s u-h a d o o p h a d o o p @ n a m e n o d e : $/ u s r / l o c a l / h a d o o p / b i n / h a d o o pn a m e n o d ef o r m a t 1 3 / 1 0 / 1 30 9 : 1 0 : 5 2I N F On a m e n o d e . N a m e N o d e :S T A R T U P _ M S G : / * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * S T A R T U P _ M S G :S t a r t i n gN a m e N o d e S T A R T U P _ M S G : h o s t=n a m e n o d e / 1 9 2 . 1 6 8 . 1 . 1 S T A R T U P _ M S G : a r g s=[ f o r m a t ] S T A R T U P _ M S G : v e r s i o n=1 . 2 . 1 S T A R T U P _ M S G : b u i l d=h t t p s : / / s v n . a p a c h e . o r g / r e p o s / a s f / h a d o o p / c o m m o n / b r a n c h e s / b r a n c h 1 . 2r1 5 0 3 1 5 2 ;c o m p i l e db y' m a t t f 'o nM o nJ u S T A R T U P _ M S G : j a v a=1 . 6 . 0 _ 3 5 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * / h a d o o p @ n a m e n o d e : $R e f o r m a tf i l e s y s t e mi n/ h d f s / n a m e?( Yo rN )Y OneveryDataNode(d a t a n o d e 1 ,d a t a n o d e 2 ,andd a t a n o d e 3 ),createaHadoopdatadirectorytostoretheHDFSblocks: r o o t @ g l o b a l _ z o n e : ~ #z l o g i nd a t a n o d e 1 r o o t @ d a t a n o d e 1 : ~ #m k d i rp/ h d f s / d a t a r o o t @ d a t a n o d e 1 : ~ #c h o w nRh a d o o p : h a d o o p/ h d f s
http://www.oracle.com/technetwork/systems/hands-on-labs/hol-setup-hadoop-solaris-2041770.html
11/15
2/21/2014
Figure8 Exercise11:RunaMapReduceJob
ConceptBreak:MapReduce
MapReduceisaframeworkforprocessingparallelizableproblemsacrosshugedatasetsusingaclusterofcomputers. TheessentialideaofMapReduceisusingtwofunctionstograbdatafromasource:usingtheM a p ( ) functionandthenprocessingthedataacross aclusterofcomputersusingtheR e d u c e ( ) function.Specifically,M a p ( ) willapplyafunctiontoallthemembersofadatasetandpostaresultset, whichR e d u c e ( ) willthencollateandresolve. M a p ( ) andR e d u c e ( ) canberuninparallelandacrossmultiplesystems. FormoreinformationaboutMapReduce,seehttp://en.wikipedia.org/wiki/MapReduce. WewillusetheWordCountexample,whichreadstextfilesandcountshowoftenwordsoccur.Theinputandoutputconsistoftextfiles,eachlineof
http://www.oracle.com/technetwork/systems/hands-on-labs/hol-setup-hadoop-solaris-2041770.html
12/15
2/21/2014
ConceptBreak:ZFSEncryption
OracleSolaris11addstransparentdataencryptionfunctionalitytoZFS.Alldataandfilesystemmetadata(suchasownership,accesscontrollists, quotainformation,andsoon)isencryptedwhenstoredpersistentlyintheZFSpool. AZFSpoolcansupportamixofencryptedandunencryptedZFSdatasets(filesystemsandZVOLs).Dataencryptioniscompletelytransparentto applicationsandotherOracleSolarisfileservices,suchasNFSorCIFS.SinceencryptionisafirstclassfeatureofZFS,weareabletosupport compression,encryption,anddeduplicationtogether.Encryptionkeymanagementforencrypteddatasetscanbedelegatedtousers,Oracle SolarisZones,orboth.OracleSolariswithZFSencryptionprovidesaveryflexiblesystemforsecuringdataatrest,anditdoesn'trequireany applicationchangesorqualification. FormoreinformationaboutZFSencryption,see"HowtoManageZFSDataEncryption." Theoutputdatacancontainsensitiveinformation,souseZFSencryptiontoprotecttheoutputdata. CreatetheencryptedZFSdataset: Note:Youneedtoprovidethepassphraseitmustbeatleasteightcharacters. r o o t @ n a m e n o d e : ~ #z f sc r e a t eoe n c r y p t i o n = o nr p o o l / e x p o r t / o u t p u t E n t e rp a s s p h r a s ef o r' r p o o l / e x p o r t / o u t p u t ' : E n t e ra g a i n : VerifythattheZFSdatasetisencrypted: r o o t @ n a m e n o d e : ~ #z f sg e ta l lr p o o l / e x p o r t / o u t p u t|g r e pe n c r y r p o o l / e x p o r t / o u t p u t e n c r y p t i o n o n l o c a l Changetheownership: r o o t @ n a m e n o d e : ~ #c h o w nh a d o o p : h a d o o p/ e x p o r t / o u t p u t CopytheoutputfilefromHDFSintoZFS: r o o t @ n a m e n o d e : ~ #s u-h a d o o p O r a c l eC o r p o r a t i o n S u n O S5 . 1 1
1 1 . 1
S e p t e m b e r2 0 1 2
h a d o o p @ n a m e n o d e : $h a d o o pd f sg e t m e r g e/ o u t p u t d a t a / o u t p u t 1/ e x p o r t / o u t p u t Analyzetheoutputtextfile.Eachlinecontainsawordandthenumberoftimesthewordoccurred,separatedbyatab. h a d o o p @ n a m e n o d e : $h e a d/ e x p o r t / o u t p u t / o u t p u t 1 " A 2 " A l p h a 1 " A l p h a , " 1 " A n 2 " A n d 1 " B O I L I N G " 2 " B a t e s i a n " 1 " B e t a 2 ProtecttheoutputtextfilebyunmountingtheZFSdataset,andthenunloadthewrappingkeyforanencrypteddatasetusingthefollowing command: r o o t @ n a m e n o d e : ~ #z f sk e yur p o o l / e x p o r t / o u t p u t
http://www.oracle.com/technetwork/systems/hands-on-labs/hol-setup-hadoop-solaris-2041770.html
13/15
2/21/2014
ConceptBreak:OracleSolarisDTrace
OracleSolarisDTraceisacomprehensive,advancedtracingtoolfortroubleshootingsystematicproblemsinrealtime.Administrators,integrators, anddeveloperscanuseDTracetodynamicallyandsafelyobserveliveproductionsystems,includingbothapplicationsandtheoperatingsystem itself,forperformanceissues. DTraceallowsyoutoexploreasystemtounderstandhowitworks,trackdownproblemsacrossmanylayersofsoftware,andlocatethecauseof anyaberrantbehavior.Whetherit'satahighlevelglobaloverview,suchmemoryconsumptionorCPUtime,oratamuchfinergrainedlevel,such aswhatspecificfunctioncallsarebeingmade,DTracecanprovideoperationalinsightsthathavebeenmissinginthedatacenterbyenablingyou todothefollowing: Insert80,000+probepointsacrossallfacetsoftheoperatingsystem. Instrumentuserandsystemlevelsoftware. Useapowerfulandeasytousescriptinglanguageandcommandlineinterfaces. FormoreinformationaboutDTrace,seehttp://www.oracle.com/technetwork/serverstorage/solaris11/technologies/dtrace1930301.html. Openanotherterminalwindowandlogininton a m e n o d e asuserh a d o o p . RunthefollowingMapReducejob: h a d o o p @ n a m e n o d e : $h a d o o pj a r/ u s r / l o c a l / h a d o o p / h a d o o p e x a m p l e s 1 . 2 . 1 . j a r w o r d c o u n t/ i n p u t d a t a / p g 2 0 4 1 7 . t x t/ o u t p u t d a t a / o u t p u t 2 WhentheHadoopjobisrun,determinewhatprocessesareexecutedontheNameNode. Intheterminalwindow,runthefollowingDTracecommand: r o o t @ g l o b a l z o n e : ~ #d t r a c en' p r o c : : : e x e c s u c c e s s / s t r s t r ( z o n e n a m e , " n a m e n o d e " ) > 0 /{t r a c e ( c u r p s i n f o > p r _ p s a r g s ) ;} ' d t r a c e :d e s c r i p t i o n' p r o c : : : e x e c s u c c e s s 'm a t c h e d1p r o b e C P U I D 04 4 7 3 04 4 7 3 04 4 7 3 04 4 7 3 04 4 7 3 14 4 7 3 14 4 7 3 14 4 7 3 14 4 7 3 14 4 7 3 14 4 7 3 14 4 7 3 04 4 7 3 04 4 7 3 ^ C F U N C T I O N : N A M E e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s / u s r / b i n / e n vb a s h/ u s r / l o c a l / h a d o o p / b i n / h a d o o pj a r/ u s r / l o c a l / h a d o o p / h a d o o p e x a b a s h/ u s r / l o c a l / h a d o o p / b i n / h a d o o pj a r/ u s r / l o c a l / h a d o o p / h a d o o p e x a m p l e s 1 . 1 . 2 . j d i r n a m e/ u s r / l o c a l / h a d o o p 1 . 1 . 2 / l i b e x e c / d i r n a m e/ u s r / l o c a l / h a d o o p 1 . 1 . 2 / l i b e x e c / s e des // _ / g d i r n a m e/ u s r / l o c a l / h a d o o p / b i n / h a d o o p d i r n a m e-/ u s r / l o c a l / h a d o o p / b i n / . . / l i b e x e c / h a d o o p c o n f i g . s h b a s e n a m e-/ u s r / l o c a l / h a d o o p / b i n / . . / l i b e x e c / h a d o o p c o n f i g . s h b a s e n a m e/ u s r / l o c a l / h a d o o p 1 . 1 . 2 / l i b e x e c / u n a m e / u s r / j a v a / b i n / j a v aX m x 3 2 mo r g . a p a c h e . h a d o o p . u t i l . P l a t f o r m N a m e / u s r / j a v a / b i n / j a v aX m x 3 2 mo r g . a p a c h e . h a d o o p . u t i l . P l a t f o r m N a m e / u s r / j a v a / b i n / j a v aD p r o c _ j a rX m x 1 0 0 0 mD h a d o o p . l o g . d i r = / v a r / l o g / h a d o o pD h a d o / u s r / j a v a / b i n / j a v aD p r o c _ j a rX m x 1 0 0 0 mD h a d o o p . l o g . d i r = / v a r / l o g / h a d o o pD h a d o
Note:PressCtrlcinordertoseetheDTraceoutput. WhentheHadoopjobisrun,determinewhatfilesarewrittentotheNameNode. Note:IftheMapReducejobisfinished,youcanrunanotherjobwithadifferentoutputdirectory(forexample,/ o u t p u t d a t a / o u t p u t 3 ). Forexample: h a d o o p @ n a m e n o d e : $h a d o o pj a r/ u s r / l o c a l / h a d o o p / h a d o o p e x a m p l e s 1 . 2 . 1 . j a r w o r d c o u n t/ i n p u t d a t a / p g 2 0 4 1 7 . t x t/ o u t p u t d a t a / o u t p u t 3 r o o t @ g l o b a l z o n e : ~ #d t r a c en' s y s c a l l : : w r i t e : e n t r y / s t r s t r ( z o n e n a m e , " n a m e n o d e " ) > 0 /{ @ w r i t e [ f d s [ a r g 0 ] . f i _ p a t h n a m e ] = c o u n t ( ) ; } ' d t r a c e :d e s c r i p t i o n' s y s c a l l : : w r i t e : e n t r y 'm a t c h e d1p r o b e ^ C / z o n e s / n a m e n o d e / r o o t / t m p / h a d o o p h a d o o p / m a p r e d / l o c a l / j o b T r a c k e r / . j o b _ 2 0 1 3 0 7 1 8 1 4 5 7 _ 0 0 0 7 . x m l . c r c 1 / z o n e s / n a m e n o d e / r o o t / v a r / l o g / h a d o o p / h i s t o r y / . j o b _ 2 0 1 3 0 7 1 8 1 4 5 7 _ 0 0 0 7 _ c o n f . x m l . c r c 1 / z o n e s / n a m e n o d e / r o o t / d e v / p t s / 3 5 / z o n e s / n a m e n o d e / r o o t / v a r / l o g / h a d o o p / j o b _ 2 0 1 3 0 7 1 8 1 4 5 7 _ 0 0 0 7 _ c o n f . x m l 6 / z o n e s / n a m e n o d e / r o o t / t m p / h a d o o p h a d o o p / m a p r e d / l o c a l / j o b T r a c k e r / j o b _ 2 0 1 3 0 7 1 8 1 4 5 7 _ 0 0 0 7 . x m l 8 / z o n e s / n a m e n o d e / r o o t / v a r / l o g / h a d o o p / h i s t o r y / j o b _ 2 0 1 3 0 7 1 8 1 4 5 7 _ 0 0 0 7 _ c o n f . x m l 1 1 / z o n e s / n a m e n o d e / r o o t / v a r / l o g / h a d o o p / h a d o o p j o b t r a c k e r n a m e n o d e . l o g 1 3 / z o n e s / n a m e n o d e / r o o t / h d f s / n a m e / c u r r e n t / e d i t s . n e w 2 5 / z o n e s / n a m e n o d e / r o o t / v a r / l o g / h a d o o p / h a d o o p n a m e n o d e n a m e n o d e . l o g 4 5 / z o n e s / n a m e n o d e / r o o t / d e v / p o l l 2 0 7 < u n k n o w n > 3 1 3 1 6 5 5 Note:PressCtrlcinordertoseetheDTraceoutput. WhentheHadoopjobisrun,determinewhatprocessesareexecutedontheDataNode: r o o t @ g l o b a l z o n e : ~ #d t r a c en' p r o c : : : e x e c s u c c e s s / s t r s t r ( z o n e n a m e , " d a t a n o d e 1 " ) > 0 /{t r a c e ( c u r p s i n f o > p r _ p s a r g s ) ;} ' d t r a c e :d e s c r i p t i o n' p r o c : : : e x e c s u c c e s s 'm a t c h e d1p r o b e C P U I D 08 8 3 3 08 8 3 3 08 8 3 3 18 8 3 3 28 8 3 3 28 8 3 3 28 8 3 3 38 8 3 3 38 8 3 3 38 8 3 3 38 8 3 3 38 8 3 3 38 8 3 3 F U N C T I O N : N A M E e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s e x e c _ c o m m o n : e x e c s u c c e s s d i r n a m e/ u s r / l o c a l / h a d o o p / b i n / h a d o o p d i r n a m e/ u s r / l o c a l / h a d o o p / l i b e x e c / s e des // _ / g d i r n a m e-/ u s r / l o c a l / h a d o o p / b i n / . . / l i b e x e c / h a d o o p c o n f i g . s h b a s e n a m e/ u s r / l o c a l / h a d o o p / l i b e x e c / / u s r / j a v a / b i n / j a v aX m x 3 2 mo r g . a p a c h e . h a d o o p . u t i l . P l a t f o r m N a m e / u s r / j a v a / b i n / j a v aX m x 3 2 mo r g . a p a c h e . h a d o o p . u t i l . P l a t f o r m N a m e / u s r / b i n / e n vb a s h/ u s r / l o c a l / h a d o o p / b i n / h a d o o pj a r/ u s r / l o c a l / h a d o o p / h a d o o p e x a b a s h/ u s r / l o c a l / h a d o o p / b i n / h a d o o pj a r/ u s r / l o c a l / h a d o o p / h a d o o p e x a m p l e s 1 . 0 . 4 . j b a s e n a m e-/ u s r / l o c a l / h a d o o p / b i n / . . / l i b e x e c / h a d o o p c o n f i g . s h d i r n a m e/ u s r / l o c a l / h a d o o p / l i b e x e c / u n a m e / u s r / j a v a / b i n / j a v aD p r o c _ j a rX m x 1 0 0 0 mD h a d o o p . l o g . d i r = / v a r / l o g / h a d o o pD h a d o
http://www.oracle.com/technetwork/systems/hands-on-labs/hol-setup-hadoop-solaris-2041770.html
14/15
2/21/2014
1 1 1 1 1
r o o t @ g l o b a l z o n e : ~ #d t r a c en' s y s c a l l : : w r i t e : e n t r y/ ( s t r s t r ( z o n e n a m e , " d a t a n o d e 1 " ) ! = 0| |s t r s t r ( z o n e n a m e , " d a t a n o d e 2 " ) ! = 0| | s t r s t r ( z o n e n a m e , " d a t a n o d e 3 " ) ! = 0)& &s t r s t r ( f d s [ a r g 0 ] . f i _ p a t h n a m e , " h d f s " ) ! = 0 & &s t r s t r ( f d s [ a r g 0 ] . f i _ p a t h n a m e , " b l o c k s B e i n g W r i t t e n " ) > 0 / {@ w r i t e [ f d s [ a r g 0 ] . f i _ p a t h n a m e ] = s u m ( a r g 2 ) ;} ' ^ C Summary Inthislab,welearnedhowtosetupaHadoopclusterusingOracleSolaris11technologiessuchasOracleSolarisZones,ZFS,andnetwork virtualizationandDTrace. SeeAlso HadoopandHDFS Hadoopframework "HowtoControlYourApplication'sNetworkBandwidth" "HowtoGetStartedCreatingOracleSolarisZonesinOracleSolaris11" "HowtoSetUpaHadoopClusterUsingOracleSolarisZones" "HowtoBuildNativeHadoopLibrariesforOracleSolaris11" MapReduce WordCount "HowtoManageZFSDataEncryption" DTrace AbouttheAuthor OrgadKimchiisaprincipalsoftwareengineerontheISVEngineeringteamatOracle(formerlySunMicrosystems).For6yearshehasspecialized invirtualization,bigdata,andcloudcomputingtechnologies. Revision1.0,10/21/2013 Followus: Blog|Facebook|Twitter|YouTube
Emailthispage
PrinterView
JAVA LearnAboutJava DownloadJavafor Consumers DownloadJavafor Developers JavaResourcesfor Developers JavaCloudService JavaMagazine
SERVICESANDSTORE LogIntoMyOracleSupport TrainingandCertification BecomeaPartner FindaPartnerSolution PurchasefromtheOracle Store CONTACTANDCHAT Phone:+1.800.633.0738 GlobalContacts OracleSupport PartnerSupport
http://www.oracle.com/technetwork/systems/hands-on-labs/hol-setup-hadoop-solaris-2041770.html
15/15