0% found this document useful (0 votes)
299 views

Linux Virtual Server Tutorial

The Linux Virtual Server Project (LVS) implements layer 4 switching in the Linux Kernel. This allows TCP and UDP sessions to to be load balanced between multiple real servers. Thus it provides a way to scale Internet services beyond a single host. HTTP and HTTPS traffic for the World Wide Web is probably the most common use. Though it can also be used for more or less any service, from email to the X Windows System.

Uploaded by

lbonilla3089
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
299 views

Linux Virtual Server Tutorial

The Linux Virtual Server Project (LVS) implements layer 4 switching in the Linux Kernel. This allows TCP and UDP sessions to to be load balanced between multiple real servers. Thus it provides a way to scale Internet services beyond a single host. HTTP and HTTPS traffic for the World Wide Web is probably the most common use. Though it can also be used for more or less any service, from email to the X Windows System.

Uploaded by

lbonilla3089
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 27

LinuxVirtualServerTutorial

Horms(SimonHorman)[email protected]
VALinuxSystemsJapan,K.K.www.valinux.co.jp
July2003.RevisedMarch2004

http://www.ultramonkey.org/
withassistancefrom

Abstract:
TheLinuxVirtualServerProject(LVS)allowsloadbalancingofnetworkedservicessuchasweband
mailserversusingLayer4Switching.Itisextremelyfastandallowssuchservicestobescaledto
service10sor100softhousandsofsimultaneousconnections.Thepurposeofthistutorialisto
demonstratehowtousevariousfeaturesofLVStoloadbalanceInternetservices,andhowthiscanbe
madehighlyavailableusingtoolssuchassuchasheartbeatandkeepalived.Itwillalsocovermore
advancedtopicswhichhavebeenthesubjectofrecentdevelopmentincludingmaintainingactive
connectionsinahighlyavailableenvironmentandusingactivefeedbacktobetterdistributeload.

Introduction
TheLinuxVirtualServerProject(LVS)implementslayer4switchingintheLinuxKernel.Thisallows
TCPandUDPsessionstotobeloadbalancedbetweenmultiplerealservers.Thusitprovidesawayto
scaleInternetservicesbeyondasinglehost.HTTPandHTTPStrafficfortheWorldWideWebis
probablythemostcommonuse.Thoughitcanalsobeusedformoreorlessanyservice,fromemailto
theXWindowsSystem.
LVSitselfrunsonLinux,howeveritisabletoloadbalanceconnectionsfromendusersrunningany
operatingsystemtorealserversrunninganyoperatingsystem.AslongastheconnectionsuseTCPor
UDP,LVScanbeused.
LVSisveryhighperformance.Itisabletohandleupwardsof100,000simultaneousconnections.Itis
easilyabletoloadbalanceasaturated100Mbitethernetlinkusinginexpensivecommodityhardware.It
isalsoabletoloadbalancesaturated1Gbitlinkandbeyondusinghigherendcommodityhardware.

LVSBasics
ThissectionwillcoverthebasicsofhowLVSworks.HowtoobtainandinstallLVS,andhowto
configureforitsmainmodesofoperation.InshortitwillcoverhowtosetupLVStoloadbalanceTCP
andUDPservices.

Terminology
LinuxDirector:HostwithLinuxandLVSinstalledwhichreceivespacketsfromendusersand
forwardsthemtorealservers.
EndUser:Hostthatoriginatesaconnection.
RealServer:Hostthatterminatesaconnection.Thiswillberunningsomesortofdaemonsuchas
Apache.
Asinglehostmaybeactinmorethanoneoftheaboverolesatthesametime.
VirtualIPAddress(VIP):TheIPaddressassignedtoaservicethataLinuxDirectorwillhandle.
RealIPAddress(RIP):TheIPaddressofaRealServer.

Layer4Switching

Figure1:LVSNAT
Layer4SwitchingworksbymultiplexingincomingTCP/IPconnectionsandUDP/IPdatagramstoreal
servers.PacketsarereceivedbyaLinuxDirectorandadecisionismadeastowhichrealserverto
fowardthepacketto.Oncethisdecisionismadesubsequentpacketstoforthesameconnectionwillbe
senttothesamerealserver.Thus,theintegrityoftheconnectionismaintained.

ForwardingPackets
TheLinuxVirtualServerhasthreedifferentwaysofforwardingpackets;networkaddresstranslation
(NAT),IPIPencapsulation(tunnelling)anddirectrouting.

NetworkAddressTranslation(NAT):Amethodofmanipulatingthesourceand/ordestination
portand/oraddressofapacket.ThemostcommonuseofthisisIPmasqueradingwhichisoften
usedtoenableRFC1918[2]privatenetworkstoaccesstheInternet.Inthecontextoflayer4
switching,packetsarereceivedfromendusersandthedestinationportandIPaddressare
changedtothatofthechosenrealserver.Returnpacketspassthroughthelinuxdirectorat

whichtimethemappingisundonesotheenduserseesrepliesfromtheexpectedsource.

DirectRouting:Packetsfromendusersareforwardeddirectlytotherealserver.TheIPpacket
isnotmodified,sotherealserversmustbeconfiguredtoaccepttrafficforthevirtualserver'sIP
address.Thiscanbedoneusingadummyinterfaceorpacketfilteringtoredirecttraffic
addressedtothevirtualserver'sIPaddresstoalocalport.Therealservermaysendreplies
directlybacktotheenduser.Thus,thelinuxdirectordoesnotneedtobeinthereturnpath.

IPIPEncapsulation(Tunnelling):AllowspacketsaddressedtoanIPaddresstoberedirectedto
anotheraddress,possiblyonadifferentnetwork.Inthecontextoflayer4switchingthe
behaviourisverysimilartothatofdirectrouting,exceptthatwhenpacketsareforwardedthey
areencapsulatedinanIPpacket,ratherthanjustmanipulatingtheethernetframe.Themain
advantageofusingtunnellingisthatrealserverscanbeonadifferentnetworks.

Figure2:LVSDirectRouting

VirtualServices
OntheLinuxDirectoravirtualserviceisdefinedbyeitheranIPaddress,portandprotocol,ora
firewallmark.Avirtualservicemayoptionallyhaveapersistancetimeoutassociatedwithit.Ifthisis
setandaconnectionisreceivedfromthesameIPaddressbeforethetimeouthasexpired,thenthe
connectionwillbeforwardedtothesamerealserverastheoriginalconnection.

IPAddress,PortandProtocol:Avirtualservermaybespecifiedby:
AnIPAddress:TheIPaddressthatenduserswillusetoaccesstheservice.
Aport:Theportthatenduserswillconnectto.
Aprotocol.EitherUDPorTCP.

FirewallMark:Packetsmaybemarkedwitha32bitunsignedvalueusingipchainsoriptables.
TheLinuxVirtualServerisabletouseusethismarktodesignatepacketsdestinedforavirtual
serviceandroutethemaccordingly.Thisisparticularlyusefulifalargenumberofcontiguous
IPbasedvirtualservicesarerequiredwiththesamerealservers.Ortogrouppersistence
betweendifferentports.Forinstancetoensurethatagivenenduserissenttothesamereal
serverforbothHTTPandHTTPS.

Scheduling
Thevirtualserviceisassignedaschedulingalgorithmthatisusedtoallocateincomingconnectionsto
therealservers.InLVStheschedulersareimplementedasseparatekernelmodules.Thusnew
schedulerscanbeimplementedwithoutmodifyingthecoreLVScode.

Therearemanydifferentschedulingalgorithmsavailabletosuitavarietyofneeds.Thesimplestare
roundrobinandleastconnected.Theseworkusingasimplestrategyofallocatingconnectionstoeach
realserverinturnandallocatingconnectionstotherealserverwiththeleastnumberofconnections
respectively.Weightedvariantsoftheseschedulersallowconnectionstobeallocatedproportionalto
theweightingoftherealserver,morepowerfulrealserverscanbesetwithahigherweightandthus,
willbeallocatedmoreconnections.
Morecomplexschedulingalgorithmshavebeendesignedforspecialisedpurposes.Forinstanceto
ensurethatrequestsforthesameIPaddressaresenttothesamerealserver.Thisisusefulwhenusing
LVStoloadbalancetransparentproxies.

InstallingLVS
Somedistributions,suchasSuSEshipwithkernelsthathaveLVScompiledin.Inthesecases
installationshouldbeaseasyasinstallingthesuppliedipvsadmpackage.Atthetimeofwriting
UltraMonkeyprovidespackagesbuiltagainstDebianSid(Unstable)andWoody(Stable/3.0)and
RedHat7.3and8.0.Detailedinformationonhowtoobtainandinstallthesepackagescanbefoundon
www.ultramonkey.org.TherestofthissectionwilldiscusshowtoinstallLVSfromsourceasitis
usefultounderstandhowthisprocessworks.
EarlyversionsofLVSworkedwithLinux2.2serieskernels.Thisimplementationinvolvedextensive
patchingoftheKernelsources.Thus,eachversionofLVSwascloselytiedtoaversionoftheKernel.
Thenetfilterpacketfilteringarchitecture[4]whichispartofthe2.4kernelshasallowedLVStobe
implementedalmostexclusivelyasasetofkernelmodules.TheresultisthatLVSisnolongertied
closelytoanindividualkernelrelease.LVSmayalsobecompileddirectlyintothekernel.However,
thisdiscussionwillfocusonusingLVSasamoduleasthisapproachiseasierandmoreflexible.
1. ObtainandUnpackKernel
Itisalwayseasiesttostartwithafreshkernel.Youcanobtainthisfromwww.kernel.org.This
examplewillusethe2.4.20kernel.Itcanbeunpackedusingthefollowingcommandwhich
shouldunpackthekernelintothelinux2.4.20directory.
tarjxvflinux2.4.20.tar.bz2

2. ObtainandUnpackLVS
LVScanbeobtainedfromwww.linuxvirtualserver.org.Thisexamplewilluse1.0.9.Itcanbe
unpackedusingthefollowingcommandwhichshouldpackthekernelintotheipvs1.0.9
directory.
tarzxvfipvs1.0.9.tar.gz

3. ApplyLVSPatchestoKernel
TwominorkernelpatchesarerequiredinorderfortheLVSmodulestocompile.Toapplythese
patchesusethefollowing:
cdlinux2.4.20/
patchpq<../ipvs1.0.9/linuxkernel_ksyms_c.diff
patchpq<../ipvs1.0.9/linuxnet_netsyms_c.diff

Athirdpatchisappliedtoallowinterfacestobehidden.Hiddeninterfacesdonotrespondto
ARPrequestsandareusedonrealserverswithLVSdirectrouting.
patchpq<../ipvs1.0.9/contrib/patches/hidden2.4.20pre101.diff

4. Configurethekernel
Firstensurethatthetreeisclean:
makemrproper

Nowconfigurethekernel.Thereareavarietyofwaysofdoingthisincluding
makemenuconfig,makexconfigandmakeconfig.Regardlessofthemethodthat
youuse,besuretocompileinnetfiltersupport,withatleastthefollowingoptions.Itis
suggestedthatwherepossibletheseoptionsarebuiltasmodules.
Networkingoptions>
Networkpacketfiltering(replacesipchains)
<m>IP:tunnelling
IP:NetfilterConfiguration>
<m>Connectiontracking(requiredformasq/NAT)
<m>FTPprotocolsupport
<m>IPtablessupport(requiredforfiltering/masq/NAT)
<m>Packetfiltering
<m>REJECTtargetsupport
<m>FullNAT
<m>MASQUERADEtargetsupport
<m>REDIRECTtargetsupport
<m>NAToflocalconnections(READHELP)(NEW)
<m>Packetmangling
<m>MARKtargetsupport
<m>LOGtargetsupport

5. BuildandInstalltheKernel
Asthekernelhasbeenreconfiguredthebuilddependenciesneedtobereconstructed.
makedep

Thekernelandmodulesmaynowbebuildusing:
makebzImagemodules

Toinstallthenewlybuiltmodulesandkernelrunthefollowingcommand.Thisshouldinstall
themodulesunder/lib/modules/2.4.20/andthekernelin/boot/vmlinuz2.4.20
makeinstallmodules_install

6. Updatebootloader
Inthecaseofgrubisusedasthebootloaderthenanewentryshouldbeaddedto
/etc/grub.conf.Thisexampleassumesthatthe/bootpartitionis/dev/hda3.Existing
entriesin/etc/grub.confshouldbeusedasaguide.
title2.4.20LVS
root(hd0,0)

kernel/vmlinuz2.4.20roroot=/dev/hda3

Ifthebootloaderislilothenanewentryshouldbeaddedto/etc/lilo.conf.This
exampleassumesthatthe/partitionis/dev/hda2.Existingentriesin/etc/lilo.conf
shouldbeusedasaguide.
image=/boot/vmlinuz2.4.20
label=2.4.20lvs
readonly
root=/dev/hda2

Once/etc/lilo.confhasbeenupdatedrunlilo.
lilo
AddedLinuxLVS*
AddedLinux
AddedLinuxOLD

7. Rebootthesystem.
Atyourbootloader'spromptbesuretobootthenewlycreatedkernel.
8. BuildandInstallLVS
ThecommandstobuildLVSshouldberunfromtheipvs1.0.9/ipvs/directory.Tobuild
andinstallusethefollowingcommands./kernel/source/linux2.4.20shouldbethe
rootdirectorythatthekernelwasjustbuiltin.
makeKERNELSOURCE=/kernel/source/linux2.4.20all
makeKERNELSOURCE=/kernel/source/linux2.4.20modules_install

9. BuildandInstallIpvsadm
IpvsadmistheuserspacetoolthatisusedtoconfigureLVS.Thesourcecanbefoundinthe
ipvs1.0.9/ipvs/ipvsadm/directory.Tobuildandinstallusethefollowingcommands.
makeall
makeinstall

LVSNAT
LVSNATisarguablythesimplestwaytoconfigureLVS.Packetsfromrealserversarereceivedbythe
linuxdirectorandthedestinationIPaddressisrewrittentobeoneoftherealservers.Thereturn
packetsfromtherealserverhavetheirsourceIPaddresschangedfromthatoftherealservertothe
VIP.

Figure3:LVSNATExample

LinuxDirector

EnableIPforwarding.Thiscanbedonebyaddingthefollowingto/etc/sysctl.confand
thenrunningsysctlp.
net.ipv4.ip_forward=1

Bringup172.17.60.201oneth0:0.Thisisbestdoneaspartofthenetworkingconfigurationof
yoursystem.Butitcanalsobedonemanually.
ifconfigeth0:0172.17.60.201netmask255.255.0.0broadcast172.17.255.255

ConfigureLVS
ipvsadmAt172.17.60.201:80
ipvsadmat172.17.60.201:80r192.168.6.4:80m
ipvsadmat172.17.60.201:80r192.168.6.5:80m

RealServers

Makesurereturnpacketsareroutedthroughlinuxdirector.Typicallythisisdonebysettingthe
VIPontheservernetworkthedefaultgateway.

Makesurethatthedesireddaemonislisteningonport80tohandleconnectionsfromendusers.

TestingandDebugging
Testingcanbedonebyconnectingto172.17.60.201:80fromoutsidetheservernetwork.
Runningapackettracingtoolonthelinuxdirectorsandrealserversisveryusefulfordebugging
purposes.Manysetupproblemscanberesolvedbytracingthepathofaconnectionandobservingat
whichsteppacketsfailtoappear.UsingTcpdumpwillbediscussedhereasanexample,thereare
varietyoftoolsavailableforvariousoperatingsystems.
Thefollowingtraceshowsaconnectionbeingopenedbyanenduser10.2.3.4totheVIP172.17.60.201
whichisforwardedtotherealserver192.168.6.5.Itshowspacketsbeingreceivedbythelinuxdirector

andthenforwardedtotherealserverandviceversa.Notethatthepacketsforwardedtotherealserver
stillhavetheenduser'sipaddressasthesourceaddress.Thelinuxdirectoronlychangesthe
destinationIPaddressofthepacket.Similarlyrepliesfromtherealservershavethedestinationaddress
settothatoftheenduser.ThelinuxdirectoronlyrewritesthesourceIPaddressofreplypacketssothat
itistheVIP.
tcpdumpnianyport80
12:40:40.96549910.2.3.4.34802>172.17.60.201.80:
S2555236140:2555236140(0)win5840
<mss1460,sackOK,timestamp166909970,nop,wscale0>
12:40:40.96764510.2.3.4.34802>192.168.6.5.80:
S2555236140:2555236140(0)win5840
<mss1460,sackOK,timestamp166909970,nop,wscale0>
12:40:40.966976192.168.6.5.80>10.2.3.4.34802:
S2733565972:2733565972(0)ack2555236141win5792
<mss1460,sackOK,timestamp12871109116690997,nop,wscale0>(DF)
12:40:40.968653172.17.60.201.80>10.2.3.4.34802:
S2733565972:2733565972(0)ack2555236141win5792
<mss1460,sackOK,timestamp12871109116690997,nop,wscale0>(DF)
12:40:40.97124110.2.3.4.34802>172.17.60.201.80:
.ack1win5840<nop,nop,timestamp16690998128711091>
12:40:40.97138710.2.3.4.34802>192.168.6.5.80:
.ack1win5840<nop,nop,timestamp16690998128711091>
ctrlc

ipvsadmLncanbeusedtoshowthenumberofactiveconnections.
ipvsadmLn
IPVirtualServerversion1.0.9(size=4096)
ProtLocalAddress:PortSchedulerFlags
>RemoteAddress:PortForwardWeightActiveConnInActConn
TCP172.17.60.201:80rr
>192.168.6.5:80Masq173
>192.168.6.4:80Masq184

ipvsadmLstatswillshowthenumberofpacketsandbytessentandreceivedpersecond.
ipvsadmLnstats
IPVirtualServerversion1.0.9(size=4096)
ProtLocalAddress:PortConnsInPktsOutPktsInBytesOutBytes
>RemoteAddress:Port
TCP172.17.60.201:8011417161153193740112940
>192.168.6.5:80578215679464255842
>192.168.6.4:80578955869909857098

ipvsadmLratewillshowthetotalnumberofpacketsandbytessentandreceived.
ipvsadmLnrate
IPVirtualServerversion1.0.9(size=4096)
ProtLocalAddress:PortCPSInPPSOutPPSInBPSOutBPS
>RemoteAddress:Port
TCP172.17.60.201:80562752751873941283
>192.168.6.5:8028137137934420634
>192.168.6.4:8028138137939520649

ipvsadmLzerowillzeroallthestatisticscounters.

LVSDirectRouting

Figure4:LVSDirectRoutingExample
LVSDirectRoutingworksbyforwardingpackets,unchanged,totheMACaddressesofrealservers.
Asthepacketisunmodifiedtherealserversneedtobeconfiguredtoaccepttrafficaddressedtothe
VIP.Thisismostcommonlydonebyusingahiddeninterface.
Astheincomingpacketsarenotmodifiedbythelinuxdirectorthereturnpacketsdonotneedtopass
throughthelinuxdirector.Thus,higherthroughputcanbeobtained.Itisalsoeasiertoloadbalance
servicesforendusersonthesamelocalnetworkasthereturnpacketscanbesentdirectlytotheend
userratherthanforcingthemtogothroughthelinuxdirector.

LinuxDirector

EnableIPforwarding.Thiscanbedonebyaddingthefollowingto/etc/sysctl.confand
thenrunningsysctlp.
net.ipv4.ip_forward=1

Bringup172.17.60.201oneth0:0.Thisisbestdoneaspartofthenetworkingconfigurationof
yoursystem.Butitcanalsobedonemanually.
ifconfigeth0:0172.17.60.201netmask255.255.0.0broadcast172.17.255.255

ConfigureLVS
ipvsadmAt172.17.60.201:80
ipvsadmat172.17.60.201:80r172.17.60.199:80g
ipvsadmat172.17.60.201:80r172.17.60.200:80g

Therealserverscansendreplypacketsdirectlytotheenduserswithoutthemneedingtobe
alteredbythelinuxdirector.Thus,thelinuxdirectordoesnotneedtobethegatewayforthe
realservers.
However,insomesituations,forinstancebecausethelinuxdirectorreallyisthegatewaytothe
realserver'snetwork,itisdesirabletoroutereturnpacketsfromtherealserversviathelinux
director.ThesourceaddressofthesepacketswillbetheVIP.HowevertheVIPbelongstoan
interfaceonthelinuxdirector.Thus,itwilldropthepacketsasbeingbogus.
Thereareseveralapproachestothisproblem.Probablythebestistoapplyakernelpatch
suppliedbyJulianAnastasovwhichaddprocentriesthatallowthispacketdroppingbehaviour
tobedisabledonaperinterfacebasis.Thispatchcanbeobtainedfrom
http://www.ssi.bg/~ja/#lvsgw

RealServers

Makesurereturnpacketsarenotroutedthroughlinuxdirectorunlessyouhavepatchedthe
kernelasdescribedabove.

Makesurethatthedesireddaemontohandleconnectionsfromendusersislisteningonport80

Bringup172.17.60.201ontheloopbackinterface.Thisisbestdoneaspartofthenetworking
configurationofyoursystem.Butitcanalsobedonemanually.OnLinuxthiscanbedone
usingthefollowingcommand.
ifconfiglo:0172.17.60.201netmask255.255.255.255

Notethatthenetmaskshouldbe255.255.255.255,regardlessoftheactualnetmaskofthe
networkthat172.17.60.201belongsto.Thisisbecauseontheloopbackinterfacetheall
addressescoveredbythenetmaskareboundtotheinterface.Thetypicalcaseis127.0.0.1with
anetmaskof255.0.0.0whichsetsuptheloopbackinterfacetoacceptallof127.0.0.0/8.Thus,
asweonlywantlo:0toacceptpacketsfor172.17.60.201thenetmaskmustbe255.255.255.255.

Hideloopback.OnLinuxrealserversitisneccessarytohidetheloopbackinterfacetoprevent
themfromrespondingtoARPrequestsfortheVIP.Thiscanbedonebyapplyingthehidden
interfacepatchdiscussedintheInstallingLVSsection.Toactivatethepatch,addthefollowing
linesto/etc/sysctl.confandthenrunsysctlp.
#Enableconfigurationofhiddendevices
net.ipv4.conf.all.hidden=1
#Maketheloopbackinterfacehidden
net.ipv4.conf.lo.hidden=1

TestingandDebugging
Testingcanbedonebyconnectingto172.17.60.201:80fromanynetwork.
DebuggingcanbedoneusingipvadmandpackettracingasperLVSNAT.However,notethatwhen
thepacketsareforwardednoaddresstranslationtakesplace.Alsonotethatasthereturnpacketsarenot
handledbyLVStheyaresentdirectlytotheenduserbytherealservertheoutgoingpacketandbyte
statisticswillbezero.

LVSTunnel

Figure5:LVSTunnelExampleSameTopologyastheLVSDirectRoutingExample
LVStunnellingworksinaverysimilarmannertodirectrouting.Themaindifferenceisthatpackets
areforwardedtotherealserversusingIPencapsulatedinIP,ratherthanjustsendinganewethernet
frame.Themainadvantageofthisisthatrealserversmaybeonadifferentnetworktothelinux
director.

LinuxDirector

EnableIPforwarding.Thiscanbedonebyaddingthefollowingto/etc/sysctl.confand

thenrunningsysctlp.
net.ipv4.ip_forward=1

Bringup172.17.60.201oneth0:0.Again,thisisbestdoneaspartofthenetworking
configurationofyoursystem.Butitcanalsobedonemanually.
ifconfigeth0:0172.17.60.201netmask255.255.0.0broadcast172.17.255.255

ConfigureLVS
ipvsadmAt172.17.60.201:80
ipvsadmat172.17.60.201:80r172.17.60.199:80i
ipvsadmat172.17.60.201:80r172.17.60.200:80i

Ifyouwishtousethelinuxdirectorasagatewayrouterfortherealservers,whichisnot
necessary,pleaseseeinformationonhowtopatchthekerneltodothisinthedirectrouting
section.

RealServers

Makesurereturnpacketsarenotroutedthroughlinuxdirectorunlessyouhavepatchedthe
kernelasdescribedinthedirectroutingsection.

Makesurethatthedesireddaemonisrunningonport80toacceptconnectionsfromtheend
users.

Bringup172.17.60.201ontunl0.Again,thisisbestdoneaspartofthenetworking
configurationofyoursystem.Butitcanalsobedonemanually.
ifconfigtunl0172.17.60.201netmask255.255.255.255

Enableforwardingandhideloopback.Thiscanbedonebyaddinglinesto
/etc/sysctl.confandthenrunningsysctlp.
net.ipv4.ip_forward=1
#Enableconfigurationofhiddendevices
net.ipv4.conf.all.hidden=1
#Makethetunl0interfacehidden
net.ipv4.conf.tunl0.hidden=1

TestingandDebugging
Testingcanbedonebyconnectingto172.17.60.201:80fromanynetwork.DebuggingisasperLVS
directrouting.

HighAvailability
LVSisaneffectivewaytoloadbalancenetworkedservices.Typicallythismeansthatseveralservers
willact,asfarasendusersareconcerned,asiftheywereasingleserver.Unfortunately,themore
serversthatareinthesystem,thegreaterthechancethatasingleserverwillfail.Thus,itisimportant

tomakeuseofhighavailabilitytechniquestoensurethatthevirtualserviceismaintainedevenif
individualserversfail.

Heartbeat
HeartbeatbeusedtomonitorapairoflinuxdirectorsandensurethatoneofthemownstheVIPatany
giventime.Itworksbyeachhostperiodicallysendingaheartbeatmessage.Ifnoheartbeatmessageis
receivedforapredeterminedperiodoftimethenthehostisconsideredtohavefailed.Whenthisoccurs
resourcescanbetakenover.Heartbeathasamodulardesignthatallowsarbitraryresourcestobe
defined.
ForthesakeofthisdiscussionwewillbeusinganIPaddressasaresource.Whenfailoveroccursthe
IPaddressisobtainedusingamethodknownasIPaddresstakeover.Thisworksbythenewly
activatedlinuxdirectorsendinggratuitousARPpacketsfortheVIP.Allhostsonthenetworkshould
receivetheseARPpacketsandthussendsubsequentpacketsfortheVIPtothenewlinuxdirector.
Heartbeatcanbeobtainedfromwww.linuxha.org.Itcanalsobeinstalledbyusingthepackages
providedorbuiltfromsourceusingthefollowingcommands.
./ConfigureMebuild
make
makeinstall

SampleConfiguration

Figure6:HeartbeatExample
Configurationisdoneusingthreefilesthatcanbefoundin/etc/ha.d.

ha.cf:Thisconfiguresthebaseparametersforheartbeatsuchaswhichinterfacestousefor
communication,howoftentosendmessagesandwheretowritelogsto.Notethatthenode
namesusedmustmatchtheoutputofunamenonthemembernodes.
logfacilitylocal0
keepalive2
deadtime10
warntime10
initdead10
nice_failbackon
mcasteth0225.0.0.769411
nodewalter
nodewendy

haresources:Setstheresourcesthataremanagedbyheartbeat.
walter172.17.60.201/24/eth0

authkeys:Setsthesecuritymechanismforinterheartbeatcommunication.Thisfilemustbe
mode600.
auth2
2sha1ultramonkey

LVSshouldbeconfiguredthesamewayonbothlinuxdirectors.ForthisexampletheLVStunnel
configurationdiscussedearlierwillbeused.DirectRoutingandNATmayalsobeused.
AstheVIP,172.17.60.201ismanagedbyheartbeatitshouldnotbebroughtuponthelinuxdirectors
byothermeans.
Heartbeatshouldbestartedonbothlinuxdirectors.AfterafewmomentstheVIPshouldbebroughtup
onwhicheverlinuxdirectoristhemaster.

TestingandDebugging
Reboottheactivelinuxdirectorandobservethattheotherlinuxdirectortakesover.Youcanexamine
theprogressofthetakeoverbyexaminingthelogssenttosyslog,typicallyfoundin
/var/log/messages.Asnice_failbackison,thecurrentlyactivelinuxdirectorwillnowact
asthemasterandwhenthefailedlinuxdirectorcomesbackonlineitwillactasastandby.

Ipfail
Thedesignofheartbeatissuchthatifanycommunicationchannelisavailabletoahost,thenitwillbe
consideredtobeavailable.Thisisnotalwaysthedesiredbehaviour.Forexampleifapairofhostshave
linksontheinternalandexternalnetwork,itmaybedesirableforfailovertooccurifeitherlinkfails
ononehost.Afterallitcannolongercommunicateroutetrafficbetweenendusersandtherealservers.

Figure7:HeartbeatwithoutIPfail
Theipfailpluginforheartbeatmakesthispossiblebymonitoringoneormoreexternalhostsknownas
apingnode.Typicallythiswouldbearouterortheswitchitself.Thepingnodeistreatedasaquorum
device.Thatis,ifahostcannotaccessapingnode,itisnoteligibletoholdanyresources.Thus,ifan
interfacefailsontheactivelinuxdirector,thenoneofthepingnodesshouldbecomeunavailableand
failoverwilloccur.

Figure8:HeartbeatwithIPfail
Theipfailmoduleisshippedaspartofheartbeat.Additionalinformationisavailablefrom
http://pheared.net/devel/c/ipfail/.Inthelongtermthiswillbeintegratedintothe
heartbeatdocumentation.

SampleConfiguration
Touseipfailwiththeheartbeatsetupdiscussedpreviously,thefollowingshouldbeaddedtoheartbeat's
ha.cffile.
ping172.17.0.254
respawnhacluster/usr/lib/heartbeat/ipfail

Apingdirectiveshouldbeaddedforeachpingnode.Ihaveonlydefinedonefortheexternalnetwork,
astherearenosuitablequorumdevicesontheinternalnetworkinthisdemonstration.
Therespawndirectivetellsheartbeattorun/usr/lib/heartbeat/ipfailasuserhacluster.Torerunitifit

exitswithastatusotherthan100,andtokillitwhenheartbeatexits.
Afteraddingtheseoptionsheartbeatneedstoberestarted.
/etc/init.d/heartbeatrestart

TestingandDebugging
TestinganddebuggingcanbedoneasperHeartbeatitself.

Ldirectord
Heartbeatisusedtomonitorthehealthoflinuxdirectors.Ldirectordcanbeusedtomonitorthehealth
ofrealserversandmanipulatestheLVSkerneltableaccordingly.Ldirectordandheartbeatareoften
usedintandemtocreateahighavailabilityLVScluster.
Ldirectordchecksservicesontherealserversbyconnectingtothem,makingaknownrequestand
checkingtheresultforaknownstring.ChecksareprovidedforHTTP,HTTPS,FTP,IMAP,POP,
SMTP,LDAPandNNTP.Additionalcheckscanbeaddedbymodifyingthecode,whichisusually
quitestraightforward.Infactmanyofthechecksincorporatedbyldirectordhavebeensuppliedas
patchesbyusers.
Thechecksemanticsaboveareknownasanegotiatecheck.Anothertypeofcheck,theconnectcheck,
simplycheckstomakesureaconnectioncanbeopenedtotheserviceontherealserver.Thisisuseful
ifthereisnotacheckfortheprotocolsuppliedbyldirectord.

SampleConfiguration
Ldirectordisconfiguredusingtheldirectord.cffile.Ithasglobaldirectiveswhicheithersetglobal
options,suchaswheretologerrorsto,ordefaultsforthevirtualservices.Thevirtualservices
encapsulateavirtualserviceprovidedbyLVS.Thevirtualservicescontaintherealserverswhichare
checked.
#GlobalDirectives
checktimeout=10
checkinterval=2
autoreload=no
logfile="local0"
quiescent=yes
#VirtualServerforHTTP
virtual=172.17.60.201:80
fallback=127.0.0.1:80
real=192.168.6.4:80masq
real=192.168.6.5:80masq
service=http
request="index.html"
receive="TestPage"
scheduler=rr
protocol=tcp
checktype=negotiate

Ldirectordmaybestartedbyrunningtheldirectordcommand,theldirectordinitscriptorbyaddingit
asaresourcetoheartbeat.Thereisnoparticularadvantagetothelatterasldirectordcanhappilyrunon
themasterandstandbylinuxdirectorsatthesametime.
OnceldirectordhasstartedtheLVSkerneltablewillbepopulated.
ipvsadmLn
IPVirtualServerversion1.0.7(size=4096)
ProtLocalAddress:PortSchedulerFlags
>RemoteAddress:PortForwardWeightActiveConnInActConn
TCP172.17.60.201:80rr
>192.168.6.5:80Masq100
>192.168.6.4:80Masq100
>127.0.0.1:80Local000

BydefaultldirectordusesthequiescentfeatureofLVStoaddandremoverealservers.Thatis,whena
realserveristoberemoveditsweightissettozeroanditremainspartofthevirtualservice.Thishas
theeffectthatexitingconnectionstotherealservermaycontinue,butnonewconnectionswillbe
allocated.Thisisparticularlyusefulforgracefullytakingrealserversoffline.Thisbehaviourcanbe
changedtoremovetherealserverfromthevirtualservicebysettingtheglobalconfigurationoption
quiescent=no.

TestingandDebugging
Testingcanbedonebybringingtherealserversupanddown.Bychangingthecontentsoftheknown
URLthatisbeingrequestedsuchthatitdoesnotcontaintheexpectedstring.Bykillingthedaemon
thatservesendusers'requests.Orbypoweringdownthehostalltogether.
IneachcaseldirectordshouldupdatetheLVSkerneltableaccordinglywhichcanbeexaminedusing
ipvsadmLn.Ldirectordalsologsitsactivities,theconfigurationabovesetstheselogstobe
writtentosyslog,typicallytheywillshowupin/var/log/syslog.
Forextradebugginginformationldirectordcanberunindebuggingmode,inwhichcaseitwilllog
verboselytotheterminalandwillnotdetachfromtheterminal.Thisisdonebyusingthedcommand
lineoption.Thisexamplestartsldirectordindebuggingmodewiththeconfigurationfile
ldirectord.cf,whichshouldbein/etc/ha.d/.Debuggingcanbeterminatedusingctrlc.
ldirectorddldirectord.cfstart

Keepalived
KeepalivedprovidesanimplementationoftheVRRPv2protocolwhichisspecifiedinRFC2338[1].It
isanalternativemethodofmanagingaVIPonanetworksothatitisownedbyonlyonehostatany
giventime.Thiscanbeusedtoswitchbetweenactiveandstandbylinuxdirectors.
VRRPv2worksonasimplestateengine.Hostsadvertisetheiravailability.Thehighestpriorityhost
winstheresourceandadvertisesthisfact.Allothernodesthengointothebackupstate.
ThereisanotherimplementationofVRRPv2forLinuxfromhttp://off.net/jme/vrrpd/.However,atthe
timeofwritingthekeepalivedimplementationappearstobemuchmorecomplete.
KeepalivedalsofeaturesservicelevelmonitoringofrealserversandmanipulatestheLVSkerneltable

accordingly.Theserviceteststhatareimplementedare:

TCP_CHECK:Checktomakesureaconnectioncanbeopenedtotheserviceontherealserver.
HTTP_GET:FetchaknownURLfromtherealserverandcomparethechecksumofthepageto
theexpectedchecksum.
SSL_GET:SSLversionofHTTP_GET
MISC_CHECK:Checkusinganexternalscript.

ItalsoprovidesanAPItoimplementnewchecks.
TheVRRPDandLVS/HealthCheckfeaturescanbeusedindividuallyorincombination.
Keepalivedisavailablefromkeepalived.sourceforge.net.Itcompilationisquitestraightforwardusing
./configureandmake.A.specfileforRedHatisalsoprovided.PackagesforDebianare
availableinthemainDebiantree.
Toconfigurekeepalived/etc/keepalived/keepalived.confshouldbemodified.Thisfileis
dividedupintosections.

global_defs:Globaldefinitionssuchaswheretosendemailalerts,ifatall,andthenameofthe
cluster.
vrrp_instance:EncapsulatesasetofvirtualIPaddressesassociatedwithaparticularinterface.
Eachinstanceshouldhaveauniqueid.
vrrp_sync_group:Groupstogethervrrp_instancessuchthatalltheinstanceswillbeownedbya
singlehostatanygiventime.ThiscanbeusedtoensurethatvirtualIPaddressesondifferent
interfacesalwaysenduponthesamemachine.
virtual_server:AvirtualservicehandledbyLVS.
real_server:Arealservertocheck.Containedwithinavirtual_server.

NotethattheVRRPimplementationworksonamaster/slavesystem.Soeachvrrp_instanceshouldbe
markedasa"MASTER"ononenodeanda"SLAVE"ontheothernodes.Duringtesting,itdidnot
appearpossibletoconfigurekeepalivedtohavebehaviouranalogoustoheartbeat'snice_failback.That
isanodewillholdaresourceuntilitfails,inwhichcaseanothernodewilltakeitoveruntilitinturn
fails.Itwasalsofoundthattheslavenodesshouldbegivenalowerprioritythanthemastertoavoid
spuriousfailovers.

SampleConfiguration
Forthesakeofbrevity,theexampleconfigurationfilesareinAppendixA.
Tocreatethechecksumsfortheconfigurationfile,thegenhashprogrammecanbeused.Genhashwill
connecttotheserverandrequesttheURL.Itwillthenproducealotofoutput,showingyouhowthe
datathatisbeingusedtoconstructthechecksum.Thefinallineisthechecksumwhichshouldbe
includedinkeepalived.conf.Forexample,togeneratethehashfortheURLhttp://192.168.6.5:80/the
followingcommandisused.
genhashs192.168.6.5p80u/
[lotsofoutputomitted]
90bfbce6bc089a41f1fddca9aeaba452

Tostartkeepalivedrunthekeepaliveddaemonorinitscript.Messagesareloggedtosyslogand
typicallycanbefoundin/var/log/message.AfterafewmomentstheLVSkerneltableshouldbe

populatedonbothmachines.Thiscanbeinspectedusingipvsadm.
ipvsadmLn
IPVirtualServerversion1.0.7(size=4096)
ProtLocalAddress:PortSchedulerFlags
>RemoteAddress:PortForwardWeightActiveConnInActConn
TCP172.17.60.201:80lc
>192.168.6.5:80Masq100
>192.168.6.4:80Masq100

Onthemastermachinethevirtualipaddressesshouldhavebeenadded.Thiscanbecheckedusingthe
ipcommand.
ipaddrsh
[lo:omitted]
2:eth0:<BROADCAST,MULTICAST,UP>mtu1500qdiscpfifo\_fastqlen100
link/ether00:50:56:4f:30:19brdff:ff:ff:ff:ff:ff
inet172.17.60.207/16brd172.17.255.255scopeglobaleth0
inet172.17.60.201/32scopeglobaleth0
3:eth1:<BROADCAST,MULTICAST,UP>mtu1500qdiscpfifo\_fastqlen100
link/ether00:50:56:4f:30:1abrdff:ff:ff:ff:ff:ff
inet192.168.6.3/24brd192.168.6.255scope
globaleth1inet192.168.6.1/32scopeglobaleth1

Ifafailoveroccursthesameaddressesshouldappearontheslave,andthenbackonthemasteronceit
isrestored.

NewDevelopments
ActiveFeedback
Ldirectord,keepalivedandothertoolsmonitorthehealthofrealservers.Theweightparameterallows
therelativecapacityofrealserverstobetakenintoaccount.However,thesetoolsdonotmonitorthe
realtimeservingcapacityoftherealserversanddonotallocateconnectionsproportionaltothis.
Thiscanbeparticularlyproblematicinsituationswheresomeconnectionsrequiresignificantlymore
resourcesonarealserverthanothers.Forinstance,ifsomeconnectionsareaplainHTMLfilefetched
fromdisk,ormorelikelymemory.Whileotherconnectionsinvolveprocessingofinformation,sucha
scalinganimageorretrievingpartofthepagefromadatabase.
Feedbackdimplementsaframeworkthatallowsrealtimeinformationfromfromtherealserversto
determinehowmanyconnectionstheyshouldbeallocatedrelativetoeachother.Assuch,feedbackd
implementsanactivefeedbacksystem.Feedbackdisavailablefrom
http://www.redfishsoftware.com.au/projects/feedbackd/
Feedbackdhastwokeycomponents,feedbackdagentwhichrunsontherealserversandmonitorstheir
servingcapacity.Themonitoringismodularsoarbitrarycheckscanbedefined.Thedefaultcheck
suppliedsimplymonitorsCPUloadusing/proc/stat.Thesecondcomponent,feedbackdmasterrunson
thelinuxdirectors.Itcollatesinformationfromthefeedbackdagent'swhichconnectandmanipulates
theweightsoftherealserversintheLVSkerneltableaccordingly.

Itwasfoundthatalittlebitofmassagingwasrequiredtogetittocompile.Alsomademinor
enhancementsweremadetoallowfeedbackdmastertoberestartedwithoutgiving"addressinuse"
errorsandtoallowfeedbackdagenttotimeoutthemaster.Thelatterisaworkaroundtoallow
feedbackdtoworkwithActive/StandByLinuxDirectors.Bothofthesechangeshavebeenforwarded
totheauthorandwillhopefullyshowupinthenextversion.
TheonlyconfigurationrequiredforfeedbackdmasteristoestablishtheLVSvirtualservicesthatwill
beused.Thisisdoneusingipvsadm.Thereisnoneedtoaddtherealserversasthiswillbedoneby
feedbackdmasterbymatchingtheprotocolandportinformationsentbythefeedbackagentsrunning
onrealservers.Assuchfeedbackdcanbeusedtoaddandremoverealserversontheflywithoutany
configurationofthelinuxdirector.Forexample:
ipvsadmAt172.17.60.201:80

Tostartfeedbackdmastersimplyrunthedaemononthecommandline.Noinitscriptissuppliedwith
thecurrentdistribution.
FeedbackdAgentisconfiguredbymodifying/etc/feedbackdagent.conf.Inthisfilethe
LinuxDirectorrunningfeedbackdmasterisspecifiedasaretheservicesthattherealservershould
join.
director=192.168.6.1
service=http
protocol=TCP
port=80
module=cpuload.so
forwarding=NAT

Again,torunfeedbackdagentsimplyrunthecommandonthecommandline.

Testing
Asaprimitivetest,oneoftherealserverscanbeloadedmanuallyandtheeffectsofthisontheLVS
tableonthelinuxdirectorcanbeobservedusingipvsadm.Anindepthanalysisoftheeffectsofusing
feedbackdcanbefoundinJeremyKerr'spaperonthefeedbackd[3].

ConnectionSynchronisationExistingSolution
Configuringtwolinuxdirectorsinanactive/standbyconfigurationisausefulwaytoprovidehigh
availability.Iftheactivelinuxdirectorfails,thestandbycanautomaticallytakeovertheIPaddressof
thevirtualservicesandtheclustercancontinuetofunction.However,whensuchafailoveroccurs
connectionsthatarecurrentlyinprogressareterminated.
Thisisbecausethestandbylinuxdirectordoesnowknowanythingabouttheseconnections.By
synchronisingconnectioninformationbetweentheactiveandstandbylinuxdirectorsthisproblemcan
beaverted.Thus,whenastandbylinuxdirectorbecomestheactivelinuxdirector,itwillhave
informationaboutthecurrentlyactiveconnectionsandwillbeabletocontinuetoforwardtheirpackets.
Thecriticalpieceofinformationrequirediswhichrealservertoforwardpacketsforagivenconnection
to.Thisinformationisquitesmallandthuscanbesynchronisedwithlittleoverhead.
ThereisanimplementationofconnectionsynchronisationwithinthecurrentLVScode.Itworksona

master/slavesystemwherebythelinuxdirectorconfiguredasthemastersendssynchronisation
informationforconnections.Thelinuxdirectorsconfiguredasslavesreceivethisinformationand
updatetheirLVSconnectiontableaccordingly.
Aconnectionissynchronisedoncethenumberofpacketspassesathreshold(3)andthenevery
frequency(50)packets.Thesynchronisationinformationfortheconnectionsareaddedtoaqueueand
periodicallyflushed.Thesynchronisationinformationforupto50connectionscanbepackedintoa
singlepacket.Thepacketsaresenttotheslavesusingmulticast.
Sendingandreceivingsynchronisationinformationbythemasterandslavesrespectivelyisdonebya
kernelthread.Thekernelsynchronisationthreadisstartedonthemasterandslavesusingthefollowing
commands.
ipvsadmstartdaemonmaster#RunontheMasterLinuxDirector
ipvsadmstartdaemonbackup#RunontheSlaveLinuxDirector

TestingandDebugging
ThesynchronisationofconnectionscanbemonitoredusingipvsadmLcn,whichlistsLVS
connectiontable.Connectionsshouldfirstappearonthemasterlinuxdirector.Thenafterafew
moments,whensynchronisationhasoccurs,theyshouldalsoappearontheslaves.
ipvsadmLcn#OntheMasterLinuxDirector
IPVSconnectionentries
proexpirestatesourcevirtualdestination
TCP01:00TIME_WAIT172.16.4.222:34939172.17.60.201:80192.168.6.5:80
TCP01:01TIME_WAIT172.16.4.222:34940172.17.60.201:80192.168.6.4:80
TCP15:00ESTABLISHED172.16.4.222:34941172.17.60.201:80192.168.6.5:80
ipvsadmLcn#OntheSlaveLinuxDirector
IPVSconnectionentries
proexpirestatesourcevirtualdestination
TCP01.20ESTABLISHED172.16.4.222:34939172.17.60.201:80192.168.6.5:80
TCP01.23ESTABLISHED172.16.4.222:34940172.17.60.201:80192.168.6.4:80
TCP08.99ESTABLISHED172.16.4.222:34941172.17.60.201:80192.168.6.5:80

TheoutputshowstwoconnectionsonthemasterlinuxdirectorthatareintheTIME_WAITstate,that
istheyhavebeenclosedbytheenduser.ItalsoshowsoneconnectionintheESTABLISHEDstate,
thatistheenduserandtherealserverstillhaveanopenconnectiontoeachother.
Eachoftheseconnectionshavebeensynchronisedtotheslave.Notethatontheslave,allthe
connectionsareintheESTABLISHEDstate.ThisisduetoanoptimisationintheLVScodewhereby
connectionsareonlysynchronisedwhentheyareintheESTABLISHEDstate.Thiscutsdown
unnecessarysynchronisationoverheadasthestateoftheconnectionsontheslaveisnotcritical.
Youcanfurthermonitorwhichlinuxdirectorishandlingconnectionsbyaddingthefollowingiptables
ruletoeachlinuxdirector.
iptablesAINPUTd172.17.60.201jACCEPT

ThiscanbemonitoredusingiptablesLINPUTvn.
iptablesLINPUTvn#OntheActiveLinuxDirector
ChainINPUT(policyACCEPT1553packets,211Kbytes)

pktsbytestargetprotoptinoutsourcedestination
51551ACCEPTall**0.0.0.0/0172.17.60.201
iptablesLINPUTvn#OntheStandByLinuxDirector
ChainINPUT(policyACCEPT2233packets,328Kbytes)
pktsbytestargetprotoptinoutsourcedestination
00ACCEPTall**0.0.0.0/0172.17.60.201

Totestthatconnectionsynchronisationisworkingcorrectlyopenaconnectiontothevirtualservice
whilethemasterlinuxdirectorisactive.Thencausefailovertooccur,thiscanbedonebyavarietyof
meansincludingpoweringdownthemasterlinuxdirector.Atthispointtheconnectionshouldstall.
OncetheVIPhasfailedovertotheslavelinuxdirectortheconnectionshouldcontinue.
Streamingisausefulwaytotestthis,asstreamingconnectionsbytheirnatureareopenforalongtime.
Italsoprovidesintuitivefeedbackasthevideoand/ormusicpauseandthencontinue.Itisofnotethat
byincreasingthebuffersizeofthestreamingclientsoftwarethepausecanbeeliminated.

Problems
Themainproblemwiththisimplementationisthemaster/slaverelationship.IfthemasterLinux
Directorfailsandthencomesbackonline,thenconnectionstotheslavewillnotbesynchronisedtothe
master.Thenexttimethatafailoveroccurs,thiswillcausecauseconnectionstobeterminated.This
couldbeavoidedbystartingandstoppingthemasterandbackupdaemonsasfailoversoccur.Buta
peertopeerrelationshipbetweenthesynchronisationdaemonswouldbeacleanerapproach.

ConnectionSynchronisationNewSolution
ToimprovethissituationIhavewrittenanewsynchronisationdaemonforLVS.Itworksonapeerto
peerbasiswhereanynodemaysendorreceivesynchronisationinformation.
Thenewsynchronisationdaemonrunsinuserspaceratherthanthekernel.Informationisreceived
fromLVSinthekernelviaanetlinksocket.ItisthensenttoothernodesusingmulticastUDP.Whena
daemonreceivesinformationovermulticastitreversesthisprocessbysendingtheinformationinto
LVSinthekernelviaanetlinksocket.
Theideaofmovingthecodetotheuserspacewastoallowmoresophisticatedsynchronisation
processingtotakeplace.Thisiseasiertoimplementandinmanywaysmoreappropriatelydoneinuser
spacethanthekernel.Giventhatsynchronisationisnotaparticularlyintensivetask,thereisno
particularadvantagetokeepingitinthekernel.
Thecodecomprises:

ModifiedLVSkernelmodulestoallowthesynchronisationdaemontogetinformationabout
connections.ThishasbeendonebyallowingLVStohavearbitrarysynchronisationmethods
definedandinsertedasmodules.Thedefaultbehaviouristheexistingmaster/slaveinkernel
daemons.

KernelPatchtoregisterthenewnetlinksocket

libip_vs_user_sync:Conveniencelibraryforcommunicatingusingthenetlinksocket.

ip_vs_user_sync_simple:Simplesynchronisationdaemonimplementedusingthisframework.

Availablefromwww.ultramonkey.org

Running
Installingandcompilingisabittrickyasthisisnewcodeandthereareanumberofsupportlibraries
required.Oncebuilt,makesurethattheLVSkernelsynchronisationdaemonsarenotrunningusing
ipvsadmstopdaemonandstarttheuserspacedaemonfromthecommandlineorusingthe
ip_vs_user_sync_simpleinitscript.

TestingandDebugging
Debuggingmessagesforip_vs_user_sync_simplearesenttosyslogbydefaultandaretypicallywritten
to/var/log/messages.Ifthedaemonisnotfunctioningcorrectly,itisrecommendedtorunit
withthedebugoptionenabledandhavemessagesloggedtotheterminal.Thiscanbedonemy
modifyingip_vs_user_sync_simple.conforonthecommandline.
ip_vs_user_sync_simpledebuglog_facility

Testingisasfortheexistingconnectionsynchronisationcodedescribedpreviously.However,asthere
isnomaster/backuprelationshipconnectionscanbemaintainedthroughmultiplefailovers.

ActiveActive
Active/StandByLinuxDirectorsoffergoodwaytoprovidehighavailability.However,ifoneassumes
thatalinuxdirectordoesnotfailorgettakendownformaintenanceveryoften,thenmostofthetime
onelinuxdirectorwillbeidle.Arguablythisisawasteofresources.Italsomeansthatthemaximum
throughputofthenetworkislimitedtothatofonelinuxdirector.
HavingActiveActivelinuxdirectorsaddressesthisproblembyallowingmorethanonelinuxdirector
toloadbalanceconnections,forthesamevirtualservices,atthesametime.

Figure9:ActiveActiveBlockDiagram
Ihavemadeanimplementationofthiswhichworksasfollows:

EachlinuxdirectorisgiventhesamehardwareandIPaddress
Thismeansthatallthelinuxdirectorswillreceivepacketsforconnectionsforthevirtual
service.
ItalsomeansthatthereisnolongeranyneedforipaddressfailoverorVRRPv2.

Aheartbeathelper,Sarurunswithheartbeatoneachlinuxdirector.
Heartbeatdoesn'tallocateanyresources,justprovidesamechanismtodeterminewhich
linuxdirectorsareavailable.
Saruusesthisinformationtodividethespaceofallpossibleincomingconnections
betweenthelinuxdirectors.
Thisisdonebyelectingamasterwhichwillmaketheallocations.
Theallocationsaredonebydividingupblocksofsourceordestinationportsor
addresses.

AnetfilterkernelmoduleisusedtoonlyacceptpacketsasdictatedbySaru.

Running
Sarucanberundirectlyfromthecommandlineorusingitsowninitscript.Messagesareloggedto
syslogafterstartupandthesetypicallyappearin/var/log/messages.Sarucanlogmore
verboselybysettingthedebugoption,eitherinsaru.confordirectlyonthecommandline.For
debuggingpurposesthisoptionisrecommendedinconjunctionwithhavingsarulogtotheterminal.
sarudebuglog_facility

Bydefaultsaruwaits30secondsafterstartupbeforejoiningthecluster.Thisistoallowtimefor
connectionsynchronisationtooccurwhenaLinuxDirectorbootsup.Thiscanbeconfiguredatrun
time,againeitherinsaru.confordirectlyonthecommandline.
TheMACandIPaddressofaninterfacecanbesetusingtheipcommand.
iplinkseteth0down
iplinkseteth0address00:50:56:14:03:40
iplinkseteth0up
iprouteadddefaultvia172.16.0.254
ipaddradddeveth0192.168.20.40/24broadcast255.255.255.0

RulestofilteroutalltraffictotheVIPthatarenotacceptedbySaruareinsertedusingtheiptables
command.Theserulesassumethatconnectionsynchronisationwillbeused,Ifthisisnotthecasethen
netfilter'sconnectiontrackingshouldbeusedtoensurethatagivenconnectionwillalwaysbehandled
bythesamelinuxdirector.
iptablesF
iptablesAINPUTd172.17.60.201ptcpmsaruid1jACCEPT
iptablesAINPUTd172.17.60.201pudpmsaruid1jACCEPT
iptablesAINPUTd172.17.60.201picmpmicmpicmptypeechorequest\
msaruid1sensesrcaddrjACCEPT
iptablesAINPUTd172.17.60.201picmpmicmpicmptype!echorequest\
jACCEPT
iptablesAINPUTd172.17.60.201jDROP

IfLVSNATisbeingusedthenthefollowingrulesarealsorequiredtopreventalltheLinuxDirectors
sendingrepliesonbehalfofthetherealservers.
iptablestnatAPOSTROUTINGs192.168.6.0/24d192.168.6.0/24jACCEPT
iptablestnatAPOSTROUTINGs192.168.6.0/24mstatestateINVALID\
jDROP
iptablestnatAPOSTROUTINGs192.168.6.0/24mstatestateESTABLISHED\
jACCEPT
iptablestnatAPOSTROUTINGs192.168.6.0/24mstatestateRELATED\
jACCEPT
iptablestnatAPOSTROUTINGs192.168.6.0/24ptcpmstatestateNEW\
tcpflagsSYN,ACK,FIN,RSTSYNmsaruid1jMASQUERADE
iptablestnatAPOSTROUTINGs192.168.6.0/24pudpmstatestateNEW\
msaruid1jMASQUERADE
iptablestnatAPOSTROUTINGs192.168.6.0/24picmp\
micmpicmptypeechorequestmstatestateNEW\
msaruid1sensedstaddrjMASQUERADE
iptablestnatAPOSTROUTINGs192.168.6.0/24picmp\
micmpicmptype!echorequestmstatestateNEW\
jMASQUERADE

iptablestnatAPOSTROUTINGs192.168.6.0/24jDROP

TestingandDebugging
Whichlinuxdirectorisacceptingpacketsforanindividualconnectioncanbemonitoredusing
ipvsadmLINPUTnv.Theoutputbelowshowsaconnectionthatwasloadbalancedby
LinuxDirectorA.
ipvsadmLINPUTnv#OnLinuxDirectorA
ChainINPUT(policyACCEPT92541packets,14Mbytes)
pktsbytestargetprotoptinoutsourcedestination
51551ACCEPTtcp**0.0.0.0/0172.17.60.201saruid1sensesrcport
00ACCEPTudp**0.0.0.0/0172.17.60.201saruid1sensesrcport
00ACCEPTicmp**0.0.0.0/0172.17.60.201icmptype8saruid1
sensesrcaddr
00ACCEPTicmp**0.0.0.0/0172.17.60.201icmp!type8
00DROPall**0.0.0.0/0172.17.60.201
ipvsadmLINPUTnv#OnLinuxDirectorB
ChainINPUT(policyACCEPT92700packets,15Mbytes)
pktsbytestargetprotoptinoutsourcedestination
00ACCEPTtcp**0.0.0.0/0172.17.60.201saruid1sensesrcport
00ACCEPTudp**0.0.0.0/0172.17.60.201saruid1sensesrcport
00ACCEPTicmp**0.0.0.0/0172.17.60.201icmptype8saruid1
sensesrcaddr
00ACCEPTicmp**0.0.0.0/0172.17.60.201icmp!type8
51551DROPall**0.0.0.0/0172.17.60.201

Conclusion
LVSisaneffectivewaytoimplementclusteringofInternetservices.Toolssuchasheartbeat,
ldirectordandkeepalivedcanbeusedtogivetheclusterhighavailability.Thereareanumberofother
techniquesthatcanbeusedtofurtherenhanceLVSclustersincludingusingactivefeedbackto
determinetheproportionofconnectionsallocatedtoeachoftherealservers.Aswellasconnection
synchronisationandactiveactivetechniquestomultiplelinuxdirectorstobetterworktogether.
LVSitselfisaverypowerfultoolandhasmanyfeaturesthatwerenotwithinthescopeofthis
presentation.Theseinclude;firewallmarkstogroupvirtualservices,specialisedschedulingalgorithms
andvarioustuningparameters.Beyondthatthereismuchscopeforfurtherexpandingthefunctionality
ofLVStomeetthenewneedsofusersandtoreflecttheeverincreasingcomplexityoftheInternet.

SampleConfigurationfilesforkeepalived
Sampleconfigurationfileforkeepalivedmaster.
global_defs{
notification_email{
[email protected]
}
[email protected]

smtp_server210.128.90.2
smtp_connect_timeout30
lvs_idLVS_DEVEL
}
vrrp_sync_groupVG1{
group{
VI_1
VI_2
}
}
vrrp_instanceVI_1{
stateMASTER
interfaceeth0
virtual_router_id51
priority100
advert_int1
authentication{
auth_typePASS
auth_pass1111
}
virtual_ipaddress{
172.17.60.201
}
}
vrrp_instanceVI_2{
stateMASTER
interfaceeth1
virtual_router_id52
priority100
advert_int1
authentication{
auth_typePASS
auth_pass1111
}
virtual_ipaddress{
192.168.6.1
}
}
virtual_server172.17.60.20180{
delay_loop6
lb_algolc
lb_kindNAT
nat_mask255.255.255.0
!persistence_timeout50
protocolTCP
real_server192.168.6.480{
weight1
HTTP_GET{
url{
path/
digest55fd843c4e99e96c1ef28e7dbb10c51b
}
connect_timeout3
nb_get_retry3
delay_before_retry3
}

}
real_server192.168.6.580{
weight1
HTTP_GET{
url{
path/
digest90bfbce6bc089a41f1fddca9aeaba452
}
connect_timeout3
nb_get_retry3
delay_before_retry3
}
}
sorry_server127.0.0.180
}

SampleConfigurationfileforkeepalived(Slave)
global_defs{
notification_email{
[email protected]
}
[email protected]
smtp_server210.128.90.2
smtp_connect_timeout30
lvs_idLVS_DEVEL
}
vrrp_sync_groupVG1{
group{
VI_1
VI_2
}
}
vrrp_instanceVI_1{
stateSLAVE
interfaceeth0
virtual_router_id51
priority100
advert_int1
authentication{
auth_typePASS
auth_pass1111
}
virtual_ipaddress{
172.17.60.201
}
}
vrrp_instanceVI_2{
stateSLAVE
interfaceeth1
virtual_router_id52
priority100
advert_int1
authentication{
auth_typePASS
auth_pass1111

}
virtual_ipaddress{
192.168.6.1
}
}
virtual_server172.17.60.20180{
delay_loop6
lb_algolc
lb_kindNAT
nat_mask255.255.255.0
!persistence_timeout50
protocolTCP
real_server192.168.6.480{
weight1
HTTP_GET{
url{
path/
digest55fd843c4e99e96c1ef28e7dbb10c51b
}
connect_timeout3
nb_get_retry3
delay_before_retry3
}
}
real_server192.168.6.580{
weight1
HTTP_GET{
url{
path/
digest90bfbce6bc089a41f1fddca9aeaba452
}
connect_timeout3
nb_get_retry3
delay_before_retry3
}
}
sorry_server127.0.0.180
}

Bibliography
1
S.Knightetal.
Rfc2338:Virtualrouterredundancyprotocol.
http://www.ietf.org/,April1998.
2

Y.Rekhteretal.
Rfc1918:Addressallocationforprivateinternets.
http://www.ietf.org/,February1996.

JeremyKerr.
Usingdynamicfeebacktooptimiseloadbalancingdecisions.
http://www.redfishsoftware.com.au/projects/feedbackd/lcapaper.pdf,January2003.
4
NetfilterCoreTeam.
Netfilterfirewalling,natandpacketmanglingforlinux2.4.
http://www.netfilter.org/,2003.

Horms20040623

You might also like