Lightning Fast Synch With Csync2 and Lsyncd - Axivo Community PDF
Lightning Fast Synch With Csync2 and Lsyncd - Axivo Community PDF
Lightning Fast Synch With Csync2 and Lsyncd - Axivo Community PDF
Log in or Sign up
Home
Forums
Members
Search Forums
Recent Posts
Home
Forums
Developer Zone
We recently had a client who hired us to build from scratch his web site cluster. Among many performance problems, they were struggling with in their vBulletin based site, one issue being the file syncronization between the web nodes. The site does not contain any physical attachments and all the static files were manually uploaded by the owner through all 8 web servers. They tried in the past to synchronize the data through most popular solutions used by web hosts but they encountered performance issues and a SAN based solution was not in their budget. As result, they were forced to store the avatars into MySQL database, creating unnecessary stress at database level while affecting the overall site performance. So I had to come up with a solution that does not eat server resources like an elephant, while providing instant synchronization through an entire cluster. After some research done, I discarded instantly similar solutions like drbd, rsync and unison based on a variety of tests performed. The performance was not at the level I wanted to provide. The only client that performed flawlessly was csync2. I selected csync2 as prime synchronization candidate for the following reasons: 1. 2. 3. 4. 5. ease of configuration setup and the large variety of automated tasks unlimited number of nodes to be synchronized great flexibility how the synchronization is performed level of security used for external nodes data storage of currently synced files in each node, for fast file comparison
1/11
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/
11/7/13
Now that the synchronization part was covered, I had to find a solution that allows csync2 to trigger a sync event, once a new file is present into any cluster node. Most Linux based operating systems have installed inotify in their kernel. Basically, inotify is a kernel subsystem that acts to extend filesystems to notice changes to the filesystem and reports those changes to an application, in our case csync2. With that established, I had to find a robust C based application that properly handled the inotify events and also could be daemonized, so I run it as Linux service. Again, I quickly discarded all Python or Perl based scripts, for the same reasons I mentioned earlier: speed and performance. The main candidates I selected were inotify-tools and lsyncd. They were both written on C and had options to run as daemon. Based on preliminary tests, I discovered inotify-tools would not satisfy my selection criteria, as its daemon features contained bugs. I even tried to daemonize myself inotifywait with a shell wrapper but the performance was just not there. The server load generated was too high, so I was forced to discard it and noticed Radu Voicila (inotifytools developer) about the issue. I selected lsyncd as prime inotify candidate for the following reasons: 1. runs as daemon, so it can be started as Linux service 2. does not hamper local filesystem performance like other similar solutions 3. aggregates and combines events in batch, avoiding unnecessary server load 4. uses lua as configuration scripting tool Once the solution logic was finished, it was time to transform everything into reality. First, I created all necessary rpm's allowing me to install csync2 and lsyncd into any cluster in a matter of seconds. CentOS 5.6 lacked several important packages, so I had to write everything from scratch except for xinetd who was available into base repository. All packages are currently available into Axivo repository. If you are a host provider, please contact us if you want to help us host our packages on your servers. The dependencies I wrote/used for csync2.x86_64 1.34 rpm are: xinetd.x86_64 2:2.3.14+ (CentOS repo) gnutls.x86_64 1.4.1+ (Axivo repo) librsync.x86_64 0.9.7+ (Axivo repo) libtasn1.x86_64 2.9+ (Axivo repo) sqlite2.x86_64 2.8.17+ (Axivo repo)
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/
2/11
11/7/13
The dependencies I wrote/used for lsyncd.x86_64 2.04 rpm are: rsync.x86_64 2.6.8+ (CentOS repo) lua.x86_64 5.1.4+ (Axivo repo) Note: lsyncd was designed mainly for rsync, so I had to implement it as requirement into Axivo rpm. Next, it was time to actually install all packages and test how well they perform. For that, I've setup a small 3 nodes cluster on an internal network and installed all of the above listed rpm's in each node:
# yum --enablerepo= axivo install csync2 lsyncd
I will explain in detail the logic, because it is important to understand how the actual setup will work. For simplicity, I will call the 3 nodes apollo, chronos and hermes, as my test nodes I used into cluster. See below the detailed setup steps I used for csync2 and lsyncd. Important: Please be aware that I refer to several file locations in the next steps. These locations are unique to Axivo rpm builds and will probably not match the actual source build locations. If you install the Axivo rpm's, you will not encounter any setup difficulties.
Floren, May 16, 2011 #1
Floren
Axivo Developer
Csync2 Setup Csync2 is very straight forward, but it required some precise steps to perform, once I installed all Axivo rpm's. The csync2 rpm automates several tasks that are required to perform manually, once you install the software from source. For example, it adds the default csync2 TCP port to /etc/services file, creates the SSL certificates (in case you connect to an external node) and manages the sqlite database needed to store the sync information for each node. To avoid frustration, it is important that you study the Csync2 documentation and understand how its process works, before you proceed with the tutorial. 1) Generate the csync2.key, into apollo node. To do that, I simply issued the command:
# csync2 -k /etc/csync2/csync2.key
11/7/13
the key. To do that, I simply opened a new terminal window and ran the usual commands like:
# du -h /
If is not enough, run top and wait until enough random data is created. Y ou will notice when the process is complete once your shell prompt is returned into the window where you started the csync2 key generation. 2) Edit the default /etc/csync2/csync2.cfg configuration file and insert into the following content:
Code: n o s s l** ; g r o u pw e b { h o s ta p o l l o ; h o s tc h r o n o s ; h o s th e r m e s ; k e y/ e t c / c s y n c 2 / c s y n c 2 . k e y ; i n c l u d e/ v a r / w w w / h t m l ; e x c l u d e* ~. * ; a u t on o n e ; }
I used this file for the initial syncronisation step on all nodes, explained into final setup steps. Personally, I did not wanted the encrypted overhead on an internal network, so I disabled the SSL with the nossl directive you noticed into above csync2 configuration file. 3) Create a custom configuration file called csync2_apollo.cfg for apollo node. This part is the key of success for our solution, I will explain the logic in detail at the end of this article.
Code: n o s s l** ; g r o u pa p o l l o { h o s ta p o l l o ; h o s t( c h r o n o s ) ; k e y/ e t c / c s y n c 2 / c s y n c 2 . k e y ; i n c l u d e/ v a r / w w w / h t m l ; e x c l u d e* ~. * ; a u t on o n e ; }
4) On each additional node, copy all files present into apollo:/etc/csync2 directory. To be save, I ran the following commands on chronos and hermes:
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 4/11
11/7/13
I did that because I wanted to have the SSL certificates and csync2.key identical on all nodes. That is the condition needed by csync2, in case you decide to use an SSL connection while syncing the files. Once the file transfer completed on each node, rename accordingly the csync2 configuration files to reflect the name of each node. On chronos node, the configuration file was renamed csync2_chronos.cfg and had the content:
Code: n o s s l** ; g r o u pc h r o n o s { h o s tc h r o n o s ; h o s t( h e r m e s ) ; k e y/ e t c / c s y n c 2 / c s y n c 2 . k e y ; i n c l u d e/ v a r / w w w / h t m l ; e x c l u d e* ~. * ; a u t on o n e ; }
while on hermes node, the configuration file was renamed csync2_hermes.cfg and had the content:
Code: n o s s l** ; g r o u ph e r m e s { h o s th e r m e s ; h o s t( a p o l l o ) ; k e y/ e t c / c s y n c 2 / c s y n c 2 . k e y ; i n c l u d e/ v a r / w w w / h t m l ; e x c l u d e* ~. * ; a u t on o n e ; }
As detailed into Csync2 documentation, nodes will be able to communicate with eachothers if they have the chained configuration files present in each node. For example, the csync2_apollo.cfg file should be present in both apollo and chronos nodes. Y ou will probably say: "Floren, what is this non-sense, simply use the global csync2 configuration file across the entire network!" "Patience, Obiwan. I will explain in detail the logic at the end of this article." 5) On each node, edit the xinetd csync2 service file:
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 5/11
11/7/13
Code: s e r v i c ec s y n c 2 { f l a g s =R E U S E s o c k e t _ t y p e =s t r e a m w a i t =n o u s e r =r o o t g r o u p =r o o t s e r v e r =/ u s r / s b i n / c s y n c 2 s e r v e r _ a r g s =i p o r t =3 0 8 6 5 t y p e =U N L I S T E D # l o g _ o n _ f a i l u r e+ =U S E R I D d i s a b l e =n o # o n l y _ f r o m =1 9 2 . 1 6 8 . 1 9 9 . 31 9 2 . 1 6 8 . 1 9 9 . 4 }
Basically, all you have to do is change the disable setting from yes, to no. Y ou could also run the command:
# chkconfig csync2 on
but I wanted you to familiarize with the options inside the service starter, in case you want to customize some settings in the future.
Floren, May 16, 2011 #2
Lsyncd Setup On each node, edit the /etc/lsyncd/lsyncd.conf file and insert the following lua script:
Code:
Floren
Axivo Developer
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/
6/11
11/7/13
i n i t S y n c={ d e l a y=1 , m a x P r o c e s s e s=1 , a c t i o n=f u n c t i o n ( i n l e t ) l o c a lc o n f i g=i n l e t . g e t C o n f i g ( ) l o c a le l i s t=i n l e t . g e t E v e n t s ( f u n c t i o n ( e v e n t ) r e t u r ne v e n t . e t y p e~ =" B l a n k e t " e n d ) l o c a ld i r e c t o r y=s t r i n g . s u b ( c o n f i g . s o u r c e ,1 ,2 ) l o c a lp a t h s=e l i s t . g e t P a t h s ( f u n c t i o n ( e t y p e ,p a t h ) r e t u r n" \ t ". .c o n f i g . s y n c i d. ." : ". .d i r e c t o r y. .p a t h e n d ) l o g ( " N o r m a l " ," P r o c e s s i n gs y n c i n gl i s t : \ n " ,t a b l e . c o n c a t ( p a t h s ," \ n " ) ) s p a w n ( e l i s t ," / u s r / s b i n / c s y n c 2 " ," C " ,c o n f i g . s y n c i d ," x " ) e n d , c o l l e c t=f u n c t i o n ( a g e n t ,e x i t c o d e ) l o c a lc o n f i g=a g e n t . c o n f i g i fn o ta g e n t . i s L i s ta n da g e n t . e t y p e= =" B l a n k e t "t h e n i fe x i t c o d e= =0t h e n l o g ( " N o r m a l " ," S t a r t u po f' " ,c o n f i g . s y n c i d ," 'i n s t a n c ef i n i s h e l s e i fc o n f i g . e x i t c o d e sa n dc o n f i g . e x i t c o d e s [ e x i t c o d e ]= =" a g a i n "t h e n Rename the corresponding source key and accordingly. ou can specify l o g ( " N o r m a l " , value " R e t r y i n gs t a r t u po fY ' " ,c o n f i g . s y n c i d ," 'i n s t a n r e t u r n" a g a i n " several sources, using this format: e l s e l o g ( " E r r o r " ," F a i l u r eo ns t a r t u po f' " ,c o n f i g . s y n c i d ," 'i n s t a t e r m i n a t e ( 1 ) Code: e n d r e t u r n l o c a ls o u r c e s={ e n d [ " / v a r / w w w / h t m l / i m a g e s " ] =" a p o l l o " , l o c a l r c = c o n f i g . e x i t c o d e s a n d c o n f i g . e x i t c o d e s [ e x i t c o d e ] [ " / v a r / w w w / h t m l / c u s t o m a v a t a r s " ] = " a p o l l o " i f r c = = " d i e " t h e n } r e t u r nr c e n d i fa g e n t . i s L i s tt h e n The sources array will allow to create specific csync2 configurations for i fr c you = =" a g a i n "t h e n l o g ( " N o r m a l " , " R e t r y i n ge v e n t sl i s to ne x i t c o d e=" ,e x i t c o d e ) each directory you monitor. e l s e The key allows you to define a l specific o g ( " N o r m a directory l " ," F i n i s to h e d be e v synced, e n t sl i s t while =" , the e x i t c value o d e ) is e n d the csync2 configuration file id: e l s e i fr c= =" a g a i n "t h e n l o g ( " N o r m a l " ," R e t r y i n g" ,a g e n t . e t y p e ,"o n" ,a g e n t . s o u r c e P a t /etc/csync2/csync2_apollo.cfg e l s e l o g ( " N o r m a l " ," F i n i s h e d" ,a g e n t . e t y p e ,"o n" ,a g e n t . s o u r c e P a t e n d e n d specify whatever values you want. Y ou can even define a Obviously, you can r e t u r nr c specific syncid e n d , per directory, if your configuration need to have specific actions i n i t=f u n c t i o n ( i n l e t ) l o c a lc o n f i g=i n l e t . g e t C o n f i g ( )
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 7/11
11/7/13
Final Setup Steps So, the hardcore stuff is done. All we have to do left is start the daemons on each node and let the sync beast unleash its power through out the entire network. 1) We need to prepare the csync2 sqlite database on each node. On apollo, run the csync2 initialization command. It will detect all new files, mark them as dirty and then sync them through the rest of nodes:
# csync2 -xv
Floren
Axivo Developer
Based on your /var/www/html directory size, it will take csync2 from seconds to minutes replicating the entire list of files. That includes not only the file content, but also its permissions. Don't worry, next syncs will be very fast as csync2 stores each file information into its database and syncs only the files that changed, resulting in dramatic performance gains over a large cluster. Once the sync was completed on apollo, I performed the same command on each other node, in our case chronos and hermes. That resulted on a global sync, allowing lsynd to start with fresh data, perfectly syncronized. 2) Start the daemons on each node:
# chkconfig xinetd on # service xinetd start # chkconfig lsyncd on # service lsyncd start
Y ou can see if everything is OK, by consulting the /var/log/lysncd/lsyncd.log file. Final Test Results I created a very simple shell script that generates 100 .html files into a /test directory. The goal was to see how fast the files are replicated through entire network and how much resources were consumed:
Code:
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/
8/11
11/7/13
# ! / b i n / s h f o r( ( i = 0 ;i < 1 0 0 ;i + + ) ) ;d o t o u c h/ v a r / w w w / h t m l / t e s t / $ { i } . h t m l d o n e
I started top on all 3 nodes and opened another terminal window where I manually created the /test directory, then executed the shell script. Once the script executed, the new directory and the set of 100 html files were replicated in less than 1 second, generating only a 0.08 server load on each node:
Sun May 15 10:57:16 2011 Normal: Recursive startup sync: apollo:/var/www/html/ Sun May 15 10:57:19 2011 Normal: Startup of 'apollo' instance finished. Sun May 15 10:59:21 2011 Normal: Processing syncing list: apollo:/var/www/html/test/ While syncing file /var/www/html/test: ERROR from peer chronos: File is also marked dirty here! Auto-resolving conflict: Won 'master/slave' test. Sun May 15 10:59:22 2011 Normal: Finished events list = 0 Sun May 15 10:59:38 2011 Normal: Processing syncing list: apollo:/var/www/html/test/0.html Click to expand...
Still, that was not enough for my taste. I really wanted to push the system to its limits and see how the nodes react under heavy sync processing. So I created a more advanced script that will generate a batch of 1000 html files, repeated in a loop 10 times at random time intervals between 1.01 and 1.60 seconds:
Code: # ! / b i n / s h f o r( ( i = 0 ;i < 1 0 ;i + + ) ) ;d o n u m b e r = $ ( ( 1 0 0 0 0 0 0+( $ ( o dA nN 2i/ d e v / r a n d o m ) )%( 1 0 0 0 0+1 0 0 0 ) ) ) f o r( ( j = 0 ;j < 1 0 0 0 ;j + + ) ) ;d o t o u c h/ v a r / w w w / t e s t / $ { i } $ { j } . h t m l d o n e u s l e e p$ { n u m b e r } d o n e
I selected the random time interval higher than 1 second because I wanted to avoid the lsyncd batch processing implemented with the new lua script. The only difference with the new script from a performance point of view was the server load increase. In previous test, I experienced a 0.08 load. With the new script, the load went up to nearly 0.50 on all nodes, rapidly dropping once the shell script finished its execution. Y et, the sync speed was unaffected, files were instantly synced over the entire network. Dropping the entire /var/www/html/test directory was a little more harsh on the load. csync2 had to remove 10,000 files from its database, so that pushed to load to nearly 1, 0.97 precisely. Still, the files were instantly deleted from all
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 9/11
11/7/13
nodes. As you can see for yourself, the final results are simply amazing. However, things were not going so well, when I started developing this project... see below. The Dark Side Originally, I started with two basic csync2 and lsyncd configurations, as explained into manual. This particular setup created insane racing conditions among nodes, generating a huge load and repeated delays into global sync. It was not looking good at all. So I started analyzing how both csync2 and lysncd work under the cover, by testing each product in different circumstances. There was a lot of back and forward discussion with Axel Kittenberger (lsyncd developer) and Gordan Bobic, on the mailing lists (where my weekend was spent, not funny I know). Basically, the racing conditions were caused by: 1. cyclic csync2 executions, based on "false" lsyncd alarms 2. lsyncd events processed on a file-by-file bassis While testing, I noticed initial race conditions among nodes, generated by csync2. Particularly, it was when the update was completed from hermes to apollo. At that time, I was using the global csync2 configuration (group web). So Gordan came with the idea to use instead a "chained" configuration setup, forcing csync2 to execute only on specific nodes and on a specific order. I created the proper configuration files and started again the tests. The results were not promising at all, the races were still there, yet at a lower intensity. Then, I noticed that by default, lsyncd operates on a batched file-by-file basis when the stock configuration files are used. In other words, when lsyncd finds that file /foo/bar has changed, it will initiate a csync2 sync of a specific file, rather than everything. That was very bad. As solution, I had to write a custom lua based configuration script and force lsyncd to process the event calls in batch, instead of default per-file event. I am so glad that Axel decided to use Lua as scripting base for configurations, you can do extraordinary things with it. As a bonus for my wasted weekend, I found out today that my script will be featured into next lsyncd package, as proper example how to bind csync2 and lsyncd for an advanced cluster sync solution. Thank you Axel! The result is an efficient sync process constantly monitored by Linux services while generating no racing conditions and updating an entire cluster in an instant. Personally, I'm very satisfied with the results and I'm sure my client will enjoy the ease and improved manageability of files.
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/ 10/11
11/7/13
#4
Home
Forums
Developer Zone
https://www.axivo.com/community/threads/lightning-fast-synch-with-csync2-and-lsyncd.121/
11/11