.NET Dynamic Software Load Balancing
.NET Dynamic Software Load Balancing
com/Articles/3338/NET-Dynamic-Software-Load-Balancing
Stoyan Damov
9 Dec 2002 1
A Draft Implementation of an Idea for .NET Dynamic Software Load Balancing
Download source code (zipped) - ~100 KB
Latest source code and documentation (would be) available here soon.
In this article
1. Introduction
a. A Teeny-Weeny Intro to Clustering and Load Balancing
b. My Idea for Dynamic Software Load Balancing
2. Architecture and Implementation
a. Architecture Outlook
b. Assemblies & Types
c. Collaboration
d. Some Implementation Details
3. Load Balancing in Action - Balancing a Web Farm
4. Building, Configuring and Deploying the Solution
a. Configuration
What's so common in Common.h?
Tweaking the configuration file
... and the other configuration file:)
b. Deployment
5. Some thoughts about MC++ and C#
a. Managed C++ to C# Translation
b. C#'s readonly fields vs MC++ non-static const members
6. "Bugs suck. Period."
7. TODO(s)
8. Conclusion
a. A (final) word about C#
9. Disclaimer
"Success is the ability to go from one failure to another with no loss of enthusiasm."
Winston Churchill
Introduction
<blog date="2002-12-05"> Yay! I passed 70-320 today and I'm now MCAD.NET. Expect the next article to
cover XML Web Services, Remoting, or Serviced Components:) </blog>
This article is about Load Balancing. Neither "Unleashed", nor "Defined" -- "Implemented":) I'm not going to
discuss in details what load balancing is, its different types, or the variety of load balancing algorithms. I'm not
going to talk about proprieatary software like WLBS, MSCS, COM+ Load Balancing or Application Center
either. What I am going to do in this article, is present you a custom .NET Dynamic Software Load Balancing
solution, that I've implemented in less than a week and the issues I had to resolve to make it work. Though the
source code is only about 4 KLOC, by the end of this article, you'll see, that the solution is good enough to
balance the load of the web servers in a web farm. Enjoy reading...
...but not everybody would understand everything. To read, and understand the article, you're expected to
know what load balancing is in general, but even if you don't, I'll explain it shortly -- so keep reading. And to
read the code, you should have some experience with multithreading and network programming (TCP, UDP
and multicasting) and a basic knowledge of .NET Remoting. Contrarily of what C# developers think, you
shouldn't know Managed C++ to read the code. When you're writing managed-only code, C# and MC++
source code looks almost the same with very few differences, so I have even included a section for C#
developers which explains how to convert (most of the) MC++ code to C#.
I final warning, before you focus on the article -- I'm not a professional writer, I'm just a dev, so don't expect too
much from me (that's my 3rd article). If you feel that you don't understand something, that's probably because
I'm not a native English speaker (I'm Bulgarian), so I haven't been able to express what I have been thinking
about. If you find a grammatical nonsense, or even a typo, report it to me as a bug and I'll be more than glad to
"fix" it. And thanks for bearing this paragraph!
For those who don't have a clue what Load Balancing means, I'm about to give a short explanation of
clustering and load balancing. Very short indeed, because I lack the time to write more about it, and because I
don't want to waste the space of the article with arid text. You're reading an article at www.CodeProject.com,
not at www.ArticleProject.com:) The enlightened may skip the following paragraph, and I encourage the rest to
read it.
Mission-critical applications must run 24x7, and networks need to be able to scale performance to handle large
volumes of client requests without unwanted delays. A "server cluster" is a group of independent servers
managed as a single system for higher availability, easier manageability, and greater scalability. It consists of
two or more servers connected by a network, and a cluster management software, such as WLBS, MSCS or
Application Center. The software provides services such as failure detection, recovery, load balancing, and the
ability to manage the servers as a single system. Load balancing is a technique that allows the performance of
a server-based program, such as a Web server, to be scaled by distributing its client requests across multiple
servers within a cluster of computers. Load balancing is used to enhance scalability, which boosts throughput
while keeping response times low.
I should warn you that I haven't implemented a complete clustering software, but only the load balancing part
of it, so don't expect anything more than that. Now that you have an idea what load balancing is, I'm sure you
don't know what is my idea for its implementation. So keep reading...
How do we know that a machine is busy? When we feel that our machine is getting very slow, we launch the
Task Manager and look for a hung instance of iexplore.exe:) Seriously, we look at the CPU utilization. If it is
low, then the memory is low, and disk must be trashing. If we suspect anything else to be the reason, we run
the System Monitor and add some performance counters to look at. Well, this works if you're around the
machine and if you have one or two machines to monitor. When you have more machines you'll have to hire a
person, and buy him a 20-dioptre glasses to stare at all machines' System Monitor consoles and go crazy in
about a week :). But even if you could monitor your machines constantly you can't distribute their workload
manually, could you? Well, you could use some expensive software to balance their load, but I assure you that
you can do it yourself and that's what this article is all about. While you are able to "see" the performance
counters, you can also collect their values programmatically. And I think that if we combine some of them in a
certain way, and do some calculations, they could give you a value, that could be used to determine the
machine's load. Let's check if that's possible!
Let's monitor the \\Processor\% Processor Time\_Total and \\Processor\% User Time\_Total performance
counters. You can monitor them by launching Task Manager, and looking at the CPU utilization in the
"Performance" tab. (The red curve shows the % Procesor time, and the green one -- the %User time). Stop or
pause all CPU-intensive applications (WinAMP, MediaPlayer, etc.) and start monitoring the CPU utilization.
You have noticed that the counter values stay almost constant, right? Now, close Task Manager, wait about 5
seconds and start it again. You should notice a big peak in the CPU utilization. In several seconds, the peak
vanishes. Now, if we were reporting performance counters values instantly (as we get each counter sample),
one could think that our machine was extremely busy (almost 100%) at that moment, right? That's why we're
not going to report instant values, but we will collect several samples of the counter's values and will report
their average. That would be fair enough, don't you think? No?! I also don't, I was just checking you:) What
about available memory, I/O, etc. Because the CPU utilization is not enough for a realistic calculation of the
machine's workload, we should monitor more than one counter at a time, right? And because, let's say, the
current number of ASP.NET sessions is less important than the CPU utilization we will give each counter a
weight. Now the machine load will be calculated as the sum of the weighted averages of all monitored
performance counters. You should be guession already my idea for dynamic software load balancing.
However, a picture worths thousand words, and an ASCII one worths 2 thousand:) Here' is a real sample, and
the machine load calculation algorithm. In the example below, the machine load is calculated by monitoring 4
performance counters, each configured to collect its next sample value at equal intervals, and all counters
collect the same number of samples (this would be your usual case):
Legend:
Sum
the sum of all counter samples
Avg
the average of all counter samples (Sum/Count)
WA
the weighted average of all counter samples (Sum/Count * Weight)
% Proc Time
(Processor\% Processor Time\_Total), the percentage of elapsed time that the processor spends to
execute a non-Idle thread. It is calculated by measuring the duration of the idle thread is active in the
sample interval, and subtracting that time from interval duration. (Each processor has an idle thread
that consumes cycles when no other threads are ready to run). This counter is the primary indicator of
processor activity, and displays the average percentage of busy time observed during the sample
interval. It is calculated by monitoring the time that the service is inactive, and subtracting that value
from 100%
% User Time
(Processor\% User Time\_Total) is the percentage of elapsed time the processor spends in the user
mode. User mode is a restricted processing mode designed for applications, environment subsystems,
and integral subsystems. The alternative, privileged mode, is designed for operating system
components and allows direct access to hardware and all memory. The operating system switches
application threads to privileged mode to access operating system services. This counter displays the
average busy time as a percentage of the sample time
ASP Req.Ex.
(ASP.NET Applications\Requests Executing\__Total__) is the number of requests currently executing
% Disk Time
(Logical Disk\% Disk Time\_Total) is the percentage of elapsed time that the selected disk drive was
busy servicing read or write requests
I wondered about half a day how to explain the architecture to you. Not that it is so
complex, but because it would take too much space in the article, and I wanted to
show you some code, not a technical specification or even a DSS. So I wondered
whether to explain the architecture using a "top-to-bottom" or "bottom-to-top"
approach, or should I think out something else? Finally, as most of you have
already guessed, I decided to explain it in my own mixed way:) First, you should
learn of which assemblies is the solution comprised of, and then you could read
about their collaboration, the types they contain and so on... And even before that, I
recommend you to read and understand two terms, I've used throughout the article
(and the source code's comments).
Machine Load
the overall workload (utilization) of a machine - in our case, this is the sum of the weighted averages of
all performance counters (monitored for load balancing); if you've skipped the section "My Idea for
Dynamic Software Load Balancing", you may want to go back and read it
Fastest machine
the machine with the least current load
Architecture Outlook
First, I'd like to appologize about the "diagrams". I can work with only two
software products that can draw the diagrams, I needed in this article. I can't
afford the first (and my company is not willing to pay for it too:), and the
second bedeviled me so much, that I dropped out of the article one UML
static structure diagram, a UML deployment diagram and a couple of activity
diagrams (and they were nearly complete). I won't tell you the name of the
product, because I like very much the company that developed it. Just
accept my appologies, and the pseudo-ASCII art, which replaced the
original diagrams. Sorry:)
The load balancing software comes in three parts: a server, that reports the
load of the machine it is running on; a server that collects such loads, no
matter which machine they come from; and a library which asks the
collecting server which is the least loaded (fastest) machine. The server that
reports the machine's load is called "Machine Load Reporting Server"
(MLRS), and the server, that collects machine loads is called "Machine
Load Monitoring Server" (MLMS). The library's name "Load Balancing
Library" (LBL). You can deploy these three parts of the software as you like.
For example, you could install all of them on all machines.
The MLRS server on each machine joins a special, designated for the
purpose of the load balancing, multicasts group, and sends messages,
containing the machine's load to the group's multicast IP address. Because
all MLMS servers join the same group at startup, they all receive each
machine load, so if you run both MLRS and MLMS servers on all machines,
they will know each other's load. So what? We have the machine loads, but
what do we do with them? Well, all MLMS servers store the machine loads
in a special data structure, which lets them quickly retrieve the least
machine load at any time. So all machines now know which is the fastest
one. Who cares? We haven't really used that information to balance any
load, right? How do we query MLMS servers which is the fastest machine?
The answer is that each MLMS registers a special singleton object with
the .NET Remoting runtime, so the LBL can create (or get) an instance of
that object, and ask it for the least loaded machine. The problem is that LBL
cannot ask simultaneously all machines about this (yet, but I'm thinking on
this issue), so it should choose one machine (of course, it could be the
machine, it is running on) and will hand that load to the client application
that needs the information to perform whatever load balancing activity is
suitable. As you will later see, I've used LBL in a web application to
distribute the workload between all web servers in web farm. Below is a
"diagram" which depicts in general the collaboration between the servers
and the library:
Note: You should see the strange figure between the machines as a cloud,
i.e. it represents a LAN :) And one more thing -- if you don't understand
what multicasting is, don't worry, it is explained later in
the Collaboration section.
Now look at the "diagram" again. Let me remind you that when a machine
joins a multicast group, it receives all messages sent to that group, including
the messages, that the machine has sent. Machine A receives its own load,
and the load, reported by C. Machine B receives the load of A and C (it
does not report its load, because there's no MLRS server installed on it).
Machine C does not receive anything, because it has not MLMS server
installed. Because the machine C's LBL should connect (via Remoting) to
an MLMS server, and it has no such server installed, it could connect to
machine A or B and query the remoted object for the fastest machine. On
the "diagram" above, the LBL of A and C communicate with the remoted
object on machine A, while the LBL of B communicates with the remoted
object on its machine. As you will later see in the Configuration section,
there are very few things that are hardcoded in the solution's source code,
so don't worry -- you will be able to tune almost everything.
The solution consists of 8 assemblies, but only three of them are of some
interest to us now: MLMS, MLRS, and LBL, located respectively in two
console applications
(MachineLoadMonitoringServer.exe and MachineLoadReportingServer.exe)
and one dynamic link library (LoadBalancingLibrary.dll). Surprisingly, MLMS
and MLRS do not contain any types. However, they use several types get
their job done. You may wonder why I have designed them in that way. Why
hadn't I just implemented both servers directly in the executables. Well, the
answer is quite simple and reflects both my strenghts and weaknesses as a
developer. If you have the time to read about it, go ahead, otherwise click
here to skip the slight detour.
GUI programming is what I hate (though I've written a bunch of GUI apps).
For me, it is a mundane work, more suitable for a designer than for a
developer. I love to build complex "things". Server-side applications are my
favorite ones. Multi-threaded, asynchronous programming -- that's the
"stuff" I love. Rare applications, that nobody "sees" except for a few
administrators, which configure and/or control them using some sort of
administration consoles. If these applications work as expected the end-
user will almost never know s/he is using them (e.g. in most cases, a user
browsing a web site does not realize that an IIS or Apache server is
processing her requests and is serving the content). Now, I've written
several Windows C++ services in the past, and I've written some .NET
Windows services recently, so I could easily convert MLMS and MLRS to
one of these. On the other hand I love console (CUI) applications so much,
and I like seing hundreds of tracing messages on the console, so I left
MLMS and MLRS in their CUI form for two reasons. The first reason is that
you can quickly see what's wrong, when something goes wrong (and it will,
at least once:), and the second one is because I haven't debugged .NET
Windows services (and because I have debugged C++ Windows services, I
can assure you that it's not "piece of cake"). Nevertheless, one can easily
convert both CUI applications in Windows services in less than half an hour.
I haven't implemented the server classes into the executables to make it
easier for the guy who would convert them into Windows services. S/he'll
need to write just 4 lines of code in the Window Service class's to get the
job done:
Xxx is either Monitoring or Reporting. I'm sure you understand me now why
I have implemented the servers' code in separate classes in separate
libraries, and not directly in the executables.
SharedLibary (SL) - contains common and helper types, used by LML, LRL
and/or LBL. A list of the types (explained further) follows:
NOTE: CounterInfo is not exactly what C++ developers call a "struct" class,
because it does a lot of work behind the scenes. Its implementation is non-
trivial and includes topics like timers, synchronization, and performance
counters monitoring; look at the Some Implementation Details section for
more information about it.
Collaboration
Configurator
Plain Text
Pass Queue Running Total (Sum)
---- ----- -------------------
[] = 0
1 [1] =0+1 = 1
2 [2 1] =1+2 = 3
3 [3 2 1] =3+3 = 6
4 [4 3 2 1] =6+4 = 10
5 [5 4 3 2 1] = 10 + 5 = 15
6 [6 5 4 3 2] = 15 - 1 + 6 = 20
7 [7 6 5 4 3] = 20 - 2 + 7 = 25
MachineLoadsCollection
Plain Text
[C:30][A:40]
Plain Text
[D:20]
[C:30][A:40]
[D:20][C:30][A:40]
Plain Text
[C:30][A:40][D:45]
Plain Text
^
|
M
A
C . . . .
H . . . .
I . . . .
N . . . .
E . . B .
S D C A .
LOAD 20 30 40 . . . -->
Plain Text
^
|
M
A
C . . . .
H . . . .
I . . . .
N . . . .
E . . B .
S . C A D
LOAD 20 30 40 45 . . -->
C++ / CLI
Shrink ▲
void MachineLoadsCollection::Add
(MachineLoad __gc* machineLoad)
{
DEBUG_ASSERT (0 != machineLoad);
if (0 == machineLoad)
return;
rwLock->AcquireWriterLock (Timeout::Infinite);
// load value
//
//
if (!loads->ContainsKey (boxedLoad))
{
// no, this is the first load with this value -
create new list
//
//
//
if (!mappings->ContainsKey (name))
{
// no, the machine is reporting for the first
time
//
loadList->Add (machineLoad);
mappings->Add (name, new LoadMapping
(machineLoad, loadList));
}
else
{
// yes, the machine has reported its load
before
//
//
//
mappings->Remove (name);
//
//
loadList->Add (machineLoad);
//
//
if (oldList->Count == 0)
loads->Remove (__box (oldLoad-
>Load));
}
rwLock->ReleaseWriterLock ();
}
rwLock->AcquireReaderLock
(Timeout::Infinite);
//
if (loads->Count > 0)
{
// the 1st element should contain one of the
least
// in this list
//
rwLock->ReleaseReaderLock ();
return (load);
}
C++ / CLI
Shrink ▲
void MachineLoadsCollection::GrimReaper
(Object __gc* state)
{
// get the state we need to continue
//
//
mlc->grimReaper->Change (Timeout::Infinite,
Timeout::Infinite);
// check if we are forced to stop
//
if (!mlc->keepGrimReaperAlive)
return;
// get the rest of the fields to do our work
//
rwLock->AcquireWriterLock (Timeout::Infinite);
//
//
//
DateTime dtNow = DateTime::Now;
IDictionaryEnumerator __gc* dic = mappings-
>GetEnumerator ();
while (dic->MoveNext ())
{
LoadMapping __gc* map =
static_cast<LoadMapping __gc*> (dic->Value);
// check whether the dead timeout has
expired for this machine
//
//
//
//
deadMachines->Add (name);
//
//
if (oldList->Count == 0)
loads->Remove (__box (oldLoad-
>Load));
}
}
//
// cleanup
//
deadMachines->Clear ();
rwLock->ReleaseWriterLock ();
//
mlc->grimReaper->Change (reportTimeout,
reportTimeout);
}
Load Balancing in Action - Balancing a Web
Farm
c#
Shrink ▲
protected void Session_Start (object sender,
EventArgs e)
{
// get the fastest machine from the load
balancer
//
string fastestMachineName =
Helper.Instance.GetFastestMachineName ();
//
string thisMachineName =
Environment.MachineName;
if (String.Compare (thisMachineName,
fastestMachineName, false) != 0)
{
// it is another machine and we should
redirect the request
//
string fasterUrl =
Helper.Instance.ReplaceHostInUrl (
Request.Url.ToString (),
fastestMachineName);
Response.Redirect (fasterUrl);
}
}
And here's the code in the sample web page:
c#
private void OnPageLoad (object sender,
EventArgs e)
{
// get the fastest machine and generate the
new links with it
//
string fastestMachineName =
Helper.Instance.GetFastestMachineName ();
link.Text = String.Format (
"Next request will be processed by machine
'{0}'",
fastestMachineName);
// navigate to the same URL, but the host
being the fastest machine
//
link.NavigateUrl =
Helper.Instance.ReplaceHostInUrl (
Request.Url.ToString (),
fastestMachineName);
}
c#
Shrink ▲
class Helper
{
private Helper ()
{
// assume failure(s)
//
loadBalancer = null;
try
{
NameValueCollection settings =
ConfigurationSettings.AppSettings;
// object on it
//
string machine =
Environment.MachineName;
int port = 14000;
RemotingProtocol protocol =
RemotingProtocol.TCP;
//
//
string fastestMachineName =
Environment.MachineName;
if (loadBalancer != null)
{
MachineLoad load =
loadBalancer.GetLeastMachineLoad ();
if (load != null)
fastestMachineName = load.Name;
}
return (fastestMachineName);
}
Configuration
C++ / CLI
#define USING_UDP 1
#define USING_MULTICASTS 1
C++ / CLI
#if defined(USING_UDP)
# define SOCKET_CREATE(sock)
SOCKET_CREATE_UDP(sock)
#else
# define SOCKET_CREATE(sock)
SOCKET_CREATE_TCP(sock)
#endif
C++ / CLI
#define
TRACE_EXCEPTION_AND_RETHROW_IF_NE
EDED(e) \
System::Type __gc* exType = e->GetType ();
\
if (exType == __typeof
(OutOfMemoryException) || \
exType == __typeof
(ExecutionEngineException)) \
throw; \
Console::WriteLine ( \
S"\n{0}\n{1} ({2}/{3}): {4}\n{0}", \
new String (L'-', 79), \
new String ((char *) __FUNCTION__),
\
new String ((char *) __FILE__), \
__box (__LINE__), \
e->Message);
C++ / CLI
#define DEBUG_ASSERT(x) Debug::Assert (x,
S#x)
c#
Debug.Assert (null != objRef, "null != objRef");
everywhere I needed to assert. In MC++, I just
write
C++ / CLI
DEBUG_ASSERT (0 != objRef);
and it is automatically expanded into
C++ / CLI
Debug::Assert (0 != objRef, S"0 != objRef");
Not to speak about
the __LINE__, __FILE__ and __FUNCTION__ m
acros I could use in
the DEBUG_ASSERT macro! Now let's
everybody scream loudly with me: "C# sucks!":)
XML
Shrink ▲
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<LoadReportingServer>
<IpAddress>127.0.0.1</IpAddress>
<Port>12000</Port>
<ReportingInterval>2000</ReportingInterval>
</LoadReportingServer>
<LoadMonitoringServer>
<IpAddress>127.0.0.1</IpAddress>
<CollectorPort>12000</CollectorPort>
<CollectorBacklog>40</CollectorBacklog>
<ReporterPort>13000</ReporterPort>
<ReporterBacklog>40</ReporterBacklog>
<MachineReportTimeout>4000</MachineReport
Timeout>
<RemotingProtocol>tcp</RemotingProtocol>
<RemotingChannelPort>14000</RemotingChan
nelPort>
<PerformanceCounters>
<counter alias="cpu"
category="Processor"
name="% Processor Time"
instance="_Total"
load-weight="0.3"
interval="500"
maximum-measures="5" />
<-- ... -->
</PerformanceCounters>
</LoadMonitoringServer>
</configuration>
Element/Attribute Meaning/Usage
LRS/IpAddress When you're using UDP +
multicasting (the default),
the IpAddress is the IP
address of the multicast
group, MLMS and MLRS
join, in order to
communicate. If you're not
using multicasting, but are
still using UDP or TCP, this
element specifies the IP
address (or the host name)
of the MLMS server, MLRS
report to. Note that because
you don't use multicasting,
there's no way for the
MLRS servers to "multicast"
their machine loads
to all MLMS servers.
There's no doubt that this
element's text should be
equal to LMS/IpAddress in
any case.
LRS/Port Using UDP + multicasting,
UDP only or TCP, that's the
port to which MLRS servers
send, and at which MLMS
servers receive machine
loads.
LRS/ MLRS servers report
ReportingInterval machine loads to MLMS
ones.
The ReportingInterval specif
ies the interval (in
milliseconds) at which, a
MLRS server should report
its load to one or more
MLMS servers. If you have
paid attention in the Some
Implementation
Details section, I said, that
even if the interval has
ellapsed, a machine may
not report its load, because
it has not gathered the raw
data it needs to calculate its
load. See
the counter element's interv
al attribute for more
information.
LMS/IpAddress In the UDP + multicasting
scenario, that's the
multicast group's IP
address, as in
the LRS/IpAddress element.
When you're using UDP or
TCP only, this address is
ignored.
LMS/CollectorPort The port, on which MLMS
servers accept TCP
connections, or receive data
from, when using UDP.
LMS/ This element specifies the
CollectorBacklog maximum number of
sockets, a MLMS server will
use, when configured for
TCP communication.
LMS/ReporterPort If haven't been reading the
article carefully, you're
probably wondering what
does this element specify.
Well, in my first design, I
was not thinking that
Remoting will serve me so
well to build the Load
Balancing Library (LBL). I
wrote a mini TCP server,
which was accepting TCP
requests and returning the
least loaded machine.
Because LBL had to
connect to an MLMS server
and ask which is the fastest
machine, you can imagine
that I've written several
overloads of
the GetLeastLoadedMachin
e method, accepting
timeouts and default
machines, if there're no
available machines at all. At
the moment I finished the
LBL client, I decided that
the design was too lame, so
I rewritten the LBL library
from scratch (yeah, shit
happens:), using Remoting.
Now, I regret to tell you that
I've overwritten the original
library's source files.
However, I left the TCP
server completely working --
it lives as
the ReporterWorker class,
and persists in
the ReporterWorker.h/.cpp f
iles in
the LoadMonitoringLibrary p
roject. If you want to write
an alternative LBL library,
be my guest -- just write
some code to connect to
the LMS reporter worker
and it will report the fastest
machine's load
immediatelly. Note that the
worker is accepting TCP
sockets, so you should
always connect to it using
TCP.
LMS/ It's not difficult to figure out
ReporterBacklog that this the backlog of the
TCP server I was talking
about above.
LMS/ Now that's an interesting
MachineReportTime setting.
out The MachineReportTimeout
is the biggest interval (in
milliseconds) at which a
machine should report its
successive load in order to
stay in the load balancing.
This means, that if machine
has reported 5 seconds
ago, and the timeout
interval is set to 3 seconds,
the machine is being
removed from the load
balancing. If it later reports,
it is back in business. I think
this is a bit lame, because
one would like to configure
each machine to report in
different intervals, but I
don't have time (now) to fix
this, so you should learn to
live with this "feature". One
way to work around my
"lameness" is to give this
setting a great enough
value. Be warned though,
that if a machine is down,
you won't be able to remove
it from the load balancing
until this interval ellapses --
so don't give it too big
values.
LMS/ Originally, I thought to use
RemotingProtocol Remoting only over TCP. I
thought that HTTP would be
too slow (it is one level
above TCP in the OSI
stack). Then, after I recalled
how complex Remoting
was, I realized that the
HTTP protocol is blazingly
faster than the Remoting
itself. So I decided to give
you an option which
protocol to use. Currently,
the solution supports only
the TCP and HTTP
protocols, but you can
easily extend it to use any
protocol you wish. This
setting accepts a string,
which is either "tcp" or "http"
(without the quotes, of
course).
LMS/ That's the port, MLMS uses
RemotingChannelPo to register and activate the
rt load balancing object with
the Remoting runtime.
LMS/ This element contains a
PerformanceCounter collection of performance
s counters, used to calculate
the machine's load. Below
are the given the attributes
of the counter XML
element, used to describe
a CounterInfo object, I
written about somewhere
above.
counter/alias Though currently not used,
this attribute specifies the
alias for the otherwise too
long performance counter
path. See
the TODO(s) section for the
reason I've put this
attribute.
counter/category The general category of the
counter,
e.g. Processor, Memory,
etc.
counter/name The specific counter in the
category, e.g. % Processor
Time, Page reads/sec, etc.
counter/instance If there are two or more
instances of the counter,
the instance attribute
specifies the exact instance
of the counter. For example,
if you have two CPUs, then
the first CPU's instance is
"0", the second one is "1",
and both are named
"_Total"
counter/load-weight The weight that balance the
counter values. E.g. you
can give more weight to the
values of Processor\%
Processor Time\_Total then
to Processor\% User Time\
_Total ones. You get the
idea.
counter/interval The interval (in
milliseconds) at which a
performance counter is
asked to return its next
sample value.
counter/maximum- The size of the cyclic queue
measures (I talked about above), that
stores the transient state of
a performance counter. In
other words, the element
specifies how many counter
values should be collected
in order to get a decent
weighted average (WA).
The counter does not report
its WA until it collects at
least maximum-measures o
f sample values. If
the CounterInfo class is
asked to return its WA
before it collects the
necessary number of
sample values, it blocks and
waits until it has collected
them.
XML
<appSettings>
<add key="LoadBalancingMachine" value="..."
/>
<add key="LoadBalancingPort" value="..." />
<add key="LoadBalancingProtocol" value="..."
/>
</appSettings>
Deployment
MC++ C#
---- ----
:: .
-> .
__gc*
__gc
__sealed __value
using namespace using
: public :
S" "
__box (x) x
c#
Shrink ▲
public class PerfCounter
{
public PerfCounter (String fullPath, int
sampleInterval)
{
// validate parameters
//
//
FullPath = fullPath;
SampleInterval = sampleInterval;
}
//
C++ / CLI
Shrink ▲
public __gc class PerfCounter
{
public:
PerfCounter (String __gc* fullPath, int
sampleInterval) :
FullPath (fullPath),
SampleInterval (sampleInterval)
{
// validate parameters
//
Debug::Assert (0 != fullPath);
if (0 == fullPath)
throw (new ArgumentNullException
(S"fullPath"));
Debug::Assert (sampleInterval > 0);
if (sampleInterval <= 0)
throw (new
ArgumentOutOfRangeException
(S"sampleInterval"));
//
public:
const String __gc* FullPath;
const int SampleInterval;
};
c#
Shrink ▲
public class PerfCounter
{
public PerfCounter (String fullPath, int
sampleInterval)
{
// validate parameters
//
Debug.Assert (null != fullPath);
if (null == fullPath)
throw (new ArgumentNullException
("fullPath"));
Debug.Assert (sampleInterval > 0);
// change to a reasonable default value
//
if (sampleInterval <= 0)
sampleInterval =
DefaultSampleInterval;
//
FullPath = fullPath;
SampleInterval = sampleInterval;
}
C++ / CLI
Shrink ▲
public __gc class CrashingPerfCounter
{
public:
CrashingPerfCounter (String __gc* fullPath, int
sampleInterval) :
FullPath (fullPath),
SampleInterval (sampleInterval)
{
// validate parameters
//
Debug::Assert (0 != fullPath);
if (0 == fullPath)
throw (new ArgumentNullException
(S"fullPath"));
Debug::Assert (sampleInterval > 0);
// the second line below will cause the
compiler to
// report "error C2166: l-value specifies
const object"
//
if (sampleInterval <= 0)
SampleInterval =
DefaultSampleInterval;
//
public:
const String __gc* FullPath;
const int SampleInterval;
private:
static const int DefaultSampleInterval =
1000;
};
C++ / CLI
SampleInterval (sampleInterval > 0 ?
sampleInterval : DefaultSampleInteval)
C++ / CLI
Shrink ▲
public __gc class LamePerfCounter
{
public:
LamePerfCounter (String __gc* fullPath, int
sampleInterval)
{
// validate parameters
//
Debug::Assert (0 != fullPath);
if (0 == fullPath)
throw (new ArgumentNullException
(S"fullPath"));
Debug::Assert (sampleInterval > 0);
if (sampleInterval <= 0)
sampleInterval = DefaultSampleInterval;
//
this->fullPath = fullPath;
this->sampleInterval = sampleInterval;
}
private:
String __gc* fullPath;
int sampleInterval;
static const int DefaultSampleInterval =
1000;
};
"Bugs suck. Period."
John Robins
John Robins
C++ / CLI
using namespace SLB =
SoftwareLoadBalancing;
C++ / CLI
SLB::SLB::X __gc* x = new SLB::SLB::X ();
C++ / CLI
using namespace SLB =
SoftwareLoadBalancing::SoftwareLoadBalancing
;
C++ / CLI
SLB::X __gc* x = new SLB::X ();
TODO(s)
Disclaimer