Big Brothers
Big Brothers
Big Brothers
Notes:
I’ve been working with Unix since 1983 as everything from a programmer.,
Systems Administrator, Manager of Technical Support, and I’m currently
consulting as a Technical Project Manager.
I’m grateful to be here. I’m surprised with how popular Big Brother has
become, and really impressed with how excellent the Big Brother community
is.
2-1
Overview
Notes:
These talks are tough. I don’t want to bore those people who already know
about Big Brother, but don’t want to do a disservice to those who’ve never
heard of it either.
Hopefully you’ll leave here with an appreciation for where Big Brother comes
from, what it is, how it was designed, how it works, why it works, and why it
is so wonderful.
The irony of course, is that in the time it takes to present this talk, you could
go the web site, download it, compile it, and have it up and running on your
network. Not only that, but this Powerpoint presentation is larger than the
entire source code to BB.
Ultimately, Big Brother is interesting because it was built by a Sys Admin for
Systems Administration. That changed the whole focus of the tool; it’s really
a holistic approach to monitoring - that something broken in one place will
naturally show up somewhere else and get noticed...
2-2
What is Big Brother?
Notes:
Big Brother is the same stuff that every Sys Admin has been writing since
entropy first attacked Unix machines, and things started to fall apart.
Watching over machines isn’t Rocket Science. It’s like the description of
being a pilot: Dull and Boring with occasional moments of sheer terror.
The general task of an admin is to make sure that the machines stay up and
running, that everybody else, users and managers, don’t bother the admin.
Generally, my goal has been to keep my phone from ringing.
That results in two things. Everyone knows how well the network is doing,
and the phone doesn’t ring as often for stupid questions.
2-3
The Need for Big Brother
Notes:
I was told that we had to get something to monitor this network that was
supposed to be available 24/7. There was a copy of HP OpenView around,
that I tried to install, but it had expired three months earlier.
I had seen quite a few “performance monitors”, and they had three things in
common. First, they almost always listed themselves as taking up half the
system resources, second, they were almost always a lot of money, and third,
most insisted on tweaking the kernel.
In addition to being the acting admin, I was also working on the Netscape
NSAPI and hacking their Proxy Server (to make it do unnatural things with
Oracle), and it surprised me that there weren’t Web-based tools out there to do
simple monitoring. The Web allows the creation of incredibly simple GUI
interfaces, really quickly and easily, not to mention the instant publishing of
information.
2-4
The Real Reason for BB
Notes:
Then when he came back with his quote - $313,000 to monitor fewer than 20
servers, all from a central console, with redundant servers, not to mention at
least 90 man-days of consulting at well over $100 an hour. He was very
pleased with himself.
I freaked. Unfortunately where I was working at the time, they happily spent
money provided the correct number of technobabble buzzwords were uttered
within the specified amount of time, out came the checkbook. The only way
they wouldn’t do this was if the problem was solved another way, and fast.
The infrastructure of Big Brother was built over that weekend.
2-5
Design considerations
Notes:
It needed some simple thresholds to watch, a way to send this data to a central
location, a method of testing the network, and finally a visual display that
would allow me to know what was going on from across the room, or via a
pager if things got bad. It also needed to be as redundant as possible.
2-6
What BB doesn’t use
u SMNP u perl
SNMP is a network Perl is a great
protocol and is not language, but isn’t
well suited to shipped with every
collecting system Unix machine.
information.
I knew sh better than
It was scary and perl at the time BB
strange to me. was written
Notes:
Hard-core network people have asked why BB doesn’t use SNMP for
monitoring. The official reason is that it’s very tough to do the kind of testing
BB does on-the-fly using SNMP traps.
The real reason for not using SNMP is that at the time BB was created, I had
no idea how SNMP worked. Meanwhile great products like MRTG do the
SNMP thing very elegantly, and people like my friend Robert-Andre Croteau
have integrated BB and MRTG together very nicely.
So why wasn’t perl used when writing BB? A couple of reasons. The first is
that my shell was better than my perl. More importantly, though, /bin/sh is on
every machine. I wouldn’t have to port perl to every machine that BB was
running on. Finally, the BB scripts are really simple. Issue a command, check
the result, and send it somewhere. Bourne Shell is just fine for that.
Same with the C programs... write the client and server in C, that’s what C is
for. (yes I know you can write it in perl!) On a Sun box, the client was 4K, the
server was 7K. The load was imperceptible. That was the idea.
2-7
Setting up Big Brother
Notes:
The source code lives on the Big Brother Internet Site. Don’t worry if you
can’t remember the address: http://www.iti.qc.ca/iti/users/sean/bb-dnld You
can always get back there by clicking my face on any BB site you see.
You’ll need a C compiler for Unix versions, kermit and a modem for paging
and A little time and patience.
Set BBHOME in runbb.sh. runbb.sh is the main program that runs everything
else based on the information you put in the etc/bb-hosts file. You can tell Big
Brother who to page, and under what circumstances by editing parameters in
etc/bbdef.sh. Most of the defaults are sensible.
BBDISPLAY is the machine running a Web server where the BB output will
be generated. BBNET is the machine that will test the network, and
BBPAGER is the machine you’ve configured to do the paging.
In order for the web pages created by Big Brother to be visible, you might
have to link the www directory underneath the Document Root directory of the
machine defined as BBDISPLAY your Web Server is running on.
2-8
Directory Structure
bb
BBHOME
runbb .s h
Notes:
This structure allows you to isolate the web pages which are created by a
process that has absolutely nothing to do with the web server. The only
suggestion is that the pages be readable by persons accessing these files via the
web. This was done for security.
The logs directory contains the actual data as reported to Big Brother from the
network monitor and the local clients. Files in this directory are owned by
whomever the Big Brother Daemon is running under. These files are named
“machine.area”; i.e. a disk report for a machine called coffee would be called
coffee.disk, and would have a corresponding row and column on the Big
Brother output.
2-9
Architectural Overview
BBDISPLAY BBDISPLAY
running every 5 min
bbd mkbb
bb-local
cpu, disk,
msgs, procs bb
via
bb
machine.area
port log files
web
bb-network 1984 pages
daemons
ftp, http, pop3
smtp BBPAGER Notify Admin
running by pager
bbd or e-mail
Notes:
runbb.sh runs on every machine that runs Big Brother. It decides what to do
based on the information you’ve defined in the etc/bb-hosts file.
If the machine is defined as BBPAGER, then it too will be running bbd, but
only to receive and process pager requests. This machine notifies whomever
has been defined in the PAGER variable in etc/bbdef.sh.
2 - 10
Big Brother Components
Notes:
The scripts are not complex. bb-network.sh tests the entire network for
connectivity by default, and for the specific services listed in the etc/bb-hosts
file. bb-local.sh does the same for each machine.
Since Big Brother uses port 1984 (what would you expect it to use?). Make
sure it’s available and not blocked by any firewalls. Otherwise it won’t work,
eh?
The BBPAGER machine receives requests to notify admins via bbd, and calls
bin/bb-page.sh to do the actual paging. If the PAGER variable contains a
numeric string, BB assumes it’s a pager number, and handles numeric paging
using kermit. If it appears to contain an e-mail address, then mail is sent. The
PAGER variable may contain multiple numbers and addresses, and the bb-
page.sh script has already been modified to support other paging methods
(including sendpage).
2 - 11
What Big Brother tests
Notes:
bb-local.sh runs on every system Big Brother is installed on, checking that the
local machine is sane, that the disk hasn’t exploded, the CPU isn’t too
overloaded, or that important processes haven’t dropped dead.
bb-network.sh uses the program bbnet to test all the daemons listed for each
machine in the etc/bb-hosts file in addition to pinging each of them.
The Big Brother Web pages will always have a background color
corresponding to the most severe condition on the network at that time.
Remember you can click on any dot on the Big Brother Web page to get more
details about the results of any particular test.
2 - 12
bb-hosts controls BB
Notes:
The etc/bb-hosts file controls the execution of Big Brother. This file should be
the same on all machines running BB. If Big Brother is having trouble, this is
the first place to look.
BBDISPLAY is the Web Server where the Big Brother Display will live and
where the BB pages will be created.
BBNET is the machine that will do the network testing. This can be the same
as BBDISPLAY, and often is.
2 - 13
Sample bb-hosts file
#
# BIG BROTHER HOSTS FILE COMMENTS WILL INTERFERE WITH PROCESSING!!
#
group <H3><I>The Big Brother Display Server</I></H3>
204.19.116.1 iti-s01.iti.qc.ca # BBPAGER BBNET BBDISPLAY ftp smtp pop3
group <H3><I>Local Server Group</I></H3>
204.19.116.2 iti-s02.iti.qc.ca # ftp smtp pop3 http:/iti-s02/
204.19.117.1 ns.iti.qc.ca # ftp smtp
group <H3><I>Test Modem Banks</I></H3>
dialup modem-bank-1 204.19.50.1 16
dialup modem-bank-2 204.19.50.17 16
summary canada.bc 204.19.116.1 http://www.iti.qc.ca/iti/users/sean/bb/
summary america.ny 204.19.116.1 http://www.iti.qc.ca/iti/users/sean/bb/
summary europe.uk 204.19.116.1 http://www.iti.qc.ca/iti/users/sean/bb/
Notes:
Comments confuse the scripts since they blindly search for keywords.
Therefore, do not comment out a line you don’t want in the file. Remove it
completely!
Groups are in effect until the next group line is reached. This will give the
display a pleasant table structure. HTML codes are permitted only on the
group lines.
Make sure that daemons are listed precisely as they appear in the /etc/services
file. A common error are misspellings of pop3 as pop-3 or even just pop.
Since we look for precisely these daemons, spelling counts.
dialup lines are used to test banks of modems for connectivity. It’s nothing
special, it just pings banks of IP addresses to see which are active.
2 - 14
Big Brother Protocol
Notes:
The protocol is pretty trivial. Make sure port 1984 is available and not
blocked. Otherwise it won’t work, eh?
Most of the work is handled locally by the scripts; the severity levels and data
pre-formatted. All bbd has to do is take very simple actions, it either writes a
log file with status information, or calls the pager.
Extending BB to test for other functions is easy; just have bb send the new
data to bbd with a new function name. So to add a function called bobo, do
the test and have bb send this data to “machine.bobo” and the display will be
updated automatically the next time the mkbb.sh script runs.
2 - 15
Big Brother Security
Notes:
The etc/security file just contains lines with IP or network addresses of clients
permitted to connect to bbd running on that machine. If the file exists, then
only those hosts and networks listed will be allowed to connect. All others
will be silently dropped. For example:
204.101.110.101 Allow client to connect
204.101.112.0 Allow subnet to connect
If Big Brother is not running as root, it might have trouble reading log files on
certain machines depending on permissions.
bbd checks to see if the bb client is trying to do funny things with the
pathnames or is attempting to overflow buffers.
All BB commands are stored in environment variables, and are executed using
their full pathnames to avoid possible Trojan horses.
2 - 16
Big Brother Benefits
Notes:
Probably the greatest benefit is the ability to publish system status information
in a format that is understandable by all. No more calls about whether or not a
service is available.
There’s enough information on the Web Pages to finally let managers know
how well their complex network and even more complex administrators are
doing. Green screens are good.
Even when there is trouble, management and the help desks can be confident
that the admin has already been notified by Big Brother and that they are
working on the problem.
Since installing BB, I’ve never been caught by surprise by a user. It also
allowed me to go for coffee in peace knowing I’ll get paged if need be. Being
proactive, that’s what they call it.
2 - 17
Big Brother Statistics
Notes:
The articles in Sys Admin, and Paul Sittler’s article in Linux Journal certainly
haven’t hurt getting the word out, either. And I’m here ‘cause Shawn Welsh is
running BB @home.
Best of all is the community that has spontaneously appeared around BB.
These Brothers are wise and good.
2 - 18
Getting Big Brother
Notes:
The best way to understand Big Brother is to download it and try it out. It’s
pretty simple and does a lot of what admins need to do.
The code has been ported to just about every Unix box around, from the very
old, to Crays... there’s even an OpenVMS port out there somewhere.
The install is simple. You may have to adjust some paths in etc/bbdef.sh in
case there is some screaming, but that should be the extent of the mods
required.
Commercial use is restricted. You can’t charge others for the services BB
provides or include it in a product for sale without first obtaining a
Commercial License.
And if you have any questions, hit the BB mailing list, or the fine archives run
by Nick Silberstein. They live at http://www.fusioni.com/~bb/
2 - 19
Future Directions
Notes:
Changes to Big Brother happen relatively slowly. It seems any time I touch
the source code I’m as likely to break something as to fix something.
Windows NT is looking more and more like a likely target for Big Brother.
Robert-Andre Croteau has already done an excellent client, the bbd server
ports out of the box, and all that really needs to be done are the Web-page
creation programs.
Logging and elegant and useful historical system info have been on the “to do”
list for a long time. Maybe this year.
Maybe Sun will license Big Brother to replace Sun Net Manager :)
2 - 20
In Conclusion
Notes:
Thanks for listening, and at this point I’d be happy to answer any questions
you might have about BB.
2 - 21