Software Engineering For Internet Applications
Software Engineering For Internet Applications
005.2 0 76dc22
2006
6 5 4 3 2
2005049144
Contents
Preface
vii
Acknowledgments
ix
Introduction
Basics
Planning
Software Structure
Content Management
97
Software Modularity
141
Discussion
10
Voice (VoiceXML)
199
11
Scaling Gracefully
213
12
Search
13
Planning Redux
14
15
9
47
63
75
161
183
241
261
281
269
vi
Contents
16
17
Writeup
303
313
Reference Chapters
329
HTML
Grading Standards
Glossary
351
359
363
To the Instructor
375
395
393
391
Preface
This is the textbook for the MIT course Software Engineering for Internet
Applications. The course is intended for juniors and seniors in computer
science. We assume that they know how to write a computer program and
debug it. We do not assume knowledge of any particular programming languages, standards, or protocols. The most concise statement of the course
goal is that The student nishes knowing how to build amazon.com by him
or herself.
Other people who might nd this book useful include the following:
m professional software developers building online communities or other multiuser Internet applications
m managers who are evaluating packaged software aimed at supporting online
communitiesvarious chapters contain criteria for judging the features of
products such as Microsoft Sharepoint or Microsoft Content Management
Server
m university students and faculty looking to add some structure to a capstone
project at the end of a computer science degree
If youre confused by the student knows how to build amazon.com statement, we can break it down in terms of principles and skills. The fundamental
dierence between server-based Internet applications and the desktop applications that students have already learned to build is that server-based applications have multiple simultaneous users. Coupled with the unreliability of
networks, this gives rise to the problems of concurrency and transactions.
Stateless communications protocols such as HTTP mean that the student must
learn how to build a stateful user experience on top of stateless protocols. For
persistence between clicks and management of concurrency and transactions,
viii
Preface
the student needs to learn how to use the relational database management system. Finally, though this goes beyond the simple stand-alone amazon.com-style
service, students ought to learn about object-oriented distributed computing
where each object is a Web service.
In addition to learning these principles, wed like the student to learn some
skills. This is a laboratory course, and we want students who graduate to be
competent software engineers. Wed like our students to be able to take vague
and ambitious specications and turn them into a system design that can be
built and launched within a few months, with the features most important to
users and easiest to develop built rst and the dicult bells and whistles deferred to a second version. Wed like our students to know how to test prototypes with end-users and rene their application design once or twice within
even a three-month project. When business requirements are extreme, for
example, build me amazon.com by yourself in three months, we want our
students to understand how to cope with the challenge via automatic code generation and use of open-source toolkits where appropriate.
We can recast the student knows how to build amazon.com statement in
terms of technologies used. By the time someone has nished reading and doing
the exercises in this book, he or she will understand HTTP, HTML, SQL, mobile browsers on telephones, VoiceXML, data modeling, page ow and interaction design, server-side scripting, and usability analysis.
Eve Andersson, Philip Greenspun, Andrew Grumet
Cambridge, Massachusetts
December 2005
Acknowledgments
Introduction
The concern for man and his destiny must always be the chief interest of all technical
eort. Never forget it between your diagrams and equations.
Albert Einstein
A twelve-year-old can build a nice Web application using the tools that came
standard with any Linux or Windows machine. Thus it is worth asking ourselves, What is challenging, interesting, and inspiring about Internet-based
applications?
There are some easy-to-identify technology-related challenges. For example,
in many situations it would be more convenient to interact with an information
system by talking and listening. Youre in the bathtub reading New Yorker.
You want to know whether there are any early morning appointments on
your calendar that would prevent you from staying in the tub and nishing
an interesting article. Youve bought a new DVD player. You could read the
manual and master the remote control. But in a dark room, wouldnt it be
easier if you could simply ask the house or the machine to back up thirty
seconds? Youre driving in your car and curious to know the population of
Thailand and the countrys size relative to the state of California; voice is your
only option.
There are some easy-to-identify missing features in typical Web-based applications. For example, shareable and portable sessions. You can use the Internet
to share your photos. You can use the Internet to share your music. You can
use the Internet to share your documents. The one thing that you cant typically share on the Internet is your experience of using the Internet. Suppose
that youre surng a travel site, planning a trip for yourself and three friends.
Wouldnt it be nice if your companions could see what youre looking at,
page-by-page, and speak comments into a shared voice-session? If everyone
Chapter 1
has the same brand of computer and special software, this is easy enough. But
shareable sessions ought to be a built-in feature of sites that are usable from
any browser. The same infrastructure could be used to make sessions portable.
You could start browsing on a desktop computer with a big screen and nish
your session in a taxi on a mobile phone.
Speaking of mobile browsers, their small screens raise the issues of multimodal user interfaces and personalization. With the General Packet Radio Service or GPRS, rolled out across the world in late 2001, it became possible for
a mobile user to simultaneously speak and listen in a voice connection while
using text screens delivered via a Web connection. As an engineer, youll have
to decide when it makes sense to talk to the user, listen to the user, print out a
screen of options to the user, and ask the user to highlight and click to choose
from that screen of options. For example, when booking an airline ight it is
much more convenient to speak the departure and arrival cities than to choose
from a menu of thousands of airports worldwide. But if there are ten options
for making the connection you dont want to wait for the computer to read
out those ten and you dont want to have to hold all the facts about those ten
options in your mind. It would be more convenient for the travel service to
send you a Web page with the ten options printed and scrollable.
On the personalization front, consider the corporate knowledge sharing or
knowledge management system. Initially, workers are happy simply to have
this kind of system in place. But after a few years, the system becomes so lled
with stu that it is dicult to nd anything relevant. Given an organization in
which 1,000 documents are generated every day, wouldnt it be nice to have a
computer system smart enough to gure out which three are likely to be most
interesting to you? And display the titles on the three lines of your phones
display?
A more interesting challenge is presented by asking the question, Can a
computer help me be all that I can be? Engineers often build things that are
easy to engineer. Fifty years after the development of television, we started
building high-denition television (HDTV). Could engineers build a higher
resolution standard? Absolutely. Did consumers care? So far it seems that not
too many do care.
Lets put it this way: Given a choice between watching Laverne and Shirley
in HDTV and being twenty pounds thinner, which would you prefer?
Thought so.
If you take a tape measure down to the self-help section of your local bookstore youll discover a world of unmet human goals. A lot of these goals are
Introduction
tough to reach because we lack willpower. Olympic athletes also lack willpower
at times. But they get to the Olympics, and were still fat. Why? Maybe because
they have a coach and we dont. Where are the engineering challenges in building a network-based diet coach? First look at a proposed interaction with the
computer system that well call Dr. Rachel:
0900: youre walking to work; you call Dr. Rachel from your mobile:
m
Dr. Rachel: What did you have for breakfast this morning? (She knows that it is
morning in your typical time zone; she knows that youve not called in so far today.)
You: Glass of orange juice. Two eggs. Two slices of bread. Coee with milk and
sugar.
Dr. Rachel: Was the orange juice glass small, medium, or large?
You: Medium.
1045: your programmer ocemate brings in a box of donuts; you eat one. Since youre
at your computer anyway, you pull down the Dr. Rachel bookmark from the Web
browsers favorites menu. You quickly inform Dr. Rachel of your consumption. She
conrms the donut and shows you a summary page with your current estimated weight,
what youve reported eating so far today, the total calories consumed so far today, and
how many are left in your budget. The page shows a warning red Dont eat more than
one small sandwich for lunch hint.
1330: youre at the cafe down the street, having a small sandwich and a Diet Coke. It
is noisy and you dont want to disturb people at the neighboring tables. You use your
mobile phones browser to connect to Dr. Rachel. She knows that it is lunchtime and
that youve not told her about lunch so the lunch menus come up rst. You report
your consumption.
1600: your desktop machine has crashed (again). Fortunately the software company
where you work provides free snacks and soda. You go into the kitchen and power
down on a bag of potato chips and some Mountain Dew. When you get back to your
desk, your computer is still dead. You call Dr. Rachel from your wired phone and
tell her about the snack and soda. She cautions you that youll have to go to the gym
tonight.
1900: driving back from the gym, you call Dr. Rachel from your car and tell her that
you worked out for 45 minutes.
2030: youre nished with dinner and weigh yourself. You use the Web browser on
your home computer to report the food consumption and weight as measured by the
Chapter 1
scale. Dr. Rachel responds with a Web page informing you that the measured weight is
higher than she would have predicted. Shes going to adjust her assumptions about your
portion estimates, e.g., in the future when you say medium shell assume large.
From the sample interaction, you can infer that Dr. Rachel must include the
following components: an adaptive model of the user; a database of calorie
counts for dierent foods; some knowledge about eective dieting, for example,
how many calories can be consumed per day if one intends to reach Weight X
by Date Y; a Web browser interface; a mobile browser interface; a conversational voice interface (though perhaps one could get by with a simple VoiceXML
interface).
What if, after two months, youre still fat? Should Dr. Rachel call you up in
the middle of meals to suggest that you dont need to clean your plate? Wheres
the line between being eective and annoying? Can the computer system read
your facial expression to gure out when to back o ?
What are the enduring unmet human goals? To connect with other people
and to learn. Email and reference library were the two universally appealing
applications of the Internet, according to a December 1999 survey conducted
by Norman Nie and Lutz Erbring and reported in Internet and Society, a January 2000 report of the Stanford Institute for the Quantitative Study of Society
(http://www.stanford.edu/group/siqss/Press_Release/Preliminary_Report.pdf ).
Entertainment and business-to-consumer e-commerce were far down the list.
Lets consider the connecting with other people goal. Suppose the people
already know each other. They may be able to meet face-to-face. They can almost surely pick up the telephone and call each other using a system that dates
from the nineteenth century. They may choose to exchange email, a system
that dates from the 1960s. It doesnt look as though there is any challenge for
twenty-rst century engineers here.
Suppose the people dont already know each other. Can technology help?
First we might ask Should technology help? Why would you want to talk to
a bunch of strangers rather than your close friends and family? The problem
with your friends and family is that by and large they (a) know the same things
that you know, and (b) know the same people that you know. Mark Granovetters classic 1973 study The Strength of Weak Ties (American Journal of Sociology 78: 136080) showed that most people got their jobs from people whom
they did not know very well. Friends of friends of friends, perhaps. There are
Introduction
Chapter 1
got tired, youd go to bed. Teaching is fun if you dont have to do it forty hours
per week for thirty years.
Imagine if every learning photographer had a group of experienced photographers answering his or her questions? Thats the online community photo.net,
started by one of the authors as a collection of tutorial articles and a questionand-answer forum in 1993 and, as of August 2005, home to 426,000 registered
users engaged in answering each others questions and critiquing each others
photographs. Imagine if every current MIT student had an alumnus mentor?
Thats what some folks at MIT have been working on. It seems like a much
more eective strategy to get some volunteer labor out of the 90,000 alumni
than to try to squeeze more from the 930 faculty members. Most of MITs
alumni dont live in the Boston area. Students can benet from the volunteerism of distant alumni only if (1) student-faculty interaction is done in a
computer-mediated fashion so that it becomes visible to authorized mentors,
and (2) mentors can use the same information system as the students and faculty to get access to handouts, assignments, and lecture notes. Were coordinating people separated in space and time who share a common purpose. Again,
thats an online community.
Online communities are challenging because learning is dicult and people
are idiosyncratic. Online communities are challenging because the software
that works for a community of 200 wont work for a community of 2,000 or
20,000. Online communities are inspiring engineering projects because they
deliver to users two of the things that they want most out of life: connections
to other people and education.
If your interest in this book stems from the desire to build a straightforward e-commerce
site, dont despair. It turns out that the most successful e-commerce and collaborative
commerce sites are, at their core, actually online communities. Amazon is the best
known example. In 1995 there were dozens of online bookstores with comprehensive
catalogs. Amazon had a catalog but, with its reader review facility, Amazon also had a
mechanism for users to communicate with each other. Thus did the programmers at
Amazon crush their competition.
As you work through this book, youre going to build an online learning
community. Along the way, youll pick up all the important principles, skills,
and technologies for building desktop Web, mobile Web, and voice applications of all types.
Introduction
More
m on GPRS: Emerging Technology: Clear Signals for General Packet Radio
Service by Peter Rysavy in the December 2000 issue of Network Magazine,
available at http://www.rysavy.com/Articles/GPRS2/gprs2.html
m on the state-of-the-art in easy-to-build voice applications: Chapter 10 on
VoiceXML (stands by itself reasonably well)
Basics
10
Chapter 2
The most important thing to know about HTTP is that it is stateless. If you
view ten Web pages, your browser makes ten independent HTTP requests of
the publishers Web server. At any time in between those requests, you are
free to restart your browser program. At any time in between those requests,
the publisher is free to restart its server program.
Heres the anatomy of a typical HTTP session:
m user types www.yahoo.com into a browser
m browser translates www.yahoo.com into an IP address and tries to open a
TCP connection with port 80 of that address (TCP is Transmission Control
Protocol and is the fundamental system via which two computers on the
Internet send streams of bytes to each other.)
m once a connection is established, the browser sends the following byte stream:
GET / HTTP/1.0 (plus two carriage-return line-feeds). The GET means
that the browser is requesting a le. The / is the name of the le, in this
case simply the root index page. The HTTP/1.0 says that this browser
would prefer to get a result back adhering to the HTTP 1.0 protocol.
m Yahoo responds with a set of headers indicating which protocol is actually
being used, whether or not the le requested was found, how many bytes are
contained in that le, and what kind of information is contained in the le
(the Multipurpose Internet Mail Extensions or MIME type)
m Yahoos server sends a blank line to indicate the end of the headers
11
Basics
In this case weve used the Unix telnet command with an optional argument
specifying the port number for the target hosteverything typed by the programmer is here indicated in bold. We typed the GET . . . line ourselves and
then hit Enter twice on the keyboard. Yahoos rst header back is HTTP/1.0
200 OK. The HTTP status code of 200 means that the le was found
(OK).
Dont get too lost in the details of the HTTP example. The point is that
when the connection is over, it is over. If the user follows a hyperlink from the
Yahoo front page to Photography, for example, thats a brand new HTTP
request. If Yahoo is using multiple servers to operate its site, the second request
might go to an entirely dierent machine. This sounds ne for browsing Yahoo. But suppose youre shopping at an e-commerce site such as Amazon. If
you put something in your shopping cart on one HTTP request, you still want
it to be there ten clicks later. Or suppose youve logged into photo.net on Click
23 and on Click 45 are responding to a discussion forum posting. You dont
want the photo.net server to have forgotten your identity and demand your
username and password again.
This presents you, the engineer, with a challenge: creating a stateful user experience on top of a fundamentally stateless protocol.
12
Chapter 2
Where can you store state from request to request? Perhaps in a log le on
the Web server. The server would write down Joe Smith wants three copies
of Bus Nine to Paradise by Leo Buscaglia. On any subsequent request by Joe
Smith, the server-side script can simply check the log and display the contents
of the shopping cart. A problem with this idea, however, is that HTTP is anonymous. A Web server doesnt know that it is Joe Smith connecting. The server
only knows the IP address of the computer making the request. Sometimes this
translates into a host name. If it is joe-smiths-desktop.stanford.edu, perhaps
you can identify subsequent requests from this IP address as coming from the
same person. But what if it is cache-rr02.proxy.aol.com, one of the HTTP
proxy servers connecting America Onlines 20 million users to the public Internet? The same users next request will very likely come from a dierent IP
address, that is, another physical computer within AOLs racks and racks
of proxy machines. The next request from cache-rr02.proxy.aol.com will very
likely come from a dierent person, that is, another physical human being
among AOLs 20 million subscribers who share a common pool of proxy
machines.
Somehow you need to write some information out to an individual user that
will be returned on that users next request.
If all of your pages are generated by computer programs as opposed to being
static HTML, one idea would be to rewrite all the hyperlinks on the pages
served. Instead of sending the same les to everyone, with the same embedded
URLs, customize the output so that a user who follows a link is sending
extra information back to the server. Here is an example of how amazon.com
embeds a session key in URLs:
1. Suppose that a shopper follows a link to a page that displays a single book
for sale, e.g., http://www.amazon.com/exec/obidos/ASIN/1588750019/.
Note that 1588750019 is an International Standard Book Number (ISBN)
and completely identies the product to be presented.
2. The amazon.com server redirects the request to a URL that includes a
session ID after the last slash, e.g., http://www.amazon.com/exec/obidos/
ASIN/1588750019/103-9609966-7089404
13
Basics
3. If the shopper rolls a mouse over the hyperlinks on the page served, he or
she will notice that all the hyperlinks contain, at the end, this same session
ID.
Note that this session ID does not change in length no matter how long a shoppers session or how many items are placed in the shopping cart. The session
ID is being used as a key to look up the shopping basket contents in a database
within amazon.com. An alternative implementation would be to encode the
complete contents of the shopping cart in the URLs instead of the session ID.
Suppose, for example, that Joe Shopper puts three books in his shopping cart.
Amazons server could simply add three ISBNs to all the hyperlink URLs that
he might follow, separated by slashes. The URLs will be getting a bit long but
Amazons programmers can take encouragement from this quote from the
HTTP spec:
The HTTP protocol does not place any a priori limit on the length of a URI. Servers
MUST be able to handle the URI of any resource they serve, and SHOULD be able to
handle URIs of unbounded length if they provide GET-based forms that could generate
such URIs. A server SHOULD return 414 (Request-URI Too Long) status if a URI is
longer than the server can handle (see section 10.4.15).
There is no need to worry about turning away Amazons best customers, the
ones with really big shopping carts, with a return status of 414 Request-URI
Too Long. Or is there? Here is a comment from the HTTP spec:
Note: Servers ought to be cautious about depending on URI lengths above 255 bytes,
because some older client or proxy implementations might not properly support these
lengths.
Perhaps this is why the real live amazon.com stores only session ID in the
URLs.
Cookies
Instead of playing games with rewriting hyperlinks in HTML pages we can
take advantage of an extension to HTTP known as cookies. We said that
we needed a way to write some information out to an individual user that will
be returned on that users next request. The rst paragraph of Netscapes
Persistent Client State HTTP CookiesPreliminary Specication (http://wp
.netscape.com/newsref/std/cookie_spec.html) reads:
14
Chapter 2
Cookies are a general mechanism which server side connections (such as CGI scripts) can
use to both store and retrieve information on the client side of the connection. The addition
of a simple, persistent, client-side state signicantly extends the capabilities of Web-based
client/server applications.
How does it work? After Joe Smith adds a book to his shopping cart, the server
writes
As long as Joe does not quit his browser, on every subsequent request to your
server, the browser adds a header:
Cookie: cart_contents=1588750019
Your server-side scripts can read this header and extract the current contents of
the shopping cart.
Sound like the perfect solution? In some ways it is. If youre a computer
science egghead you can take pride in the fact that this is a distributed database
management system. Instead of keeping a big log le on your server, youre
keeping bits of information on thousands of users machines worldwide. But
one problem with cookies is that the spec limits you to asking each browser to
store no more than 20 cookies on behalf of your server and each of those
cookies must be no more than 4 kilobytes in size. A minor problem is that
cookie information will be passed back up to your server on every page load.
If you have indeed indulged yourself by parking 80 kilobytes of information
in 20 cookies and your user is on a modem, this is going to slow down Web
interaction.
A deeper problem with cookies is that they arent portable for the user. If Joe
Smith starts shopping from his desktop computer at work and wants to continue from a mobile phone in a taxi or from a Web browser at home, he cant
retrieve the contents of his cart so far. The shopping cart resides in the memory
of his computer at work.
A nal problem with cookies is that a small percentage of users have disabled them due to the privacy problems illustrated in gure 2.2.
Figure 2.2 Cookies coupled with the open-hearted behavior of 1990s browsers meant
the end of privacy on the Internet. Suppose that three publishers cooperate and agree
to serve all of their banner ads from http://noprivacy.com. When Joe User visits
search-engine.com and types in acne cream, the page comes back with an IMG referencing noprivacy.com. Joes browser will automatically visit noprivacy.com and ask for
the GIF for SE9734. If this is Joes rst time using any of these three cooperating
services, noprivacy.com will issue a Set-Cookie header to Joes browser. Meanwhile,
search-engine.com sends a message to noprivacy.com saying SE9734 was a request for
acne cream pages. The acne cream string gets stored in noprivacy.coms database
along with browser_id 7586. When Joe visits bigmagazine.com, he is forced to register
and give his name, email address, snail mail address, and credit card number. There are
no ads in bigmagazine.com. They have too much integrity for that. So they include in
their pages an IMG referencing a blank GIF at noprivacy.com. Joes browser requests
the blank GIF for BM17377 and, because it is talking to noprivacy.com, the site
that issued the Set-Cookie header, the browser includes a cookie header saying Im
browser_id 7586. When all is said and done, the noprivacy.com folks know Joe Users
name, his interests, and the fact that he has downloaded six spanking JPEGs from
kiddieporn.com.
16
Chapter 2
A reasonable engineering approach to using cookies is to send a unique identier for the data rather than the data, just as in the amazon.com session ID in
the URL example previously described. Information about the contents of the
shopping cart will be kept in some sort of log on the server. This means that it
can be picked up from another location. To see how this works in practice, go
to an operating system shell and request the home page of eveandersson.com:
Note that two cookies are set. The rst one, ad_browser_id is given an explicit expiration date in January 2010. This instructs the browser to record the
cookie value, in this case 3291092, on the hard drive. The cookies value will
continue to be sent back up to the server for the next four years, even if the user
quits and restarts the browser. Whats the point of having a browser cookie? If
the user says I prefer text-only or I prefer French language thats probably
worthwhile information to keep with the browser. The text-only preference
17
Basics
Server-Side Storage
Youve got ID information going out to and coming back from browsers, via
either the cookie extension to HTTP or URL rewriting. Now you have to gure out a way to keep associated information on the Web server.
For exibility in how you present and analyze user-contributed data, youll
probably want to keep the information in a structured form. For example, it
would be nice to have a table of all the items put into shopping carts by various
users. And another table of orders. And another table of reader-contributed
product reviews. And another table of questions and answers.
Whats a good tool for storing tables of information? Consider rst a spreadsheet program. These are inexpensive and easy to use. One should never apply
more complex technology than necessary for solving a problem. Something like
Visicalc, Lotus 1-2-3, Microsoft Excel, or StarOce Calc would seem to serve
nicely.
The problem with a spreadsheet program is that it is designed for one user.
The program listens for user input from two sources: mouse and keyboard. The
program reports its results to one place: the screen. Any source of persistence
for a Web server has to contend with potentially thousands of simultaneous
users both reading and writing to the database. This is the problem that database management systems (DBMS) were intended to solve.
18
Chapter 2
19
Basics
...
20
Chapter 2
Programs written in this style have two drawbacks. First, they quickly become
complex and then can be developed and maintained only by professional programmers. Second, they contain a lot of errors. For example, the program
sketched above may have quite a few bugs. It is not after March 17, 2023. So
we cant be sure that the steps specied in the THEN clause of the IF statement
are error-free.
An alternative style of programming is declarative. We tell the computer
what we want, for example, a report of users whove been registered for more
than one year but who havent answered any questions in the discussion forum.
We dont tell the RDBMS whether to scan the users table rst and then check
the discussion forum table or vice versa. We just specify the desired characteristics of the report and it is the job of the RDBMS to prepare it.
Stop someone in the street. Pick someone with fashionable clothing so you
can be sure he or she is not a professional programmer. Ask this person,
Have you ever programmed in a declarative computer language? Follow
that up with Have you ever used a spreadsheet program? Chances are that
you can nd quite a few people who will tell you that theyve never written
any kind of computer program but yet theyve developed fairly sophisticated
spreadsheet models. Why? The spreadsheet language is declarative: Make
this cell be the sum of these three other cells. The user doesnt tell the spreadsheet program in what order to perform the computation, merely the desired
result.
The declarative language of the spreadsheet created an explosion in the
number of people who were able to develop working computer programs.
Through the mid-1970s, organizations that worked with data kept a sta of
programmers. If you wanted some analysis performed youd call one into your
oce, explain the assumptions and formulae to be used, then wait a few days
for a report. In 1979 Dan Bricklin (MIT EECS 73) and Bob Frankston (MIT
EECS 70) developed Visicalc and suddenly most of the people whod been
hollering for programming services were able to build their own models.
With an RDBMS the metaphoric little strips of paper pushed under the door
are declarative programs in the SQL language. (See SQL for Web Nerds at
http://philip.greenspun.com/sql/ for a SQL language tutorial.)
The second pillar of RDBMS popularity is isolation of important data from
programmers mistakes. With other kinds of database management systems, it
is possible for a computer program to make arbitrary changes to the data set.
This can be convenient for applications such as computer-aided design systems
with very complex data structures. However, if your goal is to preserve a data
21
Basics
set over a twenty-ve-year period, letting arbitrarily buggy imperative programs make arbitrary changes isnt a good idea. The RDBMS limits programmers to uttering very simple statements of the form INSERT, DELETE,
and UPDATE. Furthermore, if youre unhappy with the contents of your database you can simply review all the strips of paper that were pushed under the
door. Each strip will contain an SQL statement and the name of the program
or programmer that authored the strip. This makes it easy to correct mistakes
and reform oenders.
The third and nal pillar of RDBMS popularity is good performance with
many thousands of simultaneous users. This is more a reection on the rened
state of commercial development of systems such as IBM DB2, Oracle, Microsoft SQL Server, and the open-source PostgreSQL than an inherent feature of
the RDBMS itself.
The Steps
When building any Internet application youre going to go through the following steps:
1. Develop a data model. What information are you going to store and how
will you represent it?
2. Develop a collection of legal transactions on that model, e.g., inserts and
updates.
3. Design the page ow. How will the user interact with the system? What
steps will lead up to one of those legal transactions? (Note that page ow
embraces interaction design on Web and mobile browsers, and also via hierarchical voice menus in VoiceXML, but not conversational speech systems.)
4. Implement the individual pages. Youll be writing scripts that query
information from the data model, wrap that information in a template (in
HTML for a Web application), and return the combined result to the user.
It is very unlikely that youll have a choice of tools for persistent storage. You
will be using an RDBMS and wont be making any fundamental technology
decisions at Steps 1 or 2. Designing the page ow is a purely abstract exercise.
There are some technology-imposed limits on the interface, but those are generally derived from public standards such as HTML, XHTML Mobile Prole,
and VoiceXML. So you need not make any technology choices for Step 3.
22
Chapter 2
Step 4 is intellectually uninteresting and also uninteresting from an engineering point of view. An Internet service lives or dies by Steps 1 through 3. What
can the service do for the user? Is the page ow comprehensible and usable?
The answers to these questions are determined at Steps 1 through 3. However,
Step 4 is where you have a huge range of technology choices and therefore it
seems to generate a lot of discussion. This course and this book are neutral on
the subject of how you go about Step 4, but we provide some guidance on how
to make choices.
First, though, lets step back and make sure that everyone knows HTML.
HTML
Here is some legal HTML:
My Samoyed is really hairy.
HTML stands for Hypertext Markup Language. The <I> is markup. It tells
the browser to start rendering words in italics. The </I> closes the <I> element
and stops the italics. If you want to be more tasteful, you can tell the browser
to emphasize the word really:
My Samoyed is <EM>really</EM> hairy.
Most browsers use italics to emphasize, but some use boldface and browsers
for ancient ASCII terminals (e.g., Lynx) have to ignore this tag or come up
with a clever rendering method. A picky user with the right browser program
can even customize the rendering of particular tags.
There are a few dozen more tags in HTML. You can learn them by choosing
View Source from your Web browser when visiting sites whose formatting you
admire. You can look at the HTML reference chapter of this book. You can
learn them by starting at Yahoos directory of HTML guides and tutorials,
23
Basics
http://dir.yahoo.com/Computers_and_Internet/Data_Formats/HTML/Guides
_and_Tutorials/. Or you can buy HTML & XHTML: The Denitive Guide
(Chuck Musciano and Bill Kennedy [OReilly, 2002]).
Document Structure
Armed with a big pile of tags, you can start strewing them among your words
more or less at random. Though browsers are extremely forgiving of technically
illegal markup, it is useful to know that an HTML document ocially consists
of two pieces: the head and the body. The head contains information about the
document as a whole, such as the title. The body contains information to be
displayed by the users browser.
Another structure issue is that you should try to make sure that you close
every element that you open. If your document has a <BODY> it should have
a </BODY> at the end. If you start an HTML table with a <TABLE> and
dont have a </TABLE>, a browser may display nothing. Tags can overlap,
but you should close the most recently opened before the rest, for example, for
something both boldface and italic:
My Samoyed is <B><I>really</I></B> hairy.
Something that confuses a lot of new users is that the <P> element used to
surround a paragraph has an optional closing tag </P>. Browsers by convention assume that an open <P> element is implicitly closed by the next <P> element. This leads a lot of publishers (including lazy old us) to use <P> elements
as paragraph separators.
Heres the source HTML from a simply formatted Web document:
<html>
<head>
<title>Nikon D1 Digital Camera Review</title>
</head>
<body bgcolor=white text=black>
<h2>Nikon D1</h2>
by <a href="http://philip.greenspun.com/">Philip Greenspun</a>
<hr>
Little black spots are appearing at the top of every ...
<h3>Basics</h3>
The Nikon D1 is a good digital camera for ...
<p>
The cameras 15.6x23.7mm CCD image sensor ...
24
Chapter 2
<h3>User Interface</h3>
If you wanted a camera with lots of buttons, switches, and dials ...
<hr>
<address>
<a href="mailto:[email protected]">[email protected]</a>
</address>
</body>
</html>
25
Basics
hyperlink. If the reader clicks anywhere from here up to the </A> the browser
should fetch http://philip.greenspun.com/.
After the headline, author, and optional navigation, we put in a horizontal
rule tag: <HR>. One of the good things that we learned from designer Dave
Siegel (see http://philip.greenspun.com/wtr/getting-dates) is not to overuse
horizontal rules: Real graphic designers use whitespace for separation. We use
<H3> headlines in the text to separate sections and only put an <HR> at the
very bottom of the document.
Underneath the last <HR>, we sign our documents with the email address of
the author. This way a reader can scroll to the bottom of a browser window
and nd out who is responsible for what theyve just read and where to send
corrections. The <ADDRESS> tag usually results in an italics rendering by
browser programs. Note that this one is wrapped in an anchor tag with a target
of mailto: rather than http:. If the user clicks on the anchor text (Philips
email address), the browser will pop up a send mail to [email protected]
window.
26
Chapter 2
until 1:30 p.m. Further, suppose that User 356712 comes in at 12:30 p.m. and
changes his email address, thus updating a row in the users table. If the usage
tracking query arrives at this row at 12:45 p.m., Oracle will notice that the
row was last modied after the query started. Under the I in ACID, Oracle
is required to isolate the publisher from the users update. Oracle does this
by reaching into the rollback segment and producing data from user row
356712 as it was at 12:00 p.m. when the query started. Heres the scenario in a
table:
Time
Publisher
12:00 p.m.
12:30 p.m.
12:45 p.m.
1:30 p.m.
How would this play out in Microsoft SQL Server? When youre reading, you
take read locks on the information that youre about to read. Nobody can write
until you release them. When youre writing, you take write locks on the information that youre about to update. Nobody can read or write until you release
the locks. In the preceding example, User 356712 would submit his request
for the address change at 12:30 p.m. The thread on the Web server would be
blocked waiting for the read locks to clear. How long would it wait? A full
hour with a spinning/waving browser still receiving information icon in the
upper right corner of the browser window. If youre thoughtful, you can program around this locking architecture in SQL Server, but most Internet service
operators would rather just install Oracle than train their programmers to think
more carefully about concurrency.
27
Basics
28
Chapter 2
m Perl CGI
m Microsoft Active Server Pages
m Java Server Pages
m AOLserver ADP templates and .tcl scripts
A notable exception to this property is Java servlets. One servlet typically processes several URLs. This proves cumbersome in practice because it slows you
down when trying to x a bug in someone elses code. The ideas of modularity
and code reuse are nice, but try to think about how many les a programmer
must wade through in order to x a bug. One is great. Two is probably okay. N
where N is uncertain is not okay.
Filters We said that modularity and code reuse could be tossed in favor of
preserving the sacred principle of one URL one le. The way that you
get modularity and code reuse back is via lters, the ability to instruct the
Web server to run this fragment of code before serving any URL that starts
with /yow/. This is particularly useful for access control code. Suppose that
you have fteen scripts that constitute the administration experience for a
contest system. You want to make sure that only authorized administrators
can use the pages. Checking for administrative access requires an SQL query.
You could write a procedure called CheckForContestAdminAuthority and
instruct your script authors to include a call to this procedure in each of the
fteen admin scripts. Youve still got fteen copies of some code: one IF
statement, one procedure call, and a call to an error message procedure if
CheckForContestAdminAuthority returns unauthorized. But the SQL
query occurs only in one place and can be updated centrally.
The main problem with this approach is not the fteen copies of the IF statement and its consequents. The problem is that inevitably one of the script
authors will forget to include the check. So your site has a security hole. You
close the hole and eliminate fourteen copies of the IF statement by installing
the code as a server lter. Note that for this to work the lter mechanism must
include an API for aborting service of the requested page. Your lter needs to
be able to tell the Web server Dont proceed with serving the user with the
script or document requested.
Abstract URLs As an engineer your primary contributions to an Internet service will be data model and interaction design (Steps 1 through 3). When youre
sketching the page ow for a discussion forum on a whiteboard you give the
29
Basics
Exercises
After solving these problems you will know
m How to log into your development server
m Rudiments of whatever programming language youve chosen
30
Chapter 2
31
Basics
32
Chapter 2
33
Basics
Figure 2.3 The Bill Gates Personal Wealth Clock. This program queries a public stock
quote server to nd the price of Microsoft stock and the U.S. Census Bureaus server for
the current U.S. population, then combines the numbers on one page.
Sources: Population: U.S. Census Bureau, http://www.census.gov/cgi-bin/popclock. N
shares of Microsoft owned by Bill Gates: 1995 Microsoft Proxy Statement (141,159,990
shares adjusted for splits in December 1996, February 1998, March 1999, and February
2003). Microsoft Stock Price: Yahoo! Finance, http://yahoo.nance.com.
The nal point worth mentioning about this program is that part of the hour
of coding went into building a general-purpose caching or memoization system
to record the results of evaluating any Tcl expression in a global variable. Why?
It seemed like bad netiquette to write a program that had the potential to impose an unreasonable load on the Census Bureau and stock quote servers. Also,
in the event that the Wealth Clock became popular, it would be asking the
underlying servers several times a second for the same data. Lastly it seemed
that users shouldnt have to wait for the two subsidiary pages to be fetched if
they didnt need up-to-the-minute data. With the complete HTML page stored
in a global variable, it is available from AOLservers virtual memory space and
can be accessed much faster than even a static le. Users who want a real-time
answer can demand one with an extra mouse click. The calculation performed
for them then updates the cache for casual users.
The caching mechanism might sound like overengineering, but from time to
time the Wealth Clock would be linked to from extremely popular news sites
and receive several requests per second. The ability to handle a reasonably
high load like that, back in the mid-1990s, without an enormous server farm
was rather rare. Had those requests been passed directly through to the Census
Bureau, for example, the entire service would have slowed to a crawl.
The source code for this program is available at http://philip.greenspun.com/
examples-basics/wealth-clock.tcl.txt and may prove helpful in doing the next
exercise.
34
Chapter 2
Remember that it is a mistake to compare Harry Potter to Shakespeare. . . . Thats because Harry Potter is a ctional character whereas Shakespeare was an author. What
you really ought to be doing is comparing J. K. Rowling to Shakespeare.
Jin S. Choi
35
Basics
that you can always type M-x shell again and get an operating system shell.
In the sql-shell buer, type sqlplus to start SQL*Plus, the Oracle shell client. If youre using Windows, look for the program SQLPLUS.EXE or
PLUS80.EXE.
SQL*Plus will prompt you for a username and password. If youre using a
school-supplied development server, you may need to get these from your TA.
If you set up the RDBMS yourself, you might need to create a new tablespace
and user before you can do this exercise.
Type the following at the SQL*Plus prompt to create a table for keeping
track of the classes youre taking this semester:
create table my_courses (
course_number varchar(20)
);
Note that you have to end your SQL commands with a semicolon in SQL*Plus.
These are not part of the SQL language and you shouldnt use these when writing SQL in your Web scripts.
Insert a few rows, for example,
insert into my_courses (course_number) values (6.171);
Note that until you typed this COMMIT, another connected database user
wouldnt have been able to see the row that you inserted. Connected database
user includes the Web server. A common source of student consternation with
Oracle is that theyve inserted information with SQL*Plus and neglected to
COMMIT. The new information does not appear on any of their Web pages,
and they tear their hair out debugging. Of course nothing is wrong with their
scripts. It is just that the ACID guarantees mean that the Web server sees a different view of the database than the user who is in the middle of a transaction.
Your view of the table shouldnt change after a COMMIT, but maybe check
again:
select * from my_courses;
36
Chapter 2
37
Basics
m The newest computer can merely compound, at speed, the oldest problem
in the relations between human beings, and in the end the communicator
will be confronted with the old problem, of what to say and how to say
it.Edward R. Murrow
m Egotism is the anesthetic that dulls the pain of stupidity.Frank Leahy
m Some for renown, on scraps of learning dote, And think they grow immortal as they quote.Edward Young
Return to your RDBMS shell client (e.g., SQL*Plus for Oracle) and select
* from the table to see that your quotation has been inserted into the table.
In your RDBMS shell client, insert a quotation with some hand-coded SQL.
To see the form of the SQL INSERT command you should use, examine the
code on the page quotation-add. After creating this new table row, do select
* again, and you should now see two rows.
Hint: Dont forget that SQL quotes strings using single quotes, not double
quotes.
Now reload the quotations URL from your Web browser. If you dont see
your new quotation here, thats because you didnt type commit; at SQL*Plus
and the Web server is being protected from seeing the unnished transaction.
Exercise 6a: Eliminating the lock table via a Sequence
Read about Oracles sequence database object in the Data Modeling and
Transactions chapters of SQL for Web Nerds at http://philip.greenspun
.com/sql/data-modeling and http://philip.greenspun.com/sql/transactions. By
creating a sequence, you should be able to edit the quotation-add script to
m eliminate the need for lock table
m eliminate the transaction machinery (since youre no longer tying multiple
SQL statements together)
m generate a unique key for the new quotation within the INSERT statement
itself
38
Chapter 2
entry box labeled new category. Make sure to modify quotation-add so that
it recognizes when a new category is being dened.
Exercise 8: Searching
Add a small form at the top of /basics/quotations that takes a single query
word from the user. Build a target for this form that returns all quotes containing the specied word. Your search should be case-insensitive and also
look through the authors column. Hints: like %foo% and SQLs UPPER and
LOWER functions.
Exercise 9: Personalizing Your Service with Cookies
Now implement per-browser personalization of the quotation database. The
overall goal should be
m A user can kill a quotation and have it never show up again either from
the top-level page or the search page.
m Killing a quotation is persistent and survives the quitting and restarting of a
browser.
m Quotations killed by one user have no eect on what is seen by other users.
m Users can erase their personalizations and see the complete quotation database again by clicking on an erase my personalization link on the main
page. This link should appear only if the user has personalized the quotation
database.
Youll implement this using cookies. From your technology supplement youll
need to learn how to read the incoming HTTP request headers and then parse
out the Cookie header or perhaps youll have an API that makes it easy to get
the value of a particular cookie. Note that you can expire a cookie by reissuing
it with an expiration date that has already passed.
Hint 1: It is possible to build this system using an ID cookie for the browser
and keeping the set of killed quotations in the RDBMS. However, if youre not
going to allow users to log in and claim their prole, there really isnt much
point in keeping data on the server.
Hint 2: It isnt strictly copacetic with the cookie spec, but browsers accept
cookie values containing spaces. So you can store the killed quotations as a
space-separated list if you like.
39
Basics
Hint 3: Dont lter the quotations in your Web script. It is generally a sign of
incompetent programming when you query more data from the RDBMS than
youre going to display to the end-user. SQL is a very powerful query language.
You can use the NOT IN feature to exclude a list of quotations.
Exercise 10: Publishing Data in XML
As you learned above from querying bookstores, data on the Web have not traditionally been formatted for convenient use by computer programs. In theory,
people who wish to exchange data over the Web can cooperate using XML, a
1998 standard from the Web Consortium (http://www.w3.org/XML/). In practice, youll be hard pressed to get any XML-based cooperation from the average Web site right now (2005). Fortunately for your sake in completing this
problem set, you can cooperate with your fellow students: the overall goal is
to make quotations in your database exportable in a structured format so that
other students applications can read them.
Heres what we need in order to cooperate:
m an agreed-upon URL at everyones server where the quotations database
may be obtained: /basics/quotations-xml
m an agreed-upon format for the quotations
(In point of fact, we could avoid the need for prior agreement by setting up
infrastructures for service discovery and by employing techniques for selfdescribing databoth of which well deal with later in the semesterbut well
keep things simple for now.)
Well format the quotations using XML, a conventional notation for describing structured data. XML structures consist of data strings enclosed in HTMLlike tags of the form <foo> and </foo>, describing what kind of thing the data
is supposed to be.
Heres an informal example, showing the structure well use for our
quotations:
<quotations>
<onequote>
<quotation_id>1</quotation_id>
<insertion_date>2004-01-26</insertion_date>
<author_name>Britney Spears</author_name>
<category>Pop Musician Leisure Activities</category>
<quote>I shop, go to movies, and go out to eat.</quote>
40
Chapter 2
</onequote>
<onequote>
.. another row from the quotations table ...
</onequote>
... some more rows
</quotations>
Notice that theres a separate tag for each column in our SQL data model:
<quotation_id>
<insertion_date>
<author_name>
<category>
<quote>
Theres also a wrapper tag that identies each row as a <onequote> structure, and an outer wrapper that identies a sequence of <onequote> structures
as a <quotations> document.
Building a DTD
We can give a formal description of our XML structure, rather than an informal example, by means of an XML Document Type Denition (DTD).
Our DTD will start with a denition of the quotations tag:
<!ELEMENT quotations (onequote)+>
This says that the quotations element must contain at least one occurrence of
onequote, but may contain more than one. Now we have to say what constitutes a legal onequote element:
<!ELEMENT onequote (quotation_id,insertion_date,author_name,category,quote)>
This says that the sub-elements, such as quotation_id must each appear exactly once and in the specied order. Now we have to dene an XML element
that actually contains something other than other XML elements:
<!ELEMENT quotation_id (#PCDATA)>
41
Basics
quotation_id (#PCDATA)>
insertion_date (#PCDATA)>
author_name (#PCDATA)>
category (#PCDATA)>
quote (#PCDATA)>
You will nd this extremely useful. Hey, actually you wont nd this DTD useful at all for completing this part of the problem set. The only situation in
which a DTD is useful is when feeding documents to an XML parser because
then the parser can automatically tokenize each XML document. For implementing your quotations-xml page, you will only need to look at the informal
example.
The meat of this exercise: Write a script that queries the quotations table,
produces an XML document in the preceding form, and returns it to the client with a MIME type of application/xml. Place this in the le system at
/basics/quotations-xml, so that other users can retrieve the data by visiting
that agreed-upon URL.
Exercise 11: Importing XML
Write a program to import the quotations from another students XML output
page. Your program must
m Grab /basics/quotations-xml from another students server.
m Parse the resulting XML structure into records and then parse the records
into elds.
m If a quote from the foreign server has identical author and content as a quote
in your own database, ignore it; otherwise, insert it into your database with a
new quotation_id. (You dont want keys from the foreign server conicting with what is already in your database.)
Hint: You can set up a temporary table using create table quotations
_temp as select * from quotations and then drop it after youre done
debugging, so that you dont mess up your own quotations database.
42
Chapter 2
You are not expected to write an XML parser as part of this exercise. You
will either use a general-purpose XML parser or your TAs will give you a simple program that is capable only of parsing this particular format. If you arent
getting any help from your TAs and youre using Oracle, keep in mind that the
Oracle RDBMS has extensive built-in support for processing XML. Read the
Oracle documentation, notably the Oracle XML DB Developers Guide: Oracle
XML DB. If youre using Java or Perl, there are plenty of free open-source
XML parsers available. The Microsoft .NET Framework Class Library contains classes that provide a full set of XML tools.
Exercise 12: Taking Credit
Please go through your source code les. Make sure that there is a header at
the top explaining (1) who wrote the code, (2) on what date it was written,
and (3) what problem it is trying to solve. Please go through your Web pages.
Make sure that at the bottom of each page there is a mailto: link to your permanent email address.
It is your professional obligation to other programmers to take responsibility
for your source code. It is your professional obligation to end-users to take responsibility for their experience with your program.
Database Exercises
Were going to shift gears now into a portion of the problem set designed to
teach you more about the RDBMS and SQL. See your supplement if youre
using an RDBMS other than Oracle.
To facilitate turning in your problem set, keep a text le transcript of
relevant parts of your database session at http://yourhostname.com/basics/
db-exercises.txt.
DB Exercise 1: SQL*Loader
m Use a standard text editor to create a plain text le containing ve lines, each
line to contain your favorite stock symbol, an integer number of shares
owned, and a date acquired (in the form MM/DD/YYYY). Separate the
elds on each line with tabs.
43
Basics
Depending on how resourceful you are with skimming documentation, this exercise can take fteen minutes or a lifetime. The book Oracle: The Complete
Reference, discussed in the More section of this chapter is very helpful. You
can also read about SQL*Loader in the ocial Oracle docs, linked from http://
www.oracle.com/, typically in the Utilities book. Note that nding Oracle
documentation online requires a bit of persistence and oftentimes registration
(free). Look for links that say view library and tabs that say books.
DB Exercise 2: Copying Data from One Table to Another
This exercise exists because we found that, when faced with the task of moving
data from one table to another, programmers were dragging the data across
SQL*Net from Oracle into their Web server, manipulating it in a Web script,
then pushing it back into Oracle over SQL*Net. This is not the way! SQL is a
very powerful language and there is no need to bring in any other tools if what
you want to do is move data around within the RDBMS.
m using only one SQL statement, create a table called stock_prices with
three columns: symbol, quote_date, price. Within this one statement,
ll the table youre creating with one row per symbol in my_stocks. The
date and price columns should be lled with the current date and a nominal
price. Hint: select symbol, sysdate as quote_date, 31.415 as price
from my_stocks;.
m create a new table:
create table newly_acquired_stocks (
symbol
varchar(20) not null,
n_shares
integer not null,
date_acquired date not null
);
44
Chapter 2
m using a single insert into ... select ... statement (with a WHERE
clause appropriate to your sample data), copy about half the rows from
my_stocks into newly_acquired_stocks
DB Exercise 3: JOIN
With a single SQL statement JOINing my_stocks and stock_prices, produce a report showing symbol, number of shares, price per share, and current
value.
DB Exercise 4: OUTER JOIN
Insert a row into my_stocks. Rerun your query from the previous exercise.
Notice that your new stock does not appear in the report. This is because
youve JOINed them with the constraint that the symbol appear in both tables.
Modify your statement to use an OUTER JOIN instead so that youll get
a complete report of all your stocks, but wont get price information if none is
available.
DB Exercise 5: PL/SQL
Inspired by Wall Streets methods for valuing Internet companies, weve developed our own valuation method for this problem set: a stock is valued at the
sum of the ASCII characters making up its symbol. (Note that students whove
used lowercase letters to represent symbols will have higher-valued portfolios
than those whove used all uppercase symbols; IBM is worth only $216
whereas ibm is worth $312!)
m dene a PL/SQL function that takes a trading symbol as its argument and
returns the stock value. Hint: Oracles built-in ASCII function will be helpful.
m with a single UPDATE statement, update stock_prices to set each stocks
value to whatever is returned by this PL/SQL procedure
m dene a PL/SQL function that takes no arguments and returns the aggregate
value of the portfolio (n_shares * price for each stock). Youll want to
dene your JOIN from DB Exercise 3 (above) as a cursor, and then use the
PL/SQL Cursor FOR LOOP facility. Hint: when youre all done, you can
run this procedure from SQL*Plus with select portfolio_value() from
dual;.
45
Basics
SQL*Plus Tip: though it is not part of the SQL language, you will nd it very
useful to type / after your PL/SQL denitions if youre feeding them to
Oracle via the SQL*Plus application. Unless you write perfect code, youll
also want to know about the SQL*Plus command show errors. For exposure
to the full range of this kind of obscurantism, see the SQL*Plus Users Guide
and Reference, one of the books included in Oracles database documentation.
DB Exercise 6: Buy More of the Winners
Rather than taking your prots on the winners, buy more of them!
m use SELECT AVG( ) to gure out the average price of your holdings
m Using a single INSERT with SELECT statement, double your holdings in all
the stocks whose price is higher than average (with date_acquired set to
sysdate)
Rerun your query from DB Exercise 4. Note that in some cases you will have
two rows for the same symbol. If what youre really interested in is your current position, you want a report with at most one row per symbol.
m use a select ... group by ... query from my_stocks to produce a report
of symbols and total shares held
m use a select ... group by ... query JOINing with stock_prices to produce a report of symbols and total value held per symbol
m use a select ... group by ... having ... query to produce a report of
symbols, total shares held, and total value held per symbol restricted to symbols in which you have at least two blocks of shares (i.e., the winners)
More
m on HTTP: The Web Consortiums canonical standard at http://www.w3.org/
Protocols/
46
Chapter 2
Planning
If youre reading this chapter, we assume that youve completed the Basics
problem set and are going to stay with the course for the rest of the semester.
Welcome. Now it is time to plan your work during the core of the course.
Everyone in this course will be building an online learning community, a site
where users teach each other. The work may be done alone or in groups of two
or three students. Ideally, you or your instructors will nd a real client for you,
someone who wants to publish and administer the community on an ongoing
basis. A good client would be a non-prot organization that wants to educate
people about the subject surrounding its mission. A good client would be a
medium-sized company that wants a knowledge-sharing system for employees.
A good client would be a student group at your university. If you cant nd a
client, pick something that youre passionate about. It could be Islamic architecture. It could be African Cichlids (a family of freshwater shes, living mostly
in the rift lakes of East Africa; see www.cichlid.org). It could be cryptography.
Pick something where you think that you can easily get or generate magnet content, some tutorial information that will attract users to your service.
You are building the same type of project as everyone else in the class. Thus
it will be easy for you to compare approaches to, for example, user registration
or content management.
Before you starting writing code, however, wed like you to do some planning and competitive analysis. Fundamentally you need to answer the questions Who is going to teach what to whom? and What alternatives are
currently available for this kind of learning?
48
Chapter 3
User Classes
Start by dividing your users into classes. Two users should fall into the same
class if you expect them to want substantially the same experience with your
service. It is almost always useful to think about dierent levels of administrative privileges as you are dividing the users into classes. It is almost never useful
to think about teachers versus learners; the whole point of an online community is that each user is learning some of the time and each user is teaching
some of the time.
Example: User Class Decomposition on photo.net
To give you an idea of what a user class decomposition might look like, well
walk through one for the photo.net service.
First, consider the overall objective of photo.net: A place where a person can
go and get the answer to any question about photography.
Second, consider levels of administrative privilege. There are site-wide administrators, who are free to edit or delete any content on the site. These administrators also have the power to adjust the authority of other users. We have
moderators who have authority to approve or delete postings in particular discussion forums. Finally there are regular users who can read, post, and edit
their own contributions. A less popular service could probably get away with
only two levels of admin privilege.
A dierent way of dividing the users is by purpose in visiting the service:
m wanna-be point-and-shooterwants quick advice on what point-and-shoot
camera to buy and where to buy it; wants to invest minimal time, eort, and
money in photography
m novice photographer shopperwants to begin taking pictures for purposes of
artistic expression, but does not have a camera with exible controls right now
m novice photographer learnerhas the right equipment, but wants ideas for
where, when, and how to use it; wants critiques of nished work
m expert photographerwants new ideas, to see what is new in the world of
hardware; wants to share expertise; wants community
m wanna-be commercial photographermight be a high school or college student curious about the future or an older person wanting to change careers;
49
Planning
A nal way of dividing users that may be useful is by how they connect. In
the case of photo.net, it is easy to envision the Web-browser user. This user is
uploading and downloading photos, participating in discussions, reading tutorials, shopping for equipment, and so on. The same person may connect via
a mobile phone, in which case he or she becomes a mobile user. If the mobile
user is in the middle of a photographic project, we want to provide information
about nearby camera shops, processing labs, repair shops, time of sunset, good
locations, and other useful data. If the mobile user is connecting for social purposes, we need to think about what are practical ways for a person on a mobile
phone to participate in an online community. Our engineering challenge is similar for the telephone user.
Usage Scenarios
For each class of user, you should write down a rough idea of what a person in
this class would get from your new service. You may want to hint at page ow.
Example: Novice Photographer Shopper at photo.net
The novice should start by reading a bunch of carefully authored camerabuying advice articles and then reviews of specic cameras. Much of the best
shopping advice is contained in question-and-answer exchanges within the discussion forums so editors will need a way to pick out and point to the best
threads in the forum archives. After our user has read all of this stu, it would
be ideal if he or she could be directed into a Q&A forum where heres what Ive
decided to buy; what do you think? questions are welcomed. That could be
50
Chapter 3
implemented as an explicitly social shopping system with one column for responses from other readers and an adjacent column for bids from camera shops.
Example: Site-Wide Administrator at photo.net
The site-wide administrator should log in and see a page that gives the pulse
of the community with statistics on the number of new users registered, the
quantity of photos uploaded into the photo-sharing system, the activity in
the discussion forums, the relative eorts of the moderators (volunteers from
the community). If there are unbanned users who have been responsible for an
onerous amount of moderator work in deleting o-topic postings, and so forth,
these should be listed with a summary of their problematic activities and an
option to ban.
Exercise 1a
Answer the following questions:
m What subject will people be able to learn in the community that youre
building?
m What do you want people to say about your service after a visit?
m What are the relevant distinct user classes?
m What should a user on a mobile phone be able to do? Is it productive to
mix voice and text interaction? (See Multimodal Requirements for Voice
Markup Languages from the Web Consortium at http://www.w3.org/TR/
multimodal-reqs for some hints as to what will be possible.)
Make sure that your answers to this and all subsequent exercises are Web
accessible. It is a good idea to get into the discipline of ensuring that all documents relevant to your project are available on the project server itself, perhaps
in the /doc/ directory.
51
Planning
people for whom youre building the application concrete, you should build
two or three prole pages. A prole page contains the following information:
(a) a picture of the user, (b) the users name, age, occupation, marital status,
housing situation, and income, (c) the users short-term and long-term goals
relevant to the online community that youre building, (d) the immediate questions that this user will bring to the site, (e) the kind of computer equipment
and connection in this persons house, and (f ) any other information that will
help to humanize this ctitious person.
To assist you in this task weve created a couple of examples for an online
learning community in the area of general aviation:
m Rachel Lipschitz (http://philip.greenspun.com/seia/examples-planning/userprole-1)
m Melvin Cohen (http://philip.greenspun.com/seia/examples-planning/userprole-2)
m Mindy Silverblatt (http://philip.greenspun.com/seia/examples-planning/userprole-3)
If you dont have a good photo library of your own, lift photos (with credit)
from photo.net and other online sources.
Dont spend more than one hour on this exercise; plenty of truly awful software has
been written with fancy user proles on the programmers desks. There is no substitute
for launching a service to real users and then watching their behavior attentively.
Exercise 1c
For each class of user identied in Exercise 1a, produce a textual or graphical
usage scenario for how that user will experience your service.
52
Chapter 3
Figure 3.1
The dotcom boom is over. You ought to have a good reason for building an
information system. If a curmudgeon wants to know why you need all these
fancy computers instead of a book, some chalk, and pencil and paper, it would
be nice to have a convincing answer.
There are good reasons to look at the best elements of oine resources and
systems. After several millenia, many of these systems are exquisitely rened
and extremely eective. Your online community and technology-aided learning environment can be much improved by careful study of the best oine
alternatives.
Example: Popular Photography Magazine
The largest circulation oine publication in the U.S. world of photography is
the sixty-ve-year-old magazine Popular Photography. It is extremely eective
at answering the following questions: What is the price of a Nikon 50/1.4
lens? What are the latest cameras available? How does the new Canon Elan 7
body perform on a test bench?
53
Planning
Exercise 2
Write down the best features of oine alternatives for learning the subject matter of the service that youre building. Indicate those features that
you think can be translated into an online community and, if so, how. Write
a three-sentence justication for why your online learning community will
be an improvement over oine alternatives for at least some group of
people.
54
Chapter 3
55
Planning
Exercise 3
Find the best existing online communities in your subject area. Note how
closely they conform to the six elements of sustainability listed above. Also
write down anything strikingly good or bad about the registration process and
the mechanisms of collaboration, for example, in discussion forums, comments
on articles, and chat rooms. Look for voice and mobile interfaces. If present,
try them out. (The Adding Mobile Users To Your Community chapter provides a list of desktop browser-based phone emulators so that you wont have
to use your mobile phone; alternatively type WAP emulator or Mobile
browser emulator into a public search engine.) Look for evidence of personalization and direct controls over preferences.
56
Chapter 3
It is not hard to see why the government of a region becomes less and less manageable with
size. In a population of N persons, there are of the order of N 2 person-to-person links
needed to keep channels of communication open. Naturally, when N goes beyond a certain
limit, the channels of communication needed for democracy and justice and information
are simply too clogged, and too complex; bureaucracy overwhelms human process. . . .
We believe the limits are reached when the population of a region reaches some 2 to
10 million. Beyond this size, people become remote from the large-scale processes of
government. Our estimate may seem extraordinary in the light of modern history: the
nation-states have grown mightily and their governments hold power over tens of millions,
sometimes hundreds of millions, of people. But these huge powers cannot claim to have a
natural size. They cannot claim to have struck the balance between the needs of towns and
communities, and the needs of the world community as a whole. Indeed, their tendency has
been to override local needs and repress local culture, and at the same time aggrandize
themselves to the point where they are out of reach, their power barely conceivable to the
average citizen.
If it were possible for everyone to pile into a single community and have a great
learning experience, America Online would long ago have subsumed all the
smaller communities on the Internet. One of the later chapters of this book is
devoted to the topic of growing an online community gracefully to a large size.
But, for now, rest assured that it is a hard problem that nobody has solved.
Given suciently high quality magnet content and an initial group of people
dedicated to teaching, there will always be room for a new learning community.
Exercise 4
Identify sources of magnet content for your community this semester. If some
of this content is going to come from other people, write to them and ask for
permission. Even if youre only using their work experimentally, one concern
that an author or publisher might have is that your site will get indexed by
search engines and readers will be misdirected to your site instead of theirs. In
practice, this is not a problem if your server isnt accessible from the public
Internet or if you include a robots.txt le that will instruct search engines
to exclude certain content. You may get a friendlier response from copyright
holders if you agree to provide a hyperlinked credit and to ensure that their
content does not become multiply indexed.
If you have a client who is supplying all the magnet content, write down a
summary of what is going to be available and when. Next to each class of
documents note the person responsible for assembling and delivering them. As
57
Planning
an engineer, it isnt your job to assemble and develop content, but it is your job
to identify risks to a project, such as not enough magnet content or nobody
has thought about magnet content.
58
Chapter 3
that you produce as a work for hire, you wont have a personal toolkit of
software that you can reuse on new projects. If you dont give away any rights,
nobody will be able to run your software, which probably means that you
wont be able to solve social or organizational problems. A good negotiator
gives away things that are valuable to the other side, but that arent valuable
to his or her side.
During this course, for example, you will ideally want to retain ownership of
all software that you produce. You will therefore be free to reuse the code in
any way, shape, or form. The client, however, is going to be putting in a lot of
time and eort working with you over a period of months and is thus entitled
to some benet. Your university tuition payments have probably drained away
all of the cash in your bank account and, therefore, you wont be giving the
client money as compensation for his or her time. What you can do is give the
client a license to use your software. This obviously benets the client, but it
also benets you. The more people that are out there happily running your
software, the better your professional resume looks.
Should you try to limit what the client can do with your software? Generally,
this isnt worthwhile. Any organization that comes to you for programming assistance is probably not an organization that will want to hang out a shingle
and oer to develop software for others. If they do decide that it would make
sense to adapt your software to another application within the company, it is
very likely that they will call you rst to oer a consulting fee in exchange for
your assistance.
How about limiting your liability? Oftentimes software engineers are called
upon to write programs whose failure would have catastrophic results. Suppose
that you are oered $100,000 to write a trading program for an investment
bank. That may seem like a great deal until the bank sues you for $100 million,
alleging that a bug in your program cost them $100 million in lost prots. In
the biomedical eld a bug can be much more serious. There is the famous case
of the Therac-25 radiation treatment machine, bugs in whose control software
cost lives (see http://sunnyday.mit.edu/therac-25.html).
Disclaiming liability is dicult, even for trained lawyers, and hence this is
best left to professionals. Nearly every commercial software license includes a
disclaimer of warranty. Heres a snippet from the Microsoft End User License
Agreement (EULA):
19. DISCLAIMER OF WARRANTIES. TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW, MICROSOFT AND ITS SUPPLIERS PROVIDE
THE SOFTWARE AND SUPPORT SERVICES (IF ANY) AS IS AND WITH ALL
59
Planning
FAULTS, AND HEREBY DISCLAIM ALL OTHER WARRANTIES AND CONDITIONS, WHETHER EXPRESS, IMPLIED, OR STATUTORY, INCLUDING,
BUT NOT LIMITED TO, ANY (IF ANY) IMPLIED WARRANTIES, DUTIES OR
CONDITIONS OF MERCHANTABILITY, OF FITNESS FOR A PARTICULAR
PURPOSE, OF RELIABILITY OR AVAILABILITY, OF ACCURACY OR COMPLETENESS OF RESPONSES, OF RESULTS, OF WORKMANLIKE EFFORT,
OF LACK OF VIRUSES, AND OF LACK OF NEGLIGENCE, ALL WITH REGARD TO THE SOFTWARE, AND THE PROVISION OF OR FAILURE TO PROVIDE SUPPORT OR OTHER SERVICES, INFORMATION, SOFTWARE, AND
RELATED CONTENT THROUGH THE SOFTWARE OR OTHERWISE ARISING
OUT OF THE USE OF THE SOFTWARE. ALSO, THERE IS NO WARRANTY OR
CONDITION OF TITLE, QUIET ENJOYMENT, QUIET POSSESSION, CORRESPONDENCE TO DESCRIPTION, OR NON-INFRINGEMENT WITH REGARD
TO THE SOFTWARE.
20. EXCLUSION OF INCIDENTAL, CONSEQUENTIAL, AND CERTAIN OTHER
DAMAGES. TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE
LAW, IN NO EVENT SHALL MICROSOFT OR ITS SUPPLIERS BE LIABLE
FOR ANY SPECIAL, INCIDENTAL, PUNITIVE, INDIRECT, OR CONSEQUENTIAL DAMAGES WHATSOEVER (INCLUDING, BUT NOT LIMITED TO,
DAMAGES FOR LOSS OF PROFITS OR CONFIDENTIAL OR OTHER INFORMATION, FOR BUSINESS INTERRUPTION, FOR PERSONAL INJURY, FOR
LOSS OF PRIVACY, FOR FAILURE TO MEET ANY DUTY INCLUDING OF
GOOD FAITH OR OF REASONABLE CARE, FOR NEGLIGENCE, AND FOR
ANY OTHER PECUNIARY OR OTHER LOSS WHATSOEVER) ARISING OUT
OF OR IN ANY WAY RELATED TO THE USE OF OR INABILITY TO USE
THE SOFTWARE, THE PROVISION OF OR FAILURE TO PROVIDE SUPPORT
OR OTHER SERVICES, INFORMATION, SOFTWARE, AND RELATED CONTENT THROUGH THE SOFTWARE OR OTHERWISE ARISING OUT OF THE
USE OF THE SOFTWARE, OR OTHERWISE UNDER OR IN CONNECTION
WITH ANY PROVISION OF THIS EULA, EVEN IN THE EVENT OF THE
FAULT, TORT (INCLUDING NEGLIGENCE), MISREPRESENTATION, STRICT
LIABILITY, BREACH OF CONTRACT, OR BREACH OF WARRANTY OF
MICROSOFT OR ANY SUPPLIER, AND EVEN IF MICROSOFT OR ANY SUPPLIER HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
21. LIMITATION OF LIABILITY AND REMEDIES. NOTWITHSTANDING
ANY DAMAGES THAT YOU MIGHT INCUR FOR ANY REASON WHATSOEVER (INCLUDING, WITHOUT LIMITATION, ALL DAMAGES REFERENCED
HEREIN AND ALL DIRECT OR GENERAL DAMAGES IN CONTRACT OR
ANYTHING ELSE), THE ENTIRE LIABILITY OF MICROSOFT AND ANY OF
ITS SUPPLIERS UNDER ANY PROVISION OF THIS EULA AND YOUR EXCLUSIVE REMEDY HEREUNDER (EXCEPT FOR ANY REMEDY OF REPAIR
OR REPLACEMENT ELECTED BY MICROSOFT WITH RESPECT TO ANY
BREACH OF THE LIMITED WARRANTY) SHALL BE LIMITED TO THE
60
Chapter 3
GREATER OF THE ACTUAL DAMAGES YOU INCUR IN REASONABLE
RELIANCE ON THE SOFTWARE UP TO THE AMOUNT ACTUALLY PAID BY
YOU FOR THE SOFTWARE OR U.S.$5.00. THE FOREGOING LIMITATIONS,
EXCLUSIONS, AND DISCLAIMERS SHALL APPLY TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW, EVEN IF ANY REMEDY FAILS
ITS ESSENTIAL PURPOSE.
This is so important to Microsoft that it is the only part of a twelve-page agreement that is printed in boldface and the only part that is presented in a French
translation for Canadian customers as well.
If you dont want to cut and paste Microsofts verbiage, which might expose
you to a copyright infringement action from Redmond, consider employing a
standard free software or open-source license, of which the GNU General Public License is the best-known example. Note that using a free software license
doesnt mean that your software is now free to the world. You may have
licensed one client under the GNU GPL, but whether or not you decide to oer
anyone else a license is a decision for the future.
If you wish, you can use the sample contract at the end of this book as
a starting point in negotiating rights with your client. And remember that
old bromide of business: You dont get what you deserve; you get what you
negotiate.
More
m The case for on-line communities, McKinsey Quarterly, Shona Brown
et al., 2002, Number 1, http://www.mckinseyquarterly.com/article_abstract
.asp?ar=1143
To the Instructor
It is helpful during the second meeting of the class to bring clients on campus
to give three-minute presentations pitching their projects. Here is a suggested
outline to the client for the presentation:
1. introduce the speaker and the organization he or she represents (15 seconds)
2. explain who the users are and why they need to interact via an Internet application, i.e., what problem is this online community solving (1.5 minutes)
61
Planning
3. describe how users will be attracted to the site initially, e.g., is there a collection of magnet content that these people need that isnt available anywhere
else? (30 seconds)
4. after the site has been up and running for a few months, what will a typical
interaction look like for a new user? (30 seconds)
5. what will happen after the semester is over; how will the system be funded
and sustained? (15 seconds)
The client should be prepared to answer questions for a minute or two after the
presentation.
Software Structure
Before embarking on a development project it is a good idea to sketch the overall structure of the system to be built.
Gross Anatomy
Any good online learning community will have roughly the same core
structure:
1. user database
2. content database
3. user/content map
4. user/user map
As used above, database is an abstract term. The user database, for example,
could be implemented as a set of SQL tables within a relational database management system. The tables for the user database need not be separated in any
way from tables used to implement other modules, that is, they would all be
owned by the same user and reside within the same tablespace. On the other
hand, the user database might be external to the online learning communitys
core source of persistence. A common case in which the user database can become external is that of a corporations knowledge-management system, where
employees are authenticated by checking a central LDAP server.
A more modern example of how these core databases might become split
up would be in the world of Web services. Microsoft Hailstorm, for example,
oers to provide user database services to the rest of the Internet. A university
might set up complementary communities, one for high school students and
64
Chapter 4
one for colleagues at other schools, both anchored by the same database of
genomics content. The genomics content database might be running on a
computer that is physically separate from the computer supporting the online
communities and it might advertise its services via WSDL and provide those
services via SOAP.
User Database
At a bare minimum the user database has to record the real name and email
address of the user. Remember that the more identied, authenticated, and accountable people are, the better the opportunity for building a community out
of an aggregate. An environment where anonymous users shout at each other
from behind screen names isnt worth the programming and system administration eort. The user database should have a facility for recording the reliability
of a users name and email address since the name is likely to become more reliably known over time and the email address less likely.
To contribute to an accountable and identied environment the user database should be able to store a personal URL for each user. If this is a Yahoo!
Geocities page it wont contribute too much to accountability and identication. On the other hand, if the URL starts with http://research.hp.com/
personal/ it will give other users some condence. Since one of the sad features of the Web as architected in 1990 is that URLs rot, a user database needs
an extra eld to keep track of what has happened when a robot tries to visit the
recorded URL. If a URL has not been reachable on several separate occasions
over a one-week period, it is probably safe for a computer program to assume
that the URL is out of date and stop displaying it publicly.
The user database should record privacy and contact preferences. Is
Jane User willing to let you show her email address to the public? To other
registered users? Is Joe User willing to let you spam him with news of the
site?
Content Database
The content of an online learning community always includes questions
and answers in a discussion forum. A programmer might start by building a
65
Software Structure
table for discussion forum postings. Of the six required elements of online
community, magnet content is listed rst. Most online learning communities
oer published articles that are distinguished from user-contributed questions. A programmer would therefore create a separate table to hold
articles. Any well-crafted site that publishes articles provides a facility for
users to contribute comments on those articles. This will be another separate
table.
Is a pattern emerging here? We distinguish a question in the discussion forum
table because it is an item of content that is not a response to any other discussion forum posting. We distinguish articles from comments because an article is
an item of content that is not a response to any other content item. Perhaps the
representation of articles, comments on articles, questions, answers, and so
forth should be unied to the maximum extent possible. Each is a content
item. Each has one or more authors. Each may optionally be a response to another content item.
Here are some services that it would be nice to centralize in a single-content
repository within the content database:
m versioning of content
m whether an item of content is a reply to, a comment on, or an attachment to
some other item
m whether an item of content has been approved or disapproved by the site
moderators
m to whom may the content be shown? Is it only for members of a group, a particular user, or le grand public (as they say in France)
m who has the right to edit the content?
m who has the right to change who has the right to view or edit?
m who has the right to comment on an item? Who must review comments that
have been posted before they go live?
m timing the content: When does it go live? When does it expire?
m quality or importance of the content: Should this be highlighted to users?
Should it be withheld until it graduates from draft status?
m full-text indexing of the content
m summaries, descriptions, and keywords for the content
m a consistent site-wide taxonomy
66
Chapter 4
plus some things that really belong in the user/content map below:
m who authored or contributed an item, with a distinction among publisherauthored, group-authored, and user-authored stu
m who should be notied via email when a comment on or response to an item
is posted
m whether a content item is rated of high interest by a user or low/no interest;
given these stats, this implies the ability to pick out new content that is
likely to be of interest to User 17 (depends on text-processing software that
can compute document similarity)
User/Content Map
An online learning community generally needs to be able to record the following statements:
m User 21 contributed Comment 37 on Article 529
m User 192 asked Question 512
m User 451 posted Answer 3 to Question 924
m User 1392 has read Article 456
m User 8923 is interested in being alerted when a change is made to Article 223
m User 8923 is interested in being alerted when an answer to Question 9213 is
posted
67
Software Structure
Sam,
I notice that you have four assignments due on Monday and that you
have not even looked at two of them. I hope that you arent planning
to go to a fraternity party tonight instead of studying.
Very truly yours,
Some SQL Code
The new documents are then presented to Jane ranked by descending score.
If youre an Intel stockholder youll be pleased to consider the computational
implications of this personalization scheme. Every new document must be
68
Chapter 4
User/User Map
Relationships among users become increasingly important as communities
grow. Someone who is in a discussion forum with 100 others may wish to say
I am oended by User 45s perspective; I want the system to suppress his contributions in pages served to me or email alerts sent to me. The technical term
for this is bozo ltration and it dates back at least to the early 1980s and the
USENET (Netnews) distributed discussion forum system. Someone who is in
a discussion forum with 100,000 others may wish to say I am overwhelmed;
I never want to see anything more from this forum unless User 67329 has contributed to a thread.
Grouping of users is the most fundamental operation within the User/User
database. In a collaborative medical records system, you need to be able say
All of these users work at the same hospital and can have access to records
for patients at that hospital. In a corporate knowledge-sharing system, you
need to be able to say All of these users work in the same department and
therefore should have access to private departmental documents, a private discussion forum devoted to departmental issues, and should receive email notications of departmental events.
Lets move on from the core data model to some tips for the software that
youre soon to be building on top of the database.
69
Software Structure
There are several minor things wrong with this approach, which mixes SQL
and string literals obtained from the user:
m the programmer must remember to escape any single quote characters in the
uploaded string, replacing with [these are two single quotes, not one double quote]
m the statement might become too long for some RDBMS SQL parsers to
handle and/or the string literals might exceed limits (Oracle 9.x imposes a
4,000-character limit on string literals) if the user is waxing expansive at the
browser
m repeated invocations of this script will result in the RDBMS being fed versions of this SQL statement that are morphologically the same but dier in
actual text; depending on how the RDBMS is implemented, this might prevent the query plan from being reused
Much more serious, however, is the possibility that a malicious user could craft
a form submission that would result in destruction of data or violation of privacy. For example, consider the following code:
string EventQuery = "select *
from events
where event_id = " + EventIDfromBrowser;
70
Chapter 4
Suppose that an evil-minded person submits a form with EventIDfromBrowser set to "42; select * from user_passwords". The semicolon near
the beginning of this string could potentially terminate the rst SELECT and
the unauthorized select * from user_passwords query might then be executed. If the unauthorized query is well-crafted, the information resulting from
it might be presented in a browser window. Another scary construct would be
"42; delete from customers".
You can solve all of these problems by separating SQL code and variable
data. Heres a pseudo-code example of how it has been done using standard
libraries going back to the late 1970s:
// associate the name "event_query" with a string of SQL
PrepareStatement("event_query","select * from events where event_id = :event_id");
// associate the bind variable :event_id with the particular value for this page
BindVar("event_query",":event_id",3722);
// ask the RDBMS to execute the completed query
ExecuteStatement("event_query");
... fetch results ...
Note that the structure of the SQL seen by the RDBMS is xed as "select *
from events where event_id = :event_id", regardless of what input is
received in the form. Only the value of :event_id changes.
This is an example of using bind variables, which is standard practice in most
software that talks to an RDBMS.
Bind Variables in C#
using
using
using
using
System;
System.Configuration;
System.Data;
System.Data.SqlClient;
namespace ExecuteScalar
{
///
/// An example of how to use named parameters in ADO.NET.
///
class Class1
71
Software Structure
{
///
/// The main entry point for the application.
///
[STAThread]
static void Main(string[] args)
{
object objResult = null;
string strResult = null;
string strEmployeeID = "PMA42628M";
//Initialize the database connection, command and parameter objects.
SqlConnection conn = new SqlConnection(
ConfigurationSettings.AppSettings["connStr"]
);
SqlCommand cmd = new SqlCommand(
"select fname from employee where emp_id = @emp_id"
);
SqlParameter param = new SqlParameter("@emp_id",strEmployeeID);
//Associate the connection with the command.
cmd.Connection = conn;
//Bind the parameter value to the command.
cmd.Parameters.Add(param);
//Connect to the database and run the command.
try
{
conn.Open();
objResult = cmd.ExecuteScalar();
}
catch (Exception e)
{
Console.WriteLine("Database error: {0}", e.ToString());
}
finally
{
//Clean up.
if (!conn.State.Equals(ConnectionState.Closed))
{
conn.Close();
}
}
72
Chapter 4
Not too much to note here except that Microsoft seems to like @emp_id rather
than Oracles :emp_id, that is they use the at-sign rather than the colon to indicate that something is a bind variable.
73
Software Structure
you to be skeptical of vendor claims for the advantages of new languages and
development tools.
Get the Database Username and Password Out of the Page Scripts
Suppose that you have the following code in one of your page scripts:
dbconn = OpenDBConn("sysid=local,username=joestest,password=joerocks");
74
Chapter 4
query, it grabs one of the connections from the pool and uses it until page
service is complete. The connection is then returned to the pool. This scheme
is called connection pooling.
Often a good way to get the database username and password out of page
scripts is to use the Web servers database connection pooling system.
As noted in the Software Structure chapter, the more identied, authenticated, and accountable people are, the better the opportunity for building a
community out of an aggregate. Thus the user database should record as
much information as possible that might help Person A assess Person Bs
credibility.
As you will see in the chapter on scaling, it may become important to facilitate occasional face-to-face meetings among subgroups of users. Thus it will be
helpful to record their country of residence and postal code (what Americans
call Zoning Improvement Plan code or ZIP code).
76
Chapter 5
Notice that the comment about password encryption is placed above, rather than below,
the column name and that the primary key constraint is clearly visible to other programmers. It is good to get into the habit of writing data model les in a text editor
and including comments and examples of the queries that you expect to support. If you
use a desktop application with a graphical user interface to create tables youre losing a
lot of important design information. Remember that the data model is the most critical
part of your application. You need to think about how youre going to communicate
your design decisions to other programmers.
After a few weeks online, someone says, wouldnt it be nice to see the users
picture and hyperlink through to his or her home page?
create table users (
user_id
integer primary key,
first_names
varchar(50),
last_name
varchar(50) not null,
email
varchar(100) not null unique,
password
varchar(30) not null,
-- users personal homepage elsewhere on the Internet
url
varchar(200),
registration_date
timestamp(0),
-- an optional photo; if Oracle Intermedia Image is installed
-- use the image datatype instead of BLOB
portrait
blob
);
77
The table just keeps getting fatter. As the table gets fatter, more and more columns are likely to be NULL for any given user. With Oracle 9i youre unlikely
to run up against the hard database limit of 1,000 columns per table. Nor is
there a storage eciency problem. Nearly every database management system
is able to record a NULL value with a single bit, even if the column is dened
char(500) or whatever. Still, something seems unclean about having to add
more and more columns to deal with the possibility of a user having more and
more phone numbers.
Medical informaticians have dealt with this problem for many years. The example above is referred to as a fat data model. In the hospital world youll
very likely nd something like this for storing patient demographic and insurance coverage data. But for laboratory tests, the fat approach begins to get
ugly. There are thousands of possible tests that a hospital could perform on
a patient. New tests are done every day that a patient is in the hospital. Some
hospitals have experimented with a skinny data model for lab tests. The
table looks something like the following:
create table labs (
lab_id
patient_id
test_date
test_name
test_units
test_value
note
);
78
Chapter 5
Note that this table doesnt have a lot of integrity constraints. If you were
to specify patient_id as unique that would limit each hospital patient to
having only one test done. Nor does it work to specify the combination of
patient_id and test_date as unique because there are fancy machines that
can do multiple tests at the same time on a single blood sample, for example.
We can apply this idea to user registration:
create table users (
user_id
first_names
last_name
email
password
registration_date
);
An example of how such a data model might be lled is shown in gure 5.1.
Note that numbers are stored in a column of type VARCHAR. Wont this pre-
79
user_id
rst_names
last_name
password
Wile E.
Coyote
IFUx42bQzgMjE
users_extra_info table
user_info_id
user_id
eld_name
eld_type
varchar_value
blob_value
date_value
birthdate
date
--
--
1949-09-17
biography
blob_text
--
Created by
Chuck Jones . . .
--
aim_screen_name
string
iq207
--
--
annual_income
number
35000
--
--
Figure 5.1 Example of a user record that is split between a skinny table and a second
table.
clude queries such as Find the average income of a registered user? Not if
youre using Oracle. Oracle is smart about automatically casting between character strings and numbers. It will work just ne to
select average(varchar_value)
from users_extra_info
where field_name = annual_income
One complication of this kind of data model is that it is tough to use simple
built-in integrity constraints to enforce uniqueness if youre also going to use
the users_extra_info for many-to-one relations.
For example, it doesnt make sense to have two rows in the info table, both
for the same user ID and both with a eld name of birthdate. A user can
only have one birthday. Maybe we should
create unique index users_extra_info_user_id_field_idx on
users_extra_info (user_id, field_name);
(Note that this will make it really fast to fetch a particular eld for a particular
user as well as enforcing the unique constraint.)
80
Chapter 5
If youre using a fancy commercial RDBMS and wish to make queries like this really
fast, check out bitmap indices, often documented under Data Warehousing. These
are intended for columns of low cardinality, i.e., not too many distinct values compared
to the number of rows in the table. Youd build a bitmap index on the field_name
column.
But what about home_phone? Nothing should prevent a user from getting
two home phone numbers and listing them both. If we try to insert two rows
with the home_phone value in the field_name column and 451 in the
user_id column, the RDBMS will abort the transactions due to violation of
the unique constraint dened above.
How to deal with this apparent problem? One way is to decide that the
users_extra_info table will be used only for single-valued properties. Another approach would be to abandon the idea of using the RDBMS to enforce integrity constraints and put logic into the application code to make sure
that a user can have only one birthdate. A complex but complete approach is
to dene RDBMS triggers that run a short procedural program inside the
RDBMSin Oracle this would be a program in the PL/SQL or Java programming languages. This program can check that uniqueness is preserved for elds
that indeed must be unique.
81
Suppose that you were storing all of your application data in a single table:
create table my_data (
key_id
field_name
field_type
field_value
);
integer,
varchar,
varchar,
varchar
This is an adequate data model in the same sense that a set of raw instructions
for a Turing machine is an adequate programming language. Querying the data
dictionary would be of no help toward understanding the purpose of the application. One would have to sample the contents of the rows of my_data to see
what was being stored. Suppose, by contrast, you were poking around in an
unfamiliar database and encountered this table denition:
create table address_book (
address_book_id integer primary key,
user_id
not null references users,
first_names
varchar(30),
last_name
varchar(30),
email
varchar(100),
email2
varchar(100),
line1
varchar(100),
line2
varchar(100),
city
varchar(100),
state_province varchar(20),
postal_code
varchar(20),
country_code
char(2) references country_codes(iso),
phone_home
varchar(30),
phone_work
varchar(30),
phone_cell
varchar(30),
phone_other
varchar(30),
birthdate
date,
days_in_advance_to_remind
integer,
date_last_reminded
date,
notes
varchar(4000)
);
82
Chapter 5
Note the use of ISO country codes, constrained by reference to a table of valid codes, to
represent country in the table above. You dont want records with United States,
US, us, USA, Umited Stares, etc. These are maintained by the ISO 3166
Maintenance agency, from which you can download the most current data in text format. See http://www.iso.ch/iso/en/prods-services/iso3166ma/index.html.
The authors source code comments have been stripped out, yet it is reasonably
clear that this table exists to support an online address book. Moreover the
purpose of each column can be inferred from its name. Quite a few columns
will be NULL for each address book entry, but not so many that the table
will be absurdly sparse. Because NULL columns take up so little space in the
database, you shouldnt decide between skinny and fat based on presumed data
storage eciency.
Skinny is good when you are storing wildly disparate data on each user, such
that youd expect more than 75 percent of columns to be NULL in a fat data
model. Skinny can result in strange-looking SQL queries and data dictionary
opacity.
User Groups
One of the most powerful constructs in an online community is a user group. A
group of users might want to collaborate on publishing some content. A group
of users might want a private discussion forum. A group of users might be the
only people authorized to perform certain actions or view certain les. The
bottom line is that youll want to be able to refer to groups of users from other
objects in your database.
When building user groups you might want to think about on-the-y groups.
You denitely want to have a user group where each member is represented
by a row in a table: user 37 is part of user group 421. With this kind of
data model, people can explicitly join and separate from user groups. It is also
useful, however, to have groups generated on-the-y from queried properties.
For example, it might be nice to be able to say this discussion forum is limited
to those users who live in France without having to install database triggers to
insert rows in a user group map table every time someone registers a French
address. Rather than denormalizing the data, it will be much cleaner to query
for users who live in France every time group membership is needed.
83
84
Chapter 5
To get the data model into First Normal Form, in which there are no multivalued columns, youd create a mapping table:
create table user_group_map (
user_id
not null references users;
user_group_id
not null references user_groups;
unique(user_id, user_group_id)
);
Note that in Oracle the unique constraint results in the creation of an index. Here it will
be a concatenated index starting with the user_id column. This index will make it fast
to ask the question, To which groups does User 37 belong? but will be of no use in
answering the question, Which users belong to Group 22?
Derivable Data
Storing users and groups in three tables seems as though it might be inecient and ugly. To answer the question To which groups does Norman
Horowitz belong we must JOIN the following tables: users, user_groups,
user_group_map:
select user_groups.group_name
from users, user_groups, user_group_map
where users.first_names = Norman and users.last_name = Horowitz
and users.user_id = user_group_map.user_id
and user_groups.user_group_id = user_group_map.user_group_id;
85
If this is a popular group, there is a temptation among new database programmers to denormalize the data model by adding a column to the users table,
for example, tanganyikan_group_member_p. This column will be set to t
when a user is added to the Tanganyikans group and reset to f when a user
unsubscribes from the group. This feels like progress. We can answer our questions by querying one table instead of three. Historically, however, RDBMS
programmers have been bitten badly any time that they stored derivable data,
that is, information in one table that can be derived by querying other, more
fundamental, tables. Inevitably a programmer comes along who is not aware
of the unusual data model and writes application code that updates the information in one place but not another.
Note the use of the _p sux to denote a boolean column. Oracle does not support a
boolean data type and therefore we simulate it with a CHAR(1) that is restricted to t
and f. The p in the sux stands for predicate and is a naming convention that
dates back to Lisp programmers circa 1960.
What if you know that youre going to need this information almost every time
that you query the USERS table?
86
Chapter 5
This results in a virtual table containing all the columns of users plus an additional column called tanganyikan_group_membership that is 1 for users
who are members of the group in question and 0 for users who arent. In Oracle, if you want the column to bear the standard ANSI boolean data type
values, you can wrap the DECODE function around the query in the select
list:
decode(select count(*) ..., 1, t, 0, f) as tanganyikan_group_membership_p
Notice that weve added an _p sux to the column name, harking back to
the Lisp programming language in which functions that could return only
boolean values conventionally had names ending in p.
Keep in mind that data model complexity can always be tamed with views.
Note, however, that views are purely syntactic. If a query is running slowly
when fed directly to the RDBMS, it wont run any faster simply by having
been renamed into a view. Were you to have 10,000 members of a group, each
of whom was requesting one page per second from the groups private area on
your Web site, doing three-way JOINs on every page load would become a
substantial burden on your RDBMS server. Should you x this by denormalizing, thus speeding up queries by perhaps 5X over a join of indexed tables? No.
Speed it up by 1,000X by caching the results of authorization queries in the
virtual memory of the HTTP server process.
Clean up ugly queries with views. Clean up ugly performance problems with
indices. If youre facing Yahoo! or Amazon levels of usage, look into unloading
the RDBMS altogether with application-level caching.
87
Figure 5.2 A nite-state machine approach to user registration. A reader starts in the
not a user state. After lling out a registration form, he progresses to the Need Email
Verication/Need Admin Approval state. After responding to an email message from
the server he is moved into the Need Admin Approval state. Suppose that on this site
we have a rule that anyone whose email ends in mit.edu is automatically approved. In
that case the reader is moved to the Authorized state, which is where he will stay unless he decides to leave the service (Deleted) or is deemed to be an unreasonable burden on moderators (Banned).
88
Chapter 5
89
time in their session with a site. For example, consider the following ow of
pages on a shopping site:
m choose a book
m enter shipping address
m enter credit card number
m conrm
m thank you
A user who notices a typo in the shipping address on the conrm page should
be able to return to the shipping address entry form with the Back button or
the click right menu attached to the Back button, correct the address, and
proceed from there. See the Choosing between GET and POST section later
in this chapter.
A second general principle is: Have users pick the object rst and then the
verb. For example, consider the customer service area of an e-commerce site.
Assume that Jane Consumer has already identied herself to the server. The
merchant can show Jane a list of all the items that she has ever purchased.
Jane clicks on an item (picking the object) and gets a page with a list of
choices, for example, return for refund or exchange. Jane clicks on exchange (picking the verb) and gets a page with instructions on how to schedule a pickup of the unwanted item and pages oering replacement goods.
How original is this principle? It is lifted straight from the Apple Macintosh,
circa 1984, and is explicated clearly in Macintosh Human Interface Guidelines
(Apple Computer, Inc. [Addison-Wesley, 1993]; full text available online at
http://developer.apple.com/documentation/mac/HIGuidelines/HIGuidelines-2
.html). In a Macintosh word processor, for example, you select one word from
the document with a double-click (object). Then from the pull-down menus you
select an action to apply to this word, for example, put it into italics (verb).
Originality is valorized in contemporary creative culture, but it was not a value
for medieval authors and it does not help users. The Macintosh was enormously popular to begin with, and its user interface was copied by the developers of Microsoft Windows, which spread the object-then-verb idea to tens of
millions of people. Web publishers can be sure that the vast majority of their
users will be intimately familiar with the pick the object then the verb style
of interface. Sticking with a familiar user interface cuts down on user time and
confusion at a site.
90
Chapter 5
These principles are especially easy to apply to user administration pages, for
example. The administrator looks at a list of users and clicks on one to select it.
The server produces a new page with a list of possible actions to apply to that
user.
91
92
Chapter 5
A GET implies that you are getting information. You can resubmit a
GET any number of times: you are just querying information, not performing any actions on the back-end.
A POST implies that you are performing some action with side-effect:
inserting a row, updating a row, launching a missile, etc... Thats
why when you try to reload a POST page, your browser warns you: are
you sure you want to launch another missile?
In general, you should strive to respect the above principles. Here
are two key examples:
- searching users or content. That should be a GET.
- Inserting a user or updating a profile. That should be a POST.
Of course, HTML and HTTP have some restrictions that complicate
things:
a) GET forms are limited in length by how much your browser can
send in a URL field. This can be a problem for very complicated
search forms, though probably not an issue at this stage. If you
do hit that limit though, then its okay to use a POST.
b) POST forms can only be performed by having an HTML button, or by
using JavaScript to submit a form. JavaScript is not ideal.
Thus, sometimes you want to have a link that is effectively an
action with side-effect (e.g. "ban user"), but you make it a
GET.
You can use redirects (HTTP return code 302) to make your life easier. The nice thing about correct 302s is that the URL that issues
a 302 is never kept in a browsers history, so it is never queried
twice unless the user does something really conscious (like click
back and actively resubmit the form). Specifically:
1) when you POST data for an insert or update, have your script
process the POST, then redirect to a thank-you page. That way,
if the user clicks "reload", they are simply reloading the
thank-you page, which is just a GET and wont cause side-effects
or warnings. You can also redirect to something more meaningful,
perhaps the list of recently registered users once youve edited
one.
2) when you use a GET link to actually perform an action with sideeffect, you can also have that target script perform its action
93
Exercise 3
Build the basic user registration and login pages. Use HTTP cookies to make
the rest of the semesters work easier.
Questions: Can someone sning packets learn your users password? Gain
access to the site under your users credentials? What happens to a user who
forgets his or her password?
Exercise 4
Build the site administrators pages for working with users. The site administrator should be able to (1) see recently registered users, (2) look up a particular
user, (3) exclude a user from the site, and (4) see current and historical statistics
on user registration.
94
Chapter 5
Exercise 5
Look at your tables again for referential integrity constraints and query performance. How long will it take to look up a user by email address? What if this
email address is capitalized dierently from what youve stored in the database?
Is it possible to have two users with the same email address? (Note that by
Internet standards a lowercase email address or hostname is the same as an
uppercase email address or hostname.)
Many Web applications contain content that can be viewed only by members of a specic user group. With your data model, how many table rows
will the RDBMS have to examine to answer the question Is User 541 a
member of Group 90? If the answer is every row in a big table, that is,
a sequential scan, what kind of index could you add to speed up the
query?
More
m SQL for Web Nerds, data modeling chapter, at http://philip.greenspun.com/
sql/data-modeling
m for a discussion of indices, see SQL for Web Nerds, tuning chapter, at http://
philip.greenspun.com/sql/tuning
m Normal forms: Chapter 4 of Steve Roman, Access Database Design &
Programming (OReilly, 1999), available online at http://www.oreilly.com/
catalog/accessdata2/chapter/ch04.html and chapter 1 of Kevin Kline et al.,
Transact-SQL Programming (OReilly, 1999), available online at http://
www.oreilly.com/catalog/wintrnssql/chapter/ch01.html
m Reverse Engineering a Data Model by Eve Andersson at http://
eveandersson.com/writing/data-model-reverse-engineering is useful for understanding how to work with the Oracle Data Dictionary.
95
Content Management
There are two fundamental elements to content management: (1) storing stu
in a content repository, and (2) supporting the workow of a group of people
engaged in putting stu into that repository. This chapter will treat the storage
problem rst and then the workow support problem. Well also look at version control for both content and software, at look and feel design for individual pages, and at navigation design and information architecture.
Part of the art of content management for an online learning community is
reducing the number of types of content. For example, consider a community
where the publisher says I want articles [magnet content], comments from
users on articles, news from the publisher, comments on news from users,
questions from users, and answers to questions. A naive implementation
from these specications would result in the creation of six database tables:
articles, comments_on_articles, news, comments_on_news, questions, answers. From the RDBMSs perspective, there is nothing over-
whelming about six tables. But consider that every new table dened in the
RDBMS implies roughly twenty Web scripts. Ten of these scripts will constitute a user experience: view a directory of content in Table A, view one category, view one item, view the newest items, grab a form to insert an item,
conrm insertion, request an e-mail alert of comments on an item. Ten of these
scripts will constitute an administrators experience: view a directory of content
in Table A, view one category, view one item, view the newest items, approve
an item, disapprove an item, delete an item, conrm deletion of an item, and so
on. It will be a bit tough to code these twenty scripts in a general fashion because the SQL statements will dier in at least the table names used.
Consider further that to oer a complete index of site content, youll have to
write a program that pulls text from at least six tables into a single index.
98
Chapter 6
Figure 6.1
How dierent are these six kinds of content, really? Well look at the tables
that we need to dene for storing articles, then proceed to the other types of
content.
99
Content Management
-- could be text/html or text/plain or some sort of XML document
mime_type
varchar(100) not null,
-- will hold the title in most cases
one_line_summary
varchar(200) not null,
-- the entire article; 4 GB limit
body
clob
);
Should all articles in the database be shown to all users? Perhaps it would be
nice to have the ability to store an article and hold it for editorial examination:
create table articles (
article_id
integer primary key,
creation_user
not null references users,
creation_date
not null date,
language
char(2) references language_codes,
mime_type
varchar(100) not null,
one_line_summary
varchar(200) not null,
body
clob,
editorial_status
varchar(30)
check (editorial_status in
(submitted,rejected,approved,expired))
);
100
Chapter 6
If you change your mind about how to represent approval status, you wont
need to update dozens of Web scripts; you need only change the denition of
the articles_approved view. (See the views chapter of SQL for Web Nerds
at http://philip.greenspun.com/sql/views for more on this idea of using SQL
views as a means of programming abstraction.)
Comments on Articles
Recall the six required elements of online community:
1. magnet content authored by experts
2. means of collaboration
3. powerful facilities for browsing and searching both magnet content and contributed content
4. means of delegation of moderation
5. means of identifying members who are imposing an undue burden on the
community and ways of changing their behavior and/or excluding them
from the community without them realizing it
6. means of software extension by community members themselves
A facility that lets a user post an alternative perspective to a published article is
a means of collaboration that distinguishes a one-way publishing site from an
online community. More interestingly, the facility lifts the Internet application
out of the constraints of the literate culture within which Western culture has
operated ever since Gutenberg (1452). A literate culture produces such works
as the Michelin Green Guide to Italy: Extending below the town is the park
of the 16th-century Villa Orsini (Parco dei Mostri) which is a Mannerist creation with a series of fantastically shaped sculptures (Michelin Travel Publications, 2003). Compare that description to the images in gure 6.2 showing just
a tiny portion of the Parco dei Mostri (Park of Monsters). If a friend of
yours came back from this place and showed these slides, youd expect to hear
something much richer and more interesting than the Michelin Guides sentence. A literate culture operates with the implicit assumption that knowledge
is closed, that Italian tourism can t into a book. Perhaps the 350 pages of the
Green Guide arent enough, but some quantity of writers and pages would sufce to encapsulate everything worth knowing about Italy.
101
Content Management
Comments are often the most interesting material on a site. Heres one from http://
philip.greenspun.com/humor/bill-gates:
I must say, that all of you who do not recognize the absolute genius of Bill Gates are
stupid. You say that bill gates stole this operating system. Hmm.. i nd this interesting. If
he stole it from steve jobs, why hasnt Mr. Jobs relentlessly sued him and such. Because
Mr. Jobs has no basis to support this. Macintosh operates NOTHING like Windows 3.1
or Win 95/NT/98. Now for the mac dissing. Macs are good for 1 thing. Graphics. Thats
all. Anything else a mac sucks at. You look in all the elementary schools of america.. You
wont see a PC. Youll see a mac. Why? Because Macs are only used by people with undeveloped brains.
Allen ([email protected]), August 10, 1998
Oral cultures do not share this belief. Knowledge is open ended. People may
hold diering opinions without one person being wrong. There is not necessarily one truth; there may be many truths. Though he didnt grow up in an oral
culture, Shakespeare knew this. Watch Troilus and Cressida and its ve perspectives on the nature of a womans love and try to gure out which perspective Shakespeare thinks is correct.
Feminists, chauvinists, warmongers, pacists, Jew-haters, inclusivists, cautious people, heedless people, misers, doctors, medical malpractice lawyers,
atheists, and the pious are all able to quote Shakespeare in support of their
beliefs. Thats because Shakespeare uses the multiple characters in each of his
plays to show his cultures multiple truths.
In the 400 years since Shakespeare weve become much more literate. There
is usually one dominant truth. Sometimes this is because weve truly gured
something out. It is tough to argue that a physics textbook on Newtonian mechanics should be an open-ended discussion (though a user comment facility
might still be very useful in providing clarifying explanations for confusing sections). Yet even in the natural sciences, one can nd many examples in which
the culture of literacy distorts discourse.
Academic journals of taxonomic botany reveal disagreement on whether
Specimen 947 collected from a particular eld in Montana is a member of
species X or species Y. But the journals imply agreement on the taxonomy,
that is, on how to build a categorization tree for the various species. If you
were to eavesdrop on a cocktail party in a universitys department of botany,
youd discover that even this agreement is illusory. There is widespread disagreement on what constitutes the correct taxonomy. Hardly anyone believes
102
Chapter 6
Figure 6.2
that the taxonomy used in journals is correct, but botanists have to stick with it
for publication because otherwise older journal articles would be rendered
incomprehensible. Taxonomic botany based on an oral culture or a computer
system capable of showing multiple views would look completely dierent.
The Internet and computers, used competently and creatively, make it much
easier and cheaper to collect and present multiple truths than in the old world
of print, telephone, and snail mail. Multiple-truth Web sites are much more interesting than single-truth Web sites and, per unit of eort and money invested,
much more eective at educating users.
103
Content Management
Implementing Comments
Comments on articles will be represented in a separate table:
create table comments_on_articles_raw (
comment_id
integer primary key,
-- on what article is this a comment?
refers_to
not null references articles,
creation_user
not null references users,
104
Chapter 6
creation_date
not null date,
language
char(2) references language_codes,
mime_type
varchar(100) not null,
one_line_summary
varchar(200) not null,
body
clob,
editorial_status
varchar(30)
check (editorial_status in
(submitted,rejected,approved,expired))
);
create view comments_on_articles_approved
as
select *
from comments_on_articles_raw
where editorial_status = approved;
This table diers from the articles table only in a single column: refers_to.
How about combining the two:
create table content_raw (
content_id
integer primary key,
-- if not NULL, this row represents a comment
refers_to
references content_raw,
-- who contributed this and when
creation_user
not null references users,
creation_date
not null date,
-- what language is this in?
-- visit http://www.w3.org/International/O-charset-lang
-- to see the allowable 2-character codes (en is English, ja is Japanese)
language
char(2) references language_codes,
-- could be text/html or text/plain or some sort of XML document
mime_type
varchar(100) not null,
one_line_summary
varchar(200) not null,
-- the entire article; 4 GB limit
body
clob,
editorial_status
varchar(30)
check (editorial_status in (submitted,rejected,approved,expired))
);
-- if we want to be able to write some scripts without having to think
-- about the fact that different content types are merged
create view articles_approved
as
select *
105
Content Management
from content_raw
where refers_to is null
and editorial_status = approved;
create view comments_on_articles_approved
as
select *
from content_raw
where refers_to is not null
and editorial_status = approved;
-- lets build a single full-text index on both articles and comments
-- using Oracle Intermedia Text (formerly known as "Context")
create index content_ctx on content_raw (body)
indextype is ctxsys.context;
106
Chapter 6
create table content_raw (
content_id
integer primary key,
refers_to
references content_raw,
creation_user
not null references users,
creation_date
not null date,
release_time
date,
-- NULL means "immediate"
expiration_time
date,
-- NULL means "never expires"
language
char(2) references language_codes,
mime_type
varchar(100) not null,
one_line_summary
varchar(200) not null,
body
clob,
editorial_status
varchar(30)
check (editorial_status in
(submitted,rejected,approved,expired))
);
How do we nd news stories among all the content rows? What distinguishes a
news story with a scheduled release time and expiration date from an article on
the Windows 2003 operating system with a scheduled release time and expiration date? Well need one more column:
create table content_raw (
content_id
integer primary key,
content_type
varchar(100) not null,
refers_to
references content,
creation_user
not null references users,
creation_date
not null date,
release_time
date,
expiration_time
date,
language
char(2) references language_codes,
mime_type
varchar(100) not null,
one_line_summary
varchar(200) not null,
body
clob,
editorial_status
varchar(30)
check (editorial_status in
(submitted,rejected,approved,expired))
);
create view news_current_and_approved
as
select *
from content_raw
where content_type = news
and (release_time is null or sysdate >= release_time)
and (expiration_time is null or sysdate <= expiration_time)
and editorial_status = approved;
107
Content Management
Notice the explicit checks for NULL in the view denition above. Youd think
that something simpler such as
and sysdate between release_time and expiration_time
would work. The problem here is SQLs three-valued logic. For the RDBMS to
return a row, all of the AND clauses must return true. NULL is not true. Any
expression or calculation including a NULL evaluates to NULL. Thus
where sysdate >= release_time
108
Chapter 6
One good thing about the le system is that there are a lot of tools for
users with dierent levels of skill to add, update, remove, and rename les. Programmers can use text editors. Designers can use Web design tools and FTP
the results. Page authors can use HTML editors such as Microsoft Front
Page.
One bad thing about giving many people access to the le system is the potential for chaos. A designer is supposed to upload a template, but ends up
removing a script by mistake. Now users cant log into the site anymore. The
standard Windows and Unix le systems arent versioned. It isnt possible to
go back and ask What did this le look like six months ago? The le system
does not by itself support any workow (see below). You authorize someone
to modify a le or not. You cant say User 37 is authorized to update this article on aquarium lters, but the members shouldnt see that update until it is
approved by an editor.
The deepest problem with using the le system as a cornerstone of your content management system is that les are outside of the database. You will need
to store a lot of references to content in the database, for example, User 960 is
the author of Article 231, Comment 912 is a comment on Article 529, and
so on. It is very dicult to keep a set of consistent references to things outside
the RDBMS. Suppose that your RDBMS tables are referring to le system les
by le name. Someone renames a le. The database doesnt know. The databases referential integrity constraint mechanisms cannot be invoked to protect
against this circumstance. It is much easier to keep a set of data structures consistent if they are all within the RDBMS.
Static .html les also have the problem of being, well, static. Suppose that
you want a standard header and footer on every page. You can cut and paste
these into every .html le on the system. But what if you want to change
Copyright 2003 to Copyright 2006 in the site-wide footer? You may have
to update thousands of les. Suppose that you want the header to include a
Login link if the request comes in with no user authorization cookie and a
Logout link if the request comes in from a registered user.
Some of the problems with publisher maintenance of static .html les can be
solved by periodically writing and running clever Perl scripts. Deeper problems
with the user experience remain, however. First and foremost is the fact that
with a static .html le every person who views the page thinks that he or she
might be the only person ever to have viewed the page. This makes for a very
lonely Internet experience and, generally speaking, not a very protable one for
the publisher.
109
Content Management
A sustainable online business will typically oer some sort of online community interaction anchored by its content and will oer a consistently personalized user experience. These requirements entail some sort of computer
program executing on every page load. So you might as well take this to its logical conclusion and build every URL in your application the same way: a script
in the le system executes and pulls content from the RDBMS.
Exercise 1
Develop a data model for the content that youll be storing on your site. Note
that at a bare minimum your content repository needs to be capable of handling a discussion forum since well be building that in a later chapter.
You might nd that, in making the data model precise with SQL table denitions, questions for the client arise. You realize that your earlier discussions
with the client were too vague in some areas. This is a natural consequence of
building a SQL data model. Pick up the phone and call your client to get clarications. Email with several alternative concrete scenarios. Get your client
accustomed to elding questions in a timely manner.
Show the draft data model to your teaching assistant and discuss with other
students before proceeding.
Fortunately for companies and programmers that hope to make a nice living
from providing content management solutions, the preceding conditions
seldom obtain at better-nanced Web sites. What is more typical are the following conditions:
110
Chapter 6
The publisher decides what major content sections are available, when a content section goes live, and the relative prominence to be assigned each content
section. The information designer decides what navigational links are available
from every document on the page, how to present the available content sections, and what graphic design elements are required. The graphic designer
contributes drawings, logos, and other artwork in service of the information
designers objectives. The graphic designer also produces mock-up templates
(static HTML les) in which these artwork elements are used. The programmer
builds production templates and computer programs that reect the instructions of publisher, information designer, and graphic designer. Editors approve
content and decide when specic pages go live. Editors assign relative prominence among pages within sections. In keeping with their relative nancial
compensation, we consider the needs and contributions of authors second to
last. Authors stu fragments of HTML, plain text, photographs, music, and
sound, into the database. These authored entities will be viewed by users only
through the templates developed by the programmers.
Below is an example workow that we used to assign to students at MIT:
111
Content Management
7. log out
8. log in as programmer and visit /cm
9. make two templates for the movie section, one called movie_review and one called
actor_prole; make one template for the dining section called restaurant_review
10. log out
11. log in as author and visit /cm
12. add two movie reviews and two actor proles to the movies section and a review of
your favorite restaurant to the dining section
13. log out
14. log in as editor and visit /cm
15. approve two of the movie reviews, one of the actor proles, and the restaurant
review
16. log out
17. without logging in (i.e., youre just a regular public Web surfer now), visit the /movies
section and, ideally, you should see that the approved content has gone live
18. follow a hyperlink from a movie review to the dining section and note that you can
nd your restaurant review
19. log in as author and visit /cm
20. edit the restaurant review to reect a new and exciting dessert
21. log out
22. visit the /dining section and note that the old (approved) version of the restaurant
review is still live
23. log in as editor and visit /cm and approve the edited restaurant review
24. log out
25. visit the /dining section and check that the new (with dessert) version of the restaurant review is being served
112
Chapter 6
Among the 300,000 people who visit photo.net every month, surely there are
people capable of writing each of the preceding articles. We want a system
where
1. Joe User can transactionally sign up to write Platinum prints, thus marking the article assignment requested pending editorial approval, and supplies a brief outline and commits to completing a draft by July 1.
2. Jane Editor can approve the outline and schedule, thus generating an email
alert back to Joe.
3. Joe User gets periodic email reminders of what he has signed up to do and
by when.
4. Jane Editor is alerted when Joes rst draft is submitted on July 17 (Joe is
unlikely to be the rst author in the history of the world to submit work on
time).
5. Joe User gets an email alert asking him to review Janes corrected version
and sign o his approval.
6. The platinum printing article shows up at the top of Jane Editors workspace page as signed o by author and she clicks to push it live.
Notice the intricacies of the workow and also the idiosyncracies. The New
York Times and the Boston Globe put out very similar-looking products. They
are owned by the same corporation. What do you think the chances are that
software that supports one newspapers workow will be adequate to support
the others?
Exercise 2
Lay out the workow for each content item that will be user visible in your online learning community. For each workow step, specify (1) who needs to give
113
Content Management
approval, (2) what e-mail alerts are generated, (3) what happens if approval is
given, and (4) what happens if approval is denied.
Tip: we recommend modeling workow as a nite-state machine in which a
content item can be in only one state at a time and that single state tells you
everything that you need to know about the item. In other words, your software can take action without ever needing to go back and look to see what
states the article was in previously.
Unfortunately, Version C (the typo x) is what future users will see; all of
Shoshanas work was wasted.
114
Chapter 6
Programmers and technical writers at large companies are familiar with the
problem of lost updates when multiple people are editing the same document.
File-system based version control systems were developed to help coordinate
multiple contributors. These systems include the original Walter Tichys Revision Control System (RCS; early 1980s), Dick Grune and Brian Berliners
Concurrent Versions System (CVS; 1986), and Marc Rochkinds Source Code
Control System (SCCS; 1972). These systems require more training than is
practical for casual users. For example, RCS mandates explicit check-out and
check-in. While a le is checked out by User A it is locked and nobody but
User A can check it back in. Suppose that User A goes out to lunch, but there
is some important news that absolutely must be put on the site. What if User A
leaves for a two-week vacation and forgets to check a bunch of les back in?
These problems can be worked around manually, but it becomes a challenge
when the collaborators are on opposite sides of the globe and cannot see Oh,
Schlomos coat is still on the back of his chair so hes not yet left for the day.
For distributed authorship of Web content by geographically distributed casually connected users, the most practical system turns out to be one in which
check-in is allowed at any time by any authorized person. However, all versions
of every document are kept in the database so that one can always revert to an
earlier version or pull a section out of an earlier version. This implies that your
content management system will have an audit trail: a record of past values
held by row-column intersections in a database table, who was responsible for
any changes in those values, and when the values were changed.
There are two classical ways to implement an audit trail in an RDBMS. The
rst is to set up separate audit tables, one for each production table. Every time
an update is made to a production table, the old row is written out to an audit
table, with a time stamp. This can be accomplished transparently via RDBMS
triggers, which are described in the Triggers chapter of SQL for Web Nerds
at http://philip.greenspun.com/sql/triggers and demonstrated in practice in
an open-source audit trail package documented at http://philip.greenspun
.com/seia/examples-content-management/audit-acs-doc. The second classical
approach is to keep current and archived information in the same table. This
is more expensive in terms of computing resources required because the information that you want for the live site is interspersed with seldom-retrieved
archived information. But it is easier if you want to program in the capability
to show the site as it was on a particular day. Your templates wont have to
query a dierent table, they will merely need a dierent WHERE clause.
115
Content Management
116
Chapter 6
Were not really interested in the largest ZIP code for a particular content item
version. In fact, unless there has been some kind of mistake in our application
code, we assume that all ZIP codes for multiple versions of the same content
item are the same. However, GROUP BY is a mechanism for collapsing infor-
117
Content Management
mation from multiple rows. The SELECT list can contain column names only
for those columns that are being GROUPed BY. Anything else in the SELECT
list must be the result of aggregating the multiple values for columns that arent
GROUPed. The choices with most RDBMSes are pretty limited: MAX, MIN,
AVERAGE, SUM. There is no pick any function. So we use MAX.
Updates are similarly problematic. The U.S. Postal Service periodically
redraws the ZIP code maps. Updating one piece of information, for example,
20016 to 20816, will touch more than one row per content item.
This data model is in First Normal Form. Every value is available at the intersection of a table name, column name, and key (the composite primary key
of content_id and version_number). However, it is not in Second Normal
Form, which is why our queries and updates appear strange.
In Second Normal Form, all columns are functionally dependent on the
whole key. Less formally, a Second Normal Form table is one that is in First
Normal Form with a key that determines all non-key column values. Even less
formally, a Second Normal Form table contains statements about only one
kind of thing.
Our current content_raw table contains some information that depends
on the whole key of content_id and version_number, for example, the
body and the language code. But much of the information depends only on the
content_id portion of the key: author, creation time, release time, ZIP code.
When we need to store statements about two dierent kinds of things, it
makes sense to create two dierent tables, that is, to use Second Formal Form:
-- stuff about an item that doesnt change from version to version
create table content_raw (
content_id
integer primary key,
content_type
varchar(100) not null,
refers_to
references content_raw,
creation_user
not null references users,
creation_date
not null date,
release_time
date,
expiration_time
date,
mime_type
varchar(100) not null,
zip_code
varchar(5)
);
-- stuff about a version of an item
create table content_versions (
version_id
integer primary key,
content_id
not null references content_raw,
118
Chapter 6
version_date
date not null,
language
char(2) references language_codes,
one_line_summary
varchar(200) not null,
body
blob,
editorial_status
varchar(30)
check (editorial_status in
(submitted,rejected,approved,expired)),
-- audit the person who made the last change to editorial
editor_id
references users,
editorial_status_date date
);
status
How does one query into the versions table and nd the latest version? A rst
try might look something like the following:
select *
from content_versions
where content_id = 5657
and editorial_status = approved
and version_date = (select max(version_date)
from content_versions
where content_id = 5657
and editorial_status = approved)
Is this guaranteed to return only one row? No! There is no unique constraint on
content_id, version_date. In theory, two editors or authors could submit
new versions of an item within the same second. Remember that the date datatype in Oracle is precise only to within one second. Even more likely is that
an editor doing a revision might click on an editing form submit button twice
with the mouse or perhaps use the Reload command impatiently. Heres a
slight improvement:
select *
from content_versions
where content_id = 5657
and editorial_status = approved
and version_id = (select max(version_id)
from content_versions
where content_id = 5657
and editorial_status = approved)
119
Content Management
but a deeper reading of the manual would reveal that the rownum pseudocolumn is set before the ORDER BY clause is processed. An accepted way to
do this in one query is the nested SELECT:
select *
from (select *
from content_versions
where content_id = 5657
and editorial_status = approved
order by version_date desc)
where rownum = 1
2. fetch one row from the cursor (this will be the one with the max value in
version_date)
3. close the cursor
120
Chapter 6
the public pages may be querying for and delivering the latest version ten times
per second. Wouldnt it make more sense to compute and tag the most current
approved version at insertion/update time?
create table content_versions (
version_id
integer primary key,
content_id
not null references content_raw,
version_date
date not null,
...
editorial_status
varchar(30)
check (editorial_status in
(submitted,rejected,approved,expired)),
current_version_p
char(1) check(current_version_p in (t,f)),
...
);
121
Content Management
(current_version_p)
(partition old_crud values less than s
tablespace slow_extra_disk_tablespace
partition live_site values less than(maxvalue)
tablespace fast_new_disk_tablespace)
;
All of the rows for the live site will be kept together in relatively compact blocks.
Even if the ratio of old versions to live content is 99:1 it wont aect performance
or the amount of RAM consumed for caching database blocks from the disk. As
soon as Oracle sees a WHERE CURRENT_VERSION_P clause it knows
that it can safely ignore an entire tablespace and wont bother checking any of
the irrelevant blocks.
Have we reached Nirvana? Not according to the database eggheads, whose
relational calculus formulae do not embrace such factors as how data are
spread among physical disk drives. The database theoretician would note
that our data model is in Second Normal Form, but not in Third Normal
Form. In a table that is part of a Third Normal Form data model, all columns
are directly dependent on the whole key. The column current_version_p is
not dependent on the table key, but rather on two other non-key columns
(editorial_status and version_date). SQL programmers refer to this
kind of performance-enhancing storage of derivable data as denormalization.
If you want to serve ten million requests per day directly from an RDBMS
running on a server of modest capacity, you may need to break some rules.
However, the most maintainable production data models usually result from
beginning with Third Normal Form and adding a handful of modest and judicious denormalizations that are documented and justied.
Note that any data model in Third Normal Form is also in Second Normal Form. A
data model in Second Normal Form is also in First Normal Form.
122
Chapter 6
for the computer programs that implement the site. These are most likely in the
operating system le system and are edited by a handful of professional software developers. During this class you may decide that it is not worth the eort
to set up and use version control, in which case your de facto version control
system becomes backup tapes, so make sure that youve got daily backups.
However, in the long run you need to learn about approaches to version control for Internet application development.
Throughout this section, keep in mind that a project with a very clear publishing objective, specs that never change, and one very smart developer, does
not need version control. A project with evolving objectives, changing specications, and multiple contributors needs version control.
Classical Solution: One Development Area per Developer
Classically, version control is used by C developers with each C programmer
working from his or her own directory. This makes sense because there is no
persistence in the C world. Code is compiled. A binary runs that builds data
structures in RAM. When the program terminates, it doesnt leave anything behind. The entire tree of software is checked out from a version control repository into the le system of the development computer. Changed les are
checked back into the repository when the programmer is satised.
A shallow objection to this development method in the world of databasebacked Internet applications is that it becomes very tedious to make a small
change. The programmer checks out the tree onto a development server. The
programmer installs an RDBMS, then creates an RDBMS user and a tablespace. The programmer exports the RDBMS from the production site into a
dump le, transfers that dump le over the network to the development machine, and imports it into the RDBMS installation on the development server.
Keep in mind that for many Internet applications the database may approach
one terabyte in size, and therefore it could take hours or days to transfer and
import the dump le. Finally, the programmer nds a free IP address or
port and sets up an HTTP server rooted at the development tree. Ready to
code!
A deeper objection to applying this development method to our world is
that it is an obstacle to collaboration. In the Internet application business,
developers always work with the publisher and users. Those collaborators
need to know, at all times, where to nd the latest running version of the soft-
123
Content Management
ware so that they can oer criticism and advice. If there are ten software developers on a service it is not reasonable to ask the publishers and users to check
ten separate development sites.
A Solution for Our Times
1. three HTTP servers (they can be on one physical computer)
2. two or three RDBMS users/tablespaces (they can be in one RDBMS
instance)
3. one version control repository
Lets go through these item by item.
Item 1: Three HTTP Servers
Suppose that a publishers overall objective is to serve an Internet application
accessible at foobar.com. This requires a production server, rooted in the
le system at /web/foobar/ (Server 1). It is too risky to have programmers making changes on the live production site. This requires a development server,
rooted at /web/foobar-dev/ (Server 2). Perhaps this is enough. When everyone
is happy with the way that the development server is functioning, declare a
code freeze, test a bit, then copy the development code over to the production
directory and restart.
Whats wrong with the two-server plan? Nothing if the development and testing teams are the same, in which case there is no possibility of simultaneous development and testing. For a complex site, however, the publisher may wish to
spend a week testing before launching a revision. It isnt acceptable to idle
authors and developers while a handful of testers bangs away at the development server. The addition of a staging server, rooted at /web/foobar-staging/
(Server 3) allows development to proceed while testers are preparing for the
public launch of a new version.
Heres how the three servers are used:
1. developers work continuously in /web/foobar-dev/
2. when the publisher is mostly happy with the development site, a named version or branch is created and installed at /web/foobar-staging/
3. the testers bang away at the /web/foobar-staging/ server, checking xes
back into the version control repository, but only into the staging branch
124
Chapter 6
4. when the testers and publishers sign o on the staging servers performance,
the site is released to /web/foobar/ (production)
5. any xes made to the staging branch of the code that have not already
been xed by the development team are merged back into the development
branch in the version control repository
Item 2: Two or Three RDBMS Users/Tablespaces
Suppose that the publisher has a working production site running version 1.0 of
the software. One could connect the development server rooted at /web/foobardev/ to the production database. After all, the raison detre of the RDBMS is
concurrency control. It will be happy to handle eight simultaneous connections
from a production Web server plus two or three from a development server.
The y in this ointment is that one of the developers might get sloppy and write
a program that sends drop table users rather than drop table users
_experimental_extra_table to the database. Or, less dramatically, a junior
developer might leave out a WHERE clause in an SQL statement and inadvertently request a result set of 10 9 rows, thus slowing down the production
site.
So it would seem that this publisher will need at least one new database.
Here are the steps:
1. create a new database user and tablespace; if this is on a separate physical
computer from your production RDBMS server it will protect your production servers performance from inadvertent denial-of-service attacks by
sloppy development SQL statements
2. export the production database into a le system le, which is a good periodic practice in any case as it will verify the integrity of the database
3. import the database export into the new development database
4. every time that a developer alters a table, adds a table, or populates a new
table, record the operation in a patches.sql le
5. when ready to move code from staging to production, hastily apply all the
data model modications from patches.sql to the production RDBMS
Should there be three databases, that is, one for development, one for staging,
and one for production? Not necessarily. Unless one expects radical data model
evolution, it may be acceptable to use the same database for development and
125
Content Management
staging. Keep in mind that adding a column to a relational database table seldom breaks old queries. This was one of the objectives set forth by E. F. Codd
in 1970 in A Relational Model of Data for Large Shared Data Banks (http://
www.acm.org/classics/nov95/toc.html) and certainly modern implementations
of the relational model have lived up to Codds hopes in this respect.
Item 3: One Version Control Repository
The function of the version control repository is to
m remember what all the previous checked-in versions of a le contained
m show the dierence between whats in a checked-out tree and whats in the
repository
m help merge changes made simultaneously by multiple authors who might
have been unaware of each others work
m group a snapshot of currently checked-in versions of les as, e.g., Release
2.1 or JuneIssue
An example of a system that meets the preceding requirements is Concurrent Versions System (CVS), which is free and open source. CVS uses a
single le system directory as its repository or CVS root. CVS can run
over the Internet so that the repository is on Computer A and development,
staging, and production servers are on Computers B, C, and D. Alternatively,
you can run everything in separate le system directories on one physical
computer.
Good things about this solution Lets summarize the good things about the
version control (for computer programs) solution proposed here:
m if something is screwy with the production server, one can easily revert to a
known and tested version of the software
m programmers can protect and comment their changes by explicitly checking
les in after signicant changes
m teams of programmers and testers can work independently
Further reading: Open Source Development With CVS (Karl Fogel and Moshe
Bar [Coriolis, 2001]), a portion of which is available online at http://cvsbook
.red-bean.com/cvsbook.html.
126
Chapter 6
Note that generally most teams must write some additional SQL code to
complete this exercise, augmenting the data model that they built in Exercise 1.
A skeletal implementation should have stable and consistent URLs, that is, the
home page should be just the hostname of the server and lenames should be
consistent. If you havent had a chance to make abstract URLs work (see the
Basics chapter), this is a good time to do it. Every page should have a descriptive title so that the browsers Back button and bookmarks (favorites)
127
Content Management
are fully functional. Every page should have a View Source link at the bottom and a way to contact the persons responsible for page function and content. Some sort of consistent navigation system should be in place (also see
below). The look and feel of a skeletal implementation will be plain, but it
need not be ugly or inconsistent. Look to Google for inspiration, not the personal home pages of fellow students at your university.
Screen Space
In the 1960s a computer user could tap into a 1/100th share of a computer with
1 MB of memory and capable of executing 1 million instructions per second,
viewing the results on a 19-inch monitor. In 2005, a computer user gets a full
share of a computer with 2000 MB of memory (2 GB) and capable of executing
4 billion instructions per second. This is roughly a 400,000-fold improvement in
available computing capability. How does our modern computer user view the
results of his or her computations? On a 19-inch monitor.
Programmers of most applications no longer need concern themselves too
much with processor and memory eciency, which were obsessions in the
1960s. CPU and RAM are available in abundance. But screen real estate is as
precious as ever. Look at your page designs. Is the most important information
available without scrolling? (In the newspaper business, the term for this is
above the fold.) Are you making the best use of the screen space that you
have? Are there large swaths of empty space on the page? Could you be using
HTML tables to present two or three columns of information at the same
time?
128
Chapter 6
One particularly egregious waste of screen space is the use of icons. Typically, users cant understand what the icons mean so they need to be supplemented with plain language annotation. Generally the best policy is to let the
information be the interface, for example, display a list of article categories
(the information) where clicking on a category is the way to navigate to a
page showing articles within that category.
Time
Most people prefer fast to slow. Most people prefer consistent service time to
inconsistent service time. These two preferences contribute substantially to the
popularity of McDonalds restaurants worldwide. When people are done with
their lunch they bring those same preferences to computer applications: fast is
better than slow; response time should be consistent from session to session.
Computer and network speeds will change over the years, but human beings
will evolve much more slowly. Thus we should start by considering limits
derived from the humanity of our users. The experimental psychologists will
tell us that short-term memory is good for remembering only about seven
things at once (George A. Miller, The Magical Number Seven, Plus or Minus
Two: Some Limits on Our Capacity for Processing Information, Psychological Review 63 [1956]: 8197; http://www.well.com/user/smalin/miller.html)
and that this memory is good for only about twenty seconds. It is thus unwise
to build any computer application in which users are required to remember too
much from one page to another. It is also unwise to build any computer application where the interpage delay is more than twenty seconds. People might
forget what task they were trying to accomplish!
IBM Corporation carried out some studies around 1970 and discovered the
following required computer response times:
m 0.1 seconds for direct manipulation, e.g., moving objects around on a screen
with a pointer
m 1 second for maximum productivity in screen-click-screen systems such as
they had on the IBM 3270 terminal back in 1970 and we have on the Web
in 2005
m less than 10 seconds to hold the full attention of a user; when response times
extended beyond 10 seconds users would try to engage in another task, such
as reading a magazine, while also using the computer application
129
Content Management
A reasonable goal to strive for in an Internet application is sub-second response time. This goal is based partly on IBMs research, partly on the inability
to achieve (in 2005) the 0.1-second mark at which direct manipulation becomes
possible, and partly on what is being achieved by the best practitioners. Your
users will have used Amazon and Yahoo! and eBay. Any service that is slower
than these is going to set o alarm bells in the users mind: maybe this site is
going to fail altogether? Maybe I should try to nd a competitive site that
does the same job but is faster?
One factor that aects page-loading time is end-to-end bandwidth between
your server and the user. You cant do much about this except measure and average. Some Web servers can be congured or reprogrammed to log the total
time spent serving a page. By looking at the times spent serving large photographs, for example, you can infer average bandwidth available between your
server and the users. If the tenth percentile users are getting 50 Kbits per second, you know that, even if your server were innitely fast at preparing pages,
you should try to make sure that your pages, with graphics, are either no larger
than 50 Kbits in size or that the HTML is designed such that the page will
render incrementally. (A page that is one big TABLE is bad; a page in which
any images have WIDTH and HEIGHT tags is good because the text will be
rendered immediately with blank spaces that will be gradually lled in as the
images are loaded.)
You can verify your decisions about page layout and graphics heaviness by
comparing your pages to those of the most successful Internet service operators
such as eBay, Yahoo!, and Amazon.
Remember that in the book and magazine world every page design loads at
the same speed, which means that page design is primarily a question of aesthetics. In the Internet world page design and application speed are inextricably
linked, which makes page design an engineering problem.
Words
As a programmer, there are two kinds of text that you will be putting into the
services that you build: instructions and error messages.
For instructions, you can choose active or passive voice and rst, second, or
third person. Instructions should be second person imperative. Leave out the
pronouns, for example, Enter departure date rather than Enter your departure date.
130
Chapter 6
Figure 6.3 Dierent ways of asking the user to specify a date. Generally it is best to ask
in such a way that the user cannot possibly make a mistake and necessitate the serving
of an error page reading date not properly formatted, invalid date, or date in the
past.
Oftentimes you can build a system such that error messages are unnecessary.
The best user interfaces are those where the user cant make a mistake. For example, suppose that an application needs to prompt for a date. One could do
this with a blank text entry box and no hint, expecting the user to type MM/
DD/YYYY, for example, 09/28/1963 for September 28, 1963. If the users input did not match this pattern or the date did not exist, for example, 02/30/
2002, the application returns a page explaining the requirements. A minor improvement would be to add a note next to the box: MM/DD/YYYY. If the
application logs showed that the number of error pages served was reduced, but
not eliminated, perhaps defaulting the text entry box to todays date in MM/
DD/YYYY format would be better. Surf over to your favorite travel site, however, and youll probably nd that theyve chosen none of the above. Users
are asked to pick a date from a JavaScript calendar widget or pull down month
and day from HTML menus.
Sadly, you wont be able to eliminate the need for all error messages. Thus
youll have to make a choice between terse or verbose and between lazy or energetic. A lazy system will respond syntax error to any user input that wont
work. An energetic system will try to autocorrect the users input or at least
gure out what is likely to be wrong.
Studies have shown that it is worthwhile to develop sophisticated errorhandling pages, e.g., ones that correct the users input and serve a conrmation
page. At the very least, it is worth running some regular expressions against the
users oending input to see if its defects fall into a common pattern that can be
131
Content Management
explained on an error page. It is best to avoid anthropomorphismthe computer shouldnt say I didnt understand what you typed.
Color
132
Chapter 6
Navigation
As with page design, the best strategy for navigation is to copy the most successful and therefore familiar-to-your-users Internet applications. Best practice
for a site home-page circa 2005 seems to boil down to the following elements:
1. a navigation directory to the rest of the site
2. news and events
3. a single text input box for site-wide search
4. a quick form targeting the most frequently requested service on the site, e.g.,
on an airline site, a quick fare/schedule nder with form inputs for cities and
dates
In building the navigation directory, look at www.yahoo.com. Note that Yahoo! does not use icons for category navigation. To get to the photography category, underneath Arts & Humanities, you click on the word Photography.
The information is the interface. This principle is articulated in Edward Tuftes
classic Visual Explanations (Graphics Press, 1997). Tufte notes that if you were
to have icons youd also need a text explanation underneath. Why not let the
text alone be the interface? Tufte also argues for broad and at presentation
of information; a user shouldnt have to click through eight screens each with
only a handful of choices.
On interior pages, it is important to answer the following questions:
m Where am I?
m Where have I been?
m Where can I go?
To answer Where am I? relative to other sites on the Internet, you can include a logo graphic or font-distinguished site name in the upper left corner of
each page, hyperlinked to the site home-page. See the interior pages at amazon.com for how this works. To answer Where am I? relative to other pages
on the same site, you can include a site map with the current page highlighted.
On a complex site, this wont scale very well: better to use the Yahoo-style navigation bar, also known as hierarchical path or bread crumbs. For example, http://dir.yahoo.com/Arts/Visual_Arts/Photography/Panoramic/ contains
the following navigation bar:
133
Content Management
Home > Arts > Visual Arts > Photography > Panoramic
Note that this bar grows in size as O[log N] where N is the number of
pages on the site. Showing a full site map or top tabs results in linear
growth.
To answer Where have I been?, start by making sure not to instruct the
browser to change the standard link colors. The user will thus be cued by the
browser for any links that have already been visited. If youre careful with
your programming and consistent with your page titles, the user will be able
to right-click on the Back button and optionally return to any previous place
on your service. Note further that the Yahoo-style navigation bar is eective
at answering Where have I been? for users who have actually clicked down
from the home page.
To answer Where can I go? you need . . . links! Let the browser default to
standard colors so that users will perceive the links as links. It is generally a bad
idea to use rollovers, select boxes, or graphics. These controls wont work the
same from site to site and therefore users may not understand how to use them.
These controls dont have the property that visited links turn a dierent color;
they generally cant or dont tap into the browsers history database. Finally,
these controls arent eective at showing the user where he or she can go because many of the choices are hidden.
Exercise 5: Criticism
Take or get a tour of the other projects being built by your classmates in this
course. For each project make sure that you familiarize yourself with the overall service objectives and the data model. Then register as a user and author an
article. (If you get stuck on any of these steps, contact the team members behind
the project by phone and email and ask them to add links or hints to their
server.)
Working with your project team members, write a plain-text critique of each
project that you review. Look for situations in which the clients requirements,
as expressed in the planning exercise solutions, cant be fullled with the data
model that you see. Look for opportunities to provide constructive criticism.
Remember that your classmates dont need a self-esteem boost; they need the
benet of your engineering skills.
134
Chapter 6
Sign the critique with the name of your project team and also the names of all
team members.
Email your critique to the team members whose work youve just reviewed.
Archive these in a le and make them available at http://yourservername/doc/
critiques/cm-sent.txt. Watch your own inbox for critiques coming in from the
rest of the class. Please assemble these into one le and make them available
at http://yourservername/doc/critiques/cm-received.txt.
135
Content Management
m categorize by whats in front of the camera and present the items separated
by subheadlines, e.g., Portraits, Architecture, Wedding, Family,
Animals
m categorize by type of camera used and present items separated by subheadlines such as Digital point and shoot, Digital SLR, 35mm point and
shoot, 35mm SLR, Medium Format, Large Format
With such a large part of the user experience driven from database tables, testing an alternative is as easy as inserting some rows into the database from the
information architecture admin pages. If during a sites conceptualization people cant agree on the best categorization of content, it becomes possible to
launch with two alternatives. Half the users see IA 1 and half see IA 2. If users
136
Chapter 6
whove experienced IA 1 are more likely to register and return, we can assume
that IA 1 is superior, at least for rst-time users.
For the application that you build in this course, it is acceptable to take the
expedient path of pounding out scripts with an implicit information architecture. However, wed like you to be aware of the power for development and
testing that can be gained from an explicit information architecture.
137
Content Management
Note that before embarking on this you may want to read at least the Separating the Designers and the Programmers section on templates in the Software Modularity chapter.
138
Chapter 6
139
Content Management
Software Modularity
At this point in the course, youve built enough software that things may be
starting to get unwieldy. What will life be like for those who maintain your
code? Will they be able to gure out what modules youve written? Will they
be able to nd your documentation? Will it be simple to make small changes
site wide?
This chapter is about ways to group all the code for a module, to record the
existence of documentation for that module, to publish APIs to other parts of
the system, and methods for storing conguration parameters.
Grouping Code
Each module in your system will contain the following kinds of software:
m RDBMS table denitions
m stored procedures that run in the database (in Oracle these would be PL/SQL
or Java programs)
m procedures that run inside your Web or application server program that are
shared by more than one page (well call these shared procedures)
m scripts that generate individual pages
m (possibly) templates that work in conjunction with page scripts
m documentation explaining the objectives of the module
Here are some examples of the modules that might be behind a large online
community:
142
Chapter 7
m user registration
m articles and comments
m discussion forum (shares the same tables with articles, but has radically different workow for moderation and dierent presentation scripts)
m chat (separate tables from other content, optimized for extremely rapid
queries, custom JavaScript client software)
m adserver for selling, placing, and logging banner advertisements
m calendar (personal, group, and site-wide events)
m classied ads and auctions
m e-commerce (catalogue of products, table of orders, presentation of product
pages with reviews from community members, billing and accounting)
m email, server-based email (like Hotmail) for community members
m survey (opinion polls and other types of surveys among the members)
m weblog, private blogs for each community member who wants one, possibly
sharing tables with articles, but dierent editing, approval workow, and presentation interfaces plus RSS feeds, trackback, and the rest of the machineto-machine interfaces that are expected in the blog world
m (trouble) ticket tracker for bug and feature request tracking
Good software developers might disagree on the division into modules. For example, rather than create a separate classied ads module, a person might decide that classieds and discussion are so similar that adding price and bid
columns to an existing content table makes more sense than constructing new
tables and that adding a lot of IF statements to the scripts that present discussion questions and answers makes more sense than writing new scripts.
If the online community is used to support a group of university students and
teachers, additional specialized modules would be added, for example, for
recording which courses are being taught by whom and when, which students
are registered in which courses, what handouts are associated with each class,
what assignments are due and by when, and what grades have been assigned
and by which teachers.
Recall that the software behind an Internet service is frequently updated as
the community grows and new ideas are developed. Frequently updated software is going to have bugs, which means that the system will be frequently
debugged, oftentimes at 2:00 a.m. and usually by a programmer other than the
one who wrote the software. It is thus important to publish and abide by con-
143
Software Modularity
ventions that make it easy for a new programmer to gure out where the relevant source code les are. It might take only fteen minutes to gure out what
is wrong and patch the system. But if it takes three hours to nd the source
code les to begin with, what would have been an insignicant bug becomes a
half-day project.
Lets walk through an example of how the software is arranged on the photo
.net service. The server is congured to operate multiple Internet services. Each
one is located at /web/service-name/ which means that all the directories
associated with photo.net are underneath /web/photonet/. The page root for
the site is /web/photonet/www/. The Web server is congured to look for
library procedures (shared by multiple pages) in /web/photonet/tcl/, a
name derived from the fact that photo.net is run on AOLserver, whose default
extension language is Tcl.
RDBMS table, index, and stored procedure denitions for a module are
stored in a single le in the /doc/sql/ directory (directory names in this
chapter are relative to the Web server page root unless specied as absolute).
The name for this le is the module name followed by a .sql extension, for
example, chat.sql for the chat module. Shared procedures for all modules
are stored in the single library directory /web/photonet/tcl/, with each le
named modulename-defs.tcl, for example, chat-defs.tcl.
Scripts that generate individual pages are parked at the following locations:
/module-name/ for the user pages; /module-name/admin/ for the moderator
pages, for example, where a user with moderator privileges would go to delete a
posting; /admin/module-name/ for the site administrator pages, for example,
where the service operator would go to enable or disable a service, delegate
moderation authority to another user, and so forth.
A high-level document explaining each module is stored in /doc/modulename.html and linked from the index page in /doc/. This document is intended as a starting point for programmers who are considering using the
module or extending a feature of the module. The document has the following
structure:
1. Where to nd all the software associated with this module (site-wide conventions are nice, but it doesnt hurt to be explicit).
2. Big picture information: Why was this module built? Why arent/werent
existing alternatives adequate for solving the problem? What are the highlevel good and bad features of this module? What choices were considered
in developing the data model?
144
Chapter 7
145
Software Modularity
you are sure that a piece of code is only useful for the particular Web application that youre building, keep it in the Web server as a shared procedure.
Documentation
As we enter the 21st century we nd that rie marksmanship has been largely lost in the
military establishments of the world. The notion that technology can supplant incompetence is upon us in all sorts of endeavors, including that of shooting.
Je Cooper in The Art of the Rie (Paladin Press, 1997)
Given a system with 1,000 procedures and no documentation, the typical manager will lay down an edict to the programmers: you must write a doc string
for every procedure saying what inputs it takes, what outputs it generates, and
how it transforms those inputs into outputs. Virtually every programming
environment going back to the 1960s has support for this kind of thinking.
The fancier doc string systems will even parse through directories of source
code, extract the doc strings, and print a nice-looking manual of 1,000 doc
strings.
How useful are doc strings? Useful, but not sucient. The programmer new
to a system wont have any idea which of the 1,000 procedures and corresponding doc strings are most important. The new programmer wont have any idea
why these procedures were built, what problem they solve, and whether the
whole system has been deprecated in favor of newer software from another
source. Certainly the 1,000 doc strings arent going to convince any programmers to adopt a piece of software. It is much more important to present clear
English prose that demonstrates the quality of your thinking and design work
in attacking a real problem. The prose does not have to be more than a few
pages long, but it needs to be carefully crafted.
146
Chapter 7
information system to its operators, you will thus very seldom be required to
entertain a suggestion in this area. Only someone with years of relevant experience is likely to propose that a column be added to an SQL table or that ve
tables can be replaced with three tables. A much larger number of people are
capable of writing Web scripts. So youll sometimes be derided for your choice
of programming environment, regardless of what it is or how state-of-the-art it
was supposed to be at the time you adopted it. Virtually every human being on
the planet, however, understands that mauve looks dierent from fuchsia and
that Helvetica looks dierent from Times Roman. Thus the largest number of
suggestions for changes to a Web application will be design related. Someone
wants to add a new logo to every page on the site. Someone wants to change
the background color in the discussion forum section. Someone wants to make
a headline larger on a particular page. Someone wants to add a bit of whitespace here and there.
Suppose that youve built your Web application in the simplest and most direct manner. For each URL there is a corresponding script, which contains
SQL statements, some procedural code in the scripting language (IF statements, basically), and static strings of HTML that will be combined with the
values returned from the database to form the completed page. If you break
down what is inside a Visual Basic Active Server Page or a Java Server Page
or a Perl CGI script, you always nd these three items: SQL, IF statements,
HTML.
Development of an application with this style of programming is easy. You
can see all the relevant code for a page in one text editor buer. Maintenance is
also straightforward. If a user sends in a bug report saying There is a spelling
error on http://www.yourcommunity.org/foo/bar you know that you need
only look in one le in the le system (/foo/bar.asp or /foo/bar.jsp or /foo/bar
.pl or whatever) and you are guaranteed to nd the source of the users problem. This goes for SQL and procedural programming errors as well.
What if people want site-wide changes to fonts, colors, headers and footers?
This could be easy or hard depending on how youve crafted the system.
Suppose that default colors are read from a conguration parameter system
and headers, footers, and per-page navigation aids are generated by the page
script calling shared procedures. In this happy circumstance, making site-wide
changes might take only a few minutes.
What if people want to change the wording of some annotation in the static
HTML for a page? Or make a particular headline on one page larger? Or add a
bit of white space in one place on one page? This will require a programmer
147
Software Modularity
because the static HTML strings associated with that page are embedded in a
le that contains SQL and procedural language code. You dont want someone
to bring a section of the service down because of a botched attempt to x a
typo or add a hint.
The Small Hammer
The simplest way to separate the programmers from the designers is to create
two les for each URL. File 1 contains SQL statements and some procedural
code that lls local variables or a data structure with information from the
RDBMS. The last statement in File 1 is a call to a procedure that will fetch
File 2, a template le that looks like standard HTML with simple references
to data prepared in File 1.
Suppose that File 1 is named index.pl and is a Perl script. By convention,
File 2 will be named index.template. In preparing a template, a designer
needs to know (a) the names of the variables being set in index.pl, (b) that one
references a variable from the template with a dollar sign, for example,
$standard_navbar, and (c) that to send an actual dollar sign or at-sign
character to the user it should be escaped with a backslash. The merging of
the template and local variables established in index.pl can be accomplished
with a single call to Perls built-in eval procedure, which performs standard
Perl string interpolation, that is, replacing $foo with the value of the variable
foo.
The Medium Hammer
If the SQL/procedural script and the HTML template are in separate les in
the same directory, there is always a risk that a careless designer will delete,
rename, or modify a computer program. It may make more sense to establish
a separate directory and give the designers permission only on that parallel
tree. For example on photo.net you might have the page scripts in /web/
photonet/www/ and templates underneath /web/photonet/templates/. A
script at /ecommerce/checkout.tcl nishes by calling the shared procedure
return_template. This procedure rst invokes the Web server API to nd
out what URI is being served. A conguration parameter species the start of
the templates tree. return_template uses the URL plus the template tree root
to probe in the le system for a template to evaluate. If found, the template, in
AOLserver ADP format (same syntax as Microsoft ASP), is evaluated in the
148
Chapter 7
149
Software Modularity
Figure 7.1
sponding template would be substituted for the <slave> tag in the master
template and the result of evaluating the master template returned to the
user
m when a master template was not found in the same directory as the script,
the server would search at successively higher levels in the le system until a
master template was found, then apply that one
Figure 7.1 is an example of how what the user viewed would be divided by
master and slave templates. Content in gray is derived from the master template. Note that doesnt mean that it is static or not page specic. If a template
is an ASP or JSP fragment, it can execute arbitrarily complex computer programs to generate what appears within its portion of the page. Content in white
comes from the per-page template.
This sounds inecient due to the large number of le system probes. However, once a system is in production, it is easy for the Web server to cache,
per-URL, the results of the le system investigation. In fact, the Web server
could cache all of the templates in its virtual memory for maximum speed.
The reason that one wouldnt do this during development is that it would
150
Chapter 7
make debugging dicult. Every time you changed a template youd have
to restart the Web server or clear the cache in order to view the results of the
change.
Intermodule APIs
Recall from the User Registration and Management chapter that we want
people to be accountable for their actions within an online community. One
way to enhance accountability is by oering a user contributions page that
will show all contributions from a particular user. Wherever a persons name
appears within the application it will be a hyperlink to this user contributions
page.
Given that all site content is stored in relational database tables, the most obvious way to start writing the user contributions page script is by looking at the
SQL data models for each individual module. Then we can write a program
that queries a few dozen tables to nd all contributions by a particular user.
A drawback to this approach is that we now have code that may break if we
change a modules data model, yet this code is not within that modules subdirectory, and this code is probably being authored by a programmer other
than the one maintaining the individual module.
Lets consider a dierent application: email alerts. Suppose that your community oers a discussion forum and a classied ad system, coded as separate
modules. A user wishes to get a daily summary of activity in both areas. Each
module could oer a completely separate alerts mechanism. However, this
would mean that the user would get two email messages every night when a single combined email was desired. If we build a combined email alert system,
however, we have the same problem as with the user history page: shared code
that depends on the data models of individual modules.
Finally, lets look at the site administrators job. The site administrator is
probably a busy volunteer. He or she does not want to waste twenty mouse
clicks to see todays new content. The site administrator ought to be able to
view recently contributed content from all modules on a single page. Does
that mean we will yet again have a script that depends on every table denition
from every module?
Heres a hint at a solution. On the photo.net site each module denes a new
stu procedure, which takes the following arguments:
151
Software Modularity
The output of such a procedure can be simple: HTML for a Web page or plain
text for an email message. The output of such a procedure can be a data structure. The output of such a procedure could be an XML document, to be rendered with an XSL style sheet. The important thing is that pages interested in
new stu site wide need not be familiar with the data models of individual
modules, only the name of the new stu procedure corresponding to each
module. This latter task is made easy on photo.net: as each module is loaded
by the Web server, it adds its new stu procedure name to a site-wide list.
A page that wants to display site-wide new stu loops through this list, calling
each named procedure in turn.
Conguration Parameters
It is possible, although not very tasteful, to build a working Internet application with the following items hard-coded into each individual page:
m RDBMS username and password
m email addresses of site administrators who wish notications on events such
as new user registration or new content posting
m the email address of a sysadmin to notify if the Web server cant connect to
the RDBMS or in case of other errors
m IP addresses of users we dont like
m legacy URLs and the new URLs to which requests for the old ones should be
redirected
m the name of the site
m the names of the editors and publishers
152
Chapter 7
m the maximum attachment size that the site is willing to accept (maybe you
dont want a user uploading an 800 MB TIFF image as an attachment to a
bboard posting)
m whether or not to serve a link oering the source code behind the page
The ancient term for this approach to building software is putting magic numbers in the code. With magic numbers in the code, it is tough to grab a few
scripts from one service and apply them to another application. With magic
numbers in the code, it is tough to know how many programs you have to examine and modify after a personnel change. With magic numbers in the code, it
is tough to know if rules are being enforced consistently site wide.
Where should you store parameters such as these? Except for the database
username and password, an obvious answer would seem to be in the database. There are a bunch of keys (the parameter names) and a bunch of values
(the parameters). This is the very problem for which a database management
system is ideal.
-- use Oracles unique key generator
create sequence config_param_seq start with 1;
create table config_param_keys (
config_param_key_id
integer primary key,
key_name
varchar(4000) not null,
param_comment
varchar(4000)
);
-- we store the values in a separate table because there might
-- be more than one for a given key
create table config_param_values (
config_param_key_id
not null references config_param_keys,
value_index
integer default 1 not null,
param_value
varchar(4000) not null
);
-- we use the Oracle operator "nextval" to get the next
-- value from the sequence generator
insert into config_param_keys
values
(config_param_seq.nextval, view_source_link_p, damn 6.171 instructor is
making me do this);
-- we use the Oracle operator "currval" to get the last
-- value from the sequence generator (so that rows inserted in this transaction
153
Software Modularity
If the script gets a row with t back, it includes a View Source link at the
bottom of the page. If not, no link.
Recording a redirect required the storage of two rows in the config_
param_values table, one for the from and one for the to URL. When a
request comes in, the Web server will want to query to gure out if a redirect
exists:
select cpk.config_param_key_id
from config_param_keys cpk, config_param_values cpv
where cpk.config_param_key_id = cpv.config_param_key_id
and key_name = redirect
and value_index = 1
and param_value = :requested_url
154
Chapter 7
N-way joins notwithstanding, how tasteful is this approach to storing parameters? The surface answer is extremely tasteful. All of our information is in
155
Software Modularity
the RDBMS where it belongs. There are no magic numbers in the code. The
parameters are amenable to editing from admin pages that have the same
form as all the other pages on the site: SQL queries and SQL updates. After a
little more time spent with this problem, however, one asks Why are we
querying the RDBMS one million times per day for information that changes
once per year?
Questions of taste aside, an extra ve to ten RDBMS queries per request is a
signicant burden on the database server, which is the most dicult part of an
Internet application to distribute across multiple physical computers (see the
Scaling chapter) and therefore the most expensive layer in which to expand
capacity.
A good rule of thumb is that Web scripts shouldnt be querying the RDBMS
to gure out what to do; they should query the RDBMS only for content and
user data.
For reasonable performance, conguration parameters should be accessible
to Web scripts from the Web servers virtual memory. Implementing such a
scheme with a threaded Web server is pretty straightforward because all the
code is executing within one virtual memory space:
m look in the server API documentation to nd a mechanism for saying run
this bit of code at server startup time
m build an in-memory hash table where the parameter keys are the hash table
keys
m load the parameter values associated with a key into the hash table as a
list
m document an API to the hash table that takes a key as an input and returns a
value or a list of values as an output
A hash table is best because it oers O[1] access to the data, that is, the time
that it takes to answer the question what is the value associated with the
key foobar does not grow as the number of keys grows. In some hobbyist
computer languages, built-in hash tables might be known as associative
arrays.
If you expect to have a lot of conguration parameters, it might be best
to add a section column to the config_param_keys table and query by section and key. Thus, for example, you can have a parameter called bug_
report_email in both the discussion and user_registration sections. The
156
Chapter 7
key to the hash table then becomes a composite of the section name and key
name.
With Microsoft .NET
Conguration parameters are added to IIS/ASP.NET applications in the Web
.cong le for the application.
For example, if you place the following in c:\Inetpub\wwwroot\Web
.config (assuming default IIS installation)
<configuration>
<appSettings>
<add key="publisherEmail"
value="[email protected]" />
</appSettings>
</configuration>
157
Software Modularity
More:
m ASP.NET Conguration from .NET Framework Developers Guide at
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpguide/
html/cpconaspnetconguration.asp (note that the MSDN guys havent gured out how to do abstract URLs and they also havent converted to .aspx
yet!)
You can also specify the parameter in the WEB-INF/web.xml le for your
application:
<context-param>
<param-name>companyName</param-name>
<param-value>My Company, Inc.</param-value>
</context-param>
The override attribute in the rst example species that you do not want this
value to be overridden by a context-param tag in the web.xml le. The default
value is true (allow overrides).
To retrieve parameters from a servlet or JSP, you can call:
getServletContext().getInitParameter("companyName");
More:
m documentation for Context: http://jakarta.apache.org/tomcat/tomcat-4.0doc/cong/context.html
m javadoc for ServletContext: http://jakarta.apache.org/tomcat/tomcat-4.0doc/servletapi/javax/servlet/ServletContext.html
158
Chapter 7
Exercise 1
Create a /doc/ directory on your team server. Create an index page in this directory that links to a development standards document (/doc/developmentstandards would be a reasonable URL, but you can use whatever you like so
long as it is clearly linked from /doc/).
In this development standards document, cover at least the following issues:
1. naming of URLs: abstract versus non-abstract (bleah), dashes versus
underscores (hard for many users to read), spelled out or abbreviated
2. naming of URLs used in forms and form processingwill these be at
the same URL or will a user working through a sequence of forms proceed
/foo/bar, /foo/bar-1, /foo/bar-2, etc.
3. RDBMS used
4. computer languages used for Web scripts and procedural code within the
RDBMS
5. means of connecting to the RDBMS (libraries, bind variables, etc.)
6. variable-naming conventions
7. how to document a module
8. how to document a shared procedure
9. how to document a Web script (author, valid inputs)
10. how Web form inputs are validated by scripts
11. templating strategy chosen (if any)
12. how to add a conguration variable and how to name it so that at least all
parameters associated with a particular module can be identied quickly
Step back from your document before moving on to the next exercise. Ask
yourself If a new programmer joined this project tomorrow, and I asked her
to build a surveys module, would she be able to be an eective consistent developer in my environment without talking to me? Remember that a surveys
module will require an extensive administrative interface for creation of surveys, questions, and possible answers, both admin and user interfaces for looking at results, and a user interface for answering surveys. If the answer to the
question is Gee, this new programmer would have to ask me a lot of questions, go back and make your development standards document more explicit
and add some more examples.
159
Software Modularity
Exercise 2
Document your teams intermodule API within the /doc/ directory, perhaps at
/doc/intermodule-API, linked from the doc index page. Your strategy must
be able to handle at least the following cases:
m production of a site administrators page containing all content going back a
selectable number of days, with administration links next to each item without the page script having any dependence on any modules data model
m production of a user-level page showing new content site wide
m a centralized email alert system in which a user gets a nightly summary combining new content from multiple modules
160
Chapter 7
If your page scripts call this procedure any time they are writing user-uploaded
content out to a browser, no browser will ever interpret user-uploaded data as
an HTML tag.
That works great for elds such as first_names, last_name, street_
address, subject summary lines, and so forth, where there is no value to having an HTML tag. For some longer documents obtained from users, however,
it might be nice to enable them to use a restricted set of HTML tags such as B,
I, EM, P, BR, UL, LI, and so on. If youre going to store HTML in the database once and serve it back out thousands of times per day, it is better to check
for legal tags at upload time. The problem with checking for disallowed tags
such as SCRIPT, DIV, and FONT is that HTML keeps getting extended in
de jure and de facto ways. Unless you want the responsibility of keeping current with all of the ways in which new HTML tags can make browsers behave,
it may be better to check for approved tags. Either way, youll want the
allowed or disallowed tags list to be kept in an easy-to-modify conguration
le. Further, you probably want to perform a bit of validation on the use of
allowed tags such as B or I. A user who makes a mistake and forgets to close
one of these tags might render 100 comments underneath in an unusual font
style.
Exercise 3
Document your teams approach to preventing one user from attacking other
users with malicious HTML. Your documentation of this infrastructure should
include procedure names and examples of how those procedures are to be used.
Discussion
A discussion forum is one of the most basic tools for computer-supported cooperation among human beings. User A can post a question. User B can post
an answer. User C can view both question and answer and learn from the exchange. In a threaded forum, User D has the choice of posting a response to
User As question or to User Bs response. In a Q&A format forum, Users D,
E, and F can post responses to User As question, and the responses will simply
be presented in the order that they were submitted. With minor tweaks to the
presentation layer, a discussion forum system can function as a personal commentable weblog.
In this chapter youll prototype a discussion forum, conduct a usability test,
and then rene your system based on what you learned from observing the
users.
162
Chapter 8
Aviation in itself is not inherently dangerous. But to an even greater degree than the sea, it
is terribly unforgiving of any carelessness, incapacity or neglect.
Captain A. G. Lamplugh, 1930s
In a USENET group the magnet content can be any longish posting from a
recognized expert. Keep in mind that the number of people using a group such
as rec.aviation.soaring is fairly smallmost people get nervous in little
planes and even more nervous in a little plane with no engine. An analysis of
October 2004s activity by Marc Smiths Netscan service (netscan.research
.microsoft.com) shows that the group had only 174 Returnees. Thus it will
be fairly straightforward for these core users to recognize each other by name
or e-mail address. A typical magnet content posting in a newsgroup is the FAQ
or frequently asked questions summary in which each question has an agreedupon-by-the-group-experts answer.
The means of collaboration in the USENET group is the ability for any
member to start a new thread or reply to a message within an existing thread.
In the early days of USENET, the means of browsing and searching were reasonably good for recent messages, but terrible or nonexistent for learning from
older exchanges. Starting in the mid-1990s, Web-based search engines such as
DejaNews provided fast and easy access to old messages.
USENET has traditionally been weak on the fourth required element
(means of delegation of moderation). Not enough people have volunteered
to moderate, software to divide the eort of moderating a single forum among
multiple moderators was nonexistent, and the news protocols had security holes
that let commercial spam messages through even on moderated groups. For
163
Discussion
If the engine stops for any reason, you are due to tumble, and thats all there is to it!
Clyde Cessna
Beyond USENET
If the online learning community that you build is only as good as USENET,
congratulate yourself. The Google USENET archive contains 700 million
messages from twenty years. Hundreds of thousands of people have gotten the
answers to their questions, as shown in gure 8.1.
When building our own database-backed discussion forum system, there
are some simple improvements that we can add over the traditional USENET
system:
m an optional mail me when a response is posted eld
m email summaries or instant alerts
m up-to-the-second full text indexing (assuming your RDBMS supports it)
164
Chapter 8
Figure 8.1 A December 25, 2001 USENET exchange in the group rec.aviation.soaring
regarding mounting a camera on the wing of a glider. Notice that the rst answer comes
less than two hours after the question was posted. See http://groups.google.com/group/
rec.aviation.soaring/browse_thread/thread/666ed14e676ce92e.
165
Discussion
forum with an interstitial page explaining how to search and browse archived
threads. If the online community is short on moderator time, it will make a lot
of sense to query for those users whose postings have resulted in moderator intervention. If it turns out that 0.1 percent of the users consume 50 percent of
the moderators time, perhaps it is better to ban those handful of users and
thereby double the communitys available moderation resources.
As the semester proceeds, youll discover another advantage of building
your own discussion forum, which is that it becomes an integrated part of
your service. All of a users contributions in dierent areas, including the discussion forum, are queryable from a single database and viewable on a single
page.
Flying is inherently dangerous. We like to gloss that over with clever rhetoric and comforting statistics, but these facts remain: gravity is constant and powerful, and speed kills. In
combination, they are particularly destructive.
Dan Manningham
Exercise 1
Visit ve sites on the public Internet with discussion forums, one of which can
be the Medium Format Digest forum at photo.net (http://www.photo.net/
bboard/q-and-a?topic_id=35). For each site gather the following statistics:
m given an already-registered user, the number of clicks required to post a
message
m the number of clicks required to go from the top-level forum page to a single
thread
m if there are 20 postings within a thread, the number of clicks required to view
all the text within all of the postings
m the number of clicks required to view the subject lines of all archived postings
in a particular category
List the user interface and customer service features that you think are the
best from these ve sites and give a brief explanation of why each feature is
good.
166
Chapter 8
I certainly had no feeling for harmony, and Schoenberg thought that that would make it
impossible for me to write music. He said, Youll come to a wall you wont be able to
get through. So I said, Ill beat my head against that wall.
John Cage
If something is boring after two minutes, try it for four. If still boring, then eight. Then
sixteen. Then thirty-two. Eventually one discovers that it is not boring at all.
John Cage
It would be easy to justify the creation of 100 separate forums on our music
site. And indeed USENET contains more than 50 rec.music.* groups, including
rec.music.beatles.moderated, for example. That turns out to be the tip of
the iceberg, for the alternative hierarchy sports more than 700 alt.music.*
groups, including alt.music.celine-dion and alt.music.j-s-bach. If
USENET can support nearly 1,000 discussion forums, surely a popular comprehensive music site ought to have at least 100.
167
Discussion
Maybe not.
When discussion is fragmented, it is hard for a community to get o the
ground. If there are 50 users and 100 forums, how will those users nd each
other? The average visit will result in a user concluding that the community
isnt active. Such a user is unlikely to return or refer a friend to the site. Even
when a community is large enough to support numerous forums, presenting
discussion in a fragmented manner leads to extra work for the user whose interests are diverse. Suppose that a music scholar comes to USENET looking to
see if there has been any recent discussion of Bachs Schubler Chorales and
their inuence on later composers. Thats as simple as visiting alt.music.js-bach. If that scholar wants to check up on recent postings concerning Celine
Dions My Heart Will Go On, he or she will have to scan alt.music
.celine-dion separately.
A good example of a thriving community with a single discussion forum is
slashdot.org. It is very easy to nd the topics being actively discussed on slashdot: look at the front page.
It is possible to take the one forum and many forum approaches on
the same site at the same time. For example, look at http://www.photo.net/
bboard/ (static copy at http://philip.greenspun.com/seia/images-discussion/
photonet-bboard-original.htm). There are separate Medium Format, Nature
Photography, and Photo Critique forums. For a user to browse the new postings in these three forums will require seven mouse clicks: down into this page,
down into Medium Format, back, down into Nature, back, down into Critique. With a dierent SQL query, however, postings from all these very same
forums can be combined on one page, as in http://www.photo.net/bboard/
unied/ (static copy at http://philip.greenspun.com/seia/images-discussion/
photonet-bboard-unied.htm). Postings from particular forum topics may be
distinguished with a special publisher-chosen color or icon. Suppose that the
user nds the Photo Critique forum overwhelming and uninteresting. These
postings can be excluded from his or her personalized unied view via clicking on the Customize forums link at the top (static copy at http://philip
.greenspun.com/seia/images-discussion/unied-forum-personalization.htm) and
unchecking those forums that are no longer of interest.
She had a voice like the New Jersey State Anthem played on an electric razor.
Bright Lights, Big City by Jay McInerney
168
Chapter 8
169
Discussion
This is your community and these are your users. So in the long run only you
can know what administrative actions are most needed. At a minimum, however, you should support the following:
m nd the most active contributors
m select a contributor to become a co-moderator (presumably from the above
list)
m approve or disapprove a posting or a thread (this might be handled by more
general pages from your content management system, though remember that
moderating a discussion forum ought to be a very streamlined process); note
that these functions could be worked into the user pages, but only enabled for
those logged-in users who have moderator privileges
In-Class Presentations
At this point we recommend that teams present their functioning discussion
forum implementations. So that the audience can evaluate the workability of
the interface, the forums should be preloaded with questions and answers of
realistic length, with material copied from Google Groups if necessary.
A suggested outline for the presentation is the following:
m explain the kinds of people who are expected to use the discussion subsystem,
e.g., it might be only the site administrators (30 seconds)
m without logging in or logged in as a casual visitor, demonstrate the pages that
show all the forums (if more than one), questions within a forum, and questions and answers within a single thread (1 minute)
m demonstrate responding to an existing question/adding to an existing thread
(30 seconds)
m demonstrate asking a new question/starting a new thread (30 seconds)
m log in as a forum moderator or site administrator (15 seconds)
m demonstrate disapproving or moderating down a posting (30 seconds)
m demonstrate viewing statistics on forum usage and participation level by user
(1 minute)
m show the source code for the page that shows a single thread (one question,
many answers), with the SQL query (or queries) highlighted (1 minute)
170
Chapter 8
m show the execution plan for that query or those queries, i.e., the output of
whatever SQL performance-tracing tool is available in the RDBMS chosen
for this project (1 minute)
The presentation should be accompanied by a handout that shows (a) the data
model that supports discussion, (b) any SQL code invoked by the URL that
displays one thread of discussion (pulled out of whatever imperative language
scripts it is imbedded in), and (c) the results of the query trace.
Usability
At this point your discussion forum should work. Users can register. Users can
ask questions. Users can post answers. Is it usable? Well, consider that most
computer programs were considered perfect at one time by their creator(s). It
is only in encounters with real users that most problems become evident.
These encounters between freshly minted Internet applications and rst users
have become increasingly startling for all parties. One reason is the large and
growing user experience gap. In 1994 the average Web user was a researcher
with a Unix machine on his or her desk. Very likely the user knew how to write
at least simple computer programs. The average Web page was straight HTML
2.0 with no scripts or other active components. All Web pages worked the
same: you read the black text, you clicked on the blue text, you were reminded
by the purple text that youd already visited a link. Once you learned how to
use your rst Web site, you knew how to use all subsequently visited sites.
The user experience gap has grown larger because the users are less sophisticated while the applications have grown more complex. In 2005 the average
Web user is a rst-time computer user and the Web browser may be the only
application that he or she knows how to use. Despite the manifest inability of
these users to cope with a complex user interface, Web sites have been tarted up
with JavaScript, ActiveX, Java, Flash, to the point where they are as hard to
use and dierent from each other as old Unix applications. Users unable or unwilling to deal with the horrors of custom user interfaces have voted with their
mice. They buy at Amazon. They search at Google. They get their information
from Yahoo! and nytimes.com.
Idiosyncratic ideas make sense for magazine and television advertisements.
Dierent is good when it takes the user the same 30 seconds to absorb the message. But dierent is bad if it means the user needs extra time or extra clicks to
171
Discussion
Figure 8.2 Think of others, you could be a user yourself and It is an oense not
to ush the toilet after use. Mens room interior, Singapore. Photo copyright Philip
Greenspun.
172
Chapter 8
Figure 8.3 As the Internet gets older, applications become more complex and dicult
to use while the average user becomes less and less experienced. Source: Mark Hurst,
www.goodexperience.com.
get to the desired task. Some studies show that on each extra click there is a 50
percent chance that a user will abandon the site altogether. In mid-2000, Webvan purchased HomeGrocer, a competing grocery delivery company, and converted the old HomeGrocer users to the new Webvan user interface. Orders fell
by more than half. The HomeGrocer business went from breaking even to losing lots of cash simply because of the inferior usability of the Webvan software.
Ultimately Webvan went bankrupt, taking with it $1.2 billion in invested cash.
How is it possible that people follow what they imagine to be their own good
taste instead of either copying the successful Internet services (e.g., Yahoo!,
Amazon, Google) or listening to the users? And that people continue to believe
in the value of their own ideas even as the red ink starts to dominate their nancial reports? Justin Kruger and David Dunning, experimental psychologists
at Cornell University, wondered the same thing and wrote up their ndings in
Unskilled and Unaware of It: How Diculties in Recognizing Ones Own
Incompetence Lead to Inated Self-Assessments (Journal of Personality and
Social Psychology 77, no. 6 [1999]: 11211134; http://www.phule.net/mirrors/
unskilled-and-unaware.html). Kruger and Dunning found that people in the
12th percentile of skill estimated themselves to be in the 62nd. Furthermore,
these incompetent people failed to recalibrate themselves when shown the range
of performance by their peer group. The authors concluded that those with
limited knowledge in a domain suer a dual burden: Not only do they reach
mistaken conclusions and make regrettable errors, but their incompetence robs
them of the ability to realize it.
173
Discussion
Figure 8.4 Source: Why You Only Need to Test With 5 Users by Jakob Nielsen;
http://www.useit.com/alertbox/20000319.html.
174
Chapter 8
too many subconscious expectations). Run your usability test subjects in series,
one after the other, with your entire team observing and writing down what
happens. Ask your subjects to voice their thoughts aloud. How long does it
take the subject to complete a task? Does the subject get stuck on any step?
Does the subject indicate confusion as to the appropriate next step at any time?
A scientist is someone who measures her results against Nature. An engineer is someone
who measures her results against human needs. A computer scientist is someone who
doesnt measure his results.
us
Use the following script of tasks (cut and paste these into a separate document, and print it out, after lling in the bracketed sections), with no extra hints:
1. starting as an unregistered user at the site home page, nd the area on the
site where one would ask questions of other users (if you cant accomplish
this task, or any other task on this page, within 3 minutes, just move on)
2. read through the existing questions and answers to determine whether
or not [some question that has been asked already] has been asked and
answered already; if not, post a question on that subject (registering if
necessary)
3. read through the existing questions and answers to determine whether or
not [some question that has not been asked already] has been asked and
answered already; if not, post a question on that subject
4. log out
5. log in with the existing username/password of [user/password ] and try to
nd all the unanswered questions in the discussion forum
6. answer the question(s) that you yourself posted a few minutes earlier, pretending to be this other user
7. log out
8. log in with the existing username/password of [admin user/password ] and
nd the administrators pages
9. delete the discussion forum thread(s) that you created earlier
10. log out
175
Discussion
In between test subjects, clean up any rows that they may have left in database
tables. If your rst subject has a disastrous experience, consider taking a few
hours o to x your software, add links and annotation, and so on, before proceeding with the second subject.
Stand as far away from the subject as you possibly can while still being able
to see the computer screen and hear the subjects comments. Force yourself to
remain absolutely silent. If the subject is completely confused and clicking
around randomly, let the subject continue until he or she gures it out. Keep
track of the number of seconds each subject requires to complete each task.
Post a report on your team server at /doc/testing/discussionusability. This report will contain a summary of what you learned from
this test with average task times and average total time (we can use these to
compare the eciency of various teams solutions). The report should contain
hyperlinks to subpages that contain transcripts of individual user sessions, what
each test subject said, and what happened. Link to your report from your main
documentation index page.
176
Chapter 8
177
Discussion
m Jane Experienced visits the be a mentor page and browsing through the
requests sees that most people asking for help want to keep South American
Cichlids, with which she has no experience. However, Jane has had an African tank for ve years and feels condent that she can help Joe. She agrees to
mentor Joe.
m Janes workspace page now contains a subsection relating to her mentoring of Joe and lists his currently open questions. Jane clicks on a question
title and, seeing that none of the current responses are truly adequate, posts
her own authoritative answer.
m A week later Joe returns to world-o-cichlids.org and nds that his list of
open questions has gotten quite long and that in fact many of these questions are no longer relevant for him. He clicks on the close button next to
a question, and the server asks him Which of the responses actually
answered the question for you. Joe clicks on a response from Ned Malawinut, and the database records (1) that the question has been adequately
answered and should no longer appear in a mentors workspace, and (2)
that Ned Malawinut has contributed an answer that was seen as useful by another member.
m Joe has a question that he thinks might be ridiculous and is afraid to try it
out on the community at large. When posting he checks the initially show
only to my mentor option, and the question gets sent via email to Jane and
appears in her workspace.
m Jane returns to the server and decides that Joes question is not so easy to answer. She marks it for release to the general membership.
m Two weeks later Jane gets an email from the world-o-cichlids.org server. A
summary of some discussion threads that she has been following constitutes
the bulk of her email, but right at the top is a note You havent logged in
for more than a week and Joe, whom youre supposed to be mentoring, has
accumulated three questions that havent been adequately answered after ve
days. (This prodding mechanism addresses the issue revealed when a large
management consulting rm surveyed its employees asking Whom are you
mentoring? and Who is mentoring you? When matching the responses,
there was surprisingly little overlap!)
How can you estimate the eort required in building the full user experience
example? Start by looking at the number of new tables and columns that youd
178
Chapter 8
be adding to the system and the number of new URLs to which the server
would be responding. Then try to nd a subsystem that youve already built
for this project with a similar number of tables and page scripts. The implementation eort should be comparable.
Lets start with the data model rst. To support requests for and assignment of mentors, youll need at least one table, mentor_mentee_map with
the following columns: mentee, mentor (NULL, if not assigned), date_of_
request, date_of_assignment, mentee_goal. To support the query who
is the currently connected member mentoring and build the workspace subsection page for Jane, youll want to add an index on the mentor column. To
support the query are there any mentors who should be notied about a message posted by a member, you would add an index on the mentee column. If
you were to make this a concatenated index on mentee, mentor, it would help
the database identify outstanding requests for mentors (mentor is NULL) eciently for the be a mentor page.
Attempting to support the open/closed question status display and the query
Which members have answered a lot of questions well? might make you regret some of the data model decisions that you made in the preceding exercises
and/or in the Content Management chapter exercises. In the Content Management chapter, we have a headline asking What Is Dierent about Discussion? above the suggestion that the content_raw table can be used to support
forum questions and answers. If you went down that route and were implementing the mentoring user experience, this is where discussion would diverge
a bit from the rest of the content on the site. You need a way to represent in the
database management system whether a discussion forum question is open or
closed. If you add a discussion_forum_question_status column to the
content_raw table, youll have a NULL column value whenever the content
item is not a discussion forum question. Thats not very clean. You may also
be adding a closed_question_p boolean column to indicate that a forum
posting had been identied by the original questioner as having answered the
question. This will be NULL for more than 99 percent of content items. Thats
not a storage eciency problem, but it is sort of ugly.
An alternative to adding columns is to build some sort of bag-on-the-side
table recording which questions are open and closed and which answers closed
them. To decide whether or not this is a reasonable approach, it is worth starting by asking In what percentage of queries will the helper table need to be
JOINed in? When presenting articles and comments, you wouldnt need the
table. When presenting the discussion forum to a public user, that is, someone
179
Discussion
who wasnt logged in, the discussion forum page scripts wouldnt need the table
data. You might need these data only when serving workspace pages to members and when serving an individual discussion forum thread to a logged-in
member. It might be worth considering a table of the following form:
-- content_id is the primary key here; it is possible to have at most
-- one row in this table for a row in the content_raw table
create table discussion_question_status (
content_id
not null primary key references content_raw,
status
varchar(10) check (status in (open, closed)),
-- if the question is closed the next column will contain
-- the content_id of the posting that closed it
closed_by
references content_raw
);
-- make it fast to figure out whether a posting closed a question
create index discussion_question_status_by_closed_by on
discussion_question_status(closed_by);
As the community gains experience with this system, it will probably eventually want to give greater prominence to responses from members with a history
of writing good answers. In a fully normalized data model, for each answer displayed, the server would have to count up the number of old answers from the
author and query the discussion_question_status table to gure out what
percentage of those were marked as closing the question. In practice, youd
probably want to maintain a denormalized metric as an extra column or
columns in the users table, perhaps columns for n_answers_posted and
n_answers_closing, counts maintained by nightly batch updates or database
triggers.
Supporting the initially show only to my mentor option for new content would require the addition of a show_only_to_mentor column to the
content_raw table, where it could be used for discussion forum postings, comments on articles, and any other content item. Rather than changing all of the
pages that use the content tables, it would be easier to update the SQL views
that those tables use, for example, articles_approved, so as to exclude content that should be shown only to a mentor.
Some new page scripts would be required, at least the following:
m /workspacea page or sidebar providing a logged-in member with links to
previously asked questions and possibly other information as well, e.g., new
180
Chapter 8
content since last visit, recent content by members previously marked as interesting, etc. A mentor viewing this page would also be oered links to content marked show only to my mentor by the author.
m /mentoring/request-forma page whereby a member can sign up to request
a mentor
m /mentoring/request-conrma script that processes the preceding form and
adds a row to the mentor_mentee_map table
m /mentoring/sign-upa page that shows members who are requesting mentors, with at least the rst 200 characters of their request underneath
m /mentoring/request-detaila click-down page showing more details of a
members request for a mentor
m /mentoring/sign-up-conrma script that accepts a members agreement to
serve as a mentor, updating a row in the mentor_mentee_map table
m /mentoring/admin/a page showing summary statistics for the service
For the purposes of this course, you need not implement all of these grand
ideas, and indeed some of them dont make sense when a community is just getting started because the number of members is so small. If, however, some of
these ideas strike you as interesting, consider adding them to your project implementation plan.
181
Discussion
planning/YYYYMMDD-discussion. (If you name les with year-month-day in
Exercise 9: Execute
After consultation with your teaching assistant, execute your planned
improvements.
Among the principles of sustainable online community in the Planning chapter of this textbook, notice that the following are not mentioned:
7. means of waiting for machines to boot up
8. means of chaining users to their desks
9. means of producing repetitive strain injury
Though the alternatives vary in popularity from country to country as we write
this chapter (February 2005), there is no reason to believe that desktop computer programs such as Mozilla Firefox and Microsoft Internet Explorer are
the best way of participating in online communities.
In this chapter youll learn how to open your community to users connecting
from small mobile devices.
Be the User
If you were to close your eyes and visualize a person participating in your community, what would this participation look like? The users youve considered
thus far would probably be sitting at a desk with their hands keyboarding sixty
words per minute and their gazes set upon a twenty-inch screen. By contrast, a
mobile user might be walking along a busy street or looking down from a
mountain top. Their screen will be a few inches across, and they may be able
to type only ve or ten words per minute. What kinds of content and means
of participation will best suit this class of users?
184
Chapter 9
Exercise 1
Either using your phone or one of the emulators discussed later in this chapter,
use the mobile Internet to
m nd the weather forecast for your city
m get a stock quote for IBM
m look up ineluctable in the dictionary
m order a book from amazon.com (at least up to the nal checkout page)
m visit www.photo.net and nd the latest question that has been asked
For each task, write down how long it takes you to accomplish the task. Then
repeat the tasks with a desktop HTML browser and write down how long each
task takes.
Exercise 2
Come up with a list of two or three services from your learning community
that will be valuable to mobile users. You may nd the following guidelines
useful:
m Timeliness. A community is sustained by the active participation of its members. Though the members will often be separated in time, anyone who has
participated in a heated bulletin board debate, an online auction, or a chat
session can appreciate the value of timely interaction. Mobile browsers are
particularly well suited to this type of interaction because they allow the
user to stay connected in a wide variety of settings.
m Brevity. Users with small screens will have a dicult time receiving, reading,
and entering large amounts of content.
m Native applications. Mobile browsers are commonly bundled with cellular
telephones. Until phone companies provide General Packet Radio Service
(GPRS) in your users region, it is impossible to deliver an application that
simultaneously uses voice and hypertext. However, it is possible to produce
a hypertext document that provides one-click dialing to a publisher-specied
phone number.
185
Figure 9.1 Content to mobile devices goes from an HTTP server on the public Internet
via TCP/IP and is sometimes translated into proprietary formats and protocols within
a phone companys wireless network before reaching the handset.
Standards
Though the bits may be transported through a proprietary network, anyone
can serve content to mobile devices with a standard Web server (gure 9.1).
As illustrated in gure 9.1, the cell phone connects to your server through the
service providers wireless network. Depending on the phone and network, the
Wireless Network cloud may contain standard Internet Protocol (IP) routing, a standard HTTP proxy, or a WAP gateway. In the last case, the gateway
and phone communicate using a special set of protocols that, among other
things, compresses data before transmission over the wireless network. The net
eect is that the phones browser (sometimes called a microbrowser) looks to a
public HTTP server like a standard Web browser issuing HTTP GETs and
POSTs.
The mobile industry is consuming markup languages at a rapid rate. The progression
has taken us from the Handheld Device Markup Language (HDML; 1997) to the Wireless Markup Language (WML; 1998) to the current recommendation, XHTML Mobile
Prole (XHTML-MP; 2001). We can take heart from the fact that XHTML-MP is
derived from XHTML, the World Wide Web Consortion recommendation for standard
browsers. Gone are the bad old days when a developer had to learn a new markup language, and servers had to be congured to send new Content-Type headers, in order to
deliver mobile content. We expect that XHTML-MP will thereby enjoy wider adoption
and greater stability.
186
Chapter 9
187
The text in bold (above) is what the programmer types, simulating a microbrowser request. The exchange looks a lot like what wed see for a regular
HTML browser. The main dierences are the inclusion of the XML declaration and document-type denition in the rst two lines of the document, and
the use of the namespace attribute, xmlns, in the opening html tag.
A server wishing to distinguish between desktop and mobile users could
search the contents of the HTTP Accept header for the string application/
xhtml+xml; profile="http://www.wapforum.org/xhtml," which is supposedly required by the XHTML Mobile Prole specication (http://www
.openmobilealliance.org/tech/aliates/wap/wap-277-xhtmlmp-20011029-a
.pdf ). By contrast, a desktop browser, if it lists XHTML among the formats
that it accepts, will generally not refer to the mobile prole. Heres what Microsoft Internet Explorer 6.0 supplies as an Accept header:
image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/
vnd.ms-excel, application/vnd.ms-powerpoint, application/msword,
application/x-shockwave-flash, */*
Note that Mozilla is making full use of the original conception of the Web in which the
server and the client would negotiate to provide the user with the best possible le in response to a request for an abstract URL. The order of the MIME types in the Accept
header is irrelevant. The browser indicates its preference with quality values, for example
in the value text/html;q=0.9, Mozilla is indicating that plain vanilla HTML is less
preferred than the three preceding XML types, which default to a quality of 1.0. To
learn more about this system, see the section on Quality Values in the HTTP 1.1 specication, at http://www.w3.org/Protocols/rfc2616/rfc2616.html
188
Chapter 9
No Extra Headers
Claiming to be a Palm
% telnet www.google.com 80
% telnet www.google.com 80
Trying 216.239.57.99...
Connected to www.google.com.
Escape character is ^].
Trying 216.239.57.99...
Connected to www.google.com.
Escape character is ^].
GET / HTTP/1.1
GET / HTTP/1.1
User-Agent: UPG1 UP/4.0 (compatible; Blazer 1.0)
HTTP/1.1 200 OK
Date: Tue, 22 Apr 2003 01:20:53 GMT
Cache-control: private
Content-Type: text/html
Server: GWS/2.0
Content-length: 2691
<html><head>
<meta http-equiv="content-type"
content="text/html; charset=ISO-8859-1">
<title>Google</title><style>...</style>...
</head><body>...</body></html>
Connection closed by foreign host.
Though neither request indicates a preferred media type, Googles server recognizes the Blazer browser that ships with Handspring palm-top devices and
redirects the browser, via the response lines HTTP/1.1 302 Found and Location: http://www.google.com/palm. Sadly, there is no centrally maintained registry of user agents, and therefore success with this method is largely
a matter of programmer diligence.
Exercise 3
Summary
<?xml...> declaration and running through the closing </html> tag, into a
le called ex1.html on your Web server and load the example into dierent
kinds of browsers. We recommend that you place this le in a /mbl/ subdirec-
189
Figure 9.2
Step 1mobile browser Load the page into a mobile browser and admire
your handiwork. If you do not have access to a Web-enabled phone, install or
locate emulator software, either a PC microbrowser emulator or Web-based
tool. See the links at the end of this chapter for suggestions. Suppose for a moment that you had placed the document at /mbl/software-engineeringfor-internet-applications/examples/example1.html. Would that affect the amount of time required to complete this exercise?
Step 2desktop browser Now load the page into your favorite desktop
browser program. Marvel at the cross-browser compatibility of your document.
Compare your subjective experience of the content in the two cases, then
190
Chapter 9
answer the following question: In a world where desktop browsers and mobile
browsers can parse the same markup syntax, do we need to distinguish between
the two, or can we serve the same document to every type of user?
Keypad Hyperlinks
Lets look at a page with hyperlinks:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.0//EN"
"http://www.wapforum.org/DTD/xhtml-mobile10.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<title>
Student
</title>
</head>
<body>
<ol>
<li><a
<li><a
<li><a
<li><a
<li><a
</ol>
</body>
Life
href="calendar" accesskey="1">Calendar</a></li>
href="/academics/grades" accesskey="2">Grades</a></li>
href="urgent-messages" accesskey="3">Urgent Messages</a></li>
href="events/frat-parties" accesskey="4">Fraternity Parties</a></li>
href="http://news.google.com/" accesskey="5">News</a></li>
</html>
A numbered series of choices is presented in a list, with each choice hyperlinked to the appropriate target. We take advantage of the anchor tags accesskey attribute to improve usability by letting the user link to any of the choices
with a single keypress.
Exercise 4
Forms and server-side processing work the same way for mobile browsers as
they do for desktop browsers. Write an XHTML-MP document that prompts
for an email address (or screen name, if youve decided to ignore the sociolo-
191
Figure 9.3
gists advice about anonymity) and password, then POSTs these to a target on
your server. The servers response should print back the email address entered
and the rst character of the password, followed by one period for each subsequent password character. We recommend that you place your code so that it is
accessible via URIs starting with /mbl/.
192
Chapter 9
193
The mobile interface should be accessible to the mobile user who types only the
hostname of your site, that is, the user should not have to type in the /mbl/
subdirectory. This is typically accomplished by an IF statement in the top-level
script of your Web servers page root.
This is a good opportunity to be creative. Browsing from a phone can be
slow, expensive, and painful. Every line of information has to be critically important to the user. Here are a few ideas to get you started:
m someone who has asked a question in an online community will be very interested in new answers to that question
m in a small community, a simple list of users and their phone numbers that can
be dialed with one keypress from a mobile browser might be very useful
194
Chapter 9
The Future
In most countries the mobile Internet has not lived up to expectations of wide
success. The standout exception is the i-mode system, which has become the
dominant means of Internet access in Japan. We think that two reasons explain
i-modes relative success: always-on connectivity and revenue opportunities for
publishers.
Western mobile Internet systems typically involved a dialup and sign-on delay of as long as two minutes for the rst page; with the always-on i-mode system, the user gets consistent performance and relatively quick results for initial
requests. Early Western mobile systems charged per minute, which was painful
195
for users who typed text slowly on numeric keypads and received pages at 9800
baud. Always-on systems such as i-mode tend to charge a per-byte or at permonth rate for access, which greatly reduces the possibility of a huge end-ofmonth bill.
In most mobile Internet systems, the phone company decides what sites are
going to be interesting to users and places them on a set of default bookmarks.
The phone company often charges the site publisher to be promoted to its customers. The result? Every early system in the United States made it easy to connect to amazon.com and shop for books, which turned out not to be a popular
activity. DoCoMo, the Japanese company that runs the i-mode service, took a
dierent approach. DoCoMo decided that they werent creative enough to gure out what consumers would want out of the mobile Internet. They therefore
came up with a system in which content providers are more or less equally
available. Content providers can earn revenue via banner advertisements or by
charging for premium content. When a provider wants to charge, DoCoMo
handles the payment, taking a 59 percent commission.
The combination of always on and non-starvation for content providers created an explosion of creativity on the part of publishers. The most popular
services seem to be those that connect people with other people, rather than
business-to-consumer amazon.com-style e-commerce.
Is there hope that the mobile Internet will eventually become as popular as
i-mode is in Japan? The rst ray of hope was provided by General Packet
Radio Service (GPRS). GPRS takes advantage of lulls in voice trac within a
cell to deliver a theoretically maximum of 160Kbits/second via unused frequencies at any particular moment. GPRS requires new handsets that are equipped
to listen simultaneously on both the dedicated circuit-switched connection in
use for a voice call and also monitor GPRS frequencies for incoming packets.
In practice, GPRS may provide only three or four times faster throughput than
existing WAP systems. More important is the fact that GPRS can, in theory,
deliver an always-on experience similar to that of i-mode or a hardwired
desktop computer.
As noted above, with GPRS the wireless Internet will become a place that
supports simultaneous voice and text interaction. For example, the following
scenario can be realized:
m User dials an airline phone number
m Airline: Please speak your departure city
m User: London
196
Chapter 9
Notice that voice prompting and recognition are convenient when a user is
choosing from among hundreds of alternatives, for example, the worlds airports. However, voice becomes agonizing if the user must listen to a long list
of detailed choicesprompting with text may be much better when more than
two or three choices are available, especially if each choice requires elaborate
specication. Keep in mind The Magical Number Seven, Plus or Minus
Two: Some Limits on Our Capacity for Processing Information by George
A. Miller (Psychological Review 63 [1956]: 8197, http://www.well.com/user/
smalin/miller.html).
There is no evidence that the phone companies outside Japan will wise up to
the power of revenue sharing. However, with the introduction of GPRS the
wireless Internet will become something better than a novelty. For more on
the subject of GPRS, see Peter Rysavy, Emerging Technology: Clear Signals
for General Packet Radio Service, Network Magazine, December 2000 (http://
www.rysavy.com/Articles/GPRS2/gprs2.html).
More
Standards information:
m http://www.openmobilealliance.orgOpen Mobile Alliance, the standardsmaking body for mobile computing
m http://www.wapforum.org/what/technical.htmLegacy site for the WAP
Forum, a predecessor of the Open Mobile Alliance. Much of the WAP technical documentation, including the XHTML-MP, WTAI, and WAP architecture specications reside here.
m http://www.w3.org/TR/css-mobileCSS Mobile Prole 1.0 specication, for
controlling the display style of XHTML-MP documents
197
10
Voice (VoiceXML)
In every computing era, programmers have been responsible for writing the
fundamental application logic. During the desktop application era (1980s), the
attention given to this logic was generally dwarfed by that given to the user
interface, event handling, and graphics code that a programming team needed
to write to get a computer program into the hands of users. Result: very little
innovation at the individual level; most widely used computer programs were
written by large companies.
During the Web era (1990s), the user interface and graphics were rendered
by the Web browser, for example, Netscape Navigator or Microsoft Internet
Explorer. Programmers were able to deliver a complete system to end-users
after writing only the application logic and some simple HTML specifying the
user interface behavior. Result: a revolution in innovation, with most Web
applications written in a few months by a handful of people.
Suppose that youd observed that telephones are much more common and
portable than personal computers and Web browsers. Furthermore, youd
noticed that telephones are able to be used by almost everyone, whereas many
consumers have little patience for the complexities of the PC. Thus, youd want
to make your information system accessible to a user with only a telephone.
How would you have done it? In the 1980s, youd rent a telephone line, buy a
big specialized box to recognize utterances, buy another specialized box to talk
to the user, and park those boxes right next to the main server for your application. In the 1990s, youd have had to rent a telephone line, buy specialized
software, and park a standard computer running that software next to the
server running your application. Result in both decades: very little innovation,
with only the largest organizations oering voice/telephone interfaces to their
information systems.
200
Chapter 10
With the advent of todays voice browsers, the coming years promise to be
a period of tremendous innovation in the development of telephone-accessible
Internet applications. With a Web application, you operate the HTTP server
and run the application code; someone else runs the browser. The idea of the
voice browser is the same. You operate a server and the application. Someone else, perhaps the phone company, runs the telephone lines and voice
browser.
Bottom line: voice browsers allow you to build telephone voice applications
with nothing more than an HTTP server. From this, great innovation shall
spring.
Illustration
Suppose Tracy, a vice president at a Boston-based rm, has just own into Los
Angeles. She wants to know the telephone number and address of her companys Los Angeles oce, as well as the direct number for one of the employees. Since her company intranet is not telephone-accessible, she has to call
up her assistant and ask him to open up a Web browser to look up the information in the intranet.
With VoiceXML, it can take as little as a few hours for a developer to
take virtually any information available on the Web and make it available by
telephonenot just to callers with high-tech cellphones, but to anyone with
any kind of telephone. Tracy would be able to dial a number and say which ofce or employee she is looking for. After searching through some of the intranets database tables, the VoiceXML application can read aloud the phone
numbers and addresses she wants. And next time Tracy arrives confused in a
foreign city, she wont have to rely on her assistant being at his desk.
What Is VoiceXML?
VoiceXML, or VXML, is a markup language like HTML. The dierence:
HTML is rendered by your Web browser to format content and user-input
forms; VoiceXML is rendered by a voice browser. Your application can speak
to the user via synthesized speech or by prerecorded audio les. Your software
can receive input from the user via speech or by the tones from their telephone
201
VoiceXML
keypad. If youve ever built a Web application, youre ready to get started with
your phone application.
Figure 10.1 HTML: Publisher owns the HTTP server, which uses HTML to specify
a user experience that is rendered on the readers desktop computer. VoiceXML: Publishers owns the HTTP server, which uses VoiceXML to specify a user experience that is
rendered on a third-party gateway system and delivered as audio to the users telephone.
202
Chapter 10
Exercise 1
Use Tellme (1-800-555-TELL) to
m get driving directions between two bastions of higher education: Caltech
(1201 East California Boulevard, Pasadena, Calif.) and Pasadena City College (1570 East Colorado Boulevard, Pasadena, Calif.)
m nd the latest price for a share of stock in Oracle Corporation
m listen to your horoscope
m listen to todays top news stories
Record the amount of time required to complete the rst three tasks.
Exercise 2
Come up with a list of two or three services from your learning community
that will be valuable to telephone users. You may nd the following guidelines
useful:
m It is dicult for users to log on. With voice applications, entering a username
is even more tedious and error prone than with mobile applications. You
may want to restrict your voice services to ones that can be accessed by the
entire community and not just registered users. An alternative to the standard
username/password authentication is to assign a numeric user_id and pin to
each registered user, but that makes it more cumbersome to do Web/mobile/
phone services all in one.
m It is easy to give information to the user, but it is hard for them to give information back to your service. It is typically practical for them to pick options
from a menu, but impractical for them to provide any meaningful unstructured data.
A positive development in this area is that a number of voice gateways (e.g., VoiceGenie, www.voicegenie.com) are now partnering with providers of biometric voice
authentication software such as VoiceTrust (www.voice-trust.com/) and Vocent (www
.vocent.com).
203
VoiceXML
VoiceXML Basics
The format of a VoiceXML document is simple. Heres how to say Hello,
World to your visitors:
<?xml version="1.0"?>
<vxml version="2.0">
<form>
<block>
<audio>Hello, World</audio>
</block>
</form>
</vxml>
The rst tag, <?xml version="1.0"?>, species that the document to follow conforms to the XML 1.0 standard. All VoiceXML documents follow this
standard.
As in any XML document, every opening tag (e.g., <vxml>) has to be closed,
either with a closing tag like </vxml>, or with a slash (/) at the end of the tag,
as in the <else/> tag in the next example. The other important rule to remember is that all attribute values must be enclosed in quotation marks, as in
version="2.0". XML is much stricter than HTML in these two regards.
The <vxml version="2.0"> tag species that this is a VoiceXML 2.0 document. Within that is a <form>, which can either be an interactive element
requesting input from the useror informational. You can have as many
forms as you want within a VoiceXML document. A <block> is a container
for your executables, meaning that all your tags that make your application
do something, such as <audio>, <goto>, and a variety of others, can be
clumped together inside of a block. <audio>text</audio> will read the text
with a TTS converter, whereas <audio src="wav_file_URL"/> will play a
prerecorded .wav audio le.
Exercise 3
Sign up for a developer account at one of the VoiceXML gateways (see the list
at the end of this chapter). All of the gateways have free developer accounts
and many useful services for developers. We prefer BeVocal for its extensive
documentation and the plethora of tools it provides, including: a syntax
204
Chapter 10
More VoiceXML
Heres an example that accepts user input and behaves dierently depending on
what the user says:
<?xml version="1.0"?>
<vxml version="2.0">
<form id="animal_questionnaire">
<field name="favorite_animal">
<prompt>
<audio>Which do you like better, dogs or cats?</audio>
</prompt>
<grammar>
<![CDATA[
[
[dog dogs] {<option "dogs">}
[cat cats] {<option "cats">}
]
]]>
</grammar>
<!-- if the user gave a valid response, the filled block
is executed. -->
<filled>
<if cond="favorite_animal == dogs">
<!-- this would take the user to a form called
popular_dog_facts within the same VoiceXML
document -->
<goto next="#popular_dog_facts"/>
205
VoiceXML
<else/>
<!-- this expression is an EMCAScript (JavaScript)
expression, composed of a concatenated string
and variable; this will take the user to the
URI psychological_evaluation.cgi?affliction=cats
-->
<goto expr="psychological_evaluation.cgi?affliction=
+ favorite_animal"/>
</if>
</filled>
<!-- if the user responded but it didnt match the
grammar, the nomatch block is executed -->
<nomatch>
Im sorry, I didnt understand what you said.
<reprompt/>
</nomatch>
<!-- if there is no response for a few seconds, the
noinput block is executed -->
<noinput>
Im sorry, I didnt hear you.
<reprompt/>
</noinput>
</field>
</form>
<!-- additional forms can go here -->
</vxml>
206
Chapter 10
Note on Grammars
In VoiceXML 1.0, the W3C did not specify the grammar format, allowing each VoiceXML platform to implement grammars as they chose. In VoiceXML 2.0, each platform
is required to implement the XML format of the W3Cs Speech Recognition Grammar
Format (SRGF), the latest draft of which is available from http://www.w3.org/TR/
grammar-spec/.
In one vendors implementation, the following SRGF grammar can be used in place
of the grammar in the example:
<grammar xml:lang="en-US"
type="application/srgs+xml" version="1.0">
<rule id="animal" scope="public">
<one-of>
<item>
<one-of tag="dogs">
<item>dog</item>
<item>dogs</item>
</one-of>
</item>
<item>
<one-of tag="cats">
<item>cat</item>
<item>cats</item>
</one-of>
</item>
</one-of>
</rule>
</grammar>
However, other vendors have implemented the SRGF slightly dierently. As the SRGF
specication graduates from a candidate recommendation, vendors implementations
of SRGF should converge.
Thats all there is to getting user input. Now we can use the value of
their response in our program. In this example, if their answer is dogs,
they will be sent to a form named popular_dog_facts within the same VoiceXML document. If they answer cats, they will be sent to a dierent URL,
207
VoiceXML
psychological_evaluation.cgi?affliction=cats. Note how we used
a JavaScript expression in the goto tag in order to use the value of the
favorite_animal variable.
Those two examples are enough to give you the gist of VoiceXML and hopefully an appreciation for the simplicity of voice application development using
VoiceXML.
Excellent tutorial and reference material can be found on the developer sites
at Tellme (http://studio.tellme.com/) and BeVocal (http://cafe.bevocal.com/).
Your application should respond to the user with something like Yes, that is a
Canadian city or Ive never heard of that city.
Try out your application. Name some cities that are not on your list and see
if it mistakenly thinks they are valid cities. Now add some more cities to your
list (e.g., Calgary, Winnipeg, Victoria, Saskatoon). As you make your list
longer and longer, youll tend to start getting a few false positives.
Decide on a rule of thumb for how many elements its reasonable to have in
one grammar.
There are applications that have thousands of elements in a grammar. However, theyve
typically gone through a process of grammar tuning using representative probabilities
for grammar matches. For this exercise, just extend the standard grammar above.
208
Chapter 10
Consider that if youre authenticating users over the phone, the contributions
that might be most interesting are any new responses to questions asked by that
user.
209
VoiceXML
your VoiceXML interface. If that isnt practical, email your client explicit
instructions and then follow up with a phone call.
Write down the clients answers to the following questions:
m How useful do you think the voice interface that you just tried will be?
m What extra information should we make available via voice?
m What are the most crucial tasks that users would like to be able to accomplish from a standard phone using only touch tones and voice?
Figure 10.2
210
Chapter 10
JUPITER
Boston.
You
JUPITER
Detroit.
You
JUPITER
You
...
Notice how the system, more fully described at http://groups.csail.mit.edu/sls/
applications/jupiter.shtml, assumed that you were still interested in rain when
asking about Detroit, context carried over from the Boston question.
In the long run, as these more natural conversational technologies are perfected, the syntax of VoiceXML will have to grow to accommodate the full
power of speech interpreters or be eclipsed by another standard.
More
VoiceXML gateways:
m Voxeo (http://www.voxeo.com/)
m BeVocal Cafe (http://cafe.bevocal.com/)
m Tellme (http://studio.tellme.com/)
m VoiceGenie (http://developer.voicegenie.com/)
211
VoiceXML
Related links:
m VoiceXML Forum (http://www.voicexml.org/)
m Voice articles at developer.com (http://www.developer.com/voice/)
m Specications and news from the Web Consortium, http://www.w3.org/
Voice/. Notably interesting specs at press time include
m source code and case studies from an earlier version of this article, VoiceXML: Letting People Talk to Your HTTP Server through the Telephone,
available at http://eveandersson.com/arsdigita/asj/vxml
11
Scaling Gracefully
Lets also remind ourselves of the empirical evidence that enormous online
communities cannot satisfy every need. America Online has not subsumed all
the smaller communities on the Internet. People unsubscribe from mailing lists
when the trac level becomes too high. Early adopters of USENET discussion
groups (called NetNews or Newsgroups back in the 1970s and Google
Groups to most people in 2005) stopped participating because they found the
utility of the groups diminished when the community size grew beyond a certain point.
So the good news is that, no matter how large ones competitors, there will
always be room for a new online community. The bad news is that growth
214
Chapter 11
In this chapter we will rst consider the straightforward hardware and software
issues, then move on to the more subtle challenges that grow progressively
more dicult as the user community expands.
At a modestly visited site, it would be possible to have one CPU performing all
of these tasks. In fact, for ease of maintenance and reliability it is best to have
as few and as simple servers as possible. Consider your desktop PC, for example. How long has it been since the hardware failed? If you look into a
room with 50 simple PCs or single-board workstations, how often do you see
one that is unavailable due to hardware failure? Suppose, however, that you
combine computers to support your application. If one machine is 99 percent
215
Scaling Gracefully
reliable, a site that depends on 10 such machines will be only 0.99 10 reliable
or 90 percent. The probability analysis here is the same as ipping coins, but
with a heavy 0.99 bias towards heads. You need to get 10 heads in a row in
order to have a working service. What if you needed 100 machines to be up
and running? Thats only going to happen 0.99 100 th of the time, or roughly
37 percent.
It isnt challenging to throw hardware at a performance problem. What is
challenging is setting up that hardware so that the service is working if any of
the components are operational rather than only if all of the components are
operational.
Well examine each layer individually.
Persistence Layer
For most interactive Web applications, the persistence layer is a relational
database management system (RDBMS). The RDBMS server program is parsing SQL queries, writing transactions to the disk, rooting around on the disk(s)
for seldom-used data, gluing together data in RAM, and returning it to the
RDBMS client program. The average engineers top-of-the-head viewpoint is
that RDBMS performance is limited by the speed of the disk(s). The programmers at Oracle disagree: A properly congured Oracle server will run
CPU-bound.
Suppose that we have a popular application and need 16 CPUs to support all
the database queries. And lets further suppose that weve decided that the
RDBMS will run all by itself on one or more physical computers. Should we
buy 16 small computers, each with one CPU, or one big computer with 16
CPUs inside? The local computer shop sells 1-CPU PCs for about $500, implying a total cost of $8,000 for 16 CPUs. If we visit the Web site for Sun Microsystems (www.sun.com) we nd that the price of a 16-CPU Sunre 6800 is too
high even to list, but if the past is any guide we wont get away for less than
$200,000. We will pay 25 times as much to get 16 CPUs of the same power,
but all inside one physical computer.
Why would anyone do this?
Lets consider the peculiarities of the RDBMS application. The RDBMS
server talks to multiple clients simultaneously. If Client A updates a record
in the database and, a split-second later, Client B requests that record, the
RDBMS is required to deliver the updated information to Client B. If we were
to spread the RDBMS server program across multiple physical computers, it is
216
Chapter 11
possible that Client A would be served from Computer I and Client B would be
served from Computer II. A database transaction cannot be committed unless
it has been written out to the hard disk drive. Thus all that these computers
need do is check the disk for updates before returning any results to Client B.
Disk drives are 100,000 times slower than RAM. A single computer running an
RDBMS keeps an up-to-date version of the commonly used portions of the
database in RAM. So our multi-computer RDBMS server that ensures database coherency across processors via reference to the hard disk will start out
100,000 times slower than a single-computer RDBMS server.
Typical commercial RDBMS products, such as Oracle Parallel Server, work
via each computer keeping copies of the database in RAM and informing each
other of updates via high-speed communications networks. The machine-tomachine communication can be as simple as a high-speed Ethernet link or as
complex as specialized circuit boards and cables that achieve memory bus
speeds.
Dont we have the same problem of inter-CPU synchronization with a multiCPU single box server? Absolutely. CPU I is serving Client A. CPU II is serving Client B. The two CPUs need to apprise each other of database updates.
They do this by writing into the multiprocessor machines shared RAM. It
turns out that the CPU-CPU bandwidth available on typical high-end servers
circa 2002 is 100 Gbits/second, which is 100 times faster than the fastest available Gigabit Ethernet, FireWire, and other inexpensive machine-to-machine
interconnection technologies.
Bottom line: if you need more than one CPU to run the RDBMS, it usually
makes most sense to buy all the CPUs in one physical box.
Abstraction Layer
Suppose that you have a complex calculation that must be performed in several
dierent places within a computer program. Most likely youd encapsulate that
calculation into a procedure and then call that procedure from every part of the
program where the calculation was required. The benets of procedural abstraction are that you only have to write and debug the calculation code once and
that, if the rules change, you can be sure that by updating the single procedure
youve updated your entire application.
The abstraction layer is sometimes referred to as business logic. Something
that is complex and fundamental to the business ought to be separated out so
that it can be used in multiple places consistently and updated in one place if
217
Scaling Gracefully
necessary. Below is an example from an e-commerce system that Eve Andersson wrote. This system oered substantially all of the features of amazon.com
circa 1999. Eve expected that a lot of ham-sted programmers who adopted her
open-source creation would be updating the page scripts in order to give their
site a unique look and feel. Eve expected that laws and accounting procedures
regarding sales tax would change. So she encapsulated the looking up of sales
tax by state, the guring out if that state charges tax on shipping, and the multiplication of tax rate by price into an Oracle PL/SQL function:
create or replace function ec_tax
(v_price IN number, v_shipping IN number, v_order_id IN integer)
return number
IS
taxes
ec_sales_tax_by_state%ROWTYPE;
tax_exempt_p
ec_orders.tax_exempt_p%TYPE;
BEGIN
SELECT tax_exempt_p INTO tax_exempt_p
FROM ec_orders
WHERE order_id = v_order_id;
IF tax_exempt_p = t THEN
return 0;
END IF;
SELECT t.* into taxes
FROM ec_orders o, ec_addresses a, ec_sales_tax_by_state t
WHERE o.shipping_address=a.address_id
AND a.usps_abbrev=t.usps_abbrev(+)
AND o.order_id=v_order_id;
IF nvl(taxes.shipping_p,f) = f THEN
return nvl(taxes.tax_rate,0) * v_price;
ELSE
return nvl(taxes.tax_rate,0) * (v_price + v_shipping);
END IF;
END;
The Web script or other PL/SQL procedure that calls this function need only
know the proposed cost of an item, the proposed shipping cost, and the order
ID to which this item might be added (these are the three arguments to ec_
tax). That sales taxes for each state are stored in the ec_sales_tax_by_
state table, for example, is hidden from the rest of the application. If an
organization that adopted this software decided to switch to using third-party
218
Chapter 11
software for calculating tax, that organization would need to change only this
one function rather than wading through hundreds of Web scripts looking for
tax-related code.
Should the abstraction layer run on its own physical computer? For most
applications, the answer is no. These procedures are not suciently CPUintensive to make splitting them o onto a separate computer worthwhile in
terms of system administration eort and increased vulnerability to hardware
failure. Whats more, these procedures often do not even warrant a new execution environment. Most procedures in the abstraction layer of an Internet
service require intimate access to relational database tables. That access is fastest when the procedures are running inside the RDBMS itself. All modern
RDBMSs provide for the execution of standard procedural languages within
the database server. This trend was pioneered by Oracle with PL/SQL and
then Java. With the latest Microsoft SQL Server, one can supposedly run any
.NET-supported computer language inside the database.
When should you consider a separate environment (application server process) for the abstraction layer? Suppose that a big bank, the result of several
mergers, has an IBM mainframe to manage checking accounts, an Oracle
RDBMS for managing credit accounts, and a SQL Server-based customer support system. If Jane Customer phones up the bank and asks to pay her credit
card bill from her checking account, a computer program needs to perform a
transaction on the mainframe (debit checking), a transaction on the Oracle system (credit Visa card), and a transaction on the SQL Server database (payment
handled during a phone call with Agent 451). It is technically possible for, say,
a Java program running inside the Oracle RDBMS to connect to these other
database management systems, but traditionally this kind of problem has been
attacked by a stand-alone application server, usually a custom-authored C
program. The term application server has subsequently become used to describe the physical computers on which such a program might run and, in the
late 1990s, execution environments for Java or C programs that served some
function on a Web site other than page presentation or persistence.
Another example of where a separate physical application server might be
desirable is where substantial computation must be performed. On most photo
sharing sites, every time a photo is uploaded, the server must create scaled versions in standard sizes. The performance challenge at the orbitz.com travel site
is even more serious. Every user request results in the execution of a Lisp program written by MIT Articial Intelligence Lab alumni at itasoftware.com.
219
Scaling Gracefully
This Lisp program searches through a database of two billion ights and fares.
The database machines that are performing transactions such as ticket bookings would collapse if they had to support these searches as well.
If separate physical CPUs are to be employed in the abstraction layer,
should they all come in the same box or will it work just as well to rack and
stack cheap 1-CPU machines? That rather depends on where state is kept. Remember that HTTP is a stateless protocol. Somewhere the server needs to remember things such as Registered User 137 wants to see pages in the French
language, Unregistered user who started Session 6781205 has placed the
hardcover edition of The Cichlid Fishes in his or her shopping cart. In a
multi-process multi-computer server farm, it is impossible to guarantee that a
particular user will always be returned to the same running computer program,
if for no other reason than you want the user experience to be robust to failure
of an individual physical computer. If session state is being kept anywhere
other than in a cookie or the persistence layer (RDBMS), your application
server programs will need to communicate with each other constantly to make
sure that their ad hoc database is coherent. In that case, it might make sense to
get an expensive multi-CPU machine to support the application server. However, if all the layers are stateless except for the persistence layer, the application server layer can be handled by multiple cheap one-CPU machines. At
orbitz.com, for example, racks of cheap computers are loaded with identical
local copies of the fare and schedule database. Each time a user clicks to see
the options for traveling from New York to London, one of those application
server machines is randomly selected for action.
Presentation Layer
Computer programs in the presentation layer pull information from the persistence layer (RDBMS) and merge those results with a template appropriate to
the users preferences and client software. In a Web application these computer
programs are doing an SQL query and merging the results with an HTML template for delivery to the users Web browser. Such a program is so simple that it
is often referred to as a script. You can think of the presentation layer as
where the scripts execute.
The most common place for script execution is within the operating system
process occupied by the Web server. In other words, the script language interpreter is built into the Web server. Examples of this architecture are Microsoft
220
Chapter 11
Internet Information Server (IIS) and Active Server Pages, AOLserver and
its built-in Tcl interpreter, Apache and the mod_perl add-in. If youve
chosen to use one of these popular styles of Web development, youve chosen
to merge the presentation layer with the HTTP service layer, and spreading the
load among multiple CPUs for one layer will automatically spread it for the
other.
The multi-CPU box versus multiple-separate-box decision here should again
be based on whether or not the presentation layer holds state. If no session
state is held by the running presentation scripts, it is more economical to add
CPUs inside separate physical computers.
HTTP Service
HTTP service per se is so simple that it hardly warrants its own layer, unless
youre delivering audio and video les to a mass audience. A high performance
pure HTTP server program such as Zeus Web Server (see www.zeus.com) can
handle more than 6,000 requests per second and saturate a 100 Mbps network
link on a single 500 MHz Intel Celeron processor (that 100 Mbps link would
cost about $50,000 annually as of February 2005, by the way). Why then would
anyone ever need to deploy multiple CPUs to support HTTP service of basic
HTML pages with embedded images?
The main reason that people run out of capacity on a single front-end
Web server is that HTTP server programs are usually packaged with software
to support computationally more expensive layers. For example, the Oracle
RDBMS server, capable of supporting the persistence layer and the abstraction
layer, also includes the necessary software for interpreting Java Server Pages
and performing HTTP service. If you were running a popular service directly
from Oracle youd probably need more than one CPU. More common examples are Web servers such as IIS and AOLserver that are capable of handling
the presentation and HTTP service layers from the same operating system process. If your scripts involve a lot of template parsing, it is easy to overload a
single CPU with the demands of the Web server/script interpreter.
If no state is being stored in the HTTP Service layer it is cheapest to add
CPUs in separate physical boxes. HTTP is stateless and user interaction is
entirely mediated by the RDBMS. Therefore there is no reason for a CPU serving a page to User A to want to communicate with a CPU serving a page to
User B.
221
Scaling Gracefully
Transport-Layer Encryption
Whenever a Web page is served, two application programs on separate computers have communicated with each other. As discussed in the Basics chapter, the client opens a Transmission Control Protocol (TCP) connection to the
server, species the page desired, and receives the data back over that connection. TCP is one layer up from the basic unreliable Internet Protocol (IP).
What TCP adds is reliability: if a packet of data is not acknowledged, it will
be retransmitted. Neither TCP nor the IP of the 1990s, IPv4, provides any
encryption of the data being transmitted. Thus anyone able to monitor the
packets on the local-area network of the server or client or on the backbone
routers may be able to learn, for example, the particular pages requested by a
particular user. If you were running an online community about a degenerative
disease, this might cause one of your users to lose his or her job.
There are two ways to protect your users privacy from packet sniers. The
rst is by using a newer version of Internet Protocol, IPv6, which provides
native data security as well as authentication. In the glorious IPv6 world, we
can be sure of the origin of a packet, whether it is from a legitimate user or a
denial-of-service attacker. In the glorious IPv6 world, we can be sure that it will
be impractical to sni credit card numbers or other user-sensitive data from
Web trac. As of spring 2005, however, it isnt possible to sign up for a home
IPv6 connection. Thus we are forced to fall back on the 1990s-style approach
of adding a layer between HTTP and TCP. This was pioneered by Netscape
Communications as Secure Sockets Layer (SSL) and is now being standardized
as TLS 1.0 (see http://www.ietf.org/html.charters/tls-charter.html).
However it is performed, encryption is processor-intensive. On the client
side, thats not a big deal. The client machine probably has a 2 GHz processor
that is 98 percent idle. However on the server end, performing encryption can
tie up a whole CPU per user for the duration of a request.
If youve run out of processing power the only thing to do is . . . add processing power. The question is what kind and where. Adding general-purpose
processors to a multi-CPU computer is very expensive as mentioned earlier.
Adding additional single-CPU front-end servers to a two-tier server farm might
not be a bad strategy, especially because, if youre already running a two-tier
server farm, it requires no new thinking or system administration skills. It is
possible, however, that special-purpose hardware will be more cost-eective or
easier to administer. In particular it is possible to do encryption in the router
222
Chapter 11
for IPv6. SSL encryption for HTTP connections can be done with plug-in
boards, an example of which is the Compaq AXL300 PCI card, available in
2005 for $1,400 and capable (it is claimed) of handling 330 SSL connections
per second. Finally it is possible to interpose a hardware encryption machine
between the Web server, which communicates via ordinary HTTP, and the client, which makes requests via HTTPS. This feature is, for example, an option
on load-balancing routers from F5 Networks (www.f5.com).
223
Scaling Gracefully
We struggled to handle 10 hits per second. In 2002 we had 2 GHz CPUs. The
programmers had decided to follow the XML/XSLT fashion. We struggled to
handle 10 hits per second . . .
It seems reasonable to expect that hardware engineers will continue to deliver substantial performance improvements and that fashions in software development and business complexity will continue to rob users of any
enjoyment of those improvements. So stick to 10 requests per second per CPU
until youve got your own application-specic benchmarks that demonstrate
otherwise.
Load Balancing
As noted earlier in this chapter, an Internet service with 100 CPUs spread
among fteen physical computers isnt going to be very reliable if all 100
CPUs must be working for the overall service to function. We need to develop
a strategy for load balancing so that (1) user requests are divided more or less
evenly among the available CPUs, (2) when a piece of hardware fails, it doesnt
result in too many errors returned to users, and (3) we can recongure hardware and network without breaking users bookmarks and links from other
sites.
We will start by positing a two-tier server farm with a single multi-CPU machine running the RDBMS and multiple single-CPU front-end machines, each
of which runs the Web server program, interprets page scripts, performs SSL
encryption, and generally does any computation not being performed within
the RDBMS. This is shown in gure 11.1.
Load Balancing in the Persistence Layer
Our persistence layer is the multi-CPU computer running the RDBMS. The
RDBMS itself is typically a multi-process or multi-threaded application. For
each database client, the RDBMS spawns a separate process or thread. In this
case, each front-end machine presents itself to the RDBMS as one or more
database clients. If we assume that the load of user requests are spread among
the front-end machines, the load of database work will be spread among the
multiple CPUs of the RDBMS server by the operating system process or thread
scheduler.
224
Chapter 11
Figure 11.1 A typical server conguration for a medium-to-high volume Internet application. A powerful multi-CPU server supports the relational database management system. Multiple small 1-CPU machines run the HTTP server program.
225
Scaling Gracefully
the local name server for a translation of the hostname www.cnn.com into a
32-bit IP address. (Remember that all Internet communication is machine-tomachine and requires numeric IP addresses; alphanumeric hostnames such as
www.amazon.com or web.mit.edu are used only for user interface.) The
MIT name server would contact the InterNIC registry to learn the IP addresses
of the name servers for the cnn.com domain. The MIT name server would then
contact CNNs name servers and learn that www.cnn.com was available at
the IP address 207.25.71.5. Subsequent users within the same subnetwork at
MIT would, for a period of time designated by CNN, get the same answer of
207.25.71.5 without the MIT name server going back to the CNN name
servers.
Where is the load balancing in this system? Suppose that a Biology major
at Harvard University requested http://www.cnn.com/HEALTH/. Harvards
name server would also contact CNNs name servers to learn the translation
of www.cnn.com. This time, however, the CNN server would provide a different answer: 207.25.71.20, leading that user, and subsequent users within
Harvards network, to a dierent front-end server than the machine providing
pages to users at MIT.
Round-robin DNS is not a very popular load balancing method today. For
one thing, it is not very balanced. Suppose that the CNN name server tells
America Onlines name server that www.cnn.com is reachable at 207.25.71.29.
AOL is perfectly free to provide that translation to all of its more than 20 million customers. Another problem with round-robin DNS is the impact on users
when a front-end machine dies. If the box at 207.25.71.29 were to fail, none of
AOLs customers would be able to reach www.cnn.com until the expiration
time on the translation had elapsedthe site would be up and running and
providing pages to hundreds of thousands of users worldwide, but not to those
users whod received an unlucky DNS translation to the dead machine. For a
typical domain, this period of time might be anywhere from 6 hours to 1 week.
CNN, aware of this problem, could shorten the expiration and minimum
time-to-live on cnn.com, but if these were cut down to, say, 30 seconds, the
load on CNNs name servers might start approaching the intensity of the load
on its Web servers. Nearly every user page request would be preceded by a request for a DNS translation. (In fact, CNN set their minimum time-to-live to
15 minutes.)
A nal problem with round-robin DNS is that it does not provide abstraction. Suppose that CNN, whose primary servers were all Unix machines,
wished to run some discussion forum software that was only available for
226
Chapter 11
Figure 11.2 To preserve the freedom of rearranging components within the server
farm, typically users on the public Internet only talk to a load balancing router, which
is the public face of the service and whose IP address is what www.popularservice
.com translates to.
Windows. The IP addresses of all of its servers are publicly exposed. The only
way to direct users to a dierent machine for a particular part of the service
would be to link them to a dierent hostname, which could therefore be translated into a distinct IP address. For example, CNN would link users to http://
forums.cnn.com. Users who enjoyed these forums would bookmark the URL,
and other sites on the Internet would insert hyperlinks to this URL. After a
year, suppose that the Windows servers were dying and the people who knew
how to maintain them had moved on to other jobs. Meanwhile, the discussion
forum software has become available for Unix as well. CNN would like to pull
the discussion service back onto its main server farm, at a URL of http://
www.cnn.com/discuss/. Why should users be aware of this reshuing of hardware (see g. 11.2)?
The modern approach to load balancing is the load balancing router. This
machine, typically built out of standard PC hardware running a free Unix
operating system and a thin layer of custom software, is the only machine
that is visible from the public Internet. All of the server hardware is behind
the load balancer and has IP addresses that arent routable from the rest of
the Internet. If a user requests www.photo.net, for example, this is translated
to 216.127.244.133, which is the IP address of photo.nets load balancer. The
load balancer accepts the TCP connection on port 80 and waits for the Web
client to provide a request line, for example, GET/HTTP/1.0. Only after
that request has been received does the load balancer attempt to contact a
Web server on the private network behind it.
Notice rst that this sort of router provides some inherent security. The Web
servers and RDBMS server cannot be directly contacted by crackers on the
public Internet. The only ways in are via a successful attack on the load
227
Scaling Gracefully
balancer, an attack on the Web server program (Microsoft Internet Information Server suered from many buer overrun vulnerabilities), or an attack on
publisher-authored page scripts. The router also provides some protection
against denial-of-service attacks. If a Web server is congured to spawn a maximum of 100 simultaneous threads, a malicious user can eectively shut down
the site simply by opening 100 TCP connections to the server and then never
sending a request line. The load balancers are smart about reaping such idle
connections and in any case have very long queues.
The load balancer can execute arbitrarily complex algorithms in deciding
how to route a user request. It can forward the request to a set of front-end
servers in a round-robin fashion, taking a server out of the rotation if it fails
to respond. The load balancer can periodically pull load and health information from the front-end servers and send each incoming request to the least
busy server. The load balancer can inspect the URI requested and route to a
particular server, for example, sending any request that starts with /discuss/
to the Windows machine that is running the discussion forum software. The
load balancer can keep a table of where previous requests were routed and try
to route successive requests from a particular user to the same front-end machine (useful in cases where state is built up in a layer other than the RDBMS).
Whatever algorithm the load balancer is using, a hardware failure in one of
the front-end machines will generally result in the failure of only a handful of
user requests, that is, those in-process on the machine that actually fails.
How are load balancers actually built? It seems that we need a computer program that waits for a Web request, takes some action, then returns a result to
the user. Isnt this what Web server programs do? So why not add some code
to a standard Web server program, run the combination on its own computer,
and call that our load balancer? Thats precisely the approach taken by the
Zeus Load Balancer (http://www.zeus.com/products/zlb/) and mod_backhand
(http://www.backhand.org/mod_backhand/), a load balancer module for the
Apache Web server. An alternative is exemplied by F5 Networks, a company
that sells out-of-the-box load balancers built on PC hardware, the NetBSD
Unix operating system, and unspecied magic software.
Failover
Remember our strategic goals: (1) user requests are divided more or less evenly
among the available CPUs; (2) when a piece of hardware fails it doesnt result
228
Chapter 11
in too many errors returned to users; (3) we can recongure hardware and network without breaking users bookmarks and links from other sites.
It seems as though the load-balancing router out front and the loadbalancing operating system on the RDBMS server in back have allowed us to
achieve goals 1 and 3. And if the hardware failure occurs in a front-end singleCPU machine, weve achieved goal 2 as well. But what if the multi-CPU
RDBMS server fails? Or what if the load balancer itself fails?
Failover from a broken load balancer to a working one is essentially a
network conguration challenge, beyond the scope of this textbook. Basically
what is required are two identical load balancers and cooperation with the
next routing link in the chain that connects your server farm to the public Internet. Those upstream routers must know how to route requests for the same IP
address to one or the other load balancer depending upon which is up and running. What keeps this from becoming an endless spiral of load balancing is that
the upstream routers arent actually looking into the TCP packets to nd the
GET request. Theyre doing the much simpler job of IP routing.
Ensuring failover from a broken RDBMS server is a more dicult challenge
and one where a large variety of ideas has been tried and found wanting. The
rst idea is to make sure that the RDBMS server never fails. The machine will
have three power supplies, only two of which are required. Each disk drive will
be mirrored. If a CPU board fails, the operating system will gracefully fail back
to running on the remaining CPUs. There will be several network cards. There
will be two paths to each disk drive. Considering the number of moving parts
inside, the big complex servers are remarkably reliable, but they arent 100 percent reliable.
Given that a single big server isnt reliable enough, we can buy a whole
bunch of them and plug them all into the same disk subsystem, then run something like Oracle Parallel Server. Database clients connect to whichever physical server machine is available. If they cant get a response from a particular
server, the client retries after a few seconds to another physical server. Thus an
RDBMS server machine that fails causes the return of errors to any in-process
user requests being handled by that machine and perhaps a few seconds of
interrupted or slow service for users whove been directed to the clients of that
down machine, but it causes no longer term site unavailability.
As discussed in the Persistence Layer section of this chapter, this approach
entails a lot of wasted CPU time and bandwidth as the physical machines keep
each other apprised of database updates. A compromise approach introduced
229
Scaling Gracefully
by Oracle in 2000 was to congure a two-node parallel server. The rst machine would process online transactions. The second machine would be allowed
to lag as much as, say, ten minutes behind the rst in terms of updates. If you
wanted a CPU-intensive report querying last months user activity, youd talk
to the backup machine. If Machine 1 failed, however, Machine 2 would notice
almost immediately and start rolling its own state forward from the transaction
log on the hard disk. Once Machine 2 was up to date with the last committed
transaction, it would begin oering service as the primary database server.
Oracle proudly stated that, for customers willing to spend twice as much for
RDBMS server hardware, the two-node failover conguration was only a little bit slower than a single machine.
230
Chapter 11
Exercise 3: eBay
Visit www.ebay.com and familiarize yourself with their basic services of auction bidding and user ratings. Assume that you need to support 100 million
registered users, 800 million page views per day, 10 million bids per day, 10
million searches per day, and 0.5 million new user ratings per day. Design a
server hardware and software infrastructure that will represent a reasonable
compromise among reliability (including graceful degradation), initial cost,
and cost of administration.
Be explicit about the number of computers employed, the number of CPUs
within each computer, and the connections among the computers. If youre curious about the real numbers, remember that eBay is a public corporation and
publishes annual reports, which are available at http://investor.ebay.com/.
Your answer to this exercise should be no longer than one page.
Exercise 4: eBay Proxy Bidding
eBay oers a service called proxy bidding or automatic bidding in which
you specify a maximum amount that youre willing to pay and the server itself
will submit bids for you in increments that depend on the current high bid.
How would you implement proxy bidding on the infrastructure that you
designed for the preceding exercises? Rough out any SQL statements or triggers
that you would need. Be explicit about where the code for proxy bidding would
execute: on which server? in which execution environment?
Exercise 5: Uber-eBay
Suppose that eBay went up to one billion bids per day. How would that change
your design, if at all?
Exercise 6: Hotmail
Suppose that Hotmail were an RDBMS-backed Internet service with 200 million active users. What would be the minimum cost hardware conguration
that still provided reasonable reliability and maintainability? What is the fundamental dierence between Hotmail and eBay?
Note: http://philip.greenspun.com/ancient-history/webmail/ describes an
Oracle-backed Web mail system built by Jin S. Choi.
231
Scaling Gracefully
Exercise 7: Scorecard
Justifying your decisions, provide a one-paragraph design for the server infrastructure behind www.scorecard.org.
Translating the Elements of Good Communities from the Oine to the OnlineWorld
A face-to-face community is almost always one in which people are identied,
authenticated, and accountable. Suppose that youre a 50-year-old, 6-foot tall,
250 lb. guy, known to everyone in town as Fred Jones. Can you walk up to
the 12-year-old daughter of one of your neighbors and introduce yourself as a
13-year-old girl? Probably not very successfully. Suppose that you y a Nazi
ag out in front of your house. Can you express an opinion at the next town
meeting without people remembering that you were the Nazi ag guy? Seems
unlikely.
How do we translate the features of identiability, authentication, and accountability into the online world? In private communities, such as corporate
knowledge management systems or university coordination services, it is easy.
We dont let anyone use the system unless they are an employee or a registered
student and, in the online environment, we identify users by their full names.
Such heavyweight authentication is at odds with the practicalities of running a
public online community. For example, would it be practical to schedule faceto-face meetings with each potential registrant of photo.net, where the new user
would show an ID? On the other hand, as discussed in the User Registration
and Management chapter, we can take a stab at authentication in a public
232
Chapter 11
233
Scaling Gracefully
This causes a properly congured browser to launch the AIM client (try it). Although the AIM-based chat oered superior interactivity, it was not as successful due to (1) some users not having the AIM software on their computers, (2)
some users being behind rewalls that prevented them from using AIM, but
mostly because (3) photo.net users knew each other by real names and could
not recognize their friends by their AIM screen names. It seems that providing
a breakout and reassemble chat room is useful, but that it needs to be tightly
integrated with the rest of the online community and that, in particular, user
identity must be preserved across all services within a community.
People like computers and the Internet because they are fast. If you want an
answer to a question, you turn to the search engine that responds quickest and
with the most relevant results. In the oine world, people generally desire
speed. A Big Mac delivered in thirty seconds is better than a Big Mac delivered
in ten minutes. However, when emotions and stakes are high, we as a society
often choose delay. We could elect a president in two weeks, but instead we
choose presidential campaigns that last nearly two years. We could have tried
and sentenced Thomas Junta immediately after July 5, 2000, when he beat
Michael Costin, father of another ten-year-old hockey player, to death in a
Boston-area ice rink. After all, the crime was witnessed by dozens of people
and there was little doubt as to Juntas guilt. But it was not until January
2002 that Junta was brought to trial, convicted, and sentenced to six to ten
years in prison. Instant messaging, chat rooms, and Web-based discussion forums dont always lend themselves to thoughtful discourse, particularly when
the topic is emotional. For some communities, it may be appropriate to consider adding an articial delay in posting. Suppose that you respond to Joe
Ranters message by comparing Ranter to Adolf Hitler. Twenty-four hours
later you get an email message from the server: Does the message below truly
represent your best thinking? Choose an option by clicking on one of the URLs
below: conrm | edit | discard. Youve had some time to cool down and think.
Is Joe Ranter really similar to Adolf Hitler in relevant and signicant ways?
Upon reection, you nd that the comparison to Hitler was inapt, and so you
choose to edit the message before it becomes public.
234
Chapter 11
A user could bookmark any of these pages and enter the site periodically to
participate in as wide a discussion as interest dictated.
235
Scaling Gracefully
Another way to look at geospatialization is of the users themselves. Consider, for example, an online learning community centered around the breeding
of African Cichlids. Most of the articles and discussion would be of interest to
all users worldwide. However it would be nice to help members who were geographically proximate nd each other. Geographical clumps of members can
share information about the best aquarium shops and can arrange to get together on weekends to swap young sh. To facilitate geospatialization of users,
your software should solicit country of residence and postal code from each
new user during registration. It is almost always possible to nd a database of
latitude and longitude centroids for each postal code in a country. In the United
States, for example, you should look for the Gazetteer les on www.census
.gov, in particular those for ZIP Code Tabulation Areas (ZCTAs).
Despite applying the preceding tricks, it is always possible for growth in a
community to outstrip an old users ability to cope with all the new users and
their contributions. Every Internet collaboration system going back to the early
1970s has drawn complaints of the form I used to like this [mailing list|newsgroup|MUD|Web community] when it was smaller, but now it is big and full of
aming losers; the interesting thoughtful material is buried under a heavy layer
of dross. The earliest technological x for this complaint was the bozo lter. If
you didnt like what someone had to say, you added them to your bozo list and
the software would hide their contributions from your view of the community.
In mid-2001 we added an inverse bozo lter facility to the photo.net community. If you nd a work of great creativity in the photo sharing system or a
thoughtful response in a discussion forum you can mark the author as interesting. On subsequent logins you will nd a Your Friends section in your
personal workspace on the site. The people that youve marked as interesting
are listed in order of their most recent contribution to the site. Six months after
the feature was added, 5,000 users had established 25,000 I think that other
user is interesting relationships.
236
Chapter 11
an online learning community? What are the features that are helpful? What
features would you add if this were your service?
What is it about a newspaper that makes it particularly tough for that organization to act as the publisher of an online community?
Exercise 9: Amazon.com
List the features of amazon.com that would seem to lead to more graceful scaling of their online community. Explain how each feature helps.
Exercise 10: Scaling Plan for Your Community
Create a document at the abstract URL /doc/planning/YYYYMMDD-scaling
on your server and start writing a scaling plan for your community. This plan
should list those features that you expect to modify or add as the site grows.
The features should be grouped by phases.
Add a link to your new plan from /doc/ or a planning subindex page.
Exercise 11: Implement Phase 1
Implement Phase 1 of your scaling plan. This could be as simple as ensuring
that every time a users name or e-mail address appears on your service, the
text is an anchor to a page showing all of that persons contributions to the
community (accountability). Or it could be as complex as complete geospatialization. It really depends on how large a community your client expects to
serve in the coming months.
237
Scaling Gracefully
Lets look at some concrete scenarios. Lets assume that we have a public
community in which user-contributed content goes live immediately, without
having to be approved by a moderator. The problem of spam is greatly reduced
in any community where content must be pre-approved before appearing to
other members, but such communities require a larger sta of moderators if
discussion is to ow freely.
Scenario 1: Sarah Moneylover has registered as User 7812 and posted 50 article comments and discussion forum messages with links to her natural Viagra sales site. Sarah clicked around by hand and pasted in a text string from a
word processor open on her desktop, investing about 20 minutes in her spamming activity. The appropriate tool for dealing with Sarah is a set of ecient
administration pages. Heres how the clickstream would proceed:
1. site administrator visits a all content posted within the last 30 days link,
resulting in page after page of stu
2. site administrator clicks a control up at the top to limit the display to only
content from newly registered users, who are traditionally the most problematic, and that results in a manageable 5-screen listing
3. site administrator reviews the content items, each presented with a summary headline at the top and the rst 200 words of the body with a more
hyperlink to view the complete item and a hyperlinked authors name at the
end
4. site administrator clicks on the name Sarah Moneylover underneath a
posting that is clearly o-topic and commercial spam; this brings up a page
summarizing Sarahs registration on the server and all of her contributed
content
5. site administrator clicks the nuke this user link from Sarah Moneylover
and is presented with a Do you really want to delete Sarah Moneylover,
User 7812, and all of her contributed content?
6. site administrator conrms the nuking and a big SQL transaction is executed in which all rows related to Sarah Moneylover are deleted from the
RDBMS. Note that this is dierent from a moderator marking content as
unapproved and having that content remain in the database, but not displayed on pages. The assumption is that commercial spam has no value and
that Sarah is not going to be converted into a productive member of the
community. In fact the row in the users table associated with User 7812
ought to be deleted as well.
238
Chapter 11
The site administrator, assuming he or she was already reviewing all new content on the site, spent less than 30 seconds removing content that took the
spammer 20 minutes to post, a ratio of 40:1. As long as it is much easier to remove spam than to post it, the community is relatively spam-proof. Note that
Sarah would not have been able to deface the community if a policy of preapproval for content contributed by newly registered users was established.
Scenario 2: Ira Angrywicz, User 3571, has developed a grudge against Herschel Mellowman, User 4189. In every discussion forum thread where Herschel
has posted, Ira has posted a personal attack on Herschel right underneath. The
procedure followed to deal with Sarah Moneylover is not appropriate here because Ira, prior to getting angry with Herschel, posted 600 useful discussion
forum replies that we would be loathe to delete. The right tool to deal with
this problem is an administration page showing all content contributed by
User 3571 sorted by date. Underneath each content items headline are the rst
200 words of the body so that the administrator can evaluate without clicking
down whether or not the message is anti-Herschel spam. Adjacent to each content item is a checkbox and at the bottom of all the content is a button marked
Disapprove all checked items. For every angry reply that Ira had to type, the
administrator had to click the mouse only once on a checkbox, perhaps a 100:1
ratio between spammer eort and admin eort.
Scenario 3: A professional programmer hired to boost a companys search
engine rank writes scripts to insert content all around the Internet with hyperlinks to his clients Web site. The programs are sophisticated enough to work
through the new user registration pages in your community, registering 100
new accounts each with a unique name and email address. The programmer
has also set up robots to respond to email address verication messages sent
by your software. Now youve got 100 new (fake) users each of whom has
posted two messages. If the programmer has been a bit sloppy, it is conceivable
that all of the user registrations and content were posted from the same IP address in which case you could defend against this kind of attack by adding an
originating_ip_address column to your content management tables and
building an admin page letting you view and potentially delete all content
from a particular IP address. Discovering this problem after the fact, you might
deal with it by writing an admin page that would summarize the new user registrations and contributions with a checkbox bulk-nuke capability to remove
those users and all of their content. After cleaning out the spam youd probably
add a verify that youre a human step in the user registration process in
which, for example, a hard-to-read word was obscured inside a patterned bit-
239
Scaling Gracefully
map image and the would-be registrant had to recognize the word amidst the
noise and type it in. This would prevent a robot from establishing 100 fake
accounts.
No matter how carefully and intelligently programmed a public online community is to begin with, it will eventually fall prey to a new clever form of spam.
Planning otherwise is like being an American circa 1950 when antibiotics, vaccines, and DDT were eliminating one dreaded disease after another. The optimistic new suburbanites never imagined that viruses would turn out to be
smarter than human beings. Budget at least a few programmer days every six
months to write new admin pages or other protections against new ideas in
the world of spam.
More
m Face-to-Face and Computer-Mediated Communities, a Comparative Analysis by Amitai Etzioni and Oren Etzioni, from The Information Society
15, no. 4 (OctoberDecember 1999): 241248 or http://www.gwu.edu/~ccps/
etzioni/E31.html.
m The Linux Virtual Server, a very simple load balancer based purely on packet
rewriting; www.linuxvirtualserver.org
12
Search
Recall from the Planning chapter our principles of sustainable online community:
1. magnet content authored by experts
2. means of collaboration
3. powerful facilities for browsing and searching both magnet content and contributed content
4. means of delegation of moderation
5. means of identifying members who are imposing an undue burden on the
community and ways of changing their behavior and/or excluding them
from the community without them realizing it
6. means of software extension by community members themselves
A sustainable online community is one that can accommodate new users. If
Joe Novice, via browsing and searching, cannot nd existing content relevant
to his needs, he will ask questions that will annoy other community members:
Didnt you search the archives? Havent you read the FAQ? Long-term
community members, instead of being stimulated by discussion of new and interesting topics, nd their membership a tiresome burden of directing new users
to pages that they should have been able to nd on their own.
A communitys rst line of defense is high quality information architecture
and navigation, as discussed at the end of the Content Management chapter.
Users are better at browsing than formulating search queries. A communitys
second line of defense, however, is a superb full-text search facility. The search
database must include both publisher-authored and user-contributed content.
Here are some example query categories:
242
Chapter 12
On a large site a user might wish to restrict the search in some way. If the
search form is at the top of a document that is a chapter of an online book, it
might make sense to oer whole site and within the chapters of this book
options. If the publisher or the other users have gone to the trouble of rating
content, the default search might limit results to those documents that have
been rated of high quality. If there are multiple discussion forums on the site,
each of which is essentially a self-contained subcommunity, the search boxes
on those pages might oer a restrict searching to postings in this forum option. If a user hasnt visited the site for a month and wants to see if there is anything new and relevant, the site should perhaps oer a restrict searching to
content added within the last 30 days option.
which, by the time the bind variable :user_query is substituted, turns into
select *
from content
where body like %running%
243
Search
In Oracle this wont pick up a row whose message contains the same word, but
with a dierent capitalization. Instead we do
select *
from content
where upper(body) like upper(%running%)
would not pick up a message that contained the phrase shoes for running.
Instead well need multiple where clauses:
select *
from content
where upper(body) like upper(%running%)
and upper(body) like upper(%shoes%)
This AND clause isnt quite right. If there are lots of documents that contain
both running and shoes, these are the ones that wed like to see. However,
if there arent any rows with all query terms, we should probably oer the user
rows that contain some of the query terms. We might need to use OR, a scoring
function, and an ORDER BY so that the rows containing both query terms are
returned rst. If we insist on the AND clause, weve created a situation in
which the more the user tells us about her interests the fewer documents well
return in response to a search, eventually returning 0 results found if she
keeps adding words. (Note that public search engines circa 2005, such as
Google, Yahoo, A9, and MSN, do implicitly use AND and do return 0 results
if a user keeps adding words to a query and there arent any documents in the
database that contain each and every one of those words.)
There are some deeper problems with the Caveperson SQL Programmer
approach to full-text search. Suppose that a message contains the phrase My
brother-in-law Billy Bob ran 20 miles yesterday but not the word running.
Or a message contains the phrase My cousin Gertrude runs 15 miles every
day. These should be returned as relevant to the query running, but the
LIKE clause wont do the job. What is needed is a system for stemming both
the query terms and the indexed terms: running, runs, and ran would
all be bashed down to the stem word run for indexing and retrieval.
244
Chapter 12
The RDBMS must examine every row in the content table to answer this
query, that is, it must perform a sequential table scan (O[N] time, where N is
the number of rows in the table). Suppose that a standard RDBMS index is
dened on the body column. The values of body will be used as keys for a Btree and we could perform
select *
from content
where body = running
in O[log N] time. But the users interest isnt restricted to documents whose
only word is running or documents that begin with the word running.
The user wants documents in which the word running may be buried. A single B-tree index is not going to help.
245
Search
RDBMS B-tree. A full-text index can answer the question Find me the documents containing the word running in time that approaches O[1], that is, an
amount of time that does not vary with the size of the corpus indexed. If there
are 10 million documents in the corpus, a search through those 10 million
documents will not take much longer than a search through a corpus of 1,000
documents. (Getting close to constant time in this situation would require that
the 10-million-document collection did not use a larger vocabulary than the
1,000-document collection and that it was not the case that, say, 90 percent of
the documents contained the word running.)
How does it work? Like every other indexing strategy: extra work at insertion time is traded for less work at query time. Consider constructing a big
table of every word in the English language next to the database keys of those
documents that contain the word:
Word
Document IDs
absquatulate
612
bedizen
36, 9211
cryptogenic
dactylioglyph
7214
exheredate
feuilleton
genetotrophic
5000
hartebeest
710
inspissate
...
samoyed
sesquipedalian
723
the
uberous
6, 800
velutinous
45, 2307
widdershins
7300
xenial
3611
ypsiliform
5607
zibeline
4782
246
Chapter 12
If we build this as a hash table, we have O[1] access to a row in the table. If we
merely keep the rows in sorted order, we have O[log W] access to any row in
the table, where W is the number of words in our vocabulary. Performance
does not vary with the number of documents in the collection . . . or does it?
Just about every English document will contain the word the and therefore
simply returning the value of the document_ids column for the word the
will take O[N] time, where N is the number of documents in the corpus. This
row isnt useful anyway because it isnt selective, that is, we could get the same
information almost as fast with a sequential scan of the documents table, collecting all the document IDs. While indexing a document, a full-text search system will refer to a list of stopwords, words that are too common to be worth
indexing. For standard English, the stopword list includes such words as a,
and, as, at, for, or, the, and so forth.
Inserting a new document into the collection will be slow. Well have to go
through the document, word by word, and update as many rows in the index
as there are distinct words in the document. But that extra work at insertion
time pays o in a reduction in query time from O[N] to O[1].
Given a data structure of the preceding form, we can quickly nd all documents containing the word running. We can also quickly nd all documents
containing the word shoes. We can intersect these result sets quickly, giving
us the documents that contain both running and shoes. With some fancier
indexing data structures, we can restrict our search to documents that contain
the contiguous phrase running shoes as opposed to documents where those
words appear separately. But suppose that there are 1,000 documents in the
collection containing these two words. Which are the most relevant to the
users query of running shoes?
We need a new data structure: the word-frequency histogram. This will tell us
which words occur in a document and how frequently they occur in a way that
is easily adjusted for the total length of a document.
Heres a word-frequency histogram for the rst sentence of Tolstoys Anna
Karenina:
Word
Count
Frequency
all
1/16
another
1/16
but
1/16
each
1/16
247
Search
families
1/16
family
1/16
happy
1/16
in
1/16
is
1/16
its
1/16
one
1/16
own
1/16
resemble
1/16
unhappy
2/16
way
1/16
One might argue that this sentence makes better literature as All happy
families resemble one another, but each unhappy family is unhappy in its own
way, but the full-text search software nds it more useful in this form.
After the crude histogram is made, it is typically adjusted for the prevalence
of words in standard English. So, for example, the appearance of resemble is
more interesting than happy because resemble occurs less frequently in
standard English. Stopwords such as is are thrown away altogether. Stemming is another useful renement. In the index and in queries, we convert all
words to their stems. The stem word for families, for example, is family.
With stemming, a query for families would match a document containing
family and vice versa.
Given a body of histograms it is possible to answer queries such as Show
me documents that are similar to this one or Show me documents whose histogram is closest to a user-entered string. The inter-document similarity query
can be handled by comparing histograms already stored in the text database.
The search string platinum mines in New Zealand might be processed rst
by throwing away the stopwords in and new. By using histogram comparison, the software would deliver articles that have the most occurrences of
platinum, mines, and Zealand. Suppose that Zealand is a rarer word
than platinum. Then a document with one occurrence of Zealand is favored over one with one occurrence of platinum. A document with one occurrence of each word is preferred to an article where only one of those words
shows up. A document that contains only the words platinum mines Zealand
248
Chapter 12
is a better match than a document that contains 100,000 words, three of which
happen to match the query terms.
The power of this kind of system is enticing and raises the question Can we
run our entire Web application from a specialized full-text search database system? Indeed, why not chuck the RDBMS altogether?
We dont chuck the RDBMS because we put it in to handle the problem of
concurrency: two users trying to update the same item simultaneously. A better
query tool is nice, but we cant adopt it as our primary database management
system unless it handles the concurrency problem as well as the RDBMS.
A pragmatic approach would seem to start by keeping all the documents in
the RDBMS: articles, user comments, discussion forum postings, and so on. Either once per night or every time a new document was added, update a full-text
search systems collection. Pages that are part of the standard user experience
and workow operate from the RDBMS. The search box at the upper right
corner of every page, however, queries against the full-text search system. Lets
call this a split-system design, shown in gure 12.1.
One argument against the split-system approach is that two copies of the
document collection are being kept. In an age of $200 disk drives of absurdly
high capacity, this isnt a powerful argument. It is nearly impossible to ll a
modern disk drive with words typed by humans. One can ll up a disk drive
with video or audio streams, but not text. And in any case some full-text search
systems can build an index to a document collection without themselves keeping the original document around, that is, you would in fact have only one
copy of the document in the RDBMS.
A second argument against using RDBMS and full-text search systems
simultaneously is that the collections will get out of sync. If the Web server
crashes in the middle of an RDBMS transaction, all work is rolled back. If the
Web server was simultaneously inserting a document into a full-text search system, it is possible that the full-text database will contain a document that is not
in fact available on the main pages of the sitethe site being generated from
the RDBMS. Alternatively, the RDBMS insert might succeed while the fulltext insert fails, leading to a document that is available on the site, but not
searchable. This argument, too, ultimately lacks power. It is true that the
RDBMS is a convenient and nearly foolproof means of managing transactions
and concurrency. However, it is not the only way. If one were to hire suciently careful programmers and suciently dedicated system and database
administrators, it would be possible to keep two databases in sync.
249
Search
A third argument against the split system is the disparity of interfaces. Suppose that our RDBMS is Oracle. The Web developers know how to talk to
Oracle through Active Server Pages. The desktop programmers know how to
talk to Oracle through the C API. The marketing people know how to talk
to Oracle through various reporting tools. Some individual users have gured
out to talk to Oracle from standard desktop programs such as Microsoft Excel
and Microsoft Access. The cost of bringing in a new programmer grows if you
have to teach that person not only about an RDBMS, but also about specialized tools, each with its own library of interfaces.
However, the best argument against using both an RDBMS and a bag-onthe-side full-text search system is that the split system does not naturally support the kinds of queries that are necessary:
250
Chapter 12
251
Search
In the preceding example, Oracle Text builds its own index on the body column
of the content table. When a Text index is dened on a table, it becomes possible to use the contains operator in a WHERE clause. The Oracle RDBMS
SQL query processor is smart enough to know how to use the Text index to answer this query without doing a sequential table scan. It is possible to have
more than one call to contains in the same query. Thus the last argument of
contains is an integer identifying the query, in this case 1. It is possible to
get a relevance score out in the select list or in an ORDER BY clause with the
function score and an argument identifying from which contains call the
score should be pulled.
Oracle Text is one of the more dicult and complex Oracle RDBMS products to use. For example, if you want to be able to search for a phrase that
occurs in either the one_line_summary or body and combine the relevance
score, you need to build a multi-column index:
ctx_ddl.create_preference(content_multi,MULTI_COLUMN_DATASTORE);
ctx_ddl.set_attribute(content_multi, COLUMNS, one_line_summary, body);
create index content_text
on content(modified_date)
indextype is ctxsys.context
parameters(datastore content_multi);
252
Chapter 12
Notice that the index itself is built on the column modified_date, which is not
itself indexed. The call to ctx_ddl.set_attribute in which the COLUMNS
attribute is set is what determines which columns get indexed.
For an example of a system that tackles the challenge of indexing text from disparate
Oracle tables, see http://philip.greenspun.com/seia/examples-search/site-wide-search.
Oracle Text also has the property that its default search mode is exact phrase
matching. A user who types zippy pinhead into a search engine will expect to
nd documents that contain the phrase Zippy the Pinhead. This wont happen if your script passes the raw user query right through to the Contains operator. More problematic is what happens when a user types a query string that
contains characters that Oracle Text treats specially. This can result in an error
being raised by the SQL query and a Server Error 500 returned to the user if
you dont catch the error in your procedural script. It would be nice if Oracle
Text had a built-in procedure called ProcessRawQueryFromWebForm or
something. But it doesnt, at least we couldnt nd one in the documentation
for Oracle version 10g. The next best thing is a procedure called pavtranslate, available from http://technet.oracle.com/sample_code/products/text/
htdocs/query_syntax_translators/query_syntax_translators.html.
Oracle Text, via the INSO lters option, has the capability to index a remarkable variety of documents in a BLOB column. For example, the software can recognize a Microsoft Excel spreadsheet, pull the text out, and add it
to the index. At the same time it is smart enough to know when to ignore a
document entirely, for example, if the BLOB column were lled with a JPEG
photograph.
253
Search
254
Chapter 12
whereby the site administrators could view a report of search strings and the
users who typed them in.
Update your /doc/search le to reect the addition of this facility.
Exercise 5: Linkage
Find logical places among your communitys pages to link to the search facility. For example, on many sites it will make sense to have a quick search box in
the upper-right corner of every page served. On most sites, it makes sense to
link back to search from the search results page with a search again box
lled in by default with the original query.
Make sure that your main documentation page links to the docs for this new
module.
255
Search
Notice the user-agent header at the end: Googlebot/2.1, with its included suggestion that Web publishers check http://www.google.com/bot.html for more
information. Because some search engines archive what they index, you would
not want to provide registration-free access to content that is truly private to
members. In theory a <META NAME="ROBOTS" CONTENT="NOARCHIVE"> placed
in the HEAD of your HTML documents would prevent search engines from
archiving the page, but robots are not guaranteed to follow such directives.
Some search engines allow you to provide indexing hints and hints for presentation once a user is looking at a search results page. For example, in the
online table of contents page for this book, we have the following META tags
in the HEAD:
<meta name="keywords" content="web development
online communities MIT 6.171 textbook">
<meta name="description" content="This is the textbook for the MIT
course Software Engineering for Internet Applications">
The keywords tag adds some words that are relevant to the document, but
not present in the visible text. This would help someone who decided to search
for MIT 6.171 textbook, for example. The description tag can be used by
a search engine when summarizing a page. If it isnt present, a search engine
may show the rst 20 words on the page or follow some heuristics to build
a reasonable summary. These tags have been routinely abused. A publisher
might add popular search terms such as sex to a site that is unrelated to
those terms, in hopes of capturing more readers. A company might add the
names of its competitors as keywords. Users wouldnt see these dirty tricks
unless they went to the trouble of using the View Source command in their
browser. Because of this history of abuse, many public search engines ignore
these tags.
See http://searchenginewatch.com/resources/metasuits.html for accounts of various lawsuits that have been fought over the contents of meta tags.
256
Chapter 12
The User-agent line species for which robots the injunctions are intended.
Each Disallow asks a robot not to look in a particular directory. Nothing
requires a robot to observe these injunctions, but the standard seems to have
been adopted by all the major indices nonetheless.
Visit http://www.ibm.com/robots.txt to get a bit of insight into how a site may evolve
over time.
Exercise 6: robots.txt
Place a le on your server at /robots.txt that excludes robots from appropriate portions of your server. Put some comments at the top of the le ex-
257
Search
plaining who created this, when it was created, and the rationale behind the
exclusions.
If youre doing a 100 percent database-backed content management system, you are free
to put the content of the robots.txt le in the RDBMS, just so long as it is served when
the URI /robots.txt is requested.
The Future
As an online community grows older and larger, it becomes ever more likely
that a user will be overwhelmed with 100,000 documents matched your
query. When a community is new and small, it is possible to search for an answer merely by reading the titles of everything on the site, that is, by browsing.
As a community grows, therefore, the greater the importance of information
retrieval tools. The exercises in this chapter focus on answering a users query
by presenting links to relevant documents. Suppose that we build a search facility that always returns the very most relevant document in the corpus. Is that
an optimal solution? Only if you believe that users like to read.
Suppose that Joe User visits photo.net and types At what shutter speeds is a
tripod required? into the search box. Is it reasonable to assume that Joe wants
to read a 10,000-word document that contains the answer to this question? Or
would Joe rather get . . . the answer to his question? The answer at shutter
speeds slower than 1/lens-focal-length is a lot smaller and quicker to read
than a document containing this information.
To get a feel for how a question-answering system can be built on top of a
full-text indexer, read Scaling Question Answering to the Web (Cody
258
Chapter 12
Kwok, Oren Etzioni, Dan Weld, WWW10 conference, May 2001, http://www
.www10.org/cdrom/papers/pdf/p120.pdf ), which describes a system built at the
University of Washington. This system includes all of the expected linguistic
gymnastics plus code to sort out the Internet-specic problem of noise. Traditional information retrieval systems are designed to work with authoritative
documents, for example, the Encyclopedia Britannica, a binder of corporate
policies, or the design notes for a jetliner. The documents in the corpus are presumed to be authoritative. There wont be four dierent answers, three of them
at wrong, to questions such as In what year was Gioacchino Rossini born?
How many signatures are required for a purchase of $57,300? or How wide
is the wingspan of the airplane? With user-authored content in an online community, however, it seems safe to assume that while the average answer is likely
to be correct, for every 100 correct answers there will be at least three or four
incorrect ones. Even when the data require no interpretation, there will be
typos. For example, a Google search for rossini 1792-1868 returned 50,900
documents in February 2005; a search for rossini 1792-1869 returned 43
documents. A question-answering system built on top of lightly moderated
user-authored content will have to exercise the same sort of judgment as do
humans: How many documents contain Answer A versus Answer B? What is
the relative authority of conicting documents? Which of two conicting documents is more recent?
Mobile Internet devices put an even greater stress on information retrieval. Connection speeds are slower. Screens are smaller. It isnt practical for
a user to drill down into 20 documents returned by a search engine as possibly
relevant to a query, especially if the user is driving a car and using a voice
browser.
If you want to emerge as a hero from the dust of the next Internet collapse,
work on information retrieval.
More
m http://www.oracle.com/technology/products/text/, technical overviews for
Oracle Text
m http://trec.nist.gov/, for the proceedings of the Text REtrieval Conferences
(TREC)
259
Search
13
Planning Redux
A lot has changed since the the Planning chapter. You have a better understanding of the challenge, which may have sparked new service ideas in your
mind. Your clients have had a chance to see a prototype of the ultimate service,
which may have sparked new ideas in their minds. Your clients should have an
increased respect for your abilities and therefore an increased willingness to devote thought and attention to this project. Consider that most computer programmers suer from profound decits in the following areas:
m thinking critically about what a computer application should do
m writing down a design
m writing down an implementation plan
m documenting important features or design decisions
m clean modular design
m exercising good judgment (e.g., dont try to build something complete and
complex when you only have a week or two)
m communicating project status
To the extent that youve demonstrated that youre a cut above software developers with whom your clients have worked in the past, youll nd that their
condence in you has increased since the beginning of the class.
262
Chapter 13
you how to handle concurrency, but only observations of and interactions with
users can teach you how to build a better user experience. Your client holds the
keys to the kingdom: (1) content to attract people; (2) authority to launch the
service; (3) editorial power over existing Web sites that can link to the new service; (4) email addresses and phone numbers of people who would be likely to
nd the new service useful.
If you can launch your online learning community before the end of the
course, youll have an opportunity to learn from the rst users and, by making
minor changes, end up with a vastly improved application by the last day of the
class.
Fix the small discrepancies and record the large ones for inclusion in your restof-course implementation plan (see below).
263
Planning Redux
264
Chapter 13
Find someone who has never seen your project before and ask them to work
through the tasks in /doc/testing/representative-tasks with your entire
team observing. Write down a brief report of how it went at /doc/testing/
planning-redux-usability.
265
Planning Redux
If a client proposes a feature that is unnecessary for meeting these requirements, ask the question Why does this keep us from launching? Every day
the service isnt launched is a day that youre not learning from users. Every
day the service isnt launched is a day that the clients organization isnt learning how to operate the service.
In collaboration with your client, develop a feature grid dividing the desired
features into the following categories:
1. Minimum Launchable Feature Set, i.e., things that are required for the
launch
2. Version 1.0 (try to nish by the end of this course)
3. Version 2.0 (write down so that a planned follow-on implementation can be
accomplished)
Most admin pages can be excluded from the Minimum Launchable Feature
Set. Until there are users, there wont be any user activity and therefore little
need for statistics or moderation and organization of content. Things that are
valuable to the users and client and reasonably easy to implement should be in
Version 1.0. Anything that requires serious programming eort or that cannot
be completely specied right now should be pushed out to Version 2.0.
Place your feature grid at /doc/planning/YYYYMMDD-feature-grid.
266
Chapter 13
plan before or they wouldnt be in the to-be-implemented plan. The best tool
for estimating a new project is a record of how long it took to do a bunch of
old projects. To what is the new project most similar? Suppose that it took
you three days to build a discussion forum system, for example, and youre
asked to build a classied ad system. Both systems need a comparable number
of database tables. Both systems accept content from users and require some
sort of administrator approval. If built on the same server that is currently running the discussion forum, the classied ad system doesnt require any new software, subsystems, or other tools that you havent already installed and used.
Thus it would probably be safe to estimate the classied ad system as a threeday project.
When its ready, place your completed plan at /doc/planning/YYYYMMDDimplementation and email your client(s) and instructors notifying them that
the plan is available for nal review.
Is This Necessary?
Suppose that your team is only two people and your client is one team members mother, owner of a local scuba diving shop. Is it necessary to engage in
such a formal process? Wouldnt it be possible to obtain a successful result by
sitting down in one room and hacking out code, periodically calling Mom over
to look at whats been done?
Absolutely.
Why the emphasis on process then when the teams are so small? It is a good
habit for every software developer to get into, especially as modern software
projects tend to stretch across corporate and international borders.
Consider a software project from a Jane Decision-Makers perspective. Jane
doesnt know enough to distinguish between good code and bad code. Nor can
she look at a mostly nished project and gure out how much more coding is
required to make it work. Jane Decision-Maker is not going to be comforted by
a team of programmers with a track record of pulling everything together with
a last-minute miracle. How does she know that the miracle will happen again
on her project?
What Jane will be comforted by is process and programmers who appear to
operate in a manner that is predictable to them and their client. The more
detailed the plain-language plans, the more comforted Jane will be, especially
if the work has been contracted out to a separate corporation.
267
Planning Redux
In summary, larger teams require more process, longer projects require more
process, and work that is spread across enterprises and/or international borders
requires more process. Your project for this class is being done by a small team
on a condensed schedule and, ideally, within the same city as the client. What
benet is there to you from using a process that isnt absolutely necessary?
One benet from using a more thorough process is that youll tend to impress people a lot more in presentations of your work. People who conduct programmer job interviews have seen plenty of code monkeys, but they wont have
seen too many who show up with printouts of their clear plans and schedules
and then can talk about how they met those plans and schedules.
A deeper benet is that youll get good at the process and it will become less
of an eort on succeeding projects.
The deepest benet is that working with a written plan will become an
unconscious habit. Pilots are trained to follow checklists and procedures extremely carefully and consistently. The plane wont fall out of the sky if things
arent done in the same order or same way on every ight, and a lot of the stu
doesnt matter if youre ying on a sunny day in a well-maintained airplane.
Unless the checklists and procedures have become a habit, however, the pilot
who encounters bad weather or mechanical problems has a good chance of
dying. People tell themselves Im being sloppy today because this is an unchallenging ight, but Ill be careful when I need to be, but in fact the skills
of carefulness arent very useful unless they are habitual.
268
Chapter 13
5. I understand what work has been done, what is going to be done by the
end of the course, and what is left for a Version 2.0.
6. My student team has made it easy for me to check on their progress myself.
7. My student team has kept me well informed of their progress.
8. My student team has involved me appropriately in design and feature
decisions.
9. I was impressed by the thoroughness of the user testing done by my student
team.
10. I am impressed by the clarity and thoroughness of the documentation.
11. I think it would be easy for a new programmer to take this project over in
the event that my student team disappeared.
12. I am impressed by the mobile phone interface to my service.
13. I am impressed by the VoiceXML interface to my service.
14. My student team is the best group of engineers that I have ever worked
with.
15. My student team consists of people that I would very much like to work
with again.
Score this exercise by adding scores from each question: 0 for disagree or
wishy-washy agreement (clients wont want to say bad things about young volunteers), 1 for agree, 2 for strongly agree.
14
270
Chapter 14
Figure 14.1 A Web services interaction. Human users talk to servers A and B via the
HTTP protocol receiving results in HTML pages. When Server A needs to invoke a procedure on Server B it rst tries to gure out what the names of the functions are and
their arguments. This information comes back in a Web Services Description Language
(WSDL) document. Using the information in that WSDL document, Server A is able to
formulate a legal Simple Object Access Protocol (SOAP) request and process the results.
271
Distributed Computing
272
Chapter 14
<soap:Body>
<WhosOnline xmlns="http://jpfo.org/">
<n_seconds>600</n_seconds>
</WhosOnline>
</soap:Body>
</soap:Envelope>
HTTP/1.1 200 OK
Content-Type: text/xml; charset=utf-8
Content-Length: length
<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>
<WhosOnlineResponse xmlns="http://jpfo.org/">
<WhosOnlineResult>
<user>
<first_names>Eve</first_names>
<last_name>Andersson</last_name>
<email>[email protected]</email>
</user>
<user>
<first_names>Philip</first_names>
<last_name>Greenspun</last_name>
<email>[email protected]</email>
</user>
<user>
<first_names>Andrew</first_names>
<last_name>Grumet</last_name>
<email>[email protected]</email>
</user>
</WhosOnlineResult>
</WhosOnlineResponse>
</soap:Body>
</soap:Envelope>
273
Distributed Computing
You may wish to start your exploration of the Amazon SOAP API by locating
the Web Services Description Language (WSDL) le for the service. The
WSDL le is a formal description of the callable functions, argument names
and types, and return value type. Most Internet application development environments provide a SOAP toolset that transforms the WSDL le into a set of
proxy classes or function libraries that can be called as if the service were
implemented in the local runtime. In Microsoft Visual Studio .NET, this operation is referred to as Adding a Web Reference. If youre not a Microsoft
274
Chapter 14
Achiever, you might nd the SOAP Implementations links at the end of the
chapter useful.
A good rule of thumb is that every table you add to your data model implies roughly 5
user-accessible URLs and 5 administrative URLs. So far were up to 4 user pages, and if
you were to launch this feature, youd need to build some admin pages.
275
Distributed Computing
276
Chapter 14
277
Distributed Computing
The ISBN goes after the ASIN, and the Associates ID in this example is
pgreenspun-20.
Your development platform may provide tools that, once youve mapped the
external Web service to the internal procedure call, handle the HTTP and
278
Chapter 14
SOAP mechanics transparently. If not, you will need to skim the examples in
the SOAP specication and read the introductory articles linked below.
Exercise 7: Self-Description
Write a WSDL contract that describes the inputs and outputs for your newcontent service. Note that if you are using Microsoft .NET, these WSDL contracts will be automatically generated in most cases. You need only expose them.
Your WSDL should be available either by adding a ?WSDL to the URL of
the service itself (convenient for Microsoft .NET users) or available by adding
a .wsdl extension to the URL of the service itself.
Validate your WSDL contract and SOAP methods by inviting another team
to test your service. Do the same for them. Alternatively, look for and employ
validation tools out on the Web.
279
Distributed Computing
that aggregate information from multiple weblogs. This standard, pushed forward primarily by Userlands Dave Winer, is known as Really Simple Syndication or RSS and is documented at http://blogs.law.harvard.edu/tech/rss.
280
Chapter 14
Remember to escape any markup in your titles and descriptions, so that, for
example, <em>Whoa!</em> becomes <em>Whoa!</em>.
More
m http://www.soapware.org/bdgA Busy Developers Guide to SOAP
m http://www-106.ibm.com/developerworks/library/ws-soap/?dwzone=
components#2Using WSDL in SOAP Applications (An Introduction to
WSDL for SOAP Programmers)
m http://www.w3.org/TR/SOAPSimple Object Access Protocol (SOAP)
m http://www.w3.org/TR/wsdl.htmlWeb Service Description Language
(WSDL)
m http://www.sun.com/software/sunone/wp-arch/Sun Open Net Environment (Sun One) White Papers
m http://www.xmlrpc.comXML-RPC
m http://dmoz.org/Computers/Programming/Internet/Web_Services/SOAP/
ImplementationsA directory of SOAP implementations
m http://www.jabber.organ instant messaging client plus open platform for
XML messaging and presence information that interoperates with AOL Instant Messenger, MSN Messenger, Yahoo messenger, ICQ and IRC
m http://www.ietf.org/rfc/rfc0822.txt?number=822RFC822 standard for the
format of Internet text messages
m http://blogs.law.harvard.edu/tech/directory/5/aggregatorsa directory of
RSS readers
m http://rss.scripting.comRSS validator
15
In this section youll build a machine-readable representation of the requirements of an application and then build a computer program to generate the
computer programs that implement that application. Well treat this material
in the context of building a knowledge management system, one of the most
common types of online communities, and try to introduce you to terminology
used by business people in this area.
Organizations have complex requirements for their information systems. A
period of rapid economic growth can result in insane schedules and demands
that a new information system be ready within weeks. Finally, organizations
are ckle and have no compunction about changing the requirements midstream.
Technical people have traditionally met these challenges . . . by arguing over
programming tools. The data model cant represent the information that the
users need, the application doesnt do what what the users need it to do, and
instead of writing code, the engineers are arguing about Java versus Lisp versus ML versus C# versus Perl versus VB. If you want to know why computer
programmers get paid less than medical doctors, consider the situation of two
trauma surgeons arriving at an accident scene. The patient is bleeding profusely. If surgeons were like programmers, theyd leave the patient to bleed
out in order to have a really satisfying argument over the merits of two dierent kinds of tourniquet.
If youre programming one Web page at a time, you can switch to the language du jour in search of higher productivity. But you wont achieve signicant gains unless you quit writing code for one page at a time. Think about
ways to write down a machine-readable description of the application and
user experience, then let the computer generate the application automatically.
282
Chapter 15
War Story
The authors were asked to help Siemens and Boston Consulting Group (BCG) realize a
knowledge sharing system for 17,000 telephone switch salespeople spread among 84
countries. This was back in the 1990s when (a) telephone companies were expanding
capacity, and (b) corporations invested in information systems as a way of beating
competitors.
Siemens had spent 6 months working with a Web development contractor that was
expert in building HTML pages but had trouble programming SQL. Theyd promised
to launch the elaborately specied system 6 weeks from our rst meeting. We concluded
that many of the features that they wanted could be adapted from the source code behind the photo.net online community but that adding the knowledge repository would
require 4 programmers working full-time for 6 weeks. Whats worse, in looking at the
specs we decided that the realized system would be unusably complex, especially for
busy salespeople.
Instead of blindly cranking out the code, we assigned only one programmer to the
project, our friend Tracy Adams. She turned the human-readable design notebooks
into machine-readable database metadata tables. Tracy proceeded to build a programto-write-the-program with no visible results. Siemens and BCG were nervous until
Week 4 when the completed system was available for testing.
How do you like it? we asked. This is the worst information system that weve
ever used, they replied. How do you compare it to your specs? we asked. Hmmm
. . . maybe we should simplify the specication, they replied.
After two more iterations the system, dubbed ICN Sharenet was launched on time
and was adopted quickly, credited by Siemens with $122 million in additional sales during its rst year of operation.
One thing that we hope youve learned during this course is the value of testing with users and iterative improvement of an application. If an application is
machine-generated, you can test it with users, edit the specication based on
their feedback, and regenerate the application in a matter of minutes, ready
for a new test.
Were going to explore metadata (data about the data model) and automatic
code generation in the problem domain of knowledge management.
283
Metadata
284
Chapter 15
285
Metadata
For each of these types, we will dene a table and call a row in one of those
tables an object. To say that John McCarthy developed the Lisp programming language, the author would create two objects: one of type language
and one of type person. Why not link to the users table instead? John
McCarthy might not be a registered user of the system. Some of the people
youll be referencing, for example, John von Neumann, are dead.
Each object comprises a set of elements. An element is stored in a column.
For every object in the system, we want to record the following elements:
m name (a short description of the thing)
m overview (a longer description)
286
Chapter 15
date_of_birth, title
287
Metadata
for the algorithm type examples include Quicksort and binary search
elements for pseudo_code and high_level_explanation. In general, objects of
type algorithm will be linked to objects of type problem (what need the algorithm addresses), publication (papers that describe the algorithm or implementations of it), and person (people who were involved in developing the
algorithm)
For an example of what a completed system of this nature might look like,
visit Paul Blacks Dictionary of Algorithms, Data Structures, and Problems at
http://www.nist.gov/dads/.
Example Ontology 2: Flying
We want a system that will enable pilots to assist each other by relating experience, for example, The autopilot in N123 is not to be trusted, Avoid
the nachos at the airport cafe in Hopedale, with the comments anchored by
ocial U.S. government information regarding airports, runways, and radio
beacons for navigation.
Object types include:
m person
m publication
m airplane design
m airplane
m airport
m runway
m navigation aid (navaid)
m restaurant
m hotel
m ight school
m ight instructor
288
Chapter 15
need elements to specify performance such as stall_speed (how slow you can go
before tumbling out of the sky), approach_speed (how fast you should go when
coming near the runway to land), and cruise_speed. We want elements such as
date_certied, manufacturer_name, and manufacturer_address to describe the
design.
for the airplane type An entry in this table is a specic airplane, very
likely a rental machine belonging to a ight school. We want elements such
as date_manufactured, ifr_capable_p (legal to y in the clouds?), and optional_
equipment.
for the airport type We want to know where the airport is: lat_long; elevation; relation_to_city (distance and direction from a named town). We want to
know whether the airport is military only, private, or public. We want to know
whether or not the airport has a rotating green/white beacon and runway
lights. We want to store the frequencies for weather information, contacting
other pilots (if non-towered) or the control tower (if towered), and air trac
control for instrument ight clearances. We need to record runway lengths
and conditions. An airport may have several runways, however, thus giving
rise to a many-to-one relation, which is why we model runways separately and
link them to airports.
for the runway type number (e.g., 09/27), length, condition. Note that the
runway number implies the magnetic orientation: 09 implies a heading of 090
or landing facing magnetic east; if the wind favors a landing on the same strip
of asphalt in the opposite direction, youre on 27, which implies a heading of
270 or due west (36 faces north; 18 faces south).
for the navigation aid type The U.S. Federal Aviation Administration
maintains a nationwide network of Very High Frequency Omni Ranging beacons (VORs). These transmit two signals, one of which is constant in phase regardless of an airplanes bearing to the VOR. The second signal varies in phase
as one circles a VOR. Thus a VOR receiver in the airplane can compare
the phase of the two signals and determine that an airplane is, for example, on
the 123-degree radial from the VOR. If you didnt have a Global Positioning
System receiver in your airplane, youd determine your position on the chart
by plotting radials out from two VORs. For a navaid, we need to store its
type (could be an old non-directional beacon, which simply puts out an AM
radio-style broadcast), frequency, position, and Morse code ID (you want
to listen to the dot-dash pattern to make sure that youre receiving the proper
navaid).
289
Metadata
well have one row in this table for every object type
and thus for every new SQL table that gets defined; an
object type and its database table name are the same;
Oracle limits schema objects to 30 characters and thus
290
Chapter 15
here is the table for elements that are unique to an object type
(the housekeeping elements can be defined implicitly in the source
code for the application generator); there will be one row in
the metadata table per element
291
Metadata
-- if we did this, it would preclude an interface in which users create rows incrementally
mandatory_p
char(1) check (mandatory_p in (t,f)),
-- ordering in Oracle table creation, 0 would be on top, 1 underneath, etc.
sort_key
integer,
-- ordering within a form, lower number = higher on page
form_sort_key
integer,
-- if there are N forms, starting with 0, to define this object,
-- on which does this go? (relevant for very complex objects where
-- you need more than one page to submit)
form_number
integer,
-- for full text index
include_in_ctx_index_p char(1) check (include_in_ctx_index_p in (t,f)),
-- add forms should be prefilled with the default value
default_value
varchar(200),
check ((abstract_data_type not in (user) and oracle_data_type is not null)
or
(abstract_data_type in (user))),
unique(table_name,column_name)
);
292
Chapter 15
Notice that this table allows the users to map an object to any other object in
the system, regardless of type.
For simplicity, assume that associations are bidirectional. Suppose that a
knowledge author associates the Human encoding algorithm (used in virtually
every compression scheme, including JPEG) with the person David A. Human (19251999; an MIT graduate student at the time of his invention, which
was submitted as a term paper). We should also interpret that to mean that the
person David A. Human is associated with the algorithm for Human encoding. This is why the columns in km_object_object_map have names such as
object_id_a instead of from_object.
In an Oracle database, the primary key constraint above has the side eect of
creating an index that makes it fast to ask the question What objects are related to Object 17, where Object 17 happens to appear in the A slot? For eciency in querying What objects are related to Object 17, where Object 17
happens to appear in the B slot? create a concatenated index on the columns
in the reverse order from that of the primary key constraint.
293
Metadata
Dimensional Controls
When displaying a long list of information on a page, consider adding dimensional controls to the top. Suppose for example that you wish to help an administrator browse among the registered users of a site. You have a feeling that the
user community will grow too large for the complete list to be useful. You
therefore add an intermediate page with the following options:
m show users whove registered in the last 30 days
m show users from the same geographical area as me (site administration tends
to be divided up by region)
m show users who have contributed more than 5 items
m show users whose content has been rated below C (looking for people who
add a lot of crud to the database)
A well-designed page of this form will have a number placed discreetly next to
each option, showing the number of users who will be displayed if that option
is selected. A poorly designed page will simply leave the administrator guessing
as to how much information will be shown after an option is selected.
This traditional approach has some drawbacks. First, it adds a mouse click
before the administrator can see any user names. Ideally, you want every page
of an application to display information and/or potential actions rather than
pure bureaucracy and navigation. Second, and more seriously, this approach
doesnt scale very well. When an administrator says I need to see users whove
registered within the last 30 days, whove contributed more than 4 product
reviews, and whove bought at least $100 of stu so that I can spam them
294
Chapter 15
with a coupon, another option must be added to the list. Eventually the navigation page groans with choices.
Imagine instead that the very rst mouse click takes the administrator to a
page that shows all the users whove registered in the last 30 days, in one big
long list. At the top are sliders. Each slider controls a dimension, each of which
can restrict or expand the number of items in the list. Here are some example
dimensions for a community e-commerce site such as amazon.com:
m recency of registration, from 1 day ago (restrictive) to the beginning of time
(loose)
m geographical proximity, from same postal code (restrictive) to same city to
same state to anywhere in the world (loose)
m total purchases, from at least $10,000 (restrictive) to at least $500 to $0 or
more (loose)
m review activity, from Top 100 Reviewer (restrictive) to Top 1000 Reviewer to
0 or more reviews (loose)
m content quality, from average review rated 4.5 stars or better (restrictive) to
any average
If the default page shows too many names, the administrator will adjust a slider
or two to be more restrictive. If the administrator wants to see more names, he
or she will adjust a slider towards the loose end of that dimension.
How to implement dimensional controls? Sadly, there is no HTML tag that
will generate a little continuous slider. You can simulate a slider by oering, for
each dimension, a set of discrete points along the dimension, each of which is a
simple text hyperlink anchor. For example, for content quality you might oer
4 or better, 3 or better, 2 or better, all.
295
Metadata
m object-summarize
m object-edit-element
m link-add
m one-type-browse
Start by creating an index page in your /km/ directory. At the very least, the
index page should display an unordered list of object types and, next to each
type, options to browse or create. You dont have any information in the
database, so you should build a script called object-create rst. This page
will query the metadata tables to build a data entry form to create a single object of a particular type.
When your object creation pipeline is done inserting the row into the database, it should redirect the authors browser to a page where the object is displayed (name the script object-display if you dont have a better idea).
Presumably, the original author has authority to edit this object and therefore
this page should display small hyperlinks to edit single elds. All of these links
can target the URL object-edit-element with dierent arguments. The object display page should also summarize all the currently linked objects and
have an add link hyperlink whose target is link-add.
The page returned by link-add will look virtually identical to the index
page, that is, a list of object types. Each object type can be a hyperlink to
a multi-purpose script at one-type-browse. When called with only a table_
name argument, this page will display a table of object names with dimensional
controls at the top. The dimensions should be mine|everyones and creation
date. The user ought to be able to click on a table header and sort by that
column.
When called with extra arguments, one-type-browse will pass those
arguments through to object-summarize, a script very similar to objectdisplay, but only showing enough information that the author can positively
identify the object and with the additional ability to accept arguments for a potential link, for example, table_name_a and object_id_a.
296
Chapter 15
time to ask questions in venues where they can expect answers. If nobody is
answering, nobody will ask, thus leading to a chicken-and-egg problem.
It is important to create an incentive system that rewards users for exhibiting
the desired behavior. At amazon.com, for example, the site owners want users
to write a lot of reader reviews. At the same time, they apparently dont want
to pay people to write reviews. The solution circa 2005 is to recognize contributors with a reviewer rank. If a lot of other Amazon users have clicked to
say that they found your reviews useful, you may rise above 1,000 and a Top
1000 Reviewer icon appears next to your name. From the home page of
Amazon, navigate to Friends and favorites (under Special Features).
Then, underneath Explore, click on Top Reviewers. Notice that some of
the top 10 reviewers have written more than 5,000 reviews, all free of charge
to Amazon!
What makes sense to reward in an online community? We could start with a
couple of obvious activities: content authoring and question answering. Every
night our system could query the content tables and update user ranks according to how many articles and answers theyd posted into the database. Is it
really a good idea to reward users purely on the basis of volume? Shouldnt we
give more weight to content that has actually helped people? For example, suppose that there are ten answers to a discussion forum question. It makes sense
to give the maximum reward to the author of the answer that the person asking
the question felt was most valuable. If a question can be marked urgent by
the asker, it probably makes sense to give greater rewards to people who answer urgent questions than non-urgent ones. An article is nice, but an article
that prompts another user to say I reused this idea in my area of the organization is much nicer and should be encouraged with a greater reward.
297
Metadata
user_id
-- two columns
object_id
table_name
view_time
reuse_p
);
298
Chapter 15
Figure 15.2
299
Metadata
There are minor syntactic dierences from the Oracle statement above, but the
structure is the same. A new row is inserted only if no matching rows are found
within the last twenty-four hours.
SQL Server achieves the same isolation level as Oracle (Read Committed),
but in a dierent way. Instead of creating virtual versions of the database, SQL
Server holds exclusive locks during data-modication operations. In the example above, Session Bs INSERT cannot begin until Session As INSERT
has completed. Once it is allowed to begin, Session B will see the result of
Session As insert, and will therefore not insert a duplicate row.
More: See the Understanding Locking in SQL Server chapter of SQL
Server Books Online, the Microsoft SQL Server documentation.
Whenever you are performing logging, it is considerate to do it on the
servers time, not the users. In many Web development environments, you
can do this by calling an API procedure that will close the TCP connection
to the user, which stops the upper-right browser corner icon from spinning/
waving. Meanwhile your thread (IIS, AOLserver, Apache 2) or process
(Apache 1.x) is still alive on the server and can run whatever code is necessary
to perform the logging. Many Web servers allow you to dene lters that run
after the delivery of a page to the user.
Help with date/time arithmetic: see the Dates chapter of SQL for Web
Nerds at http://philip.greenspun.com/sql/dates.
300
Chapter 15
Write up your solutions to these non-coding exercises either in your km module overview document or in a le named metadata-exercises in the same
directory.
301
Metadata
pull some of the generated code into a text editor and change it by hand? Absolutely! The point of using metadata is to tackle extreme requirements and get
a prototype in front of real users as quickly as possible. Dont feel like a failure
because you havent solved the fty-year-old research problem of automating
programming altogether.
16
This chapter looks at ways that you can monitor user activity within your
community and how that information can be used to personalize a users
experience.
304
Chapter 16
(answer leads to action: buy more ads from the place that sends high-prot
users)
305
306
Chapter 16
147.102.16.28 - - [06/Mar/2003:09:12:31 -0500] "GET /wtr/
application-servers.html HTTP/1.1" 200 0 "http://www.google.com/
search?q=application+servers&ie=ISO-8859-7&hl=el&lr=" "Mozilla/4.0
(compatible; MSIE 5.01; Windows NT)"
307
m augment your software to log additional user activity into the RDBMS and
construct ad hoc query pages in the site administrator area of the service
m construct a full dimensional data warehouse of user activity
If all that you need is the user ID for every request, it is often a simple matter
to congure the HTTP server program, for example, Apache or Microsoft
Internet Information Server, to append the contents of the entire cookie header
or just one named cookie to each line in the access log.
When that isnt sucient, you can start adding columns to database
tables. In a sense youve already started this process. You probably have a
registration_date column in your users table, for example. This information could be derived from the access logs, but if you need it to show a member since 2001 annotation as part of their user prole, it makes more sense to
keep it in the RDBMS. If you want to oer members a page of new items
since your last visit youll probably add last_login and second_to_last_
login columns to the users table. Note that you need second_to_last_
login because as soon as User 345 returns to the site your software will update
last_login. When he or she clicks the new since last visit page, it might be
only thirty seconds since the timestamp in the last_login column. What User
345 will more likely expect is new content since the preceding Monday, his or
her previous session with the service.
Suppose the marketing department starts running ad campaigns on ten dierent sites with the goal of attracting new members. Theyll want a report of how
many people registered who came from each of those ten foreign sites. Each ad
would be a hyperlink to an encoded URL on your server. This would set a session cookie saying sourcenytimes (I came from an ad on the New York
Times Web site). If that person eventually registered as a member, the token
nytimes would be written into a source column in the users table. After a
month youll be asked to write an admin page querying the database and displaying a histogram of registration by day, by month, by source, and so forth.
The road of adding columns to transaction-processing tables and building
ad hoc SQL queries to answer questions is a long and tortuous one. The traditional way back to a manageable information system with users getting
the answers they need is the dimensional data warehouse, discussed at some
length in the data warehousing chapter of SQL for Web Nerds at http://philip
.greenspun.com/sql/data-warehousing. A data warehouse is a heavily denormalized copy of the information in the transaction-processing tables, arranged
so as to facilitate queries rather than updates.
308
Chapter 16
The exercises in this chapter will walk you through these three alternatives,
each of which has its place.
309
310
Chapter 16
be adequate. It might be worth building error notication into the software itself. Serious errors can be caught and the error handler can call a notify_
the_maintainers procedure that sends email. This might be worth including,
for example, in a centralized facility that allows page scripts to connect to the
relational database management system (RDBMS). If the RDBMS is unavailable, the sysadmins, dbadmins, and programmers ought to be notied immediately so that they can gure out what went wrong and bring the system back
up.
Suppose that an RDBMS failure were combined with a naive implementation of notify_the_maintainers on a site that gets 10 requests per second.
Suppose further that all of the people on the email notication list have gone
out for lunch together for one hour. Upon their return, they will nd 60
60 10 36,000 identical email messages in their inbox.
To avoid this kind of debacle, it is probably best to have notify_the_
maintainers record a last_notification_sent timestamp in the HTTP
servers memory or on disk and use it to ignore or accumulate requests for notication that come in, say, within 15 minutes of a previous request. A reasonable assumption is that a programmer, once alerted, will visit the server and
start looking at the full error logs. Thus notify_the_maintainers need not
actually send out information about every problem encountered.
311
warehousing for inspiration. The resulting data model should be able to answer
the questions put forth by your client in Exercise 3.
The biggest design decision that youll face during this exercise is the granularity of the fact table. If youre interested in how users get from page to page
within a site, the granularity of the fact table must be one request. On a site
such as the national dont call me registry, www.donotcall.gov, launched in
2003, one would expect a person to visit only once. Therefore the user activity
data warehouse might store just one row per registered user, summarizing their
appearance at the site and completion of registration, a fact table granularity of
one user. For many services, an intermediate granularity of one session
will be appropriate.
With a one session granularity and appropriate dimensions it is possible
to ask questions such as What percentage of the sessions were initiated in
response to an ad at Google.com? (source eld added to the fact table)
Compare the likelihood that a purchase was made by users on their fourth
versus fth sessions with the service (nth-session eld added to the fact table),
and Compare the value of purchases made in sessions by foreign versus
domestic customers (purchase amount eld added to the fact table plus a
customer dimension).
More
m www.analog.cxdownload the analog Web server log analyzer
m http://www.microsoft.com/technet/scriptcenter/tools/logparser/Microsoft
Log Parser
m www.cygwin.comstandard Unix tools for Windows
17
Writeup
Do you believe that the world owes you attention? If not, why do you think
that anyone is going to spend thirty minutes surng around the community
that youve built in order to nd the most interesting features? In any case, if
much of your engineering success is embodied in administration pages, how
would someone without admin privileges ever see them?
In code reviews at the beginning of this class, we often nd students producing source code les without attribution (I know who wrote it) and Web
pages without email signatures (nobody is actually going to use this). Maimonides commentary on Hillels quote above is that a person acquires habits
of doing right or wrongvirtues and viceswhile young; youths should do
good deeds now, and not wait until adulthood. That is, if you dont take steps
to help other users and programmers now, as a university student, there is
no reason to believe that youll develop habits of virtue post-graduation. An
alternative way of thinking about this is to ask yourself how you feel when
youre stuck trying to use someone elses Web page and there is no clear way
to send feedback or get help, or how much fun it is to be reading the source
code for an application and not have any idea who wrote it, why, or where to
314
Chapter 17
ask questions. Continuing the Talmudic theme of the chapter, keep in mind
Hillels response to a gentile interested in Judaism: That which is hateful to
you, do not do to your neighbor. That is the whole Torah; the rest is commentary. Go and study it.
A comment header at the top of every source code le and an email address
at the bottom of every page. Thats a good start toward building a professional
reputation. But it isnt enough. For every computer application that you build,
you ought to prepare an overview document. This will be a single HTML page
containing linear text that can be read simply by scrolling, that is, the reader
need not follow any hyperlinks in order to understand your achievement. It is
probably reasonable to expect that you can hold the average persons attention
for four or ve screens worth of text and illustrations. What to include in the
overview illustrations? In-line images of Web or mobile browser screens that
display the applications capabilities. If the application supports a complex
workow, perhaps a graphic showing all the states and transitions.
Here are some examples done by folks just like yourself:
m any of the reports in the 6.171 Project Galleries at http://philip.greenspun
.com/seia/gallery/spring2002/ and http://philip.greenspun.com/seia/gallery/
fall2003/
315
Writeup
If you dont invest some time in writing (prose, not code), however, youll
never have any reputation outside your immediate circle of colleagues, who
themselves may end up working at McDonalds and be unable to help you get
an engineering job during a recession.
Exercise 1
Prepare an overview document for the application that you built this semester.
Place the document at /doc/overview on your server.
Try to make sure that your audience can stop reading at any point and still
get a complete picture. Thus the rst paragraph or two should say what youve
built and why it is important to this group of users. This introduction should
say a little something about the community for whom the application has
been built and why they cant simply get together in the same room at the
same time.
It is probably worth concentrating on screen shots that illustrate your applications unique and surprising features. Things such as stand-alone discussion
forums or full-text search pages can be described in a single bullet item or sentence and easily imagined by the reader.
If you nd that your screen shots arent very compelling and that it takes 5
or 6 screen shots to tell a story, consider redesigning some of your pages! If it
makes sense to see all the sites most important features and information on
one screen in your overview document, it probably makes sense for the everyday users of the site to see them on one screen as well.
You have two basic options for structure. If it is more or less obvious how
people are going to use the service, you might be able to get away with the
Laundry List Structure: list the features of the application, punctuated by
screen shots. In general, however, the Day-in-the-Life Structure is probably
more compelling and understandable. Here you walk through a scenario of
how several users might come to the application and accomplish tasks. For example, on a photo critique site you might show the following:
1. Schlomo Mendelssohn uploads his latest photograph of his dog (screen shot
of photo upload page)
2. Winston Wu views a page of the most recently submitted photos and picks
Schlomos
316
Chapter 17
3. Winston uploads a comment on Schlomos photo, attaching an edited version of the photo (screen shot of the attach a le to your comment page)
4. Schlomo checks in from his mobile phones browser to see who has critiqued
his photo
5. Winona Horowitz calls in from a friends telephone and nds out from the
VoiceXML interface that a lot of new content has been posted in the last 24
hours
6. Winona goes home to a Web browser and visits the administration page and
deletes a duplicate posting and three o-topic posts (screen shot of the all
recently uploaded content)
7. . . .
You can work in all of the sites important features in such a scenario, while
giving the reader an idea of how these features are relevant to user and administrator goals.
Note how the example above works in the mobile and VoiceXML interfaces
of the site. All of your readers will have used Web sites before, but mobile and
VoiceXML are relative novelties.
317
Writeup
Most of their fellow physicians would agree that Surgeon 3 is the most professional doctor of the group. Surgeon 3 has practiced at the state of the art,
improved the state of the art, and taught others how to improve their skills. Is
there a way for a programmer to excel along these dimensions?
318
Chapter 17
the programmer was giving up the opportunity to work at the state of the art as
well as innovate and teach.
319
Writeup
320
Chapter 17
2. Staying lean on the sales, account management, user interface, and user experience specialists; a programming team was in direct contact with the Internet service operator and oftentimes with end-users. Our programmers had a
lot of control over and responsibility for the end-user experience.
3. Hiring good people and paying them well; it is only possible to build a highquality system if one has high-quality colleagues. Despite a tough late 1990s
recruiting market, we limited ourselves to hiring people who had demonstrated an ability to produce high-quality code on a trio of problem sets
(originally developed for this courses predecessor at MIT).
4. Giving little respect to our old code and not striving for compatibility with too
many substrate systems; we let our programmers build their professional reputation for innovation rather than become embroiled in worrying about
whether a new thing will inconvenience legacy users (we had support contracts for them) or how to make sure that new code works on every brand
of RDBMS.
5. Having a strict open-source software policy; reusable code was documented
and open-sourced in the hope that it would aid other programmers worldwide.
6. Dragging people out to writing retreats; most programmers say that they
cant write, but experience shows that peoples writing skills improve dramatically if only they will practice writing. We had a beach house near our
headquarters and dragged people out for long weekends to nish writing
projects with help from other programmers who were completing their own
writing projects.
7. Establishing our own university, assistant teaching at existing universities,
and mentoring within our oces; a lot of Ph.D. computer scientists are
reluctant to leave academia because they wont be able to teach. But we
started our own one-year post-baccalaureate program teaching the normal
undergraduate computer science curriculum, and we were happy to pay a
developer to spend a month there teaching a course. We encouraged our
developers to volunteer as teaching assistants or lecturers at universities
near our oces. We insisted that senior developers review junior developers
code internally.
How did it work out? Adhering to these principles, we built a protable business with $20 million in annual revenue. Being engineers rather than business
people we thought we were being smart by turning the company over to profes-
321
Writeup
Exercise 2
Write down your own denition of software engineering professionalism. Explain how you would put it into practice and how you could build a sustainable
organization to support that denition.
Final Presentation
In any course using this textbook, we suggest allocating 20 minutes of class
time at the end of any course, per project, for a nal presentation to a panel
of outsiders. Each team then has an opportunity to polish its presentation skills
to an audience of decision-makers, as distinct from the audience of technical
peers that have listened to earlier in-class presentations.
Young engineers need practice in convincing people with money to write
checks that will fund their work. Consequently, the best panelists are people
who, in their daily lives, listen to proposals from technical people and decide
whether or not to write checks. Examples of such people include executives at
large companies and venture capitalists.
We suggest the following format for each presentation:
1. elevator pitch, a 30-second explanation of what problem has been
solved and why the system is better than existing mechanisms available to
people
2. demo of the completed system (see the Content Management chapter for
some tips on making crisp demonstrations of multi-user applications) (5
minutes; make it clear whether or not the system has been publicly launched
or not)
322
Chapter 17
3. a slide showing system architecture and what components were used to build
the system (1 minute)
4. discussion of the toughest technical challenges faced during the project and
how they were addressed (2 minutes; possibly additional slides)
5. tour of documentation (2 minutes)you want to convince the audience that
there is enough for long-term maintenance
6. the future (1 minute)what are the next milestones? Who is carrying on the
work?
Total time: 12 minutes max.
Notice that the technical stu is at the end. Nobody cares about technology
until theyve seen what problem has been solved.
323
Writeup
You need to distinguish your application from packaged software and other
systems that the panelists expect are easily available. Dont spend ve minutes
showing a discussion forum, for example. Every panelist will have seen that.
Show one page of the forum, explain that there is a forum, that there are several levels of moderator and privacy, and then move on to what is unique
about what youve built. After one presentation, a panelist said Everything
that you showed is built into Microsoft Sharepoint. A venture capitalist on
the panel chimed in If at any time during a pitch someone points out that
there is a Microsoft product that solves the same problem, the meeting is
over.
At the same time, unless youre being totally innovative, a good place to start
is by framing your achievement in terms of something that the audience is already familiar with, for example, Yahoo! Groups or generic online community
toolkits, and then talk about what is dierent. You dont want the decisionmaker to think to herself Hey, I think Ive seen this before in Microsoft Sharepoint and have that thought in her head unaddressed the whole time.
Decision-makers often bring senior engineers with them to attend presentations, and these folks can get stuck on personal leitmotifs. Suppose Joe Panelist
chose to build his last project by generating XML from the database and then
turning that into HTML via some expensive industry-leading middleware and
XSLT, plus lots of Java and Enterprise Java Beans. This approach probably
consumes 100 times more server resources than using Microsoft Visual Basic
in Active Server Pages or a Perl script from 1993, but it is arguably cleaner
and more modern. After a 12-minute presentation, no listener could have
learned enough to say for sure that a project would have beneted from the
XML/XSLT approach, but out he comes with the challenge. You could call
him a pinhead because he doesnt know enough about your client and the original goals, for example, not having to buy a 100-CPU server farm to support a
small community. You could demonstrate that he is a pinhead by pointing out
large and successful applications that use a similar architecture to what youve
chosen. But as a junior engineer these probably arent the best ways to handle
unfair or incorrect criticism from a senior engineer at a meeting, especially if
that person has been brought along by the decision-maker. It is much better
to atter this person by asking them to schedule a 30-minute meeting where
you can really discuss the issue. Use that 30-minute meeting to show why you
designed the thing the way that you did initially. You might turn the senior engineer around to your way of thinking. At the very least, you wont be arguing
in front of the decision-maker or appearing to be arrogant/overcondent.
324
Chapter 17
To the Panelists
Imagine that each student team was hired by your predecessor. Youre trying to
gure out what they did, whether to fund the next version, and, if so, whether
this is the right team to build and launch that next version.
As a presentation proceeds, write down numerical scores (110) for how well
a team has done at the following:
m This team has communicated clearly what problem theyve solved.
m The demo gave me a good feeling for how the system works.
m This team has done an impressive job tackling engineering challenges.
m This team has documented their system clearly and thoroughly.
m Id really like to hire these people for my own organization.
Following a teams 12-minute presentation, tell them what they could have
done better.
Dont be shy about interrupting with short questions during a teams presentation. If the presentation were from one of your subordinates or a startup
company asking for funds and youd interrupt them, then interrupt our
students.
Parting Words
Work on something that excites you enough that you want to work 24/7 on it.
Become an expert on data model and page ow. Build some great systems by
yourself and link to their overview documents from your resumebe able to
say I built X or Susan and I built X rather than I built a piece of X as
part of a huge team.
More
m 6.171 Project Gallery, Spring 2002 at http://philip.greenspun.com/seia/
gallery/spring2002/
m 6.171 Project Gallery, Fall 2003 at http://philip.greenspun.com/seia/gallery/
fall2003/
325
Writeup
Reference Chapters
HTML
Typical Rendering
<p>
Dont look at your instruments
and adjust the flight controls
to, for example, keep the
altimeter steady. The
instruments have a tendency to
<b>lag behind reality</b> and
therefore youre overcorrecting
and oscillating.
</p>
HTML consists of tags, such as <p>, interspersed with plain text. The <p>
tag begins a paragraph; </p> ends the paragraph. Similarly, <b> starts text
emboldening and </b> ends it.
Basics
In HTML, almost every opening tag has a closing tag, as in the example above.
There are a few exceptions, which we will encounter shortly, but the overwhelming majority of tags must be closed.
330
Reference Chapter A
Some tags have attributes, such as the face attribute of the <font> tag.
Example:
<font face=arial>
Logical Markup
HTML has two kinds of markup: logical markup and physical markup. Physical markup, such as the bold (<b>) tag species how the browser is supposed to
render text. In contrast, logical markup, or semantic tags, species something
about the meaning of what is being marked up; the browser is free to choose a
rendering that is sensible for the users hardware, for example, italics might be
a good choice on a desktop PC, but reverse video might work better on a lowresolution mobile phone.
Here are a few examples of semantic tags:
Tag
Code Example
Typical Rendering
Emphasis
<h1>Flight Plan</h1>
Flight Plan
<em>
Strong
<strong>
Code
<code>
Headline
Level 1
<h1>
331
HTML
Headline
Level 2
<h2>Flight Plan</h2>
Flight Plan
<h3>Flight Plan</h3>
Flight Plan
<h4>Flight Plan</h4>
Flight Plan
<h5>Flight Plan</h5>
Flight Plan
<h6>Flight Plan</h6>
Flight Plan
<h2>
Headline
Level 3
<h3>
Headline
Level 4
<h4>
Headline
Level 5
<h5>
Headline
Level 6
<h6>
Physical Markup
Here are some common physical markup tags and attributes:
Tag
Code Example
Typical Rendering
Bold
<b>
Italics
<i>
Underline
<u>
Note: Generally its best to avoid the <u> tag; underlining should be reserved for
hyperlinks.
332
Reference Chapter A
Superscript
<sup>
Avogadros number is
approximately equal to
6.022 $ 10<sup>23</sup>
Avogadros number is
approximately equal to
6.022 1023
Subscript
log<sub>e</sub>x
loge x
I want a <font
size=+2>huge </font>
house, a <font
size=+1>big</font> dog,
and a <font size=-1>
small</font> waist.
I want a
house, a
big dog, and a small waist.
An airplanes navigation
lights are <font
color=green>green</font>
on the right wing and
<font color=
"#ff0000">red</font> on
the left.
<sub>
Font Size
<font
size=...>
Font Color
<font
color=...>
huge
Font Face
<font
face=...>
Typewriter
Text
<tt>
333
HTML
Preformatted
Text
<pre>
Blockquote
<blockquote>
3000
6000
9000
BUF 0517 0215+01 3306-01
BOS 2218 2325+08 2321+03
ACK 2118 2012+08 1917+03
Its generally considered more tasteful to use logical markup instead of physical markup. It has become especially important now that there is such a wide
variety of devices on which to browse Web sites, for example, mobile phones
and handheld devices. A phone might ignore <font size> tags, but it will
probably try to make headlines (<h1>) stand out.
Hyperlinks
Hyperlinks, often just called links, allow the user to jump to a new page or a
new location within the same page. Hyperlinks are generally represented by
blue, underlined text. Although it is possible to change how hyperlinks appear
to the user, we recommended against it; users expect a consistent user interface
for Web pages.
An absolute link is a hyperlink that species the full URL of the destination.
Example:
<a href="http://aviationweather.gov/">aviationweather.gov</a>
334
Reference Chapter A
Note that if you want to link to another location within the same le you can
omit the le name, for example, <a href="#DNS">DNS</a>.
You will often see a question mark followed by form variables at the end of
a URL; this is called the query string. For example,
<a href="http://groups.google.com/groups?hl=fr&group=rec.aviation
.student">rec.aviation.student newsgroup</a>
The variables in this query string are hl (headline language?) and group. Most
Web programming APIs provide convenient facilities for reading the values of
query string variables.
335
HTML
Breaks
All whitespace is treated equally in HTML, meaning that spaces, tabs, and
linebreaks are all rendered as single spaces. To force a newline to occur, you
need to use a tag.
Here are some common breaks:
Tag
Code Example
Paragraph
<p>
"Ill be seeing you,"
he said.
</p>
<p>
Then he walked away.
</p>
<p>
Line Break
<br>
Horizontal
Rule
<hr>
Typical Rendering
Carsons Plumbing<br>
123 Main St.<br>
Seattle, WA 98101
Carsons Plumbing
123 Main St.
Seattle, WA 98101
The End
Notice that <br> and <hr> have no closing tags. Additionally, the </p> tag
is optional; the browser assumes that, when it encounters a new <p> tag, the
old paragraph has ended.
Lists
The most common types of lists are ordered lists, in which the browser places a
number before each list item, and unordered lists, which appear as a series of
bulleted items. You can also create denition lists, useful for online dictionaries
or glossaries.
336
Reference Chapter A
Tag
Code Example
Typical Rendering
Ordered
List
<ol>
<ol>
<li>rations for each
occupant
<li>one axe or hatchet
<li>one first aid kit
</ol>
Common training airplanes:
<ol type=A>
<li>Cessna 172
<li>Diamond DA20
<li>Piper Tomahawk
</ol>
Class B VFR Weather Minimums:
<ol type=i>
<li>3 statute miles
visibility
<li>clear of clouds
</ol>
337
HTML
Denition
List
<dl>
<dl>
<dt>IFR
<dd>Instrument Flight
Rules
<dt>VFR
<dd>Visual Flight Rules
<dt>VOR
<dd>Very High Frequency Omni
Ranging radio navigation
beacon
</dl>
IFR
Instrument Flight
Rules
VFR
Visual Flight Rules
VOR
Very High
Frequency Omni
Ranging radio
navigation beacon
Images
Images are stored as separate les, not part of the HTML page. An image can
be included in a page as follows:
<img src="http://www.eveandersson.com/alex.jpg">
This tag instructs the users browser to make a new request, possibly to a dierent server than the one from which the HTML document was obtained, for the
image.
There are many optional attributes for images. The most important are the
width and height attributes; by telling the browser the size of the image, it
can render the entire Web page, leaving space for the image, before it has
downloaded the image le itself.
Attribute
Code Example
Dimensions
width/
height
Typical Rendering
338
Reference Chapter A
Border
border
Alignment
align
Alignment
align
Horizontal
Space (on
both sides)
hspace
Vertical
Space (top
and bottom)
vspace
CanineAmerican
CanineAmerican
CanineAmerican
339
HTML
Tables
Here are the tags used when creating HTML tables:
<table>, </table>
<tr>, </tr>
table row
<td>, </td>
table cell
<th>, </th>
Many of these tags can have attributes, for example, to specify alignment, borders, cell spacing and padding, and background colors. Examples:
Code Example
<table border=2
cellspacing=5
cellpadding=5>
<tr>
<th>Year</th>
<th>Revenue</th>
<th>Expenditures</th>
<th>Profits</th>
</tr>
<tr>
<td>1999</td>
<td>$58,295</td>
<td>$73,688</td>
<td>$(15,393)</td>
</tr>
<tr>
<td>2000</td>
<td>$902,995</td>
<td>$145,400</td>
<td>$757,595</td>
</tr>
</table>
Typical Rendering
340
Reference Chapter A
<!-- reduce
cellspacing
right-align
text in the
the
and
the
cells -->
<table border=2
cellspacing=2
cellpadding=5>
<tr>
<th>Year</th>
<th>Revenue</th>
<th>Expenditures</th>
<th>Profits</th>
</tr>
<tr>
<td>1999</td>
<td align=right>
$58,295</td>
<td align=right>
$73,688</td>
<td align=right>
$(15,393)</td>
</tr>
<tr>
<td>2000</td>
<td align=right>
$902,995</td>
<td align=right>
$145,400</td>
<td align=right>
$757,595</td>
</tr>
</table>
341
HTML
<!-- remove the
border -->
<table border=0
cellspacing=2
cellpadding=5>
<tr>
<th>Year</th>
<th>Revenue</th>
<th>Expenditures</th>
<th>Profits</th>
</tr>
<tr>
<td>1999</td>
<td>$58,295</td>
<td>$73,688</td>
<td>$(15,393)</td>
</tr>
<tr>
<td>2000</td>
<td>$902,995</td>
<td>$145,400</td>
<td>$757,595</td>
</tr>
</table>
<!-- shade every
other row -->
<table border=0
cellspacing=2
cellpadding=5>
<tr bgcolor="#cecece">
<th>Year</th>
<th>Revenue</th>
<th>Expenditures</th>
<th>Profits</th>
</tr>
342
Reference Chapter A
<tr bgcolor=white>
<td>1999</td>
<td>$58,295</td>
<td>$73,688</td>
<td>$(15,393)</td>
</tr>
<tr bgcolor="#cecece">
<td>2000</td>
<td>$902,995</td>
<td>$145,400</td>
<td>$757,595</td>
</tr>
</table>
Forms
To collect data from users, use the form tag:
<form method=POST action=/register/new>
The action is the URL to which the form is submitted, which may correspond
to a computer program in the server le system, for example, a Java Server
Page, a PHP or Perl script, and so forth.
The forms method can be either GET or POST. The only dierence is that,
with method=GET, the variables that the user submits are presented in the
query string of the following pages URL. This is useful if you want the user
to be able to bookmark the resulting page. However, if the user is expected to
enter long strings of data, method=POST is more appropriate because some old
browsers only handle query strings containing fewer than 256 characters (newer
browsers can handle a few thousand). Note further that if you use the GET
method, the form variable values will appear in the server access log and could
create a security or privacy risk.
343
HTML
Code Example
<form method=POST action=/survey/demographic>
<input type=hidden name=user_id value=2205>
Age: <input type=text size=2><br>
Sex: <input type=radio name=sex value=m>male
<input type=radio name=sex value=f>female<br>
What are you interested in (check all that apply)?
<input type=checkbox name=interest value="aerobatics">Aerobatics
<input type=checkbox name=interest value="helicopters">Helicopters
<input type=checkbox name=interest value="IFR">IFR
<input type=checkbox name=interest value="seaplanes">Seaplanes
<br>
Where do you live?
<select name=continent_live>
<option value=north_america>North America
<option value=south_america>South America
<option value=africa>Africa
<option value=europe>Europe
<option value=asia>Asia
<option value=australia>Australia
</select>
<br>
Which continents have you visited?<br>
<select multiple size=3 name=continent_visited>
<option value=north_america>North America
<option value=south_america>South America
<option value=africa>Africa
<option value=europe>Europe
<option value=asia>Asia
<option value=australia>Australia
</select>
<br>
Describe your favorite airplane trip:<br>
<textarea name=favorite_trip_story rows=5 cols=50></textarea>
<p>
<input type=submit value="Continue">
</form>
344
Reference Chapter A
Typical Rendering
Special Characters
A wide variety of non-alphanumeric characters can be specied in HTML.
Here is a small sampling:
Entity
Code Example
Typical
Rendering
n, tilde
piñata
pinata
café
cafe
ñ
e, acute accent
é
¿
¿Qué pasa?
Que pasa?
inverted question
mark
345
HTML
non-breaking
space
a b
4 > 3
4>3
5 < 6
5<6
© 2004
6 2004
£50
50
greater-than
>
less-than
<
copyright
©
pound sterling
£
News pages often include instructions that the browser refetch the page.
Heres a tag, located in the head, from news.google.com:
<meta HTTP-EQUIV="refresh" CONTENT="900">
If you load this page into a browser and step back from the computer, you
should notice it updating itself every 900 seconds (15 minutes).
346
Reference Chapter A
Also within the head, you can specify keywords and a description of the
page. These tags were originally intended to help search engines index pages,
but now they are often ignored due to abuse such as page authors using incorrect keywords to get more hits.
<META NAME="description" CONTENT="An owners review of the Diamond Star DA40">
<META NAME="keywords" CONTENT="Diamond Star DA40 review Cirrus SR20 SR22">
You can modify properties of the Web page by using <body> tag attributes.
For example:
<body bgcolor=white text=black link=blue vlink=purple alink=red>
However, you should use this sparingly; users are accustomed to the standard text colors and may become frustrated if they cant tell whats a link and
what isnt.
347
HTML
A site-wide cascading style sheet addresses all of these issues. Heres part of the
cascading style sheet for the online version of this book (http://philip.greenspun
.com/seia/style-sheet.css):
body {margin-left: 10% ; margin-right: 10%}
P { margin-top: 0.3em; text-indent : 2em }
P.stb { margin-top: 12pt }
P.mtb { margin-top: 24pt; text-indent : 0in}
P.ltb { margin-top: 36pt; text-indent : 0in}
p.marginnote { background: #E0E0E0;
text-indent: 0in ; padding-left: 5%; padding-right: 5%;
padding-top: 3pt; font-size: 75%}
p.bodynote { background-color: #E0E0E0 }
...
Each line of the style sheet gives formatting instructions for one HTML element and/or a subclass of an HTML element. The body tag is augmented so
that all of the pages will have extra left and right whitespace margins. The
next directive, for the P tag, tells browsers not to separate paragraphs with a
full blank line, but rather to indent the rst line of a new paragraph by 2em
and add only a smidgen of blank vertical space (margin-top: 0.3em). Now
paragraphs will be mushed together like those in a printed book or magazine.
Books and magazines do sometimes use whitespace, however, mostly to show
thematic breaks in the text. We therefore dene three classes of thematic breaks
and tell browsers how to render them. The rst, stb (for small thematic
break) will insert 12 points of white space. A paragraph of class stb will inherit the 2em rst-line indent of the regular P element. For medium and large
thematic breaks, more whitespace is specied, as well as an override for the
rst-line indent.
How does one use a style sheet? Park it somewhere on the server in a le with
the extension .css. This extension will tell the Web server program to MIMEtype it text/css. Inside each document that uses the cascading style sheet, put
the following link element inside the document head:
<LINK REL=STYLESHEET HREF="/seia/style-sheet.css" TYPE="text/css">
The rst time the users browser sees a page that references this style sheet, it
will come back and request http://philip.greenspun.com/seia/style-sheet.css
before rendering any of the page. Note that this will slow down page viewing
348
Reference Chapter A
a bit, although if all of our pages refer to the same site-wide style sheet, users
browsers should be smart enough to cache it. If you read ten chapters from this
book online, for example, the browser should request the common style sheet
only once.
Okay, now the browser knows where to get the style sheet and that a small
thematic break should be rendered with an extra bit of whitespace. How do we
tell the browser that a particular paragraph is of class stb? Instead of <P>,
we use
<P CLASS="stb">
Frames
Frames consist of independent windows within a single Web page. Usually
each window can be scrolled separately. Often, when you click a link, only
one frame is updated with a new URL; the rest of the page content stays the
same.
Frames sounded like a good idea at the time (mid-1990s), but have proven to
be painful for both users and developers for the following reasons:
m Frames waste screen space Often frames have their own scrollbars, which
take up valuable space within the browser window. Furthermore, if you are
only interested in one frame and you scroll down within that frame, the other
frames remain in place, leaving less space for the content you want.
m Frames make it dicult to bookmark pages When the user follows links that
only update one frame, the URL of the page does not change. Suppose Joe
User visits a travel site, follows ve links within frames to get to a page about
a tour of Mexicos Copper Canyon, and then bookmarks that page; the
bookmark will point to the front page of the travel site, not the Copper
Canyon page.
m Frames make it dicult to share pages Suppose Joe User wants to see if his
friend is interested in going on the Copper Canyon tour. While looking at the
tour advertisement, he cuts and pastes the URL from the browsers Address
eld into an email message. Joes friend clicks on the URL and gets the travel
site home page, not the interior page about the Copper Canyon tour.
349
HTML
350
Reference Chapter A
The Future
In the practical world, HTML is king. In the conference rooms of standards
committees, however, it has been superseded by Extensible Hypertext Markup
Language (XHTML). Should you wish to keep up with events in this area, visit
http://www.w3.org/MarkUp/.
More
m visit your favorite Web page and use the browser command View Source
m HTML tag reference: http://www.w3schools.com/html/html_reference.asp
(Web) and HTML & XHTML: The Denitive Guide by Chuck Musciano
and Bill Kennedy (OReilly, 2002) (print)
m Colors and their hexadecimal equivalents: http://falco.elte.hu/COMP/
HTML/colors.html
m Special characters: http://hotwired.lycos.com/webmonkey/reference/special
_characters/
m Cascading Style Sheets: http://www.w3schools.com/css/default.asp
Engagement Management
352
Reference Chapter B
We recommend you go through this formally with your team at least once a
week. You can also use it to structure introductory and update meetings with
your client, though the worksheet is primarily for your team.
353
Engagement Management
354
Reference Chapter B
Topic
Capabilities for
Site-Wide
Administrator
Capabilities for
Registered
Community
Member
Capabilities for
Unregistered Casual
Visitor
Capabilities for
User Class N
What We Think
Persuasion
Plan*
Design Preferences
Performance
Requirements
Technical
Infrastructure
Constraints
Application
Maintenance Plans/
Resources
Budget Through
First Year
Deadlines
Capabilities for site-wide administrator For this as for other user class items,
list those features that are needed rst (must-have to launch the service in any
form), next (what youd do if you had a little bit of extra time and eort
355
Engagement Management
available), and nice-to-have. One example for the site-wide administrator user
class is the following:
First = publish / manage content
Next = spam members with news/offers
Nice = track activity at individual registered user level
Design preferences If your organization has an existing Web site or sites you
can probably infer their design style. If they suggest Flash, frames, a lot of
JavaScript, youve got a potential problem and might want to point out that
Google, Amazon, eBay, and the other successful Internet applications stick to
a plain, fast-loading, easily understood design.
Performance requirements/expectations Start by suggesting your own standards of loading times in seconds for the index page and more complex pages
on the site. Let the client react to these suggestions. If everyone agrees on subsecond page loading times, that will make it a lot easier to kick out the worst
user interface ideas, such as Flash introductions.
Technical infrastructure constraints A small or medium-sized organization will
generally have only expertise and sta appropriate for maintaining one kind of
server. If youre not building the project on top of that server, youre implicitly
asking the organization to spend $100,000 per year to bring in additional maintenance sta and/or push the new service out to a contract hosting organization. It is best to be clear up front about what will need to happen when it
comes time to move the system into production.
Application maintenance plans/resources Whos going to look after what you
deliver? How experienced is this person?
Budget What is the total budget for hardware, software, integration, launch
(including populating with content), training, and maintenance?
Deadlines Youll probably use other tools to keep a detailed schedule. Use
this worksheet to keep track of some high-level scheduling goals that both you
and the client are working towards. Avoid the temptation of stereotypical technical people to think in terms of their own requirements and tasks. Your client
and sponsor dont care about SQL. They care about the date on which full
business benet (FBB) is realized for this application, that is, when is the system adding to protability or otherwise contributing to organizational goals.
356
Reference Chapter B
Working back from that date and recognizing that one or two version launches
will probably be necessary to achieve FBB, establish a public launch date.
Working back from the public launch date, establish a soft launch or full user
test date. Working back from that date, establish a feature-complete build
date on which the programmers are only testing and xing bugs rather than
adding new features.
Persuasion plan For each item in this section, if dierences of opinion arise
during initial meetings, document a persuasion plan. Here are some elements
of the plan that should be sketched in this worksheet:
m battle worth ghting?
m objective: total victory? acceptable compromise?
m agreement driven by facts/logic? emotion/relationship?
m who else should be involved? (e.g., course sta or experienced alumni
engineers)
357
Engagement Management
Sign-Os
Try to schedule comprehensive project reviews every three weeks or so, ideally
face-to-face. Notes and decisions from those reviews should be signed by both
sides (team or team leader and client). Requiring a signature has a way of forcing issues to closure.
358
Reference Chapter B
Assets Developed
In building a protable business or a professional reputation it is important to
learn from and build on experience. Here are some of the things that you can
take away from a project:
m experience with the problem domain and knowledge of how to solve a similar
problem in the future
m lessons about dealing with this particular organization
m lessons about working with this particular team
m general lessons about teamwork and working with organizations of a particular size
m data models, stored procedures, and maybe even some page scripts for re-use
on the next Internet applications that you build
m a good reference from the Client
m magazine or newspaper articles describing the application
m a white paper describing your teams achievement to a technical audience
m some sort of written summary describing your teams achievement to a business audience
At the midpoint of the project, write down what youre hoping to take away
from the experience. At the end, write down what you actually did take away.
Grading Standards
These are the grading standards used by the authors in 6.171 at MIT. If youre
a student in 6.171, please keep these in mind throughout the semester. If youre
an instructor at another school, you might nd these a useful model.
Our overall goal in 6.171 is producing professionally competent software
engineers. If by the end of the semester, you have the skills of a professional
programmer you will get an A for the class.
A professional programmer ought to be able to pick worthwhile problems
to attack. Engineering is the art of building cost-eective solutions to problems
that society regards as signicant. A person who blindly does what he or she
is told, without independently guring out the context and signicance of the
problem, is not doing engineering. A professional programmer needs to be able
to sit at a meeting with decision-makers, prepared with substantial domain
knowledge, and make signicant contributions to the discussion. In evaluating
your performance in 6.171, we look at (i) how well youve steered your client
into solving the most important problems for users rst, (ii) what youve said
during in-class discussions of potential projects, and (iii) whether youve made
useful suggestions to other teams in the realm of service design.
A professional programmer needs to be skilled at realizing clean-sheet-ofpaper designs: (i) taking vague organizational aspirations and turning them
into concrete specications, (ii) selecting appropriate tools for a substrate, (iii)
building and testing a prototype, (iv) using that prototype to obtain feedback
from users and the sponsoring organization, (v) implementing and launching
Release 1.0, (vi) rening the specs for Release 2.0 based on experience with
1.0. In evaluating your performance in 6.171, we look at whether you managed
to launch your service to real users and how successful your project was at
meeting technical and organizational goals. The mid-term exam is also aimed
360
Reference Chapter C
at guring out whether you can look at a desired user experience and perform
the most critical aspects of system design such as data modeling.
Notice that a critical element of the realization process, selecting appropriate tools, requires that a programmer maintain a network of professional colleagues. It is extremely risky to pick software tools based on vendor claims, 99
percent of which have proven to be, uh, optimistic. A programmer who can
draw on a group of friends and get unbiased information as to which tools are
reliable is much more eective than a programmer working in isolation, reading press releases and advertising. Youll get extra credit in 6.171 if you can say
I really liked feature X from Team Ys project so I asked them how they did it
and adapted their ideas and code for our project.
A professional programmer needs to have a dedication to the quality of the
end-user experience. A coder, ripe for outsourcing to the Third World, can
unthinkingly implement whatever system design that results from management
and graphic designer whims. An engineer, however, makes sure that what he or
she is doing makes eective use of the end-users time, partially by reference to
established principles of user interface design and partially by conducting prototype tests with a handful of potential users. In evaluating your performance
in 6.171, we look at whether you made eective use of user testing, ideally beyond the minimum required in the exercises.
A professional programmer is skilled in communicating. This means writing
documentation that will enable another programmer to take over a project.
Communicating also means writing white papers that explain the signicance
of a problem, how it was attacked, and what the results were. A programmer
also ought to be good at making short oral presentations that communicate the
main points of a project to a technical or non-technical audience. Finally, a
programmer should know how to make good use of face-to-face interactions
with users and customers. In evaluating your performance in 6.171, we ask
Can we understand all of this structure and source code purely by reading
what is in the /doc directory? We also look at (i) whether you gave clear and
compelling presentations in class, (ii) whether your client felt that he or she
was kept apprised of project status, and (iii) the quality of your nal overview
document.
A professional programmer is not afraid of a challenge. An MIT graduate
certainly should never be afraid of a challenge. You get extra points in 6.171
for tackling a hard problem and solving it in an elegant or clever way.
Speaking of challenges . . . most software projects are too dicult for one
person to tackle alone. Consequently, a professional programmer is good at
361
Grading Standards
Glossary
See Middleware.
364
Glossary
guarantee and document that they will work a certain way. We reserve the right to
change the core program, but we will endeavor to preserve the behavior of the API
call. If we cant, then well tell you in the release notes that we broke an old API call.
ASP Active Server Pages, introduced by Microsoft in the mid-1990s. This is the standard programming system for Internet applications hosted on Windows servers. It is
bundled with Internet Information Server (IIS) when you buy Windows. The fundamental idea is that you write HTML pages with little embedded bits of Visual Basic, C# or
other languages, that are interpreted by the server.
Audit Trail A record of past activity. For instance, a log of all past values held by columns in a database row. Or a sequence of all cash register transactions over the last
three months. Or a print-out of all customer service interactions related to a given order,
regardless of whether communication takes place by telephone, email, or live chat with a
representative.
Blog An online journal, published frequently (often daily). Readers can post comments
on each journal entry. Some blogs gain a wide readership, such as this one: http://blogs
.law.harvard.edu/philg/. The term blog is a shortening of weblog.
Bozo Filter An individual user request that the server lter out contributions from
some particular other community member.
Cable Modem A cable modem is an Internet connection provided by a cable TV operator, typically with at least 1.5 Mbits per second of download bandwidth (50100 times
faster than modems that work over analog telephone lines).
Cache Computer systems typically incorporate capacious storage devices that are slow
(e.g., disk drives) and smaller storage devices that are fast (e.g., memory chips, which are
100,000 times faster than disk). File systems and database management systems keep
recently used information from the slow devices in a cache in the fast device.
CGI Common Gateway Interface. This is a standard that lets programmers write Web
scripts without depending on details of the Web server program being used. Thus, for
example, an Internet service implemented in CGI could be moved from a site running
AOLserver to a site running Apache. CGI scripts, which run as separately launched
operating system processes, are typically very slow compared to scripts than run inside
a Web server program.
Client/Server In the 1960s, computers were so expensive that each company could
have only one. The computer ran one program at a time, typically reading instructions and data from punch cards. This was batch processing. In the 1970s, that computer
was able to run several programs simultaneously, responding to users at interactive terminals. This was timesharing (it would be nice if modesty prevented one of the authors
from noting that this was developed by his lab at MIT circa 1960). In the 1980s, companies could aord lots of computers. The big computers were designated servers and
would wait for requests to come in from a network of client computers. The client com-
365
Glossary
puter might sit on a users desktop and produce an informative graph of the information
retrieved from the server. The overall architecture was referred to as client/server. Because of the high cost of designing, developing, and maintaining the programs that run
on the client machines, Corporate America is rapidly discarding this architecture in favor of Intranet: Client machines run a simple Web browser and servers do more of the
work required to present the information.
Code Freeze The point at which all coding stops, usually to allow software testing
without the introduction of new bugs.
Collaborative Filtering If you can persuade a group of people to rate movies on a 110
scale, for example, it becomes possible to identify people whose tastes are similar. Given
a new movie that only a few people have seen and rated, a collaborative lter can identify others in the community who might like it. Some e-commerce sites provide this service, noting for example that customers who bought the product youre looking at right
now also tended to buy these other three things. Collaborative ltering is easy to program, but ultimately is a poor substitute for human reviewers and editors.
Community Site A community site exists to support the interaction of an online community of users. These users typically come together because of a shared interest and are
most vibrant when there is an educational dimension, i.e., when the more experienced
users are helping the novices improve their skills.
Compression When storing information in digital form, it is often possible to reduce
the amount of space required by exploiting regular patterns in the data. For example,
documents written in English frequently contain the. A compression system might notice this fact and represent the complete word the (24 bits) with a shorter code. A picture containing your friends face plus a lot of blue sky could be compressed if the upper
region were described as a lot of blue sky. All popular Web image, video, and sound
formats incorporate compression.
Content Repository Instead of having one SQL table for every dierent kind of content
on a site, e.g., articles, comments, news, questions, answers, it is possible to dene a single content repository table that is exible enough to store all of these in one
place. This approach to data modeling makes it simpler to perform queries such as
show me all the new stu since yesterday or show me all the content contributed by
User 37. With a content repository, it is also easier to program and enforce consistent
site-wide policies regarding approval, editing, and administration of content.
Cookie The Cookie protocol allows a Web application to conveniently maintain a
session with a particular user. The Web server sends the client a magic cookie
(piece of information) that the client is required to return on subsequent requests. The
original specication is at http://home.netscape.com/newsref/std/cookie_spec.html.
Data Model A data model is the structure in which a computer program stores persistent information. In a relational database, data models are built from tables. Within a table, information is stored in homogeneous columns, e.g., a column named
366
Glossary
registration_date would contain information only of type date. A data model is interesting because it shows what kinds of information a computer application can process. For example, if there is no place in the data model for the program to store the IP
address from which content was posted, the publisher will never be able to automatically
delete all content that came from the IP address of a spammer.
DNS
holds translations for all possible hostnames. A domain registrar, e.g., www.register
.com, records that the domain servers for the google.com domain are at particular IP
addresses. A users local name server will query the name servers for google.com to nd
the translation for the hostname www.google.com. Note that there is nothing magic
about www; it is merely a conventional name for a computer that runs a Web server.
The procedure for translating a hostname such as froogle.google.com is the same as
that applied for www. Round robin DNS was an early load-balancing technique in which
multiple computers at dierent IP addresses were congured to serve an application;
browsers asking the DNS servers to translate the sites hostname would get dierent
answers depending on when they asked, thus spreading out the users among the multiple
computers hosting the application.
DTD Document Type Denition. The specication of an XML documents schema,
including its elements, attributes, and data structure. DTDs are used for validating that
an XML document is well-formed. You can also share a DTD with your collaborators
in order to agree upon the structure of XML documents that will be exchanged.
Dynamic Site A dynamic site is one that is able to collect information from User A,
serve it back to Users B and C immediately, and hide it from User D because the server
knows that User D isnt interested in this kind of content. Dynamic sites are typically
built on top of relational database management systems because these programs make
it easy to organize content submitted by hundreds of concurrent users. An example of a
simple dynamic site would be a classied ad system.
Emacs Worlds most powerful text editor, written by Richard Stallman (RMS) in 1976
for the Incompatible Timesharing System (ITS) on the PDP-10s at MIT. Emacs has
been subsequently ported to virtually every kind of computer hardware and operating
system between 1976 and the present (including the Macintosh, Windows 95/NT, and
every avor of Unix). Good programmers tend to spend their entire working lives in
Emacs, which is capable of functioning as a mail reader, USENET news reader, Web
browser, shell, calendar, calculator, and Lisp evaluator. Emacs is innitely customizable
because users can write their own commands in Lisp. You can nd out more about
Emacs at ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-519A.pdf (Stallmans
1979 MIT AI Lab report), at www.gnu.org (where you can download the source code
for free), or by reading Learning Emacs (Debra Cameron et al. [OReilly, 1996]). If you
want to program Emacs, then youll want Writing Gnu Emacs Extensions (Bob Glickstein [OReilly, 1997]).
367
Glossary
Filter The best Web server APIs allow the programmer to say run this little piece of
code before [or after] serving les that match a particular URL pattern. Filters that run
after a le is served are useful if you want to add extra logging to an application. Filters
that run before a le is served or a script is run are useful for implementing a security
policy in a consistent fashion, rather than relying on the authors of individual scripts to
insert an authentication check.
Firewall A computer that sits between a companys internal network of computers and
the public Internet. The rewalls job is to make sure that internal users can get out to
enjoy the benets of the Internet while external crackers are unable to make connections
to machines behind the rewall.
Flat-le A at-le database keeps information organized in a structured manner, typically in one big le. A desktop spreadsheet application is an example of a at-le database management system. These are useful for Web publishers preparing content
because a large body of information can be assembled and then distributed in a consistent format. Flat-le databases typically lack support for processing transactions (inserts
and updates) from concurrent users. Thus, collaboration or e-commerce Web sites generally rely on a relational database management system as a back-end.
GIF Graphical Interchange Format. Developed in 1987 by CompuServe, this is a way
of storing compressed images with up to 256 colors. It became popular on the Web because it was the only format that could be displayed in-line by the rst multi-platform
Web browser (NCSA Mosaic). The JPEG image le format results in much better looking images with much smaller-sized les.
HTML Hyper Text Markup Language. Developed by Tim Berners-Lee, this species
a format for the most popular kind of document distributed over the Web (via HTTP).
Documented sketchily in this book, documented badly at http://www.w3.org, and documented well in HTML: The Denitive Guide (Chuck Musciano and Bill Kennedy
[OReilly, 2002]).
HTTP Hyper Text Transfer Protocol. Developed by Tim Berners-Lee, this species
how a Web browser asks for a document from a Web server. Question such as How
does a server tell the browser that a document has moved? or How does a browser
ask the time that a document was last modied? may be answered by reference to this
protocol, which is documented badly at http://www.w3.org and documented well in various books such as HTTP: The Denitive Guide (David Gourley and Brian Totty
[OReilly, 2002]).
IIS Internet Information Server. A threaded Web server program that is included by
Microsoft when you purchase the Windows operating system.
Java Java is rst a programming language, developed by Sun Microsystems around
1992, intended for use on the tiny computers inside cell phones and similar devices.
Java is second an interpreter, the Java virtual machine, formerly compiled into popular Web browsers (back when Netscape Navigator was popular and before Sun sued
368
Glossary
Microsoft). Java is third a security system that purports to guarantee that a program
downloaded from an untrusted source on the Internet can run safely inside the interpreter. Java is the only realistic way for a Web publisher to take advantage of the computing power available on a users desktop. Java is generally a cumbersome language for
server-side software development. For more background on the language, see the
Java chapter from Database Backed Web Sites at http://philip.greenspun.com/wtr/
dead-trees/53008.htm.
JPEG Joint Photographic Experts Group. A bunch of people who sat down and
designed a standard for image compression, conveniently titled IS 10918-1 (ITU-T
T.81). This standard works particularly well for 24-bit color photographs. C-Cube
Microsystems came up with the JFIF standard for encoding color images in a le. Such
a le is what people commonly refer to as a JPEG and typically ends in .jpg or
.jpeg. The main problem with JFIF les is that they record only 8 bits per color, a
vastly smaller range of intensities than is present in the natural world and signicantly
smaller than the 12- and 14-bits-per-color signals that come out of the best digital scanners and cameras. This defect and more are remedied in the JPEG 2000 standard. See
www.jpeg.org for more about the standard.
LDAP Lightweight Directory Access Protocol. A typical LDAP server is a simple
network-accessible database where an organization stores information about its authorized users and what privileges each user has. Thus, rather than create a new employee
account on 50 dierent computers, the new employee is entered into LDAP and granted
rights to those 50 systems. If the employee leaves, revoking all privileges is as simple as
removing one entry in the LDAP directory. LDAP is a bit confusing because original
implementations were presented as alternatives to the Web and the relational database
management system. Nowadays many LDAP servers are implemented using standard
RDBMSs underneath, and they talk to the rest of the world via XML documents served
over HTTP.
Linux A free version of the Unix operating system, primarily composed of tools developed over a 15-year period by Richard Stallman and Project GNU. However, the nal
spectacular push was provided by Linus Torvalds who wrote a kernel (completed in
1994), organized a bunch of programmers Internet-wide, and managed releases.
Lisp Lisp is the most powerful and also easiest-to-use programming language ever
developed. Invented by John McCarthy at MIT in the late 1950s, Lisp is today used by
the most sophisticated programmers pushing the limits of computers in mathematical
physics, computer-aided engineering, and computer-aided genetics. Lisp is also used by
thousands of people who dont think of themselves as programmers at all, e.g., people
who want to dene shortcuts in AutoCAD or the Emacs text editor. The best introduction to Lisp is also the best introduction to computer science: Structure and Interpretation of Computer Programs (Harold Abelson and Gerald Sussman [MIT Press, 1996]).
Log Analyzer A program that reads a Web servers access log le (one line per request
served) and produces a comprehensible report with summary statistics, e.g., You served
369
Glossary
234,812 requests yesterday to 2,039 dierent computers; the most popular le was
/samoyed-faces.html.
Magnet Content Material authored by a publisher in hopes of establishing an online
community. In the long run, a majority of the content in a successful community site
will be user authored.
Middleware A vague term that, when used in the context of Internet applications,
means software sold to people who dont know how to program by people who dont
know how to program. In theory, middleware sits between your relational database
management system and your application program and makes the whole system run
more reliably, just like adding a bunch of extra moving parts to your car would make
it more reliable.
MIME Multi-Purpose Internet Mail Extensions. Developed in 1991 by Nathan Borenstein of Bellcore so that people could include images and other non-plain-text documents in email messages. MIME is a critical standard for the World Wide Web
because an HTTP server answering a request always includes the MIME type of the
document served. For example, if a browser requests foobar.jpg, the server will return a MIME type of image/jpeg. The Web browser will decide, based on this type,
whether or not to attempt to render the document. A JPEG image can be rendered
by all modern Web browsers. If, for example, a Web browser sees a MIME type of
application/x-pilot (for the .prc les that PalmPilots employ), the browser will invite
the user to save the document to disk or select an appropriate application to launch for
this kind of document.
Multi-modal A multi-modal user interface allows you to interact with a piece of software in a variety of means simultaneously. For example, you may be able to communicate using a keyboard or stylus, or with your voice, or even with hand or face gestures.
These are all modes of communication. The advent of GPRS makes simultaneous
voice/keypad interaction possible on cellular telephones.
Operating System (OS) A big, complicated computer program that lets multiple,
simultaneously executing, big, complicated computer programs coexist peacefully on
one physical computer. The operating system is also responsible for hiding the details
of the computer hardware from the application programmers, e.g., letting a programmer
say I want to write ABC into a le named XYZ without the programmer having to
know how many disk drives the computer has or what company manufactured those
drives. Examples of operating systems are Unix and Windows XP. Examples of things
that try to be operating systems but mostly fail to fulll the coexist peacefully condition are Windows 98 and the Macintosh OS.
Oracle Oracle is the most popular relational database management system (RDBMS).
It was developed by Larry Ellisons Oracle Corporation in the late 1970s.
Perl Perl is a scripting language developed by Larry Wall in 1986 to make his Unix
sysadmin job a little easier. It unies a bunch of capabilities from disparate older Unix
370
Glossary
tools. Like Unix, Perl is perhaps best described as ugly but fast and useful. Perl is free,
has particularly powerful string processing operators, and quickly developed a large following and, therefore, a large library for CGI scripting. For more info, see www.perl
.com or www.perl.org.
Historical note: Lisp programmers forced to look at Perl code would usually say if
there were any justice in this world, the guys who wrote this would go to jail. In a
rare case of Lisp programmers getting their wish, in 1995 Intel Corporation persuaded
local authorities to send Randal Schwartz, author of Learning Perl (OReilly, 2001), to
the Big House for 90 days (plus 5 years of probation, 480 hours of community service,
and $68,000 of restitution to Intel). Sadly, however, it seems that Schwartzs ocial
crime was not corrupting young minds with Perl syntax and semantics. Most Unix
sysadmins periodically run a program called crack that tries to guess user passwords.
When crack is successful, the sysadmins send out email saying your password has been
cracked; please change it to something harder to guess. Obviously they do not need the
passwords since they have root access to all the boxes and can read any of the data contained on them. At a university, you get paid about $50,000/year for doing this. In Oregon if you do this for a multi-billion-dollar company that has recently donated $100,000
to the local law enforcement authorities, youve committed a crime. See http://www
.lightlink.com/spacenka/fors/ for more on State of Oregon v. Randal Schwartz.
371
Glossary
Robot In the technologically optimistic portion of the 20th century, robots were intelligent anthropomorphic machines that understood human speech, interpreted visual
scenes, and manipulated objects in the real world. In the technologically realistic 21st
century, robots are absurdly primitive programs that do things like Go look up this
book title at three dierent online bookstores and see who has the lowest price; fail completely if any one of the online bookstores has added a comma to their HTML page.
Also known as intelligent agents (an intellectually vacuous term, but useful for getting
tenure if youre a university professor). Some simple, but very useful examples of robots
are the spiders or Web crawlers that ll the content database at public search engine
sites such as AltaVista.
Scalable A marketing term used to sell defective software to executives at big companies. Internet applications are fundamentally concerned with processing updates from
thousands of concurrent users. This is what database management systems were built
for. Smart engineers build Web applications so that if the database is up and running,
the Web site will be up and running. Period. Adding more users to the site will inevitably require adding capacity to the database management system, no matter what
other software is employed. The thoughtful engineer will realize that a provably scalable site is one that relies on no other software besides the database management system
and the thinnest of software layers on top, such as Apache, AOLserver, or Microsoft
IIS.
Semantic Tag The most popular Web markup language is HTML, which provides for
formatting tags, e.g., this is a headline or this should be rendered in italics. This is
useful for humans reading Web pages. What would be more useful for computer programs trying to read Web pages is a semantic tag, e.g., the following numbers represent
the price of the product in dollars, or the following characters represent the date this
document was initially authored. More: http://www.w3.org/RDF/.
SOAP Simple Object Access Protocol. A way for a Web server to call a procedure on
another, physically separate Web server, and get back a machine-readable result in a
standardized XML format. Useful for building a Web page that combines dynamic information pulled from multiple foreign sites. Also useful for building a single Web form
that can perform multiple actions at foreign sites on behalf of a user. See http://msdn
.microsoft.com/soap/ and http://www.w3.org/TR/SOAP/.
SGML Standard Generalized Markup Language, standardized in 1980. A language
for marking up documents so that they could be parsed by computer programs. Each
community of people that wishes to author and parse documents must agree on a Document Type Denition (DTD), which is itself a machine-parsable description of what
tags a marked-up document must or may have. HTML is an example of an SGML
DTD. XML is a simplied descendant of SGML.
Soft Launch Placing a server on the public Internet, but only telling a handful of people about it gives the developers a chance to see how real users interact with the system,
x bugs, and see how the servers handle a gradually increasing load. A soft launch like
372
Glossary
this is much safer than a Big Bang-style launch in which the server is made public just as
a massive advertising campaign airs.
Spider A spider or Web crawler is a program that exhaustively surfs all the links from
a page and returns them to another program for processing. For example, all of the
Internet search engine sites rely on spider robots to discover new Web sites and add
them to their index. Another typical use of a spider is by a publisher against his or her
own site. The spider program makes sure that all of the links function correctly and
reports dead links.
SQL Structured Query Language. Developed by IBM in the mid-1970s as a way to get
information into and out of relational database management systems. A fundamental
dierence between SQL and standard programming languages is that SQL is declarative. You specify what kind of data you want from the database; the RDBMS is responsible for guring out how to retrieve it. A full tutorial on SQL is available at http://
philip.greenspun.com/sql/.
Static Site A static Web site comprises content that does not change depending on the
identity of the user, the time of day, or what other users might have contributed recently.
A static Web site is typically built using static documents in HTML format with graphics in GIF format and images in JPEG format. Collectively, these are referred to as
static les. Contrast with a dynamic site, in which content can be automatically collected
from users, personalized for the viewer, or changed as a function of the time of day.
TCP/IP Transmission Control Protocol and Internet Protocol. These are the standards
that govern transmission of data among computer systems. They are the foundation of
the Internet. IP is a way of saying send these next 1000 bits from Computer A to Computer B. TCP is a way of saying send this stream of data reliably between Computer
A and Computer B (it is built on top of IP). TCP/IP is a beautiful engineering achievement, documented beautifully in TCP/IP Illustrated, Volume 1 (W. Richard Stevens
[Addison-Wesley, 1994]).
Transaction A set of operations for which it is important that all succeed or all fail. On
an e-commerce site, when a customer conrms a purchase, youd like to send an order to
the shipping department and simultaneously bill the customers credit card. If the credit
card cant be billed, you want to make sure that the order doesnt get shipped. If the
shipping database cant accept the order, you want to make sure that the credit card
doesnt get billed. RDBMSs such as Oracle provide signicant support for implementing
transactions.
UDDI Universal Description, Discovery, and Integration. Like a worldwide Yellow
Pages, this is an XML-based registry where companies can list the Web services they
provide. More: uddi.org.
Unix An operating system developed by Ken Thompson and Dennis Ritchie at Bell
Laboratories in 1969, vaguely inspired by the advanced MULTICS system built by
MIT. Unix really took o after 1979, when Bill Joy at University of California, Berkeley
373
Glossary
released a version for Digitals VAX minicomputer. Unix fragmented into a bewildering
variety of mutually incompatible versions, thus enabling Microsoft Windows to take
over most of the server market. The only surviving variants of Unix are Suns Solaris
and Linux.
URL Uniform Resource Locator, also Uniform Resource Identier (URI). A way of
specifying the location of something on the Internet, e.g., http://philip.greenspun.com/
seia/glossary is the URL for this glossary. The part before the colon species the protocol (HTTP). Legal alternatives include encrypted protocols such as HTTPS and legacy
protocols such as FTP, news, gopher, etc. The part after the // is the server hostname
(philip.greenspun.com). The part after the next / is the name of the le on the
remote server. Also see Abstract URL. More: http://www.w3.org/Addressing/.
USENET A threaded discussion system that today connects millions of users from
around the Internet into newsgroups such as rec.photo.equipment.35mm. The original
system was built in the late 1970s and ran on one of the wide-area computer networks
later subsumed into the Internet.
Version Control System A system for keeping track of multiple versions of a le, usually source code. Version control systems are most useful when many developers are
working together on a project, to help prevent one developer from overwriting another
developers changes, and to make it easy to revert to a previous version of a le. An excellent open-source version control system is CVS, Concurrent Versions System: www
.cvshome.org.
VoiceXML A markup language used for the development of voice applications. Using
only a traditional Web infrastructure, you can create applications that are accessible
over the telephone. With VoiceXML, you can specify call ow, speech recognition, and
text-to-speech. See the Voice chapter for more.
W3C The World Wide Web Consortium. The W3C is a vendor-neutral industry consortium that promotes standards for the World Wide Web. Popular W3C standards include HTML, HTTP, URL, XML, SOAP, VoiceXML, and many more: www.w3.org.
WAP Wireless Application Protocol. A set of standard communication protocols for
wireless devices. See the Mobile chapter for more.
Web Service These days, the term Web service typically refers to a modular application
that can be invoked through the Internet. The consumers of Web services are other computer applications that communicate, usually over HTTP, using XML standards including SOAP, WSDL, and UDDI. Sometimes Web service will still be used in the older
sense of the word, as a user-facing application like amazon.com or photo.net.
Weblog
See Blog.
Windows NT/2000/XP A real operating system that can run the same programs with
more or less the same user interface as the popular Windows 95/98 system. Windows
NT was developed from scratch by a programming team at Microsoft that was mostly
374
Glossary
untainted by the people who brought misery to the world in the form of Windows
3.1/95. The latest versions of Windows work surprisingly well.
WML Wireless Markup Language. An out-of-date markup language for the development of mobile browser applications. Replaced by XHTML-MP.
Workow The management of steps in a business process. A workow species what
tasks need to be done, in what order (sometimes linearly, sometimes in parallel), and
who has permission to perform each task. Most tasks are performed by humans, but
they can also be automated processes.
WSDL Web Services Description Language. A way for a Web server to answer, in a
machine-readable form, the question what services do you provide? with said services
ultimately to be provided by SOAP. See http://www.w3.org/TR/wsdl.
WYSIWYG What You See Is What You Get. A WYSIWYG word processor, for example, lets a user view an on-screen document as it will appear on the printed page, e.g.,
with text in italics appearing on-screen in italics. This approach to software was pioneered by Xerox Palo Alto Research Center in the 1970s and widely copied since then,
notably by the Apple Macintosh. WYSIWYG is extremely eective for structurally
simple documents that are printed once and never worked on again. WYSIWYG is extremely ineective for the production of complex documents and documents that must
be maintained and kept up-to-date over many years. Thus Quark Xpress and Adobe
Framemaker facilitated a tremendous boom in desktop publishing, while Microsoft
FrontPage and similar WYSIWYG tools for Web page construction have probably hindered development of interesting Web applications.
XHTML The next generation of HTML, compliant with XML standards. Although it
is very similar to the current HTML, it follows a stricter set of rules, thus allowing for
better automatic code validation. This structure also makes it possible to embed other
XML-based languages such as MathML (for equations) and SMIL (for multimedia) inside of XHTML pages. More: www.wdvl.com/Authoring/Languages/XML/XHTML/.
XHTML-MP XHTML Mobile Prole. A strict subset of XHTML, used as a markup
language for wireless application development. See the Mobile chapter for more.
XML Extensible Markup Language, a simplied version of SGML with enhanced features for dening hyperlinks. As with SGML, it solves the trivial problem of dening a
syntax for exchanging structured information, but doesnt do any of the hard work of
getting users to agree on semantic structure.
To the Instructor
Thank you for considering this textbook. This section is intended to help you
use it eectively for students at the following levels:
m juniors and seniors in Computer Science taking a one-term cram course in
Internet application design (the MIT way)
m juniors and seniors in Computer Science taking a one-year capstone course
in software engineering
m seniors in Computer Science doing a capstone independent study project or
bachelors thesis
m sophomores in Computer Science or non-majors spending a semester learning about building modern information systems
With respect to these goals, we will treat the following issues: (1) what to do
during lectures, (2) how to nd clients for your students, (3) what to put on
exams, (4) how to nd and use alumni mentors, and (5) evaluation and grading.
Before plunging into these issues, lets take a step back and reect on the
rationale for teaching this material at all.
A Step Back
Why is software engineering part of the undergraduate computer science curriculum? There are enough mathematical and theoretical aspects of computer
science to occupy students through a bachelors degree. Yet most schools have
always included at least some hands-on programming. Why? Perhaps there is
a belief that someone with an engineering degree ought to be able to engineer
the sorts of systems that society demands. In the 1980s, users wanted desktop
376
To the Instructor
applications. Universities adapted by teaching students how to build a computer program that interacted with a single user at a time, processing input
from the mouse and keyboard and displaying results graphically. Starting in
the early 1990s, however, demand shifted toward server-based Internet applications. With 1,000 users potentially attempting the same action at the same instant, the technical challenge shifts to managing concurrency and transactions.
Given stateless protocols such as HTTP, software engineers must learn to develop stateful user experiences. Given the ubiquitous network and evolving
standards for remote procedure calls, students can learn practical ways of implementing distributed computing.
Once weve taught students how to build Internet applications, it is gratifying
to observe their enormous potential. A computer science graduate in 1980 was,
by his or her eorts alone, able to reach only a handful of users. Thanks to the
ubiquitous Internet, a computer science student today is able to write a program that hundreds of thousands of people will use before that student ever
graduates. One of our student teams, for example, built a photo-sharing service
launched to the users of photo.net. Through November 2005, the software built
by the students is holding more than one million photographs on behalf of
roughly 87,000 users.
Universities have long taught theoretical methods for dealing with concurrency
and transactions. The Internet raises new challenges in these areas. A dozen
users may simultaneously ask for the same airline seat. Twenty responses to
377
To the Instructor
a discussion forum question may come in simultaneously. The radio or hardwired connection to a user may be interrupted halfway through an attempt to
register at a site. Starting in 1994 there has been a convergence of solutions to
these problems, with the fundamental element of the solution being the relational database management system (RDBMS). At a school like MIT, where
the RDBMS has not been taught, this textbook gives an opportunity to introduce SQL and data modeling with tables. At a school with an existing database course, this textbook can be used to get students excited about using the
RDBMS as a black box before they embark on a more formal course where
the underpinnings are explained.
Scientists measure their results against nature. Engineers measure their results
against human needs. Programmers . . . dont measure their results. As a nal
overarching deep principle, we need to teach students to measure their results
against the end-user experience. Anyone can build an Internet application. The
applications that are successful and have impact are those whose data model
and page ow permit the users to accomplish their tasks with a minimum of
time and confusion.
378
To the Instructor
Students in MIT course 6.001 (Structure and Interpretation of Computer Programs, based on the Abelson/Sussman textbook of the same name) learn all of
the above in one semester, albeit not very thoroughly. By the end of the semester, theyre either really excited about the challenges in computer science
or . . . theyve wised up and switched to biology.
Survey courses have been similarly successful on the electrical engineering
side of our department. In the good old days, MIT oered 6.01, a linear networks course. Students learned RLC networks in detail. But they forgot why
theyd wanted to major in electrical engineering. Today the rst hardware
course is 6.002, where students play with op-amps before learning about the
transistor!
One of the most celebrated courses at MIT is the Aeronautics and Astronautics departments Unied Engineering. Here is the rst semesters description
from the course catalog:
Presents the principles and methods of engineering, as well as their interrelationships
and applications, through lectures, recitations, design problems, and labs. Disciplines
introduced include: statics, materials and structures, dynamics, uid dynamics, thermodynamics, materials, propulsion, signal and system analysis, and circuits. Topics: mechanics
of solids and uids; statics and dynamics for bodies systems and networks; conservation of
mass and momentum; properties of solids and uids; temperature, conservation of energy;
379
To the Instructor
stability and response of static and dynamic systems. Applications include particle and
rigid body dynamics; stress and deformations in truss members; airfoils and nozzles in
high-speed ow; passive and active circuits. Laboratory exposure to empirical methods in
engineering; illustration of principles and practice. Design of typical aircraft or spacecraft
elements.
Note that this is all presented in one semester, albeit with double the standard
credit hours. For almost every topic in the course description, MIT has one or
more full-semester courses exclusively devoted to that topic.
Experiences like these led us to develop Software Engineering for Internet
Applications and the corresponding survey course in building computer systems
for collaboration.
380
To the Instructor
We consider Mozart to have been creative although he did not develop new
musical forms, relying instead on the structure laid down by Haydn. A student
will accomplish more if he or she can spend the rst months of a project working rather than guring out what eld in which to work, roughly what the scope
of the project should be, what tools to choose from an unlimited palette, and so
forth.
381
To the Instructor
Java. A quick glance at Unied Modeling Language (UML) might make one
think that this is a useful nod in the direction of formal specication of Internet
applications. However, UML cannot be compiled into a working system nor
can it be veried against a system built in executable languages such as SQL
and Java. Even if students mastered the 150 primitives of UML, the only thing
that they would learn is that people in the IT industry can get paid high salaries, despite never having learned to write clear English prose. Object Role
Modeling (ORM), however, is a high-level formal specication language that
looks promising for automatic code generation in the coming years.
382
To the Instructor
shared workspaces for students. This is a shame because for learning most technical material it is much more eective for students to work together and live
separately. A student working in a common laboratory with teaching assistants
and fellow students nearby wont get stuck on something simple, such as how
do I launch SQL*Plus? If you can possibly arrange a room with a bunch of
desks and PCs and make that the center of your class, this will be an enormous
help to less experienced students.
383
To the Instructor
to the idea that they are going to be speaking up in class, we pick a few
examples of online communities from the public Internet and ask students
to criticize the features and user interface. We follow this with a 20-minute
introduction of the RDBMS. Remind students that they must turn in the
Basics problems in one week or be dropped from the class.
m Week 1, Meeting 2 In grappling with the Basics problem set, the students
have now had a chance to work with SQL. We give a 20-minute lecture on
serialization and concurrency control in the RDBMS, pointing out the practical dierences between optimistic and pessimistic locking. The rest of the
class time is devoted to pitches by prospective clients. The clients introduce
themselves and explain what they want to accomplish with their Internet application. Each client should get about 5 minutes. For those projects where
the client is unable to present in person, an instructor gives the pitch on behalf of the client.
m Week 2, Meeting 1 Students turn in the Basics problems. Today is the
day that you assign teams to clients, and hence today is the day that you decide who is staying in the class. Drop anyone who did not turn in the problem set. They are not capable of building database-backed Web pages and
hence are very unlikely to catch up. Most of the class time is devoted to code
review on the Basics problems. You have secretly been surng around before class looking at source code from various students. Youre looking to get
a discussion going on at least the following issues: (a) lack of commenting or
identied authorship, (b) error handling in the comparative shopping problem, (c) dierent approaches to generating unique keys in the face of concurrency, (d) escaping single-quote characters in the search pages, (e) user
interface design for the quote personalization system (tables versus bulleted
lists, kill buttons versus checkboxes and a submit button), (f ) dierent
ways of parsing XML. Spend the last 510 minutes of class with some hints
on working with the client. Students often have the most trouble contacting
their client. Theyll say I sent him email a week ago, but he hasnt responded. Remind them to pick up the phone twice per day until they get a
phone or in-person meeting with their client.
(Giving students one week to do the Basics problem set seemed harsh to
us, and hence we decided one term to give them two weeks to do it. Rather
than spreading the work out, the result was that most students did nothing
until two or three days before the due date and ended up staying up all
night.)
384
To the Instructor
385
To the Instructor
386
To the Instructor
m we want to make sure that students are reading and re-reading the principles
outlined in this textbook
m we want to make sure that students understand data modeling and
concurrency
m we want to see if a student is capable of writing good analyses of Internet
applications and compelling justications of his or her design work
m by giving take-home exams rather than in-class quizzes we are able to create
an experience that will add to the students skills
A good style of question involves asking the students to try out a particular
public Internet service and then build a data model that would support what
theyve just seen. The students should then load their data model and try to
solve some SQL puzzles against them.
Another good question asks the students to visit a public Internet application, try it out, and write a critique of the user experience. In our exam we include the following admonition: Your critique should be clear concerning what
is wrong with the current system. Your critique should be explicit about what
to change, such that a junior programmer could implement your improvements
without depending on his or her own taste and judgment.
You might also want to ask the students to propose and justify a hardware
and software architecture to handle a specic service and user load.
Note that all of these questions are suciently open-ended to lead to interesting classroom discussion. Note further that these exams must be graded by
someone experienced with software engineering and data modeling.
Finding Clients
A real-world client has much to oer your students. A real-world client will
phrase problems in vague and general terms. A real-world client will bring content and users to esh out what would otherwise be a purely academic exercise.
A real-world client can provide students with performance feedback. A realworld client forces students to confront the challenge of demonstrating their
achievement to a non-technical audience.
What can your students oer real-world clients? In some cases, a student
team will build a launchable, documented, maintainable, high-performance system that the client can run for years. This happy result, however, is not neces-
387
To the Instructor
sary in order for a client to get value from participating in a course based on
this textbook. Oftentimes working with a student team will enable a client to
make decisions and formulate precise specications. Most people are unable
to make good decisions about information systems without seeing a prototype.
We dont promise clients that their student team will solve their problem, but
we do promise clients that the experience will clarify their goals and, whatever
else, will be over in 3.5 months.
Working groups within your own university can be a good source of clients.
Groups that need to work with o-campus people, such as alumni, parents, or
colleagues at other institutions, are especially logical candidates for online community support. Non-prot organizations can also be good sources of projects
because they are usually much more patient than for-prot corporations and
can aord to (a) wait for your semester to start, and (b) start over if necessary
at the end of your semester in the event that the student team does not produce
a launchable system. For-prot organizations can provide well-organized and
highly motivated clients. Both cash-starved startups and small neglected departments within larger companies may be attracted to working with a student
team. With any potential client, however, try to make sure that they have
enough resources to gather content and users.
A bit of diversity among the client projects is nice, but at their cores all of the
client projects should be online communities. At the very least, a project needs
to have a discussion forum where User A can ask a question that User B will
answer. Much of the value in this course comes from student teams comparing
their diering approaches to the similar challenges of user registration, content
management, and discussion support. If a client wants a 100-percent voice interface, their team wont be able to learn from other teams very eectively nor
will other teams building primarily Web browser sites be able to learn from the
voice-browser-only team. If a client says I want an online store, just respond
no. If a client says I want an online store where the customers talk to each
other, respond with Okay, but the students arent going to build the checkout
pages until the end of the term, and youll have to oer them summer jobs if
you want e-commerce admin pages.
Here are some criteria for selecting among clients:
m spirit of the project; does it look like an online learning community in which
the users share a common purpose and the more experienced will teach the
less experienced?
388
To the Instructor
Alumni Mentors
In 1950 tuition at Ivy League schools was about $500 and the average new car
cost nearly $2,000 (4 times tuition). In 2003 tuition is approaching $30,000 per
year and a beautiful Honda Accord can be had for $15,000 (1/2 of tuition).
Thanks to improvements in design and manufacturing engineering, the relative
price of an automobile has fallen by a factor of 8 while its quality has improved
dramatically. Why has the cost of a university education soared relative to automobiles and other manufactured goods? Consider the classroom circa 1950:
25 students, 1 teacher, 1 blackboard, 25 chairs. Compare to the classroom experience circa 2005: 25 students, 1 teacher, 1 blackboard, 25 chairs. Even if universities were to exercise restraint in the hiring of administrative sta, the cost
of tuition is doomed to outstrip ination because education is the only industry
in America where there are no productivity improvements.
This problem is not too severe for teaching Physics 101. The school pays one
instructor and lls a room with 300 tuition-paying students. But teaching software engineering eectively requires that students be given an apprenticeship.
No school will want to pay the army of instructors that would represent an
optimum-sized teaching sta for a software engineering project course like this
one. Even if a school had an innite amount of money, professors and graduate
students are probably the wrong people for the job. How much experience
does the average academic computer scientist have in comparing a collection
of software source code to a statement of user requirements and suggesting
improvements?
We can solve the stang and expertise problems in one stroke by bringing
in alumni volunteers. A typical school has 10 or 20 times as many alumni as
389
To the Instructor
current students. If students are broken up into teams of 3 and each volunteer
can assist two teams, we only need to convince approximately 1 percent of our
alumni to volunteer each semester. As working software engineers, our graduates will likely do a much better job of assisting students than a fresh graduate
student would and perhaps even a better job in some areas than a seasoned
professor.
A course based on Software Engineering for Internet Applications is uniquely
amenable to alumni mentoring because all of the students work is accessible
from any Web browser anywhere on the Internet. Between the plans and the
/doc directory and the mandated View Source links at the bottom of every
student-authored page, an alumnus 3,000 miles away ought to be able to contribute almost as eectively as someone who is willing to come down to campus
two nights per week.
390
To the Instructor
Similarly if a usability study shows that test users are able to accomplish tasks
quickly and reliably, what does your opinion of the page ow matter? During
most of this course we try to act as coaches to help our students achieve high
performance as perceived by their clients and end-users. We use every opportunity to arrange for students to get real-world feedback rather than letter grades
from us.
The principal area where we must retain the role of evaluator is in looking at
a teams documentation. The main question here is How easy would it be for
a new team of programmers, with access only to what is in the /doc directory
on a teams server, to take over the project?
392
Sample Contract
Date:
Print name:
Signature:
Date:
Print name:
Signature:
Date:
Print name:
Signature:
Date:
Print name:
Signature:
Date:
Eve Andersson
Eve is Senior Vice President and
Chair of the Bachelor of Science in
Computer Science at Neumont
University in Salt Lake City, Utah.
She has engineered dozens of
enterprise Web applications and a
handful of voice applications. Her
open-source software for building
online communities and e-commerce
sites has been adopted by thousands
of Internet application operators
worldwide. Eve is a co-author of
Stephen Breitenbach et al., Early
Adopter VoiceXML (Wrox Press,
2001).
Eve holds a B.S. from Caltech in
Engineering and Applied Science,
and an M.S. from U.C. Berkeley in
Mechanical Engineering (1998). She
was Visiting Professor of Computer
Science at Galileo University in
Guatemala in 2002, where she led
the development of the universitys
learning management system. She
can recite the rst few hundred
digits of pi from memory, although
394
Index
396
Index
Development server, 123
Diet coach, network-based, 3
Dimensional controls, 293
Dimensional data warehouse, 307, 310
311
Discussion, 161181
data model, 107
geospatialized, 234
one forum or many, 166168
structured, 175
unied view, 167
DNS (Domain Name System), 57
round-robin, 224225
Documentation, 143, 145, 313325
DTD (Document Type Denition), 40.
See also XML
Durability, 19. See also ACID test
Elevator pitch, 137
Email alerts, 150
Engagement letter, 357
Error log, 304
Error messages, 130
Error notication, 310
Estimating a project, 266
Face-to-face community, 231
Failover, 227229
Fat data model, 7577
vs. skinny for user data, 8082
Feature complete, 356
File system, using for content
management, 108
Filters, and security, 28
Final presentation, 321324
Finite state machine, 87
Fulltext search system. See Search,
fulltext search system
Geospatialization, 234235
GPRS (General Packet Radio Service), 2,
184, 195
Grouping users. See User groups
GSL (Grammar Specication Language),
206. See also VoiceXML, grammars
397
Index
Launch Day, 264
LDAP (Lightweight Directory Access
Protocol), 63
Library procedures, 143. See also Shared
procedures
Licensing student work. See Intellectual
property
Links. See Hyperlinks
Literate culture, 100
Load balancing, 223229
via DNS, 224225
router, 226227
Log analysis programs, 308
Magic numbers. See Conguration
parameters
Magnet content, 47
Markup
language, 329
logical, 330331
physical, 330, 331333
tags, 329
Master templates. See Templating, master
templates
Metadata, 282
META tags, 255, 345346
Microbrowser, 185
Microsoft SQL Server, 25. See also
RDBMS
concurrency, 26
MIME (Multipurpose Internet Mail
Extensions), 10
Minimum launchable feature set,
265
Mobile browsers, 2
Multi-modal interfaces. See
Human-computer interfaces,
multi-modal
Navigation, 132133
Netnews. See USENET
Netscan, 162
News, implementing, 105107
Newsgroups. See USENET
New stu, 150151
398
Index
Real-world clients, 377, 386
expectations, 351
feedback, 138
organization, 351
sponsor, 351
Remote method invocation, 269
Request processor, 29. See also Abstract
URLs
Requirements worksheet, 354
Response time, 128
robots.txt le, 256
Rotisserie (Harvard Law School), 175
176
RSS (Really Simple Syndication), 279
SALT (Speech Application Language
Tags) Forum, 209
Scaling. See Online community, scaling
Scheduled process, 308
Screen shots, 315. See also
Documentation
Scripting, 27
SCRIPT tags. See Security, and
HTML
Search, 241259
fulltext search system, 244
inside the RDBMS, 250252
INSO lters (Oracle), 252
query categories, 242
restricting, 242
Security
and lters (see Filters, and security)
and HTML, 159160
Semantic tags, 330331
Server-mediated mentoring, 176180
server.xml le (Java servlets), 157
Session cookie, 307
Sessions. See State, and HTTP
shareable, portable, 1
Shared procedures, 144. See also Library
procedures
6.171 Project Galleries, 314. See also
Documentation
Skinny data model, 7780
vs. fat for user data, 8082
399
Index
User database, 6364
User experience gap, 170
User groups, 68
data modeling, 8287
User prole page, 51
Version control
for computer programs, 121125
for content, 113121
Views. See SQL, views
Voice browser, 200
VoiceXML, 199211
basic example, 203
grammars, 205206
WAP gateway, 185
Web.cong le (ASP.NET), 156
Web service frameworks, 271
web.xml le (Java servlets), 157
Word frequency histogram, 246
Workow, 109113
example, 110111
roles, 110
Workspace page, 176
WSDL (Web Services Description
Language), 270
XHTML Mobile Prole, 186
XML (Extensible Markup Language),
3942