Wrox - Web Standards Programmer's Reference - HTML CSS JavaScript Perl Python and PHPBBBBBBBBBV
Wrox - Web Standards Programmer's Reference - HTML CSS JavaScript Perl Python and PHPBBBBBBBBBV
Steven M. Schafer
Web Standards Programmer’s Reference:
HTML, CSS, JavaScript®, Perl,
Python®, and PHP
Steven M. Schafer
Web Standards Programmer’s Reference:
HTML, CSS, JavaScript®, Perl, Python®, and PHP
Published by
Wiley Publishing, Inc.
10475 Crosspoint Boulevard
Indianapolis, IN 46256
www.wiley.com
Copyright © 2005 by Wiley Publishing, Inc., Indianapolis, Indiana
Published simultaneously in Canada
ISBN-13: 978-0-7645-8820-4
ISBN-10: 0-7645-8820-6
Manufactured in the United States of America
10 9 8 7 6 5 4 3 2 1
1MA/SQ/QX/QV/IN
Library of Congress Cataloging-in-Publication Data:
Schafer, Steven M.
Web standards programmer's reference : HTML, CSS, JavaScript, Perl, Python, and PHP / Steven M. Schafer.
p. cm.
Includes index.
ISBN-13: 978-0-7645-8820-4 (paper/website)
ISBN-10: 0-7645-8820-6 (paper/website)
1. HTML (Document markup language) 2. Web site development. I. Title.
QA76.76.H94S2525 2005
006.7'4--dc22
2005012600
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means,
electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of
the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization
through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA
01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be addressed to the Legal
Department, Wiley Publishing, Inc., 10475 Crosspoint Blvd., Indianapolis, IN 46256, (317) 572-3447, fax (317) 572-4355, or
online at http://www.wiley.com/go/permissions.
LIMIT OF LIABILITY/DISCLAIMER OF WARRANTY: THE PUBLISHER AND THE AUTHOR MAKE NO REPRESEN-
TATIONS OR WARRANTIES WITH RESPECT TO THE ACCURACY OR COMPLETENESS OF THE CONTENTS OF
THIS WORK AND SPECIFICALLY DISCLAIM ALL WARRANTIES, INCLUDING WITHOUT LIMITATION WAR-
RANTIES OF FITNESS FOR A PARTICULAR PURPOSE. NO WARRANTY MAY BE CREATED OR EXTENDED BY
SALES OR PROMOTIONAL MATERIALS. THE ADVICE AND STRATEGIES CONTAINED HEREIN MAY NOT BE
SUITABLE FOR EVERY SITUATION. THIS WORK IS SOLD WITH THE UNDERSTANDING THAT THE PUBLISHER IS
NOT ENGAGED IN RENDERING LEGAL, ACCOUNTING, OR OTHER PROFESSIONAL SERVICES. IF PROFES-
SIONAL ASSISTANCE IS REQUIRED, THE SERVICES OF A COMPETENT PROFESSIONAL PERSON SHOULD BE
SOUGHT. NEITHER THE PUBLISHER NOR THE AUTHOR SHALL BE LIABLE FOR DAMAGES ARISING HEREFROM.
THE FACT THAT AN ORGANIZATION OR WEBSITE IS REFERRED TO IN THIS WORK AS A CITATION AND/OR A
POTENTIAL SOURCE OF FURTHER INFORMATION DOES NOT MEAN THAT THE AUTHOR OR THE PUBLISHER
ENDORSES THE INFORMATION THE ORGANIZATION OR WEBSITE MAY PROVIDE OR RECOMMENDATIONS IT
MAY MAKE. FURTHER, READERS SHOULD BE AWARE THAT INTERNET WEBSITES LISTED IN THIS WORK MAY
HAVE CHANGED OR DISAPPEARED BETWEEN THEN THIS WORK WAS WRITTEN AND WHEN IT IS READ.
For general information on our other products and services please contact our Customer Care Department within the
United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Trademarks: Wiley, the Wiley logo, Wrox, the Wrox logo, Programmer to Programmer, and related trade dress are trade-
marks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries,
and may not be used without written permission. All other trademarks are the property of their respective owners. Wiley
Publishing, Inc., is not associated with any product or vendor mentioned in this book.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available
in electronic books.
Credits
Senior Acquisitions Editor Project Coordinator
Jim Minatel Kristie Rees
Introduction xxiii
Chapter 5: Images 51
Image Formats 51
Web Formats 51
Transparency 53
Interlaced and Progressive Storage and Display 54
Animation 54
Creating Images 57
Commercial Applications 57
Open Source Applications 57
Operating System Built-In Applications 58
Using Premade Images 58
Inserting Images into Web Documents 58
Image Attributes 60
Specifying Text for Nongraphical Browsers 60
Image Size 61
Image Alignment and Borders 62
Image Maps 63
Specifying an Image Map 63
Specifying Clickable Regions 64
Putting It All Together 66
Summary 67
Chapter 6: Links 69
Understanding URLs 69
Absolute versus Relative Paths 71
Using the Anchor Tag 71
Attributes of the Anchor Tag 72
Link Titles 72
vi
Contents
Keyboard Shortcuts and Tab Orders 73
Link Colors 74
Document Relationships 75
The Link Tag 76
Summary 77
Chapter 7: Text 79
Methods of Formatting Text 79
The Font Tag 79
Inline Text Attributes 80
CSS Text Control 81
Special In-Line Text Elements 82
Nonbreaking Spaces 82
Soft Hyphens 83
Bold and Italic 84
Monospaced Text 85
Superscript, Subscript, Big, and Small Text 85
Insertions and Deletions 86
Abbreviations 87
Grouping In-Line Elements 87
Summary 88
Chapter 8: Tables 89
Parts of a Table 89
Formatting Tables 91
Table Width and Alignment 91
Cell Spacing and Padding 94
Borders and Rules 96
Rows 99
Cells 100
Captions 102
Header, Footer, and Body Sections 103
Backgrounds 105
Spanning Columns and Rows 106
Grouping Columns 109
Using Tables for Page Layout 111
Floating Page 113
Odd Graphic and Text Combinations 116
Navigational Blocks 119
Multiple Columns 120
A Word About Frames 121
Summary 122
vii
Contents
Chapter 9: Forms 123
Understanding Forms 123
Form Handling 127
Passing Form Data 128
The Form Tag 129
The Input Tag 129
The name and id Attributes 130
Text Input Boxes 130
Password Input Boxes 131
Radio Buttons 131
Checkboxes 132
List Boxes 132
Large Text Areas 134
Hidden Fields 135
Buttons 135
Images 136
File Fields 137
Submit and Reset Buttons 137
Field Labels 138
Fieldsets and Legends 138
Tab Order and Keyboard Shortcuts 140
Preventing Changes to Fields 141
Summary 142
viii
Contents
Nonparsed Data 158
Entities 158
Namespaces 159
Style Sheets 159
Using XML 160
Extensible Stylesheet Language Transformations (XSLT) 161
XML Editing 161
XML Parsing 161
Summary 162
ix
Contents
Chapter 14: Text 189
Aligning Text 189
Horizontal Alignment 189
Vertical Alignment 191
Indenting Text 194
Controlling White Space 195
Floating Objects 195
The white-space Property 198
Letter and Word Spacing 198
Capitalization 200
Text Decorations 200
Formatting Lists 201
Any Element Can Be a List Item 201
The list-style-type Property 202
Positioning of Markers 203
Images as List Markers 204
Autogenerating Text 205
Define and Display Quotation Marks 205
Automatic Numbering 205
Fonts 210
Font Selection 210
Font Sizing 211
Font Styling 212
Line Spacing 213
Font Embedding 213
Summary 214
x
Contents
xi
Contents
Determining the Document Object Model 264
Uses for JavaScript 265
Incorporating JavaScript in Your Documents 266
Anatomy of the <script> Tag 266
Placement of the Script Tag 267
Execution of Scripts 267
Summary 268
xii
Contents
xiii
Contents
Basic Perl Syntax 372
Data Types and Variables 372
Data Types 372
Variables 373
Special Variables 374
Calculations and Operators 377
Control Structures 380
While and Until 380
For 380
Foreach 381
If Else 381
More Loop Control — Continue, Next, Last, Redo 382
Regular Expressions 383
Regular Expression Operations 383
Regex Special Characters 384
Example Expressions 385
Modifying Expressions 385
Memorizing Substrings 386
Built-in Functions 386
User-Defined Functions 387
File Operations 387
Standard Operating Procedure 388
Opening a File 388
Reading from a Text File 388
Writing to a Text File 389
Closing a File 389
Working with Binary Files 389
Getting File Information 390
Other File Functions 391
Objects 391
Perl’s Object Nomenclature 391
Perl Constructors 392
Accessing Property Values 392
Modules 393
Using Perl for CGI 393
Perl Errors and Troubleshooting 394
Maximum Error Reporting 394
The Apache Internal Server Error Message 395
Summary 396
xiv
Contents
xv
Contents
Chapter 27: Scripting with Other Executable Code 423
Requirements for CGI 423
Sample CGI Using Bash Shell Scripting 424
Configuring Apache to Deliver Bash Scripts 424
Getting Data into the Script 425
Getting Data Out of the Script 426
Doing Useful Things 428
Summary 430
xvi
Contents
xvii
Contents
PHP Examples 509
Date and Time Handling 509
Handling Form Data 514
Using Form Data 517
Accessing Databases 523
Summary 526
xviii
Contents
xx
Contents
xxi
Contents
FTP 733
Function Handling 736
HTTP 737
Iconv Library 737
Image 738
IMAP 745
Mail 750
Math 751
MIME 753
Miscellaneous 754
MS SQL 755
MySQL 757
Network Functions 760
ODBC 761
Output Buffering 765
PCRE 766
PHP Options and Info 766
Program Execution 769
Regular Expressions 769
Sessions 770
Simple XML 771
Sockets 772
SQLite 774
Streams 776
Strings 778
URL 784
Variable Functions 784
XML 786
ZLib 787
Index 789
xxii
Introduction
The Web has matured quickly from a textual reference to a medium suitable for publishing just about
any document imaginable, conveying any idea, containing any type of information. As the Web grows,
it envelopes people of all types—research professionals, companies, and even individuals. People from
all walks of life and with all levels of technical ability are expected to have a Web presence.
As a consequence, most technical people have been relied upon to know more about the technologies
involved in publishing on the Web. Unfortunately, despite what most non-technical people think, tech-
nical people don’t automatically understand all things Web-related. The evolving standards, increasing
number of platforms that are Web capable, and number of technologies that can be employed in Web
publishing conspire to create a morass of technologies that must be addressed.
This book does its best to cover the basics of all of the technologies central to Web publishing:
This book is not designed to be a comprehensive beginner’s tutorial for every standard in Web publish-
ing. That would require six or seven books each this size or larger. In fact, if you’ve never done any Web
publishing or other programming, this may not be the right book for you to start with. (See the next sec-
tion, “Who Is This Book For?”) However, if you don’t need an exhaustive tutorial on each language and
are looking for core usage examples and syntax for several popular Web standards, this book is the all-
in-one reference for Web standards that programmers should turn to when needing to learn or reference
information on the core publishing technologies.
Wiley and WROX have several additional books that should be considered as supplements to this
book. Almost all of the technologies covered in this book have appropriate Beginning and Professional
titles that cover the technology in more depth. Browse for the subjects that most interest you at
http://www.wrox.com.
Introduction
❑ Programmers familiar with traditional programming languages who wish to learn more about
Web technologies so they can expand their programming capabilities to deliver standards-
compliant Web content
❑ Web designers familiar with HTML and related-technologies who wish to become familiar with
scripting languages to expand their capabilities on the Web
The first third of this book covers XHTML and CSS, the backbone technologies for Web content.
Programmers who want to learn about the current XHTML and CSS standards and how they are used
to format and convey content should spend time working through the chapters in these parts.
The second third of the book covers scripting—client-side JavaScript, server-side CGI (Perl and Python),
and PHP are covered. These parts of the book introduce the programming technologies prevalent on the
Web. These programming languages (commonly known as scripting languages) can be used to help cre-
ate and deploy more dynamic and powerful content via the Web. Anyone looking to learn how to use
scripting to expand the capabilities of their online documents should read this part of the book.
The last third of the book contains reference Appendices useful for looking up the syntax and capabili-
ties of the specific technologies.
See the section, “How This Book Is Organized” later in this Introduction for a full breakdown of the
book’s contents.
Personally, I’ve been coding for the Web for several years. However, only recently have I begun to adhere
to the W3C specifications and produce standards-compliant HTML. The road to this point has been a bit
painful, but also very rewarding and something I hope to communicate in every example within this
book.
It’s important to understand that you can code for the Web while ignoring the standards, but you shouldn’t.
Most browsers (especially the oft-used Microsoft Internet Explorer) will allow sloppy coding, actually
correcting common code errors. However, this doesn’t guarantee that your document will appear the
way you intended. The auto-correcting behavior of some browsers can also make designers complacent
in their non-conformance. I’ve often heard designers claim, “It looks fine in browser X,” when trying to
defend a non-standards-conforming document.
xxiv
Introduction
You should do the following:
❑ Code to the standards, not the browsers. Trust that most browsers will support the W3C stan-
dards and correctly render documents that are coded to said standards.
❑ Test your documents against the most popular browsers (Microsoft IE and Mozilla Firefox) or
your target browser(s), if known.
This book does its best to cover only the published standards for XHTML and CSS, ignoring browser-
specific extensions, transitional DTDs, quirks-modes, and anything else non-standard. There are a few
areas where this approach is difficult to achieve:
❑ Some tags/attributes that have been deprecated have not had their features replaced with CSS.
In those cases, the deprecated tags or attributes are covered, but discouraged.
❑ JavaScript is especially testy when required to be cross-platform (Internet Explorer and Mozilla).
As such, the JavaScript chapters cover tricks to help your scripts exist peacefully on both
platforms.
In the event that a desired effect can only be achieved by deprecated methods, the methods are covered
but disclaimed as deprecated, and their use is not recommended.
The author recognizes that there are more user agent platforms than just Internet Explorer and Mozilla
Firefox, such as Opera, for example. However, the author also recognizes that the most popular browsers
are IE and Firefox themselves, or are based on the IE or Mozilla (Gecko) codebase. As such, this book
highlights these two browsers—the few times we highlight any specific browser.
xxv
Introduction
chapter covers the Document Object Model, a standard method of identifying document elements and
working with their attributes and properties (Chapter 21). The concept of Dynamic HTML—the practice
of creating dynamic content through the synergy of JavaScript and HTML—is covered in the next chap-
ter (Chapter 22). The last chapter in this part shows you practical examples of JavaScript in use, includ-
ing sample code and explanations thereof (Chapter 23).
Part V—PHP
This part of the book covers the relatively new, but exciting and powerful, Web scripting language PHP.
The first chapter in this part covers the basics of the PHP language (Chapter 29), the second chapter
covers the language in-depth (Chapter 30), and the third chapter covers practical examples of using PHP
(Chapter 31).
Part VI—Appendixes
The reference appendixes of this book provide comprehensive referential material on the technologies
covered in this book. These references are designed to be used with the chapters where the technology is
covered. The chapters cover learning and using the technologies while the appendixes provide the com-
prehensive reference into the technologies as a whole.
Terminology
This book uses fairly unique terminology regarding the World Wide Web and content published thereon.
Most references of this type refer to content on the Web in terms of pages or sites. However, the author
maintains that the Web has grown into an actual publishing medium, allowing rich content to be easily
developed and deployed, allowing for use of the term “document” in lieu of page or site. On today’s
Web, content can be as rich as any book, magazine, or other document-based medium.
In fact, with the abundance of multimedia options available, the Web often exceeds “document” publish-
ing standards.
To the same end, this book routinely refers to XHTML tags by their name, not their coding. For example,
you will see descriptions of the span element, instead of <span>. Also, because all XHTML tags need to
have open and closed pairs in XHTML, when we do refer to the tags by their codes we will only refer to
the open tag (for example, <body> tags, instead of <body> and </body> tags).
xxvi
Introduction
This book also avoids using the familiar term browser when referring to the application rendering XHTML
and other Web-related technologies into visual presentations. Instead, the book refers to such applica-
tions as user agents. This is due to the fact that a wide range of software and devices now render Web
technologies into presentation formats. The scope of serviceable XHTML rendering tools isn’t as narrow
as it once was—reserved for a few applications known as browsers (Internet Explorer, Mozilla, Opera,
and so on). This book assumes that the reader wants to provide content for as many platforms as possi-
ble, even those outside the familiar application (browser) setting.
Code Listings
There are several ways that code is conveyed in this book.
When code is represented in line, within normal text, it is presented in a special, monotype font such as:
The Wiley Web site can be found at http://www.wiley.com.
Inline code is reserved for short examples, URLs, and other short pieces of text.
When longer listings are required, they appear in a listing format similar to the following:
If particular sections of the listing need to be specially referenced, they will appear with a gray back-
ground like the <title> section in the preceding listing.
Within code listings we often need to show that the listings contain placeholder information that may be
different in actual use. For example, the following code shows that the margin-top property needs an
argument indicating what the margin should be set to:
margin-top: margin-value;
In such cases, we will use italic keywords representing the variable information. In the preceding listing,
margin-value is the placeholder for the value of the margin. In actual use, margin-value would be
replaced by an actual value, such as the following:
margin-top: 25px;
This paragraph contains important information that deserves the reader’s attention. It is reserved for
special notes outside the normal flow of the text, cautions that the reader should be aware of, and other
information of special importance.
xxvii
Introduction
Source Code
Code from this book can be found on the WROX Web site, namely http://www.wrox.com. You can use
the search function to search for this book or use the topical listings to find it. Note that many books are
similar in title and searching for the ISBN (0-7645-8820-6) instead of the title might yield quicker
results.
xxviii
The Basics of HTML
Before you begin to code HTML pages for the Web, it is important to understand some of the tech-
nology, standards, and syntax behind the Web. This chapter introduces you to HTML and answers
the following questions:
Subsequent chapters in this section delve into the specifics of HTML, covering various tags you
can use to format your pages.
The World Wide Web is a network of computers that, using the Internet, are able to exchange text,
graphics, and even multimedia content using standard protocols. Web servers — special computers
that are set up for the distinct purpose of delivering content — are placed on the Internet with spe-
cific content for others to access. Web clients — which are generally desktop computers but can also
be dedicated terminals, mobile devices, and more — access the servers’ content via a browser. The
browser is a specialized application for displaying Web content.
For example, Google maintains many Web servers that connect to their database of content found
on the Web. You use your home or office PC to connect to the servers via a browser such as
Microsoft’s Internet Explorer or Mozilla’s Firefox (shown in Figure 1-1).
Chapter 1
Figure 1-1
If you were to make a diagram of the relationships between all the technical components involved in
requesting and delivering a document over the Web, it would resemble the diagram shown in Figure 1-2.
Requests
Documents
Server
Storage
Web User Agent
Server
Figure 1-2
2
The Basics of HTML
Creating a Web
The Web was created as a replacement for the aging Gopher protocol. Gopher allowed documents across
the Internet to be linked to each other and searched. The inclusion of hyperlinks — embedded links to other
documents on the Web — gives the resulting technology its name because it resembles a spider’s web.
Figure 1-3 shows a graphic representation of a handful of sites on the Web. When a line is drawn
between the sites that link to one another, the web becomes more obvious.
National Company X
Business (Acme Partner)
Directory
ISO and
Standards Acme Inc.
(U.S.) Bob’s Web log
(Acme COO)
Figure 1-3
3
Chapter 1
However, the Web doesn’t operate as the diagram would have you believe. One Web site doesn’t go to
another for information; your browser requests the information directly from the server where the infor-
mation can be found. For example, suppose you are on the Acme Inc US site in Figure 1-3 and click the
link to Company X. The Acme Inc US server doesn’t handle the request for the external page; your
browser reads the address of the new page from the hyperlink and requests the information from the
server that actually hosts that page (Company X in the example from Figure 1-3).
Hyperlinks contain several pieces of vital information that instruct the Web browser where to go for the
content. The following information is provided:
The information is assembled together in a URL. The information is presented in the following form:
❑ The path to the file being requested, beginning with a slash, with a slash between each directory
in the path and a slash at the end (for example, /options/)
❑ The name of the file being requested (for example, index.html)
Most Web servers are configured to deliver specific documents if the browser doesn’t explicitly request a
document. These specific documents differ between server applications and configurations but are gen-
erally documents such as index.html and home.html. For example, the following two URLs will
return the same document (index.html):
http://www.google.com/options/
http://www.google.com/options/index.html
http://www.google.com/options/index.html
Protocol Server Path File
Figure 1-4
4
The Basics of HTML
Although HTTP is the protocol of choice for the Web, most browsers support additional protocols such
as the File Transfer Protocol (FTP).
Much like other protocols, an HTTP conversation consists of a handful of commands from the client and
a stream of data from the server. Although discussing the whole HTTP protocol is beyond this book’s
scope, it is important to grasp the basics of how the protocol operates. By using a telnet client, you can
“talk” to a Web server and try the protocol manually as shown in the following code (text typed by the
user appears with a gray background):
telnet localhost 80
Trying 127.0.0.1...
Connected to localhost.
Escape character is ‘^]’.
GET /index.html HTTP/1.1
Accept: text/plain,text/html
Host: localhost
User-Agent: Telnet
HTTP/1.1 200 OK
Date: Sun, 17 Oct 2004 23:47:49 GMT
Server: Apache/1.3.26 (Unix) Debian GNU/Linux PHP/4.1.2
Last-Modified: Sat, 26 Oct 2002 09:12:14 GMT
ETag: “19b498-100e-3dba5c6e”
Accept-Ranges: bytes
Content-Length: 4110
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=iso-8859-1
<BR>
<HR NOSHADE>
<BR>
5
Chapter 1
<P>This is a placeholder page installed by the <A
HREF=”http://www.debian.org/”>Debian</A>
release of the <A HREF=”http://www.apache.org/”>Apache</A> web server package,
because no home page was installed on this host.
...
</BODY>
</HTML>
Connection closed by foreign host.
The telnet client is started with the name of the host to connect to and the port number (80):
telnet localhost 80
Once the client is connected, the server waits for a command. In this case, the client (our telnet session)
sends a block of commands, including the following:
❑ The document to be retrieved and the protocol to return the document (GET and HTTP 1.1)
❑ The types of documents the client expects or can support (plain text or HTML text)
❑ The host the request is destined for (typically the fully qualified domain name of the server)
❑ The name of the user agent (browser) doing the requesting (Telnet)
Note that only the first three pieces of data are necessary; the user agent name is provided only as a
courtesy to the Webmaster on the server as it gets recorded in the server logs accordingly.
This block of commands is known as the header and is required to be followed by a blank line, which
indicates to the server that the client is done with the header. The server then responds with information
of its own, including the following:
HTTP/1.1 200 OK
❑ The server identification string, which usually identifies the type and capabilities of the server
but can be configured differently
6
The Basics of HTML
❑ Information about the document being delivered (date modified, size, encoding, and so on)
Last-Modified: Sat, 26 Oct 2002 09:12:14 GMT
ETag: “19b498-100e-3dba5c6e”
Accept-Ranges: bytes
Content-Length: 4110
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=iso-8859-1
❑ The content of the document itself (in this case, the default Debian/GNU Linux Apache wel-
come page)
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 3.2//EN”>
<HTML>
<HEAD>
<META HTTP-EQUIV=”Content-Type” CONTENT=”text/html; charset=iso-8859-1”>
. . .
A few seconds after the full document is delivered, the server closes the connection.
This dialog is HTTP at its simplest, but it does a good job of illustrating how the protocol works.
To date, HTML has gone through four major standards, including the latest, 4.01. In addition to HTML,
Cascading Style Sheets (CSS) and Extensible Markup Language (XML) have also provided valuable con-
tributions to the way of the Web.
Most of the standards used on the Web are developed and/or ratified by the World Wide Web Consortium
(W3C). The resulting specifications can be found online at the W3C Web site, www.w3c.org.
7
Chapter 1
HTML 1.0
HTML 1.0 was never specified by the W3C, as it predated the organization. The standard supported a few
basic tags and graphics, although the latter needed to be in GIF format if used in-line or JPEG format if the
image was out-of-line. You couldn’t specify the font, background images, or colors, and there were no
tables or forms. At the time, only one browser, Mosaic 1.0, was available to view Web documents.
However, the standard became the stepping-stone to the modern Web.
HTML 2.0
The HTML 2.0 standard provided a wealth of improvement over the 1.0 version. Background colors and
images were supported, as were tables and rudimentary forms. Between 1.0 and 2.0, a new browser was
launched (Netscape), and several HTML features created to support features in the new browser became
part of the 2.0 standard.
HTML 3.2
The HTML 3.2 standard significantly increased the capability of HTML and the Web. Many of the new
features enabled Web designers to create feature-rich and elegant designs via new layout tags. Although
the 3.2 specification introduced Cascading Style Sheets (CSS level 1), browsers were slow to adopt the
new way of formatting. However, the standard did not include frames, but the feature was implemented
in the various browsers anyway.
There was an HTML 3.0 proposed standard, but it could not be ratified by the W3C in time. Hence, the
next ratified standard was HTML 3.2.
HTML 4.0
HTML 4.0 did not introduce many new features, but it ushered in a new way of implementing Web
design. Instead of using explicit formatting parameters in HTML tags, HTML 4.0 encouraged moving
the formatting parameters to style sheets instead. HTML 3.2 had become burdensome to support with
several dozen tags with several parameters each. The 4.0 standard emphasized the use of CSS, where
formatting changes could be made within one document (the style sheet) instead of individually editing
every page on a site.
XML 1.0
The Extensible Markup Language (XML) was created as a stepping-stone to bring Standard Generalized
Markup Language (SGML) concepts to HTML. Although it was the precursor to HTML, SGML was not
widely endorsed. As such, the W3C sought to create a usable subset of SGML targeted specifically
toward the Web. The new standard was meant to have enough flexibility and power to provide tradi-
tional publishing applications on the Web. XML became part of the new XHTML standard.
8
The Basics of HTML
When the formatting needs to change, the CSS document alone can be updated, and the changes are
then reflected in all documents that use that style sheet. The “cascade” in the name refers to the feature
that allows styles to be overridden by subsequent styles. For example, the HR department Web pages
for Acme Inc. can use the company style sheet but also use styles specific for the individual department.
The result is that all of the Acme Inc. Web pages look similar, but each department has a slightly unique
look and feel.
HTML 4.01
Heralded as the last of the HTML standards, 4.01 fixed errors inherent in the 4.0 specification and made
the final leap to embracing CSS as the vehicle for document formatting (instead of using parameters in
HTML tags).
XHTML 1.0
Extensible Hypertext Markup Language is the latest standard for Web documents. This standard infuses
the HTML 4.01 standard with extensible language constructs courtesy of XML. It was designed to be
used in XML-compliant environments yet be compatible with standard HTML 4.01 user agents. As of
this writing, adoption of the XHTML standard for Web documents has been slow. Although most
browsers natively support HTML 4.01, most do not support the extensibility features of XHTML 1.0.
HTML Tags
Each tag begins with a left-pointing angle bracket (<) and ends with a right-pointing angle bracket (>).
Between the brackets are keywords that indicate the type of tag. Beginning tags include any parameters
necessary for the tag; ending tags contain only the keyword prefixed by a slash.
For example, if you want a word to be bold in a document, you would surround it with bold tags (<b>
and </b>) similar to the following:
Many tags require children tags to operate. For example, the <table> tag itself only marks the position
in the document where a table will appear; it does nothing to format the table into rows and columns.
Several child tags — <tr> for rows, <td> for cells/columns, and so on — are used between the begin-
ning and ending <table> tags accordingly:
9
Chapter 1
<tr>
<td>Cell 3</td>
<td>Cell 4</td>
</tr>
</table>
Notice how the tags are closed in the opposite order they were opened. Although intuitive for structures
like tables, it isn’t always as intuitive. For example, consider the following fragment where the phrase
“italic and bold” is formatted with italic and bold tags:
This sentence uses <b><i>italic and bold</b></i> tags for emphasis.
Although this example would generally not cause a problem when rendered via a user agent, there are
many instances where overlapping tags can cause problems. Well-formed HTML always uses nested
tags — tags are closed in the exact opposite order that they were opened.
❑ Use liberal white space. Browsers ignore superfluous white space, so you can make use of it to
create documents that are more easily read and maintained. Insert blank lines and follow stan-
dard coding rules for indentation whenever possible.
❑ Use well-formed code. This means following the XHTML standard to the letter — not taking
shortcuts that some browsers allow. In particular, you should pay attention to the following:
❑ Always include a <doctype> tag (<doctype> and other document-level tags are dis-
cussed in Chapter 2.
❑ Elements (tags) must be nested, not overlapping.
❑ All nonempty elements need to be terminated. Most browsers allow for nonclosed
elements, but to meet the XHTML standard you need to supply closing tags for each
open one (for example, supply a closing paragraph tag [</p>] for every open one
[<p>]).
❑ All tags need to be closed. Although the HTML standard allows tags such as <hr>
without a closing tag — in fact, the <hr> tag has no closing mate — in XML all tags
must be closed. In the case of tags like <hr>, you close the tag by putting a slash at the
end of the tag: <hr />.
❑ All attribute values must be quoted. Again, most browsers allow nonquoted
attributes, but the XHTML standard does not.
❑ All attributes must have values. Older HTML standards allowed for tags similar to
the following:
<input type=”checkbox” checked>
10
The Basics of HTML
However, XHTML does not allow attributes without values (for example, checked).
Instead, you must supply a value, such as the following:
<input type=”checkbox” checked=”checked”>
❑ Comment your code. Using the comment tag pair (<!-- and -->) should be as natural as com-
menting code in programming languages. Especially useful are comments at the end of large
blocks, such as nested tables. It can help identify which part of the document you are editing:
</table> <!-- End of floating page -->
Source
Type the following code into a document and save it, in plain text format, as sample.html on
your local hard drive.
Output
Now open the document in a Web browser. In most graphical operating environments, you can
simply use a file manager to find the sample.html file and then double-click on it. Your default
Web browser should open and load the file. If not, select Open (or Open File) from the File menu
and find the sample.html file using the browser’s interface.
11
Chapter 1
Figure 1-5
At this point, you may be asking yourself, “Why don’t I need a Web server?” The reason is simple: The
browser loads and interprets the HTML file from the local hard drive; it doesn’t have to request the file
from a server. However, the file uses only HTML, which is interpreted only by the client side. If you
used any server-side technologies (Perl, PHP, and so on), you would have to load the sample file onto a
Web server that had the appropriate capabilities to process the file before giving it to the client. More
information on server-side technologies can be found in Parts V and VI of this book.
Summar y
This chapter introduced you to the World Wide Web and the main technology behind it, HTML. You saw
how the Web works, how clients and servers interact, and what makes up a hyperlink. You also learned
how HTML evolved and where it is today. This basic background serves as a foundation for the rest of
the chapters in this section, where you will learn more about specific HTML coding.
12
Document Tags
HTML documents are much like word processing documents — they contain information about
the document itself, not just its contents. Understanding the layout of the document is as impor-
tant as forming the document itself. This chapter delves into the details of document-level tags.
The tags that make up the framework of a typical HTML document include the following:
The following sections detail the various tags and sections in a typical Web document.
Web document is used in this book to refer to HTML documents due to the Web becoming closer
to a true publishing platform.
Chapter 2
The DTD specifies each valid element that can be contained in the document, including the attributes for
the element and types of values each can contain. For example, the XHTML 1.0 Strict DTD contains the
following section for the anchor tag (<a>):
<!ELEMENT a %a.content;>
<!ATTLIST a
%attrs;
%focus;
charset %Charset; #IMPLIED
type %ContentType; #IMPLIED
name NMTOKEN #IMPLIED
href %URI; #IMPLIED
hreflang %LanguageCode; #IMPLIED
rel %LinkTypes; #IMPLIED
rev %LinkTypes; #IMPLIED
shape %Shape; “rect”
coords %Coords; #IMPLIED
>
This section specifies the relationship the <a> tag has to the document (in-line) as well as the valid
attributes (charset, type, name, and so on). The structure of the sections within the DTD also indicates
where elements can appear in relationship to one another.
The XHTML Basic 1.0 DTD is a bit different because it applies to a modular standard — its sections
refer to other modular DTD that contains the actual specification. For example, the DTD contains the
following section on tables:
14
Document Tags
<!-- Tables Module ............................................... -->
<!ENTITY % xhtml-table.module “INCLUDE” >
<![%xhtml-table.module;[
<!ENTITY % xhtml-table.mod
PUBLIC “-//W3C//ELEMENTS XHTML Basic Tables 1.0//EN”
“xhtml-basic-table-1.mod” >
%xhtml-table.mod;]]>
The DTD is important because without it validation tools and certain clients won’t know how to validate
or otherwise handle your document. You should get in the habit of always including a valid document
type tag at the beginning of your documents.
You can find a list of valid, public DTDs on the W3C Web site at http://www.w3.org/QA/2002/04/
valid-dtd-list.html.
HTML Tag
The HTML tag (<html>) is the tag that indicates the beginning and end of an XHTML document. Your
documents should always begin with the opening HTML tag (<html>) and end with the closing HTML
tag (</html>).
A variety of browsers (including Microsoft’s Internet Explorer) correctly handle documents that are
missing one, or both, of these tags. However, you should never count on your audience using a particu-
lar user agent or browser and should therefore strive to always write standards-compliant code.
15
Chapter 2
The <title> and <body> elements are discussed in the appropriately titled sections later in this chapter.
The following sections detail some of the various elements found in the head section.
For example, a document with the following title code would cause Mozilla’s Firefox to display “A syn-
opsis of last quarter’s earnings” in its title bar:
The document title is also routinely used as a label for the document when added to user favorites, as
the descriptive text for the document in search engines, and so forth. Because of the limited space
granted to document titles, it’s important to keep the title to a reasonable length and on one line.
However, given the wide range of places it can appear, to describe your document you should make
the title as apropos to the document content as possible.
Meta Tags
Meta tags enable Web authors to embed extra information in their documents. Meta tags are used to pro-
vide information to search engines, control browser caching of documents, and much more. Most of a
document’s meta information is generated by the Web server that delivers the document. However, by
using <meta> tags, you can supply different or additional information about the document.
For example, a meta tag that provides a description of a document’s content would resemble the following:
The full breadth of meta tag uses is outside the scope of this book because, since meta tags are simple
data containers, any entity, program, or user agent can accept unique meta tags if required. For example,
a user agent may require the author’s name on every page. For pages displayed with that user agent,
you could use the following tag:
The following sections detail some of the more popular meta tag uses.
Over the last few years, meta tag use and support has been declining. As a result, you should never
depend on meta tags to drive any functionality of your documents.
16
Document Tags
The preceding tag provides keywords (to those agents requesting them) for the document. Some search
engines use these keywords to categorize the document for searching.
The preceding tag tells conforming robots not to index the current document or follow any links from it.
Note that not all robots follow such directions.
http://www.example.com/products/images/imscr.jpg
Relative paths provide information relative to the location of the current document. For example, if the
same image was referenced in a document contained in the products directory, you could use a relative
path such as the following:
images/imscr.jpg
When the user agent receives that path, it typically appends the path to the content onto the path of the
current document (http://www.example.com/products/). This creates the absolute path to the content
being referenced.
❑ When you move a document, for example, to the legacy_products directory, internal absolute
paths will be broken — they will still refer to resources in the products directory.
❑ Some servers do not handle relative links properly, resulting in broken links due to the server or
user agent incorrectly building the absolute path. This is mostly due to configuration issues but
is something to consider if you don’t have control over the server configuration.
One method to help ensure that all your links continue to function is to provide the correct context to all
parties (server and user agent) via the base tag (<base>). For example, the following tag indicates that
the document exists in the products directory and all relative links should be applied against that path:
17
Chapter 2
Thereafter, if you have to move the documents in a particular path, you can simply change the base tag
at the top of each:
The rest of the links in the document (if relative) will be correctly handled.
The first tag (Cache Control) is generally understood by modern browsers. The second should be used
for browsers that are only HTTP 1.0 compliant, as they do not recognize Cache Control. (When in doubt
as to the user agents accessing your content, include both.)
Furthermore, you can control how long the document stays in the cache:
The value of max-age is given in seconds from the user agent receiving the document. In the preceding
example, the user agent is instructed to cache the document for one day (86,400 seconds), at which
point the cached copy should expire.
As previously stated, it is up to the user agent as to whether it will abide by such requests; you should
never design documents that rely on behavior specified in meta tags.
For example, the following tag will reload the document specified after 5 seconds:
❑ Notifying users of a moved page. I’m sure you have seen at least one page resembling the
following:
Such pages use the meta tag to direct the user to the appropriate location after the specified
time. Note that the redirection is done by the user agent, not the server.
18
Document Tags
❑ Refreshing content that changes. A document that tracks stock prices or inventory quantities
should be refreshed automatically so that the data is reasonably accurate. In this case, the docu-
ment specified in the tag should be the current document, causing the document to simply
reload, refreshing its contents. Note that this use generally requires a no-cache directive as
well, helping prevent the user agent from simply loading the same copy from cache.
For a comprehensive list of HTTP 1.1 headers, including cache and other directives, see the HTTP 1.1
definition on the W3C Web site: http://www.w3.org/Protocols/rfc2616/rfc2616.html.
Style Section
The head section is also the area where you should declare any general and local styles for the docu-
ment. All style definitions should be contained within style tags accordingly:
<head>
<title>ACME Products Corporate Web Site</title>
<style type=”text/css”>
...style definitions go here ...
</style>
...
</head>
Note that the opening style tag includes a type definition so that the user agent knows what to expect —
textual information in CSS format.
You can also refer to an external document containing style definitions (commonly referred to as a style
sheet) using the <link> tag. For example, the following code refers the user agent to an external style
sheet named site.css:
The <link> tag can also be used to provide information on documents that are related to the current
document — an index, the next or previous document in a series, and so on. For more information on
attributes and values necessary to specify such information, see Appendix A.
19
Chapter 2
Script Section
You should also place any global scripts inside the head section of the document. For example, if you use
JavaScript for certain features, your head would include a section similar to the following:
<script type=”text/javascript”>
...script code goes here...
</script>
As with other non-HTML content containers, the opening <script> tag includes identifiers of the content
contained in the section (in this case, textual JavaScript code).
The <script> tag can also be used to refer to an external document containing the script code by
adding the src (source) attribute:
The preceding code would direct the user agent to find the code for the scripts in the file myscripts.js.
Note that you can (and usually should) include an absolute or relative path to the external script file.
Body Section
The body section of the document is where the visible content appears. This content is typically a series
of block tags containing in-line content — similar to paragraphs containing words and sentences.
The body section is delimited by opening and closing <body> tags and appears after the <head> section
but within the <html> tags. For example, the following code shows the typical structure for a Web
document:
20
Document Tags
Prior to HTML 4.01 the <body> tag played host to a wealth of document format information, including
the following:
❑ Background color
❑ Background image
❑ The color of text
❑ The color of links in the document
However, those attributes have been deprecated, and the appropriate CSS styles are now used instead.
The <body> tag does retain all of its event attributes — onload, onunload, onclick, and so on. These
events can be used to trigger scripts upon the appropriate action. For example, the following <body> tag
will cause the current document to close when the user clicks anywhere in the document:
Events and scripts are covered in depth in Part III of this book. Dynamic HTML, which makes good use
of events and scripting, is covered in Chapters 24 and 25.
Summar y
This chapter discussed the various document-level tags and how they are used to set up the basic format
of HTML documents. You learned that a <doctype> tag should be mandatory for all Web documents
and how the other document tags relate to one another. As you progress through the rest of the chapters
in this section, you will learn about content-level tags and how to construct and format the actual docu-
ment contents.
21
Paragraphs and Lines
In Chapter 2, you learned how to correctly set up an HTML document. Now that you have the
basic framework for a document, you can get to work on filling it with content. The first elements
that you need to learn are block tags, which define blocks of content within the body of a docu-
ment. This chapter teaches you about the top-level block elements — paragraphs, line breaks, and
divisions — as well as some of the additional block elements.
“Thus did he speak, and they did even as he had said. Those who were about Ajax and King
Idomeneus, the followers moreover of Teucer, Meriones, and Meges peer of Mars called all
their best men about them and sustained the fight against Hector and the Trojans, but the
main body fell back upon the ships of the Achaeans.
“The Trojans pressed forward in a dense body, with Hector striding on at their head. Before
him went Phoebus Apollo shrouded in cloud about his shoulders. He bore aloft the terrible
aegis with its shaggy fringe, which Vulcan the smith had given Jove to strike terror into the
hearts of men. With this in his hand he led on the Trojans.”
Note how the lines within the paragraph are spaced using single line spacing, with double-
spacing between the paragraphs.
Chapter 3
Source
To display the paragraphs similarly in a Web document, you would simply place each paragraph
within paragraph tags:
<p>Thus did he speak, and they did even as he had said. Those who were about
Ajax and King Idomeneus, the followers moreover of Teucer, Meriones, and Meges
peer of Mars called all their best men about them and sustained the fight
against Hector and the Trojans, but the main body fell back upon the ships of
the Achaeans.</p>
<p>The Trojans pressed forward in a dense body, with Hector striding on at their
head. Before him went Phoebus Apollo shrouded in cloud about his shoulders. He
bore aloft the terrible aegis with its shaggy fringe, which Vulcan the smith had
given Jove to strike terror into the hearts of men. With this in his hand he led
on the Trojans.</p>
Output
In a browser, the code generates the document shown in Figure 3-1.
Figure 3-1
Despite the name of the tag implying text (paragraph), the <p> tag can be used to enclose any distinct
piece of a document. In fact, the <p> tag is the block tag most used within documents.
Each element within the body of an HTML document (anything between the <body> tags) must be
enclosed within block tags.
24
Paragraphs and Lines
For example, a document body with two tables would resemble the following code:
<body>
<p>
<table>
...body of table one...
</table>
</p>
<p>
<table>
...body of table two...
</table>
</p>
</body>
Early Web developers used <p> tags only to create space. For example, two paragraphs would be coded
similarly to the following:
...paragraph one...
<p>
...paragraph two...
No closing tags (</p>) were used and therefore no elements were enclosed within block tags. This practice
will still render basic text properly in most browsers but is not standards compliant and impacts some of
the features of cascading style sheets.
Fran.
I think I hear them.--Stand, ho! Who is there?
Hor.
Friends to this ground.
Mar.
And liegemen to the Dane.
Fran.
Give you good-night.
25
Chapter 3
Mar.
O, farewell, honest soldier;
This text has the distinct format of a play script where each paragraph is formatted like the fol-
lowing example:
Actor
Dialog
Fran.
Source
To format text such as in the preceding example, you make each actor-dialog pair a separate para-
graph with a line break between the two:
<p>Fran.<br />
I think I hear them.--Stand, ho! Who is there?</p>
<p>[Enter Horatio and Marcellus.]</p>
<p>Hor.<br />
Friends to this ground.</p>
<p>Mar.<br />
And liegemen to the Dane.</p>
<p>Fran.<br />
Give you good-night.</p>
<p>Mar.<br />
Notice the following two things about the preceding code: (1) The code uses white space to help break
up the document; this will have no effect on how the browser renders the text. (2) The <br> tag,
because it has no closing tag, includes the slash (<br />) so that it closes itself and is XHTML
compliant.
26
Paragraphs and Lines
Output
When rendered by a Web browser, the preceding code results in the display shown in Figure 3-2.
Figure 3-2
Unlike the <p> tag, you can use multiple <br /> tags to create vertical white space in documents.
However, the use of CSS is still preferred for spacing issues; see the information in Part II of this
book, especially Chapters 15 and 16.
Headings
Standard HTML tags allow for six levels of headings, <h1> through <h6>. The higher the heading number,
the smaller the heading. Figure 3-3 shows a simple page with all six headers and a line of standard text.
The user agent’s settings affect the size of the different headings.
The code to generate the document shown in Figure 3-3 appears here:
27
Chapter 3
<h1>Heading One</h1>
<h2>Heading Two</h2>
<h3>Heading Three</h3>
<h4>Heading Four</h4>
<h5>Heading Five</h5>
<h6>Heading Six</h6>
A line of normal text.</p>
</body>
</html>
Notice how the headings have implicit line breaks and how the entire document is set inside paragraph
tags. Although there are no attributes that you can use to modify the format and behavior of heading tags,
you can change their appearance and behavior with styles (which are discussed in Part II of this book).
As a general rule, you should not include any other tags within a heading.
Figure 3-3
Horizontal Rules
Horizontal rules appear as lines across the user agent screen, and they are generally used to separate
information visually. A typical <hr/> is shown in Figure 3-4.
The tag to generate a horizontal rule is <hr />. Like any other nonpaired tag, the <hr /> tag should
include a slash so that it operates as an open and close tag.
28
Paragraphs and Lines
Previous versions of HTML included various attributes that could be used to modify the width, thick-
ness, and look of the line. These attributes have been deprecated in favor of applicable styles.
You will learn more about styles in Part II of this book. Figure 3-5 shows a few sample rules using differ-
ent styles.
Figure 3-4
Figure 3-5
29
Chapter 3
Preformatted Text
Occasionally you will need to present content that has already been formatted — tabbed or spaced data,
for example — that you do not want the user agent to reformat. For example, suppose you had the fol-
lowing output from a SQL query:
+---------------+-------------------+
| name | value |
+---------------+-------------------+
| newsupdate | 1069455632 |
| releaseupdate | Tue, 1/28, 8:18pm |
| status | 0 |
| feedupdate | 1069456261 |
+---------------+-------------------+
If you allow a user agent to reformat this text, it will end up looking something like what is shown in
Figure 3-6, which is nothing like what was intended.
Figure 3-6
Keep in mind that all white space (spaces, line breaks, and so on) will usually be condensed by the user
agent into one single space. Use the <pre> tag as required to have the user agent interpret and render
white space verbatim.
In such cases, use the preformatted text tag (<pre>). This tag tells the user agent not to reformat the text
within the <pre> block but to render it verbatim as it appears in the document.
30
Paragraphs and Lines
If you use a <pre> block with the example text (as shown in the following code), the user agent will ren-
der it correctly, as shown in Figure 3-7.
<pre>
+---------------+-------------------+
| name | value |
+---------------+-------------------+
| newsupdate | 1069455632 |
| releaseupdate | Tue, 1/28, 8:18pm |
| status | 0 |
| feedupdate | 1069456261 |
+---------------+-------------------+
</pre>
Figure 3-7
Block Divisions
You may sometimes want to format a large block of text in a similar fashion but in a way that is different
from other block(s) in the same document. For example, you might want to set apart a quote so that it
appears in a different style than the text around it.
You could change the format of the paragraphs manually or you could set them off in their own block
using the division tag (<div>).
31
Chapter 3
Source
The following shows a few paragraphs of text, surrounding a quote that should be set off in a dif-
ferent format.
<p>Despite recent setbacks, Acme Inc still intends on releasing its super-
duper gaming console on Tuesday. Company CEO Morgan Webb had this to say:</p>
<p>Although Acme has not released a single product in over four years, the
new X22-B12 console holds the promise of revolutionizing gameplay--that is,
if it arrives on time and garners enough support from the masses.</p>
The styles for the document, inserted into the document <head> section, include the format for
the quote class of <div>:
<style type=”text/css”>
div.quote { padding-right: 4em; padding-left: 4em;
font-style: italic; }
</style>
32
Paragraphs and Lines
Output
The resulting output, once rendered in a user agent, is shown in Figure 3-8. Note how the
quote is indented from the right and left margins and is in italic type, as defined by the style.
Figure 3-8
Summar y
This chapter introduced you to the basic block tags and how best to use them within your content. The
coverage in this chapter was pretty basic, formatting-wise. You saw how the block tags can be used to sep-
arate text but not to format it. That is because most of the formatting attributes have been deprecated in
favor of styles, which are covered in Part II of this book. Chapter 4 introduces the list tags, and subsequent
chapters introduce other elements of HTML.
33
Lists
XHTML supports many different block text elements due to its roots as a text document description
and formatting language. One of the more often used blocks is lists, of which XHTML supports
three different varieties:
❑ Ordered lists — Lists whose elements must appear in a certain order. Such lists usually
have their items prefixed with a number or letter.
❑ Unordered lists — Lists whose elements can appear in any order, usually referred to as
bulleted or laundry-style lists. Such lists usually have their items prefixed with a bullet
or other graphic symbol.
❑ Definition lists — Lists that contain two pieces of information — a term and a definition
of said term — for each list element.
This chapter covers all three lists, their syntax, and various options that can be used to customize
their appearance.
This chapter introduces several Cascading Style Sheet concepts. For more information about
Cascading Style Sheets, see Part II of this book.
Understanding Lists
Both ordered and unordered lists share a similar syntax in XHTML, as shown in the following
pseudocode example:
<list_tag>
<item_tag>List item</item_tag>
<item_tag>List item</item_tag>
<item_tag>List item</item_tag>
</list_tag>
Chapter 4
Definition lists are different in syntax due to their unique structure — that is, two items for each list
element. See the section on definition lists later in this chapter for more information.
Each list is encapsulated in opening and closing list tags, and each list element in turn is encapsulated in
opening and closing list item tags.
Ordered lists have their list items prefixed by incrementing numbers or letters indicating the order of the
items. Unordered lists have their list items prefixed by a bullet or other symbol, indicating that their
order does not matter.
❑ Banana
❑ Chocolate
❑ Strawberry
Definition lists have two pieces of information per list item, usually a term and a definition, as shown in
the following example:
❑ Mozilla
❑ Developed by the Mozilla Project, an open source browser for multiple platforms
Ordered and unordered lists have many options that can be used to customize their appearance.
More information on the individual list types is provided in the following sections.
36
Lists
<ol>
<li>List item 1</li>
<li>List item 2</li>
<li>List item 3</li>
</ol>
The default numbering method uses Arabic numbers. When rendered in a user agent, this basic list
resembles that shown in Figure 4-1.
Figure 4-1
Previous versions of HTML did not require the closing item tag (</li>). However, as you should know
by now, XHTML requires each tag to have a closing mate. Therefore, it is important always to close
your list items with an appropriate tag.
Source
The source code for the list would resemble the following code. Note the use of the style
attribute in the <ol> tag.
37
Chapter 4
</html>
Output
When rendered in a user agent, the list appears as shown in Figure 4-2.
Figure 4-2
❑ decimal
❑ decimal-leading-zero
❑ lower-roman
❑ upper-roman
❑ lower-greek
❑ lower-alpha
38
Lists
❑ lower-latin
❑ upper-alpha
❑ upper-latin
❑ hebrew
❑ armenian
❑ georgian
❑ cjk-ideographic
❑ hiragana
❑ katakana
❑ hiragana-iroha
❑ katakana-iroha
❑ none
The values are self-explanatory. For example, the decimal-leading-zero will produce numbers with
leading zeros (“01” instead of “1”). Keep in mind that the default style is decimal on most user agents,
but if you want to ensure that the list displays as decimal on all agents, you should explicitly set it to
such using the list-style property.
Some of the list-style-type values are font dependent (that is, they are supported only on certain
fonts). If you are using a type such as hiragana with a Latin-based font, you will not get the results
you intend.
The inherit value causes the list to adopt the list style of its parent(s). The inside value moves the
ordinal inside the paragraph of the item, for a more compact list. The outside value (the default) places
the ordinal outside the list item. A sample of inside and outside items is shown in Figure 4-3, which uses
the following list code:
39
Chapter 4
aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.</li>
<li>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis
aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.</li>
<li>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis
aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.</li>
</ol></p>
<p>Inside positioning
<ol style=”list-style-position: inside”>
<li>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis
aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.</li>
<li>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis
aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.</li>
<li>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis
aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.</li>
</ol></p>
</body>
</html>
The various list properties can all be defined within one property, list-style. The list-style
property has the following syntax:
You can use this one property to specify one, two, or all three list-style properties in one style
declaration.
40
Lists
Figure 4-3
However, the start attribute of the <ol> tag has been deprecated. To date, no replacement CSS style has
been defined. Although you can use the start attribute, your document will no longer be XHTML
compliant.
To implement flexible numbering, use the new CSS2 automatic counters and numbering feature. This
feature uses the content property along with the new counter-increment and counter-reset
properties to provide a flexible yet powerful automatic counter function.
41
Chapter 4
The following style code will define a counter and cause any <ol> list to begin with 10:
<style type=”text/css”>
ol { counter-reset: list 9; }
li { list-style-type: none; }
li:before {
content: counter(list,decimal) “. “;
counter-increment: list; }
</style>
This code introduces quite a few CSS2 concepts — pseudoelements, counters, and related properties and
methods. However, it isn’t as complex as it might first appear:
❑ The ol definition sets the counter (list) to be reset to 9 every time an <ol> tag is used in the
document.
❑ The li definition sets the list-style-type to none — the counter will display our number. If
the type was left alone or set to decimal, there would be an additional number displayed with
each item.
❑ The li:before definition accomplishes two distinct purposes:
❑ It causes the counter to be displayed before the item (using the begin pseudoelement
and the content property) along with a period and a space.
❑ It also increments the counter. Note that the counter increment happens first, before the
item is displayed. That is why the counter is initialized to one lower than the starting num-
ber desired (9 instead of 10).
Using the preceding styles along with the following list code in a document results in a list with items
numbered 10–15:
<ol>
<li>Item 10</li>
<li>Item 11</li>
<li>Item 12</li>
<li>Item 13</li>
<li>Item 14</li>
<li>Item 15</li>
</ol>
Unfortunately, at the time of this writing, only the Opera browser fully supports counters. However, the
other user agents should adopt this feature in the future.
❑ Chocolate
❑ Vanilla
42
Lists
❑ Strawberry
❑ Mocha
This same list can be implemented in HTML documents using the unordered list tag (<ul>), as shown in
the following HTML code:
<ul>
<li>Chocolate</li>
<li>Vanilla</li>
<li>Strawberry</li>
<li>Mocha</li>
</ul>
Note that the use of the unordered list tag is very similar to the use of the ordered list tag — only the out-
put is different, as shown in Figure 4-4.
Figure 4-4
43
Chapter 4
❑ disc
❑ circle
❑ square
❑ none
You can also use the list-style-image property to specify a graphic image for use as the list item marker.
Source
The following code uses several different types of list markers:
44
Lists
<li>Mocha</li>
</ul>
</p>
</body>
</html>
Note that to use an image for a marker, the image must conform to the following:
❑ Accessible to the document via HTTP (be on the same Web server or deliverable from
another Web server)
❑ In a suitable format for the Web (jpg, gif, or png)
❑ Sized appropriately for use as a bullet
If the image used in the list-style-image property is not found, most user agents will substi-
tute the default marker.
Output
The code results in a document that resembles that shown in Figure 4-5.
Figure 4-5
45
Chapter 4
As you can see from Figure 4-5, all user agents do not use a standard round bullet for the default <ul>
item marker. In this case, Mozilla Firefox uses a solid diamond for the standard bullet. If you want to
ensure that your document is always displayed using a particular marker in its unordered lists, you
should explicitly define it.
Definition Lists
Definition lists seem more complex than the other two lists due to their having two elements per list item.
However, the sparse number of options available for definition lists makes them easy to implement.
The definition list itself is encapsulated within definition list tags (<dl>). The list items consist of a defi-
nition term (<dt>) and definition (<dd>), each delimited by its own tag pair.
For example, the following code results in the document shown in Figure 4-6:
46
Lists
Figure 4-6
Nesting Lists
You can nest lists of the same or different types as necessary. For example, you can generate a list similar
to the following, incorporating an ordered list within an unordered one:
47
Chapter 4
This combination of lists can be constructed using the following code and results in the display shown in
Figure 4-7:
Figure 4-7
48
Lists
Summar y
As you learned in this chapter, XHTML lists are flexible text constructs that can be used for a variety of
purposes. Using the many optional formatting options, you can construct a list, or nested series of lists,
for just about any purpose. The new counters feature of CSS can also be used to extend even more func-
tionality and flexibility to your lists.
49
Images
Previously a haven of text-only mediums, the Internet became mainstream due to graphic content
on the Web. Today capabilities exist to deliver much more than static graphics — multimedia
of every variety can be found on Web pages. This chapter covers the inclusion of basic graphic
format — static images — in Web documents.
Image Formats
Although static images can seem unexciting in today’s world of Web-delivered content, the Web
would be a very boring place without their use. As you will see in the following sections, there are
plenty of options and formats to consider when using images in your Web documents.
Web Formats
The Web supports three main formats of graphics — GIF, JPEG, and PNG. The following sections
detail the capabilities and suggested uses for each type of graphic.
GIF
The Graphics Interchange Format (GIF) was created in the late 1980s. It was originally used by the
CompuServe online service to deliver graphic content to their subscribers. The GIF format uses
LWZ compression to help keep the file size small. Version 89a of the GIF format added the ability
to encapsulate several images within one file, giving the format animation functionality.
Patent problems plagued the GIF format’s adoption in the late 1990s. Unisys, the patent holder of LWZ
compression, chose to terminate their royalty-free licenses and charge royalties for use of the format.
This practice spurred the development of alternative formats for platforms like the Web, resulting in
new and/or more robust JPEG and PNG format support.
JPEG
The Joint Photographic Experts Group (JPEG) graphic file format is actually two standards: one specifies
how an image is translated into a series of bytes, and the second (JPEG File Interchange Format [JFIF])
specifies how the data is encapsulated into a file format. JPEG files are stored using one of several lossy
compression methods — to keep images at a reasonable size, the compression scheme sacrifices being
able to accurately reconstruct all data present in the original image.
The JPEG format is not as versatile as the GIF format, but its support of 24-bit color and compression
make it a good format for quality images across a bandwidth-constrained medium.
PNG
The Portable Network Graphics (PNG) format was developed during the GIF patent confusion. PNG
was created specifically for delivery over online services such as the Web and specifically to solve some
of the problems with the GIF format. PNG uses a nonpatented lossless compression scheme to keep file
sizes smaller while maintaining image quality.
PNG is a relatively new format, and as such, support for the format and its various features is still some-
what spotty. For example, as of this writing, Microsoft’s Internet Explorer supports only the single-color
transparency option of the PNG format. However, this format promises to be a large step in the evolu-
tion of online graphics.
52
Images
In 2004 an animation standard was proposed for the PNG format. As of this writing, the standard has
not made it into the mainstream and is not in use on the Web.
Transparency
Transparent graphics can be displayed with one or more of their colors transparent, causing what is
under the image to be shown instead of the image data. The effect is as if the specified colors were
turned to clear glass. For example, consider the two images in Figure 5-1.
Figure 5-1
The first image was not saved with transparency information — its white background clearly outlines
the image. The second image contains transparency information — the white background is transparent,
allowing the grid to show through areas of the image that are completely white.
As previously mentioned in the “Web Formats” section of this chapter, both GIF and PNG formats sup-
port transparency, though many user agents do not fully support PNG’s transparency feature set.
53
Chapter 5
Figure 5-2 shows how an interlaced graphic and a noninterlaced graphic load in a user agent.
Interlaced
Image is revealed
top-to-bottom,
alternating scan lines.
Bottom of image
is revealed in piecemeal
fashion while image loads.
Non-Interlaced
Image is revealed
top-to-bottom, one
scan line at a time.
Bottom of image
cannot be seen until
entire image is loaded.
Figure 5-2
The effect of an interlaced image is similar to horizontal blinds being opened. Pieces of the image, top to
bottom, are revealed at the same time, allowing the user to get the general idea of what the image con-
tains without having to wait for the entire image to load.
The JPG image format supports a similar feature. An interlaced JPG image is saved in progressive format
but displays similarly to that of an interlaced GIF image.
Interlaced and progressive images are not used very often, due to the proliferation of broadband
throughout the Web’s intended audience (consumers). Most images load fast enough that the advan-
tages provided by interlacing or progressive display are negligible. However, you should avoid going
overboard with images to keep your documents’ load times to a minimum.
Animation
The GIF format supports displaying, encapsulating several images within one image file. The images
can then be displayed one after another, resulting in a rudimentary animation technique. The technique
is similar to that of an animator’s flipbook: basic sketches are drawn on the pages of the book, and when
the book’s pages are rapidly flipped, the images seem to animate.
Keep in mind that an animated GIF file contains one image for each frame of animation. As such, they
are significantly larger than static images.
54
Images
Animated GIFs have several options that can be used to aid in the animation:
❑ Each image within the file can be displayed with its own delay value so the animation can be
slowed or sped up as necessary.
❑ Each image can replace the previous image in a variety of ways — by overwriting it with trans-
parency, a palette color, and so on.
❑ The animation can be set to play a limited number of times or set to repeat indefinitely.
Source
You need to assemble all the images in your animation. For this example, you will take a clock
face and move the hour hand through a 12-hour cycle, saving each image as shown in Figure 5-3
(images viewed in Jasc Software’s Media Center application).
Figure 5-3
55
Chapter 5
Figure 5-4
Output
When placed in a Web document, the finished image animates the clock by showing each
image/frame in order (see Figure 5-5).
Figure 5-5
56
Images
Animated GIFs do not interpolate motion between the individual frames. As such, extreme changes
between frame images will appear very jerky and sudden. When creating the images for animation
frames, try to keep your motions slow and spanning several frames.
There are several graphic editing programs that support animated GIFs. This example shows JASC
Software’s Animation Shop, available as part of their Paint Shop Pro product (www.jascsoftware.com).
Creating Images
Many graphic editing packages and applications are available to create images for your Web documents.
This section lists a few of the more popular solutions.
Commercial Applications
Several commercial editing applications can be purchased to create and edit graphic images:
❑ Adobe Photoshop — Known for its high-end feature set, Adobe Photoshop is the extreme top
end of graphic editing products. Its numerous features, large number of add-ons, and huge
install base are among its benefits; its large price tag is among its deficits.
❑ Adobe Illustrator — Another Adobe product, Illustrator, is known for its vector editing abilities.
Its native editing of Postscript-compatible files is one of its benefits, though it also is fairly
expensive and doesn’t handle raster graphic formats as well.
❑ JASC Software Paint Shop Pro — Paint Shop Pro has been a contender for second place in the
graphic editor championships. However, its latest versions all but put it in the same class as
Adobe’s products. Its ability to handle both vector and raster graphics and its compatibility
with many Photoshop add-ons mean that Paint Shop Pro can handle almost any task.
❑ Macromedia Freehand and Fireworks — Although known mainly for its Flash and Director
products, Macromedia has its own suite of graphic editing programs as well. Of those products,
Freehand and Fireworks are of particular note. Freehand is similar to Adobe’s Illustrator prod-
uct, excelling in editing vector graphics. Fireworks, by comparison, edits raster graphics and has
a host of animation features as well.
Several of the commercial packages are available in suite form. For example, the Macromedia suite bun-
dles Dreamweaver, Fireworks, Flash, and Freehand in one package (for less than the individual applica-
tions bought separately). If you need more than one capability (vector editing, raster editing, animation,
and so on), look for one of the suites instead of individual applications.
57
Chapter 5
The most full-featured and supported open source application is the GNU Image Manipulation
Program (GIMP). Rivaling Photoshop in features and capabilities, GIMP is a great solution for the
cash conscious — its price (free) belies its wide range of features and Photoshop compatibility. Visit
www.gimp.org for more information.
One important issue when dealing with other people’s content is rights. It’s important to be aware of
what rights are granted for the images’ use. For example, some image products and sites do not allow
their images to be used for commercial purposes. You can use them for internal company use or for
things like greeting cards or party invitations. However, you might not be able to use the images for a
commercial Web site.
Before using an image for online and/or commercial use, read the license info that accompanies the
graphic (if from a retail package) or query the author for license information. When in doubt, don’t use
the image.
In addition, although it may be tempting to use images you find elsewhere on the Web, you should
remember that someone else holds the rights on almost every image, rights that don’t automatically
translate to anyone who downloads and reuses them.
The two parameters, src and alt, define where to find the image and text to display if the user agent
can’t display the image (or is set not to display images at all). These two parameters comprise the mini-
mal set of parameters for any image tag.
As with other tags that lack a closing mate, you should end the <img> tag with a slash to be XHTML
compliant.
For example, the following tag will insert an image, cat.jpg, into the document with the alternate text
“A picture of a cat”:
58
Images
In this case, because the src does not contain a server and full path, the image is assumed to be in the
same directory, on the same server, as the XHTML document. The following tag gives the full URL to the
image, which could conceivably be on a different server than the document:
Note that it is a good practice to store images separately from documents. Most Web authors use a
directory such as images to store all images for their documents.
The user agent will attempt to display the image in-line (that is, alongside elements around it). For
example, consider the following code snippet:
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud
exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. <img
src=”book.jpg” alt=”A book” /> Duis aute irure dolor in reprehenderit in
voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint
occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id
est laborum.</p>
When displayed in a user agent, the code results in the document being rendered similarly to that shown
in Figure 5-6.
Figure 5-6
As you can see, the image doesn’t quite fit where placed. However, the user agent dutifully renders it
where the tag appears. A better choice would be to place the image at the beginning or end of the text,
as in the following code (whose results are shown in Figure 5-7):
<p><img src=”book.jpg” alt=”A book” /> Lorem ipsum dolor sit amet, consectetur
adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna
aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut
aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in
voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint
occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id
est laborum.</p>
59
Chapter 5
Figure 5-7
The <img> tag has several attributes to help control how user agents will display an image; those
attributes are covered in the next section. You can also use several CSS style attributes to control how
an image is formatted in relation to other elements. Part II of this book covers CSS in detail.
Image Attributes
The <img> tag supports several attributes that can be used to help adjust how an image is rendered in
a user agent. The basic XHTML attributes are described in the following sections.
Earlier versions of HTML supported additional attributes such as align, border, hspace, and
vspace to help position the image. However, those attributes have been deprecated; to adjust the factors
controlled by those attributes, you must now use their CSS equivalents.
Because of the utility of the attribute, you should endeavor to always include the alt tag with a descrip-
tive value.
Resist the urge to embed extra information in an alt attribute value. Doing so will obscure the infor-
mation from browsers that display the image and otherwise don’t use the alt-text. Additionally, it will
not give alternative user agents (nongraphical) the information they need to understand the purpose of
the graphic that they cannot see.
60
Images
If you have a lot of information to convey, consider using the longdesc (long description) attribute as
well as the alt attribute. The longdesc attribute specifies a URL to a document that is to be used as
the long description for the figure. Note that it is up to the user agent to decide how to enable access to
the long description, if at all.
Image Size
Two attributes exist to control the physical image size. The attributes are suitably named width and
height. Both attributes support pixel and percentage values. You can use a pixel value to specify an
exact size that the image should be rendered at in the user agent. A percentage value will size the image
according to the size of the user agent’s window.
Note that changing the image’s display size, via tag attributes, does not alter the amount of data trans-
mitted to the user, only the size at which it displays.
For example, if you wanted a square image to be rendered to 100 pixels square, you could use the fol-
lowing tag:
61
Chapter 5
Note that you can use only one of the size attributes if you want; the user agent will use the image’s
proportion to determine the other dimension’s correct value. However, the value in specifying both
dimensions is that the user agent can reserve the space for the image — rendering the rest of the page as
it waits for the image data.
Contrary to what you might think, using percentage values for width and height scales the image
according to the user agent’s window. For example, consider the following tag, shown in two differently
sized browser windows in Figure 5-9:
Figure 5-9
As you can see, instead of resizing the image to 50 pixels square (25% of 200), the image is sized to 25%
of the browser’s width.
Be careful to preserve an image’s aspect ratio when specifying both dimensions in an alt tag. Specifying
dimensions that do not adhere to the same aspect ratio as the original image will cause the image to
appear distorted in the user agent much like a funhouse mirror distorts reflections.
62
Images
The images shown in the figures of this chapter are vertically aligned to the baseline of neighboring text.
Although some user agents use this default alignment, not all user agents can be relied on to exhibit
this default behavior; if you need an image aligned in a certain manner, it is best to explicitly code the
alignment.
The CSS styles for positioning are covered in Chapter 19. The CSS styles for controlling margins and
borders are covered in Chapter 16.
Image Maps
Image maps provide a method to map areas of images to actions. For example, a company Web site
might want to provide a map of the United States that allows customers to click on a state to find a local
office or store.
There are two types of image maps: client-side and server-side. Client-side image maps rely on the user
agent to process the image, the area where the user clicks, and the expected action. Server-side image
maps rely on the user agent only to tell the server where the user clicked; an agent on the Web server
does all the processing.
Between the two methods, client-side image maps are preferred. The user agent is able to offer immedi-
ate feedback to the user (like being over a clickable area). Most user agents support client-side image
maps. Server-side agents can also bog down a server if the map draws consistent traffic, hides many
details necessary to provide immediate feedback to the user, and might not be compatible with some
user agents.
If you want images to be clickable and take the user to one particular destination, you don’t have to use an
image map. Instead, embed the <img> tags in appropriate anchor tags (<a>) similarly to the following:
Note that all src attributes should usually include a full relative or absolute path to the resource.
Inside the <map> tag pair, you specify the various clickable regions of the image, as covered in the next
section.
63
Chapter 5
❑ rect — Defines a rectangular area by specifying the coordinates of the upper-left and lower-
right corners of the rectangle
❑ circle — Defines a circular area by specifying the coordinates of the center of the circle and the
circle’s radius
❑ poly — Defines a free-form polygon area by specifying the coordinates of each point of the
polygon
All coordinates of the image map are relative to the top-left corner of the image (referenced as 0, 0) and
are measured in pixels. For example, suppose you wanted to create an image map for a travel site with
an icon of a car, plane, and hotel. When users click on one of the icons, they are taken to the reservation
page for auto rentals, airfare, or hotel reservations, respectively. Such an image would resemble the
image shown in Figure 5-10.
Figure 5-10
The regions that will be used for the map are within the three icon squares (the white squares around the
icons). The regions are all rectangular, are uniform in size (121 pixels square), and have the following
upper-left coordinates:
❑ car — 35 x, 11 y
❑ plane — 190 x, 11 y
❑ hotel — 345 x, 11 y
64
Images
Knowing the upper-left corner coordinates and the size of each rectangle, you can easily figure out the
coordinates of the bottom-right corner of each rectangle by adding the width (121) and height (121) to
the upper-left coordinates.
Several tools are available to help create image map coordinates. Use your favorite search engine to find
software dedicated to mapping regions, or examine your graphics program to see if it can create regions for
you. Paint Shop Pro is an excellent Windows-based image editor that has image mapping tools built in.
<map name=”map1”>
<a href=”plane.html” shape=”rect” coords=”35,11,156,132”>
Plane Reservations</a>
<a href=”car.html” shape=”rect” coords=”190,11,311,132”>
Rental Cars</a>
<a href=”hotel.html” shape=”rect” coords=”345,11,466,132”>
Hotel Reservations</a>
</map>
The link text (between the anchor tags) helps the user determine what the clickable area leads to, as
shown by the Internet Explorer tooltip in Figure 5-11.
<map name=”map1”>
<area href=”plane.html”
shape=”rect” coords=”35,11,156,132”
alt=”Plane Reservations”>
<area href=”car.html”
shape=”rect” coords=”190,11,311,132”
alt=”Rental Cars”>
<area href=”hotel.html”
shape=”rect” coords=”345,11,466,132”
alt=”Hotel Reservations”>
</map>
Using the alt attribute helps the user determine what the clickable area leads to, as shown by the
Internet Explorer tooltip in Figure 5-11.
65
Chapter 5
Tooltip
Figure 5-11
66
Images
The image map example in this chapter is somewhat simplistic. Image maps can be used for more com-
plex purposes, such as letting customers click on a U.S. map as mentioned earlier in this chapter or
allowing users to click on various buildings on a map or parts on an exploded diagram of a machine for
more information on the building or part clicked.
Summar y
This chapter introduced you to the <img> tag and the various graphic formats it supports. You also
learned about image qualities such as transparency, interlacing, and animation, which can be used to
make your image use more inventive and visually appealing. As you can see, adding graphics to a Web
document is straightforward, but using them to increase the value and usability of your documents can
be more challenging. Use this chapter as a basis for including images, but supplement it with informa-
tion from Part II of this book for how to effectively position images within your documents.
67
Links
Links are what turn plain documents into Web-enabled content. Each document on the Web can
contain one or more links to other documents — allowing users to easily access information
related to the current document or entirely different information. As you will see in this chapter,
you can also include information within links to describe the actual relationship between the
document doing the linking and the document being linked to.
Understanding URLs
A Uniform Resource Locator (URL) is the unique address of a resource (usually a document) on
the Web. This addressing scheme allows user agents and other Internet-enabled programs to find
documents and ask for their contents.
URLs are made up of several different parts, all working together to provide a unique address for
Internet content. Figure 6-1 shows an example of a typical URL and its various parts.
❑ The protocol section is a protocol abbreviation followed by a colon. For example, the
standard HTTP protocol is designated as http:. Another popular protocol supported
by many user agents is File Transfer Protocol (FTP), designated in URLs as ftp:.
❑ The server name is prefixed with two slashes and typically includes a fully qualified
domain name, as in //www.example.com. The www is the server name, and example.com
is the domain. Note that it is a misnomer that Web servers need to be named www;
although www (World Wide Web) is a common convention, the server name can be any
valid name. For example, the fully qualified name of the U.S.-based server for the Internet
Movie Database is us.imdb.com. Note that an IP address can be specified instead of a
server name.
Chapter 6
http://www.example.com:85/products/details/inventory.cgi?product_id=123887
The URL can also include a username and password before the server name (description follows). This
is especially true for FTP URLs. The username and password should appear after the protocol, separated
by a colon (:) and ending with an ampersand (@). When used in this form, the URL would resemble the
following:
http://username:[email protected]/...
❑ If necessary, the server name is followed by a port number — a colon separates the name and
port. For example, some Web servers run their HTTP services on a port other than port 80. In
those cases, the URL needs to include the alternate port number. In the example shown in
Figure 6-1, the port number is 85.
The standard port for HTTP is port 80. For FTP the standard port is 25. Most user agents know the
default ports and will use the default if no port is specified.
❑ After the server name (and optional port number) is the path on the server where the document
or file can be found. In this case, the path to the document is /products/details/, that is, the
details subdirectory of the products directory, which is off the root of the server. Note that
the path of the URL doesn’t directly correspond to the path on the file system of the server — the
Web server software is configured to remap file system directories into URLs.
❑ The next piece of the URL puzzle is the actual document or filename. In this example, the name
is inventory.cgi and the server looks for that file in the directory specified to return to the
requesting user agent.
❑ After the filename, the URL can contain optional arguments for the server to pass to the file. If
the file is an executable (CGI or other script), the arguments can be used for a variety of pur-
poses. The argument list is separated from the filename by a question mark; the arguments
appear in name/value pairs (separated by equal signs), the pairs separated by ampersands (&).
For example, suppose you need to pass inventory.cgi the following name/value pairs:
❑ product_id = 123887
❑ description = long
❑ lang = EN
?product_id=123887&description=long&lang=EN
Strictly speaking, the arguments are not a part of the URL — the URL itself contains only information
about where to find a resource. Arguments are covered here for the sake of completeness. See Part IV of
this book for more on URL arguments and programs to interpret them.
70
Links
http://www.example.com/products/gizmo.html
Suppose the document has a link to another document, doodad.html, which resides on the same server,
in the same directory. Both of the following URLs can be used to reference the other document:
http://www.example.com/products/doodad.html
doodad.html
./doodad.html
http:doodad.html
The first URL uses an absolute path to the document — everything from the protocol, server name, and
path to the document are specified. The other three URLs are relative — they contain only enough infor-
mation for the document to be found relative to the location of the current document.
If you don’t specify the protocol in a URL, the user agent will attempt to use its default protocol to
request the document.
Note that relative paths can be used only with documents on the same Web server because documents
on other servers require substantially more information to guide the user agent to them. Relative paths
are best used on sections of Web sites where the documents in the section never change relationships to
one another. In most cases, absolute paths should be used in URLs.
<a href=”url_to_resource”>textual_description_of_link</a>
The anchor tag can appear by itself or around other HTML elements. For example, a link to product
information could appear in a document as follows:
In this case, the paragraph would appear as shown in Figure 6-2, with the word here being the link to
the other document.
According to the XHTML standard, anchor links need to be placed within block elements (headings,
paragraphs, and so on).
71
Chapter 6
Figure 6-2
As previously mentioned, URLs can refer to other resources besides HTML documents. You can refer to
other resources to be delivered via other protocols by specifying the correct protocol and server in the
anchor tag. For example, the following tag would refer to a ZIP compressed file delivered via the FTP
protocol:
You can also use an anchor tag to spawn helper applications on the user’s computer. For example, this
anchor would open the default e-mail application on the user’s computer to send an e-mail message to
[email protected]:
You can also embed other elements within the anchor to use as links. For example, you can include an
image in the anchor so that the user can click on the image to activate the link:
There are many other ways to link documents, including using image maps (covered in Chapter 5) and
using event attributes in other elements (covered in Chapters 20 and 21).
Link Titles
The title attribute can be used to give more information about the document being linked to. It takes
one argument, a string of characters, to title the link. For example, the following anchor tag uses a title
attribute:
72
Links
The use of the title is left up to the user agent. Some agents, such as Mozilla Firefox, use the title as a
tooltip, as shown in Figure 6-3.
Figure 6-3
However, that isn’t always the case. In fact, some users who visit your site may not even have a mouse
to aid in browsing. The reason could be a physical handicap, a text-only browser, or just a fondness for
using the keyboard. It is important to accommodate these users by adding additional methods to access
links on your page.
The anchor tag includes two attributes to aid non-mouse users, keyboard shortcuts, and tab ordering.
Keyboard shortcuts define a single key that can be used to access the link. The accesskey attribute
takes one letter as its value. For example, the following link defines “C” as the access key:
Note that different user agents and different operating systems treat shortcut keys differently. For exam-
ple, Windows users on Internet Explorer need to hold the Alt key while they press the access key. Note
also that different browsers handle the actual access of the link differently; some browsers will activate
the link as soon as the access key is pressed, while others only select the link, requiring another key to be
pressed to actually activate the link (usually Enter).
As with most graphical operating systems, the Tab key can be used to move through elements of the
interface, including links. Typically, the tab order of links corresponds to the order in which the links
appear in the document. The tabindex attribute can be used to define an alternate order in which the
links in a document should be accessed. The tabindex attribute uses a number to define the position
the link should occupy in the tab order. For example, the following three links have the tab order
73
Chapter 6
reversed — pressing the tab key several times will select the last link, then the second, and finally
the first:
As with most interface elements in XHTML, the browser defines how tabindex is implemented and
how tabbed elements are accessed.
Link Colors
To differentiate text used for links from other text in the document, user agents use different text colors.
Different colors are used to show different modes of links:
❑ Link — The standard link in the document that is not active and has not been visited (see other
modes).
❑ Active — The target of the link is active in another browser window.
❑ Visited — The target of the link has been previously visited (typically, this means the target can
be found in the browser’s cache).
❑ Hover — The mouse pointer is over the link.
The various links are colored differently so that the user can tell the status of each link on your page. The
standard colors of each link status are as follows:
As with other presentation attributes in HTML, the user agent plays a significant role in setting link col-
ors and text decorations. Most user agents follow the color scheme outlined in this section, but there are
those that don’t conform to this scheme.
To change the color of links, you use CSS. For example, to choose another color (and possibly display
property) for visited links, you could use something similar to the following:
<head>
<style type=”text/css”>
a:visited { color: yellow; font-weight: bold; }
</style>
</head>
This changes the visited links in the document to yellow, bold text. The link, active, and hover style
properties can be used to change the other link modes.
74
Links
Document Relationships
There are a host of other attributes that you can add to your anchor tags to describe the form of the tar-
get being linked to, the relationship between the current document and the target, and more.
The following table lists these descriptive attributes and their possible values.
type The MIME type of the target Any valid MIME type
75
Chapter 6
An example of how the relationship attributes (rel, rev) can be used is shown in the following code
snippet:
The anchor tags define the relationships between the chapters (next, previous) and the table of con-
tents (chapter, contents).
❑ The link tag must appear in the <head> section of the document.
❑ The link tag does not encapsulate any text.
❑ The link tag does not have a matching close tag.
For example, the following code could be used in chapter10.html to define that document’s relation-
ship to chapter9.html and chapter11.html:
<head>
<title>Chapter 10</title>
<link href=”chapter9.html” rel=”next” rev=”prev” />
<link href=”chapter11.html” rel=”prev” rev=”next” />
</head>
Link tags do not result in any text being rendered by the user agents but can be used to provide other
information, such as to provide alternate content for search engines. For example, the following link ref-
erences a French version of the current document (chapter10.html):
76
Links
Other relationship attribute values (start, contents, and so on) can likewise be used to provide rele-
vant information on document relationships to search engines.
Summar y
This chapter reviewed the anchor tag (<a>), how it is used to provide links to other documents, and how
additional information can be provided to illustrate the relationship between linked documents. You
also learned how to provide alternative navigational methods through the use of the tabindex and
accesskey attributes. As you learn the rest of the XHTML tags and elements, you will see how to effec-
tively weave links into your documents’ structure.
77
Text
Although most Web documents are chock-full of graphics and multimedia, text still plays a very
important part in communicating on the Web. Previous versions of HTML (prior to 4.01) included
several tags for direct text formatting. Many of those tags have been deprecated in favor of CSS,
but many still exist and can be used and still be in XHTML compliance.
The font size is given relative to the default document font size. The default size is typically con-
trolled via the <basefont> tag (also deprecated). The <basefont> tag supports the same argu-
ments as the <font> tag, but it has no closing mate.
Default font types and sizes are left up to the user agent. No standard correlation exists between
the size used in a <font> tag and the actual font size used by the user agent.
To be XHTML compliant, you should use CSS methods for font control.
Chapter 7
Tag Use
<cite> Citation
<code> Computer code text
<dfn> Definition term
<em> Emphasized text
<kbd> Keyboard text
<samp> Sample computer code text
<strong> Strongly emphasized text
<var> Variable(s)
Figure 7-1
80
Text
The adoption and support of these tags is very haphazard across the various user agents. As such, these
tags are best avoided. Use of CSS instead of these tags is strongly encouraged.
Tags for italic and bold text are still part of the current XHTML specification and are covered in their
own section later in this chapter.
Control of text via CSS involves creating style definitions in style sections within your document, in an
externally linked style sheet, or within individual tags. For example, both of the paragraphs in the fol-
lowing code will be rendered in all caps, the first via a definition in a <style> section and the second
from style code directly in the paragraph tag:
81
Chapter 7
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html>
<head>
<style type=”text/css”>
p.caps { text-transform: capitalize; }
</style>
</head>
<body>
<p class=”caps”> Lorem ipsum dolor sit amet, consectetur adipisicing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim
ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip
ex ea commodo consequat.</p>
<p style=” text-transform: capitalize;”> Lorem ipsum dolor sit amet, consectetur
adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna
aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi
ut aliquip ex ea commodo consequat.</p>
</body>
</html>
Smaller sections of text can use the span tag (<span>) to incorporate text changes in-line. For example,
to specify that a handful of words should be rendered in red, you would use a span tag similar to the
following:
<p> Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt <span style=”color: red”>ut labore et dolore magna aliqua</span>.
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip
ex ea commodo consequat.</p>
More information on the <span> tag appears in a separate section later in this chapter. More informa-
tion on CSS can be found in Part II of this book.
Nonbreaking Spaces
Just as you will want to break some text into discrete chunks, other times you will want to keep text
together. For example, you wouldn’t want words separated in dates (December 25, 2004), awkward
phrases that include letters and numbers (24 hours), or in some company names (International Business
Machine Corporation).
82
Text
Suppose you were to use the phrase “12 Angry Men.” You would not want a user agent to split the “12”
and “Angry” across two lines as shown in the following:
Whenever you don’t want the user agent to break text, you should use a nonbreaking space entity
( ) instead of a normal space. For example, when coding the “12 Angry Men” paragraph, you
could use something similar to the following:
As discussed in previous chapters, user agents tend to collapse white space. This is typically a desirable
effect — allowing you to be more liberal with white space when formatting your documents. However,
sometimes you need to explicitly include spaces in your documents. The nonbreaking space entity can
also be used to space-fill text. For example, to indent a line of text by three spaces, you could use code
similar to the following:
However, space-fill formatting techniques should be avoided — the use of CSS instead is highly
recommended.
The nonbreaking space code, , is known as an entity in HTML-speak. There are entities for
many characters that can’t be typed on a conventional keyboard. Many of the supported entities are
listed in Appendix A.
Soft Hyphens
Soft hyphens can be used to indicate where a user agent can hyphenate a word, if necessary. For exam-
ple, consider the following code and its resulting output shown in Figure 7-2:
Figure 7-2
83
Chapter 7
To tell the user agent where a word can be hyphenated, you insert a soft-hyphen entity (­). Using
the preceding example, you can hyphenate the word “triskaidekaphobia” with soft hyphens, as follows:
The resulting output, shown in Figure 7-3, shows how the optional hyphens are used to achieve better
justification results.
Figure 7-3
Previous versions of HTML contained an underline attribute. However, the underline tag has been dep-
recated. Underlining should be accomplished using CSS.
Not every font has a bold and/or italic variant. Whenever possible, the user agent will substitute a simi-
lar font when bold or italic is asked for but not available. However, not all user agents are font savvy. In
short, your mileage with these tags may vary depending on the user agent being used.
For the same reasons mentioned elsewhere, it is advisable to use CSS instead of hard-coded bold and
italic tags.
84
Text
Figure 7-4
Monospaced Text
Another text formatting tag that has survived deprecation in XHTML is the teletype (<tt>), or
monospaced, tag. This tag tells the user agent that text should be rendered in a monospaced font. You
can use this tag to format reserved words in documentation, code listings, and so on. The following code
shows an example of the teletype tag in use:
This tag is named for the teletype terminals used with the first computers, which were capable of print-
ing only in a monospaced font.
Again, the use of styles is preferred over individual in-line tags. If you need text rendered in a
monospaced font, consider using styles instead of the <tt> tag.
85
Chapter 7
Figure 7-5
The big (<big>) and (<small>) tags are used as you would expect: to delimit text you want rendered
bigger or smaller than the default text. For example, consider the following example code and the result
shown in Figure 7-6:
Figure 7-6
86
Text
The following paragraph has been marked up with text to be inserted (underlined) and deleted
(strikethrough). The output of this code is shown in Figure 7-7.
Figure 7-7
Abbreviations
The abbreviation tag (<abbr>) can be used to mark a word as an abbreviation and to give users the
expansion of the acronym. For example, consider the following code:
It is up to the user agent as to how the title attribute’s value will be shown, if at all. Some user agents
will display the value when the mouse is over the acronym.
The <span> tag is used like any other in-line tag (<b>, <i>, <tt>, and so on), surrounding the text/
elements that it should affect. You use the style or class attribute to define what style should be
applied. For example, both of the paragraphs in the following code sample would cause the word “red”
to be rendered in red, bold text:
87
Chapter 7
<head>
<style type=”text/css”>
.boldredtext { color: red; font-weight: bold; }
</style>
</head>
<body>
<!-- Paragraph 1, using direct style coding -->
<p>We should paint the document <span style=”color: red; font-weight: bold;”>
red</span>.</p>
Of the two methods, the use of the class attribute is preferred over the style attribute because class
attribute avoids directly (and individually) coding the text. Instead, it references a separate style defini-
tion that can be repurposed for other text or changed globally, as required.
Summar y
As Web publishing evolves, so do the tools to adequately provide publishing capabilities to Web
documents. As you have seen, most of the document-formatting capabilities — including textual
formatting — have been relegated to CSS instead of direct coding via HTML tags. However, there are
still quite a few formatting tags that can be used to format text in your Web documents. This chapter
introduced those remaining tags that are still XHTML compliant. However, you should always strive
to encode text formatting in CSS instead of directly coding text using tags.
88
Tables
Tables were created in HTML as a means to display tabular data — typically scientific or academic
data. However, as the Web became more of a traditional publishing medium, tables evolved from
only supporting plain textual data to being a flexible platform for arranging all sorts of elements to
accomplish all sorts of layouts.
Today, XHTML tables can be used to display tabular data, align elements in a form, or even pro-
vide entire document layout structures. This chapter introduces you to tables and their various
uses and formats.
Par ts of a Table
A table in XHTML can be made up of the following parts:
❑ Header row(s)
❑ Column groupings
❑ Body row(s)
❑ Header cells
❑ Body cells
❑ Rows
❑ Columns
❑ Footer row(s)
❑ Caption
Figure 8-1 shows an example of a table with its various parts labeled.
Chapter 8
Figure 8-1
The table in Figure 8-1 was rendered from the following code:
90
Tables
<!-- Table body -->
<tbody>
<tr><th>Header Cell 1</th><th>Header Cell 2</th></tr>
<tr><td>Row 1, Cell 1</td><td>Row 1, Cell 2</td></tr>
<tr><td>Row 2, Cell 1</td><td>Row 2, Cell 2</td></tr>
</tbody>
</table>
</p>
</body>
</html>
Not all of the parts contained in this example are mandatory. It is possible to create a table using only the
table tag (<table>) and row (<tr>) and cell/column (<td>) tags. For example, the following table is
completely valid:
<table>
<tr><td>Row 1, Cell 1</td><td>Row 1, Cell 2</td></tr>
<tr><td>Row 2, Cell 1</td><td>Row 2, Cell 2</td></tr>
</table>
However, as you will see in the rest of this chapter, the breadth and depth of table tags and options
allow you to encapsulate a lot of information within XHTML tables.
It is possible to nest tables within one another. In fact, a particularly popular XHTML technique is to use
tables for sophisticated page layout (covered later in this chapter) — doing so depends on nested tables.
It’s important to note that most user agents build tables in memory before displaying them. This can
cause a delay in displaying a large table.
Formatting Tables
Tables are one of the most versatile elements in XHTML. You can use them to simply align other ele-
ments or as layout control for a full document. Along with functionality usually comes complexity, and
tables are no exception — you can use many options and attributes to format tables. The following sec-
tions detail the various formatting options available.
<p>
Short Text Table<br />
<table border=”1”>
<tr><td>Short Text 1</td><td>Short Text 2</td></tr>
</table>
</p>
91
Chapter 8
<p>
Longer Text Table<br />
<table border=”1”>
<tr><td>Much Longer Text 1</td><td>Much Longer Text 2</td></tr>
</table>
</p>
Figure 8-2
Once a table expands to the limits of the user agent’s window, the content of its cells will wrap within
their respective cells.
Note that both tables are left-aligned in the user agent window.
However, there are times when you want to explicitly define a table’s width and possibly its alignment.
<p>
50% Table Width<br />
<table border=”1” width=”50%”>
<tr>
<td>Cell 1</td><td>Cell 2</td>
<td>Cell 3</td><td>Cell 4</td>
</tr>
</table>
</p>
92
Tables
The containing object is a nonconstrained paragraph that spans the width of the user agent. The result is
that the table will occupy 50 percent of the user agent’s window width, as shown in Figure 8-3.
Figure 8-3
To specify an exact width of a table, use pixel width specifications instead. For example, if you need a
table to be 500 pixels wide, you could use a table definition similar to the following:
<table width=”500px”>
If the specified table width exceeds the user agent’s window width, it is up to the user agent to handle
the overflow, via resizing the table, wrapping it, or providing scroll bars as shown in Figure 8-4.
Figure 8-4
93
Chapter 8
Besides specifying the width of the table as a whole, you can also specify the width of each column
within the table, using width attributes in <th> and <td> tags or specifying width within <col> or
<colgroup> tags. These techniques are covered in the “Cells” and “Grouping Columns” sections later
in this chapter.
For example, if you wanted a table to be centered in the user agent’s window, you could use code similar
to the following (whose result is shown in Figure 8-5):
<p>Centered Table</p>
<p>
<table border=”1” align=”center”>
<tr>
<td>Cell 1</td><td>Cell 2</td>
<td>Cell 3</td><td>Cell 4</td>
</tr>
</table>
</p>
Figure 8-5
Note that the align attribute has no visible effect on a table that occupies the full width of its container
object.
94
Tables
Figure 8-6 shows a graphical representation of cell padding and spacing.
Cell Spacing
Cell Border
Cell Padding
Cell
Contents
Figure 8-6
Cell padding is controlled with the <table> tag’s cellpadding attribute and can be specified in pixels
or percentages. When specified by percentage, the browser uses half of the specified percentage for each
side of the cell. The percentage is of the available space for the dimension (size of the cell), vertical or
horizontal.
Cell spacing is controlled with the cellspacing attribute. Like cellpadding, the cellspacing attribute
can be specified in pixels or percentages. Figure 8-7 shows a table whose cellspacing attribute has been
set to 20 percent using the following <table> tag:
Figure 8-7
95
Chapter 8
CSS offers several additional formatting options for tables and their elements. CSS is covered in Part II
of this book.
Table Borders
The <table> tag’s border attribute can be used to control the width of the border surrounding the
table. For example, consider the following three tables and the resulting output shown in Figure 8-8:
<p>
No Borders<br />
<table border=”0”>
<tr><td>Cell 1</td><td>Cell 2</td></tr>
<tr><td>Cell 3</td><td>Cell 4</td></tr>
</table>
</p>
<p>
Border = 1<br />
<table border=”1”>
<tr><td>Cell 1</td><td>Cell 2</td></tr>
<tr><td>Cell 3</td><td>Cell 4</td></tr>
</table>
</p>
<p>
Border = 5<br />
<table border=”5”>
<tr><td>Cell 1</td><td>Cell 2</td></tr>
<tr><td>Cell 3</td><td>Cell 4</td></tr>
</table>
</p>
The border attribute’s value specifies the width of the border in pixels. The default border width is 0,
or no border.
Borders can be an effective troubleshooting tool when dealing with table problems in XHTML. If you
are having trouble determining what is causing a problem in a table, try turning on the borders to bet-
ter visualize the individual rows and columns. If you are using nested tables, turn on the borders of
individual tables (possibly using different border values for different tables) until you narrow down the
scope of the problem.
96
Tables
Figure 8-8
To specify which outside borders are displayed, use the frame attribute with the <table> tag. The
frame attribute supports the values displayed in the following table:
Value Definition
void Display no borders.
above Display a border on the top of the table only.
below Display a border on the bottom of the table only.
hsides Display borders on the horizontal sides (top and bottom) only.
lhs or rhs Display only the left side or the right side border.
vsides Display borders on the vertical sides (right and left) only.
box or border Display borders on all sides of the table (the default when the border
attribute is set without specifying frame).
Not all user agents use the same defaults for table borders. If you want a table rendered a particular way,
use care to explicitly define each border option.
97
Chapter 8
Table Rules
The <table> tag’s rules attribute controls which rules (borders between cells) are displayed within a
table. The rules attribute supports the values shown in the following table:
Value Definition
none Display no rules.
groups Display rules between row groups and column groups only.
rows Display rules between rows only.
cols Display rules between columns only.
all Rules will appear between all rows and columns.
For example, the following table code will cause the table to render with rules between columns only, as
shown in Figure 8-9:
Figure 8-9
98
Tables
Note that the width of rules is governed by the setting of the cellspacing attribute. For example, set-
ting cellspacing to a value of 3px will result in rules 3 pixels wide.
Rows
Rows are the horizontal elements of the table grid and are delimited with table row tags (<tr>). For
example, a table with five rows would use the following pseudocode:
<table>
<tr> row 1 </tr>
<tr> row 2 </tr>
<tr> row 3 </tr>
<tr> row 4 </tr>
<tr> row 5 </tr>
</table>
The rows are divided into columns (individual cells within the row) via table data (<td>) or table head-
ing (<th>) tags, which are covered in the next section.
The <tr> tag supports the options shown in the following table:
Attribute Definition
align Set to right, left, center, justify, or char (character), this attribute
controls the horizontal alignment of data in the row. Note that if you use
char alignment, you should also specify the alignment character with
the char attribute described below.
char Specifies the alignment character to use with character (char) alignment.
charoff Specifies the offset from the alignment character to align the data on. Can
be specified in pixels or as a percentage.
valign Set to top, middle, bottom, or baseline, this attribute controls the
vertical alignment of data in the row. Baseline vertical alignment aligns
the baseline of the text across the cells in the row.
Bottom vertical alignment aligns the row to the bottom of neighboring elements. Setting the vertical
alignment to baseline will cause the row to be aligned to the baseline of neighboring text (the line
text rests upon when written on ruled paper).
You can use the align value of char to align columns on a particular character — a decimal (.) if you
want to align numbers, for example. If you set alignment to char, you will also need to specify the align-
ment character using the char attribute. For example, to align a cell’s data on a decimal point, you
would use something similar to the following:
99
Chapter 8
You can also use the charoff value for alignment to set the alignment to be offset from a particular
character. When using charoff alignment, you also need to use the char attribute to specify the charac-
ter to offset from.
Note that using alignment attributes in a table row tag will cause all cells in that row to be formatted
accordingly. If you want to format individual cells in the row differently, use attributes in the appropri-
ate table data or table header tags instead.
Cells
The cells of a table are the elements that actually hold data. The cell definitions also define the column in
which they reside. Table cells are delimited by table data tags (<td>), as shown in the following example:
<table>
<tr> <!-- Row 1 -->
<td>Column 1</td><td>Column 2</td><td>Column 3</td>
</tr>
<tr> <!-- Row 2 -->
<td>Column 1</td><td>Column 2</td><td>Column 3</td>
</tr>
</table>
Formatting your tables with ample white space (line breaks and indents) will help you accurately for-
mat and understand your tables. There are just as many ways to format a table in XHTML as there are
Web programmers — find a style that suits your tastes and stick to it.
This code defines a table with two rows and three columns, due to the three sets of <td> tags within
each row (<tr>).
You can also use table header tags (<th>) to define cells that are to be used as headers for the columns.
Expanding on the previous example, the following adds column headers:
<table>
<tr> <!-- Header Row -->
<th>Header 1</th><th>Header 2</th><th>Header 3</th>
</tr>
<tr> <!-- Body Row 1 -->
<td>Column 1</td><td>Column 2</td><td>Column 3</td>
</tr>
<tr> <!-- Body Row 2 -->
<td>Column 1</td><td>Column 2</td><td>Column 3</td>
</tr>
</table>
Most user agents render the table header cells (those delimited by <th> tags) in a different font, usually
bold. This allows an easy method to format headings without using additional character formatting tags.
However, as with all formatting defaults, each user agent is free to define its own default formatting for
table headers. If you want your headers to appear with specific textual formatting, you should take care
to explicitly code them as such.
100
Tables
Some user agents will not properly render an empty cell (for example, <td></td>). When you find
yourself needing an empty cell, get in the habit of placing a nonbreaking space entity ( ) in the cell
(for example, <td> </td>) to help ensure that the user agent will render your table correctly.
Although cells represent the smallest element in a table, they have the most attributes for their tags.
Supported attributes include those shown in the following table:
Attribute Definition
abbr An abbreviated form of the cell’s contents. User agents can use the
abbreviation where appropriate (using a voice synthesizer to speak a
short form of the contents, displaying on a small device, and so on).
As such, the value of the abbr attribute should be as short and concise
as possible.
align The horizontal alignment of the cell’s contents — left, center, right,
justify, or char (character).
axis Used to define a conceptual category for the cell, which can be used to
place the cell’s contents into dimensional space. How the categories are
used (if at all) is up to the individual user agent.
char The character used to align the cell’s contents if the alignment is set
to char.
charoff The offset from the alignment character to use when aligning the cell’s
contents by character.
colspan How many columns the cell should span (the default is 1). See the
“Spanning Columns and Rows” section of this chapter for more
information.
headers A space-separated list of header cell id attributes that correspond with
the cells used as headers for the current cell. User agents use this informa-
tion at their discretion — a verbal agent might read the contents of all
header cells before the current cell’s contents.
rowspan How many rows the cell should span (the default is 1). See the “Spanning
Columns and Rows” section of this chapter for more information.
scope The scope of the current cell’s contents when used as a header — row,
col (column), rowgroup, or colgroup (column group). If set, the cell’s
contents are treated as a header for the corresponding element(s).
valign The vertical alignment of the cell’s contents — top, middle, bottom, or
baseline.
Previous versions of HTML also supported a nowrap attribute for cell tags. In HTML version 4.01
(and hence, XHTML) that attribute was deprecated in favor of CSS formatting.
101
Chapter 8
Captions
Captions allow you to annotate your tables, detailing the contents or its meaning for the reader. The
caption section of an XHTML table is encapsulated in caption tags (<caption>) within the table tags
(<table>). For example, consider the following table and the resulting output shown in Figure 8-10:
Note that the caption must come immediately after the <table> tag so that the user agent will know
to reserve space for it. Also, the caption generally appears centered above the table, but different user
agents may display it differently.
You can use styles to format the caption. For more information on styles, see Part II of this book.
102
Tables
Figure 8-10
103
Chapter 8
Each section supports the same tags delimiting columns and rows — table rows (<tr>), table headings
(<th>), and table data (<td>). For example, a table heading section might resemble the following:
<thead>
<tr>
<th>Cust #</th>
<th>Customer Name</th>
<th>Last Order Date</th>
</tr>
</thead>
Note that, in this case, <th> tags are used to ensure that the cells are formatted as headings. However,
you could just as easily use <td> tags if you wanted.
A sample use of these tags is shown in the following code, and the result is displayed in a user agent
within Figure 8-11:
Notice the use of the rules=”groups” attribute in the <table> tag. This causes the rules to be
inserted between the sections (row groups) only (see Figure 8-11).
104
Tables
<tr>
<td>4</td><td>20</td><td>7</td><td>3</td>
</tr>
</tbody>
</table>
</body>
</html>
Although counterintuitive, the <tfoot> section should be placed before the <tbody> section in the table
code. This allows the user agent to anticipate the footer section when rendering the table.
Figure 8-11
Backgrounds
Previous versions of HTML supported a bgcolor attribute in table, row, header, and cell tags. The
attribute was used to define a color for the element it was included with. However, in HTML 4.01 that
attribute was deprecated. To specify background colors in table elements, you must now use CSS.
105
Chapter 8
For example, the following style definition defines a CSS class for a table with a red background:
Using CSS, you can also use graphic images as backgrounds for tables:
Many user agents do not currently support color or image backgrounds in tables.
Spanning Columns
Using the colspan attribute in table header (<th>) and table data (<td>) tags, you can span a cell over
two or more columns. For example, consider the following table and the result in a user agent, shown in
Figure 8-12:
106
Tables
<td>Amy</td>
<td>7</td><td>7</td><td>0</td><td>0</td><td>0</td><td>0</td>
</tr>
<td>Ted</td>
<td>2</td><td>2</td><td>4</td><td>2</td><td>2</td><td>2</td>
</tr>
<td>Thomas</td>
<td>7</td><td>3</td><td>4</td><td>0</td><td>0</td><td>0</td>
</tr>
<td>Corinna</td>
<td>0</td><td>0</td><td>4</td><td>10</td><td>0</td><td>0</td>
</table>
The colspan attributes were added to table header tags so the result is formatted as a header. The row
where the colspan attributes are used has fewer columns (by necessity, one fewer for each column
spanned).
Figure 8-12
107
Chapter 8
Spanning Rows
You can use the rowspan attribute in table data (<td>) and table header (<th>) tags to span a cell
across several rows. For example, consider the following table and the results shown in Figure 8-13:
108
Tables
Figure 8-13
Grouping Columns
HTML 4.01 added a few extra tags to make defining and formatting groups of columns easier. The two
tags, <colgroup> and <col>, are used together to define and optionally format column groups and
individual columns.
The <colgroup> tag is used to define and optionally format groups of columns. The tag supports the
same formatting attributes as the <tr> and <td>/<th> tags (align, valign, width, and so on). Any
columns defined by the <colgroup> will inherit the formatting contained in the <colgroup> tag’s
attributes and styles.
The <colgroup> tag’s span attribute indicates how many columns are in the group. For example, the
following code defines the first three columns in a group and sets their alignment to center:
<table>
<colgroup span=”3” align=”center”>
</colgroup>
...
109
Chapter 8
Additional <colgroup> tags create additional column groups. You must use additional column groups
if the columns you are grouping are not contiguous or do not start with the first column. For example,
the following HTML table code creates three column groups:
<table>
<colgroup span=”2” align=”center”>
<!-- This group contains columns 1 & 2 -->
</colgroup>
<colgroup span=”3” align=”char” char=”.”>
<!-- This group contains columns 3 - 5 -->
</colgroup>
<colgroup span=”5” align=”right” style=”font-weight: bold;” >
<!-- This group contains columns 6 - 10 -->
</colgroup>
...
Column groups that do not have explicit formatting attributes defined in their respective <colgroup>
tags inherit the standard formatting for the columns of the table. However, the group is still defined as a
group and will respond accordingly to table attributes that affect groups (rules=”groups”, and so on).
What if you don’t want all the columns within the group formatted identically? For example, in a group
of three columns, suppose you wanted the center column (column number 2 in the group) to have its
text formatted as bold text? To define specific formatting for the columns in the group, you use the
<col> tag. To format a group using the preceding example (middle column bold), you could use code
similar to the following:
<table>
<colgroup span=”3”>
<!-- This group contains columns 1 & 3 -->
<col></col>
<col style=”font-weight: bold;”></col>
<col></col>
</colgroup>
...
The <col> tag follows similar rules to that of the <colgroup> tag:
110
Tables
In standard HTML the <col> tag has no closing tag. However, in XHTML the tag must be appropriately
closed.
Using the <colgroup> or <col> tags does not eliminate or change the necessity of <td> tags (which
actually form the columns). You must still take care in placing the rest of the tags within the table to
ensure proper formatting of your tables.
Figure 8-14
111
Chapter 8
At first glance you wouldn’t think that many tables were involved in this document’s creation. However,
if you enable borders on all the tables, their use and multitude becomes quite apparent, as shown in
Figure 8-15.
Figure 8-15
This section covers some of the more popular uses of tables for page layout purposes. However, the pos-
sibilities for using tables are endless; feel free to experiment with different layouts or with combining
layouts.
Most table layout schemes use nested tables to accomplish their formatting. Remember that tables can
be nested only within cells of other tables (between <th> or <td> tags).
112
Tables
Floating Page
The floating page layout (as shown in Figure 8-16) is quite popular and used for documents of all kinds,
from corporate sites to personal online diaries.
Figure 8-16
The effect is fairly easy to create using a few nested tables, as shown in the following code, the output of
which is shown in Figure 8-17.
113
Chapter 8
<!-- Sets “desktop” color (behind page) -->
body { background-color: #B0C4DE; }
</style>
</head>
<body>
<p>
<!-- /Body container -->
<!-- (background = border, padding = border width
margin = centered table) -->
<table border=”0” cellpadding=”4px” cellspacing=”0”
style=”background-color: black;
margin: 0 auto;”>
<tr>
<td>
</td>
</tr>
</table>
<!-- /Floating page -->
</td>
</tr>
</table>
<!-- /Body container -->
</p>
</body>
</html>
The comments in the code delimit the individual tables and content areas. It is a good practice to follow
standard code formatting (indentation, liberal white space, and so on) and to include sufficient com-
ments to easily keep track of all your tables, how they are formatted, and what they accomplish.
114
Tables
For more of a drop-shadow effect, set two adjacent borders to a nonzero value, as shown in the follow-
ing code:
Figure 8-17
115
Chapter 8
This will increase the width of the right and bottom borders, giving the page a more realistic, drop-
shadow effect.
Note that not all browsers correctly support attaching a background-color CSS style to the
body tag.
For example, consider the logo shown in Figure 8-18, which is typical of current Web document mast-
heads using a nonrectangular graphic. A sidebar containing nonvital information appears under the
planet on the logo, while the main body of the document appears under the logo text.
Figure 8-18
Using a graphic editor like Paint Shop Pro, you can break the image into three parts, as shown in
Figure 8-19.
116
Tables
Figure 8-19
Those parts can then be placed into a table; the top of the planet and the text are placed in the first row,
and the bottom of the planet and the main body of the document in the second row, as shown in the fol-
lowing code, which renders similarly to Figure 8-20.
</td>
<td align=”top”>
117
Chapter 8
<!-- Main Page Content Starts Here -->
<p>Main Page Content</p>
</td>
</tr>
</tbody>
</table>
The appropriate content replaces the placeholders, creating a seamless page design like that shown in
Figure 8-19.
Many graphic editing programs have a Slice feature that can help break apart an image, and some appli-
cations will even build the appropriate HTML for you. The Slice feature in Paint Shop Pro (accessed via
File ➪ Export ➪ Image SLicer) is shown in Figure 8-20.
Figure 8-20
Note that white space in your code can create inadvertent problems when embedding graphics in table
cells. Be careful not to leave any white space between table data tags (<td>) and the image tags (<img>).
For example, the following code will result in a small margin between the image and the edge of the
table cell due to the line breaks and spaces used to indent the <img> tag:
<td>
<img src=”logo_top.jpg” alt=”top piece of logo”>
</td>
118
Tables
Navigational Blocks
Tables can also be used to provide more simple layouts for navigational panes. For example, you can
provide a navigational pane on top of a document or at either margin. Figure 8-21 shows an example of
a navigational pane at the top of a document. Figure 8-22 shows an example of a navigational pane on
the left margin of the document.
Figure 8-21
The table borders in both examples (Figures 8-21 and 8-22) have been turned on to show the layout of
the tables involved. Although most layout designs use no borders, it may be advantageous to turn some
borders on to help delimit certain sections of your documents.
The top navigation pane (Figure 8-21) provides an area where a menu can be placed. The left-margin
pane (Figure 8-22) provides individual cells for individual menu items; you can use this approach to
uniquely position the individual menu items.
119
Chapter 8
Figure 8-22
Multiple Columns
Tables can also be used to provide a newspaper-like format for your documents. This layout is quite sim-
ple, relying on two (or more) parallel columns, as shown in Figure 8-23.
120
Tables
Figure 8-23
121
Chapter 8
XHTML allows a special document type definition (DTD) for frame support, the Extensible HTML ver-
sion 1.0 Frameset DTD, available at http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd.
Although once very popular with Web designers, frames have become an outdated construct and should
not be used for the following reasons:
❑ Frames are hard to code (requiring a special frameset document in addition to content documents)
and are reasonably hard to manage.
❑ Frame support in user agents cannot be relied on as the Web moves to more resource-
constrained platforms (mostly in the mobile arena).
❑ Frames are going the route of deprecation and aren’t XHTML 1.1 compliant.
Summar y
This chapter introduced you to one of the most powerful and flexible XHTML elements, the table.
You learned about the various pieces that make up the table whole, as well as how to format each. You
also learned about the evolution of the table and how some of the table-formatting attributes have
migrated — much like other elements’ formatting attributes — into CSS. In addition, you learned how
to stretch the boundaries of tables to provide layout structures for text and even entire documents.
122
Forms
The Web was built as a one-way communication medium — designed to deliver content to a user
but not gather data from the user. However, the usefulness of a World Wide Web soon drove con-
structs to enable users to send information as well as receive it. Enter the form, which allows
graphical user controls to be placed in Web documents, allowing users to use methods they are
familiar with to send data to interact with databases, submit orders to retailers, and more. This
chapter details the ins and outs of XHTML forms and their controls.
Understanding Forms
HTML forms allow users to interact with Web documents by providing GUI controls for data
entry. The HTML side of forms simply collects the data. A separate handler, usually a script of
some sort, is used to do something useful with the data. A typical interaction with an HTML form
resembles that shown in Figure 9-1.
HTTP
Document
User Form
Agent Form Handler
Form Data
Data
Web Server
Figure 9-1
1. The Web server sends the HTML document (containing the form) to the user agent.
2. The user uses the form’s GUI controls to enter data and submits the completed form.
Chapter 9
3. The form is submitted to a specified server (typically the same server that delivered the form
document) to be passed to a handler.
4. The server passes the data stream to a specified handler, which uses the data in a prescribed
method.
A sample form, using the various form fields, is shown in the following code and rendered in a user
agent in Figure 9-2.
</td>
<td>
124
Forms
<option id=”Periph”>Peripherals</option>
</select>
</p>
</td>
</tr>
<tr>
<td>
<td>
<!-- Submit and Reset buttons -->
<p>
<input type=”submit” name=”submit” id=”submit” value=”Submit”/>
<input type=”reset” name=”reset” id=”reset” />
</p>
</td>
</tr>
</table>
125
Chapter 9
</form>
</p>
</body>
</html>
The various fields and options are covered in appropriate sections later in this chapter.
126
Forms
Form Handling
As previously mentioned, a separate form handler is necessary to do something useful with the data.
Form handlers are generally script files designed to interact with e-mail, databases, or some other sys-
tem. For example, a Perl program might be used to query a database based on user input and then pass
the results back to the user via a separate document.
A simple PHP form handler that logs form data to a file might resemble the following:
<?php
?>
Note that this form handler is very basic — it doesn’t do any error checking, convert encoded values,
or provide any feedback to the user. It simply takes the fields data passed to it and puts it in a comma-
separated value (CSV) log file.
Common form handlers are created in Perl, Python, PHP, or other server-side programming languages.
More information on scripting languages that can be used for form handling can be found in Parts IV
and V of this book.
127
Chapter 9
Security is an issue that should be considered when creating form handlers. One of the earliest, most
popular form handlers, formmail.cgi, was found to have a vulnerability that allowed anyone to send
data to the script and have it e-mail the data to whomever the sender wanted. This functionality was an
instant hit with e-mail spammers, who still use unsecured formmail scripts to send anonymous spam.
If you want a generic form handler to simply store or e-mail the data, you can choose from a few routes.
Several sites on the Internet have generic form handlers available. For example, CGI Resource Index, at
http://cgi.resourceindex.com/, has several dozen scripts that you can download and use for your
form handling.
Several services are also available that allow you to process your form data through their server
and scripts. You may need such a service if you cannot run scripts on your server or want a generic,
no-hassle solution. A partial list of script services is also available at the CGI Resource Index,
http://cgi.resourceindex.com/. From the main page, select Remotely Hosted and browse for a
service that meets your needs.
The HTTP GET protocol transfers data by attaching it to the URL text passed to the form handler. You
have probably noticed URLs that resemble the following:
http://www.example.com/forms.cgi?id=45677&character=Taarna
The data appears after the question mark and is in name/value pairs. For example, the variable named
id has the value of 45677, and the variable character has the value of Taarna. In most cases, the vari-
able name corresponds to field names from the form, but how they translate to values within the form
handler is up to the handler itself.
Because the data is passed as plain text in the URL, it is easy to implement — you can pass data by sim-
ply adding the appropriate coding to the URL used to call the data handler. However, GET is also inher-
ently insecure. You should never use GET to send confidential data to a handler, because the data is
clearly visible in most user agents and can be easily sniffed by hackers.
The HTTP POST method passes data by encoding it in the HTTP protocol stream. As such, it is not
normally visible to a user and is a more secure method to pass data, but it can be harder to implement.
Thankfully, most Web technologies make passing data via POST trivial.
Note that GET data is also limited in size due to being encapsulated in the URL.
128
Forms
The action attribute provides a URL to a suitable form handler that will process the form data accord-
ingly. The method attribute specifies how the form data should be passed to the handler, via GET or POST.
The <form> tag has several additional attributes, shown in the following table:
Attribute Values
accept A comma-separated list of content types that the handler’s server will
accept
accept-charset A comma-separated list of character sets the form data may be in
enctype The content type the form data is in
id The ID of the form (used instead of name)
name The name of the form (deprecated, use the id attribute instead)
target Where to open the handler URL (deprecated)
Although you may not need these attributes in all forms, they can be very useful. The accept, accept-
charset, and enctype attributes are invaluable for processing nontextual and international data. The
id attribute is used to uniquely identify a form in your document. This is essential for scripting, espe-
cially if you use more than one form in the same document.
❑ button
❑ checkbox
❑ file
❑ hidden
❑ image
❑ password
129
Chapter 9
❑ radio
❑ reset
❑ submit
❑ text
For example, the following two tags define a text field and a submit button:
More information on the various fields supported by the input tag appears in appropriate sections later
in this chapter.
HTML requires that all fields contain name attributes for their data to be submitted with the form. Any
field that does not have a name attribute will not be included in the form data submission. Furthermore,
HTML uses the name attribute to identify the value — as a sort of variable name, if you will. Therefore, it
is important that you include name attributes in all your form fields. It is also suggested that the name
values be succinct and machine-readable — that is, devoid of spaces and nonalphanumeric characters.
Some applications (some scripts and the <label> tag) require that fields also contain an id attribute. For
example, user agents use the <label> tags’ for attribute to match other fields’ id attribute, resulting in
a label-field match. JavaScript and other scripting languages can use the id attribute to directly access
form fields.
To be on the safe side, it’s usually best to include both attributes in all form fields.
130
Forms
Although all the attributes previously listed are not required, they represent the minimum attributes that
you should always use with text input fields. The following sample text box is displayed 30 characters
long, accepts a maximum of 40 characters, and has no initial value:
The following code example defines a text box that is displayed as a box 40 characters long, only accepts
40 characters, and has an initial value of “[email protected]” (supplied via the value attribute):
Note that the password field only visibly obscures the data to help stop casual snoops from seeing what
a user inputs into a field. It does not encode or in any way obscure the information at the data level. As
such, be careful how you use this field.
Radio Buttons
The radio input field defines one in a series of radio buttons. When one is selected, the others in the
group are deselected, making the buttons mutually exclusive from each other.
The value attribute defines what value is returned to the handler if the button is selected. This attribute
should be unique between buttons in the same group. Note that all radio buttons within a group share
the same name attribute value, which defines them as a group.
The following code defines a group of radio buttons that allows a user to select their gender:
<p>Gender:
<input type=”radio” name=”gender” id=”male” value=”male”> Male
<input type=”radio” name=”gender” id=”female” value=”female”> Female</p>
131
Chapter 9
If you want a radio button selected by default, use the checked attribute within the appropriate button’s
tag. Remember that XML and its variants do not allow attributes without values. Although HTML will
allow the checked attribute to be used with or without a value, you should specify the checked
attribute as checked=”checked” instead of just checked to remain XHTML compliant.
Fieldsets are handy elements to use with radio buttons. More information on fieldsets appears in a sepa-
rate section later in this chapter.
Checkboxes
The checkbox field has the following format:
Checkboxes are very similar in definition to radio buttons; however, unlike radio buttons, multiple
checkboxes can be selected from the same group. The following example displays a checkbox allowing
the user to select whether they should receive solicitous e-mails:
You can use the checked attribute to preselect checkboxes in your forms. Also, just like radio buttons,
the value attribute is used as the value of the checkbox if it is selected. If no value is given, selected
checkboxes are typically given the value of “on” by the user agent.
List Boxes
List boxes are used to allow a user to pick one or more textual items from a list. The list can be presented
in its entirety, with each element visible, or as a drop-down list where users must scroll to their choices.
List boxes are delimited using select (<select>) tags, with their options delimited using option
(<option>) tags. Optionally, you can use the option group (<optgroup>) tag to group related options
within the list.
The <select> tag provides the container for the list and has the following format:
The size attribute determines how many items will initially be displayed by the control. If the number
of items in the list exceeds the number of lines to display, the user agent will provide scroll bars so that
the user can navigate to the additional items in the list. If the size attribute is set to 1, the list will
become a drop-down list; clicking the list will expand it to show multiple items with a scroll bar.
132
Forms
The select tag does not include an attribute to control the width of the control. The select box is automati-
cally sized according to the longest element (<option>) it contains. If you wish a select list to be wider,
a common practice is to include a placeholder option of the appropriate length, similar to the following:
However, including such an option places an additional burden on the form handling; you must ensure
that this option is not selected if the field is not optional.
The <option> tag delimits the items to be contained in the list. Each item is given its own <option> tag
pair. The option tag has the optional attributes shown in the following table:
Attribute Values
label A shorter label for the item that the user agent can use
selected Indicates that the item should be initially selected
value The value that should be sent to the handler if the item is selected; if omitted,
the text of the item is sent
The label attribute is useful for fields where you need to provide human-readable text (including
spaces, punctuation, and so on) in the field for the user’s benefit but wish to return a more succinct
value to the form handler.
<option value=”sun”>Sunday</option>
<option value=”mon”>Monday</option>
<option value=”tue”>Tuesday</option>
<option value=”wed” selected=”selected”>Wednesday</option>
<option value=”thr”>Thursday</option>
<option value=”fri”>Friday</option>
<option value=”sat”>Saturday</option>
Occasionally, you will want to group options of a list together for clarity. For this, you can use option
group (<optgroup>) tags to delimit the groups of options. For example, the following code defines two
groups for the preceding list of options, weekend and weekday:
<optgroup label=”Weekend”>
<option>Sunday</option>
<option>Saturday</option>
</optgroup>
<optgroup label=”Weekday”>
<option>Monday</option>
<option>Tuesday</option>
<option>Wednesday</option>
<option>Thursday</option>
<option>Friday</option>
</optgroup>
133
Chapter 9
It is up to the user agent as to how to display the option groups. A popular method of displaying the
groups is to display the group label above the options to which they apply, as shown in Figure 9-3.
Figure 9-3
Combining the various list tags to create a list would look similar to the following code:
134
Forms
The cols and rows attributes define the size of the text box in the user agent. If the content of the box
exceeds its dimensions, the user agent will provide a vertical scroll bar to scroll the content appropri-
ately. Note that the text area tag is one of the few form tags that has a formal closing tag. If the field
should have a default value, it is placed between the tags. The tags should be adjacent to one another
if the field is to be blank.
It is important to carefully watch the formatting of your code around a text area tag. For example, if you
want the field to be initially blank, you cannot place the open and close tags on separate lines in the code:
<textarea>
</textarea>
This would result in the field containing a newline character — it would not be blank.
The text entered into the <textarea> field wraps within the width of the box, but the text is sent as one
long string to the handler. However, where the user enters line breaks, those breaks are also sent to the
handler, embedded in the string.
Previous versions of HTML supported a wrap attribute for the <textarea> tag. This attribute was
used to control how text wrapped in the text box as well as how it was sent to the handler. However,
user agent support for this attribute was inconsistent — you could not rely on an agent to follow the
intent of the attribute. The attribute has been deprecated and should not be used.
Hidden Fields
You can place additional, nonvisible data in your forms using hidden fields. The hidden field has the fol-
lowing format:
Other than not being visibly displayed, hidden fields are much like any other field. Hidden fields are
used mostly for tracking data and the state of a process. For example, in a multipage form, a userid
field can be hidden in the form to ensure that subsequent forms, when submitted, are tied to the same
user data. For instance, the following code could be used to track a user by a unique number:
Keep in mind that while hidden fields do not display in the user agent interface, they are still visible in
the code of the document. Hidden fields should never be used for sensitive data.
Buttons
You can add custom text buttons on your forms using the button field. The button field has the follow-
ing format:
135
Chapter 9
This tag results in a simple button being displayed on the form using the style of the current GUI. The
following code results in the button shown in Figure 9-4:
Figure 9-4
Buttons by themselves are relatively useless on a form. To have the button actually perform an action,
you need to link it to a script via the onclick or other event attribute. For example, the following code
results in a button that, when clicked, executes the JavaScript function buynow():
More information on JavaScript and events can be found in Part III of this book.
Images
You can include additional graphic images in your form to help convey a message. The image field dis-
plays a graphic image much like the image tag (<img>) and has the following format:
However, much like the button field, the image field is useless without being tied to an event handler.
The following example causes the image buynow.jpg to be displayed on a form. When the image is
clicked, the JavaScript function buynow() is executed:
Images by themselves are not intuitive user interface mechanisms. The image field exists to help encapsu-
late graphics into the form element within the document object model. If you use images for user interface
purposes, be sure to include enough hints as to their purpose using nongraphical (text, and so on) means.
136
Forms
File Fields
File fields allow files to be attached to form data and sent along with the data to the handler. File fields
have the following syntax:
The file field renders as a text box with a button that enables the user to browse for a file using their plat-
form’s file browser. Alternately, the user can manually type the full path and name of the file in the text
box. Figure 9-5 shows an example of a file field.
Figure 9-5
However, to use this control in your forms, you must do the following:
❑ Specify your form encoding as multipart, which allows the file to be attached to the rest of
the data.
❑ Use the POST, not the GET, method of form delivery. File information cannot be encapsulated
using the GET method.
In other words, when using a file field, your <form> tag should resemble the following:
137
Chapter 9
<input type=”submit” name=”submit” id=”submit” [value=”text_for_button”] />
and
The value attribute for both tags is optional; if this attribute is omitted, the buttons will display default
text (usually “Submit” and “Reset,” but the text is ultimately determined by the user agent).
The submit button, when clicked, causes the form to be submitted to the handler specified in the <form>
tag’s action attribute. You can also use the onclick event attribute to call a script to preprocess the
form data prior to submission.
The reset button, when clicked, causes the form to be reloaded and its fields reset to their default values.
You can also use the onclick event attribute to change the button’s behavior, calling a script instead of
reloading the form. However, the user will expect the reset button to ultimately reset the form; if you tie
a script to the button using the onclick event, you should ensure that the script also resets the form.
Field Labels
The label tag (<label>) is used to define text labels for fields. This tag has the following format:
<label for=”id_of_related_tag”>text_label</label>
For example, the following code defines a label for a text box:
The label field’s for attribute should match the id of the field for which it is intended. The main pur-
pose of the label tag is accessibility — most users will be able to ascertain the purpose of fields in your
forms by sight. However, if the user agent does not have a visual component, or if the user is visually
impaired, the visual layout of the form cannot be relied on to match labels and fields. Note that if the
user agent supports it, the user can also click on the field label to select the appropriate field.
The <label> tag’s for attribute ensures that the user agent can adequately match labels with fields for
the user, if necessary.
Notice the use of both the id and name attributes in the text input field tag. HTML requires a field to
have a name tag for its data to be submitted. However, the label tag requires an id value in its matching
input field.
138
Forms
Figure 9-6
The fieldset tag (<fieldset>) is used as a container for form elements and results in a thin border being
displayed around the contained elements. For example, the following code results in the output shown
in Figure 9-7.
<fieldset>
<p>Gender: <br>
<input type=”radio” name=”gender” id=”male” value=”male”> Male <br>
<input type=”radio” name=”gender” id=”female” value=”female”> Female</p>
</fieldset>
Figure 9-7
139
Chapter 9
The legend tag (<legend>) allows the surrounding fieldset box to be captioned. For example, the fol-
lowing code adds a “Gender” caption to the previous example. The output of this change is shown in
Figure 9-8.
<fieldset>
<p><legend>Gender </legend>
<input type=”radio” name=”gender” id=”male” value=”male”> Male <br>
<input type=”radio” name=”gender” id=”female” value=”female”> Female</p>
</fieldset>
Figure 9-8
The tabindex attribute defines what order the fields are selected in when the user presses the Tab key.
This attribute takes a numeric argument that specifies the field’s order on the form. The fields are then
accessed in their numeric, tabindex order — tabindex 1, then 2, and so forth.
The accesskey attribute defines a key that the user can press to directly access the field. This attribute
takes a single letter as an argument; that letter becomes the key the user can press to directly access the
field. Keys specified in accesskey attributes typically require an additional key to be pressed with the
specified key. For example, user agents running on Microsoft Windows typically require the Alt key to
be pressed along with the letter specified by accesskey. Other platforms require similar keys; such keys
typically follow the GUI interface conventions of the platform.
The following example defines a text box that can be accessed by pressing Alt+F (on Windows platforms)
and is third in the tab order:
140
Forms
Note the use of the <span> tag to delimit the corresponding letter (“F”) in the field’s label. Deprecation
of the underline element caused a slight problem when using accesskey attributes. It is customary
to underline shortcut keys in GUI interfaces so that the user knows what key is mapped to what
field/function. However, with the deprecation of the underline element, you must use CSS (hence the
span tag) to appropriately code the letter corresponding to the access key.
The <span> tag is covered in Chapter 7, while CSS is covered in Part II of this book.
You can add the readonly attribute to text fields to keep the user from being able to edit the data con-
tained therein. This method has the advantage of displaying the data in field form while prohibiting the
user from being able to modify it.
The disabled attribute causes the corresponding field to appear as disabled (usually graying out the
control, consistent with the user agent’s platform method of showing disabled controls) so the user can-
not use the control.
The following code shows examples of both a read-only and a disabled control. The output of this code
is shown in Figure 9-9.
141
Chapter 9
</table>
</form>
</p>
</body>
</html>
Figure 9-9
Although the two attributes make the fields look similar on-screen, the readonly field can be selected
but not edited. The disabled field cannot be selected at all. You should also note the field’s read-only or
disabled status in text — whether in the field label or additional text near the field. This courtesy is for
non-GUI users or users of agents that do not plainly indicate the field’s status.
Summar y
This chapter detailed XHTML forms, showing you how to define a form and populate it with appropri-
ate controls for gathering data from users. You learned the basics of form handling and how to create
documents to effectively gather and submit data. Parts IV and V of this book cover scripting and give
examples of how to create script handlers for various purposes.
142
Objects and Plugins
The Web isn’t just for text anymore. Today’s user agents support many different types of data —
from sound files to rich multimedia presentations. Including such content in your Web documents
is not only welcome but also expected.
Many helpful applications — known as plugins — help extend a user agent’s capability. The most
popular plugin, Macromedia’s Flash Player, allows for complex animation and even full naviga-
tion through non-HTML content delivered via the user agent.
This chapter introduces you to the world of non-HTML content, plugins, and how to use them in
your documents.
Understanding Plugins
Plugins are small applications that extend the capabilities of user agents by running on the client
machine and handling data delivered via HTTP supplied by the user agent. A typical plugin
works with a user agent as shown in the diagram in Figure 10-1.
Content/
Presentation Data Request
User Plugin Browser
Data
Web Server
Figure 10-1
Chapter 10
The user agent requests the content as normal but receives a file it doesn’t know how to deal with.
However, it has a plugin registered for the file it receives. The browser launches the plugin and passes
the file to the plugin for processing. The plugin presents the data to the user in an environment native
to the data while remaining in the browser environment.
Note that plugins are specialized applications requiring the end user to install and maintain them on
their system(s). Although plugins (especially Flash and the like) are common on the Web, you should
use caution in deciding to use plugin-enabled content in your pages, as it does have an impact on the
end user.
The first plugins enabled the Netscape browser to deliver content other than text and basic graphics. The
earliest plugins included programs from Macromedia for its Shockwave product and from Adobe for its
Acrobat (PDF) product. These programs enabled users to view Shockwave graphic presentations and
Adobe PDF files.
Today, plugins are available for almost all types of data. The old standbys are still available (Adobe
Acrobat Reader, Macromedia Flash/Shockwave, and so on), but a host of new plugins exists to allow
files to be transferred via HTTP and viewed by the end user.
There are other means for user agents to handle nonnative files. For example, Windows users can rely
on file associations. If a received file is a registered file type, Windows will automatically spawn the cor-
rect application to handle it. However, note that the file is then being handled outside of the user agent,
requiring another full application with all the associated overhead.
For example, the following could be used to embed a MIDI file (jinglebells.mid) in a Web document:
When the document is displayed in Windows Microsoft Internet Explorer, a small media player control
appears, as shown in Figure 10-2, and the MIDI file begins to play. Note that the hidden attribute could
be used to hide the player from the user and the space the player occupies can be modified with the
width and height attributes (but using values that are too small will hide some of the control).
Note, however, that many other platforms will not handle the embedded file as shown in Figure 10-2 —
Windows handles it deftly because of the built-in media player control available to applications. Other
user agents on other platforms will require a separate plugin to utilize the content. For example, Mozilla
Firefox will display a prompt, as shown in Figure 10-3, and will attempt to install Apple’s QuickTime
player if the user chooses Install Missing Plugins.
144
Objects and Plugins
Figure 10-2
Figure 10-3
Another seldom-used tag for embedding non-HTML content in HTML documents was the <applet>
tag. This tag was used mainly to call on a small application (applet) to do something useful with or in
addition to the document’s contents.
Later versions of HTML deprecated the <embed> tag, replacing it with the <object> tag, which was
designed to be more flexible.
145
Chapter 10
The object tag encapsulates alternate text that is used if the object cannot be handled by the destination
user agent and parameters defined by parameter tags (<param>).
The object tag can be used in the <head> or <body> section of a document. If used in the <body>, the
object appears where placed in the document and uses the format of its containing block. If it is placed in
the head, its positioning is determined by other criteria — options in the <object> block, styles dictat-
ing object placement, and so on.
146
Objects and Plugins
The classid and codebase attributes are essential — they tell the user agent what plugin should be
used to display the content. The classid attribute corresponds to the internal identifier of the plugin.
For example, on Windows platforms this value is stored in the Windows Registry along with the location
of the plugin. The codebase attribute points to the Flash player (plugin) that can be downloaded if the
plugin isn’t already available on the platform.
The rest of the attributes are important to help tailor the appearance of the object (width, height,
declare) or to provide the user agent more information about the object’s data.
Note that the <object> tag can be formatted like any other block tag using CSS.
Parameters
Most objects require parameters to customize their appearance and operation. The <param> tag is
used within the <object> tag to provide the appropriate parameters. The <param> tag has the follow-
ing syntax:
The <param> tag has no closing mate. To be XHTML compliant, the tag should end with the slash.
The name and value attributes are necessary; they are the two attributes that provide the actual param-
eter for the object. The other attributes are necessary in certain circumstances to help define the type and
scope of the parameter.
Object Examples
More data/media formats are delivered via the Web than can be readily counted. Each of the non-HTML-
based formats has its own plugin, format for the <object> tag, and parameters. The best way to deter-
mine the correct format of the <object> tag is to consult the owner of the data format or applicable
plugin (Macromedia for Flash content, Apple Computer Inc. for QuickTime, and so on).
Many GUI-based HTML editors include features to help embed non-HTML content in Web documents.
For example, Macromedia’s Dreamweaver provides several features to embed and control various objects
within your documents.
The following two examples show how to embed commonly used data types: a MIDI file and a Flash file.
147
Chapter 10
Source
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html>
<head>
<title>A MIDI Object</title>
</head>
<body>
<p>
<object classid=”clsid:22D6F312-B0F6-11D0-94AB-0080C74C7E95” id=’jinglebells’
height=”45” width=”300” />
Jingle Bells!
<param name=”autostart” value=”true” />
<param name=”filename” value=”jinglebells.mid” />
</object>
</p>
</body>
</html>
This example will work only on Windows, with Microsoft Internet Explorer or a browser with an
appropriate plugin allowing access to the Windows Media Player controls.
Output
The code results in a media player panel being displayed in the document at the <object> tag’s
location, as shown in Figure 10-4. The MIDI file begins to play as soon as the document is loaded.
Figure 10-4
148
Objects and Plugins
Source
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html>
<head>
<title>A Flash Object</title>
</head>
<body>
<p>
<object classid=”clsid:D27CDB6E-AE6D-11cf-96B8-444553540000”
codebase=”http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#ve
rsi
on=6,0,40,0” width=”150” height=”150”>
Radar Screen
<param name=”movie” value=”radar.swf” />
<param name=”quality” value=”high” />
<param name=”loop” value=”1” />
<param name=”play” value=”1” />
</object>
</p>
</body>
</html>
Output
The preceding document displays as shown in Figure 10-5, with the Flash movie in the place of
the <object> tag.
149
Chapter 10
Figure 10-5
The answer is to include an appropriate <embed> tag within the <object> tag. Newer browsers will
ignore the <embed> tag because it doesn’t belong within the <object> tag, while older browsers will
ignore the <object> and <param> tags but will process the <embed> tag.
For example, to use <embed> with the earlier Flash example, you would use code similar to the following:
<object classid=”clsid:D27CDB6E-AE6D-11cf-96B8-444553540000”
codebase=”http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#versi
on=6,0,40,0” width=”150” height=”150”>
Radar Screen
<param name=”movie” value=”radar.swf” />
<param name=”quality” value=”high” />
<param name=”loop” value=”1” />
<param name=”play” value=”1” />
<EMBED src=”radar.swf” quality=”high” width=”150” height=”150”
type=”application/x-shockwave-flash”
pluginspage=”http://www.macromedia.com/go/getflashplayer” />
</object>
150
Objects and Plugins
Placing the <embed> tag within the <object> tag will cause newer user agents to ignore it; they will
perceive it as an invalid part of the <object> block due to its context. Older user agents (that don’t sup-
port <object>) will ignore the <object> tags but will recognize and use the embed block. Note that if
you place the embed section outside of the object block, the older user agents will still handle it properly,
but the newer user agents will understand both tags individually, displaying the object twice.
Use of the <embed> tag will cause your code to be non–XHTML compliant.
Summar y
This chapter introduced you to non-HTML content and how you can embed it in your documents. You
learned how plugins operate, how to tell the user agent that it needs a plugin, as well as how to pass
parameters to the plugin to help control the content.
Unfortunately, multimedia is one of the areas where the user agent market has fragmented. Microsoft
builds a lot of functionality into Internet Explorer through the native Windows platform, while other
user agents rely on plugins to handle non-HTML content. Properly coding for all cases becomes a chore
not easily accomplished. The best advice is to stick to popular formats (such as Flash) or ensure that a
plugin exists (and is accessible via your code) for most platforms.
151
XML
The Extensible Markup Language (XML) is a popular scheme for representing data. Although
created as a more portable version of SGML, XML lives mostly on the application side of the com-
puter world. XML is used to store preferences and data from applications, provide unified data
structure for transferring data, encapsulate syndicated feeds from Web sites, and more. The XML
standards are being adopted by other data formats such as HTML (creating XHTML).
This chapter presents a primer on XML, including its format, methods, and tools.
Full coverage of XML is outside the scope of this book — full coverage of XML can occupy an entire
book of its own. In the case of the Web, XML is a bystander technology, useful to know but not
entirely critical for publishing on the Web. However, because XHTML is XML compliant, coverage
is mandatory. If you desire more information about XML, you would do well to pick up a book dedi-
cated to the subject, such as WROX Beginning XML, 3rd Edition, WROX XSLT 2.0 Programmer’s
Reference, 3rd Edition, or Wiley’s XML Weekend Crash Course or XML Programming Bible.
XML Basics
The Extensible Markup Language (XML) was created to bring the advantages of the bloated
Standard Generalized Markup Language (SGML) standard to smaller platforms such as Web
browsers. XML retains the flexibility of its older sibling but has been redesigned for the Web with
the ability to be easily transmitted via the Internet’s architecture and displayed with less overhead.
❑ Form should follow function. In other words, the language should be flexible enough to
encapsulate many types of data. Instead of shoehorning multiple forms of data into one
structure, the structure should be able to change to adequately fit the data.
❑ Documents should be easily understood by their content alone. The markup should be
constructed in such a way that there is no doubt about the content it frames. XML docu-
ments are often referred to as self-describing because of this attribute.
Chapter 11
❑ Format should be separated from presentation. The markup language should represent the dif-
ference in pieces of data only and should make no attempt to describe how the data will be pre-
sented. For example, elements should be marked with tags such as <emphasis> instead of <b>
(bold), leaving the presentation of the data (which should be emphasized, but not necessarily
bold) to the platform using the data.
❑ The language should be simple and easily parsed, with intrinsic error checking.
These attributes are evident in the goals stated in the W3C’s Recommendation for XML 1.0 (found at
http://www.w3.org/TR/1998/REC-xml-19980210):
As-is, XML is ill-suited for the World Wide Web. Because XML document elements can be author-
defined, user agents cannot possibly interpret and display all XML documents in the way the author
would have intended. However, standardized XML structures are excellent for storing application data.
For example, consider the following applications of XML:
❑ The popular RSS syndication format defines particular element tags in XML format to encapsu-
late syndicated news and blog feeds. This enables many applications to easily disseminate the
information contained within the feed.
❑ Several online statistic sites (computer game stats, and so on) store their information in XML
because it can be easily parsed and understood by a variety of applications.
❑ Many applications store their preferences in XML-formatted files. This format proves to be easily
parsed, changed, and rewritten, as necessary.
❑ Many word processing and other document-based applications (spreadsheets, and so on) store
their documents in XML format.
❑ Many B2B applications use XML to share and transfer data between each other.
Note that while XML provides an ideal data structure, it should be used only for smaller, sequential col-
lections of data. Data collections that require random access or have thousands of records would benefit
from an actual database format instead of XML.
154
XML
XHTML was designed to bring HTML into XML compliance (each element being properly closed, and
so on), not the other way around (add extensibility to HTML). In short, XHTML adheres to XML stan-
dards, but it is not itself an extensible markup language.
XML Syntax
XML follows guidelines we have already set forth for XHTML:
Within documents, the structure is similar to that of HTML, where element tags are used to encapsulate
content that may itself contain tag-delimited content.
The following sections outline the particular syntax of the various XML document elements.
The declaration is <?xml?>, with version and encoding attributes. The version attribute specifies the
version of XML the document uses, and the encoding attribute specifies the character encoding used
within the document’s content.
As with other markup languages, XML supports document type definitions (DTDs), which specify the
rules used for the elements within documents using the DTD. Applications can then use the DTD to check
the document’s syntax. An XML document’s DTD declaration resembles that of an XHTML document,
specifying a SYSTEM or PUBLIC definition. For example, the following DTD is used for OpenOffice
documents:
155
Chapter 11
Elements
XML elements resemble XHTML elements. However, due to the nature of XML (extensible), elements
are generally not of the HTML variety. For example, consider the following snippet from an RSS feed,
presented in XML format:
In this case, the following elements are used. <channel>, the container for the channel (that is, the feed
itself), has the following subcontainers:
The feed then encapsulates each news item within an <item> element, which has the following
subelements:
156
XML
Note that several elements have multiple contexts. For example, the <channel> and <item> elements
both provide context for <title> elements; the placement of each <title> element (usually its parent)
determines what element the <title> refers to.
Attributes
XML elements support attributes much like XHTML. Again, the difference is that the attributes can be
defined in accordance with the document’s purpose. For example, consider the following code snippet:
<employee sex=”female”>
<lastName>Moore</lastName>
<firstName>Terri</firstName>
<hireDate>2003-02-20</hireDate>
</employee>
<employee sex=”male”>
<lastName>Robinson</lastName>
<firstName>Branden</firstName>
<hireDate>2000-04-30</hireDate>
</employee>
In this example, the sex of the employee is coded as an attribute of the <employee> tag.
In most cases, the use of attributes instead of elements is arbitrary. For example, the preceding example
could have been coded with sex as a child element instead of as an attribute:
<employee>
<sex>female</sex>
<lastName>Moore</lastName>
<firstName>Terri</firstName>
<hireDate>2003-02-20</hireDate>
</employee>
<employee>
<sex>male</sex>
<lastName>Robinson</lastName>
<firstName>Branden</firstName>
<hireDate>2000-04-30</hireDate>
</employee>
The mitigating factor in deciding how to code data is whether the content is ever to be used as data,
instead of just a modifier. If an application will use the content as data, it’s best to code it within an ele-
ment where it is more easily parsed as such.
Comments
XML supports the same comment tag as HTML:
You can embed comments anywhere inside an XML document as long as the standard XML conventions
and corresponding DTD rules are not violated by doing so.
157
Chapter 11
Nonparsed Data
On occasion, you will need to define content that should not be parsed (interpreted by the application
reading the data). Such data is defined as character data or CDATA. Nonparsed data is formatted within
a CDATA element, which has the following syntax:
<!CDATA [non_parsed_data]]>
CDATA elements are generally used to improve the legibility of documents by placing reserved characters
within a CDATA element instead of using cryptic entities. For example, both of the following paragraph
elements result in identical data, but the first is more legible due to the CDATA elements:
<p>The <!CDATA [<table>]]> element should be used instead of the <!CDATA [<pre>]]>
Element whenever possible.</p>
Entities
XML also allows for user-defined entities. Entities are content mapped to mnemonics; the mnemonics
can then be used as shorthand for the content within the rest of the document. Entities are defined using
the following syntax:
Entities are defined within a document’s DTD. For example, the following document prologue defines
“Acme, Inc.” as the entity company:
<?xml version=”1.0”?>
<!DOCTYPE report SYSTEM “/xml/dtds/reports.dtd” [
<!ENTITY company “Acme, Inc.”>
]>
Elsewhere in the document, the entity (referenced by &entityname;) can be used to insert the company
name:
<report>
<title>TPS Report</title>
<date>2005-01-25</date>
<summary>The latest run of the regression test have yielded perfect results. The
job for &company; can now determinately be completed and final code
delivered.</summary>
...
Entities can also be declared as external resources. Such external resources are generally larger than a
few words or a phrase, like complete documents. A system entity, used for declaring external resources,
is defined using the following syntax:
158
XML
For example, the following code defines a chapter01 entity that references a local document named
chapter01.xml:
The chapter01 entity can then be used to insert the contents of chapter01.xml in the current document.
Namespaces
The concept of namespaces is relatively new to XML; they allow you to group elements together by their
purpose using a unique name. Such groupings can serve a variety of purposes, but they are commonly
used to distinguish elements from one another.
For example, an element named <table> can refer to a data construct or a physical object (such as a
dining room table):
If both elements are used in the same document, there will be a conflict because the two refer to two
totally different things. This is a perfect place to specify namespaces.
Namespace designations are added as prefixes to element names. For example, you could use a furniture
namespace to identify the table elements that refer to furnishings:
<furniture:table>
<type>Dining</type>
<width>4</width>
<length>8</width>
<color>Cherry</color>
</furniture:table>
Style Sheets
XML also offers support for style sheets. Style sheets are linked to XML documents using the xml-
stylesheet tag, which has the following syntax:
159
Chapter 11
For example, to link a CSS style sheet to a document, you could use a tag similar to the following:
Using XML
Actual use of an XML document requires that the document be transformed into a usable format. There
are many means and formats to translate XML — the limits are governed only by your imagination and
tools at hand.
Viewing XML documents doesn’t require special tools. Many of the modern user agents can view XML
documents and even add capabilities such as tag highlighting and the ability to collapse portions of the
document, as shown in Figure 11-1, where Internet Explorer is displaying an RSS document.
Collapsed elements
Figure 11-1
160
XML
There are many tools that can help you manage XML documents and perform XSLT, including many
open source solutions (search for “XSLT” on sourceforge.org).
XML Editing
You have many choices for editing XML files. Because XML is a text-only format, you can use any text
editor (emac, vi, notepad, and so on) to create and edit XML documents. However, dedicated XML edi-
tors make the editing job easier by adding syntax highlighting, syntax checking, validation, auto-code
completion, and more.
❑ Many open source XML editors are available (search “XML editor” on sourceforge.net).
❑ Lennart Staflin has developed a major mode for Emacs called PSGML (http://www.lysator
.liu.se/projects/about_psgml.html).
❑ XMetal — formerly owned by Corel, now owned by Blast Radius — is a well-known, capable
(albeit commercial and expensive) XML editor (http://www.xmetal.com).
❑ XMLSpy, by Altova, is another capable XML editor in the same price range as XMetal, though
the personal edition is free (http://www.altova.com).
❑ <oXygen/>, by SyncRO Soft Ltd., is a lower-cost, multiplatform XML editor and XSLT debugger
(http://www.oxygenxml.com).
XML Parsing
You may choose to use XML to store various types of data, or you may have the need to access other
people’s data that is stored in XML.
Many XML parsing applications are available, including many open source applications (search for
“XML parsing” on sourceforge.org). In addition, there are XML parsing modules for most program-
ming languages:
❑ James Clark’s XML parser, expat, is well-known as the standard for XML parsing
(http://expat.sourceforge.net and http://www.jclark.com/xml/expat.html).
❑ Many XML modules are available for Perl via CPAN (http://www.cpan.org).
❑ Several XML tools are available for Python, including the many found at the Python Web site
(http://pyxml.sourceforge.net/topics).
161
Chapter 11
❑ PHP has a handful of XML functions built in as extensions to support expat (http://www.php
.net/manual/en/ref.xml.php).
❑ The PHP Extension and Application Repository has several additional extensions for XML
maintenance and manipulation (http://pear.php.net).
Summar y
This chapter covered the basics of XML, its format, structure, and use. The basics of creating and main-
taining an XML document and tools for working with XML data were all covered. You should now have
a basic understanding of the standard and how it is affecting other data schemes and Web technologies.
162
CSS Basics
In the grand scheme that is the World Wide Web, Cascading Style Sheets (CSS) are a relatively new
invention. The Web was founded on HTML and plain text documents. Over the last few years, the
Web has become a household mainstay and has matured into a viable publishing platform thanks
in no small part to CSS.
CSS enables Web authors and programmers to finely tune elements for publishing both online
and across several different types of media, including print format. This chapter serves as the
introduction to CSS. Subsequent chapters in this section will show you how to use styles with
specific elements.
The advantage of styles is that you can change the definition once and the change affects every
element using that style. Coding each element individually, by contrast, would require that each
element be recoded when you wanted them all to change. Thus, styles provide an easy means to
update document formatting and maintain consistency across a site.
Also, coding individual elements is best done while the document is being created. This means
that the document must be formatted by the author — not always the best choice. Instead, the ele-
ments can be tagged with appropriate styles (such as heading) while the document is created, and
the final formatting can be left up to another individual who defines the styles.
Styles can be grouped into purpose-driven style sheets. Style sheets are just that, groups of styles
relating to a common purpose. Style sheets allow for multiple styles to be attached to a document
all at once. It also allows for all the style formatting in a document to be changed at once. This
allows documents to be quickly formatted for different purposes — one style sheet can be used for
online documents, another for brochures, and so on.
Chapter 12
<p><b><u>Heading One</u></b></p>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud
exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.</p>
<p><b><u>Heading Two</u></b></p>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud
exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.</p>
<p><b><u>Heading Three</u></b></p>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor
incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud
exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.</p>
For the purpose of this example, ignore the fact that most of the text formatting tags (underline, center,
and so on) have been deprecated.
Note that all three headings are coded bold and underlined. Now suppose that you wanted the headings
to be larger and italic. Each heading would have to be recoded similar to the following:
Although using a decent editor with global search and replace makes this change pretty easy, consider
managing an entire site, with several documents — if not tens or hundreds — each with several head-
ings. Each document makes the change exponentially harder.
Now, let’s look at how styles would change the example. Using styles, the example could be coded simi-
larly to the following:
There are several ways to apply styles to document elements. Various ways to define and use styles are
covered in Chapter 13.
164
CSS Basics
The style is defined in the head section of the document, similar to the following:
<head>
<style type=”text/css”>
p.heading { font-weight: bold; text-decoration: underline; }
</style>
</head>
This definition defines a heading class that formats text as bold and underlined.
Style definitions are covered in Chapter 13. Style properties are covered in appropriate chapters (“Text,”
and so on) later in this part of the book.
To change all the headings in the document to a larger, italic font, the one definition can be recoded:
<head>
<style type=”text/css”>
p.heading { font-size: larger; font-style: italic; }
</style>
</head>
❑ CSS1 defines basic style functionality, with limited font and limited positioning support.
❑ CSS2 adds aural properties, paged media, and better font and positioning support. Many other
properties have been refined as well.
❑ CSS3 adds presentation-style properties, allowing you to effectively build presentations from
Web documents (similar to Microsoft PowerPoint presentations).
You don’t have to specify the level of CSS you are using. However, you should be conscientious about
what user agents will be accessing your site. Most modern browsers support CSS, but the level of sup-
port varies dramatically between user agents. It’s always best to test your implementation on target user
agents before widely deploying your documents.
When using styles, it is important to keep in mind that not all style properties are well supported by all
user agents. This book attempts to point out major inconsistencies and differences in the most popular
user agents, but the playing field is always changing. One invaluable reference for style compatibility is
Brian Wilson’s excellent resources at http://www.blooberry.com/indexdot/index.html.
165
Chapter 12
Defining Styles
Styles can be defined in several different ways and attached to a document. The most popular method
for defining styles is to add a style block to the head of a document:
<head>
<style type=”text/css”>
Style definitions
</style>
</head>
Using this method, all style definitions are placed within a style element, delimited by <style> tags.
The opening style tag has the following syntax:
In most cases, the MIME type is “text/css.” The media attribute is typically not used unless the destina-
tion media is nontextual. The media attribute supports the following values:
❑ all
❑ aural
❑ braille
❑ embossed
❑ handheld
❑ projection
❑ screen
❑ tty
❑ tv
Note that multiple definitions, each defining a style for a different medium, can appear in the same doc-
ument. This powerful feature allows you to easily define styles for a variety of document usage and
deployment.
Alternately, the style sheet can be contained in a separate document and linked to documents using the
link (<link>) tag:
<head>
<link rel=”stylesheet” type=”text/css” href=”mystyles.css” />
</head>
166
CSS Basics
Then, when the style definitions in the external style sheet change, all documents that link to the external
sheet reflect the change. This presents an easy way to modify a document’s format — whether to affect
new formatting for visual sake or for a specific purpose.
Attaching external style sheets via the link tag should be your preferred method of applying styles to a
document, as it provides the most scalable use of styles — you have to change only one external style
sheet to affect many documents.
You can add comments to your style section or style sheet by delimiting the comment with /* and */.
For example, the following is a style comment:
/* Define a heading style with a border */
Cascading Styles
So where does the “cascading” in Cascading Style Sheets come from? It comes from the fact that styles
can stack, or override, each other. For example, suppose that an internal corporate Web site’s appearance
varies depending on the department that owns the various documents. It is important that all the docu-
ments follow the corporate look and feel, but the Human Resources department might use people-
shaped bullets or other small changes unique to that department. The HR department doesn’t need a
separate, complete style sheet for its documents. The department needs only a sheet containing the dif-
ferences from the corporate sheet. For example, consider the following two style sheet fragments:
corporate.css
body {
font-family:verdana, palatino, georgia, arial, sans-serif;
font-size:10pt;
}
p {
font-family:verdana, palatino, georgia, arial, sans-serif;
font-size:10pt;
}
p.quote {
font-family:verdana, palatino, georgia, arial, sans-serif;
font-size:10pt;
border: solid thin black;
background: #5A637B;
padding: .75em;
}
h1, h2, h3 {
margin: 0px;
padding: 0px;
}
ul {
list-style-image: url(“images/corp-bullet.png”)
}
...
167
Chapter 12
humanresources.css
ul {
list-style-image: url(“images/hr-bullet.png”)
}
The humanresources.css sheet contains only the style definitions that differ from the corporate.css
sheet, in this case, only a definition for ul elements (using the different bullet). The two sheets are linked
to the HR documents using the following <link> tags:
<head>
<link rel=”stylesheet” type=”text/css” href=”corporate.css” />
<link rel=”stylesheet” type=”text/css” href=”humanresources.css” />
</head>
When a user agent encounters multiple styles that could be applied to the same element, it uses CSS
rules of precedence, covered later in this section.
Likewise, other departments would have their own style sheets, and their documents would link to the
corporate and individual department sheets. As another example, the engineering department might use
their own style sheet and declare it in the head of their documents:
<head>
<link rel=”stylesheet” type=”text/css” href=”corporate.css” />
<link rel=”stylesheet” type=”text/css” href=”engineering.css” />
</head>
The styles embedded in elements take precedence over all previously declared styles.
❑ Author origin — The author of a document includes styles in a <style> section or linked sheets
(via <link>).
❑ User origin — The user (viewer of document) specifies a style sheet.
❑ User Agent origin — The user agent specifies default style sheet (when no other exists).
Styles that are critical to the document’s presentation should be coded as important by placing the text
!important at the end of the declaration. For example, the following style is marked as important:
Such styles are treated differently from normal styles when the correct style to use is determined from
the cascade.
168
CSS Basics
The CSS standard uses the following rules to determine which style to use when multiple styles exist for
an element:
1. Find all style declarations from all origins that apply to the element.
2. For normal declarations, author style sheets override user style sheets, which override the
default style sheet. For !important style declarations, user style sheets override author style
sheets, which override the default style sheet.
3. More specific declarations take precedence over less specific declarations.
4. Styles specified last have precedence over otherwise equal styles.
Summar y
This chapter taught you the basics of CSS — how styles are attached to a document, how they are best
used, what the different levels of CSS are, and how the cascade in Cascading Style Sheets works. You
learned the various ways to embed and define styles and more about the separation between content and
formatting that CSS can provide. Chapter 13 delves into the ins and outs of style definitions. Subsequent
chapters in this part of the book will show you how styles are best used with various elements.
169
Style Definitions
By this point in the book, you should recognize the power, consistency, and versatility that styles
can bring to your documents. You have seen how styles can make format changes easier and
how they adhere to the content versus formatting separation. Now it’s time to learn how to create
styles — the syntax and methods used to define styles for your documents.
selector_expression {
element_property: property_value(s);
element_property: property_value(s);
...
}
The selector_expression is an expression that can be used to match specific elements in the
document. Its simplest form is an element’s name, such as h1 to match all <h1> elements. At its
most complex, you can match individual subelements of particular elements or elements that
have particular relationships to other elements.
Selectors are covered in depth within the next section of this chapter.
The element_property specifies which properties of the element the definition will affect. For
example, to change the color of an element, the color property is used. Note that some properties
affect only one aspect of an element, while others combine several properties into one declaration.
For example, the border property can be used to define the width, style, and color of an element’s
border; each of the properties (width, style, color) has its own property declarations, as well
(border-width, border-style, and border-color).
Individual properties are covered within chapters relating to the type of element they affect. For
example, the font properties are covered in the next chapter, “Text.”
Chapter 13
The property_values(s) specify how the property should affect the element to which it applies. For
example, to specify an element’s color as red, you would use the value red as the property value for the
color property.
More information on property values can be found in the “Property Values” section later in this chapter.
Now let’s look at the elements of a style declaration in a real example. The following style definition can
be used to change all the heading-one (<h1>) elements in a document to have red text:
h1 {
color: red;
}
The actual formatting of the style declarations can vary. The syntax is as follows:
The declaration should be separated from the left brace (which begins the property/value section) by
white space, and each property/value pair should end in a semicolon. The property/value pair section
ends in a right brace. Extra white space can appear between all elements, and the amount of white space
(whether spaces, new lines, or tabs) doesn’t matter. For example, both of the following definitions pro-
duce identical results, but they are formatted quite differently:
h1 {
font-family: helvetica, sans-serif;
border: thin dotted black;
text-align: right;
color: red;
Proper ty Values
Throughout this chapter, you will see how to apply values to properties using CSS. First, it is important
to talk a bit about the values themselves. Property values can be expressed in several different metrics
according to the individual property and the desired result.
❑ CSS keywords and other properties, such as thin, thick, transparent, ridge, and so forth
❑ Real-world measures
❑ Inches (in)
❑ Centimeters (cm)
❑ Millimeters (mm)
172
Style Definitions
❑ Points (pt) — The points used by CSS2 are equal to 1/72 of an inch
❑ Grads (grad)
❑ Radians (rad)
❑ Time values (seconds (s) and milliseconds (ms)) — Used with aural style sheets
❑ Frequencies (hertz (Hz) and kilohertz (kHz)) — Used with aural style sheets
❑ Textual strings
Which metric is used depends on the value you are setting and your desired effect. For example, it doesn’t
make sense to use real-world measures (inches, centimeters, and so on) unless the user agent is calibrated
to use such measures or your document is meant to be printed. The em unit can be quite powerful, allow-
ing a value that changes as the element sizes change. However, using the em unit can have unpredictable
results. The em metric is best used when you need a relational, not absolute, value.
In the case of relational property values (percentages, em, and so on), the value is calculated on the
element’s parent values.
Understanding Selectors
Selectors are essentially patterns that enable a user agent to identify what elements get what styles. For
example, the following style in effect says, “If it is a paragraph tag, give it this style”:
p { text-indent: 2em; }
The selector is the first element before the brace, in this case, p (which matches the <p> tag).
This section shows you how to construct selectors of different types to best match styles to your elements
within your documents.
h1 { color: red; }
173
Chapter 13
Using the actual element name (h1) as the selector causes all occurrences of those tags to be formatted
with the property/values section of the definition (color: red). You can also specify multiple selectors
by listing them all in the selector area, separated by commas. For example, this definition will affect all
levels of heading tags in the document:
* { color: red; }
Every tag will have the color: red property/value applied to it. Of course, you would rarely want a
definition to apply to all elements of a document — you can also use the universal selector to match
other elements of the selector. The following selector matches any <ol> tag that is a descendant of a
<td> tag, which is a descendant of a <tr> tag:
tr td ol { color: red; }
More information on child/descendant selectors can be found in the “Matching Child, Descendant, and
Adjacent Sibling Elements” section later in this chapter.
However, this selector rule is very strict, requiring all three elements. If you also wanted to include
descendant <ol> elements of <td> elements, you would need to specify a separate selector or use the
universal selector to match all elements between <tr> and <ol>, as in the following example:
tr * ol { color: red; }
In essence, the universal selector is a wildcard, used to represent any one or more elements. For exam-
ple, the selector immediately preceding would also match <ol> elements embedded within a paragraph
element within a cell, within a row (tr td p ol). You can use the universal selector with any of the selec-
tor forms discussed in this chapter.
To specify a class to match with a selector, you append a period and the class name to the selector. For
example, this style will match any paragraph tag with a class of dark_area:
174
Style Definitions
For example, suppose that this paragraph was in the area of the document with the dark background:
The specification of the dark_area class with the paragraph tag will cause the paragraph’s text to be
rendered in white.
The universal selector can be used to match multiple elements with a given class. For example, the fol-
lowing style definition will apply to all elements that specify the dark_area class:
You can also omit the universal selector, specifying only the class itself (beginning with a period):
You could also take the example one step further, specifying the background and foreground in the same
style:
element[attribute=”value”]
For example, if you want to match any table with a border attribute set to 3, you could use this selector:
table[border=”3”]
175
Chapter 13
You can also match elements that contain the attribute no matter what the value of the attribute by omit-
ting the equal sign and attribute value. To match any table with a border attribute, you could use this
selector:
table[border]
Combine two or more selector formats for even more specificity. For example, the following selector will
match table elements with a class value of datalist and a border value of 3:
table.datalist[border=”3”]
You can also specify multiple attributes for more specificity. Each attribute is specified in its own brack-
eted expression. For example, if you wanted to match tables with a border value of 3 and a width value
of 100%, you could use this selector:
table[border=”3”][width=”100%”]
You can also match single values within a space- or hyphen-separated list value. To match a value in a
space-separated list, use tilde equal (~=) instead of the usual equal sign (=). To match a value in a hyphen-
separated list, you use bar equal (|=) instead of the usual equal sign (=). For example, the following selec-
tor would match any attribute that has “us” in a space-separated value of the language attribute:
[language~=”us”]
<html>
<body>
<div class=”div1”>
<h1>Heading 1</h1>
<table>
<tr><td>Cell 1</td><td>Cell 2</td></tr>
<tr><td>Cell 3</td><td>Cell 4</td></tr>
</table>
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing
elit, sed diam nonummy nibh euismod tincidunt ut laoreet
dolore magna aliquam erat volutpat. Ut wisi enim ad minim
176
Style Definitions
veniam, quis nostrud exerci tation ullamcorper suscipit
lobortis nisl ut aliquip ex ea commodo consequat.</p>
</div>
<div class=”div2”>
<h1>Heading 2</h1>
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing
elit, sed diam nonummy nibh euismod tincidunt ut laoreet
dolore magna aliquam erat volutpat. Ut wisi enim ad minim
veniam, quis nostrud exerci tation ullamcorper suscipit
lobortis nisl ut aliquip ex ea commodo consequat.</p>
<ol>An ordered list
<li>First element
<li>Second element
<li>Third element
</ol>
</div>
</body>
</html>
body
div1 div2
h1 table p h1 p ol
tr
li
td td li
li
tr
td td
Figure 13-1
177
Chapter 13
Ancestors and Descendants
Ancestors and descendants are elements that are linked by lineage, no matter the distance. For example,
in Figure 13-1 the list elements under div2 are descendants of the body element, and the body element
is their ancestor, even though multiple elements separate the two.
Siblings
Siblings are children that share the same, direct parent. In Figure 13-1, the list elements under div2 are
siblings of each other. The header, paragraph, and table elements are also siblings because they share
the same, direct parent (div1).
Selecting by Hierarchy
You can use several selector mechanisms to match elements by their hierarchy in the document.
To specify ancestor and descendant relationships, you list all involved elements separated by spaces. For
example, the following selector matches the list elements in Figure 13-1 (li elements within a div with a
class of div2):
div.div2 li
To specify parent and child relationships, list all involved elements separated by a right angle bracket
(>). For example, the following selector matches the table element in Figure 13-1 (a table element that is
a direct descendant of a div with a class of div1):
To specify sibling relationships, list all involved elements separated by plus signs (+). For example, the
following selector matches the paragraph element under div1 in Figure 13-1 (a paragraph that has a sib-
ling relationship with a table):
table + p
You can mix and match the hierarchy selector mechanisms for even more specificity. For example, the
following selector will match only table and paragraph elements that are children of the div with a class
value of div1:
178
Style Definitions
Note that this inheritance rule is valid only for foreground properties. Background properties (background
color, image, and so on) are not automatically inherited by descendant elements.
You can override inheritance by defining a style for an element with a different value for the otherwise
inherited property. For example, the following definitions result in all elements, except for paragraphs
with a nogreen class, being rendered in green:
Attributes that are not in conflict are cumulatively inherited by descendant elements. For example, the
following rules result in paragraphs with an emphasis class being rendered in bold, green text:
Using Pseudoclasses
You can use a handful of pseudoclasses to match attributes of elements in your document. Pseudoclasses
are identifiers that are understood by user agents and apply to elements of certain types without the ele-
ments having to be explicitly styled. Such classes are typically dynamic and tracked by other means than
the actual class attribute.
For example, there are pseudoclasses used to modify visited and unvisited anchors in the document
(explained in the next section). Using the pseudoclasses, you don’t have to specify classes in individual
anchor tags — the user agent determines which anchors are in which class (visited or not) and applies
the style(s) appropriately.
179
Chapter 13
Anchor Styles
A handful of pseudoclasses can be used with anchor elements (<a>). The anchor pseudoclasses begin
with a colon and are listed in the following table.
Pseudoclass Matches
:link Unvisited links
:visited Visited links
:active Active links
:hover The link that the browser pointer is hovering over
:focus The link that currently has the user interface focus
For example, the following definition will cause all unvisited links in the document to be rendered in
blue, visited links in red, and when hovered over, green:
The order of the definitions is important; because the link membership in the classes is dynamic, :hover
must be the last definition. If the order of :visited and :hover were reversed, visited links would not
turn green when hovered over because the :visited color attribute would override the :hover color
attribute. Ordering is also important when using the :focus pseudoclass — it should be placed last in
the definitions.
Pseudoclass selectors can also be combined with other selector methods. For example, if you wanted all
nonvisited anchor tags with a class attribute of important to be rendered in a bold font, you could use
the following code:
This code results in only the first paragraph of all div elements being indented by 25px.
180
Style Definitions
The pseudoelements :before and :after are covered in the “Pseudoelements” section later in this
chapter.
The :lang selectors apply to all elements with a quote class within the document. The second two defi-
nitions in the preceding example add quote characters to any quote classed element.
Pseudoelements
Pseudoelements are another virtual construct to help apply styles dynamically to elements within a doc-
ument. For example, the :first-line pseudoelement applies a style to the first line of an element
dynamically — that is, as the first changes size (longer or shorter), the user agent adjusts the style cover-
age accordingly.
:first-line
The :first-line pseudoelement specifies a different set of property values for the first line of elements.
This is a powerful feature; as the browser window changes widths, the “first line” of an element can grow
or shrink accordingly, and the style is applied appropriately. This is illustrated in the following code and
in Figure 13-2, which shows two browser windows of different widths:
181
Chapter 13
<p>When in the Course of human events, it becomes necessary
for one people to dissolve the political bands which have
connected them with another, and to assume among the powers
of the earth, the separate and equal station to which
the Laws of Nature and of Nature’s God entitle them, a decent
respect to the opinions of mankind requires that they should
declare the causes which impel them to the separation.</p>
</body>
</html>
The preceding code example manages element formatting by exception. Most paragraphs in the docu-
ment should have their first line underlined. A universal selector is used to select all paragraph tags. A
different style, using a class selector (noline), is defined to select elements that have a class of
noline. Using this method, you only have to add class attributes to the exceptions (the minority)
instead of the rule (the majority).
Figure 13-2
182
Style Definitions
The :first-line pseudoelement has a limited range of properties it can affect. Only properties in the
following groups can be applied using :first-line:
❑ Font properties
❑ Color properties
❑ Background properties
❑ word-spacing
❑ letter-spacing
❑ text-decoration
❑ vertical-align
❑ text-transform
❑ line-height
❑ text-shadow
❑ clear
:first-letter
The :first-letter pseudoelement is used to affect the properties of the first letter of an element. This
selector can be used to achieve typographic effects such as drop caps, as illustrated in the following code
and Figure 13-3:
183
Chapter 13
Figure 13-3
Notice the use of the content property. This property assigns the actual value to content-generating
selectors. In this case, quote marks are assigned as the content to add before and after elements with a
quote class. The following code and Figure 13-4 illustrate how a supporting user agent (Opera, in this
case) generates content from the :before and :after pseudoelements:
184
Style Definitions
opinions of mankind requires that they should declare the causes which impel them
to the separation.</p>
</body>
</html>
Figure 13-4
Generated content breaks the division of content and presentation. However, adding presentation con-
tent is sometimes necessary to enhance the content being presented. Besides adding elements such as
quote marks, you can also create counters for custom numbered lists and other more powerful features.
❑ border
❑ border-collapse
❑ border-spacing
❑ border-top
❑ border-right
❑ border-bottom
❑ border-left
❑ border-color
185
Chapter 13
❑ border-top-color
❑ border-right-color
❑ border-bottom-color
❑ border-left-color
❑ border-style
❑ border-top-style
❑ border-right-style
❑ border-bottom-style
❑ border-left-style
❑ border-width
❑ border-top-width
❑ border-right-width
❑ border-bottom-width
❑ border-left-width
Several of these properties can be used to set multiple properties within the same definition. For example,
to set an element’s border, you could use code similar to the following:
p.bordered {
border-top-width: 1px;
border-top-style: solid;
border-top-color: black;
border-right-width: 2px;
border-right-style: dashed;
border-right-color: red;
border-bottom-width: 1px;
border-bottom-style: solid;
border-bottom-color: black;
border-left-width: 2px;
border-left-style: dashed;
border-left-color: red;
}
Alternately, you could use the shorthand property border-side to shorten this definition considerably:
p.bordered {
border-top: 1px solid black;
border-right: 2px dashed red;
border-bottom: 1px solid black;
border-left: 2px dashed red;
}
186
Style Definitions
This definition could be further simplified by use of the border property, which sets all sides of an ele-
ment to the same values:
p.bordered {
border: 1px solid black;
border-right: 2px dashed red;
border-left: 2px dashed red;
}
This code first sets all sides to the same values and then sets the exceptions (right and left borders).
As with all things code, avoid being overly ingenious when defining your styles. Doing so will dramati-
cally decrease the legibility of your code.
Summar y
This chapter taught you the basics of defining styles — from formatting and using the various selector
methods to formatting property declarations and setting their values. You also learned about special
pseudoclasses and elements that can make your definitions more dynamic. The next series of chapters
in this part delve into specific style use — for text, borders, tables, and more.
187
Text
Although the Web is rife with multimedia of all types, plain text is still the main medium used to
convey messages across the Internet. As with other elements, CSS provides many properties for
controlling how your text is rendered in your document, including alignment, letter and word
spacing, white-space control, and even the font itself. This chapter covers how to use CSS to for-
mat text in your documents.
Aligning Text
Multiple properties in CSS can be used to align text, both horizontally and vertically. This section
covers the various properties used to align text to other elements around it.
Horizontal Alignment
The text-align property can be used to align text horizontally using four different values/styles:
left (default), right, center, and full. Consider the following code and the results shown in Figure 14-1:
Figure 14-1
190
Text
Note that justification is specified by how the text aligns to a specific margin. For example, left-aligned
text aligns against the left margin while right-aligned text aligns to the right margin. Any side of the text
not justified remains ragged.
You can also use the text-align property to align columns of text on a specific character, for example,
monetary amounts aligned to a decimal point. The following code causes the numbers in the Amount
Due column to align on their decimal points:
This use of text-align (character alignment) is not well supported in today’s browsers. As such, you
should avoid depending on it.
Vertical Alignment
The vertical-align property can be used to align text on the vertical axis. The vertical-align
property supports the values shown in the following table.
191
Chapter 14
Value Effect
baseline The default vertical alignment; this value aligns the text’s baseline to other
objects around it.
bottom Causes the bottom of the element’s bounding box to be aligned with the
bottom of the element’s parent bounding box.
length Causes the element to ascend (positive value) or descend (negative value) by
the value specified.
middle Causes the text to be aligned using the middle of the text and the midline of
objects around it.
percentage Causes the element to ascend (positive value) or descend (negative value) by
the percentage specified. (The percentage is computed from the line height
of the element.)
sub Causes the text to descend to the level appropriate for subscripted text, based
on its parent’s font size and line height. (Has no effect on the actual size of the
text, only the position of the element.)
super Causes the text to ascend to the level appropriate for superscripted text, based
on its parent’s font size and line height. (Has no effect on the actual size of the
text, only the position of the element.)
text-bottom Causes the bottom of the element’s bounding box to be aligned with the
bottom of the element’s parent text.
text-top Causes the top of the element’s bounding box to be aligned with the top of the
element’s parent text.
top Causes the top of the element’s bounding box to be aligned with the top of the
element’s parent bounding box.
The following code and Figure 14-2 illustrate the effect of each vertical-align value:
192
Text
/* All elements get a border */
body * { border: 1px solid black; }
/* Reduce the spans’ font by 50% */
p * { font-size: 50%; }
</style>
</head>
<body>
<p>Baseline: Parent
<span class=”baseline”>aligned text</span> text</p>
<p>Sub: Parent
<span class=”sub”>aligned text</span> text</p>
<p>Super: Parent
<span class=”super”>aligned text</span> text</p>
<p>Top: Parent
<span class=”top”>aligned text</span> text</p>
<p>Text-top Parent
<span class=”text-top”>aligned text</span> text</p>
<p>Middle: Parent
<span class=”middle”>aligned text</span> text</p>
<p>Bottom: Parent
<span class=”bottom”>aligned text</span> text</p>
<p>Text-bottom: Parent
<span class=”text-bottom”>aligned text</span> text</p>
<p>Length: Parent
<span class=”length”>aligned text</span> text</p>
<p>Percentage: Parent
<span class=”percentage”>aligned text</span> text</p>
</body>
</html>
Figure 14-2
193
Chapter 14
Text isn’t the only type of element that you can affect with the vertical-align property. For example,
note the document displayed in Figure 14-3; the sphere image has its vertical-align property set to
middle. Note how the image and the text are aligned on their vertical midpoints.
Figure 14-3
Indenting Text
The text-indent property can be used to indent the first line of an element. For example, to indent the
first line of a paragraph by 5 percent of its overall width, you could use code similar to the following
(whose results are shown in Figure 14-4):
The text-indent property indents only the first line of the element to which it is applied. If you want
to indent the entire element, use the margin property instead. The margin property is discussed in
Chapter 15.
194
Text
Figure 14-4
You can use any of the valid property value metrics in defining the value of the indentation. Note that
the user agent’s window size can play a significant role in the size of the actual indent if you use values
computed from the containing block (percentages, em, and so on).
Possible metrics for CSS properties were covered in the “Property Values” section of Chapter 12.
Floating Objects
Allowing elements to float in your documents can make them seem more dynamic. Floating elements
float against a margin, allowing other elements to flow around them. For example, consider the follow-
ing code and the results shown in Figure 14-5:
195
Chapter 14
<p><b>Non-Floating Image</b><br />
<img src=”smsphere.jpg” alt=”small sphere” style=”float: none;”> Lorem ipsum
dolor sit amet, consectetuer adipiscing elit, sed diam nonummy nibh euismod
tincidunt ut laoreet dolore magna aliquam erat volutpat. Ut wisi enim ad minim
veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex
ea commodo consequat. Duis autem vel eum iriure dolor in hendrerit in vulputate
velit esse molestie consequat, vel illum dolore eu feugiat nulla facilisis at vero
eros et accumsan et iusto odio dignissim qui blandit praesent luptatum zzril
delenit augue duis dolore te feugait nulla facilisi.</p>
<p>
Figure 14-5
Floating images ignore the normal flow of the document, sticking to the margin closest to their location and
allowing items to flow around them. You can set an element’s float property to right, left, or none.
However, there are times when you might not want an element to flow around a floating element. For
example, headings look odd when floated away from their home margin, as shown in Figure 14-6.
When you don’t want an element to flow around floating elements, you should set its clear property.
The clear property has four possible values: none (default), right, left, or both. This property makes
sure that the specified side (or sides, if set to both) is clear of floated elements before the element is
placed. For example, adding this style to the example shown in Figure 14-6 ensures that both sides of all
headings are clear and results in the appearance shown in Figure 14-7 (the heading isn’t placed until the
left margin is clear):
196
Text
Figure 14-6
Figure 14-7
197
Chapter 14
❑ normal
❑ nowrap
❑ pre
The default value, none, allows the user agent to compress white space normally. Using the pre value
causes the text to be formatted as preformatted text, preserving all the white space in the element. The
nowrap value results in the element not wrapping at the border of the user agent’s screen — the text con-
tinues on the current line until the next line break. Most user agents will add horizontal scroll bars to
allow the user to scroll such content.
For example, the following paragraph will be rendered as is, with all its superfluous white space intact,
but otherwise will inherit the formatting of the paragraph element (<p>):
For example, consider the following code and output shown in Figure 14-8, which illustrates three dif-
ferent letter-spacing values:
198
Text
<h3>Normal</h3>
<p class=”normal”>Lorem ipsum dolor sit amet, consectetuer
adipiscing elit, sed diam nonummy nibh euismod tincidunt
ut laoreet dolore magna aliquam erat volutpat. Ut wisi
enim ad minim veniam, quis nostrud exerci tation
ullamcorper suscipit obortis nisl ut aliquip ex ea commodo
consequat.</p>
<h3>Tight</h3>
<p class=”tight”>Lorem ipsum dolor sit amet, consectetuer
adipiscing elit, sed diam nonummy nibh euismod tincidunt
ut laoreet dolore magna aliquam erat volutpat. Ut wisi
enim ad minim veniam, quis nostrud exerci tation
ullamcorper suscipit obortis nisl ut aliquip ex ea commodo
consequat.</p>
<h3>Loose</h3>
<p class=”loose”>Lorem ipsum dolor sit amet, consectetuer
adipiscing elit, sed diam nonummy nibh euismod tincidunt
ut laoreet dolore magna aliquam erat volutpat. Ut wisi
enim ad minim veniam, quis nostrud exerci tation
ullamcorper suscipit obortis nisl ut aliquip ex ea commodo
consequat.</p>
</body>
</html>
Figure 14-8
Note that the user agent can govern how much letter spacing is allowed to change. Also, changing the
spacing by too drastic a value (as in the tight paragraph in Figure 14-8) can have unpredictable results.
The word-spacing property behaves exactly like the letter-spacing property, except that it controls
spacing between words.
199
Chapter 14
Capitalization
The text-transform property can be used to force particular capitalization on elements. This property
has four possible values:
❑ none (default)
❑ capitalize
❑ uppercase
❑ lowercase
Setting the appropriate value will force the user agent to render the text (if possible) using that setting.
For example, you may want all your headings in title case (as in this book, where most words begin with
a capital letter). To do so, you could use a style similar to the following:
This won’t quite have the desired effect, as conjunctions (and, or, and so on) and other words not com-
monly capitalized in initial-caps schemes will still be capitalized.
Text Decorations
You can add additional text effects with the text-decoration and text-shadow properties.
❑ none (default)
❑ underline
❑ overline
❑ line-through
❑ blink
However, the use of this property isn’t recommended. Blinking text has never had a welcome place on
the Web, underlined text can be confused with links, and the delete tag (<del>) should be used to gener-
ate strikethrough text.
200
Text
The advice to use a specific tag (namely, <del>) for formatting seems contrary to what we preach about
styles — that is, use styles instead of dedicated tags. However, in this case, the recommendation is made
because of the meaning of the tag, namely deletion, instead of its ornamentation function (strikethrough).
The same way you would use <emphasis> instead of <b> when you wanted text to be emphasized (but
not necessarily bold), you would use <del> to indicate that text is to be deleted, not just decorated in
strikethrough. The advantage in this case is that the user agent could be configured not even to show text
contained in <del> tags.
The text-shadow property, used to provide a drop shadow on affected text, is more complex than most
properties discussed in this chapter, having the following syntax:
At its most basic, the text-shadow property takes two distance arguments: one vertical, the other hori-
zontal. Positive values will place the shadow down and to the right, negative values will place the
shadow up and to the left. The color value sets the color for the shadow, and the blur value specifies
the area of effect. You can also use multiple definitions to spawn multiple shadows of the same element.
When using multiple definitions, you should separate them with a comma.
For example, consider the following code, which defines a shadow above and to the right (2em, -2em) of
the heading and another lighter shadow directly underneath the text:
<h1 style=”text-shadow: #666666 2em -2em, #AAAAAA 0em 0em 1.5em;”>A Drop
Shadow Heading</h1>
Most user agents do not support the text-shadow property. If you desire this effect, you would be better
off creating it in graphic image form.
Formatting Lists
Chapter 4 of this book covered XHTML lists, both of the ordered (or numbered) and unordered (or bul-
leted) variety. You learned how to embed list items (<li>) in both list types to construct lists. Using CSS,
you can be much more creative with your lists, as this section will quickly demonstrate.
There is a list-style shortcut property that you can use to set list properties with one value assign-
ment. You can use the list-style property to define the other list properties, as follows:
To create a new list item, you can define a class as a list item:
201
Chapter 14
Then you can use that class to declare elements as list items:
As you read through the rest of this section, keep in mind that the list properties can apply to any ele-
ment defined as a list-item.
Both bullets and numbers that precede list items are known as markers. Markers have additional value
with CSS, as shown in the “Generated Content” section later in this chapter.
❑ armenian
❑ circle
❑ cjk-ideographic
❑ decimal
❑ decimal-leading-zero
❑ disc
❑ georgian
❑ hebrew
❑ hiragana
❑ hiragana-iroha
❑ katakana
❑ katakana-iroha
❑ lower-alpha
❑ lower-greek
❑ lower-latin
❑ lower-roman
❑ none (default)
❑ square
❑ upper-alpha
❑ upper-latin
❑ upper-roman
202
Text
Setting the style provides a list with appropriate item identifiers. For example, consider this code and the
output shown immediately after:
HTML Code:
<ol style=”list-style-type:lower-roman;”>
A Roman Numeral List
<li>Step 1
<li>Step 2
<li>Step 3
</ol>
Output:
You can also use the none value of list-style-type to suppress bullets or numbers for individual
items. However, this does not change the number of those items; the numbers are just not displayed.
For example, consider the following revised code and output:
HTML Code:
<ol style=”list-style-type:lower-roman;”>
A Roman Numeral List
<li>Step 1
<li style=”list-style-type:none;”>Step 2
<li>Step 3
</ol>
Output:
Note that the third item is still number 3 (Roman iii), despite the fact that the number for item 2 is
suppressed.
Positioning of Markers
The list-style-position property sets the position of the marker in relation to the list item. The
valid values for this property are inside or outside. The outside value is the more typical list style;
the marker is offset from the list item, and the entire text of the item is indented. The inside value sets
the list to a more compact style; the marker is indented with the first line of the item. Figure 14-9 shows
an example of both list marker positions:
203
Chapter 14
Figure 14-9
To set the marker position for an entire list, use the list-style-position property in the list ele-
ment (<ol> or <ul>) instead of in the list item element.
<ol>
<li style=”list-style-image: url(sphere.jpg)”>
Lorem ipsum dolor sit amet, consectetuer
adipiscing elit, sed diam nonummy nibh euismod tincidunt
ut laoreet dolore magna aliquam erat volutpat.
<li style=”list-style-image: url(cone.jpg)”>
Lorem ipsum dolor sit amet, consectetuer
adipiscing elit, sed diam nonummy nibh euismod tincidunt
ut laoreet dolore magna aliquam erat volutpat.
</ol>
Note that you can use any URL-accessible image with the list-style-image property. Remember to
use images sized appropriately for your lists.
204
Text
Autogenerating Text
One of the strengths of CSS is its ability to generate additional text, not just format existing text. You saw
how the :before and :after pseudoelements could be used to add text in Chapter 13. This section
expands on the autogenerated text mechanisms.
The quotes property takes a list of arguments in string format to use for the open and close quotes at
multiple levels. This property has the following form:
The standard definition for most English uses (double quotes on first level, single quotes on second
level) is as follows:
The opposite quote type is used to encapsulate each quote character (single quote enclosing double and
vice versa).
Once you define the quotes, you can use them along with the :before and :after pseudoelements, as
in the following example:
The open-quote and close-quote words are mnemonics for the values set in the quotes property.
The content property also accepts string values, so you can use almost anything for its value.
Automatic Numbering
The content property can also be used to automatically generate numbers, which, in turn, can be used
to automatically number elements. The advantage to using automatic counters over standard list num-
bering comes in the form of flexibility, enabling you to start at an arbitrary number, combine numbers
(for example, 1.1), and so on.
As with all generated content, most user agents do not support counters.
205
Chapter 14
content: counter(counter_name);
This places the current value of the counter specified in the content object. For example, the following
style definition will cause the user agent to display “Chapter” and the current value of the counter
named chapter_num at the beginning of each <h1> element:
Of course, it’s of no use to always assign the same number to the element. The counter-increment and
counter-reset objects are used to change the value of the counter.
If the increment value is not specified, the counter is incremented by 1. You can increment several coun-
ters with the same statement by specifying the additional counters after the first, separated by spaces.
For example, to increment the chapter and section counters each by 2, you could use the following:
You can also specify negative numbers to decrement the counter(s). To decrement the chapter counter
by 1, you could use the following:
The other method for changing a counter’s value is to use the counter-reset property. This property
resets the counter to 0 or a number expressly specified with the property. The counter-reset property
has the following format:
For example, to reset the chapter counter to 1, you could use this definition:
counter-reset: chapter 1;
You can reset multiple counters with the same property by specifying all the counters on the same line,
separated by spaces.
If a counter is used and incremented or reset in the same context (in the same definition), the counter is
first incremented or reset before being used. For example, the following code will not use the value of
206
Text
the chapter counter before the heading; it will increment the counter and use the incremented value
despite the fact that the content property comes before the counter-increment property:
Autonumbering Examples
Using counters, you can easily implement autonumbering schemes for many things. This section shows
two examples — one for chapters and sections, the other for lists.
This definition will display “Chapter chapter_num:” before the text in each <h1> element. The
chapter counter is incremented and the section counter is reset for each <h1> element.
The next step is to set up the section numbering, which is similar to the chapter numbering:
Source
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html>
<head>
<title>Chapter Auto-Numbering</title>
<style type=”text/css”>
h1:before {content: “Chapter “ counter(chapter) “: “;
counter-increment: chapter;
counter-reset: section; }
h2:before {content: “Section “ counter(chapter) “.”
counter(section) “: “;
counter-increment: section; }
</style>
</head>
<body>
207
Chapter 14
<h1>First Chapter</h1>
<h2>Section Name</h2>
<h2>Section Name</h2>
<h1>Second Chapter</h1>
<h2>Section Name</h2>
<h1>Third Chapter</h1>
</body>
</html>
Output
The code results in the output shown in Figure 14-10.
Figure 14-10
In this example, starting both counters at 0 is ideal. However, if you needed to start the counters at
another value, the resets should be attached to a higher tag in the document hierarchy (such as <body>):
Source
For example, consider the following code, which starts numbering the list at 30:
208
Text
<head>
<title>List Auto-Number</title>
<style type=”text/css”>
li:before {content: counter(list) “: “;
counter-increment: list; }
</style>
</head>
<body>
<ol style=”counter-reset: list 29;
list-style-type:none;”>
<li>First item
<li>Second item
<li>Third item
</ol>
</body>
</html>
Output
The output of the preceding code appears in Figure 14-11.
Figure 14-11
You can use multiple instances of a counter in your documents, and each instance can operate inde-
pendently. The key is each counter’s scope: A counter’s scope is within the element that initialized
the counter with the first reset. In the list example, it is the <ol> tag. If you nested another <ol>
tag within the first, the nested list could have its own instance of the list counter.
209
Chapter 14
Fonts
Fonts are stylized collections of letters and symbols. Different fonts can be used to convey different
information; specialized fonts can be used to provide special characters or symbols. Although fonts can
be quite different from each other, they share the same basic characteristics, as shown in Figure 14-12.
Descent
Figure 14-12
Fonts are mapped according to a system similar to ruled paper. The line that the characters or symbols
sit on is called the baseline. The distance between the baseline and the top of the highest characters (usu-
ally capital letters and lowercase letters such as l, f, or t) is known as the ascension. The distance between
the baseline and the lowest point of characters that dip below it (such as p, g, or q) is known as the
descension.
Vertical font measurements, such as line spacing or leading, are typically measured between the baselines
of text, at least as far as CSS is concerned.
Just as CSS offers many properties to control lines and paragraphs, it also offers many properties to con-
trol the font(s) of the text in your documents.
Font Selection
CSS supports five different font family types. These general types can be used to apprise a user agent of
the type of font face it should use. Those five families are as follows:
❑ Serif — Serif fonts have little ornamentation on each glyph (character — includes letters, numbers,
and symbols). Typically, serif fonts are used in body text; the finishing strokes, flared or tapering
ends, or serifed endings, make the lines of characters flow and tend to be easier on the eyes.
❑ Sans serif — These fonts are fairly plain, having little or no ornamentation on their glyphs. Sans
serif fonts are typically used for headings or other large areas of emphasis.
❑ Cursive — Cursive fonts are quite ornate, approximating cursive writing. Such fonts should be
used only in extreme cases where emphasis is on ornamentation rather than legibility.
❑ Fantasy — Fantasy fonts, much like cursive fonts, emphasize ornamentation over legibility.
Fantasy fonts come in many styles but still retain the basic shape of letters. Like cursive fonts,
fantasy fonts are generally used for logos and other ornamentation purposes where legibility is
secondary.
210
Text
❑ Monospace — Monospace fonts come in serif and sans serif varieties but all share the same
attribute: all characters in the font have the same width. The effect is much like characters on a
text-based terminal or typewriter. Such fonts are generally used in code listings and other list-
ings approximating terminal output.
The font-family property defines the font or fonts that should be used for elements in the document.
This property has the following format:
For example, to select a sans serif font, you might use a definition similar to the following:
Note that this definition uses three family names (Verdana, Arial, Helvetica) and a generic family
name (Sans-Serif) for versatility. The definition instructs the user agent that the sans serif font
Verdana should be used. If it is unavailable, the Arial font (popular on Windows-based platforms)
should be used. If neither of those fonts is available, the Helvetica font should be used (popular on
Macintosh-based platforms and other PostScript systems). If none of the previously specified fonts are
available, the user agent should use its default sans serif font.
The preceding font-family definition is a good, universal sans serif font specification that can be used for
any platform. Likewise, the following definition can be used for a universal serif font specification:
Note that the font-family definition doesn’t control the font variant (bold, italic, and so on) but the
font that should be used as the basis for fonts in the element where the font-family definition is
placed. Individual font variant tags and elements (<b>, <i>, and so on) determine the variant of the font
used when such elements are encountered by the browser. If the base font cannot be used, one of the
variants (if any) of the definition is used in its stead.
Style definitions to set up a document in traditional serif font body text with sans serif font headings
would resemble the following:
Font Sizing
Two properties can be used to control font sizing: font-size and font-size-adjust. Both properties
can adjust a font absolutely or relative to the current font size. Possible value metrics are shown in the
following table:
211
Chapter 14
Metric Description
Absolute size keywords Keywords corresponding to user agent absolute font sizes. These
keywords include xx-small, x-small, small, medium, large,
x-large, and xx-large.
Relative size keywords Keywords corresponding to user agent relative font sizes. These
keywords include larger and smaller.
Length absolute An absolute value corresponding to a font size. Negative values
are not supported, but supported values include point sizes (for
example, 12pt) and optionally (though not as exact) other size
values such as pixels (for example, 10px).
Percentage relative A percentage corresponding to a percentage of the current font.
These values can be expressed in actual percentages (for example,
150%) or other relative metrics such as ems (for example, 1.5em).
Font Styling
Four properties can be used to affect font styling: font-style, font-variant, font-weight, and
font-stretch. The syntax of each is shown in the following listing:
The font-style property is used to control the italic style of the text, while the font-weight property
is used to control the bold style of the text. The other two properties control other display attributes of
the font; font-variant controls whether the font is displayed in small caps, and font-stretch does
exactly what its name implies.
The various values for the font-weight property can be broken down as follows:
❑ 100-900 — The darkness of the font, where 100 is the lightest and 900 the darkest. Various
numbers correspond to other values as described below.
❑ lighter — Specifies the next lightest setting for a font unless the font weight is already near the
weight value corresponding to 100, in which case, it stays at 100.
❑ normal — The normal darkness for the current font (corresponds to weight 400).
❑ bold — The darkness corresponding to the bold variety of the font (corresponds to weight 700).
❑ bolder — Specifies the next darkest setting for a font unless the font weight is already near the
weight value corresponding to 900, in which case, it stays at 900.
212
Text
The font-style and font-weight properties can be used to control a font’s bold and italic properties
without coding document text directly with italic (<i>) and bold (<b>) elements. For example, you
might define a bold variety of a style using definitions similar to the following:
The bold class of the paragraph element inherits the base font from its parent, the paragraph element.
The font-weight property in the bold class of the paragraph element simply makes such styled ele-
ments render as a bold variety of the base font.
Line Spacing
The line-height property controls the line height of text. The line height is the distance between the
baselines of two vertically stacked lines of text. This value is also known as leading.
This property sets the size of the surrounding box of the element for which it is applied. The normal
value sets the line height to the default size for the current font. Specifying a number (for example, 2)
causes the current line height to be multiplied by the number specified. Absolute lengths (for example,
1.2em) cause the line height to be set to that absolute value. A percentage value is handled like a number
value; the percentage is applied to the value of the current font.
For example, the following two definitions both set a class up to double-space text:
p.doublespace { line-height: 2; }
p.doublespace { line-height: 200%; }
Font Embedding
There are two technologies for providing fonts embedded into your documents. Embedding fonts allows
your readers to download the specific font to their local machine so that your documents use the exact
font you designate. Unfortunately, as with most progressive Web technologies, the market is split into
two distinct factions:
❑ OpenType is a standard developed by Microsoft and Adobe Systems. OpenType fonts, thanks to
the creators of the standard, share similar traits to PostScript and TrueType fonts used in other
publishing applications. Currently, only Internet Explorer supports OpenType.
❑ TrueDoc is a standard developed by Bitstream, a popular font manufacturer. Currently, only
Netscape-based browsers natively support TrueDoc fonts; however, Bitstream does make an
ActiveX control for support on Internet Explorer.
Even though a font is available for low cost or even for free, it doesn’t mean you can reuse it, especially
in a commercial application. When acquiring fonts for use on the Web, you need to ensure that you will
have the appropriate rights for the use you intend.
213
Chapter 14
To embed OpenType fonts in your document, you use an @font-face definition in the style section of
your document. The @font-face definition has the following syntax:
@font-face { font-definition }
The font-definition contains various information on the font, including stylistic information and the
path to the font file. This information is contained in typical style property: value form, similar to
the following:
@font-face {
font-family: Dax;
font-weight: bold;
src: url(‘http://www.example.com/fontdir/Dax.pfr’);
}
To embed TrueDoc fonts in your document, you use the link (<link>) tag in a format similar to the
following:
To use TrueDoc fonts in Internet Explorer, you also have to include the TrueDoc ActiveX control using
code similar to the following:
Several fonts are available for use on the TrueDoc Web site. Visit www.truedoc.com for more
information.
Embedding fonts is not recommended for several reasons, including the following:
Instead, it is recommended that you stick to CSS definitions for specifying font attributes. If you know
your audience and their platform and you need your document to look exactly as you intend, investigate
embedded fonts.
Summar y
This chapter introduced you to the various CSS properties and definitions used to control text in your
documents. You learned how to align text, control the spacing, and specify the font used. You also
learned about two different font technologies that enable you to embed specific fonts in your documents.
214
Padding, Margins,
and Borders
All elements in an XHTML document can be formatted in a variety of ways using CSS. Previous
chapters in this part of the book covered the CSS basics — how to write a style definition and how
to apply it to various elements within your documents. This chapter begins coverage of the area
that surrounds elements and how it can be formatted, including customizing the space around an
element and giving it a border. Chapter 16 continues this discussion with colors and background
images.
To illustrate this point, take a look at Figure 15-1. This figure shows a document that isn’t overtly
boxy.
Chapter 15
Figure 15-1
The same document is shown in Figure 15-2, but a thin border has been added to all elements, courtesy
of the following style:
Note how all the XHTML elements in the document pick up the border in a rectangular box shape.
As previously mentioned in this part of the book, all elements have a margin, padding, and border prop-
erty. These properties control the space around the element’s contents and the elements around it. These
properties stack around elements, as shown in Figure 15-3.
216
Padding, Margins, and Borders
Figure 15-2
Margin
Element
Border
Content
Padding
Figure 15-3
217
Chapter 15
The element contents (text, image, and so on) are immediately surrounded by padding. The padding
defines the distance between the element’s contents and border.
The element’s border (if any) is drawn right outside the element’s padding.
The element’s margin surrounds the element’s border, or the space the border would occupy if no border
is defined. The margin defines the distance between the element and neighboring elements.
Element Padding
An element’s padding defines the space between the element and the space its border would occupy. This
space can be increased or decreased, or set to an absolute value, using the following padding properties:
❑ padding-top
❑ padding-right
❑ padding-left
❑ adding-bottom
❑ padding
The first four properties are predictable in their behavior; for example, padding-top will change the
padding on the top of the element, padding-right will change the padding on the right side of the ele-
ment, and so forth. The fifth property, padding, is a shortcut for all sides; its effect is determined by the
number of values provided, as explained in the following table.
For example, the following style will set the top and bottom padding value to 5 pixels and the right and
left padding to 10 pixels:
218
Padding, Margins, and Borders
Although changing an element’s padding value will change its distance from neighboring elements, you
should use an object’s margin property to increase or decrease the distance from neighboring elements.
Note, however, that an element’s background color extends to the edge of the element’s padding. Therefore,
increasing an element’s padding can extend the background away from an element. This is one reason to
use padding instead of margins to increase space around an element. For more information on back-
grounds, see Chapter 16.
As with all CSS properties, you can specify an absolute value (as in the preceding example) or a relative
value. When specifying a relative value, the value is applied to the size of the element’s content (such as
font size, and so on), not to the default value of the padding. For example, the following code would
define padding as two times the element’s font size:
padding: 200%;
Element Borders
Borders are among the most versatile CSS properties. As you saw in Figure 15-2, every element in an
XHTML document can have a border. However, that figure showed only one type of border, a single, thin,
black line around the entire element. Each side of an element can have a different border, all controlled by
CSS properties corresponding to width, style (solid, dashed, dotted, and so on), and color of the border.
The following sections detail how each of the respective CSS properties can be used to affect borders.
Border Width
The width of an element’s border can be specified using the border width properties, which include the
following:
❑ border-top-width
❑ border-right-width
❑ border-bottom-width
❑ border-left-width
❑ border-width
As with other properties that affect multiple sides of an element, there are border width properties for
each side and a shortcut property that can be used for all sides, border-width.
The border-width shortcut property accepts one to four values. The way the values are mapped to
the individual sides depends on the number of values specified. The rules for this behavior are the same
as those used for the padding property. See the “Element Padding” section earlier in this chapter for
the specific rules.
As with other properties, the width can be specified in absolutes or relative units. For example, the first
style in the following code example sets all of an element’s borders to 2 pixels wide. The second style
sets all of an element’s borders to 50 percent of the element’s size (generally font size):
219
Chapter 15
p.two-pixel { border-width: 2px; }
You can also use keywords such as thin, medium, or thick to roughly indicate a border’s width. The
actual width used when the document is rendered is up to the user agent. However, if you want exact
control over a border’s width, you should specify it using absolute values.
Border Style
There are 10 different types of predefined border styles. These types are shown in Figure 15-4, generated
by the following code:
The border type hidden is identical to the border type none, except that the border type hidden is
treated like a border for border conflict resolutions. Border conflicts happen when adjacent elements
share a common border (when there is no spacing between the elements). In most cases, the most eye-
catching border is used. However, if either conflicting element has the conflicting border set to hidden,
the border between the elements is unconditionally hidden.
220
Padding, Margins, and Borders
Figure 15-4
As with other properties of this type, there are several different border style properties:
❑ border-top-style
❑ border-right-style
❑ border-bottom-style
❑ border-left-style
❑ border-style
The first four properties affect the side for which they are named. The last, border-style, acts as a
shortcut for all sides, following the same rules as other shortcuts covered in this chapter. See the section
“Element Padding” for more information.
221
Chapter 15
Border Color
The border color properties allow you to set the color of the element’s visible border. As with the other
properties in this chapter, there are border color properties for each side of an element (border-top-
color, border-right-color, and so on) as well as a shortcut property (border-color) that can affect
all sides.
You can choose from three different methods to specify colors in the border colors properties:
❑ Color keywords — Black, white, maroon, and so on. Note that the exact color (mix of red,
green, and blue) is left up to the browser and its default colors. (See Appendix A for a list of
common color keywords.)
❑ Color hexadecimal values — Values specified in the form #rrggbb, where rrggbb is two digits
(in hexadecimal notation) for each of the colors red, green, and blue. For example, #FF0000
specifies red (255 red, 0 green, 0 blue) and #550055 specifies purple (equal parts of red and
blue, no green).
❑ Color decimal or percentage values — Values specified using the rgb( ) function. This function
takes three values, one each for red, green, and blue. The value can be an integer between 0 and
255 or a percentage. For example, the following specifies the color purple (equal parts red and
blue, no green) in integer form and then again in percentages:
rgb(100, 0, 100)
rgb(50%, 0, 50%)
Most graphic editing programs supply color values in multiple formats, including percentage
RGB values and perhaps even HTML-style hexadecimal format. Lynda Weinman’s site,
www.lynda.com, contains a multitude of information on Web colors, especially the following page:
http://www.lynda.com/hex.html.
For example, the following two styles set the same border for different paragraph styles:
222
Padding, Margins, and Borders
Border Spacing
Two additional border properties bear mentioning here, both of which are primarily used with tables:
❑ border-spacing — This property controls how the user agent renders the space between cells
in tables.
❑ border-collapse — This property selects the collapsed method of table borders.
These properties are covered in more depth along with other table properties in Chapter 17.
Element Margins
Margins are the space between an element’s border and neighboring elements. Margins are an important
property to consider and adjust as necessary within your documents. Most elements have suitable
default margins, but sometimes you will find it necessary to increase or decrease an element’s margin(s)
to suit your unique needs.
For example, consider the image and text shown in Figure 15-5, rendered using the following code:
Figure 15-5
Notice how the “T” in “Text” is almost touching the image next to it. In this case, an additional margin
would be welcome.
223
Chapter 15
As with other properties in this chapter, margin properties exist for each individual side (margin-top,
margin-left, and so on) as well as a shortcut property to set all sides at once (margin). As with the
other shortcut properties describe herein, the margin property accepts one to four values, and the num-
ber of values specified determines how the property is applied to an element. See the “Element
Padding” section earlier in this chapter for more information.
For example, you can increase the margins of the image in Figure 15-5 using a style similar to the
following:
border-right: 5px;
This would set the right border of the image (the edge next to the text) to 5 pixels. Likewise, you can
change all four margins using a shortcut such as the following:
There are no guidelines for which margins you should adjust on what elements. However, it’s usually
best to modify the least number of margins or to be consistent with which margins you do change.
Dynamic Outlines
Outlines are another layer that exists around an element to allow the user agent to highlight the element,
if necessary. This generally happens when a form element receives focus. The position of the outline can-
not be moved, but it can be influenced by the position of the element’s border. Note that outlines do not
occupy any space; the element occupies the same amount of space whether its outline is visible or not.
Figure 15-6 shows an example of a dynamic outline around the Phone label.
Figure 15-6
224
Padding, Margins, and Borders
Using CSS you can modify the look of outlines. However, unlike other properties covered in this chapter,
all sides of an outline must be the same. The CSS properties governing outlines include outline-color,
outline-style, outline-width, and the shorthand property outline. These properties operate
much like the other properties in this chapter, allowing the same values and having the same effects.
The format of the outline shortcut property is as follows:
To use the outline properties dynamically, use the :focus and :active pseudoelements. These two
pseudoelements specify when an element’s outline should be visible — when it has focus or when it is
active. For example, the following definitions specify a thick green border when form elements have
focus and a thin blue border when they are active:
However, as of this writing, user agent support for outlines is very inconsistent, when it exists at all. If
you intend to use outlines in your documents, you should test your code extensively on all platforms
you expect your audience to use.
Summar y
This chapter introduced you to the box model of CSS and how you can use various properties of an ele-
ment’s surrounding box to help format your documents. You learned how padding, borders, and margins
comprise a layered structure around an element and how each can be manipulated to change how ele-
ments render in the document. You learned about the extensive border options and finished with cover-
age of dynamic outlines. Chapter 16 covers the other customizable pieces of the box model, namely the
foreground and background, both colors and images.
225
Colors and Backgrounds
In Chapter 15, you learned about the box-formatting model of CSS and how you can manipulate
an element’s containing box to format your XHTML documents. This chapter continues that dis-
cussion, teaching you about element foreground and background colors and using images for ele-
ment backgrounds.
Element Colors
Most elements in an XHTML document have two color properties: a foreground property and a
background property. Both of these properties can be controlled using CSS styles. The following
sections discuss both types of color properties.
Foreground Colors
The foreground color of an element is typically used as the visible portion of an element — in most
cases, the color of the font or other visible part of the element. You can control the foreground
color of an element using the CSS color property, which has the following format:
color: <color_value>;
As with other properties using color values, the value can be expressed using one of three methods:
❑ An RGB value using the rgb() function (rgb(100%,0,0) or rgb(255,0,0) for red)
More information on color values can be found in the “Border Color” section of Chapter 15.
Chapter 16
For example, the following style defines a class of the paragraph element, which will be rendered with a
red font:
The following paragraph, when used with the preceding style, will be rendered with red text:
As with all style properties, you are not limited to element-level definitions. As shown in the following
code, you can define a generic class that can be used with elements, spans, divisions, and more:
When defining an element’s foreground color, you should pay attention to what that element’s back-
ground color will be, avoiding dark foregrounds on dark backgrounds and light foregrounds on light
backgrounds. However, matching foreground and background colors can have its uses — see the note
near the end of the following section for an example of this practice.
Keep in mind that the user settings of the user agent can affect the color of elements, as well. If you don’t
explicitly define an element’s color using appropriate styles, the user agent will use its default colors.
Background Colors
An element’s background color can be thought of as the color of the virtual page the element is rendered
upon. For example, consider Figure 16-1, which shows two paragraphs: the first is rendered against the
user agent’s default background (in this case, white) and the second against a light-gray background.
Figure 16-1
Saying that a document has a default color of white is incorrect. The document will have the color speci-
fied in the user agent’s settings if not otherwise instructed to change it.
228
Colors and Backgrounds
You can use the CSS background-color property to define a particular color that should be used for
an element’s background. The background-color property’s syntax is similar to other element color
properties:
background-color: <color_value>
For example, you could use this property to define a navy blue background for the entire document
(or at least its body section):
Note that this definition also sets a foreground color so that the default text will be visible against the
dark background.
Sometimes it can be advantageous to use similar foreground and background colors together. For exam-
ple, on a forum that pertains to movie reviews, users may wish to publish spoilers — pieces of the plot
that others may not wish to know prior to seeing the movie. On such a site, a style can be defined such
that the text cannot be viewed until it is selected in the user agent, as shown in Figure 16-2. The style
could be defined as follows:
Figure 16-2
229
Chapter 16
Note that an element’s background extends to the end of its padding. If you want to enlarge the back-
ground of an element, expand its padding accordingly. For example, both paragraphs in Figure 16-3
have a lightly colored background. However, the second paragraph has had its padding expanded, as
laid out in the following code:
Background Images
In addition to solid colors, you can specify that an element use an image as its background. To do so, you
use the background-image property. This property has the following syntax:
background-image: url(“<url_to_image>”);
For example, the following code results in the document rendered in Figure 16-4, where the paragraph is
rendered over a light gradient:
Background images can be used for interesting effects, such as that shown in Figure 16-5, rendered from
the following code:
231
Chapter 16
<img src=”catframe.gif” alt=”gradient” width=”490” height=”231” /></p>
</body>
</html>
Note how the various sides of the paragraph were padded to ensure that the text appears in the correct
position relative to the background.
Figure 16-4
232
Colors and Backgrounds
Figure 16-5
Using the background-repeat property is straightforward — its values specify how the image repeats.
For example, to repeat our smiley face across the top of the paragraph, specify repeat-x, as shown in
the following definition code and Figure 16-6:
233
Chapter 16
Figure 16-6
Specifying repeat-y would repeat the image vertically instead of horizontally. If you specify just
repeat, the image tiles both horizontally and vertically. Specifying no-repeat will cause the image to
be placed once only, not repeating in either dimension.
The background-attachment property specifies how the background image is attached to the element.
Specifying scroll allows the image to scroll with the contents of the element, as shown with the second
paragraph in Figure 16-7. Both paragraphs were rendered with the following paragraph definition; the
second paragraph has been scrolled a bit, vertically shifting both text and image:
234
Colors and Backgrounds
background-image: url(“smiley.gif”);
background-attachment: scroll;
/* Border for clarity only */
border: thin solid black; }
Figure 16-7
Specifying a value of fixed for the background-attachment property will fix the background image
in place, causing it not to scroll if/when the element’s content is scrolled. This value is particularly use-
ful for images used as the background for entire documents for a watermark effect.
The use of the overflow property in the code for Figure 16-7 controls what happens when an element’s
content is larger than its containing box. The scroll value enables scroll bars on the element so that the
user can scroll to see the entire content. The overflow property also supports the values visible
(which causes the element to be displayed in its entirety, despite its containing box size) and hidden
(which causes the portion of the element that overflows to be clipped and remain inaccessible to the user).
235
Chapter 16
❑ Two percentages are used to specify where the upper-left corner of the image should be placed
in relation to the element’s padding area.
❑ Two lengths (in inches, centimeters, pixels, em, and so on) specify where the upper-left corner of
the image should be placed in relation to the element’s padding area.
❑ Keywords specify absolute measures of the element’s padding area. The supported keywords
include top, left, right, bottom, and center.
No matter what format you use for the background-position values, the format is as follows:
If only one value is given, it is used for the horizontal placement and the image is centered vertically.
The first two formats can be mixed together (for example, 10px 25%), but keywords cannot be mixed
with other values (for example, center 30% is invalid).
For example, to center a background image behind an element, you can use either of the following
definitions:
If you want to specify an absolute position behind the element, you can do so as well:
You can combine the background image properties to achieve diverse effects. For example, you can use
background-position to set an image to appear in the center of the element’s padding, and you can
specify background-attachment: fixed to keep it there. Furthermore, you could use background-
repeat to repeat the same image horizontally or vertically, creating striping behind the element.
Summar y
This chapter completed the discussion of the CSS box-formatting model and how you can manipulate
the foreground and background of the containing box of elements. You learned about foreground and
background colors as well as how to use images as the background for elements. Chapter 17 covers
table-formatting properties, and the CSS coverage wraps up in Chapter 18 with an explanation of ele-
ment positioning.
236
Tables
In Chapter 8, you learned about all the formatting attributes available for table elements in your
XHTML documents. It should come as no surprise that CSS has analogous properties to match
each of the table element attributes. However, the various CSS properties do not apply to tables
exactly like the element attributes. This chapter breaks down the CSS properties into their respec-
tive groups and shows you how to use them to format tables using CSS instead of tag attributes.
Because many of the table element’s attributes have not been deprecated in XHTML, you may be
tempted to embed all of your document’s table formatting within individual table tags. Resist that
temptation. Using tag attributes increases the editing difficulty of the document — each table using
tag attributes instead of CSS properties must be edited individually. If you use CSS properties
Chapter 17
instead, you can modify many tables by editing only one style (or a few styles). Furthermore, if you use
external style sheets, you can effect changes in multiple documents by editing only a few styles.
The next few sections detail the CSS properties for formatting tables.
Defining Borders
Tables use border properties to control the border of document tables and their subelements. For example,
to surround every table and their subelements with a single 1pt border, you could use a style definition
similar to the following:
The results of this definition can be seen on the table shown in Figure 17-1.
Figure 17-1
Note that the style specifies the table element (table) as well as its descendants (table *) to ensure that
the table itself as well as all of its subelements receive a border. You can define your selectors in creative
ways to create unique borders — placing one style around cells, another around the table, a third around
the caption, and so on.
To use CSS to create table borders similar to borders created with a border=”1” attribute, you can use
styles similar to those in the following code, whose results are shown in Figure 17-2:
238
Tables
<style type=”text/css”>
/* More padding for legibility */
table td { padding: 5px; }
/* Formatting similar to border attribute */
table.attrib-similar { border: outset 1pt; }
table.attrib-similar td { border: inset 1pt; }
</style>
</head>
<body>
<p><table border=”1”>
<caption>border=”1” attribute</caption>
<tr><td>Cell 1</td><td>Cell 2</td><td>Cell 3</td></tr>
<tr><td>Cell 4</td><td>Cell 5</td><td>Cell 6</td></tr>
</table></p>
<p><table class=”attrib-similar”>
<caption>CSS styles</caption>
<tr><td>Cell 1</td><td>Cell 2</td><td>Cell 3</td></tr>
<tr><td>Cell 4</td><td>Cell 5</td><td>Cell 6</td></tr>
</table></p>
</body>
</html>
Figure 17-2
239
Chapter 17
The border-spacing property has the following syntax:
For example, the following definition will create more space between columns than between rows, as
shown in the table in Figure 17-3:
Figure 17-3
The border-spacing property is not supported in current versions of Microsoft Internet Explorer.
Also note that this property works only in concert with the border-collapse property (described in
the next section) set to separate.
The table padding properties function exactly as they do with other elements. For example, to increase
the space between a table cell’s contents and its border, you could explicitly specify the padding value in
an appropriate style:
240
Tables
Collapsing Borders
As you may have noticed, the default CSS border handling leaves spaces between the borders of adjacent
elements. For example, consider the table in Figure 17-4.
When you want adjacent elements to collapse their borders into one border, you enter the border-
collapse property, which has the following syntax:
The two values do what you would expect; separate causes the borders of each element to be rendered
separately, spaced according to the user agent default or the border-spacing property’s value, while
collapse causes adjacent elements to be separated by one border. The table in Figure 17-4 is identical
to the table in Figure 17-3 except that the border-collapse property has been set to collapse, as
demonstrated in the following code:
Figure 17-4
Notice how setting border-collapse to collapse causes the user agent to ignore the border-spacing
property.
241
Chapter 17
When two adjacent elements have different borders, it is up to the user agent to decide which border to
render when the borders are collapsed. Typically, the most ornate border is chosen. For example, consider
the following code and the results shown in Figure 17-5:
</style>
</head>
<body>
<p><pre>
table, table td { border: solid 2pt black;
border-collapse: separate; }
table th { border: inset 3pt; }
</pre></p>
<p>
<table class=”one”>
<tr><th>Employee</th><th>Start Date</th><th>Next Review</th></tr>
<tr><td>Vicki S.</td><td>2/15/04</td><td>2/28/04</td></tr>
<tr><td>Teresa M.</td><td>11/15/03</td><td>3/31/04</td></tr>
<tr><td>Tamara D.</td><td>8/25/02</td><td>n/a</td></tr>
<tr><td>Steve H.</td><td>11/02/00</td><td>3/31/04</td></tr>
</table>
</p>
<p><pre>
table, table td { border: solid 2pt black;
border-collapse: collapse; }
table th { border: inset 3pt; }
</pre></p>
<p>
<table class=”two”>
<tr><th>Employee</th><th>Start Date</th><th>Next Review</th></tr>
<tr><td>Vicki S.</td><td>2/15/04</td><td>2/28/04</td></tr>
<tr><td>Teresa M.</td><td>11/15/03</td><td>3/31/04</td></tr>
<tr><td>Tamara D.</td><td>8/25/02</td><td>n/a</td></tr>
<tr><td>Steve H.</td><td>11/02/00</td><td>3/31/04</td></tr>
</table>
</p>
</body>
</html>
242
Tables
Figure 17-5
Notice how the border between the header row and first data row of the second table (Figure 17-5) is
inset. This is because the header row’s border was more ornate and won the conflict between the header
row and data row borders when collapsed.
The empty-cells property controls whether empty cells will have a border rendered for them or not.
This property has the following syntax:
As you would expect, setting the property to show (the default) will cause borders to be rendered, while
setting the property to hide will cause them to be hidden.
Current versions of Microsoft Internet Explorer disregard this property. To ensure that borders are ren-
dered around “empty” cells in Internet Explorer, you can insert nonbreaking space entities ( ) in
each otherwise empty cell.
243
Chapter 17
Table Layout
Typically, the user agent is in charge of how to render the table to best fit its platform’s display based in
part on the contents of cells and in part on the default table rendering settings of the platform. You can
force the user agent to render the table using only the width values of its elements by using the table-
layout property. This property has the following syntax:
Setting the property value to auto (the default) allows the user agent to consider the contents of the
table cells when formatting the table. Setting the property’s value to fixed causes the user agent to dis-
regard the contents of the table and format it only according to explicit width values given within the
document (via CSS) or the table itself (CSS or tag attributes).
Setting the appropriate value will position the caption to the corresponding side of the table. If you wish to
change the default alignment (center) of the caption, you can use text alignment properties such as text-
align or vertical-align. The text alignment properties are covered in more depth in Chapter 14.
Note that you can use the text alignment properties to help control where a table is placed by placing the
table within paragraph tags that use appropriate text-align properties.
For example, the following definition will position the corresponding table’s caption to the right of the
table, with a left-justified, horizontal alignment and a top vertical alignment:
Summar y
This chapter covered the CSS formatting properties of tables. You learned what properties are available
and how most of them match up to the table element’s attributes. In most cases, you can accomplish the
same formatting with either method (CSS or attributes), though the use of CSS is strongly encouraged.
Because of the diversity possible with the various combinations of table properties, the examples in this
chapter only scratched the surface of formatting possibilities. However, using these examples, you should
be able to construct style definitions for just about any table-formatting chore.
244
Element Positioning
In Chapter 11 you learned how XHTML tables could be used to create document layouts, position-
ing elements in a grid-like pattern to format a document. The table layout method allows for fairly
diverse and complex layouts. However, CSS provides several sizing and positioning properties
that allow you much more control over your document. CSS-based document layout has several
other advantages, as well, especially when used in conjunction with other technologies (such as
Dynamic HTML, covered in Chapters 21 and 22). This chapter covers the various positioning,
sizing, and visibility properties available in CSS.
The following sections detail the various positioning methods (static, relative, absolute, and
fixed) available via the position property.
Not all user agents support all positioning models. If you choose to use a positioning method, you
should test your code on all platforms you wish to support.
Static Positioning
Static positioning is the default positioning model used if no other method is specified. This
method causes elements to be rendered within their in-line or other containing block, placed in
the document as the user agent would normally flow them. The three paragraphs shown in
Figure 18-1 are all positioned statically, though the second paragraph has its positioning model
explicitly defined with the following style:
Chapter 18
p.static { width: 400px; height: 200px;
border: 1pt solid black;
position: static;
}
Figure 18-1
Several styles have been added to the demonstration paragraphs within this section to help illustrate the
difference in positioning models. Sizing and border properties have been implemented to help visualize
the paragraphs’ position. Sizing properties are covered later in this chapter, and border properties are
covered in Chapter 15.
Other than the sizing and border properties added for clarity, the position of the paragraph is similar to
its position using no styles — hence the default static positioning — as shown in Figure 18-2, which has
no styles added to the second paragraph.
Relative Positioning
Relative positioning moves an element from its normal position by using measurements relative to that
normal position. For example, you can nudge an element a bit to the right of its normal position by set-
ting the positioning model to relative and specifying a value for the left edge of the element, as in the
following example:
246
Element Positioning
Figure 18-2
This example places the left edge of the paragraph 25 pixels to the right of where it would have been
placed using static positioning.
When specifying relative measures, you can use the side properties (top, left, bottom, and right)
to move the corresponding side of the element. Any unspecified sides of the element will be positioned
according to other factors affecting their position — their size, margins, neighboring elements, bounding
box, and so on.
Figure 18-3 shows an example of relative positioning; the second paragraph has been moved down and
to the right using the following styles:
Note that the movement of the second paragraph causes it to overlap (and cover) the text of the third
paragraph. Using layer properties, you can control which paragraph ends up on top. (Element layer
properties are covered in the “Element Layers” section later in this chapter.) This example also intro-
duces element transparency; without a defined background color, the top element has a transparent
background, allowing elements beneath it to show through.
247
Chapter 18
Figure 18-3
Also note that the user agent does not flow elements into the hole created due to the element(s) being
repositioned — the third paragraph remains in the position it would occupy had the second paragraph
not been repositioned.
Absolute Positioning
Absolute positioning uses absolute measures to position an element in relation to the view port of the
user agent. The normal (static) position of the element is not taken into account when this positioning
method is used.
The second paragraph in Figure 18-4 has been positioned using absolute positioning with the following
styles:
A white background has been added to the demonstration paragraph — overriding the transparent back-
ground — to help clarify its position in the document.
248
Element Positioning
Figure 18-4
Note that the upper-left corner of the user agent view port is referenced as zero; the preceding code
results in the paragraph’s upper-left corner being positioned 50 pixels down and to the right from the
upper-left corner of the user agent view port. Also note that absolute positioning removes elements from
the normal flow of the document; the user agent flows neighboring elements as though the repositioned
element did not exist. As with other methods, the repositioned element floats to the top layer of the ele-
ment stack, overlapping elements below it.
Absolute positioning specifies only the initial position of an element when it is rendered. If the user
agent scrolls its view port or the display otherwise changes, the element will move accordingly. See the
next section on fixed positioning for a method to fix an element in place.
Fixed Positioning
Although not immediately evident, elements repositioned using other positioning methods are still
subject to the flow of the document and scrolling of the user agent’s view port. For example, consider
Figures 18-5 and 18-6, which both show an element repositioned using absolute positioning. The view
port in Figure 18-6 has been scrolled down a bit, causing the repositioned element to scroll accordingly.
249
Chapter 18
Figure 18-5
Figure 18-6
250
Element Positioning
Using fixed positioning, you can force an element to retain its initial position despite any movement of
the user agent’s view port. For example, Figures 18-7 and 18-8 show the same document as the previous
two figures (the document has been scrolled a bit in Figure 18-8). However, in the following two figures,
the repositioned paragraph uses fixed positioning defined with the following styles:
As you can see in Figure 18-8, despite the document being scrolled in the user agent, the repositioned
element retains the position defined by its style thanks to the fixed-positioning method.
Figure 18-7
251
Chapter 18
Figure 18-8
The specified side of the element (top, right, bottom, or top) is the side used to position the element. The
element’s other properties (size, borders, and so on) determine the position of the sides not explicitly
positioned. The positioning method being employed also plays a role in the actual position of the ele-
ment (see the previous section).
For absolutely positioned elements, the side values are related to the element’s containing block. For rel-
atively positioned elements, the side values are related to the outer edges of the element itself.
For example, the following styles result in positioning an element 50 pixels down from its normal position
in the document flow:
252
Element Positioning
Using percentages causes the user agent to position an element according to a percentage of its size (or
its bounding-box size). For example, to move an element left by 50% of its width, the following style can
be used, whose result is shown in Figure 18-9:
Figure 18-9
As you might expect, changing the positioning model changes the effect of the positioning, as shown
with the following code whose result appears in Figure 18-10:
In this example, the right side of the element is positioned at the 50% mark (center) of the user agent
view port because the positioning method is specified as absolute.
253
Chapter 18
Positioning alone can also drive an element’s size. For example, the following code will result in para-
graph elements being scaled horizontally to 25% of the view port, the left side of each positioned at the
25% horizontal mark, and the right at the 50% horizontal mark.
p { position: absolute;
left: 25%; right: 50%; }
However, due to the cascade behavior of CSS, whichever property appears last in the definition drives
the final size of the element. So, the following definition will result in paragraph elements that have their
left side positioned at the view port’s horizontal 25% mark, but each will be 400 pixels wide (despite the
size of the view port) because the width property overrides the setting of the right property:
p { position: absolute;
left: 25%; right: 50%;
width: 400px; }
Properties explicitly defining an element’s size are covered in the “Controlling an Element’s Size” sec-
tion later in this chapter.
254
Element Positioning
Floating Elements
Occasionally, it is useful to float an element outside of the normal flow of a document’s elements. When
elements are floated, they are removed from the normal flow and are placed against the specified margin
of the user agent’s view port.
The float property is used to control the floating behavior of elements and has the following syntax:
The default behavior of elements is none — the element is positioned in the normal flow of elements. If
the float property is set to right, the element is floated to the right margin of the user agent’s view
port; if the float property is set to left, the element is floated to the left margin.
For example, the sphere image in Figure 18-11 is not floated; it appears in the position where it is placed
in the document’s code — in-line with neighboring elements.
Figure 18-11
The same image appears in Figure 18-12 with the following style applied:
255
Chapter 18
Figure 18-12
Neighboring elements flow around floated elements instead of being rendered in-line with them. The
flow is still subject to appropriate margin and other values of the associated elements.
If you do not want elements to flow around neighboring floating elements, you can use the clear prop-
erty to inhibit this behavior. The clear property has the following syntax:
Setting clear to left or right will ensure that the affected element is positioned after any floated ele-
ments on the specified side so that they will not flow around them. Setting clear to both ensures that
both sides of the element are clear of floaters. Setting clear to none (the default) allows elements to
flow normally around floating elements.
Headings are one type of element that can benefit from the clear property’s setting; typically, you
would want headings to avoid flowing around floating elements. You can use the following style to
ensure that headings avoid flowing around floating elements:
256
Element Positioning
As with most properties, you can use various metrics to specify an element’s size. For example, the fol-
lowing style specifies that an element should be rendered 100 pixels square:
The following code sets an element to 150 percent of its normal width:
Specifying auto causes the element’s dimension (width or height) to be sized according to its contents or
other relevant properties.
❑ min-height
❑ max-height
❑ min-width
❑ max-width
For example, if you want an element to be at least 200px square, you could use the following style:
Controlling Overflow
Whenever you take the chore of sizing elements away from the user agent, you run the risk of element
contents overflowing the size of the element. The overflow property can be used to help control what
the user agent does when content overflows an element. This property has the following syntax:
257
Chapter 18
This property controls what the user agent should do with the content that overflows. The visible value
ensures that all the content remains visible, even if it must flow outside the bounds of its margins. The
hidden value causes any content that overflows to be hidden and therefore inaccessible to the user.
The scroll value causes the element to inherit scroll bars if any content overflows the element. Lastly,
the auto value allows the user agent to handle the element using the default settings of the user agent.
Figure 18-13 shows an example of the first three values of the overflow property.
Figure 18-13
As with many CSS properties, support for the overflow property isn’t consistent. If you rely on this
property in your code, you should test it on all intended platforms.
Element Layers
CSS also supports a third dimension for elements, allowing you to control what elements are placed on
top of what elements. You can usually anticipate how elements will stack and leave control of the stack-
ing up to the user agent. However, if you want more control over the element stack, you can use the
z-index property to specify an element’s position in the stack.
258
Element Positioning
Named for the stacking dimension (z-axis), this property has the following syntax:
If an integer is specified as the value for the z-index property, the affected elements will be stacked
accordingly — elements with higher z-index values are stacked on top of elements with lower z-index
values.
Figure 18-14 provides an illustrative example of how elements stack given different z-index values,
using the following code:
</style>
</head>
<body>
<p><b>z-index: 0 (default)</b><br />
Lorem ipsum dolor sit amet, consectetur
adipisicing elit, sed do eiusmod tempor incididunt ut labore et
dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation
ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis
aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat
cupidatat non proident, sunt in culpa qui officia deserunt
mollit anim id est laborum.</p>
259
Chapter 18
<p class=”zleveltwo”><b>z-index: 2</b><br />
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim
ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut
aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit
in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui
officia deserunt mollit anim id est laborum.</p>
Figure 18-14
260
Element Positioning
Note that the elements have an explicitly coded background color (white). This is because overlapping
objects (and, therefore, also those stacked using the z-index property) inherit a transparent background
so that elements under them can be seen through the transparency. If the background-color setting is
omitted, the document will render similar to that shown in Figure 18-15.
Figure 18-15
Controlling Visibility
You can use the visibility property to control whether an element is visible in the user agent. This
property has the following syntax:
The first two values accomplish exactly what their names imply: visible makes the affected element(s)
visible in the user agent, and hidden hides them (makes them invisible). Setting this property to the
value collapse will have the same effect as hidden on any element except for table columns or rows. If
this value is used on a table row or column, the content in that column or row is removed from the table
and that space is made available for other content.
261
Chapter 18
You can use positioning, sizing, and visibility properties with JavaScript to create impressive anima-
tions in your document. See Chapters 28 and 29 for examples of this and other automation techniques
using Dynamic HTML.
Summar y
This chapter covered the basics of CSS-based document layout. Using the positioning properties, you can
accurately position elements within your documents. Combining the positioning properties with sizing
and other visible CSS properties, you can create complex layouts with ease. As you will see in the next
section, JavaScript works well with CSS and can be used for further control, automation, and animation of
your documents.
262
JavaScript Basics
Up to this point in the book, you have learned only about static technology for Web documents.
Starting in this section, you will learn about technologies that can be used to dynamically deliver
content and manipulate content based on various criteria. This section of the book covers
JavaScript, a mainstay of XHTML document scripting. This chapter covers the origins of
JavaScript, its typical uses, and methods to incorporate scripts into your documents.
Histor y of JavaScript
In the early days of the World Wide Web, it became obvious to the Netscape team that rudimen-
tary scripting would improve the medium greatly. JavaScript was created in 1996, released with
Netscape 2.0, to fill that role and is still the most popular scripting language used on the Web. The
Netscape team that created JavaScript was the same team that created the Netscape browser —
a team that understood the innovations the Web was bringing to the Internet. The scripting lan-
guage was designed to be integrated into the user agent and to be able to parse details of any doc-
ument the agent rendered and affect changes in some elements. The parsing was available because
of the time the team took to construct the Document Object Model (DOM), a method to access doc-
ument links, anchors, form and form objects, and other objects.
Shortly after the base language was constructed, it was turned over to the European Computer
Manufacturers Association (ECMA) for standardization. The ECMA produced the ECMAscript
standard, which embodied most of the features and capabilities of the JavaScript language.
Additional capabilities were added to JavaScript over the next few years and matured as other
technologies (such as CSS) matured as well.
Despite its naming, JavaScript is not Java. It inherited the moniker “Java” due to many similarities
with Sun’s Java language. However, the similarities today are slight — noticeable to most pro-
grammers, but slight.
Different Implementations
Unfortunately, as with many of the Web technologies, development of JavaScript and the DOM fractured
over the next few years as other entities adopted and expanded the technologies. The largest gap was
created around the DOM, an area of JavaScript long neglected by the Netscape team.
Contrary to popular belief, Microsoft did not initially fork the DOM for its own benefit. The DOM that
existed in JavaScript around the release of Internet Explorer was plagued with bugs and was poorly
implemented. The open source community and other user agent programmers banded together and
embraced a DOM guideline constructed by the World Wide Web Consortium. The benefits of forking
the DOM from the flawed Netscape implementation were seen and seized by many.
Despite the existence of a well-known standard, several entities have made subtle changes to their
JavaScript implementations. Microsoft, for example, has tweaked its version of JavaScript (JScript),
creating quite a few inconsistencies in implementation and use.
Unfortunately, this causes many problems for the scripting programmer who must create code that
works on the majority of user agents.
The following table outlines the Netscape, Internet Explorer, and Mozilla support for the different
Document Object Model specifications.
Most user agents report some of their capabilities via their headers. However, this information cannot be
relied on — some user agents can be configured as to what information to report and implementations
vary. The easiest way to determine a user agent’s capabilities is to test for specific capabilities. For example,
the following code tests for the layer and W3C DOM so that you can program your scripts accordingly:
if (document.all) {
// IE4+ code (IE4+ uses the document.all collection)
} else if (document.layers) {
264
JavaScript Basics
// NS4+ code (Older versions of Netscape use the layers collection)
} else if (document.getElementById) {
// NS6+ code and IE5+ (The latest browsers use getElementById)
}
Implicit in the preceding code is the support for the getElementById function. This is one break
afforded to script programmers; using this one function, you can ascertain the ID assigned to an element
and manipulate it using that ID. Of course, only the current crop of browsers (Netscape 6+/Firefox and
Internet Explorer 5+) supports getElementById.
Even with those caveats taken into account, there are still many uses for JavaScript, including the
following:
❑ Form verification. JavaScript can parse form data prior to the data being submitted to a handler
on the server, ensuring that there are no obvious errors (missing or improperly formatted data).
Form verification is covered in Chapter 22.
❑ Document animation and automation. Accessing element data and properties via the DOM,
JavaScript can affect changes in elements’ content, appearance, size, position, and so forth. By
doing so, simple scripts can create simple animations — menu items that change color, images
and text that move, and so on. Rudimentary dynamic content can also be achieved via
JavaScript — custom content can be generated according to other behavior initiated by the user.
Dynamic HTML (the name for the technology of JavaScript manipulating document objects and
elements) is covered in Chapters 23 and 24.
❑ Basic document intelligence. As mentioned in the preceding bullet, JavaScript can initiate
changes in documents based on other, dynamic criteria. Using JavaScript you can embed a base
level of intelligence and an extra layer of user interface in your documents. Elements can be
linked to scripts via events (onclick, onmouseover, and so on) to help the user better use your
documents. Sample uses of triggers can be found in Chapter 22.
265
Chapter 19
<script type=”MIME_type”>
...script content...
</script>
Current versions of XHTML support the following Multipurpose Internet Mail Extensions (MIME) types
for the <script> tag:
❑ text/ecmascript
❑ text/javascript
❑ text/jscript
❑ text/vbscript
❑ text/vbs
❑ text/xml
Of course, for JavaScript you will want to use the text/javascript MIME type; the rest of the MIME
types are for other scripting options and languages.
The <script> tag also supports the option attributes listed in the following table:
Note the versatility inherent in the availability of the src attribute. If you have scripts that you want to
make available to several documents, you can place them in an external document and include them via
a tag similar to the following:
266
JavaScript Basics
Then, whenever you need to change a script used across multiple documents, you have to edit it only
once, in the external file.
Execution of Scripts
Unless otherwise instructed, user agents execute JavaScript code as soon as it is encountered in a docu-
ment. The exceptions to this rule include the following:
For example, the first code segment in the following code listing will not be executed until an event
explicitly calls the function enclosing the code, while the second code segment will be executed as soon
as it is encountered by the rendering user agent:
Keep in mind that you can force a script to run immediately after the document is loaded by using the
onload event in the <body> tag:
<body onload=”script_function_name”>
267
Chapter 19
Short scripts can also be embedded directly within event attributes, as in the following example:
Events are covered in more detail in Chapter 20 and shown in examples in Chapter 21.
Summar y
This chapter began the coverage of scripting within this book. This section concentrates on JavaScript,
starting with the basics of how JavaScript can be incorporated into your documents and working toward
using the language to manipulate a document’s object model and using JavaScript for animation and
other advanced uses.
268
The JavaScript Language
The previous chapter introduced you to JavaScript. This chapter dives into the language itself, out-
lining the language’s syntax, structure, functions, objects, and more. Note that this chapter focuses
on the ins and outs of the language, not specific uses thereof. Subsequent chapters in this section
will introduce you to specific ways to use the language in your documents.
❑ All code should appear within appropriate constructs, namely between <script> tags or
within event attributes. (Events are discussed in the “Event Handling” section later in this
chapter.)
❑ With few exceptions, code lines should end with a semicolon (;). Notable exceptions to
this rule are lines that end in a block delimiter ({ or }).
❑ Blocks of code (usually under control structures such as functions, if statements, and
so on) are enclosed in braces ({ and }).
❑ Although it is not absolutely necessary, explicit declaration of all variables is a good idea.
❑ The use of functions to delimit code fragments is recommended; it increases the ability to
execute those fragments independently from one another and encourages reuse of code.
(Functions are discussed in the “User-Defined Functions” section later in this chapter.)
❑ Comments can be inserted in JavaScript code by prefixing the comment with a double-
slash (//) or surrounding the comment with /* and */ pairs. In the former case (//), the
comment ends at the next line break. Use the latter case for multiline comments.
Chapter 20
Data Types
JavaScript, like most other programming languages, supports a wide range of data types. However,
JavaScript employs little data type checking. It doesn’t care too much about what you store in variables
or how you use the stored data within your scripts. As such, it becomes important to monitor your data
types as you write your scripts to ensure that you pass the appropriately typed data to functions and
methods.
❑ Booleans
❑ Integers
❑ Floating-point numbers
❑ Strings
❑ Arrays
JavaScript also supports objects — an unordered container of various data types. Objects are covered in
the “Objects” section later in this chapter.
There is little difference between JavaScript’s integer and floating-point data types because numbers are
stored in floating-point format. The JavaScript standard specifies an upper/lower bound of floating-
point numbers as 1.7976931348623157E+10308, but the user agent implementation may vary. If you need
to ensure that a value is an integer, use the parseInt() function accordingly. (Functions are covered
later in this chapter — see Appendix C for reference coverage of JavaScript functions.)
JavaScript arrays can contain a mix of the various data types and are declared using the new operator in
your declaration statement, as in the following:
a = new Array();
If you want, you can also define the array’s values at declaration time by specifying the values within
the Array() declaration, as shown in the following code:
You can similarly use the String() and Number() functions within your declarations to ensure that a
variable is declared as a specific type. For example, to declare the variable s as a string, you could use
the following:
270
The JavaScript Language
Explicitly setting a variable’s type helps ensure that it will always contain the data you expect it to. If
left to its own devices, JavaScript will adapt a variable’s type as needed, perhaps causing rounding
errors or more grievous mistakes.
Variables
JavaScript variables are case sensitive and can contain a mix of letters and numbers. You should take
care to avoid variable names that use JavaScript reserved words. Unlike some other popular scripting
languages, JavaScript does not identify its variables by prefixing them with a special character, such as
a dollar sign ($) — its variables are referenced by their names only.
You may wish to use a naming convention for your variables, for example one that describes what sort
of data the variable will hold. There are several different naming methods and schemes you can use.
Although the preference of which you use is subjective, you would do well to pick one and stick with it.
One common method is Hungarian notation, where the beginning of each variable name is a three letter
identifier indicating the data type.
Appendix C covers built-in JavaScript operators, functions, objects, and more. This appendix can be
used as a source of JavaScript reserved words.
JavaScript uses the var statement to explicitly declare variables. As previously mentioned, explicit decla-
ration of variables is not necessary in JavaScript. However, it is good practice to do so.
You can declare multiple variables within one var statement by separating the variable declarations
with commas. The var statement also supports assigning an initial value to the declared variables. The
following three lines are all valid var statements:
var x;
var x = 20;
var x = 20, firstname = “Steve”;
JavaScript variables have global scope unless they are declared within a function, in which case they
have local scope within the function in which they were declared. For example, in the following code the
variable x is global while the variable y is local to the function:
<script type=”text/JavaScript”>
var x = 100;
function spacefill(text,amount) {
var y = 0;
...
}
</script>
Note that variables with global scope transcend <script> sections; that is, variables declared within
one <script> section are accessible in other <script> sections within the document.
271
Chapter 20
272
The JavaScript Language
String Operators
Operator Use
+ Concatenation
273
Chapter 20
String Tokens
Token Character
\\ Backslash
\’ Single quote
\” Double quote
\b Backspace
\f Form Feed
\n Line Feed
\r Carriage Return
\t Horizontal Tab
\v Vertical Tab
Control Structures
Like many other languages, JavaScript supports many different control structures that can be used to
execute particular blocks of code based on decisions or repeat blocks of code while a particular condition
is true. The following sections cover the various control structures available in JavaScript.
Do While
The do while loop executes one or more lines of code as long as a specified condition remains true. This
structure has the following format:
do {
// statement(s) to execute
} while (<expression>);
Due to the expression being evaluated at the end of the structure, statement(s) in a do while loop are
executed at least once. The following example will loop a total of 20 times — incrementing the variable x
each time until x reaches the value 20:
var x = 0;
do {
x++; // increment x
} while (x < 20);
While
The while loop executes one or more lines of code while a specified expression remains true. The while
loop has the following syntax:
274
The JavaScript Language
while (<expression>) {
// statement(s) to execute
}
Because the <expression> is evaluated at the beginning of the loop, the statement(s) will not be exe-
cuted if the <expression> is false at the beginning of the loop. For example, the following loop will
execute 20 times, each iteration of the loop incrementing x until it reaches 20:
var x = 0;
while (x <= 20) { // do until x = 20 (will not execute when x = 21)
x++; // increment x
}
The <initial_value> expression is evaluated at the beginning of the loop; this event occurs only
before the first iteration of the loop. The <condition> is evaluated at the beginning of each loop itera-
tion. If the condition returns false, the current iteration is executed; if the condition returns true, the loop
exits and the script execution continues after the loop’s block. At the end of each loop iteration, the
<loop_expression> is evaluated.
Although their usage can vary, for loops are generally used to step through a range of values (such as
an array) via a specified increment. For example, the following example begins with the variable x equal
to 1 and exits when x equals 20; each loop iteration increments x by 1:
Note that the <loop_expression> is not limited to an increment expression. The expression should
advance the appropriate values toward the exit condition but can be any valid expression. For example,
consider the two following snippets of code:
275
Chapter 20
Another variation of the for loop is the for in loop. The for in loop executes statement(s) while
assigning a variable to the properties of an object or elements of an array. For example, the following
code will assign the variable i to each element in the names array:
The following loop code will assign i to all the properties of the document object:
for (i in document) {
// statement(s) to execute
}
If Else
The if and if else constructs execute a block of code depending on the evaluation (true or false) of an
expression. The if construct has the following syntax:
if (<expression>) {
// statement(s) to execute if expression is true
} [ else {
// statement(s) to execute if expression is false
} ]
For example, the following code tests if the value stored in i is the number 2:
if (i == 2) {
// statement(s) to execute if the value in i is 2
}
The following code will execute one block of code if the value of i is an odd number, another block of
code if the value of i is an even number:
if ((i % 2) != 0) {
// statement(s) to execute if i is odd
} else {
// statement(s) to execute if i is even
}
You can also use complex expressions in an if loop, as in the following example:
Note the use of the parentheses in the previous example. You can use parentheses to explicitly define the
precedence in the expression — important when using or/and logic.
276
The JavaScript Language
In addition, you can create else if constructs in JavaScript by nesting if statements within one
another, as shown in the following code:
if ((i % 2) != 0) {
// statement(s) to execute if i is odd
} else
if (i == 12) {
// statement(s) to execute if i is 12
}
}
However, in most cases where you are comparing against one variable, using switch (covered in the
next section) is a better choice.
Switch
The switch construct executes specific block(s) of code based on the value of a particular expression.
This structure has the following syntax:
switch (<expression>) {
case <value_1>: {
// statement(s) to be executed if <expression> = <value_1>
break; }
case <value_2>: {
// statement(s) to be executed if <expression> = <value_2>
break; }
...
default: {
// statement(s) to be executed if <expression> does not match any other case
}
}
For example, the following structure will perform the appropriate code based on the value of firstname:
switch (firstname) {
case “Steve”: {
// statement(s) to execute if firstname = “Steve”
break; }
case “Terri”: {
// statement(s) to execute if firstname = “Terri”
break; }
default: {
/* statement(s) to execute if firstname does not
equal “Steve” or “Terri”
}
}
Note that the switch statement is an efficient structure to perform tasks based on the value of one
variable — much more efficient than a series of nested if statements.
277
Chapter 20
Note that the break statements and the default section are optional. If you omit the break statements,
each case section after the matching case will be executed. For example, in the preceding code section, if
the breaks were removed and firstname was equal to “Steve,” the code in all sections (“Steve,” “Terri,”
and default) would execute.
Note that the switch construct can only be used to compare against one value. If you need to make
decisions based on several different values, use a nested if construct instead.
For example, the following code will skip processing the number 7, but all other numbers between 1 and
20 will be processed:
var x = 1;
while (x <= 20) {
if (x == 7) continue; // skip the number 7
// statement(s) to execute if x does not equal 7
}
In the following code, the loop will be exited if the variable x ever equals 100 during the loop’s execution:
var y = 1;
while (y <= 20) {
if (x == 100) break; // if x = 100, leave the loop
// statement(s) to execute
}
// execution continues here when y > 20 or x = 100
The break statement can also be used with labels to specify which loop should be broken out of. (See the
next section, “Labels,” for more information on labels.)
Labels
JavaScript supports labels, which can be used to mark statements for reference by other statements in
other sections of a script. Labels have the following format:
<label_name>:
To use a label, place it before the statement you wish to identify with the label. For example, the follow-
ing references the while loop with the label code_loop:
278
The JavaScript Language
var x = 100;
code_loop:
while (x <= 1000) {
// statement(s)
}
You can reference labels using the break statement to exit structures outside of the current structure. For
example, both loops in the following code will be broken out of if the variable z ever equals 100:
var x = 0, y = 0;
top_loop:
while (x <= 100) {
while (y <= 50) {
...
if (z == 100) break top_loop;
}
}
/* execution resumes here after loops are complete or if
z = 100 during the loops execution */
You can also use labels to mark blocks of code, as in the following example:
code_block: {
// block of code here
}
The break statement can then be used to break out of the block, if necessary.
Built-in Functions
JavaScript provides a few built-in functions for data manipulation. Most of the built-in functions exist to
convert data between the various data types and to check if data is of a particular type. The following
table lists the supported functions:
279
Chapter 20
User-Defined Functions
JavaScript supports user-defined functions. User-defined functions allow you to better organize your
code into discrete, reusable chunks.
280
The JavaScript Language
For example, the following function will space-fill the string passed to it to 25 characters and return the
filled string:
Elsewhere in your code, you can use this function similarly to the following:
address = spacefill(address);
This would cause the variable address to be space-filled to 25 characters and reassigned to itself.
Strictly speaking, the return statement is optional. However, it is usually a good idea to at least
include a status code return (success/fail) for all your functions.
The arguments passed to a function can be of any type. If multiple arguments are passed to the function,
separate them with commas in both the calling statement and function definition, as shown in the fol-
lowing examples:
Calling syntax:
spacefill(address, 25)
Function syntax:
Note that the number of arguments in the calling statement and in the function definition should match.
If you supply fewer variables than the number expected by the function, the remaining variables will
remain undefined. If you specify more variables than the number expected by the function, the extra val-
ues will be discarded.
The variables used by the function for the arguments and any other variables declared and used by the
function are considered local variables — they are inaccessible to code outside the function and exist
only while the function is executing.
Objects
JavaScript is an object-driven language. As you will see in Chapter 21, “The Document Object Model,”
the user agent supplies a host of objects that your scripts can reference. However, you will encounter
many objects built into JavaScript that are outside of the Document Object Model.
281
Chapter 20
Built-in Objects
JavaScript has several built-in objects. For example, two specific objects exist for manipulating data:
one for performing math operations (Math) on numeric data and another for performing operations on
string values (String).
These objects have various methods for acting upon data. For example, to find the square root of vari-
able x, you could use the Math.sqrt method:
Or, to convert a string to lowercase, you could use the String.toLowerCase() method:
As with most object-oriented languages, JavaScript supports the with statement. Using the with state-
ment can facilitate using multiple methods of the same object, as shown in the following code:
with (Math) {
y = random(200);
x = round(sqrt(y));
}
The same code without using the with statement would look like the following code:
y = Math.random(200);
x = Math.round(Math.sqrt(y));
Although the Math object was referenced only three times in the code, you can see how repeatedly refer-
encing the object could get tedious when constructing complex mathematical operations.
Another very useful object is the Date object. This object has several methods that can be used to manip-
ulate dates and times in various formats. For example, the following code will output the current date
(in month, day, year format) wherever it is placed in the document:
<script type=”text/JavaScript”>
months = new Array (“January”,”February”,”March”,”April”,”May”,”June”,
“July”,”August”,”September”,”October”,”November”,”December”);
var today = new Date(); // create new date object (with values = today)
// Set day, month, and year from today’s value
var day = today.getDate();
var month = today.getMonth();
var year = today.getYear();
// Output “month day, year” (month is textual value)
document.write(months[month]+” “+day+”, “+year);
</script>
You can use the millisecond methods of the Date object to do calculations on dates — the number of
days between two dates or the number of days until a particular date, for example.
Appendix C, “JavaScript Language Reference,” lists the available built-in objects, their properties, and
methods.
282
The JavaScript Language
User-Created Objects
The new declaration statement can be used to create new objects based on existing, built-in objects. For
example, to create a new array, you could use code similar to the following:
Teaching the concept of objects and object-oriented programming is beyond the scope of this book. As a
consequence, this section concentrates only on how to implement objects in JavaScript.
The preceding code creates a new array object, based on the built-in JavaScript array object, and
assigns values to the new object.
Creation of new, custom objects requires the existence of an object constructor. This is unnecessary when
creating objects based on built-in objects — JavaScript also includes built-in constructors for native objects.
For example, the following function can be used to construct totally new objects of a movie class:
The constructor can then be called via new, as in the following example:
You can also create a direct instance of an object, bypassing creation and use of a constructor, if you
want. For example, the following also creates a movie object, but without use of a constructor:
You can access the new object’s properties via the normal property syntax:
if (m.genre == “Horror”) {
// do something if genre is Horror
}
Object properties can be objects themselves. For example, you could create a director object that in
turn is a property of the movie object:
function director(name,age) {
this.name = name;
this.age = age;
}
function movie(title, genre, director, releasedate) {
this.title = title;
283
Chapter 20
this.genre = genre;
this.director = director;
this.releasedate = releasedate;
}
dir1 = new director(“James Cameron”,51);
mov1 = new movie(“Aliens”,”Scifi”,dir1,”1986-07-18”);
if (mov1.director.name == “James Cameron”) // if director of mov1 is Cameron
New methods can be assigned to objects via functions. For example, if you have a function named beep
that causes the user agent to play the sound of a horn, you could assign that function as a method by
using the assignment operator:
car.honk = beep();
However, in most cases, you will find that your JavaScript objects fall into the plain old data object
model — not needing methods to be manipulated.
One very important object available to JavaScript is the Document Object Model (DOM). Using the
DOM, your scripts can access a wealth of information about the current document — every element and
every attribute is available for reading and manipulation. The Document Object Model is covered in
Chapter 21.
Event Handling
One of the more powerful and often used techniques concerning JavaScript is events. Using event
attributes in XHTML tags, such as onmouseover and onclick, you can create interactive documents
that respond to the user’s actions.
Event Trigger
onAbort Abort selected in browser (stop loading of image or document), usually by
clicking the Stop button
onBlur When the object loses focus
onChange When the object is changed (generally a form element)
onClick When the object is clicked
onDblClick When the object is double-clicked
onDragDrop When an object is dropped into the user agent window (generally a file)
onError When a JavaScript error occurs (not a browser error — only JavaScript code
errors will trigger this event)
onFocus When an object receives focus
onKeyDown When the user presses a key
onKeyPress When the user presses and/or holds down a key
284
The JavaScript Language
Event Trigger
onKeyUp When the user releases a key
onload When the object is loaded into the user agent (typically used with the
<body> element to run a script when the document has completed loading)
For example, table text in the following document will turn red when the user moves the mouse over the
table (onmouseover) and back to black when the mouse is moved outside of the table (onmouseout):
285
Chapter 20
This technique demonstrates another important technique: incorporating raw code in event attribute
values. This technique is useful if the code is unique to the element in which it appears and is a fairly
small piece of code. However, in the preceding case, what if you wanted to change other elements’ col-
ors as the user moves the mouse over them? In that case, you would be better off defining functions for
the color change and calling the functions from within the events, as in the following:
Note that the event must pass the object ID to the function to correctly identify the object whose color is
to change.
286
The JavaScript Language
Some of the tools in the following list are open source and others are commercial. Most of the commercial
applications offer free trial versions. Capabilities between the various editors vary — pick an editor that
offers the capabilities you need in the price range that works for you.
❑ TextPad — http://www.textpad.com
❑ PSPad — http://www.pspad.com/
❑ Homesite — http://www.macromedia.com/software/homesite/
❑ vim — http://www.vim.org/
❑ Emacs — http://www.gnu.org/software/emacs/emacs.html
❑ Bluefish — http://bluefish.openoffice.nl/
❑ Matching braces — Often you might find that a block section of code is missing its beginning or
ending brace ({ or }). Adhering to strict syntax formatting will help; it’s easier to notice a miss-
ing brace if it doesn’t appear where it should. For example, consider placing the braces on lines
of their own where they are very conspicuous, similar to the following snippet:
if (windowName == “menu”)
{
// Conditional code here
}
❑ Missing semicolons — When writing quick and dirty code, it’s easy to forget the little things,
such as the semicolons on the end of statements. That is one reason why I never treat semicolons
as optional; I use them at the end of every statement even when they technically are optional.
287
Chapter 20
❑ Variable type conflicts — Because JavaScript allows for loose variable typing, it is easy to make
mistakes by assuming a variable contains data of one type when it actually contains data of
another type. For example, if you access a numeric variable with a string function, JavaScript
will interpret the numeric value as a string, resulting in the original number being rounded,
truncated, or otherwise modified.
❑ Incorrect object references — You will find that the syntax of referencing objects can sometimes
be tricky. What works in one document or user agent may not work in another. It’s important to
always reference objects starting with the top of the object hierarchy (for example, starting with
the document object) or use tools such as getElementID() to uniquely identify and reference
objects.
❑ Working with noncompliant HTML — Sometimes the problem is not in the JavaScript but in
the XHTML that JavaScript is trying to interact with. It is important to work within the XHTML
standards to ensure that elements in your document can be referenced appropriately by your
scripts. It is also important to define and adhere to naming conventions for element id and
name attributes, helping avoid typos between your document elements and scripts.
❑ Your own idiosyncrasies — After writing several scripts, you may find several personal coding
idiosyncrasies that end up constantly biting you. Try to remember those issues and check your
code for your consistent problems as you go.
Identifying Problems
One big problem with JavaScript is the lack of feedback when problems do exist. Mozilla Firefox will
simply not run a script that has syntactical errors, providing little to no feedback as to what the error is.
Internet Explorer will display an error icon in its status bar when a JavaScript syntax error is found;
clicking on the icon will usually display a message regarding the error (as shown in Figure 20-1), how-
ever cryptic the message might be.
288
The JavaScript Language
The error shown in Figure 20-1, Object expected, is a very common error reported by Internet
Explorer. In most cases, the error results from an event call (for example, onClick) to a function or
other external piece of code that failed to compile due to syntax errors. Beginning JavaScript program-
mers may spend a lot of time adjusting the syntax of the event call when the problem is actually in the
code being called.
There are several methods you can employ to track down the source of an error. The most common are
outlined in the following sections.
Using Alert
The alert function is a valuable tool that can be used for basic troubleshooting. You can use this function
to display values of variables or to act as simple breakpoints within the script. For example, if you need to
track the value of variable x, you could place lines similar to the following in key areas of your script:
Other alert functions can be used to create a kind of breakpoint in your script, letting you know when
and where the script enters key areas. For example, the following line could be used before a key loop
construct:
When you see the appropriate alert displayed, you know your script has at least executed to that point.
When using the alert function, be sure to include enough information to distinguish the alert from
other alerts of its type. For example, an alert reporting simply Now entering FOR loop doesn’t
tell you which for loop is actually being reported on.
try {
//code you want to troubleshoot
// If x<23, there’s an error
if (x < 23) { throw(x); } // Throw value of x
}
catch (err) {
// Catch the error, and make a decision based on
// value passed -- in this case, just report value
alert(“Error: “ + err);
}
If your code traps an error in the try section of the script, you can throw an exception using the throw
function. Execution of the script moves immediately to the catch section where the error can be further
diagnosed and reported on. Note that multiple throws can be implemented in the try section, and the
catch section can perform more actions than demonstrated in the preceding code. For example, condi-
tional statements can be used in the catch section to report different messages depending on the value
passed by the throw.
289
Chapter 20
Using JSUnit, you can define assertions and use those assertions to test the functionality of your code.
Note that the assertions will not protect you from simple typos and other syntactical errors; assertions
will test only for proper values going in and coming out of functions. The JSUnit site has several exam-
ples of how assertions can be defined and used. Visit the site for more information.
Summar y
This chapter covered the basics of the JavaScript language to familiarize you with its syntax, structure,
data objects, and more. After reading this chapter, you should understand how JavaScript compares to
other programming languages and be ready to apply that knowledge. The next chapter discusses the
Document Object Model, the most powerful data object available to JavaScript. Subsequent chapters
cover typical uses of JavaScript as well as using JavaScript with HTML to achieve Dynamic HTML
(DHTML).
290
The Document Object Model
Most Web programmers are familiar with Dynamic HTML (DHTML) and the underlying
Document Object Models developed by Netscape and Microsoft for their respective browsers.
However, there is a unifying Document Object Model (DOM) developed by the W3C that is
less well known and, hence, used less often. The W3C DOM has several advantages over the
DHTML DOM — using its node structure it is possible to easily navigate and change documents
despite the user agent used to display them. This chapter covers the basics of the W3C DOM
and oldChild)teaches you how to use JavaScript to manipulate it.
The W3C DOM is much more complex than shown within this chapter. There are several addi-
tional methods and properties at your disposal to use in manipulating documents, many more
than we have room to address in this chapter. Further reading and information on the standard
can be found on the W3C Web site at http://www.w3.org/TR/2000/WD-DOM-Level-
1-20000929/Overview.html. The next chapter covers the details of the DHTML DOM.
It’s important to note that the DOM is a type of application program interface (API) allowing
any programming language access to the structure of a Web document. The main advantage of
using the DOM is the ability to manipulate a document without another trip to the document’s
server. As such, the DOM is typically accessed and used by client-side technologies, such as
JavaScript. Therefore, the coverage of the DOM in this book appears in the JavaScript part of the
book and is very JavaScript-centric.
The first DOM specification (Level 0) was developed at the same time as JavaScript and early
browsers. It is supported by Netscape 2 onward.
Chapter 21
There were two intermediate DOMs supported by Netscape 4 onward and Microsoft Internet
Explorer (IE) versions 4 and 5 onward. These DOMs were proprietary to the two sides of the browser
coin — Netscape and Microsoft IE. The former used a collection of elements referenced through a
document.layers object, while the latter used a document.all object. To be truly cross-browser
compatible, a script should endeavor to cover both of these DOMs instead of one or the other.
Techniques for accessing these DOMs are covered in Chapter22, “Dynamic HTML.”
The latest DOM specification (Level 1) is supported by Mozilla and Microsoft Internet Explorer version 5
onward. Both browser developers participated in the creation of this level of the DOM and as such sup-
port it. However, Microsoft chose to continue to support its document.all model as well, while
Netscape discontinued its document.layers model.
Keep in mind also that the DOM was originally intended to allow programs to navigate and change
XML, not HTML, documents, so it contains many features a Web developer dealing only with HTML
may never need.
<html>
<head>
<title>Sample DOM Document</title>
<style type=”text/css”>
</style>
<script type=”text/JavaScript”>
</script>
</head>
<body>
<div class=”div1”>
<h1>Heading 1</h1>
<table>
<tr><td>Cell 1</td><td>Cell 2</td></tr>
<tr><td>Cell 3</td><td>Cell 4</td></tr>
</table>
292
The Document Object Model
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing
elit, sed diam <b>nonummy nibh euismod</b> tincidunt ut laoreet
dolore magna aliquam erat volutpat. Ut wisi enim ad minim
veniam, quis nostrud exerci tation ullamcorper suscipit
lobortis nisl ut aliquip ex ea commodo consequat.</p>
</div>
<div class=”div2”>
<h1>Heading 2</h1>
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing
elit, sed diam nonummy nibh euismod tincidunt ut laoreet
dolore magna aliquam erat volutpat. Ut wisi enim ad minim
veniam, quis nostrud exerci tation ullamcorper suscipit
lobortis nisl ut aliquip ex ea commodo consequat.</p>
<ol id=”sortme”>An ordered list
<li>Gamma</li>
<li>Alpha</li>
<li>Beta</li>
</ol>
</div>
</body>
</html>
Figure 21-1
293
Chapter 21
HTML
HEAD BODY
STYLE
H1 P H1 TABLE P
SCRIPT
Text Text Text
Text B Text
TD Text
Text
TR
TD Text
LI Text TBODY
TD Text
OL LI Text TR
TD Text
LI Text
Figure 21-2
As you can see, each node is joined to its neighbors using a familiar parent, child, sibling relationship. For
example, the first DIV node is a child of the BODY node, and the DIV node in turn has three children — an
H1 node, a P node, and an OL node. Those three children (H1, P, and OL) have a sibling relationship to
one another.
Plain text, usually the content of nodes such as paragraphs (P), is referenced as textual nodes and is
broken down as necessary to incorporate additional nodes. This can be seen in the first P node, which
contains a bold (B) element. The children of the P node include the first bit of text up to the bold element,
the bold element, and the text after the bold element. The bold element (B) in turn contains a text child,
which contains the bolded text.
The relationships between nodes can be explored and traversed using the DOM JavaScript bindings, as
described in the next section.
294
The Document Object Model
A full list of DOM JavaScript bindings can be found on the W3C’s Document Object Model Level 1
pages, at http://www.w3.org/TR/2000/WD-DOM-Level-1-20000929/ecma-script-
language-binding.html.
Property Description
attributes This read-only property returns a named node map-
NamedNodeMap containing the specified node’s attributes.
childNodes This read-only property returns a node list containing all the
children of the specified node.
firstChild This read-only property returns the first child node of the
specified node.
lastChild This read-only property returns the last child node of the
specified node.
nextSibling This read-only property returns the next sibling of the specified
node.
nodeName This read-only property returns a string containing the name
of the node, which is typically the name of the element
(P, DIV, TABLE, and so on).
nodeType This read-only property returns a number corresponding to the
node type (1 = element, 2 = text).
nodeValue This property returns a string containing the contents of the
node and is only valid for text nodes.
ownerDocument This read-only property returns the root document node object
of the specified node.
parentNode This read-only property returns the parent node of the speci-
fied node.
previousSibling This read-only property returns the previous sibling of the
specified node. If there is no node, the property returns null.
295
Chapter 21
The second table describes JavaScript’s methods.
Method Description
appendChild(newChild) Given a node, this method inserts the newChild node at the
end of the children and returns a node.
cloneNode(deep) This method clones the node object. The parameter deep —
(a Boolean) — specifies whether the clone should include the
source object’s attributes and children. The return value is the
cloned node(s).
hasChildNodes() This method returns true if the node object has children
nodes, false if the node object has no children nodes.
insertBefore(newChild, Given two nodes, this method inserts the newChild node
refChild) before the specified refChild node and returns a node object.
removeChild(oldChild) Given a node, this method removes the oldChild node from
the DOM and returns a node object containing the node
removed.
replaceChild(newChild, Given two nodes, this method replaces the oldChild node
oldChild) with the newChild node and returns a node object. Note that if
the newChild is already in the DOM, it is removed from its
current location to replace the oldChild.
Source
This example uses the document example from earlier in the chapter with scripting necessary
to navigate the DOM:
296
The Document Object Model
<style type=”text/css”>
</style>
<script type=”text/JavaScript”>
var s = new String();
// Only track elements (1), text (3), and the document (9)
if ( node.nodeType == 1 || node.nodeType == 3 ||
node.nodeType == 9 ) {
// Add dashes to represent node level
for (var x = 0; x < lvl; x++) { s = s + “--”; }
// Report first 20 chars for text nodes
if ( node.nodeType == 3 ) {
mynodeType = node.nodeValue;
if (mynodeType.length > 20) {
mynodeType = mynodeType.slice(0,16) + “...”;
}
} else {
// Report “Element/Tag” for elements
mynodeType = “Element/Tag”;
}
s = s + “+ “ + node.nodeName + “ (“ + mynodeType + “)\n”;
297
Chapter 21
function domwalk()
// Navigate through the DOM and report it in another window
alert(“Click OK to display the document’s DOM”);
showChildren(document,0);
displaywin = window.open(“”,”displaywin”,
“width=400,height=400,scrollbars=yes,resizable=yes”);
displaywin.document.write(“<pre>”+s+”</pre>”);
}
</script>
</head>
<body onload=”domwalk()”>
<div class=”div1”>
<h1>Heading 1</h1>
<table>
<tr><td>Cell 1</td><td>Cell 2</td></tr>
<tr><td>Cell 3</td><td>Cell 4</td></tr>
</table>
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing
elit, sed diam <b>nonummy nibh euismod</b> tincidunt ut laoreet
dolore magna aliquam erat volutpat. Ut wisi enim ad minim
veniam, quis nostrud exerci tation ullamcorper suscipit
lobortis nisl ut aliquip ex ea commodo consequat.</p>
</div>
<div class=”div2”>
<h1>Heading 2</h1>
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing
elit, sed diam nonummy nibh euismod tincidunt ut laoreet
dolore magna aliquam erat volutpat. Ut wisi enim ad minim
veniam, quis nostrud exerci tation ullamcorper suscipit
lobortis nisl ut aliquip ex ea commodo consequat.</p>
<ol id=”sortme”>An ordered list
<li>Gamma</li>
<li>Alpha</li>
<li>Beta</li>
</ol>
</div>
</body>
</html>
This code works by recursively calling the showChildren() function for each node that has
children in the document (identified by the hasChildNodes() property). The nodes are
added to a global string (s) until the end of the document is reached (there are no more nodes
or children). The script then spawns a new window to display the full DOM as recorded in the
string. (Note that your user agent must allow pop-up windows for this code to work.)
Output
The script displays the windows shown in Figure 21-3. The DOM is displayed with represen-
tative levels (dashes and pluses) in the new window.
298
The Document Object Model
Figure 21-3
You can also use the values and types properties of nodes to effectively search the DOM for nodes, as
demonstrated by the next example.
299
Chapter 21
Source
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html>
<head>
<title>DOM Find Node</title>
<style type=”text/css”>
</style>
<script type=”text/JavaScript”>
function findNode(startnode,nodename,nodeid) {
function dofind() {
alert(“Click OK to find ‘sortme’ node”);
var node = findNode(document,”OL”,”sortme”);
alert(“Found node: “ + node.nodeName);
}
</script>
300
The Document Object Model
</head>
<body onload=”dofind()”>
<div class=”div1”>
<h1>Heading 1</h1>
<table>
<tr><td>Cell 1</td><td>Cell 2</td></tr>
<tr><td>Cell 3</td><td>Cell 4</td></tr>
</table>
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing
elit, sed diam <b>nonummy nibh euismod</b> tincidunt ut laoreet
dolore magna aliquam erat volutpat. Ut wisi enim ad minim
veniam, quis nostrud exerci tation ullamcorper suscipit
lobortis nisl ut aliquip ex ea commodo consequat.</p>
</div>
<div class=”div2”>
<h1>Heading 2</h1>
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing
elit, sed diam nonummy nibh euismod tincidunt ut laoreet
dolore magna aliquam erat volutpat. Ut wisi enim ad minim
veniam, quis nostrud exerci tation ullamcorper suscipit
lobortis nisl ut aliquip ex ea commodo consequat.</p>
<ol id=”sortme”>An ordered list
<li>Gamma</li>
<li>Alpha</li>
<li>Beta</li>
</ol>
</div>
</body>
</html>
This script works by traversing the DOM (using the same mechanisms from the previous
example) looking for a node with the specified name (nodeName) and ID (id). When found,
the search stops and the node is reported as found along with the type of node (element
name). In this example OL is returned because the node with the ID sortme is an OL element.
The DOM provides another, easier mechanism to find an element with a particular id, namely the
getElementById() method of the document object. In fact, the entire search function in the pre-
ceding script can be replaced with one line:
node = document.getElementById(“sortme”);
The previous method of traversing the DOM was used to illustrate how you can manually search
the DOM, if necessary. More information and uses of the getElementById() method can be found in
Chapter 22, “Dynamic HTML.”
Output
This script simply outputs the alert box shown in Figure 21-4. However, after execution the
variable node contains a reference to the node being sought and can be manipulated, as shown
in the next section.
301
Chapter 21
Figure 21-4
Changing Nodes
As previously mentioned, you can manipulate document nodes on the fly, adding, removing, and
changing them as needed. The following sections show examples of changing nodes.
302
The Document Object Model
Source
This example uses previously discussed methods to find and change the text of an OL node.
<style type=”text/css”>
</style>
<script type=”text/JavaScript”>
function findNode(startnode,nodename,nodeid) {
function dofindNchange() {
alert(“Click OK to change ‘sortme’ node’s text”);
var node = document.getElementById(“sortme”);
303
Chapter 21
if (node.firstChild.nodeType == 3) {
node.firstChild.nodeValue = “Changed text”;
}
}
</script>
</head>
<body onload=”dofindNchange()”>
<div class=”div1”>
<h1>Heading 1</h1>
<table>
<tr><td>Cell 1</td><td>Cell 2</td></tr>
<tr><td>Cell 3</td><td>Cell 4</td></tr>
</table>
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing
elit, sed diam <b>nonummy nibh euismod</b> tincidunt ut laoreet
dolore magna aliquam erat volutpat. Ut wisi enim ad minim
veniam, quis nostrud exerci tation ullamcorper suscipit
lobortis nisl ut aliquip ex ea commodo consequat.</p>
</div>
<div class=”div2”>
<h1>Heading 2</h1>
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing
elit, sed diam nonummy nibh euismod tincidunt ut laoreet
dolore magna aliquam erat volutpat. Ut wisi enim ad minim
veniam, quis nostrud exerci tation ullamcorper suscipit
lobortis nisl ut aliquip ex ea commodo consequat.</p>
<ol id=”sortme”>An ordered list
<li>Gamma</li>
<li>Alpha</li>
<li>Beta</li>
</ol>
</div>
</body>
</html>
The change of the node takes place in the findNchange() function, after finding the node.
The found node’s firstChild is checked to ensure it is text, and then its value is changed.
Output
Figure 21-5 shows the document after the change; note the OL node’s text now reads
“Changed text.”
Using the DOM, you can also rearrange nodes within the document, as demonstrated in the next
example.
304
The Document Object Model
Figure 21-5
Source
This example uses functions used in previous examples but expands upon them by using a
sort routine to sort the OL node’s children.
<style type=”text/css”>
305
Chapter 21
</style>
<script type=”text/JavaScript”>
function findNode(startnode,nodename,nodeid) {
function sortlist(node) {
// Sort needed?
if (node.childNodes[i].firstChild.nodeValue >
node.childNodes[j].firstChild.nodeValue) {
// Use temporary nodes to swap nodes
tempnode_i = node.childNodes[i].cloneNode(true);
306
The Document Object Model
tempnode_j = node.childNodes[j].cloneNode(true);
node.replaceChild(tempnode_i, node.childNodes[j]);
node.replaceChild(tempnode_j, node.childNodes[i]);
}
function dofindNsort() {
alert(“Click OK to sort list”);
// Find and sort node
var node = findNode(document,”OL”,”sortme”);
sortlist(node);
}
</script>
</head>
<body onload=”dofindNsort()”>
<div class=”div1”>
<h1>Heading 1</h1>
<table>
<tr><td>Cell 1</td><td>Cell 2</td></tr>
<tr><td>Cell 3</td><td>Cell 4</td></tr>
</table>
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing
elit, sed diam <b>nonummy nibh euismod</b> tincidunt ut laoreet
dolore magna aliquam erat volutpat. Ut wisi enim ad minim
veniam, quis nostrud exerci tation ullamcorper suscipit
lobortis nisl ut aliquip ex ea commodo consequat.</p>
</div>
<div class=”div2”>
<h1>Heading 2</h1>
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing
elit, sed diam nonummy nibh euismod tincidunt ut laoreet
dolore magna aliquam erat volutpat. Ut wisi enim ad minim
veniam, quis nostrud exerci tation ullamcorper suscipit
lobortis nisl ut aliquip ex ea commodo consequat.</p>
<ol id=”sortme”>An ordered list
<li>Gamma</li>
<li>Alpha</li>
<li>Beta</li>
</ol>
</div>
</body>
</html>
307
Chapter 21
This example works similarly to that of previous examples. After the node is found, its LI chil-
dren are sorted in ascending order. Note the checks built into the sort routine to ensure only
the LIs are sorted — the text of the OL and other non-LI children is ignored in the sort routine.
This is necessary because you can’t always predict the elements a user agent identifies as
nodes. For example, compare the DOM from Mozilla’s Firefox browser shown in Figure 21-6
to that of Microsoft’s Internet Explorer earlier in the chapter (Figure 21-3).
Figure 21-6
Notice the extra text elements scattered throughout the Firefox DOM, especially between ele-
ments like the LI nodes. If the script didn’t check for valid LI nodeNames, the text nodes
would be sorted into the LIs, disrupting the structure.
308
The Document Object Model
Also note that temporary nodes are used to swap the node contents. The swap could not be
done using traditional methods (one temporary value) using the replaceChild() function:
tempnode = node.childNodes[j].cloneNode(true);
node.replaceChild(node.childNodes[i], node.childNodes[j]);
node.replaceChild(tempnode, node.childNodes[i]);
The second line in the preceding code removes node.childNodes[i] from the document to
replace node.childNodes[j]. Therefore, that node ([i]) would not exist for the third line of
the code to operate on.
Output
The output of this script is shown in Figure 21-7. Note how the OL node’s LI children have
been sorted into ascending order.
Figure 21-7
309
Chapter 21
Summar y
This chapter introduced you to the Document Object Model, its various levels, and how you can use it to
manipulate the underlying structure of documents using client-side scripting (JavaScript). You learned
how to navigate through the DOM, find nodes of interest, and change the document by manipulating
the nodes.
Although the W3C DOM isn’t used as much as the DHTML DOM implemented by both major browser
developers, the W3C DOM is used by both browser developers and contains much power to manipulate
documents using minimal scripting. Chapter 22 delves into Dynamic HTML and the various collections,
properties, and methods at the disposal of JavaScript to effect even more changes in your documents.
310
JavaScript Objects and
Dynamic HTML
Previous chapters in this part of the book detailed how to program using JavaScript and explained
how to hook the language into the W3C Document Object Model. In addition to being a robust
language and having access to the document object, JavaScript also has a host of built-in objects
that can be accessed and manipulated to achieve a variety of effects, including what has come to
be known as Dynamic HTML (DHTML) — the ability to manipulate a document in a dynamic
fashion within the client viewing the document. This chapter covers those built-in functions and
how they can be used to achieve DTHML.
For in-depth listings of all built-in objects, properties, and methods, see Appendix C. For more
examples of how to use JavaScript for Dynamic HTML, see Chapter 23.
Window Object
The window object is the top-level object for an XHTML document. It includes properties and
methods to manipulate the user agent window. The window object is also the top-level object for
most other objects.
Chapter 22
Using the window object, you can not only work with the current user agent window, but you can also
open and work with new windows. The following code will open a new window displaying a specific
document:
NewWin = window.open(“example.htm”,”newWindow”,
“width=400,height=400,scrollbars=no,resizable=no”);
The open method takes three arguments: a URL of the document to open in the window, the name of the
new window, and options for the window. For example, the preceding code opens a window named
newWindow containing the document example.htm and will be 400 pixels square, be nonresizable, and
have no scrollbars.
❑ toolbar = yes|no — Controls whether the new window will have a toolbar
❑ location = yes|no — Controls whether the new window will have an address bar
❑ status = yes|no — Controls whether the new window will have a status bar
❑ menubar = yes|no — Controls whether the new window will have a menu bar
❑ resizeable = yes|no — Controls whether the user can resize the new window
❑ scrollbars = yes|no — Controls whether the new window will have scrollbars
The window object can also be used to size and move a user agent window. One interesting DHTML
effect is to shake the current window. The following function can be used to cause the user agent win-
dow to visibly shudder:
function shudder() {
// Move the document window up and down 5 times
for (var i=1; i<= 5; i++) {
window.moveBy(8,8);
window.moveBy(-8,-8);
}
}
You can use other methods to scroll a window (scroll, scrollBy, scrollTo) and to resize a window
(resizeBy, resizeTo).
Document Object
You can use the JavaScript document object to access and manipulate the current document in the user
agent window. Many of the collection objects (form, image, and so on) are children of the document
object.
312
JavaScript Objects and Dynamic HTML
The document object supports a write and writeln method, both of which can be used to write con-
tent to the current document. For example, the following code results in the current date being displayed
(in mm/dd/yyyy format) wherever the code is inserted in the document:
<script type=”text/JavaScript”>
today = new Date;
document.write((today.getMonth()+1) + “/” + today.getDate() +
“/” + today.getFullYear());
</script>
The open and close methods can be used to open and then close a document for writing. Building on
the examples in the earlier “Window Object” section, the following code can be used to spawn a new
document window and write the current date to the new window:
<script type=”text/JavaScript”>
today = new Date;
newWin = window.open(“”,””,”width=400,height=400,scrollbars=no,resizable=no”);
newDoc = newWin.document.open();
newDoc.write((today.getMonth()+1) + “/” + today.getDate() +
“/” + today.getFullYear());
newDoc.close();
</script>
Form Object
You can use the form object to access form elements in a document. The form object supports length
and elements properties — the former property returns how many elements (fields) are in the form, and
the latter contains an array of form element objects, one per field. You can also access the form elements
by their name attribute. For example, the following code will set the size field to the length of the
address field using the form name and element names to address the various values:
...
<head>
<script type=”text/JavaScript”>
function dolength() {
document.form1.addlength.value =
document.form1.address.value.length;
}
</script>
</head>
<body>
<p>
<form name=”form1” action=”handler.cgi” method=”post”>
Length: <input type=”text” name=”addlength” size=”5” /><br />
Address: <input type=”text” name=”address” size=”30” onkeyup=”dolength();”/>
</form>
</p>
...
313
Chapter 22
The form object can be used for a variety of form automation techniques. For example, a button can be
created to check (or uncheck) all of a series of checkboxes:
<head>
<script type=”text/JavaScript”>
function checkall(field) {
for (i=0; i<field.length; i++) {
field[i].checked = true;
}
}
</script>
</head>
<body>
As you can see by the checkbox object’s checked property, JavaScript has built-in properties and meth-
ods to manipulate all manner of form fields. A comprehensive list of these properties and methods
appears in Appendix C, “JavaScript Language Reference.”
Location Object
The location object can be used to manipulate the URL information about the current document in the
user agent. Various properties of the location object are used to store individual pieces of the docu-
ment’s URL (protocol, hostname, port, and so on). For example, you could use the following code to
piece the URL back together:
with (document.location) {
var url = protocol + “//”;
url += hostname;
if (port) { url += “:” + port; }
url += pathname;
if (hash) { url += hash; }
}
314
JavaScript Objects and Dynamic HTML
The preceding example is only to illustrate how the various pieces relate to one another — the
location.href property contains the full URL.
One popular method of using the location object is to cause the user agent to load a new page. To do
so, your script simply has to set the document.location object to the desired URL. For example, the
following code will cause the user agent to load the yahoo.com home page:
document.location = “http://www.yahoo.com”;
History Object
The history object is tied to the history function of the user agent. Using the history object your script
can navigate up and down the history list. For example, the following code acts as though the user used
the browser’s back feature, causing the user agent to load the previous document in the history list:
history.back();
Other properties and methods of the history object allow more control over the history list. For exam-
ple, the history.length property can be used to determine the number of entries in the history list.
As with other objects in this chapter, a full list of properties and methods supported by the object
appears in Appendix C.
The function can then use that reference to operate on the object that initiated the call:
function dosomething(el) {
... // do something with the element referenced by el ...
}
For example, the following function can be used to change the color of an element when called with a
reference to that element:
function changecolorRed(el) {
el.style.color = “red”;
}
315
Chapter 22
That function can then be added to an event of any element, similar to the following onclick event
example:
element = getElementById(“elementID”);
For example, the following code would assign a reference to the address field to the element variable:
element = getElementById(“address”);
...
<input type=”text” size=”30” id=”address”>
Once assigned, the element variable can be used to access the referenced field’s properties and
methods:
addlength = element.length;
Before using getElementById() you should test the user agent to ensure the function is available.
The following if statement will generally ensure that the user agent supports the appropriate DOM
level and, thus, getElementById():
if (document.all || document.getElementById) {
...getElementById should be available, use it...
}
Dynamic HTML
Dynamic HTML (DHTML) involves using scripts to manipulate elements within a document. The result
is the creation of dynamic content (document automation, animation, and so on). Such manipulation
usually involves CSS styles — the manipulation of an element’s style is very efficient.
316
JavaScript Objects and Dynamic HTML
To access and manipulate document elements, you can use either of the two DOMs, the W3C DOM dis-
cussed in Chapter 21 or the JavaScript DOM provided via the objects discussed earlier in this chapter.
One popular DHTML technique is to hide or reveal document elements. You can use this to create drop-
down text, collapsible outlines, and more.
Source
This code uses two classes, one to show the list items, the other to hide the list items.
function hideNreveal(list) {
if (list.className == “hidelist”) {
list.className = “showlist”;
} else {
list.className = “hidelist”;
}
}
</script>
</head>
<body>
317
Chapter 22
<p>
<ul id=”list1” class=”hidelist”
onclick=”hideNreveal(this);”>An unordered list.
<li>Item 1</li>
<li>Item 2</li>
<li>Item 3</li>
<li>Item 4</li>
</ul>
</p>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim
ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut
aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit
in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur
sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt
mollit anim id est laborum.</p>
</body>
</html>
Output
This example results in a document containing a collapsible list. When the list header (the UL
text) is clicked, the list expands or collapses. The two states of the list are shown in Figures 22-1
(collapsed) and 22-2 (expanded).
Figure 22-1
318
JavaScript Objects and Dynamic HTML
Figure 22-2
Another popular means of manipulating elements is to manipulate their styles directly, changing the
values instead of changing the styles applied in a wholesale manner (via className). For example, you
can move an element by changing its positioning styles, as in the following example.
Source
This code uses a relative positioned element’s top and left values to move an element when it is
clicked.
319
Chapter 22
<script type=”text/JavaScript”>
function moveme(el) {
if (el.style) { el = el.style; }
el.top = “-5px”; el.left = “20px”;
}
</script>
</head>
<body>
<p class=”movable” onclick=”moveme(this);”>Lorem ipsum dolor sit amet,
consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore
et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation
ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure
dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat
nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.</p>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim
ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut
aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit
in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur
sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt
mollit anim id est laborum.</p>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim
ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut
aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit
in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur
sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt
mollit anim id est laborum.</p>
</body>
</html>
This code uses an onclick event to call the moveme function. That function uses the object reference
passed to it (this) to access the appropriate object’s top and left properties. The properties are
changed, creating a dynamic shift in the element (5 pixels up, 20 pixels down).
Output
Figure 22-3 shows the document immediately after loading, and Figure 22-4 shows the document
after the first paragraph is clicked and the script modified its position.
320
JavaScript Objects and Dynamic HTML
Figure 22-3
Figure 22-4
321
Chapter 22
Summar y
This chapter rounds out the coverage of JavaScript programming, covering the built-in objects that can
be used to access and manipulate document elements. Using the information in this chapter and previ-
ous chapters in this part, you should be able to construct scripts to perform a variety of useful functions
for your documents.
322
Using JavaScript
An entire section of this book has been dedicated to JavaScript. You learned about the language
itself as well as various ways to use it with other technologies. This chapter wraps up that cover-
age by providing useful examples of JavaScript in action. Feel free to use any of the techniques
covered on your own projects.
The code from the examples can be downloaded from this book’s companion Web site.
Many smaller footprint user agents (cell phones, PDAs, and so on) do not support JavaScript.
Many more user agents have JavaScript disabled by default. The net result is that JavaScript is sig-
nificantly less accessible today than it was in the early days of the Web and graphical user agents.
Those user agents that do support JavaScript cannot be counted on to adhere to any one standard.
As previously mentioned in Chapter 21, there are at least four different Document Object Models
implemented across the current user agents. Writing code that is truly cross-platform compatible is
nearly impossible, very difficult at best.
What this means is that JavaScript cannot be relied on to perform medium- to high-level tasks, and
its presence should never be relied on to use your documents.
Chapter 23
The preceding statement assumes your documents are meant for public consumption where any number
of user agents may be employed to access the documents. If you are coding in a known environment where
you can control the number and type of user agents in use, you can reasonably rely on JavaScript code.
Also, JavaScript is very limited in scope. It cannot access external information, whether on the client or
server. For more complex tasks needing an interface to external resources, you should consider CGI,
PHP, or other server-based technology.
❑ Is the task JavaScript will perform absolutely necessary? Image rollovers and dynamic text pro-
vide cool features to your documents but don’t necessarily add functionality.
❑ Can the same thing be accomplished using simpler means (usually straight XHTML code with-
out as many bells and whistles)? If so, you are generally better off using the simpler means.
❑ Can you reasonably code cross-platform scripts? Remember that the more complex the script
(for example, DHTML), the more likely the script is not to work properly on some user agents.
❑ Can you offer alternatives to the JavaScript-enabled features? When using JavaScript for fancy
navigation menus, for example, can you also include a basic text menu for non-JavaScript-
enabled users?
Although it may sound as if I am trying to talk every reader of this book out of using JavaScript, I’m not.
I’m simply trying to make the point that because you can easily implement JavaScript doesn’t mean you
always should use JavaScript — that’s all.
JavaScript Resources
Following is a partial list of some of the best JavaScript resources the Web has to offer:
❑ The ECMA Specification — The ECMA specification from the ECMA Web site gives the
reader an in-depth look at the standards behind JavaScript. It should be kept in mind that
not all user agents adhere to the specification, but this background document will help any
coder better understand JavaScript. You can find the ECMA specification at http://www
.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf.
❑ The W3C DOM Specification — This reference document explains the design and workings of
the Document Object Model Level 1. The DOM is described in detail and the bindings specific
to ECMA (and therefore, JavaScript) are covered in Appendix D. You can find the W3C DOM
Specification at http://www.w3.org/TR/2000/WD-DOM-Level-1-20000929/Overview.html.
❑ The MSDN Web Development Library — The Web Development section of the MSDN Library
(click on the Web Development link in the left pane) has a lot of useful information on the
implementation of JavaScript, CSS, and DHTML within Microsoft Internet Explorer. If you need
to find out how IE will handle something in particular, this is the source to consult. Find it at
http://msdn.microsoft.com/library/default.asp.
324
Using JavaScript
❑ The Gecko (Mozilla Firebird) DOM Reference — This document describes the Gecko DOM
implementation. If you need to write code specific to the Gecko browsers, this is the document
to consult. Note that many new browsers (many in the mobile arena) are adopting the Gecko
browser standard. You can find this reference at http://www.mozilla.org/docs/dom/
domref/.
❑ The DevGuru JavaScript Language Index — DevGuru does an excellent job of providing quick
references for various online technologies. Their comprehensive JavaScript Quick Reference
is indispensable for quick lookups of JavaScript events, functions, methods, objects, and more.
Find this index at http://www.devguru.com/Technologies/ecmascript/quickref/
javascript_index.html.
❑ Quirksmode.org — The reference for browser quirks. This site contains information about
almost every user agent quirk known to man. Helpful examples and tutorials abound to help
even the most inexperienced JavaScript programmer adapt code for cross-platform capability.
Find it at http://www.quirksmode.org/.
❑ The getElementById.com Web site — This site contains many useful DHTML scripts,
techniques, and tutorials. Find it at http://getelementbyid.com/news/index.aspx.
JavaScript Examples
The following sections provide example documents that include scripts to perform various tasks. Each
example is presented with source code, output, an explanation of how the script works, and ways that
the script can be extended or improved.
The scripts in this section have been written and verified to run on the two most popular browsers:
Mozilla Firefox 1.0 and Microsoft Internet Explorer 6.0+. To keep the examples straightforward and
simple, no additional cross-platform coding has been added. However, the scripts should run on any
user agent compatible with these two “standards.”
Source
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html>
<head>
<title>Display Date (Text)</title>
325
Chapter 23
<script type=”text/JavaScript”>
// Set up arrays
var months = new Array(“January”,”February”,”March”,”April”,
“May”,”June”,”July”,”August”,”September”,”October”,
“November”,”December”);
var days = new Array(“Sunday”,”Monday”,”Tuesday”,”Wednesday”,
“Thursday”,”Friday”,”Saturday”);
function writedate() {
// Get current values, build output, and display
var today = new Date;
var thisMonth = today.getMonth();
var thisDay = today.getDate();
var thisYear = today.getFullYear();
var thisWeekday = today.getDay();
var datetext = days[thisWeekday] + “, “ + months[thisMonth] +
thisDay + “, “ + thisYear;
document.write(datetext);
}
</script>
</head>
<body>
<script type=”text/JavaScript”>writedate();</script>
</body>
</html>
Output
This script simply writes the current date to the user agent window, as shown in Figure 23-1.
Figure 23-1
326
Using JavaScript
How It Works
Using the built-in JavaScript date methods, the writedate() function assembles the date
into a string that is written to the browser via the document.write method. Wherever the
date is needed in the document, the code to call the writedate() function is inserted in lieu
of the date:
<script type=”text/JavaScript”>writedate();</script>
Source
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html>
<head>
<title>Obscure Email</title>
<script type=”text/JavaScript”>
<body>
<!-- Use an array to break up the address and a for loop to reassemble -->
<p><script type=”text/JavaScript”>
var a = new Array (“<a h”,”ref=”,’”’,”ma”,”il”,”to:”,
“ss”,”ch”,”afer”,”@ex”,”ampl”,”e.c”,”om”,’”>’);
327
Chapter 23
for (i in a) {
document.write(a[i]);
}
</script>Email me</a></p>
<!-- Use a function to reassemble domain, address, and link text -->
<p><script type=”text/JavaScript”>eaddr(“example.com”,”sschafer”,
“Email me”);</script></p>
</body>
</html>
Output
Both methods result in an Email me link being displayed in the user agent window, as shown
in Figure 23-2.
Figure 23-2
How It Works
The first method builds the script into the body of the document. This script uses an array to
break up the e-mail address into chunks that are unrecognizable as an e-mail address, so the
spam robots won’t recognize it as such:
The script reassembles and writes the address link to the document window.
328
Using JavaScript
The second means of obscuring e-mail addresses uses a similar method but encapsulates the
code into a reusable function. The function takes the address, domain name, and link text as
separate arguments to reassemble and output accordingly.
Using forms and CGI or PHP presents other unique challenges. See the chapters in the next two
parts of this book for more details on form handling with CGI and PHP.
Opening another user agent window is akin to pop-up windows, which have a very bad reputation
in the Web world due to the amount of advertising spam associated with their use. However, pop-up
windows can be used for very legitimate reasons; they are used all the time in OS graphical user
interfaces (dialog boxes). The key is to use them efficiently, with ample warning so that the user
expects them and finds the content (and their use) useful.
Source
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html>
<head>
<title>Open New Window</title>
<script type=”text/JavaScript”>
function newwindow(title,url,options) {
// options syntax:
// width=x,height=y,scrollbars=yes|no,resizable=yes|no
if (!options) {
options = “width=650,height=550,scrollbars=yes,resizable=yes”;
}
return window.open(url,title,options);
}
329
Chapter 23
</script>
</head>
<body>
<p>
<input type=”button” value=”Default window”
onclick=”newwindow(‘NewWindow’,’’,’’);” />
<input type=”button” value=”Yahoo small”
onclick=”newwindow(‘YahooWindow’,’http://www.yahoo.com’,
‘width=200,height=200,scrollbars=no,resizable=no’);” />
</p>
</body>
</html>
Output
The preceding code opens a default, empty window 650 pixels wide by 550 pixels high or a win-
dow 200 pixels square displaying the contents of the Yahoo main page, as shown in Figure 23-3.
Figure 23-3
330
Using JavaScript
How It Works
The script uses input buttons with onclick() events to run the newwindow() script. The
script takes three arguments, specifying the title of the new window, the URL to display in the
new window, and options for the new window. If no options are given, the function opens a
blank, 650x550, resizable window with scroll bars.
It should be noted (as an exception) that the code in the preceding example is not XHTML compliant,
as the input element is not properly contained in a block element. This is done intentionally throughout
this section to eliminate XHTML coding overhead, improving the legibility of the examples.
Source
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html>
<head>
<title>Write to New Window</title>
<script type=”text/JavaScript”>
function newwindow(title,url,options) {
// options syntax:
// width=x,height=y,scrollbars=yes|no,resizable=yes|no
if (!options) {
options = “width=650,height=550,scrollbars=yes,resizable=yes”;
}
return window.open(url,title,options);
}
331
Chapter 23
</script>
</head>
<body>
<p>
<input type=”button” value=”New window”
onclick=”doNewWin();” />
</p>
</body>
</html>
Output
The preceding code opens a default, empty window 650 pixels wide by 550 pixels high and
then writes a heading “Text to New Window” in the window, as shown in Figure 23-4.
Figure 23-4
332
Using JavaScript
How It Works
This script uses methods similar to the previous script (a button with an onclick() handler)
to execute a function. That function (doNewWin()) uses the newwindow() function to open a
default window and then uses the reference it returns to access the new window with the
document.write method:
Note the use of the document.close method necessary to let the user agent know that the
document is complete and no more content is forthcoming.
Images
JavaScript is also routinely used to manipulate images within a document. The scripts in this section
show you several ways to access images using JavaScript.
❑ When a site contains a lot of graphics and the user will usually navigate to many of the
documents on the site. If a majority of the images can be preloaded, subsequent pages
will load more quickly.
❑ When creating animations in a document. Without preloading the images, there will be
a delay between image transitions as the new image loads.
❑ When manipulating images using JavaScript. Preloading the images allows your scripts
more control over the image data; you know where the image will be stored without
needing to navigate the DOM and can perform operations (size the window, change
descriptive text, and so on) before displaying the image.
Preloading images does not cause the images to display; it only causes the user agent to load them
into its cache for easy retrieval if and when they are needed for display. Your code can use the images
either directly (using <img> tags) or indirectly (by changing the src attribute of other <img>
tags).
333
Chapter 23
Source
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html>
<head>
<title>Image Preload</title>
<script type=”text/JavaScript”>
</script>
</head>
<body>
</body>
</html>
Output
This script produces no visible output.
How It Works
The script works by creating new image objects and loading the specified image files into the
objects. This results in a request to the server for the image, which is then cached on the user’s
system, available to display from local cache instead of across the slower Internet.
To gain the maximum benefit from cached images, it is important to ensure that the URL used in the
preload script is identical to the URL used in the document’s image (<img>) elements. A URL that
is even slightly different may result in the image being requested from the server again, instead of
being served from cache.
Note that the image is not displayed using this code, but methods and functions (size, and so
on) are available to be used on the image. The image src property can be used to move an
image from the cached object to an object in the document using code similar to the following:
The script has some basic error checking included (it looks for the presence of the
document.images collection before preloading).
334
Using JavaScript
Additionally, the script container that holds the preload logic should include the defer
attribute so that the user agent will continue to load the page (because it expects no output
from the script within).
Source
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html>
<head>
<title>Image Rollover</title>
<script type=”text/JavaScript”>
335
Chapter 23
</script>
</head>
<body>
<p>
<img src=”./AboutUs.jpg” width=”200” height=”50” alt=”About Us”
id=”AboutUs” onmouseover=”rollover(‘AboutUs’);”
onmouseout=”rollout(‘AboutUs’);”
onclick=”window.location=’./aboutus.htm’;” />
</p>
</body>
</html>
Output
Figures 23-5 and 23-6 show this script in action. Figure 23-5 shows the image without the
mouse over it, while Figure 23-6 shows the image when the mouse is over it. Note that when
the mouse leaves the image, it reverts to the image shown in Figure 23-5.
Figure 23-5
How It Works
As previously stated, this script works by tying into the onmouseover() and onmouseout()
events of the <img> tag. When the mouse is placed over the image in the document, the
onmouseover() event calls the rollover() function, which changes the image element’s src
property to that of the white-filled text. When the mouse leaves the image, the onmouseout()
event calls the rollout() function, which changes the image src back to the original image.
336
Using JavaScript
Figure 23-6
The script preloads the images so that they are ready for the rollover. As a result, there should not be
a delay when the user rolls over the image the first time.
The result is animated text that responds to the presence of the mouse pointer. The onclick()
event changes the window.location property, making the image act like a hyperlink.
Notice the addition of the border style to the image tag. This removes the pesky border
resulting from the encapsulation in the anchor element, giving you a cleaner display.
337
Chapter 23
Source
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html>
<head>
<title>Graphical Date Display</title>
<script type=”text/JavaScript”>
function ShowDate() {
// Check for required browser functionality
if (document.all || document.getElementById) {
// Set up our variables
var img = new Array(“”);
var today = new Date();
var date = String(today.getDate());
var month = String(today.getMonth()+1);
var year = String(today.getFullYear());
var date1 = date.charAt(0);
var date2 = date.charAt(1);
// Prime the img array
for (x = 1; x <= 7; x++) {
img[x] = document.getElementById(“img”+String(x));
}
// Set src of appropriate images
// 1&2 = day, 3 = month, 4-7 = year
if (date2 == “”) {
img[2].src = “./images/dateimgs/” + date1 + “.gif”
img[1].src = “./images/dateimgs/space.gif”
} else {
img[1].src = “./images/dateimgs/” + date1 + “.gif”
img[2].src = “./images/dateimgs/” + date2 + “.gif”
}
img[3].src = “./images/dateimgs/month” + month + “.gif”;
img[4].src = “./images/dateimgs/” + year.charAt(0) + “.gif”;
img[5].src = “./images/dateimgs/” + year.charAt(1) + “.gif”;
img[6].src = “./images/dateimgs/” + year.charAt(2) + “.gif”;
img[7].src = “./images/dateimgs/” + year.charAt(3) + “.gif”;
}
}
</script>
</head>
<body onload=”ShowDate()”>
<p>
<!-- Simple line table to hold date -->
338
Using JavaScript
Output
This script results in the one-line date display shown in Figure 23-7. The graphics used to
display the various elements (numbers, months) are shown within Windows Explorer in
Figure 23-8.
Figure 23-7
339
Chapter 23
Figure 23-8
How It Works
This script is similar to the script from Example 1; it uses built-in JavaScript functions to ascer-
tain the current date. It then uses premade images for the digits and text of the month —
images named to be easy to access (1.gif for the number 1, month1.gif for the month
January, and so on). The appropriate image URL(s) are then swapped into the appropriate
image elements in the document. In this case, the image elements are encased in a table for for-
matting purposes.
340
Using JavaScript
❑ The script could be modified to display more date formats. The format displayed would
be driven by an argument passed to the script. Note that this would mean changing the
month images (removing the comma), adding an image or two (comma, dash, slashes,
and so on), and perhaps employing a larger table whose elements are used dynamically
(not all formats would use all the cells in the table; unused cells would have to be filled by
blank images).
Although it is possible to change default behavior of form elements using JavaScript, resist the temp-
tation. Users are generally used to their platforms’ GUI behavior, and unannounced changes (such
as automatically moving between fields of a phone number) aren’t always welcome.
Source
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html>
<head>
<title>Form Enhancement</title>
<script type=”text/JavaScript”>
// Check size (length) of text box and report
// in another box
function checksize() {
if (document.all || document.getElementById) {
var counter = document.getElementById(“size”);
var text = document.getElementById(“msg”);
counter.value = text.value.length;
}
}
</script>
</head>
<body onload=”checksize()”>
<p>
Edit the message below (limited to 120 characters!).<p>
<form name=”form1” id=”form1” method=”post”
341
Chapter 23
action=”bogus.htm” onkeyup=”checksize()”>
<input type=”text” name=”size” id=”size”
disabled=”disabled” size=”6”> Characters
</p>
<p>
<textarea cols=”40” rows=”3” name=”msg” id=”msg”
wrap=”virtual”></textarea>
</p>
<p>
<input type=”submit” name=”submit” value=”Submit” />
<input type=”reset” name=”reset” value=”Clear” />
<input type=”button” name=”close” value=”Close”
onclick=”self.close()” />
</p>
</form>
</p>
</body>
</html>
Output
This example results in a document with two form fields, a text box for text entry and
a smaller text box for tallying the number of characters in the other field, as shown in
Figure 23-9.
Figure 23-9
342
Using JavaScript
How It Works
The script works by accessing the length property of the text area and assigning it to the
value of the counter box. The script is encapsulated in a function that is called by the onkeyup()
event handler of the text area element. Every time a keystroke is entered into the text box, the
counter is updated. The counter text box is disabled so that it cannot be edited by the user.
Source
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html>
<head>
<title>Basic Form Validation</title>
<script type=”text/JavaScript”>
if (document.all || document.getElementById) {
elid = document.getElementById(el);
if (!filter.test(elid.value)) {
err = “The ‘“ + name + “‘ field must be a valid email address<br />”;
}
}
return err;
}
343
Chapter 23
if (document.all || document.getElementById) {
el1id = document.getElementById(el1);
el2id = document.getElementById(el2);
if (el1id.value != el2id.value) {
err = “The values of ‘“ + name1 + “‘ and ‘“ + name2 + “‘ “;
err += “do not match<br />”;
}
}
return err;
}
if (document.all || document.getElementById) {
elid = document.getElementById(el);
if ((elid.value.length < minlength) &&
(minlength != 0)) {
err = “The ‘“ + name + “‘ field must be at least “;
err += minlength + “ characters long<br />”
}
if ((elid.value.length > maxlength) &&
(maxlength != 0)) {
err += “The ‘“ + name + “‘ field must be less than “;
err += maxlength + “ characters long<br />”
}
}
return err;
}
if (document.all || document.getElementById) {
elid = document.getElementById(el);
if (elid.value == “”) {
err = “The ‘“ + name + “‘ field cannot be blank<br />”;
}
}
return err;
}
// Validation engine
function FRMvalidate() {
344
Using JavaScript
if (document.all || document.getElementById) {
if (err.length != 0) {
errid = document.getElementById(“errtext”);
errid.innerHTML = err;
return(false);
} else {
validateid = document.getElementById(“validated”);
validateid.value = “true”;
}
}
return(true);
}
</script>
<style type=”text/css”>
.errtext { color: red; }
td { padding: 5px; }
</style>
<body>
<p>
<form action=”http://www.example.com/handler.cgi” method=”GET”
onsubmit=”return FRMvalidate();”>
<table border=”0”>
<tr>
<td>First name:</td>
<td><input type=”text” id=”firstname” name=”firstname” size=”20”
maxlength=”20” /></td>
</tr><tr>
<td>Last name:</td>
<td><input type=”text” id=”lastname” name=”lastname” size=”20”
maxlength=”20” /></td>
</tr><tr>
<td>Email:</td>
<td><input type=”text” id=”email” name=”email” size=”20”
maxlength=”40” /></td>
</tr><tr>
<td>Password:</td>
<td><input type=”password” id=”password” name=”password” size=”20”
maxlength=”20” /></td>
</tr><tr>
345
Chapter 23
<td>Confirm Password:</td>
<td><input type=”password” id=”confpass” name=”confpass” size=”20”
maxlength=”20” /></td>
</tr><tr>
<td><input type=”submit” id=”submit” name=”submit” value=”Submit” /></td>
<td><input type=”reset” id=”reset” name=”reset”></td>
</tr>
</table>
Output
This example generates a form with an error section above it, as shown in Figure 23-10. If the
user has an error in a field and clicks Submit, the error text appears to help the user identify
and fix the problem.
Figure 23-10
346
Using JavaScript
How It Works
The example works by tying into the onsubmit() event handler of the form element. This
handler runs the specified code — in this case calling the FRMvalidate() function — when the
user tries to submit the form. If the expression of the onsubmit() handler (the result of the
function) returns true, the form is submitted to the appropriate form handler as defined in the
form tag. If the expression evaluates to false, the form is not submitted; it is as if the user never
clicked the Submit button.
The FRMvalidate() function calls smaller functions to check for specified conditions. For
example, the FRMnonBlank() function tests to see if the specified field is not blank. Each vali-
dation function takes the ID and name of an element to check and returns a string containing
the error(s) found, if any. The error text helpfully includes the supplied form element’s name.
The FRMvalidate() function appends each error to a combined error log, which is displayed
to the user if necessary. If the error string remains blank after all validation routines have been
run, the FRMvalidate() function returns true, allowing the form to be submitted. The func-
tion also sets the hidden validated field to true so that the form handler knows the data has
passed the basic validation.
JavaScript validation should never be relied on. Because it is run on the client, it is possible for an
unscrupulous user to modify the code to circumvent the validation. It is important that your form
handler do its own in-depth validation before doing anything useful with the data. See the chapters
in the next two parts of this book for help on writing form handlers in CGI and PHP.
If the script is run on a nonsupported browser, it will still enable the form data to be passed to the
handler, but the validated field will remain false.
347
Chapter 23
Source
<html>
<head>
<title>Swapping Styles</title>
<style type=”text/css”>
.initialbox { width: 20; height: 20;
position: absolute;
top: 200; left: 200;
visibility: visible;
border: thick solid black;
background-color: red;
overflow: hidden;
z-index: 3;
}
.finalbox { width: 200; height: 200;
top: 200; left: 200;
position: absolute;
visibility: visible;
border: none;
background-image: url(A45.jpg);
background-position: center center;
background-repeat: no-repeat;
overflow: hidden;
z-index: 3;
}
</style>
<script type=”text/JavaScript”>
// Determine box by id = hiddenbox
// if browser supports it
function gethiddenbox() {
if (document.all || document.getElementById) {
return document.getElementById(“hiddenbox”);
} else {
return false;
}
}
348
Using JavaScript
box.className = “initialbox”;
}
}
}
</script>
</head>
<body>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed
do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.</p>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed
do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.</p>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed
do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.</p>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed
do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.</p>
<p>
<input type=”button” id=”swapstyles” value=”Swap Styles”
onclick=”swapStyles();”>
</p>
You may have noticed that this example’s code does not include a DOCTYPE declaration. This,
unfortunately, is by design. For some reason unknown to this author, inclusion of the DOCTYPE
declaration causes this example not to work in Mozilla Firefox or Microsoft Internet Explorer. The
DOCTYPE declaration is not supposed to affect the operation of code; it is only to inform agents and
validation tools of the standard the document is following. However, in this case, the DOCTYPE
declaration must trigger some unknown (again, to this author) quirk in both user agents.
349
Chapter 23
Output
This example toggles a paragraph element’s attributes each time the button is clicked. The two
different styles of the element are shown in Figures 23-11 and 23-12.
Figure 23-11
How It Works
This example works by examining the current style class assigned to the element and chang-
ing it to another, toggling between the two sets of styles. The button’s onclick() event
handler calls the swapStyle() function to swap the styles via the className property,
significantly changing the appearance of the element.
Note that each class needs to include every style necessary for the appropriate state of the
paragraph element. This example does not take advantage of style inheritance.
350
Using JavaScript
Figure 23-12
Source
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html>
<head>
<title>Drop Down Menu</title>
<style type=”text/css”>
351
Chapter 23
</style>
<script type=”text/JavaScript”>
// Netscape layers
if( document.layers ) {
return document.layers[eID];
}
// DOM; IE5, NS6, Mozilla, Opera
if( document.getElementById ) {
return document.getElementById(eID);
}
// Proprietary DOM; IE4
if( document.all ) {
return document.all[eID];
}
// Netscape alternative
if( document[eID] ) {
return document[eID];
}
return false;
}
352
Using JavaScript
</script>
</head>
<body onload=”initmenu();”>
<div class=”main”>
353
Chapter 23
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu
fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.</p>
</div>
</body>
</html>
Output
This document displays a MENU tab at the top of the screen that, when moused over, drops
down a menu of links. The tab is shown in Figure 23-13 and the full menu is shown in
Figure 23-14.
Figure 23-13
How It Works
The script works by placing the tab and menu tables within a division (<div>) that incor-
porates onmouseover() and onmouseout() event handlers. When the mouse moves appro-
priately, the corresponding function is called to drop down or roll up the menu. The menu
animation is accomplished by toggling the display property of the menu table. The current
state of the menu is kept in the menushown variable to keep the functions from performing
unnecessary work (showing the menu when it is already shown, and vice versa).
354
Using JavaScript
Figure 23-14
❑ Actually moving the menu using positioning properties (top, left, and so on).
❑ Resizing the menu’s containing block using size properties (width, height).
❑ Using the hidden property instead of the display property (hiding the menu by mak-
ing it invisible instead of moving it off-screen). Note that using the display property
would require a change in the position properties, as well — positioning the menu
permanently in the visible position.
The mechanism for dropping down or rolling up the menu can be changed, as well. Although
the current mechanism is quite fluid in Microsoft Internet Explorer, it is a bit erratic in Mozilla
Firefox. An alternative would be to use an onclick() event handler on the MENU tab table that
toggles the state of the menu.
355
Chapter 23
Source
<!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Strict//EN”
“http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd”>
<html>
<head>
<title>Div Movement</title>
<style type=”text/css”>
/* Set initial styles for div */
.hiddendiv { width: 0px; height: 0px;
position: absolute;
top: 0px; left: 1px;
color: white;
visibility: hidden;
border: thin dotted black;
background-image: url(A45.jpg);
background-position: center center;
background-repeat: no-repeat;
overflow: hidden;
z-index: 3;
}
</style>
<script type=”text/JavaScript”>
// Netscape layers
if( document.layers ) {
return document.layers[divID];
}
// DOM; IE5, NS6, Mozilla, Opera
if( document.getElementById ) {
return document.getElementById(divID);
}
// Proprietary DOM; IE4
if( document.all ) {
return document.all[divID];
}
// Netscape alternative
356
Using JavaScript
if( document[divID] ) {
return document[divID];
}
return false;
}
function movediv() {
// Set up measurements
var noPx = document.childNodes ? ‘px’ : 0;
var div = getdivID(“hiddendiv”);
357
Chapter 23
</head>
<body>
<div>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed
do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.</p>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed
do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.</p>
<p>Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed
do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
enim ad minim veniam, quis nostrud exercitation ullamco laboris
nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.</p>
358
Using JavaScript
<p><form>
<input type=”button” id=”reveal” value=”Reveal Div”
onclick=”revealdiv();”>
</form></p>
</div>
Output
This example animates a division (<div>), moving it from the upper-left corner of the screen
to a position in the middle of the document while it grows from a small size to its normal size.
Figures 23-15, 23-16, and 23-17 illustrate the animation in progress.
How It Works
This example incorporates more error checking and cross-platform code than any other exam-
ple in this chapter. The initial state of the division is set via the hiddendiv style class; the
element is hidden, set to a size of 10 pixels square, and placed in the upper-left corner of the
document’s display.
Figure 23-15
359
Chapter 23
Figure 23-16
Figure 23-17
360
Using JavaScript
The button’s onclick() event handler calls the revealdiv() function that prepares the divi-
sion for movement (makes the element visible, sets initial size and position values) and then
calls the movement function, movediv().
The movediv() function increases the size of the element and moves it 10 pixels right and
down each time it is called. A timer is used to call the movediv() function every 20 millisec-
onds until the element reaches the specified location on-screen (200 pixels from the left mar-
gin). Once the element arrives at its destination, the function stops calling itself and the
operation is done.
Most of the error checking built into the script centers around the getdivID() function that
returns an element’s ID despite the DOM that the user agent might be using. Note that this
function supports far more diverse user agents than previous examples and can be retrofitted
into other scripts needing more than Firefox and IE compatibility.
Additional error checking is built into checking for appropriate use of the style property (for
example, using div.width or div.style.width) and appropriate values for the visibility
style. Also, because of a quirk in Firefox you must set the initial values of the division element
using JavaScript; Firefox seems incapable of reading the size and position values from the style
class set by XHTML.
Lastly, the appropriate measures (pixels or more specifically px) must be used and worked
around when present. The initial assignment of the noPx variable is done according to the user
agent capabilities. Just in case px appears in the values, parseInt is used to parse the integer
values out of the appropriate properties before calculations are performed.
This example can also be combined with other examples in this chapter. The division can have
content appear after being moved and/or the error checking routines present in this example
can be used for effect in other examples.
Summar y
This chapter wraps up the book’s coverage of JavaScript. Using the examples and resources within
this chapter, you should be able to utilize JavaScript to the best of its potential.
361
CGI Basics
The Common Gateway Interface (CGI) is an important tool in a Web programmer’s bag of tricks.
When HTTP was created, it was introduced as a simple way to receive and respond to queries for
documents. As the Web grew, it became important that the protocol be able to interface with addi-
tional resources beyond those of simple textual documents. This chapter introduces CGI as a con-
cept along with the techniques and technology involved in its operation.
Perl was perhaps the earliest means of accomplishing CGI, although any programming language
capable of reading from standard input (STDIN) and writing to standard output (STDOUT) can be
used for CGI purposes.
Note that this section provides the bare essentials regarding HTTP requests and responses. More
detail on this subject is covered in Chapter 1 in the section “The Web Protocol.”
Chapter 24
HTTP Request
Web
Client
HTTP Response
Web Server
Figure 24-1
This request asks for a document in a particular directory (governed by somepath) and tells the server
that the client speaks HTTP version 1.1. For example, to request the top-level page from a server, a client
might use the following:
This request asks for the index.html file from the server’s root directory.
The server then responds with a header indicating the success or failure of the request. When the request
can be fulfilled, the server usually responds with the following header:
HTTP/1.1 200 OK
This header indicates it is in HTTP 1.1 format and includes the status code in both numeric (200) and
textual (OK) form. If appropriate, the header is followed by the content requested. If the request cannot
be fulfilled, the server instead sends an error code in the status reply, such as the familiar “404 document
not found” error:
In some cases, such as the preceding 404 error, the server is configured to perform other actions besides
simply returning an error. For example, most Web servers are configured to send a special page to a
client along with the 404 error, as shown in Figure 24-2.
Using the first method (GET), data is embedded in the URL by separating the request info and the data
with a question mark (?). The data is then contained in name/value pairs, separated by ampersands (&).
For example, to pass my first and last name in a URL, I could use the following:
http://www.example.com/index.html?firstname=Steve&lastname=Schafer
364
CGI Basics
Figure 24-2
❑ lastname=Schafer — The second data field (lastname) and its data (Schafer)
Using the second method (POST), the data is sent back to the server embedded in HTTP headers. For
example, the following headers encapsulate the same data as in the preceding GET example:
firstname=Steve&lastname=Schafer
The minimal headers tell the destination (script or page) that the incoming data is encoded; the actual
data is passed as name/value pairs after the headers (and the end-header blank line).
365
Chapter 24
In either case, the data is available to the CGI script. In most cases, dedicated libraries or variables exist
to address the data. For example, Perl has a CGI library for addressing GET and POST data while PHP
has _SERVER variables that contain GET or POST data.
The individual means of handling GET and POST data are covered in the respective script language
chapters in this part of the book.
This is best seen with an example. Suppose that a client submits the following request:
If the server can find the script /cgi-bin/somescript.cgi, the script is executed according to local
system policies — usually allowing the script to communicate directly with the client. This process
resembles that shown in Figure 24-3.
Web Server
Figure 24-3
The best thing about CGI is that most of the communication sent back to the client can be handled
directly through the standard output (STOUT) device. For example, this simple Perl script sends a short,
concise document to any client requesting it:
#!/usr/bin/perl -w
Note, however, that the burden of HTTP protocol adherence is now the script’s responsibility. It has to
send the appropriate header(s), as necessary, so that the client can appropriately understand the data.
The following script shows how easy it can be to handle data passed to a script:
366
CGI Basics
example.pl
#!/usr/bin/perl -w
As previously mentioned, Perl has several libraries that make dealing with CGI tasks easier. In the pre-
ceding example, the CGI library is used to access the data passed via GET.
Figure 24-4 shows the result when this script is accessed using the URL test.pl?firstname=
Steve&lastname=Schafer.
Figure 24-4
367
Chapter 24
❑ Having the required scripting language installed
❑ Being configured to allow CGI scripts to run (usually restricted to certain directories on the server)
❑ Having appropriate permissions to execute the script(s)
The real power of CGI comes from the capabilities of the scripting language, which is usually capable of
the following:
Due to the power inherent in such actions, many system administrators do not have CGI enabled on
their servers or choose to restrict its usage.
Another risk of CGI scripting is the load such scripts can put on the server. Improperly written scripts
can use too many resources, bringing a server to its knees.
It is also imperative that script authors take all necessary actions to ensure that their scripts are well
written, well behaved, and pose no serious security risks to the server on which they are run.
The separate scripting chapters in this part of the book contain tips and techniques that can be employed
to help write safe and secure scripts.
Source
#!/usr/bin/perl
368
CGI Basics
HTML
# Close document
print “</body>\n</html>”;
Output
The output of this script is shown in Figure 24-5. Note that the parameter (name), passed via
GET, appears on the address line (appended to the URL).
GET parameter
Figure 24-5
369
Chapter 24
Summar y
This chapter introduced CGI and showed the basics of how the technology works. Subsequent chapters
in this part of the book cover the specifics of particular scripting languages. Useful examples are covered
in each chapter and compiled in Chapter 28, “Using CGI.”
370
Perl Language
Perl is one of the mainstays of CGI scripting. Originally conceived as a simple reporting language,
it has grown to be one of the most popular Web and system scripting languages. Although some-
what quirky in nature, the community around Perl has created many additional resources and
code to help even the neophyte programmer tap its capabilities.
The Perl community has created hundreds of modules, constructs that can easily be added to your
scripts to increase their capabilities without having to code the capabilities yourself. There are
modules to access other application data (databases, and so on), emulate protocol stacks, commu-
nicate with hardware, and more.
However, rapid growth from humble beginnings brings with it some growing pains. Perl is quirky
and somewhat dated; although powerful, a lot of Perl’s power comes in the form of patches and
hacks. Still, Perl is one of the most powerful and versatile scripting solutions available.
❑ The Perl.org Web site (http://www.perl.org) — The main site for information and
documentation on Perl. Download Perl, read online documentation, and follow Perl’s
development here.
Chapter 25
❑ The Perl.com O’Reilly Web site (http://www.perl.com) — A great resource for Perl docu-
mentation, code examples, and so on.
❑ The Comprehensive Perl Archive Network (CPAN) (http://www.cpan.org) — The one and
only comprehensive resource for Perl modules. Search the module archive based on keywords,
capabilities, or module names. The archive includes entries for almost all previous versions of
modules as well as the most current.
❑ With few exceptions, code lines should end with a semicolon (;). Notable exceptions to the
semicolon rule are lines that end in a block delimiter ({ or }).
❑ Blocks of code (usually under control structures such as functions, if statements, and so on)
are enclosed in braces ({ and }).
❑ Explicit declaration of all variables is a good idea, and necessary if using strict data
declarations.
❑ The use of functions to delimit code fragments is recommended and increases the ability to
execute those fragments independently from one another. (Functions are discussed in the
“User-Defined Functions” section, later in this chapter.)
❑ Comments can be inserted in Perl code by prefixing the comment with a pound sign (#) or sur-
rounding the comment with /* and */ pairs. In the former case (#), the comment ends at the
next line break. Use the latter case for multiline comments.
Data Types
Perl supports three types of data:
❑ Scalars
❑ Arrays
❑ Associative arrays
Perl also supports objects — an unordered container of various data types. Objects are covered in the
“Objects” section later in this chapter.
Scalar values are individual numeric, string, or Boolean values. Example scalar values include 3, 3.14159,
TRUE, and “character data.”
372
Perl Language
Arrays are collections of scalars. Individual elements in an array are referenced by an index relating to
their position in the array with a zero index. For example, in the following string array, “Steve” is ele-
ment number 3:
“Terri”,”Ian”,”Branden”,”Steve”,”Tammy”
Associative arrays are arrays that use string identifiers instead of numeric identifiers for elements of
the array. Associative arrays are generally used to store data associated with a particular identifier. For
example, you could define an associative array to store anniversary dates of employees so that accessing
the array with an index of “Steve” would return the date associated with Steve.
Variables
Perl variables are largely untyped; you can use the same variable to contain different scalar values at dif-
ferent times in your scripts. Of course, doing so is generally frowned upon — the point is that variable
typing in Perl is left up to the programmer.
Perl variable names are prefixed with a specific character depending on the use of the variable, as shown
in the following table:
Character Use
$ Individual scalar value, individual value in an array or associative array
@ Entire array or slice of an array
% Entire associative array or slice of an associative array
The index delimiter of normal arrays is the square bracket ([ and ]) while the index delimiter of
associative arrays is the curly bracket ({ and }). For example, the first following print statement
prints the third value of array name while the second prints the “Steve” value of the associative array
birthdate:
print $name[3];
print $birthdate{‘Steve’};
Variables in Perl are declared using standard programming nomenclature (that is, variable name and
equal sign and then the desired value). It is recommended that each variable declaration include the
generic variable constructor (my), as shown in the following examples:
The use of strict pragma (use strict) causes Perl to require explicit variable declarations (before
use), enforces local variable scope, and requires explicit references. All three aspects of the strict
pragma help enforce good programming practices and should be enabled in all of your Perl scripts.
373
Chapter 25
Special Variables
Perl has many special variables to contain internal data, specific to the script being run. Special variables
can be used to hold results of searches, values of environment variables, debugging flags, and more. The
following table details most of these variables.
More information on these variables can be found in the perlvar documentation that comes with Perl
distributions. The format and the location of this documentation depend on your platform and the Perl
distribution you are using. Generally speaking, Linux users can access the perlvar docs by using
man perlvar.
Variable Use
$_ The default parameter for a lot of functions.
$. Holds the current record or line number of the file handle that was last read. It is
read-only and will be reset to 0 when the file handle is closed.
$/ Holds the input record separator. The record separator is usually the newline
character. However, if $/ is set to an empty string, two or more newlines in the
input file will be treated as one.
$, The output separator for the print() function. Normally, this variable is an
empty string. However, setting $, to a newline might be useful if you need to
print each element in the parameter list on a separate line.
$\ Added as an invisible last element to the parameters passed to the print()
function. Normally, an empty string, but if you want to add a newline or some
other suffix to everything that is printed, you can assign the suffix to $.
$# The default format for printed numbers. Normally, it’s set to %.20g, but you can
use the format specifiers covered in the section “Example: Printing Revisited” in
Chapter 9 to specify your own default format.
$% Holds the current page number for the default file handle. If you use select()
to change the default file handle, $% will change to reflect the page number of
the newly selected file handle.
$= Holds the current page length for the default file handle. Changing the default
file handle will change $= to reflect the page length of the new file handle.
$- Holds the number of lines left to print for the default file handle. Changing the
default file handle will change $- to reflect the number of lines left to print for
the new file handle.
$~ Holds the name of the default line format for the default file handle. Normally,
it is equal to the file handle’s name.
$^ Holds the name of the default heading format for the default file handle.
Normally, it is equal to the file handle’s name with _TOP appended to it.
$| If nonzero, will flush the output buffer after every write() or print()
function. Normally, it is set to 0.
374
Perl Language
Variable Use
$$ This UNIX-based variable holds the process number of the process running the
Perl interpreter.
$? Holds the status of the last pipe close, back-quote string, or system() function.
$& Holds the string that was matched by the last successful pattern match.
$` Holds the string that preceded whatever was matched by the last successful
pattern match.
$’ Holds the string that followed whatever was matched by the last successful
pattern match.
$+ Holds the string matched by the last bracket in the last successful pattern
match. For example, the statement /Fieldname: (.*)|Fldname: (.*)/ &&
($fName = $+); will find the name of a field even if you don’t know which of
the two possible spellings will be used.
$* Changes the interpretation of the ^ and $ pattern anchors. Setting $* to 1 is
the same as using the /m option with the regular expression matching and
substitution operators. Normally, $* is equal to 0.
$0 Holds the name of the file containing the Perl script being executed.
$<number> This group of variables ($1, $2, $3, and so on) holds the regular expression
pattern memory. Each set of parentheses in a pattern stores the string that
matches the components surrounded by the parentheses into one of the
$<number> variables.
$[ Holds the base array index. Normally, it’s set to 0. Most Perl authors
recommend against changing it without a very good reason.
$] Holds a string that identifies which version of Perl you are using. When used
in a numeric context, it will be equal to the version number plus the patch level
divided by 1000.
$” This is the separator used between list elements when an array variable is
interpolated into a double-quoted string. Normally, its value is a space
character.
$; Holds the subscript separator for multidimensional array emulation. Its use
is beyond the scope of this book.
$! When used in a numeric context, holds the current value of errno. If used in
a string context, will hold the error string associated with errno.
$@ Holds the syntax error message, if any, from the last eval() function call.
$< This UNIX-based variable holds the read uid of the current process.
$> This UNIX-based variable holds the effective uid of the current process.
$) This UNIX-based variable holds the read gid of the current process. If the
process belongs to multiple groups, $) will hold a string consisting of the group
names separated by spaces.
Table continued on following page
375
Chapter 25
Variable Use
$: Holds a string that consists of the characters that can be used to end a word
when word wrapping is performed by the ^ report formatting character.
Normally, the string consists of the space, newline, and dash characters.
$^D Holds the current value of the debugging flags.
$^F Holds the value of the maximum system file description. Normally, it’s set to 2.
The use of this variable is beyond the scope of this book.
$^I Holds the file extension used to create a backup file for the in-place editing
specified by the -i command line option. For example, it could be equal to
“.bak.”
$^L Holds the string used to eject a page for report printing.
$^P This variable is an internal flag that the debugger clears so that it will not debug
itself.
$^T Holds the time, in seconds, at which the script begins running.
$^W Holds the current value of the -w command line option.
$^X Holds the full pathname of the Perl interpreter being used to run the current
script.
$ARGV Holds the name of the current file being read when using the diamond
operator (<>).
@ARGV This array variable holds a list of the command line arguments. You can use
$#ARGV to determine the number of arguments minus one.
@F This array variable holds the list returned from autosplit mode. Autosplit mode
is associated with the -a command line option.
@Inc This array variable holds a list of directories where Perl can look for scripts to
execute. The list is mainly used by the require statement.
%Inc This hash variable has entries for each filename included by do or require
statements. The keys of the hash entries are the filenames, and the values are the
paths where the files were found.
%ENV This hash variable contains entries for your current environment variables.
Changing or adding an entry affects only the current process or a child process,
never the parent process. See the section “Example: Using the %ENV Variable”
later in this chapter.
%SIG This hash variable contains entries for signal handlers.
_ This file handle (the underscore) can be used when testing files. If used, the
information about the last file tested will be used to evaluate the new test.
DATA This file handle refers to any data following __END__.
STDERR This file handle is used to send output to the standard error file. Normally, this
is connected to the display, but it can be redirected if needed.
376
Perl Language
Variable Use
STDIN This file handle is used to read input from the standard input file. Normally, this
is connected to the keyboard, but it can be changed.
STDOUT This file handle is used to send output to the standard output file. Normally, this
is the display, but it can be changed.
377
Chapter 25
378
Perl Language
String Operators
Operator Use
. Concatenation
x Repetition
String Tokens
Token Character
\b Backspace
\e Escape
\t Horizontal Tab
\n Line feed
\v Vertical Tab
\f Form feed
\r Carriage return
\” Double quote
\’ Single quote
\$ Dollar sign
\@ At sign
\\ Backslash
379
Chapter 25
Control Structures
Like other languages, Perl supports many different control structures that can be used to execute partic-
ular blocks of code based on decisions or repeat blocks of code while a particular condition is true. The
following sections cover the various control structures available in Perl.
while (<expression>) {
# statement(s) to execute
}
Because the <expression> is evaluated at the beginning of the loop, the statement(s) will not be exe-
cuted if the <expression> is false at the beginning of the loop. For example, the following loop will
execute 20 times, each iteration of the loop incrementing x until it reaches 20:
my $x = 0;
while ($x <= 20) { # do until $x = 20 (will not execute when x = 21)
$x++; # increment x
}
The until loop is similar to the while loop except that the loop is executed until the expression is false:
until (<expression>) {
# statement(s) to execute while expression is true
}
For
The for loop executes statement(s) a specific number of times and is governed by two expressions and a
condition:
The <initial_value> expression is evaluated at the beginning of the loop; this event occurs only
before the first iteration of the loop. The <condition> is evaluated at the beginning of each loop itera-
tion. If the condition returns false, the current iteration is executed; if the condition returns true, the loop
exits and the script execution continues after the loop’s block. At the end of each loop iteration, the
<loop_expression> is evaluated.
Although their usage can vary, for loops are generally used to step through a range of values via a spec-
ified increment. For example, the following example begins with the variable x equal to 1 and exits when
x equals 20 — each loop iteration increments x by 1:
for ($x = 2; $x <= 40; $x+=2) { # for $x = 2 to 40, by 2 (even numbers only)
# statement(s) to execute
}
Foreach
The foreach loop is similar to that of a normal for, but it assigns values to the controlling variable from
a list, one after another, until all values have been assigned. This loop is handy for finding the largest
element, printing all the elements (or performing a particular task on all elements), or simply seeing if a
given value is a member of a list. The foreach structure has the following syntax:
For example, the following code would print all the values in the $names array, each followed by a new-
line (\n):
my @names = (“Steve”,”Terri”,”Ian”,”Angie”,”Branden”);
foreach $name (@names) {
print $name.”\n”; # Will print all values from the names array
}
If Else
The if and if else constructs execute a block of code depending on the evaluation (true or false) of an
expression. The if construct has the following syntax:
if (<expression>) {
# statement(s) to execute if expression is true
} [ else {
# statement(s) to execute if expression is false
} ]
For example, the following code tests if the value stored in i is the number 2:
if (i == 2) {
# statement(s) to execute if the value in i is 2
}
381
Chapter 25
The following code will execute one block of code if the value of i is an odd number, another block of
code if the value of i is an even number:
if ((i % 2) != 0) {
# statement(s) to execute if i is odd
} else {
# statement(s) to execute if i is even
}
You can also use complex expressions in an if loop, as in the following example:
You can also create else if constructs in JavaScript by nesting if statements within one another, as
shown in the following code:
if ((i % 2) != 0) {
# statement(s) to execute if i is odd
} else
if (i == 12) {
# statement(s) to execute if i is 12
}
}
However, Perl provides the elsif directive for just this purpose:
if ((i % 2) != 0) {
# statement(s) to execute if i is odd
} elsif (i == 12) {
# statement(s) to execute if i is 12
}
}
Control Use
continue Performs additional statements at the end of each loop iteration (see notes after
this table).
next Skip immediately to the conditional statement, bypassing any other statements.
last Exit the current loop as though the condition had been met.
redo Repeat the current iteration of the loop. (Note that neither the increment/
decrement expression nor the conditional expression is evaluated before
restarting the block.)
382
Perl Language
The continue statement creates a special block of code that is executed at the end of each loop iteration,
immediately before the loop condition expression is evaluated. A continue block is executed even in
the event that the loop does a next or redo but not a last (last skips the continue block). A while
loop with a continue block would resemble the following:
while (<expression>) {
# redo jumps to here
# statements;
} continue {
# next jumps to here
# statements;
# standard loop jumps back to <expression>
}
# last jumps to here
Regular Expressions
One technology that Perl is renowned for is regular expressions (commonly abbreviated regex). Regular
expressions are special strings used as a template to match content, a kind of search expression in a way.
Regular expressions are used in one of three ways: to match, to substitute, and to translate. Using regu-
lar expressions, you can construct advanced pattern-matching algorithms. The following sections cover
the basics of Perl regular expressions.
Coverage of regular expressions can fill a book in itself. This section serves only as an introduction to
the subject.
By default, Perl uses the built-in variable $_ as the space to be searched/matched by a regular expres-
sion and the variables $n ($1, $2, and so on) as variables to store data returned from regular expression
operations. You can also use the regex assignment operator (=~) to perform a regex operation on any
variable’s content. For example, to perform a regex operation on the text stored in the string variable
$content, you could use the following code:
$content =~ /<regex_pattern>/;
If the pattern matches content in the searched variable, the expression will return TRUE. Therefore, you
can also use regular expressions within control structures, as in the following code example:
383
Chapter 25
if ($content =~ /^Yabba(.*)Doo!$/) {
# do something if regex found in $content
}
Character Use
^ Match beginning of line
$ Match end of line
. Match any character (except newline)
| Specify alternatives to match
[] Specify group or range to match
\ Escape the next character
Other character Match literal character
You can modify each matching character or expression with a matching quantifier, specifying how many
matches should be found. The valid quantifiers are described in the following table.
Quantifier Meaning
* Match 0 or more times
+ Match 1 or more times
? Match 0 or 1 times
{n} Match exactly n times
{n,} Match at least n times
{n,m} Match between n and m times (inclusive)
Perl regular expressions also have meta and control characters that can be used in expressions. The meta
and control characters are described in the next table.
Character Meaning
\t Tab
\n Newline
\r Carriage return
\f Form feed
384
Perl Language
Character Meaning
\d and \D Digit and nondigit
\s and \S Space and nonspace (white space and non–white space)
\w and \W Word and nonword
\b and \B Word and nonword boundary
Example Expressions
Before moving on to how regular expressions can be used in Perl, it is important to understand how the
pattern-matching strings are constructed and what they match.
As mentioned in the previous section, an asterisk quantifier (*) is used to match 0 or more of a particular
character. Consider the following expression:
ba*h
This expression would match bah, baah, baaah, and so on, as well as bh. The * quantifies that the a
should be matched 0 (bh) or more (baaaaah) times.
If the + quantifier were used instead of * (for example, ba+h), bh would not be matched because the a
would have to be matched at least once.
Using the group/range construct ([ and ]), you can specify a group of characters or a contiguous range
of characters to match. To specify a group of characters, you simply list them in between the brackets.
For example, consider the following expression:
b[uo]t
This expression would match but and bot (either a u or an o between the b and t). To specify a range,
you list the beginning and ending character separated by a dash (-). Ranges can be included in groups
simply by listing them next to other elements. For example, consider the following expressions and com-
ments indicating what they would match:
A caret (^) can be used within a group/range construct to negate it. For example, the following expres-
sion would match anything except a single number:
[^0-9]
Modifying Expressions
There are a few characters that can be appended after the regular expression (after the closing /), modi-
fying how the regular expression is applied. The modifiers are listed in the following table:
385
Chapter 25
Modifier Meaning
i Case-insensitive match
g Global match (useful mainly with substitutions)
m Match across multiple lines
s Treat string as one line (.will then match newlines)
For example, the global switch (g) can be used so that a match will replace all substrings matched, not
just the first substring (as is the default):
Memorizing Substrings
If you enclose parts of an expression in parenthesis, whatever is matched by the parenthetical pattern
will be returned in the appropriate $n variable (first parenthetical in $1, second parenthetical in $2, and
so forth). For example, the following code would print “abba” and “Doo!”:
#!/usr/bin/perl
print $1.”\n”.$2.”\n”;
The parentheticals can also be used in the expression as shorthand match expressions by prefixing the
appropriate number with a backslash (\1 for the first parenthetical, and so on). Of course, the shorthand
must come after the parenthetical for which it is a match. For example, the following expression matches
the earlier “Yabba Dabba Doo!” example:
/^Y(.*)\sD\1.*/ # match first Y”abba” and use shorthand (\1) for second D”abba”
Built-in Functions
Perl has a host of built-in functions for manipulating data. The functions are accessed in typical fashion
for programming languages:
function_name(<argument(s)>)
Most functions return values; therefore, you can use functions to make assignments, use them as/in
expressions in loop statements, and so on.
386
Perl Language
Strictly speaking, all functions return values. However, not all functions can be relied on to return mean-
ingful values.
A fairly comprehensive list of the more popular built-in Perl functions can be found in Appendix D,
“Perl Language Reference.”
User-Defined Functions
You can create your own functions by using the sub directive. User-defined functions have the following
syntax:
sub <function_name> {
# function statements
return <value>;
}
Note that Perl does not include function arguments in the function definition as in most other languages.
That is because the arguments sent to the function are automatically stored in the $_ variable array. The
first argument is stored in $_[0], the second argument is stored in $_[1], and so forth. Within the func-
tion, it is the programmer’s duty to distill the array into useful variables. For example, to create a func-
tion to find the area of a circle, you could use code similar to the following:
sub areaofcircle {
my $radius = $_[0];
# area = pi * (r squared)
my $area = 3.1415 * ($radius ** 2);
return $area;
}
A popular and efficient way of distilling function parameters is to use code similar to the following:
my(
$firstparam,
$secondparam,
$thirdparam,
$fourthparam,
) = @_;
This code transfers the contents of the $_ array (referenced in its entirety by @_) to the variables speci-
fied. (Of course, you would want to use variable names that are more meaningful to your function.)
It is worth noting that use of the return function is not necessary if the last expression evaluated con-
tains the desired return value (as is the case in the areaofcircle() function example). However, it is
usually best to be explicit with your code and always use the return function.
File Operations
One of the advantages of using CGI programs is that they can read and write to the filesystem, and Perl is
no exception. Perl includes quite a few functions for dealing with file IO, covered in the following sections.
387
Chapter 25
In an effort to make programs more uniform, there are three connections that always exist when your pro-
gram starts. These are STDIN, STDOUT, and SDTERR (these variable names are open file handles).
Opening a File
To open a file, Perl uses (strangely enough) the open function. The open function has the following syntax:
open(FILEHANDLE, “<filename>”)
For example, to open the file test.txt you would use code similar to the following:
The preceding syntax will result in the script exiting and displaying “cannot open file” if the open
function does not succeed.
The FILEHANDLE can be any valid variable name but should be descriptive to be easily identified as a
file handle. One standard practice is to capitalize the file handle variables.
The default operation of the open function is to open a file for reading. To open a file for writing, you
need to preface the filename with a >; to append to a file, you would preface the name with >>. Both of
these conventions are standard input redirectors.
open(FILETOWRITE,”>test.txt”)
open(FILETOAPPEND,”>>test.txt”)
For example, the following snippet of code will read all lines from the file test.txt:
...
open (FILE,”test.txt) || die “cannot open file”;
while (<FILE>) {
# do something useful with input
}
...
388
Perl Language
For example, the following code snippet will write the value of the $contents variable to the file
test.txt:
...
my $contents = “This is a line to write to a file”;
open (FILE,”>test.txt”) || die “cannot open file”;
print FILE $contents;
...
The select function can be used to change the standard output handle from STDOUT to the specified
file handle. For example, both of the following code snippets accomplish the same thing, directing the
print function’s output to the file handled by the file handle FILE:
select(FILE);
print “output string”;
Note that the select function also returns the current file handle so you can reset it later if you would
like.
Closing a File
To close a file, you simply use the close function with the appropriate file handle. For example, to close
a file opened with the FILE file handle, you would use the following code:
close(FILE);
Once a file has been closed, it cannot be read from or written to. However, until a written file is closed,
its contents cannot be relied on — the operating system may not write its buffers until the file is closed.
One important difference in using the read function with binary files is the addition of a buffer and length
parameters within the function. When used with binary files, the read function has the following syntax:
389
Chapter 25
The buffer variable can be any scalar variable, but it must be declared before being used.
For example, the following code snippet will read a binary file in 4K chunks:
...
my $buffer = “”;
open(FILE,”test.bin”) || die “cannot open file”;
while ( read(FILE, $buffer, 4096) ) {
# do something useful with contents ($buffer)
}
...
This code reads 4K chunks until the read fails, at which time the while statement exists.
Operator Meaning
-A Returns time of last access
-b Is a block device
-B Is a binary file
-c Is a character device
-C Returns the time of last change
-d Is a directory
-e Exists
-f Is a regular file
-g Is setgid bit set
-k Is sticky bit set
-l Is a symbolic link
-M Returns age of file
-o Is owned by current user
-O Is owned by the read user
-p Is a named pipe
-r Can be read from
-R Can be read by the current user
390
Perl Language
Operator Meaning
-s Returns size of file
-S Is a socket
-t Is open to a tty
-T Is a text file
-u Is setuid bit set
-w Can be written to
-W Can be written to by the current user
-x Can be executed
-X Can be executed by the current user
-z Is size zero
For example, the following code tests if a file exists and can be written to:
$_ = “test.txt”;
if ((-f) && (-w)) {
print “File exists and can be written to”;
}
Objects
Perl has robust support for object data types. Although a full description of object-oriented program-
ming is beyond the scope of this book, the following sections provide a primer on Perl’s handling of
objects.
This syntax uses -> to separate objects and methods. For example, when creating a new object of class
dog, you use the new() method similarly to the following:
doberman = dog_object->new();
391
Chapter 25
Object properties are assigned using the associative assignment operator, =>. This creates the properties
as hashes (associative arrays) and has the benefit of providing a ready-made method for accessing prop-
erty values using simple statements such as the following:
print %{doberman}->{‘color’};
Perl Constructors
One easy way to create constructors is to use a separate namespace in Perl to create a new() function
and associated initialization routines. For example, to create a constructor for the dog class, you could
use code similar to the following:
package dog_object;
sub new {
my $class = shift;
my %params = @_;
bless {
“color” => $params{“color”},
“size” => $params{“size”}
}, $class;
This code defines a new namespace (dog_object) where the new() function can be initialized and dis-
tinguished from other new() functions in the script. The new() function itself shifts the class name off
the parameter stack and assigns the rest of the parameters to the params associative array. This array is
then used to initialize the object’s properties, and the object is reassigned as a data type of $class.
#!/usr/bin/perl
my $class = shift;
my %params = @_;
bless {
“color” => $params{“color”},
“size” => $params{“size”}
}, $class;
}
392
Perl Language
# back to normal namespace
package main;
Modules
As mentioned early in this chapter, a lot of Perl’s power comes from the abundance of prefab modules
available for use with your scripts. Many of these scripts are available in CPAN (www.cpan.org).
To use a module, you must install it into your Perl modules directory and then declare the module
within your script. You can employ the use directive to import a module, its classes, variables, and
methods for use in your script. The syntax for the use directive is as follows:
For example, to use the popular CGI module in your scripts, you could use the following at the begin-
ning of your script:
The qw function (quote words) is used to quickly expand a list into single-quoted words. Most Perl
modules have functions bundled into lists such as :standard; specifying qw(:standard) in a use
statement will expand the list into the individual single-quoted functions to include. Alternatively,
you could specify the individual functions (comma-separated as single words if using qw or comma-
separated single-quoted names if not using qw). Note that you shouldn’t use commas inside a qw with
slashes as delimiters (qw//). Doing so will cause an error.
Perl has the same requirements for CGI as any other scripting language:
393
Chapter 25
By itself, Perl can accomplish a lot of system integration with Web documents — directory listings, read-
ing/writing files, and so on. However, to truly utilize the CGI power of Perl, it is highly suggested that
you use the CGI.pm module.
This module, available via CPAN (www.cpan.org) or packaged specifically for most Linux distributions,
provides the interface for receiving POST and GET data, as well as outputting most XHTML tags and
related data. The latest version of CGI.pm operates using the Perl object model — a new CGI object is
created and acted upon to output the appropriate XHTML.
A quick example of how easy CGI.pm can make your CGI work is shown in the following code:
#!/usr/bin/perl -w
Retrieving data sent via POST or GET is equally simple by accessing the passed data via the Vars parame-
ter of the CGI object ($cgi->Vars). The type of data (GET/POST) can be ascertained by examining the
Request Method environment variable ($ENV{REQUEST_METHOD}, which will contain GET or POST).
Example 2 in Chapter 28 shows how to parse GET and POST data using the CGI module.
The first way is to use the w flag when running Perl. In your scripts, you simply add the flag to the inter-
preter line at the top of the script:
#!/usr/bin/perl -w
394
Perl Language
The second way to increase data reporting is to use the CGI::Carp fatalsToBrowser routine. Including
this function causes Perl to attempt to output any errors to the browser window. This directive can be spec-
ified using the following line in your scripts:
For example, the following script will output the error in the die function, in this case ---An error
occurred here---, as shown in Figure 25-1:
#!/usr/bin/perl –w
use CGI::Carp qw(fatalsToBrowser);
die “---An error occurred here---”;
Figure 25-1
The fatalsToBrowser directive helps avoid the problem described in the next section — the dreaded
Apache Internal Server Error message.
If you want to, change the default message supplied by the fatalsToBrowser routine by also import-
ing the CGI::Carp set_message function and use it to define a more appropriate message, as in the fol-
lowing sample:
#!/usr/bin/perl –w
use CGI::Carp qw(fatalsToBrowser);
set_message(“This is a custom error message.”);
395
Chapter 25
Figure 25-2
This error doesn’t include a lot of useful information, but it does throw up a flag that should cause you
to check the Apache error log. The following error typically accompanies the Internal Server browser
message:
The root of the actual problem is that Perl script spit out an error message before it sent appropriate HTTP
headers allowing the Web server and browser to communicate properly. One way to troubleshoot the
actual error is to run the script from the command line and see what error the interpreter is reporting.
Summar y
This chapter provided a primer to the Perl scripting language. Chapter 26 covers the basics of Python,
another popular scripting language used on the Web. Insight into using other executable code for CGI
purposes is covered in Chapter 27, and practical examples using all forms of CGI are presented in
Chapter 28.
396
The Python Language
Python is a rising star in CGI scripting. Touted for its uncomplicated nature, readability, and
robustness, it has grown to be one of the most widely used Web and system scripting languages.
Python is interpreted and object oriented. Python has more features than Perl or Tcl and is easier to
learn and use. The documentation and tutorials available to aid in the learning process are much
clearer than those available for PHP. Python also edges out PHP in the elegant way it handles
namespaces. Python has a wide variety of uses, including the following:
❑ The Python.org Web site (http://www.python.org) — The main site for information and doc-
umentation on Python, downloading Python, reading online documentation to include a
detailed Module index, and following Python’s development.
❑ The Starship Python Web site (http://starship.python.net/) — A pretty good resource
for Python documentation, code examples, and so on, especially the search engine at http://
starship.python.net/crew/theller/pyhelp.cgi.
Modules
A lot of Python’s functionality comes from the availability of prefab modules available for use with your
scripts. Many of these modules are listed in Appendix E, “Python Language Reference.” A more complete
list of modules for the current python version is available at http://docs.python.org/modindex.html.
Many modules are available, but if you don’t find one you need, you can write it yourself. The distutils
package provides support for building and installing your own modules. The new modules may be written
purely in Python or may be written as extension modules in C. You can also build them from a combination
of Python and C.
Python Interpreter
On a UNIX system, the Python interpreter is called an Integrated DeveLopment Environment (IDLE) but
is usually thought to be named after a character in the Monty Python comedy troop for which Python
itself was named. IDLE is usually installed as /usr/local/bin/python or /usr/bin/python. If its
location is in your PATH, you can start it by typing the following command: python.
On a Windows system, the Python IDE is GUI-based and is called PythonWin. PythonWin is usually
installed in the same directory as Python. The executable is called pythonwin.exe.
398
The Python Language
Python is available for Mac, too. That project is called MacPython, and it contains the Mac version of
IDLE.
Whichever operating system you choose, you will find that, in interactive mode, the interpreter will
display a prompt of >>>. This prompt allows you to type in your Python commands. If any of these
commands are not built in, you must first import the module that contains that command, like this:
>>> import re
To exit the Python interpreter, type an end-of-file character (Control+D on UNIX, Control+Z on
Windows) at the primary prompt. The interpreter will exit with a zero exit status.
You can also invoke a Python script directly by making the first line of the script look like this,
#!/usr/local/bin/python
with the path representing the path to the Python interpreter. Although the preceding hard-coded method
works, on UNIX systems the preferred method is to use the env command, which will look for the Python
interpreter in your PATH. The env command is usually found in /bin or /usr/bin. In this case, the first line
of your script would look like this:
#! /usr/bin/env python
The names of Python scripts usually end with the extension .py. The script must have the executable
attribute set and may be called like this:
python file2execute
Or if it has the executable attribute set, it may simply be executed like this:
/path2script/file2execute
If you are planning to use Tkinter, you might need to set some environmental variables if Tcl/Tk is not
in a standard location. Set TK_LIBRARY and TCL_LIBRARY variables to point to the local Tcl and Tk
library file destinations. In some instances, the PYTHONPATH environment variable is used to specify
possible module locations. In Windows, the PYTHONPATH variable is stored in the Registry.
The os module’s environ function is used to turn the shell environment into a simple Python object.
The os.environ function will list the environment variables as follows:
>>> import os
>>> os.environ.keys()
[‘LESS’, ‘MINICOM’, ‘LESSOPEN’, ‘SSH_CLIENT’, ‘LOGNAME’, ‘USER’, ‘INPUTRC’,
‘QTDIR’, ‘PS2’, ‘PATH’, ‘PS1’, ‘LANG’, ‘KDEDIR’, ‘TERM’, ‘SHELL’, ‘XAUTHORITY’,
‘SHLVL’, ‘EDITOR’, ‘MANPATH’, ‘JAVA_HOME’, ‘HOME’, ‘PYTHONPATH’, ‘T1LIB_CONFIG’,
‘LS_OPTIONS’, ‘_’, ‘SSH_CONNECTION’, ‘WINDOW_MANAGER’, ‘GDK_USE_XFT’, ‘SSH_TTY’,
‘LC_COLLATE’, ‘HOSTNAME’, ‘CPLUS_INCLUDE_PATH’, ‘PWD’, ‘MAIL’, ‘LS_COLORS’]
Because the PYTHONPATH previously looked at is listed, let’s look to see what it is set to with the
environ function, as follows:
399
Chapter 26
>>> import os
>>> os.environ[‘PYTHONPATH’]
‘/usr/lib/python2.4/;/usr/local/lib/mylib’
If the PYTHONPATH variable is not set, the sys.path is searched. Typically, developers will add their own
modules to the site-packages of the Python installation.
In recent Python versions, you can also use this functionality to set or change the PYTHONPATH like this:
>>> os.environ[‘PYTHONPATH’]=”/usr/lib/python2.4”
>>> os.environ[‘PYTHONPATH’]
‘/usr/lib/python2.4’
❑ Comments are preceded by hash marks (#) and can begin anywhere in a line, although no active
code may follow it on the same line. Multiline comments should also be indented, with each
line preceded by a hash mark.
❑ Blocks of code, called suites, are delimited with indentation, typically four spaces. Spaces and
tabs should not be intermixed. Each time the level of indentation is increased, a new code block
begins. The end of that code block is marked by the reduction of indentation to match the previ-
ous level.
❑ A colon (:) separates the header of a code block from the rest of the suite.
❑ Newline (\n) is the standard line separator.
❑ Python statements are delimited by newlines, but single statements can be broken into multiple
lines by using backslashes (\) to continue a line.
❑ Functions are organized as importable modules. Each module is a separate Python file.
Data Types
Python supports the following five types of data:
❑ Numbers
❑ Strings
400
The Python Language
❑ Lists
❑ Dictionaries
❑ Tuples
Numbers
Python supports four numerical types:
❑ int (signed integers) — Most 32-bit computers will offer an integer range from –231 to 231–1
(–2,147,483,648 to 2,147,483,647).
❑ long (long integers) — The range of Python longs are limited only by the amount of virtual
memory a system has. Longs may be represented in decimal, octal, or hexadecimal notation.
As in most other languages, long integers in decimal form are denoted by an uppercase L or a
lowercase l. Some examples follow:
❑ 99999999999L
❑ –23456l
❑ 0xD34F867CA0
❑ float (floating-point real values) — Floating-point numbers in Python are denoted by a decimal
point and either a lowercase e or an uppercase E and a positive (+) or negative (–) sign.
Examples follow:
❑ 0.99
❑ –953.234
❑ 96.7e3
❑ –1.609E-19
❑ complex (complex numbers) — Complex numbers have real and imaginary components.
Complex number attributes are accessible like this:
>>>myComplex = -7.345-1.53j
>>>myComplex
(-7.345-1.53j)
❑ If you want to see only the real portion of the number, it may be returned this way:
>>>myComplex.real
-7.345
❑ Likewise, to see only the imaginary portion, use the imag functionality, as follows:
>>>myComplex.imag
-1.53
401
Chapter 26
Strings
Strings are immutable sequences of alphanumeric characters. The elements of a Python string may be
contained either within single quotes (tic marks) or double quotes. A single-quoted string may contain
double quotes and vice versa. Strings can also be contained in triple quotes. Because strings are immutable,
string functions do not change the string passed to it but instead return a new string. Due to Python’s
memory management capabilities, you might not notice when this happens. Consider the following
example where two strings are added together:
You can see from this example that the resulting string is not the same string that we started with.
Because strings are immutable, a new string with a new identity is created. Still, you’ll most likely refer
to it primarily by its variable name of string, so the fact that it is not the original string might be trans-
parent to you.
String Operators
Operator Use
+ Concatenation
% Format
* Repetition
r or R Raw string
u or U Unicode string
A string can be preceded by the character r, which means to take escape characters literally. They can
also be preceded by u to indicate that the string is of type Unicode. Here are some examples:
402
The Python Language
The repetition operator can be used to create a new string with multiple instances of the specified string.
Here is an example using string1, which we set in the preceding code to “pre”:
>>> string1*3
‘preprepre’
String Methods
Operator Use
string.capitalize() Returns string with first character capitalized. For
example, ‘string’.capitalize returns “String”.
string.center(width[,fillchar]) Returns string centered in a new string padded with
fillchar on each side to length width. For example,
‘string’.center(10) returns “ string “.
403
Chapter 26
Operator Use
string.join(words) Concatenates specified words with separator of
string. For example,
‘foo’.join([‘str’,’ing’,’str’,’ing’])
returns ‘strfooingfoostrfooing’.
string.split(sep) Returns string as separated at the specified separator.
For example, ‘string’.split(‘i’) returns ‘str’ and
‘ng’.
Python provides a powerful tool that is similar to the printf function in C. Strings may be formatted
using these format operators as in the following example:
The preceding is only one example. The following table illustrates some other options:
Lists
Python lists are zero-based, mutable sequences of elements, which might be numeric, strings, or some-
thing else.
404
The Python Language
Lists are denoted by square brackets, and their elements are comma delimited. They look something
like this:
[1,2,3,4]
[`first`,`second`,`third`]
A number of methods are associated with a list instance: new, append, extend, insert, remove, sort,
reverse, and so on.
Passing a list as a parameter to the append function causes the list to be seen as one member.
>>>mylist+myotherlist
[1, `two`, 3, `four`, `a`, `n`, `o`, `t`, `h`, `e`, `r`, 1, `two`, 3]
>>>mylist[0]
1
A negative index indicates that the count is to begin with the last element in the list.
>>>mylist[-2]
‘e’
A list can even be treated as a stack with the pop method, which would return the last member of the list.
>>>mylist.pop()
`r`
Dictionaries
Dictionaries are arrays of key-value pairs, each having a one-to-one relationship. The keys and values
may be nearly any type of Python object. Dictionaries are declared like this:
myDictionary={‘name’:’Jacob’,’hobby’:’Pokemon’,’sport’:’basketball’}
405
Chapter 26
There are several ways to access the information in a dictionary. Among the most common are the
following:
Tuples
Tuples are sequences of values similar to lists, except that tuples are immutable. A tuple is declared like so:
>>>vegetables = (‘carrot’,’potato’,’broccoli’,’celery’,’cauliflower’)
To get the third value of this (zero-indexed) tuple, use the following:
>>>vegetables[2]
‘broccoli’
However, because tuples are immutable, overwriting the third value with another value cannot be done
directly with the following:
>>>vegetables[2] = ‘newvalue’
Instead, you can use slicing and concatenation to simulate the same thing. You make a new tuple by slic-
ing and concatenation, but it does not have the same identifier as the original tuple, because the original
tuple isn’t changed; simply bind the result to the same name as the old tuple.
>>> v = t = (1,2,3,4)
>>> t
(1, 2, 3, 4)
id(t)
1077685140
>>> v
(1, 2, 3, 4)
id(v)
1077685140
At this point, the two tuples are still the same. Now let’s do some slicing and concatenating of t, leaving
v alone. To do this, we rebind the return value to t in each step.
>>> t = t[:3]
>>> t
(1, 2, 3)
id(t)
1077858988
>>> t = t[:2]
406
The Python Language
>>> t
(1, 2)
id(t)
1077883052
>>> t = t + (3,4)
>>> t
(1, 2, 3, 4)
id(t)
1077879348
In rechecking the value of tuple v and its identifier, you see that they haven’t changed.
>>> v
(1, 2, 3, 4)
id(v)
1077685140Variables
Python variables are untyped. You can use the same variable to contain different Python data types at
different times in your scripts. Of course, doing so can be quite confusing and is generally not consid-
ered good programming practice. Python takes care of typing issues behind the scenes, so no errors will
be generated by such a practice.
Another difference between Python variables and those of some other programming languages is that
Python variables don’t have to be declared. Just start using the variable and it exists. The type will be
assigned at run-time.
Variable Scope
The scope of a variable may be defined as the area of a program in which a variable is visible and the
length of time that variable is accessible for. There are three scopes in Python, shown here in the order
resolved by Python:
import sys
sys.exit()
The privilege of not having to declare variables is not without some cost. With Python, if you acciden-
tally reuse a variable name within the same scope, the first will be overwritten by the second. This can
lead to program bugs that are difficult to find and fix.
407
Chapter 26
Another common mistake is to accidentally misspell a variable name in one place and spell it correctly in
another, again within the same scope. This causes the misspelled variable to be seen as a new variable
name and does not generate an error as it would if Python were more strongly typed.
Although Python’s primary assignment operator is the equal sign, Python supports a number of others,
as shown in the following table. It is worth noting that Python supports multiple assignment, whereby
all objects are assigned the same value, as in this example:
a = b = c = 11
408
The Python Language
Operator Use
%= Modulus assignment
**= Exponential assignment
<<= Left shift assignment
>>= Right shift assignment
&= And assignment
^= Xor assignment
|= Or assignment
Python supports a pretty standard set of comparison operators. They behave much as expected, except
in the case of the < or >, which may be used in combination. The following sequence
is invalid in many languages, and actually means (3 > 2) and (2 > 1).
409
Chapter 26
Control Structures
Like many other languages, Python supports many different control structures that can be used to exe-
cute particular blocks of code based on decisions or repeat blocks of code while a particular condition is
true. The following sections cover the various control structures available in Python.
While Loop
The while loop executes one or more lines of code while a specified expression remains true. The
expression is tested prior to execution of the code stanza and again when the flow returns to the top of
the stanza. This loop has the following syntax:
while expression;
executeMe #executes if the while expression evaluates to True.
Because the expression is evaluated at the beginning of the loop, the statement(s) will not be executed at
all if the expression evaluates to false at the beginning of the loop. For example, the following loop will
execute 20 times, each iteration of the loop incrementing x = 20:
410
The Python Language
x = 0;
while x <= 20
x += 1; # increment x if the while expression evaluates to True.
For Loop
The Python for loop is a bit different than the same loop in C or Perl. Whereas these languages execute
their loop based on the specified step and halting condition, in Python the for loop iterates its loop over
a sequence such as a list or a string. The statement(s) are executed a specific number of times depending
on the number of members in the sequence or the range of values passed to it:
The loop iterates for each member in the specified range. In the first case, the range is determined by
the quantity of elements in the array. In the second case, the range() built-in function passes the array
of [0,1,2,3] with exactly four elements, causing the loop to execute exactly four times.
If Statement
The if and elif constructs execute a block of code depending on the evaluation (true or false) of the
specified expression. The if construct has the following syntax:
if expression:
executeMeifTrue
else:
executeMeifFalse
Also available is the elif clause, of which there may be zero or more. An elif is used to mean execute
the code block following it only if the primary expression evaluates to False and the expression follow-
ing the elif evaluates to True.
if x==0:
print `zero’
elif x==1:
print ‘one’
411
Chapter 26
The elif and else clauses may be used in the same if clause as long as the else is last:
if x==0:
print `zero’
elif x==1:
print ‘one’
else:
print ‘neither’
In the preceding code, if x is 0, the string ‘zero’ would be printed; if x is 1, the string ‘one’ would be
printed; and if x is something else, the string ‘neither’ would be printed.
Try Statement
Python provides a rather unique statement to allow for the specification of error handlers for a group of
statements. The try statement with an except clause is commonly used when a file is being opened so
that an error that prevents the file from being accessed allows you to gracefully break out of the block of
code intended to process that file’s contents.
try:
file = open(`/tmp/filename.txt’, `r’)
line = file.readfile()
except IOError:
print `Error opening the file.’
Multiple except clauses are allowed to test for different conditions, as follows:
try:
file = open(`/tmp/filename.txt’, `r’)
line = file.readfile()
except IOError:
print `Error opening the file.’
except SystemError:
print `System error while opening the file.’
Another form of the try statement allows some functionality to be specified as a way out. The try clause
is executed, and when no exception occurs, the finally clause is executed. If an exception occurs in the
try clause, the exception is temporarily saved while the finally clause is executed and then reraised. If
the finally clause raises another exception or executes a return or break statement, the saved excep-
tion is lost.
try:
file = open(`/tmp/filename.txt’, `r’)
line = file.readfile()
finally:
line = ‘’
412
The Python Language
Control Use
continue Continues with next iteration of the loop
break Breaks out of the closest for or while loop
pass Indicates “do nothing”
The continue statement must be nested in a for or while loop but not in a function or class definition
or try statement within that loop. It continues with the next cycle of the nearest enclosing loop.
A break is used to terminate the current loop and continue execution at the next statement. It may be
found in either while or for loops.
A while loop with a continue block and a break statement would look like this:
The pass statement is necessary in Python when a statement expects a block of code and one is not pre-
sent, as follows:
if user != “Jacob”:
print “That’s not him.”
else:
pass
Regular Expressions
For text analysis, Python provides the regex and the re modules. The regex module is old and some-
what deprecated, although still available. The regex module uses an emacs-style format, which some
users find difficult to read. Using regular expressions from the re module, you can construct advanced
pattern-matching algorithms in a less arcane syntax. Regular expressions are handled via a small, highly
specialized programming language embedded in Python and are made available through the re mod-
ule. Using the re module, you specify the rules for the set of possible strings that you want to match.
You can use it to determine whether the string matches the pattern or whether there is a match for the
pattern anywhere in the string. You can also use the re module to modify a string or to split it apart in
various ways. The following sections cover basic Python regular expressions.
413
Chapter 26
>>> import re
>>> reobj = re.compile(‘foo*’)
>>> print reobj
<_sre.SRE_Pattern object at 0x403c38c0>
Character Use
^ Match beginning of line.
$ Match end of line.
. Match any character. Match newline only if DOTALL flag is specified.
\ Escape the next character.
\\ Match a literal \.
[] Specify group of characters to match.
[^ ] Match character not in set.
Other character Match literal character.
The following regular expressions are commonly used to represent the indicated classes of characters or
numbers:
A caret (^) can be used within a group/range construct to negate it. For example, the following expres-
sion would match anything except a single digit:
[^0-9]
Each matching character or expression can be modified by a matching quantifier, specifying how many
matches should be found. The valid quantifiers are described in the following table:
414
The Python Language
Quantifier Meaning
* Match 0 or more times
+ Match 1 or more times
? Match 0 or 1 times
{n} Match exactly n times
{m,n} Match between m and n times (inclusive)
The * quantifier works like it does in most languages. For example, foo* will match fo, foo, or fooooooo.
(In fact, it would match fo followed by any number of the letter o.)
The + quantifier is very similar to the * except that it doesn’t match if the character preceding the star
doesn’t exist. In contrast to the preceding example, which matches fo even though it has zero instances
of the second o, foo+ will match only foo or foo followed by any number of the letter o.
The {n} quantifier matches the preceding character 0 or 1 times. So bar? will match either ba or bar.
The most complicated quantifier is {m,n}, where m and n are decimal integers. This quantifier means
there must be at least m repetitions, and at most n. For example, a/{1,3}b will match a/b, a//b, and
a///b. It won’t match ab, which has no slashes, or a////b, which has four.
Python regular expressions also have meta and control characters that can be used in expressions. The
meta and control characters are described in the next table.
Character Meaning
\A Matches only at start of string
\d and \D Digit and nondigit
\s and \S Space and nonspace
\w and \W Alphanumeric character and nonalphanumeric character
\b and \B Word and nonword boundary
Built-in Functions
Python has many built-in functions that are accessible from any namespace. Two of the most common
are explained here:
The __init__ function is one of the most widely used functions in Python. Its purpose is to define what
should be done when the class is instantiated.
__init__(self)
415
Chapter 26
The built-in function dir() returns a list of strings representing the functions and attributes that the
module defines. It is called like this:
After importing the sys module, any of these items may be called by the item name prefaced with sys
and a period.
A more comprehensive list of the more popular built-in Python functions can be found in Appendix E,
“Python Language Reference.”
User-Defined Functions
You can create your own functions by using the def directive. User-defined functions have the following
syntax:
def function_name(parameters):
# function statements
For example, to create a function to find the area of a circle you could use code similar to the following:
def areaofcircle()
radius = input(“Please enter the radius: “)
area = 3.14+(radius**2)
print “The area of the circle is”, area
Attribute Description
__doc__ documentation string
__name__ string version of function name
func_code byte-compiled code object
func_defaults default argument tuple
func_globals global namespace dictionary
416
The Python Language
>>> areaofcircle.__doc__
>>> areaofcircle.__name__
‘areaofcircle’
>>> areaofcircle.func_code
<code object areaofcircle at 0x403eab60, file “<stdin>”, line 1>
>>> areaofcircle.func_defaults
>>> areaofcircle.func_globals
{‘lambdaFunc’: <function <lambda> at 0x403ebbfc>, ‘__builtins__’: <module
‘__builtin__’ (built-in)>, ‘areaofcircle’: <function areaofcircle at 0x403ebd4c>,
‘datetime’: <module ‘datetime’ from ‘/usr/lib/python2.4/lib-dynload/datetime.so’>,
‘sys’: <module ‘sys’ (built-in)>, ‘time’: <module ‘time’ from ‘/usr/lib/python2.4/
lib-dynload/time.so’>, ‘__name__’: ‘__main__’, ‘os’: <module ‘os’ from ‘/usr/lib/
python2.4/os.pyc’>, ‘__doc__’: None}
Lamda Functions
Python allows for the creation of anonymous functions using the lamda keyword. Lamda expressions
are similar to user-defined functions without the __name__ (actually the __name__ when invoked from
a lambda function returns the string ‘<lambda>’). An example of a lambda function follows:
File Operations
One of the advantages of using CGI programs is that they can read and write to the filesystem, and
Python is no exception. Python includes quite a few functions for dealing with file IO, which are covered
in the following sections.
Opening a File
To open a file, Python uses the open function. The open function has the following syntax:
FILEHANDLE=open(“filename”,mode)
For example, to open the file test.txt, you would use code similar to the following:
417
Chapter 26
try:
FILEHANDLE = open(`/tmp/filename.txt’, `r’)
except IOError:
print `cannot open file.’
The preceding syntax will result in the script exiting and displaying “cannot open file” if the open func-
tion does not succeed.
The preceding syntax will result in the script exiting and displaying “cannot open file” if the open function
does not succeed. The FILEHANDLE can be any valid variable name but should be descriptive enough to be
easily identified as a file handle. One standard practice is to capitalize the file handle variables.
The default operation of the open function is to open a file for reading. To open a file for writing you
need to specify a mode of ‘w’; to append to a file you would specify a mode of ‘a’; to open the file read-
only you would specify a mode of ‘r’.
For example, the following snippet of code will read all lines from the file filename.txt into a buffer
where it can be acted upon:
...
try:
FILEHANDLE = file(`/tmp/filename.txt’, `r’)
data = FILEHANDLE.readlines()
#Do something useful with data.
except IOError:
print `cannot open file.’
...
After a file is opened, the file object maintains state information about the file it has opened: <open file
‘/tmp/filename.txt’, mode ‘r’ at 0x403e73c8>. The file object supports the following methods:
Methods Meaning
close() Closes the file, preventing reading and writing. Can be called more
than once.
flush() Flushes the internal buffer.
fileno() Returns the integer “file descriptor” that is used by the underlying
implementation to request I/O operations from the operating system.
418
The Python Language
Methods Meaning
isatty() Returns True if the file is connected to a tty-like device; otherwise
returns False.
next() Returns the next line of the file.
read([size]) Returns size characters from file at current position.
readline([size]) Returns one line from file or size characters, which might be a partial
line. Returns empty string if first character read is EOF.
readlines([sizehint]) Returns a list of all the lines in the file.
seek(offset[,whence]) Sets the file’s current position.
tell() Tells the file’s current position.
truncate([size]) Truncates the file as represented in the buffer to the current position or
to the location represented by size.
write(str) Writes str to the fileptr to be flushed.
writelines(sequence) Writes lines represented by sequence to the fileptr to be flushed.
Attributes Usage
closed Returns True if the file is closed; otherwise returns False.
encoding Returns the encoding method for the file.
mode Returns the mode that the file was opened in.
name Returns the file name.
newlines If Python was configured with the --with-universal-newlines
option, it returns the number of newlines in the file.
softspace Returns a Boolean that indicates whether a space character needs to be
printed before another value when using the print statement.
>>> myList=[“one”,”two”,”three”]
>>> OutputFile=file(“/tmp/outputfile.txt”,”w”)
>>> for item in myList:
>>> OutputFile.write(item)
419
Chapter 26
Closing a File
To close the file, you use the close method from the instance of the file class representing the file to be
closed.
>>> OutputFile.close()
>>> OutputFile=file(“/tmp/outputfile.txt”,”r”)
>>> OutputFile.readlines()
[‘onetwothree’]
Once a file has been closed, it cannot be read from or written to. However, until a written file is closed, its
contents cannot be relied upon — the operating system may not write its buffers until the file is closed.
One important difference in instantiating a file class for a binary file is the addition of a b in the mode for
that file. When used with binary files, the file function has the following syntax:
OutputFile=file(“/tmp/outputfile.txt”,”rb”)
Objects
Python has robust support for object data types. Although a full description of object-oriented program-
ming is beyond the scope of this book, the following sections provide an introduction to Python’s han-
dling of objects.
Python is classified as an object-oriented programming language, although you can write useful Python
code without using classes and instances. Many Python programmers make good use of Python without
taking advantage of its object-oriented features. In Python, everything is an object: list, tuple, string,
class, or instance of class.
Python classes are instantiated in the following way. First, define the class like this:
class MyClass:
def __init__(self):
#code to execute
def function1(self,args):
#code to execute
if __name__ == ‘__main__’:
results = MyClass()
420
The Python Language
In the following code, Python attempts to execute the statements between the try and the first except
statement. If an error is encountered, execution of the try statements stops and the error is checked
against the except statements. Execution progresses through each except statement until it finds one
that matches the generated exception. If a matching except statement is found, the code block for that
exception is executed. If there is no matching except statement in the try block, the final else block is
executed. In practice, the else statement is not often used.
try:
# program statements
except ExceptionType:
# exception processing for named exception
except AnotherType:
# exception processing for a different exception
else:
# clean up if no exceptions are raised
Troubleshooting in Python
There are several methods to diagnose problematic Python code. The methods most commonly used for
diagnosing CGI code are listed in order from least to most robust in the following section.
421
Chapter 26
import cgitb
cgitb.enable()
import sys
sys.stderr = sys.stdout
print “Content-Type: text/plain”
print
...your code here...
Because no HTML interpretation is going on, the traceback will be more readable, not requiring you to
wade through HTML code to get to the important stack trace.
These are not the only troubleshooting techniques available to you, but these are the most commonly
used with regard to CGI code.
Summar y
This chapter provided a primer to the Python scripting language. It discussed the basics of Python syn-
tax, how to implement Python on your system, Python’s object-oriented nature, and how to troubleshoot
CGI code written in Python. That, put together with previous chapters, gives you an idea how to create
CGI code in Perl and Python. Chapter 27 covers using other languages for CGI, and Chapter 28 gives
you several examples of Perl and Python CGI scripts.
422
Scripting with Other
Executable Code
This book concentrates on the programming/scripting languages that are most used for CGI on
the Web. However, you can effectively use any program, interpreted script, or other executable
supported by the platform that is running the Web server. This chapter demonstrates some tech-
niques that can be used with other programming and scripting languages to accomplish CGI.
Of course, just because a program or script can fulfill the requirements for CGI doesn’t mean that it should
be used for CGI purposes. Due to the fact that CGI scripts can access privileged areas of the operating
and file systems, CGI scripts pose security risks usually not inherent in the Web server itself.
Scripting languages such as Perl and Python have been tailored for CGI use and, as such, have dealt
with several of the security issues relating to CGI. Therefore, using more standard languages for CGI
should be encouraged.
The examples in this section were created on a GNU/Debian Linux system and may need to be modified
to run on other versions of Linux.
It is possible to cheat and simply name your scripts using filenames representative of other scripting
languages. For example, if the Web server is already configured to deliver .cgi files as scripts, you can
simply name your Bash scripts using the .cgi extension. However, it is usually advisable to appropri-
ately name your scripts following conventions for the language being used and to configure Apache
appropriately for that language. For example, use .sh extensions for shell scripts, .py extensions for
Python scripts, and so on.
The essential steps to configure Apache to deliver shell scripts are as follows:
1. Ensure that the Apache CGI module is installed and active. The module should be compiled
into Apache or appear in a LoadModule line within the Apache configuration file similar to the
following:
LoadModule cgi_module /usr/lib/apache/1.3/mod_cgi.so
2. Enable CGI scripting in the directory(ies) where you will place your scripts. This is typically
accomplished by adding the ExecCGI option to the appropriate directory configuration sections
of the Apache configuration file. Such configuration sections resemble the following:
424
Scripting with Other Executable Code
<Directory /usr/lib/cgi-bin/>
AllowOverride None
Options ExecCGI
Order allow,deny
Allow from all
</Directory>
3. Add an appropriate handler for an .sh file type. If you have a handler defined for CGI scripts,
simply add the .sh extension to the existing list of extensions:
AddHandler cgi-script .cgi .pl .sh
After making changes to the Apache configuration file, don’t forget to restart Apache for the changes to
take effect.
It is highly recommended to restrict shell script CGI as much as possible. If you wish to try out the
examples in this section, you would do well to allow shell scripting only in the directory containing the
examples and perhaps should protect that directly with an .htaccess or other protection scheme to
ensure the scripts cannot be accessed by unscrupulous users of your Web server.
To pass data to a script, you would use the standard URL GET encoding; that is, following the URL to
the script with a question mark and then the arguments to pass to the script:
<url-to-script>?<command-line-argument(s)>
The script would then use its standard methods for dealing with command line arguments. In the case of
the Bash shell, the methods are in the form of the variables listed in the following table:
Variable Use
$@ The full list of arguments passed to the script.
$1 - $9 The first nine arguments passed to the script. The first argument is
stored in $1, the second in $2, and so on.
$_ The full path to the root program being run (in this case, the Bash shell
because it is interpreting the script).
$0 The full path to the script.
$# The number of arguments passed to the script.
To illustrate how these variables work with the command line, consider the following script:
425
Chapter 27
cmdline.sh
#!/bin/bash
When executed with the following command line, the script provides the output shown here:
Scripts that are not enabled with specific CGI libraries and methods will be bound to the limits of their
respective command lines. For example, Bash command line arguments cannot contain certain charac-
ters unless they are quoted because those characters mean special things to the Bash shell. Still, the Web
server may censor other characters prior to their arriving at the script. As such, it’s important to keep
your parameter-passing simple when working with non–CGI-enabled scripts.
echocmdline.sh
#!/bin/bash
426
Scripting with Other Executable Code
This script will run from a command line just fine:
However, if accessed via a Web server, it will generate the error shown in Figure 27-1.
Figure 27-1
This error is a typical response if the Web server doesn’t detect that the script has provided adequate
header information. If you examine the Apache error log, you can immediately see the root cause of the
problem; the script didn’t provide adequate headers:
[Wed Mar 9 03:43:16 2005] [error] [client 192.168.3.141] malformed header from
script. Bad header=/var/www/test/echocmdline.sh : /var/www/test/echocmdline.sh
To correct this error, you must include appropriate headers in the output from your script. In the case of
this simple example script, the output can be plain text, so adding a Content-type header is all that is
required:
echocmdline.sh
#!/bin/bash
427
Chapter 27
# Echo the command line
echo -e “$0 $@ \n\n”
Notice the newline (\n) at the end of the content-type echo statement. This provides a blank line after
the Content-type header, which is required to let the user agent know that the end of the headers has
been reached.
Figure 27-2
Shell scripting is quite powerful. This chapter covers only the basic input/output CGI functionality.
Using the full power of a shell scripting, however, you can accomplish some truly amazing things.
Listing a File
The following script will list the file passed to it via GET by using the Linux cat command.
428
Scripting with Other Executable Code
#!/bin/bash
Toggling a State
The following script will simply toggle a state, in this case alternately creating or deleting a lock file on
the file system. This simplistic example can be extended to allow a server to be started and stopped, a
process to be paused, and so on, using Web access.
#!/bin/bash
# File to toggle
FILE=”/var/www/file.lock”
#!/bin/bash
429
Chapter 27
echo -e “--------------------------------\n”
eval $@
else
echo -e “Nothing to do!\n”
fi
This example is shown only to illustrate the extreme danger that scripts can pose to a system. In most
cases, the script will be run as the same user the Web server is running as, which can limit the damage
that can be done by such scripts. However, never underestimate the power of allowing command line
access to a system.
Summar y
As you can see by the examples in this chapter, providing basic CGI functionality via any executable
code is trivial. However, it is difficult to pass complex data to scripts that do not have CGI libraries or
modules to handle HTTP GET or POST operations. Also, scripting using native OS tools (such as shell
scripts) can pose hazards that are typically mitigated with the CGI functions of more robust program-
ming and scripting languages.
430
Using CGI
The previous chapters in this part demonstrated how CGI works behind the scenes — the syntax
and functionality of Perl, Python, and other CGI-enabled technologies. This chapter rounds out
the CGI coverage by showing some basic but useful examples of CGI in action.
❑ You need documents to provide dynamic content or interactive functions to your static
documents.
❑ You need content from other resources, databases, hardware, and so on.
❑ You need more interactivity between your documents and their audience than straight
XHTML technologies can provide.
That said, you should also consider the following before deploying a CGI solution:
All that said, CGI provides a great resource to infuse your documents with interactivity and
dynamic content.
Chapter 28
One popular CGI technique to decrease server load is the use of Server Side Includes (SSI). SSI lets you
to embed scripts in static documents, allowing the script to deliver the dynamic portion of the document
but relying upon the standard HTTP server to deliver the static content. SSI coverage is beyond the
scope of this book, but for more information on SSI, I’d suggest visiting the Apache Web site, specifi-
cally the SSI tutorial at http://httpd.apache.org/docs-2.0/howto/ssi.html.
Sample Data
This section details the sample data used in this chapter and in Chapter 31, “Using PHP.” Note that the
code listed in this section is available on the book’s companion Web site, but it is also listed here for
immediate reference.
Sample Form
The following form is used in the form examples — data from this form is sent to the form handler.
432
Using CGI
interested in? <br />
<select name=”prod” id=”prod” multiple=”multiple” size=”4”>
<option id=”MB”>Motherboards</option>
<option id=”CPU”>Processors</option>
<option id=”Case”>Cases</option>
<option id=”Power”>Power Supplies</option>
<option id=”Mem”>Memory</option>
<option id=”HD”>Hard Drives</option>
<option id=”Periph”>Peripherals</option>
</select>
</p>
</td></tr>
<tr><td>
</td></tr>
</table>
</form>
</body>
</html>
This form will be filled out as shown in Figure 28-1, and the resulting data will be passed to the example
script.
433
Chapter 28
Figure 28-1
USE mysqlsamp;
--
-- Table structure for table ‘computers’
--
434
Using CGI
comp_CD enum(‘CD’,’DVD’,’CDRW’,’CDR’,’CDRW/DVD’) default NULL,
comp_location varchar(10) default NULL,
comp_ts timestamp(10) NOT NULL,
PRIMARY KEY (comp_id)
) TYPE=MyISAM;
--
-- Dumping data for table ‘computers’
--
435
Chapter 28
INSERT INTO computers VALUES
(05295,’Super-puter’,’XL’,’P4’,2400,1000,180,’CDRW/DVD’,’ITLAB’,’0000000000’);
INSERT INTO computers VALUES
(04780,’Super-puter’,’XS’,’P4’,2400,1000,180,’CDRW’,’LAB01’,’0000000000’);
INSERT INTO computers VALUES
(05021,’Custom’,’3’,’P3’,1000,512,0,’CD’,’ITLAB’,’0000000000’);
--
-- Table structure for table ‘keyboards’
--
--
-- Dumping data for table ‘keyboards’
--
--
-- Table structure for table ‘mice’
--
436
Using CGI
CREATE TABLE mice (
mouse_id int(5) unsigned zerofill NOT NULL default ‘00000’,
mouse_type char(3) NOT NULL default ‘PS2’,
mouse_model varchar(5) NOT NULL default ‘2BS’,
mouse_comp int(5) unsigned zerofill NOT NULL default ‘00000’,
mouse_ts timestamp(10) NOT NULL,
PRIMARY KEY (mouse_id)
) TYPE=MyISAM;
--
-- Dumping data for table ‘mice’
--
--
-- Table structure for table ‘mice_type’
--
437
Chapter 28
--
-- Dumping data for table ‘mice_type’
--
--
-- Table structure for table ‘monitors’
--
--
-- Dumping data for table ‘monitors’
--
The following listing shows the permission statement used to create the sample user in the MySQL
examples:
438
Using CGI
Perl Examples
This section demonstrates a few examples of how Perl can be used for CGI. The examples show how you
can leverage Perl’s capabilities and add-on modules to deliver dynamic document content.
Source — Perl-Example01.cgi
#!/usr/bin/perl
# Default is today
my $day = localtime->mday();
my $month = localtime->mon();
my $year = localtime->year();
my $today = $month.”/”.$day.”/”.$year;
my $month = shift;
my $year = shift;
my $t = Time::Piece->strptime($month,”%m”);
$month_text = $t->strftime(“%B”);
439
Chapter 28
HTML
} # End docheader()
print <<HTML;
<!-- Close all open tags, end document -->
440
Using CGI
</table>
</body>
</html>
HTML
} # End docfooter()
print <<HTML;
<td align=”right” valign=”top”> </td>
HTML
} # End emptyday()
my $today = shift;
my $month = shift;
my $day = shift;
my $year = shift;
my $font = “”;
my $curday = $month.”/”.$day.”/”.$year;
if ( $curday eq $today ) {
$font = “ style=\”color: red;\””;
}
print <<HTML;
<td align=”right” valign=”top” $font>$day</td>
HTML
} # End day()
my $cmd = shift;
if ($cmd eq “open”) {
print “<tr class=\”week\”>\n”;
}
if ($cmd eq “close”) {
print “</tr>\n”;
}
441
Chapter 28
} # End weekrow()
my $first_weekday = $t->strftime(“%w”) + 1;
my $day = 1;
my $last_day = $t->month_last_day;
442
Using CGI
# Close document
docfooter();
} # End main();
Output
This script outputs an XHTML table similar to that shown in Figure 28-2.
Figure 28-2
How It Works
This script uses the date and time functions in the Time::Piece library (found on CPAN) to
determine these parameters about the current month:
443
Chapter 28
Using those parameters, the script can create a calendar for any month — past, present, or
future. The rest of the script is fairly straightforward:
1. Output the document header (Content-type, doctype, head tags, and so on).
2. Determine the first weekday of the month.
3. Output blank cells for weekdays up to the first day of the month.
4. Output cells for each day of the month.
5. Close the month by outputting blank cells to fill the last week.
❑ Thumbnail calendars for the months on either side of the current month could be dis-
played in the calendar header, much like paper calendars.
❑ Multiple months could be assembled into a longer calendar.
❑ The calendar functions could be applied to other Web-enabled features.
❑ The calendar could be extended to be dynamic, able to output any month. An example
of a more dynamic calendar appears in Example 3 in this chapter.
Source — Perl-Example02.cgi
#!/usr/bin/perl
444
Using CGI
my %params = $cgi->Vars;
HTML
445
Chapter 28
# Close document
print “</body>\n</html>”;
Output
This script, when passed the example data outlined in the “Sample Data” section earlier in this
chapter, displays the document shown in Figure 28-3.
How It Works
This script uses the CGI.pm module to access the HTTP information passed to the script. The
Request-Method header (accessed by the $ENV{REQUEST_METHOD} variable) contains the
method used to send the data to the script (GET or POST), while the param() method is used
to return the data itself. If the data contains multiple values (such as from a select form ele-
ment), each value is displayed.
This script can be used to troubleshoot XHTML form data. Simply specify it as the form handler
(in the action property of the form tag).
446
Using CGI
Figure 28-3
Source — Perl-Example03.cgi
#!/usr/bin/perl
use Time::Piece;
use CGI;
447
Chapter 28
# Default is today
my $day = localtime->mday();
my $month = localtime->mon();
my $year = localtime->year();
my $today = $month.”/”.$day.”/”.$year;
my $cgi = CGI->new();
my %params = $cgi->Vars;
my $cmd = 0;
if (exists($params{month})) {
$month = $params{month};
}
if (exists($params{year})) {
$year = $params{year};
}
if (exists($params{next})) {
$cmd = 1;
}
if (exists($params{prev})) {
$cmd = -1;
}
if ($cmd != 0) {
$month += $cmd;
# Adjust month over or underrun
if ($month >= 13) {
$month = 1;
$year += 1;
} else {
if ($month <= 0) {
$month = 12;
$year -= 1;
}
}
}
my $month = shift;
my $year = shift;
my $t = Time::Piece->strptime($month,”%m”);
$month_text = $t->strftime(“%B”);
448
Using CGI
HTML
449
Chapter 28
} # End docheader()
print <<HTML;
<!-- Close all open tags, end document -->
</table>
</form>
</body>
</html>
HTML
} # End docfooter()
print <<HTML;
<td align=”right” valign=”top”> </td>
HTML
} # End emptyday()
my $today = shift;
my $month = shift;
my $day = shift;
my $year = shift;
my $font = “”;
my $curday = $month.”/”.$day.”/”.$year;
if ( $curday eq $today ) {
$font = “ style=\”color: red;\””;
}
print <<HTML;
<td align=”right” valign=”top” $font>$day</td>
HTML
} # End day()
450
Using CGI
my $cmd = shift;
if ($cmd eq “open”) {
print “<tr class=\”week\”>\n”;
}
if ($cmd eq “close”) {
print “</tr>\n”;
}
} # End weekrow()
my $first_weekday = $t->strftime(“%w”) + 1;
my $day = 1;
my $last_day = $t->month_last_day;
451
Chapter 28
# Close document
docfooter();
} # End main();
Output
This script results in a document that displays a monthly calendar, as shown in Figure 28-4.
Figure 28-4
452
Appendix C
Event Handlers
onFocus
onKeyDown
onKeyPress
onKeyUp
onSelect
Window Object
As the top-level object in the JavaScript client hierarchy, a Window object is created for every user agent
window and frame (every instance of an XHTML <body> or <frameset> tag).
Properties
Properties Description
closed Returns a Boolean value corresponding to whether a window has
been closed. If the window has been closed, this property is true.
defaultStatus Returns or sets the message displayed in a window’s status bar.
[= “message”]
644
JavaScript Language Reference
Properties Description
scrollbars[.visible = Sets the visibility of the window’s scroll bars.
true|false]
toolbar[.visible = Sets the visibility of the window’s toolbar. Note that this property
true|false] can be set only prior to the window being opened and requires the
UniversalBrowserWrite privilege.
Methods
Methods Description
alert(“message”) Displays an alert box containing message and an OK button (to
clear the box).
blur Removes the focus from the specified window.
captureEvents Instructs the window to capture all events of a particular type. See
(event_types) the Event object for a list of event types.
clearInterval Used to cancel a timeout previously set with the setInterval
(intervalID) method.
clearTimeout Used to cancel a timeout previously set with the setTimeout
(timeoutID) method.
close Causes the specified window to close.
confirm(“message”) Displays a dialog box containing message along with OK and Can-
cel buttons. If the user clicks the OK button, this method returns
true; if the user clicks the Cancel button (or otherwise closes the
dialog box), the method returns false.
disableExternal Disables the capturing of events previously enabled using the
Capture enableExternalCapture method.
645
Appendix C
Methods Description
handleEvent(event) Used to call the handler for the specified event.
home Mimics the user pressing the Home button, causing the window to
display the document designated as the user’s home page.
moveBy(horizPixels, Moves the window horizontally by horizPixels and vertically by
vertPixels) vertPixels in relation to its current position.
moveTo(Xposition, Moves the window upper left corner to the position Xposition
Yposition) (horizontal) and Yposition (vertically).
open(URL, windowname [, Opens a new window named windowname, displaying the
features]) document referred to by URL, with the optional specified features.
The specified features are contained in a string, with the features
separated by commas. Features can include the following:
For example, to create a new window that is 400 pixels square, is not
resizable, and has no scroll bars, you could use the following string
for features:
“height=400,width=400,resizeable=no,scrollbars=no”
print Calls the print routine for the user agent to print the current
document.
prompt(message[, Displays a dialog box containing message and a text box with the
input]) default input (if specified). The content of the text box is returned
if the user clicks OK. If the user clicks Cancel or otherwise closes
the dialog box, the method returns null.
releaseEvents Used to release any captured events of the specified type and to
(event_type) send them on to objects further down the event hierarchy.
resizeBy(horizPixels, Resizes the specified window by the specified horizontal and
vertPixels) vertical pixels. The window retains its upper left position; the resize
moves the lower right corner appropriately.
646
JavaScript Language Reference
Methods Description
resizeTo(horizPixels, Resizes the specified window to the specified dimensions.
vertPixels)
routeEvent(event_type) Used to send an event further down the normal event hierarchy.
scrollBy(horizPixels, Scrolls the specified window by the amount (horizontal and
vertPixels) vertically) specified. The visible property of the window’s scrollbar
must be set to true for this method to work. Note that this method
has been largely deprecated in favor of scrollTo.
scrollTo(Xposition, Scrolls the specified window to the specified coordinates, with the
Yposition) specified coordinate becoming the top left corner of the viewable
area.
setInterval Causes the expression to be evaluated or the function called every
(expression/function, milliseconds. Returns the ID of the interval. Use the
milliseconds) clearInterval method to stop the iterations.
Event Handlers
Event Handlers
onBlur
onDragDrop
onError
onFocus
onLoad
onMove
onResize
onUnload
647
Perl Language Reference
This appendix provides a comprehensive reference to the Perl language. Within this appendix you
will find listings for Perl’s many language conventions, including its variables, statements, and
functions. For more information on using the language, see Chapters 25 and 28 of this book.
#!/usr/bin/perl -U
Argument Use
-a Turns on autosplit mode. Used with the -n or -p options. (Splits
to @F.)
-c Checks syntax. (Does not execute program.)
-d Starts the Perl symbolic debugger.
-D number Sets debugging flags.
-e command Enters a single line of script. Multiple -e arguments can be used
to create a multiline script.
-F regexp Specifies a regular expression to split on if -a is used.
-i[extension] Edits < > files in place.
-I[directory] Used with -P, specifies where to look for include files. The direc-
tory is prepended to @INC.
-l [octnum] Enables line-end processing on octnum.
Table continued on following page
Appendix D
Argument Use
-n Assumes a while (<>) loop around the script. Does not print lines.
-p Similar to –n, but lines are printed.
-P Executes the C preprocessor on the script before Perl.
-s Enables switch parsing after program name.
-S Enables PATH environment variable searching for program.
-T Forces taint checking.
-u Compiles program and dumps core.
-U Enables Perl to perform unsafe operations.
-v Outputs the version of the Perl executable.
-w Enables checks and warning output for spelling errors and other error-
prone constructs in the script.
-x [directory] Extracts a Perl program from input stream. Specifying directory changes
to that directory before running the program.
-X Disables all warnings.
-0[octal] Designates an initial value for the record separator, $/. See also –l.
Command Use
h Prints out a help message.
T Prints a stack trace.
s Single-steps forward.
n Single-steps forward around a subroutine call.
RETURN (key) Repeats the last s or n debugger command.
r Returns from the current subroutine.
c [ line ] Continues until line, breakpoint, or exit.
p expr Prints expr.
650
Perl Language Reference
Command Use
l [ range ] Lists a range of lines. range may be a number, a subroutine name, or one
of the following formats: start-end, start+amount. (Omitting range
lists the next window.)
w Lists window around current line.
- Lists previous window.
f file Switches to file.
l sub Lists the subroutine sub.
S List the names of all subroutines.
/pattern/ Searches forward for pattern.
?pattern? Searches backward for pattern.
b [ line [ Sets breakpoint at line for the specified condition. If line is omitted, the
condition ]] current line is used.
b sub [ condition ] Sets breakpoint at the subroutine sub for the specified condition.
d [ line ] Deletes breakpoint at line.
D Deletes all breakpoints.
L Lists lines that currently have breakpoints or actions.
a line command Sets an action for line.
A Deletes all line actions.
< command Sets command to be executed before every debugger prompt.
> command Sets command to be executed before every s, c, or n command.
V [ package [ vars ]] Lists all variables or specified vars in package. If package is omitted, lists
main.
X [ vars ] Similar to V, but lists the current package.
! [ [-]number ] Re-executes a command. If number is not specified, the previous com-
mand is used.
H [ -number ] Displays the last -number commands of more than one letter.
t Toggles trace mode.
= [ alias value ] Sets alias to value, or lists current aliases.
q Quits the debugger.
command Executes command as a Perl statement.
651
Appendix D
Operators
The following tables detail the various operators present in the Perl language.
652
Perl Language Reference
Operator Use
< Numeric is less than
>= Numeric is greater than or equal to
<= Numeric is less than or equal to
eq String equality
ne String nonequality
gt String greater than
lt String less than
ge String greater than or equal to
le String less than or equal to
653
Appendix D
String Operators
Operator Use
. Concatenation
x Repetition
String Tokens
Token Character
\b Backspace
\e Escape
\t Horizontal tab
\n Line feed
\v Vertical tab
\f Form feed
\r Carriage return
\” Double quote
\’ Single quote
\$ Dollar sign
\@ At sign
\\ Backslash
654
Perl Language Reference
Standard Variables
The following tables detail the various standard variables in the Perl language.
Global Variables
Variable Use
$_ The default input and pattern-searching space.
$. The current input line number of the last filehandle read.
$/ The input record separator (newline is the default).
$, The output field separator for the print operator.
$” The separator joining elements of arrays interpolated in strings.
$\ The output record separator for the print operator.
$? The status returned by the last `...` command, pipe close, or system operator.
$] The Perl version number.
$; The subscript separator for multidimensional array emulation (default is \034).
$! In a numeric context, is the current value of errno. In a string context, is the
corresponding error string.
$@ The Perl error message from the last eval or do command.
$: The set of characters after which a string may be broken to fill continuation
fields in a format.
$0 The name of the file containing the Perl script being executed.
$$ The process ID of the currently executing Perl program.
$< The real user ID of the current process.
$> The effective user ID of the current process.
$( The real group ID of the current process.
$) The effective group ID of the current process.
$^A The accumulator for formline and write operations.
$^D The debug flags; passed to Perl using the -D command line argument.
$^F The highest system file descriptor.
$^I In-place edit extension, passed to Perl using the -i command line argument.
$^L Formfeed character used in formats.
$^P Internal debugging flag.
Table continued on following page
655
Appendix D
Variable Use
$^T The time (as delivered by time) when the program started. Value is used by the
file test operators -M, -A, and -C.
$^W The value of the -w command line argument.
$^X The name used to invoke the current program.
Context-Dependent Variables
Variable Use
$% The current page number of the current output channel.
$= The page length of the current output channel. (Default is 60.)
$- The number of lines remaining on the page.
$~ The name of the current report format.
$^ The name of the current top-of-page format.
$| Used to force a flush after every write or flush on the current output channel.
Set to nonzero to force flush.
$ARGV The name of the file when reading from < >.
Localized Variables
Variable Use
$& The string matched by the last successful pattern match.
$` The string preceding what was matched by the last successful pattern match.
$’ The string following what was matched by the last successful pattern match.
$+ The last bracket matched by the last search pattern.
$1...$9 Contains the subpatterns from the corresponding parentheses in the last suc-
cessful pattern match. (Subpatterns greater than $9 are available if the match
contained more than 9 matched subpatterns.)
Special Arrays
Array Use
@ARGV Contains the command-line arguments for the program. Does not include
the command name.
@EXPORT Names of methods a package exports by default.
656
Perl Language Reference
Array Use
@EXPORT_OK Names of methods a package can export upon explicit request.
@INC Contains a list of places to look for Perl scripts for require or do.
@ISA Contains a list of the base classes of a package.
@_ Contains the parameter array for subroutines. Also used by split (not in
array context).
%ENV Contains the current environment.
%INC Contains a list of files that have been included with require or do. The
key to each entry is the filename of the inclusion, and the value is the loca-
tion of the actual file used. (The require command uses this array to
determine if a particular file has already been included or not.)
%OVERLOAD Overload operators in a package.
%SIG Sets signal handlers for various signals.
Statements
The following tables detail the various statements present in the Perl language.
657
Appendix D
Function/Statement Use
package name Designates the remainder of the current block as a package.
require expr Can be used in multiple contexts: If expr is numeric, statement
requires Perl to be at least the version in expr. If expr is nonnumeric,
it indicates a name of a file to be included from the Perl library.
(The .pm extension is assumed if none is given.)
return expr Returns from a subroutine with the value specified.
sub name { expr ; ... } Designates name as a subroutine. Parameters are passed by refer-
ence as array @_. Returns the value of the last expression evalu-
ated in the subroutine or the value indicated with the return
statement.
[ sub ] BEGIN { expr ; ... } Defines a setup block to be called before execution of the rest of
the script.
[ sub ] END { expr ; ... } Defines a cleanup block to be called upon termination of the script.
tie var, package, [ list ] Ties a variable to a package that will handle it.
untie var Breaks the binding between var and its package.
use module [ [ Imports semantics from module into the current package.
version ] list ]
658
Perl Language Reference
Statement Use
for (init_expr; cond_expr; The for loop is a complex loop structure typically used
loop_expr){ to iterate over a sequence of numbers (for example,
// loop code 1–10). At the start of the loop the init_expr is evaluated
} and the cond_expr is also evaluated. The loop executes as
long as cond_expr remains true, evaluating the loop_expr
on the second and subsequent iterations. For example,
the following loop executes 10 times, assigning the vari-
able x values of 1 through 10:
foreach [ var ] (mixed) { Performs the loop code once for every item in mixed,
// loop code assigning the variable var to each item in turn. For
} example, the following code will output all the values in
array @arr:
659
Appendix D
Statement Use
redo [ label ] Causes the loop to redo the current iteration of the loop
(not evaluating the conditional statement in the process).
If label is specified, it performs the action on the appro-
priately labeled loop instead of the current one.
until (expr) { Perform the loop code until expr evaluates to true. Note
# statement(s) to execute that because the conditional statement is at the beginning
# while expression is true of the loop, the loop code may not execute (if expr is
} [ continue { initially true).
# statements to do at end
# of loop or explicit continue
}]
while (expr) { Perform the loop code while expr evaluates to true. Note
# statement(s) to execute that because the conditional statement is at the beginning
} [ continue { of the loop, the loop code may not execute (if expr is
# statements to do at end initially false).
# of loop or explicit continue
}]
Functions
The following tables detail the various default functions available in the Perl language.
Arithmetic Functions
Function Use
abs expr Returns the absolute value of expr.
atan2 x,y Returns the arctangent of x/y.
cos expr Returns the cosine of expr.
exp expr Returns e (the natural logarithm base) to the power of expr.
int expr Returns the integer portion of expr.
log expr† Returns the natural logarithm of expr.
rand [ expr ] Returns a random number between 0 and the value of expr. If expr is
omitted, returns a value between 0 and 1.
sin expr Returns the sine of expr.
sqrt expr Returns the square root of expr.
srand [ expr ] Sets the random number seed for the rand operator.
time Returns the number of seconds since January 1, 1970.
660
Perl Language Reference
Conversion Functions
Function Use
chr expr Returns the character represented by the decimal value expr.
gmtime expr Returns a 9-element array (0 = $sec, 1 = $min, 2 = $hour, 3 = $mday, 4 =
$mon, 5 = $year, 6 = $wday, 7 = $yday, 8 = $isdst) with the time for-
matted for the Greenwich time zone. Note that expr should be in a form
returned from a time function.
hex expr Returns the decimal value of expr, with expr interpreted as a hex string.
localtime expr Returns a 9-element array (0 = $sec, 1 = $min, 2 = $hour, 3 = $mday, 4 =
$mon, 5 = $year, 6 = $wday, 7 = $yday, 8 = $isdst) with the time for-
matted for the local time zone. Note that expr should be in a form
returned from a time function.
oct expr Returns the decimal value of expr, with expr interpreted as an octal
string. If expr begins with 0x, expr is interpreted as a hex string instead
of an octal string.
ord expr Returns the ASCII value of the first character of expr.
vec expr, offset, Using expr as a vector of unsigned integers, returns the bit at offset. Note
bits that bits must be between 1 and 32.
Structure Conversion
Function Use
pack template, list Returns a binary structure, packing the list of values using tem-
plate. See the template listing in the next table.
unpack template, expr Returns an array unpacking the structure expr, using template.
See the template listing in the next table.
For the pack and unpack functions, template is a sequence of characters containing the characters in
the following table.
661
Appendix D
Character Use
B A bit string (descending bit order inside each byte).
h A hex string (low nybble first).
H A hex string (high nybble first).
c A signed char value.
C An unsigned char value. (See U for Unicode chars.)
s A signed short value.
S An unsigned short value.
i A signed integer value.
I An unsigned integer value.
l A signed long value.
L An unsigned long value.
n An unsigned short in “network” (big-endian) order.
N An unsigned long in “network” (big-endian) order.
v An unsigned short in “VAX” (little-endian) order.
V An unsigned long in “VAX” (little-endian) order.
q A signed quad (64-bit) value.
Q An unsigned quad value.
j A signed integer value.
J An unsigned integer value.
f A single-precision float (native format).
d A double-precision float (native format).
F A floating-point value (native format).
D A long double-precision float (native format).
p A pointer to a null-terminated string.
P A pointer to a structure (fixed-length string).
u A uuencoded string.
U A Unicode character number.
w A BER compressed integer.
x A null byte.
X Back up a byte.
662
Perl Language Reference
Character Use
@ Null fill to absolute position, counted from the start of the innermost
group.
( Start of a group.
) End of a group.
Each character can be followed by a decimal number that is used as a repeat count. An asterisk specifies all
remaining arguments. If the format begins with %N, the unpack function will return an N-bit checksum.
Spaces can be included in the template for legibility — they are ignored when the template is processed.
String Functions
Function Use
chomp string|list Removes the trailing record separator (as set in $/) from string
or all elements of list. Returns the total number of characters
removed.
chop list Removes the last character from all elements of list. Returns the
last character removed.
crypt string, salt Encrypts string.
eval expr Parses expr and executes it as if it contained Perl code. Returns
the value of the last expression evaluated.
index string, substr [, Returns the position of substr in string at or after offset. If substr
offset ] is not found, index returns -1.
length expr Returns the length of expr in characters.
lc expr Returns a lowercase version of expr.
lcfirst expr Returns expr with the first character in lowercase.
quotemeta expr Returns expr with all regular expression metacharacters
quoted.
rindex string, substr [, Returns the position of the last substr in string at or before offset.
offset ]
substr expr, offset [, Extracts a substring of length len out of expr and returns it. If
len ] offset is negative, substr counts from the end of the string.
uc expr Returns an uppercase version of expr.
ucfirst expr Returns expr with the first character in uppercase.
663
Appendix D
664
Perl Language Reference
Function Use
splice @array, offset [, Removes the elements of @array designated by offset and length
length [, list ]] and replaces them with list (if specified). Returns the elements
removed from @array.
split [ pattern [, Splits a string into an array and returns the array. If limit is
expr [, limit ]]] specified, split creates at most the number of fields specified. If
pattern is omitted, the string is split at white space. If split is
not in array context, it returns number of fields and splits to @_.
unshift @array, list Prepends list to the front of @array, and returns the number
of elements in the new array.
values %hash Returns an array containing all the values of hash.
665
Appendix D
The following table explains the options mentioned in the preceding “Search and Replace Functions” table.
666
Perl Language Reference
Test Use
-p File is a named pipe (FIFO), or filehandle is a pipe.
-S File is a socket.
-b File is a block special file.
-c File is a character special file.
-t Filehandle is opened to a tty.
-u File has a setuid bit set.
-g File has a setgid bit set.
-k File has a sticky bit set.
-T File is an ASCII text file.
-B File is a binary file (opposite of -T).
-M Script start time minus file modification time (in days).
-A Script start time minus access time (in days).
-C Script start time minus inode change time (in days).
File Operations
Function Use
chmod list Changes the permissions of the files in list. The first element of list is
the mode to use.
chown list Changes the owner and group of the files in list. The first two elements
of the list are the numerical userid and groupid to set.
truncate file, size Truncates file to size. The file can be a filename or a filehandle.
link oldfile, Creates newfile as a link to oldfile.
newfile
lstat file Identical to the stat function, but lstat does not traverse symbolic
links.
mkdir directory, Creates directory with permissions in mode. Sets $! if operation fails.
mode
readlink expr Returns the value of a symbolic link. Sets $! on system error, uses $_ if
expr is omitted.
rename oldname, Changes the name oldname to newname.
newname
667
Appendix D
Function Use
stat file Returns a 13-element array where 0 = $dev, 1 = $ino, 2 = $mode, 3 =
$nlink, 4 = $uid, 5 = $gid, 6 = $rdev, 7 = $size, 8 = $atime, 9 =
$mtime, 10 = $ctime, 11 = $blksize, 12 = $blocks. Note that file
can be a filehandle, an expression evaluating to a filename, or _
(underline filehandle), which will use the file referred to in the last file
test operation or stat call. Returns a null list on failure.
symlink oldfile, Creates newfile symbolically linked oldfile.
newfile
getc [ filehandle ] Returns the next character from filehandle, or an empty string if
EOF. Reads from STDIN if filehandle is not specified.
668
Perl Language Reference
Function Use
ioctl filehandle, Performs ioctl(2) on filehandle, using the supplied parameters,
function, $var with nonstandard return values.
open filehandle [ , Opens a file and associates it with filehandle. If filename is not ,
filename ] specified the scalar variable filehandle must contain the filename.
Returns true on success or undef on failure.
read filehandle, $var, Reads length binary bytes from filehandle into $var at offset. Returns
length [ , offset ] the number of bytes read.
seek filehandle, Arbitrarily positions the file pointer. Returns true if the operation
position, whence was successful.
select [ filehandle ] Returns the current default filehandle. If filehandle is specified, it
becomes the current default filehandle.
select rbits, wbits, Performs a select(2) system call with the parameters specified.
nbits, timeout
669
Appendix D
Function Use
tell [ filehandle ] Returns the current file pointer position for filehandle. Assumes the
last file accessed if filehandle is omitted.
write [ filehandle ] Writes a formatted record to filehandle, using the data format asso-
ciated with that filehandle.
Directory Functions
Function Use
closedir dirhandle Closes a directory opened by opendir.
opendir dirhandle, Opens dirname on the dirhandle specified.
dirname
readdir dirhandle Returns the next entry or an array of entries from dirhandle.
rewinddir dirhandle Positions the directory pointer at the beginning of the dirhandle list.
seekdir dirhandle, pos Sets the directory pointer on dirhandle to pos.
telldir dirhandle Returns the directory pointer position in the dirhandle list.
System Functions
Function Use
alarm expr Schedules a SIGALRM after expr seconds.
chdir [ expr ] Changes the working directory to expr. If expr is omitted, alarm
uses $ENV{“HOME”} or $ENV{“LOGNAME”}.
chroot dirname Changes the root directory to dirname for the process and its
children.
die [ list ] Prints list to STDERR and exits with the current value of $!.
exec list Executes the system command(s) list. Does not return.
exit [ expr ] Exits the program immediately with the value of expr. Calls
appropriate end routines and object destructors before exiting.
fork Performs a fork(2) system call. Returns the process ID of the
child to the parent process and 0 to the child process.
getlogin Returns the effective login name.
getpgrp [ pid ] Returns the process group for process pid. If pid is 0 or omitted,
getgrp returns the current process.
670
Perl Language Reference
Function Use
getpriority which, who Returns the current priority for a process, a process group, or a
user.
glob pattern Returns a list of filenames that match the pattern pattern.
kill list Sends a signal to the processes in list. The first element of the list
is the signal to send in numeric or name form.
setpgrp pid, pgrp Sets the process group to pgrp for the process specified by pid. If
pid is omitted or 0, setpgrp uses the current process.
setpriority which, Sets the current priority for a process, a process group, or a user.
who, priority
sleep [ expr ] Causes the program to sleep for expr seconds. If expr is omitted,
the program sleeps forever. Returns the number of seconds slept.
syscall list Calls a system call. The first element in list is the system call; the
rest of list is used as arguments.
system list Similar to exec except that a fork is performed first, and the par-
ent process waits for the child process to complete.
times Returns a four-element array giving the user and system times, in
seconds, for this process and the children of this process (0=
$user, 1= $system, 2= $cuser, 3= $csystem).
umask [ expr ] Sets the umask for the process. Returns the old umask. Omitting
expr causes the current umask to be returned.
wait Behaves like a wait(2) system process — waits for a child process
to terminate. Returns the process ID of the terminated process (-1
if none). The status is returned in $?.
waitpid pid, flags Performs the same function as the waitpid(2) system call.
warn [ list ] Similar to die, warn prints list on STDERR but doesn’t exit.
Networking Functions
Function Use
accept newsocket, genericsocket Accepts a new socket similar to the accept(2) system
call.
bind socket, name Binds name to socket. The name should be a packed
address of an appropriate type for socket.
connect socket, name Attempts to connect name to socket, similar to the system
call.
getpeername socket Returns the socket address of the other end of socket.
Table continued on following page
671
Appendix D
Function Use
getsockname socket Returns the name of socket.
getsockopt socket, level, Returns the socket option identified by optionname,
optionname queried at level.
listen socket, queuesize Similar to the listen system call, starts listening on
socket. Returns true or false depending on success.
recv socket, scalar, Attempts to receive length characters of data into scalar
length, flags from the specified socket. Specified flags are the same as
the recv system call.
send socket, msg, flags [ , to ] Attempts to send msg to socket. Takes the same flags as
the send system call. Use to when necessary to specify
an unconnected socket.
setsockopt socket, level, Sets the socket option optionname to optionvalue using the
optionname, optionvalue level specified.
shutdown socket, method Shuts down a socket using the specified method. The
method can be any valid method for the shutdown sys-
tem call.
socket socket, domain, Similar to the socket system call, creates a socket in
type, protocol domain with the type and protocol specified.
socketpair socket1, socket2, Similar to socket but creates a pair of bidirectional
domain, type, protocol sockets.
Miscellaneous Functions
Function Use
defined expr Tests whether expr has an actual value.
do filename Executes filename as a Perl script.
dump [ label ] Performs an immediate core dump to a new binary executable. When
new binary runs, execution starts at optional label or at the beginning
of the executable if label is not specified.
eval { expr1 ; Evaluates and executes any code between the braces ({ and }).
expr2; ... exprN}
ref expr Tests expr and returns true if expr is a reference. Returns a package
name if expr has been blessed into a package. (See the “Subroutines,
Packages, and Modules” table in the “Statements” section.)
672
Perl Language Reference
Function Use
reset [ list ] Resets all variables and arrays that begin with a letter in list.
scalar expr Evaluates expr in scalar context.
undef [ value ] Undefines value. Returns undefined.
wantarray Tests the current context to see if an array is expected. Returns true if
array is expected or false if array is not expected.
Regular Expressions
The following tables provide information used with Perl’s regular expressions and regex handling
functions.
673
Appendix D
Escape Characters
Escaped Character Use
\w Matches alphanumeric (including underscore).
\W Matches nonalphanumeric.
\s Matches white space.
\S Matches non–white space.
\d Matches numeric.
\D Matches nonnumeric.
\A Matches the beginning of the string.
\Z Matches the end of the string.
\b Matches word boundaries.
\B Matches nonword boundaries.
\G Matches where a previous m//g search left off.
\1 ... \9 Are used to refer to previously matched subexpressions (grouped with
parenthesis inside the match pattern).
Note: \10 and up can be used if the pattern matches more than nine
subexpressions.
674
Python Language Reference
This appendix lists various functions and variables available for CGI programming in Python.
Their syntax and general use are shown, and short examples are included, where necessary, for
clarity. Although care has been taken to include those functions most likely to be used in CGI pro-
gramming, Python is evolving, so older code might include deprecated functions that aren’t listed
here. Because Python is highly modularized, the functions and variables are grouped by the mod-
ule to which they belong. In most cases, the module must be imported before the functions listed
become available.
Built-in Functions
The following sections cover the functions built into Python. These functions are always available
and do not require that a specific module be imported for their use.
Syntax Description
__import__(mod) Imports the module represented by the string mod,
especially useful for dynamically importing a list
of modules:
myModules = [‘sys’,’os’,’cgi’,’cgitb’]
modules = map(__import__,myModules)
if isinstance(obj, basestring):
Syntax Description
bool([x]) Returns True or False depending on the value of x. If x
is a false statement or empty, returns False; otherwise,
returns True.
callable(obj) Returns 1 if obj can be called; otherwise, returns 0.
chr(i) Returns a string of one character whose ASCII code is the
integer i.
classmethod(func) Returns a class method for func in the following format:
class C:
@classmethod
def func(cls, arg1, arg2...):
cmp(a,b) Compares values a and b, returning a negative value if a
< b, 0 if a == b, and a positive value if a > b.
compile(string, filename, Compiles string into a code object. Filename is the file
kind[,flags[, don’t inherit]]) containing the code. Kind denotes what kind of code to
compile.
complex([real[,imag]]) Returns a complex number with value real+imag*j.
delattr(obj,string) Removes the attribute of obj whose name is string.
dict([mapping or sequence]) Returns a dictionary whose initial value is set to mapping
or sequence. Returns empty dictionary if no mapping or
sequence is provided.
dir(obj) Returns attributes and methods of obj. Works on nearly
any data type.
divmod(a,b) Returns quotient and remainder of two noncomplex
numbers, a and b.
enumerate(iterable) Returns an enumerate object based on the specified iter-
able object.
eval(expression[,globals[,locals]]) Returns evaluation of expression by Python rules using
globals (which must be a dictionary) as global namespace
and locals (which can be any mapping object) as local
namespace.
execfile(filename[,globals[,locals]]) Parses filename, evaluating as series of Python expres-
sions, using globals and locals as global and local names-
paces. globals and locals are dictionaries. If no locals are
given, locals are set to provided globals.
file(filename[,mode[,bufsize]]) Returns new file object of specified mode. Modes are r for
reading, w for writing, and a for appending: + appended
to mode indicates that file is changeable, and b appended
to mode indicates that file is to be binary. bufsize may be 0
for unbuffered, 1 for line buffered, or any other positive
value for a specific buffer size.
676
Python Language Reference
Syntax Description
filter(function, list) Returns a list of items from list where function is True.
float([x]) Returns string or number x as converted to a floating-point
value. If no argument x is given, returns 0.0.
frozenset([iterable]) Returns set (with elements taken from iterable) that has no
update methods but can be hashed and used as the mem-
ber of other sets or as dictionary keys (called a frozen set).
getattr(obj,name[,default]) Returns the value of the named attribute of obj or default if
it does not exist.
globals() Returns a dictionary containing the current global symbol
table.
hasattr(obj, name) Returns True if name is the name of one of obj’s attributes;
otherwise, returns False.
hash(obj) Returns the hash value of obj.
help([obj]) Invokes the built-in help system.
hex(x) Converts x to a hexadecimal value.
id(obj) Returns the unique integer representing obj.
input([prompt]) Returns the equivalent of eval(raw_input(prompt)).
int[x[,radix]] Returns the integer version of x to the base specified
by radix.
isinstance(obj, classinfo) Returns True if obj is an instance of classinfo; otherwise,
False. For example, if issubclass(A,B), then
isinstance(x,A) => isinstance(x,B).
677
Appendix E
Syntax Description
long([x[,radix]]) Returns a long integer converted from x to base radix. x
may be a string or a regular or long integer or a floating-
point number. A floating-point number is truncated
toward zero.
map(func, list) Returns list of results from applying func to every mem-
ber of list.
max(s[, args]) Returns the largest member of sequence s. If additional
args are specified, returns the largest of args.
min(s[, args]) Returns the smallest member of sequence s. If additional
args are specified, returns the smallest of args.
object() Returns a new featureless object.
oct(x) Returns octal form of integer x.
open(filename[mode[,bufsize]]) Alias for the file function.
ord(c) Returns the ASCII character represented by the one-
character string or Unicode character c. ord(‘c’) returns
the integer 99.
pow(x,y[,z]) Returns x to the power y; if z is present, returns x to the
power y, modulo z.
property[fget[,fset[,fdel][,doc]]]) Returns a property attribute for a new-style class with
functions to get, set, and delete an attribute.
class C(object):
def getx(self): return self.__x
def setx(self, val): return self.__x = val
def delx(self): del self.__x
x=property(getx,setx,delx,”doc of the ‘x’
property.”)
range([start,]stop[,step]]) Creates a progression list of plain integers often used for
loops. If all arguments are specified, the list looks like
[start,start+step,start+step+step,...].
raw_input(prompt) Prints prompt to standard output with no newline. Then
reads the next line from standard input and returns it
(without a newline).
s=raw_input(‘prompt>’)
prompt> Phrase I typed in.
>>>s
“Phrase I typed in.”
678
Python Language Reference
Syntax Description
reduce(function, sequence[,initializer]) Applies function of two arguments cumulatively to
items of sequence, from left to right, for the purpose of
reducing sequence to a single value.
679
Appendix E
Syntax Description
str(x) Converts x into string form. Works on any available
data type.
sum(sequence[, start]) Returns the sum of start and the items of sequence that
are not allowed to be strings.
super(type[,obj or type]) Returns the superobject of type. If the second argument is
an object, isinstance(obj, type) must be true. If the second
argument is a type, issubclass(type2,type) must be true.
tuple([sequence]) Returns a tuple whose items are the same and in the
same order as those in sequence.
type(obj) Returns datatype of obj. Works on any available data type.
unichr(i) Return the Unicode string of one character whose Uni-
code code is the integer i. For example, unichr(99)
returns the string u’c’.
Unicode([obj[,encoding[,errors]]]) Returns the Unicode version of obj. If no optional param-
eters are specified, this function will behave like the
str() function except that it returns Unicode strings
instead of eight-bit strings. If encoding and/or errors are
given, Unicode() will decode the object, which can
either be an eight-bit string or a character buffer using
the codec for encoding.
vars([obj]) Without arguments, this function returns dictionary
corresponding to the current local symbol table. If a
module is specified, this function returns a dictionary
corresponding to the specified object’s symbol table.
xrange([start,]stop[,step]) This function is very similar to range(), but returns an
“xrange object’’ instead of a list.
array Module
These functions define functionality in support of the array object type, which can represent an array of
characters, integers, and floating-point numbers. Arrays are sequence types, behave very much like lists,
and are supported by the following functions:
Syntax Description
array(typecode[,initializer]) Returns a new array whose items are restricted by typecode and
initialized from the optional initializer value, which must be a list
or string in versions prior to 2.4 but may also contain an iterable
over elements of the appropriate type. The typecode is a character
that defines the item type.
append(x) Appends a new item of value x to the end of the array.
680
Python Language Reference
Syntax Description
buffer_info() Return a tuple (address, length) giving the current memory
address and the length in elements of the buffer used to hold the
array’s contents.
byteswap() Switches byte order of arguments that are 1, 2, 4, or 8 bytes in size
(endianness).
count(x) Returns the number of times x occurs in the array.
extend(iterable) Appends items from iterable (which prior to version 2.4 had to be
another array but has been changed to include any iterable con-
taining elements of the same type as those in array) to end of
array.
fromfile(file,num) Appends num items from file to array or returns EOFError if less
than num are available.
fromlist(list) Appends items from list to array; equivalent to “for x in list:
array.append(x)”.
asyncore Module
This module provides the basic functionality in support of writing asynchronous socket service clients
and servers.
681
Appendix E
Syntax Description
loop([timeout[,use_poll[,map[,count]]]]) Enters a polling loop that terminates after count
passes or after all open channels have been closed.
The use_poll parameter, which defaults to False,
can be set to True to indicate that the poll()
should be used in preference to select(). timeout
specifies the number of seconds before the
select() or poll() function should timeout. map
is a dictionary of channels to watch. If the map
parameter is omitted, a global map is used. This
map is updated by the default class __init__() —
make sure you extend, rather than override,
__init__() if you want to retain this behavior.
682
Python Language Reference
Syntax Description
recv(buffer_size) Reads at most buffer_size bytes from the socket’s remote
end-point. If buffer_size is an empty string, the connec-
tion has been closed from the remote end.
listen(backlog) Listens for backlog connections made to the socket. back-
log should be at least one and not more than the number
allowed by the operating system.
bind(address) Binds the socket to address as long as it is not already
bound.
accept() Accepts a connection for a socket that is bound to an
address and is listening for connection.
close() Closes the socket.
asynchat Module
This module builds on the basic functionality of the asyncore module for the purpose of simplifying
asynchronous communication between clients and servers. It especially helps with protocols whose ele-
ments are terminated by arbitrary strings or are of variable length.
Syntax Description
CLASS ASYNC_CHAT() Abstract class of asyncore.dispatcher. To make practical use of
it, you should subclass async_chat, providing meaningful meth-
ods of collect_incoming_data() and found_terminator().
close_when_done() Pushes a None on the producer fifo, which, when popped off,
causes the channel to be closed.
collect_incoming_ Called with data holding some amount of received data. The
data(data) default method, which must be overridden, raises a
NotImplementedError exception.
discard_buffers() Discards any data held in the input or output buffers and the pro-
ducer fifo.
found_terminator() Called when the incoming data stream matches the termination
condition set by the set_terminator() method.
get_terminator() Returns the current terminator for the channel.
handle_close() Called when the channel is closed.
handle_read() Called when a read event fires on the channel’s socket in the asyn-
chronous loop. By default, it checks the termination condition set by
set_terminator(), which can be the appearance of a particular
string as input or the receipt of a particular number of characters,
and upon finding it, calls found_terminator().
Table continued on following page
683
Appendix E
Syntax Description
handle_write() Called when the application may write to the channel.
handle_write() calls the initiate_send() method.
binascii Module
The following section covers the functions in Python’s binascii module. The binascii module pro-
vides functionality for conversion between binary and various ASCII-encoded representations.
684
Python Language Reference
Syntax Description
ab2_uu(string) Returns the binary data converted from the single line of uuen-
coded data string.
b2a_uu(data) Returns a line of ASCII characters ending in a newline, as con-
verted from binary data.
a2b_base64(string) Returns binary data as converted from the block of base64 data
specified by string.
b2a_base64(string) Returns a line of ASCII characters in base64 coding ending in a
newline, as converted from a 57-character or shorter binary string.
a2b_qp(string[, header]) Returns binary data as converted from a block of quoted-printable
data. If the optional argument header is present and true, under-
scores will be decoded as spaces.
b2a_qp(data[,quotetabs, Returns a line or lines of ASCII characters in quoted-printable
istext, header]) format as converted from a line or lines of binary data. If the
optional argument header is present and true, spaces will be
encoded as underscores.
a2b_hqx(string) Converts binhex4-formatted ASCII data string to binary, without
doing RLE-decompression.
rledecode_hqx(data) Performs RLE-decompression on the data and returns the decom-
pressed data. The decompression algorithm uses 0x90 after a byte
as a repeat indicator, followed by a count. A count of 0 specifies a
byte value of 0x90. The routine returns the decompressed data,
unless data input data ends in an orphaned repeat indicator, in
which case the Incomplete exception is raised.
cgi Module
The following section covers the functions in Python’s Common Gateway Interface module. This mod-
ule defines a number of utilities for use by CGI scripts written in Python.
Syntax Description
parse(file[,keep_blanks Parses a query from the specified file or from sys.stdin if none
[,strict_parsing]]) is specified. For details on the keep_blanks and strict_parsing
parameters, see the parse_qs function.
parse_qs(querystr[,keep_ Returns a dictionary containing data from the parsed query string
blanks[,strict_parsing]]) querystr. The keep_blanks parameter is a flag indicating whether
blank values in URL encoded queries should be treated as blank
strings. The strict_parsing parameter indicates whether or not to
raise an exception when parsing errors are found.
Table continued on following page
685
Appendix E
Syntax Description
parse_qsl(querystr[,keep_ Returns a list of name,value pairs from the parsed query string
blanks[,strict_parsing]]) querystr. The keep_blanks parameter is a flag indicating whether
blank values in URL encoded queries should be treated as blank
strings. The strict_parsing parameter indicates whether or not to
raise an exception when parsing errors are found.
parse_multipart(file,pdict) Parses multipart/form-data for file uploads. Arguments include file
file and a dictionary containing other parameters in the Content-
Type Header. Returns a dictionary just like the parse_qs() func-
tion: keys are the field names, each value is a list of values for that
field.
parse_header(string) Parses MIME Header string into a main value and a dictionary of
parameters.
test() Writes minimal HTTP headers and formats all information pro-
vided to the script in HTML form for use in testing.
print_environ() Formats the shell environment in HTML format.
print_form() Formats a form in HTML.
print_directory() Formats the current directory in HTML.
print_environ_usage() Prints a list of cgi environment variables in HTML.
escape(s[,quote]) Convert the characters &, <, and > in string s to HTML-safe
sequences. If quote is True, double-quotes are translated as well.
cgitb Module
The following section covers the functions in Python’s CGI Traceback module. Although this module
was originally developed to provide extensive traceback information in HTML for troubleshooting
Python CGI scripts, it has more recently been generalized to also provide information in plain text.
Output includes a traceback showing excerpts of the source code for each level, as well as the values of
the arguments and local variables to currently running functions to assist in debugging. To use this mod-
ule, add the following to the top of the script to be debugged:
686
Python Language Reference
Syntax Description
enable([display[,logdir This function causes the cgitb module to take over the
[,context[,format]]]]) interpreter’s default handling for exceptions by setting the value
of sys.excepthook. The display argument may be 1, which
enables sending the traceback to the browser, or 0 to disable it.
The logdir argument specifies to write tracebacks to files in the
directory named by logdir. The context value is the number of lines
of context to display around the current line of source code in the
traceback. The format option may be either “html” to format the
output to HTML or any other value that formats it as plain text.
handler([info]) This function handles an exception using the default settings
(show a report in the browser, but don’t log to a file). The optional
info argument should be a 3-tuple containing an exception type,
exception value, and traceback object.
Cookie Module
The Cookie Module provides a mechanism for state management in HTML primarily on the server side.
It supports both simple string-only cookies, and provides an abstraction for having any serializable data-
type as cookie value. The Cookie Module originally strictly applied the parsing rules described in RFC
2109 and RFC 2068 specifications, but modifications have made its parsing less strict. Due to security
concerns, two classes have been deprecated from this module: CLASS SerialCookier([input]) and
CLASS SmartCookier([input]). For backwards compatibility, the Cookie Module exports a class
named Cookie, which is just an alias for SmartCookie. This is probably a mistake and will likely be
removed in a future version. You should not use the Cookie class in your applications, for the same rea-
son you should not use the SerialCookie class.
Syntax Description
exception Exception raised when the cookie in question is invalid according
CookieError to RFC 2109.
CLASS BaseCookie This class is a dictionary-like object with keys that are strings and
([input]) values that are Morse1 instances. Upon setting a key to a value,
the value is converted to a Morse1 containing the key and the
value. If input is given, this class passes it to the load() method.
CLASS SimpleCookie This class derives from BaseCookie and overrides
([input]) value_decode() and value_encode() to be the identity and
str(), respectively.
cookielib Module
The cookielib module defines classes in support of the automatic handling of HTTP cookies, for
accessing Web sites that require cookies to be set on the client machine by an HTTP response from a Web
server and then to be returned to the server in later HTTP requests.
687
Appendix E
Syntax Description
exception LoadError Error returned if the cookies fail to load from the spec-
ified file.
CLASS CookieJar(policy=None) The CookieJar class stores HTTP cookies, extracts
HTTP requests, and returns them in HTTP responses.
Instances of the CookieJar class automatically expire
contained cookies when necessary.
CLASS FileCookieJar(filename, A CookieJar that can load cookies from and save
delayload=None,policy=None) cookies to a file. Cookies are NOT loaded from the
named file until either the load() or revert()
method is called.
CLASS CookiePolicy() This class is responsible for deciding whether each
cookie should be accepted from or returned to the
server.
CLASS DefaultCookiePolicy Constructor class should be passed as keyword
(blocked_domains=None, allowed_ arguments only. blocked_domains is a sequence of
domains=None, netscape=True, domain names that we never accept cookies from or
rfc2965=False, hide_cookie2=False, return cookies to. allowed_domains is a sequence of the
strict_domain=False, strict_rfc2965_ only domains for which we accept and return cookies
unverifiable=True, strict_ns_ or None.
unverifiable=False, strict_ns_domain=
DefaultCookiePolicy.
DomainLiberal, strict_ns_set_initial_
dollar=False, strict_ns_set_path=False )
CLASS Cookie() This class represents Netscape, RFC 2109, and RFC
2965 cookies.
email Module
The following functions are available from the email module for use in setting and querying header
fields and for accessing message bodies. This module replaces the functionality of the mimetools mod-
ule from before Python version 2.3.
Syntax Description
CLASS Message() The basic message constructor. Message objects provide a
mapping style interface for accessing the message headers and
an explicit interface for accessing both the headers and the pay-
load, which can be either a string or a list of Message objects
for MIME container documents (for example, multipart/*
and message/rfc822).
as_string([unixfrom]) Return the entire message flattened as a string. When optional
unixfrom is True, the envelope header is included in the
returned string. unixfrom defaults to False.
688
Python Language Reference
Syntax Description
__str__() Equivalent to as_string with unixfrom set to True.
is_multipart() Returns True if the message’s payload is a list of sub-Mes-
sage objects; otherwise (if it is a string) returns False.
689
Appendix E
Syntax Description
add_header(_name,_value, The add_header() method is similar to __setitem__()
**_params) except that additional header parameters can be provided as
keyword arguments. _name is the header field to add and
_value is the primary value for the header.
msg.add_header(‘Content-Disposition’,
‘attachment’,filename=’example.gif’)
adds a header which looks like:
Content-Disposition: attachment;
filename=”example.gif”
replace_header(_name,_value) Replaces the first header found in the message that matches
_name, retaining header order and field name case.
get_content_type() Returns the message’s content type in lowercase of the form
maintype/subtype or the default content type if there is no
Content-Type Header in the message.
get_content_maintype() Returns the message’s main content type. This is the maintype
part of the string returned by get_content_type().
get_content_subtype() Returns the message’s sub-content type. This is the subtype
part of the string returned by get_content_type().
get_default_type() Returns the default content type. Most messages have a default
content type of text/plain, except those that are subparts of
multipart/digest containers and have a default content type of
message/rfc822.
690
Python Language Reference
Syntax Description
set_type(type[,header[,requote]]) Set the main type and subtype for the Content-Type:
header where type is a string in the form maintype/subtype.
If requote is False, this leaves the existing header’s quoting
as is; otherwise, the parameters will be quoted.
get_filename([failobj]) Returns the value of the filename parameter of the Con-
tent-Disposition: header of the message, or failobj if
either the header is missing or has no filename parameter.
get_boundary([failobj]) Returns the value of the boundary parameter of the Con-
tent-Type: header of the message, or failobj if the header is
missing or has no boundary parameter.
set_boundary(boundary) Sets the boundary parameter of the Content-Type: header
to boundary. set_boundary().
get_content_charset([failobj]) Returns the charset parameter of the Content-Type:
header in lowercase if it exists; otherwise returns failobj.
get_charsets([failobj]) Returns a list containing the character set names in the mes-
sage, one element for each subpart of the payload or failobj if
no content header exists.
walk() This method is an all-purpose generator used to iterate over
all parts and subparts of the message object tree.
preamble MIME document format allows some text between the
blank line following the headers and the first multipart
boundary string. The preamble attribute contains this lead-
ing extra-armor text.
epilogue Text that appears between the last boundary and the end of
the message is stored in the epilogue attribute.
defects The defects attribute contains a list of all problems occurring
during message parsing.
file Object
The following section covers the functions available when the built-in function file() is called. The
file object returned has the inherent functionality described in the following table. These functions are
called like this:
filename.function()
Syntax Description
close() Closes the file.
flush() Flushes the internal file buffer.
Table continued on following page
691
Appendix E
Syntax Description
fileno() Returns the integer “file descriptor” that is used by the underlying
implementation to request I/O operations from the operating system.
isatty() Returns True if the filename describes a TTY device; otherwise,
returns False.
read([size]) Returns, in the form of a string object, size bytes from the file or fewer
if the read hits EOF before obtaining that many bytes. If the size argu-
ment is negative or omitted, returns all data until EOF is reached.
readline([size]) Reads and returns a line from the file, including the trailing newline
character.
readlines([size]) Reads and returns all lines from file as a list, including the trailing
newline characters.
seek(offset[,pos]) Sets the file’s current position, for example, stdio’s fseek(). The pos
argument defaults to 0, which turns on absolute file positioning. Other
values are 1, seek relative to the current position and 2, seek relative to
the end of the file.
tell() Returns the file’s current position.
truncate([size]) Truncates the file to length size bytes or 0 if size is not specified.
write(string) Writes string to the file.
writelines(sequence) Writes a sequence of strings to the file. The sequence can be any iter-
able object producing strings, typically a list of strings.
Syntax Description
enable() Enables the garbage collector.
disable() Disables the garbage collector.
isenabled() Returns True if the garbage collector is enabled.
collect() Runs a full garbage collection, examining all generations and returning
the number of unreachable objects found.
692
Python Language Reference
Syntax Description
set_debug() Sets debugging flags for garbage collection and writing the resulting
debugging information out to stderr. The flags may be any of the
following:
693
Appendix E
Syntax Description
CLASS HTTPConnection(host[,port]) An instance of this class represents one transaction to
the HTTP server. If port is not provided and host is of
the form host::port, the port to connect to is taken from
this string. If host does not contain this port section
and the port parameter is not provided, the connection
is made to the default HTTP port, usually 80.
request(method,url[,body[,headers]]) This will send a request to the server using the HTTP
request method method and the selector url. If the body
argument is present, it should be a string of data to
send after the headers are finished. The headers argu-
ment should be a mapping of extra HTTP headers to
send with the request.
get_response() Should be called after a request is sent to get the
response from the server. Returns an HTTPResponse
instance.
set_debuglevel(level) Sets the default debugging level; defaults to no debug-
ging data printing out.
connect() Connects to the server specified when the object was
created.
close() Closes the connection to the server.
send(data) Sends specified data to the server. This method should
be called directly only after the endheaders() method
and before the getreply() method.
putrequest(request,selector[,skip_host First call made to server after a connection has been
[,skip_accept_encoding]]) made. It sends a line to the server consisting of the
request string, the selector string, and the HTTP ver-
sion. skip_host and skip_accept_encoding are Boolean
variables.
putheader(header,arguments) Sends an RFC 822-style header to the server. It sends a
line to the server consisting of the header, a colon and
a space, and the first argument. If more arguments are
given, continuation lines are sent, each consisting of a
tab and an argument.
endheaders() Sends a blank line to the server, signaling the end of
the headers.
694
Python Language Reference
Syntax Description
CLASS HTTPSConnection(host[,port, An instance of this class represents one transaction to
key_file,cert_file]) the secure HTTP server. If port is not provided and host
is of the form host::port, the port to connect to is taken
from this string. If host does not contain this port sec-
tion and the port parameter is not provided, the con-
nection is made to the default HTTPS port, usually
443. key_file is the name of a Privacy Enhanced Mail
(PEM) Security Certificate formatted file that contains
your private key. cert_file is a PEM formatted certificate
chain file.
CLASS HTTPResponse(sock Class whose instance is returned upon successful
[,debuglevel=0][,strict=0] connection. Not instantiated directly.
read([byte]) Reads the byte bytes of the response body.
getheader(name[,default]) Gets the content of the header name or default if no
header name is specified.
getheaders() Returns a list of (header, value) tuples.
msg Instance of mimetools.message (deprecated) or
email.message, which contains the response headers.
695
Appendix E
Syntax Description
exception CannotSendHeader Subclass of ImproperConnectionState.
exception ResponseNotReady Subclass of ImproperConnectionState.
exception BadStatusLine Subclass of HTTPException raised when the server
responds with an unknown HTTP status code.
HTTP_PORT Variable holding default value for HTTP Port, 80.
HTTPS_PORT Variable holding default value for HTTPS Port, 443.
imaplib Module
The imaplib module defines three classes, IMAP4, IMAP4_SSL, and IMAP4_stream, which encapsulate
a connection to an IMAP4 server and implement a large subset of the IMAP4rev1 client protocol, as
defined in RFC 2060.
Syntax Description
CLASS IMAP4([host[,port]]) Initializes the instance thereby creating the connection and deter-
mining the protocol (IMAP4 or IMAP4rev1). If host is not specified,
localhost is used. If port is omitted, the standard IMAP4 port (143)
is used.
exception IMAP4.error Exception raised on any error.
exception IMAP4.abort Subclass of IMAP4.error, which is raised upon IMAP4 server
errors.
exception IMAP4.readonly Subclass of IMAP4.error, which is raised when a writable mail-
box has its status changed by the server.
CLASS IMAP4_SSL([host This is a subclass derived from IMAP4 that connects over an
[,port[,keyfile[,certfile]]]]) SSL-encrypted socket (to use this class, you need a socket module
that was compiled with SSL support).
CLASS IMAP4_stream This is a subclass derived from IMAP4 that connects to the
(command) stdin/stdout file descriptors created by passing command to
os.popen2().
696
Python Language Reference
mimetools Module
Deprecated. Use the email module now.
os Module
The following table covers the functions in Python’s os module. This module provides more portable
access to the underlying operating system functionality than the posix module. Extensions to particular
operating systems exist but make the use of the os module much less portable. These functions perform
such functions as file processing, directory traversing, and access/permissions assignment.
Syntax Description
remove() Deletes the file.
unlink() Same as remove().
rename() Renames the file.
stat() Returns file statistics for the file.
lstat() Returns file statistics for a symbolic link.
symlink() Creates a symbolic link.
utime() Updates the timestamp for the file.
chdir() Changes the working directory.
listdir() Lists files in the current directory.
getcwd() Returns the current working directory.
mkdir(dir) Creates directory as specified by dir.
makedirs() Same as mkdir() except with multiple directories being created.
rmdir(dir) Removes directory as specified by dir.
removedirs() Same as rmdir() except with multiple directories being removed.
access() Verifies permission modes for the file.
chmod() Changes permission modes for the file.
umask() Sets default permission modes for the file.
basename() Removes the directory path and returns the leaf name.
dirname() Removes leaf name and returns directory path.
join() Joins separate components into a single pathname.
split() Returns a tuple containing a dirname() and a basename().
splitdrive() Returns tuple containing drivename and pathname.
splitext() Returns tuple containing filename and extension.
Table continued on following page
697
Appendix E
Syntax Description
getatime() Returns last access time for file. This varies a bit with different operating
systems.
getmtime() Returns last file modification time for file.
getsize() Returns file size in bytes.
exists() Returns True if pathname, file, or directory exists.
isdir() Returns True if pathname exists and is a directory.
isfile() Returns True if pathname exists and is a file.
islink() Returns True if pathname exists and is a symbolic link.
samefile() Returns True if both pathnames refer to the same file.
os.path Module
The following table covers the functions in Python’s os.path module. This module provides support
for manipulation of command paths.
Syntax Description
abspath(path) Returns a normalized absolute version of the pathname path.
basename(path) Returns the base name of pathname path, the second half of the
pair returned by split(path).
commonprefix(list) Returns the longest path prefix that is a prefix of all paths in list. If
there is none, an empty string is returned.
dirname(path) Return the directory name of pathname path, the first half of the
pair returned by split(path).
exists(path) Returns True if path exists and False if path is a broken symbolic
link.
lexists(path) Same as exists() for use on platforms lacking os.lstat().
expanduser(path) Returns path with an initial component of “~” or “~user”
replaced by that user’s home directory.
expandvars(path) Returns path with environment variables expanded.
getatime(path) Returns the time since the last time path was accessed in seconds
since the last epoch.
getmtime(path) Returns the time since the last time path was modified in seconds
since the last epoch.
getctime(path) Returns the system’s ctime, which, on some systems (such as
Unix), is the time of the last change, and on others (such as Win-
dows) is the creation time for path in seconds since the last epoch.
698
Python Language Reference
Syntax Description
getsize(path) Returns the size in bytes of path.
isabs(path) Returns True if path is an absolute path (starting with /).
isfile(path) Return True if path is an existing regular file. This function fol-
lows symbolic links, so both islink() and isfile() can be
True for the same path.
samestat(stat1,stat2) Returns True if the stat tuples stat1 and stat2 refer to the same file.
Stat tuples are returned from stat, lstat, and fstat functions.
split(path) Splits the pathname path into a pair, where tail is the last path-
name component and head is everything leading up to that. The
tail part will never contain a slash; if path ends in a slash, tail will
be empty.
Table continued on following page
699
Appendix E
Syntax Description
splitdrive(path) Splits the pathname path into a pair (drive, tail), where drive
is either a drive specification or an empty string. On systems that
do not use drive specifications, drive will always be an empty
string.
splittext(path) Strips the rightmost file extension, which can include only one
period and returns the remainder.
path(“etc/myfile.txt”).stripext() ==
path(“etc/myfile”)
walk(path, visit, arg) Calls the function visit with arguments (arg, dirname, names) for
each directory in the directory tree rooted at path (including path
itself, if it is a directory). The argument dirname specifies the vis-
ited directory; the argument names lists the files in the directory.
The visit function may modify names to influence the set of direc-
tories visited below dirname.
poplib Module
The following table covers the functions in Python’s poplib module, which defines a class, POP3, that
supports connecting to a POP3 server and implements the protocol as defined in RFC 1725, and a class
POP3_SSL, which supports connecting to a POP3 server that uses SSL as an underlying protocol layer as
defined in RFC 2595. Instances of the POP3 class include all of the methods listed. Instances of POP3_SSL
have no additional methods. The interface of this subclass is identical to its parent.
Syntax Description
CLASS POP3(host[,port]) This class is for implementing a connection to the mail server host
using the POP3 protocol. The connection is created when an
instance of the class is initialized. If port is omitted, the standard
POP3 port (110) is used.
CLASS POP3_SSL(host[,port This class is for implementing a connection to the mail server host
[,keyfile[,certfile]]]) using the POP3 protocol over an SSL-encrypted port. The connec-
tion is created when an instance of the class is initialized. If port is
omitted, the standard POP3 over SSL port (995) is used. A PEM
formatted private key and certificate chain file for the SSL connec-
tion may be provided.
set_debuglevel(level) Sets the instances debugging level to 0, which produces no debug-
ging output, 1, which produces a moderate amount of debugging
output, or 2, which produces the maximum amount of debugging
output. Any number higher than 2 produces the same amount as
specifying 2.
getwelcome() Returns the welcome screen for the POP3 server to which the con-
nection is made.
700
Python Language Reference
Syntax Description
user(username) Sends the user command for the specified username to the POP3
server.
pass_(password) Sends the password command with the specified password to the
POP3 server. The mailbox will be locked until quit is sent.
apop(user,secret) Uses the more secure APOP authentication to log into the POP3
server.
rpop(user) Uses RPOP commands to log into POP3 server.
stat() Returns tuple representing mailbox status (message count, mail-
box size)
list([msg]) Requests message list of message msg. If no parameters are sent,
response is in the form (response,[‘mesg_num octets’]).
retr(msg) Retrieves the whole message msg and marks it as seen. Response
is in format (response, [‘line’,...], octets).
dele(msg) Flags message number msg for deletion, which, on most servers,
occurs when the quit command is issued.
rset() Resets the deletion marks on any messages in the mailbox.
noop() Does nothing. Is sometimes used to keep the connection alive.
quit() Commits changes, unlocks mailbox, and drops the connection.
top(msg,amount) Retrieves the message header plus amount lines of the message
after the header of message number msg.
uidl([msg]) Returns message digest list for message identified by msg or for all
if msg is not specified.
smtpd Module
These functions define functionality in support of the creation and usage of sockets in Python.
Syntax Description
CLASS SMTPServer Creates a new SMTPServer object that binds to localaddr, treating
(localaddr,remoteaddr) remoteaddr as an upstream SMTP relayer. SMTPServer inherits
form asyncore.dispatcher and is thus inserted into
asyncore’s event loop when instantiated.
701
Appendix E
smtplib Module
These functions supply Simple Mail Transport Protocol (SMTP) functionality for use in Python scripts.
Syntax Description
CLASS SMTP([host[,port Encapsulates an SMTP connection. Has methods that support
[,local_hostname]]]) SMTP and ESMTP operations. If the optional host and port
parameters are included, they are passed to the connect()
method when it is called.
set_debuglevel(level) Sets the level of debug output. If level is set to True, debug log-
ging is enabled for the connection and all messages sent to and
received from the server.
connect([host[,port]]) Connects to host on port port. If host contains a colon followed by
a number, the number will be interpreted as the port number.
docmd(cmd[,argstring]) Sends the command cmd and optional arguments argstring to
the server and returns a 2-tuple containing the numeric
response code and the actual response line.
helo([hostname]) Identifies user to SMTP server using “HELO”. This is usually
called by sendmail and not directly.
ehlo([hostname]) Identifies user to ESMTP server using “EHLO”. This is usually
called by sendmail and not directly.
has_extn(name) Returns True if name is in the set of SMTP service extensions
returned by the server; otherwise, returns False.
verify(address) Verifies address on the server using SMTP “VRFY”, which
returns a tuple of code 250 and a full address if the user
address is valid.
login(user,password) Logs onto an SMTP server, which requires authentication using
user and password. Automatically tries either “EHLO” or “HELO”
if this login attempt was not preceded by it.
SMTPHeloError Error returned if the server doesn’t reply correctly to the
“HELO” greeting.
SMTPAuthenticationError Error most likely returned if the server doesn’t accept the user-
name/password combination.
SMTPError Error returned if no suitable authentication method was found.
starttls([keyfile[,certfile]]) Puts the SMTP connections into Transport Layer Security
mode. Requires that ehlo() be called again afterwards. If key-
file and certfile are provided, they are passed to the socket mod-
ule’s ssl() function.
sendmail(from_addr,to_ Sends mail to to_addr from from_addr consisting of msg.
addr,msg[,mail_opts,rcpt_opts]) Automatically tries either “EHLO” or “HELO” if this sendmail
attempt was not preceded by it.
SMTPRecipientsRefused Error returned if all recipient addresses are refused.
702
Python Language Reference
Syntax Description
SMTPHeloError Error returned if the server doesn’t reply correctly to the “HELO”
greeting.
SMTPSenderRefused Error returned if the server doesn’t accept from_addr.
SMTPDataError Error returned if the server replies with an unexpected error code.
quit() Terminates the SMTP session and closes the connection.
socket Module
These functions define functionality in support of the creation and usage of sockets in Python.
Syntax Description
socket(socket_family, Creates a socket of socket_family AF_UNIX or AF_INET, as
socket_type, protocol) specified. The socket_type is either SOCK_STREAM or
SOCK_DGRAM. The protocol is usually left to default to 0.
703
Appendix E
string Module
The following table covers the functions in Python’s string module. The term string is used to repre-
sent the string variable to be acted upon by the function. For example, the function to capitalize a string
variable named myString would be called as follows:
myString.capitalize()
Syntax Description
string.capitalize() Returns a copy of string with only its first character capitalized.
string.center(width) Returns a copy of string centered in a string of length width.
string.count(sub[,start Returns number of occurrences of substring sub in string.
[,end]] )
string.find(sub Returns the lowest index in string where substring sub is found.
[ ,start[,end]] ) Return -1 if sub is not found.
string.index(sub Behaves like string.find() but raises ValueError if sub is not
[ ,start[,end]] ) found within string.
string.isalnum() Returns True if all characters in string are alphanumeric, other-
wise, returns False.
string.isalpha() Returns True if all characters in string are alphabetic; otherwise,
returns False.
string.isdigit() Returns True if all characters in string are digits; otherwise,
returns False.
string.islower() Returns True if all characters in string are lowercase; otherwise,
returns False.
string.isspace() Returns True if all characters in string are space characters; other-
wise, returns False.
string.istitle() Returns True if all characters in string are title case; otherwise,
returns False.
string.isupper() Returns True if all characters in string are uppercase; otherwise,
returns False.
separator.join(seq) Returns a concatenation of strings in seq, separated by separator
(for example, “+”.join( [‘H’, ‘I’, ‘!’] ) -> “H+I+!”)
704
Python Language Reference
Syntax Description
string.ljust(width) Returns string left justified in a string of length width.
string.lower() Returns string with each character converted to lowercase.
string.lstrip([chars] ) Returns a copy of string with leading chars removed. Default chars
is set to whitespace.
string.replace(old, new Returns a copy of string with all occurrences of substring old
[, maxsplit]) replaced by new.
string.rfind(sub Returns the highest index in string where substring sub is found.
[, start[, end]]) Returns -1 if sub is not found.
string.rindex(sub Behaves like rfind(), but raises ValueError when the substring
[ , start[, end]]) sub is not found.
string.rjust(width) Returns string right justified in a string of length width.
sys Module
This module provides access to some variables used or maintained by the interpreter and to functions
that interact strongly with the interpreter. It is always available.
Syntax Description
argv This variable represents the list of command line arguments
passed to a Python script. argv[0] is the script name or has zero
length if none was passed. If other arguments were passed, they
are assigned argv[i] where i is 1 — the number of arguments.
byteorder This variable is set to big or little depending on the endian-ness of
the system.
builtin_module_name This variable is a tuple of strings giving the names of all modules
that are compiled into this Python interpreter.
copyright This variable is a string containing the copyright information for
the Python interpreter.
dllhandler This variable is an integer representing the handle of the Python
DLL (only in Windows).
displayhook(value) This function writes the value of displayhook to stdout and
saves it in __builtin__._.
excepthook(type,value, When an exception is raised and uncaught, the interpreter calls
traceback) sys.excepthook with three arguments, the exception class,
exception instance, and a traceback object. In a Python pro-
gram, this happens just before the program exits.
__displayhook__ The original value of displayhook is stored in this variable.
Table continued on following page
705
Appendix E
Syntax Description
__excepthook__ The original value of excepthook is stored in this variable.
exc_info() This function returns a tuple of three values that give information
about the exception that is currently being handled. The informa-
tion returned is specific both to the current thread and to the cur-
rent stack frame. If the current stack frame is not handling an
exception, the information is taken from the calling stack frame, or
its caller, and so on until a stack frame is found that is handling an
exception.
exc_clear() This function clears all information relating to the current or last
exception that occurred in the current thread.
exec_prefix A variable giving the site-specific directory prefix where the plat-
form-dependent Python files are installed; by default, this is
‘/usr/local’.
getdefaultencoding() Returns the name of the current default string encoding used by
the Unicode implementation.
getdlopenflags() Returns the current value of the flags that are used for
dlopen() calls.
getfilesystemencoding() Returns the name of the encoding used to convert Unicode file-
names into system filenames. Returns None if the system default
encoding is used. The return value is dependent on the filesystem.
getrefcount(obj) Returns the reference count of the object obj.
getrecursionlimit() Returns the current value of the recursion limit, the maximum
depth of the Python interpreter stack. This limit serves to prevent
the crashing of Python inherent to infinite recursion.
getwindowsversion() Returns one of the following strings, which represent the various
Windows versions:
706
Python Language Reference
Syntax Description
maxUnicode This variable is an integer giving the largest supported code point
for a Unicode character.
modules This variable is a dictionary of all the loaded modules.
path This variable is a list of strings that specifies the search path for
modules as initialized from the environment variable PYTHON-
PATH plus an installation-dependent default.
707
Appendix E
Syntax Description
warnoptions An implementation detail of the warnings framework; this value
is not to be modified.
winver A variable containing the version number used to form registry
keys on Windows platforms, stored as string resource 1000 in the
Python DLL. winver is normally the first three characters of ver-
sion. It is provided in the sys module for informational purposes
only and has no effect on Windows registry keys.
random Module
This module provides functionality in support of obtaining pseudo-random numbers. Different operat-
ing systems handle this differently — some using randomness sources as hash sources and others instead
using system time.
Syntax Description
seed(x) Initializes the basic random number generator using hash object
x if it is provided.
getstate() Returns the current internal state of the random number
generator.
setstate(state) Resets the internal state of the random number generator to the
supplied state.
jumpahead(n) Changes the internal state to one different from the current
state. n is a nonnegative integer that is used to scramble the
current state vector. This is commonly used in multithreaded
programs, in conjunction with multiple instances of the Random
class: setstate() or seed() can be used to force all instances
into the same internal state, and then jumpahead() can be used
to force the instances’ states far apart.
getrandbits(x) Returns a python long int with x random bits.
randrange([start,]stop[,step]) Returns a randomly selected element from range(start, stop, step).
randint(a,b) Returns a random integer int such that a <= int <= b.
choice(seq) Returns a random element from seq.
shuffle(x[,random]) Shuffles the sequence x, using the random function random if
specified.
sample(sequence,length) Returns a list of num unique elements from sequence.
random() Returns a random floating-point number between 0.0 and 1.0.
uniform(a,b) Returns a random real number N where a <= N > b.
708
Python Language Reference
urllib Module
This module provides high-level functionality for fetching data across the World Wide Web. The func-
tionality is similar to the built-in function open() except that it accepts Universal Resource Locators
(URLs) instead of filenames.
Syntax Description
CLASS URLopener([proxies Base class for opening and reading URLs. In most cases, you
[, **509]]) will want to use the CLASS FancyURLopener instead. An
empty proxies variable turns off proxies completely.
CLASS FancyURLopener Class for opening and reading URLs providing default
([proxies[,**509]]) handling for the following HTTP response codes: 301, 302,
303, 307, and 401. An empty proxies variable turns off proxies
completely.
>>>import urllib
>>>proxies =
{‘http’:’http://proxy.server.com:8080/’}
>>>opener = urllib.FancyURLopener(proxies)
urllib2 Module
This module defines functions and classes needed to facilitate opening URLs — basic and digest authen-
tication, redirections, cookies, and such. If a class is listed, functions belonging to that class are grouped
with it.
709
Appendix E
Syntax Description
urlopen(url[,data]) Opens the specified url, which can be a string or a REQUEST
object. The data parameter is for use in passing extra infor-
mation as needed for http and should be a buffer in the for-
mat of application/x-www-form-urlencoded. The
urlopen function returns a file-like object with a geturl()
method to retrieve the URL of the resource received and an
info() method to return the meta-information of the page.
install_opener(opener) Installs an OpenerDirector instance as the default global
opener.
build_opener(handlers) Returns an OpenerDirector instance, which chains the han-
dlers in the order given. These handlers can be instances of
BaseHandler or subclasses of it. The following exceptions
may be raised: URLError (on handler errors), HTTPError (a
subclass of URLError that handles exotic HTTP errors), and
GopherError (another subclass of URLError that handles
errors from the Gopher handler).
CLASS Request(url[,data[,headers This class is an abstraction of a URL Request. url should be
[,origin_req_host[,unverifiable]]]]) a valid URL in string format. For a description of data, see
the add_data() description. headers should be in dictionary
format. origin_req_host should be the request-host of the ori-
gin transaction. unverifiable should indicate whether or not
the request is verifiable, as specified in RFC 2965.
add_data(data) Sets the Request data to data.
get_method() Posts a string indicating the HTTP request method, which
may be either GET or POST.
has_data() Returns True when the instance has data and False when
instance is None.
get_data() Returns the instance’s data.
add_header(key, val) Adds another header to the request. Later calls will over-
write earlier ones with the same key value.
add_undirected_header Adds a header that will not be added in the case of a
(key,header) redirected request.
has_header(header) Returns whether the instance (either regular or redirected)
has a header.
get_full_url() Returns the URL given in the constructor.
get_type() Returns the type (or scheme) of the URL.
get_host() Returns the host to which the connection will be made.
get_selector() Returns the selector, the part of the URL that is sent to
the server.
710
Python Language Reference
Syntax Description
set_proxy(host,type) Prepares the request by connecting to a
proxy server. host and type will replace the
ones for the instance.
get_origin_req_host() Returns the request-host of the origin
transaction.
is_unverifiable() Returns whether the request is unverifi-
able as defined in RFC 2965.
CLASS OpenerDirector() The OpenerDirector class opens URLs
via BaseHandlers chained together and
is responsible for managing their
resources and error recovery.
add_handler(handler) Searches and adds to the possible chains
any of the following handlers:
711
Appendix E
Syntax Description
close() Removes any parents.
parent A valid OpenerDirector to be used to open
a URL using a different protocol.
default_open(req) This method is not defined in BaseHandler,
but should be defined to catch all URLs.
protocol_open(req) This method is not defined in BaseHandler,
but should be defined to handle URLs of the
given protocol.
unknown_open(req) This method is not defined in BaseHandler,
but should be defined to handle catching
URLs with no specific registered handler to
open them.
http_error_default(req,fp,code,msg,hdrs) This method is not defined in BaseHandler
but should be overridden by the subclass to
provide a catchall for otherwise unhandled
HTTP errors. req is a Request object, fp is a
file-like object with the HTTP error body,
code is the three-digit error code, msg is a
user-visible explanation of the error code,
and hdrs is the mapping object with the
header of the error.
http_error_nnn(req,fp,code,msg,hdrs) This method is not defined in BaseHandler
but is called on an instance of subclass when
an HTTP error with code nnn is encountered.
protocol_request(req) This method is not defined in BaseHandler
but should be called by the subclass to pre-
process requests of the specified protocol.
protocol_response(req,response) This method is not defined in BaseHandler
but should be called by the subclass to post-
process requests of the specified protocol.
CLASS HTTPDefaultErrorHandler() A class that defines a default handler for
HTTP error responses; all responses are
turned into HTTPError exceptions.
CLASS HTTPRedirectHandler() A class to handle redirection.
redirect_request(req,fp,code,msg,hdrs) Returns a request or none in response to a
redirect request.
712
Python Language Reference
Syntax Description
http_error_301(req,fp,code,msg,hdrs) Redirects to the URL.
http_error_302(req,fp,code,msg,hdrs) Redirects for a “found” response.
http_error_303(req,fp,code,msg,hdrs) Redirects for “see other” response.
http_error_307(req,fp,code,msg,hdrs) Redirects for “temporary redirect”
response.
CLASS HTTPCookieProcessor(cookies) A class to handle HTTP cookies.
cookiejar The cookielib.CookieJar in which
cookies are stored.
CLASS ProxyHandler(proxies) A class to handle routing requests
through a proxy. If proxies is specified, it
must be in the form of a dictionary that
maps protocol names to URLs of proxies.
protocol_open(req) ProxyHandler will assign a method pro-
tocol_open() for every protocol that has
a proxy in the proxies dictionary given in
the constructor.
CLASS HTTPPasswordMgr() A class that supports a database of
(realm, url) -> (user, password)
mappings.
add_password(realm,uri, Sets user authentication token for given
user,password) user, realm, and uri.
713
Appendix E
Syntax Description
CLASS HTTPBasicAuthHandler([password_mgr]) A class that supports HTTP authentication
with the remote host. The password_mgr
variable must be compatible with HTTP-
PasswordMgr.
714
Python Language Reference
Syntax Description
CLASS FileHandler() A class that handles opening local files.
file_open(req) Opens the local file if localhost or no host
is specified, and otherwise initiates an ftp
connection and reattempts opening the
file.
CLASS FTPHandler() A class that handles FTP URLs.
ftp_open(req) Opens the file indicated by req using an
empty username and password.
CLASS CacheFTPHandler() A class that handles FTP URLs and caches
FTP connections to minimize delay.
setTimeout(t) Sets timeout of connections to t seconds.
setMaxConns(n) Sets the maximum number of cached con-
nections to n.
CLASS GopherFTPHandler() A class that handles gopher URLs.
gopher_open(req) Opens the gopher resource denoted
by req.
CLASS UnknownHandler() A catchall class to handle unknown URLs.
unknown_open() Raises a URLError exception on URLs
with no specific register handler.
715
PHP Language Reference
This appendix cannot provide an exhaustive list of all of the PHP functions (or it would be hun-
dreds of pages long), but it presents a subset of functions that the authors think you will encounter
in your everyday use of PHP along with brief descriptions of what those functions do. The core lan-
guage functions are included, as well as the functions for PHP extensions that are in popular use.
This appendix is meant as a quick reference for you to check the input and return types of
parameters — more of a reminder of how the function works than a verbose description of how
to use the function in your scripts. If you need to see how a particular function should be used,
or to read up on a function that isn’t covered here, check out the PHP documentation online at
www.php.net/docs.php.
This appendix originally appeared in Beginning PHP 5 written by Dave W. Mercer, Allan Kent,
Steven D. Nowicki, David Mercer, Dan Squier, and Wankyu Choi and published by Wrox,
ISBN: 0-7645-5783-1, copyright 2004, Wiley Publishing, Inc. The author is grateful to those
authors and the publisher for allowing it to be reused here.
Apache
Function Returns Description
apache_child_terminate(void) bool Terminates apache process after
this request.
apache_note(string note string Gets and sets Apache request
_name [, string note_value]) notes.
virtual(string filename) bool Performs an Apache subrequest.
getallheaders(void) array Alias for apache_request_
headers().
Arrays
Function Returns Description
krsort(array array_arg [, int bool Sorts an array by key value in reverse
sort_flags]) order.
ksort(array array_arg [, int bool Sorts an array by key.
sort_flags])
count(mixed var [, int mode]) int Counts the number of elements in a vari-
able (usually an array).
natsort(array array_arg) void Sorts an array using natural sort.
natcasesort(array array_arg) void Sorts an array using case-insensitive
natural sort.
asort(array array_arg [, int bool Sorts an array and maintains index
sort_flags]) association.
arsort(array array_arg [, int bool Sorts an array in reverse order and
sort_flags]) maintains index association.
sort(array array_arg [, int bool Sorts an array.
sort_flags])
718
PHP Language Reference
in_array(mixed needle, array bool Checks if the given value exists in the
haystack [, bool strict]) array.
array_search(mixed needle, mixed Searches the array for a given value and
array haystack [, bool strict]) returns the corresponding key if
successful.
extract(array var_array [, int int Imports variables into symbol table from
extract_type [, string prefix]]) an array.
compact(mixed var_names [, array Creates a hash containing variables and
mixed ...]) their values.
array_fill(int start_key, array Creates an array containing num ele
int num, mixed val) ments starting with index start_key
each initialized to val.
range(mixed low, mixed high array Creates an array containing the range of
[, int step]) integers or characters from low to high
(inclusive).
shuffle(array array_arg) bool Randomly shuffles the contents of an
array.
Table continued on following page
719
Appendix F
array_values(array input) array Returns just the values from the input
array.
array_count_values(array input) array Returns the value as key and the fre-
quency of that value in input as value.
array_reverse(array input array Returns input as a new array with the
[, bool preserve keys]) order of the entries reversed.
array_pad(array input, int array Returns a copy of input array padded
pad_size, mixed pad_value) with pad_value to size pad_size.
array_flip(array input) array Returns array with key <-> value
flipped.
array_change_key_case(array array Returns an array with all string keys
input [, int case=CASE_LOWER]) lower-cased (or uppercased).
array_unique(array input) array Removes duplicate values from array.
array_intersect(array arr1, array Returns the entries of arr1 that have
array arr2 [, array ...]) values that are present in all the other
arguments.
array_uintersect(array arr1, array Returns the entries of arr1 that have
array arr2 [, array ...], values that are present in all the other
callback data_compare_func) arguments. Data is compared by using a
user-supplied callback.
720
PHP Language Reference
721
Appendix F
BCMath
Function Returns Description
bcadd(string left_operand, string Returns the sum of two arbitrary
string right_operand precision numbers.
[, int scale])
722
PHP Language Reference
BZip2
Function Returns Description
bzopen(string|int file| resource Opens a new BZip2 stream.
fp, string mode)
bzread(int bz[, int length]) string Reads up to length bytes from a BZip2
stream, or 1024 bytes if length is not
specified.
bzwrite(int bz, string data int Writes the contents of the string data to
[, int length]) the BZip2 stream.
bzerrno(resource bz) int Returns the error number.
bzerrstr(resource bz) string Returns the error string.
Table continued on following page
723
Appendix F
Calendar
Function Returns Description
unixtojd([int timestamp]) int Converts UNIX timestamp to Julian Day.
jdtounix(int jday) int Converts Julian Day to UNIX timestamp.
cal_info(int calendar) array Returns information about a particular
calendar.
cal_days_in_month int Returns the number of days in a month
(int calendar, int month, for a given year and calendar.
int year)
724
PHP Language Reference
Class/Object
Function Returns Description
class_exists bool Checks if the class exists.
(string classname)
725
Appendix F
Character Type
Function Returns Description
ctype_alnum(string text) bool Checks for alphanumeric character(s).
ctype_alpha(string text) bool Checks for alphabetic character(s).
ctype_cntrl(string text) bool Checks for control character(s).
ctype_digit(string text) bool Checks for numeric character(s).
ctype_graph(string text) bool Checks for any printable character(s)
except space.
ctype_lower(string text) bool Checks for lowercase character(s).
ctype_print(string text) bool Checks for printable character(s).
ctype_punct(string text) bool Checks for any printable character that is
not whitespace or an alphanumeric
character.
ctype_space(string text) bool Checks for whitespace character(s).
ctype_upper(string text) bool Checks for uppercase character(s).
ctype_xdigit(string text) bool Checks for character(s) representing a
hexadecimal digit.
Curl
Function Returns Description
curl_version([int version]) array Returns CURL version information.
curl_init([string url]) resource Initializes a CURL session.
726
PHP Language Reference
727
Appendix F
728
PHP Language Reference
Director y
Function Returns Description
opendir(string path) mixed Opens a directory and returns a
dir_handle.
Error Handling
Function Returns Description
error_log(string message bool Sends an error message somewhere.
[, int message_type
[, string destination
[, string extra_headers]]])
729
Appendix F
Filesystem
Function Returns Description
flock(resource fp, bool Portable file locking.
int operation
[, int &wouldblock])
730
PHP Language Reference
731
Appendix F
fgetcsv(resource fp array Gets line from file pointer and parses for
[,int length CSV fields.
[, string delimiter
[, string enclosure]]])
disk_total_space(string path) float Gets total disk space for filesystem that
path is on.
disk_free_space(string path) float Gets free disk space for filesystem that
path is on.
chgrp(string filename, bool Changes file group.
mixed group)
732
PHP Language Reference
FTP
Function Returns Description
ftp_connect(string host resource Opens an FTP stream.
[, int port [, int timeout]])
733
Appendix F
ftp_get(resource stream, bool Retrieves a file from the FTP server and
string local_file, writes it to a local file.
string remote_file, int mode
[, int resume_pos])
734
PHP Language Reference
ftp_nb_fput(resource stream, int Stores a file from an open file to the FTP
string remote_file, server nbronly.
resource fp, int mode[,
int startpos])
735
Appendix F
Function Handling
Function Returns Description
call_user_func mixed Calls a user function that is the first
(string function_name parameter.
[, mixed parmeter]
[, mixed ...])
736
PHP Language Reference
HTTP
Function Returns Description
header(string header void Sends a raw HTTP header.
[, bool replace,
[int http_response_code]])
Iconv Librar y
Function Returns Description
iconv(tring in_charset, string Returns str converted to the out_
string out_charset, string str) charset character set.
737
Appendix F
Image
Function Returns Description
exif_tagname(index) string Gets headername for index or false if
not defined.
exif_read_data(string array Reads header data from the JPEG/TIFF
filename [, sections_needed image filename and optionally reads the
[, sub_arrays internal thumbnails.
[, read_thumbnail]]])
738
PHP Language Reference
739
Appendix F
740
PHP Language Reference
imagecolorclosesthwb int Gets the index of the color that has the
(resource im, int red, hue, white, and blackness nearest to the
int green, int blue) given color.
imagecolordeallocate bool De-allocates a color for an image.
(resource im, int index)
741
Appendix F
742
PHP Language Reference
743
Appendix F
imagefttext(resource im, array Writes text to the image using fonts via
int size, int angle, int x, freetype2.
int y, int col,
string font_file,
string text, [array extrainfo])
744
PHP Language Reference
IMAP
Function Returns Description
imap_open(string mailbox, resource Opens an IMAP stream to a mailbox.
string user, string password
[, int options])
745
Appendix F
746
PHP Language Reference
747
Appendix F
748
PHP Language Reference
Mail
Function Returns Description
ezmlm_hash(string addr) int Calculates EZMLM list hash value.
mail(string to, string subject, int Sends an e-mail message.
string message
[, string additional_headers
[, string additional_
parameters]])
750
PHP Language Reference
Math
Function Returns Description
abs(int number) int Returns the absolute value of the number.
ceil(float number) float Returns the next highest integer value of
the number.
floor(float number) float Returns the next lowest integer value
from the number.
round(float number float Returns the number rounded to
[, int precision]) specified precision.
sin(float number) float Returns the sine of the number in radians.
cos(float number) float Returns the cosine of the number in
radians.
tan(float number) float Returns the tangent of the number in
radians.
asin(float number) float Returns the arc sine of the number in
radians.
acos(float number) float Returns the arc cosine of the number in
radians.
atan(float number) float Returns the arc tangent of the number in
radians.
atan2(float y, float x) float Returns the arc tangent of y/x, with the
resulting quadrant determined by the
signs of y and x.
sinh(float number) float Returns the hyperbolic sine of the num-
ber, defined as (exp(number) - exp(-
number))/2.
751
Appendix F
752
PHP Language Reference
MIME
Function Returns Description
mime_content_type string Returns content-type for file.
(string filename
|resource stream)
753
Appendix F
Miscellaneous
Function Returns Description
get_browser([string browser_ mixed Gets information about the capabilities
name [, bool return_array]]) of a browser.
constant(string const_name) mixed Given the name of a constant this
function returns the constant’s
associated value.
getenv(string varname) string Gets the value of an environment
variable.
putenv(string setting) bool Sets the value of an environment
variable.
getopt(string options array Gets options from the command line
[, array longopts]) argument list.
flush(void) void Flushes the output buffer.
sleep(int seconds) void Delays for a given number of seconds.
usleep(int micro_seconds) void Delays for a given number of micro
seconds.
time_nanosleep mixed Delays for a number of seconds and
(long seconds, nanoseconds.
long nanoseconds)
754
PHP Language Reference
MS SQL
Function Returns Description
mssql_connect int Establishes a connection to an MS-SQL
([string servername server.
[, string username
[, string password]]])
755
Appendix F
756
PHP Language Reference
MySQL
Function Returns Description
mysql_connect resource Opens a connection to a MySQL Server.
([string hostname[:port]
[:/path/to/socket]
[, string username
[, string password [, bool new
[, int flags]]]]])
758
PHP Language Reference
759
Appendix F
Network Functions
Function Returns Description
define_syslog_variables(void) void Initializes all syslog-related variables.
openlog(string ident, bool Opens connection to system logger.
int option, int facility)
760
PHP Language Reference
ODBC
Function Returns Description
odbc_close_all(void) void Closes all ODBC connections.
odbc_binmode bool Handles binary column data.
(int result_id, int mode)
761
Appendix F
762
PHP Language Reference
763
Appendix F
764
PHP Language Reference
Output Buffering
Function Returns Description
ob_list_handlers() false| Lists all output_buffers in an
array array.
ob_start([ string|array user_ bool Turns on Output Buffering
function [, int chunk_size (specifying an optional output
[, bool erase]]]) handler).
ob_flush(void) bool Flushes (sends) contents of the
output buffer. The last buffer
content is sent to next buffer.
ob_clean(void) bool Cleans (deletes) the current output
buffer.
ob_end_flush(void) bool Flushes (sends) the output buffer,
and deletes current output buffer.
ob_end_clean(void) bool Cleans the output buffer, and
deletes current output buffer.
ob_get_flush(void) bool Gets current buffer contents,
flushes (sends) the output buffer,
and deletes current output buffer.
ob_get_clean(void) bool Gets current buffer contents and
deletes current output buffer.
ob_get_contents(void) string Returns the contents of the output
buffer.
ob_get_level(void) int Returns the nesting level of the
output buffer.
ob_get_length(void) int Returns the length of the output
buffer.
ob_get_status([bool full_status]) false| Returns the status of the active or
array all output buffers.
ob_implicit_flush([int flag]) void Turns implicit flush on/off and is
equivalent to calling flush() after
every output call.
output_reset_rewrite_vars(void) bool Resets (clears) URL rewriter values.
output_add_rewrite_ bool Adds URL rewriter values.
var(string name, string value)
765
Appendix F
PCRE
Function Returns Description
preg_match(string pattern, int Performs a Perl-style regular expression
string subject [, array match.
subpatterns [, int flags
[, int offset]]])
766
PHP Language Reference
767
Appendix F
768
PHP Language Reference
Program Execution
Function Returns Description
exec(string command [, array string Executes an external program.
&output [, int &return_value]])
Regular Expressions
Function Returns Description
ereg(string pattern, int Matches a regular expression.
string string [, array
registers])
769
Appendix F
Sessions
Function Returns Description
session_set_cookie_params(int void Sets session cookie parameters.
lifetime [, string path [, string
domain [, bool secure]]])
770
PHP Language Reference
Simple XML
Function Returns Description
simplexml_load_file simplemxml_element Loads a filename and returns a
(string filename) simplexml_element object to allow
for processing.
simplexml_load_string simplemxml_element Loads a string and returns a
(string data) simplexml_element object to allow
for processing.
simplexml_import_dom simplemxml_element Gets a simplexml_element object
(domNode node) from dom to allow for processing.
771
Appendix F
Sockets
Function Returns Description
socket_select(array int Runs the select() system call on the
&read_fds, array &write_fds, sets mentioned with a timeout specified
&array except_fds, int by tv_sec and tv_usec.
tv_sec[, int tv_usec])
772
PHP Language Reference
773
Appendix F
SQLite
Function Returns Description
sqlite_popen(string filename resource Opens a persistent handle to a SQLite
[, int mode [, string &error_ database. Will create the database if it
message]]) does not exist.
sqlite_open(string filename resource Opens a SQLite database. Will create
[, int mode [, string &error_ the database if it does not exist.
message]])
774
PHP Language Reference
775
Appendix F
Streams
Function Returns Description
stream_socket_client(string resource Opens a client connection to a
remoteaddress [, long &errcode, remote address.
string &errstring, double timeout,
long flags, resource context])
776
PHP Language Reference
777
Appendix F
Strings
Function Returns Description
crc32(string str) string Calculates the crc32 polynomial
of a string.
crypt(string str [, string salt]) string Encrypts a string.
convert_cyr_string(string str, string Converts from one Cyrillic
string from, string to) character set to another.
lcg_value() float Returns a value from the combined
linear congruential generator.
levenshtein(string str1, string str2) int Calculates Levenshtein distance
between two strings.
md5(string str, [ bool raw_output]) string Calculates the md5 hash of a string.
md5_file(string filename [, bool string Calculates the md5 hash of given
raw_output]) filename.
metaphone(string text, int phones) string Breaks English phrases down into
their phonemes.
778
PHP Language Reference
779
Appendix F
780
PHP Language Reference
781
Appendix F
782
PHP Language Reference
783
Appendix F
URL
Function Returns Description
http_build_query(mixed string Generates a form-encoded query string
formdata [, string prefix]) from an associative array or object.
parse_url(string url) array Parses a URLand returns its components.
get_headers(string url) array Fetches all the headers sent by the
server in response to an HTTP request.
urlencode(string str) string URL-encodes all non alphanumeric
characters except -_.
urldecode(string str) string Decodes URL-encoded string.
rawurlencode(string str) string URL-encodes all non alphanumeric
characters.
rawurldecode(string str) string Decodes URL-encoded string.
base64_encode(string str) string Encodes string using MIME base64
algorithm.
base64_decode(string str) string Decodes string using MIME base64
algorithm.
get_meta_tags(string filename array Extracts all meta tag content attributes
[, bool use_include_path]) from a file and returns an array.
Variable Functions
Function Returns Description
gettype(mixed var) string Returns the type of the variable.
settype(mixed var, string type) bool Sets the type of the variable.
784
PHP Language Reference
785
Appendix F
XML
Function Returns Description
xml_parser_create([string encoding]) resource Creates an XML parser.
xml_parser_create_ns([string resource Creates an XML parser.
encoding [, string sep]]).
786
PHP Language Reference
ZLib
Function Returns Description
gzfile(string filename [, int use_ array Reads and uncompresses entire
include_path]) .gz-file into an array.
787
Index
Index
790
Index
cells
HTML divisions, 31–33 built-in functions
PHP, 474 Perl, 386–387
Python, delimiting, 400 PHP, listed, 486–490
blocking changes to form fields, 141–142 Python, 415–416
body section built-in objects, JavaScript
HTML tags, 20–21 current document, 312–313
with two tables, 25 form elements, 313–314
visible content, 533–534 history list, navigating, 315
XHTML tables, 103–105 reference-making element, 315–316
bold text URL information, manipulating, 314–315
CSS, 212–213, 585 XHTML document, 311–312
HTML, 84–85 bulleted list
XHTML, 531 described, 35, 42–43, 569
Boolean object, JavaScript, 614–615 item marker, changing, 43–46
borders ordered lists within, 47–48
CSS ordinal, changing position of, 46
collapsing, 241–243 button, form
color, 220–221 checkbox clicks, automating, 314
defining, 238–239, 594–596 custom text, 135–136
differing adjacent, 242 JavaScript object, 615–616
predefined styles, 220–222 radio, 131–132, 635–636
shortcut, 222 reset, 137–138
space to neighboring elements (margins), 223–224 submit, 137–138
spacing, 223, 239–240 XHTML element, 534–535
width, 219–220 BZip2, PHP functions, 723–724
images, 62–63
property shortcut, 222
XHTML tables C
drop-shadow effect, 115–116 calendar
setting, 96–99 dynamic, user-interactive
bottom, positioning elements, 252–254 Perl, 447–453
box formatting model, CSS, 215–218 PHP, 517–523
braces ({}) Python, 463–466
JavaScript, 287 simple
Perl, 372 Perl, 439–444
PHP, 474, 501 PHP, 509–514, 724–725
break statement Python, 457–460
JavaScript, 278 capitalization
PHP, 486 changing, 583
Python, 412–413 denoting, 200, 585
browser, Web captions
caching with meta tags, 18 CSS tables, 598
default path, setting, 17 XHTML tables, 102–103, 244, 535
DOM support, listed, 264 caret (^), 385
formatting shortcuts, reasons to avoid, 10 Cascading Style Sheets. See CSS
JavaScript support lacking, 323 cells
meta tags caching, 18 CSS tables, rendering, 597
refreshing and reloading after specified time, 18–19 spacing, 239–240
scripting language, unsupported, 552 XHTML tables
Web servers, connecting, 1–2 defining, 564
window, another delimiting, 100–101
opening, 329–331 rules, 98–99
outputting text, 331–333 spacing and padding, 94–95
XML, usefulness of, 154, 160 width and alignment, 91–94
791
CGI (Common Gateway Interface)
CGI (Common Gateway Interface). See also Perl; Python style, defining, 171–172
form, sample, 432–434 text, 584
history, 363 transparency, graphics, 53
HTTP XHTML codes, 571–576
data encapsulation, 364–366 columns
request and response, 363–364 HTML table headers, 100–101
mechanics, 366–367 XHTML tables
MySQL data, sample, 434–438 attributes, specifying, 537
Python functions, 685–687 grouping, 109–111, 537–538
scripting multiple, newspaper-like, 120–121
basic requirements, 423–424 spanning, 106–109
Linux Bash shell, 424–430 command terminal character and blocks of code, 474
servers, 367–368 commenting code
when to use, 431–432 Perl, 372
cgitb troubleshooting tool, Python, 421–422 PHP, 474
changes, text, incorporating in line (HTML) (<span>), 82 Python, 400
changing nodes, DOM, 302–309 style sheets, 167
chapter and section, automatic numbering, 207–208 XML, 157
characters commercial image applications, 57
matching with Perl, 673 Common Gateway Interface. See CGI
type functions, PHP, 726 comparison operators
checkbox, form JavaScript, 604
JavaScript, 616–617 Perl, 378, 652–653
XHTML, 132 Python, 409
child sibling elements, matching by, 176–178 compressing white space, 198
circular area, image map, 64 conditions
citation, source, 536 loop, breaking (continue), 278
class Perl, 658–660
functions, PHP, 725–726 constructors
matching elements, CSS, 174–175 Perl objects, 392
object definitions, PHP, 491 PHP objects, 491–492
clickable regions, image map, 64–66 content
clients, Web, 1 block, enclosing (<div>), 539–540
clipping boundary, CSS blocks, 589 body section
closing files HTML tags, 20–21
Perl, 389 with two tables, 25
PHP, 497 visible content, 533–534
Python, 420 XHTML tables, 103–105
code CSS, 581–582
inline snippets, XHTML element, 536–537 document, writing (write and writeln
running in interpreter, 421 methods), 313
code block overflow, controlling, 257–258
CSS, handling, 588–591 visible, tag containing (<body>), 533–534
DHTML, animating, 356–361 continue statement
HTML divisions, 31–33 JavaScript, 278
PHP, 474 Perl, 382–383
Python, delimiting, 400 PHP, 486
collapsible lists, creating with DHTML, 317–319 Python, 412–413
collapsing borders, 241–243 control structures
colon (:), 400 JavaScript
colors breaking out (break), 278
background, 228–230 do while loop, 274
changing following mouse movement, 285–286 expression, executing code based on value of
foreground, 227–228 (switch), 277–278
links, 74 for loop, 275
792
Index
definition lists
for in loop, 276 selectors, 173, 577
if and if else loops, 276–277 shorthand expressions, 185–187
while loop, 274–275 spacing borders and cells, 239–240
Perl style definition format, 171–172
continue, 382–383 tables properties, 237–243
for loop, 380–381 text, 189–214
foreach loop, 381 versions 1.0 and 2.0, 8–9
if and if else, 381–382 curl function, PHP, 726–727
last, 382–383 current date, writing to document in JavaScript,
next, 382–383 325–327
redo, 382–383 current document object, JavaScript, 312–313
while and until loops, 380 cursive fonts, 210
Python cursor, mouse, 599
continue and break, 412–413
for loop, 411
if and elif statements, 411–412 D
try statement, 412 data encapsulation, HTTP, 364–366
while loop, 410–411 data entry. See form
conversion functions, 661 data, extracting from Perl files, 390–391
cookie functions, Python, 687–688 data passing
copyright, image, 58 HTTP (GET and POST), 364–366, 444–453
core attributes, XHTML, 571 Linux Bash shell scripting, 425–426
count modifiers, matching, 674 to software, 555–556
counter object, automatic numbering, 206 XHTML forms, 128
CSS (Cascading Style Sheets) data types
border settings, 219–223, 238–239 JavaScript, 270–271
box formatting model, 215–218 Perl, 372–373
cascading, 167–169 PHP, 475
collapsing borders, 241–243 Python
defining styles, 166–167 dictionaries, 405–406
dynamic outlines, 224–225 lists, 404–405
HTML and, 164–165 numbers, 401
inheritance, 179 strings, 402–404
inline text formatting, 81–82 tuples, 406–407
levels, 165 database access, query, and report
margins, 223–224 MySQL, 434–438
matching elements, 173–178 Perl, 453–456
padding elements, 218–219 PHP, 523–526
positioning elements Python, 467–469
absolute, 248–249 date
fixed, 249–252 current
floating, 255–256 in document header, 475–477
layering, 258–261 writing to document, 325–327
relative, 246–248 graphical display, 338–341
specifying (top, right, bottom, and left proper- handling
ties), 252–254 JavaScript, 282, 617–620
static, 245–246 Perl examples, 439–444
syntax, 245 PHP, 509–514, 727–728
visibility, 261–262 Python examples, 457–460
properties and table attributes, 237–238 debugger, Perl symbolic, 651
property values, 172–173 decorations, text, 200–201
pseudoclasses, 179–181 default meta tag path, 17–18
pseudoelements, 181–185 defining styles, CSS, 166–167
purpose, 163 definition lists, 35, 538, 540–541
793
deleting text
deleting text, 86–87, 538–539 document type definitions. See DTDs
descendant elements, matching by, 176–178 dollar sign ($), 475
design strategy, XML, 153–154 DOM (Document Object Model)
destructors, PHP objects, 491–492 changing nodes, 302–309
DevGuru JavaScript Language Index, 325 history, 291–292
DHTML (Dynamic HTML) JavaScript, 263, 264–265
collapsible lists, 317–319 node properties and methods, 295–296
JavaScript sample, 292–294
code blocks, animating, 356–361 traversing nodes, 296–302
menus, animating, 351–355 double forward slash (//), 474
styles, swapping, 348–351 do-while loop, 274, 482
moving elements, 319–321 drop caps, adding to first letter, 183–184
uses, 316–317 drop shadow text, 201
dictionary data, Python, 405–406 drop-down menu effect, 351–355
directory functions DTDs (document type definitions)
Perl, 670 document type tags, 14–15
PHP, 729 XHTML Basic 1.0, 14
directory, HTTP requests, 363–364 XHTML frame support, 122
document XML support, 155
CSS Dynamic HTML (DHTML)
layering, 258–261 collapsible lists, 317–319
matching, 173–178 JavaScript
padding, 218–219 code blocks, animating, 356–361
sizing, 256–258 menus, animating, 351–355
visibility, 261–262 styles, swapping, 348–351
DOM, accessing by ID, 316 moving elements, 319–321
external content, embedding, 552–553 uses, 316–317
form, inserting, 129 dynamic outlines, CSS, 224–225
header containing current date and time, 475–477
HTML tags
block divisions, 31–33 E
body section, 20–21 ECMA Specification, JavaScript, 324
body with two tables, 25 editing
described, 13 animated GIFs, 54–57
DOCTYPE, 10, 14–15 XML, 161
head section, 15–19 elements. See also tags
headings, 27–28 CSS
horizontal rules, 28–29 layering, 258–261
HTML, 10–11, 15 matching, 173–178
manual line breaks, 25–27 padding, 218–219
paragraphs, 23–24 sizing, 256–258
preformatted text, 30–31 visibility, 261–262
script section, 20 DOM, accessing by ID, 316
style section, 19 moving with JavaScript, 319–321
image map code sample, 66–67 XML, 156–157
intelligence, in JavaScript, 265 elif statement, 411–412
JavaScript object, 620–622 e-mail
master element, 545 address, obscuring, 327–329
moving with JavaScript, 319–321 errors and troubleshooting report, sending, 504–505
navigational pane, 119–120 PHP functions, 750
original/desired location, defining in XHTML (<base>), Python functions, 688–691
531–532 <embed> tag
URL, 70 older browsers, supporting, 150–151
XML, 156–157 representing non-HTML data with, 144–145
794
Index
file
embedding operators, 383–384
external content in document, 552–553 special characters, 384–385
fonts, 213–214 substrings, memorizing, 386
emphasis, text, 560–561 PHP, 769–770
enclosing scripts, 266–267 Python
end of document, 15 described, 413
ending elements operations, 414
PHP, 473–474 special characters, 414–415
pseudoelements, 184–185 Extensible Markup Language (XML)
entities, user-defined, 158–159 attributes, 157
errors and troubleshooting comments, 157
JavaScript design strategy, 153–154
elements, animating, 360–361 DTDs, 155
form validating, 347 editing, 161
need, 286 elements, 156–157
syntax, 287–288 entities, user-defined, 158–159
tools, 287 namespaces, 159
Perl nonparsed data, 158
Apache Internal Server error message, 395–396 non-Web applications, 154
maximum reporting, 394–395 parsing, 161–162
PHP PHP functions, 771, 786–787
custom handling, 505 style sheets, 159–160
error level, controlling, 503–504 versions, 8
handling, 729–730 viewing documents, 160
identifying, 502–503 XSLT, 161
level, controlling, 503–504 Extensible Stylesheet Language (XSLT), 161
sending to file or e-mail address, 504–505
syntax, common, 501
tools, 500–501 F
Python fantasy fonts, 210
cgitb module, 421–422 field labels, 138
code, running in interpreter, 421 fieldsets, form, 138–140
error stream, redirecting, 422 file
escape characters accessing, 387–388
JavaScript, 606 binary
Perl, 674 Perl, 389–390
events PHP, 497–498
attributes, XHTML, 570–571 Python, 420
handlers, JavaScript, 284–286, 610–611 closing
object, JavaScript, 622–623 Perl, 389
exact size, element, 257 PHP, 497
exception handling Python, 420
JavaScript form validating, 347 errors and troubleshooting report, 504–505
Python, 421 form fields, 137
executing scripts, 267–268 functions
exporting data, Linux Bash shell, 426–428 Perl, 667–668
expressions PHP, 730-733
executing code based on value of (switch), in Python, 691–692
JavaScript, 277–278 handle and handle test functions, 666–667
matching, in Perl, 673 information, getting, 390–391
expressions, regular listing, 428–429
Perl locking, 499
examples, 385 miscellaneous, listed, 499–500
modifying, 385–386
795
file (continued)
file (continued) form
opening blocking changes to fields, 141–142
Perl, 388 button, custom text, 135–136
PHP, 495–496 CGI, sample, 432–434
Python, 417–418 checkboxes, 132
reading text deciphering and handling data
Perl, 388 Perl, 444–447
PHP, 496–497 PHP, 514–516
Python, 418–419 Python examples, 460–462
upload, 623–624 defining, 542–543
writing text described, 123–126
Perl, 389 dynamic calendar, creating
PHP, 497 Perl, 447–453
Python, 419 PHP, 517–523
File Transfer Protocol (FTP), 733–736 Python, 463–466
filename, in URL, 70 features, adding, 341–343
filesystem functions, PHP, 730–733 field labels, 138, 548–549
first letter of elements, property values, 183–184 fieldsets and legends, 138–140
first line of elements file fields, 137
indenting, 194–195 footer, 565–566
property values, 181–183 header, 567
first-child pseudoclasses, 180 heading, 566–567
fixed positioning elements, CSS, 249–252 hidden fields, 135
Flash, Shockwave id attribute, 130
<embed>and <object>tags, 150–151 images, 136
plugins, 149–150 input mechanism, 129–130, 546–547
floating CSS elements, 255–256, 592, 593 keyboard shortcuts, 140–141
floating page layout tables, 113–116 label, 541–542
floating point values, 401 legends, 138–140
floating text objects, 195–197 list boxes, 132–134
fonts name attribute, 130
described, 210 object, JavaScript, 313–314, 624–625
embedding, 213–214 options list, 558–559
formatting tag, 79 passing data, 128
line spacing, 213 password input boxes, 131
lists, formatting, 39 PHP handler, XHTML, 127–128
selection, 210–211 radio buttons, 131–132
sizing reset button, 137–138
CSS, 211–212, 584 selection options, hierarchy of, 554
XHTML, 79, 532–533, 559–560 submit button, 137–138
styling tab order, 140–141
CSS, 212–213, 582–587 tags, XHTML, 129
XHTML, 582–587 text areas, large, 134–135
footer, XHTML table, 103–105 text input boxes, 130–131, 565
for in loop, 276 validating, 265, 343–347
for loop values, setting options, 554–555
JavaScript, 275 formatting. See also CSS
Perl, 380–381 HTML documents, 10–11
PHP, 482–483 strings, Python operators, 404
Python, 411 text
foreach loop CSS inline control, 81–82
Perl, 381 font tag, 79
PHP, 483–484 inline attributes, 80–81
foreground colors, 227–228 nonbreaking spaces, 82–83
soft hyphens, 83–84
XHTML table column groups, 110
796
Index
HTML (HyperText Markup Language)
forward slash, asterisk (/*), 474 GIMP (GNU Image Manipulation Program), 58
frames, 121–122 graphical date display, 338–341
free-form polygonal area, image map, 64 graying out form field controls, 141–142
FTP (File Transfer Protocol), 733–736 grouping
functions columns, XHTML tables, 109–111, 537–538
JavaScript in-line text elements, 87–88
data manipulation, built-in, 279–280 Gutmans, Andi (PHP re-writer), 471
object, 625–626
top-level, 611–612
Perl H
arithmetic, 660 head section, document tags
array and list, 664–665 meta tags, 16–19, 543–544
conversion, 661 structure, 15–16
directory, 670 title, specifying, 16, 567
file and file handle test, 666–667 header
file operations, 667–668 cells, table, 100
input and output, 668–670 documents
miscellaneous, 672–673 current date and time, 475–477
networking, 671–672 XHTML, 543
search and replace, 665–666 HTTP data, encapsulating, 6, 364–366
string, 663 tables
structure, 661–663 columns, spanning, 106–107
system, 670–671 described, 103–105
Python headings
array, 680–681 capitalizing, CSS, 200
asynchronous communication, 681–684 HTML, 27–28
binary and ASCII code conversion, 684–685 XHTML table, 566
built-in listed, 675–680 height
CGI, 685–687 elements, specifying, 257, 590
cookie, 687–688 line spacing, controlling, 213, 590–591
email, 688–691 XHTML images, 62
file, 691–692 hidden form fields, 135
garbage collection, 692–693 hidden object, JavaScript, 626–627
HTTP and HTTPS protocol, 693–696 history
IMAP, 696 JavaScript object, 627
interpreter, 705–708 list, navigating, 315
operating system, 697–700 horizontal rules
POP3 server, 700–701 HTML, 28–29
pseudo-random numbers, obtaining, 708 XHTML, 544
SMTP, 701–703 horizontal table elements
sockets, 701, 703 columns spanning, 106–109
strings, 704–705 described, 99–100
URL, 709–715 horizontal text alignment, 189–191
hovering over link, coloring, 74
HTML (HyperText Markup Language)
G block divisions, 31–33
garbage collection, Python, 692–693 creation, 7
Gecko (Mozilla Firebird) DOM Reference, 325 CSS and, 164–165
GIF (Graphics Interchange Format) document tags
animation block divisions, 31–33
assembling, 55–56 body section, 20–21
described, 54–55 body with two tables, 25
output, 56–57 described, 13
source, 55 DOCTYPE, 10, 14–15
Web images, 51–52 head section, 15–19
797
HTML (HyperText Markup Language), document tags (continued)
HTML (HyperText Markup Language), document tags matching elements, CSS, 175
(continued) node, finding in DOM, 300–302
headings, 27–28 IDE (integrated development environment), PHP, 501
horizontal rules, 28–29 identifying problems
HTML, 10–11, 15 JavaScript
manual line breaks, 25–27 elements, animating, 360–361
paragraphs, 23–24 form validating, 347
preformatted text, 30–31 need, 286
script section, 20 syntax, 287–288
style section, 19 tools, 287
headings, 27–28 Perl
horizontal rules, 28–29 Apache Internal Server error message, 395–396
manual line breaks, 25–27 maximum reporting, 394–395
non-HTML content, 144–147 PHP
output for simple Web page, 11–12 custom handling, 505
paragraphs, 23–25 error level, controlling, 503–504
preformatted text, 30–31 handling, 729–730
source for simple Web page, 11 identifying, 502–503
standards, listed by number, 8–9 level, controlling, 503–504
tables, 89 sending to file or e-mail address, 504–505
tag, 9–10, 15 syntax, common, 501
text, 79 tools, 500–501
versions, listed, 8–9 Python
HTTP (HyperText Transfer Protocol). See also plugins cgitb module, 421–422
data encapsulation, 364–366 code, running in interpreter, 421
described, 4–7 error stream, redirecting, 422
form data IDLE (Integrated DeveLopment Environment), 398, 399
creating dynamic calendar, 517–523 if else loop
passing (GET and POST), 128, 364–366, 444–453 JavaScript, 276–277
PHP functions, 737 Perl, 381–382
port, standard, 70 PHP, 484–485
Python functions, 693–696 if loop
request and response, 363–364 JavaScript, 276–277
HTTPS protocol, Python function, 693–696 Perl, 381–382
hyperlinks Python, 411–412
anchor tag, 71–73 image maps
colors, 74 clickable regions, specifying, 64–66
described, 3–4 document code sample, 66–67
JavaScript object, 629–630 navigation, defining, 550–551
keyboard shortcuts and tab orders, 73–74 physical area, describing, 530–531
style sheet to documents, 166–167 specifying, 63
tag, 76–77, 550 images
target details, 75–76 aligning, 62–63
titles, 72–73 animation, 54–57
URLs, 69–71 background, 231–236
visited and unvisited, styles of, 180 borders, 62–63
XHTML elements, 528, 546 forms, 136
HyperText Markup Language. See HTML graphical date display, 338–341
HyperText Transfer Protocol. See HTTP inserting into Web documents, 58–60
interlaced and progressive storage and display, 54
irregularly shaped layouts, 116–118
I list item markers, 44–46, 204
Iconv library, 737 object, JavaScript, 627–628
ID PHP functions, 738–745
element, accessing by, 316 preloading, 333–335
form attributes, 130 rollovers, 335–337
798
Index
JavaScript
size, 61–62 Java language, 263
table backgrounds, 106 JavaScript
text, specifying for nongraphical browsers, 60–61 calculations and operators, 272–274
transparency, 53 conditional expression, breaking loop to
Web formats, 51–53 (continue), 278
XHTML documents, 57–58 constants, 603
IMAP control structures
PHP functions, 745–750 breaking out of, 278
Python functions, 696 do while loop, 274
indenting text expression, executing code based on value of
CSS, 583 (switch), 277–278
XHTML, 194–195 for loop, 275
inheritance for in loop, 276
CSS, 179 if and if else loops, 276–277
list style, 39–40 while loop, 274–275
in-line text data manipulation functions, 279–280
formatting, 80–81, 560 data types, 270–271
grouping, 87–88 DHTML
input functions, Perl, 668–670 code blocks, animating, 356–361
<input> tag, 129–130 collapsible lists, 317–319
input, user. See form menus, animating, 351–355
inserting moving elements, 319–321
images, 58–60 styles, swapping, 348–351
text or content, 86–87, 547–548 uses, 316–317
inside value, list style, 39–40 document head section, 20
integers, Python supported, 401 DOM, 264–265
integrated development environment (IDE), PHP, 501 drawbacks of using, 323–324
Integrated DeveLopment Environment (IDLE), 398, 399 elements, accessing by ID, 316
interlaced image storage and display, 54 enclosing scripts, 266–267
internationalization attributes, XHTML, 571 errors and troubleshooting
Internet Explorer (Microsoft) need, 286
DOM support, 264 syntax, 287–288
TrueDoc fonts, using, 214 tools, 287
interpreter, Python, 398–400, 705–708 event handlers, 284–286, 610–611
invisible data, 135 executing scripts, 267–268
invisible elements forms
CSS, 261–262, 591 features, adding, 341–343
hidden form fields, 135 validating, 343–347
JavaScript object, 626–627 functions, 611–612
irregularly shaped graphic and text layouts, 116–118 guidelines for using, 324
italic text history, 263
CSS, 212–213, 585 identifying problems, 288–290
HTML, 84–85 images
XHTML, 545–546 graphical date display, 338–341
item, lists, 201–202 preloading, 333–335
item marker rollovers, 335–337
image, 204, 581 implementations, different, 264
positioning, 203–204, 580 methods, 610
style, 43–46 objects
anchor, 612–613
area, 613
J array, 613–614
JASC Software Boolean, 614–615
Animation Shop, 54–57 button, 615–616
Paint Show Pro, 57 checkbox, 616–617
799
JavaScript, objects (continued)
JavaScript, objects (continued)
date, 617–620
K
keyboard shortcuts
described, 281
forms, 140–141
document, 620–622
input, indicating, 548
event, 622–623
links, 73–74
file upload, 623–624
form, 624–625
function, 625–626
hidden, 626–627
L
label
history, 627
document, 16
image, 627–628
form, 541–542
link, 629–630
lambda keyword, Python anonymous functions, 417
location, 630–631
language
math, 631–632
element style, 181
navigator, 632
encoding, 557
number, 633
large text areas, form, 134–135
object, 633
last control structure, Perl, 382–383
option, 634
layering elements, CSS, 258–261, 593
password, 634–635
layout
radio, 635–636
irregularly shaped graphic and text, 116–118
RegExp, 636
multiple-column pages, 120–121
Reset, 637
layout tables, page
Screen, 637–638
described, 111–112, 244
Select, 638–639
floating page, 113–116
String, 639–641
multiple columns, 120–121
Submit, 641–642
navigational blocks, 119–120
Text, 642–643
odd graphic and text combinations, 116–118
Textarea, 643–644
left, positioning elements, 252–254
user-created, 283–284
legends, form, 138–140
Window, 644–647
Lerdorf, Rasmus (PHP creator), 471
objects, built-in
letter
current document, 312–313
first of elements, property values, 183–184
form elements, 313–314
spacing style, 198–199, 586
history list, navigating, 315
levels, CSS, 165
math operations, 282
line
reference-making element (self), 315–316
breaks, manual, 25–27, 534
URL information, manipulating, 314–315
first of elements
XHTML document (window), 311–312
indenting, 194–195
operators, 604–606
property values, 181–183
properties, 610
spacing, fonts, 213
statements, marking for reference (label_name),
links
278–279
anchor tag, 71–73
syntax, 269
colors, 74
text, writing to document
JavaScript object, 629–630
current date, 325–327
keyboard shortcuts and tab orders, 73–74
e-mail address, obscuring, 327–329
style sheet to documents, 166–167
user-defined functions, 280–281
tag, 76–77, 550
uses, 265
target details, 75–76
variables, 271
titles, 72–73
Web resources, 324–325
URLs, 69–71
window
visited and unvisited, styles of, 180
opening another, 329–331
Linux
text, writing, 331–333
Bash shell scripting
JPEG (Joint Photographic Experts Group) format, 52
Apache, configuring, 424–425
JSUnit troubleshooting tool, 290
exporting data, 426–428
800
Index
Microsoft Windows
file, listing, 428–429 if else
passing data, 425–426 JavaScript, 276–277
state, toggling, 429 Perl, 381–382
user-specific command, running, 429–430 PHP, 484–485
PHP Perl, 658–660
running, 472 lowercase text, changing to, 583
troubleshooting tool, 501
list box, form, 132–134
lists M
collapsible, creating with DHTML, 317–319 Macintosh
custom numbering, automatic, 208–209 PHP troubleshooting tool, 501
definition, 46–47 Python interpreter, 399
described, 35–36 MacPython, 399
item, formatting, 201–202 Macromedia Freehand and Fireworks, 57
markers manual line breaks, 25–27, 534
image, 204, 581 maps, image
positioning, 203–204, 580 clickable regions, specifying, 64–66
setting, 202–203, 580 document code sample, 66–67
nesting, 47–48 navigation, defining, 550–551
ordered, 37–42 physical area, describing, 530–531
Perl functions, 664–665 specifying, 63
Python data types, 404–405 margins
text, formatting, 201–204 CSS, 223–224, 588
unordered, 42–46, 569 floating objects, 195–197
XHTML element, 549–550 marker, list
location object, JavaScript, 630–631 image, 204, 581
locking files, PHP, 499 positioning, 203–204, 580
logical operators style, 43–46
JavaScript, 604 master element, XHTML documents, 545
Perl, 378, 653 math operations
Python, 409 JavaScript object, 631–632
logo, page layout, 116–118 numbers, strings, and dates, 282
loops PHP functions, 751–753
breaking out of (break), 278 menus
continue statement animating, 351–355
JavaScript, 278 navigational blocks, 119–120
Perl, 382–383 meta tags
PHP, 486 automatic refresh and redirect, 18–19
Python, 412–413 default path, 17–18
do-while, 274, 482 search engine information, 17
for server, overriding, 19
JavaScript, 275 syntax, 16
Perl, 380–381 user agent caching, 18
PHP, 482–483 metadata, document, 551–552
Python, 411 methods
for in, 276 JavaScript, 610
foreach objects, assigning, 284
Perl, 381 PHP objects, 492–493
PHP, 483–484 Python strings, 403–404
if Microsoft Internet Explorer
JavaScript, 276–277 DOM support, 264
Perl, 381–382 TrueDoc fonts, using, 214
Python, 411–412 Microsoft Windows
PHP, 472, 500
Python interpreter, 398
801
Microsoft Windows Picture Viewer
Microsoft Windows Picture Viewer, 58 ordinal, changing position of, 39–41
MIDI sound file, adding, 147–148 starting number, changing, 41–42
MIME (Multipurpose Internet Mime Extensions) within unordered list, 47–48
PHP functions, 753 numbering, automatic
XHTML script tag support, 266 chapter and section number example, 207–208
minimum or maximum size, element, 257, 590 counter object, 206
modules counting, 582
Perl, 393, 657–658 described, 205
Python, 398 list, custom, 208–209
monospaced text, 85, 211, 568–569 text, 205–209
mouse value, changing counter’s, 206–207
cursor, CSS, 599 numbers. See also math operations
images, changing in JavaScript, 335–337 JavaScript object, 270, 633
menus, animating with drop-down effect, 351–355 ordered list style, changing, 37–39
movie data, storing, 493–494 Perl, 372–373
moving DHTML elements, 319–321 Python, 401
Mozilla Firebird, 264, 325
Mozilla Firefox, 288
MS SQL function, PHP, 755–757 O
MSDN Web Development Library, 324 object
multiple column page layout, 120–121 JavaScript
Multipurpose Internet Mime Extensions (MIME) anchor, 612–613
PHP functions, 753 area, 613
XHTML script tag support, 266 array, 613–614
music, background, 147–148 Boolean, 614–615
MySQL built-in, 282
CGI data managing, 434–438 button, 615–616
data access with Perl, 453–456 checkbox, 616–617
PHP functions, 757–760 date, 617–620
described, 281
document, 620–622
N event, 622–623
name file upload, 623–624
form attribute, 130 form, 624–625
matching elements, CSS, 173–174 function, 625–626
namespaces, XML, 159 hidden, 626–627
navigating history, 627
history list, 315 image, 627–628
JavaScript object, 632 link, 629–630
nodes, DOM, 296–302 location, 630–631
navigational blocks math, 631–632
image map, 550–551 navigator, 632
page layout tables, 119–120 number, 633
nesting lists, 47–48 object, 633
networking functions option, 634
Perl, 671–672 password, 634–635
PHP, 760–761 radio, 635–636
newspaper-like columns, documents, 120–121 references, incorrect, 288
next control structure, Perl, 382–383 RegExp, 636
node properties and methods, DOM, 295–296 Reset, 637
nonbreaking spaces, 82–83 Screen, 637–638
nonparsed data, XML, 158 Select, 638–639
numbered list String, 639–641
described, 35, 37, 553–554 Submit, 641–642
number style, changing, 37–39 Text, 642–643
802
Index
PEAR (PHP Extension and Application Repository)
Textarea, 643–644 bitwise, 410
user-created, 283–284 comparison, 409
Window, 644–647 logical, 409
JavaScript built-in miscellaneous, 410
current document, 312–313 strings, 402–403
form elements, 313–314 option
history list, navigating, 315 JavaScript object, 634
reference-making element, 315–316 PHP functions, 766–768
URL information, manipulating, 314–315 ordered list
XHTML document, 311–312 described, 35, 37, 553–554
Perl number style, changing, 37–39
constructors, 392 ordinal, changing position of, 39–41
nomenclature, 391–392 starting number, changing, 41–42
property values, accessing, 392–393 within unordered list, 47–48
PHP ordinal, changing position of
class definitions, 491 ordered lists, 39–41
constructors and destructors, 491–492 unordered lists, 46
functions, 725–726 orphans, CSS printing, 598–599
methods and properties, 492–493 output functions
movie data, storing, 493–494 Perl, 668–670
Python, 420 PHP buffering, 765
<object> tag outside value, list style, 39–40
history of, 146–147 overflow, controlling, 257–258, 589
older browsers, supporting, 150–151
oblique text, 585
ODBC function, PHP, 761–764 P
Official PHP Web Site, 508 packages, Perl, 657–658
online resources, 325 padding, CSS blocks, 588–589
open source image applications, 57–58 page layout tables
opening described, 111–112, 244
another window, JavaScript, 329–331 floating page, 113–116
files multiple columns, 120–121
Perl, 388 navigational blocks, 119–120
PHP, 495–496 odd graphic and text combinations, 116–118
Python, 417–418 pages, printing, 598–599
OpenType fonts, 213, 214 Paint Show Pro (JASC Software), 57
operating systems paragraphs
image applications, 58 color, defining style, 228
keyboard shortcuts, differentiating, 73–74 HTML, 23–25
Python functions, 697–700 XHTML, 555
operations, Python regular expressions, 414 parentheses (()), 386
operators parsing XML, 161–162
JavaScript, 604–606 passing data
Perl HTTP (GET and POST), 364–366, 444–453
arithmetic, 377, 652 Linux Bash shell scripting, 425–426
assignment, 377, 652 to software, 555–556
bitwise, 378, 653 XHTML forms, 128
comparison, 378, 652–653 password
logical, 378, 653 form input boxes, 131
miscellaneous, 379, 654 JavaScript object, 634–635
regular expressions, 383–384 path
string, 379, 654 absolute versus relative, 71
PHP, 479–481 default, 17–18
Python PCRE function, PHP, 766
arithmetic, 408 PEAR (PHP Extension and Application Repository), 508
assignment, 408–409
803
Perl (Practical Extraction and Report Language)
Perl (Practical Extraction and Report Language) logical, 378, 653
built-in functions, 386–387 miscellaneous, 379, 654
for CGI, 393–394 string, 379, 654
command line arguments, 649–650 regular expressions
control structures examples, 385
continue, 382–383 listed, 673–674
for loop, 380–381 modifying, 385–386
foreach loop, 381 operators, 383–384
if and if else, 381–382 special characters, 384–385
last, 382–383 substrings, memorizing, 386
next, 382–383 resources, 371–372
redo, 382–383 special variables, 374–377
while and until loops, 380 string tokens, 379
data types, 372–373 syntax, 372
debugger, symbolic, 650–651 user-defined functions, 387
errors and troubleshooting variables, 373, 655–657
maximum reporting, 394–395 PHP Extension and Application Repository (PEAR), 508
message, Apache Internal Server, 395–396 PHP (PHP: Hypertext Preprocessor)
examples beginning and ending tags, 473–474
database access, 453–456 break and continue statements, 486
date and time handling, simple calendar, 439–444 built-in functions, listed, 486–490
form data, creating dynamic calendar, 447–453 command terminal character and blocks of code, 474
form data, deciphering and dealing with, 444–447 commenting code, 474
script, 368–369 database, querying and reporting, 523–526
file operations date and time handling (simple calendar), 509–514
binary, manipulating, 389–390 do-while loop, 482
closing, 389 errors and troubleshooting
information, getting, 390–391 custom handling, 505
opening, 388 error level, controlling, 503–504
reading text, 388 identifying, 502–503
writing text, 389 sending to file or e-mail address, 504–505
functions syntax, common, 501
arithmetic, 660 tools, 500–501
array and list, 664–665 file operations
conversion, 661 binary, 497–498
directory, 670 closing, 497
file and file handle test, 666–667 locking, 499
file operations, 667–668 miscellaneous, listed, 499–500
input and output, 668–670 opening, 495–496
miscellaneous, 672–673 reading text, 496–497
networking, 671–672 writing text, 497
search and replace, 665–666 for loop, 482–483
string, 663 foreach loop, 483–484
structure, 661–663 form handler
system, 670–671 creating dynamic calendar, 517–523
history, 371 deciphering and handling, 514–516
modules, 393 logging data to file, 127
objects security, 128
constructors, 392 functions
nomenclature, 391–392 Apache, 717–718
property values, accessing, 392–393 array, 718–722
operators BCMath, 722–723
arithmetic, 377, 652 BZip2, 723–724
assignment, 377, 652 calendar, 724–725
bitwise, 378, 653 character type, 726
comparison, 378, 652–653 class/object, 725–726
804
Index
pseudoclasses
curl, 726–727 plugins
date and time, 727–728 described, 143–144
directory, 729 <embed>tag, representing non-HTML data with,
email, 750 144–145
error handling, 729–730 MIDI sound file, adding, 147–148
filesystem, 730–733 <object> tag, 146–147
FTP, 733–736 older, Netscape-based browsers, supporting, 150–151
handling, 736 parameters, 147
HTTP, 737 Shockwave Flash, adding, 149–150
Iconv library, 737 plus sign (+), 385
image, 738–745 PNG (Portable Network Graphics) format, 52–53
IMAP, 745–750 polygonal area, image map, 64
math, 751–753 POP3 server function, Python, 700–701
MIME, 753 port number, URL, 70
miscellaneous, 754–755 positioning
MS SQL, 755–757 background images, 236
MySQL, 757–760 blocks, 591–593
network, 760–761 elements
ODBC, 761–764 absolute, 248–249
options and info, 766–768 fixed, 249–252
output buffering, 765 floating, 255–256
PCRE, 766 layering, 258–261
programs, executing, 769 relative, 246–248
session, 770–771 specifying (top, right, bottom, and left proper-
simple XML, 771 ties), 252–254
socket, 772–773 static, 245–246
SQLite, 774–776 syntax, 245
streams, 776–778 visibility, 261–262
strings, 778–784 pound sign (#)
URL, 784 identifiers, matching style elements by, 175
variable, 784–785 PHP commenting, 474
XML, 786–787 Practical Extraction and Report Language. See Perl
ZLib, 787 preformatted text, HTML, 30–31
history, 471–472 preloading images with JavaScript, 333–335
if/else construct, 484–485 premade images, 58
objects printing
class definitions, 491 CSS, 598–599
constructors and destructors, 491–492 to text file, in Perl, 389
methods and properties, 492–493 programs, PHP functions executing, 769
movie data, storing, 493–494 progressive image storage and display, 54
operators, 479–481 property
regular expressions, 769–770 borders shortcut, 222
requirements, 472–473 CSS attributes, 237–238
resources, 508 JavaScript, 610
script, sample, 475–477 PHP objects, 492–493
switch construct, 485–486 values
user-defined functions CSS, 172–173
arguments, 490–491 Perl objects accessing, 392–393
return value, 490 protocol, URL section, 69–70
variable scope, 491 pseudoclasses
variables, 475 anchor styles, 180
when to use, 507–508 described, 179
while loop, 482 first-child element, 180
white space, use of, 474 language, changing by, 181
PHPBuilder Web site, 508
805
pseudoelements
pseudoelements operators
beginning and ending elements, 184–185 arithmetic, 408
CSS, 181–185 assignment, 408–409
first letter, specifying, 183–184 bitwise, 410
first line, specifying, 181–183 comparison, 409
pseudo-random numbers, Python functions logical, 409
obtaining, 708 miscellaneous, 410
Python regular expressions
anonymous functions (lambda keyword), 417 described, 413
built-in functions, 415–416 operations, 414
control structures special characters, 414–415
continue and break, 412–413 resources, 398
for loop, 411 syntax, 400
if and elif statements, 411–412 troubleshooting
try statement, 412 cgitb module, 421–422
while loop, 410–411 code, running in interpreter, 421
data types error stream, redirecting, 422
dictionaries, 405–406 user-defined functions, 416–417
lists, 404–405 variable scope, 407–408
numbers, 401 PythonWin, 398
strings, 402–404
tuples, 406–407
errors and exception handling, 421 Q
examples query, database
date and time handling, 457–460 MySQL, 434–438
form data, deciphering and dealing with, 460–462 Perl, 453–456
file operations PHP, 523–526
binary, handling, 420 Python, 467–469
closing, 420 quirks, browser reference, 325
opening, 417–418 quotation, enclosing, 533
reading from text file, 418–419 quotation marks (“)
writing to text file, 419 adding with styles, 184–185
functions autogenerating in text, 205
array, 680–681 block, offsetting, 31–33
asynchronous communication, 681–684
binary and ASCII code conversion, 684–685
built-in listed, 675–680 R
CGI, 685–687 radio buttons
cookie, 687–688 forms, 131–132
email, 688–691 JavaScript object, 635–636
file, 691–692 reading file text
garbage collection, 692–693 Perl, 388
HTTP and HTTPS protocol, 693–696 PHP, 496–497
IMAP, 696 Python, 418–419
interpreter, 705–708 read-only form fields, XHTML, 141–142
operating system, 697–700 rectangular area, image map, 64
POP3 server, 700–701 redirect meta tags, automatic, 18–19
pseudo-random numbers, obtaining, 708 redo control structure, Perl, 382–383
SMTP, 701–703 references
sockets, 701, 703 back to reference (self), 315–316
strings, 704–705 external, declaring as system entity, 158–159
URL, 709–715 labeling, 278–279
history, 397 object, incorrect, 288
interpreter, 398–400 refresh meta tags, automatic, 18–19
modules, 398 RegExp object, JavaScript, 636
objects, 420
806
Index
software plugins
regular expressions identifier, 175
Perl name, 173–174, 600
examples, 385 universal selector (*), 174
modifying, 385–386 semicolon (;)
operators, 383–384 JavaScript, 287
special characters, 384–385 Perl, 372
substrings, memorizing, 386 PHP, 474, 501, 502
PHP, 769–770 serif fonts, 210
Python server
described, 413 Apache
operations, 414 Internal Server error message, Perl and, 395–396
special characters, 414–415 Linux Bash shell, configuring to deliver, 424–425
relative paths, 17, 71 PHP functions, 472, 717–718
relative positioning, CSS elements, 246–248 CGI, 367–368
repeated background images, 232–235 defined, 1
request and response, HTTP, 363–364 documents, delivering, 4
reset button, forms, 137–138 meta tags overriding, 19
Reset object, JavaScript, 637 name in URL, 69–70
return value, PHP user-defined function, 490 session function, PHP, 770–771
right, positioning elements, 252–254 SGML (Standard Generalized Markup Language), 7
rollovers Shockwave Flash
images, changing in JavaScript, 335–337 <embed> and <object>tags, 150–151
menus, animating with drop-down effect, 351–355 plugins, 149–150
rows, XHTML tables shortcuts, keyboard
columns spanning, 106–109 forms, 140–141
described, 99–100 input, indicating, 548
RSS feed, 156–157 links, 73–74
rules, XHTML shorthand expressions, CSS, 185–187
documents, 544 sibling elements, matching by, 176–178
tables, 96–99 Simple Mail Transport Protocol (SMTP) functions,
701–703
simple XML function, PHP, 771
S sizing
sans serif fonts, 210 elements, 257–258
scalar values, 372 fonts
Screen object, JavaScript, 637–638 CSS, 211–212, 584
script section, 20, 558 XHTML, 79, 532–533, 559–560
scripting, Bash shell images, 61–62
Apache, configuring, 424–425 small text (<small>), 86, 559–560
exporting data, 426–428 SMTP (Simple Mail Transport Protocol) functions,
file, listing, 428–429 701–703
passing data, 425–426 socket functions
state, toggling, 429 PHP, 772–773
user-specific command, running, 429–430 Python, 701, 703
scrolling soft hyphens, text formatting, 83–84
background images, 232–235 software
viewers’, fixing document positions despite, 249–252 output, sample, 557
search and replace functions, Perl, 665–666 values, passing, 555–556
search engine meta tags, 17 software plugins
Select object, JavaScript, 638–639 described, 143–144
selection fonts, 210–211 <embed> tag, representing non-HTML data with, 144–145
selectors, CSS matching MIDI sound file, adding, 147–148
attributes by specific, 175–176, 600 <object> tag, 146–147
child, descendant, and adjacent sibling elements, older, Netscape-based browsers, supporting, 150–151
176–178, 600–601 parameters, 147
class, 174–175 Shockwave Flash, adding, 149–150
807
source citation
source citation, 536 styling fonts, 212–213
source code, JavaScript, 265 submit button
spaced data, presenting, 30–31 forms, 137–138
spacing JavaScript object, 641–642
borders, 223 subroutines, Perl, 657–658
letter and word styles, setting, 198–199, subscript (<sub>), 85–86, 561–562
239–240, 586 substrings, memorizing, 386
spam, 327–329 suites, code, 400
special characters superscript (<sup>), 85–86, 562
Perl, 384–385 Suraski, Zeev (PHP re-writer), 471
Python regular expressions, 414–415 switch construct, PHP, 485–486
special variables, Perl, 374–377 system functions, Perl, 670–671
spelling errors, Python variable name, 408
SQLite function, PHP, 774–776
square brackets ([]) T
element attributes, matching styles, 175–176 tab order
Perl regular expression, 385 forms, 140–141
stacking elements, CSS, 258–261, 593 input, indicating (kbd), 548
Standard Generalized Markup Language (SGML), 7 links, 73–74
starting list number, 41–42 tabbed data, presenting, 30–31
state, Linux Bash shell scripting, 429 tabbed elements, accessing by keyboard, 73–74
statements tables
JavaScript, 278–279, 606–609 captions
Perl, 657–660 aligning and positioning, 244
static positioning, CSS elements, 245–246 defining, 535–536
stream functions, PHP, 776–778 CSS properties, 237–243, 596–598
strings HTML, 89
JavaScript layout, 244
math operations, 282 MySQL, populating, 434–438
object, 639–641 text, changing color following mouse movement, 285
operators, 606 XHTML
support, 270 backgrounds, 105–106
Perl body, main, 563–564
described, 373 borders and rules, 96–99
functions, 663 captions, 102–103
operators, 379, 654 cell spacing and padding, 94–95
PHP functions, 778–784 cells, specifying, 100–101, 564
Python content, defining, 562–563
described, 402 database, accessing and reporting data, 523–526
format operators, 404 frames, 121–122
functions, 704–705 grouping columns, 109–111
methods, 403–404 header, footer, and body sections, 103–105, 567
operators, 402–403 page layout, using for, 111–121
structure functions, Perl, 661–663 parts, 89–91
style rows, 99–100, 568
borders, 220–221 spanning columns and rows, 106–109
definition format, 171–172 width and alignment, 91–94
DHTML, swapping, 348–351 tags
differences from main sheet, enabling, 167–169 form, 129
document tag section, 19 HTML document
rules, defining in XHTML, 561 block divisions, 31–33
style sheets. See also CSS body section, 20–21
external, referring to, 19 body with two tables, 25
purpose, 163 described, 9–10, 13
XHTML formatting, 164–165 DOCTYPE, 10, 14–15
XML, 159–160 head section, 15–19
808
Index
tuple data type
headings, 27–28 writing to document, JavaScript
horizontal rules, 28–29 current date, 325–327
HTML, 10–11, 15 e-mail address, obscuring, 327–329
manual line breaks, 25–27 Textarea object, JavaScript, 643–644
paragraphs, 23–24 tiling background images, 232–235
preformatted text, 30–31 time
script section, 20 current, in document header, 475–477
style section, 19 handling
link, 76–77 JavaScript, 282
paragraph, 23–24 Perl examples, 439–444
target details, 75–76 PHP, 509–514
TCP/IP (Transmission Control Protocol/Internet Python examples, 457–460
Protocol), 1 PHP functions, 727–728
teletype tag (<tt>), 85 title, document, 16, 567
telnet client, 5–7 tools, errors and troubleshooting
term, XHTML definition, 539, 540 JavaScript, 290
test functions, Perl file and file handle, 666–667 PHP, 500–501
text top, positioning elements, 252–254
abbreviations, 87 Transmission Control Protocol/Internet Protocol
big, 85–86, 561–562 (TCP/IP), 1
bold and italic, 84–85 transparency, image, 53
emphasis, 541, 560 trapping errors, 289
files, reading from and writing to traversing nodes, DOM, 296–302
PHP, 496–497 troubleshooting
Python, 418–419 JavaScript
writing to, 419 elements, animating, 360–361
form input boxes, 130–131 form validating, 347
formatting, XHTML, 79–84 need, 286
grouping in-line elements, 87–88 syntax, 287–288
insertions and deletions, 86–87 tools, 287
irregularly shaped layouts, 116–118 Perl
italic, 545–546 Apache Internal Server error message, 395–396
JavaScript maximum reporting, 394–395
object, 642–643 symbolic debugger, 651
writing in window, 331–333 PHP
monospaced, 85 custom handling, 505
rendering, specified (bdo), 532 error level, controlling, 503–504
small, 85–86, 561–562 handling, 729–730
specifying for nongraphical browsers, 60–61 identifying, 502–503
styles, setting level, controlling, 503–504
aligning, 189–194 sending to file or e-mail address, 504–505
capitalization, 200 syntax, common, 501
decorations, 200–201 tools, 500–501
direction, handling different languages, 587 Python
displaying, 581 cgitb module, 421–422
floating objects, 195–197 code, running in interpreter, 421
fonts, 210–214 error stream, redirecting, 422
indenting, 194–195 XHTML tables, 96
letter and word spacing, 198–199 TrueDoc fonts, 213, 214
lists, formatting, 201–204 true/false condition, executing code. See if else
numbering, automatic, 205–209, 582 loop; if loop
quotation marks, 205, 582 try statement, Python, 412
white space, preserving, 198 try/catch statement, 289
subscript, 85–86, 561–562 tuple data type, 406–407
superscript, 85–86, 561–562
809
typefaces
typefaces user-defined entities, XML, 158–159
described, 210 user-defined functions
embedding, 213–214 JavaScript, 280–281
formatting tag, 79 Perl, 387
line spacing, 213 PHP, 490–491
lists, formatting, 39 Python, 416–417
selection, 210–211 user-specific command, running in Linux Bash shell
sizing script, 429–430
CSS, 211–212, 584
XHTML, 79, 532–533, 559–560
styling V
CSS, 212–213, 582–587 validating forms, JavaScript, 343–347
XHTML, 582–587 value
automatic numbering, changing, 206–207
stepping through range (for loop), 275
U troubleshooting, basic, 289
underlining text, 583 van Rossum, Guido (Python language inventor),
unicode text, 587 397, 418
Uniform Resource Locator. See URL variable
universal selector (*), matching elements by, 174 JavaScript, 271, 288, 289
UNIX, 398, 399 Perl, 373, 655–657
unordered list PHP, 475, 501, 784–785
described, 35, 42–43, 569 Python, 407
item marker, changing, 43–46 XHTML, 569–570
ordered lists within, 47–48 variable scope
ordinal, changing position of, 46 PHP, 491
until loop, Perl, 380 Python, 407–408
uppercase text, changing to, 583 vertical text alignment, 191–194, 591
URL (Uniform Resource Locator) viewers’ scrolling, fixing document positions despite,
absolute versus relative paths, 71 249–252
components, 69–70 viewing documents, 160
data, passing to CGI script, 425–426 visibility, positioning elements, 261–262
described, 4 visited link, coloring, 74
form data, passing, 128
HTTP data, encapsulating, 364–366
information, manipulating (location), 314–315 W
links, 69–71 Wall, Larry (Perl language inventor), 371
PHP functions, 784 W3C (World Wide Web Consortium)
Python functions, 709–715 DOM specification, 291, 324
user agent Web specifications, overall, 7
caching with meta tags, 18 Web
default path, setting, 17 creating, 3–4
DOM support, listed, 264 described, 1–2
formatting shortcuts, reasons to avoid, 10 HTTP, 4–7
JavaScript support lacking, 323 Web browser
meta tags caching, 18 caching with meta tags, 18
refreshing and reloading after specified time, 18–19 default path, setting, 17
scripting language, unsupported, 552 DOM support, listed, 264
Web servers, connecting, 1–2 formatting shortcuts, reasons to avoid, 10
window, another JavaScript support lacking, 323
opening, 329–331 meta tags caching, 18
outputting text, 331–333 refreshing and reloading after specified time, 18–19
XML, usefulness of, 154, 160 scripting language, unsupported, 552
user input. See form Web servers, connecting, 1–2
user-created objects, JavaScript, 283–284
810
Index
XHTML
window, another white space
opening, 329–331 compressing, 198
outputting text, 331–333 CSS, 587
XML, usefulness of, 154, 160 explicitly including, 83
Web document floating objects, 195–197
CSS HTML documents, reasons to use, 10
layering, 258–261 PHP, 474
matching, 173–178 preserving, 198, 556
padding, 218–219 XHTML tables used for layout, 118
sizing, 256–258 widows, CSS printing, 599
visibility, 261–262 width
DOM, accessing by ID, 316 borders, 219–220
external content, embedding, 552–553 elements, specifying, 257
form, inserting, 129 images, XHTML, 62
header containing current date and time, 475–477 margins, CSS, 588
HTML tags tables, XHTML, 91–94
block divisions, 31–33 wildcard, style matching, 174
body section, 20–21 window, browser
body with two tables, 25 opening another, 329–331
described, 13 text, writing, 331–333
DOCTYPE, 10, 14–15 Window object, JavaScript, 644–647
head section, 15–19 Windows (Microsoft)
headings, 27–28 PHP, 472, 500
horizontal rules, 28–29 Python interpreter, 398
HTML, 10–11, 15 Windows Picture Viewer (Microsoft), 58
manual line breaks, 25–27 word spacing, CSS, 198–199, 586
paragraphs, 23–24 World Wide Web
preformatted text, 30–31 creating, 3–4
script section, 20 described, 1–2
style section, 19 HTTP, 4–7
image map code sample, 66–67 World Wide Web Consortium (W3C)
intelligence, in JavaScript, 265 DOM specification, 291, 324
JavaScript object, 620–622 Web specifications, overall, 7
master element, 545 writing
moving with JavaScript, 319–321 to browser window in JavaScript, 331–333
navigational pane, 119–120 to document in JavaScript
original/desired location, defining in XHTML (<base>), date, current, 325–327
531–532 e-mail address, obscuring, 327–329
URL, 70 text, writing to document
XML, 156–157 to text file
Web image formats Perl, 389
GIF, 51–52 PHP, 497
JPEG, 52 Python, 419
PNG, 52–53
Web resources
JavaScript, 324–325 X
Perl, 371–372 XHTML
PHP, 508 attributes
Python, 398 core, 571
while loop events, 570–571
JavaScript, 274–275 internationalization, 571
Perl, 380 document, built-in JavaScript object, 311–312
PHP, 482 hyperlink (<a>), 528
Python, 410–411 PHP date and time routines, 509–514
811
XHTML (continued)
XHTML (continued) design strategy, 153–154
tables DTDs, 155
backgrounds, 105–106 editing, 161
body, main, 563–564 elements, 156–157
borders and rules, 96–99 entities, user-defined, 158–159
captions, 102–103 namespaces, 159
cell spacing and padding, 94–95 nonparsed data, 158
cells, specifying, 100–101, 564 non-Web applications, 154
content, defining, 562–563 parsing, 161–162
database, accessing and reporting data, 523–526 PHP functions, 771, 786–787
frames, 121–122 style sheets, 159–160
grouping columns, 109–111 versions, 8
header, footer, and body sections, 103–105, 567 viewing documents, 160
page layout, using for, 111–121 XSLT, 161
parts, 89–91 XSLT (Extensible Stylesheet Language), 161
rows, 99–100, 568
spanning columns and rows, 106–109
width and alignment, 91–94 Z
tips for using, 527 z-axis, 258–261, 593
versions, 9 ZLib functions, PHP, 787
XML (Extensible Markup Language)
attributes, 157
comments, 157
812
Appendix C
Event Handlers
onFocus
onKeyDown
onKeyPress
onKeyUp
onSelect
Window Object
As the top-level object in the JavaScript client hierarchy, a Window object is created for every user agent
window and frame (every instance of an XHTML <body> or <frameset> tag).
Properties
Properties Description
closed Returns a Boolean value corresponding to whether a window has
been closed. If the window has been closed, this property is true.
defaultStatus Returns or sets the message displayed in a window’s status bar.
[= “message”]
644
JavaScript Language Reference
Properties Description
scrollbars[.visible = Sets the visibility of the window’s scroll bars.
true|false]
toolbar[.visible = Sets the visibility of the window’s toolbar. Note that this property
true|false] can be set only prior to the window being opened and requires the
UniversalBrowserWrite privilege.
Methods
Methods Description
alert(“message”) Displays an alert box containing message and an OK button (to
clear the box).
blur Removes the focus from the specified window.
captureEvents Instructs the window to capture all events of a particular type. See
(event_types) the Event object for a list of event types.
clearInterval Used to cancel a timeout previously set with the setInterval
(intervalID) method.
clearTimeout Used to cancel a timeout previously set with the setTimeout
(timeoutID) method.
close Causes the specified window to close.
confirm(“message”) Displays a dialog box containing message along with OK and Can-
cel buttons. If the user clicks the OK button, this method returns
true; if the user clicks the Cancel button (or otherwise closes the
dialog box), the method returns false.
disableExternal Disables the capturing of events previously enabled using the
Capture enableExternalCapture method.
645
Appendix C
Methods Description
handleEvent(event) Used to call the handler for the specified event.
home Mimics the user pressing the Home button, causing the window to
display the document designated as the user’s home page.
moveBy(horizPixels, Moves the window horizontally by horizPixels and vertically by
vertPixels) vertPixels in relation to its current position.
moveTo(Xposition, Moves the window upper left corner to the position Xposition
Yposition) (horizontal) and Yposition (vertically).
open(URL, windowname [, Opens a new window named windowname, displaying the
features]) document referred to by URL, with the optional specified features.
The specified features are contained in a string, with the features
separated by commas. Features can include the following:
For example, to create a new window that is 400 pixels square, is not
resizable, and has no scroll bars, you could use the following string
for features:
“height=400,width=400,resizeable=no,scrollbars=no”
print Calls the print routine for the user agent to print the current
document.
prompt(message[, Displays a dialog box containing message and a text box with the
input]) default input (if specified). The content of the text box is returned
if the user clicks OK. If the user clicks Cancel or otherwise closes
the dialog box, the method returns null.
releaseEvents Used to release any captured events of the specified type and to
(event_type) send them on to objects further down the event hierarchy.
resizeBy(horizPixels, Resizes the specified window by the specified horizontal and
vertPixels) vertical pixels. The window retains its upper left position; the resize
moves the lower right corner appropriately.
646
JavaScript Language Reference
Methods Description
resizeTo(horizPixels, Resizes the specified window to the specified dimensions.
vertPixels)
routeEvent(event_type) Used to send an event further down the normal event hierarchy.
scrollBy(horizPixels, Scrolls the specified window by the amount (horizontal and
vertPixels) vertically) specified. The visible property of the window’s scrollbar
must be set to true for this method to work. Note that this method
has been largely deprecated in favor of scrollTo.
scrollTo(Xposition, Scrolls the specified window to the specified coordinates, with the
Yposition) specified coordinate becoming the top left corner of the viewable
area.
setInterval Causes the expression to be evaluated or the function called every
(expression/function, milliseconds. Returns the ID of the interval. Use the
milliseconds) clearInterval method to stop the iterations.
Event Handlers
Event Handlers
onBlur
onDragDrop
onError
onFocus
onLoad
onMove
onResize
onUnload
647
Perl Language Reference
This appendix provides a comprehensive reference to the Perl language. Within this appendix you
will find listings for Perl’s many language conventions, including its variables, statements, and
functions. For more information on using the language, see Chapters 25 and 28 of this book.
#!/usr/bin/perl -U
Argument Use
-a Turns on autosplit mode. Used with the -n or -p options. (Splits
to @F.)
-c Checks syntax. (Does not execute program.)
-d Starts the Perl symbolic debugger.
-D number Sets debugging flags.
-e command Enters a single line of script. Multiple -e arguments can be used
to create a multiline script.
-F regexp Specifies a regular expression to split on if -a is used.
-i[extension] Edits < > files in place.
-I[directory] Used with -P, specifies where to look for include files. The direc-
tory is prepended to @INC.
-l [octnum] Enables line-end processing on octnum.
Table continued on following page
Appendix D
Argument Use
-n Assumes a while (<>) loop around the script. Does not print lines.
-p Similar to –n, but lines are printed.
-P Executes the C preprocessor on the script before Perl.
-s Enables switch parsing after program name.
-S Enables PATH environment variable searching for program.
-T Forces taint checking.
-u Compiles program and dumps core.
-U Enables Perl to perform unsafe operations.
-v Outputs the version of the Perl executable.
-w Enables checks and warning output for spelling errors and other error-
prone constructs in the script.
-x [directory] Extracts a Perl program from input stream. Specifying directory changes
to that directory before running the program.
-X Disables all warnings.
-0[octal] Designates an initial value for the record separator, $/. See also –l.
Command Use
h Prints out a help message.
T Prints a stack trace.
s Single-steps forward.
n Single-steps forward around a subroutine call.
RETURN (key) Repeats the last s or n debugger command.
r Returns from the current subroutine.
c [ line ] Continues until line, breakpoint, or exit.
p expr Prints expr.
650
Perl Language Reference
Command Use
l [ range ] Lists a range of lines. range may be a number, a subroutine name, or one
of the following formats: start-end, start+amount. (Omitting range
lists the next window.)
w Lists window around current line.
- Lists previous window.
f file Switches to file.
l sub Lists the subroutine sub.
S List the names of all subroutines.
/pattern/ Searches forward for pattern.
?pattern? Searches backward for pattern.
b [ line [ Sets breakpoint at line for the specified condition. If line is omitted, the
condition ]] current line is used.
b sub [ condition ] Sets breakpoint at the subroutine sub for the specified condition.
d [ line ] Deletes breakpoint at line.
D Deletes all breakpoints.
L Lists lines that currently have breakpoints or actions.
a line command Sets an action for line.
A Deletes all line actions.
< command Sets command to be executed before every debugger prompt.
> command Sets command to be executed before every s, c, or n command.
V [ package [ vars ]] Lists all variables or specified vars in package. If package is omitted, lists
main.
X [ vars ] Similar to V, but lists the current package.
! [ [-]number ] Re-executes a command. If number is not specified, the previous com-
mand is used.
H [ -number ] Displays the last -number commands of more than one letter.
t Toggles trace mode.
= [ alias value ] Sets alias to value, or lists current aliases.
q Quits the debugger.
command Executes command as a Perl statement.
651
Appendix D
Operators
The following tables detail the various operators present in the Perl language.
652
Perl Language Reference
Operator Use
< Numeric is less than
>= Numeric is greater than or equal to
<= Numeric is less than or equal to
eq String equality
ne String nonequality
gt String greater than
lt String less than
ge String greater than or equal to
le String less than or equal to
653
Appendix D
String Operators
Operator Use
. Concatenation
x Repetition
String Tokens
Token Character
\b Backspace
\e Escape
\t Horizontal tab
\n Line feed
\v Vertical tab
\f Form feed
\r Carriage return
\” Double quote
\’ Single quote
\$ Dollar sign
\@ At sign
\\ Backslash
654
Perl Language Reference
Standard Variables
The following tables detail the various standard variables in the Perl language.
Global Variables
Variable Use
$_ The default input and pattern-searching space.
$. The current input line number of the last filehandle read.
$/ The input record separator (newline is the default).
$, The output field separator for the print operator.
$” The separator joining elements of arrays interpolated in strings.
$\ The output record separator for the print operator.
$? The status returned by the last `...` command, pipe close, or system operator.
$] The Perl version number.
$; The subscript separator for multidimensional array emulation (default is \034).
$! In a numeric context, is the current value of errno. In a string context, is the
corresponding error string.
$@ The Perl error message from the last eval or do command.
$: The set of characters after which a string may be broken to fill continuation
fields in a format.
$0 The name of the file containing the Perl script being executed.
$$ The process ID of the currently executing Perl program.
$< The real user ID of the current process.
$> The effective user ID of the current process.
$( The real group ID of the current process.
$) The effective group ID of the current process.
$^A The accumulator for formline and write operations.
$^D The debug flags; passed to Perl using the -D command line argument.
$^F The highest system file descriptor.
$^I In-place edit extension, passed to Perl using the -i command line argument.
$^L Formfeed character used in formats.
$^P Internal debugging flag.
Table continued on following page
655
Appendix D
Variable Use
$^T The time (as delivered by time) when the program started. Value is used by the
file test operators -M, -A, and -C.
$^W The value of the -w command line argument.
$^X The name used to invoke the current program.
Context-Dependent Variables
Variable Use
$% The current page number of the current output channel.
$= The page length of the current output channel. (Default is 60.)
$- The number of lines remaining on the page.
$~ The name of the current report format.
$^ The name of the current top-of-page format.
$| Used to force a flush after every write or flush on the current output channel.
Set to nonzero to force flush.
$ARGV The name of the file when reading from < >.
Localized Variables
Variable Use
$& The string matched by the last successful pattern match.
$` The string preceding what was matched by the last successful pattern match.
$’ The string following what was matched by the last successful pattern match.
$+ The last bracket matched by the last search pattern.
$1...$9 Contains the subpatterns from the corresponding parentheses in the last suc-
cessful pattern match. (Subpatterns greater than $9 are available if the match
contained more than 9 matched subpatterns.)
Special Arrays
Array Use
@ARGV Contains the command-line arguments for the program. Does not include
the command name.
@EXPORT Names of methods a package exports by default.
656
Perl Language Reference
Array Use
@EXPORT_OK Names of methods a package can export upon explicit request.
@INC Contains a list of places to look for Perl scripts for require or do.
@ISA Contains a list of the base classes of a package.
@_ Contains the parameter array for subroutines. Also used by split (not in
array context).
%ENV Contains the current environment.
%INC Contains a list of files that have been included with require or do. The
key to each entry is the filename of the inclusion, and the value is the loca-
tion of the actual file used. (The require command uses this array to
determine if a particular file has already been included or not.)
%OVERLOAD Overload operators in a package.
%SIG Sets signal handlers for various signals.
Statements
The following tables detail the various statements present in the Perl language.
657
Appendix D
Function/Statement Use
package name Designates the remainder of the current block as a package.
require expr Can be used in multiple contexts: If expr is numeric, statement
requires Perl to be at least the version in expr. If expr is nonnumeric,
it indicates a name of a file to be included from the Perl library.
(The .pm extension is assumed if none is given.)
return expr Returns from a subroutine with the value specified.
sub name { expr ; ... } Designates name as a subroutine. Parameters are passed by refer-
ence as array @_. Returns the value of the last expression evalu-
ated in the subroutine or the value indicated with the return
statement.
[ sub ] BEGIN { expr ; ... } Defines a setup block to be called before execution of the rest of
the script.
[ sub ] END { expr ; ... } Defines a cleanup block to be called upon termination of the script.
tie var, package, [ list ] Ties a variable to a package that will handle it.
untie var Breaks the binding between var and its package.
use module [ [ Imports semantics from module into the current package.
version ] list ]
658
Perl Language Reference
Statement Use
for (init_expr; cond_expr; The for loop is a complex loop structure typically used
loop_expr){ to iterate over a sequence of numbers (for example,
// loop code 1–10). At the start of the loop the init_expr is evaluated
} and the cond_expr is also evaluated. The loop executes as
long as cond_expr remains true, evaluating the loop_expr
on the second and subsequent iterations. For example,
the following loop executes 10 times, assigning the vari-
able x values of 1 through 10:
foreach [ var ] (mixed) { Performs the loop code once for every item in mixed,
// loop code assigning the variable var to each item in turn. For
} example, the following code will output all the values in
array @arr:
659
Appendix D
Statement Use
redo [ label ] Causes the loop to redo the current iteration of the loop
(not evaluating the conditional statement in the process).
If label is specified, it performs the action on the appro-
priately labeled loop instead of the current one.
until (expr) { Perform the loop code until expr evaluates to true. Note
# statement(s) to execute that because the conditional statement is at the beginning
# while expression is true of the loop, the loop code may not execute (if expr is
} [ continue { initially true).
# statements to do at end
# of loop or explicit continue
}]
while (expr) { Perform the loop code while expr evaluates to true. Note
# statement(s) to execute that because the conditional statement is at the beginning
} [ continue { of the loop, the loop code may not execute (if expr is
# statements to do at end initially false).
# of loop or explicit continue
}]
Functions
The following tables detail the various default functions available in the Perl language.
Arithmetic Functions
Function Use
abs expr Returns the absolute value of expr.
atan2 x,y Returns the arctangent of x/y.
cos expr Returns the cosine of expr.
exp expr Returns e (the natural logarithm base) to the power of expr.
int expr Returns the integer portion of expr.
log expr† Returns the natural logarithm of expr.
rand [ expr ] Returns a random number between 0 and the value of expr. If expr is
omitted, returns a value between 0 and 1.
sin expr Returns the sine of expr.
sqrt expr Returns the square root of expr.
srand [ expr ] Sets the random number seed for the rand operator.
time Returns the number of seconds since January 1, 1970.
660
Perl Language Reference
Conversion Functions
Function Use
chr expr Returns the character represented by the decimal value expr.
gmtime expr Returns a 9-element array (0 = $sec, 1 = $min, 2 = $hour, 3 = $mday, 4 =
$mon, 5 = $year, 6 = $wday, 7 = $yday, 8 = $isdst) with the time for-
matted for the Greenwich time zone. Note that expr should be in a form
returned from a time function.
hex expr Returns the decimal value of expr, with expr interpreted as a hex string.
localtime expr Returns a 9-element array (0 = $sec, 1 = $min, 2 = $hour, 3 = $mday, 4 =
$mon, 5 = $year, 6 = $wday, 7 = $yday, 8 = $isdst) with the time for-
matted for the local time zone. Note that expr should be in a form
returned from a time function.
oct expr Returns the decimal value of expr, with expr interpreted as an octal
string. If expr begins with 0x, expr is interpreted as a hex string instead
of an octal string.
ord expr Returns the ASCII value of the first character of expr.
vec expr, offset, Using expr as a vector of unsigned integers, returns the bit at offset. Note
bits that bits must be between 1 and 32.
Structure Conversion
Function Use
pack template, list Returns a binary structure, packing the list of values using tem-
plate. See the template listing in the next table.
unpack template, expr Returns an array unpacking the structure expr, using template.
See the template listing in the next table.
For the pack and unpack functions, template is a sequence of characters containing the characters in
the following table.
661
Appendix D
Character Use
B A bit string (descending bit order inside each byte).
h A hex string (low nybble first).
H A hex string (high nybble first).
c A signed char value.
C An unsigned char value. (See U for Unicode chars.)
s A signed short value.
S An unsigned short value.
i A signed integer value.
I An unsigned integer value.
l A signed long value.
L An unsigned long value.
n An unsigned short in “network” (big-endian) order.
N An unsigned long in “network” (big-endian) order.
v An unsigned short in “VAX” (little-endian) order.
V An unsigned long in “VAX” (little-endian) order.
q A signed quad (64-bit) value.
Q An unsigned quad value.
j A signed integer value.
J An unsigned integer value.
f A single-precision float (native format).
d A double-precision float (native format).
F A floating-point value (native format).
D A long double-precision float (native format).
p A pointer to a null-terminated string.
P A pointer to a structure (fixed-length string).
u A uuencoded string.
U A Unicode character number.
w A BER compressed integer.
x A null byte.
X Back up a byte.
662
Perl Language Reference
Character Use
@ Null fill to absolute position, counted from the start of the innermost
group.
( Start of a group.
) End of a group.
Each character can be followed by a decimal number that is used as a repeat count. An asterisk specifies all
remaining arguments. If the format begins with %N, the unpack function will return an N-bit checksum.
Spaces can be included in the template for legibility — they are ignored when the template is processed.
String Functions
Function Use
chomp string|list Removes the trailing record separator (as set in $/) from string
or all elements of list. Returns the total number of characters
removed.
chop list Removes the last character from all elements of list. Returns the
last character removed.
crypt string, salt Encrypts string.
eval expr Parses expr and executes it as if it contained Perl code. Returns
the value of the last expression evaluated.
index string, substr [, Returns the position of substr in string at or after offset. If substr
offset ] is not found, index returns -1.
length expr Returns the length of expr in characters.
lc expr Returns a lowercase version of expr.
lcfirst expr Returns expr with the first character in lowercase.
quotemeta expr Returns expr with all regular expression metacharacters
quoted.
rindex string, substr [, Returns the position of the last substr in string at or before offset.
offset ]
substr expr, offset [, Extracts a substring of length len out of expr and returns it. If
len ] offset is negative, substr counts from the end of the string.
uc expr Returns an uppercase version of expr.
ucfirst expr Returns expr with the first character in uppercase.
663
Appendix D
664
Perl Language Reference
Function Use
splice @array, offset [, Removes the elements of @array designated by offset and length
length [, list ]] and replaces them with list (if specified). Returns the elements
removed from @array.
split [ pattern [, Splits a string into an array and returns the array. If limit is
expr [, limit ]]] specified, split creates at most the number of fields specified. If
pattern is omitted, the string is split at white space. If split is
not in array context, it returns number of fields and splits to @_.
unshift @array, list Prepends list to the front of @array, and returns the number
of elements in the new array.
values %hash Returns an array containing all the values of hash.
665
Appendix D
The following table explains the options mentioned in the preceding “Search and Replace Functions” table.
666
Perl Language Reference
Test Use
-p File is a named pipe (FIFO), or filehandle is a pipe.
-S File is a socket.
-b File is a block special file.
-c File is a character special file.
-t Filehandle is opened to a tty.
-u File has a setuid bit set.
-g File has a setgid bit set.
-k File has a sticky bit set.
-T File is an ASCII text file.
-B File is a binary file (opposite of -T).
-M Script start time minus file modification time (in days).
-A Script start time minus access time (in days).
-C Script start time minus inode change time (in days).
File Operations
Function Use
chmod list Changes the permissions of the files in list. The first element of list is
the mode to use.
chown list Changes the owner and group of the files in list. The first two elements
of the list are the numerical userid and groupid to set.
truncate file, size Truncates file to size. The file can be a filename or a filehandle.
link oldfile, Creates newfile as a link to oldfile.
newfile
lstat file Identical to the stat function, but lstat does not traverse symbolic
links.
mkdir directory, Creates directory with permissions in mode. Sets $! if operation fails.
mode
readlink expr Returns the value of a symbolic link. Sets $! on system error, uses $_ if
expr is omitted.
rename oldname, Changes the name oldname to newname.
newname
667
Appendix D
Function Use
stat file Returns a 13-element array where 0 = $dev, 1 = $ino, 2 = $mode, 3 =
$nlink, 4 = $uid, 5 = $gid, 6 = $rdev, 7 = $size, 8 = $atime, 9 =
$mtime, 10 = $ctime, 11 = $blksize, 12 = $blocks. Note that file
can be a filehandle, an expression evaluating to a filename, or _
(underline filehandle), which will use the file referred to in the last file
test operation or stat call. Returns a null list on failure.
symlink oldfile, Creates newfile symbolically linked oldfile.
newfile
getc [ filehandle ] Returns the next character from filehandle, or an empty string if
EOF. Reads from STDIN if filehandle is not specified.
668
Perl Language Reference
Function Use
ioctl filehandle, Performs ioctl(2) on filehandle, using the supplied parameters,
function, $var with nonstandard return values.
open filehandle [ , Opens a file and associates it with filehandle. If filename is not ,
filename ] specified the scalar variable filehandle must contain the filename.
Returns true on success or undef on failure.
read filehandle, $var, Reads length binary bytes from filehandle into $var at offset. Returns
length [ , offset ] the number of bytes read.
seek filehandle, Arbitrarily positions the file pointer. Returns true if the operation
position, whence was successful.
select [ filehandle ] Returns the current default filehandle. If filehandle is specified, it
becomes the current default filehandle.
select rbits, wbits, Performs a select(2) system call with the parameters specified.
nbits, timeout
669
Appendix D
Function Use
tell [ filehandle ] Returns the current file pointer position for filehandle. Assumes the
last file accessed if filehandle is omitted.
write [ filehandle ] Writes a formatted record to filehandle, using the data format asso-
ciated with that filehandle.
Directory Functions
Function Use
closedir dirhandle Closes a directory opened by opendir.
opendir dirhandle, Opens dirname on the dirhandle specified.
dirname
readdir dirhandle Returns the next entry or an array of entries from dirhandle.
rewinddir dirhandle Positions the directory pointer at the beginning of the dirhandle list.
seekdir dirhandle, pos Sets the directory pointer on dirhandle to pos.
telldir dirhandle Returns the directory pointer position in the dirhandle list.
System Functions
Function Use
alarm expr Schedules a SIGALRM after expr seconds.
chdir [ expr ] Changes the working directory to expr. If expr is omitted, alarm
uses $ENV{“HOME”} or $ENV{“LOGNAME”}.
chroot dirname Changes the root directory to dirname for the process and its
children.
die [ list ] Prints list to STDERR and exits with the current value of $!.
exec list Executes the system command(s) list. Does not return.
exit [ expr ] Exits the program immediately with the value of expr. Calls
appropriate end routines and object destructors before exiting.
fork Performs a fork(2) system call. Returns the process ID of the
child to the parent process and 0 to the child process.
getlogin Returns the effective login name.
getpgrp [ pid ] Returns the process group for process pid. If pid is 0 or omitted,
getgrp returns the current process.
670
Perl Language Reference
Function Use
getpriority which, who Returns the current priority for a process, a process group, or a
user.
glob pattern Returns a list of filenames that match the pattern pattern.
kill list Sends a signal to the processes in list. The first element of the list
is the signal to send in numeric or name form.
setpgrp pid, pgrp Sets the process group to pgrp for the process specified by pid. If
pid is omitted or 0, setpgrp uses the current process.
setpriority which, Sets the current priority for a process, a process group, or a user.
who, priority
sleep [ expr ] Causes the program to sleep for expr seconds. If expr is omitted,
the program sleeps forever. Returns the number of seconds slept.
syscall list Calls a system call. The first element in list is the system call; the
rest of list is used as arguments.
system list Similar to exec except that a fork is performed first, and the par-
ent process waits for the child process to complete.
times Returns a four-element array giving the user and system times, in
seconds, for this process and the children of this process (0=
$user, 1= $system, 2= $cuser, 3= $csystem).
umask [ expr ] Sets the umask for the process. Returns the old umask. Omitting
expr causes the current umask to be returned.
wait Behaves like a wait(2) system process — waits for a child process
to terminate. Returns the process ID of the terminated process (-1
if none). The status is returned in $?.
waitpid pid, flags Performs the same function as the waitpid(2) system call.
warn [ list ] Similar to die, warn prints list on STDERR but doesn’t exit.
Networking Functions
Function Use
accept newsocket, genericsocket Accepts a new socket similar to the accept(2) system
call.
bind socket, name Binds name to socket. The name should be a packed
address of an appropriate type for socket.
connect socket, name Attempts to connect name to socket, similar to the system
call.
getpeername socket Returns the socket address of the other end of socket.
Table continued on following page
671
Appendix D
Function Use
getsockname socket Returns the name of socket.
getsockopt socket, level, Returns the socket option identified by optionname,
optionname queried at level.
listen socket, queuesize Similar to the listen system call, starts listening on
socket. Returns true or false depending on success.
recv socket, scalar, Attempts to receive length characters of data into scalar
length, flags from the specified socket. Specified flags are the same as
the recv system call.
send socket, msg, flags [ , to ] Attempts to send msg to socket. Takes the same flags as
the send system call. Use to when necessary to specify
an unconnected socket.
setsockopt socket, level, Sets the socket option optionname to optionvalue using the
optionname, optionvalue level specified.
shutdown socket, method Shuts down a socket using the specified method. The
method can be any valid method for the shutdown sys-
tem call.
socket socket, domain, Similar to the socket system call, creates a socket in
type, protocol domain with the type and protocol specified.
socketpair socket1, socket2, Similar to socket but creates a pair of bidirectional
domain, type, protocol sockets.
Miscellaneous Functions
Function Use
defined expr Tests whether expr has an actual value.
do filename Executes filename as a Perl script.
dump [ label ] Performs an immediate core dump to a new binary executable. When
new binary runs, execution starts at optional label or at the beginning
of the executable if label is not specified.
eval { expr1 ; Evaluates and executes any code between the braces ({ and }).
expr2; ... exprN}
ref expr Tests expr and returns true if expr is a reference. Returns a package
name if expr has been blessed into a package. (See the “Subroutines,
Packages, and Modules” table in the “Statements” section.)
672
Perl Language Reference
Function Use
reset [ list ] Resets all variables and arrays that begin with a letter in list.
scalar expr Evaluates expr in scalar context.
undef [ value ] Undefines value. Returns undefined.
wantarray Tests the current context to see if an array is expected. Returns true if
array is expected or false if array is not expected.
Regular Expressions
The following tables provide information used with Perl’s regular expressions and regex handling
functions.
673
Appendix D
Escape Characters
Escaped Character Use
\w Matches alphanumeric (including underscore).
\W Matches nonalphanumeric.
\s Matches white space.
\S Matches non–white space.
\d Matches numeric.
\D Matches nonnumeric.
\A Matches the beginning of the string.
\Z Matches the end of the string.
\b Matches word boundaries.
\B Matches nonword boundaries.
\G Matches where a previous m//g search left off.
\1 ... \9 Are used to refer to previously matched subexpressions (grouped with
parenthesis inside the match pattern).
Note: \10 and up can be used if the pattern matches more than nine
subexpressions.
674
Python Language Reference
This appendix lists various functions and variables available for CGI programming in Python.
Their syntax and general use are shown, and short examples are included, where necessary, for
clarity. Although care has been taken to include those functions most likely to be used in CGI pro-
gramming, Python is evolving, so older code might include deprecated functions that aren’t listed
here. Because Python is highly modularized, the functions and variables are grouped by the mod-
ule to which they belong. In most cases, the module must be imported before the functions listed
become available.
Built-in Functions
The following sections cover the functions built into Python. These functions are always available
and do not require that a specific module be imported for their use.
Syntax Description
__import__(mod) Imports the module represented by the string mod,
especially useful for dynamically importing a list
of modules:
myModules = [‘sys’,’os’,’cgi’,’cgitb’]
modules = map(__import__,myModules)
if isinstance(obj, basestring):
Syntax Description
bool([x]) Returns True or False depending on the value of x. If x
is a false statement or empty, returns False; otherwise,
returns True.
callable(obj) Returns 1 if obj can be called; otherwise, returns 0.
chr(i) Returns a string of one character whose ASCII code is the
integer i.
classmethod(func) Returns a class method for func in the following format:
class C:
@classmethod
def func(cls, arg1, arg2...):
cmp(a,b) Compares values a and b, returning a negative value if a
< b, 0 if a == b, and a positive value if a > b.
compile(string, filename, Compiles string into a code object. Filename is the file
kind[,flags[, don’t inherit]]) containing the code. Kind denotes what kind of code to
compile.
complex([real[,imag]]) Returns a complex number with value real+imag*j.
delattr(obj,string) Removes the attribute of obj whose name is string.
dict([mapping or sequence]) Returns a dictionary whose initial value is set to mapping
or sequence. Returns empty dictionary if no mapping or
sequence is provided.
dir(obj) Returns attributes and methods of obj. Works on nearly
any data type.
divmod(a,b) Returns quotient and remainder of two noncomplex
numbers, a and b.
enumerate(iterable) Returns an enumerate object based on the specified iter-
able object.
eval(expression[,globals[,locals]]) Returns evaluation of expression by Python rules using
globals (which must be a dictionary) as global namespace
and locals (which can be any mapping object) as local
namespace.
execfile(filename[,globals[,locals]]) Parses filename, evaluating as series of Python expres-
sions, using globals and locals as global and local names-
paces. globals and locals are dictionaries. If no locals are
given, locals are set to provided globals.
file(filename[,mode[,bufsize]]) Returns new file object of specified mode. Modes are r for
reading, w for writing, and a for appending: + appended
to mode indicates that file is changeable, and b appended
to mode indicates that file is to be binary. bufsize may be 0
for unbuffered, 1 for line buffered, or any other positive
value for a specific buffer size.
676
Python Language Reference
Syntax Description
filter(function, list) Returns a list of items from list where function is True.
float([x]) Returns string or number x as converted to a floating-point
value. If no argument x is given, returns 0.0.
frozenset([iterable]) Returns set (with elements taken from iterable) that has no
update methods but can be hashed and used as the mem-
ber of other sets or as dictionary keys (called a frozen set).
getattr(obj,name[,default]) Returns the value of the named attribute of obj or default if
it does not exist.
globals() Returns a dictionary containing the current global symbol
table.
hasattr(obj, name) Returns True if name is the name of one of obj’s attributes;
otherwise, returns False.
hash(obj) Returns the hash value of obj.
help([obj]) Invokes the built-in help system.
hex(x) Converts x to a hexadecimal value.
id(obj) Returns the unique integer representing obj.
input([prompt]) Returns the equivalent of eval(raw_input(prompt)).
int[x[,radix]] Returns the integer version of x to the base specified
by radix.
isinstance(obj, classinfo) Returns True if obj is an instance of classinfo; otherwise,
False. For example, if issubclass(A,B), then
isinstance(x,A) => isinstance(x,B).
677
Appendix E
Syntax Description
long([x[,radix]]) Returns a long integer converted from x to base radix. x
may be a string or a regular or long integer or a floating-
point number. A floating-point number is truncated
toward zero.
map(func, list) Returns list of results from applying func to every mem-
ber of list.
max(s[, args]) Returns the largest member of sequence s. If additional
args are specified, returns the largest of args.
min(s[, args]) Returns the smallest member of sequence s. If additional
args are specified, returns the smallest of args.
object() Returns a new featureless object.
oct(x) Returns octal form of integer x.
open(filename[mode[,bufsize]]) Alias for the file function.
ord(c) Returns the ASCII character represented by the one-
character string or Unicode character c. ord(‘c’) returns
the integer 99.
pow(x,y[,z]) Returns x to the power y; if z is present, returns x to the
power y, modulo z.
property[fget[,fset[,fdel][,doc]]]) Returns a property attribute for a new-style class with
functions to get, set, and delete an attribute.
class C(object):
def getx(self): return self.__x
def setx(self, val): return self.__x = val
def delx(self): del self.__x
x=property(getx,setx,delx,”doc of the ‘x’
property.”)
range([start,]stop[,step]]) Creates a progression list of plain integers often used for
loops. If all arguments are specified, the list looks like
[start,start+step,start+step+step,...].
raw_input(prompt) Prints prompt to standard output with no newline. Then
reads the next line from standard input and returns it
(without a newline).
s=raw_input(‘prompt>’)
prompt> Phrase I typed in.
>>>s
“Phrase I typed in.”
678
Python Language Reference
Syntax Description
reduce(function, sequence[,initializer]) Applies function of two arguments cumulatively to
items of sequence, from left to right, for the purpose of
reducing sequence to a single value.
679
Appendix E
Syntax Description
str(x) Converts x into string form. Works on any available
data type.
sum(sequence[, start]) Returns the sum of start and the items of sequence that
are not allowed to be strings.
super(type[,obj or type]) Returns the superobject of type. If the second argument is
an object, isinstance(obj, type) must be true. If the second
argument is a type, issubclass(type2,type) must be true.
tuple([sequence]) Returns a tuple whose items are the same and in the
same order as those in sequence.
type(obj) Returns datatype of obj. Works on any available data type.
unichr(i) Return the Unicode string of one character whose Uni-
code code is the integer i. For example, unichr(99)
returns the string u’c’.
Unicode([obj[,encoding[,errors]]]) Returns the Unicode version of obj. If no optional param-
eters are specified, this function will behave like the
str() function except that it returns Unicode strings
instead of eight-bit strings. If encoding and/or errors are
given, Unicode() will decode the object, which can
either be an eight-bit string or a character buffer using
the codec for encoding.
vars([obj]) Without arguments, this function returns dictionary
corresponding to the current local symbol table. If a
module is specified, this function returns a dictionary
corresponding to the specified object’s symbol table.
xrange([start,]stop[,step]) This function is very similar to range(), but returns an
“xrange object’’ instead of a list.
array Module
These functions define functionality in support of the array object type, which can represent an array of
characters, integers, and floating-point numbers. Arrays are sequence types, behave very much like lists,
and are supported by the following functions:
Syntax Description
array(typecode[,initializer]) Returns a new array whose items are restricted by typecode and
initialized from the optional initializer value, which must be a list
or string in versions prior to 2.4 but may also contain an iterable
over elements of the appropriate type. The typecode is a character
that defines the item type.
append(x) Appends a new item of value x to the end of the array.
680
Python Language Reference
Syntax Description
buffer_info() Return a tuple (address, length) giving the current memory
address and the length in elements of the buffer used to hold the
array’s contents.
byteswap() Switches byte order of arguments that are 1, 2, 4, or 8 bytes in size
(endianness).
count(x) Returns the number of times x occurs in the array.
extend(iterable) Appends items from iterable (which prior to version 2.4 had to be
another array but has been changed to include any iterable con-
taining elements of the same type as those in array) to end of
array.
fromfile(file,num) Appends num items from file to array or returns EOFError if less
than num are available.
fromlist(list) Appends items from list to array; equivalent to “for x in list:
array.append(x)”.
asyncore Module
This module provides the basic functionality in support of writing asynchronous socket service clients
and servers.
681
Appendix E
Syntax Description
loop([timeout[,use_poll[,map[,count]]]]) Enters a polling loop that terminates after count
passes or after all open channels have been closed.
The use_poll parameter, which defaults to False,
can be set to True to indicate that the poll()
should be used in preference to select(). timeout
specifies the number of seconds before the
select() or poll() function should timeout. map
is a dictionary of channels to watch. If the map
parameter is omitted, a global map is used. This
map is updated by the default class __init__() —
make sure you extend, rather than override,
__init__() if you want to retain this behavior.
682
Python Language Reference
Syntax Description
recv(buffer_size) Reads at most buffer_size bytes from the socket’s remote
end-point. If buffer_size is an empty string, the connec-
tion has been closed from the remote end.
listen(backlog) Listens for backlog connections made to the socket. back-
log should be at least one and not more than the number
allowed by the operating system.
bind(address) Binds the socket to address as long as it is not already
bound.
accept() Accepts a connection for a socket that is bound to an
address and is listening for connection.
close() Closes the socket.
asynchat Module
This module builds on the basic functionality of the asyncore module for the purpose of simplifying
asynchronous communication between clients and servers. It especially helps with protocols whose ele-
ments are terminated by arbitrary strings or are of variable length.
Syntax Description
CLASS ASYNC_CHAT() Abstract class of asyncore.dispatcher. To make practical use of
it, you should subclass async_chat, providing meaningful meth-
ods of collect_incoming_data() and found_terminator().
close_when_done() Pushes a None on the producer fifo, which, when popped off,
causes the channel to be closed.
collect_incoming_ Called with data holding some amount of received data. The
data(data) default method, which must be overridden, raises a
NotImplementedError exception.
discard_buffers() Discards any data held in the input or output buffers and the pro-
ducer fifo.
found_terminator() Called when the incoming data stream matches the termination
condition set by the set_terminator() method.
get_terminator() Returns the current terminator for the channel.
handle_close() Called when the channel is closed.
handle_read() Called when a read event fires on the channel’s socket in the asyn-
chronous loop. By default, it checks the termination condition set by
set_terminator(), which can be the appearance of a particular
string as input or the receipt of a particular number of characters,
and upon finding it, calls found_terminator().
Table continued on following page
683
Appendix E
Syntax Description
handle_write() Called when the application may write to the channel.
handle_write() calls the initiate_send() method.
binascii Module
The following section covers the functions in Python’s binascii module. The binascii module pro-
vides functionality for conversion between binary and various ASCII-encoded representations.
684
Python Language Reference
Syntax Description
ab2_uu(string) Returns the binary data converted from the single line of uuen-
coded data string.
b2a_uu(data) Returns a line of ASCII characters ending in a newline, as con-
verted from binary data.
a2b_base64(string) Returns binary data as converted from the block of base64 data
specified by string.
b2a_base64(string) Returns a line of ASCII characters in base64 coding ending in a
newline, as converted from a 57-character or shorter binary string.
a2b_qp(string[, header]) Returns binary data as converted from a block of quoted-printable
data. If the optional argument header is present and true, under-
scores will be decoded as spaces.
b2a_qp(data[,quotetabs, Returns a line or lines of ASCII characters in quoted-printable
istext, header]) format as converted from a line or lines of binary data. If the
optional argument header is present and true, spaces will be
encoded as underscores.
a2b_hqx(string) Converts binhex4-formatted ASCII data string to binary, without
doing RLE-decompression.
rledecode_hqx(data) Performs RLE-decompression on the data and returns the decom-
pressed data. The decompression algorithm uses 0x90 after a byte
as a repeat indicator, followed by a count. A count of 0 specifies a
byte value of 0x90. The routine returns the decompressed data,
unless data input data ends in an orphaned repeat indicator, in
which case the Incomplete exception is raised.
cgi Module
The following section covers the functions in Python’s Common Gateway Interface module. This mod-
ule defines a number of utilities for use by CGI scripts written in Python.
Syntax Description
parse(file[,keep_blanks Parses a query from the specified file or from sys.stdin if none
[,strict_parsing]]) is specified. For details on the keep_blanks and strict_parsing
parameters, see the parse_qs function.
parse_qs(querystr[,keep_ Returns a dictionary containing data from the parsed query string
blanks[,strict_parsing]]) querystr. The keep_blanks parameter is a flag indicating whether
blank values in URL encoded queries should be treated as blank
strings. The strict_parsing parameter indicates whether or not to
raise an exception when parsing errors are found.
Table continued on following page
685
Appendix E
Syntax Description
parse_qsl(querystr[,keep_ Returns a list of name,value pairs from the parsed query string
blanks[,strict_parsing]]) querystr. The keep_blanks parameter is a flag indicating whether
blank values in URL encoded queries should be treated as blank
strings. The strict_parsing parameter indicates whether or not to
raise an exception when parsing errors are found.
parse_multipart(file,pdict) Parses multipart/form-data for file uploads. Arguments include file
file and a dictionary containing other parameters in the Content-
Type Header. Returns a dictionary just like the parse_qs() func-
tion: keys are the field names, each value is a list of values for that
field.
parse_header(string) Parses MIME Header string into a main value and a dictionary of
parameters.
test() Writes minimal HTTP headers and formats all information pro-
vided to the script in HTML form for use in testing.
print_environ() Formats the shell environment in HTML format.
print_form() Formats a form in HTML.
print_directory() Formats the current directory in HTML.
print_environ_usage() Prints a list of cgi environment variables in HTML.
escape(s[,quote]) Convert the characters &, <, and > in string s to HTML-safe
sequences. If quote is True, double-quotes are translated as well.
cgitb Module
The following section covers the functions in Python’s CGI Traceback module. Although this module
was originally developed to provide extensive traceback information in HTML for troubleshooting
Python CGI scripts, it has more recently been generalized to also provide information in plain text.
Output includes a traceback showing excerpts of the source code for each level, as well as the values of
the arguments and local variables to currently running functions to assist in debugging. To use this mod-
ule, add the following to the top of the script to be debugged:
686
Python Language Reference
Syntax Description
enable([display[,logdir This function causes the cgitb module to take over the
[,context[,format]]]]) interpreter’s default handling for exceptions by setting the value
of sys.excepthook. The display argument may be 1, which
enables sending the traceback to the browser, or 0 to disable it.
The logdir argument specifies to write tracebacks to files in the
directory named by logdir. The context value is the number of lines
of context to display around the current line of source code in the
traceback. The format option may be either “html” to format the
output to HTML or any other value that formats it as plain text.
handler([info]) This function handles an exception using the default settings
(show a report in the browser, but don’t log to a file). The optional
info argument should be a 3-tuple containing an exception type,
exception value, and traceback object.
Cookie Module
The Cookie Module provides a mechanism for state management in HTML primarily on the server side.
It supports both simple string-only cookies, and provides an abstraction for having any serializable data-
type as cookie value. The Cookie Module originally strictly applied the parsing rules described in RFC
2109 and RFC 2068 specifications, but modifications have made its parsing less strict. Due to security
concerns, two classes have been deprecated from this module: CLASS SerialCookier([input]) and
CLASS SmartCookier([input]). For backwards compatibility, the Cookie Module exports a class
named Cookie, which is just an alias for SmartCookie. This is probably a mistake and will likely be
removed in a future version. You should not use the Cookie class in your applications, for the same rea-
son you should not use the SerialCookie class.
Syntax Description
exception Exception raised when the cookie in question is invalid according
CookieError to RFC 2109.
CLASS BaseCookie This class is a dictionary-like object with keys that are strings and
([input]) values that are Morse1 instances. Upon setting a key to a value,
the value is converted to a Morse1 containing the key and the
value. If input is given, this class passes it to the load() method.
CLASS SimpleCookie This class derives from BaseCookie and overrides
([input]) value_decode() and value_encode() to be the identity and
str(), respectively.
cookielib Module
The cookielib module defines classes in support of the automatic handling of HTTP cookies, for
accessing Web sites that require cookies to be set on the client machine by an HTTP response from a Web
server and then to be returned to the server in later HTTP requests.
687
Appendix E
Syntax Description
exception LoadError Error returned if the cookies fail to load from the spec-
ified file.
CLASS CookieJar(policy=None) The CookieJar class stores HTTP cookies, extracts
HTTP requests, and returns them in HTTP responses.
Instances of the CookieJar class automatically expire
contained cookies when necessary.
CLASS FileCookieJar(filename, A CookieJar that can load cookies from and save
delayload=None,policy=None) cookies to a file. Cookies are NOT loaded from the
named file until either the load() or revert()
method is called.
CLASS CookiePolicy() This class is responsible for deciding whether each
cookie should be accepted from or returned to the
server.
CLASS DefaultCookiePolicy Constructor class should be passed as keyword
(blocked_domains=None, allowed_ arguments only. blocked_domains is a sequence of
domains=None, netscape=True, domain names that we never accept cookies from or
rfc2965=False, hide_cookie2=False, return cookies to. allowed_domains is a sequence of the
strict_domain=False, strict_rfc2965_ only domains for which we accept and return cookies
unverifiable=True, strict_ns_ or None.
unverifiable=False, strict_ns_domain=
DefaultCookiePolicy.
DomainLiberal, strict_ns_set_initial_
dollar=False, strict_ns_set_path=False )
CLASS Cookie() This class represents Netscape, RFC 2109, and RFC
2965 cookies.
email Module
The following functions are available from the email module for use in setting and querying header
fields and for accessing message bodies. This module replaces the functionality of the mimetools mod-
ule from before Python version 2.3.
Syntax Description
CLASS Message() The basic message constructor. Message objects provide a
mapping style interface for accessing the message headers and
an explicit interface for accessing both the headers and the pay-
load, which can be either a string or a list of Message objects
for MIME container documents (for example, multipart/*
and message/rfc822).
as_string([unixfrom]) Return the entire message flattened as a string. When optional
unixfrom is True, the envelope header is included in the
returned string. unixfrom defaults to False.
688
Python Language Reference
Syntax Description
__str__() Equivalent to as_string with unixfrom set to True.
is_multipart() Returns True if the message’s payload is a list of sub-Mes-
sage objects; otherwise (if it is a string) returns False.
689
Appendix E
Syntax Description
add_header(_name,_value, The add_header() method is similar to __setitem__()
**_params) except that additional header parameters can be provided as
keyword arguments. _name is the header field to add and
_value is the primary value for the header.
msg.add_header(‘Content-Disposition’,
‘attachment’,filename=’example.gif’)
adds a header which looks like:
Content-Disposition: attachment;
filename=”example.gif”
replace_header(_name,_value) Replaces the first header found in the message that matches
_name, retaining header order and field name case.
get_content_type() Returns the message’s content type in lowercase of the form
maintype/subtype or the default content type if there is no
Content-Type Header in the message.
get_content_maintype() Returns the message’s main content type. This is the maintype
part of the string returned by get_content_type().
get_content_subtype() Returns the message’s sub-content type. This is the subtype
part of the string returned by get_content_type().
get_default_type() Returns the default content type. Most messages have a default
content type of text/plain, except those that are subparts of
multipart/digest containers and have a default content type of
message/rfc822.
690
Python Language Reference
Syntax Description
set_type(type[,header[,requote]]) Set the main type and subtype for the Content-Type:
header where type is a string in the form maintype/subtype.
If requote is False, this leaves the existing header’s quoting
as is; otherwise, the parameters will be quoted.
get_filename([failobj]) Returns the value of the filename parameter of the Con-
tent-Disposition: header of the message, or failobj if
either the header is missing or has no filename parameter.
get_boundary([failobj]) Returns the value of the boundary parameter of the Con-
tent-Type: header of the message, or failobj if the header is
missing or has no boundary parameter.
set_boundary(boundary) Sets the boundary parameter of the Content-Type: header
to boundary. set_boundary().
get_content_charset([failobj]) Returns the charset parameter of the Content-Type:
header in lowercase if it exists; otherwise returns failobj.
get_charsets([failobj]) Returns a list containing the character set names in the mes-
sage, one element for each subpart of the payload or failobj if
no content header exists.
walk() This method is an all-purpose generator used to iterate over
all parts and subparts of the message object tree.
preamble MIME document format allows some text between the
blank line following the headers and the first multipart
boundary string. The preamble attribute contains this lead-
ing extra-armor text.
epilogue Text that appears between the last boundary and the end of
the message is stored in the epilogue attribute.
defects The defects attribute contains a list of all problems occurring
during message parsing.
file Object
The following section covers the functions available when the built-in function file() is called. The
file object returned has the inherent functionality described in the following table. These functions are
called like this:
filename.function()
Syntax Description
close() Closes the file.
flush() Flushes the internal file buffer.
Table continued on following page
691
Appendix E
Syntax Description
fileno() Returns the integer “file descriptor” that is used by the underlying
implementation to request I/O operations from the operating system.
isatty() Returns True if the filename describes a TTY device; otherwise,
returns False.
read([size]) Returns, in the form of a string object, size bytes from the file or fewer
if the read hits EOF before obtaining that many bytes. If the size argu-
ment is negative or omitted, returns all data until EOF is reached.
readline([size]) Reads and returns a line from the file, including the trailing newline
character.
readlines([size]) Reads and returns all lines from file as a list, including the trailing
newline characters.
seek(offset[,pos]) Sets the file’s current position, for example, stdio’s fseek(). The pos
argument defaults to 0, which turns on absolute file positioning. Other
values are 1, seek relative to the current position and 2, seek relative to
the end of the file.
tell() Returns the file’s current position.
truncate([size]) Truncates the file to length size bytes or 0 if size is not specified.
write(string) Writes string to the file.
writelines(sequence) Writes a sequence of strings to the file. The sequence can be any iter-
able object producing strings, typically a list of strings.
Syntax Description
enable() Enables the garbage collector.
disable() Disables the garbage collector.
isenabled() Returns True if the garbage collector is enabled.
collect() Runs a full garbage collection, examining all generations and returning
the number of unreachable objects found.
692
Python Language Reference
Syntax Description
set_debug() Sets debugging flags for garbage collection and writing the resulting
debugging information out to stderr. The flags may be any of the
following:
693
Appendix E
Syntax Description
CLASS HTTPConnection(host[,port]) An instance of this class represents one transaction to
the HTTP server. If port is not provided and host is of
the form host::port, the port to connect to is taken from
this string. If host does not contain this port section
and the port parameter is not provided, the connection
is made to the default HTTP port, usually 80.
request(method,url[,body[,headers]]) This will send a request to the server using the HTTP
request method method and the selector url. If the body
argument is present, it should be a string of data to
send after the headers are finished. The headers argu-
ment should be a mapping of extra HTTP headers to
send with the request.
get_response() Should be called after a request is sent to get the
response from the server. Returns an HTTPResponse
instance.
set_debuglevel(level) Sets the default debugging level; defaults to no debug-
ging data printing out.
connect() Connects to the server specified when the object was
created.
close() Closes the connection to the server.
send(data) Sends specified data to the server. This method should
be called directly only after the endheaders() method
and before the getreply() method.
putrequest(request,selector[,skip_host First call made to server after a connection has been
[,skip_accept_encoding]]) made. It sends a line to the server consisting of the
request string, the selector string, and the HTTP ver-
sion. skip_host and skip_accept_encoding are Boolean
variables.
putheader(header,arguments) Sends an RFC 822-style header to the server. It sends a
line to the server consisting of the header, a colon and
a space, and the first argument. If more arguments are
given, continuation lines are sent, each consisting of a
tab and an argument.
endheaders() Sends a blank line to the server, signaling the end of
the headers.
694
Python Language Reference
Syntax Description
CLASS HTTPSConnection(host[,port, An instance of this class represents one transaction to
key_file,cert_file]) the secure HTTP server. If port is not provided and host
is of the form host::port, the port to connect to is taken
from this string. If host does not contain this port sec-
tion and the port parameter is not provided, the con-
nection is made to the default HTTPS port, usually
443. key_file is the name of a Privacy Enhanced Mail
(PEM) Security Certificate formatted file that contains
your private key. cert_file is a PEM formatted certificate
chain file.
CLASS HTTPResponse(sock Class whose instance is returned upon successful
[,debuglevel=0][,strict=0] connection. Not instantiated directly.
read([byte]) Reads the byte bytes of the response body.
getheader(name[,default]) Gets the content of the header name or default if no
header name is specified.
getheaders() Returns a list of (header, value) tuples.
msg Instance of mimetools.message (deprecated) or
email.message, which contains the response headers.
695
Appendix E
Syntax Description
exception CannotSendHeader Subclass of ImproperConnectionState.
exception ResponseNotReady Subclass of ImproperConnectionState.
exception BadStatusLine Subclass of HTTPException raised when the server
responds with an unknown HTTP status code.
HTTP_PORT Variable holding default value for HTTP Port, 80.
HTTPS_PORT Variable holding default value for HTTPS Port, 443.
imaplib Module
The imaplib module defines three classes, IMAP4, IMAP4_SSL, and IMAP4_stream, which encapsulate
a connection to an IMAP4 server and implement a large subset of the IMAP4rev1 client protocol, as
defined in RFC 2060.
Syntax Description
CLASS IMAP4([host[,port]]) Initializes the instance thereby creating the connection and deter-
mining the protocol (IMAP4 or IMAP4rev1). If host is not specified,
localhost is used. If port is omitted, the standard IMAP4 port (143)
is used.
exception IMAP4.error Exception raised on any error.
exception IMAP4.abort Subclass of IMAP4.error, which is raised upon IMAP4 server
errors.
exception IMAP4.readonly Subclass of IMAP4.error, which is raised when a writable mail-
box has its status changed by the server.
CLASS IMAP4_SSL([host This is a subclass derived from IMAP4 that connects over an
[,port[,keyfile[,certfile]]]]) SSL-encrypted socket (to use this class, you need a socket module
that was compiled with SSL support).
CLASS IMAP4_stream This is a subclass derived from IMAP4 that connects to the
(command) stdin/stdout file descriptors created by passing command to
os.popen2().
696
Python Language Reference
mimetools Module
Deprecated. Use the email module now.
os Module
The following table covers the functions in Python’s os module. This module provides more portable
access to the underlying operating system functionality than the posix module. Extensions to particular
operating systems exist but make the use of the os module much less portable. These functions perform
such functions as file processing, directory traversing, and access/permissions assignment.
Syntax Description
remove() Deletes the file.
unlink() Same as remove().
rename() Renames the file.
stat() Returns file statistics for the file.
lstat() Returns file statistics for a symbolic link.
symlink() Creates a symbolic link.
utime() Updates the timestamp for the file.
chdir() Changes the working directory.
listdir() Lists files in the current directory.
getcwd() Returns the current working directory.
mkdir(dir) Creates directory as specified by dir.
makedirs() Same as mkdir() except with multiple directories being created.
rmdir(dir) Removes directory as specified by dir.
removedirs() Same as rmdir() except with multiple directories being removed.
access() Verifies permission modes for the file.
chmod() Changes permission modes for the file.
umask() Sets default permission modes for the file.
basename() Removes the directory path and returns the leaf name.
dirname() Removes leaf name and returns directory path.
join() Joins separate components into a single pathname.
split() Returns a tuple containing a dirname() and a basename().
splitdrive() Returns tuple containing drivename and pathname.
splitext() Returns tuple containing filename and extension.
Table continued on following page
697
Appendix E
Syntax Description
getatime() Returns last access time for file. This varies a bit with different operating
systems.
getmtime() Returns last file modification time for file.
getsize() Returns file size in bytes.
exists() Returns True if pathname, file, or directory exists.
isdir() Returns True if pathname exists and is a directory.
isfile() Returns True if pathname exists and is a file.
islink() Returns True if pathname exists and is a symbolic link.
samefile() Returns True if both pathnames refer to the same file.
os.path Module
The following table covers the functions in Python’s os.path module. This module provides support
for manipulation of command paths.
Syntax Description
abspath(path) Returns a normalized absolute version of the pathname path.
basename(path) Returns the base name of pathname path, the second half of the
pair returned by split(path).
commonprefix(list) Returns the longest path prefix that is a prefix of all paths in list. If
there is none, an empty string is returned.
dirname(path) Return the directory name of pathname path, the first half of the
pair returned by split(path).
exists(path) Returns True if path exists and False if path is a broken symbolic
link.
lexists(path) Same as exists() for use on platforms lacking os.lstat().
expanduser(path) Returns path with an initial component of “~” or “~user”
replaced by that user’s home directory.
expandvars(path) Returns path with environment variables expanded.
getatime(path) Returns the time since the last time path was accessed in seconds
since the last epoch.
getmtime(path) Returns the time since the last time path was modified in seconds
since the last epoch.
getctime(path) Returns the system’s ctime, which, on some systems (such as
Unix), is the time of the last change, and on others (such as Win-
dows) is the creation time for path in seconds since the last epoch.
698
Python Language Reference
Syntax Description
getsize(path) Returns the size in bytes of path.
isabs(path) Returns True if path is an absolute path (starting with /).
isfile(path) Return True if path is an existing regular file. This function fol-
lows symbolic links, so both islink() and isfile() can be
True for the same path.
samestat(stat1,stat2) Returns True if the stat tuples stat1 and stat2 refer to the same file.
Stat tuples are returned from stat, lstat, and fstat functions.
split(path) Splits the pathname path into a pair, where tail is the last path-
name component and head is everything leading up to that. The
tail part will never contain a slash; if path ends in a slash, tail will
be empty.
Table continued on following page
699
Appendix E
Syntax Description
splitdrive(path) Splits the pathname path into a pair (drive, tail), where drive
is either a drive specification or an empty string. On systems that
do not use drive specifications, drive will always be an empty
string.
splittext(path) Strips the rightmost file extension, which can include only one
period and returns the remainder.
path(“etc/myfile.txt”).stripext() ==
path(“etc/myfile”)
walk(path, visit, arg) Calls the function visit with arguments (arg, dirname, names) for
each directory in the directory tree rooted at path (including path
itself, if it is a directory). The argument dirname specifies the vis-
ited directory; the argument names lists the files in the directory.
The visit function may modify names to influence the set of direc-
tories visited below dirname.
poplib Module
The following table covers the functions in Python’s poplib module, which defines a class, POP3, that
supports connecting to a POP3 server and implements the protocol as defined in RFC 1725, and a class
POP3_SSL, which supports connecting to a POP3 server that uses SSL as an underlying protocol layer as
defined in RFC 2595. Instances of the POP3 class include all of the methods listed. Instances of POP3_SSL
have no additional methods. The interface of this subclass is identical to its parent.
Syntax Description
CLASS POP3(host[,port]) This class is for implementing a connection to the mail server host
using the POP3 protocol. The connection is created when an
instance of the class is initialized. If port is omitted, the standard
POP3 port (110) is used.
CLASS POP3_SSL(host[,port This class is for implementing a connection to the mail server host
[,keyfile[,certfile]]]) using the POP3 protocol over an SSL-encrypted port. The connec-
tion is created when an instance of the class is initialized. If port is
omitted, the standard POP3 over SSL port (995) is used. A PEM
formatted private key and certificate chain file for the SSL connec-
tion may be provided.
set_debuglevel(level) Sets the instances debugging level to 0, which produces no debug-
ging output, 1, which produces a moderate amount of debugging
output, or 2, which produces the maximum amount of debugging
output. Any number higher than 2 produces the same amount as
specifying 2.
getwelcome() Returns the welcome screen for the POP3 server to which the con-
nection is made.
700
Python Language Reference
Syntax Description
user(username) Sends the user command for the specified username to the POP3
server.
pass_(password) Sends the password command with the specified password to the
POP3 server. The mailbox will be locked until quit is sent.
apop(user,secret) Uses the more secure APOP authentication to log into the POP3
server.
rpop(user) Uses RPOP commands to log into POP3 server.
stat() Returns tuple representing mailbox status (message count, mail-
box size)
list([msg]) Requests message list of message msg. If no parameters are sent,
response is in the form (response,[‘mesg_num octets’]).
retr(msg) Retrieves the whole message msg and marks it as seen. Response
is in format (response, [‘line’,...], octets).
dele(msg) Flags message number msg for deletion, which, on most servers,
occurs when the quit command is issued.
rset() Resets the deletion marks on any messages in the mailbox.
noop() Does nothing. Is sometimes used to keep the connection alive.
quit() Commits changes, unlocks mailbox, and drops the connection.
top(msg,amount) Retrieves the message header plus amount lines of the message
after the header of message number msg.
uidl([msg]) Returns message digest list for message identified by msg or for all
if msg is not specified.
smtpd Module
These functions define functionality in support of the creation and usage of sockets in Python.
Syntax Description
CLASS SMTPServer Creates a new SMTPServer object that binds to localaddr, treating
(localaddr,remoteaddr) remoteaddr as an upstream SMTP relayer. SMTPServer inherits
form asyncore.dispatcher and is thus inserted into
asyncore’s event loop when instantiated.
701
Appendix E
smtplib Module
These functions supply Simple Mail Transport Protocol (SMTP) functionality for use in Python scripts.
Syntax Description
CLASS SMTP([host[,port Encapsulates an SMTP connection. Has methods that support
[,local_hostname]]]) SMTP and ESMTP operations. If the optional host and port
parameters are included, they are passed to the connect()
method when it is called.
set_debuglevel(level) Sets the level of debug output. If level is set to True, debug log-
ging is enabled for the connection and all messages sent to and
received from the server.
connect([host[,port]]) Connects to host on port port. If host contains a colon followed by
a number, the number will be interpreted as the port number.
docmd(cmd[,argstring]) Sends the command cmd and optional arguments argstring to
the server and returns a 2-tuple containing the numeric
response code and the actual response line.
helo([hostname]) Identifies user to SMTP server using “HELO”. This is usually
called by sendmail and not directly.
ehlo([hostname]) Identifies user to ESMTP server using “EHLO”. This is usually
called by sendmail and not directly.
has_extn(name) Returns True if name is in the set of SMTP service extensions
returned by the server; otherwise, returns False.
verify(address) Verifies address on the server using SMTP “VRFY”, which
returns a tuple of code 250 and a full address if the user
address is valid.
login(user,password) Logs onto an SMTP server, which requires authentication using
user and password. Automatically tries either “EHLO” or “HELO”
if this login attempt was not preceded by it.
SMTPHeloError Error returned if the server doesn’t reply correctly to the
“HELO” greeting.
SMTPAuthenticationError Error most likely returned if the server doesn’t accept the user-
name/password combination.
SMTPError Error returned if no suitable authentication method was found.
starttls([keyfile[,certfile]]) Puts the SMTP connections into Transport Layer Security
mode. Requires that ehlo() be called again afterwards. If key-
file and certfile are provided, they are passed to the socket mod-
ule’s ssl() function.
sendmail(from_addr,to_ Sends mail to to_addr from from_addr consisting of msg.
addr,msg[,mail_opts,rcpt_opts]) Automatically tries either “EHLO” or “HELO” if this sendmail
attempt was not preceded by it.
SMTPRecipientsRefused Error returned if all recipient addresses are refused.
702
Python Language Reference
Syntax Description
SMTPHeloError Error returned if the server doesn’t reply correctly to the “HELO”
greeting.
SMTPSenderRefused Error returned if the server doesn’t accept from_addr.
SMTPDataError Error returned if the server replies with an unexpected error code.
quit() Terminates the SMTP session and closes the connection.
socket Module
These functions define functionality in support of the creation and usage of sockets in Python.
Syntax Description
socket(socket_family, Creates a socket of socket_family AF_UNIX or AF_INET, as
socket_type, protocol) specified. The socket_type is either SOCK_STREAM or
SOCK_DGRAM. The protocol is usually left to default to 0.
703
Appendix E
string Module
The following table covers the functions in Python’s string module. The term string is used to repre-
sent the string variable to be acted upon by the function. For example, the function to capitalize a string
variable named myString would be called as follows:
myString.capitalize()
Syntax Description
string.capitalize() Returns a copy of string with only its first character capitalized.
string.center(width) Returns a copy of string centered in a string of length width.
string.count(sub[,start Returns number of occurrences of substring sub in string.
[,end]] )
string.find(sub Returns the lowest index in string where substring sub is found.
[ ,start[,end]] ) Return -1 if sub is not found.
string.index(sub Behaves like string.find() but raises ValueError if sub is not
[ ,start[,end]] ) found within string.
string.isalnum() Returns True if all characters in string are alphanumeric, other-
wise, returns False.
string.isalpha() Returns True if all characters in string are alphabetic; otherwise,
returns False.
string.isdigit() Returns True if all characters in string are digits; otherwise,
returns False.
string.islower() Returns True if all characters in string are lowercase; otherwise,
returns False.
string.isspace() Returns True if all characters in string are space characters; other-
wise, returns False.
string.istitle() Returns True if all characters in string are title case; otherwise,
returns False.
string.isupper() Returns True if all characters in string are uppercase; otherwise,
returns False.
separator.join(seq) Returns a concatenation of strings in seq, separated by separator
(for example, “+”.join( [‘H’, ‘I’, ‘!’] ) -> “H+I+!”)
704
Python Language Reference
Syntax Description
string.ljust(width) Returns string left justified in a string of length width.
string.lower() Returns string with each character converted to lowercase.
string.lstrip([chars] ) Returns a copy of string with leading chars removed. Default chars
is set to whitespace.
string.replace(old, new Returns a copy of string with all occurrences of substring old
[, maxsplit]) replaced by new.
string.rfind(sub Returns the highest index in string where substring sub is found.
[, start[, end]]) Returns -1 if sub is not found.
string.rindex(sub Behaves like rfind(), but raises ValueError when the substring
[ , start[, end]]) sub is not found.
string.rjust(width) Returns string right justified in a string of length width.
sys Module
This module provides access to some variables used or maintained by the interpreter and to functions
that interact strongly with the interpreter. It is always available.
Syntax Description
argv This variable represents the list of command line arguments
passed to a Python script. argv[0] is the script name or has zero
length if none was passed. If other arguments were passed, they
are assigned argv[i] where i is 1 — the number of arguments.
byteorder This variable is set to big or little depending on the endian-ness of
the system.
builtin_module_name This variable is a tuple of strings giving the names of all modules
that are compiled into this Python interpreter.
copyright This variable is a string containing the copyright information for
the Python interpreter.
dllhandler This variable is an integer representing the handle of the Python
DLL (only in Windows).
displayhook(value) This function writes the value of displayhook to stdout and
saves it in __builtin__._.
excepthook(type,value, When an exception is raised and uncaught, the interpreter calls
traceback) sys.excepthook with three arguments, the exception class,
exception instance, and a traceback object. In a Python pro-
gram, this happens just before the program exits.
__displayhook__ The original value of displayhook is stored in this variable.
Table continued on following page
705
Appendix E
Syntax Description
__excepthook__ The original value of excepthook is stored in this variable.
exc_info() This function returns a tuple of three values that give information
about the exception that is currently being handled. The informa-
tion returned is specific both to the current thread and to the cur-
rent stack frame. If the current stack frame is not handling an
exception, the information is taken from the calling stack frame, or
its caller, and so on until a stack frame is found that is handling an
exception.
exc_clear() This function clears all information relating to the current or last
exception that occurred in the current thread.
exec_prefix A variable giving the site-specific directory prefix where the plat-
form-dependent Python files are installed; by default, this is
‘/usr/local’.
getdefaultencoding() Returns the name of the current default string encoding used by
the Unicode implementation.
getdlopenflags() Returns the current value of the flags that are used for
dlopen() calls.
getfilesystemencoding() Returns the name of the encoding used to convert Unicode file-
names into system filenames. Returns None if the system default
encoding is used. The return value is dependent on the filesystem.
getrefcount(obj) Returns the reference count of the object obj.
getrecursionlimit() Returns the current value of the recursion limit, the maximum
depth of the Python interpreter stack. This limit serves to prevent
the crashing of Python inherent to infinite recursion.
getwindowsversion() Returns one of the following strings, which represent the various
Windows versions:
706
Python Language Reference
Syntax Description
maxUnicode This variable is an integer giving the largest supported code point
for a Unicode character.
modules This variable is a dictionary of all the loaded modules.
path This variable is a list of strings that specifies the search path for
modules as initialized from the environment variable PYTHON-
PATH plus an installation-dependent default.
707
Appendix E
Syntax Description
warnoptions An implementation detail of the warnings framework; this value
is not to be modified.
winver A variable containing the version number used to form registry
keys on Windows platforms, stored as string resource 1000 in the
Python DLL. winver is normally the first three characters of ver-
sion. It is provided in the sys module for informational purposes
only and has no effect on Windows registry keys.
random Module
This module provides functionality in support of obtaining pseudo-random numbers. Different operat-
ing systems handle this differently — some using randomness sources as hash sources and others instead
using system time.
Syntax Description
seed(x) Initializes the basic random number generator using hash object
x if it is provided.
getstate() Returns the current internal state of the random number
generator.
setstate(state) Resets the internal state of the random number generator to the
supplied state.
jumpahead(n) Changes the internal state to one different from the current
state. n is a nonnegative integer that is used to scramble the
current state vector. This is commonly used in multithreaded
programs, in conjunction with multiple instances of the Random
class: setstate() or seed() can be used to force all instances
into the same internal state, and then jumpahead() can be used
to force the instances’ states far apart.
getrandbits(x) Returns a python long int with x random bits.
randrange([start,]stop[,step]) Returns a randomly selected element from range(start, stop, step).
randint(a,b) Returns a random integer int such that a <= int <= b.
choice(seq) Returns a random element from seq.
shuffle(x[,random]) Shuffles the sequence x, using the random function random if
specified.
sample(sequence,length) Returns a list of num unique elements from sequence.
random() Returns a random floating-point number between 0.0 and 1.0.
uniform(a,b) Returns a random real number N where a <= N > b.
708
Python Language Reference
urllib Module
This module provides high-level functionality for fetching data across the World Wide Web. The func-
tionality is similar to the built-in function open() except that it accepts Universal Resource Locators
(URLs) instead of filenames.
Syntax Description
CLASS URLopener([proxies Base class for opening and reading URLs. In most cases, you
[, **509]]) will want to use the CLASS FancyURLopener instead. An
empty proxies variable turns off proxies completely.
CLASS FancyURLopener Class for opening and reading URLs providing default
([proxies[,**509]]) handling for the following HTTP response codes: 301, 302,
303, 307, and 401. An empty proxies variable turns off proxies
completely.
>>>import urllib
>>>proxies =
{‘http’:’http://proxy.server.com:8080/’}
>>>opener = urllib.FancyURLopener(proxies)
urllib2 Module
This module defines functions and classes needed to facilitate opening URLs — basic and digest authen-
tication, redirections, cookies, and such. If a class is listed, functions belonging to that class are grouped
with it.
709
Appendix E
Syntax Description
urlopen(url[,data]) Opens the specified url, which can be a string or a REQUEST
object. The data parameter is for use in passing extra infor-
mation as needed for http and should be a buffer in the for-
mat of application/x-www-form-urlencoded. The
urlopen function returns a file-like object with a geturl()
method to retrieve the URL of the resource received and an
info() method to return the meta-information of the page.
install_opener(opener) Installs an OpenerDirector instance as the default global
opener.
build_opener(handlers) Returns an OpenerDirector instance, which chains the han-
dlers in the order given. These handlers can be instances of
BaseHandler or subclasses of it. The following exceptions
may be raised: URLError (on handler errors), HTTPError (a
subclass of URLError that handles exotic HTTP errors), and
GopherError (another subclass of URLError that handles
errors from the Gopher handler).
CLASS Request(url[,data[,headers This class is an abstraction of a URL Request. url should be
[,origin_req_host[,unverifiable]]]]) a valid URL in string format. For a description of data, see
the add_data() description. headers should be in dictionary
format. origin_req_host should be the request-host of the ori-
gin transaction. unverifiable should indicate whether or not
the request is verifiable, as specified in RFC 2965.
add_data(data) Sets the Request data to data.
get_method() Posts a string indicating the HTTP request method, which
may be either GET or POST.
has_data() Returns True when the instance has data and False when
instance is None.
get_data() Returns the instance’s data.
add_header(key, val) Adds another header to the request. Later calls will over-
write earlier ones with the same key value.
add_undirected_header Adds a header that will not be added in the case of a
(key,header) redirected request.
has_header(header) Returns whether the instance (either regular or redirected)
has a header.
get_full_url() Returns the URL given in the constructor.
get_type() Returns the type (or scheme) of the URL.
get_host() Returns the host to which the connection will be made.
get_selector() Returns the selector, the part of the URL that is sent to
the server.
710
Python Language Reference
Syntax Description
set_proxy(host,type) Prepares the request by connecting to a
proxy server. host and type will replace the
ones for the instance.
get_origin_req_host() Returns the request-host of the origin
transaction.
is_unverifiable() Returns whether the request is unverifi-
able as defined in RFC 2965.
CLASS OpenerDirector() The OpenerDirector class opens URLs
via BaseHandlers chained together and
is responsible for managing their
resources and error recovery.
add_handler(handler) Searches and adds to the possible chains
any of the following handlers:
711
Appendix E
Syntax Description
close() Removes any parents.
parent A valid OpenerDirector to be used to open
a URL using a different protocol.
default_open(req) This method is not defined in BaseHandler,
but should be defined to catch all URLs.
protocol_open(req) This method is not defined in BaseHandler,
but should be defined to handle URLs of the
given protocol.
unknown_open(req) This method is not defined in BaseHandler,
but should be defined to handle catching
URLs with no specific registered handler to
open them.
http_error_default(req,fp,code,msg,hdrs) This method is not defined in BaseHandler
but should be overridden by the subclass to
provide a catchall for otherwise unhandled
HTTP errors. req is a Request object, fp is a
file-like object with the HTTP error body,
code is the three-digit error code, msg is a
user-visible explanation of the error code,
and hdrs is the mapping object with the
header of the error.
http_error_nnn(req,fp,code,msg,hdrs) This method is not defined in BaseHandler
but is called on an instance of subclass when
an HTTP error with code nnn is encountered.
protocol_request(req) This method is not defined in BaseHandler
but should be called by the subclass to pre-
process requests of the specified protocol.
protocol_response(req,response) This method is not defined in BaseHandler
but should be called by the subclass to post-
process requests of the specified protocol.
CLASS HTTPDefaultErrorHandler() A class that defines a default handler for
HTTP error responses; all responses are
turned into HTTPError exceptions.
CLASS HTTPRedirectHandler() A class to handle redirection.
redirect_request(req,fp,code,msg,hdrs) Returns a request or none in response to a
redirect request.
712
Python Language Reference
Syntax Description
http_error_301(req,fp,code,msg,hdrs) Redirects to the URL.
http_error_302(req,fp,code,msg,hdrs) Redirects for a “found” response.
http_error_303(req,fp,code,msg,hdrs) Redirects for “see other” response.
http_error_307(req,fp,code,msg,hdrs) Redirects for “temporary redirect”
response.
CLASS HTTPCookieProcessor(cookies) A class to handle HTTP cookies.
cookiejar The cookielib.CookieJar in which
cookies are stored.
CLASS ProxyHandler(proxies) A class to handle routing requests
through a proxy. If proxies is specified, it
must be in the form of a dictionary that
maps protocol names to URLs of proxies.
protocol_open(req) ProxyHandler will assign a method pro-
tocol_open() for every protocol that has
a proxy in the proxies dictionary given in
the constructor.
CLASS HTTPPasswordMgr() A class that supports a database of
(realm, url) -> (user, password)
mappings.
add_password(realm,uri, Sets user authentication token for given
user,password) user, realm, and uri.
713
Appendix E
Syntax Description
CLASS HTTPBasicAuthHandler([password_mgr]) A class that supports HTTP authentication
with the remote host. The password_mgr
variable must be compatible with HTTP-
PasswordMgr.
714
Python Language Reference
Syntax Description
CLASS FileHandler() A class that handles opening local files.
file_open(req) Opens the local file if localhost or no host
is specified, and otherwise initiates an ftp
connection and reattempts opening the
file.
CLASS FTPHandler() A class that handles FTP URLs.
ftp_open(req) Opens the file indicated by req using an
empty username and password.
CLASS CacheFTPHandler() A class that handles FTP URLs and caches
FTP connections to minimize delay.
setTimeout(t) Sets timeout of connections to t seconds.
setMaxConns(n) Sets the maximum number of cached con-
nections to n.
CLASS GopherFTPHandler() A class that handles gopher URLs.
gopher_open(req) Opens the gopher resource denoted
by req.
CLASS UnknownHandler() A catchall class to handle unknown URLs.
unknown_open() Raises a URLError exception on URLs
with no specific register handler.
715
PHP Language Reference
This appendix cannot provide an exhaustive list of all of the PHP functions (or it would be hun-
dreds of pages long), but it presents a subset of functions that the authors think you will encounter
in your everyday use of PHP along with brief descriptions of what those functions do. The core lan-
guage functions are included, as well as the functions for PHP extensions that are in popular use.
This appendix is meant as a quick reference for you to check the input and return types of
parameters — more of a reminder of how the function works than a verbose description of how
to use the function in your scripts. If you need to see how a particular function should be used,
or to read up on a function that isn’t covered here, check out the PHP documentation online at
www.php.net/docs.php.
This appendix originally appeared in Beginning PHP 5 written by Dave W. Mercer, Allan Kent,
Steven D. Nowicki, David Mercer, Dan Squier, and Wankyu Choi and published by Wrox,
ISBN: 0-7645-5783-1, copyright 2004, Wiley Publishing, Inc. The author is grateful to those
authors and the publisher for allowing it to be reused here.
Apache
Function Returns Description
apache_child_terminate(void) bool Terminates apache process after
this request.
apache_note(string note string Gets and sets Apache request
_name [, string note_value]) notes.
virtual(string filename) bool Performs an Apache subrequest.
getallheaders(void) array Alias for apache_request_
headers().
Arrays
Function Returns Description
krsort(array array_arg [, int bool Sorts an array by key value in reverse
sort_flags]) order.
ksort(array array_arg [, int bool Sorts an array by key.
sort_flags])
count(mixed var [, int mode]) int Counts the number of elements in a vari-
able (usually an array).
natsort(array array_arg) void Sorts an array using natural sort.
natcasesort(array array_arg) void Sorts an array using case-insensitive
natural sort.
asort(array array_arg [, int bool Sorts an array and maintains index
sort_flags]) association.
arsort(array array_arg [, int bool Sorts an array in reverse order and
sort_flags]) maintains index association.
sort(array array_arg [, int bool Sorts an array.
sort_flags])
718
PHP Language Reference
in_array(mixed needle, array bool Checks if the given value exists in the
haystack [, bool strict]) array.
array_search(mixed needle, mixed Searches the array for a given value and
array haystack [, bool strict]) returns the corresponding key if
successful.
extract(array var_array [, int int Imports variables into symbol table from
extract_type [, string prefix]]) an array.
compact(mixed var_names [, array Creates a hash containing variables and
mixed ...]) their values.
array_fill(int start_key, array Creates an array containing num ele
int num, mixed val) ments starting with index start_key
each initialized to val.
range(mixed low, mixed high array Creates an array containing the range of
[, int step]) integers or characters from low to high
(inclusive).
shuffle(array array_arg) bool Randomly shuffles the contents of an
array.
Table continued on following page
719
Appendix F
array_values(array input) array Returns just the values from the input
array.
array_count_values(array input) array Returns the value as key and the fre-
quency of that value in input as value.
array_reverse(array input array Returns input as a new array with the
[, bool preserve keys]) order of the entries reversed.
array_pad(array input, int array Returns a copy of input array padded
pad_size, mixed pad_value) with pad_value to size pad_size.
array_flip(array input) array Returns array with key <-> value
flipped.
array_change_key_case(array array Returns an array with all string keys
input [, int case=CASE_LOWER]) lower-cased (or uppercased).
array_unique(array input) array Removes duplicate values from array.
array_intersect(array arr1, array Returns the entries of arr1 that have
array arr2 [, array ...]) values that are present in all the other
arguments.
array_uintersect(array arr1, array Returns the entries of arr1 that have
array arr2 [, array ...], values that are present in all the other
callback data_compare_func) arguments. Data is compared by using a
user-supplied callback.
720
PHP Language Reference
721
Appendix F
BCMath
Function Returns Description
bcadd(string left_operand, string Returns the sum of two arbitrary
string right_operand precision numbers.
[, int scale])
722
PHP Language Reference
BZip2
Function Returns Description
bzopen(string|int file| resource Opens a new BZip2 stream.
fp, string mode)
bzread(int bz[, int length]) string Reads up to length bytes from a BZip2
stream, or 1024 bytes if length is not
specified.
bzwrite(int bz, string data int Writes the contents of the string data to
[, int length]) the BZip2 stream.
bzerrno(resource bz) int Returns the error number.
bzerrstr(resource bz) string Returns the error string.
Table continued on following page
723
Appendix F
Calendar
Function Returns Description
unixtojd([int timestamp]) int Converts UNIX timestamp to Julian Day.
jdtounix(int jday) int Converts Julian Day to UNIX timestamp.
cal_info(int calendar) array Returns information about a particular
calendar.
cal_days_in_month int Returns the number of days in a month
(int calendar, int month, for a given year and calendar.
int year)
724
PHP Language Reference
Class/Object
Function Returns Description
class_exists bool Checks if the class exists.
(string classname)
725
Appendix F
Character Type
Function Returns Description
ctype_alnum(string text) bool Checks for alphanumeric character(s).
ctype_alpha(string text) bool Checks for alphabetic character(s).
ctype_cntrl(string text) bool Checks for control character(s).
ctype_digit(string text) bool Checks for numeric character(s).
ctype_graph(string text) bool Checks for any printable character(s)
except space.
ctype_lower(string text) bool Checks for lowercase character(s).
ctype_print(string text) bool Checks for printable character(s).
ctype_punct(string text) bool Checks for any printable character that is
not whitespace or an alphanumeric
character.
ctype_space(string text) bool Checks for whitespace character(s).
ctype_upper(string text) bool Checks for uppercase character(s).
ctype_xdigit(string text) bool Checks for character(s) representing a
hexadecimal digit.
Curl
Function Returns Description
curl_version([int version]) array Returns CURL version information.
curl_init([string url]) resource Initializes a CURL session.
726
PHP Language Reference
727
Appendix F
728
PHP Language Reference
Director y
Function Returns Description
opendir(string path) mixed Opens a directory and returns a
dir_handle.
Error Handling
Function Returns Description
error_log(string message bool Sends an error message somewhere.
[, int message_type
[, string destination
[, string extra_headers]]])
729
Appendix F
Filesystem
Function Returns Description
flock(resource fp, bool Portable file locking.
int operation
[, int &wouldblock])
730
PHP Language Reference
731
Appendix F
fgetcsv(resource fp array Gets line from file pointer and parses for
[,int length CSV fields.
[, string delimiter
[, string enclosure]]])
disk_total_space(string path) float Gets total disk space for filesystem that
path is on.
disk_free_space(string path) float Gets free disk space for filesystem that
path is on.
chgrp(string filename, bool Changes file group.
mixed group)
732
PHP Language Reference
FTP
Function Returns Description
ftp_connect(string host resource Opens an FTP stream.
[, int port [, int timeout]])
733
Appendix F
ftp_get(resource stream, bool Retrieves a file from the FTP server and
string local_file, writes it to a local file.
string remote_file, int mode
[, int resume_pos])
734
PHP Language Reference
ftp_nb_fput(resource stream, int Stores a file from an open file to the FTP
string remote_file, server nbronly.
resource fp, int mode[,
int startpos])
735
Appendix F
Function Handling
Function Returns Description
call_user_func mixed Calls a user function that is the first
(string function_name parameter.
[, mixed parmeter]
[, mixed ...])
736
PHP Language Reference
HTTP
Function Returns Description
header(string header void Sends a raw HTTP header.
[, bool replace,
[int http_response_code]])
Iconv Librar y
Function Returns Description
iconv(tring in_charset, string Returns str converted to the out_
string out_charset, string str) charset character set.
737
Appendix F
Image
Function Returns Description
exif_tagname(index) string Gets headername for index or false if
not defined.
exif_read_data(string array Reads header data from the JPEG/TIFF
filename [, sections_needed image filename and optionally reads the
[, sub_arrays internal thumbnails.
[, read_thumbnail]]])
738
PHP Language Reference
739
Appendix F
740
PHP Language Reference
imagecolorclosesthwb int Gets the index of the color that has the
(resource im, int red, hue, white, and blackness nearest to the
int green, int blue) given color.
imagecolordeallocate bool De-allocates a color for an image.
(resource im, int index)
741
Appendix F
742
PHP Language Reference
743
Appendix F
imagefttext(resource im, array Writes text to the image using fonts via
int size, int angle, int x, freetype2.
int y, int col,
string font_file,
string text, [array extrainfo])
744
PHP Language Reference
IMAP
Function Returns Description
imap_open(string mailbox, resource Opens an IMAP stream to a mailbox.
string user, string password
[, int options])
745
Appendix F
746
PHP Language Reference
747
Appendix F
748
PHP Language Reference
Mail
Function Returns Description
ezmlm_hash(string addr) int Calculates EZMLM list hash value.
mail(string to, string subject, int Sends an e-mail message.
string message
[, string additional_headers
[, string additional_
parameters]])
750
PHP Language Reference
Math
Function Returns Description
abs(int number) int Returns the absolute value of the number.
ceil(float number) float Returns the next highest integer value of
the number.
floor(float number) float Returns the next lowest integer value
from the number.
round(float number float Returns the number rounded to
[, int precision]) specified precision.
sin(float number) float Returns the sine of the number in radians.
cos(float number) float Returns the cosine of the number in
radians.
tan(float number) float Returns the tangent of the number in
radians.
asin(float number) float Returns the arc sine of the number in
radians.
acos(float number) float Returns the arc cosine of the number in
radians.
atan(float number) float Returns the arc tangent of the number in
radians.
atan2(float y, float x) float Returns the arc tangent of y/x, with the
resulting quadrant determined by the
signs of y and x.
sinh(float number) float Returns the hyperbolic sine of the num-
ber, defined as (exp(number) - exp(-
number))/2.
751
Appendix F
752
PHP Language Reference
MIME
Function Returns Description
mime_content_type string Returns content-type for file.
(string filename
|resource stream)
753
Appendix F
Miscellaneous
Function Returns Description
get_browser([string browser_ mixed Gets information about the capabilities
name [, bool return_array]]) of a browser.
constant(string const_name) mixed Given the name of a constant this
function returns the constant’s
associated value.
getenv(string varname) string Gets the value of an environment
variable.
putenv(string setting) bool Sets the value of an environment
variable.
getopt(string options array Gets options from the command line
[, array longopts]) argument list.
flush(void) void Flushes the output buffer.
sleep(int seconds) void Delays for a given number of seconds.
usleep(int micro_seconds) void Delays for a given number of micro
seconds.
time_nanosleep mixed Delays for a number of seconds and
(long seconds, nanoseconds.
long nanoseconds)
754
PHP Language Reference
MS SQL
Function Returns Description
mssql_connect int Establishes a connection to an MS-SQL
([string servername server.
[, string username
[, string password]]])
755
Appendix F
756
PHP Language Reference
MySQL
Function Returns Description
mysql_connect resource Opens a connection to a MySQL Server.
([string hostname[:port]
[:/path/to/socket]
[, string username
[, string password [, bool new
[, int flags]]]]])
758
PHP Language Reference
759
Appendix F
Network Functions
Function Returns Description
define_syslog_variables(void) void Initializes all syslog-related variables.
openlog(string ident, bool Opens connection to system logger.
int option, int facility)
760
PHP Language Reference
ODBC
Function Returns Description
odbc_close_all(void) void Closes all ODBC connections.
odbc_binmode bool Handles binary column data.
(int result_id, int mode)
761
Appendix F
762
PHP Language Reference
763
Appendix F
764
PHP Language Reference
Output Buffering
Function Returns Description
ob_list_handlers() false| Lists all output_buffers in an
array array.
ob_start([ string|array user_ bool Turns on Output Buffering
function [, int chunk_size (specifying an optional output
[, bool erase]]]) handler).
ob_flush(void) bool Flushes (sends) contents of the
output buffer. The last buffer
content is sent to next buffer.
ob_clean(void) bool Cleans (deletes) the current output
buffer.
ob_end_flush(void) bool Flushes (sends) the output buffer,
and deletes current output buffer.
ob_end_clean(void) bool Cleans the output buffer, and
deletes current output buffer.
ob_get_flush(void) bool Gets current buffer contents,
flushes (sends) the output buffer,
and deletes current output buffer.
ob_get_clean(void) bool Gets current buffer contents and
deletes current output buffer.
ob_get_contents(void) string Returns the contents of the output
buffer.
ob_get_level(void) int Returns the nesting level of the
output buffer.
ob_get_length(void) int Returns the length of the output
buffer.
ob_get_status([bool full_status]) false| Returns the status of the active or
array all output buffers.
ob_implicit_flush([int flag]) void Turns implicit flush on/off and is
equivalent to calling flush() after
every output call.
output_reset_rewrite_vars(void) bool Resets (clears) URL rewriter values.
output_add_rewrite_ bool Adds URL rewriter values.
var(string name, string value)
765
Appendix F
PCRE
Function Returns Description
preg_match(string pattern, int Performs a Perl-style regular expression
string subject [, array match.
subpatterns [, int flags
[, int offset]]])
766
PHP Language Reference
767
Appendix F
768
PHP Language Reference
Program Execution
Function Returns Description
exec(string command [, array string Executes an external program.
&output [, int &return_value]])
Regular Expressions
Function Returns Description
ereg(string pattern, int Matches a regular expression.
string string [, array
registers])
769
Appendix F
Sessions
Function Returns Description
session_set_cookie_params(int void Sets session cookie parameters.
lifetime [, string path [, string
domain [, bool secure]]])
770
PHP Language Reference
Simple XML
Function Returns Description
simplexml_load_file simplemxml_element Loads a filename and returns a
(string filename) simplexml_element object to allow
for processing.
simplexml_load_string simplemxml_element Loads a string and returns a
(string data) simplexml_element object to allow
for processing.
simplexml_import_dom simplemxml_element Gets a simplexml_element object
(domNode node) from dom to allow for processing.
771
Appendix F
Sockets
Function Returns Description
socket_select(array int Runs the select() system call on the
&read_fds, array &write_fds, sets mentioned with a timeout specified
&array except_fds, int by tv_sec and tv_usec.
tv_sec[, int tv_usec])
772
PHP Language Reference
773
Appendix F
SQLite
Function Returns Description
sqlite_popen(string filename resource Opens a persistent handle to a SQLite
[, int mode [, string &error_ database. Will create the database if it
message]]) does not exist.
sqlite_open(string filename resource Opens a SQLite database. Will create
[, int mode [, string &error_ the database if it does not exist.
message]])
774
PHP Language Reference
775
Appendix F
Streams
Function Returns Description
stream_socket_client(string resource Opens a client connection to a
remoteaddress [, long &errcode, remote address.
string &errstring, double timeout,
long flags, resource context])
776
PHP Language Reference
777
Appendix F
Strings
Function Returns Description
crc32(string str) string Calculates the crc32 polynomial
of a string.
crypt(string str [, string salt]) string Encrypts a string.
convert_cyr_string(string str, string Converts from one Cyrillic
string from, string to) character set to another.
lcg_value() float Returns a value from the combined
linear congruential generator.
levenshtein(string str1, string str2) int Calculates Levenshtein distance
between two strings.
md5(string str, [ bool raw_output]) string Calculates the md5 hash of a string.
md5_file(string filename [, bool string Calculates the md5 hash of given
raw_output]) filename.
metaphone(string text, int phones) string Breaks English phrases down into
their phonemes.
778
PHP Language Reference
779
Appendix F
780
PHP Language Reference
781
Appendix F
782
PHP Language Reference
783
Appendix F
URL
Function Returns Description
http_build_query(mixed string Generates a form-encoded query string
formdata [, string prefix]) from an associative array or object.
parse_url(string url) array Parses a URLand returns its components.
get_headers(string url) array Fetches all the headers sent by the
server in response to an HTTP request.
urlencode(string str) string URL-encodes all non alphanumeric
characters except -_.
urldecode(string str) string Decodes URL-encoded string.
rawurlencode(string str) string URL-encodes all non alphanumeric
characters.
rawurldecode(string str) string Decodes URL-encoded string.
base64_encode(string str) string Encodes string using MIME base64
algorithm.
base64_decode(string str) string Decodes string using MIME base64
algorithm.
get_meta_tags(string filename array Extracts all meta tag content attributes
[, bool use_include_path]) from a file and returns an array.
Variable Functions
Function Returns Description
gettype(mixed var) string Returns the type of the variable.
settype(mixed var, string type) bool Sets the type of the variable.
784
PHP Language Reference
785
Appendix F
XML
Function Returns Description
xml_parser_create([string encoding]) resource Creates an XML parser.
xml_parser_create_ns([string resource Creates an XML parser.
encoding [, string sep]]).
786
PHP Language Reference
ZLib
Function Returns Description
gzfile(string filename [, int use_ array Reads and uncompresses entire
include_path]) .gz-file into an array.
787
Index
Index
790
Index
cells
HTML divisions, 31–33 built-in functions
PHP, 474 Perl, 386–387
Python, delimiting, 400 PHP, listed, 486–490
blocking changes to form fields, 141–142 Python, 415–416
body section built-in objects, JavaScript
HTML tags, 20–21 current document, 312–313
with two tables, 25 form elements, 313–314
visible content, 533–534 history list, navigating, 315
XHTML tables, 103–105 reference-making element, 315–316
bold text URL information, manipulating, 314–315
CSS, 212–213, 585 XHTML document, 311–312
HTML, 84–85 bulleted list
XHTML, 531 described, 35, 42–43, 569
Boolean object, JavaScript, 614–615 item marker, changing, 43–46
borders ordered lists within, 47–48
CSS ordinal, changing position of, 46
collapsing, 241–243 button, form
color, 220–221 checkbox clicks, automating, 314
defining, 238–239, 594–596 custom text, 135–136
differing adjacent, 242 JavaScript object, 615–616
predefined styles, 220–222 radio, 131–132, 635–636
shortcut, 222 reset, 137–138
space to neighboring elements (margins), 223–224 submit, 137–138
spacing, 223, 239–240 XHTML element, 534–535
width, 219–220 BZip2, PHP functions, 723–724
images, 62–63
property shortcut, 222
XHTML tables C
drop-shadow effect, 115–116 calendar
setting, 96–99 dynamic, user-interactive
bottom, positioning elements, 252–254 Perl, 447–453
box formatting model, CSS, 215–218 PHP, 517–523
braces ({}) Python, 463–466
JavaScript, 287 simple
Perl, 372 Perl, 439–444
PHP, 474, 501 PHP, 509–514, 724–725
break statement Python, 457–460
JavaScript, 278 capitalization
PHP, 486 changing, 583
Python, 412–413 denoting, 200, 585
browser, Web captions
caching with meta tags, 18 CSS tables, 598
default path, setting, 17 XHTML tables, 102–103, 244, 535
DOM support, listed, 264 caret (^), 385
formatting shortcuts, reasons to avoid, 10 Cascading Style Sheets. See CSS
JavaScript support lacking, 323 cells
meta tags caching, 18 CSS tables, rendering, 597
refreshing and reloading after specified time, 18–19 spacing, 239–240
scripting language, unsupported, 552 XHTML tables
Web servers, connecting, 1–2 defining, 564
window, another delimiting, 100–101
opening, 329–331 rules, 98–99
outputting text, 331–333 spacing and padding, 94–95
XML, usefulness of, 154, 160 width and alignment, 91–94
791
CGI (Common Gateway Interface)
CGI (Common Gateway Interface). See also Perl; Python style, defining, 171–172
form, sample, 432–434 text, 584
history, 363 transparency, graphics, 53
HTTP XHTML codes, 571–576
data encapsulation, 364–366 columns
request and response, 363–364 HTML table headers, 100–101
mechanics, 366–367 XHTML tables
MySQL data, sample, 434–438 attributes, specifying, 537
Python functions, 685–687 grouping, 109–111, 537–538
scripting multiple, newspaper-like, 120–121
basic requirements, 423–424 spanning, 106–109
Linux Bash shell, 424–430 command terminal character and blocks of code, 474
servers, 367–368 commenting code
when to use, 431–432 Perl, 372
cgitb troubleshooting tool, Python, 421–422 PHP, 474
changes, text, incorporating in line (HTML) (<span>), 82 Python, 400
changing nodes, DOM, 302–309 style sheets, 167
chapter and section, automatic numbering, 207–208 XML, 157
characters commercial image applications, 57
matching with Perl, 673 Common Gateway Interface. See CGI
type functions, PHP, 726 comparison operators
checkbox, form JavaScript, 604
JavaScript, 616–617 Perl, 378, 652–653
XHTML, 132 Python, 409
child sibling elements, matching by, 176–178 compressing white space, 198
circular area, image map, 64 conditions
citation, source, 536 loop, breaking (continue), 278
class Perl, 658–660
functions, PHP, 725–726 constructors
matching elements, CSS, 174–175 Perl objects, 392
object definitions, PHP, 491 PHP objects, 491–492
clickable regions, image map, 64–66 content
clients, Web, 1 block, enclosing (<div>), 539–540
clipping boundary, CSS blocks, 589 body section
closing files HTML tags, 20–21
Perl, 389 with two tables, 25
PHP, 497 visible content, 533–534
Python, 420 XHTML tables, 103–105
code CSS, 581–582
inline snippets, XHTML element, 536–537 document, writing (write and writeln
running in interpreter, 421 methods), 313
code block overflow, controlling, 257–258
CSS, handling, 588–591 visible, tag containing (<body>), 533–534
DHTML, animating, 356–361 continue statement
HTML divisions, 31–33 JavaScript, 278
PHP, 474 Perl, 382–383
Python, delimiting, 400 PHP, 486
collapsible lists, creating with DHTML, 317–319 Python, 412–413
collapsing borders, 241–243 control structures
colon (:), 400 JavaScript
colors breaking out (break), 278
background, 228–230 do while loop, 274
changing following mouse movement, 285–286 expression, executing code based on value of
foreground, 227–228 (switch), 277–278
links, 74 for loop, 275
792
Index
definition lists
for in loop, 276 selectors, 173, 577
if and if else loops, 276–277 shorthand expressions, 185–187
while loop, 274–275 spacing borders and cells, 239–240
Perl style definition format, 171–172
continue, 382–383 tables properties, 237–243
for loop, 380–381 text, 189–214
foreach loop, 381 versions 1.0 and 2.0, 8–9
if and if else, 381–382 curl function, PHP, 726–727
last, 382–383 current date, writing to document in JavaScript,
next, 382–383 325–327
redo, 382–383 current document object, JavaScript, 312–313
while and until loops, 380 cursive fonts, 210
Python cursor, mouse, 599
continue and break, 412–413
for loop, 411
if and elif statements, 411–412 D
try statement, 412 data encapsulation, HTTP, 364–366
while loop, 410–411 data entry. See form
conversion functions, 661 data, extracting from Perl files, 390–391
cookie functions, Python, 687–688 data passing
copyright, image, 58 HTTP (GET and POST), 364–366, 444–453
core attributes, XHTML, 571 Linux Bash shell scripting, 425–426
count modifiers, matching, 674 to software, 555–556
counter object, automatic numbering, 206 XHTML forms, 128
CSS (Cascading Style Sheets) data types
border settings, 219–223, 238–239 JavaScript, 270–271
box formatting model, 215–218 Perl, 372–373
cascading, 167–169 PHP, 475
collapsing borders, 241–243 Python
defining styles, 166–167 dictionaries, 405–406
dynamic outlines, 224–225 lists, 404–405
HTML and, 164–165 numbers, 401
inheritance, 179 strings, 402–404
inline text formatting, 81–82 tuples, 406–407
levels, 165 database access, query, and report
margins, 223–224 MySQL, 434–438
matching elements, 173–178 Perl, 453–456
padding elements, 218–219 PHP, 523–526
positioning elements Python, 467–469
absolute, 248–249 date
fixed, 249–252 current
floating, 255–256 in document header, 475–477
layering, 258–261 writing to document, 325–327
relative, 246–248 graphical display, 338–341
specifying (top, right, bottom, and left proper- handling
ties), 252–254 JavaScript, 282, 617–620
static, 245–246 Perl examples, 439–444
syntax, 245 PHP, 509–514, 727–728
visibility, 261–262 Python examples, 457–460
properties and table attributes, 237–238 debugger, Perl symbolic, 651
property values, 172–173 decorations, text, 200–201
pseudoclasses, 179–181 default meta tag path, 17–18
pseudoelements, 181–185 defining styles, CSS, 166–167
purpose, 163 definition lists, 35, 538, 540–541
793
deleting text
deleting text, 86–87, 538–539 document type definitions. See DTDs
descendant elements, matching by, 176–178 dollar sign ($), 475
design strategy, XML, 153–154 DOM (Document Object Model)
destructors, PHP objects, 491–492 changing nodes, 302–309
DevGuru JavaScript Language Index, 325 history, 291–292
DHTML (Dynamic HTML) JavaScript, 263, 264–265
collapsible lists, 317–319 node properties and methods, 295–296
JavaScript sample, 292–294
code blocks, animating, 356–361 traversing nodes, 296–302
menus, animating, 351–355 double forward slash (//), 474
styles, swapping, 348–351 do-while loop, 274, 482
moving elements, 319–321 drop caps, adding to first letter, 183–184
uses, 316–317 drop shadow text, 201
dictionary data, Python, 405–406 drop-down menu effect, 351–355
directory functions DTDs (document type definitions)
Perl, 670 document type tags, 14–15
PHP, 729 XHTML Basic 1.0, 14
directory, HTTP requests, 363–364 XHTML frame support, 122
document XML support, 155
CSS Dynamic HTML (DHTML)
layering, 258–261 collapsible lists, 317–319
matching, 173–178 JavaScript
padding, 218–219 code blocks, animating, 356–361
sizing, 256–258 menus, animating, 351–355
visibility, 261–262 styles, swapping, 348–351
DOM, accessing by ID, 316 moving elements, 319–321
external content, embedding, 552–553 uses, 316–317
form, inserting, 129 dynamic outlines, CSS, 224–225
header containing current date and time, 475–477
HTML tags
block divisions, 31–33 E
body section, 20–21 ECMA Specification, JavaScript, 324
body with two tables, 25 editing
described, 13 animated GIFs, 54–57
DOCTYPE, 10, 14–15 XML, 161
head section, 15–19 elements. See also tags
headings, 27–28 CSS
horizontal rules, 28–29 layering, 258–261
HTML, 10–11, 15 matching, 173–178
manual line breaks, 25–27 padding, 218–219
paragraphs, 23–24 sizing, 256–258
preformatted text, 30–31 visibility, 261–262
script section, 20 DOM, accessing by ID, 316
style section, 19 moving with JavaScript, 319–321
image map code sample, 66–67 XML, 156–157
intelligence, in JavaScript, 265 elif statement, 411–412
JavaScript object, 620–622 e-mail
master element, 545 address, obscuring, 327–329
moving with JavaScript, 319–321 errors and troubleshooting report, sending, 504–505
navigational pane, 119–120 PHP functions, 750
original/desired location, defining in XHTML (<base>), Python functions, 688–691
531–532 <embed> tag
URL, 70 older browsers, supporting, 150–151
XML, 156–157 representing non-HTML data with, 144–145
794
Index
file
embedding operators, 383–384
external content in document, 552–553 special characters, 384–385
fonts, 213–214 substrings, memorizing, 386
emphasis, text, 560–561 PHP, 769–770
enclosing scripts, 266–267 Python
end of document, 15 described, 413
ending elements operations, 414
PHP, 473–474 special characters, 414–415
pseudoelements, 184–185 Extensible Markup Language (XML)
entities, user-defined, 158–159 attributes, 157
errors and troubleshooting comments, 157
JavaScript design strategy, 153–154
elements, animating, 360–361 DTDs, 155
form validating, 347 editing, 161
need, 286 elements, 156–157
syntax, 287–288 entities, user-defined, 158–159
tools, 287 namespaces, 159
Perl nonparsed data, 158
Apache Internal Server error message, 395–396 non-Web applications, 154
maximum reporting, 394–395 parsing, 161–162
PHP PHP functions, 771, 786–787
custom handling, 505 style sheets, 159–160
error level, controlling, 503–504 versions, 8
handling, 729–730 viewing documents, 160
identifying, 502–503 XSLT, 161
level, controlling, 503–504 Extensible Stylesheet Language (XSLT), 161
sending to file or e-mail address, 504–505
syntax, common, 501
tools, 500–501 F
Python fantasy fonts, 210
cgitb module, 421–422 field labels, 138
code, running in interpreter, 421 fieldsets, form, 138–140
error stream, redirecting, 422 file
escape characters accessing, 387–388
JavaScript, 606 binary
Perl, 674 Perl, 389–390
events PHP, 497–498
attributes, XHTML, 570–571 Python, 420
handlers, JavaScript, 284–286, 610–611 closing
object, JavaScript, 622–623 Perl, 389
exact size, element, 257 PHP, 497
exception handling Python, 420
JavaScript form validating, 347 errors and troubleshooting report, 504–505
Python, 421 form fields, 137
executing scripts, 267–268 functions
exporting data, Linux Bash shell, 426–428 Perl, 667–668
expressions PHP, 730-733
executing code based on value of (switch), in Python, 691–692
JavaScript, 277–278 handle and handle test functions, 666–667
matching, in Perl, 673 information, getting, 390–391
expressions, regular listing, 428–429
Perl locking, 499
examples, 385 miscellaneous, listed, 499–500
modifying, 385–386
795
file (continued)
file (continued) form
opening blocking changes to fields, 141–142
Perl, 388 button, custom text, 135–136
PHP, 495–496 CGI, sample, 432–434
Python, 417–418 checkboxes, 132
reading text deciphering and handling data
Perl, 388 Perl, 444–447
PHP, 496–497 PHP, 514–516
Python, 418–419 Python examples, 460–462
upload, 623–624 defining, 542–543
writing text described, 123–126
Perl, 389 dynamic calendar, creating
PHP, 497 Perl, 447–453
Python, 419 PHP, 517–523
File Transfer Protocol (FTP), 733–736 Python, 463–466
filename, in URL, 70 features, adding, 341–343
filesystem functions, PHP, 730–733 field labels, 138, 548–549
first letter of elements, property values, 183–184 fieldsets and legends, 138–140
first line of elements file fields, 137
indenting, 194–195 footer, 565–566
property values, 181–183 header, 567
first-child pseudoclasses, 180 heading, 566–567
fixed positioning elements, CSS, 249–252 hidden fields, 135
Flash, Shockwave id attribute, 130
<embed>and <object>tags, 150–151 images, 136
plugins, 149–150 input mechanism, 129–130, 546–547
floating CSS elements, 255–256, 592, 593 keyboard shortcuts, 140–141
floating page layout tables, 113–116 label, 541–542
floating point values, 401 legends, 138–140
floating text objects, 195–197 list boxes, 132–134
fonts name attribute, 130
described, 210 object, JavaScript, 313–314, 624–625
embedding, 213–214 options list, 558–559
formatting tag, 79 passing data, 128
line spacing, 213 password input boxes, 131
lists, formatting, 39 PHP handler, XHTML, 127–128
selection, 210–211 radio buttons, 131–132
sizing reset button, 137–138
CSS, 211–212, 584 selection options, hierarchy of, 554
XHTML, 79, 532–533, 559–560 submit button, 137–138
styling tab order, 140–141
CSS, 212–213, 582–587 tags, XHTML, 129
XHTML, 582–587 text areas, large, 134–135
footer, XHTML table, 103–105 text input boxes, 130–131, 565
for in loop, 276 validating, 265, 343–347
for loop values, setting options, 554–555
JavaScript, 275 formatting. See also CSS
Perl, 380–381 HTML documents, 10–11
PHP, 482–483 strings, Python operators, 404
Python, 411 text
foreach loop CSS inline control, 81–82
Perl, 381 font tag, 79
PHP, 483–484 inline attributes, 80–81
foreground colors, 227–228 nonbreaking spaces, 82–83
soft hyphens, 83–84
XHTML table column groups, 110
796
Index
HTML (HyperText Markup Language)
forward slash, asterisk (/*), 474 GIMP (GNU Image Manipulation Program), 58
frames, 121–122 graphical date display, 338–341
free-form polygonal area, image map, 64 graying out form field controls, 141–142
FTP (File Transfer Protocol), 733–736 grouping
functions columns, XHTML tables, 109–111, 537–538
JavaScript in-line text elements, 87–88
data manipulation, built-in, 279–280 Gutmans, Andi (PHP re-writer), 471
object, 625–626
top-level, 611–612
Perl H
arithmetic, 660 head section, document tags
array and list, 664–665 meta tags, 16–19, 543–544
conversion, 661 structure, 15–16
directory, 670 title, specifying, 16, 567
file and file handle test, 666–667 header
file operations, 667–668 cells, table, 100
input and output, 668–670 documents
miscellaneous, 672–673 current date and time, 475–477
networking, 671–672 XHTML, 543
search and replace, 665–666 HTTP data, encapsulating, 6, 364–366
string, 663 tables
structure, 661–663 columns, spanning, 106–107
system, 670–671 described, 103–105
Python headings
array, 680–681 capitalizing, CSS, 200
asynchronous communication, 681–684 HTML, 27–28
binary and ASCII code conversion, 684–685 XHTML table, 566
built-in listed, 675–680 height
CGI, 685–687 elements, specifying, 257, 590
cookie, 687–688 line spacing, controlling, 213, 590–591
email, 688–691 XHTML images, 62
file, 691–692 hidden form fields, 135
garbage collection, 692–693 hidden object, JavaScript, 626–627
HTTP and HTTPS protocol, 693–696 history
IMAP, 696 JavaScript object, 627
interpreter, 705–708 list, navigating, 315
operating system, 697–700 horizontal rules
POP3 server, 700–701 HTML, 28–29
pseudo-random numbers, obtaining, 708 XHTML, 544
SMTP, 701–703 horizontal table elements
sockets, 701, 703 columns spanning, 106–109
strings, 704–705 described, 99–100
URL, 709–715 horizontal text alignment, 189–191
hovering over link, coloring, 74
HTML (HyperText Markup Language)
G block divisions, 31–33
garbage collection, Python, 692–693 creation, 7
Gecko (Mozilla Firebird) DOM Reference, 325 CSS and, 164–165
GIF (Graphics Interchange Format) document tags
animation block divisions, 31–33
assembling, 55–56 body section, 20–21
described, 54–55 body with two tables, 25
output, 56–57 described, 13
source, 55 DOCTYPE, 10, 14–15
Web images, 51–52 head section, 15–19
797
HTML (HyperText Markup Language), document tags (continued)
HTML (HyperText Markup Language), document tags matching elements, CSS, 175
(continued) node, finding in DOM, 300–302
headings, 27–28 IDE (integrated development environment), PHP, 501
horizontal rules, 28–29 identifying problems
HTML, 10–11, 15 JavaScript
manual line breaks, 25–27 elements, animating, 360–361
paragraphs, 23–24 form validating, 347
preformatted text, 30–31 need, 286
script section, 20 syntax, 287–288
style section, 19 tools, 287
headings, 27–28 Perl
horizontal rules, 28–29 Apache Internal Server error message, 395–396
manual line breaks, 25–27 maximum reporting, 394–395
non-HTML content, 144–147 PHP
output for simple Web page, 11–12 custom handling, 505
paragraphs, 23–25 error level, controlling, 503–504
preformatted text, 30–31 handling, 729–730
source for simple Web page, 11 identifying, 502–503
standards, listed by number, 8–9 level, controlling, 503–504
tables, 89 sending to file or e-mail address, 504–505
tag, 9–10, 15 syntax, common, 501
text, 79 tools, 500–501
versions, listed, 8–9 Python
HTTP (HyperText Transfer Protocol). See also plugins cgitb module, 421–422
data encapsulation, 364–366 code, running in interpreter, 421
described, 4–7 error stream, redirecting, 422
form data IDLE (Integrated DeveLopment Environment), 398, 399
creating dynamic calendar, 517–523 if else loop
passing (GET and POST), 128, 364–366, 444–453 JavaScript, 276–277
PHP functions, 737 Perl, 381–382
port, standard, 70 PHP, 484–485
Python functions, 693–696 if loop
request and response, 363–364 JavaScript, 276–277
HTTPS protocol, Python function, 693–696 Perl, 381–382
hyperlinks Python, 411–412
anchor tag, 71–73 image maps
colors, 74 clickable regions, specifying, 64–66
described, 3–4 document code sample, 66–67
JavaScript object, 629–630 navigation, defining, 550–551
keyboard shortcuts and tab orders, 73–74 physical area, describing, 530–531
style sheet to documents, 166–167 specifying, 63
tag, 76–77, 550 images
target details, 75–76 aligning, 62–63
titles, 72–73 animation, 54–57
URLs, 69–71 background, 231–236
visited and unvisited, styles of, 180 borders, 62–63
XHTML elements, 528, 546 forms, 136
HyperText Markup Language. See HTML graphical date display, 338–341
HyperText Transfer Protocol. See HTTP inserting into Web documents, 58–60
interlaced and progressive storage and display, 54
irregularly shaped layouts, 116–118
I list item markers, 44–46, 204
Iconv library, 737 object, JavaScript, 627–628
ID PHP functions, 738–745
element, accessing by, 316 preloading, 333–335
form attributes, 130 rollovers, 335–337
798
Index
JavaScript
size, 61–62 Java language, 263
table backgrounds, 106 JavaScript
text, specifying for nongraphical browsers, 60–61 calculations and operators, 272–274
transparency, 53 conditional expression, breaking loop to
Web formats, 51–53 (continue), 278
XHTML documents, 57–58 constants, 603
IMAP control structures
PHP functions, 745–750 breaking out of, 278
Python functions, 696 do while loop, 274
indenting text expression, executing code based on value of
CSS, 583 (switch), 277–278
XHTML, 194–195 for loop, 275
inheritance for in loop, 276
CSS, 179 if and if else loops, 276–277
list style, 39–40 while loop, 274–275
in-line text data manipulation functions, 279–280
formatting, 80–81, 560 data types, 270–271
grouping, 87–88 DHTML
input functions, Perl, 668–670 code blocks, animating, 356–361
<input> tag, 129–130 collapsible lists, 317–319
input, user. See form menus, animating, 351–355
inserting moving elements, 319–321
images, 58–60 styles, swapping, 348–351
text or content, 86–87, 547–548 uses, 316–317
inside value, list style, 39–40 document head section, 20
integers, Python supported, 401 DOM, 264–265
integrated development environment (IDE), PHP, 501 drawbacks of using, 323–324
Integrated DeveLopment Environment (IDLE), 398, 399 elements, accessing by ID, 316
interlaced image storage and display, 54 enclosing scripts, 266–267
internationalization attributes, XHTML, 571 errors and troubleshooting
Internet Explorer (Microsoft) need, 286
DOM support, 264 syntax, 287–288
TrueDoc fonts, using, 214 tools, 287
interpreter, Python, 398–400, 705–708 event handlers, 284–286, 610–611
invisible data, 135 executing scripts, 267–268
invisible elements forms
CSS, 261–262, 591 features, adding, 341–343
hidden form fields, 135 validating, 343–347
JavaScript object, 626–627 functions, 611–612
irregularly shaped graphic and text layouts, 116–118 guidelines for using, 324
italic text history, 263
CSS, 212–213, 585 identifying problems, 288–290
HTML, 84–85 images
XHTML, 545–546 graphical date display, 338–341
item, lists, 201–202 preloading, 333–335
item marker rollovers, 335–337
image, 204, 581 implementations, different, 264
positioning, 203–204, 580 methods, 610
style, 43–46 objects
anchor, 612–613
area, 613
J array, 613–614
JASC Software Boolean, 614–615
Animation Shop, 54–57 button, 615–616
Paint Show Pro, 57 checkbox, 616–617
799
JavaScript, objects (continued)
JavaScript, objects (continued)
date, 617–620
K
keyboard shortcuts
described, 281
forms, 140–141
document, 620–622
input, indicating, 548
event, 622–623
links, 73–74
file upload, 623–624
form, 624–625
function, 625–626
hidden, 626–627
L
label
history, 627
document, 16
image, 627–628
form, 541–542
link, 629–630
lambda keyword, Python anonymous functions, 417
location, 630–631
language
math, 631–632
element style, 181
navigator, 632
encoding, 557
number, 633
large text areas, form, 134–135
object, 633
last control structure, Perl, 382–383
option, 634
layering elements, CSS, 258–261, 593
password, 634–635
layout
radio, 635–636
irregularly shaped graphic and text, 116–118
RegExp, 636
multiple-column pages, 120–121
Reset, 637
layout tables, page
Screen, 637–638
described, 111–112, 244
Select, 638–639
floating page, 113–116
String, 639–641
multiple columns, 120–121
Submit, 641–642
navigational blocks, 119–120
Text, 642–643
odd graphic and text combinations, 116–118
Textarea, 643–644
left, positioning elements, 252–254
user-created, 283–284
legends, form, 138–140
Window, 644–647
Lerdorf, Rasmus (PHP creator), 471
objects, built-in
letter
current document, 312–313
first of elements, property values, 183–184
form elements, 313–314
spacing style, 198–199, 586
history list, navigating, 315
levels, CSS, 165
math operations, 282
line
reference-making element (self), 315–316
breaks, manual, 25–27, 534
URL information, manipulating, 314–315
first of elements
XHTML document (window), 311–312
indenting, 194–195
operators, 604–606
property values, 181–183
properties, 610
spacing, fonts, 213
statements, marking for reference (label_name),
links
278–279
anchor tag, 71–73
syntax, 269
colors, 74
text, writing to document
JavaScript object, 629–630
current date, 325–327
keyboard shortcuts and tab orders, 73–74
e-mail address, obscuring, 327–329
style sheet to documents, 166–167
user-defined functions, 280–281
tag, 76–77, 550
uses, 265
target details, 75–76
variables, 271
titles, 72–73
Web resources, 324–325
URLs, 69–71
window
visited and unvisited, styles of, 180
opening another, 329–331
Linux
text, writing, 331–333
Bash shell scripting
JPEG (Joint Photographic Experts Group) format, 52
Apache, configuring, 424–425
JSUnit troubleshooting tool, 290
exporting data, 426–428
800
Index
Microsoft Windows
file, listing, 428–429 if else
passing data, 425–426 JavaScript, 276–277
state, toggling, 429 Perl, 381–382
user-specific command, running, 429–430 PHP, 484–485
PHP Perl, 658–660
running, 472 lowercase text, changing to, 583
troubleshooting tool, 501
list box, form, 132–134
lists M
collapsible, creating with DHTML, 317–319 Macintosh
custom numbering, automatic, 208–209 PHP troubleshooting tool, 501
definition, 46–47 Python interpreter, 399
described, 35–36 MacPython, 399
item, formatting, 201–202 Macromedia Freehand and Fireworks, 57
markers manual line breaks, 25–27, 534
image, 204, 581 maps, image
positioning, 203–204, 580 clickable regions, specifying, 64–66
setting, 202–203, 580 document code sample, 66–67
nesting, 47–48 navigation, defining, 550–551
ordered, 37–42 physical area, describing, 530–531
Perl functions, 664–665 specifying, 63
Python data types, 404–405 margins
text, formatting, 201–204 CSS, 223–224, 588
unordered, 42–46, 569 floating objects, 195–197
XHTML element, 549–550 marker, list
location object, JavaScript, 630–631 image, 204, 581
locking files, PHP, 499 positioning, 203–204, 580
logical operators style, 43–46
JavaScript, 604 master element, XHTML documents, 545
Perl, 378, 653 math operations
Python, 409 JavaScript object, 631–632
logo, page layout, 116–118 numbers, strings, and dates, 282
loops PHP functions, 751–753
breaking out of (break), 278 menus
continue statement animating, 351–355
JavaScript, 278 navigational blocks, 119–120
Perl, 382–383 meta tags
PHP, 486 automatic refresh and redirect, 18–19
Python, 412–413 default path, 17–18
do-while, 274, 482 search engine information, 17
for server, overriding, 19
JavaScript, 275 syntax, 16
Perl, 380–381 user agent caching, 18
PHP, 482–483 metadata, document, 551–552
Python, 411 methods
for in, 276 JavaScript, 610
foreach objects, assigning, 284
Perl, 381 PHP objects, 492–493
PHP, 483–484 Python strings, 403–404
if Microsoft Internet Explorer
JavaScript, 276–277 DOM support, 264
Perl, 381–382 TrueDoc fonts, using, 214
Python, 411–412 Microsoft Windows
PHP, 472, 500
Python interpreter, 398
801
Microsoft Windows Picture Viewer
Microsoft Windows Picture Viewer, 58 ordinal, changing position of, 39–41
MIDI sound file, adding, 147–148 starting number, changing, 41–42
MIME (Multipurpose Internet Mime Extensions) within unordered list, 47–48
PHP functions, 753 numbering, automatic
XHTML script tag support, 266 chapter and section number example, 207–208
minimum or maximum size, element, 257, 590 counter object, 206
modules counting, 582
Perl, 393, 657–658 described, 205
Python, 398 list, custom, 208–209
monospaced text, 85, 211, 568–569 text, 205–209
mouse value, changing counter’s, 206–207
cursor, CSS, 599 numbers. See also math operations
images, changing in JavaScript, 335–337 JavaScript object, 270, 633
menus, animating with drop-down effect, 351–355 ordered list style, changing, 37–39
movie data, storing, 493–494 Perl, 372–373
moving DHTML elements, 319–321 Python, 401
Mozilla Firebird, 264, 325
Mozilla Firefox, 288
MS SQL function, PHP, 755–757 O
MSDN Web Development Library, 324 object
multiple column page layout, 120–121 JavaScript
Multipurpose Internet Mime Extensions (MIME) anchor, 612–613
PHP functions, 753 area, 613
XHTML script tag support, 266 array, 613–614
music, background, 147–148 Boolean, 614–615
MySQL built-in, 282
CGI data managing, 434–438 button, 615–616
data access with Perl, 453–456 checkbox, 616–617
PHP functions, 757–760 date, 617–620
described, 281
document, 620–622
N event, 622–623
name file upload, 623–624
form attribute, 130 form, 624–625
matching elements, CSS, 173–174 function, 625–626
namespaces, XML, 159 hidden, 626–627
navigating history, 627
history list, 315 image, 627–628
JavaScript object, 632 link, 629–630
nodes, DOM, 296–302 location, 630–631
navigational blocks math, 631–632
image map, 550–551 navigator, 632
page layout tables, 119–120 number, 633
nesting lists, 47–48 object, 633
networking functions option, 634
Perl, 671–672 password, 634–635
PHP, 760–761 radio, 635–636
newspaper-like columns, documents, 120–121 references, incorrect, 288
next control structure, Perl, 382–383 RegExp, 636
node properties and methods, DOM, 295–296 Reset, 637
nonbreaking spaces, 82–83 Screen, 637–638
nonparsed data, XML, 158 Select, 638–639
numbered list String, 639–641
described, 35, 37, 553–554 Submit, 641–642
number style, changing, 37–39 Text, 642–643
802
Index
PEAR (PHP Extension and Application Repository)
Textarea, 643–644 bitwise, 410
user-created, 283–284 comparison, 409
Window, 644–647 logical, 409
JavaScript built-in miscellaneous, 410
current document, 312–313 strings, 402–403
form elements, 313–314 option
history list, navigating, 315 JavaScript object, 634
reference-making element, 315–316 PHP functions, 766–768
URL information, manipulating, 314–315 ordered list
XHTML document, 311–312 described, 35, 37, 553–554
Perl number style, changing, 37–39
constructors, 392 ordinal, changing position of, 39–41
nomenclature, 391–392 starting number, changing, 41–42
property values, accessing, 392–393 within unordered list, 47–48
PHP ordinal, changing position of
class definitions, 491 ordered lists, 39–41
constructors and destructors, 491–492 unordered lists, 46
functions, 725–726 orphans, CSS printing, 598–599
methods and properties, 492–493 output functions
movie data, storing, 493–494 Perl, 668–670
Python, 420 PHP buffering, 765
<object> tag outside value, list style, 39–40
history of, 146–147 overflow, controlling, 257–258, 589
older browsers, supporting, 150–151
oblique text, 585
ODBC function, PHP, 761–764 P
Official PHP Web Site, 508 packages, Perl, 657–658
online resources, 325 padding, CSS blocks, 588–589
open source image applications, 57–58 page layout tables
opening described, 111–112, 244
another window, JavaScript, 329–331 floating page, 113–116
files multiple columns, 120–121
Perl, 388 navigational blocks, 119–120
PHP, 495–496 odd graphic and text combinations, 116–118
Python, 417–418 pages, printing, 598–599
OpenType fonts, 213, 214 Paint Show Pro (JASC Software), 57
operating systems paragraphs
image applications, 58 color, defining style, 228
keyboard shortcuts, differentiating, 73–74 HTML, 23–25
Python functions, 697–700 XHTML, 555
operations, Python regular expressions, 414 parentheses (()), 386
operators parsing XML, 161–162
JavaScript, 604–606 passing data
Perl HTTP (GET and POST), 364–366, 444–453
arithmetic, 377, 652 Linux Bash shell scripting, 425–426
assignment, 377, 652 to software, 555–556
bitwise, 378, 653 XHTML forms, 128
comparison, 378, 652–653 password
logical, 378, 653 form input boxes, 131
miscellaneous, 379, 654 JavaScript object, 634–635
regular expressions, 383–384 path
string, 379, 654 absolute versus relative, 71
PHP, 479–481 default, 17–18
Python PCRE function, PHP, 766
arithmetic, 408 PEAR (PHP Extension and Application Repository), 508
assignment, 408–409
803
Perl (Practical Extraction and Report Language)
Perl (Practical Extraction and Report Language) logical, 378, 653
built-in functions, 386–387 miscellaneous, 379, 654
for CGI, 393–394 string, 379, 654
command line arguments, 649–650 regular expressions
control structures examples, 385
continue, 382–383 listed, 673–674
for loop, 380–381 modifying, 385–386
foreach loop, 381 operators, 383–384
if and if else, 381–382 special characters, 384–385
last, 382–383 substrings, memorizing, 386
next, 382–383 resources, 371–372
redo, 382–383 special variables, 374–377
while and until loops, 380 string tokens, 379
data types, 372–373 syntax, 372
debugger, symbolic, 650–651 user-defined functions, 387
errors and troubleshooting variables, 373, 655–657
maximum reporting, 394–395 PHP Extension and Application Repository (PEAR), 508
message, Apache Internal Server, 395–396 PHP (PHP: Hypertext Preprocessor)
examples beginning and ending tags, 473–474
database access, 453–456 break and continue statements, 486
date and time handling, simple calendar, 439–444 built-in functions, listed, 486–490
form data, creating dynamic calendar, 447–453 command terminal character and blocks of code, 474
form data, deciphering and dealing with, 444–447 commenting code, 474
script, 368–369 database, querying and reporting, 523–526
file operations date and time handling (simple calendar), 509–514
binary, manipulating, 389–390 do-while loop, 482
closing, 389 errors and troubleshooting
information, getting, 390–391 custom handling, 505
opening, 388 error level, controlling, 503–504
reading text, 388 identifying, 502–503
writing text, 389 sending to file or e-mail address, 504–505
functions syntax, common, 501
arithmetic, 660 tools, 500–501
array and list, 664–665 file operations
conversion, 661 binary, 497–498
directory, 670 closing, 497
file and file handle test, 666–667 locking, 499
file operations, 667–668 miscellaneous, listed, 499–500
input and output, 668–670 opening, 495–496
miscellaneous, 672–673 reading text, 496–497
networking, 671–672 writing text, 497
search and replace, 665–666 for loop, 482–483
string, 663 foreach loop, 483–484
structure, 661–663 form handler
system, 670–671 creating dynamic calendar, 517–523
history, 371 deciphering and handling, 514–516
modules, 393 logging data to file, 127
objects security, 128
constructors, 392 functions
nomenclature, 391–392 Apache, 717–718
property values, accessing, 392–393 array, 718–722
operators BCMath, 722–723
arithmetic, 377, 652 BZip2, 723–724
assignment, 377, 652 calendar, 724–725
bitwise, 378, 653 character type, 726
comparison, 378, 652–653 class/object, 725–726
804
Index
pseudoclasses
curl, 726–727 plugins
date and time, 727–728 described, 143–144
directory, 729 <embed>tag, representing non-HTML data with,
email, 750 144–145
error handling, 729–730 MIDI sound file, adding, 147–148
filesystem, 730–733 <object> tag, 146–147
FTP, 733–736 older, Netscape-based browsers, supporting, 150–151
handling, 736 parameters, 147
HTTP, 737 Shockwave Flash, adding, 149–150
Iconv library, 737 plus sign (+), 385
image, 738–745 PNG (Portable Network Graphics) format, 52–53
IMAP, 745–750 polygonal area, image map, 64
math, 751–753 POP3 server function, Python, 700–701
MIME, 753 port number, URL, 70
miscellaneous, 754–755 positioning
MS SQL, 755–757 background images, 236
MySQL, 757–760 blocks, 591–593
network, 760–761 elements
ODBC, 761–764 absolute, 248–249
options and info, 766–768 fixed, 249–252
output buffering, 765 floating, 255–256
PCRE, 766 layering, 258–261
programs, executing, 769 relative, 246–248
session, 770–771 specifying (top, right, bottom, and left proper-
simple XML, 771 ties), 252–254
socket, 772–773 static, 245–246
SQLite, 774–776 syntax, 245
streams, 776–778 visibility, 261–262
strings, 778–784 pound sign (#)
URL, 784 identifiers, matching style elements by, 175
variable, 784–785 PHP commenting, 474
XML, 786–787 Practical Extraction and Report Language. See Perl
ZLib, 787 preformatted text, HTML, 30–31
history, 471–472 preloading images with JavaScript, 333–335
if/else construct, 484–485 premade images, 58
objects printing
class definitions, 491 CSS, 598–599
constructors and destructors, 491–492 to text file, in Perl, 389
methods and properties, 492–493 programs, PHP functions executing, 769
movie data, storing, 493–494 progressive image storage and display, 54
operators, 479–481 property
regular expressions, 769–770 borders shortcut, 222
requirements, 472–473 CSS attributes, 237–238
resources, 508 JavaScript, 610
script, sample, 475–477 PHP objects, 492–493
switch construct, 485–486 values
user-defined functions CSS, 172–173
arguments, 490–491 Perl objects accessing, 392–393
return value, 490 protocol, URL section, 69–70
variable scope, 491 pseudoclasses
variables, 475 anchor styles, 180
when to use, 507–508 described, 179
while loop, 482 first-child element, 180
white space, use of, 474 language, changing by, 181
PHPBuilder Web site, 508
805
pseudoelements
pseudoelements operators
beginning and ending elements, 184–185 arithmetic, 408
CSS, 181–185 assignment, 408–409
first letter, specifying, 183–184 bitwise, 410
first line, specifying, 181–183 comparison, 409
pseudo-random numbers, Python functions logical, 409
obtaining, 708 miscellaneous, 410
Python regular expressions
anonymous functions (lambda keyword), 417 described, 413
built-in functions, 415–416 operations, 414
control structures special characters, 414–415
continue and break, 412–413 resources, 398
for loop, 411 syntax, 400
if and elif statements, 411–412 troubleshooting
try statement, 412 cgitb module, 421–422
while loop, 410–411 code, running in interpreter, 421
data types error stream, redirecting, 422
dictionaries, 405–406 user-defined functions, 416–417
lists, 404–405 variable scope, 407–408
numbers, 401 PythonWin, 398
strings, 402–404
tuples, 406–407
errors and exception handling, 421 Q
examples query, database
date and time handling, 457–460 MySQL, 434–438
form data, deciphering and dealing with, 460–462 Perl, 453–456
file operations PHP, 523–526
binary, handling, 420 Python, 467–469
closing, 420 quirks, browser reference, 325
opening, 417–418 quotation, enclosing, 533
reading from text file, 418–419 quotation marks (“)
writing to text file, 419 adding with styles, 184–185
functions autogenerating in text, 205
array, 680–681 block, offsetting, 31–33
asynchronous communication, 681–684
binary and ASCII code conversion, 684–685
built-in listed, 675–680 R
CGI, 685–687 radio buttons
cookie, 687–688 forms, 131–132
email, 688–691 JavaScript object, 635–636
file, 691–692 reading file text
garbage collection, 692–693 Perl, 388
HTTP and HTTPS protocol, 693–696 PHP, 496–497
IMAP, 696 Python, 418–419
interpreter, 705–708 read-only form fields, XHTML, 141–142
operating system, 697–700 rectangular area, image map, 64
POP3 server, 700–701 redirect meta tags, automatic, 18–19
pseudo-random numbers, obtaining, 708 redo control structure, Perl, 382–383
SMTP, 701–703 references
sockets, 701, 703 back to reference (self), 315–316
strings, 704–705 external, declaring as system entity, 158–159
URL, 709–715 labeling, 278–279
history, 397 object, incorrect, 288
interpreter, 398–400 refresh meta tags, automatic, 18–19
modules, 398 RegExp object, JavaScript, 636
objects, 420
806
Index
software plugins
regular expressions identifier, 175
Perl name, 173–174, 600
examples, 385 universal selector (*), 174
modifying, 385–386 semicolon (;)
operators, 383–384 JavaScript, 287
special characters, 384–385 Perl, 372
substrings, memorizing, 386 PHP, 474, 501, 502
PHP, 769–770 serif fonts, 210
Python server
described, 413 Apache
operations, 414 Internal Server error message, Perl and, 395–396
special characters, 414–415 Linux Bash shell, configuring to deliver, 424–425
relative paths, 17, 71 PHP functions, 472, 717–718
relative positioning, CSS elements, 246–248 CGI, 367–368
repeated background images, 232–235 defined, 1
request and response, HTTP, 363–364 documents, delivering, 4
reset button, forms, 137–138 meta tags overriding, 19
Reset object, JavaScript, 637 name in URL, 69–70
return value, PHP user-defined function, 490 session function, PHP, 770–771
right, positioning elements, 252–254 SGML (Standard Generalized Markup Language), 7
rollovers Shockwave Flash
images, changing in JavaScript, 335–337 <embed> and <object>tags, 150–151
menus, animating with drop-down effect, 351–355 plugins, 149–150
rows, XHTML tables shortcuts, keyboard
columns spanning, 106–109 forms, 140–141
described, 99–100 input, indicating, 548
RSS feed, 156–157 links, 73–74
rules, XHTML shorthand expressions, CSS, 185–187
documents, 544 sibling elements, matching by, 176–178
tables, 96–99 Simple Mail Transport Protocol (SMTP) functions,
701–703
simple XML function, PHP, 771
S sizing
sans serif fonts, 210 elements, 257–258
scalar values, 372 fonts
Screen object, JavaScript, 637–638 CSS, 211–212, 584
script section, 20, 558 XHTML, 79, 532–533, 559–560
scripting, Bash shell images, 61–62
Apache, configuring, 424–425 small text (<small>), 86, 559–560
exporting data, 426–428 SMTP (Simple Mail Transport Protocol) functions,
file, listing, 428–429 701–703
passing data, 425–426 socket functions
state, toggling, 429 PHP, 772–773
user-specific command, running, 429–430 Python, 701, 703
scrolling soft hyphens, text formatting, 83–84
background images, 232–235 software
viewers’, fixing document positions despite, 249–252 output, sample, 557
search and replace functions, Perl, 665–666 values, passing, 555–556
search engine meta tags, 17 software plugins
Select object, JavaScript, 638–639 described, 143–144
selection fonts, 210–211 <embed> tag, representing non-HTML data with, 144–145
selectors, CSS matching MIDI sound file, adding, 147–148
attributes by specific, 175–176, 600 <object> tag, 146–147
child, descendant, and adjacent sibling elements, older, Netscape-based browsers, supporting, 150–151
176–178, 600–601 parameters, 147
class, 174–175 Shockwave Flash, adding, 149–150
807
source citation
source citation, 536 styling fonts, 212–213
source code, JavaScript, 265 submit button
spaced data, presenting, 30–31 forms, 137–138
spacing JavaScript object, 641–642
borders, 223 subroutines, Perl, 657–658
letter and word styles, setting, 198–199, subscript (<sub>), 85–86, 561–562
239–240, 586 substrings, memorizing, 386
spam, 327–329 suites, code, 400
special characters superscript (<sup>), 85–86, 562
Perl, 384–385 Suraski, Zeev (PHP re-writer), 471
Python regular expressions, 414–415 switch construct, PHP, 485–486
special variables, Perl, 374–377 system functions, Perl, 670–671
spelling errors, Python variable name, 408
SQLite function, PHP, 774–776
square brackets ([]) T
element attributes, matching styles, 175–176 tab order
Perl regular expression, 385 forms, 140–141
stacking elements, CSS, 258–261, 593 input, indicating (kbd), 548
Standard Generalized Markup Language (SGML), 7 links, 73–74
starting list number, 41–42 tabbed data, presenting, 30–31
state, Linux Bash shell scripting, 429 tabbed elements, accessing by keyboard, 73–74
statements tables
JavaScript, 278–279, 606–609 captions
Perl, 657–660 aligning and positioning, 244
static positioning, CSS elements, 245–246 defining, 535–536
stream functions, PHP, 776–778 CSS properties, 237–243, 596–598
strings HTML, 89
JavaScript layout, 244
math operations, 282 MySQL, populating, 434–438
object, 639–641 text, changing color following mouse movement, 285
operators, 606 XHTML
support, 270 backgrounds, 105–106
Perl body, main, 563–564
described, 373 borders and rules, 96–99
functions, 663 captions, 102–103
operators, 379, 654 cell spacing and padding, 94–95
PHP functions, 778–784 cells, specifying, 100–101, 564
Python content, defining, 562–563
described, 402 database, accessing and reporting data, 523–526
format operators, 404 frames, 121–122
functions, 704–705 grouping columns, 109–111
methods, 403–404 header, footer, and body sections, 103–105, 567
operators, 402–403 page layout, using for, 111–121
structure functions, Perl, 661–663 parts, 89–91
style rows, 99–100, 568
borders, 220–221 spanning columns and rows, 106–109
definition format, 171–172 width and alignment, 91–94
DHTML, swapping, 348–351 tags
differences from main sheet, enabling, 167–169 form, 129
document tag section, 19 HTML document
rules, defining in XHTML, 561 block divisions, 31–33
style sheets. See also CSS body section, 20–21
external, referring to, 19 body with two tables, 25
purpose, 163 described, 9–10, 13
XHTML formatting, 164–165 DOCTYPE, 10, 14–15
XML, 159–160 head section, 15–19
808
Index
tuple data type
headings, 27–28 writing to document, JavaScript
horizontal rules, 28–29 current date, 325–327
HTML, 10–11, 15 e-mail address, obscuring, 327–329
manual line breaks, 25–27 Textarea object, JavaScript, 643–644
paragraphs, 23–24 tiling background images, 232–235
preformatted text, 30–31 time
script section, 20 current, in document header, 475–477
style section, 19 handling
link, 76–77 JavaScript, 282
paragraph, 23–24 Perl examples, 439–444
target details, 75–76 PHP, 509–514
TCP/IP (Transmission Control Protocol/Internet Python examples, 457–460
Protocol), 1 PHP functions, 727–728
teletype tag (<tt>), 85 title, document, 16, 567
telnet client, 5–7 tools, errors and troubleshooting
term, XHTML definition, 539, 540 JavaScript, 290
test functions, Perl file and file handle, 666–667 PHP, 500–501
text top, positioning elements, 252–254
abbreviations, 87 Transmission Control Protocol/Internet Protocol
big, 85–86, 561–562 (TCP/IP), 1
bold and italic, 84–85 transparency, image, 53
emphasis, 541, 560 trapping errors, 289
files, reading from and writing to traversing nodes, DOM, 296–302
PHP, 496–497 troubleshooting
Python, 418–419 JavaScript
writing to, 419 elements, animating, 360–361
form input boxes, 130–131 form validating, 347
formatting, XHTML, 79–84 need, 286
grouping in-line elements, 87–88 syntax, 287–288
insertions and deletions, 86–87 tools, 287
irregularly shaped layouts, 116–118 Perl
italic, 545–546 Apache Internal Server error message, 395–396
JavaScript maximum reporting, 394–395
object, 642–643 symbolic debugger, 651
writing in window, 331–333 PHP
monospaced, 85 custom handling, 505
rendering, specified (bdo), 532 error level, controlling, 503–504
small, 85–86, 561–562 handling, 729–730
specifying for nongraphical browsers, 60–61 identifying, 502–503
styles, setting level, controlling, 503–504
aligning, 189–194 sending to file or e-mail address, 504–505
capitalization, 200 syntax, common, 501
decorations, 200–201 tools, 500–501
direction, handling different languages, 587 Python
displaying, 581 cgitb module, 421–422
floating objects, 195–197 code, running in interpreter, 421
fonts, 210–214 error stream, redirecting, 422
indenting, 194–195 XHTML tables, 96
letter and word spacing, 198–199 TrueDoc fonts, 213, 214
lists, formatting, 201–204 true/false condition, executing code. See if else
numbering, automatic, 205–209, 582 loop; if loop
quotation marks, 205, 582 try statement, Python, 412
white space, preserving, 198 try/catch statement, 289
subscript, 85–86, 561–562 tuple data type, 406–407
superscript, 85–86, 561–562
809
typefaces
typefaces user-defined entities, XML, 158–159
described, 210 user-defined functions
embedding, 213–214 JavaScript, 280–281
formatting tag, 79 Perl, 387
line spacing, 213 PHP, 490–491
lists, formatting, 39 Python, 416–417
selection, 210–211 user-specific command, running in Linux Bash shell
sizing script, 429–430
CSS, 211–212, 584
XHTML, 79, 532–533, 559–560
styling V
CSS, 212–213, 582–587 validating forms, JavaScript, 343–347
XHTML, 582–587 value
automatic numbering, changing, 206–207
stepping through range (for loop), 275
U troubleshooting, basic, 289
underlining text, 583 van Rossum, Guido (Python language inventor),
unicode text, 587 397, 418
Uniform Resource Locator. See URL variable
universal selector (*), matching elements by, 174 JavaScript, 271, 288, 289
UNIX, 398, 399 Perl, 373, 655–657
unordered list PHP, 475, 501, 784–785
described, 35, 42–43, 569 Python, 407
item marker, changing, 43–46 XHTML, 569–570
ordered lists within, 47–48 variable scope
ordinal, changing position of, 46 PHP, 491
until loop, Perl, 380 Python, 407–408
uppercase text, changing to, 583 vertical text alignment, 191–194, 591
URL (Uniform Resource Locator) viewers’ scrolling, fixing document positions despite,
absolute versus relative paths, 71 249–252
components, 69–70 viewing documents, 160
data, passing to CGI script, 425–426 visibility, positioning elements, 261–262
described, 4 visited link, coloring, 74
form data, passing, 128
HTTP data, encapsulating, 364–366
information, manipulating (location), 314–315 W
links, 69–71 Wall, Larry (Perl language inventor), 371
PHP functions, 784 W3C (World Wide Web Consortium)
Python functions, 709–715 DOM specification, 291, 324
user agent Web specifications, overall, 7
caching with meta tags, 18 Web
default path, setting, 17 creating, 3–4
DOM support, listed, 264 described, 1–2
formatting shortcuts, reasons to avoid, 10 HTTP, 4–7
JavaScript support lacking, 323 Web browser
meta tags caching, 18 caching with meta tags, 18
refreshing and reloading after specified time, 18–19 default path, setting, 17
scripting language, unsupported, 552 DOM support, listed, 264
Web servers, connecting, 1–2 formatting shortcuts, reasons to avoid, 10
window, another JavaScript support lacking, 323
opening, 329–331 meta tags caching, 18
outputting text, 331–333 refreshing and reloading after specified time, 18–19
XML, usefulness of, 154, 160 scripting language, unsupported, 552
user input. See form Web servers, connecting, 1–2
user-created objects, JavaScript, 283–284
810
Index
XHTML
window, another white space
opening, 329–331 compressing, 198
outputting text, 331–333 CSS, 587
XML, usefulness of, 154, 160 explicitly including, 83
Web document floating objects, 195–197
CSS HTML documents, reasons to use, 10
layering, 258–261 PHP, 474
matching, 173–178 preserving, 198, 556
padding, 218–219 XHTML tables used for layout, 118
sizing, 256–258 widows, CSS printing, 599
visibility, 261–262 width
DOM, accessing by ID, 316 borders, 219–220
external content, embedding, 552–553 elements, specifying, 257
form, inserting, 129 images, XHTML, 62
header containing current date and time, 475–477 margins, CSS, 588
HTML tags tables, XHTML, 91–94
block divisions, 31–33 wildcard, style matching, 174
body section, 20–21 window, browser
body with two tables, 25 opening another, 329–331
described, 13 text, writing, 331–333
DOCTYPE, 10, 14–15 Window object, JavaScript, 644–647
head section, 15–19 Windows (Microsoft)
headings, 27–28 PHP, 472, 500
horizontal rules, 28–29 Python interpreter, 398
HTML, 10–11, 15 Windows Picture Viewer (Microsoft), 58
manual line breaks, 25–27 word spacing, CSS, 198–199, 586
paragraphs, 23–24 World Wide Web
preformatted text, 30–31 creating, 3–4
script section, 20 described, 1–2
style section, 19 HTTP, 4–7
image map code sample, 66–67 World Wide Web Consortium (W3C)
intelligence, in JavaScript, 265 DOM specification, 291, 324
JavaScript object, 620–622 Web specifications, overall, 7
master element, 545 writing
moving with JavaScript, 319–321 to browser window in JavaScript, 331–333
navigational pane, 119–120 to document in JavaScript
original/desired location, defining in XHTML (<base>), date, current, 325–327
531–532 e-mail address, obscuring, 327–329
URL, 70 text, writing to document
XML, 156–157 to text file
Web image formats Perl, 389
GIF, 51–52 PHP, 497
JPEG, 52 Python, 419
PNG, 52–53
Web resources
JavaScript, 324–325 X
Perl, 371–372 XHTML
PHP, 508 attributes
Python, 398 core, 571
while loop events, 570–571
JavaScript, 274–275 internationalization, 571
Perl, 380 document, built-in JavaScript object, 311–312
PHP, 482 hyperlink (<a>), 528
Python, 410–411 PHP date and time routines, 509–514
811
XHTML (continued)
XHTML (continued) design strategy, 153–154
tables DTDs, 155
backgrounds, 105–106 editing, 161
body, main, 563–564 elements, 156–157
borders and rules, 96–99 entities, user-defined, 158–159
captions, 102–103 namespaces, 159
cell spacing and padding, 94–95 nonparsed data, 158
cells, specifying, 100–101, 564 non-Web applications, 154
content, defining, 562–563 parsing, 161–162
database, accessing and reporting data, 523–526 PHP functions, 771, 786–787
frames, 121–122 style sheets, 159–160
grouping columns, 109–111 versions, 8
header, footer, and body sections, 103–105, 567 viewing documents, 160
page layout, using for, 111–121 XSLT, 161
parts, 89–91 XSLT (Extensible Stylesheet Language), 161
rows, 99–100, 568
spanning columns and rows, 106–109
width and alignment, 91–94 Z
tips for using, 527 z-axis, 258–261, 593
versions, 9 ZLib functions, PHP, 787
XML (Extensible Markup Language)
attributes, 157
comments, 157
812