1 files changed, 604 insertions, 295 deletions
diff --git a/doc/src/sgml/query.sgml b/doc/src/sgml/query.sgml
index 82c4ffe697..04fcce1985 100644
--- a/doc/src/sgml/query.sgml
+++ b/doc/src/sgml/query.sgml
@@ -1,102 +1,106 @@
 <!--
-$Header: /cvsroot/pgsql/doc/src/sgml/query.sgml,v 1.17 2001/01/13 23:58:55 petere Exp $
+$Header: /cvsroot/pgsql/doc/src/sgml/query.sgml,v 1.18 2001/09/02 23:27:49 petere Exp $
 -->
 
- <chapter id="query">
-  <title>The Query Language</title>
-
-  <para>
-   The  <productname>Postgres</productname>  query language is a variant of
-   the <acronym>SQL</acronym> standard. It
-   has many extensions to <acronym>SQL</acronym> such as an
-   extensible type  system,
-   inheritance,  functions and production rules. These are
-   features carried over from the original
-   <productname>Postgres</productname>  query
-   language,  <productname>PostQuel</productname>.
-   This section provides an overview
-   of how to use <productname>Postgres</productname>
-   <acronym>SQL</acronym>  to  perform  simple  operations.
-   This manual is only intended to give you an idea of our
-   flavor of <acronym>SQL</acronym> and is in no way a complete  tutorial  on
-   <acronym>SQL</acronym>.  Numerous  books  have  been  written  on
-   <acronym>SQL92</acronym>, including
-   <xref linkend="MELT93" endterm="MELT93"> and
-   <xref linkend="DATE97" endterm="DATE97">.
-   You should be  aware  that  some language features 
-   are extensions to the standard.
-  </para>
-
-  <sect1 id="query-psql">
-   <title>Interactive Monitor</title>
-
-   <para>
-    In the examples that follow, we assume  that  you  have
-    created  the mydb database as described in the previous
-    subsection and have started <application>psql</application>.
-    Examples  in  this  manual  can  also   be   found   in source distribution
-    in the directory <filename>src/tutorial/</filename>.    Refer   to   the
-    <filename>README</filename> file in that directory for how to use them.   To
-    start the tutorial, do the following:
+ <chapter id="tutorial-sql">
+  <title>The <acronym>SQL</acronym> Language</title>
+
+  <sect1 id="tutorial-sql-intro">
+   <title>Introduction</title>
+
+   <para>
+    This chapter provides an overview of how to use
+    <acronym>SQL</acronym> to perform simple operations.  This
+    tutorial is only intended to give you an introduction and is in no
+    way a complete tutorial on <acronym>SQL</acronym>.  Numerous books
+    have been written on <acronym>SQL92</acronym>, including <xref
+    linkend="MELT93" endterm="MELT93"> and <xref linkend="DATE97"
+    endterm="DATE97">.  You should be aware that some language
+    features are extensions to the standard.
+   </para>
+
+   <para>
+    In the examples that follow, we assume that you have created a
+    database named <quote>mydb</quote>, as described in the previous
+    chapter, and have started <application>psql</application>.
+   </para>
+
+   <para>
+    Examples in this manual can also be found in source distribution
+    in the directory <filename>src/tutorial/</filename>.  Refer to the
+    <filename>README</filename> file in that directory for how to use
+    them.  To start the tutorial, do the following:
 
 <screen>
-<prompt>$</prompt> <userinput>cd <replaceable>...</replaceable>/src/tutorial</userinput>
+<prompt>$</prompt> <userinput>cd <replaceable>....</replaceable>/src/tutorial</userinput>
 <prompt>$</prompt> <userinput>psql -s mydb</userinput>
 <computeroutput>
-Welcome to the POSTGRESQL interactive sql monitor:
-  Please read the file COPYRIGHT for copyright terms of POSTGRESQL
-
-   type \? for help on slash commands
-   type \q to quit
-   type \g or terminate with semicolon to execute query
- You are currently connected to the database: postgres
+...
 </computeroutput>
 
 <prompt>mydb=&gt;</prompt> <userinput>\i basics.sql</userinput>
 </screen>
+
+    The <literal>\i</literal> command reads in commands from the
+    specified files. The <literal>-s</literal> option puts you in
+    single step mode which pauses before sending a query to the
+    server.  The commands used in this section are in the file
+    <filename>basics.sql</filename>.
    </para>
+  </sect1>
+
+
+  <sect1 id="tutorial-concepts">
+   <title>Concepts</title>
 
    <para>
-    The  <literal>\i</literal>  command  read  in  queries  from the specified
-    files. The <literal>-s</literal> option puts you in single step mode which
-    pauses  before  sending a query to the backend. Queries
-    in this section are in the file <filename>basics.sql</filename>.
+    <indexterm><primary>relational database</primary></indexterm>
+    <indexterm><primary>hierarchical database</primary></indexterm>
+    <indexterm><primary>object-oriented database</primary></indexterm>
+    <indexterm><primary>relation</primary></indexterm>
+    <indexterm><primary>table</primary></indexterm>
+
+    <productname>PostgreSQL</productname> is a <firstterm>relational
+    database management system</firstterm> (<acronym>RDBMS</acronym>).
+    That means it is a system for managing data stored in
+    <firstterm>relations</firstterm>.  Relation is essentially a
+    mathematical term for <firstterm>table</firstterm>.  The notion of
+    storing data in tables is so commonplace today that it might
+    seem inherently obvious, but there are a number of other ways of
+    organizing databases.  Files and directories on Unix-like
+    operating systems form an example of a hierarchical database.  A
+    more modern development is the object-oriented database.
    </para>
 
    <para>
-    <application>psql</application>
-    has a variety of <literal>\d</literal> commands for showing system information.
-    Consult these commands for more details;
-    for a listing, type <literal>\?</literal> at the <application>psql</application> prompt.
+    <indexterm><primary>row</primary></indexterm>
+    <indexterm><primary>column</primary></indexterm>
+
+    Each table is a named collection of <firstterm>rows</firstterm>.
+    Each row has the same set of named <firstterm>columns</firstterm>,
+    and each column is of a specific data type.  Whereas columns have
+    a fixed order in each row, it is important to remember that SQL
+    does not guarantee the order of the rows within the table in any
+    way (unless they are explicitly sorted).
    </para>
-  </sect1>
-
-  <sect1 id="query-concepts">
-   <title>Concepts</title>
 
    <para>
-    The fundamental notion in <productname>Postgres</productname> is
-    that of a <firstterm>table</firstterm>, which is a named
-    collection of <firstterm>rows</firstterm>.  Each row has the same
-    set of named <firstterm>columns</firstterm>, and each column is of
-    a specific type.  Furthermore, each row has a permanent
-    <firstterm>object identifier</firstterm> (<acronym>OID</acronym>)
-    that is unique throughout the database cluster.  Historially,
-    tables have been called classes in
-    <productname>Postgres</productname>, rows are object instances,
-    and columns are attributes.  This makes sense if you consider the
-    object-relational aspects of the database system, but in this
-    manual we will use the customary <acronym>SQL</acronym>
-    terminology.  As previously discussed,
-    tables are grouped into databases, and a collection of databases
-    managed by a single <application>postmaster</application> process
-    constitutes a database cluster.
+    <indexterm><primary>cluster</primary></indexterm>
+
+    Tables are grouped into databases, and a collection of databases
+    managed by a single <productname>PostgreSQL</productname> server
+    instance constitutes a database <firstterm>cluster</firstterm>.
    </para>
   </sect1>
 
-  <sect1 id="query-table">
+
+  <sect1 id="tutorial-table">
    <title>Creating a New Table</title>
 
+   <indexterm zone="tutorial-table">
+    <primary>CREATE TABLE</primary>
+   </indexterm>
+
    <para>
     You  can  create  a  new  table by specifying the table
     name, along with all column names and their types:
@@ -110,39 +114,82 @@ CREATE TABLE weather (
     date            date
 );
 </programlisting>
+
+    You can enter this into <command>psql</command> with the line
+    breaks.  <command>psql</command> will recognize that the command
+    is not terminated until the semicolon.
+   </para>
+
+   <para>
+    White space (i.e., spaces, tabs, and newlines) may be used freely
+    in SQL commands.  That means you can type the command aligned
+    differently than above, or even all on one line.  Two dashes
+    (<quote><literal>--</literal></quote>) introduce comments.
+    Whatever follows them is ignored up to the end of the line.  SQL
+    is also case insensitive about key words and identifiers, except
+    when identifiers are double-quoted to preserve the case (not done
+    above).
+   </para>
+
+   <para>
+    <type>varchar(80)</type> specifies a data type that can store
+    arbitrary character strings up to 80 characters in length.
+    <type>int</type> is the normal integer type.  <type>real</type> is
+    a type for storing single precision floating point numbers.
+    <type>date</type> should be self-explanatory.  (Yes, the column of
+    type <type>date</type> is also named <literal>date</literal>.
+    This may be convenient or confusing -- you choose.)
    </para>
 
    <para>
-    Note that both keywords and identifiers are case-insensitive;
-    identifiers can preserve case by surrounding them with
-    double-quotes as allowed
-    by <acronym>SQL92</acronym>.
-    <productname>Postgres</productname>  <acronym>SQL</acronym>
-    supports the usual
+    <productname>PostgreSQL</productname> supports the usual
     <acronym>SQL</acronym> types <type>int</type>,
-    <type>float</type>,  <type>real</type>,  <type>smallint</type>,
-<type>char(N)</type>,  
-    <type>varchar(N)</type>,  <type>date</type>, <type>time</type>,
-    and <type>timestamp</type>, as well as other types of general utility and
-    a rich set of geometric types.  As we will 
-    see later, <productname>Postgres</productname> can be customized
-    with an  
-    arbitrary  number  of
-    user-defined  data types.  Consequently, type names are
-    not syntactical keywords, except where required to support special
-    cases in the <acronym>SQL92</acronym> standard.
-    So far, the <productname>Postgres</productname>
-    <command>CREATE</command> command
-    looks exactly  like
-    the  command  used  to  create a table in a traditional
-    relational system.  However, we will presently see that
-    tables  have  properties  that  are  extensions of the
-    relational model.
+    <type>smallint</type>, <type>real</type>, <type>double
+    precision</type>, <type>char(<replaceable>N</>)</type>,
+    <type>varchar(<replaceable>N</>)</type>, <type>date</type>,
+    <type>time</type>, <type>timestamp</type>, and
+    <type>interval</type> as well as other types of general utility
+    and a rich set of geometric types.
+    <productname>PostgreSQL</productname> can be customized with an
+    arbitrary number of user-defined data types.  Consequently, type
+    names are not syntactical keywords, except where required to
+    support special cases in the <acronym>SQL</acronym> standard.
+   </para>
+
+   <para>
+    The second example will store cities and their associated
+    geographical location:
+<programlisting>
+CREATE TABLE cities (
+    name            varchar(80),
+    location        point
+);
+</programlisting>
+    The <type>point</type> type is such a
+    <productname>PostgreSQL</productname>-specific data type.
+   </para>
+
+   <para>
+    <indexterm>
+     <primary>DROP TABLE</primary>
+    </indexterm>
+
+    Finally, it should be mentioned that if you don't need a table any
+    longer or want to recreate it differently you can remove it using
+    the following command:
+<synopsis>
+DROP TABLE <replaceable>tablename</replaceable>;
+</synopsis>
    </para>
   </sect1>
 
-  <sect1 id="query-populate">
-   <title>Populating a Table with Rows</title>
+
+  <sect1 id="tutorial-populate">
+   <title>Populating a Table With Rows</title>
+
+   <indexterm zone="tutorial-populate">
+    <primary>INSERT</primary>
+   </indexterm>
 
    <para>
     The <command>INSERT</command> statement is used to populate a table  with
@@ -151,129 +198,184 @@ CREATE TABLE weather (
 <programlisting>
 INSERT INTO weather VALUES ('San Francisco', 46, 50, 0.25, '1994-11-27');
 </programlisting>
+
+    Note that all data types use rather obvious input formats.  The
+    <type>date</type> column is actually quite flexible in what it
+    accepts, but for this tutorial we will stick to the unambiguous
+    format shown here.
    </para>
 
    <para>
-    You can also use <command>COPY</command> to load large
-    amounts of data from flat (<acronym>ASCII</acronym>) files.
-    This is usually faster because the data is read (or written) as a
-    single atomic
-    transaction directly to or from the target table. An example would be:
+    The <type>point</type> type requires a coordinate pair as input,
+    as shown here:
+<programlisting>
+INSERT INTO cities  VALUES ('San Francisco', '(-194.0, 53.0)');
+</programlisting>
+   </para>
 
+   <para>
+    The syntax used so far requires you to remember the order of the
+    columns.  An alternative syntax allows you to list the columns
+    explicitly:
 <programlisting>
-COPY weather FROM '/home/user/weather.txt' USING DELIMITERS '|';
+INSERT INTO weather (city, temp_lo, temp_hi, prcp, date)
+    VALUES ('San Francisco', 43, 57, 0.0, '1994-11-29');
+</programlisting>
+    You can also list the columns in a different order if you wish or
+    even omit some columns, e.g., unknown precipitation:
+<programlisting>
+INSERT INTO weather (date, city, temp_hi, temp_lo)
+    VALUES ('1994-11-29', 'Hayward', 54, 37);
+</programlisting>
+    Many developers consider explicitly listing the columns better
+    style than relying on the order implicitly.
+   </para>
+
+   <para>
+    Please enter all the commands shown above so you have some data to
+    work with in the following sections.
+   </para>
+
+   <para>
+    <indexterm>
+     <primary>COPY</primary>
+    </indexterm>
+
+    You could also have used <command>COPY</command> to load large
+    amounts of data from flat text files.  This is usually faster
+    because the <command>COPY</command> is optimized for this
+    application while allowing less flexibility than
+    <command>INSERT</command>.  An example would be:
+
+<programlisting>
+COPY weather FROM '/home/user/weather.txt';
 </programlisting>
 
     where the path name for the source file must be available to the
-    backend server
-    machine, not the client, since the backend server reads the file directly.
+    backend server machine, not the client, since the backend server
+    reads the file directly.  You can read more about the
+    <command>COPY</command> command in the <citetitle>Reference
+    Manual</citetitle>.
    </para>
   </sect1>
 
-  <sect1 id="query-query">
+
+  <sect1 id="tutorial-select">
    <title>Querying a Table</title>
 
    <para>
-    The <classname>weather</classname> table can be queried with normal relational
-    selection  and projection queries.  A <acronym>SQL</acronym>
-    <command>SELECT</command> 
-    statement is used to do this.  The statement is divided into
-    a target list (the part that lists the columns to be
-    returned) and a qualification (the part that  specifies
-    any  restrictions).   For  example, to retrieve all the
-    rows of weather, type:
+    <indexterm><primary>query</primary></indexterm>
+    <indexterm><primary>SELECT</primary></indexterm>
+
+    To retrieve data from a table it is
+    <firstterm>queried</firstterm>.  An <acronym>SQL</acronym>
+    <command>SELECT</command> statement is used to do this.  The
+    statement is divided into a select list (the part that lists the
+    columns to be returned), a table list (the part that lists the
+    tables from which to retrieve the data), and an optional
+    qualification (the part that specifies any restrictions).  For
+    example, to retrieve all the rows of
+    <classname>weather</classname>, type:
 <programlisting>
 SELECT * FROM weather;
 </programlisting>
+    (where <literal>*</literal> means <quote>all columns</quote>) and
+    the output should be:
+<screen>
+     city      | temp_lo | temp_hi | prcp |    date
+---------------+---------+---------+------+------------
+ San Francisco |      46 |      50 | 0.25 | 1994-11-27
+ San Francisco |      43 |      57 |    0 | 1994-11-29
+ Hayward       |      37 |      54 |      | 1994-11-29
+(3 rows)
+</screen>
+   </para>
 
-    and the output should be:
-<programlisting>
-+--------------+---------+---------+------+------------+
-|city          | temp_lo | temp_hi | prcp | date       |
-+--------------+---------+---------+------+------------+
-|San Francisco | 46      | 50      | 0.25 | 1994-11-27 |
-+--------------+---------+---------+------+------------+
-|San Francisco | 43      | 57      | 0    | 1994-11-29 |
-+--------------+---------+---------+------+------------+
-|Hayward       | 37      | 54      |      | 1994-11-29 |
-+--------------+---------+---------+------+------------+
-</programlisting>
-    You may specify any arbitrary expressions in the  target list. For 
+   <para>
+    You may specify any arbitrary expressions in the target list.  For 
     example, you can do:
 <programlisting>
 SELECT city, (temp_hi+temp_lo)/2 AS temp_avg, date FROM weather;
 </programlisting>
+    This should give:
+<screen>
+     city      | temp_avg |    date
+---------------+----------+------------
+ San Francisco |       48 | 1994-11-27
+ San Francisco |       50 | 1994-11-29
+ Hayward       |       45 | 1994-11-29
+(3 rows)
+</screen>
+    Notice how the <literal>AS</literal> clause is used to relabel the
+    output column.  (It is optional.)
    </para>
 
    <para>
-    Arbitrary  Boolean  operators
-    (<command>AND</command>,  <command>OR</command> and 
-    <command>NOT</command>) are
-    allowed in the qualification of any query.   For  example,
+    Arbitrary Boolean operators (<literal>AND</literal>,
+    <literal>OR</literal>, and <literal>NOT</literal>) are allowed in
+    the qualification of a query.  For example, the following
+    retrieves the weather of San Francisco on rainy days:
 
 <programlisting>
 SELECT * FROM weather
     WHERE city = 'San Francisco'
     AND prcp > 0.0;
 </programlisting>
-results in:
-<programlisting>
-+--------------+---------+---------+------+------------+
-|city          | temp_lo | temp_hi | prcp | date       |
-+--------------+---------+---------+------+------------+
-|San Francisco | 46      | 50      | 0.25 | 1994-11-27 |
-+--------------+---------+---------+------+------------+
-</programlisting>
+    Result:
+<screen>
+     city      | temp_lo | temp_hi | prcp |    date
+---------------+---------+---------+------+------------
+ San Francisco |      46 |      50 | 0.25 | 1994-11-27
+(1 row)
+</screen>
    </para>
 
    <para>
-    As  a final note, you can specify that the results of a
-    select can be returned in a <firstterm>sorted order</firstterm>
-    or with duplicate rows removed.
+    <indexterm><primary>ORDER BY</primary></indexterm>
+    <indexterm><primary>DISTINCT</primary></indexterm>
+    <indexterm><primary>duplicate</primary></indexterm>
+
+    As a final note, you can request that the results of a select can
+    be returned in sorted order or with duplicate rows removed.  (Just
+    to make sure the following won't confuse you,
+    <literal>DISTINCT</literal> and <literal>ORDER BY</literal> can be
+    used separately.)
 
 <programlisting>
 SELECT DISTINCT city
     FROM weather
     ORDER BY city;
 </programlisting>
-   </para>
-  </sect1>
-
-  <sect1 id="query-selectinto">
-   <title>Redirecting SELECT Queries</title>
-
-   <para>
-    Any <command>SELECT</command> query can be redirected to a new table
-<programlisting>
-SELECT * INTO TABLE temp FROM weather;
-</programlisting>
-   </para>
 
-   <para>
-    This forms an implicit <command>CREATE</command> command, creating a new
-    table temp with the column names and types specified
-    in  the target list of the <command>SELECT INTO</command> command.  We can
-    then, of course, perform any operations on the  resulting 
-    table that we can perform on other tables.
+<screen>
+     city
+---------------
+ Hayward
+ San Francisco
+(2 rows)
+</screen>
    </para>
   </sect1>
 
-  <sect1 id="query-join">
+
+  <sect1 id="tutorial-join">
    <title>Joins Between Tables</title>
 
+   <indexterm zone="tutorial-join">
+    <primary>join</primary>
+   </indexterm>
+
    <para>
-    Thus far, our queries have only accessed one table at a
-    time.  Queries can access multiple tables at once,  or
-    access  the  same  table  in  such  a way that multiple
-    rows of the table are being processed at the  same
-    time.   A query that accesses multiple rows of the
-    same or different tables at one time is called a  join
-    query.
-    As an example, say we wish to find all the records that
-    are in the  temperature  range  of  other  records.  In
-    effect,  we  need  to  compare  the temp_lo and temp_hi
-    columns of each WEATHER  row  to  the  temp_lo  and
-    temp_hi  columns of all other WEATHER columns.
+    Thus far, our queries have only accessed one table at a time.
+    Queries can access multiple tables at once, or access the same
+    table in such a way that multiple rows of the table are being
+    processed at the same time.  A query that accesses multiple rows
+    of the same or different tables at one time is called a
+    <firstterm>join</firstterm> query.  As an example, say you wish to
+    list all the weather records together with the location of the
+    associated city.  In effect, we need to compare the city column of
+    each row of the weather table with the name column of all rows in
+    the cities table.
     <note>
      <para>
       This  is only a conceptual model.  The actual join may
@@ -281,102 +383,189 @@ SELECT * INTO TABLE temp FROM weather;
       to the user.
      </para>
     </note>
-
-    We can do this with the following query:
+    This would be accomplished by the following query:
 
 <programlisting>
-SELECT W1.city, W1.temp_lo AS low, W1.temp_hi AS high,
-    W2.city, W2.temp_lo AS low, W2.temp_hi AS high
-    FROM weather W1, weather W2
-    WHERE W1.temp_lo < W2.temp_lo
-    AND W1.temp_hi > W2.temp_hi;
+SELECT *
+    FROM weather, cities
+    WHERE city = name;
+</programlisting>
 
-+--------------+-----+------+---------------+-----+------+
-|city          | low | high | city          | low | high |
-+--------------+-----+------+---------------+-----+------+
-|San Francisco | 43  | 57   | San Francisco | 46  | 50   |
-+--------------+-----+------+---------------+-----+------+
-|San Francisco | 37  | 54   | San Francisco | 46  | 50   |
-+--------------+-----+------+---------------+-----+------+
-</programlisting>     
+<screen>
+     city      | temp_lo | temp_hi | prcp |    date    |     name      | location
+---------------+---------+---------+------+------------+---------------+-----------
+ San Francisco |      46 |      50 | 0.25 | 1994-11-27 | San Francisco | (-194,53)
+ San Francisco |      43 |      57 |    0 | 1994-11-29 | San Francisco | (-194,53)
+(2 rows)
+</screen>
 
-    <note>
-     <para>
-      The semantics of such a join are 
-      that the qualification
-      is a truth expression defined for the Cartesian  product  of
-      the tables indicated in the query.  For those rows in
-      the Cartesian product for which the qualification  is  true,
-      <productname>Postgres</productname>  computes  and  returns the
-      values specified in the target list.  
-      <productname>Postgres</productname> <acronym>SQL</acronym>
-      does not assign  any  meaning  to
-      duplicate values in such expressions. 
-      This means that <productname>Postgres</productname> 
-      sometimes recomputes the same target list several times;
-      this frequently happens when Boolean expressions are connected 
-      with an "or".  To remove such duplicates, you must  use
-      the <command>SELECT DISTINCT</command> statement.
-     </para>
-    </note>
    </para>
 
    <para>
-    In this case, both <literal>W1</literal> and
-    <literal>W2</literal>  are  surrogates for  a
-    row of the table weather, and both range over all
-    rows of the table.  (In the  terminology  of  most
-    database  systems, <literal>W1</literal> and <literal>W2</literal> 
-    are known as <firstterm>range variables</firstterm>.)  
-    A query can contain an  arbitrary  number  of
-    table names and surrogates.
+    Observe two things about the result set:
+    <itemizedlist>
+     <listitem>
+      <para>
+       There is no result row for the city of Hayward.  This is
+       because there is no matching entry in the
+       <classname>cities</classname> table for Hayward, so the join
+       cannot process the rows in the weather table.  We will see
+       shortly how this can be fixed.
+      </para>
+     </listitem>
+
+     <listitem>
+      <para>
+       There are two columns containing the city name.  This is
+       correct because the lists of columns of the
+       <classname>weather</classname> and the
+       <classname>cities</classname> tables are concatenated.  In
+       practice this is undesirable, though, so you will probably want
+       to list the output columns explicitly rather than using
+       <literal>*</literal>:
+<programlisting>
+SELECT city, temp_lo, temp_hi, prcp, date, location
+    FROM weather, cities
+    WHERE city = name;
+</programlisting>
+      </para>
+     </listitem>
+    </itemizedlist>
    </para>
-  </sect1>
 
-  <sect1 id="query-update">
-   <title>Updates</title>
+   <formalpara>
+    <title>Exercise:</title>
+
+    <para>
+     Attempt to find out the semantics of this query when the
+     <literal>WHERE</literal> clause is omitted.
+    </para>
+   </formalpara>
 
    <para>
-    You can update existing rows using the
-    <command>UPDATE</command> command. 
-    Suppose you discover the temperature readings are
-    all  off  by 2 degrees as of Nov 28, you may update the
-    data as follow:
+    Since the columns all had different names, the parser
+    automatically found out which table they belong to, but it is good
+    style to fully qualify column names in join queries:
 
 <programlisting>
-UPDATE weather
-    SET temp_hi = temp_hi - 2,  temp_lo = temp_lo - 2
-    WHERE date > '1994-11-28';
+SELECT weather.city, weather.temp_lo, weather.temp_hi, weather.prcp, weather.date, cities.location
+    FROM weather, cities
+    WHERE cities.name = weather.city;
 </programlisting>
    </para>
-  </sect1>
-
-  <sect1 id="query-delete">
-   <title>Deletions</title>
 
    <para>
-    Deletions are performed using the <command>DELETE</command> command:
+    Join queries of the kind seen thus far can also be written in this
+    alternative form:
+
 <programlisting>
-DELETE FROM weather WHERE city = 'Hayward';
+SELECT *
+    FROM weather INNER JOIN cities ON (weather.city = cities.name);
 </programlisting>
 
-    All weather recording belonging to Hayward are removed.
-    One should be wary of queries of the form
+    This syntax is not as commonly used as the one above, but we show
+    it here to help you understand the following topics.
+   </para>
+
+   <para>
+    <indexterm><primary>join</primary><secondary>outer</secondary></indexterm>
+
+    Now we will figure out how we can get the Hayward records back in.
+    What we want the query to do is to scan the
+    <classname>weather</classname> table and for each row to find the
+    matching <classname>cities</classname> row.  If no matching row is
+    found we want some <quote>empty values</quote> to be substituted
+    for the <classname>cities</classname> table's columns.  This kind
+    of query is called an <firstterm>outer join</firstterm>.  (The
+    joins we have seen to far are inner joins.)  The command looks
+    like this:
+
 <programlisting>
-DELETE FROM <replaceable>tablename</replaceable>;
+SELECT *
+    FROM weather LEFT OUTER JOIN cities ON (weather.city = cities.name);
+
+     city      | temp_lo | temp_hi | prcp |    date    |     name      | location
+---------------+---------+---------+------+------------+---------------+-----------
+ Hayward       |      37 |      54 |      | 1994-11-29 |               |
+ San Francisco |      46 |      50 | 0.25 | 1994-11-27 | San Francisco | (-194,53)
+ San Francisco |      43 |      57 |    0 | 1994-11-29 | San Francisco | (-194,53)
+(3 rows)
 </programlisting>
 
-    Without a qualification, <command>DELETE</command> will simply
-    remove  all  rows from the given table, leaving it
-    empty.  The system will not request confirmation before
-    doing this.
+    In particular, this query is a <firstterm>left outer
+    join</firstterm> because the table mentioned on the left of the
+    join operator will have each of its rows in the output at least
+    once, whereas the table on the right will only have those rows
+    output that match some row of the left table, and will have empty
+    values substituted appropriately.
+   </para>
+
+   <formalpara>
+    <title>Exercise:</title>
+
+    <para>
+     There are also right outer joins and full outer joins.  Try to
+     find out what those do.
+    </para>
+   </formalpara>
+
+   <para>
+    <indexterm><primary>join</primary><secondary>self</secondary></indexterm>
+    <indexterm><primary>alias</primary><secondary>for table name in query</secondary></indexterm>
+
+    We can also join a table against itself.  This is called a
+    <firstterm>self join</firstterm>.  As an example, suppose we wish
+    to find all the weather records that are in the temperature range
+    of other weather records.  So we need to compare the
+    <structfield>temp_lo</> and <structfield>temp_hi</> columns of
+    each <classname>weather</classname> row to the
+    <structfield>temp_lo</structfield> and
+    <structfield>temp_hi</structfield> columns of all other
+    <classname>weather</classname> rows.  We can do this with the
+    following query:
+
+<programlisting>
+SELECT W1.city, W1.temp_lo AS low, W1.temp_hi AS high,
+    W2.city, W2.temp_lo AS low, W2.temp_hi AS high
+    FROM weather W1, weather W2
+    WHERE W1.temp_lo < W2.temp_lo
+    AND W1.temp_hi > W2.temp_hi;
+
+     city      | low | high |     city      | low | high
+---------------+-----+------+---------------+-----+------
+ San Francisco |  43 |   57 | San Francisco |  46 |   50
+ Hayward       |  37 |   54 | San Francisco |  46 |   50
+(2 rows)
+</programlisting>     
+
+    Here we have relabeled the weather table as <literal>W1</> and
+    <literal>W2</> to be able to distinguish the left and right side
+    of the join.  You can also use these kinds of aliases in other
+    queries to save some typing, e.g.:
+<programlisting>
+SELECT *
+    FROM weather w, cities c
+    WHERE w.city = c.name;
+</programlisting>
+    You will encounter this style of abbreviating quite frequently.
    </para>
   </sect1>
 
-  <sect1 id="query-agg">
-   <title>Using Aggregate Functions</title>
+
+  <sect1 id="tutorial-agg">
+   <title>Aggregate Functions</title>
+
+   <indexterm zone="tutorial-agg">
+    <primary>aggregate</primary>
+   </indexterm>
 
    <para>
+    <indexterm><primary>average</primary></indexterm>
+    <indexterm><primary>count</primary></indexterm>
+    <indexterm><primary>max</primary></indexterm>
+    <indexterm><primary>min</primary></indexterm>
+    <indexterm><primary>sum</primary></indexterm>
+
     Like  most  other relational database products, 
     <productname>PostgreSQL</productname> supports
     aggregate functions.
@@ -388,94 +577,214 @@ DELETE FROM <replaceable>tablename</replaceable>;
    </para>
 
    <para>
-    It is important to understand the interaction between aggregates and
-    SQL's <command>WHERE</command> and <command>HAVING</command> clauses.
-    The fundamental difference between <command>WHERE</command> and
-    <command>HAVING</command> is this: <command>WHERE</command> selects
-    input rows before groups and aggregates are computed (thus, it controls
-    which rows go into the aggregate computation), whereas
-    <command>HAVING</command> selects group rows after groups and
-    aggregates are computed.  Thus, the
-    <command>WHERE</command> clause may not contain aggregate functions;
-    it makes no sense to try to use an aggregate to determine which rows
-    will be inputs to the aggregates.  On the other hand,
-    <command>HAVING</command> clauses always contain aggregate functions.
-    (Strictly speaking, you are allowed to write a <command>HAVING</command>
-    clause that doesn't use aggregates, but it's wasteful; the same condition
-    could be used more efficiently at the <command>WHERE</command> stage.)
-   </para>
-
-   <para>
     As an example, we can find the highest low-temperature reading anywhere
     with
 
-    <programlisting>
+<programlisting>
 SELECT max(temp_lo) FROM weather;
-    </programlisting>
+</programlisting>
+
+<screen>
+ max
+-----
+  46
+(1 row)
+</screen>
+   </para>
+
+   <para>
+    <indexterm><primary>subquery</primary></indexterm>
 
     If we want to know what city (or cities) that reading occurred in,
     we might try
 
-    <programlisting>
-SELECT city FROM weather WHERE temp_lo = max(temp_lo);
-    </programlisting>
+<programlisting>
+SELECT city FROM weather WHERE temp_lo = max(temp_lo);     <lineannotation>WRONG</lineannotation>
+</programlisting>
 
     but this will not work since the aggregate
-    <function>max</function> can't be used in
-    <command>WHERE</command>. However, as is often the case the query can be
-    restated to accomplish the intended result; here by using a
-    <firstterm>subselect</firstterm>:
+    <function>max</function> cannot be used in the
+    <literal>WHERE</literal> clause.  However, as is often the case
+    the query can be restated to accomplish the intended result; here
+    by using a <firstterm>subquery</firstterm>:
 
-    <programlisting>
+<programlisting>
 SELECT city FROM weather
     WHERE temp_lo = (SELECT max(temp_lo) FROM weather);
-    </programlisting>
+</programlisting>
+
+<screen>
+     city
+---------------
+ San Francisco
+(1 row)
+</screen>
 
-    This is OK because the sub-select is an independent computation that
-    computes its own aggregate separately from what's happening in the outer
-    select.
+    This is OK because the sub-select is an independent computation
+    that computes its own aggregate separately from what is happening
+    in the outer select.
    </para>
 
    <para>
-    Aggregates are also very useful in combination with
-    <command>GROUP BY</command> clauses.  For example, we can get the
-    maximum low temperature observed in each city with
+    <indexterm><primary>GROUP BY</primary></indexterm>
+    <indexterm><primary>HAVING</primary></indexterm>
+
+    Aggregates are also very useful in combination with <literal>GROUP
+    BY</literal> clauses.  For example, we can get the maximum low
+    temperature observed in each city with
 
-    <programlisting>
+<programlisting>
 SELECT city, max(temp_lo)
     FROM weather
     GROUP BY city;
-    </programlisting>
+</programlisting>
+
+<screen>
+     city      | max
+---------------+-----
+ Hayward       |  37
+ San Francisco |  46
+(2 rows)
+</screen>
 
     which gives us one output row per city.  We can filter these grouped
-    rows using <command>HAVING</command>:
+    rows using <literal>HAVING</literal>:
 
-    <programlisting>
+<programlisting>
 SELECT city, max(temp_lo)
     FROM weather
     GROUP BY city
-    HAVING min(temp_lo) < 0;
-    </programlisting>
+    HAVING max(temp_lo) < 40;
+</programlisting>
+
+<screen>
+  city   | max
+---------+-----
+ Hayward |  37
+(1 row)
+</screen>
 
     which gives us the same results for only the cities that have some
-    below-zero readings.  Finally, if we only care about cities whose
-    names begin with "<literal>P</literal>", we might do
+    below-forty readings.  Finally, if we only care about cities whose
+    names begin with <quote><literal>S</literal></quote>, we might do
 
-    <programlisting>
+<programlisting>
 SELECT city, max(temp_lo)
     FROM weather
-    WHERE city like 'P%'
+    WHERE city LIKE 'S%'
     GROUP BY city
-    HAVING min(temp_lo) < 0;
-    </programlisting>
+    HAVING max(temp_lo) < 40;
+</programlisting>
+   </para>
 
-    Note that we can apply the city-name restriction in
-    <command>WHERE</command>, since it needs no aggregate.  This is
-    more efficient than adding the restriction to <command>HAVING</command>,
+   <para>
+    It is important to understand the interaction between aggregates and
+    SQL's <literal>WHERE</literal> and <literal>HAVING</literal> clauses.
+    The fundamental difference between <literal>WHERE</literal> and
+    <literal>HAVING</literal> is this: <literal>WHERE</literal> selects
+    input rows before groups and aggregates are computed (thus, it controls
+    which rows go into the aggregate computation), whereas
+    <literal>HAVING</literal> selects group rows after groups and
+    aggregates are computed.  Thus, the
+    <literal>WHERE</literal> clause must not contain aggregate functions;
+    it makes no sense to try to use an aggregate to determine which rows
+    will be inputs to the aggregates.  On the other hand,
+    <literal>HAVING</literal> clauses always contain aggregate functions.
+    (Strictly speaking, you are allowed to write a <literal>HAVING</literal>
+    clause that doesn't use aggregates, but it's wasteful; the same condition
+    could be used more efficiently at the <literal>WHERE</literal> stage.)
+   </para>
+
+   <para>
+    Note that we can apply the city name restriction in
+    <literal>WHERE</literal>, since it needs no aggregate.  This is
+    more efficient than adding the restriction to <literal>HAVING</literal>,
     because we avoid doing the grouping and aggregate calculations
-    for all rows that fail the <command>WHERE</command> check.
+    for all rows that fail the <literal>WHERE</literal> check.
+   </para>
+  </sect1>
+
+
+  <sect1 id="tutorial-update">
+   <title>Updates</title>
+
+   <indexterm zone="tutorial-update">
+    <primary>UPDATE</primary>
+   </indexterm>
+
+   <para>
+    You can update existing rows using the
+    <command>UPDATE</command> command. 
+    Suppose you discover the temperature readings are
+    all  off  by 2 degrees as of November 28, you may update the
+    data as follow:
+
+<programlisting>
+UPDATE weather
+    SET temp_hi = temp_hi - 2,  temp_lo = temp_lo - 2
+    WHERE date > '1994-11-28';
+</programlisting>
+   </para>
+
+   <para>
+    Look at the new state of the data:
+<programlisting>
+SELECT * FROM weather;
+
+     city      | temp_lo | temp_hi | prcp |    date
+---------------+---------+---------+------+------------
+ San Francisco |      46 |      50 | 0.25 | 1994-11-27
+ San Francisco |      41 |      55 |    0 | 1994-11-29
+ Hayward       |      35 |      52 |      | 1994-11-29
+(3 rows)
+</programlisting>
    </para>
   </sect1>
+
+  <sect1 id="tutorial-delete">
+   <title>Deletions</title>
+
+   <indexterm zone="tutorial-delete">
+    <primary>DELETE</primary>
+   </indexterm>
+
+   <para>
+    Suppose you are no longer interested in the weather of Hayward,
+    then you can do the following to delete those rows from the table.
+    Deletions are performed using the <command>DELETE</command>
+    command:
+<programlisting>
+DELETE FROM weather WHERE city = 'Hayward';
+</programlisting>
+
+    All weather recording belonging to Hayward are removed.
+
+<programlisting>
+SELECT * FROM weather;
+</programlisting>
+
+<screen>
+     city      | temp_lo | temp_hi | prcp |    date
+---------------+---------+---------+------+------------
+ San Francisco |      46 |      50 | 0.25 | 1994-11-27
+ San Francisco |      41 |      55 |    0 | 1994-11-29
+(2 rows)
+</screen>
+   </para>
+
+   <para>
+    One should be wary of queries of the form
+<synopsis>
+DELETE FROM <replaceable>tablename</replaceable>;
+</synopsis>
+
+    Without a qualification, <command>DELETE</command> will simply
+    remove  all  rows from the given table, leaving it
+    empty.  The system will not request confirmation before
+    doing this.
+   </para>
+  </sect1>
+
  </chapter>
 
 <!-- Keep this comment at the end of the file