Barry Stinson - PostgreSQL Essential Reference-Sams (2001)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 475

• Table of Contents

PostgreSQL Essential Reference


By Barry Stinson

Publisher : New Riders Publishing


Pub Date : October 19, 2001
ISBN : 0-7357-1121-6
Pages : 400

PostgreSQL is the fastest, most comprehensive guide to


PostgreSQL, this book provides you with:

A complete SQL command reference

A full listing of the built-in PostgresSQL operators, data


types, and functions

The necessary tools for programming PostgreSQL with Perl,


Python, PHP, C and C++, ODBC, and JDBC

A noteworthy accomplishment of this book is the layout - all


the subjects are clearly presented, and the reference style
makes looking up information quick and easy. Each topic is
arranged both by task and alphabetically, and includes
important information such as feature notes, syntax, and
example code.
• Table of Contents

PostgreSQL Essential Reference


By Barry Stinson

Publisher : New Riders Publishing


Pub Date : October 19, 2001
ISBN : 0-7357-1121-6
Pages : 400

Copyright
About the Author
About the Technical Reviewers
Acknowledgments
Tell Us What You Think
Introduction
What's Inside?
Who Is This Book For?
Who Is This Book Not For?
Conventions

Part I: SQL Reference


Chapter 1. PostgreSQL SQL Reference
Table of Commands
Alphabetical Listing

Part II: PostgreSQL Specifics


Chapter 2. PostgreSQL Data Types
Table of Data Types
Geometric Data Types
Logical Data Types
Network Data Types
Numeric Data Types
String Data Types
Time Data Types
Other Data Types
More Data Types

Chapter 3. PostgreSQL Operators


Geometric Operators
Logical Operators
Network Operators
Numerical Operators
String Operators
Time Operators

Chapter 4. PostgreSQL Functions


Map of Functions Grouped by Category
Aggregate Functions
Conversion Functions
Geometric Functions
Network Functions
Numerical Functions
SQL Functions
String Functions
Time Functions
User Functions
Other Functions

Chapter 5. Other PostgreSQL Topics


Arrays in Fields
Inheritance
PostgreSQL Indexes
OIDs
Multiversion Concurrency Control

Part III: PostgreSQL Administration


Chapter 6. User Executable Files
Alphabetical Listing of Files

Chapter 7. System Executable Files


Alphabetical Listing of Files

Chapter 8. System Configuration Files and Libraries


System Configuration Files
Library Files
Chapter 9. Databases and Log Files
PostgreSQL Data Directory
Log Files

Chapter 10. Common Administrative Tasks


Compiling and Installation
Creating Users
Granting User Rights
Database Maintenance
Database Backup/Restore
Performance Tuning

Part IV: Programming with PostgreSQL


Chapter 11. Server-Side Programming
Benefits of Procedural Languages
Installing Procedural Languages
PL/pgSQL
PL/Tcl
PL/Perl

Chapter 12. Creating Custom Functions


Creating Custom Functions
Creating Custom Triggers
Creating Custom Rules

Chapter 13. Client-Side Programming

ecpg
JDBC

libpq
libpq++
libpgeasy
ODBC
Perl
Python (PyGreSQL)
PHP

Chapter 14. Advanced PostgreSQL Programming


Extending Functions
Extending Types
Extending Operators
Part V: Appendices
Appendix A. Additional Resources
PostgreSQL versus Other RDBMSs
Online PostgreSQL Resources
Books

Appendix B. PostgreSQL Version Information


Version 7.1.2 (Released May 2001)
Version 7.1.1 (Released May 2001)
Version 7.1 (Released April 2001)
Version 7.0.3 (Released November 2000)
Version 7.0.2 (Released June 2000)
Version 7.0 (Released May 2000)
Version 6.5.2 (Released September 1999)
Version 6.5.1 (Released July 1999)
Version 6.5 (Released June 1999)
Version 6.4.1 (Released December 1999)
Version 6.4 (Released October 1998)
Version 6.3 (Released March 1998)
Version 6.2.1 (Released October 1997)
Version 6.2 (Released June 1997)
Version 6.1 (Released June 1997)
Version Postgre95 .01 (Released May 1995)
Copyright

Copyright © 2002 by New Riders Publishing

FIRST EDITION: October, 2001

All rights reserved. No part of this book may be reproduced or transmitted in any
form or by any means, electronic or mechanical, including photocopying, recording,
or by any information storage and retrieval system, without written permission from
the publisher, except for the inclusion of brief quotations in a review.

Library of Congress Catalog Card Number: 2001086186

06 05 04 03 02 7 6 5 4 3 2 1

Interpretation of the printing code: The rightmost double-digit number is the year of
the book's printing; the rightmost single-digit number is the number of the book's
printing. For example, the printing code 02-1 shows that the first printing of the
book occurred in 2002.

Printed in the United States of America

Trademarks

All terms mentioned in this book that are known to be trademarks or service marks
have been appropriately capitalized. New Riders Publishing cannot attest to the
accuracy of this information. Use of a term in this book should not be regarded as
affecting the validity of any trademark or service mark.

Warning and Disclaimer

This book is designed to provide information about PostgreSQL. Every effort has
been made to make this book as complete and as accurate as possible, but no
warranty of fitness is implied.

The information is provided on an as-is basis. The authors and New Riders
Publishing shall have neither liability nor responsibility to any person or entity with
respect to any loss or damages arising from the information contained in this book
or from the use of the discs or programs that may accompany it.

Publisher

David Dwyer

Associate Publisher

Stephanie Wall

Managing Editor

Kristy Knoop
Acquisitions Editor

Deborah Hittel-Shoaf

Development Editor

Chris Zahn

Product Marketing Manager

Stephanie Layton

Publicity Manager

Susan Nixon

Project Editor

Sarah Kearns

Copy Editor

Amy Lepore

Indexer

Chris Morris

Manufacturing Coordinator

Jim Conway

Book Designer

Louisa Klucznik

Cover Designer

Brainstorm Design, Inc.

Cover Production

Aren Howell

Proofreader

Jeannie Smith

Composition

Ron Wise
About the Author

Barry Stinson graduated from Louisiana State University in 1995 with a master's
degree in music composition. During his tenure there, he was fortunate enough to
help design the Digital Arts studio with Dr. Stephen David Beck. Designing a full-
fledged music and graphicarts digital studio afforded him exposure to a diverse set
of unique computing systems—particularly those from NeXT, SGI, and Apple. It was
during this time that he discovered Linux and subsequently PostgreSQL, both of
which were still in an early stage of development.

After graduation, Barry set up his own consulting company, Silicon Consulting, which
is based in Lafayette, Louisiana. Over the years, he has worked as a consultant for
many companies throughout southern Louisiana.

Increasingly, much of the work Barry has done over the years has centered on
databases. In the time from his original exposure to Postgre95 to its present form
as PostgreSQL, an amazing amount of development has taken place on open-source
database systems.

The rise of high-quality and open-sourced computing systems that has taken place
recently has produced a renaissance in the high-tech industry. However, according
his girlfriend, Pamela, his continued insistence to rely on renegade operating
systems, such as Linux, has only served to strengthen the unruly aspects already
present in his personality.
About the Technical Reviewers
These reviewers contributed their considerable hands-on expertise to the entire
development process for PostgreSQL Essential Reference. As the book was being
written, these dedicated professionals reviewed all the material for technical
content, organization, and flow. Their feedback was critical to ensuring that
PostgreSQL Essential Reference fits our reader's need for the highest-quality
technical information.

Jeremy Murrish is a Software Engineer and Project Manager at Direct Data, Inc., in
St. Louis, Missouri. Jeremy has spent his five years at Direct Data developing web-
based digital asset management and workflow solutions on a UNIX platform.
Jeremy's experience lies mainly in the publishing, pre-press, and printing industries.
He has a B.S. in Computer Science from the University of Missouri-Rolla. Jeremy
lives in St. Louis with his wife, Sherri, and their two dogs, Sloan and Thorn.

Lamar Owen basically grew up breathing computer programming. His first real
experiences with computers involved the old 8-bit TRS-80 Models I and III and a
hexidecimal debugger, and he programmed in hand-assembled machine language.
Lamar wrote a Z80 disassembler, and patched/rewrote portions of the TRSDOS
operating system for his personal use. After graduating from Rosman High School in
1986, he earned his Bachelor's degree in Electronics Engineering Technology from
DeVRY Institute of Technology in Decatur, Georgia, where he graduated Summa
Cum Laude in 1989. Lamar has owned, administered, and programmed various
UNIX systems for nearly 15 years. He has been employed by Anchor Baptist
Broadcasting for 11 years, for which he is currently Technical Director. In addition,
Lamar has maintained the PostgreSQL RPM set for two years.
Acknowledgments
This book would not have been possible if it weren't for the tireless efforts of the
New Riders staff, specifically Stephanie Wall, Deborah Hittel-Shoaf, and Chris Zahn.
Let me also acknowledge the efforts of the rest of the New Riders team—Sarah
Kearns, Amy Lepore, Chris Morris, Jeannie Smith, and Ron Wise. Moreover, the
technical reviewers, Jeremy Murrish and Lamar Owen, deserve a great deal of
thanks for providing invaluable insights, corrections, and a great deal of expertise in
making this book a success.

I would like to also thank The Logan Law Firm in Lafayette, Louisiana, for providing
me with all the coffee I could drink while writing this book and for providing an
environment "free from distraction."

Thanks to all the people who have worked so hard on the PostgreSQL web site,
mailing list, and documentation project that answered so many of my questions.

Mom and Dad, thanks for always believing in me and for buying that first computer.
I told you it would pay off one day!

Last, but certainly not least, I would like to thank Pamela Beadle, my beautiful
girlfriend, whose intelligence, patience, and beauty are enough to inspire any man.
Tell Us What You Think
As the reader of this book, you are the most important critic and commentator. We
value your opinion and want to know what we're doing right, what we could do
better, what areas you'd like to see us publish in, and any other words of wisdom
you're willing to pass our way.

As the Associate Publisher for New Riders Publishing, I welcome your comments.
You can fax, email, or write me directly to let me know what you did or didn't like
about this book— as well as what we can do to make our books stronger.

Please note that I cannot help you with technical problems related to the topic of
this book, and that due to the high volume of mail I receive, I might not be able to
reply to every message.

When you write, please be sure to include this book's title and author as well as
your name and phone or fax number. I will carefully review your comments and
share them with the author and editors who worked on the book.

Fax: 317-581-4663

Email: [email protected]

Mail: Stephanie Wall


Associate Publisher
New Riders Publishing
201 West 103rd Street
Indianapolis, IN 46290 USA
Introduction
Databases are an essential staple of the technological world. In some sense, you
could say that the most universal function of computers is to serve as a database
system. The early punch card readers, which predate the modern computer, were
originally used by the U.S. government and large businesses as a method to collect,
aggregate, and report data. This general purpose formed the basis for the modern
computer to be developed.

In the early days of the modern computer, each program was expected to be able to
handle its own data storage and retrieval functions. This, of course, placed a
significant burden on early programmers, who had to write extra code that had
nothing to do with the true function of their application. Moreover, it turns out that it
is difficult to store data efficiently and reliably; therefore, it was only natural that
the idea of a database was born.

Early database systems were a unique but welcomed concept. Existing as an


entirely separate application, programmers were no longer required to write all the
code necessary to perform the low-level file access functions for data storage.
Instead, programmers could devote their time to writing code that pertained directly
to the application they were designing. Their application simply had to tell the data
engine what data it needed to store or retrieve, and the database system would
handle the request. By the mid-1960s, several database management systems
being sold by companies like IBM addressed many of these needs.

There were two problems with these early systems: They caused problems with
portability, and they tended to work better with static data structures. Because each
vendor was selling a unique and proprietary database management system,
applications had to be specially written to interface with each. If your application
was written to interface with the IBM database, it could not easily be configured to
work with a competitor's and vice versa. Moreover, the early databases worked on
datasets that were implemented as "flat files." This meant that if you wanted to
capture a different set of data from your application, modifications to the base data
structure were difficult and time consuming. In many cases, significant sections of
your application's source code would need to be rewritten to enable such
modifications.

In the early 1970s, a paper written by E.F. Codd, an IBM researcher, fundamentally
changed the history of how database systems would be implemented. Codd
suggested that datasets be represented relationally by the database system. This
meant that tables within a database could be linked together with various indexes to
produce an underlying data structure that was much more dynamic and extensible
than previously possible with the early flat-file systems.

IBM set out to design a system that incorporated many of Codd's visions, and this
system came to be known as System-R. Completed in the mid-1970s, System-R
also contained a new feature known as a structured query language (SQL). This new
language provided two very radical concepts to the database world: (1) It was
declarative, and (2) It was accepted by the American National Standards Institute
(ANSI) to be a standard.

Before the advent of the SQL language, programmers needed to procedurally define
how the data stored by the database would be accessed.With SQL, however,
programmers could simply request what criteria needed to be present in the
returned dataset, and the database engine would perform the work of actually
translating that request into returned data. This removed an additional burden from
programmers because it meant that the actual mechanics of data input and retrieval
were abstracted from their control. As a result, a tremendous amount of work went
into designing query planners that could perform these requests in the most
efficient manner possible. Through this concentration of effort, databases achieved
levels of efficiency and reliability that were outside the grasp of what any individual
developer could have achieved independently.

Being declared an ANSI standard allowed competing products to be interfaced in the


same way. Developers could now design their applications so that the code became
much more portable. This meant that as database systems improved and as
requirements changed, the application could be transferred to a new database
system much easier than previously possible.

These two developments—relational data structures and the SQL language—are


what formed the basis for the modern relational database management system
(RDBMS). Consequently, RDBMS has developed into an entire industry unique to
itself.

One of the early RDBMSs was called Ingres, and it included many of the features
available in database systems at that time. A project was started in the early 1990s
at the University of California at Berkeley to further these concepts; this project was
dubbed Postgres as a play on words that implied after Ingres.

During the middle of that decade, the project, then known as Postgres95, was
renamed to PostgreSQL and was released to the world at large as an open-sourced
project. In the ensuing years, a tremendous amount of development was put into
this project, which has resulted in an open-source and freely available feature-rich
database system. The result is an RDBMS that rivals the features and performance
typically only found in high-dollar commercial systems. This is a monumental
achievement, and a great deal of admiration and respect should go to the countless
developers who have contributed their time and efforts to this project.

In fact, PostgreSQL has advanced to such a degree that there is now much
commercial interest in further supporting and developing PostgreSQL. Among the
companies interested are Great Bridge and Red Hat, both of which have a deep
commitment to open source and to PostgreSQL. With an active development
community backed by serious commercial support, PostgreSQL is destined to be one
of the shining stars of the open-source arena. PostgreSQL will undoubtedly be
considered a success in the same manner as the Apache and Linux projects.
What's Inside?

A lot of effort has been put into making this book a truly great "reference" manual.
There is a certain art to making a reference manual that differs from writing a
traditional book. Namely, the author of a reference manual needs to keep in mind,
even more than usual, how the book will actually be used.

Reference manuals aren't read sequentially; rather, the reader usually jumps from
topic to topic as his or her needs arise. Consequently, a great deal of care needs to
be taken in how the book is laid out, in how it is organized, and in giving the reader
the "right amount" of information on each topic.

I've worked hard at following this approach to the best of my ability. The
information collected has been organized into the following structure.

SQL Reference

This section outlines all the SQL commands supported by PostgreSQL Version 7.1 in
a single chapter— Chapter 1, "PostgreSQL SQL Reference." Each command is listed
in alphabetical order, along with usage notes and an example.

PostgreSQL Specifics

The following PostgreSQL-specific information is covered:

Chapter 2, "PostgreSQL Data Types," is a listing of the valid PostgreSQL data


types and their typical use.

Chapter 3, "PostgreSQL Operators," is a listing of the operators that exist in


PostgreSQL, as well as examples that highlight their use.

Chapter 4, "PostgreSQL Functions," is a listing of the included functions within


PostgreSQL, as well as examples that highlight their use.

Chapter 5, "Other PostgreSQL Topics," covers table inheritance, B-Tree


indexes, and OIDs. It also includes a discussion of how Multiversion
Concurrency Control (MVCC) works.

PostgreSQL Administration

Information in this section is designed to aid the database administrator (DBA) in


understanding how the PostgreSQL system operates. The following topics are
covered:

Chapter 6, "User Executable Files," covers files specifically designed to be


executed by database users.
Chapter 7, "System Executable Files," covers files specific to system or server
functions.

Chapter 8, "System Configuration Files and Libraries," covers the configuration


files needed by PostgreSQL.

Chapter 9, "Databases and Log Files," provides information on where the


database and log files are stored locally.

Chapter 10, "Common Administrative Tasks," is a brief synopsis of common


administrative tasks that the DBA might need to perform.

Programming with PostgreSQL

This section outlines the options available to programmers who need to develop
custom applications with PostgreSQL. Covered topics include the following:

Chapter 11, "Server-Side Programming," outlines the PL/pgSQL, PL/pgTCL,


and PL/Perl procedural scripting languages.

Chapter 12, "Creating Custom Functions," outlines the use of custom written
functions, triggers, and rules.

Chapter 13, "Client-Side Programming," describes how client applications can


interface with the back end. It explores Python, Perl, Libpq, Libpq++,
Libpgeasy, Ecpg, ODBC, and the JDBC interfaces.

Chapter 14, "Advanced PostgreSQL Programming," delves into the extending


of PostgreSQL through the creation of custom types, operators, and
aggregates.

Appendices

Two appendices provide further information about PostgreSQL:

Appendix A, "Additional Resources," includes a list of various support


resources, from mailing lists to commercial sites.

Appendix B, "PostgreSQL Version Information," provides you with a historical


overview of the changes in the various PostgreSQL releases.
Who Is This Book For?

This book is designed to be used by those who already possess a basic


understanding of SQL-based databases but who need specific information regarding
PostgreSQL.

Ideally, the reader should have some familiarity with a UNIX-style operating system.
This is not a strict requirement, but it will make certain tasks like installation or
administration much easier.

As previously mentioned, this book is more suited as a technical reference as


opposed to a teaching manual. Moreover, due to the practical organization of this
book, it is probably better to flip from section to section rather than trying to read it
sequentially. Regardless, I sincerely hope that this book becomes a well-worn
addition to your desktop.
Who Is This Book Not For?

If you are new to database systems in general, this book will probably not be of
immediate benefit to you.Additionally, because PostgreSQL is both a relational
database and SQL based, if these concepts are not familiar to you as well, then this
might not be the book for you, at least not yet.
Conventions

The following typographical conventions are used in this book:

Monospaced font indicates web sites, keywords, commands, file paths, and
options. Italicized font indicates where you should substitute a value of your own
choosing.

In SQL statements, SQL keywords and function names are uppercase. Database,
table, and column names are lowercase.
Part I: SQL Reference

Part I SQL Reference

1 PostgreSQL SQL Reference


Chapter 1. PostgreSQL SQL Reference
The structured query language (SQL) is used by most relational databases to
perform specific database operations. Before SQL came into existence, each
database system used its own proprietary methods for accessing the underlying
data. This caused many problems for developers who were trying to create portable
front-end systems that could communicate with multiple databases.

The solution was to create a standard method of accessing database functions that
each database vendor would support. The result was originally dubbed SQL-86,
named after the year of its inception. Later, the standard was amended with
additional features and renamed to SQL-89.

In 1992, the SQL specification was expanded significantly to handle extra data
types, outer joins, catalog specification, and other enhancements. This version of
SQL, called SQL-92 (a.k.a. SQL-2), is the foundation of many modern relational
database management systems (RDBMSs).

PostgreSQL supports the majority of the functions outlined in the SQL-92 standard.
The following pages list the SQL commands, their syntax, their options, and
examples of how SQL is used in PostgreSQL. Although all the major functional
specifications of SQL-92 are supported in PostgreSQL, there are occasions when
PostgreSQL has SQL commands that have no counterpart in the formal SQL-92
specification. The following alphabetical listing notes these areas and points the user
to synonymous commands.
Table of Commands

The following table organizes the supported SQL commands by task.

Creation Destruction Use/Modify

Database CREATE DATABASE DROP DATABASE COMMENT

LOAD

VACUUM

Table/Index CREATE TABLE DROP TABLE COMMENT

CREATE INDEX DROP INDEX SELECT

CREATE VIEW DROP VIEW EXPLAIN

TRUNCATE ALTER TABLE

ALTER INDEX

COPY

UPDATE

INSERT

DELETE

SELECT INTO

VACUUM

CLUSTER

Constraints CREATE TRIGGER DROP TRIGGER COMMENT

CREATE CONSTRAINT DROP CONSTRAINT

CREATE RULE DROP RULE

CREATE SEQUENCE DROP SEQUENCE

User
CREATE USER DROP USER ALTER USER

CREATE GROUP DROP GROUP ALTER GROUP

GRANT

REVOKE

Session/Transaction SET CLOSE SHOW

DECLARE ABORT RESET

BEGIN COMMIT FETCH

ROLLBACK MOVE

END LISTEN

UNLISTEN

NOTIFY

LOCK

UNLOCK

Misc CREATE AGGREGATE DROP AGGREGATE COMMENT

CREATE OPERATOR DROP OPERATOR

CREATE TYPE DROP TYPE

CREATE LANGUAGE DROP LANGUAGE

CREATE FUNCTION DROP FUNCTION


Alphabetical Listing

The following pages comprise an alphabetical listing of the SQL commands supported in PostgreSQL (Version
7.1).

ABORT

Syntax

ABORT [WORK | TRANSACTION]

Description

ABORT is used to halt a transaction in process and roll back the table(s) to its original state.

Input(s)

None. WORK and TRANSACTION are optional and have no effect.

Output(s)

ROLLBACK (Message returned if successful.)

NOTICE: ROLLBACK: no transaction in process (Returned if no transaction is in process.)

Notes

Must be used within a BEGIN…COMMIT series.

SQL-92 Compatibility

Not used in SQL-92; use ROLLBACK instead.

Example

The following code shows how the ABORT command could be used to halt a transaction in progress and return
the table back to its original state. First you see the values in the table mytable in their original state. Next,
those values are modified with the UPDATE command. However, the UPDATE command is issued from within a
BEGIN…COMMIT transaction; therefore, it is possible for us to ABORT the current transaction and return the table
to its original state.

SELECT * FROM mytable;

name | age
----------------------
Barry | 29

BEGIN TRANSACTION;
UPDATE mytable SET age=30 WHERE name='Barry';
SELECT * FROM mytable;

name | age
----------------------
Barry | 30
ABORT TRANSACTION;
SELECT * FROM mytable;

name | age
----------------------
Barry | 29

ALTER GROUP

Usage

ALTER GROUP groupname [ ADD USER | DROP USER ] username [,…]

Description

ALTER GROUP adds or removes users from a specified group.

Input(s)

groupname—The name of the group to modify.

username—The name of the user to add or drop.

Output(s)

ALTER GROUP (Message returned if successful.)

Notes

Only the superuser can issue this command—all other attempts will fail. The user and the group must exist
before this command can be issued. Dropping a user will only remove the user from the group, not drop him or
her from the database.

SQL-92 Compatibility

There is no ALTER GROUP in SQL-92. SQL-92 employs the concept of roles.

Example

The following code shows how multiple users can be added or dropped from the group admins.

ALTER GROUP admins ADD USER frank, mike, bill;


ALTER GROUP admins DROP USER mike;

ALTER TABLE

Usage

ALTER TABLE table [ * ] ADD [COLUMN] column coltype


ALTER TABLE table [ * ] ALTER [COLUMN] column
{ SET DEFAULT value | DROP DEFAULT }
ALTER TABLE table [ * ] RENAME [COLUMN] column TO newcolumn
ALTER TABLE table [ * ] RENAME TO newtable
ALTER TABLE table [ * ] ADD table constraint

Starting with PostgreSQL 7.1, some new possibilities to the ALTER TABLE command were added. See the
following:

ALTER TABLE [ ONLY ] table [ * ] ADD [COLUMN] column coltype


ALTER TABLE [ ONLY ] table [ * ]
ALTER [COLUMN] column { SET DEFAULT value | DROP DEFAULT }
ALTER TABLE table ADD table-constraint-definition
ALTER TABLE table OWNER TO new-owner

Description

ALTER TABLE modifies a table or column. It enables columns to be modified, renamed, or added to an existing
table. Additionally, the table itself can be renamed by using the ALTER TABLE…RENAME syntax. If a table or
column is renamed, none of the underlying data will be affected.

By using the SET DEFAULT or DROP DEFAULT options, the default value for that column can be set, modified,
or removed. (See the "Notes" section.)

If an asterisk (*) is included after the table name, then all tables that inherit their column properties from the
current table will be modified as well. (See the "Notes" section.) This changes withVersion 7.1 of PostgreSQL,
which cascades all changes to inherited tables by default. To limit changes to a specific table in PostgreSQL 7.1
and later, use the ONLY command.

Input(s)

coltype—The type of column to be added.

column—The name of the column to modify.

constraint—The new constraint to add to the table.

newcolumn—The new name of the column after it is renamed.

new-owner—Change ownership of the table to this user.

newtable—The new table name after it is renamed.

table—The name of the table to modify.

value—The value to set as the default for a particular column.

Output(s)

ALTER (Message returned if the modification was successful.)

ERROR (Message returned if the column, table, or column type does not exist.)

Notes

The ALTER TABLE command can only be issued by users who own the table or class of tables being modified.

The [COLUMN] keyword is optional and can safely be omitted.

Changing the default value for a column will not retroactively affect existing data in that column. DEFAULT
VALUE will only affect newly inserted rows. To change the default value for all rows, the DEFAULT VALUE clause
should be followed with an UPDATE command to reset the existing rows to the desired value.
The asterisk (*) should always be included if the table is a superclass; otherwise, queries will fail if performed on
subtables that depend on the newly modified column.

Only FOREIGN KEY constraints can be added to a table; to add or remove a unique constraint, a unique index
must be created. When adding a FOREIGN KEY constraint, the column name must exist in the foreign table. To
add check constraints to a table, you must re-create and reload the table using the CREATE TABLE command.

SQL-92 Compatibility

The ALTER COLUMN form is fully compliant with SQL-92.

The ADD COLUMN form is compliant, except that it does not support defaults or constraints. A subsequent ALTER
COLUMN command must be issued to achieve the desired results.

ALTER TABLE does not support some of the functionality as specified in SQL-92. Specifically, SQL-92 allows
constraints to be dropped from a table. To achieve this result in PostgreSQL, indexes must be dropped, or the
table must be re-created and reloaded.

Examples

To add a column statecode of type VARCHAR[2] to the table authors, you would issue the following
command:

ALTER TABLE authors ADD COLUMN statecode VARCHAR[2];

To rename the column statecode to state, you would use the following command:

ALTER TABLE authors RENAME COLUMN statecode TO state;

To change the default value of column state to TX, do the following:

ALTER TABLE authors ALTER COLUMN old_email SET DEFAULT 'TX';

To add a FOREIGN KEY constraint to the table authors, which ensures that the field state is a valid entry (as
defined by the foreign table us_states), issue this command:

ALTER TABLE authors ADD CONSTRAINT statechk FOREIGN KEY (state) REFERENCES
us_states (state) MATCH FULL;

To rename the table authors to writers, do the following:

ALTER TABLE authors RENAME TO writers;

ALTER USER

Usage

ALTER USER username [ WITH PASSWORD password]


[ CREATEDB | NOCREATEDB ]
[ CREATEUSER | NOCREATEUSER ]
[ VALID UNTIL abstime]

Description

ALTER USER modifies an existing user account in the database.

The optional clauses CREATEDB or NOCREATEDB determine whether the user is allowed to create databases.

The optional clauses CREATEUSER or NOCREATEUSER determine whether the user will be allowed to create users
of his or her own.
The optional clause VALID UNTIL supplies the date and/or time when the password will expire.

Input(s)

username—The username whose attributes will be modified.

password—The new password for this user.

abstime—The date and/or time that this password will expire.

Output(s)

ALTER USER (Message returned if the action was a success.)

ERROR: ALTER USER: user 'username' does not exist (Message returned if the username does not
exist in current database.)

Notes

Only a database administrator or superuser can modify privileges and account expiration.

To create or drop a user from the database, use CREATE USER or DROP USER, respectively.

SQL-92 Compatibility

SQL-92 does not define the concept of USERS; it is left for each implementation to decide.

Examples

To change the user Charles password to qwerty, issue the following command:

ALTER USER 'Charles' WITH PASSWORD 'qwerty';

To set the user Charles password to expire on January 1, 2005, issue the following command:

ALTER USER 'Charles' VALID UNTIL 'Jan 1 2005';

To cause the user Charles password to expire at 12:35 on January 1, 2005, in a time zone that is six hours
ahead of UTC, issue the following command:

ALTER USER 'Charles' VALID UNTIL 'Jan 1 12:35:00 2005 +6';

To give the user Charles the capability to create his own users but not his own databases, issue the following
command:

ALTER USER 'Charles' CREATEUSER NOCREATEDB

BEGIN

Usage

BEGIN [ WORK | TRANSACTION ]

Description

By default, all commands issued in PostgreSQL are performed in an implicit transaction. The explicit use of the
BEGIN…COMMIT clauses encapsulates a series of SQL commands to ensure proper execution. If any of the
commands in the series fail, it can cause the entire transaction to ROLLBACK, bringing the database back to its
original state.

PostgreSQL transactions are normally set to be READ COMMITTED, which means that inprocess transactions can
see the effect of other committed transactions. The behavior can be changed by issuing a SET TRANSACTION
ISOLATION LEVEL SERIALIZABLE command after a transaction has started. This would have the effect of
preventing the current transaction from seeing any changes to the database while it is in process. (See the
"Examples" section.)

Input(s)

None. WORK and TRANSACTION are optional and have no effect.

Output(s)

BEGIN (Message issued once transaction series has begun.)

NOTICE: BEGIN: already a transaction in progress (Message indicates that a current transaction is
already in progress and that the transaction just begun has no effect on existing transaction.)

Notes

See ABORT, COMMIT, and ROLLBACK for more information regarding transactions.

SQL-92 Compatibility

The BEGIN keyword is implicit in SQL-92; this is an extension to PostgreSQL. Normally, in SQL-92, every
transaction begins with an implicit BEGIN command but requires a COMMIT or ROLLBACK command to actually
commit the transaction to the database.

Examples

In these examples, you focus on two users who are performing operations on the database concurrently. These
examples will highlight how transactions affect the data that other users see, particularly with respect to the
READ COMMIT and SET TRANSACTION ISOLATION LEVEL SERIALIZABLE commands. (Note: In the following
examples, the SELECT commands would also display the column names and return the actual data. This output
has been abbreviated to make these listings more readable.)

The following example shows User 1 as he or she is engaged in an explicit transaction series and User 2, who
is using only implicit transactions. The example shows what data each user can see.

User 1 User 2
BEGIN TRANSACTION;
INSERT INTO mytable VALUES ('Pam');
(1) row inserted SELECT * FROM mytable;
(0) row found
SELECT * FROM mytable;
(1) row found
COMMIT TRANSACTION;

SELECT * FROM mytable; SELECT * FROM mytable;


(1) row found (1) row found

The following example shows how two explicit transactions, both using READ COMMIT (which is the default),
affect each other. In particular, note how User 2 can view the effects of User 1 after User 1 has issued a
COMMIT command. Compare this example with the next one, which uses the SERIALIZABLE command.
User 1 User 2
BEGIN TRANSACTION; BEGIN TRANSACTION;
SELECT * FROM mytable; SELECT * FROM mytable;
(0) results found (0) results found

INSERT INTO mytable VALUES ('Pam');


(1) result inserted
COMMIT; SELECT * FROM mytable;
(1) results found

INSERT INTO mytable VALUE ('Barry');


(1) results inserted
SELECT * FROM mytable;
(1) result found COMMIT;

SELECT * FROM mytable; SELECT * FROM mytable;


(2) results found (2) results found

In this example, the SERIALIZABLE command is used to show how it prevents changes from being seen while
the transaction is in process. In effect, the SERIALIZABLE command takes a snapshot of the database as it
existed before the transaction was started and isolates it from the effects of other COMMIT commands.

User 1 User 2
BEGIN TRANSACTION; BEGIN TRANSACTION;
SET TRANSACTION ISOLATION LEVEL
SERIALIZABLE;
SELECT * FROM mytable; SELECT * FROM mytable;
(0) results found (0) results found

INSERT INTO mytable VALUES ('Pam');


(1) result inserted
COMMIT; SELECT * FROM mytable;
(0) results found
INSERT INTO mytable VALUE ('Barry');
(1) results inserted
SELECT * FROM mytable;
(1) result found
COMMIT;

SELECT * FROM mytable; SELECT * FROM mytable;


(2) results found (2) results found

CLOSE

Usage

CLOSE cursor;

Description

This closes cursors that were opened by using the DECLARE command. Closing a cursor frees resources within
PostgreSQL and should be performed when the current cursor is no longer needed.

Input(s)

cursor—The name of the cursor to close.


Output(s)

CLOSE (Message returned if successful.)

NOTICE PerformPortalClose: portal 'cursor' not found (Message returned if no cursor is found.)

Notes

By default, a cursor is closed if a COMMIT or ROLLBACK command is issued. See DECLARE for more discussion on
cursors.

SQL-92 Compatibility

CLOSE is fully SQL-92 compliant.

Example

To close the cursor newchecks:

CLOSE newchecks;

CLUSTER

Usage

CLUSTER index ON table;

Description

Normally, PostgreSQL physically stores data in tables in an unordered manner. CLUSTER forces PostgreSQL to
physically reorder the tables so that the data is grouped according to the index specified. Generally speaking,
database performance will improve after a CLUSTER command is issued. However, any subsequent inserts are
not physically grouped in the same manner. In effect, the CLUSTER command creates a static index based on the
criteria specified. If subsequent data is inserted or updated, the CLUSTER command must be reissued to
physically reorder the table.

Input(s)

index—The name of the index on which to perform the cluster.

table—The table name.

Output(s)

CLUSTER (Message returned when the command was successful.)

ERROR: relation <tablerelation_number> inherits "table"

ERROR: relation "table" does not exist

Notes

To perform the reordering of data, PostgreSQL copies the table in index order to a temporary table and then re-
creates and reloads the table in the new order. This causes any grant permissions and other indexes to be lost in
the transfer.

Because the CLUSTER command produces a static ordering, most users would only benefit from this command
for specific cases. Dynamic clusters can be created by using the ORDER BY clause within a SELECT command.
(See the section on SELECT later in this chapter.)

The CLUSTER command can take several minutes to complete. This depends on the size of the table and/or the
hardware speed of the system.

SQL-92 Compatibility

SQL-92 has no CLUSTER command.

Example

The following example shows a table named authors that has an index called name. The same effect could be
achieved by using a SELECT…ORDER ON name command.

SELECT * FROM authors;


Name | Age
-------------
Pam | 25
Barry | 29
Tom | 32
Amy | 43
CLUSTER authors ON name;

SELECT * FROM authors;


Name | Age
----------------
Amy | 43
Barry | 29
Pam | 25
Tom | 32

COMMENT

Usage

COMMENT ON DATABASE | INDEX | RULE | SEQUENCE | TABLE | TYPE | VIEW obj_name IS text

Or

COMMENT ON COLUMN table_name.column_name IS text |


AGGREGATE agg_name agg_type IS text |
FUNCTION func_name (arg1, arg2, [,…]) IS text |
OPERATOR op_name (leftoperand_type, rightoperand_type) IS text |TRIGGER
trigger_name ON table_name IS text

Description

COMMENT enables users or administrators to associate text descriptions with an object.

Input(s)

obj_name—The name of the object on which to place a comment.


table_name—The name of the table to be affected.

column_name—The specific column to affect.

agg_name—The aggregate name on which to comment.

agg_type—The aggregate type to affect.

func_name—The function name.

op_name—The operator name.

trigger_name—The name of the trigger.

text—The actual text of the comment to place on the object.

Output(s)

COMMENT (Message returned if successful.)

Notes:

COMMENT has no effect on an object; it is used only for documentation purposes.

Comments on an object can be retrieved from within psql by using the /dd command. (See "psql" in Chapter
6,"User Executable Files," for more.)

SQL-92 Compatibility

SQL-92 has no COMMENT command; this is an extension by PostgreSQL.

Examples

To add a comment to the table authors:

COMMENT ON TABLE authors IS 'Listing of our authors';

Here are some other examples of COMMENT:

COMMENT ON DATABASE newriders IS 'Database for web-site';


COMMENT ON COLUMN authors.email IS 'Email address for author';
COMMENT ON FUNCTION book_sales(varchar) IS 'Returns books sold';
COMMENT ON OPERATOR > (int, int) IS 'Compares two integers';

COMMIT

Usage

COMMIT [ WORK | TRANSACTION ]

Description

By default, all commands issued in PostgreSQL are performed in an implicit transaction. The explicit use of the
BEGIN…COMMIT clauses encapsulates a series of SQL commands to ensure proper execution. If any of the
commands in the series fail, it can cause the entire transaction to ROLLBACK, bringing the database back to its
original state.
Input(s)

None. WORK and TRANSACTION are optional and have no effect.

Output(s)

COMMIT (Message returned if successful.)

NOTICE: COMMIT: no transaction in progress (Message returned if there's no current transaction.)

Notes

See ABORT, BEGIN, COMMIT, and ROLLBACK for more information regarding transactions.

SQL-92 Compatibility

SQL-92 only specifies the forms COMMIT and COMMIT WORK. Otherwise, this command is fully compliant.

Example

This example shows two users who are concurrently using the table mytable. The INSERT command from User
1 is not seen by User 2 until a COMMIT command is issued. (This assumes the READ COMMITTED clause is set;
see the section on BEGIN for more information.)

User 1 User 2
BEGIN TRANSACTION;
INSERT INTO mytable VALUES ('Pam');
(1) row inserted SELECT * FROM mytable;
(0) row found
SELECT * FROM mytable;
(1) row found
COMMIT TRANSACTION;

SELECT * FROM mytable; SELECT * FROM mytable;


(1) row found (1) row found

COPY

Usage

COPY [ BINARY ] table [ WITH OIDS ] FROM {filename | stdin}


[ [USING ] DELIMTERS delimiter]
[ WITH NULL AS nullstring]

Or

COPY [ BINARY ] table [ WITH OIDS ] TO {filename | stdout}


[ [USING ] DELIMTERS delimiter]
[ WITH NULL AS nullstring]

Description

The COPY command enables users to import or export tables from PostgreSQL. By using the BINARY keyword,
data will be used in a binary format and will not be human readable. For ASCII formats, the delimiters can be
specified by including the USING DELIMITERS keyword. Additionally, null strings can be specified by using the
WITH NULL clause. The inclusion of the WITH OIDS clause will cause PostgreSQL to export or expect the Object
IDs to be present.

Text File Structures

When COPY…TO is used without the BINARY keyword, PostgreSQL will generate a text file in which each row
(instance) is contained on a separate line of the text file. If a character embedded in a field also matches the
specified delimiter, the embedded character will be preceded with a backslash (\). OIDs, if included, will be the
first item on the line. The format of a generated text file will look like this:

<OID.Row1><delimiter><Field1.Row1><delimiter>…<Field N.Row1><newline>
<OID.Row2><delimiter><Field1.Row2><delimiter>…<Field N.Row2><newline>

<OID.RowN><delimiter><Field1.RowN><delimiter>…<Field N.RowN><newline>
(EOF)

If COPY…TO is sending to standard output (stdout) instead of a text file, the End-Of-File (EOF) will be
designated by \.<newline> (backslash followed by a period followed by a new line).

If COPY…FROM is being used, it will expect the text file to have this same format. Similarly, if being copied from
standard-input (stdin), COPY…FROM will expect the last row to be \.<newline> (backslash followed by a
period followed by a new line). However, in the case of COPY…FROM using a file, the process will terminate if a \.
<newline> is received or when an <EOF> occurs.

Binary File Structures

If COPY…TO is used with the BINARY clause, PostgreSQL will generate the resulting file as a binary file type. The
format for a binary file will be as follows:

DATA TYPE DESCRIPTION


Uint32 Total number of tuples (instances) in file
Uint32 Total length of tuple data
Uint32 OID (if specified)
Uint32 Number of NULL attributes
[Uint32,…Uint32] Attributes, counting from 0 and tuple-data

Uint32 is an unsigned 32-bit integer.

Input(s)

table—The name of the table to copy into or copy from.

filename—The filename to copy into or from.

stdin—Specifies that the file should come from the standard input or pipe.

stdout—Specifies that the file should be copied to the standard output or pipe.

delimiter—A one-character delimiter to use for separating fields.

nullstring—A string used to signify NULL values. Default = \N (backslash N).

Output(s)

COPY (Message returned if the command was successful.)

ERROR: reason (Message returned if the copy failed with reason for failure.)

Notes
The user must have either SELECT or SELECT INTO permissions to execute a COPY…TO or COPY…FROM
command.

By default, the delimiter is the tab (\t). If the delimiter specified with the USING DELIMITER option is more
than one character long, only the first character will be used.

When a filename is given, PostgreSQL assumes the current directory (such as $PGDATA). In general, it is best to
use the full pathname of the file so that confusion does not occur. Accordingly, the user executing the COPY
command must have sufficient permissions to create, modify, or delete a file in the specified directory. This is, of
course, more related to the permissions granted to the user by the underlying OS than to a specific issue related
to PostgreSQL.

Using a COPY command will not invoke table rules or defaults. However, triggers will still continue to function.
Therefore, additional operations might need to take place after a COPY command is issued (to replace defaults,
for instance).

Generally, using the BINARY keyword will result in a faster execution time. However, this depends on the data
stored in the table.

SQL-92 Compatibility

There is no specification for the COPY command in SQL-92. It is left for each implementation to decide how to
import and export data.

Examples

To copy the data from the table authors to the file /home/sqldata.txt in a comma-delimited format:

COPY authors TO '/home/sqldata.txt' USING DELIMITERS ',';

This produces a file that contains the following entries:

Amy,43
Barry,29
Pam,25
Tom,32

If the OID clause is added, then it becomes the following:

COPY authors WITH OIDS TO '/home/sqldata.txt'


USING DELIMITERS ',';
15001,Amy,43
15002,Barry,29
15003,Pam,25
15004,Tom,32

Alternatively, if the BINARY keyword is added, then the statement becomes the following:

COPY BINARY authors TO '/home/sqldata.txt' USING DELIMITERS ',';

And if you view the data with the unix od -c command:

004 \0 \0 \0 \f \0 \0 \0 \0 \0 \0 \0 \a \0 \0 \0
A m y \0 + \0 \0 \0 020 \0 \0 \0 \0 \0 \0 \0
\t \0 \0 \0 B a r r y \0 \0 \0 035 \0 \0 \0
\f \0 \0 \0 \0 \0 \0 \0 \a \0 \0 \0 P a m \0
031 \0 \0 \0 \f \0 \0 \0 \0 \0 \0 \0 \a \0 \0 \0
120 T o m \0 \0 \0 \0

CREATE AGGREGATE

Usage
CREATE AGGREGATE name (BASETYPE = input_data_type
[ , SFUNC1= sfunc1, STYPE1=state1_type]
[ , SFUNC2= sfunc2, STYPE2=state2_type]
[, FINALFUNC= ffunc]
[, INITCOND1= initial_condition1]
[, INITCOND2= initial_condition2]

Description

PostgreSQL includes a number of built-in aggregates such as sum(), avg(), min(), max()…. By using the
CREATE AGGREGATE command, users can extend PostgreSQL to include user-defined aggregate functions.

An aggregate is composed of at least one function but can include up to three. There are two state-transition
functions, sfunc1 and sfunc2, and a final calculation function, ffunc. They are used as follows:

sfunc1 (internal state 1, next-data-item) next-internal-state-1


sfunc2 (internal state 2) next-internal-state2
ffunc (internal state 1, internal state 2) aggregate-value

next-internal-state-1 and next-internal-state-2 are temporary variables created by PostgreSQL


that hold the current internal state of the aggregate as it is computed. (These variables are of type
state1_type and state2_type, respectively.) After all the data has been calculated through the related
functions, the final function can be invoked to calculate the aggregate's output value.

Additionally, an aggregate function can provide one or two initial values for the related functions. If only one
sfunc is used, this initial value is optional. However, if sfunc2 is specified, then initial_condition2 is a
mandatory inclusion.

Input(s)

name—The name of the new aggregate to create.

input_data_type—The data type that this aggregate operates on (that is, INT, VARCHAR, and so on).

sfunc1—The first state-transition function to operate on all non-NULL values (see the next "Notes" section).

state1_type—The data type for the first state-transition function (that is, INT, VARCHAR, and so on).

sfunc2—The second state-transition function to operate on all non-NULL values (see the next "Notes" section).

state2_type—The data type for the second state-transition function (that is, INT, VARCHAR, and so on).

ffunc—The final function to compute the aggregate after all input is completed (see "Notes" below).

initial_condition1—The initial value for state_value_1.

initial_condition2—The initial value for state_value_2.

Output(s)

CREATE (Message returned if successful.)

Notes

ffunc must be included if both sfunc functions are included. If only one transition function is used, then it is
optional. When ffunc is not included, the aggregate's output value is derived from the last value as computed
by sfunc1.

Two aggregates can have the same name if they each operate on different data types. In this way, PostgreSQL
allows for an aggregate name to be used, but will choose the correct version depending on the data type it is
given. In other words, if you have two functions— both named the same, but each accepting a different data type
such as foo([varchar]) and foo([int])—then you only need to call the aggregate foo([our-data-
type]), and PostgreSQL will choose the appropriate version to compute the output aggregate value.

SQL-92 Compatibility

This is a PostgreSQL extension; there is no equivalent concept in SQL-92.

Examples

The following code creates a function called complex_sum, which extends the standard sum() function by
added complex number support.

CREATE AGGREGATE complex_sum (sfunc=complex_add,


basetype=complex, stype=complex, initcond='(0,0)')

Then, when the code is run, you get the following output:

SELECT complex_sum(salary) FROM authors;

complex_sum
-----------
(34,53.9)

CREATE DATABASE

Usage

CREATE DATABASE name [ WITH LOCATION='path' ]

Description

CREATE DATABASE is used to create new PostgreSQL databases. The creator must have sufficient permissions
to perform such an action. Accordingly, once the database is created, the user will then become its owner.

By default, PostgreSQL will create the database in the standard data directory (that is, $PGDATA). However,
alternate paths can be identified by including the WITH LOCATION keywords.

Input(s)

name—The name of the database to create.

path—The path and/or filename of the database file to create.

Output(s)

CREATE DATABASE (Message returned if successful.)

ERROR: user 'username' not allowed to create/drop databases (Message returned if the user
doesn't have permission to create or drop databases.)

ERROR: createdb: database 'name' already exist (Message returned if the database already exists.)

ERROR: Single quotes are not allowed in database name (Message returned if the database name
contains single quotes.)

ERROR: Single quotes are not allowed in database path (Message returned if the pathname
contains single quotes.)
ERROR: The path 'pathname' is invalid (Message returned if the path doesn't exist.)

ERROR: createdb: May not be called in transaction block (Message returned if trying to create a
database while in an explicit transaction.)

ERROR: Unable to create database directory 'path'

Or

ERROR: Could not initialize database directory (Message returned usually because user doesn't
have sufficient permissions in the specified directory.)

Notes

If the location definition contains a slash (/), then the leading part is assumed to be an environmental variable,
which must be known to the server process. However, if PostgreSQL is compiled with the option
ALLOW_ABSOLUTE_PATHS set to true, then absolute pathnames are also allowed (for example,
/home/barry/pgsql). By default, this option is set to false.

Before an alternate location can be used, it must be prepared with the initlocation command. For more
information, see Chapter 7, "System Executable Files," in the section "initlocation."

SQL-92 Compatibility

Databases are equivalent to the SQL-92 concept of catalogs, which are left for the specific implementation to
define.

Examples

The following is a simple example that creates a new database named sales:

CREATE DATABASE sales;

The following example creates a database in an alternate location, based on an environmental variable that is
known to the server.

CREATE DATABASE sales WITH LOCATION='PGDATA2/sales';

CREATE FUNCTION

Usage

CREATE FUNCTION name ( [ftype [,…] ] )


RETURNS rtype AS definition
LANGUAGE lang_name [ WITH (attrib [,…]) ]

Or

CREATE FUNCTION name ( [ftype [,…] ] )


RETURNS rtype AS obj, link_symbol
LANGUAGE 'C' [ WITH (attrib [,…]) ]

Description

CREATE FUNCTION enables users to create functions in PostgreSQL. PostgreSQL allows for the concept of
operator overloading, which is to say, the same name can be used with several different functions as long as they
each operate on different data types. However, this must be used with caution with respect to C namespaces.
See Chapter 12, "Creating Custom Functions," for more information.
Input(s)

name—The name of the function to create.

ftype—The data type that the function arguments require.

rtype—The data type that the function returns.

definition—Either the actual code, a function name, or the path to the object file that defines the function.

obj—When used with C code, the actual object file that defines the function.

link_symbol—Used to define the objects link symbol, if applicable.

lang_name—The name of the language used.

attrib—Optional information used for optimization purposes (see the "Notes" section for more information).

Output(s)

CREATE (Message returned if successful.)

Notes

The user who creates a function will become the subsequent owner of the function.

Use DROP FUNCTION to remove user-defined functions from PostgreSQL.

SQL-92 Compatibility

CREATE FUNCTION is a PostgreSQL language extension.

Examples

The following code creates a simple SQL function that returns the date of the last check for a given employee.
First the function needs to be defined:

CREATE FUNCTION last_check(varchar)


RETURNS datetime AS
'BEGIN;
SELECT max(check_date) FROM authors WHERE authors.name=$1;
END;'
LANGUAGE 'sql';

Now you can test it out by passing it some test data:

SELECT last_check('Pam') AS CHECK_DATE;

CHECK_DATE
----------
11/14/2001

CREATE GROUP

Usage

CREATE GROUP name [ WITH SYSID gid] [ USER username [,… ] ]


Description

CREATE GROUP is used to initiate a new group in the current database. Additionally, users can be added to the
newly created group by specifying the USER keyword. By default, the group will be given the next group ID
(gid); however, if the clause WITH SYSID is specified, the user can declare the gid to use (if available).

Input(s)

name—The name of the new group to create.

gid—If specified, the group ID to assign.

username—If specified, the user to add to the new group.

Output(s)

CREATE GROUP (Message returned if successful.)

Notes

If the username is specified, it must already exist before it can be used.

The user of this command must have superuser access to the database.

SQL-92 Compatibility

There are no GROUPS in SQL-92; however, the concept of ROLES is similar.

Examples

Create a new group called book_authors:

CREATE GROUP book_authors;

Create a new group and assign users to it:

CREATE GROUP book_authors WITH USER barry, pam, tom;

CREATE INDEX

Usage

CREATE [ UNIQUE ] INDEX indexname ON tablename


[ USING idx_method] (columnname [oprname] [,… ])

Or

CREATE [ UNIQUE ] INDEX indexname ON tablename


[ USING idx_method] (funcname (columnname [,…])
[opr_name] [,… ])

Description

This command creates an index on the particular column and table specified. Generally, this will improve
database performance if the affected columns are part of query operation.
In addition to creating indexes on specific columns, PostgreSQL also allows for the creation of indexes based on
the results generated by a function. This allows dynamic indexes to be created for data that would normally
require significant transformation to generate via standard operations.

By default, PostgreSQL creates indexes using the BTREE method. However, with the inclusion of the USING
idx_method clause, it is possible to specify other methods. The following index methods are possible (see the
"Notes" section for more information):

BTREE. Implementation of Lehman-Yao high-concurrency B-Tree method.

RTREE. Guttman's quadratic-split R-Tree method.

HASH. Litwin's linear hash method.

In addition to being able to specify index methods, PostgreSQL also allows for the specification of which operator
classes to use. Normally, it is sufficient to accept the base operator classes for the field's data type; however,
there are cases in which such a specification would be useful. For instance, in the case of a complex number that
needs to be indexed based on the absolute and the real value, it would be beneficial to specify the particular
operator class at index creation time to achieve the most efficient indexing method.

Input(s)

UNIQUE—The addition of this keyword mandates that all data contained in the specified column will always hold a
unique value. If subsequent data is inserted that is not unique, an error message will be generated.

indexname—The name of the index to create.

tablename—The table on which the index is contained.

idx_method—The index method to use: BTREE (default), RTREE, or HASH.

columnname—The specific column to index.

funcname—Index the result of this supplied function.

oprname—The specific operator class to use when performing index.

Output(s)

CREATE: (Message returned if successful.)

ERROR: Cannot create index: 'index_name' already exists. (Message returned if index is already
in existence.)

Notes

The BTREE method is the most common (and default) type of index used. Additionally, the BTREE method is the
only one that supports multicolumn indexes (up to 16 by default). When data is searched with one of the
following operators, BTREE index use is preferred:

< Less than

<= Less than or equal to

= Equal to

>= Greater than or equal to


> Greater than

The RTREE method is most useful for determining geometric relations. In particular, if the following operators are
used, the RTREE index method is preferred:

<< Object lays to the left

&< Object overlaps to the left

&> Object overlaps to the right

>> Object lays to the right

&& Object overlaps

@ Object contains or is on

~= Same as

The HASH method provides for a very quick comparison but is only useful when the following operator is invoked:

= Equal to

SQL-92 Compatibility

CREATE INDEX is a PostgreSQL language extension. SQL-92 has no such command.

Examples

To create an index on the column lastname on the table authors:

CREATE INDEX name_idx ON authors(lastname);

To create a unique index on the column check_num on the table payroll:

CREATE INDEX check_num_idx ON payroll(check_num)

CREATE LANGUAGE

Usage

CREATE [ TRUSTED ] PROCEDURAL LANGUAGE 'lang-name'


HANDLER handler-name
LANCOMPILER 'comment'

Description
The capability to add a new language to PostgreSQL is one of its more advanced features. The CREATE
LANGUAGE command enables the administrator to catalog a new language in PostgreSQL, which can then be
used to create functions.

Care must be taken when using the TRUSTED keyword. Its inclusion indicates that the particular language offers
unprivileged users no functionality to bypass access restrictions. When the TRUSTED keyword is not used, it
indicates that only superusers can use this language to create new functions.

See Part IV, "Programming with PostgreSQL," for more information on registering new languages in PostgreSQL.

Input(s)

TRUSTED—Keyword that indicates whether the language can be trusted with unprivileged users.

lang-name—The new language name to add to the system. A new language name cannot override a built-in
PostgreSQL language.

HANDLER handler-name—The name of an existing function that is called to execute the newly registered
language.

comment—At this time, comment performs no function and is purely optional.

Output(s)

CREATE (Message returned if the command was successful.)

ERROR PL handler function func() doesn't exist (Message returned if the handler function is not
registered.)

Notes

Handler functions must take no arguments and return an opaque type, a placeholder, or an unspecified data
type. This eliminates the possibility of calling a handler function as a standard function within a query.

However, arguments must be specified on the actual call from the PL function in the desired language.
Specifically, the following arguments must be included:

Triggers. When called from the trigger manager, the only argument required is the object ID from that
procedure's pg_proc entry.

Functions. When called from the function manager, the arguments needed are as follows:

The object ID from pg_proc.

The number of arguments given to the PL function.

The actual arguments, given in a FmgrValues structure.

A pointer to a Boolean value that indicates to the caller whether the return value is SQL NULL.

SQL-92 Compatibility

There is no CREATE LANGUAGE statement in SQL-92.This is a PostgreSQL extension.

Example

This example implies that the handler function pl_call_hand already exists. First, you need to register the
pl_call_hand as a function, and then it can be used to define a new language:
CREATE FUNCTION pl_call_hand () RETURNS opaque
AS '/usr/local/pgsql/lib/my_pl_handler.so'
LANGUAGE 'C';

CREATE PROCEDURAL LANGUAGE 'my_pl_lang'


HANDLER pl_call_hand
LANCOMPILER 'PL/Sample';

CREATE OPERATOR

Usage

CREATE OPERATOR name ( PROCEDURE=function_name


[, LEFTARG=type1 ]
[, RIGHTARG=type2 ]
[, COMMUTATOR=comut_op ]
[, NEGATOR=negat_op ]
[, RESTRICT=rest_func ]
[, JOIN=join_func ]
[, HASHES ]
[, SORT1=l_sort_op ]
[, SORT2=r_sort_op ] )

Description

This command names a new operator from the following possible candidates:

| ' ? $ : + - * / < > = ~ ! @ # % ^ &

There are some exceptions concerning how the operator can be named:

A minus sign (-) or a /* cannot appear anywhere in an operator name. (These characters signify a
comment and are therefore ignored.)

A dollar sign ($) or a colon (:) cannot be defined as a single-character name. However, it is permissible to
use them as part of a multicharacter name (such as $%).

A multicharacter name cannot end with a plus (+) or minus (-) sign unless certain conditions are met. This
is due to how PostgreSQL parses operators for queries. The characters that must be present for an operator
to end with a plus or minus sign are as follows:

: $ ~ ! ? ' && | @ # % ^

In addition to the restrictions on naming conventions, the right-hand and/or left-hand data types must be
defined. For unary operators, either the LEFTARG or RIGHTARG data type must be defined. Subsequently, both
must be defined for binary operators. Binary operators have a data type on each side of the operator (that is, x
OPERATOR y), whereas unary operators only contain data on one side.

Other than the preceding items, the only other required member of a CREATE OPERATOR command is the
PROCEDURE. The function_name specifies a previously created function that handles the underlying work
necessary to deliver the correct answer.

The remaining options (COMMUTATOR, NEGATOR, RESTRICT, JOIN, HASHES, SORT1, and SORT2) are used to
help the query optimization process. Generally, it is not necessary to define these optimization helpers. The
downside is that queries will take longer than needed to complete. Care should be taken when defining these
options. Incorrect use of these optimization parameters can result in core dumps and/or other server mishaps.

Consult Part IV of this book for more information on creating operators.

Input(s)
name—The name of the operator to create (see the preceding naming conventions).

function_name—The function to process the operator's utility.

type1—The data type of the left-hand argument, if any.

type2—The data type of the right-hand argument, if any.

comut_op—The equivalent operator for switched left-hand and right-hand data placement.

negat_op—The operator that negates the current operator (for example, != negates =).

rest_func—The function used to estimate the selectivity restriction in determining how many rows will pass
when the operator is part of a WHERE clause (see Part IV for more information).

join_func—The function used to estimate the selectivity of joins that would result if the operator was used in
conjunction with fields in between a pair of tables.

HASHES—Indicates to PostgreSQL that it is permissible to use hash-level equality matching for a join based on
this operator.

l_sort_op—Defines the left-hand sort operator that is needed to optimize merge joins.

r_sort_op—Defines the right-hand sort operator that is needed to optimize merge joins.

Output(s)

CREATE (Message returned if successful.)

Notes

The function function_name must previously exist before an operator can be defined. Likewise, rest_func
and join_func must previously exist if their associated options are to be set.

Both SORT1 and SORT2 must be defined if either is to be defined.

The RESTRICT, JOIN, and HASHES clauses must only be used on operators that are binary and that return
Boolean values.

SQL-92 Compatibility

There is no CREATE OPERATOR syntax present in SQL-92; this is a PostgreSQL extension.

Example

This example shows the creation of a binary operator = that is used for comparing int4 data types. (Note:This
operator is already defined as part of the base PostgreSQL operator set; this example is for demonstration
purposes only.)

CREATE OPERATOR = (PROCEDURE=int4_equal_proc,


LEFTARG = int4,
RIGHTARG = int4,
COMMUTATOR = =,
NEGATOR = !=,
RESTRICT = int4_restrict_proc,
JOIN = int4_join_proc,
HASHES,
SORT1 = <,
SORT2 = <);

CREATE RULE
Usage

CREATE RULE rulename AS ON event


TO object
[ WHERE condition ]
DO [ INSTEAD ] [ action | NOTHING ]

Description

PostgreSQL enables users to define action rules that are executed automatically once fired by a specific event.
Although the concept of RULES is close to TRIGGERS, there are some important differences that make each
suitable for different tasks.

RULES are primarily useful for performing cascading chains of events to ensure that certain SQL actions are
always carried out. TRIGGERS are more useful for performing data validation before or after an action is
committed. However, there is sufficient overlap between the two that allows each to perform the other's
functionality in certain cases.

The events that can be used to trigger rules are SELECT, UPDATE, INSERT, and DELETE. These events can be
bound either to a specific column or to an entire table.

One curious aspect of rule creation is the DO INSTEAD keywords. Normally, the action specified in the rule
definition is carried out in addition to the event that originally fired the trigger. However, with the inclusion of the
DO INSTEAD keywords, PostgreSQL can be directed to perform an alternate action that will supplant the action
that originally fired the event. Additionally, if the NOTHING keyword is included, no action at all will be
performed.

Input(s)

rulename—The name of the rule to create.

event—The specific event(s) that causes the action to initiate. Must be SELECT, UPDATE, INSERT, or DELETE.

object—The column or table to bind to this rule.

condition—The condition that satisfies the WHERE clause.

action—The SQL statement that performs the desired action.

Output(s)

CREATE (Message returned if the command was successful.)

Notes

When specifying the condition for the rule, it is permissible to use the new or old temporary variable for
performing dynamic queries (see the "Examples" section for more).

Care needs to be taken when designing cascading rules. It is possible to create infinite loops by defining multiple
rule actions that operate on circular definitions. In such cases, PostgreSQL will simply refuse to execute the rule
if it determines that it would result in an infinite loop.

You must have rule definition permissions for a table or column to define rules on it.

A SQL rule cannot reference an array and cannot pass parameters.

System attributes generally cannot be referenced in a rule definition (for example, func(cls) where cls is a
class). However, OIDs can be accessed from a rule.

SQL-92 Compatibility
CREATE RULE is a PostgreSQL extension; there is no SQL-92 command.

Examples

The following example shows how rules can be used to enforce referential integrity. In this case, if an author is
deleted from the authors table, it also marks that author's status as 'inactive' in the payroll table:

CREATE RULE del_author AS


ON DELETE authors
DO UPDATE payroll SET status='Inactive'
WHERE payroll.auth_id = OLD.oid;

The next example shows how to redirect a user's action by using the DO INSTEAD clause. In this case, if the
user is not a manager, then no action is performed (notice the use of current_user, which is a built-in
environmental variable that contains the current user):

CREATE RULE upd_payroll AS


ON UPDATE payroll
WHERE current_user <> "Manager"
DO INSTEAD NOTHING;

In this example, you see how rules can be used to help managers keep track of import information as it changes
throughout the database. This rule definition will log all high-dollar orders to a separate table, which can then be
printed and purged daily for a manager's review:

CREATE RULE log_highdlr AS


ON INSERT orders
WHERE new.invoice_total > 1000
DO
INSERT INTO rep_table (amt, date, description)
VALUES
(new.invoice_total,new.invoice_date,'Big $$ Invoice');

CREATE SEQUENCE

Usage

CREATE SEQUENCE name


[ INCREMENT invalue]
[ MINVALUE mnvalue]
[ MAXVALUE mxvalue]
[ START stvalue]
[ CACHE cavalue]
[ CYCLE ]

Description

Sequences are number generators that PostgreSQL can use to produce series of sequential numbers for use
throughout the database. Most often, the CREATE SEQUENCE command is used to generate unique number
series for use in table inserts. However, sequences can be used for many different reasons and are independent
of any table-related functions.

After a sequence has been created, it will respond to the following function calls:

nextval(sequence)—Increments the sequence and returns the next number.

currval(sequence)—Returns the current value of the sequence (no modification done to existing
sequence).
setval(sequence, newvalue)—Sets the current sequence to a new value.

Input(s)

name—The name of the sequence to create.

invalue—The value used to determine the direction of the sequence. A positive value (default = 1) will result in
an ascending sequence. A negative value results in a descending sequence.

mnvalue—The minimal value that the sequence will reach. The value -2147483647 is the default for descending
sequences, and 1 is the default for ascending sequences.

mxvalue—The maximum value that the sequence will reach. The value 2147483647 is the default for ascending
sequences, and -1 is the default for descending sequences.

stvalue—The initial value that the sequence starts with.

cavalue—Indicates whether PostgreSQL should preallocate sequence numbers and store them in memory for
faster access. The minimum and default value is 1.

CYCLE—Indicates whether the sequence should continue past the max or min values. If the outer bound (min or
max) is reached, the sequence will begin again at the opposite area (minvalue or maxvalue).

Output(s)

CREATE (Message returned if successful.)

ERROR: Relation 'sequence' already exists (Message returned if the sequence already exists.)

ERROR: DefineSequence: MINVALUE (start) can't be >= MAXVALUE (max) (Message returned if the
starting value is out of range.)

ERROR: DefineSequence: START value (start) can't be < MINVALUE (min) (Message returned if
the starting value is out of range.)

ERROR: DefineSequence: MINVALUE (min) can't be >= MAXVALUE (max) (Message returned if the
minimum and maximum values conflict with each other.)

Notes

Sequences actually exist in a database as a one-row table. An alternative method for determining the current
value would be to issue the following:

SELECT last_value FROM sequence_name;

SQL-92 Compatibility

There is no CREATE SEQUENCE statement in SQL-92.

Example

The following example shows how a sequence can be created and then bound to the default value of a table:

CREATE SEQUENCE chk_num INCREMENT 1 START 1;

CREATE TABLE mytable


( check_number int DEFAULT NEXTVAL('chk_num'),
description VARCHAR(40),
amount MONEY
);
CREATE TABLE

Usage

CREATE [ TEMPORARY | TEMP ] TABLE tablename


(
columnname columntype [ NULL | NOT NULL ] [ UNIQUE ]
[ DEFAULT defvalue ] [ column__constraint | PRIMARY KEY ] [,…]
[, PRIMARY KEY (column [,…]) ] [, CHECK (condition) ]
[, table_constraint ]
)
[INHERITS (inheritable [,…]) ]

Description

CREATE TABLE is a comprehensive command that is used to enter a new table class into the current database.
In its most basic form, CREATE TABLE can simply be the listing of column names and data types. However,
specifying PRIMARY KEYS, DEFAULTS, and CONSTRAINTS can become increasingly more complex and requires
more explanation.

By using the TEMP or TEMPORARY keyword, it signifies to PostgreSQL that the table being created should only
exist for the length of this session. Once the current session is completed, the table will automatically be dropped
from the database.

The syntax of the CREATE TABLE command can be broken up according to column-level or table-wide directives.

Column-Level Commands

At the column level, you can specify many clauses that act to constrain the acceptable data that might be
inserted in that field. Use NULL or NOT NULL clauses to specify whether or not null values are permitted in a
column.

Also at the column level, the UNIQUE keyword can be used to mandate that all values in that column be unique.
In actuality, this is performed by PostgreSQL creating a unique index on the desired column. In addition to the
UNIQUE keyword, you can also specify that the current column is intended to be a primary key. A primary key
implies that values will be unique and non-null, but it also indicates that other tables might rely on this column
for referential integrity reasons.

By using the DEFAULT keyword, default values can be specified for a particular column. These include either
hard-coded defaults or the results of functions.

The CONSTRAINT clause can be used to define more advanced constraints than are possible through the NULL,
DEFAULT, and UNIQUE keywords. However, note that explicitly named constraints can have significant overlap
with the existing keywords present in the CREATE TABLE command. For instance, it is possible to designate a
column as non-null by using either of the two methods:

CREATE TABLE mytable (myfield1 VARCHAR(10) NOT NULL);

Or

CREATE TABLE mytable (myfield1 VARCHAR(10)


CONSTRAINT no_nulls NOT NULL);

Essentially, both methods are valid ways to ensure that non-null values are rejected from the column. However,
the CONSTRAINT clause offers many features that are more advanced. The full syntax of the columnar
CONSTRAINT command is as follows:

CONSTRAINT name
{
[ NULL | NOT NULL ] | UNIQUE | PRIMARY KEY | CHECK constraint |
REFERENCES reftable(refcolumn)[ MATCH mtype ][ ON DELETE delaction ] [ ON
UPDATE upaction ][ [NOT] DEFERRABLE ] [ INITIALLY chktime ]
}
[, …]

By using the CHECK constraint clause, it is possible to include a conditional expression that resolves to a
Boolean result. If the result returned is TRUE, then the CHECK constraint will pass.

The following is a more detailed list containing examples of valid column-level constraint clauses:

NOT NULL constraint

The NOT NULL constraint at the column level takes the following syntax:

CONSTRAINT name NOT NULL

UNIQUE constraint

The UNIQUE constraint at the column level takes the following syntax:

CONSTRAINT name UNIQUE

PRIMARY KEY constraint

The PRIMARY KEY constraint at the column level takes the following syntax:

CONSTRAINT name PRIMARY KEY

CHECK constraint

The check constraint evaluates a conditional expression, which returns a Boolean value.The syntax at the
column level takes the following syntax:

CONSTRAINT name CHECK(condition[,…)

REFERENCES constraint

The REFERENCES keyword allows external columns to be bound with the current column for referential integrity
purposes.The general syntax of REFERENCES at the columnar level is as follows:

CONSTRAINT name REFERENCES reftable [(refcolumn)]


[MATCH mtype]
[ON DELETE delaction]
[ON UPDATE upaction]
[[NOT] DEFERRABLE]
[INITIALLY chktime]

Table 1.1 shows the valid options that the REFERENCES command can take.

Table 1.1. Valid Options for the REFERENCES Command

Option Explanation

MATCH Where mtype is one of the following:


mtype
<default type> Partial matches are possible for multikey foreign references (that is,
some columns might be null and so on).

MATCH FULL All columns in a multikey foreign reference must match (that is, all
columns must be non-null and so on).

MATCH PARTIAL Not currently implemented.

ON DELETE Where delaction is one of the following:


delaction
NO ACTION The default. Produces an error if foreign key is violated.

RESTRICT Same as NO ACTION.

SET DEFAULT Sets the column values to the default if referenced columns are deleted.

SET NULL Sets the column values to null if referenced columns are deleted.

CASCADE Deletes the current row if the referenced row is deleted.

ON UPDATE Where upaction is one of the following:


upaction
NO ACTION The default. Produces an error if foreign key is violated.

RESTRICT Same as NO ACTION.

SET DEFAULT Sets the column value to the default if referenced columns are updated.

SET NULL Sets the column value to null if referenced columns are updated.

CASCADE Cascades updates of referenced columns to the current field.

This command updates the current row if the referenced column is updated. If the referenced row
is updated but no changes are made to the referenced column, no changes are made.

INITIALLY Where chktime is one of the following:


chktime
DEFERRED Check the constraint only at the end of the current transaction.

IMMEDIATE The default. Check the constraint after each statement.

Table-Level Commands

Many of the commands at the column level directly overlap commands issued at the table level. In most cases,
the syntax is the same, with the exception that table-level commands must also specify the column they are
acting upon. Commands issued at the columnar level will be implicitly bound to the current column.

Issuing a PRIMARY KEY is essentially the same as it is in the columnar specification. However, in this case, the
syntax is slightly different. Use the format …PRIMARY KEY(columnname)… instead of …columnname coltype
PRIMARY KEY… to specify a primary key at the table-level.

Additionally, the CONSTRAINT clause differs slightly at the table level. The following listing shows the table-level
CONSTRAINT clause:
CONSTRAINT name { PRIMARY KEY | UNIQUE } (columnname [,…])
[CONSTRAINT name ] CHECK (constraint_clause)
[CONSTRAINT name ] FOREIGN KEY (column[,…])
[REFERENCES reftable(refcolumn[,…])
[MATCH matchtype]
[ON DELETE delaction]
[ON UPDATE upaction]
[[NOT DEFERRABLE] [INITIALLY chktime]]

UNIQUE constraint

The UNIQUE constraint at the table level takes the following syntax:

CONSTRAINT name UNIQUE(column[,…])

PRIMARY KEY constraint

The PRIMARY KEY constraint at the table level takes the following syntax:

CONSTRAINT name PRIMARY KEY(column[,…])

FOREIGN KEY constraint

The FOREIGN KEY constraint at the table level takes the following syntax:

CONSTRAINT name FOREIGN KEY(column[,…])

REFERENCES constraint

The REFERENCES keyword enables external columns to be bound with the current column for referential integrity
purposes.The general syntax of REFERENCES at the table level is as follows:

CONSTRAINT name REFERENCES reftable [(refcolumn)]


[MATCH mtype]
[ON DELETE delaction]
[ON UPDATE upaction]
[[NOT] DEFERRABLE]
[INITIALLY chktime]

For a listing of the valid options that the REFERENCES command can take for the tablelevel version, please refer
to Table 1.1.

Input(s)

TEMP—Indicates whether the new table is a temporary table.

TEMPORARY—Indicates whether the new table is a temporary table.

tablename—The name of the table to create.

columnname—The column name to create in the new table.

columntype—What data type the column will hold. (For more information on data types, see Chapter 2,
"PostgreSQL Data Types.")

NULL—Indicates that the column should allow null values.

NOT NULL—Indicates that the column should not allow null values.

UNIQUE—Indicates that all values should be unique.


defvalue—The value or function that supplies the default value for this column.

Column constraint—A constraint clause that operates on the current column.

PRIMARY KEY—Indicates that all values for a column will be unique and non-null.

CHECK (conditional)—Used to signify that a conditional expression will be evaluated for the column or table
to determine whether an INSERT or UPDATE is permitted.

Table constraint—A constraint clause that operates on the current table.

INHERITS (inheritable)—Specifies a table(s) from which the current table will inherit all of its fields.

Output(s)

CREATE (Message returned if successful.)

ERROR: DEFAULT: type mismatch (Message returned if the default value type doesn't match the column data
type.)

Notes

Arrays can be specified as a valid columnar data type; however, consistent array dimensions are not enforced.

Up to the 7.0.X version of PostgreSQL, there was a compile-time limit of 8KB of data per row. By changing this
option and recompiling the source, a 32KB limit per row was possible. The newest release of PostgreSQL—Version
7.1—has introduced a new functionality dubbed TOAST (The Oversized-Attribute Storage Technique), which
promises virtually unlimited row-size limits.

Although it is possible to overlap columns with both the UNIQUE and PRIMARY KEY clauses, it is best not to
directly overlap indexes in such a way. Generally, there is a performance hit associated with overlapped indexes.

Ideally, tables referenced by the MATCH command should be columns with UNIQUE or PRIMARY KEY bindings.
However, this is not enforced in PostgreSQL.

SQL-92 Compatibility

There are so many attributes that deal with the CREATE TABLE command and SQL-92 compatibility that it would
be more beneficial to talk about specific cases, as outlined in the following sections.

The TEMPORARY Clause

In PostgreSQL, temporary tables are only locally visible. SQL-92, however, also defines the idea of globally visible
temporary tables.Additionally, SQL-92 also further defines global temporary tables with the ON COMMIT clause,
which can be used to delete table rows after a transaction is completed.

The UNIQUE Clause

The UNIQUE clause in SQL-92 also allows for the UNIQUE clause at both the table and column level to have these
additional options: INITIALLY DEFERRED, INITIALLY IMMEDIATE, DEFERRABLE, and NOT DEFERRABLE.

The NOT NULL Clause

In the SQL-92 specification, the NOT NULL clause can also have the following options: INITIALLY DEFERRED,
INITIALLY IMMEDIATE, DEFERRABLE, and NOT DEFERRABLE.

The CONSTRAINT Clause


SQL-92 defines some additional functionality not present in the PostgreSQL implementation of the CONSTRAINT
clause. SQL-92 supports the concept of ASSERTATIONS and DOMAINS. These concepts are not directly supported
in PostgreSQL.

The CHECK Clause

In the SQL-92 specification, the CHECK clause can also have the following options: INITIALLY DEFERRED,
INITIALLY IMMEDIATE, DEFERRABLE, and NOT DEFERRABLE.

The PRIMARY KEY Clause

In the SQL-92 specification, the PRIMARY KEY clause can also have the following options: INITIALLY
DEFERRED, INITIALLY IMMEDIATE, DEFERRABLE, and NOT DEFERRABLE.

Examples

The following is a basic example of how the command is used to create the table authors. It creates four fields:
one primary key (bound to a sequence default value), two mandatory non-null fields, and one date field.

CREATE TABLE authors


(
Author_id INT PRIMARY KEY DEFAULT NEXTVAL('serial'),
Author_name VARCHAR(40) NOT NULL,
Author_SSN VARCHAR(11) NOT NULL,
Author_DOB DATE
);

The following code creates a temporary table that has a field that can hold a two-dimensional array:

CREATE TEMPORARY TABLE mytemp


(
id INT NOT NULL,
matrix INT[][]
);

The following shows a basic example of how CREATE TABLE can be used to enforce data integrity by including
the use of the CHECK constraint. This example shows how a column constraint is used to mandate that an author
be older than 18:

CREATE TABLE author


(
id INT PRIMARY KEY,
name VARCHAR(40) NOT NULL,
age INT CHECK (author_age>17)
);

This example is similar to the preceding, except it shows how a table constraint can be used. Note how the table
constraint is based on two field conditions returning a Boolean true value:

CREATE TABLE author


(
id INT PRIMARY KEY,
name VARCHAR(40) NOT NULL,
age INT
CONSTRAINT chk_it CHECK(age>17 AND name<>'')
);

This last example shows how tables can inherit fields from other tables. Additionally, it demonstrates how the
table-level PRIMARY KEY clause can be used to create multicolumn primary keys:
CREATE TABLE new_author
(
new_id INT PRIMARY KEY,
new_name VARCHAR(40) NOT NULL,
CONSTRAINT multikey PRIMARY KEY(new_id, id)
);
INHERITS(author)

CREATE TABLE AS

Usage

CREATE TABLE AS tablename [(columnname[,…]) AS select_criteria

Description

The CREATE TABLE AS command is functionally very similar to SELECT INTO; it enables the results of a query
to be used to populate a new table.

If the columnname clause is left out, then all columns will be created in the new table.

Input(s)

tablename—The name of the new table to create.

columnname—The name(s) of the columns to select.

select_criteria—The SELECT statement that will be used to generate the table data.

Output(s)

See CREATE TABLE and SELECT for output messages.

Notes

The user who executes this command will own the resulting table. Likewise, users must have permissions to
create tables and be able to select data from the tables.

See SELECT INTO for more information.

SQL-92 Compatibility

This command is a PostgreSQL extension; there is no CREATE TABLE AS specified in the SQL-92 specification.

Example

The following example shows how to create a table called tmp_authors from the existing table authors only
where the author is older than 40:

CREATE TABLE tmp_authors AS


SELECT * FROM authors WHERE age>40;

CREATE TRIGGER

Usage
CREATE TRIGGER trigname { BEFORE | AFTER } {event [OR …] }
ON table
FOR EACH { ROW | STATEMENT }
EXECUTE PROCEDURE func(args)

Description

CREATE TRIGGER specifies that an action is to be bound to a particular table-related event. This concept is close
to the idea of RULES in many ways, but each is better suited for different uses. TRIGGERS are most commonly
used for maintaining referential integrity, either before or after a table event has occurred. RULES are most likely
used to perform cascading SQL commands while an event is in progress.

The CREATE TRIGGER command specifies when to fire a trigger (that is, BEFORE or AFTER) and what event will
trigger it (that is, INSERT, UPDATE, or DELETE). Finally, the user-specified function fires when these conditions
are met.

If a trigger is set to fire before an event, it is possible for the trigger to change (or ignore) the data before it is
inserted. Likewise, if the trigger is set to fire after an event, all of the changes made—including deletions, inserts,
and updates—are visible to the trigger.

Input(s)

trigname—The name of the trigger to create.

table—The name of the table to which to bind the trigger.

event—Event is INSERT, DELETE, or UDPATE.

func(args)—The function and arguments to fire when event conditions are met.

Output(s)

CREATE (Message returned if successful.)

Notes

The creator of the trigger must also have sufficient rights to the relations in question. Currently, STATEMENT
triggers are not implemented.

SQL-92 Compatibility

SQL-92 does not contain a CREATE TRIGGER statement. This is an extension by PostgreSQL.

Examples

This example uses a function named state_check() to verify that newly inserted state names are greater than
three characters in length.

First, you need to define the function:

CREATE FUNCTION state_check()


RETURNS opaque
AS 'BEGIN
IF length(new.statename)<3
THEN RAISE EXCEPTION 'State names must be greater than 3 characters';
END IF;
RETURN new;
END;'
LANGUAGE 'plpgsql';

You can proceed with the definition of your trigger.

CREATE TRIGGER statecheck_trigger BEFORE INSERT ON authors EXECUTE PROCEDURE


state_check();

Now you can test the trigger by inserting some test data:

INSERT INTO authors (statename) VALUES ('Alabama');


(1 row inserted ok)

INSERT INTO authors (statename) VALUES ('Al');


ERROR: State names must be greater than 3 characters

CREATE TYPE

Usage

CREATE TYPE typename


(
INPUT = in_function,
OUTPUT = out_function,
INTERNALLENGTH = {in_length | VARIABLE }
[, EXTERNALLENGTH = {ext_length | VARIABLE } ]
[, DEFAULT = defaultval]
[, ELEMENT = element]
[, DELIMITER = delimiter]
[, SEND = send_function]
[, RECEIVE = rec_function]
[, PASSEDBYVALUE ]
)

Description

PostgreSQL includes a number of built-in data types; however, users can register their own by using the CREATE
TYPE command.

CREATE TYPE requires that two functions (in_function and out_function) exist before a new type can be
defined. The in_function is used to convert the data to an internal data type so that it can be used by the
operators and functions defined for that type. Likewise, out_function converts the data back to its external
representation.

Newly created data types can either be fixed or variable length. Fixed-length types must be explicitly specified
during the definition of the new data type. By using the VARIABLE keyword, PostgreSQL will assume that the
data type is a TEXT type and therefore variable in length.

The ELEMENT and DELIMITER keywords are used when specifying a new data type that is an array. The
ELEMENT keyword specifies the data type of the elements in an array, and the DELIMITER keyword is used to
denote what delimiter is used to separate array elements.

When an external computer will be making use of the newly created data type, it is then necessary to specify
send_function and rec_function. These functions are used to convert the data to and from a format that is
conducive for the external system. If these functions are not specified, it is assumed that the internal data-type
format is acceptable on all machine architectures.

Use the PASSEDBYVALUE keyword to specify to PostgreSQL that operators and functions that make use of the
new data type should be explicitly passed the value—instead of the reference.

Input(s)
typename—The name of the newly created data type.

in_function—The function used to convert from the external representation of a data type to the internal data
type.

out_function—The function used to convert from an internal data type to the external representation.

in_length—Either a literal value or the keyword VARIABLE used to specify the internal length of the data type.

ext_length—Either a literal value or the keyword VARIABLE used to specify the external length of the data
type.

defaultval—The default value to be displayed when data is not present.

element—If the newly created type is an array, this specifies the type of elements in that array.

delimiter—If the newly created type is an array, this indicates what delimiter appears between the array
elements.The default is a comma (,).

send_function—Specifies the function to convert data to a form for use by an external machine.

rec_function—Specifies the function to convert data from a form for use by an external machine to the format
needed by the local machine.

PASSEDBYVALUE—This variable, if present, indicates that functions or operators using the new data type should
be passed arguments by value instead of by reference.

Output(s)

CREATE (Message returned if successful.)

Notes

The data type specified must be a name that is unique, it must be fewer than 31 characters in length, and it
cannot begin with an underscore (_).

The in_function and out_function must both be defined to accept either one or two arguments of type
opaque.

You cannot use PASSEDBYVALUE to pass values whose internal representation is greater than 4 bytes.

SQL-92 Compatibility

SQL-92 does not specify CREATE TYPE; however, it is defined in the SQL3 proposal.

Example

The following example creates a data type called deweydec, which will be used to hold Dewey decimal numbers.
This example assumes that the functions dewey_in and dewey_out have previously been defined.

CREATE TYPE deweydec


(
INTERNALLENGTH = 16,
INPUT = dewey_in,
OUTPUT = dewey_out
);

CREATE USER

Usage
CREATE USER username
[ WITH [ SYSID uid] [ PASSWORD password]]
[ CREATEDB | NOCREATEDB ]
[ CREATEUSER | NOCREATEUSER ]
[ IN GROUP groupname [,… ]
[ VALID UNTIL abstime]

Description

The CREATE USER command adds a new user to the current PostgreSQL database. The only required variable is
the name of the new user, which must be unique. By default, PostgreSQL will assign the user the next user
identification number (UID); however, it can be specified by including the WITH SYSID clause.

Additionally, a new user can be included in an existing group by specifying the IN GROUP command. Likewise,
certain user rights can be specified at creation time. Users can be given permission to create users of their own
by including the CREATEUSER clause. Likewise, users can be assigned permission to create their own databases
with the CREATEDB option.

PostgreSQL enables usernames to be set to automatically expire at a given time. By using the VALID UNTIL
clause, an absolute time can be specified that sets the expiration time.

Input(s)

username—The name of the user to be created.

uid—The user identification number of the new user.

password—The password of the new user.

groupname—The group(s) to which the new user belongs.

abstime—If present, specifies the absolute time at which the new username is set to expire. Otherwise, the
username is valid forever.

Output(s)

CREATE USER (Message returned if successful.)

Notes

The username must be unique in the current database.

The creator must have sufficient permissions to execute the CREATE USER command. Additionally, creators
become owners of the objects created with the CREATE USER command.

Both NOCREATEBD and NOCREATEUSER are the defaults.

SQL-92 Compatibility

There is no CREATE USER command in SQL-92. This command is a PostgreSQL extension.

Examples

Create a new user with a specified password in the current database. Assign the user to the group managers:

CREATE USER bryan WITH PASSWORD '08f30w0'


IN GROUP managers;
Create a new user with a password. Give the user the capability to create new users but not to create new
databases. Set the username to expire on January 30, 2002.

CREATE USER bryan WITH PASSWORD '08f30w0'


NOCREATEDB CREATEUSER
VALID UNTIL 'Jan 20 2002';

CREATE VIEW

Usage

CREATE VIEW viewname AS SELECT selectquery

Description

Views are useful as a method to implement commonly used queries. Instead of using the full query each time it
is needed, you can define it as a view and reuse it with a much simpler syntax each time it's needed.

Input(s)

viewname—The name of the view to create.

selectquery—The SQL query that provides the columns and row specifications for the newly created view.

Output(s)

CREATE (Message returned if successful.)

ERROR: Relation 'viewname' already exists (Message returned if the specified view name is already in
use.)

NOTICE create: attribute name 'column' has an unknown type (Message returned if an explicit
query does not define the data type of the static variable; see the "Examples" section.)

Notes

Views are sometimes referred to as virtual tables; however, thinking of them as macro substitutions is probably
more conceptually correct.

Currently, views are read-only.

SQL-92 Compatibility

SQL-92 specifies that VIEWS are to be updateable. Currently, PostgreSQL views are read-only.

Examples

Create a view of the books table, where only those books that are fiction are returned:

CREATE VIEW fictionbooks AS


SELECT * FROM books WHERE genre='fiction';

SELECT * FROM fictionbooks WHERE author_name='Shakespeare, W.';


author_name title genre call_number
------------------------------------------------------------------------
Shakespeare, W. Complete Works Vol. 1 Fiction 842.12 Sha
Shakespeare, W. Complete Works Vol. 2 Fiction 842.13 Sha
Create an explicit query that returns a static result:

CREATE VIEW errmsg AS SELECT text 'Error: Not Found';

DECLARE

Usage

DECLARE cursorname [ BINARY ] [ INSENSITIVE ] [ SCROLL ]


CURSOR FOR selectquery
[ FOR { READ ONLY | UPDATE [ OF column [,…] } ]

Description

The DECLARE statement enables a user to create a cursor to store and navigate a query result. By default,
PostgreSQL returns data in a text format; however, data can also be returned in a binary format by including the
BINARY keyword.

Returning the data as binary information requires that the calling application be able to convert and manipulate it
(the standard psql front end cannot handle binary data). However, there are specific advantages to returning
data in a binary-only format; it usually requires less work of the server and usually results in a smaller-size data
transfer.

Input(s)

cursorname—The name of the cursor to create.

INSENSITIVE—A reserved keyword for SQL-92. This is ignored by PostgreSQL.

SCROLL—A reserved keyword for SQL-92. This is ignored by PostgreSQL.

selectquery—A SQL query that defines what row and column selections to use for the cursor to be created.

READ ONLY—A keyword that denotes that the cursor is read-only. PostgreSQL, at this time, only generates read-
only cursors. This word is ignored by PostgreSQL.

UPDATE—A keyword that denotes that the cursor should be updateable. PostgreSQL produces only read-only
cursors; this keyword is ignored.

column—For use with the UPDATE keyword. PostgreSQL ignores this word at this time.

Output(s)

SELECT (Message returned if the SELECT command was successful.)

NOTICE: BlankPortalAssignName: portal 'cursorname' already exists (Message returned if the


cursor name already exists.)

NOTICE: Named portals may only be used in begin / end transaction blocks (Message
returned if the cursor is not declared in the transaction block.)

Notes

PostgreSQL does return architecture-specific binary data. Therefore, there can be issues related to big-endian or
little-endian byte ordering. However, all text returns are architecture neutral.

SQL-92 Compatibility
The INSENSITIVE, SCROLL, READ ONLY, UPDATE, and column keywords are reserved for future SQL-92
compatibility. At this time, PostgreSQL creates only read-only cursors.

SQL-92 only allows cursors to be in embedded SQL commands or in modules. PostgreSQL, however, also allows
cursors to exist in interactive methods.

SQL-92 specifies that cursors are to be opened with the OPEN command. PostgreSQL assumes that cursors are
considered open upon declaration. However, ecpg (embedded SQL preprocessor for Postgres) supports the OPEN
command to be in compliance with the SQL-92 specification.

The BINARY keyword is a PostgreSQL extension; no such keyword exists in the SQL-92 specification.

Example

This example creates a cursor for use with the authors table:

DECLARE newauthors CURSOR FOR


SELECT * FROM authors WHERE status='New';

DELETE

Usage

DELETE FROM table [ WHERE condition]

Description

DELETE is used to remove all or certain rows from a table. Use the WHERE condition to specify which rows are to
be deleted.

Input(s)

table—The table that contains the rows to delete.

condition—Optional SQL WHERE condition to specify the rows to delete.

Output(s)

DELETE count (Message returned if successful with the number of rows deleted.)

Notes

The user deleting the rows must have permissions to the table in question as well as to any tables present in the
WHERE condition.

Using DELETE without a WHERE condition results in all rows being deleted. Although not part of the SQL-92
specification, TRUNCATE performs this same function much more efficiently.

SQL-92 Compatibility

DELETE is SQL-92 compatible. However, SQL-92 also specifies that DELETE is allowed from a cursor, which in
PostgreSQL are read-only.

Examples

Delete all the rows from the table authors:


DELETE FROM authors

Delete all the rows from the table where the author's salary is less than $10,000:

DELETE FROM authors WHERE salary<10000;

DROP AGGREGATE

Usage

DROP AGGREGATE aggname type

Description

The DROP AGGREGATE command deletes all references to the aggregate named from the current database.

Input(s)

aggname—The name of the aggregate to delete.

type—The data type of the aggregate.

Output(s)

DROP (Message returned if the command was successful.)

NOTICE: RemoveAggregate: aggregate 'agg' for 'type' does not exist (Message returned if the
aggregate does not exist in the current database.)

Notes

Only owners of the aggregate or superusers can execute this command.

SQL-92 Compatibility

There is no CREATE or DROP AGGREGATE in the SQL-92 specification. This is a PostgreSQL extension.

Example

Drop the aggregate complex_sum:

DROP AGGREGATE complex_sum complex;

DROP DATABASE

Usage

DROP DATABASE databasename

Description

The DROP DATABASE command deletes the database and all related data named.
Input(s)

databasename—The name of the database to remove.

Output(s)

DROP DATABASE (Message returned if successful.)

ERROR: user 'username' is not allowed to create/drop databases (Message returned if the user
does not have sufficient rights to drop a database.)

ERROR: dropdb: cannot be executed on the template database (Message returned if the user
attempts to drop the template database.)

ERROR: dropdb: cannot be executed on an open database (Message returned if the command is
attempted on a database that is currently open.)

ERROR: dropdb: database 'name' does not exist (Message returned if the specified database name
cannot be found.)

Notes

You cannot issue a DROP DATABASE command on the current database. Usually, the command is performed
while connected to another database or from the command line with the dropdb command.

Due to the need to physically delete files, the DROP DATABASE command cannot take place inside of a
transaction. Usually a DROP command only modifies the system catalogs; therefore, they can be rolled back.
Because a ROLLBACK cannot recover deleted file system objects, this command must be issued as an atomic
entity and not be embedded in an explicit BEGIN…COMMIT clause.

The user must own the database or have superuser permissions to execute the DROP DATABASE command.

SQL-92 Compatibility

The SQL-92 specification does not define a method for the DROP DATABASE command.

This is a PostgreSQL extension.

Example

This example removes the database called publisher.

DROP DATABASE publisher

DROP FUNCTION

Usage

DROP FUNCTION funcname ([type [,… ])

Description

Removes the function specified from the current database. PostgreSQL allows functions to be overloaded;
therefore, the optional type keyword allows for PostgreSQL to discriminate between similar function names.

Input(s)
funcname—The name of the function to delete.

type—The data type required by the function, if applicable.

Output(s)

DROP (Message returned if the command is successful.)

NOTICE: RemoveFunction: Function "name" ("types") does not exist: (Message returned if the
function name or data type is not valid.)

Notes

The user must own the function to be dropped or have superuser rights to the database.

SQL-92 Compatibility

DROP FUNCTION is a PostgreSQL language extension; the SQL-92 specification does not define it.

Example

This example drops the function called last_check from the current database:

DROP FUNCTION last_check;

DROP GROUP

Usage

DROP GROUP name

Description

Removes the group specified from the current database.

Input(s)

name—The name of the group to remove.

Output(s)

DROP GROUP (Message returned if successful.)

Notes

The DROP GROUP command does not remove the users that make up the group from the database.

SQL-92 Compatibility

DROP GROUP is a PostgreSQL language extension.

Example
The following example deletes the group managers from the current database:

DROP GROUP managers;

DROP INDEX

Usage

DROP INDEX name

Description

The DROP INDEX command removes an index from the current database.

Input(s)

name—The name of the index to remove.

Output(s)

DROP (Message returned if successful.)

ERROR: index 'index_name' nonexistent (Message returned if the index name does not exist.)

Notes

To execute this command, the user must own or have superuser rights to the index.

SQL-92 Compatibility

SQL-92 leaves the concept of indexes up to the specific implementation. Therefore, DROP INDEX is a PostgreSQL
implementation.

Example

This example removes the index named checknumber from the current database:

DROP INDEX checknumber;

DROP LANGUAGE

Usage

DROP LANGUAGE langname

Description

The DROP LANGUAGE command is used to remove a user-defined language from the current database.

Input(s)

langname—The name of the language to remove.


Output(s)

DROP (Message returned if the command was successful.)

ERROR: Language 'name' doesn't exist (Message returned if the language name specified cannot be
found.)

Notes

Warning: PostgreSQL does not do any checks to see if functions depend on the language to be dropped.
Consequently, it is possible to remove a language that is still needed by the system.

To execute the DROP LANGUAGE command, the user needs to own the object or have superuser access to the
database.

SQL-92 Compatibility

There is no DROP LANGUAGE in SQL-92; this is a PostgreSQL extension.

Example

This example removes the language mylang from the system:

DROP LANGUAGE mylang;

DROP OPERATOR

Usage

DROP OPERATOR id (type | NONE [,…] )

Description

This command is used to remove an existing operator from the current database. By using the type keyword, it
is possible to specify either the left or the right operator in conjunction with the NONE keyword.

Input(s)

id—The identifier of the operator to remove.

type—The data type of the left or right operator.

Output(s)

DROP (Message returned if the command was successful.)

ERROR: RemoveOperator: binary operator 'oper' taking type 'type' and 'type2' does not
exist (Message returned if the specified operator does not exist in the current database.)

ERROR: RemoveOperator: left unary operator 'oper' taking type 'type' does not exist
(Message returned if the left unary operator specified does not exist.)

ERROR: RemoveOperator: right unary operator 'oper' taking type 'type' does not exist
(Message returned if the right unary operator specified does not exist.)

Notes
The DROP OPERATOR command does not check for dependencies that rely on the operator to be dropped.
Therefore, it is the user's responsibility to ensure that all dependencies will continue to be satisfied after the
operation is completed.

SQL-92 Compatibility

DROP OPERATOR is a PostgreSQL extension. There is no such command in SQL-92.

Examples

This example drops the operator = for int4:

DROP OPERATOR = (int4, int4);

To remove only the left unary operator =:

DROP OPERATOR = (none, int4);

DROP RULE

Usage

DROP RULE name

Description

The DROP RULE command removes a specific rule designation from the current database. Once removed,
PostgreSQL will immediately cease applying the rule actions to all event triggers.

Input(s)

name—The name of the rule to drop.

Output(s)

DROP (Message returned if the command was successful.)

ERROR: RewriteGetRuleEventRel: rule 'name' not found (Message returned if PostgreSQL cannot
find the rule name specified.)

Notes

The user of this command must either own the rule or have superuser access to the current database in order to
execute the DROP RULE command.

SQL-92 Compatibility

The DROP RULE command is a PostgreSQL extension; there is no specification for this command in SQL-92.

Example

This example drops the rule called del_author from the database.

DROP RULE del_author;


DROP SEQUENCE

Usage

DROP SEQUENCE name [,…]

Description

The DROP SEQUENCE command removes the named sequence from the current database. PostgreSQL actually
uses a table to hold the current value of the sequence, so in effect, DROP SEQUENCE works like a specific DROP
TABLE command.

Input(s)

name—The name of the sequence to remove from the current database.

Output(s)

DROP (Message returned if the command was successful.)

NOTICE: Relation 'name' does not exist. (Message returned if PostgreSQL could not find the rule
name specified.)

Notes

PostgreSQL does not do any dependency checking on dropped sequences. Therefore, it is the user's responsibility
to ensure that nothing depends on the sequence before issuing a DROP SEQUENCE command.

The user of this command must either own the sequence named or have superuser rights to the database.

SQL-92 Compatibility

DROP SEQUENCE is a PostgreSQL extension; there is no equivalent command in the SQL-92 specification.

Example

This example removes the sequence named check_numb_seq from the database:

DROP SEQUENCE check_numb_seq;

DROP TABLE

Usage

DROP TABLE name [,…]

Description

The DROP TABLE command removes the table named, related indexes, and any associated views from the
current database.

Input(s)
name—The name of the table to remove.

Output(s)

DROP (Message returned if the command was successful.)

ERROR: Relation 'name' Does Not Exist! (Message returned if the table name cannot be located in the
current database.)

Notes

PostgreSQL does not check or warn for FOREIGN KEY relationships that could be affected by executing the DROP
TABLE command; therefore, it is the user's responsibility to ensure that other relations will not be affected by
the command.

Due to the need to physically delete files, the DROP TABLE command cannot take place inside of a transaction.
Usually a DROP command modifies only the system catalogs; therefore, they can be rolled back. Because a
ROLLBACK cannot recover deleted file system objects, this command must be issued as an atomic entity and not
be embedded in an explicit BEGIN…COMMIT clause.

The user of this command must own the table and associated objects or have superuser rights to the current
database.

SQL-92 Compatibility

DROP TABLE is mostly SQL-92 compliant. However, the SQL-92 specification also includes the keywords
RESTRICT and CASCADE in the command. These keywords are used to limit or cascade the removal of a table to
other referenced objects.At this time, PostgreSQL does not support these commands.

Examples

To drop the table authors:

DROP TABLE authors;

To drop the tables authors and payroll:

DROP TABLE authors, payroll;

DROP TRIGGER

Usage

DROP TRIGGER trigname ON tablename

Description

The DROP TRIGGER command will remove the trigger specified from the current database.

Input(s)

trigname—The name of the trigger to remove from the database.

tablename—The name of the table that holds the named trigger.

Output(s)
DROP (Message returned if the command was executed successfully.)

ERROR: Drop Trigger: there is no trigger 'name' on relation 'table': (Message returned if
PostgreSQL cannot locate the trigger name specified.)

Notes

The user of this command must either own the object or have superuser access to the current database.

SQL-92 Compatibility

There is no DROP TRIGGER definition in the SQL-92 specification. This is a PostgreSQL language extension.

Example

This example removes the trigger state_checktrigger from the payroll table:

DROP TRIGGER state_checktrigger ON payroll;

DROP TYPE

Usage

DROP TYPE name

Description

The DROP TYPE command is used to remove the type specified from the current database.

Input(s)

name—The name of the data type to remove.

Output(s)

DROP (Message returned if successful.)

ERROR: RemoveType: type 'name' does not exist (Message returned if PostgreSQL cannot locate the
name specified.)

Notes

The user must own the objects or have superuser access to the type of object that is to be removed.

PostgreSQL does not do any dependency checking on the removal of TYPE objects. Therefore, it is the user's
responsibility to ensure that any operators, functions, aggregates, or other objects that depend on the data type
will not be left in an inconsistent state as a result of the removal of that data type.

SQL-92 Compatibility

SQL-92 does not specify a DROP TYPE command; however, it is part of the SQL3 specification.

Example
To remove the data type int4 from the database:

Warning

This action could be very dangerous; the int4 object is an important part of the PostgreSQL system.
This example is provided for sample purposes only—DO NOT EXECUTE IT! Removing the int4 object
will result in serious corruption of your database.

DROP TYPE int4;

DROP USER

Usage

DROP USER username

Description

The DROP USER command is used to remove a user from the current database.

Input(s)

username—The name of the user to remove.

Output(s)

DROP USER (Message returned if the command was successful.)

ERROR: DROP USER: user 'name' does not exist (Message returned if the username specified cannot
be found.)

DROP USER: user 'name' owns database 'name' (Message returned if the user who attempted to drop a
database owns any database.)

Notes

PostgreSQL will not allow a user who owns a database to be dropped. However, PostgreSQL does not do a
dependency check for objects owned by the user. Therefore, it is the user's responsibility to ensure that other
database objects will not be left in an inconsistent state after the DROP USER command is completed.

SQL-92 Compatibility

The DROP USER command is a PostgreSQL language extension. There is no SQL-92 command for DROP USER.

Example

Remove the user bill from the current database:

DROP USER bill;

DROP VIEW
Usage

DROP VIEW name

Description

The DROP VIEW command removes the view specified from the current database.

Input(s)

name—The name of the view to remove.

Output(s)

DROP (Message returned if the command was successful.)

ERROR: RewriteGetRuleEventRel: rule '_RETname' not found (Message returned if the view named
does not exist in the current database.)

Notes

The DROP VIEW command removes the named view from the current database.

SQL-92 Compatibility

The SQL-92 specification defines some additional features for the DROP VIEW command: RESTRICT and
CASCADE. These keywords determine whether items that reference the view in question are also dropped. By
default, PostgreSQL only deletes the view explicitly named.

It is the user's responsibility to ensure that other database objects will not be left in an inconsistent state after
the DROP VIEW command is completed.

Example

The following command removes the view fictionbooks from the current database:

DROP VIEW fictionbooks;

END

Usage

END [ WORK | TRANSACTION ]

Description

The END keyword is used to complete an explicit PostgreSQL transaction.

The explicit BEGIN…END clauses are used to encapsulate a series of SQL commands to ensure proper execution.
If any of the commands in the series fail, it can cause the entire transaction to roll back, bringing the database
back to its original state.

By default, all commands issued in PostgreSQL are performed in an implicit transaction. The END keyword is
equivalent to COMMIT.
Input(s)

None. WORK and TRANSACTION are optional and have no effect.

Output(s)

COMMIT (Message returned if successful.)

NOTICE: COMMIT: no transaction in progress (Message returned if there is no current transaction.)

Notes

Generally, it is best to use the COMMIT PostgreSQL keyword, thereby maintaining SQL-92 compatibility.

See ABORT, BEGIN, and ROLLBACK for more information regarding transactions.

SQL-92 Compatibility

The END keyword is a SQL-92 extension. It is equivalent to the SQL-92 word COMMIT.

Example

This example shows how END can be used to terminate a PostgreSQL transaction:

SELECT * FROM authors;

Name LastCheck Status


-----------------------------------
Frank $800.00 Active
Bill $500.00 Inactive

BEGIN;

INSERT INTO authors (name, lastcheck, hiredate)


VALUES ('Sam', 700.00, 'Active');

END;

SELECT * FROM authors;

Name LastCheck Status


-----------------------------------
Frank $800.00 Active
Bill $500.00 Inactive
Sam $700.00 Active

EXPLAIN

Usage

EXPLAIN [ VERBOSE ] query

Description
The EXPLAIN command is used to profile and trace how queries are being executed. It gives insight into how the
PostgreSQL planner generates an execution plan for the supplied query. It also displays what indexes will be used
and what join algorithms it will employ.

The output of the EXPLAIN command generates the startup time before the first tuple can be returned, the total
time for all tuples, and what type of scan is being used (that is, sequential, index, and so on).

The VERBOSE argument will cause EXPLAIN to dump the full internal representation of the plan tree instead of
just the summary. This option is typically used for performance tuning and advanced debugging scenarios.

Input(s)

VERBOSE—Optional keyword that will produce the full execution plan and all internal states. Useful mostly for
debugging.

query—The query that EXPLAIN should profile.

Output(s)

NOTICE: QUERY PLAN: plan (Message returned along with the execution plan.)

EXPLAIN (Message returned after the successful execution plan.)

Notes

See Chapter 10, " Common Administrative Tasks," and its section titled "Performance Tuning" for more
information on query optimization.

SQL-92 Compatibility

SQL-92 has no EXPLAIN command. This is a PostgreSQL extension.

Examples

This example supposes that the authors table has a single field of an int4 data type and 1,000 rows of data.
Additionally, this example assumes that the authors table has no index set:

EXPLAIN SELECT * FROM authors;


NOTICE: QUERY PLAN:
Seq Scan on authors (cost=0.00..4.68 rows=1000 width=4)
EXPLAIN

The following example includes the addition of a WHERE constraint and an index on the single field in the
authors table. Notice the improvement in total time and the fact that only one row is returned:

EXPLAIN SELECT * FROM authors WHERE i=100;


NOTICE: QUERY PLAN:
Index Scan using fi on authors (cost=0.00..0.38 rows=1 width=4)
EXPLAIN

This final example includes the use of a sum() aggregate added to the preceding example. Notice how the start
time for the aggregate is .38, which is also the total time for the index scan. This, of course, is because an
aggregate cannot function until data is provided to it.

EXPLAIN SELECT sum(i) FROM authors WHERE i=100;


NOTICE: QUERY PLAN:
Aggregate (cost=0.38..0.38 rows=1 width=4)
Index Scan using fi on authors (cost=0.00..0.38 rows=1 width=4)
EXPLAIN
FETCH

Usage

FETCH [ FORWARD | BACKWARD | RELATIVE ]


[number | ALL | NEXT | PRIOR ]
{ IN | FROM } cursor

Description

The FETCH command retrieves rows from a defined cursor. The cursor should have previously been defined in a
DECLARE statement.

The number of rows to retrieve can either be specified by a signed integer or be one of the following: ALL, NEXT,
or PRIOR.

In addition to the number of rows to retrieve, the direction of the next retrieval can also be specified. By default,
PostgreSQL searches in a FORWARD direction. However, by using a signed integer, the resulting direction of the
search can be changed from what is specified by the keywords alone. For instance, FORWARD -1 is functionally
the same as BACKWARD 1.

Input(s)

FORWARD—Retrieve rows forward from the current relative position.

BACKWARD—Retrieve rows backward from the current relative position.

RELATIVE—Included for SQL-92 compatibility. No functional use.

number—A signed integer to indicate the number of rows to retrieve in the specified direction.

ALL—Retrieve all the rows remaining in the specified direction.

NEXT—Retrieve the next single row in the specified direction (for example, equivalent to using 1 count).

PRIOR—Retrieve the previous single row from the specified direction (for example, equivalent to using the -1
count).

IN or FROM—Use either word.

cursor—The name of the predefined cursor.

Output(s)

If successful, the FETCH command will return the rows requested.

NOTICE: PerformPortalFetch: portal 'cursor' not found (Message returned if the cursor specified
has not been declared.)

NOTICE: FETCH/ABSOLUTE not supported, using RELATIVE (Message returned because PostgreSQL
does not support absolute positioning in cursors.)

ERROR: FETCH/RELATIVE at current position is not supported (Message returned if the user tried
to execute a FETCH RELATIVE 0 command. This command, although valid in SQL-92, is not supported in
PostgreSQL.)

Notes

By using a signed integer with a directional statement, search directions can be reversed. For instance, the
following commands are all functionally identical:
FETCH FORWARD 1 IN mycursor
FETCH FORWARD NEXT IN mycursor
FETCH BACKWARD PRIOR IN mycursor
FETCH BACKWARD -1 IN mycursor

PostgreSQL currently supports read-only cursors but not updateable cursors. Therefore, updates must be entered
explicitly and cannot take place in a cursor.

Use the MOVE command to navigate through a cursor without retrieving rows of data.

SQL-92 Compatibility

PostgreSQL allows cursors to exist outside of embedded use, which is an extension from the original SQL-92
specification.

Additionally, SQL-92 declared some additional features for the FETCH command. Absolute cursor positioning
through the ABSOLUTE command and storing results in variables through the INTO command were both defined
in SQL-92 but do not exist in PostgreSQL.

Example

This example shows a cursor created from the authors table and then FETCH being used to retrieve specific
rows:

BEGIN;
DECLARE mycursor CURSOR FOR SELECT * FROM authors;
FETCH FORWARD 3 IN mycursor;

Name SSN HireDate


----------------------------------
Bill 666-66-6666 01/01/1980
Sam 123-45-6789 05/21/1994
Amy 999-99-9999 06/05/2001

FETCH BACKWARD 1 IN mycursor;

Name SSN HireDate


----------------------------------
Sam 123-45-6789 05/21/1994

FETCH FORWARD NEXT IN mycursor;

Name SSN HireDate


----------------------------------
Amy 999-99-9999 06/05/2001

CLOSE mycursor;
COMMIT;

GRANT

Usage

GRANT privilege [,…] ON object [,…]


TO { PUBLIC | GROUP groupname | username}

Description
The GRANT command is used to assign specific privileges to groups, users, or the public at large. By default,
creators of an object get all privileges assigned to them for that object. Users other than the creator need to be
given explicit rights, or belong to a group that inherits such rights, to access an object.

The GRANT command allows the following privileges to be assigned:

SELECT—The capability to access columns in a table.

INSERT—The capability to insert rows into a table.

UPDATE—The capability to modify data in a table.

DELETE—The capability to remove rows from a table.

RULE—The capability to define rules on a table.

ALL—All of the preceding.

These privileges can be assigned on the following objects:

Tables

Views

Sequences

Input(s)

privilege—One of the following: SELECT, INSERT, UPDATE, DELETE, RULE, or ALL.

object—One of the following object classes: table, view, or sequence.

PUBLIC—Optional keyword indicates that the privilege applies to everyone.

groupname—The name of the group to which to apply privileges.

username—The specific user to which to apply privileges.

Output(s)

CHANGE (Message returned if the command was successful.)

ERROR: ChangeAcl: class 'object' not found: (Message returned if the object specified cannot be
located to assign permission to.)

Notes

To grant access to only a specific column, the following procedure must be carried out:

1. Do not grant access to the table for the user.

2. Create a VIEW of the table with specific fields present.

3. GRANT the user access to the VIEW.

See the REVOKE command for information on how to remove permissions assigned with GRANT.

SQL-92 Compatibility
SQL-92 defines some additional settings for the GRANT command. Specifically, it allows privileges to be set down
to the column level. Additionally, the SQL-92 specification includes the following:

Privileges

References

Usage

Objects

Character Set

Collation

Translation

Domain

With Grant Option

Examples

The following example gives the user bill certain rights to the authors table:

GRANT SELECT, UPDATE ON authors TO bill;

To grant all privileges of the managers group to the authors table:

GRANT ALL ON authors TO GROUP managers;

INSERT

Usage

INSERT INTO tablename [ (column [,…]) ]


{ VALUES (data [,…] ) | SELECT query}

Description

The INSERT command is used to append new rows to a table. Additionally, by using a SELECT query, numerous
rows can be appended simultaneously.

The particular columns can be specified during an insert, or if not included, PostgreSQL will attempt to insert the
default value for that column.

If an attempt is made to insert the wrong data type into a column, PostgreSQL will automatically try to convert
the data to the correct data type.

Input(s)

tablename—The name of the table into which to insert rows.

column—A listing of columns that matches the data.

data—The actual data to be inserted into the table.


query—A SQL query that generates data to be used for insertion.

Output(s)

INSERT oid 1 (Message returned if one row was inserted along with the OID of that object.)

INSERT 0 number (Message returned if multiple rows were inserted; includes the number of rows inserted.)

Notes

The user executing this command must have insert privileges to the table specified.

SQL-92 Compatibility

The INSERT command is fully compatible with the SQL-92 specification.

Examples

This example shows a basic use for the INSERT command. Data is inserted into the three-column table
authors:

INSERT INTO authors (Name, SSN, LastCheck)


VALUES ('Sam', '333-33-3333', 450.00);

This example shows how the SELECT command is used in conjunction with the INSERT command. Notice how
the columns returned from the SELECT statement should match the columns specified in the INSERT command.

INSERT INTO authors (Name, SSN, LastCheck)


SELECT Name, SSN, LastCheck from tempTable;

LISTEN

Usage

LISTEN name

Description

The LISTEN command is used in conjunction with NOTIFY. LISTEN registers a name to the PostgreSQL back
end and listens for a notification from a NOTIFY command.

Multiple clients can all listen on the same LISTEN name. When a notification comes for that name, all clients will
be notified.

Input(s)

name—The name to register with PostgreSQL.

Output(s)

LISTEN (Message returned if successful.)

NOTICE: Async_Listen: We are already listening on 'name' (Message returned if the back end
already has that LISTEN name registered.)
Notes

The name can be any combination of 31 characters if enclosed in double quotes.

SQL-92 Compatibility

LISTEN is a PostgreSQL extension; there is no such command in the SQL-92 specification.

Example

This example registers a name with the LISTEN command and then sends a notification:

LISTEN IAmWaiting;

NOTIFY IAmWaiting;

Asynchronous NOTIFY 'IAmWaiting' from backend with pid '2342' received.

LOAD

Usage

LOAD filename

Description

The LOAD command is used to load an object file (a .o from a C-compiled file) for use by PostgreSQL. After the
file has been loaded, all functions contained therein will be available for use.

Alternatively, if no LOAD command is explicitly given, PostgreSQL will automatically load the necessary object file
once the function is called.

If the code in an object file has changed, the LOAD command can be issued to refresh PostgreSQL and make
those changes visible.

Input(s)

filename—The path and filename of the object file to load.

Output(s)

LOAD (Message returned if the command was successful.)

ERROR: LOAD: could not open file 'name' (Message returned if the filename specified could not be
found.)

Notes

The object file must be reachable from the PostgreSQL back end; therefore, the user needs to take into account
pathnames and permissions before specifying the file.

Care should be taken in designing object files to prevent errors. Functions in a user-defined object file should not
call other user-defined object files. Ideally, all function calls should exist in the same object file or be linked to
one of the standard C, math, or PostgreSQL library files.

SQL-92 Compatibility
The SQL-92 specification does not define a LOAD command; this is a PostgreSQL extension.

Example

To load a user-defined object file for use:

LOAD '/home/bill/myfile.o'

LOCK

Usage

LOCK [ TABLE ] tablename

Or

LOCK [ TABLE ] tablename IN


[ ROW | ACCESS ]
{ SHARE | EXCLUSIVE } MODE

Or

LOCK [ TABLE ] tablename IN SHARE ROW EXCLUSIVE MODE

Description

The LOCK TABLE command is used to control simultaneous access to the specified table. PostgreSQL, by default,
automatically handles many table-locking scenarios. However, there are cases when the capability to specify is
helpful.

PostgreSQL provides the following lock types:

EXCLUSIVE—Prevents any other lock type from being granted on the table for the duration of the
transaction.

SHARE—Allows others to share the lock as well, but prevents exclusive locks for the duration of the
transaction.

The preceding lock types work on the following levels of granularity:

ACCESS—The entire table schema.

ROWS—Locks only individual rows.

The following table lists the common lock modes, their typical uses, and what conflicts they produce with other
lock modes:

Lock Mode Database Operation Conflicts With

ACCESS SHARE MODE SELECT (any table query) (This is the least restrictive lock.)

ACCESS EXCLUSIVE MODE LOCK TABLE (This is the most restrictive lock.)

ALTER TABLE

DROP TABLE
VACUUM

SHARE MODE CREATE INDEX ROW EXCLUSIVE

SHARE ROW EXCLUSIVE

EXCLUSIVE

ACCESS EXCLUSIVE

SHARE ROW EXCLUSIVE MODE ROW EXCLUSIVE

SHARE

SHARE ROW EXCLUSIVE

EXCLUSIVE

ACCESS EXCLUSIVE

EXCLUSIVE MODE ROW SHARE

ROW EXCLUSIVE

SHARE

SHARE ROW EXCLUSIVE

EXCLUSIVE

ACCESS EXCLUSIVE

ROW SHARE MODE SELECT…FOR UPDATE EXCLUSIVE

ACCESS EXCLUSIVE

ROW EXCLUSIVE MODE INSERT SHARE

UPDATE SHARE ROW EXCLUSIVE

DELETE EXCLUSIVE

ACCESS EXCLUSIVE

Input(s)

tablename—The name of the table on which to perform the lock.

SHARE ROW EXCLUSIVE MODE—Like an EXCLUSIVE lock, but it allows SHARE ROW locks by others.

Output(s)

LOCK TABLE (Message returned if the command was successful.)

ERROR 'tablename': Table does not exist (Message returned if the LOCK command could not locate
the table specified.)
Notes

To prevent deadlocks (pauses that occur when two transactions each wait for the other to complete), it is
important for transactions to acquire locks on objects in the same order. For instance, if a transaction updates
Row 1 and then Row 2, then a separate transaction should also update Row 1 and then Row 2 in that order and
not vice versa.

Additionally, if multiple locks are involved from a single transaction, the most restrictive lock should be used.

PostgreSQL will detect deadlocks and roll back at least one of the waiting transactions to resolve it.

Most LOCK modes (except ACCESS SHARE/EXCLUSIVE) are compatible with Oracle's LOCK modes.

SQL-92 Compatibility

The SQL-92 specification uses the SET TRANSACTION clause to specify concurrent table access, which is
supported by PostgreSQL. (See the SET command.)

The LOCK TABLE command is a PostgreSQL extension.

Example

This example locks the entire table authors to prevent any other access while the updates occur:

BEGIN;
LOCK TABLE authors;
UPDATE authors SET status='active';
COMMIT;

MOVE

Usage

MOVE [direction] [count] { IN | FROM } cursorname

Description

The MOVE command enables a user to navigate through a cursor without actually retrieving any of the data. It
works somewhat like the FETCH command, except it only positions the cursor.

Input(s)

direction—Specifies the direction to move: FORWARD or BACKWARD.

count—Either a signed integer or the keyword NEXT or PRIOR; these specify how many rows to move from the
current position.

IN or FROM—Use either option; they are both functionally the same.

cursorname—The name of the cursor to move through; it should already have been defined with a DECLARE
statement.

Output(s)

MOVE (Returned if the command was successful.)

Notes
By using a signed integer with a directional statement, movement directions can be reversed. For instance, the
following commands are all functionally identical:

MOVE FORWARD 1 IN mycursor


MOVE FORWARD NEXT IN mycursor
MOVE BACKWARD PRIOR IN mycursor
MOVE BACKWARD -1 IN mycursor

MOVE works in a very similar fashion to FETCH. Refer to FETCH for more information.

SQL-92 Compatibility

SQL-92 does not specify a MOVE command. However, it is possible to FETCH rows starting from a defined
position. The effect is an implied MOVE to that defined position.

Example

The following example defines a cursor mycursor and then navigates through it, using the MOVE command to
retrieve specific rows:

BEGIN;
DECLARE mycursor CURSOR FOR SELECT * FROM authors;
MOVE FORWARD 3 IN mycursor;
FETCH NEXT IN mycursor;

Name SSN HireDate


------------------------------------
Sam S. 123-45-6789 12/01/1998

COMMIT;

NOTIFY

Usage

NOTIFY name

Description

The NOTIFY command is used in conjunction with LISTEN to send a notification message to clients who have
registered a name to listen on. It provides a way to implement a basic messaging system between client and
server processes. A typical use might be to inform client applications that a specific table has changed, prompting
the client applications to redisplay their data.

The information passed to the client application includes the notification name and the PID of the back-end
process.

Input(s)

name—The name previously registered with LISTEN to which to send a notification.

Output(s)

NOTIFY (Message returned if the command was executed successfully.)

Notes
NOTIFY events are actually executed inside a PostgreSQL transaction; therefore, this has some important
implications.

First, notifications will not be sent until the entire transaction is committed. Particularly, this is relevant if the
NOTIFICATION is part of a RULE or TRIGGER associated with a table. The notification will not be sent until the
entire transaction involving that table has completed.

Second, if a listening front-end receives a notification while it is in a transaction, the NOTIFY event will be
delayed until its transaction is completed.

It is not a good practice to have a front-end application depend on the number of notifications it receives. It is
possible, if many notifications are sent in quick succession, that the client would only receive one notification.

SQL-92 Compatibility

The NOTIFY command is a PostgreSQL extension; there is no such command in the SQL-92 specification.

Example

This example registers a name with the LISTEN command and then sends a notification:

LISTEN IAmWaiting;

NOTIFY IAmWaiting;

Asynchronous NOTIFY 'IAmWaiting' from backend with pid '2342' received.

REINDEX

Usage

REINDEX { TABLE | DATABASE | INDEX } name [ FORCE ]

Description

The REINDEX command is used to recover from corruptions of system indexes. To run this command, the
postmaster process must be shut down, and PostgreSQL should be launched with the -O and -P options. (This
is to prevent PostgreSQL from reading system indexes upon startup.)

Input(s)

TABLE—Re-create all table indexes on the specified table.

DATABASE—Re-create all system indexes on the specified database.

INDEX—Re-create the specified index.

name—The name of the specific table, database, or index to re-create.

FORCE—Forces PostgreSQL to overwrite the current index, even if PostgreSQL determines that it is still valid.

Output(s)

REINDEX (Message returned if the command was successful.)

SQL-92 Compatibility
This is a PostgreSQL language extension. No such command is defined in the SQL-92 specification.

Example

This example forces a REINDEX on the database acme:

REINDEX DATABASE acme FORCE;

RESET

Usage

RESET variable

Description

RESET changes a run-time variable back to its default setting. It is functionally equivalent to a SET variable
TO DEFAULT command.

Input(s)

variable—The name of the variable to reset to its default value.

Output(s)

SET (Message returned if successful.)

Notes

See the SET command for further discussion and for a list of run-time variables.

SQL-92 Compatibility

RESET is a PostgreSQL language extension. There is no RESET command in the SQL-92 specification.

Example

This example restores the variable DateStyle back to its default setting:

RESET DateStyle;

REVOKE

Usage

REVOKE privilege [,…]


ON object [,…]
FROM { PUBLIC | GROUP groupname | username}

Description
The REVOKE command enables the owner of an object (or a superuser) to remove permissions granted to a user,
a group, or the public on a specific object.

The REVOKE command allows the following privileges to be removed:

SELECT—The capability to access columns in a table.

INSERT—The capability to insert rows into a table.

UPDATE—The capability to modify data in a table.

DELETE—The capability to remove rows from a table.

RULE—The capability to define rules on a table.

ALL—All of the preceding.

These privileges can be revoked from the following objects:

Tables

Views

Sequences

Input(s)

privilege—One of the following: SELECT, INSERT, UPDATE, DELETE, RULE, or ALL.

object—One of the following object classes: table, view, or sequence.

PUBLIC—Optional keyword indicates that the privilege applies to everyone.

groupname—The name of the group to remove privileges.

username—The specific user from which to remove privileges.

Output(s)

CHANGE (Message returned if the command was successful.)

ERROR (Message returned if an object was not found or if the permissions specified could not be revoked.)

Notes

Refer to the GRANT command for more information on assigning privileges to a user or group.

SQL-92 Compatibility

The SQL-92 specification for the REVOKE command has some additional functionality. It allows privileges to be
removed at the column level, as well as removing additional privileges not mentioned here. Specifically, these are
the following:

Usage

Grant Option
References

Examples

This example shows how to remove the user bill's privileges for changing data in the table authors:

REVOKE UPDATE, INSERT, DELETE ON authors FROM bill;

To remove all users from being able to see or modify the table payroll:

REVOKE ALL ON payroll FROM PUBLIC;

ROLLBACK

Usage

ROLLBACK [ WORK | TRANSACTION ]

Description

The ROLLBACK command is used to stop and reverse a PostgreSQL transaction that is currently in progress.
When PostgreSQL receives a ROLLBACK command, any changes made to tables are automatically reverted to
their original state.

By default, all commands issued in PostgreSQL are performed in an implicit transaction. The explicit use of the
BEGIN…COMMIT clauses encapsulates a series of SQL commands to ensure proper execution. If any of the
commands in the series fail, a ROLLBACK command can be issued, thereby bringing the database back to its
original state.

Input(s)

None. WORK and TRANSACTION are optional keywords that have no functional effect.

Output(s)

ABORT (Message returned if the command was successful.)

NOTICE: ROLLBACK: no transaction in progress: (Message returned if there was no current


transaction in progress.)

Notes

The COMMIT command is used to successfully ensure that transactional actions are completed successfully.

See ABORT, BEGIN, and COMMIT for more information regarding transactions.

SQL-92 Compatibility

The ROLLBACK command is fully SQL-92 compliant. SQL-92 also specifies ROLLBACK WORK as a valid statement,
which is also supported by PostgreSQL.

Example

This example shows a transaction in progress that is terminated by using a ROLLBACK command:
BEGIN;
SELECT * FROM authors;

Name SSN Status


-------------------------------
Greg L. 123-45-6789 Active
Mike D. 999-99-9999 Active
INSERT INTO authors (Name, SSN, Status)
VALUES ('Barry S.', '555-55-5555', 'Inactive');

SELECT * FROM authors;

Name SSN Status


-------------------------------
Greg L. 123-45-6789 Active
Mike D. 999-99-9999 Active
Barry S. 555-55-5555 Inactive

ROLLBACK;
SELECT * FROM authors;

Name SSN Status


-------------------------------
Greg L. 123-45-6789 Active
Mike D. 999-99-9999 Active

SELECT

Usage

SELECT [ ALL | DISTINCT [ ON (expression [,…]) ] expression


[ AS name] [,…]
[ INTO [ TEMPORARY | TEMP ] [ TABLE ] new_table]
[ FROM [ ONLY ] fromitem [alias] [,…]]
[ ON JOIN joincondition | USING (joinlist) ][,…]]
[ WHERE wherecondition]
[ GROUP BY column [,…] ]
[ HAVING wherecondition [,…] ]
[ {UNION [ALL] | INTERSECT | EXCEPT} secselect]
[ ORDER BY column [ASC | DESC | USING operator][,… ]
[ FOR UPDATE [OF tablename[,…] ] ]
[ LIMIT {count | ALL} [ {OFFSET |,} start] ]

Description

The SELECT command is used to retrieve rows from a single or multiple tables. If the WHERE condition is not
given, all rows are returned.

FROM Clause

The FROM clause identifies what tables are to be included in the query. If the FROM clause is simply a table name,
by default, this includes rows from inherited relations. The ONLY option will limit results to be only from the
specified table.

The FROM clause can also refer to SUB-SELECT, which is useful for performing advanced grouping, aggregation,
and ordering functions.
The FROM clause can also refer to a JOIN statement, which is the combination of two distinct FROM locations.
The following JOIN types are supported:

INNER JOIN | CROSSJOIN. A straight combination of the included row sources with no qualification made
for row removal.

The OUTER JOINs presented here in the next three bullets are a feature of Version 7.1 and above.
Previous versions of PostgreSQL do not support OUTER JOINs.

LEFT OUTER JOIN. The left-hand row source is returned in full, but the right-hand rows are returned only
where they passed the ON qualification. The left-hand rows are fully extended across the width of the
result, using NULLs to pad the areas where right-hand rows are missing.

RIGHT OUTER JOIN. The converse of a LEFT OUTER JOIN. All right-hand rows are returned, but left-
hand rows are only returned where they passed the ON qualification. The right-hand rows are fully extended
across the width of the result, using NULLs to pad the areas where the left-hand rows are missing.

FULL OUTER JOIN. A FULL OUTER JOIN returns all left-hand rows (NULL extended to right) and all
right-hand rows (NULL extended to left).

DISTINCT Clause

The DISINCT clause allows the user to specify whether duplicate rows are returned or not. The default is to
return ALL, including duplicate rows.

By specifying DISTINCT ON in conjunction with ORDER BY, it is possible to limit duplicate returns based on
specific columns.

WHERE Clause

The WHERE clause is used to limit what rows are returned. An expression that constitutes a valid WHERE clause
evaluates to a Boolean expression. For instance:

WHERE expression1 condition expression2

For example:

WHERE Name='Barry')

The condition can be one of =, <, <=, >, >=, <>, ALL, ANY, IN, and LIKE.

GROUP BY Clause

The GROUP BY clause is used to consolidate duplicate rows into single entries. All fields selected must contain
identical rows for the rows to be consolidated.

In the case in which an aggregate is on a field, the aggregate function will be computed for all members in each
group.

By default, GROUP BY attempts to function on input columns. However, if used with a SELECT AS clause, GROUP
BY can function on output columns.Additionally, GROUP BY can be used with the ordinal column number.

HAVING Clause
The HAVING clause filters out groups of rows generated by a GROUP BY command. Essentially, the HAVING
clause is like a WHERE filter for GROUP BY conditionals. However, the WHERE clause goes into effect before a
GROUP BY is run, whereas HAVING is executed after the GROUP BY has finished.

ORDER BY Clause

The ORDER BY clause instructs PostgreSQL to order the output of a SELECT command by specific columns. If
multiple columns are specified, the output order will match the left-to-right order of the columns specified.

The direction of ordering can be specified by using either the ASC (ascending) or DESC (descending) option. By
default, ASC is assumed.

In addition to specifying column names, the ordinal numbers of the respective columns can also be used. If the
ORDER BY declaration name is ambiguous, an output column name will be assumed. This functions opposite of
the GROUP BY clause.

UNION Clause

The UNION clause allows the output result to be a collection of rows from two or more queries. To function, each
query must have the same number of columns and the same respective data types.

By default, UNION composites do not contain duplicate rows, but they can if the ALL option is specified.

INTERSECT Clause

The INTERSECT clause gathers a composite output result from a collection of like queries. To function, each
query must have the same number of columns and the same respective data types.

INTERSECT differs from UNION because only the rows that are in common to both queries are returned.

FOR UPDATE Clause

The FOR UPDATE clause performs an exclusive lock on the selected rows to facilitate data modifications.

EXCEPT Clause

The EXCEPT clause returns composite output resulting from a collection of like queries. To function, each query
must have the same number of columns and the same respective data types.

The EXCEPT clause differs from UNION in that all rows from the first query are returned but only nonmatching
rows from the second column.

LIMIT Clause

The LIMIT clause is used to specify the maximum number of rows returned. If the OFFSET option is included,
that many rows will be skipped before the LIMIT command starts to take effect.

LIMIT usually returns meaningful results only when used in conjunction with an ORDER BY command;
otherwise, it is difficult to know what significance the rows being returned have.

Input(s)

expression—The name of the table's column or an expression.

name—Specifies an alternate name for a column or expression. Often used to rename the result of an aggregate
(that is, SELECT sum(check) AS TotalPayroll).
TEMPORARY | TEMP—The results of SELECT are sent to a unique temporary table, which is deleted once this
session is complete.

new_table—The results of SELECT are sent to a new query with this specified name. (See the SELECT INTO
command for more information.)

fromitem—The name of a table, sub-select, or JOIN clause to select rows from (see preceding for JOIN).

alias—Defines an optional name for the preceding table. Used to prevent confusion when dealing with same-
table joins.

wherecondition—The SQL statement that returns a Boolean value once evaluated. This output determines
what rows are initially filtered by the query. Alternatively, an asterisk (*) signifies ALL.

column—The column name.

secselect—The secondary SELECT statement for use in UNIONS. Standard SELECT statement, except in the
version of SELECT, cannot include ORDER BY or LIMIT clauses.

count—The number of rows to return in a limit phrase.

start—Used with the OFFSET command to not begin returning data until the specified number of initial rows
have been returned.

Output(s)

If successful, the data is returned.

XX ROWS (Message returned after the data set that indicates the number of rows was returned.)

Notes

The user executing the SELECT command must have permissions to select the tables involved.

SQL-92 Compatibility

The major components of the PostgreSQL SELECT command are SQL-92 compliant, except for the following
areas:

LIMIT…OFFSET—This is a PostgreSQL extension. There are no such commands in SQL-92.

DISTINCT ON—This is a PostgreSQL extension. There are no such commands in SQL-92.

GROUP BY—In SQL-92, this command can only refer to input column names, whereas PostgreSQL can use
both.

ORDER BY—In SQL-92, this command can only refer to output (result) column names, whereas PostgreSQL
can use both.

UNION clause—In SQL-92, this command allows an additional option, CORRESPONDING BY, to be
included. This option is not available in PostgreSQL.

Examples

This example shows a simple SELECT statement from the table authors, selecting rows where they match a
specific name:

SELECT * FROM authors WHERE name='Sam';


Name SSN HireDate
------------------------------------
Sam 111-11-1111 01-01-1990
Sam 123-45-6789 04-23-2001
Sam 999-99-9999 06-22-1971
Sam 333-33-3333 09-19-1995

Here's the same example with an added ORDER BY feature:

SELECT * FROM authors WHERE name='Sam' ORDER BY HireDate;


Name SSN HireDate
------------------------------------
Sam 999-99-9999 06-22-1971
Sam 111-11-1111 01-01-1990
Sam 333-33-3333 09-19-1995
Sam 123-45-6789 04-23-2001

Here's the example again, this time with a LIMIT command to return only the two newest members:

SELECT * FROM authors WHERE name='Sam'


ORDER BY HireDate DESC
LIMIT 2;
Name SSN HireDate
------------------------------------
Sam 123-45-6789 04-23-2001
Sam 333-33-3333 09-19-1995

To join the current authors table with the payroll table to get the last check amount:

SELECT * FROM authors JOIN payroll ON authors.ssn=payroll.snn;


Name SSN HireDate SSN LastCheck
-----------------------------------------------------------------
Sam 123-45-6789 04-23-2001 123-45-6789 500.00
Sam 999-99-9999 06-22-1971 999-99-9999 674.00
Sam 333-33-3333 09-19-1995 333-33-3333 800.00
Sam 111-11-1111 01-01-1990 111-11-1111 964.15

Use of aggregate functions (like count(), sum(), and so on) provides easy methods of summarizing data that
would be tedious to compute otherwise. In this example, you use the count() function to tell us how many
authors are named Sam:

SELECT count(name) FROM authors WHERE name='Sam';

count
-----
4

Use the SUM function, GROUP BY, and a JOIN to tell us how much all the Sams have been paid:

SELECT authors.name, sum(payroll.LastCheck) AS Total


FROM authors JOIN payroll ON authors.ssn=payroll.ssn
WHERE authors.name='Sam'
GROUP BY authors.name;
Name Total
---------------
Sam 2398.15

This example shows how a sub-select works. All the people from payroll who made more than $900 on their
last check are chosen, and then their names are displayed from a join to authors:

SELECT name, ssn FROM authors


WHERE ssn IN (SELECT ssn FROM payroll WHERE LastCheck>900);
Name SSN
---------------------
Sam 111-11-1111

SELECT INTO

Usage

SELECT [ ALL | DISTINCT [ ON (expression [,…]) ] expression


[ AS name] [,…]
[ INTO [ TEMPORARY | TEMP ] [ TABLE ] new_table]
[ FROM [ ONLY ] fromitem [alias] [,…]]
[ ON JOIN joincondition | USING (joinlist) ][,…]]
[ WHERE wherecondition]
[ GROUP BY column [,…] ]
[ HAVING wherecondition [,…] ]
[ {UNION [ALL] | INTERSECT | EXCEPT} secselect]
[ ORDER BY column [ASC | DESC | USING operator][,… ]
[ FOR UPDATE [OF tablename[,…] ] ]
[ LIMIT {count | ALL} [ {OFFSET | } start] ]

Description

The syntax of the SELECT INTO command is essentially the same as for a regular SELECT command; the only
difference is that the output of the query is directed to a new table.

Input(s)

See the SELECT command.

Output(s)

See the "Output" section under the SELECT command.

Notes

The user that executes this command will become the owner of the newly created table.

SQL-92 Compatibility

See the "SQL-92 Compatibility" section under the SELECT command.

Example

Create a new table from only the people named Sam in your authors table:

SELECT * FROM authors WHERE name='Sam' INTO TABLE SamTable;

SET

Usage

SET variable { TO | = } {value | 'value' | DEFAULT }

Or
SET CONSTRAINTS { ALL | list} mode

Or

SET TIME ZONE { 'timezone' | LOCAL | DEFAULT }

Or

SET TRANSACTION ISOLATION LEVEL { READ COMMITTED | SERIALIZABLE }

Description

Essentially, the SET command is used to set a run-time variable in PostgreSQL. However, the specific usage
varies greatly depending on what run-time variable is being set.

After a variable has been SET, the SHOW command can be used to display its current setting, and the RESET
command can be used to set it to its default value.

Input(s)

The basic list of valid variables and value combinations follow in the next section.

CLIENT_ENCODING | NAMES

Parameter(s): value

Sets multibyte encoding, which must be enabled during compile time.

DATESTYLE

Parameter(s):

ISO—Use the ISO 8601 style dates and times.

SQL—Use Oracle/Ingres style dates and times.

Postgres—Use the standard PostgreSQL style dates and times.

European—Use dd/mm/yyyy style dates.

NonEuropean—Use mm/dd/yyyy style dates.

German—Use dd.mm.yyyy style dates.

US—Use dd/mm/yyyy style dates (same as NonEuropean).

DEFAULT—Use the ISO style dates and times.

Sets the date and time styles for representation purposes.

SEED

Parameter(s): value

Sets the random number generator with a specific seed (floating point between 0 and 1).

Additionally, this value can be set using the setseed function.


This option is only available if MULTIBYTE support has also been enabled.

SERVER_ENCODING

Parameter(s): value

Sets the multibyte encoding to a value.

This option is only available if MULTIBYTE support has also been enabled.

CONSTRAINTS

Parameter(s): constraintlist and mode

Controls the constraint evaluation level in the current transaction, where:

constraintlist—Comma-separated list of constraint names.

mode—Either IMMEDIATE or DEFERRED.

TIME ZONE | TIMEZONE

Parameter(s): value

Sets the time zone depending on your operating system (that is, /usr/lib/zoneinfo or
/usr/share/zoneinfo has valid time-zone values for a Linux-based OS).

PG_OPTIONS

PG_OPTIONS can take several internal optimization parameters. They are as follows:

all

deadlock_timeout

executorstats

hostlookup

lock_debug_oidmin

lock_debug_relid

lock_read_priority

locks

malloc

nofsync

notify

palloc

parserstats
parse

plan

plannerstats

pretty_parse

pretty_plan

pretty_rewritten

query

rewritten

shortlocks

showportnumber

spinlocks

syslog

userlocks

verbose

RANDOM_PAGE_COST

Parameter(s): float-value

Sets the optimizer's estimate of the cost of nonsequentially fetched disk pages.

CPU_TUPLE_COST

Parameter(s): float-value

Sets the optimizer's estimate of the cost of processing each tuple during a query.

CPU_INDEX_TUPLE_COST

Parameter(s): float-value

Sets the optimizer's estimate of the cost of processing each indexed tuple during a query.

CPU_OPERATOR_COST

Parameter(s): float-value

Sets the optimizer's estimate of the cost of processing each operator in a WHERE clause during a query.

EFFECTIVE_CACHE_SIZE

Parameter(s): float-value
Sets the optimizer's assumptions about the effective size of the disk cache.

ENABLE_SEQSCAN

Parameter(s): ON (default) or OFF

Enables/disables the planner's use of sequential scan types. (Note: This capability is actually impossible to turn
off completely, but setting it as disabled discourages its use.)

ENABLE_INDEXSCAN

Parameter(s): ON (default) or OFF

Enables/disables the planner's use of index scans.

ENABLE_TIDSCAN

Parameter(s): ON (default) or OFF

Enables/disables the planner's use of TID scan plans.

ENABLE_SORT

Parameter(s): ON (default) or OFF

Enables/disables the planner's use of explicit sort types. (Note: This capability is actually impossible to turn off
completely, but setting it as disabled discourages its use.)

ENABLE_NESTLOOP

Parameter(s): ON (default) or OFF

Enables/disables the planner's use of nested loops in join plans. (Note: This capability is actually impossible to
turn off completely, but setting it as disabled discourages its use.)

ENABLE_MERGEJOIN

Parameter(s): ON (default) or OFF

Enables/disables the planner's use of merge-join plans.

ENABLE_HASHJOIN

Parameter(s): ON (default) or OFF

Enables/disables the planner's use of hash-join plans.

GEQO

Parameter(s): ON (default), ON=value, or OFF

Sets the threshold for using genetic optimization algorithms.

KSQO

Parameter(s): ON, OFF (default), or DEFAULT (which is OFF)


This sets the Key Set Query Optimizer, which determines whether logically OR'd and AND clauses get optimized by
using a UNION query (that is, WHERE (a=1 AND b=1) OR (a>2 AND b>2)).

MAX_EXPR_DEPTH

Parameter(s): integer

Sets the maximum nesting depth that the parser will accept.

Caution

Raising this level too high can result in server crashes.

Output(s)

SET VARIABLE (Message returned if successful.)

NOTICE: Bad value for variable (value) (Message returned if the value specified cannot be used with
the declared variable.)

Notes

Use the SHOW command to display the value at which a variable is currently set.

SQL-92 Compatibility

The only use of the SET command defined in the SQL-92 specification is for SET TRANSACTION ISOLATION
LEVEL and SET TIME ZONE. Outside of these specific areas, this command is a PostgreSQL language extension.

Example

This example sets the TIME ZONE information to Central Time:

SET TIME ZONE "CST6CDT";

SELECT CURRENT_TIMESTAMP As RightNow;

RightNow
----------------------
2001-08-15 09:50:23-06

SHOW

Usage

SHOW variable

Description

The SHOW command is used to display the current value of a run-time variable. It can be used in conjunction with
the SET and RESET commands to change variable settings.
Input(s)

variable—The name of the variable to display.

Output(s)

NOTICE: variable is value (Message returned if the command was successful.)

NOTICE: Unrecognized variable value (Message returned if the variable name specified cannot be
found.)

NOTICE: Time zone is unknown (Message returned if the TZ or PGTZ variables are not set correctly.)

Notes

For a list of valid variables that can be displayed, refer to the SET command.

SQL-92 Compatibility

SHOW is a PostgreSQL extension. There is no SHOW command defined in the SQL-92 specification.

Example

This example shows what the current date style is set to:

SHOW datestyle;

NOTICE: DateStyle is ISO with US (NonEuropean) conventions

TRUNCATE

Usage

TRUNCATE [ TABLE ] name

Description

The TRUNCATE command quickly deletes all rows from the specified table. Functionally, it is the same as a
DELETE command, but it is much faster.

Input(s)

name—The name of the table to TRUNCATE.

Output(s)

TRUNCATE (Message returned if the command was successful.)

Notes

The user of this command must own the table specified or have DELETE privileges to execute this command.

SQL-92 Compatibility
This is a PostgreSQL extension; the SQL-92 method for achieving this same effect would be to elicit an
unqualified DELETE command.

Example

To quickly delete all data from the table temptable:

TRUNCATE TABLE temptable;

UNLISTEN

Usage

UNLISTEN { notifyname | * }

Description

The UNLISTEN command is used to stop a front-end from waiting on a LISTEN command. The specific name to
stop listening on can be specified, or a wildcard (*) can be specified, which will stop listening on all previously
registered names.

Input(s)

notifyname—The previously registered name on which to stop listening.

*—Stop listening on all previously registered names.

Output(s)

UNLISTEN (Message returned if successful.)

Notes

Once unregistered, further NOTIFY commands sent by the server will be ignored.

SQL-92 Compatibility

UNLISTEN is a PostgreSQL extension. There is no such command in the SQL-92 specification.

Example

This example shows the name mynotify being registered, sending notification, and then being unregistered:

LISTEN mynotify;
NOTIFY mynotify;
Asynchronous NOTIFY 'mynotify' from backend with pid '7277' received
UNLISTEN mynotify;
NOTIFY mynotify;

UPDATE

Usage
UPDATE table SET column=expression [,…]
FROM fromlist
WHERE condition

Description

The UPDATE command is used to change the data in specific rows in a table. If no WHERE condition is specified,
all rows are assumed; otherwise, only those rows matching the WHERE criteria are updated.

By using the FROM keyword, multiple tables can be used to satisfy the WHERE condition.

Input(s)

table—The name of the table to update.

column—The specific column in which to change the data.

expression—A specific value or valid expression to which to change the data.

fromlist—A list of alternate tables to include in the following WHERE condition.

condition—A standard SQL WHERE condition to constrain the updates. (See SELECT for more information on
WHERE conditions.)

Output(s)

UPDATE # (Message returned if successful. Output includes the number of rows where data was changed.)

Notes

The user of the UPDATE command must have write permissions to the table specified, as well as SELECT
permissions on any tables needed in the WHERE clause.

SQL-92 Compatibility

The UPDATE command is mostly compliant with the SQL-92 specification, except the following:

FROM fromlist—PostgreSQL allows multiple tables to satisfy the WHERE condition. This is not supported
in SQL-92.

WHERE CURRENT OF cursor—SQL-92 allows updates to be positioned based on an open cursor. This is
not supported in PostgreSQL.

Example

The following example updates the column status to active for all people named Bill in the authors table:

UPDATE authors SET status='active' WHERE name='Bill';

VACUUM

Usage

VACUUM [ VERBOSE ] [ ANALYZE ] [table [ (column [,…]) ] ]

Description
The VACUUM command serves two purposes: to reclaim wasted disk space and to profile PostgreSQL optimization
performance.

When the VACUUM command is run, all classes in the current database are opened, and old records from rolled-
back transactions are cleared out. Additionally, the system catalog tables are then updated with information
concerning the optimization statistics for each class. Furthermore, if run with the ANALYZE command,
information related to the dispersion of column data will be updated to improve query execution paths.

Input(s)

VERBOSE—Displays a detailed report for each table.

ANALYZE—Updates the column statistics for each table. This information is used by the query optimization
routine to plan the most efficient searches.

table—The name of a table to VACUUM. The default is all tables.

column—The name of a column to ANALYZE. The default is for all columns.

Output(s)

VACUUM (Message returned if the command executed successfully.)

NOTICE: - Relation 'table' (The report header for the specified table.)

NOTICE: Pages XX, Changed XX, Reapped XX, Empty XX, New XX; Tup XXXX: Vac XXXX, Crash
XX, Unused XX, MinLen XXX, MaxLen XXX; Re-using: Free/Avail. Space XXXXXXX/XXXXXXX;
EndEmpty/Avail. Pages X/XX. Elapsed X/X sec (Message returned that is the analysis table.)

NOTICE: Index 'name': Pages XX; Tuples XXXX: Deleted XXXX. Elapsed X/X sec (The analysis
report for an index.)

Notes

The current open database is the default target for VACUUM.

VACUUM is a good candidate for running as a nightly cron job. For running this command outside of a psql or
other front-end application, see the vacuumdb command in Chapter 6, "User Executable Files," and the
section,"vacuumdb."

VACUUM ANALYZE should be run after significant deletions or modifications have been made to a database.

SQL-92 Compatibility

There is no VACUUM statement in SQL-92; this is a PostgreSQL extension.

Example

This example VACUUMS the table authors:


VACUUM VERBOSE ANALYZE authors;

NOTICE: --Relation authors--


NOTICE: Pages 10: Changed 6, reaped 10, Empty 0, New 0, Tup 1037: Vac 54, Keep/VTL
0/0 Crash 0, UnUsed 0, MinLen 64, MaxLen 64; Re-using: Free/Avail. Space
11108/3608; EndEmpty/Avail Pages 0/9. CPU 0.00s/0.01u sec
NOTICE: Index name_idx: Pages 9; Tuples 1037: Deleted 54. CPU 0.00s/0.01u sec
NOTICE: Index ssn_idx: Pages 5; Tuples 1037: Deleted 54. CPU 0.00s/0.00u sec
Notice Rel authors: Pages: 10 9; Tuples moved: 46. CPU 0.00s/0.01u sec
NOTICE: Index name_idx: Pages 10; Tuples 1037: Deleted 46. CPU 0.00s/0.01u sec
NOTICE: Index ssn_idx: Pages 5; Tuples 1037: Deleted 46. CPU 0.00s/0.00u sec
VACUUM
Part II: PostgreSQL Specifics

Part II PostgreSQL Specifics

2 PostgreSQL Data Types

3 PostgreSQL Operators

4 PostgreSQL Functions

5 Other PostgreSQL Topics


Chapter 2. PostgreSQL Data Types
Data types are the basic building blocks of any RDBMS. They provide the
mechanisms needed for the higher-level functionality of modern databases to exist.
Queries, data validation, comparison operators, manipulation functions, and so on
exist only as an extension of the data-type concept.

Without data types, it would be extremely difficult to consistently obtain meaningful


information from a database. Data types ensure that data at the columnar level is
stored in a consistent format. This facilitates the capability to make comparisons
and manipulation of the underlying data in a predictable manner. Without such
consistency, it could be like comparing apples and oranges (or integers and strings,
in this case) when trying to make valid comparisons of the underlying data.

A further benefit of data types is that they provide a significant boost to the overall
performance and efficiency of a database system. Because the database knows
what type of data is stored in a given column, assumptions can be made about how
to most efficiently store and retrieve that data.
Table of Data Types

The following is a table with the PostgreSQL built-in data types sorted according to
what type of data they hold. This chart might be useful if you know the type of data
you want to store but are unsure of the PostgreSQL designation for that data type.

Data PostgreSQL Data


Description
Category Type (Alias Name)

Geometric BOX Coordinates describing a rectangle.

CIRCLE Coordinates describing a circle.

LINE Coordinates describing an infinite line.

LSEG Coordinates describing a line segment.

PATH Coordinates describing a list of points.

POINT Coordinates describing a single point.

POLYGON Coordinates describing a polygon.

Logical BOOLEAN True or false Boolean value.


Network

CIDR IP / size of network segment.

INET IP address and netmask value.

MACADDR MAC address of an ethernet card.

Numeric[*] BIGINT or INT8 Integer—1 x 10^18.

DECIMAL or User-defined precision and decimal rounding.


NUMERIC

DOUBLE Floating-point value—15 digit.


PRECISION or
FLOAT8

FLOAT Same as FLOAT8 or DOUBLE PRECISION.

INTEGER or INT4 Integer—± 2,147,483,648 range.

REAL or FLOAT4 Floating-point value—six-digit precision.

SERIAL Integer—0 to +2147483647.

SMALLINT or INT2 Integer—± 32,768 range.

String CHAR or CHARACTER Fixed-length string—blank padded.

TEXT Variable-length string.

VARCHAR or Variable-length string with specified limit.


CHARACTER
VARYING

Time DATE Day from 4713 BC to 32767 AD.

INTERVAL Time interval ± 178,000,000 years.

TIME Time of day 00:00:00 to 23:59:59.

TIME WITH TIME Time of day and time zone from 00:00:00+12 to
ZONE 23:59:59-12.

TIMESTAMP Date/time from 4713 BC to 1465001 AD.


Other BIT Specified fixed-length binary data type.

Right-padded.

BIT VARYING Variable-length (with maximum) binary data


type.

MONEY Fixed-precision number ± 21474836.48 (no


longer used—use FLOAT or NUMERIC (x,2)).

NAME Internal data type for system names.

OID PostgreSQL object identifier.

[*]Some of the data-type names for the Numeric category are only supported
in PostgreSQL 7.1. For instance, in Version 6.5, BIGINT is not supported, but
INT8 is supported.

The following pages comprise a listing of the built-in data types. They are broken up
according to the types of data they define. Each element is described with relevant
information such as storage size, range of values, compatibility, and descriptive
notes.
Geometric Data Types

PostgreSQL includes a number of built-in data types for holding geometric-based


data. There are also many built-in functions and operators that aid in the
manipulation and calculation of geometric data.

BOX

Description

Holds the outer-corner coordinates that define a rectangle.

Inputs

((x1,y1),(x2,y2))

x1—X-axis begin point.

y1—Y-axis begin point.

x2—X-axis end point.

y2—Y-axis end point.

Storage Size

32 bytes

Example Data

((1,1),(50,50))

Notes

On input, the data is reordered to store the lower-right corner first and the upper-
left corner second.Therefore, the preceding example, when stored, would be
represented as (50,50),(1,1).

CIRCLE

Description
Holds the coordinates that represent a center point and radius of a circle.

Inputs

<(x,y),r>

x—X-axis of center point.

y—Y-axis of center point.

r—Radius

Storage Size

24 bytes

Example Data

<(10,10),5>

LINE

Description

Holds the coordinates for an infinite line.

Inputs

((x1,y1),(x2,y2))

x1, y1—X- andY-axis of starting point.

x2, y2—X- andY-axis of ending point.

Storage Size

32 bytes

Example Data

((1,1),(100,100))
Notes

The LINE data type was implemented starting in PostgreSQL 7.1.

LSEG

Description

Holds the coordinates that represent a finite line segment.

Inputs

((x1,y1),(x2,y2))

x1, y1—X- andY-axis of starting point.

x2, y2—X- andY-axis of ending point.

Storage Size

32 bytes

Example Data

((1,1),(100,100))

Notes

The LSEG data type is similar to LINE, except that the latter represents an infinite
line as opposed to a specified segment. Essentially, once a line is defined with the
LINE data type, it is assumed that it will continue along the same plane in
perpetuity.

PATH

Description

Paths represent variable line segments, in which there can be numerous points that
create either an open or closed path.

Inputs
((x1,y1),…,(xn,yn))

x1, y1—Represents the starting point of a closed path.

xn, yn—Represents the end point of a closed path.

[(x1,y1),…,(xn,yn)]

x1, y1—Represents the starting point of an open path.

xn, yn—Represents the end point of an open path.

Storage Size

4 + 32n bytes

Example Data

[(1,1),(3,3),(5,10) ]—An open path.

((1,1),(3,3),(5,10),(1,1) )—A closed path.

Notes

Closed paths begin with an open parenthesis and open paths begin with an open
bracket.

The functions isopen, isclosed, popen, and pclose can be used to test and
manipulate paths.

POINT

Description

Holds the coordinates that represent a single point in space.

Inputs

(x,y)

x—X-axis of point.

y—Y-axis of point.
Storage Size

16 bytes

Example Data

(1,5)

POLYGON

Description

Holds a set of coordinates that represents a polygon.

Inputs

((x1,y1),…,(xn,yn))

x1, y1—The starting point of the polygon.

xn, yn—The ending point of the polygon (will usually be the same as the starting
point).

Storage Size

4+32n bytes

Example Data

((1,1),(5,1),(5,5),(1,5),(1,1) )—A square polygon.

Notes

A polygon is very similar to a closed path. However, there are some additional
functions that only act on polygons (that is, poly_center, poly_contain,
poly_left, poly_right, and so on).
Logical Data Types

Logical data types are used to represent the concepts of true, false, or NULL.
Typically, this data type is useful as a flag that indicates the current state of a
record. The true and false values are self-explanatory, while the value NULL
usually indicates the equivalent of "unknown."

BOOLEAN

Description

Holds a logical true, a logical false, or a NULL value.

Inputs

TRUE, t, true, y, yes, 1—All valid TRUE values.

FALSE, f, false, n, no, 0—All valid FALSE values.

NULL—Valid NULL value.

Storage Size

1 byte

Notes

Generally, it is best to use the TRUE and FALSE input forms for Boolean data. These
formats are SQL compatible and generally are more accepted, although some
RDBMSs use 1 and 0 for TRUE and FALSE representations. Some of the input values
need to be escaped by enclosing them in single quotes (i.e., 't'). However, the
SQL-compliant TRUE and FALSE forms do not require quotations.
Network Data Types

PostgreSQL is unique among many SQL systems in that it includes built-in data
types for network addresses. CIDR, INET, and MACADDR all represent specific
aspects of network addresses. These data types can be particularly useful when
using PostgreSQL as a back-end database to a web application.

Storing network values in these data types is preferential due to the included
functions in PostgreSQL that act on network-specific data types.

CIDR

Description

Holds dotted-quad data for an IP address and the number of bits in the netmask.
This data type is named for the Classless Internet Domain Routing (CIDR)
convention.

Inputs

x.x.x.x/y

x.x.x. x—Valid IP address.

y—Bits in the netmask.

Storage Size

12 bytes

Example Data

192.168.0.1/24

128.1 (128.1.0.0/16 assumed)

10 (10.0.0.0/8 assumed)

Notes

If the bits from the netmask are omitted, the netmask bits are assumed by using
the class of the dotted-quad (for example, 255.0.0.0 assumes 8, 255.255.0.0
assumes 16, 255.255.255.0 assumes 24, and so on). However, the assumption will
be large enough to handle all the entries in the expressed octets.
IPv6 is not yet supported.

INET

Description

Holds dotted-quad data for an IP address and an optional netmask.

Inputs

x.x.x.x/y

x.x.x. x—Valid IP address.

y—If given, a netmask; otherwise, a host is assumed.

Storage Size

12 bytes

Example Data

192.168.0.1 (192.168.0.1/32 is assumed)

Notes

The difference between this and CIDR is that an INET can refer to a single host,
whereas CIDR refers to an IP network.

MACADDR

Description

Holds a MAC address, which is an ethernet hardware address.

Inputs

Several different formats are supported, such as the following

xxxxxx:xxxxxx
xxxxxx-xxxxxx

xxxx.xxxx.xxxx

xx-xx-xx-xx-xx-xx

xx:xx:xx:xx:xx: xx (the default)

Storage Size

6 bytes

Example Data

Both of these refer to the same MAC address:

08-00-2d-01-32-22

08002d:013222

Notes

The directory $SOURCE/contrib/mac includes tools to identify the manufacturers


of specific network cards from a given MAC address. (This directory is specific to
Version 7.1 only.)
Numeric Data Types

Numeric data types store a variety of number-related data. The 7.X series of
releases has brought some changes to this area. Namely, PostgreSQL now uses a
more descriptive naming convention for number-related data types (for example,
BIGINT versus INT8).

Recently, some data types have become deprecated over the last few releases. For
instance, use of the MONEY data type is no longer encouraged; instead, it is
preferential to use the DECIMAL data type.

BIGINT (or INT8)

Description

Holds a very large integer.

Inputs

Large integer (approximately 1 x 10^18)

Storage Size

8 bytes

Notes

Versions of PostgreSQL before 7.1 might refer to this data type as INT8.

DECIMAL (or NUMERIC)

Description

Holds a number of user-defined length with a decimal-width specification.

Inputs

(x,y)

x—Total length.
y—Decimal width.

Storage Size

8 bytes

Notes

Versions of PostgreSQL before 7.1 might refer to this data type as NUMERIC.

DOUBLE PRECISION (or FLOAT8)

Description

Holds a large floating-point number.

Inputs

Variable precision—15 decimal places.

Storage Size

8 bytes

Notes

Versions of PostgreSQL before 7.1 might refer to this data type as FLOAT8.

INTEGER (or INT4)

Description

Holds an integer.

Inputs

Range from -2,147,483,648 to +2,147,483,648

Storage Size
4 bytes

Notes

Versions of PostgreSQL before 7.1 might refer to this data type as INT4.

REAL (or FLOAT4)

Description

Holds a standard floating-point number.

Inputs

Variable precision up to six decimal places.

Storage Size

4 bytes

Notes

Versions of PostgreSQL before 7.1 might refer to this data type as FLOAT4.

SERIAL

Description

Holds an integer and is used for auto-increment fields.

Inputs

Range from 0 to + 2,147,483,647

Storage Size

4 bytes

Notes
The SERIAL data type is actually just a standard INTEGER type with some
additional features: The SERIAL data type is an INTEGER with an automatically
created SEQUENCE and INDEX on the specified column. When a table containing a
SERIAL type is dropped, the associated SEQUENCE must also be explicitly dropped
—it does not occur automatically.

SMALLINT (or INT2)

Description

Holds a small integer.

Inputs

Range from -32,768 to +32,768

Storage Size

2 bytes

Notes

Versions of PostgreSQL before 7.1 might refer to this data type as INT2.
String Data Types

PostgreSQL includes three basic data types for storing string-related data. In
compliance with the SQL standard, there are types for fixed-length and variable-
length character strings. Additionally, PostgreSQL defines a more generic data type
named TEXT that requires no specified upper limit regarding maximum size.
However, this data type is PostgreSQL-specific and is not compliant with the SQL-92
standards.

CHAR (or CHARACTER)

Description

Holds a fixed-length string specified with a maximum length.

Inputs

CHAR(n)

n—Integer length of string (if omitted, a 1 is assumed).

Storage Size

(4+n) bytes

Notes

CHAR is a SQL-92-compatible data type. Data that does not fill to the limit specified
is blank padded.

TEXT

Description

Holds a variable-length string.

Storage Size

Variable—depends on the length of string, but no smaller than 4 bytes.

VARCHAR (or CHARACTER VARYING)


Description

Holds a variable-length string with a specified limit.

Inputs

VARCHAR(n)

n—Integer length of string (if omitted, a 1 is assumed).

Storage Size

Variable—depends on the length of string, but no smaller than 4 bytes.

Notes

VARCHAR and CHARACTER VARYING are both SQL-compatible data types.


Time Data Types

PostgreSQL includes a number of built-in data types specifically designed to handle


time- and date-related data.

A number of built-in constants are useful to know for simplifying date-time entry.
The following is a list of them:

now—Constant that stores a timestamp upon storage.

today—Constant that refers to midnight on the current day.

tomorrow—Constant that refers to midnight on the next day.

yesterday—Constant that refers to midnight of the previous day.

PostgreSQL evaluates constants at the start of a transaction, and this might result in
undesired behavior. For instance, using the now constant in a series inserted inside
a transaction will result in all rows having the same timestamp. A way around this is
to use the now() function, which is evaluated upon each call, not during transaction
creation.

DATE

Description

Holds a value that describes a particular day. Many different input formats are
supported (see the following section).

Inputs

Valid range from 4713BC t o 3 2767 A D

Possible input formats:

June 22, 1971—Standard prose format of date.

June 22, 200 BC—Specifying the era.

1971-0622—ISO format (yyyy-mm-dd).

6/22/ 1971—U.S. mode.

22/6/1971—European mode (not a valid date in U.S. mode).

19710622 or 710622—ISO format (yyyymmdd or yymmdd).

1971.174 or 71.174—Year and the day of the year.


Storage Size

4 bytes

Notes

Valid month formats and abbreviations:

January Jan

February Feb

March Mar

April Apr

May May

June Jun

July Jul

August Aug

September Sep or Sept

October Oct

November Nov

December Dec

Valid days of the week and abbreviations:


Monday Mon

Tuesday Tue or Tues

Wednesday Wed or Weds

Thursday Thu,Thur, or Thurs

Friday Fri

Saturday Sat

Sunday Sun

The preceding describes the input formats; the output formats are specified by the
DATESTYLE variable (see the SET SQL command).

INTERVAL

Description

Holds a time-interval value.

Inputs

The input format for INTERVAL is as follows:

Qnt Unit [Qnt Unit …] Direction

Valid values for Qnt are as follows:

-2147483648 to +2147483648

Valid values for Unit are as follows (plurals are also valid):

Second

Hour

Minute
Day

Week

Month

Year

Decade

Century

Millennium

Valid values for Direction are as follows:

Ago—For items in the past.

[blank ]—For future items.

Storage Size

12 bytes

Example Data

1 Week Ago

5 Years 3 Months Ago

30 Days

Notes

INTERVAL is accurate to a resolution of .000001 second (1 microsecond).

TIME

Description

Holds an entry for a time-based value.


Inputs

The valid range for TIME is from 00: 00: 00.00 to 23: 59: 59.99.

The valid input formats that TIME can take are as follows:

08: 24—ISO format

08:24: 50—ISO format

08:24: 50.15—ISO format

082450—ISO format

08:24 PM—Standard

20: 24—24-hour format

z—Same as 00:00:00

zulu—Same as 00:00:00

Storage Size

4 bytes

Notes

The TIME data type is a SQL-compatible format. The TIME data type is accurate to
a resolution of .000001 (1 microsecond).

TIME WITH TIME ZONE

Description

Holds an entry for a time-based value with included time-zone information.

Inputs

The valid range for TIME WITH TIME ZONE is from 00:00:00.00+12 to
23:59:59.99-12.

The valid input formats that TIME WITH TIME ZONE can take are as follows:

08:246—ISO format
08:24:506—ISO format

08:24:50.156—ISO format

0824506—ISO format

Storage Size

4 bytes

Notes

TIME WITH TIME ZONE will accept any time-based input format that is also legal
for the TIME data type, except time zone information is appended to the end.

The TIME data type is a SQL-compatible format. The TIME WITH TIME ZONE data
type is accurate to a resolution of .000001 (1 microsecond).

TIMESTAMP

Description

Holds values that represent time and date information.

Inputs

The valid range for TIMESTAMP is from 471301-01 00:00:00.00 BC to


146500112-31 23:59:59.99 AD.

The valid input formats that TIMESTAMP can take are as follows:

Date Time [Era] [Time Zone]

For instance:

2001-11-24 08:23:11—Standard TIMESTAMP.

2001-11-24 08:23:11 AD -6: 00— with era and time zone.

November 11, 2001 08:23:11—Prose-style TIMESTAMP.

Storage Size

8 bytes
Notes

Because of the inclusion of time, date, era, and time-zone information, the
TIMESTAMP is a popular data type for storage of temporal elements.
Other Data Types

The following are various data types that are used less frequently. Often, these are
used for internal system purposes only, but you might run across them, so they are
listed here for completeness.

BIT and BIT VARYING

The BIT type stores a series of binary-1 and binary-0 values. The BIT data type has
a specified width and pads empty entries with zeros, whereas the BIT VARYING
data type allows flexible-width entries to be made.

MONEY

The MONEY data type is still supported but is no longer considered active. Consider
using a NUMERIC or DECIMAL data type with an appropriately set decimal width.

NAME

The NAME data type stores a 31-character string, but it is only intended to be used
internally. PostgreSQL makes use of the NAME type to store information in the
internal system catalogs.

OID

The OID data type is an integer that ranges in value from zero to 4 billion. Every
object created in PostgreSQL has an OID assigned to it implicitly. OID s are useful
for maintaining data integrity because that number will be unique in the database.

By default, OID s are hidden from view, but they can be selected and displayed by
explicitly specifying them in a query. For instance:

SELECT * FROM test;


Name Age
-------------
Bill 34
Ann 22

SELECT oid, * FROM test;

oid Name Age


-----------------------
19278 Bill 34
19279 Ann 22
Often, OID s are used instead of creating explicit sequences to ensure data integrity.
However, it is generally advisable not to do this for a number of reasons. OIDs exist
for all objects in an entire database; therefore, they will not be sequential in any
given table.

Moreover, once an OID sequence has reached its upper limit, it starts again at zero
(or another prescribed minimum). Although sequences can do the same thing, the
odds of OID s wrapping around are much greater because they are distributed
throughout the entire database.

Additionally, when building database-driven applications, there are numerous cases


in which the next value of a sequence is needed before it has actually been
committed into use. Due to the reasons previously stated, it is far more reliable to
query the next value of a single sequence than to try to guess the next value of an
OID.
More Data Types

The following is a listing of some of the more obscure PostgreSQL data types (some
of these types are only aliases to previously documented types):

abstime—Holds an absolute system (UNIX) time.

aclitem—Holds an access control list item.

bpchar—Holds a blank-padded fixed-length string.

bytea—Holds a variable-length binary value.

cid—Holds a command identifier type and is used in transactions.

filename—Holds a filename and is used in system tables.

int2vectory—Holds an array of 16 INT2 type values.

lztext—Holds variable-length text and is stored compressed.

oidvector—Holds an array of 16 OIDs and is used in system tables.

regproc—Holds registered procedures.

reltime—Holds relative time (UNIX delta).

timetz—Holds an ANSI SQL time (hh:mm:ss).

tinterval—Holds an array of (abstime, abstime).

varbit—Holds a fixed-length bit value.

xid—Holds a transaction ID sequence.


Chapter 3. PostgreSQL Operators
Data types, by themselves, are only useful for holding data. To make comparisons,
orderings, and selections of table datum, operators are needed.

Most operators simply return an implicit Boolean true or false given the comparison
criteria. However, some operators, such as the math- and string-related ones,
return new results from the supplied elements.

The following is a map of the default PostgreSQL operators grouped by data type.
After that is a more detailed listing of all the supported PostgreSQL operators,
including information on specific usage, syntax, and notes.

Table 3.1. Map of Operators Grouped by Data Type

Data Type Operator

Geometric +

##

&&

&<

&>

<->

<<

<^

>>

>^

?#
?-

?-|

@-@

?|

?||

\@

@@

~=

Logical AND

OR

NOT

Network <

<=

>=

>

<>

<<

<<=

>>

>>=

Numerical !

!!

%
*

|/

||/

String <

<=

<>

>

>=

||

!!=

~~

!~~

~*

!~

!~*

Time #<

#<=
#<>

#=

#>

#>=

<#>

<<

~=

<?>
Geometric Operators

PostgreSQL includes a number of operators to assist in comparing geometric data


types included with PostgreSQL.

Some of these operators are implicit Boolean returns (for example, << and >>), and
others provide new results from the input elements like math operators (for
example, + and -).

Listing

+ Translation (for example, point '(2,0)' + point '(0,1)')

- Translation (for example, point '(2,0)' - point '(0,1)')

* Scaling/rotation

/ Scaling/rotation

# Intersection

# Number of points in polygon

## Point of closest proximity

&& Overlaps

&< Overlaps to left

&> Overlaps to right

<-> Distance between

<< Is left of
<^ Is below

>> Is right of

>^ Is above

?# Intersects or overlaps

?- Is horizontal

?-| Is perpendicular

@-@ Length or circumference

?| Is vertical

?|| Is parallel

@ Contained or on

@@ Center of

~= Same as

Notes/Examples

Add a point to a box:

box '((0,0),(1,1))' ** point '(2,0)'

Select all boxes that lay to the left of the given box:

SELECT * FROM map WHERE the box << box '((5,5),(4,4))';

Select the lines parallel to the line segment given:


SELECT * FROM map WHERE the lines ?|| lseg '((1,1),(3,3))';
Logical Operators

The logical operators usually are used to combine expressions to get an aggregate
Boolean value from the list.

Listing

AND Functions as a logical AND criteria connector.

OR Functions as a logical OR criteria connector.

NOT Functions as a negator.

Notes/Examples

SELECT * FROM payroll WHERE firstname="Bill" AND lastname="Smith";


SELECT * FROM payroll WHERE firstname="Bill" OR firstname="Sam";
SELECT * FROM payroll WHERE firstname IS NOT NULL;
Network Operators

The network operators that are built into PostgreSQL are useful for making
comparisons between IP addresses. These operators function on INET and CIDR
data types equally.

Listing

< Less than

<= Less than or equal to

= Equals

>= Greater than or equal to

> Greater than

<> Not equal

<< Contained within

<<= Contained within or equals

>> Contains

>>= Contains or equals

Notes/Examples

Use the following to find all the IP addresses less than the one specified:

SELECT * FROM computers WHERE ipaddr<'192.168.0.100'

Similarly, use the following to find all the IP addresses on the given subnet:
SELECT * FROM computers WHERE ipaddr<<'192.168.0.1/24'
Numerical Operators

Numerical operators return a new value from the supplied elements.

Listing

! Factorial
!! Factorial (left operator)
% Mod or truncate
* Multiplication
+ Addition
- Subtraction
/ Division
: Exponentiation
@ Absolute value
^ Exponentiation
| Square root
||/ Cube root
& Binary AND
| Binary OR
# Binary XOR
~ Binary NOT
<< Binary shift left
>> Binary shift right

Notes/Examples

Here are some examples of numerical operators in action:

Operator Supplied Values Result

Addition 5+5 10

Square root |/81 9

Absolute value @-8 8

Cube root | |/343 7


Factorial 3! 6

Left factorial !3 6

Binary AND 1&5 1

Binary XOR 1#5 4

Binary left shift 1<<4 16

The binary operators also function on BIT and BITVARYING d ata types. For
instance:

Operator Supplied Values Result

Binary AND B'10001'&B'01101' B'00001'

Binary NOT ~B'01110' B'10001'


String Operators

Essentially, all string operators return implicit Boolean true or false values given the
supplied comparison. (The exception is the concatenation operators shown in the
next "Listing" section.)

When making comparisons, the characters' location in the ANSI chart is taken into
account. Therefore, a lowercase "a" is seen as less than (<) an uppercase "A."

String operators make use of regular expression matching. This expression


matching includes an internal format and a POSIX-compliant format.

PostgreSQL can make use of two distinct types of pattern matching: an ANSI-SQL
method and a POSIX regex style. The internal ANSI-SQL style makes use of the
LIKE and NOTLIKE keywords. This ANSI-SQL method can use the following
wildcards for pattern matching:

Wildcard Meaning

% Any matching character

_ Any single character

Conversely, the POSIX-compliant operators use the standard regex comparisons,


such as the following:

POSIX Regex
Meaning
Symbol

. Single character match.

* Any string of 1 or more.

+ Repetition of a sequence.

? Possible repetition of a sequence.

[ ] List of single characters enclosed in brackets. Matches any of


those.
^[ ] List of single characters enclosed in brackets. Rejects matches
to any of those.

…etc…

A full discussion of POSIX-style regular expressions is beyond the scope of this


book. See the man pages for sed, awk, and egrep for more information on POSIX-
style regex.

The regex engine included with most versions of PostgreSQL is the POSIX 1003.2
"egrep" style. This regex library, by Henry Spencer, is included in many other
popular applications. More information on the regex engine included in a specific
version of PostgreSQL can usually be found in the source directory $
SOURCE/backend/ regex.

Listing

< Less than

<= Less than or equal to

<> Not equal

= Equal to

> Greater than

>= Greater than or equal to

| | Concatenate strings

!!= Not like

~~ Like
LIKE Like

ILIKE Like; case insensitive

NOT ILIKE Not like; case insensitive

NOT LIKE Not like

!~~ Not like

~ Match using regex; case sensitive

~* Match using regex; case insensitive

!~ No match using regex; case sensitive

!~* No match using regex; case insensitive

Notes/Examples

Select all records from a table where the first name is Bob:

SELECT * FROM authors WHERE firstname='Bob';

Select all records from a table, except those named Bob:

SELECT * FROM authors WHERE firstname<>'Bob';

Select all records where the first name begins with Bo:

SELECT * FROM authors WHERE firstname LIKE 'Bo';

Select all records where the first name begins with b, regardless of case:

SELECT * FROM authors WHERE firstname ILIKE 'b';


Time Operators

The time operators are used to compare temporal values and usually return a
Boolean true or false.

Listing

< Interval less than

<= Interval less than or equal to

<> Interval not equal

= Interval equal

> Interval greater than

>= Interval greater than or equal to

| Start of interval
Chapter 4. PostgreSQL Functions
PostgreSQL includes a number of built-in functions that manipulate specific data
types and return values.

The following is a map of the built-in functions grouped by category.


Map of Functions Grouped by Category

Function Category

Aggregate functions AVG

COUNT

MAX

MIN

STDDEV

SUM

VARIANCE

Conversion functions CAST

TO_CHAR

TO_DATE

TO_NUMBER

TO_TIMESTAMP

Geometric functions AREA

BOX

CENTER

CIRCLE

DIAMETER

HEIGHT

ISCLOSED

ISOPEN

LENGTH
LSEG

NPOINTS

PATH

PCLOSE

POINT

POLYGON

POPEN

RADIUS

WIDTH

Network functions ABBREV

BROADCAST

HOST

MASKLEN

NETMASK

NETWORK

TEXT

TRUNC

Numerical functions ABS

ACOS

ASIN

ATAN

ATAN2

CBRT

CEIL

COS
COT

DEGREES

EXP

FLOOR

LN

LOG

PI

POW or POWER

RADIANS

RANDOM

ROUND

SIN

SQRT

TAN

TRUNC

SQL functions CASE WHEN

COALESCE

NULLIF

String functions ASCII

CHR

INITCAP

LENGTH, CHAR_LENGTH,or CHARACTER_LENGTH

LOWER

LPAD

LTRIM
OCTET_

POSITION

STRPOS

RPAD

RTRIM

SUBSTRING

SUBSTR

TRANSLATE

TRIM

UPPER

Time functions AGE

CURRENT_

CURRENT_TIME

CURRENT_TIMESTAMP

DATE_PART

DATE_TRUNC

EXTRACT

ISFINITE

NOW

TIMEOFDAY

TIMESTAMP

User functions CURRENT_USER

SESSION_USER

USER
Other functions ARRAY_DIMS
Aggregate Functions

PostgreSQL includes a number of aggregate functions. Generally, aggregate functions


calculate a single return value for an entire range of supplied input values. This
behavior operates in contrast to standard functions, which generally return an output
value for each supplied input.

AVG

Description

The AVG function returns the average value of the supplied column or expression.

Input

AVG(col | expression)

Example

Return the average salary from the payroll table:

SELECT AVG(salary) FROM payroll;

This example shows the use of expressions contained in the AVG function. Specifically,
it returns the average amount over $18,000 that employees earn (notice that the
criteria provided restricts calculations being performed on anyone earning less than
$18,000).

SELECT AVG(salary-18000) FROM payroll WHERE salary>18000;

Notes

The AVG function will work on the following data types: smallint, integer, bigint,
real, double precision, numeric, and interval.

Any integer value (that is, bigint, integer, and so on) returns an integer data
type.

Any floating-point value returns a numeric data type.

Others, such as interval, are returned as their own data type.

COUNT
Description

The COUNT function counts the rows or expressions where a nonNULL value is returned.

Inputs

COUNT(*)—Count all rows.

COUNT(col | expression)—Count a specific column or expression.

Example

SELECT COUNT(*) AS Num_Active FROM payroll WHERE status="active";

MAX

Description

The MAX function returns the greatest value from a column or expression list that was
passed to it.

Input

MAX(col | expression)

Example

SELECT MAX(salary) FROM payroll;

MIN

Description

The MIN function returns the smallest value from a column or expression list that was
passed to it.

Input

MIN(col | expression)
Example

SELECT MIN(salary) FROM payroll;

STDDEV

Description

The STDDEV function returns the standard deviation of the supplied columns or
expression list.

Input

STDDEV(col | stddev)

Example

SELECT STDDEV(price) FROM stocks;

Notes

The STDDEV function will work on the following data types: smallint, integer,
bigint, real, double precision, and numeric.

SUM

Description

The SUM function returns the aggregate sum of all the column or expression values
passed to it.

Input

SUM(col | expression)

Example

SELECT SUM(salary) FROM payroll WHERE checkdate='06-01-2001';

Notes
The SUM function will work on the following data types: smallint, integer, bigint,
real, double precision, numeric, and interval.

VARIANCE

Description

The VARIANCE function will return the squared value of the standard deviation from the
supplied column or expression list.

Input

VARIANCE(col | expression)

Example

SELECT VARIANCE(price) FROM stocks;


Conversion Functions

PostgreSQL includes a number of conversion functions. These are used to convert


from one data type to another and to format specific output styles.

CAST

Description

The CAST function can be used to convert from one data type to another. Generally
speaking, CAST is a fairly generic and easy-to-use function that makes most data-
type conversions easy.

Inputs

CAST(value AS newtype)

value—The value that needs converting.

newtype—The new data type to convert to.

Examples

CAST('57' as INT)) 57
CAST(57 as CHAR) '57'
CAST(57 as NUMERIC(4,2)) 57.00
CAST('05-23-87' as DATE)) 1987-05-23

Notes

An additional way to perform type conversion is to separate the value and the
desired data type with double colons (::).

Then the preceding examples would appear as follows:

'57'::INT 57
57::CHAR '57'
57::NUMERIC(4,2) 57.00
'05-23-87'::DATE 1987-05-23

TO_CHAR
Description

The TO_CHAR function takes various input data types and converts them to a string
data type. In addition to performing a data conversion, the TO_CHAR function also
has extensive formatting capabilities to output the string in the exact format
desired.

Inputs

The TO_CHAR function shares a common usage pattern regardless of the data type
it is handling. All TO_CHAR functions accept two arguments; the first is the data to
be converted, and the second is a formatting template for PostgreSQL to use when
constructing the output.The following table illustrates this usage pattern.

Usage Description

TO_CHAR(int, texttemplate) Converts from an integer to a specific


string format.

TO_CHAR(numeric, textemplate) Converts from a numeric to a specific


string format.

TO_CHAR(double precision, Converts from a double to a specific string


textemplate) format.

TO_CHAR(timestamp, Converts from a timestamp to a specific


texttemplate) string format.

TO_CHAR with Numbers (Int, Numeric, or Double Precision)

Converting to a character string from a numerical data type uses the following
template mask for formatting output.

(In addition to the following specific formatting commands, the TO_CHAR function
will also blindly accept and display any text enclosed in double quotes. This can be
very helpful when trying to perform specific labeling of output data.)

Item Description

Leading zero
0

9 Digit placeholder

. Decimal point

, Thousands separator

G Group separator*

D Decimal point*

S Negative values with minus sign* (–)

PR Negative values in angle brackets* (<>)

L Currency symbol*

MI Minus sign in specified position (if n<0)

PL Plus sign in specified position (if n>0)

SG Plus or minus sign in specified position

RN Output Roman numeral (for n>1 and n<3999)

TH Convert to ordinal number

Vn Shift value by 10*n digits

* These items use the locale setting for your particular machine, so your results
might vary.
TO_CHAR with Date/Time Data Types

The TO_CHAR (and TO_DATE, TO_TIMESTAMP) function uses the following date-
time–related template mask for formatting output:

Item Description

SSSS Seconds past midnight (0–86399)

SS Second (00–59)

MI Minute (00–59)

HH Hour of day (01–12)*

HH12 Hour of day (01–12)

HH24 Hour of day (00–23)

AM or A.M. Meridian indicator (uppercase)

PM or P.M. Meridian indicator (uppercase)

am or a.m. Meridian indicator (lowercase)

pm or p.m. Meridian indicator (lowercase)

DAY Uppercase full day name (such as MONDAY)

Day Proper case full day name (such as Monday)

day Lowercase full day name (such as monday)

Uppercase abbreviated day name (such as MON)


DY

Dy Proper case abbreviated day name (such as Mon)

dy Lowercase abbreviated day name (such as mon)

D Day of the week (1–7; SUN = 1)

DD Day of the month (01–31)

DDD Day of the year (001–366)

W Week of the month (1–5)

WW Week of the year (1–53; first week starts 01/01)

IW ISO week of year (1–53; first week starts on first Thursday of Jan)

MM Month (01–12)

MONTH Uppercase full month name (such as JUNE)

Month Proper case full month name (such as June)

month Lowercase full month name (such as june)

MON Uppercase abbreviated month name (such as JUN)

Mon Proper case abbreviated month name (such as Jun)

mon Lowercase abbreviated month name (such as jun)


Y Last digit of year (such as 1)

YY Last two digits of year (such as 01)

YYY Last three digits of year (such as 001)

YYYY Full year (four and more digits) (such as 2001)

Y,YYY Full year (four and more digits) (such as 2,001)

CC Century (such as 20)

BC or B.C. Era indicator (uppercase)

bc or b.c. Era indicator (lowercase)

AD or A.D. Era indicator (uppercase)

ad or a.d. Era indicator (lowercase)

J Julian day (days since 01/01/4712 BC)

Q Quarter

RM Uppercase month in Roman numeral (I = Jan)

rm Lowercase month in Roman numeral (I = Jan)

TZ Uppercase time zone

tz Lowercase time zone


* These items use the locale setting for your particular machine, so your results
might vary.

Examples

Numerical and date/time examples are given in the following sections.

TO_CHAR Numerical Examples

Input Output

TO_CHAR(123,'999') 123

TO_CHAR(123,'99 9') 12 3

TO_CHAR(123,'0999') 0123

TO_CHAR(123,'999.9') 123.0

TO_CHAR(1234,'9,999') 1,234

TO_CHAR(1234,'9G999') 1,234

TO_CHAR(1234.5,'9999D99') 1234.50

TO_CHAR(123,'999PL') 123+

TO_CHAR(123,'PL123') +123

TO_CHAR(-123,'999MI') 123-

TO_CHAR(-123,'MI123') -123

TO_CHAR(123,'SG123') +123
TO_CHAR(-123,'SG123') -123

TO_CHAR(-123,'999PR') <123>

TO_CHAR(123,'RN') CXXIII

TO_CHAR(32, '99TH') 32nd

TO_CHAR(123,'9"Hundred and"99') 1 Hundred and 23

TO_ CHAR Date/Time Examples

Input Output

TO_CHAR('November 1 2001', 'MM"--"DD"--"YY') 11--01--01

TO_CHAR('Jun 22 2001', '"Year"YYYY Year 2001 Day 174


"Day"DDD')

Notes

Any items in double quotes are ignored. Therefore, to output reserved template
words, simply enclose them in double quotes (that is, YYYY outputs as " YYYY").

Special characters like backslashes (\) can be achieved by enclosing them in quotes
and doubling them (that is,"\\" become "\" on output).

The preceding templates are used in many other TO -style functions (that is,
TO_DATE, TO_NUMBER, and so on).

TO_DATE

Description

The TO_DATE function converts a text string to a date format. The TO_DATE
function takes two arguments; the first is the string to be converted, and the second
is a text template that specifies how the output is to appear.

Input

TO_DATE(text, texttemplate)

Example

TO_DATE('01 01 2001', 'MONTH DD YYYY') JANUARY 01 2001

Notes

There are a number of options that the text template string can take. Refer to
TO_CHAR for a full listing of the options that the date-time template can take.

TO_NUMBER

Description

The TO_NUMBER function is used to convert from character input strings to a


numeric output.The TO_NUMERIC function accepts two input arguments; the first is
the text to be converted, and the second is the text template that specifies how the
output format is to appear.

Input

TO_NUMBER(text, texttemplate)

Examples

TO_CHAR(1234567,'9G999G999') 1,234,567
TO_CHAR(1234.5,'9999D99') 1234.50

Notes

The text template of the TO_NUMBER function accepts a number of options. For a
full listing of supported layout options, refer to the TO_CHAR function.

TO_TIMESTAMP
Description

The TO_TIMESTAMP function is used to convert from a string format to a timestamp


data type. The TO_TIMESTAMP function accepts two arguments; the first is the
string to be converted, and the second is a date-time template used to format the
resulting output.

Input

TO_TIMESTAMP (text, texttemplate)

Example

TO_TIMESTAMP('05 December 2001','DD MM YYYY') 12 05 2001

Notes

The date-time template accepts many options for formatting output. Refer to the
TO_CHAR function for a full list of valid date-time formatting options.
Geometric Functions

Somewhat unique among popular RDBMSs, PostgreSQL provides an extensive set of


geometric-oriented functions. These functions and their associated operators are
useful when performing calculations on spatial-related data sets.

AREA

Description

The AREA function computes the area occupied by a given object.

Input

AREA(obj)

Example

AREA(box '((1,1),(3,3))') 4

BOX

Description

There are several versions of the BOX function. Most perform conversions from other
geometric types to the box data type. However, if the BOX function is passed two
overlapping boxes, the result will be a box that represents where the intersection
occurs.

Inputs

BOX(box,box)—Perform an intersection.

BOX(circle)— Convert a circle to a box.

BOX(point,point)—Convert from points to a box.

BOX(polygon)—Convert from a polygon to a box.

Examples
BOX(box'((1,1),(3,3))', box'((2,2),(4,4))') BOX'(3,3),(2,2)'
BOX(circle'(0,0),2') BOX'(1.41, 1.41), (-1.41, -1.41)'
BOX(point'(0,0)', point'(1,1)') BOX'(1,1),(0,0)'
BOX(polygon'(0,0),(1,1),(1,0)' BOX'(1,1),(0,0)'

CENTER

Description

The CENTER function returns the center point of the object passed to it.

Input

CENTER(obj)

Example

CENTER(box'(0,0),(1,1)') point'(.5,.5)'

CIRCLE

Description

The CIRCLE function converts from box data types to a circle.

Input

CIRCLE(box)—Convert from box to circle.

Example

CIRCLE(box'(0,0),(1,1)') CIRCLE'(.5,.5),.707016…'

DIAMETER

Description

The DIAMETER function returns the diameter of a supplied circle.


Input

DIAMETER(circle)

Example

DIAMETER(circle'((0,0),2)') 4

HEIGHT

Description

The HEIGHT function is used to compute the vertical height of a supplied box.

Input

HEIGHT(box)

Example

HEIGHT(box'(0,0),(3,3)') 3

ISCLOSED

Description

The ISCLOSED function returns a Boolean value that represents whether the supplied
path is open or closed.

Input

ISCLOSED(path)

Example

ISCLOSED(path'(0,0),(1,1),(1,0),(0,0)') t

ISOPEN
Description

The ISOPEN function returns a Boolean value that represents whether the supplied
path is open or closed.

Input

ISOPEN(path)

Example

ISOPEN(path'(0,0),(1,1),(1,0),(0,0)') f

LENGTH

Description

The LENGTH function returns the length of the supplied lseg.

Input

LENGTH(lseg)

Example

LENGTH(lseg'(0,0),(1,1)' 1.41422135623731

Notes

If the LENGTH function is passed a BOX data type, it will interpret the opposite
corners of the box as the lseg to compute.

LSEG

Description

The LSEG function converts from either a box or a pair of points to an lseg data
type.
Inputs

LSEG(box)
LSEG(point,point)

Example

LSEG(box'(0,0),(1,1)') LSEG'(1,1),(0,0)'

NPOINTS

Description

The NPOINTS function returns the number of points that compose the supplied path.

Inputs

NPOINTS(path)
NPOINTS(polygon)

Example

NPOINTS(path'(0,0),(1,1)') 2

PATH

Description

The PATH function converts from a polygon to a path.

Input

PATH(polygon)

Example

PATH(polygon'(0,0),(1,1),(1,0)') PATH'((0,0),(1,1),(1,0))'

Notes
Notice the closed representation "(" in the example provided. For more information
on open or closed path representation, refer to the PATH data type.

PCLOSE

Description

The PCLOSE function converts a path to the closed representation of a path.

Input

PCLOSE(path)

Example

PCLOSE(path'(0,0),(1,1),(1,0)') PATH'((0,0),(1,1),(1,0))'

Notes

See the PATH data type for more information on how paths are represented as being
open or closed.

POINT

Description

The POINT function provides various geometric services, depending on the supplied
object type.

Inputs

POINT(circle)—Return the center of the supplied circle.

POINT(lseg, lseg)—Return the intersection of the supplied lsegs.

POINT( polygon)—Return the center of the supplied polygon.

Examples

POINT(circle'((0,0),2)') POINT'(0,0)'
POINT(polygon'(0,0),(1,1),(1,0)') POINT'(.66…),.33…)'
POLYGON

Description

The POLYGON function converts various geometric types to a polygon.

Inputs

POLYGON(box)—Convert a box to a 12-point polygon.

POLYGON( circle)—Convert a circle to a 12-point polygon.

POLYGON(n, circle )—Convert a circle to an n-point polygon.

POLYGON(path)—Convert a path to a polygon.

Examples

POLYGON(4, circle'((0,0),4)')
POLYGON'(-4,0),(2.041,4),(4,-4.0827),(-6.12,-4)'

POPEN

Description

The POPEN function converts a path to an open path.

Input

POPEN(path)

Example

POPEN(path'(0,0),(1,1),(1,0)') PATH'[(0,0),(1,1),(1,0)]'

Notes

Notice the open representation of the returned path. For more information on open or
closed path representations, refer to the PATH data type.

RADIUS
Description

The RADIUS function returns the radius of a supplied circle.

Input

RADIUS(circle)

Example

RADIUS(circle'((0,0),2)') 2

WIDTH

Description

The WIDTH function returns the horizontal size of a supplied box.

Input

WIDTH(box)

Example

WIDTH(box'(0,0),(2,2)') 2
Network Functions

PostgreSQL includes many functions that are network oriented. Primarily, these are
useful for performing calculations and transformations of IP-related data. The
following sections discuss the included network functions in PostgreSQL.

ABBREV

Description

The ABBREV function returns an abbreviated text format for a supplied inet or
cidr value.

Input

ABBREV(inet | cidr)

Example

ABBREV('192.168.0.0/24') "192.168/24"

BROADCAST

Description

The BROADCAST function returns the broadcast address of the supplied inet or
cidr value.

Input

BROADCAST(inet | cidr)

Example

BROADCAST('192.168.0.1/24') '192.168.0.255/24'

HOST

Description
The HOST function extracts the host address for the supplied inet or cidr value.

Input

HOST(inet | cidr)

Example

HOST('192.168.0.101/24') '192.168.0.101'

MASKLEN

Description

The MASKLEN function extracts the netmask length for the supplied inet or cidr
value.

Input

MASKLEN(inet | cidr)

Example

MASKLEN('192.168.0.1/24') 24

NETMASK

Description

The NETMASK function calculates the netmask for the supplied inet or cidr value.

Input

NETMASK(inet | cidr)

Example

NETMASK('192.168.0.1/24') '255.255.255.0'
NETWORK

Description

The NETWORK function extracts the network from a supplied inet or cidr value.

Input

NETWORK(inet | cidr)

Example

NETWORK('192.168.0.155/24') '192.168.1.0/24'

TEXT

Description

The TEXT function returns the IP and netmask length as a text value.

Input

TEXT(inet | cidr)

Example

TEXT(CIDR '192.168.0.1/24') "192.168.0.1/24"

TRUNC

Description

The TRUNC function sets the last 3 bytes to zero for the supplied macaddr value.

Input

TRUNC(macaddr)
Example

TRUNC(macaddr '33:33:33:33:33:aa') '33:33:33:00:00:00'

Notes

This function is useful for associating a supplied MAC address with a manufacturer.
See the directory $SOURCE/contrib/mac (SOURCE is the location of the
PostgreSQL source code) for more information.
Numerical Functions

PostgreSQL includes many functions that assist in performing numerical


calculations. The following sections discuss the included numerical functions in
PostgreSQL.

ABS

Description

The ABS function returns the absolute value of a supplied number.

Input

ABS(num)

Examples

ABS(-7) 7
ABS(-7.234) 7.234

Notes

The ABS function's return value is the same data type that it is passed.

ACOS

Description

The ACOS function returns an inverse cosine.

Input

ACOS(num)

ASIN

Description
The ASIN function returns an inverse sine.

Input

ASIN(num)

ATAN

Description

The ATAN function returns an inverse tangent.

Input

ATAN(num)

ATAN2

Description

The ATAN2 function returns an inverse tangent of y/x.

Input

ATAN2(x,y)

CBRT

Description

The CBRT function returns the cube root of the supplied number.

Input

CBRT(num)

Example
CBRT(27) 3

CEIL

Description

The CEIL function returns the smallest integer not less than the supplied value.

Input

CEIL(num)

Example

CEIL(-22.2) -22

COS

Description

The COS function returns a cosine value.

Input

COS(num)

COT

Description

The COT function returns a cotangent value.

Input

COT(num)

DEGREES

Description
The DEGREES function converts from radians to degrees.

Input

DEGREES(num)

Example

DEGREES(1) 90

EXP

Description

The EXP function performs an exponential calculation on the supplied value.

Input

EXP(num)

Example

EXP(0) 1.0

FLOOR

Description

The FLOOR function returns the largest integer not greater than the supplied value.

Input

FLOOR(num)

Example

FLOOR(-22.2) -23
LN

Description

The LN function performs a natural logarithm on the supplied number.

Input

LN(num)

Example

LN(100) 4.6051701860

LOG

Description

The LOG function performs a standard base-10 logarithm on the supplied value.

Input

LOG(num)

Example

LOG(100) 2.0

PI

Description

The PI function returns the standard pi value.

Inputs

None.
Example

PI() 3.1459265358979

POW or POWER

Description

The POW function raises a number by the specified exponent.

Inputs

POW(num, exp)

num—The number on which to perform the exponentiation.

exp—The power to which to raise the number.

Examples

POW(2,2) 4.0
POW(2,3) 8.0

RADIANS

Description

The RADIANS function converts the supplied degrees to radian units.

Input

RADIANS(num)

Example

RADIANS(90) 1

RANDOM
Description

The RANDOM function returns a pseudorandom number between 0.0 and 1.0.

Inputs

None.

Example

RANDOM() .654387

ROUND

Description

The ROUND function rounds a number to the specified decimal places.

Inputs

ROUND(num, dec)

num—The number to manipulate.

dec—An integer that represents the number of decimal places.

Examples

ROUND(1.589, 1) 1.6
ROUND(1.589, 2) 1.59

SIN

Description

The SIN function performs a sine function on the supplied value.

Input
SIN(num)

SQRT

Description

The SQRT function returns the square root of the supplied value.

Input

SQRT(num)

Example

SQRT(9) 3

TAN

Description

The TAN function performs a tangent calculation on the supplied number.

Input

TAN(num)

TRUNC

Description

The TRUNC function truncates to the specified number of decimal places without
rounding them out.

Inputs

TRUNC(num [, dec])

num—The number on which to perform the truncate.


dec—The number of decimal places to leave remaining, if supplied; otherwise,
truncate all decimal places.

Examples

TRUNC(1.589999, 2) 1.58
TRUNC(1.589999) 1
SQL Functions

PostgreSQL includes several functions that return values based on expressions


supplied in the current SQL statement. Moreover, these functions are not constrained
to acting on specific data types; rather, they act as control structures within a SQL
statement.

CASE WHEN

Description

The CASE WHEN function is a simple conditional evaluation tool. Most programming
languages contain similar constructs. It can be thought of as analogous to the
ubiquitous IF…THEN…ELSE statement.

Inputs

CASE WHEN condition THEN result


[ WHEN condition THEN result ]

[ ELSE result ]
END

Example

This example shows a classic IF…THEN…ELSE paradigm in which the CASE WHEN
function can be used. The age of an employee is compared against certain constants,
and the possible outputs of minor, adult, or unknown are returned depending on
their age.

SELECT name, age,


CASE WHEN age<18 THEN 'minor'
WHEN age>=18 THEN 'adult'
ELSE 'unknown'
END
FROM employees;
name age case
-------------------
Bill 13 minor
Timmy 7 minor
Pam 25 adult
Barry NULL unknown
COALESCE

Description

The COALESCE function accepts an arbitrary number of input arguments and returns
the first one that is evaluated as NOT NULL. The COALESCE function is very useful
for providing display defaults for arbitrary data sources.

Input

COALESCE(arg1, …, argN)

Example

Return a default message to the user:

SELECT COALESCE(book.title, book.description, 'Not Available');

NULLIF

Description

The NULLIF function accepts two arguments. It returns a NULL value only if the
value of both arguments is equal. Otherwise, it returns the value of the first
argument.

Input

NULLIF(arg1, arg2)

Example

In this case, the first value will be returned because the values are not equal:

SELECT NULLIF('hello', 'world');

----------------
'hello'

However, when the values are equal, a NULL value is returned:


SELECT NULLIF('hello', SUBSTR('helloword',1,5));

NULL

Notes

The NULLIF function behaves in an inverse-like manner of the COALESCE function.


It is useful for exception testing, in which a variable is being tested against a known
value. If the variable equals the known value, nothing is returned. However, if the
values do not match, the value of the evaluated variable is returned instead.
String Functions

PostgreSQL includes several functions that modify string-related data. These


functions are particularly useful for controlling output displays and/or normalizing
data input.

ASCII

Description

The ASCII function returns the ASCII value for the supplied character.

Inputs

ASCII(chr)

chr—The character to determine the ASCII value of.

Examples

ASCII('A') 65
ASCII('Apple') 65

Notes

In the case of multiple characters being supplied to the ASCII function, only the first
is evaluated.

CHR

Description

The CHR function returns the character that corresponds to the ASCII value
provided.

Inputs

CHR(val)

val—An ASCII value.


Example

CHR(65) 'A'

INITCAP

Description

The INITCAP function forces a string or column to be returned when the first
character is uppercase and the rest is lowercase only.

Inputs

INITCAP(col)

Or

INITCAP(string)

Example

SELECT INITCAP(name) AS Proper_Name FROM authors;

Proper_Name
--------
Bill
Bob
Sam

LENGTH, CHAR_LENGTH, or CHARACTER_LENGTH

Description

The LENGTH ( or CHAR_LENGTH, or CHARACTER_LENGTH) function returns the


length of the supplied column.

Inputs

LENGTH(col)

col—A column containing a string data type.


Example

SELECT name WHERE LENGTH(name)<4 FROM authors;

Name
------
Pam
Sam
Sue
Bob

LOWER

Description

The LOWER function forces a string or column to be returned in lowercase only.

Inputs

LOWER(col)

Or

LOWER(string)

Example

SELECT LOWER(name) AS Low_Name FROM authors;

Low_Name
--------
bill
bob
sam

LPAD

Description

The LPAD function left-pads a string with specified characters or spaces.

Inputs
LPAD(str, len, fill)

str—The string to left-pad.

len—The number of spaces to pad.

fill—By default, a space; however, any characters can be specified.

Examples

LPAD('Hello', 3) ' Hello'


LPAD('ello', 3, 'H') 'HHHello'

LTRIM

Description

The LTRIM function removes the specified characters from the left side of a
character string.

Inputs

LTRIM(str [,trim])

str—The string to trim.

trim—By default, a space; however, any character(s) can be specified.

Examples

LTRIM(' Hello') 'Hello'


LTRIM('HHHello', 'H') 'ello'

OCTET_LENGTH

Description

The OCTET_LENGTH function returns the length of a column or string, including any
multibyte data present.

Inputs
OCTET_LENGTH(col)

Or

OCTET_LENGTH(string)

Example

SELECT OCTET_LENGTH('Hello World');

Octet_Length
11

Notes

OCTET_LENGTH and LENGTH will often return the same value. However, a crucial
difference is that OCTET_LENGTH is actually returning the number of bytes in a
string. This can be an important difference if multibyte information is being stored.

POSITION

Description

The POSITION function returns an integer that represents the position of the
supplied character string in the given column (or supplied string).

Inputs

POSITION(str IN col)

str—The character string to locate.

col—The column or string to perform the search on.

Example

Return the names from the table authors where the second letter is an 'a':

SELECT name FROM authors WHERE POSITION('a' IN name))=2;

Name
------
Pam
Sam
Tammy
Barry

STRPOS

Description

The STRPOS function returns an integer that represents the position of a specific
character string in a given column (or supplied string).

Inputs

STRPOS(col, str)

col—The column or string to perform the search on.

str—The character string to locate.

Example

See the examples in the POSITION function section.

Notes

This command is essentially the same as the POSITION function.

RPAD

Description

The RPAD function right-fills the specified string with spaces or characters.

Inputs

RPAD(str, len [,fill])

str—The string to right-fill.

len—The number of spaces to append.


fill—By default, a space; however, any character can be used.

Examples

RPAD('Hello', 3) 'Hello '


RPAD('Hello', 3, '!') Hello!!!

RTRIM

Description

The RTRIM function removes the specified characters from the right side of a
character string.

Inputs

RTRIM(str [,trim])

str—The string to right-trim.

trim—By default, a space; however, any character(s) can be used.

Examples

RTRIM('Hello ') 'Hello'


RTRIM('Hello!!!', '!') 'Hello'

SUBSTRING

Description

The SUBSTRING function extracts a specified portion from an existing character


string.

Inputs

SUBSTRING(str FROM pos [ FOR len])

str—The string to manipulate.

pos—The starting position to begin extraction.


len—By default, the rest of the string is assumed; however, a specific portion can
be specified.

Examples

SUBSTRING('Hello' FROM 2) 'ello'


SUBSTRING('Hello' FROM 2 FOR 2) 'el'

Notes

This is the same as the SUBSTR function.

SUBSTR

Description

The SUBSTR function extracts a specified portion from an existing character string.

Inputs

SUBSTRING(str, pos [, len])

str—The string to manipulate.

pos—The starting position to begin extraction.

len—By default, the rest of the string is assumed; however, a specific portion can
be specified.

Examples

SUBSTRING('Hello', 2) 'ello'
SUBSTRING('Hello', 2, 2) 'el'

Notes

This is the same as the SUBSTRING function.

TRANSLATE

Description
The TRANSLATE function performs a search and replace on a specified string. The
data replaced is done according to where it matches in the search criteria. See the
following example for more.

Inputs

TRANSLATE(str, searchset, replaceset)

str—The base string to search and modify.

searchset—Either a single character or a multicharacter search set.

replaceset—Each respective member in this set replaces a corresponding


member in the search set.

Examples

TRANSLATE('HelloW', 'W', '!') 'Hello!'


TRANSLATE('Hello', 'Ho', 'Jy') 'Jelly'

TRIM

Description

The TRIM function removes the specified character or whitespace from the left or
right (or both) of a given string.

Inputs

TRIM([ leading | trailing | both ] [trim] FROM str)

leading | trailing | both— The side from which to remove the specified
characters.

trim—By default, whitespace is assumed; however, any character(s) can be


specified.

str—The string to trim.

Examples

TRIM(both FROM ' Hello ') 'Hello'


TRIM(both '!' FROM '!!HELLO!!') 'Hello'
UPPER

Description

The UPPER function forces a string or column to be returned in uppercase only.

Inputs

UPPER(col)

Or

UPPER(string)

Example

SELECT UPPER(name) AS Upper_Name FROM authors;

Upper_Name
--------
BILL
BOB
SAM
Time Functions

The following functions assist in performing calculations based on time- and date-related
material. These are useful in performing calculations and transformations of temporal-related
data sets.

AGE

Description

The AGE function returns an interval that represents the difference between the current time and
the time argument supplied.

Inputs

AGE(timestamp)—Computes the difference between the timestamp supplied and now().

AGE(timestamp, timestamp)—Computes the difference between the two supplied


timestamps.

Example

AGE('03-01-2001 15:56:00', '11-01-2001 14:22:00') 7 mon 30 22:26 ago

CURRENT_DATE

Description

The CURRENT_DATE function returns the current system date.

Inputs

None.

Example

SELECT CURRENT_DATE; 2001-06-11

Notes

Notice that there are no trailing parentheses "()" with this function. This is to maintain SQL
compatibility.

CURRENT_TIME
Description

The CURRENT_TIME function returns the current system time.

Inputs

None.

Example

SELECT CURRENT_TIME; 22:10:31

Notes

Notice that there are no trailing parentheses "()" with this function. This is to maintain SQL
compatibility.

CURRENT_TIMESTAMP

Description

The CURRENT_TIMESTAMP function returns the current system date and time.

Inputs

None.

Example

SELECT CURRENT_TIMESTAMP; '2001-06-11 22:10:31-06'

Notes

Notice that there are no trailing parentheses "()" with this function. This is to maintain SQL
compatibility. This function is analogous to the NOW function.

DATE_PART

Description

The DATE_PART function extracts a specified section from the supplied date/time argument.

Inputs
DATE_PART(formattext, timestamp)
DATE_PART(formattext, interval)

formattext—One of the valid DATE_PART formatting options; see the following section.

timestamp/interval—The supplied time-related value.

DATE_PART Formatting Options

The following keywords are recognized as valid date-time elements available for extraction:

Item Description

millennium Extracts the year field divided by 1,000.

century Extracts the year field divided by 100.

decade Extracts the year field divided by 10.

year Extracts the year field.

day The day of the year (1–366) (timestamp only).

quarter The quarter of the year (1–4) (timestamp only).

month The month of the year (1–12) (timestamp only).The number of remaining
months (interval only).

week The week number of the year (timestamp only).

dow The day of the week (0–6; 0 = Sunday) (timestamp only).

day The day of the month (1-31) (timestamp only).

hour The hour field (0-23).

second The seconds field, including fractional (0–59.99).

milliseconds The seconds field, including fractional, multiplied by 1,000.

The seconds field, including fractional, multiplied by 1 million.


microseconds

epoch The number of seconds since 01-01-1970 00:00 (timestamp). The total
number of seconds (interval).

Examples

DATE_PART('second', TIMESTAMP '06-01-2001 12:23:43') 43


DATE_PART('hour', TIMESTAMP '06-01-2001 12:23:43') 12

Notes

When using DATE_PART with interval data types, it is import to recognize that DATE_PART
will not do implicit calculations. DATE_PART only functions as an extraction tool. For instance, if
your interval is 1 month ago, DATE_PART will return 0 (zero) if you tried to extract days.

DATE_TRUNC

Description

The DATE_TRUNC function truncates the supplied timestamp to the specified precision.

Inputs

DATE_TRUNC(formattext, timestamp)

formattext—The precision value to which to truncate the timestamp; see the following valid
options for formatting.

timestamp—The supplied date/time value to truncate.

DATE_TRUNC Formatting Options

The following is a listing of the various levels of precision that DATE_TRUNC can operate on:

Item Description

millennium Truncates everything finer than the millennium.

century Truncates everything finer than the century.

decade Truncates everything finer than the decade.


year Truncates everything finer than the year.

month Truncates everything finer than the month.

day Truncates everything finer than the day.

hour Truncates everything finer than the hour.

minute Truncates everything finer than the minute.

second Truncates everything finer than the second.

milliseconds Truncates everything finer than milliseconds.

microseconds Truncates the microseconds.

Examples

DATE_TRUNC('hour', TIMESTAMP '2001-11-1 23:11:45')


TIMESTAMP '2001-11-1 23:00:00'

DATE_TRUNC('year' TIMESTAMP '11-1-2001 23:11:45')


TIMESTAMP '2001-01-01 00:00:00'

EXTRACT

Description

The EXTRACT function extracts the specified value from the supplied timestamp or interval.

Inputs

EXTRACT(formattext FROM timestamp)


EXTRACT(formattext FROM interval)

formattext—A valid date field. Refer to DATE_PART for a listing of valid format codes.

interval/timestamp—The supplied time value.

Example

EXTRACT('hour' FROM TIMESTAMP '2001-11-1 23:33:45') 23


Notes

The EXTRACT function performs like the DATE_PART function. Either syntax can be used
interchangeably.

ISFINITE

Description

The ISFINITE function returns a Boolean value that indicates whether the supplied timestamp
or interval represents a finite amount of time.

Inputs

ISFINITE(timestamp)
ISFINITE(interval)

Example

ISFINITE(TIMESTAMP '2002-05-05 23:13:44') t

NOW

Description

The NOW function returns a timestamp that represents the current system time.

Inputs

None.

Example

SELECT now();
'2001-11-1 15:23:54-06'

Notes

The NOW function is conceptually the same as CURRENT_TIMESTAMP.

TIMEOFDAY

Description
The TIMEOFDAY function returns a high-precision date and time value.

Inputs

None.

Example

SELECT TIMEOFDAY(); 'Sun Mar 11 22:23:14.853452 2001 CST'

TIMESTAMP

Description

The TIMESTAMP function works as a conversion routine to convert either date or date and
time data types into a timestamp.

Inputs

TIMESTAMP(date)
TIMESTAMP(date, time)

Example

TIMESTAMP('06-01-2001','23:45:11') '2001-06-01 23:45:11'


User Functions

Several of the included functions in PostgreSQL deal with user and session issues.
The following sections discuss user-related functions.

CURRENT_USER

Description

The CURRENT_USER function returns the user ID being used for permission
checking.

Inputs

None.

Example

SELECT CURRENT_USER;
--------------
webuser

Notes

Currently, CURRENT_USER and SESSION_USER are the same, but in the future,
there might be a distinction as needed for programs running in a setuid mode.

Notice that the preceding function is not called with the trailing parentheses "()".
This is to maintain SQL compatibility.

SESSION_USER

Description

The SESSION_USER function returns the user ID that is currently logged into
PostgreSQL.

Inputs

None.
Example

SELECT SESSION_USER;
--------------
webuser

Notes

Currently, CURRENT_USER and SESSION_USER are the same, but in the future,
there might be a distinction as needed for programs running in a setuid mode.

Notice that the preceding function is not called with the trailing parenthesis "()".
This is to maintain SQL compatibility.

USER

Notes

See CURRENT_USER.
Other Functions

Some of the included PostgreSQL functions do not fall neatly into a specific category.
This section outlines one example of this type of function.

ARRAY_DIMS

Description

The ARRAY_DIMS function returns the number of elements stored in an array field.

Input

ARRAY_DIMS (col)

Example

SELECT ARRAY_DIMS(testscore) FROM students;

array_dims
----------
[1:4]
Chapter 5. Other PostgreSQL Topics
PostgreSQL, like all RDBMSs, has specific ways in which common concepts such as
indexes and transaction control are implemented. Additionally, there are other
concepts that are unique to PostgreSQL.

This chapter contains information related to how PostgreSQL handles the following:

Arrays in fields

Inheritance

Indexes

Object identifiers (OIDs)

Multiversion Concurrency Control (MVCC)


Arrays in Fields

One of the nice features that PostgreSQL supports is the concept of fields that can
hold arrays. This enables multiple values of the same data type to be stored in a
single field.

To insert arrays into a field, the field should be marked as holding arrays during
table creation. After a field has been designated to hold arrays, data can be
inserted, selected, or updated into the array by referring to the specific array
element in the designated table.

Creating an Array

For example, let's create a table named students that has a four-element array field
to hold test scores:

CREATE TABLE students


(name char(20), testscore int[4]);

Notice that, at table creation time, a field is designated as being able to support
arrays by including brackets [] next to the data-type definition.

Utilizing Array Fields

When inserting, updating, or selecting data, the specific array element can be
chosen by referencing it explicitly. PostgreSQL begins numbering array elements at
1. Therefore, the first element in the array testscore would be referenced as
testscore[1].

Now let's insert some sample data into your table students. Notice how you refer
to the specific element you want to address by using braces {}:

INSERT INTO students (name, testscore)


VALUES ('Bill', '{90,0,0,0}');

INSERT INTO students (name, testscore)


VALUES ('Sam', '{86,0,0,0}');

INSERT INTO students (name, testscore)


VALUES ('Pam', '{95,0,0,0}');

Notice that the array fields are referenced by using braces {}; specifically, the array
elements are referenced where you wanted the data to be inserted. Likewise, the
same strategy is used when updating the row is done.

In fact, either the entire array can be replaced or just the specific element:

UPDATE students SET testscore='{95,86,0,0}' WHERE name='Pam';


Or you can just update the desired element:

UPDATE students SET testscore[3]=98 WHERE name='Pam';

Selecting specific elements can be done in the same way. For instance, to select all
students who scored higher than an 85 on the first three tests, you would use the
following:

SELECT * FROM students WHERE testscore[1]>85


AND testscore[2]>85 AND testscore[3]>85;

Multidimensional Arrays

PostgreSQL also enables multidimensional arrays to be created and used. Suppose


you wanted to track each student's test for each half of the school year. In this case,
it would be easy to create a multidimensional array to hold that information:

CREATE TABLE students (name char(20), testscore int[4][4]);

You could then insert and access that information as before (notice the use of the
double braces):

INSERT INTO students VALUES ('Bill',


'{{75,85,99,68},{88,91,77,87}}');

Selection can then be made for the specific element in the specific multidimensional
array. For instance, to see who scored greater than a 90 on the third exam in the
second half of school:

SELECT * FROM students WHERE testscore[2][3]>90;

Extending Arrays

One caveat (or benefit) of the PostgreSQL array structure is that element sizes
within an array can be expanded dynamically. Although you might explicitly specify
a maximum array size during the table creation, this size can be altered by using
the UPDATE command.

For instance, here is an example of a table created with an array. The table then
utilizes the UPDATE command to extend the size of the array:

CREATE TABLE students (name char(20), testscore int[3]);

INSERT INTO students VALUES ('Bill', '{96,84,98}');

SELECT * FROM students;

Name testscore
-------------------
Bill {96,84,98}

UPDATE students SET testscore='{96,84,98,100}';

SELECT * FROM students;

Name testscore
-------------------
Bill {96,84,98,100}

Although this can be a useful feature, it can also be problematic unless used
carefully. It would be possible to end up with different rows that each have a
different number of array elements.

One useful function for dealing with arrays is the ARRAY_DIMS function. This
function returns the current number of elements in an array. Refer to the
ARRAY_DIMS function in Chapter 4, "PostgreSQL Functions."
Inheritance

PostgreSQL allows tables to inherit properties and attributes from other tables. This
is useful in cases in which many tables are needed to hold very similar information.
In these cases, it is often possible to create a parent table that holds the common
data structures, allowing the children to inherit these structures from the parent.

For instance, let's create a table called employees:

CREATE TABLE employees (name char(10), salary numeric(9,2));

Now you can create a specific table just for cooks, who happen to need all the
information that other employees need:

CREATE TABLE cooks (specialty char(10)) INHERITS(employees);

SELECT * FROM cooks;

name salary specialty


---------------------
(0 rows)
INSERT INTO cooks VALUES ('Bill', 877.50, 'Steak');

SELECT * FROM cooks;

name salary specialty


-----------------------------
'Bill' 877.50 'Steak'

The real power of inheritance lies in the capability to search parent tables for
information stored in child tables, without having to explicitly name the child table in
the query.

SELECT * FROM employees* WHERE name='Bill';

name salary specialty


-----------------------------
'Bill' 877.50 'Steak'

Notice that the preceding query includes an asterisk (*) after the table employees.
This is to tell PostgreSQL to extend its search to child tables as well.

Starting in PostgreSQLVersion 7.1, all queries, by default, extend their searches to


child tables. There is no need to include the extra asterisk after the table name,
although that notational style is still supported.

To limit a query search to a particular table inVersion 7.1, there are two options.
One is to set the environmental variable SQL_Inheritance to OFF. The second is
to use the keyword ONLY during a SELECT query, for instance:

SELECT * FROM ONLY employees WHERE name='Bill';

name salary specialty


---------------------
(0 rows)

Or, alternatively, you can use the SET command:

SET SQL_Inheritance TO OFF;

SELECT * FROM employees WHERE name='Bill';

name salary specialty


---------------------
(0 rows)

Although table inheritance is a powerful feature of PostgreSQL, there are some


limitations. Namely, the limitation comes in the form of conceptual planning.

Although not a limitation of PostgreSQL per se, unless table inheritance is carefully
planned, problems will arise. For instance, in the preceding examples, you are
assuming that every cook will also be an employee.

Certainly, it's possible that a new relationship could be formed that would not fall
under the category of employee. Perhaps volunteer or consultant would be
more appropriate for a given relationship. At this point, your previous database
schema is problematic and will need to be redone to fit more accurately. As
mentioned earlier, this is not an inherent problem with PostgreSQL; it just
underlines the need for careful planning when using inheritance.
PostgreSQL Indexes

Essentially, indexes help database systems search tables more efficiently. The
concept of indexes is widely supported among all the popular RDBMSs, and
PostgreSQL is no exception.

By default, PostgreSQL has support for up to three types of indexes: B-Tree, R-Tree,
and hash. During index creation, the specific type of index required can be specified.
Each index type is best suited for a specific type of indexing.

A general rule of thumb when using indexes is to determine what queries your
database is making consistent use of. Essentially, indexes should exist for every
WHERE criteria in frequently used queries.

B-Tree Indexes

The B-Tree index is based on an implementation of the high-concurrency Lehman-


Yao B-Trees. The B-Tree index is a fully dynamic index that does not need periodic
optimization.

This is the default index of which PostgreSQL most often makes use. In fact, if the
CREATE INDEX command is called with no specification of index types, a B-Tree
index will be generated.

B-Tree indexes are used whenever the following list of comparison operators is
employed:

<, <=, =, =>, >

Currently, B-Tree indexes are the only provided indexes that support multicolumn
indexing. Up to 16 columns can be aggregated into a B-Tree multicolumn index
(although this limit can be altered at compile time).

R-Tree Indexes

R-Tree indexes are especially suited for fast optimization of geometric and/or spatial
comparisons. R-Tree indexes are implementations of Antonin Guttman's quadratic
splits. The R-Tree index is a fully dynamic index that does not need periodic
optimization.

R-Tree indexes are preferentially searched whenever one of the following


comparison operators is employed:

<<, &<, &>, >>, @, ~=, &&

Hash Indexes

The hash index is a standard hash index that is an implementation of Litwin's linear
hashing algorithms. This hash index is a fully dynamic index that does not need
periodic optimization.
Hash indexes can be used whenever the = operator is employed in a comparison.
However, there is no substantial evidence that a hash index is any faster than a B-
Tree index, as implemented in PostgreSQL. Therefore, in most cases, it is preferable
to use B-Tree for = comparisons.

Other Index Topics

There are also specific uses of indexes that should be talked about. Namely, they
can also be used on the output of functions and in multicolumn situations.

Functional Indexes

Often it is desirable to index the results of a function if a particular query makes


consistent use of it. For instance, if you were frequently querying a table for a
MAX() value from a field, it would be beneficial to create a separate index that
contains the output from that function:

CREATE INDEX max_payroll_idx ON payroll (MAX(salary));

This will result in much faster optimization times when queries call WHERE
MAX(salary)>n or other such selection criteria.

Functional indexes are not suited for use in multicolumn indexes.

Multicolumn Indexes

The B-Tree indexing scheme utilized by PostgreSQL can support multicolumn


indexes up to 16 columns wide. (This is a compile-time option.) Most frequently,
multicolumn indexes are used only when the AND operator is used within a query.

For instance:

CREATE INDEX name_ssn_idx ON payroll (name, ssn);

SELECT * FROM payroll WHERE name='Bill' AND ssn='555-55-5555';

This would make use of the multicolumn name_ssn_idx. However, the following
would not:

SELECT * FROM payroll WHERE name='Bill' OR ssn='555-55-5555';

Multicolumn indexes generally should be used sparingly. Most often, single-column


indexes are more efficient in terms of speed and storage size than are multicolumn
indexes.

However, multicolumn indexes can be used with great effect to aggregate unique
row keys for tables that contain a lot of similar information. This is generally used to
enforce data integrity. For instance, suppose a table made use of the following
fields:

Age Height Name

Each individual field would prove to be difficult to enforce any unique constraints
upon. After all, there will be many potential people who are 5'10" or who are 25 and
so on. However, there will be decidedly fewer people who are 5'10", 25 years old,
and named Bill Smith. This can be a good candidate to use a multicolumn index with
unique constraints to enforce data integrity.

Primary Key versus Unique Key Indexes

A common cause of confusion is the fact that there are two key types that, on the
surface, seem to perform the same function.

Both primary and unique keys make use of indexes that enforce rules requiring field
values to be unique to the table. However, there are some important distinctions
and finer differences between them:

Primary keys are mainly used to relate a field value to a specific row (OID) in a
table. This is why primary keys can be used as relational keys when used in
conjunction with foreign tables. Additionally, primary keys will not allow NULL
values to be entered.

Unique keys do not relate a field value to a specific row; they just enforce a
uniqueness clause on the specified column. Although this is useful for
maintaining data integrity, it is not necessarily as useful for foreign table
relations as primary keys are. Moreover, unique keys will generally allow NULL
values to be inserted.

There is no functional difference between a UNIQUE NO NULL constraint and


a primary key. The keyword PRIMARY KEY exists to serve as a mnemonic
device to remind the user of the purpose for the index constraint.

Here's a basic example of the differences: Suppose you have two fields that are
important in an employee table. One field is for the employee_id, which is
assigned by the system, and the other is an SSN used by humans for data input and
so on.

In this scenario, the employee_id should be designated as a primary key, and the
SSN should be designated as a unique key.
OIDs

PostgreSQL makes use of object identifiers (OIDs) and temporary identifiers (TIDs)
to correlate table rows with system tables and temporary index entries.

Every row inserted into a table in PostgreSQL has a unique OID associated with it.
In fact, every table in PostgreSQL contains a hidden column named oid. For
instance:

SELECT * FROM authors;

Name Title
---------------------------------------------------
Bill Smith Cooking for 6 Billion
Sam Jones Chicken Soup for the Publishers Soul
SELECT oid, * FROM authors;

Oid Name Title


------------------------------------------------------------
17887 Bill Smith Cooking for 6 Billion
18758 Sam Jones Chicken Soup for the Publishers Soul

The key concept to understand with OIDs is that they are not sequential within a
table. OIDs are issued for every row item in the entire database; they are not
specifically constrained to one table. Therefore, any one table will never contain a
sequential ordering of OIDs. The SERIAL data type or an autonumbering SEQUENCE
is best suited for that type of application.

By default, PostgreSQL reserves the OIDs from 0 to 16384 for system-only use.
Therefore, user table-rows will always be assigned an OID greater than this.

PostgreSQL also uses TIDs to make dynamic relations between rows of data and
index entries.This value fluctuates and is only used for internal system purposes.

A common question is how to create an exact copy of a table, including the original
OIDs.This is made possible by utilizing the OID data type provided by PostgreSQL.

For instance:

CREATE TABLE new_authors


(orig_oid oid, name char(10), title char(30));

SELECT oid, name, title INTO new_authors FROM authors;

COPY new_authors TO '/tmp/newauth';

DELETE FROM new_authors;


COPY new_authors WITH OIDS FROM '/tmp/newauth';
Multiversion Concurrency Control

PostgreSQL makes use of Multiversion Concurrency Control (MVCC) to maintain data


consistency. Understanding, at least in concept, how MVCC works can be beneficial
to a PostgreSQL administrator or developer.

Most popular RDBMSs make use of table or row locks to maintain database
consistency. Typically, these locks occur at the physical level of the file. These locks
are used to prohibit two or more instances from writing to the same row (or table)
concurrently.

PostgreSQL uses a more advanced method for ensuring database integrity. In MVCC,
each transaction sees a version of the database as it existed at some near point in
the past. It is import to distinguish that the transaction is not seeing the actual
data, just a previous version of that data.

This prevents the current transaction from having to deal with the database arriving
at an inconsistent state due to other concurrent database transactions. In essence,
once a transaction is started, that transaction is an island unto itself. The underlying
data structures are isolated from other transactions' manipulations. Once that
transaction has ended, the changes it made to its specific version of the database
are merged back into the actual data structures.

There are three types of concurrency problems that any RDBMS has to deal with:

Dirty reads. A transaction reads data written by another uncommitted


process.

Phantom reads. A transaction re-executes a query because the underlying


data has changed in such a way as to make a different result set occur.

Nonrepeat reads. A transaction rereads data and finds data that has
changed due to another transaction having been committed since the first read
occurred.

PostgreSQL provides for READ COMMITTED and SERIALIZABLE isolation levels of


protection that offer the following:

Level Dirty Read Phantom Read Nonrepeat Read

READ COMMITTED Not possible Possible Possible

SERIALIZABLE Not possible Not possible Not possible

READ COMMITTED Level


This is the default method of isolation protection in PostgreSQL. READ COMMITTED
level prevents queries from seeing data changes after the transaction has been
started. However, the transaction will see previous changes it made to the table
while in process.

The crucial point in understanding READ COMMITTED isolation is what happens


during an UPDATE, DELETE, or SELECT FOR UPDATE command being run by
another transaction. Instances such as these can cause only partial isolation
support. The following steps outline how such a scenario can occur:

1. Transaction A will wait for Transaction B to complete.

2. If Transaction B issues a ROLLBACK, then Transaction A will continue as usual.

3. If Transaction B completes with a COMMIT, then Transaction A will re-execute


its query to make certain that no criterion has changed that would result in it
not needing to be run (for example, if the row was deleted by Transaction B).

4. If the row still matches the criterion, then the update will continue. (Note: See
the paragraph following this list.)

5. The row is then doubly updated, and other waiting statements in Transaction A
will continue to execute.

The important point to notice is what happens in step 4. At this point, Transaction A
has a new version of the database it is using. This occurred when Transaction A re-
executed its query. At that point, it was using a new version of the database as its
baseline. Therefore, subsequent statements in Transaction A will operate on the
changes made by Transaction B. So, in this way, the transaction isolation is only
partial. It is possible for transactions to "seep" into each other in specific cases like
these.

SERIALIZABLE Level

This isolation level differs from READ COMMITTED in that each transaction must
occur in a scheduled serial manner. No transactions can occur that would result in
one transaction acting on another transaction's modifications.

This is enforced by strict transaction scheduling. This scheduling mandates that if


one transaction completes successfully and thereby contaminates another
transaction's read buffer, then the second transaction issues a ROLLBACK
automatically.

The practical effect of this is that the database system must be constructed in such
a way as to expect transaction failures and retry them afterward. On a heavily used
system, this could mean that a significant percentage of transactions would fail due
to this strict scheduling. Such a burden could make the system much slower than
would occur under a straight READ COMMITTED database.

In the majority of cases, the READ COMMITTED isolation level is appropriate.


However, there might be some queries that mandate such a rigorous approach to
ensure data validity.
Part III: PostgreSQL Administration

Part III PostgreSQL Administration

6 User Executable Files

7 System Executable Files

8 System Configuration Files and Libraries

9 Databases and Log Files

10 Common Administrative Tasks


Chapter 6. User Executable Files
PostgreSQL includes a number of executable files that aid in configuration and
administration of the database system. Although this chapter is labeled as "User
Executable Files," do not confuse this terminology to mean "executed by any users."
Usually, these files can only be run from the postgres DBA account. Additionally,
these files can be executed from a client machine and do not necessarily need to be
executed from the same physical machine that holds the backend database system.

Most of these utilities perform operations that could also be performed by executing
series of SQL commands. They have been included as standalone executables to aid
the DBA in performing routine system tasks.

In the following documentation, the location of the file is noted. The two most
common forms of PostgreSQL installation are installation either from source code or
as part of an RPM package. The appropriate installation type is included in the
"Notes" section of each of the following commands.
Alphabetical Listing of Files

createdb

Description

The createdb utility is a command-line alternative to the CREATE DATABASE SQL


clause.

Usage/Options

createdb [options] name [comment]

Option Description

-e, --echo Echoes backend messages to stdout.

-E, --encoding type Character encoding scheme.

-h, --host host The hostname where the server resides.

-p, --port port The port or socket file of the listening server.

-q, --quiet Do not return any responses from the back end.

-U, --username user Connects as this username.

-W, --password Prompts for the password.

-D, --location path Alternate path to the database location.

name The name of the database to create.

comment Description or explanation of the database.


Examples

$ createdb mydatabase 'My database for holding records'


$ createdb -h db.somewebsite.com -p 9333 mydatabase
$ createdb -D /usr/local/mydb mydatabase

Notes/Location

createdb relies on the psql command to actually perform the database creation.
Therefore, psql must be present and able to be executed in order for createdb to
function correctly.

Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin

createlang

Description

The createlang utility registers a new language to the specified PostgreSQL


database.

Usage/Options

createlang [options] [language [dbname]]

Option Description

-h, --host host The hostname where the server resides.

-p, --port port The port or socket file of the listening server.

-U, --username Connects as this username.


user
-W, --password Prompts for the password.

-l, --list Lists the languages currently registered for the specified
database.

language The language name to register with the database.

dbname The database name to create a language in.

Examples

$ createlang pltcl mydatabase


$ createlang -h db.someserver.com -p 9999 plsql mydatabase
$ createlang -l mydatabase

Notes/Location

Currently the createlang command accepts plsql or pltcl.

This command is a wrapper for the CREATE LANGUAGE SQL command; however,
this is the preferred method for adding languages because of certain system checks
it automatically performs.

Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin

createuser

Description

The createuser command adds a new user to PostgreSQL.

Usage/Options

$ createuser [options] username


Option Description

-e, --echo Echoes back-end messages to stdout.

-h, --host host The hostname where the server resides.

-p, --port port The port or socket file of the listening server.

-q, --quiet Do not return any responses from the back end.

-d, --createdb Enables the user to create new databases.

-D, --no- Prohibits the user from creating new databases.


createdb

-a, --adduser Enables the user to create additional users.

-A, --no-adduser Prohibits the user from creating new users.

-P, --pwprompt If using authentication, prompts for the new user's


password.

-i, --sysid id Enables specification of the UID of the user.

username Unique username to create.

Examples

$ createuser joe
$ createuser -h db.someserver.com -p 9999 joe
$ createuser -a -d joe

Notes/Location
The createuser utility is a wrapper for the psql command. Therefore, the psql
file must be present and able to be executed by the user.

To create users, the flag in pg_shadow for the executing user must be set as such
to succeed.

Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin

dropdb

The dropdb utility is a command-line alternative to the DROP DATABASE SQL


clause.

Usage/Options

dropdb [options] name

Option Description

-e, --echo Echoes back-end messages to stdout.

-h, --host host The hostname where the server resides.

-p, --port port The port or socket file of the listening server.

-q, --quiet Do not return any responses from the back end.

-U, --username user Connects as this username.

-W, --password Prompts for the password.

-i, --interactive Interactive verification of the delete process.

name The name of the database to remove.


Examples

$ dropdb mydatabase
$ dropdb -h db.somewebsite.com -p 9333 mydatabase

Notes/Location

dropdb relies on the psql command to actually perform the database deletion.
Therefore, psql must be present and able to be executed for dropdb to function
correctly.

Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin

droplang

Description

The droplang utility removes a language from the specified PostgreSQL database.

Usage/Options

droplang [options] [language [dbname]]

Option Description

-h, --host host The hostname where the server resides.

-p, --port port The port or socket file of the listening server.

-U, --username Connects as this username.


user

-W, --password Prompts for the password.


-l, --list Lists the languages currently registered for the specified
database.

language The language name to remove.

dbname Removes the language specified in this database.

Examples

$ droplang pltcl mydatabase


$ droplang -h db.someserver.com -p 9999 plsql mydatabase
$ droplang -l mydatabase

Notes/Location

This command is a wrapper for the DROP LANGUAGE SQL command; however, this
is the preferred method for removing languages because of the system checks it
automatically performs.

Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin

dropuser

Description

The dropuser command removes a user from PostgreSQL.

Usage/Options

$ dropuser [options] username

Option Description

-e, --echo Echoes back-end messages to stdout.


-h, --host host The hostname where the server resides.

-p, --port port The port or socket file of the listening server.

-q, --quiet Do not return any responses from the back end.

-i, --interactive Prompts for confirmation before deletion.

username Unique username to delete.

Examples

$ dropuser joe
$ dropuser -h db.someserver.com -p 9999 joe

Notes/Location

The dropuser utility is a wrapper for the psql command. Therefore, the psql file
must be present and able to be executed by the user.

To remove users, the flag in pg_shadow for the executing user must be set as such
to succeed.

Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin

ecpg

Description

The ecpg command is a SQL preprocessor that is used to embed SQL commands
within C programs. Using SQL commands within a C program is essentially a two-
step process. First, the file of SQL commands is passed through the ecpg utility;
then it can be linked and compiled with a standard C compiler.

Usage/Options

ecpg [options] file [,el]]


Option Description

-v Prints version information of ecpg.

-t Turns off autotransaction mode.

-I path Specifies an alternate include path.

-d Turns on debugging info.

-o Specifies the output filename; if this is omitted, file.c is the


filename default.

file The file(s) to process.

Examples

$ ecpg myfile.pgc

Notes/Location

A discussion concerning the actual syntax of the ecpg command is outside the
scope of this entry. For a more complete discussion of using embedded SQL in C
programs, see Chapter 13,"Client-Side Programming," and the section "ecpg."

Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin

pgaccess

Description

The pgaccess utility is a GUI front end that makes many common administration
tasks easier.
Usage/Options

pgaccess [dbname]

dbname—Start pgaccess connected to this database.

Notes/Location

pgaccess provides the following functionality:

Opens any database.

Specifies username and password on login.

Specifies hostname and/or port for connection.

Saves preferences locally (~./pgaccessrc).

Executes VACUUM command on database.

Provides edit-in-place modification of table data.

Deletes selected table rows.

Appends records to tables.

Filters rows based on supplied criteria.

Specifies sort order of rows.

Imports/exports table data.

Renames tables.

Drops tables.

Defines and edits user-specified queries.

Stores queries as views.

Stores view layouts.

Constructs queries using drag-and-drop support.


Builds queries with table aliasing.

Prompts the user for supplied parameters for dynamic queries (such as "
SELECT * FROM authors WHERE name=[parameter 'Authors
Name']").

Defines, inspects, and deletes sequences.

Designs, views, sorts, and drops views.

Defines, views, and drops functions.

Defines and generates simple reports.

Changes fonts, size, and style of reports.

Loads and saves reports.

Designs custom forms.

Saves and views forms.

Defines, edits, and executes user-defined scripts.

pgaccess depends on the Tcl/Tk language, so it will need to be installed to work


properly. Additionally, the PostgreSQL-Tcl packages will need to be installed, or the
source will need to be compiled with the - -with-tcl option enabled.

Location of the file:

RPM— /usr/bin

Source— /usr/bin or /usr/local/pgaccess

pgadmin

Description

The pgadmin tool is a Windows 95/98/NT tool for performing basic PostgreSQL
administration. (This tool is not included in the base PostgreSQL package; it is a
third-party tool specifically for Windows users.)

Notes/Location
The tool's features include the following:

Executing arbitrary SQL commands.

Creating databases, tables, indexes, sequences, views, triggers, functions, and


languages.

Granting user and group privileges.

Data import and export tools.

Predefined reports on databases, tables, indexes, sequences, languages, and


views.

Revision tracking.

The pgadmin tool is not distributed as part of the standard PostgreSQL system.
Please visit http://www.pgadmin.freeserve.co.uk for more information on obtaining,
installing, and using pgadmin.

pg_dump

Description

The pg_dump utility is a very important tool in the PostgreSQL administrator's


arsenal. It enables database schema and/or data to be dumped out to standard
text. By default, it writes the data to stdout, which can easily be redirected to a
file using the appropriate piping symbol.

Combined with psql or pg_restore, this is the preferred method for performing
database backups and restores (see the next section, "pg_dumpall").

Version 7.1 of PostgreSQL adds some important features to pg_dump and


pg_dumpall. There is a new option that enables dumps to be made in a specified
format. Some of these new format types enable advanced features like dump and
restore of user-defined objects, selective restores, and so on (see the section
"pg_restore" for more information).

Usage/Options

pg_dump [options] database

Option Description
-h host Starts the host where the server is running.

-p port Specifies the port where the server is running.

-u Prompts for user/password authentication.

-v Verbose mode.

-a Dumps only the data; no schema.

-b, -- Dumps data and BLOBs (v7.1 feature).


blobs

-c Drops schema before creating.

-C, -- Includes commands to create the database (v7.1 feature).


create

-d Dumps data in a proper INSERT format.

-D Dumps data as INSERTs with attribute names.

-f, -- Sends output messages to the specified file (v7.1 feature).


file name

-Fp Uses a plain SQL text format. This is the default (v7.1 feature).

-Ft Outputs archive in a tar format (v7.1 feature).

-Fc Outputs the archive in new custom format. This is the most flexible
option (v7.1 feature).

-i Ignores version mismatch with server back end (pg_dump is only


designed to work with the correct version; this is for experimental
use only).
-n Suppresses double quotes in dump.

-N Includes double quotes in dump (default).

-o Dumps the OIDs for every table.

-O, --no- Does not set ownership objects to match the original database (v7.1
owner feature).

-R, --no- Does not attempt to connect to the database (v7.1 feature).
reconnect

-S, -- Specifies the superuser name (DBA) to use when disabling triggers
superuser and setting ownership information (v7.1 feature).
name

-s Dumps schema only; no data.

-t table Dumps info for this table only.

-x Does not dump ACL (Grant/Revoke) information.

-Z, -- Specifies compression level (0–9); currently, only the custom format
compress supports this feature (v7.1 feature).
[0..9]

Examples

$ pg_dump authors
$ pg_dump -a authors
$ pg_dump -t payroll authors

Notes/Location
pg_dump cannot handle large objects (LOs).

pg_dump cannot correctly handle extracting all system catalog metadata. For
instance, partial indexes are not supported.

Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin

pg_dumpall

Description

The pg_dumpall command is very similar to pg_dump. However, pg_dumpall


extracts all databases to a script file. In addition to the standard items extracted in
a pg_dump, pg_dumpall also includes the contents of the pg_shadow file.

Version 7.1 of PostgreSQL adds some important features to pg_dump and


pg_dumpall. There is a new option that enables dumps to be made in a specified
format. Some of these new format types enable advanced features like dump and
restore user-defined objects, selective restores, and so on (see the section
"pg_restore" for more information).

Usage/Options

pg_dumpall [options]

Option Description

-h host Starts the host where the server is running.

-p port Specifies the port where the server is running.

-u Prompts for user/password authentication.

-v Verbose mode.

-a Dumps only the data; no schema.


-b, -- Dump data and BLOBs (v7.1 feature).
blobs

-c Drops schema before creating.

-C, -- Includes commands to create the database (v7.1 feature).


create

-d Dumps data in a proper INSERT format.

-D Dumps data as INSERTs with attribute names.

-f, -- Sends output messages to the specified file (v7.1 feature).


file name

-Fp Uses a plain SQL text format. This is the default (v7.1 feature).

-Ft Outputs the archive in a tar format (v7.1 feature).

-Fc Outputs the archive in new custom format. This is the most flexible
option (v7.1 feature).

-i Ignores version mismatch with the server back end (pg_dump is only
designed to work with the correct version; this is for experimental
use only).

-n Suppresses double quotes in dump.

-N Includes double quotes in dump (default).

-o Dumps the OIDs for every table.

-O, --no- Does not set ownership objects to match the original database (v7.1
owner feature).
-R, --no- Does not attempt to connect to the database (v7.1 feature).
reconnect

-S, -- Specifies the superuser name (DBA) to use when disabling triggers
superuser and setting ownership information (v7.1 feature).
name

-s Dumps schema only; no data.

-x Does not dump ACL (Grant/Revoke) information.

-Z, -- Specifies compression level (0–9); currently, only the custom format
compress supports this feature (v7.1 feature).
[0..9]

Examples

$ pg_dumpall
$ pg_dumpall -a
$ pg_dumpall -o

Notes/Location

pg_dumpall has many of the same limitations that pg_dump has with regard to
system metadata. See the section "pg_dump" for more information.

Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin

pg_restore

Description

This is a new tool included with the PostgreSQL 7.1 release. It is designed to restore
data dumped by the pg_dump or pg_dumpall database utilities.

The new version of pg_dump includes the capability to dump data in a nontext
format that has many advantages over traditional data dumps:
Selective restores are possible using the new pg_dump format and
pg_restore.

The new archive format produced by pg_dump is designed to be portable


among platforms.

The new pg_dump format will produce queries to enable the regeneration of
all user-defined types, functions, tables, indexes, aggregates, and operators.

Usage/Options

pg_restore [options] archive-file

Option Description

-a, --data- Restores data only; no schema.


only

-c, --clean Drops schema before invoking createdb.

-C, --create Includes SQL to create schema.

-d, --dbname The name of the database to connect to.


name

-f, -- Specifies the file to hold generated output.


file=filename

-Ft, -- Specifies that the format of the archive file is in tar.


format=t

-Fc, -- Specifies that the format of the archive file is in the custom
format=c format of pg_dump. This is the most flexible format to restore
from.

-i, -- Restores information for the named index only.


index=name
-l, --list Lists the contents of the archive only.

-L, --use-list Restores the elements contained in the specified file only.
file Restored in the order they appear, comments begin with a
semicolon (;) at the start of the line.

-N, --orig- Restores items to the original dump order.


order

-o, --oid- Restores items to the original OID order.


order

-O, --no-owner Does not restore ownership information; objects will be owned
by the current user.

-P, --function Restores named functions only.


name

-r, -rearrange Restores items in modified OID order. (This is the default.)

-R, --no- Prohibits pg_restore from attempting any database


reconnect connections. Useful if already directly connected to the
database.

-s, --schema- Restores the schema only; no data.


only

-S, -- Specifies the superuser name to use when disabling triggers


superuser=name and applying ownership information. By default, pg_restore
uses the current username, if that user is considered a DBA.

-t, -- Restores schema/data for the specified table only.


table=name

-T, -- Restores the specified trigger only.


trigger=name
-v, --verbose Produces verbose output.

-x, --no-acl Prevents restoration of the Access Control List (that is,
Grant/Revoke information).

-h, --host Specifies the hostname where the server process is running.
name

-p, --port Specifies the port to connect to.


port

-u Forces prompts for authentication.

Examples

Dump a database and restore using custom format:

$ pg_dump -Fc newriders $ newriders.cust_fmt


$ pg_restore -d newriders newriders.cust_fmt

Only restore the payroll table:

$ pg_restore -d newriders -t payroll newriders.cust_fmt

Notes/Location

See also the sections discussing pg_dump, pg_dumpall, and pg_upgrade.

Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin

pg_upgrade

Description
The pg_upgrade utility can be used to upgrade to a new version of the database
system without having to reload all the data in the current database.

This command currently will not function on PostgreSQLVersion 7.1 and above; see
the section on pg_restore if you are using one of the newer versions.

Usage/Options

pg_upgrade [ -f file] old_data_dir

-f file—This specifies a file containing the schema for the old database.

old_data_dir—This represents the path to the old data directory.

Examples

The usual method of upgrading a database while using the pg_upgrade utility is as
follows:

1. Back up existing data (that is, pg_dumpall).

2. Dump out schema to a file (that is, pg_dumpall -s db.out).

3. Stop the current postmaster.

4. Rename the old data directory (that is, data.old).

5. Build the new binaries with the make tools.

6. Install the new binaries with make install.

7. Run initdb in the new system to create a new database structure.

8. Start the postmaster.

9. Upgrade the old database to the new system using $ pg_upgrade -f


db.out/usr/local/pgsql/data.old.

10. Copy old pg_hba.conf files and pg_options to their new location (that is,
/usr/local/pgsql/data).

11. Stop and start postmaster again.

12. Verify that connections are working correctly.

13. Connect to the restored database and examine its contents carefully.

14. If the database is not valid, restore from your full dump file created in step 1.
15. If the database is valid, issue a vacuum command to update query-planning
statistics (that is, vacuumdb -z mydb).

Notes/Location

Not all upgrades can be accomplished with this tool. Check the release notes of the
new database version to see if pg_upgrade is supported.

Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin

pgtclsh

Description

The pgtclsh command is a wrapper program, which essentially is a standard Tcl


shell with the libpgtcl library loaded.

Usage/Options

pgtclsh [script arg1 [,…]]

script—The optional Tcl script file to process.

arg1—The optional arguments to pass to the specified script file.

Examples

$ pgtclsh myfile.tcl

Notes/Location

If pgtclsh is launched with no specified script file, it automatically enters into the
interactive Tcl interface.

Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin
pgtksh

Description

The pgtksh command is essentially a Tk (wish) shell with the libpgtcl libraries
loaded. This is what the pgaccess program is based on.

Usage/Options

pgtksh [script arg1 [,…]]

script—The optional Tcl script file to process.

arg1—The optional arguments to pass to the specified script file.

Examples

$ pgtksh myfile.tcl

Notes/Location

If pgtksh is launched with no specified script file, it automatically enters into the
interactive Tcl interface.

Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin

psql

Description

psql is an interactive front end to the PostgreSQL system. It is a very versatile


interface and has numerous options that can control nearly every aspect of a
PostgreSQL system.

Once psql is started and connected to the specified database, the user is entered
into an interactive shell. In this mode, commands can be issued to the PostgreSQL
back end, and the responses can be seen in real time.

Usage/Options
psql [options] [database [user]]

The psql options fall into two categories: command-line options, which are issued
while starting psql, and shell options, which can be issued inside the psql shell.

Command-
Line Description
Options

-a, -- Echoes all processed lines to the screen. This is useful when running
echo-all a script to monitor progress or for debugging purposes.

-A, --no- Switches the output to an unaligned layout.


align

-c, -- Specifies that psql is to execute a single SQL command or a single


command interactive shell command.The query must either be pure SQL or a
query psql shell command. Mixing command types is not permitted.

-d, -- Specifies the database to connect to.


dbname
database

-e, -- Echoes all queries sent to the back end.


echo-
queries

-E, -- Echoes all queries, even hidden queries as a result of shell


echo- commands (that is, \dt and so on).
hidden

-f, -- Reads the specified file and executes the contained SQL queries.
file file After finished, terminates.

-F, -- Uses the specified field separator.


field-
separator
sep
-h, -- Connects to the specified host (where the server is running).
host
hostname

-H, -- Produces table output in HTML format.


html

-l, -- Lists all available databases and then exits.


list

-o, -- Captures all query output in the specified file.


output
file

-p, -- Connects using the specified port.


port port

-P, -- Allows setting the default print (output) style (for example,
pset val aligned, unaligned, html, or latex).

-q Quiet mode.

-R, -- Uses the specified separator for record demarcation.


record-
separator
sep

-s, -- Prompts the user before each query is executed. Useful for
single- debugging or controlling execution of SQL scripts.
step

-S, -- Runs PostgreSQL in a single-line mode, where carriage returns


single- terminate a query (the default is a semicolon).
line

-t, -- Turns off the printing of column names and result totals. Only prints
tuples- data (tuples) returned.
only

-T, -- Specifies the options to be included in HTML table output.


table-
attr
options

-u Forces username and password prompts.

-U, -- Connects to the database as a specified user.


username
name

-v, -- Assigns a value to a variable. To unset a variable, include it with no


variable, equal sign or value following.
--set
var=val

-V, -- Displays the psql version information.


version

-W, -- Forces PostgreSQL to prompt the user for a password.


password

-x, -- Turns on extended row format mode.


expanded

-X, --no- Does not read the startup file ~/.psqlrc.


psqlrc

-?, -- Displays a help screen showing psql options.


help

Inside the psql shell, most options are prefaced with a backslash (\).
Shell Options Description

\a Toggles alignment mode.

\C title Sets the specified title to print atop each query result set.

\c, \connect db Closes the current connection and connects to the specified
[user] database. Optionally, will connect as the specified user.

\copy table Performs a front-end version of the SQL COPY command.


[with oids] Specifies direction of copy and whether to route to a file or
use stdin or stdout. Additionally, null string and delimiters
{from | to} can be specified.

filename |
stdin | stdout

[using
delimiters
char]

[with null as
nullstr]

\copyright Displays the PostgreSQL copyright.

\d Same as \dtvs. Displays information about the specified


relations.

\dt Displays information about the tables in the current database.

\dv Displays information about the views in the current database.

\ds Displays information about the sequences in the current


database.

\di Displays information about the indexes in the current


database.
\da [pattern] Displays information about the aggregates in the current
database. Optionally, only shows those that match the pattern
specified (such as max).

\dd [obj] Shows the comments associated with all objects in the
current database. Optionally, only shows comments attached
to specified object.

\df [pattern] Displays information about the functions in the current


database. Optionally, only shows functions that match the
specified pattern.

\dl Lists all large objects in the current database (same as


\lo_list).

\dp [pattern] Displays information concerning permissions on objects within


the current database. Optionally, only displays information on
objects that match the specified pattern (same as \z).

\dS Displays information concerning the system tables in the


current database.

\dT [pattern] Displays information on the data types included in the current
database. Optionally, only displays information on objects that
match the specified pattern.

\e, \edit file Launches an external editor (vi is default) to edit the file
specified.

\echo text Echoes the text specified or performs command substitution.

\encoding type Sets encoding to the specified type or, with no parameter, lists
the current encoding type.

\f str Sets the field separator to the specified string. Default is the
piping symbol (|) (see also \pset).

\g [file | Sends output from the entered query to the specified file or
command] pipes it through the specified command (similar to \o).
\h, \help Displays a list of all the valid SQL commands. Optionally,
[command] displays more detailed help on the specified command.

\i file Reads input from the specified file and executes it.

\l, \list Lists all known databases and their owners (including a "+"
will also display comments).

\lo_export oid Exports the large object with the specified OID to the
file filename specified.

\lo_import file Imports the large object from the filename specified.
[comment] Optionally, provides a descriptive comment to be associated
with the LO.

\lo_list Lists all known large objects in the current database.

\lo_unlink oid Deletes the large object with the specified OID from the
current database.

\o [file | Sends all future results of queries to the filename specified or


command] pipes through the command given.

\p Prints the current query buffer.

\pset parameter Allows the user to manually set one of several options that
affect the current database. See this listing of parameters:

format Sets the output mode for tables to the


specified format: unaligned, aligned,
html, or latex (that is, \pset format=
latex).

border Sets the border width or type. In HTML, 0 =


no border, 1 = dividing lines, and 2 = new
frame (that is, \pset border=0).
expanded Toggles between regular and expanded
format.

null Specifies how to display a NULL field value


(that is, \pset null 'N/A').

fieldsep Specifies the character(s) to use as a field


separator (that is, \pset fieldsep '#').

recordset Specifies the character(s) to use as a record


separator (that is, \pset recordsep
'%').

tuples_only Suppresses header and footer information


from being displayed with query output.
Returns data only from queries.

title Specifies the title to use for following tables


(that is, \pset title 'Our Bank
Account').

tableattr Specifies attributes to include in HTML output


(that is, \pset tableattr
bgcolor='#FFFF00').

pager Toggles the use of page-by-page displays. By


default, it uses more to handle page
displays, but the PAGER variable can be
defined to any appropriate handler.

\q Quits the current psql shell.

\qecho text Echoes the specified text to wherever \o output is currently


being directed. Useful for adding comments to redirected
output files.

\r Clears (resets) the current query buffer.


\s file Saves the current psql command history to the specified
filename. (Note: PostgreSQL v7 and greater do this
automatically upon exit.)

\set var value Sets the psql environmental variable to the specified value.
(Note: This is not the same as the SQL SET command.) Here
is a list of valid environmental variables:

DBNAME The name of the currently connected


database.

ECHO The echo mode to which psql is


currently set. all means all output is
echoed; queries means only query
output is echoed.

ECHO_HIDDEN Specifies whether hidden queries are


echoed to stdout (that is, hidden
queries like \dt).

ENCODING Specifies the encoding scheme to use. If


multibyte encoding is not available, this
will be set to SQL_ASCII.

HISTCONTROL Controls what is entered into the


command history buffer. The value
ignorespace ignores any commands
beginning with whitespace. The value
ignoredups will refuse to enter
duplicate entries, and ignore-both
combines both.

HISTSIZE The number of commands to store in the


history buffer (default is 500).

HOST The host where the current connection is


operating.

IGNOREEOF If not set, sending an EOF (Ctrl+D) will


terminate the current psql session.
Otherwise, if set to a numeric variable, it
will ignore this many EOF insertions
before termination (default is 10).

LASTOID The value of the last affected OID.

LO_TRANSACTION Specifies the default action to take when


executing a large object event (for
example, \lo_export, \lo_import,
and so on). A rollback value will force
a rollback to any in-progress
transactions. commit will force a
commit, and nothing will specify that
no action is to take place. The latter case
is usually reserved for those cases in
which the user will be providing explicit
BEGIN… COMMIT encapsulation to LO
events.

ON_ERROR_STOP Specifies that if noninteractive scripts


encounter an error, processing is to stop.
By default, psql will continue to process
statements even if a malformed SQL
statement has been encountered.

PORT The port where the current session is


connected.

PROMPT1 Specifies what the normal prompt is to


look like (default is %/%R%#).

PROMPT2 This prompt is issued when psql is


expecting more data (default is
%/%R%#).

PROMPT3 This prompt is issued when a SQL COPY


has been called and the interface
requires tuple input (default is >>).

Prompt types:

%M—Full hostname.
%m—Truncated hostname.

%>—Port number.

%n—Username connected as.

%/—Current database.

%~— Like %/ but prefixes a ~ if this is


default database.

%#— Prefixes # if DBA; otherwise >.

%R—Sets the following: PROMPT1 "="


(default), PROMPT1 "^" (single-line
mode), PROMPT1 "!" (if session
disconnected), or PROMPT2 "-", "*",
"'", """ (depending on what
continuation condition exists).

%digits— Sets to the digit specified.

%:name— The value of the psql variable


NAME.

%:command— The output of the given


command.

QUIET Sets quiet mode.

SINGLELINE Sets single-line mode. If set, then a new


line indicates query termination (default
is ;).

SINGLESTEP Sets single-step mode. If set, the user is


prompted for confirmation before any
query execution takes place.

USER The user you are currently connected as.

\t Toggles whether to suppress header and footer information


from being displayed with query output. Returns data only
from queries (same as \pset tuples_only).
\T options Specifies options to be placed in HTML table output (same as
the \pset tableattr command).

\w file command Outputs the current query buffer to the specified filename or
pipes it through the provided command.

\x Toggles extended row format mode.

\z pattern Displays permission information on current database objects.


Optionally, displays information only on those objects that
match the specified pattern.

\! command Escapes to a separate UNIX shell and executes the provided


command.

\? Displays help on psql backslash commands.

Examples

To start psql, execute a query and immediately exit:

$ psql -c 'SELECT * FROM authors' newriders


name | age
----------
Sam | 25
Bill | 67

Alternatively, to run an entire script called mydb.sql from the command line
(executing into the newriders database):

$ psql -f mydb.sql newriders

To perform the same example but this time display it in HTML mode (useful for CGI
programming):

$ psql -H -c 'SELECT * FROM authors' newriders


<table border=1>
<tr>
<th align=center>name</th>
<th align=center>age</th>
</tr>
<tr valign=top>
<td align=left>Sam</td>
<td align=left>25</td>
</tr>
<tr valign=top>
<td align=left>Bill</td>
<td align=left>67</td>
</tr>
</table>

To redirect queries to an output file from inside a psql shell and to include
descriptive titles to each data dump:

psql=>\o mycapture.txt
psql=>\qecho Listing of all authors
psql=>\qecho **********************
psql=>SELECT * FROM authors;
psql=>\qecho And their payroll info
psql=>\qecho **********************
psql=>SELECT * FROM payroll;

To list all files that end in .sql in the current directory from a psql shell interface
(notice the use of backticks, not single quotes):

psql=>\echo 'ls *.sql'


authors.table.sql
payroll.table.sql

Notes/Location

The psql shell environment also supports variable substitution. The most basic
form associates a variable name with a value. For instance:

psql=>\set myvar name='Sam'


psql=>\echo :myvar
psql=>name='Sam'
psql=>SELECT * FROM authors WHERE :myvar;
name | age
Sam | 44

As you can see, variable names are referenced by prefixing the name with a colon
(:).

To change the default editor when using the \e command, specify the correct value
to the PSQL_EDITOR variable.

Location of the file:


RPM— /usr/bin

Source— /usr/local/pgsql/bin

vacuumdb

Description

The vacuumdb command is a wrapper program for the VACUUM SQL statements.
Although there is no real difference in how they operate, the vacuumdb command is
most widely used for running via a cron job.

Usage/Options

vacuumdb [connection-options] [analyze options]

Connection Option Description

-h, --host host The hostname where the server resides.

-p, --port port The port or socket file of the listening server.

-U, --username user Connects as specified user.

-W, --password Forces prompt for password.

-e, --echo Echoes back-end messages to stdout.

-q, --quiet Do not return any responses from the back end.

Analyze options:

-d, --dbname name The name of the database to vacuum.

-z, --analyze Calculates statistics for query optimizer.


-a, --alldb Vacuums all databases.

-v, --verbose Verbose output.

-t, --table table Cleans or analyzes table only.

-t, --table table(col) Analyzes column only (must be used with -z).

Examples

Clean the newriders database (both are equivalent):

$ vacuumdb -d newriders
$
$ vacuumdb newriders

Clean all databases and then analyze a specific table:

$ vacuumdb -a
$
$ vacuumdb -z -d newriders -t authors

Notes/Location

Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin
Chapter 7. System Executable Files
Although most of these files are executable by user accounts, they have been
collected into a separate listing.They are usually relegated to specific database
system events, as opposed to the everyday use of the commands discussed in
Chapter 6, "User Executable Files." Generally, these files are to be used for server
control instead of as client utilities.

The following sections include the typical file locations of the discussed commands.
These locations usually differ depending on whether the database system was
installed from source code or from an RPM package.
Alphabetical Listing of Files

initdb

Description

The initdb command is used to prepare a directory location for a new PostgreSQL system.
The initdb command is usually performed with several other steps, as briefly outlined here:

1. Create the directory to hold the data as root.

2. Use chown to change ownership of the directory to the DBA user.

3. Use login (or su) to log in to the DBA user account.

4. Execute initdb with appropriate options.

5. initdb will generate the shared catalog tables.

6. initdb will generate the template1 database. (Each time you create a new database, it
is generated from template1.)

Usage/Options

initdb -D path [options]

Option Description

-D, --pgdata The path to the PostgreSQL database.


path

-i, -- Specifies the UID of the DBA.


sysid=id

-W, -- Forces password prompt.


pwprompt

-E, -- Specifies the encoding type to use (the system must have been built with
encoding=type the multibyte encoding flag set to true).

-d, --debug Prints debugging information from the back end.

-n, --noclean By default, if


-L, path Specifies the path for initdb to find its input files. (This is a special-case
option and is rarely needed.)

Examples

$initdb -D /usr/local/pgsql/data

Notes/Location

The initdb command should not be run as root.

Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin

initlocation

Description

The initlocation utility is used to initialize a secondary data storage area. In some ways,
this command is similar to initdb, except that many internal catalog operations do not occur
with initlocation. Additionally, this command can be run as often as necessary, as opposed
to initdb, which is generally run only once per installation.

Usage/Options

initlocation path

path— The path where the new database will reside.

Examples

$initlocation /usr/local/pgsql/data2

Notes/Location

This should be run as the DBA account, not as root.

Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin

ipcclean
Description

The ipcclean command is a shell script designed to clean up orphaned semaphores and
shared memory after a back-end server crash.

Usage/Options

ipcclean

Notes/Location

This command makes certain assumptions about the naming convention used with output from
the ipcs utility. Therefore, this shell script might not be portable across all operating systems.

Warning!

Running the ipcclean command while a database server is currently operational


will result in a general failure of the database system.

pg_ctl

Description

pg_ctl is used to control various aspects of the PostgreSQL postmaster server.

Usage/Options

pg_ctl [-w] [-D path] [-p path] [-o "options"] start


pg_ctl [-w] [-D path] [-m mode] stop
pg_ctl [-w] [-D path] [-m mode] [-o "options"] restart
pg_ctl [-D path] status

Option Description

-w Watches for creation/destruction of the pid file ($PGDATA/ postmaster.pid).


Times out after 60 seconds.

-D path The path to the database.

-p path Specifies the path to the postmaster file.


-m mode Specifies one of the following shutdown modes:

s, smart Waits for clients to log out (default).

f, fast Sends SIGTERM to the back end. Active transactions are issued an
immediate ROLLBACK.

i, Sends SIGUSR1 to all back ends; in this mode, database recovery


immediate will be needed on the next system start.

-o Sends specified options to the postmaster. Options are usually quoted in order
"options" to ensure proper execution.

start Starts the postmaster.

stop Stops the postmaster.

restart Restarts the postmaster (automatic stop/start).

status Shows the status of the postmaster.

Examples

$pg_ctl start
$pg_ctl -m smart stop

Notes/Location

Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin

pg_passwd

Description

The pg_passwd utility is used to create and manipulate the password file needed if
authentication is enabled in PostgreSQL.
Usage/Options

pg_passwd filename

filename—The path and filename of the password file to create or manipulate.

Examples

$pg_passwd /usr/local/pgsql/data/pg_pword
File "/usr/local/pgsql/data/pg_pword" does not exist,, Create? (y/n): Y
Username: barry
Password:
Re-enter password:

Notes/Location

The file must be in the path of the PostgreSQL database to be used for client authentication.
Additionally, the authentication method might need to be entered in the pg_hba.conf
configuration file.

Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin

postgres

Description

The postgres file is the actual server process for processing queries in PostgreSQL. It is
usually called by the multiprocess postmaster wrapper. (Both are actually the same file;
postmaster is a symlink to the postgres process.)

The postgres server is usually not invoked directly; rather, many of these options are passed
to the postgres process upon execution.

Although generally not invoked directly, the postgres process can be executed in an
interactive mode that will allow queries to be entered and executed. However, such execution
should not be attempted if the postmaster process is running; data corruption could result.

Usage/Options

postgres [options] database

Option Description
-A 0|1 Specifies whether assert checking should be enabled. (This debugging tool is
available only if enabled at compile time. If it was enabled, the default is on.)

-B val The number of 8KB shared buffers to use. The default is 64.

-c Sets various runtime options. See the Advanced Option list later in this chapter for
var=val these options.

-d Sets the debug level. The higher the value, the more entries are output to the log.
level The default is 0; the range is usually valid up to 4.

-D path The path to the data directory.

-F Disables fsync system calls. Can result in improved performance, but there is a
risk of data corruption. Generally, use this option only if there is a specific reason to
do so; it is not intended for standard operation.

-e Sets the date style to European (that is, dd-mm-yyyy).

-o file Sends all debugging information to the specified file.

-P Disables use of system indexes of scan/update tuples. (Note: The REINDEX


command requires use of this option.)

-s Sends timing statistics to stdout for each processed query. Useful for performance
tuning.

-S val Specifies the amount of kilobytes to be used for internal sorts and hashes before the
system calls on temporary files. This memory amount indicates that every system
sort and/or hash is capable of using up to this much memory. When the system is
processing complex sorts, multiple sort/hash instances will be used, and each one
will use up to this much memory. The default value is 512 kilobytes.

-E Echoes all queries to stdout.

-N Prohibits use of a new line as a query delimiter.

Not for Everyone


These advanced options are not recommended for general use. They typically are
used only for advanced debugging scenarios or by PostgreSQL developers. Moreover,
the inclusion of these advanced options might change from release to release. There
is no guarantee that these options will exist in any specific version of PostgreSQL.

Advanced
Description
Option

-fi Disables index scans.

-fs Disables sequential scans.

-fn Disables nested loop joins.

-fm Disables merge joins.

-fh Disables hash joins.

-i Prevents query execution but shows plan.

-L Prohibits use of the locking system.

-O Enables modification of system tables.

-p Indicates that the specified database has been started by postmaster. Impacts
database buffer sizes, file descriptors, and so on.

-tpa Prints timing information for the system parser. (Cannot be used with the -s
option.)

-tpl Prints timing information for the system planner. (Cannot be used with the -s
option.)

-te Prints timing information for the system executor. (Cannot be used with the -s
option.)

-v val Specifies the protocol version to use.

-W sec Sleeps for the specified number of seconds before starting. Useful for developers
who need to start debugging programs in the interim.
Notes/Location

When starting the postgres process, the current OS username is selected as the PostgreSQL
username. If the current username is not a valid PostgreSQL user, the process will not continue.

postgres and postmaster are the same file (actually postmaster is a symbolic link to the
postgres executable). However, you cannot substitute one command for the other and expect
the same results. The postgres executable registers what name it was invoked by, and if it is
called as postmaster, certain options and assumptions are enabled.

Many of these options can be and are passed to the postgres process by using a configuration
file. (See the "pg_options/postgresql.conf" section in Chapter 8, "System Configuration Files
and Libraries.")

Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin

postmaster

Description

The postmaster is the multiuser implementation of the postgres application. In most cases,
this process is started at boot time, and log files are redirected to an appropriate file.

One postmaster instance is required to manage each database cluster. Starting multiple
instances can be achieved by specifying separate data locations and connection ports.

Usage/Options

Option Description

-A 0|1 Specifies whether assert checking should be enabled. (This debugging tool is only
available if enabled at compile time. If it was enabled, the default is on.)

-B val The number of 8KB shared buffers to use. The default is 64.

-b path Specifies the path to the back-end executable (usually postgres).

-c Sets various runtime options. See the following listing:


var=val
shared_buffers = integer

debug_level = integer

fsync = BOOLEAN
virtual_host = integer

tcpip_socket = BOOLEAN

unix_socket_directory = integer

ssl = BOOLEAN

max_connections = integer

port = integer

enable_indexscan = BOOLEAN

enable_hashjoin = BOOLEAN

enable_mergejoin = BOOLEAN

enable_nestloop = BOOLEAN

enable_seqscan = BOOLEAN

enable_tidscan = BOOLEAN

sort_mem = integer

show_query_stats = BOOLEAN

show_parser_stats = BOOLEAN

show_planner_stats = BOOLEAN

show_executor_stats = BOOLEAN

(Note: See the configuration options in Chapter 8 for more information about these
settings.)

-d Sets the debug level. The higher the value, the more entries are output to the log.
level The default is 0; the range is usually valid up to 4.

-D path The path to the data directory.

-F Disables fsync system calls. Can result in improved performance, but there is a
risk of data corruption. Generally, only use this option if there is a specific reason to
do so; it is not intended for standard operation.

-h host Specifies the host for which the server is to respond to queries. This defaults to
listening on all configured interfaces.

-i Enables clients to connect via TCP/IP. By default, on UNIX, domain sockets are
permitted.
-k path Specifies the directory the postmaster is to use for listening to UNIX domain
sockets. (The default is /tmp.)

-1 Enables use of SSL connections. (Note: This option requires that SSL was enabled at
compile time and that the -i option has also been used.)

-N val Specifies the maximum connections permitted to this database back end. The
default value is 32, but this can be set as high as 1,024 if your system will support
that many processes. (Note: The -B option must be set with at least twice the
number of -N to operate correctly.)

-o Command-line options to pass to the postgres back end. If the option string
options contains any whitespace, quotes must be used. (Note: See postgres for valid
command-line switches.)

-p port The TCP/IP port on which to start listening for connections. The default port is either
5432 or the port set during compile time. (Note: If set to a nondefault port, all client
applications will need to specify this port number to connect successfully.)

-S Starts postmaster as a separate process from the current terminal (like a


daemon). However, all error messages will be redirected to /dev/null instead of
stdout. (Note: Use of this option will make debugging problems nearly impossible.
It is better to start postmaster as an explicit background process and then
redirect error messages to a specific file. See the following section for an example.)

Examples

Start postmaster as a foreground process:

$postmaster -D /usr/local/pgsql/data

Start postmaster as a background process with a specified data directory and direct all error
messages to a specified log file:

$postmaster -D /usr/local/pgsql/data >pglog 2>&1 &

Notes/Location

When starting the postmaster process, the current OS username is selected as the
PostgreSQL username. If the current username is not a valid PostgreSQL user, the process will
not continue.

postgres and postmaster are the same file (actually postmaster is a symbolic link to the
postgres executable). However, you cannot substitute one command for the other and expect
the same results. The postgres executable registers the name it was invoked by, and if it is
called as postmaster, certain options and assumptions are enabled.
Location of the file:

RPM— /usr/bin

Source— /usr/local/pgsql/bin
Chapter 8. System Configuration Files and Libraries
In addition to the executable files mentioned in the previous chapters, PostgreSQL
also includes files that deal with configuration settings. Additionally, depending on
what features are desired, a number of libraries are included. The following is a
listing of the configuration files used and the libraries included.
System Configuration Files

pg_options/postgresql.conf

Description

This is the configuration file that specifies what options are to be used when the server is started as
postmaster. The name of this file has changed between versions and will either be called pg_options,
postmaster.opts, or postgresql.conf, depending on your current version. The exact syntax and
options available within this configuration file also vary depending on the version.

Notes/Location

Essentially, this file is a text file that contains various command-line switches. A standard configuration
file might appear as follows:

-p 5432 Use TCP/IP port 5432


-D /usr/local/pgsql/data/ Path to data directory
-B 64 Start with 64 8KB buffers
-b /usr/local/pgsql/bin/postgres Path to executable
-N 32 32 Max connections

The exact syntax of the configuration file will depend on what version of PostgreSQL is running. The
postgresql.conf file, which is the method used in 7.1, accepts the following options:

CHECKPOINT_SEGMENTS (integer)

The maximum distance between automatic WAL checkpoints.

CHECKPOINT_TIMEOUT (integer)

The maximum time between automatic WAL checkpoints, in seconds.

CPU_INDEX_TUPLE_COST (floating point)

Sets the estimated cost of processing each tuple when used in an index scan.

CPU_OPERATOR_COST (floating point)

Sets the estimated cost for processing each operator in a WHERE clause.

CPU_TUPLE_COST (floating point)

Sets the estimated cost of processing a tuple inside a sequential scan.

DEADLOCK_TIMEOUT (integer)

Specifies the amount of time, in milliseconds, to wait on a lock before checking to see if there is a
deadlock condition or not.

DEBUG_ASSERTIONS (boolean)

The Boolean value to enable or disable various debugging assertion checks.

DEBUG_LEVEL (integer)
The value that determines how verbose the debugging output is. This option is 0 by default, which
means no debugging output. Values up to 4 are valid.

DEBUG_PRINT_PARSE (boolean), DEBUG_PRINT_PLAN (boolean), DEBUG_PRINT_REWRITTEN


(boolean), DEBUG_PRINT_QUERY (boolean), DEBUG_PRETTY_PRINT (boolean)

Specifies what to print in the debug information. Prints the query, the parse tree, the execution plan, or
the query rewriter output to the server log.

EFFECTIVE_CACHE_SIZE (floating point)

Sets the assumed size of the disk cache. This is measured in disk pages, which are normally 8KB apiece.

ENABLE_HASHJOIN (boolean)

The Boolean value to enable or disable hash joins. The default is on.

ENABLE_INDEXSCAN (boolean)

The Boolean value to enable or disable the use of index scan plan types. The default is on.

ENABLE_MERGEJOIN (boolean)

The Boolean value to enable or disable the use of merge-join plan types. The default is on.

ENABLE_NESTLOOP (boolean)

The Boolean value to enable or disable the use of nested-loop join plans. It's not possible to suppress
nested-loop joins entirely, but turning this variable off discourages the planner from using it.

ENABLE_SEQSCAN (boolean)

The Boolean value to enable or disable the use of sequential scan plan types. It's not possible to
suppress sequential scans entirely, but turning this variable off discourages the planner from using it.

ENABLE_SORT (boolean)

The Boolean value to enable or disable the use of sort steps. It's not possible to suppress sorts entirely,
but turning this variable off discourages the planner from using it.

ENABLE_TIDSCAN (boolean)

The Boolean value to enable or disable the use of TID scan plan types. The default is on.

FSYNC (boolean)

The Boolean value that enables or disables PostgreSQL use of the fsync() system call in several places
to make sure that updates are physically written to disk and do not hang around in the kernel buffer
cache. This increases the chance that a database installation will still be usable after an operating system
or hardware crash by a large amount. However, use of this option will degrade system performance. The
default is off.

GEQO (boolean)

The Boolean value to enable or disable genetic query optimization. This is on by default.

GEQO_EFFORT (integer), GEQO_GENERATIONS (integer), GEQO_POOL_SIZE (integer),


GEQO_RANDOM_SEED (integer), GEQO_SELECTION_BIAS (floating point)

Various tuning parameters for the genetic query optimization algorithm.

GEQO_THRESHOLD (integer)
Specifies how many FROM items until the GEQO optimization is used. The default is 11.

HOSTNAME_LOOKUP (boolean)

The Boolean value to specify whether to resolve IP addresses to hostnames. By default, connection logs
show only the IP address.

KRB_SERVER_KEYFILE (string)

Specifies the location of the Kerberos server key file.

KSQO (boolean)

The Key Set Query Optimizer (KSQO) causes the query planner to convert queries whose WHERE
clause contains many OR'ed AND clauses. KSQO is commonly used when working with products like
Microsoft Access that tend to generate queries of this form. The default is off.

LOG_CONNECTIONS (boolean)

The Boolean value to enable or disable logging of each successful connection.This is off by default.

LOG_PID (boolean)

The Boolean value that enables or disables the log entry to prefix each message with the process ID of
the back-end process. The default is off.

LOG_TIMESTAMP (boolean)

The Boolean value to enable or disable each log message to include a timestamp.The default is off.

MAX_CONNECTIONS (integer)

Determines how many concurrent connections the database server will allow.The default is 32.

MAX_EXPR_DEPTH (integer)

Sets the maximum expression nesting depth that the parser will accept. The default value is high enough
for any normal query, but you can raise it if you need to. (If you raise it too high, however, you run the
risk of back-end crashes due to stack overflow.)

PORT (integer)

The TCP port on which the server listens. It is 5432 by default.

RANDOM_PAGE_COST (floating point)

Sets the estimated cost of performing random, nonsequential page retrievals.

SHARED_BUFFERS (integer)

Sets the number of 8KB shared memory buffers that the database server will use.The default is 64.

SHOW_QUERY_STATS (boolean), SHOW_PARSER_STATS (boolean), SHOW_PLANNER_STATS


(boolean), SHOW_EXECUTOR_STATS (boolean)

Various Boolean values to set options that write performance statistics of the respective module to the
server log.

SHOW_SOURCE_PORT (boolean)

The Boolean value to enable or disable the showing of the outgoing port of the connected user. The
default is off.
SILENT_MODE (bool)

The Boolean value that determines whether postmaster runs silently. If this option is set, postmaster
will automatically run in the background, and any controlling ttys are disassociated; thus, no messages
are written to stdout or stderr (the same effect as the postmaster's -S option). Unless some logging
system such as syslog is enabled, using this option is discouraged because it makes it impossible to see
error messages.

SORT_MEM (integer)

Specifies the amount of memory to be used by internal sorts and hashes before resorting to temporary
disk files. The value is specified in kilobytes and defaults to 512 kilobytes.

SQL_INHERITANCE (bool)

The Boolean value to determine whether subtables are included in queries by default. By default, 7.1 and
above include this capability; however, this was not the case in prior versions. If you need the old
behavior, you can set this variable to off.

SSL (boolean)

The Boolean value to enable or disable SSL connections. The default is off.

SYSLOG (integer)

The value that determines the postgres use of syslog for logging. If this option is set to 1, messages
go both to syslog and the standard output. A setting of 2 sends output only to syslog. The default is
0, which means syslog is off. To use syslog, the build of postgres must be configured with the --
enable-syslog option.

SYSLOG_FACILITY (string)

This option determines the syslog "facility" to be used when syslog is enabled.You can choose from
LOCAL0, LOCAL1, LOCAL2, LOCAL3, LOCAL4, LOCAL5, LOCAL6, LOCAL7; the default is LOCAL0.

SYSLOG_IDENT (string)

If logging to syslog is enabled, this option determines the program name used to identify PostgreSQL
messages in syslog log messages. The default is postgres.

TCPIP_SOCKET (boolean)

The Boolean value to enable or disable TCP/IP connections. It is off by default.

TRACE_NOTIFY (boolean)

The Boolean value to enable or disable debugging output for the LISTEN and NOTIFY commands. The
default is off.

UNIX_SOCKET_DIRECTORY (string)

Specifies the directory of the UNIX domain socket on which the postmaster is to listen for connections
from client applications. The default is normally /tmp.

UNIX_SOCKET_GROUP (string)

Sets the group owner of the UNIX domain socket.

UNIX_SOCKET_PERMISSIONS (integer)
Sets the access permissions of the UNIX domain socket. The default permissions are 0777, meaning
anyone can connect.

VIRTUAL_HOST (string)

Specifies the TCP/IP hostname or address on which the postmaster is to listen for connections from
client applications. Defaults to listening on all configured addresses (including localhost).

WAL_BUFFERS (integer)

The number of disk-page buffers in shared memory for the WAL log.

WAL_DEBUG (integer)

If nonzero, turn on WAL-related debugging output on standard error.

WAL_FILES (integer)

The number of log files created in advance at checkpoint time.

WAL_SYNC_METHOD (string)

The method used for forcing WAL updates out to disk. Possible values are FSYNC, FDATASYNC,
OPEN_SYNC, and OPEN_DATASYNC. Not all of these choices are available on all platforms.

All the preceding Boolean values will accept the following values:

TRUE Values FALSE Values

ON OFF

TRUE FALSE

YES NO

1 0

Location of the file:

RPM— /var/lib/pgsql/data/

Source— /usr/local/pgsql/data

/etc/logrotate.d/postgres

Description

Responsible for rotating log files.

Notes/Location
Although not an official part of the PostgreSQL distribution, many systems include a file to manage and
rotate log files produced by PostgreSQL.

These are usually cron jobs that are scheduled to run daily or weekly. These RPM additions are usually
configured to run PostgreSQL logging with syslog.

It is generally not advisable to attempt to rotate log files not running as a syslog configured
installation. PostgreSQL keeps its connections to log files open at all times; therefore, rotating a log file
while postmaster is still active could result in unpredictable behavior.

If configuring PostgreSQL to run with syslog is not an option, the next best solution is to briefly stop
the postmaster service, rotate your log files, and then restart the database system.

For more information on PostgreSQL log files and syslog, see Chapter 9, "Databases and Log Files."

Location of the file:

RPM— /etc/logrotate.d/postgres

Note:
This file was subsequently dropped from the most recent RPM package due to the confusion
resulting from the syslog versus PostgreSQL log issues identified here. However, older RPM
packages that are specifically designed to work with syslog will include the preceding file in
the specified location.

pg_hba.conf

Description

The pg_hba.conf file is a configuration file that is responsible for host-based access control. Essentially,
this is a text file that details how users are permitted to connect to the PostgreSQL back end.

This file has separate areas that deal with local or remote (TCP/IP) users, databases allowed to connect,
and authentication methods.

The format of a PostgreSQL access control file differs depending on whether a TCP/IP or a local UNIX
connection is being specified. The basic formats are as follows:

TCP/IP:

Host DB IP Netmask Auth-Type [Auth-Args]

Local:

Local DB Auth-Type [Auth-Args]

Option Description

Host Either host, hostssl, or local, depending


on whether access is being specified for a
standard TCP/IP connection, secure SSL
TCP/IP connection, or local UNIX connection.

DB The name of the database for which this


access control list is valid. Use all to specify
all databases or sameuser to specify that
the user can only connect to databases of the
same username.

IP For TCP/IP connections, this specifies valid


client IP addresses that can connect to the
PostgreSQL back end.

Netmask The network mask for the valid client


machine.

Auth- The authentication method to employ before


Type granting access. This can be one of the
following:

trust No authentication; trust this user.

password Match password supplied by host. By default,


pg_shadow is checked unless an alternative is
supplied in the Auth-Args section.

crypt Same as preceding, but password is not sent


as clear text; it is encrypted before being
transmitted.

ident Uses the ident protocol (RFC 1413). Usually


a file named pg_ident.conf exists that will
map ident usernames to the corresponding
PostgreSQL username. (Not supported by local
connections.)

krb4 Kerberos V4 is used. (Not supported by local


connections.)

krb5 Kerberos V5 is used. (Not supported by local


connections.)

reject Deny connection attempt.

Auth- Various arguments used by the


Args authentication method specified.

Notes/Location

A sample pg_hba.conf file might appear as follows

local all trust


host web 192.168.0.0 255.255.255.0 trust
host payroll 192.168.0.0 255.255.255.0 crypt

In this case, all local connections are permitted. Similarly, any connection from the IP range 192.168.0.0
to 192.168.0.254 is permitted to the web database. However, a user from that address block trying to
connect to the payroll database will need to provide authentication provided by the crypt method.

Location of the file:

RPM— /usr/local/pgsql/data/

Source— /var/lib/pgsql/data/
Library Files

The library files installed by PostgreSQL vary depending on a number of factors;


compile-time options, specific packages, versions, and ancillary programs all
determine what libraries are installed. Therefore, the following is a listing of the
most common libraries installed and their typical locations. The symbol indicates
a symbolic link, and the version numbers have been replaced with an "X" or a "Y" to
indicate major and minor version numbers. Substitute as appropriate.

Source: /usr/local/pgsql/lib

RedHat: /usr/lib

Library files for ecpg:

libecpg.a
libecpg.so libecpg.so.X.Y.Z
libecpg.so.X libecpg.so.X.Y.Z
libecpg.so.X.Y.Z

Library files for simplified integration with the libpq library:

libpgeasy.a
libpgeasy.so libpgeasy.so.X.Y
libpgeasy.so.X libpgeasy.so.X.Y
libpgeasy.so.X.Y

Library files for the standard C programming interface:

libpq.a
libpq.so libpq.so.X.Y
libpq.so.X libpq.so.X.Y
libpq.so.X.Y

Library files for the C++ programming interface:

libpq++.a
libpq++.so libpq++.so.X.Y
libpq++.so.X libpq++.so.X.Y
libpq++.so.X.Y

Library files for the ODBC interface:

libpsqlodbc.a
libpsqlodbc.so libpsqlodbc.so.X.Y
libpsqlodbc.so.X libpsqlodbc.so.X.Y
libpsqlodbc.so.X.Y
Library files for the tcl interface:

libpgtcl.a
libpgtcl.so libpgtcl.so.X.Y
libpgtcl.X libpgtcl.so.X.Y
libpgtcl.so.X.Y
plpgsql.so

Library files for the Perl interface:

/usr/lib/perl5/site_perl/5.005/<arch>/Pg.so

Library files for the Python interface:

/usr/lib/python1.5/site-packages/_pgmodule.so

Library files for the PHP interface:

/usr/lib/php3/pgsql.so
/usr/lib/php4/pgsql.so
Chapter 9. Databases and Log Files
Depending on your specific installation, the location of your log and database files
will vary. Normally, they are located in the same directory as your base PostgreSQL
data files.

On a source-based system, this is usually /usr/local/pgsql/data, and on an


RPM-based system, it is usually /var/lib/pgsql/data. However, neither of these
locations is the "officially sanctioned" location. The Linux FHS (File-Hierarchy
Specification) states that /var/log/pgsql or /var/log/postgres is the proper
location for the database. For the following listings, $PGBASE refers to the path
where the base system is installed.
PostgreSQL Data Directory

Every PostgreSQL installation requires that a base data directory be specified.


Usually this is done with the initdb command during installation. This directory
contains a number of files. The following is a typical listing of the files present in the
default PostgreSQL data directory. (Note: These files may change from version to
version, but a close approximation of these will exist.)

File Description

$PGBASE/PG_VERSION The file containing the version number of the


version of PostgreSQL that created this data
directory.

$PGBASE/base/ The directory that contains user-defined databases


and the default template1 database.

$PGBASE/base/template1 The default template that is used as a model for all


other user-created databases.

$PGBASE/pg_control The internal control file for PostgreSQL, used for


keeping checkpoints of flushed transactions,
position tracking, and so on.

$PGBASE/pg_database The internal control file for PostgreSQL that keeps


a record of every database created on the system.

$PGBASE/pg_geqo The configuration file used to define Genetic Query


Optimizations (GEQO).

$PGBASE/pg_geqo.sample A sample GEQO file.

$PGBASE/pg_group The internal control file that stores group


information and user membership.

$PGBASE/pg_group_ The index file for pg_group names.


name_index
$PGBASE/pg_group_ The index file for pg_group system IDs (UIDs).
sysid_index

$PGBASE/pg_log Maintains the current status of transactions, either


committed or uncommitted.

$PGBASE/pg_options The configuration file used when postmaster is


started (might be listed as postmaster.opts).

$PGBASE/pg_pwd The plain-text file containing usernames and


passwords.

$PGBASE/pg_shadow The system catalog containing usernames,


passwords, and associated user rights.

$PGBASE/pg_variable The internal control file used to store current


variable settings such as the next OID and so on.

$PGBASE/pg_xlog The directory to house log files generated by


WAL.These files ensure database integrity through
the new Write-Ahead Logging (Version 7.1
feature).

$PGBASE/postmaster.opts The configuration file used when postmaster is


started (might be listed as pg_options).

$PGBASE/postgresql.conf The configuration file used when postmaster is


started (Version 7.1 feature).

You will notice that all user-defined databases are stored neatly in the
$PGBASE/base directory. Every database created in PostgreSQL is stored in its own
directory under $PGBASE/base. Within each directory are two main classes of files:
system catalogs and user-created.

Note:
Some changes were made to these file locations starting inVersion 7.1. In
particular, there now exists a template0 file that is a read-only copy of
the template1 file. Additionally, many of the preceding files are now
named according to their PID number; this change was made to facilitate
the new Write-Ahead Logging (WAL) implementation. Refer to the latest
documentation included with your system for more information.

System Catalogs

Every time a new database is created, PostgreSQL extracts a base set of system
catalogs from the template1 database. These files are used to track tables,
indexes, aggregates, operators, data types, and so on.

A basic set of system catalogs should look something like the following. (Different
versions contain different catalog files, but this is a representational sample.)

File Description

$PGBASE/pg_aggregate Contains definitions of aggregate functions.

$PGBASE/pg_attrdef Contains the default value of columns that indicate


use of a default value condition.

$PGBASE/pg_attribute Contains one row for every column in every table


that describes attributes (that is, name, data type,
and so on).

$PGBASE/pg_class Contains information on all classes (tables, indexes,


views, and so on).

$PGBASE/pg_database Shared with the entire cluster, it contains


information on available databases.

$PGBASE/pg_description Holds comments created with the COMMENT SQL


command.

$PGBASE/pg_group Defines groups and membership.

$PGBASE/pg_index Contains information on all defined indexes.

$PGBASE/pg_inherits Contains information on table inheritance.


$PGBASE/pg_language Registers the call interfaces for available PostgreSQL
languages.

$PGBASE/pg_operator One row for every operator type in the database.

$PGBASE/pg_proc Contains information on all defined functions.

$PGBASE/pg_relcheck Stores information on check constraints.

$PGBASE/pg_shadow Stores user information, passwords, and effective


user rights.

$PGBASE/pg_type Stores information on all available data types in the


database.

It is important to realize that these objects are accessible (from a DBA account) via
a standard SQL interface. From psql, for instance, these system catalogs can be
called like normal SQL tables.

Warning!

Viewing information is fine, but be careful about making any changes.


Your database could quickly become unusable if the wrong changes are
made.

User-Defined Catalogs

This directory also contains the names of any user-defined tables, indexes,
sequences, and so on. For instance, looking in the directory for your newriders
database, you see the following:

$PGBASE/authors The authors table.

$PGBASE/auth_idx The index file for the authors table.

$PGBASE/payroll The payroll table.


$PGBASE/payroll_idx The payroll index file.

$PGBASE/next_check_seq Custom-created sequence to compute the next


check number.
Log Files

By default, the postmaster process sends logging information to stdout.


However, it is common to redirect output to a specified log file.

>postmaster -D /usr/local/pgsql/data >pglog 2>&1 &

This will redirect the stdout of the postmaster process to the file named pglog
and will also redirect the stderr facility of postmaster to stdout (which is then
itself redirected to the specified log file).

This type of arrangement is convenient but presents some long-term problems:

Log files tend to grow very large if untended.

The system log files and the database log files would reside in separate areas.
This can make debugging system failures problematic.

It is difficult to redirect these log files to an external "logging" server designed


for this purpose.

Depending on the size of your database, the frequency of its use, and your
networking architecture, this might be a fine solution. However, there are two
methods for dealing with the problems previously presented:

Implement a custom log-rotation solution.

Configure PostgreSQL to use syslog.

Customized Log Rotation

A customized log-rotation solution is possible if some amount of downtime is


permitted. This is due to the fact that the postmaster process must be stopped for
the logs to be rotated.

This type of solution usually mandates the use of cron and shell scripting. The
process usually occurs like this:

1. At a specified hour (usually late at night), postmaster is stopped.

2. The log-rotation script is run.

3. After a few minutes of delay, postmaster is restarted.

Typically, this can be completed in one or two minutes, depending of course on the

system hardware.
Configuring cron to perform these tasks is outside the scope of this book, but
generally it is a straightforward process.

The script to handle the actual log rotation can be done either as a simple shell
script or in a language like Perl or Python. A typical rotation scheme usually
renames (using mv) files, keeping only a specific amount of history. For instance:

logfile (current) logfile.1


logfile.1 logfile.2
logfile.2 logfile.3
logfile.3 logfile.4
logfile.4 logfile.5
logfile.5 /dev/null

Typically, these events are run daily or weekly; however, for heavily used systems, a
more frequent schedule could be advisable.

Configuring PostgreSQL to Use syslog

The preceding approach would be fine for most PostgreSQL server installations;
however, a number of issues are still not addressed by this example, namely:

Log files are not integrated with other system log files (making debugging
more difficult).

It is still relatively difficult to redirect log files to an external logging system.

The solution is to use the syslog facility present on most UNIX (Linux) systems.

Generally, configuring PostgreSQL to use syslog requires three steps:

1. Compile the source using the appropriate options to enable logging (that is, --
enable-syslog) or download the appropriate RPMs that have that
functionality enabled.

2. Enable the syslog option in the pg_options (or equivalent) file (that is,
syslog=1).

3. Edit the /etc/syslog.conf file to correctly capture PostgreSQL syslogging


calls, like this:

local0.* /var/log/postgresql

Using syslog is suggested for larger installations or when remote monitoring of the
database system is a priority.
Chapter 10. Common Administrative Tasks
The administration duties for dealing with a PostgreSQL system can be broken down
as follows:

Compilation and installation

Creating users

Assigning rights to users

Performing regular database maintenance

Performing database backups and restores

Performance tuning

Each of these issues requires specific knowledge of key areas of the system. Please
refer to the following sections that focus on these areas.
Compiling and Installation

A detailed description of compiling and installation, with all included options, is


outside the scope of this book. However, the two most popular forms of installation
will be covered: source-based and packaged installs.

Source-Based Installation

The source files can be retrieved from the PostgreSQL FTP site
(ftp.postgresql.org) or from numerous mirror sites around the world.

Once the file is downloaded, it will probably be in a tarred-gzipped format. In order


to compile, it first must be unpacked. Move the file to a clean directory (for
example, /usr/src/postgres) and issue the following commands:

>tar xzf postgresql-7.1.tar.gz

After the code is unpacked, you can delete the original tar.gz file if disk space is
an issue; otherwise, move it to a safe location.

Next, review the INSTALL text file included in the created directory for installation
notes. Briefly, the rest of the procedure is as follows:

1. Create a user account to serve as the DBA's account (postgres is a popular


choice).You can do this using userconf, useradd, or whatever tool your
system provides for user management.

2. Review the installation options for your system. Here is a partial list of options
supported (type ./configure --help for a full list):

--prefix=BASEDIR (where BASEDIR is the path of choice)

--enable-locale

--enable-multibyte (to include support for multibyte characters like


Chinese and so on)

--enable-syslog (turn on syslog feature)

--enable-assert (enable assert checking; debug feature)

--enable-debug (compile with debugging flags on)

--with-perl (include Perl interface support)

--with-tcl (include tcl interface support)

--with-odbc (include ODBC drivers)

3.
Configure the source code with those options selected (for example,
configure --with-odbc).

4. Type make (or gmake) to build binaries.

5. If the make fails, examine the log files generated (usually in ./config.log)
for any reasons why the compile didn't work.

6. Type make install to install the binaries to the location specified (default is
/usr/local/pgsql).

7. Tell your machine where the libraries are located, either by setting the
LD_LIBRARY_PATH environmental variable to the <BASEDIR>/lib path or
by editing the /etc/ld.so.conf file to include it.

8. Include the <BASEDIR>/bin path in the user's or system's search path (that
is, /etc/profile).

9. Create the directory to hold the databases, change the ownership to the DBA,
and initialize the location (assumes a user named postgres exists):

# mkdir /usr/local/pgsql/data
# chown postgres /usr/local/pgsql/data
# su - postgres
> /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data

10. Start the postmaster server (as the DBA account) in the background.
Specify the data directory previously created, such as:

>/usr/local/pgsql/bin/postmaster -D /usr/local/pgsql/data &

11. As DBA, create the users you need using the createuser command.

12. Switch to the user created and create the database(s) needed (that is,
createdb).

Error Message Routing


The actions taken in step 11 will result in the routing of error messages to
the terminal where this command was executed.To route error messages
to a log file, use the >>server.log 2>>1 & ending. Refer to the
INSTALL notes for more information.

Package-Based Installation

Essentially, package-based installation (for example, RPM or DEB) automates the


preceding process. It is still a good idea, however, to read the process outlined in
the preceding section so that you at least understand what things the package will
be doing to your system.
Depending on the package management system installed on your machine, the
commands will be different. The following assumes you have RPM-based package
management tools, but the Debian package management system is very similar in
concept:

1. Download the list of RPM files that you require


(ftp.postgresql.org/pub/binary).

(examples)
postgresql-server-7.0.3-2.i386.rpm Server programs (req)
postgresql-7.0.3-2.i386.rpm Clients & Utilities (req)
postgresql-devel-7.0.3-2.i386.rpm Development Libraries
postgresql-odbc-7.0.3-2.i386.rpm ODBC Libraries
postgresql-perl-7.0.3-2.i386.rpm Perl interface
postgresql-python-7.0.3-2.i386.rpm Python interface
postgresql-tcl-7.0.3-2.i386.rpm TCL Interface
postgresql-tk-7.0.3-2.i386.rpm Tk Interface
postgresql-test-7.0.3-2.i386.rpm Regression Test Routines

2. Install the files.

3. Verify that a user for PostgreSQL was created by examining /etc/passwd (or
the equivalent).

4. Switch to the DBA user account (typically postgres) and create the users you
need (for example, createuser web).

Switch to that user and create the working database (for example, createdb web
site).
Creating Users

Database users are separate entities from regular operating system users.
Depending on the particular application, it might be possible to have only one or two
total database users. However, if multiple users need to connect to the database—
each with his or her own set of access rights—it is desirable to create individual user
accounts.

The easiest way to create users is to utilize the command-line utility createuser.

There are three main attributes to consider when creating new users:

Should they be able to create their own users? (Are they superusers?)

Should they be able to create their own databases?

Is authentication required? If so, what type and what password?

The actual act of creating users can take place either at the command line or in an
interactive SQL session.

To create a user from the command line:

>createuser web
>Shall the new user be allowed to create databases (y/n)? N
>Shall the new user be allowed to create users (y/n)? N

Or, alternatively, from a SQL session:

psql=>CREATE USER web NOCREATEDB NOCREATEUSER;


CREATE

Creating a user from a SQL session enables some additional options that are not
available from the command-line utility. For instance, passwords, group
membership, and account expiration can all be set from this method.

Additionally, PostgreSQL enables users to be collected into logical groups for easier
permission management. To create a group, the following command should be
entered in a SQL session:

CREATE GROUP webusers;

Then users can be added or removed from the group, as follows:

ALTER GROUP webusers ADD USER bill, mary, amy, jane;


ALTER GROUP webusers DROP USER mary;
Granting User Rights

There are four basic rights in the PostgreSQL database system:

Select (read)

Insert (write)

Update/Delete (write)

Rule (write/execute)

By default, the creator of a database is implicitly given all rights to all objects in the
database. These privileges are considered immutable for the DBA superuser
account.

To assign other users rights on database options, use the GRANT and REVOKE SQL
commands, such as:

GRANT SELECT, UPDATE ON authors TO bill;


REVOKE ALL ON payroll FROM joe;

The PostgreSQL system also has a reserved keyword called PUBLIC that applies to
every user in the system (except the DBA). This can make setting blanket rules and
permissions much easier.

GRANT SELECT, UPDATE ON authors TO PUBLIC;


REVOKE UPDATE, DELETE on payroll FROM PUBLIC;

Specifying permissions on a user-by-user basis can be tedious on systems with a


sizable number of user accounts. Using GRANT and REVOKE in combination with
groups can be an effective method for handling rights management.

Typically, the users should be collected into logical groups that seek to match like
users together with respect to their permissions. Rights then can be assigned or
revoked for the entire group without having to specify every individual user.

GRANT SELECT, UPDATE ON authors TO GROUP staff;


REVOKE UPDATE, DELETE on payroll FROM GROUP staff;
GRANT SELECT, UPDATE, DELETE on payroll TO GROUP managers;
Database Maintenance

Proper database maintenance ensures that the system will always function optimally
and that problems can be handled effectively. There are three main areas of
database maintenance:

Monitoring log files.

Scheduling regular VACUUM and VACUUM ANALYZE events.

Routinely backing up.

Regular monitoring of log files can tip off administrators to potential issues that can
be corrected long before they become major problems. Some administrators even
write small custom scripts to parse log files and automatically mail any suspicious
entries to an email address so that the administrator can take further action.

cron is also a useful database maintenance tool, especially for performing routine
tasks like vacuumdb and log rotation. One of the primary reasons the vacuumdb
utility was created as a separate command-line utility was to facilitate its use as an
automated cron job.
Database Backup/Restore

The most critical component of any database maintenance plan is the database
backup and restore procedures. Once again, PostgreSQL makes the administrator's
job easier by providing command-line tools such as pg_dump, pg_dumpall, and
pg_restore. Like commands such as vacuumdb, these are especially suited to be
run as cron jobs.

By default, pg_dump and pg_dumpall simply dump all output to stdout.


However, this can easily be redirected to files using the appropriate UNIX redirection
symbols.

>pg_dump newriders > nr.backup

After the command has redirected its output to a standard OS file, standard backup
tools can be used to securely archive it.

Here are some factors to consider when trying to evaluate an optimal backup plan.

Does the entire system need to be backed up or just a specific database?

If only one database is needed, the pg_dump command should suffice. If an


entire cluster of databases is needed, however, use the pg_dumpall
command.

The two commands function almost identically; however, pg_dumpall is not


capable of authenticating itself to every database it dumps. To run pg_dumpall
against back ends that require authentication, set the environmental variable
PGPASSWORD to the proper password, and that will be automatically relayed
on each attempted connection.

Will you need to selectively restore database files (that is, specific tables and
so on)?

Version 7.1 of PostgreSQL made some improvements to the pg_dump,


pg_dumpall, and pg_restore commands. These commands allow the
database dump to be stored in a special format.

This new format provides a great deal of flexibility when it comes time to
restore. The database schema, data, functions, or specific tables can be
selectively restored. Additionally, this new format stores data in a compressed
format that causes fewer problems when dealing with very large databases.

For instance, to dump out the newriders database in the special format and
then selectively restore only the payroll table:

>pg_dump -Fc newriders > nr.backup.cust_fmt


>pg_restore -d newriders -t payroll nr.backup.cust_fmt
What will the resultant size of the dump file be?

Many operating systems (like certain versions of Linux) have restrictions on


the maximum allowable size of a single file (such as 2GB). Therefore, on a
large database system, this could be problematic.

There are a number of answers to this problem, ranging from upgrading your
PostgreSQL database system to piping output through special tools.

As previously mentioned, PostgreSQLVersion 7.1 introduced some new


features to the pg_dump and pg_dumpall commands, which include the use
of a new custom dump format. These additions include a new command-line
option that indicates the compression level desired.

For instance, to dump the newriders database at the maximum compression


level possible (at the expense of speed):

>pg_dump -Fc -z9 newriders > nr.backup.cust_fmt

Alternatively, another method for achieving the same effect is to pipe the
output of pg_dump through the gzip command. This can be done using any
version of PostgreSQL and standard UNIX system commands:

>pg_dump newriders | gzip > nr.backup.zip

And it can be restored by the following:

>gzip -c nr.backup.zip | psql newriders

If the resulting zipped files are still too large, the other option is to use the
split command. This example will split the output file into numerous 1GB
files.

>pg_dump newriders | split -b 1024m - nr.backup

And it can be restored by the following:

>cat nr.backup.* | psql newriders

Are your configuration files, such as pg_options, pg_hba.conf, and


pg_pwd, being regularly archived?

For complex installations, lost configuration files can be very time consuming
to try to re-create by hand. Make sure you have secure, offline copies of all
your PostgreSQL configuration files.
Performance Tuning

Generally speaking, there are no surefire methods for obtaining optimal performance from a
database system. However, there are guidelines that can assist administrators in implementing
successful performance-tuning strategies.

Hardware Considerations

If you notice that your database system is consistently running at high CPU loads or that an
excessive amount of hard-disk paging is occurring, it might be necessary to upgrade your
hardware.

These are the four biggest hardware issues related to database performance:

RAM. Not enough RAM will result in the database constantly having to swap memory to
hard disk. This expensive and time-consuming operation always incurs a performance hit.

Hard disk. Slow hard drives and controllers can result in a severe lack of performance.
Upgrading to newer controllers and/or drives can result in a significant boost in system
speed. Particularly, the use of striped RAID arrays can benefit system performance.

CPU. Insufficient CPU resources can slow down system responsiveness, particularly if many
large queries are being processed simultaneously. Because PostgreSQL is not
multithreaded, there is no direct benefit to be gained by running it on a multi-CPU system.
However, each connection does receive its own process, which could be benefited by being
spread across multiple CPUs.

Network. No matter how robust the system's hardware, performance will suffer if there are
networking problems. Upgrading networking cards, adding switches to the LAN, and
increasing bandwidth capacity can all positively impact system performance.

Tuning SQL Code

Although it is common to blame hardware for database sluggishness, most often there are
tunings to the underlying database code that could improve performance.

Some general rules can be followed to help tune SQL database code:

Have indexes on commonly queried fields.

Any fields in which joins are being done or that are the focus of numerous SELECT…WHERE
clauses should be indexed. However, there is a balance to strike between the number of
indexes on a field and performance. Indexes help with selection but penalize insertion or
updates. So having every field indexed is not a good idea.

Use explicit transaction.

If numerous tables are being updated or inserted, encapsulating the statements inside of
one BEGIN…COMMIT clause can significantly improve performance.

Use cursors.

Using cursors can dramatically improve system performance. In particular, using cursors to
generate lists for user selection can be much more efficient than running numerous isolated
queries.
Limit use of triggers and rules.

Although triggers and rules are an important part of data integrity, overuse will severely
impact system performance.

Use explicit JOINs.

Starting with PostgreSQLVersion 7.1, it is possible to control how the query planner will
operate by using an explicit JOIN syntax. For instance, both of the following queries
produce the same results, but the second unambiguously gives the query planner the order
to proceed:

SELECT * FROM x,y,z WHERE x.name=y.name AND y.age=z.age;


SELECT * FROM x JOIN (y JOIN z ON (y.age=z.age)) ON (x.name=y.name);

Enforce logic mechanisms on the front end.

Enforcing some minimal standards on the front end of a database application can improve
overall system performance. Checking input fields for valid and/or minimal information
requirements can obviate the need to do tremendously expensive queries. For instance,
enforcing that a front end requires more than three letters on a last name will prevent the
back end from having to process a query to return all records in which the last name begins
with an "S," which could be a very expensive query and not provide any real value to the
user.

Buffer Size and Other Considerations

PostgreSQL comes with certain default or preset settings with regard to buffer size, simultaneous
connections, and sort memory. Usually these settings are fine for a standalone database.
However, they usually are set cautiously low to make as little impact on the system as possible
while idle.

For larger, dedicated servers with several hundred or thousand megabytes of data, these settings
will need to be adjusted.

It is often assumed that setting the options to higher values will automatically improve
performance. Generally, you should not exceed more than 20% of your system limits with any of
these settings. It is important to leave sufficient RAM for kernel needs; a sufficient amount of
memory particularly needs to be available to handle network connections, manage virtual
memory, and control scheduling and process management. Without such tolerances, performance
and responsiveness of the system will be negatively impacted.

There are three crucial run-time settings that impact database performance: shared buffers, sort
memory, and simultaneous connections.

Shared Buffers

The shared buffer option (-B) determines how much RAM is made available to all of the server
processes. Minimally, it should be set to at least twice the number of simultaneous connections
allowed.

Shared buffers can be set either in the postgresql.conf file or by issuing a direct command-
line option to the postmaster back end. By default, many PostgreSQL installations come with a
preset value of 64 for this setting. Each buffer consumes 8KB of system RAM. Therefore, in a
default setting, 512KB of RAM is dedicated for shared buffers.
If you are setting up a dedicated database that is expected to handle very large datasets or
numerous simultaneous connections, it might need to be set as high as 15% of system RAM. For
instance, on a machine with 512MB of RAM, that means a shared buffer setting of 9,000.

Ideally, buffer space should be large enough to hold the most commonly accessed table
completely in memory. Yet it should be small enough to avoid swap (page-in) activity from the
kernel.

Sort Memory

The postgres back end (which typically is only called by the postmaster process) has a
setting (-S) that determines how much memory is made available to query sorts. This value
determines how much physical RAM is exhausted before resorting to disk space, while trying to
process sorts or hash-related functions.

This setting is declared in KB, with the standard default being 512.

For complex queries, many sorts and hashes might be running in parallel, and each one will be
allowed to use this much memory before swapping to hard disk begins. This is an important point
to stress—if you would blindly set this setting to 4,096, every complex query and sort would be
allowed to take as much as 4MB of RAM. Depending on your machine's available resources, this
might cause the virtual memory subsystem of your kernel to swap this memory out.
Unfortunately, this is usually a much slower process than just allowing PostgreSQL to create
temporary files in the first place.

Simultaneous Connections

There is a postmaster option (-N) that will set the number of concurrent connections that
PostgreSQL will accept. By default, this setting is set to 32. However, it can be set as high as
1,024 connections. (Remember that shared buffers need to be set to at least twice this number.)
Also remember that PostgreSQL does not run multithreaded (yet); therefore, every connection
will spawn a new process. On some systems, like UNIX, this poses no significant problems.
However, on NT, this can often become an issue.

Optimizing Queries with EXPLAIN

The EXPLAIN command describes the query plan being evaluated for the supplied query. It
returns the following information:

Starting cost. This is an estimation of how much time elapsed before an output scan
began. Typically, this number will be nonzero if it was waiting on another query to complete
before it could begin; such is the case with subselects and joins.

Total cost. This is an estimation of how much time would be spent if all rows were
returned. This occurs regardless of whether any other factors, like a LIMIT statement,
would've prevented all rows from being returned.

Output rows. This is the estimated number of rows returned. As in the preceding, this
happens even if factors like LIMIT statements would prevent it.

Estimated average width. This is the width, in bytes, of the average row.

The time units previously mentioned are not related to an objective amount of time; they are an
indication of how many disk page fetches would be needed to complete the request.
For example:

EXPLAIN SELECT * FROM authors;

NOTICE: QUERY PLAN


Seq Scan on authors (cost=0.00..92.10 rows=5510 width=20)

The preceding EXPLAIN statement lists the following estimations:

Using a sequential scan (as opposed to an index).

No delay on start time.

A cost of 92.10 for delivering the entire query.

An estimated 5,510 rows will be returned.

Average width of a row is 20 bytes.

Modifying the query will produce different results:

EXPLAIN SELECT * FROM authors WHERE age<10000;

NOTICE: QUERY PLAN


Seq Scan on authors (cost=0.00..102.50 rows=5510 width=20)

In this example, you can see that the total cost increased slightly. It is interesting to note that
although there is an index on age on this table, the query planner is still using a sequential scan.
This is due to the fact that the search criterion is so broad; an index scan would not be of any
benefit. (Obviously, all values in the age column are less than 10,000.)

If you constrain the search criterion slightly more, you can see some changes:

EXPLAIN SELECT * FROM authors WHERE age<75;

NOTICE: QUERY PLAN


Seq Scan on authors (cost=0.00..102.50 rows=5332 width=20)

Again, you are still using an index scan, although the number of rows returned is now lower.
Further constraints can produce results that are more dramatic:

EXPLAIN SELECT * FROM authors WHERE age<30;

NOTICE: QUERY PLAN


Index Scan using age_idx on authors
(cost=0.00..32.20 rows=991 width=20)

A number of things are interesting about this result. First, you have finally constrained the
criterion enough to force the query planner to make use of the age_idx index. Second, both the
total cost and the number of returned rows are dramatically reduced.

Finally, let's try the following:

EXPLAIN SELECT * FROM authors WHERE age=27;


NOTICE: QUERY PLAN
Index Scan using age_idx on authors
(cost=0.00..3.80 rows=71 width=20)

You can see the tremendous speed gain you were able to achieve by using such a limited
criterion.

Using EXPLAIN on more complex queries can sometimes illuminate potential problems with the
underlying database structure.

EXPLAIN SELECT * FROM authors, payroll WHERE


authors.name=payroll.name;

NOTICE: QUERY PLAN

Merge Join (cost=69.83..425.08 rows=85134 width=36)


->Index Scan using name_idx on authors
(cost=0.00..273.80 rows=5510 width=20)
->Sort (cost=69.83..69.83 rows=1000 width=16)
->Seq scan on payroll
(cost=0.00..20.00 rows=1000, width=16)

This output produces some interesting facts about the underlying database structure. Obviously,
the authors table has an index on name, but the payroll table appears to be resorting to
using sequential scans and sorts to match fields.

After investigating, it is determined that, in fact, the payroll table does not have an
appropriate index for this join. So, after an index is created, you get the following results:

CREATE INDEX pr_name_idx ON payroll(name);

EXPLAIN SELECT * FROM authors, payroll WHERE


authors.name=payroll.name;

NOTICE: QUERY PLAN

Merge Join (cost=0.00..350.08 rows=44134 width=36)


->Index Scan using name_idx on authors
(cost=0.00..273.86 rows=5510 width=20)
->Index Scan using pr_name_idx on payroll
(cost=0.00..29.50 rows=500 width=16)

By including an index on the payroll table, you have now achieved a 25% increase in query
execution.

Running EXPLAIN on your queries is a good way to uncover hidden bottlenecks that are
impacting system performance.

In fact, for non-hardware–related problems, the EXPLAIN command is probably the single best
tool that a DBA can use to solve performance problems. EXPLAIN provides the information
necessary to intelligently allocate system resources, such as shared buffers, and optimize your
queries and indexes for greater performance.
One of the best ways to use the EXPLAIN command is as a benchmark generation tool. This way,
when changes are made to table schema, indexes, hardware, or the operating system, a valid
comparison can be made to determine how much these changes affected system performance.
Part IV: Programming with PostgreSQL

Part IV Programming with PostgreSQL

11 Server-Side Programming

12 Creating Custom Functions

13 Client-Side Programming

14 Advanced PostgreSQL Programming


Chapter 11. Server-Side Programming
In general, there are two ways to program in PostgreSQL: using one of the
internally declared procedural languages (PLs) or using one of the externally
available application programming interfaces (APIs). The most basic difference
between the two approaches is that procedural languages act as a server-side
option, and APIs are used for client-side access.

Server-side programming is code that is actually written, contained, and executed


inside the PostgreSQL back-end system. Typically, this code is meant to extend the
functionality of the base system and to enable other queries or SQL statements to
have access to these customizations.

Client-side programming is used to enable applications that reside outside of the


PostgreSQL back end to insert, manipulate, or retrieve data that resides within the
PostgreSQL database engine.

The choice of language and approach depends heavily on several factors. Procedural
language programming is less complex and therefore enables a faster development
cycle. Procedural language programming is the preferred method for performing
common extensions to the base PostgreSQL system, and as such, it assists in
maximizing code reuse.

Utilizing the external APIs is appropriate when fine-grained control and/or speed of
execution is required. Moreover, utilizing the externally available APIs might be the
only way to interface custom applications with the back-end system in specific
cases.
Benefits of Procedural Languages

This chapter specifically focuses on using the procedural languages (PL) available in
PostgreSQL. Regardless of the specific PL chosen, they all share a common set of
advantages to the developer. Those advantages include the following:

Extensibility. Utilizing the internal PLs enables developers to quickly create


custom functions, triggers, and rules to add functionality not already present
in the base system. Moreover, once these extensions are enabled, they
become available to other SQL statements present in the system.

Control structures. By default, the SQL language does not allow the
programmer to use the rich set of control structures and conditional
evaluations included in other common programming languages. For this
reason, the included PLs allow a developer to marry such traditional control
structures with the SQL language. This is particularly useful when creating
complex computations and triggers.

Productivity and compatibility. By using the included PostgreSQL PLs, the


developer can have access to all the included data types, operators, and
functions already present in the base system. This can significantly increase
productivity because the programmer does not need to re-create common
elements already defined in PostgreSQL in his or her own custom code.
Additionally, the developer can have a high level of assurance that the
returned data types and comparison results will be compatible with the
PostgreSQL back end.

Security. The included PostgreSQL PLs are trusted by the back-end system
and only have access to a limited set of system-wide functions. In particular,
the included PLs operate, on a system level, with the same permissions
granted to the base postgres user. This is because it implies that extraneous
file system objects will be safe from any errant code.
Installing Procedural Languages

In a default installation, PostgreSQL will automatically include the capability for the
system to access code written in the PL/pgSQL language. Both PL/Tcl and PL/Perl
can also be included by setting their respective compile-time variables (that is, --
witht-cl or --with-perl).

To enable a specific PL after the system is in production, however, it is necessary to


define it through several steps. There are two methods for accomplishing this, either
through the explicit declaration of SQL statements or by using the createlang
system utility. The createlang utility helps automate many of the steps needed in
the manual creation of a new language and is generally the preferred method. Both
methods are covered in the following sections.

SQL Declaration

The location of the shared library, which acts as a handler, might vary from
installation to installation. RPM-based installations usually place it in the
/usr/lib/pgsql directory. Source-based installations will depend on what path
was supplied to the install script. The UNIX find command can always be used with
great effect in these situations.

Adding a new procedural language to PostgreSQL using explicit SQL statements is


accomplished as follows:

1. Compile the shared object (for example, plpgsql.so) by referring to the


documentation included with the source code of the object. Once compiled,
copy the object to the appropriate library directory.

2. Declare the handler in PostgreSQL by using the CREATE FUNCTION clause


(refer to the "CREATE FUNCTION" section in Chapter 1, "PostgreSQL SQL
Reference," for syntax specifics).

For instance:

CREATE FUNCTION plpgsql_call_handler()


RETURN OPAQUE AS
'/usr/local/pgsql/lib/plpgsql.so' LANGUAGE 'C';

3. Create the language as a trusted language by using the CREATE LANGUAGE


clause (refer to the "CREATE LANGUAGE" section in Chapter 1 for syntax
specifics).

For instance:

CREATE TRUSTED PROCEDURAL LANGUAGE 'plpgsql'


HANDLER plpgsql_call_handler
LANCOMPILER 'PL/pgSQL';
Using the createlang Utility

Alternatively, languages included in the base PostgreSQL system (such as PL/Tcl)


can be defined by utilizing the createlang utility. This utility simplifies many of the
preceding steps. (Refer to the createlang command in Chapter 6, "User Executable
Files," for specific options.)

The createlang utility can be used currently to register either the PL/Tcl or PL/Perl
language with the back-end server. Moreover, the createlang utility also accepts
an option to declare what database the language is registered in. If a language is
registered in the template1 database, then that language will be available in all
future databases subsequently created.

For instance:

>createlang pltcl template1

This command will automatically register the PL/Tcl language in the template1
database and in all subsequent databases.
PL/pgSQL

The PL/pgSQL language is the default language typically used to perform server-side programming. It combines
the ease of SQL with the power of a scripting language.

With PL/pgSQL, it is possible to build custom functions, operators, and triggers. A standard use might be to
incorporate commonly called queries inside the database. Many RDBMSs refer to this as stored-procedures, and
it offers a way for client applications to quickly request specific database services without the need for a lengthy
communication transaction to occur. The overhead involved in establishing a conversation between a client and
server machine can often significantly slow down the apparent speed of the system.

When a PL/pgSQL-based function is created, it is compiled internally as byte code. The resultant near-binary
code is then executed each time the function is called. PostgreSQL will execute the PL/pgSQL compiled code
rather than having to reinterpret individual SQL commands. Therefore, this can result in a significant
performance increase compared to reissuing the same SQL commands time after time.

Another benefit of using PL/pgSQL is when portability is an issue. Because PL/pgSQL is executed entirely within
the PostgreSQL system, this means that PL/pgSQL code can be run on any system running PostgreSQL.

PL/pgSQL Language Specifics

The basic structure of PL/pgSQL code is as follows:

<label declaration>
[DECLARE
…Statements… ]
BEGIN
…Statements…
END;

Any number of these blocks can be encapsulated inside each other, for instance:

<label declaration>
[DECLARE
…Statements… ]
BEGIN
[DECLARE
…Statements… ]
BEGIN
…Statements…
END;
…Statements…
END;

When PostgreSQL encounters multiple groups of DECLARE…BEGIN…END statements, it interprets all variables as
local to their respective group. In effect, variables used in one subgroup are not accessible to variables in
neighboring or parent groups. For instance, in this example, all the myvar variables are local to their respective
subgroups:

CREATE FUNCTION myfunc() RETURNS INTEGER AS '


DECLARE
myvar INTEGER := 1;
BEGIN
RAISE NOTICE "My Variable is %", myvar;
DECLARE
myvar VARCHAR := "Hello World";
BEGIN
RAISE NOTICE "My Variable is %", myvar;
END;
END;' LANGUAGE 'plpgsql';
In this instance, not only do the two instances of the myvar variable contain different data, they are designated
to hold different data types as well.

Comments

PL/pgSQL has two different comment styles: one for inline comments (such as --) and another for comment
blocks (such as /* … */). For instance:

BEGIN
Some-code --this is a comment
<…>
<…>
<…>
Some-more-code
<…>
/* And this
is a comment
block */
END;

Variable Assignment and Declaration

Variable declarations are made in the DECLARE block of the PL/pgSQL statement. Any valid SQL data type can
be assigned to a PL/pgSQL variable. Declaration statements follow this syntax:

name [ CONSTANT ] type [ NOT NULL ] [ {DEFAULT | := } value]

name - The name of the variable being defined.


CONSTANT - Keyword that indicates that this variable is read-only.
Type - The SQL data type (e.g., INTEGER, INTERVAL, VARCHAR, etc.).
NOT NULL - By default, all variables are initialized as NULL unless a value is set.
Including this keyword will not allow a NULL value to be set to this variable, and
will mandate that either a default or explicit value is set.
DEFAULT - Keyword that indicates that a default value is set.
value - The value of the default value or the explicit declaration.

If a variable type is unknown, the programmer can make use of the %TYPE and %ROWTYPE commands, which
will automatically gather a specific variable type or an entire row from a database table.

For instance, if you wanted to automatically type the variable myvar as the same type as the table/field
payroll.salary, you could use the following:

CREATE FUNCTION yearlysalary(INTEGER, INTEGER) RETURN INTEGER AS '


DECLARE
myvar payroll.salary%TYPE;
BEGIN
RETURN myvar*2;
END;
' LANGUAGE 'plpgsql';

Alternatively, an entire database row can be typed by using the %ROWTYPE syntax. For instance:

CREATE FUNCTION yearlysalary(INTEGER, INTEGER) RETURN INTEGER AS '


DECLARE
myvar payroll%TYPE;
BEGIN
RETURN myvar.salary*2;
END;
' LANGUAGE 'plpgsql';
Passing Variables to Functions

PL/pgSQL can accommodate up to 16 passed variables. It refers to variables by their ordinal number.The
numbering sequence starts at 1; therefore $1 represents the first variable passed, $2 the second, and so on.

There is no need to declare the data type of the passed variable; PL/pgSQL will automatically cast the
appropriate variable number as the proper data type.

Using the ALIAS keyword, however, enables the programmer to alias a more descriptive variable name to the
ordinal number. For instance:

CREATE FUNCTION addnumbers(INTEGER, INTEGER) RETURN INTEGER AS '


DECLARE
Number_1 ALIAS FOR $1;
Number_2 ALIAS FOR $2;
BEGIN
RETURN Number_1 + Number_2;
END;
' LANGUAGE 'plpgsql';

Additionally, the RENAME command can be used to rename current variables to alternate names. For instance:

CREATE FUNCTION addnumbers(INTEGER, INTEGER) RETURN INTEGER AS '


DECLARE
Number_1 ALIAS FOR $1;
Number_2 ALIAS FOR $2;
BEGIN
RENAME Number_1 TO Orig_Number
RETURN Orig_Number + Number_2;
END;
' LANGUAGE 'plpgsql';

Control Statements

PL/pgSQL supports most of the common control structures such as IF...THEN and WHILE loops, and FOR
statements. Most of the syntax of these statements works as it does in other languages. The following sections
outline the basic format expected by these control statements.

IF…THEN…ELSE…ELSE IF

In addition to just the basic IF…THEN statement, PL/pgSQL also provides the capability to perform ELSE and
ELSE IF exception testing.A string of ELSE IF conditional tests is analogous to using a CASE or SWITCH
statement, which is often found in other programming languages.

IF conditional-expression THEN
execute-statement;
END IF;

IF conditional-expression THEN
execute-statement;
ELSE
execute-statement;
END IF;

IF conditional-expression THEN
execute-statement;
ELSE IF conditional-expression2 THEN
execute-statement;
END IF;
LOOPS

Like all programming languages, PL/pgSQL includes the capability to create code loops that will only run when
certain conditions are met. Loops can be particularly useful when traversing a series of rows in a table and
performing some manipulation.

An endless loop could be created using the following template:

LOOP
Statements;
END LOOP;

Such as:
LOOP
x:=x+1;
END LOOP;

Or, alternatively, the EXIT directive can be used with an IF…THEN statement to create an exit point.

LOOP
x:=x+1;
IF x>10 THEN
EXIT;
END IF;
END LOOP;

Another way of performing the preceding task is to use the EXIT WHEN statement, such as:

LOOP
x:=x+1;
EXIT WHEN x>10;
END LOOP;

The WHILE clause can be included to offer a cleaner implementation of the preceding, such as:

WHILE x<10 LOOP


x:=x+1;
END LOOP;

In contrast to a WHILE -type loop, a FOR loop is expected to perform a fixed number of iterations. The FOR
statement expects the following syntax:

FOR name IN [ REVERSE ] expression__start..expression_end LOOP

For instance, the following two examples count from 1 to 100 and from 100 to 1, respectively:

FOR a IN 1..100 LOOP


RAISE NOTICE '%', a;
END LOOP;

FOR a IN REVERSE 1..100 LOOP


RAISE NOTICE '%', a;
END LOOP;

Although these examples are similar in functionality to the WHILE loops, the real power of using FOR loops is
for traversing record sets. For instance, this example traverses through the payroll table and summarizes the
total amount paid out for a given payroll period:

CREATE FUNCTION totalpay(DATETIME) REAL AS '


DECLARE
recs RECORD;
payroll_period ALIAS $1;
retval REAL :=0;
BEGIN
FOR recs IN SELECT * FROM PAYROLL WHERE
payperiod=payroll_period LOOP

retval:=retval+PAYROLL.SALARY;
END LOOP;
RETURN retval
END;
' LANGUAGE 'plpgpsql';

Using SELECT

PL/pgSQL has some slight differences from standard SQL in how the SELECT statement operates inside of a
code block. The SELECT…INTO command normally creates a new table; inside of a PL/pgSQL code block,
however, this declarative assigns the selected row to a variable placeholder. For instance, this example declares
a variable myrecs as a RECORD and fills it with the output of a SELECT query.

CREATE FUNCTION checkemail() RETURNS INTEGER AS '


DECLARE
myrecs RECORD;
BEGIN;
SELECT INTO myrecs * FROM authors WHERE
name='Barry';
IF myrecs.email IS NULL THEN
RETURN 0;
ELSE
RETURN 1;
END IF;
END;
' LANGUAGE 'plpgsql';

In the preceding example, the existence of an email is determined by comparing it against a SQL NULL value.
Alternatively, the NOT FOUND clause can be used following a SELECT INTO query. For example:

CREATE FUNCTION checkemail() RETURNS INTEGER AS '


DECLARE
myrecs RECORD;
BEGIN;
SELECT INTO myrecs * FROM authors WHERE
name='Barry';
IF NOT FOUND THEN
RETURN 0;
ELSE
RETURN 1;
END IF;
END;
' LANGUAGE 'plpgsql';

Executing Code Inside Functions

There are two basic methods for executing code within a current code block. If a return value is not required,
the developer can call the code with the PERFORM command.

If dynamic queries are desired, the EXECUTE command can be used.

The following example gives an indication of how the PERFORM command would be used. First, a custom
function is defined, addemp, which accepts the parameters needed to create an employee. If the employee
already exists, however, the function exits with a 0 exit code. However, if the employee was created, the exit
code is a 1. The following is an example of your first function:

CREATE FUNCTION addemp(VARCHAR, INTEGER, INTEGER)


RETURNS INTEGER AS '
DECLARE
Name ALIAS FOR $1;
EmpID ALIAS FOR $2;
Age ALIAS FOR $3;
EmpRec RECORD;
BEGIN
/* Check to see if emp exist */
SELECT INTO EmpRec * FROM employee WHERE
Employee.emp_id=EmpID;
IF NOT FOUND THEN
/* Doesn't exist, so add them
INSERT INTO employee VALUES (Name, EmpID, Age);
RETURN 1;
ELSE
/* Emp already exist, exit status 0 */
RETURN 0;
END IF;
END;
' LANGUAGE 'plpgsql';

After the preceding is created, you can now call this function from another by using the PERFORM statement. As
mentioned earlier, the PERFORM statement ignores any return values from the called function. So, in this case,
the returned 0 or 1 exit code will be ignored. However, due to the nature of how the addemp function is being
used, that is not a concern.

<function is created>
<…Some Code…>

/*Traverse List and run against addemp function */
FOR emp IN SELECT * FROM TempEmps;
PERFORM addemp(emps.name, emps.emp_id, emps.age);
END LOOP;

<…Some Code…>
<End Function>

In the preceding case, no return values are processed from the PERFORM addemp clause. In this instance, this
is a desired behavior because the addemp function will only add employees when it is appropriate to do so.

The EXECUTE statement contrasts with the PERFORM command in that, instead of executing predefined
functions, the EXECUTE statement is designed to handle dynamic queries.

For instance, the following code snippet gives a brief example of how this could be used:

CREATE FUNCTION orderemp(VARCHAR) RETURNS OPAQUE AS '


DECLARE
SortOrder ALIAS FOR $1;
QueryStr VARCHAR;
BEGIN
/* Determine Sorting Order */
IF SortOrder := "Age" THEN
QueryStr := "age";
ELSE IF SortOrder := "ID" THEN
QueryStr := "emp_id";
ELSE IF SortOrder := "FName" THEN
QueryStr := "first_name";
ELSE IF SortOrder := "LName" THEN
QueryStr := "last_name";
ELSE
RAISE NOTICE "Unknown value: " || SortOrder;
RETURN 0;
END IF;
EXECUTE "SELECT * FROM employee ORDER BY " ||
QueryStr;
RETURN 1;
END IF;
' LANGUAGE 'plpgsql';

The preceding example shows how a basic dynamic query can be created using the EXECUTE statement.
However, much more complex uses are possible. In fact, it is possible to actually use the EXECUTE statement to
create custom functions within other functions.

Exceptions and Notifications

PL/pgSQL uses the RAISE statement to insert messages into the PostgreSQL log system. The basic format for
the RAISE command is as follows:

RAISE level 'format' [, identifier […]];

level—Either DEBUG, NOTICE, or EXCEPTION.

format—Uses the % character to denote the placeholder for the comma-separated list in identifier.

identifier—The list (text strings and variables) of messages to log.

DEBUG will be silently ignored if debugging is turned off (compile-time option). NOTICE will write the message
to the client application and enter it in the PostgreSQL system log file. EXCEPTION will perform all the actions
of NOTICE and additionally force a ROLLBACK from the parent transaction.

The following are some examples:

RAISE NOTICE "Warning! Salary change attempted by non-manager";

RAISE NOTICE "User % not found in payroll table", user_id;

RAISE EXCEPTION "Invalid Entry in Payroll Table..aborting";

Unfortunately, PL/pgSQL does not have built-in mechanisms for detecting or recovering from an error based on
RAISE events. This can be done either by setting specific return variables or through explicit trapping done in
the client application. However, in most cases—particularly if the transaction is aborted—not much can be done
with regard to automatic recovery; usually human intervention will be required at some level.

Retrieving System Variables

PL/pgSQL also includes the capability for a function to retrieve certain diagnostic settings from the PostgreSQL
back end while in process. GET DIAGNOSTICS can be used to retrieve the ROW_COUNT and the RESULT_OID.
The syntax would be as follows:

GET DIAGNOSTICS mycount = ROW_COUNT;

GET DIAGNOSTICS last_id = RESULT_OID;

The RESULT_OID would only make sense after an insert was immediately performed previously in the code.

Notes
The BEGIN and END statements that define a PL/pgSQL code block are not analogous to the BEGIN…END SQL
transaction clause. The SQL BEGIN…END statements define the start and commit of a transactional statement. A
PL/pgSQL function is automatically part of either an explicit or implicit transaction in the SQL query that called
it. Because PostgreSQL does not support nested transactions, it is not possible to have a transaction be part of
a called function.

Just like with standard SQL declarations, in PL/pgSQL, arrays can be used by utilizing the standard notation (for
example, myint INTEGER(5);).

The main differences between PL/pgSQL and Oracle's procedural languages are that PostgreSQL can overload
functions, CURSORS are not needed in PostgreSQL, default parameters are allowed in function calls in
PostgreSQL, and PostgreSQL must escape single quotes. (Because the function itself is already in quotes,
queries inside a function must use a series of quotes to remain at the proper level.) There are other
differences, but most of these deal with specific syntax issues; consult an Oracle PL/SQL book for more
information.
PL/Tcl

The PL/Tcl language allows a trusted version of the popular Tool Command Language
(Tcl) to be used when creating custom functions or triggers in PostgreSQL. Although a
full explanation of the Tcl language is outside the scope of this book, we will highlight
some of the major features and provide some examples.

The major difference between the regular Tcl language and PL/Tcl is that the latter is
running in a trusted mode. This means that no OS-level activity can be performed.
Moreover, only a limited set of Tcl commands are enabled. In fact, Tcl functions cannot
be used to create new data types in PostgreSQL.

If OS-level operations are desired, there is a more expressive version of PL/Tcl


available named PL/TclU (Tcl Untrusted) that can be used for programming in
PostgreSQL. By default, this language is not available in the base distribution and
must explicitly be added to be operational. The reader is urged caution, however,
because errant scripts could cause system corruption or failure.

General Tcl Language Primer

Much of the syntax in PL/Tcl is the same as Tcl in general. The following is a brief
synopsis of how to use Tcl.

Comments

Like many scripting languages, the default comment indicator is the pound sign (#).
Any line that begins with this symbol is ignored entirely. For instance:

CREATE FUNCTION addit (arg1) RETURNS INTEGER AS '


# Set variable to arg1
Set myvar $1
# This is another comment
' LANGUAGE 'pltcl';

Variable Assignment

Tcl accepts variable assignments. For instance, to assign a variable, you could do the
following:

Set myval 10
Set mystr "This is my string"
Set myval_2 myval+100

The first two examples are obvious: The variable myval is set to a numerical value of
10, and the variable mystr is set to string. However, the last example is deceptive.
On first look, it would appear that the variable myval_2 should be equal to 110, but
actually it is equal to the string myval+100. To perform variable substitution, use the
following syntax:

Set myval_2 [expr $myval+100]

PL/Tcl uses the $ symbol to indicate that a variable is being referenced. Additionally,
anything enclosed in brackets ([]) is evaluated as Tcl code.

Control Structures

Like all modern scripting languages, Tcl has the standard flow-control mechanisms for
determining code-path execution. For instance, the standard IF block looks like this:

if {conditional-expression} {
#code block
}

Tcl also supports IF…ELSE control structures, such as:

if {conditional-expression} {
#code block
}
else {
#something else
}

Tcl also supports the standard WHILE and FOR loops. For instance:

while {$x < 100} {


#some code
incr x 1
}
#loop has exited - run more code

Or, alternatively, a FOR loop could be used. The FOR loop takes the following syntax:

for {initial condition} {test condition} {modification} {


#some code
}

For instance:

for {set x 0} {$x < 100} {incr x 1} {


#some code
}
#loop has exited - run more code
The preceding uses the incr Tcl command, which increments the variable specified
with the given amount. (Note: 1 is the default; it does not need to be explicitly given.)

The Tcl language also supports a more powerful FOR loop called FOREACH. The basic
syntax is as follows:

foreach variable(s) list(s){


#some code
}

For example:

foreach month {Apr May Jun} {


#Run quarterly report
}

Additionally, more complex FOREACH structures can be created by using multiple


variable names and lists. For example:

foreach xpoint ypoint {10 200 20 400} {


#On first run xpoint=10 , ypoint=200
#On second run xpoint=20 , ypoint=400
}

Or alternatively:

foreach xpoint {10, 200} ypoint {20 400} {


#On first run xpoint=10 , ypoint=200
#On second run xpoint=20 , ypoint=400
}

Tcl also supports the SWITCH control structure. The basic syntax is as follows:

switch option test-expression {


test_case1 {code statement}
test_case2 {code statement}
default {default code statement}
}

OPTION usually refers to -exact, -glob, or -regexp, which does exact matching,
pattern matching, or regular expression matching on the supplied test cases.

The DEFAULT keyword can be used to match a case that fails all other comparison
tests. Additionally, a "-" sign as a code statement will indicate that the first following
full code statement is to be run as the appropriate execution initiative. For instance:

set myvar "Barry"


switch -glob $myvar {
arry {puts 1}
*arry -
Ba* -
Bar* {puts 4}
default {puts "Not Barry"}
}

The preceding example will return a 4. Notice how the "-" continuation symbols can be
linked together to form a chain of correct matches.

Strings and List

Tcl has many included list- and string-related commands. A brief listing is included
here:

Command Description

list {1 2 3 4} Returns a list of supplied elements.

concat {1 2} {3 Returns a concatenated list of elements.


4}

lappend {1 2} {3 Adds the elements to the list.


4}

lindex {1 2 3} 2 Returns the Nth element specified (start=0).

linsert {1 2 4} 2 Inserts an item at the index point.


{3}

join {1 {2 3} 4} Joins all elements into a single flat element.

llength {1 2 3 4} Returns the length (elements) of the list.

lreplace {1 2 3} Replaces elements at index 1 to 2 with an "a".


1 2 a

lsearch {a b c} b Returns the index (that is, 1) of the searched.value "b".


lsort {b z a c} Returns the list sorted.

split Splits the elements according to a supplied delimiter (a


this,is,split , comma in this case).

PL/Tcl Language Specifics

Up to this point, we have been discussing general features of the Tcl language; Pl/Tcl
adds some specific functionality to the base language.

Basic Structure

The following is the basic format of the PL/Tcl language:

CREATE FUNCTION function_name (arg1 [, argN]) RETURNS type


AS '
# PL/tcl Code
' LANGUAGE 'pltcl';

This is similar to how all PLs are used within PostgreSQL, and as with PL/pgSQL, care
must be taken to properly escape quoted character strings correctly.

Arguments passed to PL/Tcl start at $1 and progress sequentially, as in PL/pgSQL.


PL/Tcl will also accept arguments in the form of arrays. The passed array usually will
refer to the specific element needed by using its attribute name. For instance:

CREATE FUNCTION ispaid(payroll_array) RETURNS INTEGER


AS '
if {$1(salary)>0}{
return 1
}
if {$1(hourly)>0}{
return 1
}
return 0
' LANGUAGE 'pltcl';

Global Data (GD) Directives

Due to the nature of performing queries with PL/Tcl, it is important to be able to store
globally accessible data between various operations inside a PL/Tcl code block.

To accomplish this, PL/Tcl uses an internally available array named "GD." This variable
is the recommended method for distributing shared information throughout a
procedure. (See the example in the next section for a procedure that uses the GD
variable.)

Accessing Data from PL/Tcl

Unlike PL/pgSQL, you cannot simply embed standard SQL statements inside of PL/Tcl.
There are special built-in commands that allow access to the database back end.

Executing a Query Directly

The spi_exec command can be used to submit a query directly to the database
query engine. The syntax for the spi_exec command is as follows:

spi_exec -options query {


loop-statements
}

options - One of the following options:


-count n Return only N rows from
query
-array name Stores results in
associative array with given name
query - The query string to execute
loop-statements - Execute these commands for
each row returned

The following are some examples of how the spi_exec command works:

spi_exec "SELECT * FROM authors"

spi_exec -count 10 "SELECT * FROM authors ORDER BY name"

spi_exec -array myrecs "SELECT * FROM authors"

Preparing and Executing a Query

The preceding use of spi_exec executed the queries by submitting them directly to
the query engine. In many cases, this approach will work fine. However, if you plan to
execute the same basic query multiple times—with perhaps just a change in criteria—
it is more efficient to prepare the query and then execute it.

When a query is prepared, it is submitted to the query planner, which then prepares
and saves a query plan for the submitted entry. It is then possible to use that query
plan to execute the actual query, which can result in performance increases if used
correctly.
A query is prepared by using the spi_prepare command, which takes the following
syntax:

spi_prepare query typelist

query - The SQL query to execute


typelist - If arguments are going to be passed to the query
from the PL/Tcl code, then a listing of their data
types must be supplied

The following is an example of the spi_prepare command. Notice the use of a


double backslash to properly escape the $ symbol.Additionally, notice that the
VARCHAR data type is supplied because of the $1 PL/Tcl variable:

spi_prepare "SELECT * FROM authors WHERE name=\\$1" VARCHAR

After a query has been prepared, it can be executed with the spi_execp command.
This command is similar to the spi_exec command, with the exception that is geared
toward executing already-prepared queries. The following is the syntax that the
spi_execp command uses:

spi_execp options queryID value-list {


loop-statements
}

options - One of the following options;


-count n Return only N rows from the query
-array name Store the results in
associative array with given name
-nulls str Uses named string values to use
all null values
queryID - The query OID returned from spi_prepare
value-list - If a typelist was provided to
spi_prepare, then a list of those values
must be supplied to spi_execp
loop-statement - A PL/Tcl statement that will be
executed for every row returned

The following is an example of using the spi_execp command; notice the use of the
GD global system variable. In particular, the following example will only create the
query plan when first called; on all subsequent calls, the previously saved plan is
simply executed:

CREATE FUNCTION count_checks(int4) RETURNS int4 AS '


#Check to see if plan exists
if {! [ info exists GD((plan) ]} {
set GD(plan) [ spi__prepare "SELECT count(*) AS
chk_count FROM payroll WHERE emp_id=\\$1" int 4 ]
}
#Plan has been created or already exists
spi_execp $GD(plan) [ list $$1 ]
return $chk_count
' LANGUAGE 'pltcl';

Constructing Queries

A related command that is useful when accessing the PostgreSQL back end is the
quote statement. This command is useful in constructing query strings that make use
of variable substitution.An example of the quote command is as follows:

set myval "Barry"


quote "SELECT * FROM authors WHERE name=$myval"

The preceding would result in the following text if sent to the query parser:

"SELECT * FROM authors WHERE name='Barry'"

One subtle point to watch out for is when the value of a variable already contains a
single or double quote. The quote command will dutifully reproduce this, which could
result in an error being generated from the PostgreSQL query parser. Consider the
following:

set myval 'Barrys'"


quote "SELECT * FROM authors WHERE name=$myval"

The preceding would result in the following text if sent to the query parser:

"SELECT * FROM authors WHERE name='Barry's"

To correct this obvious problem, use the following syntax:

set myval "Barrys'"


"SELECT * FROM authors WHERE name='[quote $myval ]'"

Accessing the PostgreSQL Log System

Like PL/pgSQL, there are commands present in PL/Tcl that provide access to the
PostgreSQL log system. The elog command uses the following syntax:

elog level message

level - Either NOTICE, ERROR, FATAL, DEBUG, or NOIND


message - The text message to pass to the log
(For more information on the elog levels previously mentioned, refer to the elog C
function discussed in Chapter 13, "Client-Side Programming.")

Notes

When installing PL/Tcl—whether at compile time or after—it is required that the Tcl
language and associated libraries exist on the target system for installation to be
successful.
PL/Perl

Perl is one of the most common scripting languages in use. It runs on almost all
platforms and has wide support in the development community. For these reasons,
PL/Perl can be an effective choice when choosing a PostgreSQL PL language.

Like PL/Tcl, the PostgreSQL implementation of PL/Perl only enables specific commands,
which are deemed trusted. Essentially, any Perl commands that explicitly deal with the
file system, environmental settings, or external modules have been disabled.

It is still possible, however, that errant code created in PL/Perl can negatively impact
the base system. Most of these problems are because PL/Perl will still allow the
exhaustion of memory and endless loops to be created. Therefore, code created in
PL/Perl should be closely inspected to ensure that runaway code could not create a
problem for the parent system.

General Perl Language Primer

Much of the syntax in PL/Perl is the same as Perl in general. The following is a brief
synopsis of how to use Perl. Obviously, if you are new to Perl, consult one of the many
books or web sites available for the new Perl user.

Comments

Like many scripting languages, the default comment indicator is the pound sign (#).
Any line that begins with the pound sign is ignored entirely.

Control Structures

Perl contains most of the common control structures that are present in other
languages. The standard IF structure is as follows:

if (expression) {
code-statement
}

Perl also supports more complex IF statements, like IF…ELSE and ELSEIF statements.

For instance:

if (expression) {
code-statement
} elseif (another-expression) {
other-code-statement;
} elseif (another-expression) {
other-code-statement;
} else (final-expression) {
final-code;
}

Notice in the preceding code how ELSEIF and ELSE statements can be combined to
create a chain of test cases and a final default statement to execute if none of the cases
test true.

Perl also supports WHILE, UNTIL, DO, FOR, and FOREACH loops; examples are notated
in the following:

while ($a<10) {
print $a;
$a++;
}

until ($a>10) {
print $a;
$a++;
}

do {
print $a;
$a++;
} while ($a<10)

for ($a=0; $a<10; $a++) {


print "Printing 10 times…";
}
@lst = ("Jan", "Feb", "Mar");
foreach $a (@lst) {
print $a;
}

Perl also contains ways to break out-of-control structures like LAST, NEXT, and REDO,
and by using labeled blocks. The following is a list of examples:

while ($a<10) {
print $a;
if ($a=5){
#a is 5, so exit loop
last;
}
$a++;
}
print "exited loop";

The preceding code will continue looping until one of two conditions are met: Either $a
is greater or equal to 10 or $a is equal to 5. (Actually, in this example, the code will
never reach 10 because it will always exit at 5.)

The other statements work similarly. The NEXT statement will reiterate the loop and
skip any remaining items; the REDO statement will run the loop again from the
beginning without reevaluating the test condition. For example:

while ($a<10) {
$a++;
print $a;
if ($a=5){
#a is 5, so loop again
next;
}
}
print "exited loop";

In addition to just reiterating the loop, label declaratives can be specified in conjunction
with the NEXT, LAST, and REDO statements to control program flow:

OUTER: while ($a<10) {


$a++;
print $a;
INNER: if ($a=5){
REALLYINNER: if ($b=1){
last OUTER;
}
}
}
print "exited loop";

Associative Arrays

One of the more powerful features of the Perl language is how associative arrays can be
created and manipulated. The next example creates a two-element array and assigns
values to it:

$employee("name")="Fred";
$employee("age")=29;

To get a listing of the keys contained in an array, use the keys function. For instance:

@lst = keys(%employee);
#lst now equals ("name", "age")

Alternatively, if you wanted to list the values stored in array, you could use the values
function. For instance:
@lst = values(%employee);
#lst now equals ("Fred", 29)

If you want to return both the key and value pairs together, the each function can be
used. This function is meant to be used inside of a loop, and on each successive call, it
returns the next key/value pair. For instance:

while (($name, $age) = each(%employee)) {


#some code goes here…
}

To remove an element from an associative array, use the delete function, as in the
following:

$employee("name")="Fred";
$employee("age")=29;
$employee("shoesize")=10;
#The employee array is 3 elements wide
delete $employee("shoesize");
#Now just 2

Array Access Functions

In Perl, array numbering begins at 0 and proceeds sequentially for every element
contained. Lists of elements can be specified by using a comma-separated list. Negative
numbers refer to array elements beginning at the end of the element list. The following
is a brief listing of examples:

@lst=("one", "two", "three", "four");


#Set tmp variable to 'one'
$tmp=$lst[0];
#Set tmp to 'two' and 'four'
$tmp=$lst[1,3];
#Set tmp to 'four'
$tmp=$lst[-1];

A common use of arrays is as a queue to hold information. Queues typically need to


have elements removed or added in a predictable and specified manner. Perl contains
several functions that assist with this: pop, push, shift, and unshift. The pop and
push functions work on the right side of an array, and shift and unshift process on
the left side.

For example:

@queue=(54,123,65643);
#Return and Remove 65643
$myval=pop(@queue);
#Add 111 to queue
push(@queue, 111);
#
#Return and Remove 54
$myval=shift(@queue);
#Add 222 to left side
unshift(@queue, 222);

To reverse or reorder the list of elements, use the reverse or sort function, as in the
following:

@lst=(10,1,5);
@lst=reverse(@lst); #Now lst = (5,1,10)
@lst=sort(@lst); #Now lst = (1,5,10)

Perl and Regular Expressions

One of the things that has made Perl so widely used is its use of regular expressions
(regex). Essentially, regular expressions are a method to match patterns between a
supplied template and the source text. A full explanation of regex is beyond the scope
of this book; however, Table 11.1 and Table 11.2 provide some examples.

Table 11.1. Characters Used for Pattern Matching in regex

regex Pattern Description

/anytext/ The text anytext is specifically searched for.

. Represents any character.

* Zero or more of the preceding character.

+ One or more of the preceding character.

? Any single character.

[^anytext] Doesn't contain an anytext.

^anytext Starts with the text anytext.


Table 11.2. Example Search and Results

Source Text regex Template Notes/Matches

PostgreSQL is good /POSTGRESQL/ No match. Case sensitive.

/POSTGRESQL/i Match. Case insensitive.

/eS/ eS match (PostgreSQL).

/ / Match (space found).

/[efgh]ood/ Match (good).

/[e-h]ood/ Match. Same as above. (good).

/Postgre[^SQL]/ No match. Negated "SQL".

/^P/ Match. Starts "P" (PostgeSQL).

/g..d/ Match (good).

/g.*d/ Match (good).

/goo+/ No match. Doesn't end in "o".

/i*s/ Match. Zero match of "i" (is).

/i+s/ No match (is).

/PostgreSQ?/ Match (PostgreSQL).


!/good/ No match. Negated "good".

PL/Perl Language Specifics

The basic format of the PL/Perl language is as follows:

CREATE FUNCTION name (Arg1 [, ArgN]) RETURNS type AS '


Return $_[0]'
LANGUAGE 'plperl';

Escaping Characters

As with PL/pgSQL and PL/Tcl, it is important to remember that quoted strings inside of
a PL/Perl function need to be properly escaped.

Use of the Perl functions q[], qq[], and qw[] can assist in creating properly escaped
variable-substitution sequences.

Variable Substitution

By default, variables are passed to the underlying Perl function as "$_". This variable is
the default Perl namespace when no explicit variable has been specified, and
consequently, this is the namespace that inherits PL/Perl variables. For instance:

CREATE FUNCTION getproduct(INTEGER, INTEGER) RETURNS INTEGER AS '


$newval=$_[0] * $_[1];
return $newval;'
LANGUAGE 'plperl';

Additionally, entire tuples can be passed to a PL/Perl function. Within the PL/Perl code,
the keys of the associative array are the field names from the passed tuple. Obviously,
the values of the associative array hold the field data. For instance:

CREATE FUNCTION citystate(employee) RETURNS INTEGER AS '


$empl = shift;
return $empl->{"city"} + $empl->{"state"};
LANGUAGE 'plperl';

Notes

When installing PL/Perl—whether at compile time or after—it is required that the Perl
language and associated libraries exist on the target system for installation to be
successful. Moreover, the shared library version of libperl (that is, libperl.so)
should be present so that PostgreSQL can have access to it.
Chapter 12. Creating Custom Functions
By itself, a database is nothing more than a container that holds data. The functions
and tools are what make a database truly useful. Much of the work of designing an
effective database is being able to model the business rules needed. Developing the
proper table schema is one of the ways to model the business rules within the
database; the other ways are through the creation of custom functions, triggers,
and rules.

As a basic overview, functions, triggers, and rules compare as follows:

Functions. User-defined functions (also called stored procedures) are most


useful as a way to implement commonly called code within the back end itself.

Triggers. These enable additional actions to be performed when a SELECT,


INSERT, or UPDATESQL command is issued. These differ from rules in that
they are called on a per-row basis.

Rules. Rules automatically rewrite supplied queries to perform substitute or


additional actions when a SELECT, INSERT, or UPDATE command is issued.
Rules differ from triggers in that rules are used when the actions affect other
tables.
Creating Custom Functions

PostgreSQL includes a number of predefined functions to help in data manipulation


(see Chapter 4, "PostgreSQL Functions," for a full listing). Many of the ones included
are general-purpose functions that aid in converting, formatting, or aggregating
data.

There are many cases in which the system would benefit from the existence of user-
defined functions. Functions are particularly useful when the same information
needs to be accessed repeatedly. In these cases, it is possible to create a user-
defined function that is stored within the server.

Typically, when a function is created, a query plan is precompiled and stored as


ready to be executed. This benefits system speed tremendously because the client
application only needs to request that the function be executed instead of having to
supply the SQL code and wait for it to be parsed, executed, and returned.

The benefits of using custom-created functions are not only limited to speed
considerations. In many instances, the standard SQL language does not provide
sufficient control to perform the desired action. For instance, if conditional
branching, loop iteration, or complex variable substitution is needed, creating
custom functions might be the only way to accomplish the task at hand.

As mentioned in Chapter 11, "Server-Side Programming," PostgreSQL includes a


number of procedural languages that can be used to write custom functions.
Although each has its own strengths, on balance, PL/pgSQL is probably the most
accessible. PL/pgSQL allows standard SQL commands to be included, along with
more advanced control structures like loops, if-then-else, and variable substitution.

Example Uses

In this section, you will examine instances of when creating custom functions would
be useful.

Code Reuse

As previously mentioned, one of the situations in which custom functions are


desirable is where you want to avoid the duplication of work.

In this example, we look at a function named homestate, which accepts an


employee id value and returns that employee's home state.

CREATE FUNCTION homestate(int) RETURNS text


AS '
DECLARE
empid ALIAS FOR $1;
BEGIN
SELECT state FROM payroll WHERE
employee_id=empid;
IF FOUND THEN
RETURN state;
ELSE
RETURN "N/A";
END IF;
END
' LANGUAGE 'plpgsql';

Upon creation of the preceding function, PostgreSQL preformulates a query plan on


the payroll table and awaits execution.

A carefully designed database will include a number of these prefabricated queries.


They offer the following benefits to the system as a whole:

Speed and efficiency. The preformed query plan already exists in the
database engine and awaits execution.

Format control of returned data. In the preceding example, when an


employee's home state is not found, they will automatically be presented with
an "N/A" value. This is provided directly from the custom function, with no
intervening steps needed on the client end.

Abstraction. In the preceding example, the client application doesn't directly


have any knowledge of the underlying database schema. As a result, the basic
table structures could be changed considerably, and as long as the input and
output interfaces of this function continued to work in the same way, the client
application would not need any modification.

Indirect benefits. It could be argued that the mere inclusion of such a


predefined query function improves the database and application architecture.
That is to say, it forces the programmers/DBA to think about what types of
data will be most accessed. Accordingly, this can lead to other insights that
improve the efficiency of the system as a whole.

Combining Functions

By combining multiple functions, it is possible to build a more flexible, yet


consistent, database.

In this example, there is a specific user interface (UI) feature that you are trying to
create. It comes to the developer's attention that when users are entering
information into the system, they want to be able to enter either the employee
name or the employee ID into the dialog box.

This task is made easier because of the fact that you can assume that all employee
IDs will be strictly composed of numbers, whereas employee names will consist
entirely of letters.
Rather than having to re-create this feature for each instance of its use, it is decided
to create a general case function that simply accepts either input (ID or name) and
returns the employee ID.

Furthermore, it is decided that only the employee ID is to be stored in tables that


link against the payroll table. This helps with the concept known as data
normalization, the idea of which is to have consistent representation and
nonredundant data stored across the database.

Therefore, the following function can be developed that will accept either format and
return the employee ID. (The full potential of this function will not be seen until
later.)

CREATE FUNCTION getempid(varchar) RETURNS int


AS '
DECLARE
empval ALIAS FOR $1;
BEGIN
/*Determine if empval is name
if so, return the emp_id,
otherwise, return back the emp_id
*/
IF empval ~ '[a-zA-Z]' THEN
SELECT employee_id FROM payroll WHERE
last_name=empval;
RETURN last_name
ELSE
RETURN empval;
END IF;
END
' LANGUAGE 'plpgsql';

At first glance, this function doesn't appear to be that useful. It simply determines
whether the variable passed is a digit or alphabetical, and it returns the employee
ID for that person. Moreover, it seems that if this function is already passed the
employee ID, it simply returns that value directly back. On the surface, this might
seem like a waste. However, when combined with other functions, the true potential
for such a function can be seen.

For instance, by combining the first function, homestate, with this latest function,
you can enable it to accept either the last name or the employee ID. In this case,
you use your latest function as a wrapper to ensure a flexible range of input values.
The clientside code would appear as follows:

SELECT homestate(getempid('Stinson'));

Or

SELECT homestate(getempid(592915));
Or, finally

SELECT homestate(getempid(strInputValue));

By combining the two functions, this allows a more flexible range of accepted input
data, while still storing data in a consistent format on the back end. Moreover, if the
developers one day realize that they want to allow users to input the Social Security
number as well, it will only require a modification of the getempid function.

Stored Procedures

In reality, stored procedures and functions are exactly the same thing. Namely, they
are a set of code statements that are created with a CREATE FUNCTION command.
The difference is more conceptual than concrete.

In general, functions accept an input value, perform some lookup or manipulation of


it, and return an output value.A classic example of a function can be seen in the
upper function. This function accepts a character string, converts it to uppercase,
and returns the resultant string. For instance:

> select upper('abcdefg');


> ABCDEFG

Stored procedures, however, do more than just accept a value and provide return
data. Generally, they perform some basic procedure or alteration to database tables.
For instance, consider the following example.

In this example, we want an easy way to make adjustments/inserts to the


employee table. Specifically, the users want an easy way to assign new or existing
employees to a new job description. The specifications for such a function are as
follows:

If an employee doesn't exist, add the person and assign him or her to the
specified job.

If an employee already exists, change his or her job description to the new
one.

Given these specs, a sample stored procedure that accomplishes this might appear
as the following:

CREATE FUNCTION assignjob(int, varchar) RETURNS int


AS '
DECLARE
empid ALIAS FOR $1;
jobdesc ALIAS FOR $2;
retval INTEGER:= 0;
emprec RECORD;
BEGIN
/*Determine if employee exist in table*/
SELECT INTO emprec WHERE employee_id=empid;
IF FOUND THEN
/*Emp exist, modify his job description*/
UPDATE employee SET job_description=jondesc
WHERE employee_id=empid;
retval := 1;
ELSE
/*Emp doesn't exist, add him*/
INSERT INTO employee VALUES (empid,jobdesc);
retval := 1;
END IF;
RETURN retval;
END
' LANGUAGE 'plpgsql';

Although the preceding example is still considered a function, it actually performs


modifications to database tables instead of just calculating return values. For this
reason, examples like the preceding are referred to as stored procedures.

This might be seen as just a difference of semantics. However, it underlines a


conceptual difference between the two approaches.

Stored procedures are very useful in automating table manipulations that must
occur regularly. For instance, a good use might be to perform a task such as voiding
a payroll check. Typically, such an operation requires modifying many tables in a
standard account system setup. Although it could be coded directly into the client
application, that might make for a more rigid application in the end.

For instance, with the current system, modifications might need to be made to the
payroll, employee, AP, and GL tables to fully void an incorrectly printed check.
There would be no problem, per se, with coding this procedure directly from the
client machine. If in the future, however, there is a new table— JobCost—that
needs to be updated, this could be a needlessly complex change to make. It could
require changing the code in dozens or hundreds of client applications.

A better approach would have been to create the task of voiding a check as a stored
procedure (that is, function) within the database back end. The benefit of this setup
is that the clients simply call the voidcheck function and are oblivious to the actual
steps the server is taking to complete their request. On the server side, it is
relatively minor to update the function to affect another table; therefore, the entire
system becomes much more flexible.
Creating Custom Triggers

There is a certain degree of overlap between stored procedures (that is, functions) and
triggers. Both operate as predefined code created with the CREATE FUNCTION statement.
However, triggers are most often used as an automated response to some table-related
event, not as an action directly called by a client application.

Triggers bind these functions to DELETE, UPDATE, or INSERT table events using the
CREATE TRIGGER command. The client application has no direct knowledge of a trigger's
existence; it simply performs the requested action, which results in the server firing the
appropriate trigger event.

Triggers are used for performing actions that pertain to the same table that is being
accessed. Often they are used as a mechanism to ensure data or business-rule integrity. For
instance, consider the following function and trigger pair:

CREATE FUNCTION trig_insert_update_check_emp() RETURNS opaque AS


'BEGIN
/*Check employee age, state, and name
and enforce certain checks */
IF new.age > 20 THEN
new.adult = 'yes';
ELSE
new.adult = 'no';
END IF;
IF new.state ~ '^[A-Za-z][A-Za-z]$' THEN
new.state = upper(new.state);
ELSE
RAISE EXCEPTION 'Alphabetical State Desc Only';
END IF;
IF new.name ~ '^[a-zA-Z]*' THEN
new.name = initcap(new.name);
ELSE
RAISE EXCEPTION 'Alphabetical Name Only';
END IF;
END;
' LANGUAGE 'plpgsql';
CREATE FUNCTION trig_delete_check_emp() RETURNS opaque AS
'BEGIN
/*Make sure a manager isn't deleted*/
IF old.manager='yes' THEN
RAISE EXCEPTION 'Cannot Delete Managers!'
END IF;
END;
' LANGUAGE 'plpgsql';

The preceding two functions make use of the new and old keywords. These keywords refer
to data that has just been INSERTED or DELETED, respectively, when called as part of a
trigger event. Next, a trigger event is created and bound to each function.
CREATE TRIGGER employee_insert_update
BEFORE INSERT OR UPDATE
ON employee
FOR EACH ROW EXECUTE PROCEDURE trig_insert_update_check_emp();

CREATE TRIGGER employee_update


BEFORE DELETE
ON employee
FOR EACH ROW EXECUTE PROCEDURE trig_delete_check_emp();

Now that the triggers have been created, they can be tested as follows:

>INSERT INTO employee (name, age, state, manager)


VALUES ('sean', 29, 'T8', 'yes');
>ERROR: Alphabetical State Desc Only

>INSERT INTO employee (name, age, state, manager)


VALUES ('sean', 29, 'tx', 'yes');
>INSERT 323003 1

>SELECT * FROM employee WHERE name='Sean';


name age state manager adult
------------------------------------
Sean 29 TX yes yes

>DELETE FROM employee WHERE name='Sean';


>ERROR: Cannot Delete Managers!

>UPDATE employee SET manager='no' WHERE name='Sean';


>UPDATE 1

>DELETE FROM employee WHERE name='Sean';


>DELETE 1

In the preceding examples, notice the similarity between how these triggers behave and
how column constraints typically behave. Column constraints generally check a specific
field's validity before an INSERT or UPDATE is allowed.

However, triggers and column constraints are not mutually exclusive in their behavior. If the
BEFORE keyword is used when creating a trigger, it will fire before the field (or table)
constraints are checked. Moreover, the BEFORE keyword means that the trigger will be fired
before the actual insert is completed. Therefore, if a trigger depends on an OID or relies on
a unique index, it will not function correctly.

Likewise, when the AFTER keyword is specified, the trigger event will be activated after the
specified table action (INSERT, UPDATE, or DELETE) has already completed. Moreover, the
AFTER keyword will cause the trigger not to fire until all the table or field constraints have
already been evaluated.
Creating Custom Rules

Rules are very similar to triggers in concept, with some crucial differences. Triggers usually
refer exclusively to the table being acted on, whereas rules act on external tables.
Additionally, triggers are fired in addition to the action being carried out. For instance, an
INSERT trigger will fire the event either before or after the insert is performed.

Rules, on the other hand, can also be created with the optional keyword INSTEAD. In this
case, the rule action is carried out in lieu of the specified action.

A typical use of rules is to perform actions on external tables when a table-related event
occurs on the specified table. A simple use of a rule set would be to implement the
capability to log an audit trail of changes made to important tables. For instance, suppose
the management wants to see a weekly report of every expenditure over $1000. You could
implement this as follows:

CREATE RULE log_payables AS


ON INSERT TO accounts_payable
WHERE new.amount > 1000 DO
INSERT INTO audit VALUES(new.vendor_id, new.amount, new.user);

You could test this rule as follows:

>SELECT * FROM accounts_payable;

vendor amount user


--------------------
0 Results Found

>SELECT * FROM audit;

vendor amount user


--------------------
0 Results Found

>INSERT INTO accounts_payable VALUES('Acme, Inc', 2500, 'Sean');


INSERT 231431 1
INSERT 231432 1

SELECT * FROM audit;

vendor amount user


-------------------------
Acme, Inc 2500.00 Sean

Rules can also be used in conjunction with functions to create actions that are more
complicated. For instance, suppose there are two tables in the database, payroll and
paytotals. The payroll table holds an individual record for every payroll check issued.
The paytotals table, however, has the latest year-to-date payroll totals for each
employee.
In this case, it is assumed that it is important for the system to automatically keep the
paytotals table up-to-date.A rule/function combination could be created to accomplish
this, as follows:

CREATE FUNCTION getpaytotal(int) RETURN real AS


'DECLARE
empid ALIAS FOR $1;
BEGIN
SELECT sum(amount) AS paysum FROM payroll
WHERE employee_id=empid;
RETURN paysum;
END;
' LANGUAGE 'plpgsql';

Now that the function is created, it can be incorporated into a rule set:

CREATE RULE compute_paytotal AS


ON INSERT TO payroll DO
UPDATE paytotals SET amount=getpaytotal(new.employee_id)
WHERE paytotals.employee_id=new.employee_id;

The resultant rule can be tested as follows:

>SELECT * FROM payroll;

empid name amount checknum


--------------------------------
123 Sean 100.00 5411
123 Sean 100.00 5412

>SELECT * FROM paytotals;

empid name amount


--------------------
123 Sean 200.00

>INSERT INTO payroll VALUES (123, 'Sean', 200, 5413);


INSERT 243411 1
UPDATE 1

>SELECT * FROM paytotals;

empid name amount


--------------------
123 Sean 400.00

As previously mentioned, the CREATE RULE command also allows for the inclusion of the
INSTEAD keyword. When this is specified, an alternative action will be performed.
Following from the previous example, let's assume management decided that any entry into
the accounts_payable table that was over $1000 should be deferred into an alternate
table until it was approved. For instance:

CREATE RULE defer_ap AS


ON INSERT TO accounts_payable
WHERE new.amount > 1000 DO INSTEAD
INSERT INTO ap_hold VALUES (new.vendor, new.amount, new.user);

This would result in any insert actions made into the accounts_payable table being
redirected into the ap_hold table, pending management approval.

Unlike triggers, rules can also be defined to occur on SELECT statements. This can have
some interesting implications. For instance, consider the following example:

>SELECT * FROM accounts_payable;

vendor amount user


-------------------------
Widgets 500.00 Barry
Acme, Inc 2500.00 Sean

>CREATE TABLE my_ap INHERITS (accounts_payable);

>SELECT * FROM my_ap;

vendor amount user


-------------------------
Widgets 500.00 Barry
Acme, Inc 2500.00 Sean

>CREATE RULE my_select AS


ON SELECT TO my_ap DO INSTEAD
SELECT * FROM my_ap WHERE amount>1000;

>SELECT * FROM my_ap;

vendor amount user


-------------------------
Acme, Inc 2500.00 Sean

In the preceding example, we've created a table that INHERITS all the attributes from the
base table. Then a rule is defined on the new table that rewrites any SELECT statements
and enforces a criterion match. The resultant action behaves suspiciously like a standard
VIEW. This is not by accident because PostgreSQL actually uses rule definitions as the way
that the CREATE VIEW command is implemented.

Notes and Considerations


Obviously, such uses for rule sets can be very beneficial in many cases. However, care must
be taken when designing rule sets because there exists the potential for misuse. For
instance, consider the following two related rules:

CREATE RULE insert_1 AS


ON INSERT TO apple DO
INSERT INTO orange VALUES(new.weight);

CREATE RULE insert_2 AS


ON INSERT TO orange DO
INSERT INTO apple VALUES(new.weight);

This example shows a dangerous potential of rule use. In this case, if an INSERT is made to
either table, an infinite loop of cascading insert actions will begin. In actuality, PostgreSQL
is too intelligent to allow this to happen, and the action would automatically fail once too
many recursive queries are executed.

In general, however, rules should only point to tables that do not have any associated rules
already set. That is, rule sets should point away from other rule sets. In large databases,
which might have hundreds of tables, it can be extremely complicated to manage and
predict results if numerous rules are actively engaged.

Additionally, rules only have access to specific system classes, namely to the OID attribute.
This means that rule definitions cannot act directly on any system attributes. Therefore,
functions such as func(table) will fail because table is considered a system class.

The code body for a particular rule can be accessed by viewing the pg_rules catalog.
Chapter 13. Client-Side Programming
PostgreSQL provides a number of interfaces that enable client applications to access
the database back end. In addition to the APIs provided by PostgreSQL, a number of
other languages have provided their own interfaces to PostgreSQL.

The choice of client language depends on many factors. C and C++ excel at fine-
grained control and raw speed, Python and Perl are ideal for rapid prototyping and
flexibility, PHP is great as a web-based solution, and ODBC and JDBC provide access
from Windows or Java clients. Interfaces for each of these languages are addressed
in the following sections.
ecpg

ecpg is a set of applications and libraries designed to help facilitate an easy way to
include SQL commands within C source code. Embedded SQL in C, or ecpg, is a
multiplatform tool that many RDBMSs support.

The concept behind Embedded SQL is that a developer can simply type SQL queries
directly into his or her C source code, and the ecpg preprocessor translates those
simple SQL statements into more complex functions, thereby obviating that work
needing to be done by the developer.

The output of the ecpg program is standard C code; this can then be linked against
the libpq and ecpg libraries and compiled directly to binary code.

The general flow of creating a program with ecpg is illustrated in Figure 13.1.

Figure 13.1. The flow of program creation with ecpg.

For a complete discussion of the command-line options that can be accepted by


ecpg, refer to the section titled "ecpg" in Chapter 6, "User Executable Files."

Embedded SQL makes use of the syntax in the following section to perform standard
database operations.

Declaring and Defining Variables

The following code is used to define the variables needed by the underlying C
program when data is passed to or from the PostgreSQL back end.

EXEC SQL BEGIN DECLARE SECTION;


[…Variable Definitions…]
EXEC SQL END DECLARE SECTION;
For instance, the following code defines variables to hold the employee_id and
employee_name of your fictional database:

EXEC SQL BEGIN DECLARE SECTION;


int empl_id;
varchar empl_name[30];
EXEC SQL END DECLARE SECTION;

Obviously, this section must occur before any use can be made of the empl_id and
empl_name variables, and the types must match their corresponding PostgreSQL
data type. Here is a brief table that matches PostgreSQL to standard C data types:

PostgreSQL C

SMALLINT short

INTEGER int

INT2 short

INT4 int

FLOAT float

FLOAT4 float

FLOAT8 double

DOUBLE double

DECIMAL(p,s) double

CHAR(n) char x[n+1]

VARCHAR(n) struct
DATE char[12]

TIME char[9]

TIMESTAMP char[28]

Connecting to a Database

Embedded SQL in C uses the following syntax for connecting to a back-end server:

EXEC SQL CONNECT TO dbname

The actual database name can be specified as follows:

dbname[@server][:port]

Executing Queries

Once connected, queries can be sent to the back end for processing by using the
following syntax:

EXEC SQL query

In general, almost all query actions require that an explicit COMMIT command be
issued. The exception is SELECT commands; they can be issued on a single line.
The following are some examples of typical usage:

EXEC SQL SELECT * FROM payroll WHERE name='Jason Smith';

EXEC SQL INSERT INTO payroll VALUES


('Steven', 'Berkeley', 'CA');
EXEC SQL COMMIT;

EXEC SQL UPDATE payroll SET l_name='Wickes'


WHERE f_name='Steven';
EXEC SQL COMMIT;

EXEC SQL DECLARE my_cur CURSOR FOR


SELECT * FROM payroll
WHERE state='CA';
EXEC SQL FETCH my_cur INTO :name;
[…some code…]
EXEC SQL CLOSE my_cur;
EXEC SQL COMMIT;
EXEC SQL DELETE * FROM payroll;
EXEC SQL COMMIT;

Error Handling

The epcg communications area must be defined with the following command:

EXEC SQL INCLUDE sqlca;

Additionally, error reporting then can be turned on with the following:

EXEC SQL WHENEVER sqlerror sqlprint;


JDBC

PostgreSQL provides a type 4 JDBC driver. This indicates that the driver is written in
Pure Java and is platform independent. Therefore, once compiled, the driver can be
used on any system.

Compiling the Driver

To build the driver at compile time of the system, include the --with-java option
of the configure command. Otherwise, if the system is already installed, it can
still be compiled by entering the /src/interfaces/jdbc directory and issuing
the make install command.

If you have installed on a packaged-based system, there is an appropriate RPM or


DEB package for JDBC installation (for example, postgresql-jdbc-7.1.2-
4PGDG.i386.rpm). Once installed, it usually resides in /usr/share/pgsql.

Upon completion, the JDBC driver will be in the current directory, named
postgresql.jar.

Installing the Driver

To use the driver, the jar archive postgresql.jar needs to be included in the
environment variable CLASSPATH. For example, to load the driver with the fictional
Java application foo.jar, you would issue (this assumes using the Bash shell) the
following:

$ CLASSPATH=/usr/local/pgsql/lib/postgresql.jar

$ export CLASSPATH

$ java ./foo.jar

Configuring Clients

Any Java source that uses JDBC needs to import the java.sql package using the
following command:

import java.sql.*;

Don't Import the postgresql Package

Do not import the postgresql package. If you do, your source will not
compile.
Connecting

To connect, you need to get a Connection instance from JDBC.To do this, you
would use the DriverManager.getConnection() method:

Connection db = DriverManager.getConnection(url,user,pwd);

Options Description

url Database URL.

user Username to connect as.

pwd Password of the user.

With JDBC, a database is represented by a uniform resource locator (URL). With


PostgreSQL, this takes one of the following forms:

jdbc:postgresql:database
jdbc:postgresql://host/database
jdbc:postgresql://hostport/database

Options Description

database Database name.

host Hostname of the server (default is localhost).

port Port number of the server (default is 5432).

Executing Queries

To submit a query to the database, a Statement object is necessary. The


executeQuery() method will return a ResultSet instance that will contain the
returned result. For instance:
Statement myst = db.createStatement();
ResultSet myrs = myst.executeQuery("SELECT * FROM payroll");
[…more code…]

Before the actual ResultSet can be accessed, an rs.next() function must be


called. This function returns a TRUE value if there are more results present and
prepares the tuple for processing.

Updating Records

To update a specific element or to execute any statement that does not result in a
ResultSet, use the executeUpdate() method. For instance:

Statement myst = db.createStatement();


Myst.executeUpdate("UPDATE payroll SET first_name='Steve'");
libpq

The libpq library is a C language API that provides access to the PostgreSQL back
end. In fact, most of the provided client tools (like psql) use this library as their
connection route to the back end.

The libpq provides many functions that can control nearly every aspect of the
client/server. Although an in-depth discussion of every function is outside the scope
of this chapter, the two most popular functions it provides are as follows:

PQconnectdb. Connects to a database.

PQexec. Sends and executes a query to the back end.

PQconnectdb also provides the following functions:

PQreset. Resets client/server communication.

PQfinish. Closes the database connection.

PQconnectdb and PQexec are both discussed in more detail in the following sections.

PQconnectdb

The PQconnectdb function accepts several options as shown here. In this example,
your user-defined object name is PGconnectID, but it could be anything.

PGconnectID *PQconnectdb(const char *conninfo)

conninfo would contain one of the following (in the form option=value):

Option Description

host The hostname (or UNIX path) to connect to.

hostaddr For TCP/IP connections, the IP address.

port The port of the server.

dbname The specific database to connect to.


user Connect as this user.

password If authentication is required, the password.

options Trace/debug options.

tty The file or tty to send debugging information to.

requiressl Set to 1 to mandate SSL connections.

If the conninfo string is not specified, the following environmental variables can
be set to specify the connection options:

PGDATABASE Sets the database name.

PGDATESTYLE Sets the default date style.

PGGEQO Sets the default Genetic Optimizer.

PGHOST Sets the database host.

PGOPTIONS Sets various run-time options.

PGPASSWORD Sets the user's password.

PGPORT Sets the server port.

PGREALM Sets the Kerberos realm.

PGTTY Sets the debug/error tty or file.

PGTZ Sets the default time zone.


PGUSER Sets the username.

The following functions depend on the connection pointer returned earlier


(PGconnectID, in this case).

char *PQdb(const PGconnectID, *conn)

Returns the name of the currently connected database.

char *PQuser(const PGconnectID, *conn)

Returns the name of the currently connected user.

char *PQpass (const PGconnectID, *conn)

Returns the password of the current connection.

char *PQhost (const PGconnectID, *conn)

Returns the hostname of the back-end server.

char *PQport (const PGconnectID, *conn)

Returns the server port of the current connection.

char *PQtty (const PGconnectID, *conn)

Returns the debug file/ tty of the connection.

char *PQoptions (const PGconnectID, *conn)

Returns the debug/trace connection options.

char *PQerrormessage (const PGconnectID, *conn)

Returns the last error message generated by the connection.

int *PQbackendPID (const PGconnectID, *conn)

Returns the PID of the back-end server process.

ConnStatusType PQstatus(const PGconnectID, *conn)

Returns one of the following status conditions:


CONNECTION_STARTED

CONNECTION_MADE

CONNECTION_AWAITING_RESPONSE

CONNECTION_AUTH_OK

CONNECTION_SETENV

CONNECTION_OK

CONNECTION_BAD

PQconnectdb also provides these functions:

void PQreset(PGconn *conn)

Resets the communication port with the back end.

void PQfinish(PGconn *conn)

Closes the connection to the back end and frees memory used by the PGconn
object.

PQexec

The PQexec function sends a requested query to a connected back-end server.


When a response is received, it is stored in a pointer. In this case, the pointer is
called PGresult; however, this is variable.

PGresult *PQexec(PGconn *conn, const char *query)

The following functions act on the preceding returned pointer (PGresult):

ExecStatusType PQresultStatus(const PGresult *res)

Provides information regarding the last executed query.This will return one of
the following values:

PGRES_EMPTY_QUERY

PGRES_COMMAND_OK

PGRES_TUPLES_OK
PGRES_COPY_OUT

PGRES_COPY_IN

PGRES_BAD_RESPONSE

PGRES_NONFATAL_ERROR

PGRES_FATAL_ERROR

char *PQresultErrorMessage(const PGresult *res)

This function returns the last error message specifically associated with a
particular PGresult. This function differs from PQerrormessage, which
returns the last error associated with a particular connection but not a specific
result.

int PQntuples(const PGresult *res)

Returns the number of rows in the query result.

int PQnfields(const PGresult *res)

Returns the number of fields in each row of the query result.

char *PQfname(const PGresult *res, int field_index)

Returns the field name associated with the given field index. Field indices start
at 0.

int PQfnumber(const PGresult *res, const char *field_name)

Returns the field index associated with the specified field name. A value of –1
is returned if the given name does not match any field.

Oid PQftype(const PGresult *res, int field_index)

Returns an integer that represents the field type associated with the field
index. The system table pg_type contains the names and properties of the
various data types. The OIDs of the built-in data types are defined in
src/include/catalog/pg_type.h in the source tree.

int PQfsize(const PGresult *res, int field_index)

Returns the number of bytes of field data in the specified field index.
char* PQgetvalue(const PGresult *res, int tup_num, int
field_num)

Returns a single field value of one row of a PGresult. In most instances, the
value returned by PQgetvalue is a null-terminated ASCII representation of
the value.

int PQgetlength(const PGresult *res, int tup_num, int


field_num)

Returns the length of a field in bytes.

int PQgetisnull(const PGresult *res, int tup_num, int


field_num)

Returns a value of 1 if the field is NULL; otherwise, returns a 0.

char * PQcmdTuples(const PGresult *res)

Returns the number of rows affected by the SQL command. This function only
measures the effects of INSERT, DELETE, or UPDATE commands.

Oid PQoidValue(const PGresult *res)

If the SQL command was an INSERT, returns the OID of the tuple inserted;
otherwise, returns InvalidOid.

void PQclear(PQresult *res)

Frees the storage associated with the PGresult. Every query result should be
freed via PQclear when it is no longer needed. PGresult does not go away
after use, even if the connection is closed. Failure to do this will result in
memory leaks in the front-end application.
libpq++

The libpq++ library enables C++ applications to interface with the PostgreSQL
back end. Fundamentally, it operates in the same way as the libpq library except
that much of it is implemented as classes.

libpq++ provides two main classes: PgConnection and PgDatabase.

PgConnection

This class provides the functions needed to connect to a PostgreSQL database.

PgConnection::PgConnection(const char *conninfo)

The connection info can be specified either by the connect-string argument (as in
the preceding) or by expressly setting the following environmental variables:

PGDATABASE Sets the database name.

PGDATESTYLE Sets the default date style.

PGGEQO Sets the default Genetic Optimizer.

PGHOST Sets the database host.

PGOPTIONS Sets various run-time options.

PGPASSWORD Sets the user's password.

PGPORT Sets the server port.

PGREALM Sets the Kerberos realm.

PGTTY Sets the debug/error tty or file.

PGTZ Sets the default time zone.


PGUSER Sets the username.

The connect-string argument, if the environmental variables are not used, can be
specified with the following options (in the form option=value):

Option Description

dbname The specific database to connect to.

host The hostname (or UNIX path) to connect to.

hostaddr For TCP/IP connections, the IP address.

options Trace/debug options.

password If authentication is required, the password.

port The port of the server.

requiressl Set to 1 to mandate SSL connections.

tty The file or tty to send debugging information to.

user Connect as this user.

This class provides several functions that assist in database connection.

int PgConnection::ConnectionBad()

Returns TRUE if the connection succeeded; otherwise, returns FALSE.

int ConnStatusType PgConnection::Status()

Returns the status of the connection to the back-end server, either


CONNECTION_OK or CONNECTION_BAD.
ExecStatusType PgConnection::Exec(const char* query)

Sends a query to the back-end server for execution. Returns the results of the
query.The status should report one of the following:

PGRES_EMPTY_QUERY

PGRES_COMMAND_OK

PGRES_TUPLES_OK

PGRES_COPY_OUT

PGRES_COPY_IN

PGRES_BAD_RESPONSE

PGRES_NONFATAL_ERROR

PGRES_FATAL_ERROR

const char *PgConnection::ErrorMessage()

Returns the last error message text.

PgDatabase

The pgDatabase class provides access to the elements residing in a return set of
data. Specifically, this class is useful for returning information pertaining to how
many rows or fields were affected by a given query. The following are the class
functions:

int PgDatabase::Tuples()

Returns the number of rows in the query result.

int PgDatabase::CmdTuples()

Returns the number of rows affected after an INSERT, UPDATE, or DELETE. If


the command was anything else, it returns –1.

int PgDatabase::Fields()

Returns the number of fields in the query result.


const char *PgDatabase::FieldName(int field_num)

Returns the field name associated with the given index. The field indices start
at 0.

int PgDatabase::FieldNum(const char* field_name)

Returns the field index associated with the field name specified.

Oid PgDatabase::FieldType(int field_num)

Returns the field type associated with the given index. The integer returned is
an internal coding of the type.

short PgDatabase::FieldSize(int field_num)

Returns the number of bytes occupied by the given field. Field indices start at
0.

const char *PgDatabase::GetValue(int tup_num, int


field_num)

Returns a single field value from a row of a PGresult. Row and field indices
start at 0. For most queries, the value returned by GetValue is a null-
terminated ASCII string.

int PgDatabase::GetLength(int tup_num, int field_num)

Returns the length of a field in bytes.

void PgDatabase::PrintTuples(FILE *out = 0, int


printAttName = 1, int terseOutput = 0, int width = 0)

Prints out all the tuples and/or the attribute names.

int PgDatabase::GetLine(char* string, int length)

Reads a line directly from the socket.

void PgDatabase::PutLine(const char* string)

Writes a line directly to the connection socket.

int PgDatabase::EndCopy()

Ensures that client and server will be synchronized, in case direct access
methods caused communications to get out of sync.
libpgeasy

Essentially, the libpgeasy library provides a simpler interface to the libpq C


library.The typical uses of libpgeasy are as follows:

connectdb. Connects to the database.

doquery. Executes the supplied query.

fetch. Retrieves the results from the back end.

disconnectdb. Closes the database connection.

The following functions are provided by libpgeasy to accomplish the preceding:

PGconn *connectdb(char *options);

PGresult *doquery(char *query);

int fetch(void *param);

int fetchwithnulls(void *param);

void reset_fetch();

void disconnectdb();
ODBC

Open database connectivity (ODBC) is an API that provides a product-neutral


interface between front-end applications and database servers. ODBC drivers are
primarily used to connect Windows-based applications to various RDBMSs. However,
ODBC drivers are available for almost all platforms, including UNIX, Mac, and
others.

Installation

PostgreSQL can be compiled (or installed from packages) with the necessary drivers
for ODBC access. Although PostgreSQL includes some built-in ODBC drivers, other
projects are more supported. One of the more popular ODBC access methods is
currently the unixODBC project (see www.unixodbc.org).

Installation of the ODBC drivers can be broken up into five steps:

1. Install and configure an ODBC manager.

2. Compile the specific PostgreSQL ODBC driver.

3. Add the ODBC extensions to the base catalogs.

4. Install the PostgreSQL ODBC driver on a client machine.

5. Configure the .ini file or use the provided GUI.

Before the actual installation of the chosen ODBC driver can begin, an ODBC
manager must previously exist on the system.All versions of Windows from
Windows 95 on already include an ODBC manager. For UNIX/Linux clients, there are
several choices. There is the unixODBC manager applet, and there is a free ODBC
client called iODBC. (More information can be obtained from www.unixodbc.org or
www.iodbc.org.)

If your system was installed from source code, the option --enable-odbc could've
been supplied at compile time. (See Chapter 10, "Common Administrative Tasks,"
for more compile-time options.) Likewise, most of the package-based installs also
provide an optional package that includes the required ODBC functionality (for
example, postgresql-odbc-7.1.2-4PGDG.i386.rpm).

Alternatively, if the system has previously been compiled without the ODBC option,
it can still be compiled by running the make install command in the appropriate
directory (for example, src/interfaces/odbc).

The base catalogs require some modifications to be completely ODBC compatible.


The file odbc.sql is a collection of modifications that need to be made to the base
catalog structure. It is designed to be executed as a script and should require no
human inter-action. To automatically apply these changes, execute the following
command as the PostgreSQL DBA user:

>psql -d template1 -f PATH/odbc.sql


Additionally, make sure you start the postmaster with the -i option (or make the
appropriate change to the postgresql.conf file), which enables access from
TCP/IP connections.Additionally, most systems require that the pg_hba.conf file
be edited. (See Chapter 10 for more information on PostgreSQL administration.)
Otherwise, the client would need to be local to connect successfully.

As for installing the client machines, the easiest method is to download the Windows
executable that automatically installs and configures Windows machines. This
installer can be obtained from the following (check mirrors also):

ftp://ftp.postgresql.org/pub/odbc/versions/full/

Additionally, the MS Installer (MSI) or plain DLL versions of the driver can be
obtained from the following:

ftp://ftp.postgresql.org/pub/odbc/versions/msi/

ftp://ftp.postgresql.org/pub/odbc/versions/dll/

The next step is to configure the odbc.ini file (or preferably, use the provided GUI
management dialog).

The odbc.ini file has three required sections:

[ODBC Data Sources]

A list of database names.

This section must include the following:

Driver = Path (e.g., prefix/lib/libpsqlodbc.so)


Database=DatabaseName
Servername=localhost
Port=5432

[Data Source Specification]

A configuration section for each ODBC data source.

[ODBC]

Defines the InstallDir keyword.

An alternative to specifying all of these options within an .ini file is to utilize the
GUI configuration tool provided with the Windows driver.

The options provided via this method include the following:


Disable Genetic Optimizer. Automatically turns off the back-end genetic
optimizer at connection time.

Keyset Query Optimization (KSQO). Some applications, specifically the MS


Jet Database Engine, use "keyset" queries. Many of these queries will most
likely crash the back end without the KSQO feature.

CommLog. Log communications to/from the back end to that file.

Recognize Unique Indexes. This setting enables Access 95 and 97 to ask


the user at link time what the index should be.

ReadOnly (default). New data sources will inherit the state of this box for
the data source read-only attribute.

Use Declare/Fetch. If true (default), the driver automatically uses declare


cursor/fetch to handle SELECT statements and keeps 100 rows in a cache.

Parse Statements. If enabled, the driver will parse a SQL query statement to
identify the columns and tables and to gather statistics about them such as
precision, nullability, aliases, and so on.

Unknown Sizes. This controls what SQLDescribeCol and


SQLColAttributes will return for precision for character data types. This
was more of a workaround for pre-6.4 versions of Postgres. Options are as
follows:

Maximum. Allows return of the maximum precision of the data type.

Dont Know. Returns a "Dont Know" value and lets the application decide.

Longest. Returns the longest string length of the column of any row.

Data Type Options. Affects how some data types are mapped. Options are as
follows:

Text as LongVarChar. Postgres TEXT type is mapped to SQLLongVarChar;


otherwise, SQLVarChar.

Unknowns as LongVarChar. Unknown types (arrays and so on) are mapped to


SQLLongVarChar; otherwise, SQLVarChar.

Bools as Char. Boolean returns are mapped to SQL_CHAR; otherwise, to SQL_BIT.

Cache Size. When using cursors, this is the row size of the tuple cache. If not
using cursors, this is how many tuples to allocate memory for at any given
time.
Max VarChar. The maximum precision of theVarChar and BPChar (char[x])
types.

Max LongVarChar. The maximum precision of the LongVarChar type.

SysTable Prefixes. By default, names that begin with pg_ are treated as
system tables.This allows defining additional ones. Separate each prefix with a
semicolon (;).

Connect Settings. These commands will be sent to the back end upon a
successful connection. Use a semicolon (;) to separate commands.

The driver also has these data source/connection options:

ReadOnly. Determines whether the data source will allow updates.

Row Versioning. Allows applications to detect whether data has been


modified by other users while you are attempting to update a row. The driver
uses the xmin system field of Postgres to allow for row versioning.

Show System Tables. The driver will treat system tables as regular tables.

OID Options:

Show Column. Shows the OID.

Fake Index. Fakes a unique index on OID.This is mainly useful for older MS
Access–style applications.

Protocol:

6.2. Forces the driver to use Postgres 6.2 protocol, which had different byte
ordering, protocol, and other semantics.

6.3. Use the 6.3 protocol. This is compatible with both 6.3 and 6.4 back ends.

6.4. Use the 6.4 protocol. This is only compatible with 6.4.
Perl

PostgreSQL already includes the procedural language PL/Perl that can run Perl
scripts. Accessing PostgreSQL from an external Perl script, however, requires the
use of the Perl database-independent (DBI) module. The Perl DBI defines a set of
functions, variables, and conventions to Perl scripts, regardless of what back-end
database is actually used.

Perl DBI provides a consistent interface to scripts, resulting in much more portable
and flexible code. The DBI is just a general-purpose interface, however; a database
driver is still needed to connect to a specific database.

The Perl system does include an older, non-DBI PostgreSQL access module named
Pg. However, this is an older module, and most development work recently has
gone into the newer DBI-compliant modules.

The overall architecture of the Perl DBI system is illustrated in Figure 13.2.

Figure 13.2. The overall architecture of the Perl DBI system.

The PostgreSQL driver is named DBI::Pg, and it must be present and installed for
execution to be successful. This class and driver set is modeled closely after the
libpq library functions. Therefore, the functional interfaces are analogous to how
things would be done in C.

DBI Class (Connecting)

The DBI class is the base class provided by the interface system. The following are
the methods it provides:

connect($data_source, $username, $password, \%attr);

Options:

data_source: The database driver name (such as dbi:pg).

username: Connect as this user.

password: The user's password.

%attr: Various options specific to that driver.


Description: Establishes a database connection. If successful, returns a valid
database handle object.

available_drivers

Description: Returns a list of valid database drivers.

data_sources($driver)

Description: Returns a list of valid databases for the specified driver.

trace($level[, $file])

Description: Specifies the trace/debugging level and a logging file, if specified.

DBI Handle Methods (Executing Queries)

Once the DBI class returns a valid handle object, it will provide these methods:

prepare($statement[, \%attr])

Description:This function sends the query statement, along with options, to


the database engine for preparation. (This method does not perform the
prepare with PostgreSQL; it simply caches the query until an execute is
called.)

do($statement[, \%attr][, @bind_values])

Description: Prepares and executes the supplied query statement.Additionally,


optional attributes and where to bind the results can be specified.

commit

Description: Issues a COMMIT to the database back end.

rollback

Description: Issues a ROLLBACK to the database back end.

disconnect

Description: Disconnects from the database.

ping

Description: Determines whether the database server is still running.


quote($string)

Description: Escapes any special characters. Useful for formatting a query


string before delivering to the back end.

DBI Statement Handle Methods (Results)

After a ResultSet has been returned, that object provides the following methods:

execute([@bind_values])

Description: Executes the previously prepared statement. Optionally, binds the


values for each element before executing the statement.

fetchrow_arrayref

Description: Fetches the next row of data holding values; returns a reference
to the array.

fetchrow_array

Description: Fetches the next row of data holding values; returns an array.

fetchrow_hashref

Description: Fetches the next row of data holding values; returns a reference
to an array.

fetchall_arrayref

Description: Fetches all the rows of data holding values; returns a reference to
an array.

finish

Description: Indicates to the back end that no more rows will be fetched;
allows the server to reclaim resources.

rows

Description: Returns the number of rows affected by the last query.

bind_col$column_number, \$var_to_bind, \%attr);

Description: Binds a specific PostgreSQL column to a Perl variable.


Statement Handle Attributes

The returned statement handles provide the following attributes:

NUM_OF_FIELDS

Description: Returns the number fields in the current row.

NUM_OF_PARAMS

Description: Returns the number of placeholders in the prepared statement.

NAME

Description: Returns a reference to an array that contains the field's names for
each column.

pg_size

Description: Returns a reference to an array of integer values for each column.


The integer shows the size of the column in bytes. Variable-length columns are
indicated by –1.

pg_type

Description: Returns a reference to an array of strings for each column. The


string shows the name of the data type.

pg_oid_status

Description: PostgreSQL-specific attribute that returns the OID of the last


INSERT command.

pg_cmd_status

Description: PostgreSQL-specific attribute that returns the type of the last


command. Possible types are INSERT, DELETE, UPDATE, and SELECT.
Python (PyGreSQL)

PyGreSQL is a Python interface to the PostgreSQL database. It was written by D'Arcy J.M. Cain and
was based heavily on code written by Pascal Andre.

PyGreSQL is implemented as three parts:

C shared module: _pg.so

Two Python wrappers: pg.py and pgdb.py

Compiling PyGreSQL

In the directory containing pgmodule.c, run the following command:

cc -fpic -shared -o _pg.so -I[pyInc] --I[pgInc] --L[pgLib] --lpq pgmodule.c

Compile-time options include the following:

[pyInc] = Python include path (Python.h)

[pgInc] = PostgreSQL include path (Postgres.h)

[pgLib] = PostgreSQL libraries path (libpq.so /libpq.a)

Some of the following keywords can be specified:

-DNO_DEF_VAR Disable default variables support.

-DNO_DIRECT Disable direct access methods.

-DNO_LARGE Disable large object support.

-DNO_PQSOCKET Older PostgreSQL version.

-DNO_SNPRINTF No snprintf call available.

Python Configuration

Locate the directory where the dynamic loading packages are located for Python (for example,
usr/lib/python/libdynload). Copy the resulting _pg.so file to this location. Copy the pg.py
and pgdb.py files to Python's standard library directory (for example, /usr/local/lib/Python).

The pg.py file uses the traditional interface, whereas the pgdb.py file is compliant with the DB-API
2.0 specification developed by the Python DB-SIG.

The remainder of this section describes only the older pg API. You can read about the new DB-SIG
API at the following:
www.python.org/topics/database/DatabaseAPI-2.0.html

Or a tutorial is located at the following:

www2.linuxjournal.com/lj-issues/issue49/2605.html

PyGreSQL Interfaces

The PyGreSQL module provides two separate interfaces to a PostgreSQL database server.Access is
provided via one of the two included wrapper modules:

pg. Standard PyGreSQL interface.

pgdb.The DBI 2.0 API interface.

Although most of the new development effort is to further define the DBI-compliant interface, the
standard PyGreSQL interface is currently more standard. This section will focus on the standard
interface, although information on the DBI 2.0 interface can be found at the following:

www.python.org/topics/database/DatabaseAPI-2.0.html

The standard pg module provides the following attributes:

connect(dbname, host, port, opt, tty, user, passwd)

Description: Opens a PostgreSQL connection.

Parameters:

dbname The name of the connected database.

host The name of the server host.

port The port used by the database server.

opt The connection options.

tty The debug terminal.

user The PostgreSQL user.

passwd The password for the user.

For instance:

>>>import pg
>>>database=pg.connect(dbname="newriders", host=127.0.0.1)
get_defhost()

Description: Returns the current default host information.

get_defport()

Description: Returns the current default port information.

get_defopt()

Description: Returns the current default connection options.

get_deftty()

Description: Returns the current default debug terminal.

set_deftty(tty)

Parameters:

tty The new debug terminal

Description: Sets the debug terminal value for new connections. If None is supplied as a
parameter, environment variables will be used in future connections.

get_defbase()

Description: Returns the current database name.

Sending Queries to a Database Object

Once connected to a database, a pgobject is returned. This object embeds specific parameters that
define this connection. The following parameters are available through function calls:

query(command)

Parameters:

command The SQL command string

Description: Sends the specified SQL query (command) to the database. If the query is an
insert statement, the return value is the OID of the new row. If it is a query that does not
return a result, None is returned. For SELECT statements, a pgqueryobject object is
returned that can be accessed via the getresult or dictresult method.

For instance:

>>>import pg
>>>database=pg.connect("newriders")
>>>result=database.query("SELECT * FROM authors")

close
Description: Closes the database connection. The connection is automatically closed when the
connection is deleted, but this method enables an explicit close to be issued.

fileno

Description: Returns the underlying socket ID used to connect to the database.

getnotify

Description: Receives NOTIFY messages from the server. If the server returns no notify, the
methods returns None. Otherwise, it returns a tuple (relname, pid), where relname is the
name of the notify and pid is the process ID of the connection that triggered the notify.
Remember to do a listen query first; otherwise, getnotify will always return None.

inserttable

Description: Allows quick insertion of large blocks of data in a table. The list is a list of
tuples/lists that define the values for each inserted row.

putline

Description: Writes a string directly to the connection socket.

getline

Description:This method reads a string directly from the server socket.

endcopy

Description: Ensures that the client and server will be synchronized, in case direct access
methods cause communications to get out of sync.

Accessing Large Objects from a Database Connection

To access large objects via a pg connection to a database, the following functions are used:

getlo

Description: Gets a large object through the object's OID.

locreate

Description: Creates a large object in the database.

loimport

Description: This method enables you to create large objects in a very simple way. You just give
the name of a file containing the data to be used.

open

Description: This method opens a large object for reading/writing, in the same way as the UNIX
open() function.
close

Description: This method closes a previously opened large object, in the same way as the UNIX
close() function.

read

Description: This function enables you to read a large object, starting at the current position.

write

Description: This function enables writing a large object, starting at the current position.

tell

Description: This method gets the current position of the large object.

seek

Description: Moves the position cursor in the large object.

unlink

Description: Deletes a PostgreSQL large object.

size

Description: Returns the size of a large object. Currently, the large object needs to be opened.

export

Description: Dumps the content of a large object on the host of the running Python program,
not the server host.

Accessing Results from a pgobject

Once a query has been issued to the database, if results are returned, they can be accessed in the
following ways:

getresult

Description: Returns the list of the values contained in pgqueryobject. More information
about this result can be accessed using listfields, fieldname, or fieldnum methods.

dictresult

Description: Returns the list of the values contained in pgqueryobject, returned as a


dictionary with the field names used as the key indexes.

listfields

Description: Lists the field names of the previous query result. The fields are in the same order
as the result values.
fieldname(int)

Description: Finds a field name from its ordinal sequence number (integer). The fields are in
the same order as the result values.

fieldnum(str)

Description: Returns the field number from its name (string).

ntuples

Description: Returns the number of tuples found in a query.

reset

Description: Resets the current database.

For example:
>>>import pg
>>>database=pg.connect("newriders")
>>>results=database.query("SELECT * FROM payroll")
>>>results.ntuples()
2340
>>>mydict=results.dictresult()

The DB Wrapper

The preceding functions are wrapped within the pg module. This module also provides a special
wrapper named DB. This wrapper streamlines much of the connection and access mechanics needed
to interact with the database. The preceding functions are also included in the name space, so it isn't
necessary to import both modules. The preferred way to use this module is as follows:

>>>import pg
>>>db=pg.DB('payroll','localhost')
>>>db.query("INSERT INTO checks VALUES ('Erica',200)")
>>>db.query("SELECT * FROM checks")

Name Amount
-------------
Erica 200

The following list describes the methods and variables of this class (these are very similar to the base
pg method, with some slight exceptions):

pkey(table)

Description: This method returns the primary key of a table. Note that this raises an exception
if the table doesn't have a primary key.

get_databases

Description: Although you can do this with a simple select, it is added here for convenience.

get_tables
Description: Returns a list of tables available in the current database.

get_attnames

Description: Returns a list of attribute names.

get(table, arg, [keyname])

Parameters:

table The name of the table.

arg Either a dictionary or the value to be looked up.

keyname The name of field to use as key (optional).

Description: Gets a single row. It assumes that the key specifies a unique row; if keyname is not
specified, the primary key for the table is used.

insert(table, a)

Parameters:

table The name of the table.

a A dictionary of values.

Description: Inserts values into the specified table, using values from the dictionary. Then the
dictionary is updated with values modified by rules, triggers, and so on.

update(table, a)

Parameters:

table The name of the table.

a A dictionary of values.

Description: Updates an existing row. The update is based on the OID value from get. An array is
returned that reflects any changes caused by the update due to triggers, rules, defaults, and so on.

clear(table, [a])
Parameters:

table The name of the table.

a A dictionary of values.

Description: Clears all the fields to clear values, which is determined by the data type. Numeric
types are set to 0, dates are set to TODAY, and everything else is set to NULL. If the argument a is
present, it is used as the array, and any entries matching attribute names are cleared with
everything else left unchanged.

delete(table, a)

Parameters:

table The name of the table.

a A dictionary of values.

Description: Deletes the row from a table based on the OID from get.
PHP

PHP is a scripting language used for building dynamic web pages. It contains a number of advanced
features that rival commercial options such as ASP and ColdFusion.

It contains several built-in database interfaces, including functions specific for communicating with
both MySQL and PostgreSQL. The following is a list of the functions specific to PostgreSQL:

pg_close(connection_id)

Description: Closes a PostgreSQL connection. Returns false if not a valid connection;


otherwise, returns true.

pg_cmdtuples(result_id)

Description: Returns the number of instances affected by an INSERT, UPDATE, or DELETE


query. If no tuple was affected, the function will return 0.

pg_connect([host], [port], [options], [tty], dbname)

Description: Opens a connection to a PostgreSQL database. Returns a connection index on


success or false if the connection could not be made.

Example:

<?php
$dbconn = pg_Connect ("dbname=newriders");
$dbconn2 = pg_Connect ("host=localhost port=5432 dbname=newriders");
?>

pg_dbname(connection_id)

Description: Returns the name of the database connected to the specified connection index.
Otherwise, it returns false if the connection is not a valid connection index.

pg_end_copy([resource connection])

Description: Synchronizes a front-end application with the back end after doing a copy
operation. It must be issued; otherwise, the back end might get out of sync with the front end.

pg_errormessage(connection_id)

Description: Returns a string containing any error messages from previous database operations;
otherwise, returns false.

pg_exec(connection_id, query)

Description: Returns a result index following the execution of the SQL commands contained in
the query. Otherwise, it returns a false value. From a successful execution, the return value
of this function is an index to be used to access the results from other PostgreSQL functions.

pg_fetch_array(result_id, row, [result_type])


Description: Returns an array that corresponds to the fetched row; otherwise, returns false if
there are no more rows.

pg_fetch_array() is an extended version of pg_fetch_row(). In addition to storing the


data in the numeric indices of the result array, it also stores the data in associative indices,
using the field names as keys.

For instance:

<?php
$conn = pg_pconnect ("dbname=newriders");

$rst = pg_exec ($conn, "SELECT * FROM authors");


if (!$rst) {
echo "An error occured.\n";
exit;
}

$rst_array = pg_fetch_array ($rst, 0);


echo $rst_array[0] . " First Row - First Field\n";

$rst_array = pg_fetch_array ($rst, 1);


echo $rst_array["author"] . " Second Row - Author Field\n";
?>

pg_fetch_object(result_id, row, [result_type])

Description: Returns the properties that correspond to the fetched row; otherwise, returns
false if there are no more rows.

pg_fetch_object() is similar to pg_fetch_array() with one difference—an object is


returned instead of an array. As a result, you can only access the data by the field names, not
by their ordinal numbers.

pg_fetch_row(result_id, row)

Description: Returns the specified row as an array. Each result column is stored in an array
offset, starting at offset 0.

pg_fieldisnull(result_id, row, field)

Description: Returns 0 if the field in the given row is not NULL. Returns 1 if the field in the
given row is NULL. Field can be specified as number or fieldname.

pg_fieldname(result_id, field_number)

Description: Returns the field name of the corresponding field-index number specified. Field
numbering starts from 0.

pg_fieldnum(result_id, field_name)

Description: Returns the field number for the column name specified.

pg_fieldprtlen(result_id, row field_name)

Description: Returns the number of characters of a specific field in the given row.
pg_fieldsize(result_id, field_number)

Description: Returns the number of bytes that the internal storage size of the given field
number occupies.A field size of –1 indicates a variable-length field.

pg_fieldtype(result_id, field_number)

Description: Returns a string containing the data type of the field represented by the field
number supplied.

pg_freeresult(result_id)

Description: When called, all result memory will automatically be freed. Generally, this is only
needed when you are certain you are running low on memory because PHP will automatically
free memory once a connection is closed.

pg_getlastoid(result_id)

Description: Returns the last OID assigned to an inserted tuple. The result identifier is used
from the last command sent via pg_exec().

pg_host(connection_id)

Description: Returns the hostname of the connected PostgreSQL server.

pg_loclose(file_id)

Description: Closes a large object. file_id is a file descriptor for the large object from
pg_loopen().

pg_locreate(connection_id)

Description: Creates a large object and returns its OID.

pg_loexport(oid, file_path [, int connection_id])

Description: Specifies the object ID of the large object to export, and the filename argument
specifies the pathname of the file.

pg_loimport(file_path, [connection_id])

Description: Specifies the pathname of the file to be imported as a large object. All handling of
large objects in PostgreSQL must happen inside a transaction.

pg_loopen(connection_id, obj_oid, string mode)

Description: Opens a large object and returns file descriptor. The file descriptor encapsulates
information about the connection. Do not close the connection before closing the large object
file descriptor. obj_oid specifies a valid large object OID. The mode can be "r","w", or "rw".

pg_loread(file_id, length)

Description: Reads the specified length of bytes from a large object and returns it as a string.
The file_id specifies a valid large object file descriptor.
pg_loreadall(file_id)

Description: Reads a large object and passes it straight to the browser.

pg_lounlink(connection_id, large_obj_id)

Description: Deletes a large object with the OID specified in large_obj_id.

pg_lowrite(file_id, buffer)

Description: Writes to a large object from the specified buffer. Returns the number of bytes
actually written or false in the case of an error. file_id refers to the file descriptor for the
large object from pg_loopen().

pg_numfields(result_id)

Description: Returns the number of fields in a result. The result_id is a valid result identifier
returned by pg_exec().

pg_numrows(result_id)

Description: Returns the number of rows in a result. The result_id is a valid result identifier
returned by pg_exec().

pg_options(connection_id)

Description: Returns a string of the specified options valid on the provided connection identifier.

pg_pconnect([host], [port],[tty], [options] dbname, [user], [password])

Description: Opens a persistent connection, needed by other PHP functions, to a PostgreSQL


database.

pg_port(connection_id)

Description: Returns the port number of the PostgreSQL server.

pg_put_line(connection_id, data)

Description: Sends a NULL -terminated string to the PostgreSQL server. This is useful, for
example, for very high-speed inserting of data into a table, initiated by starting a PostgreSQL
copy operation.

For instance:

<?php
$conn = pg_pconnect ("dbname=foo");
pg_exec($conn, "create table bar (a int4, b char(16), d float8)");
pg_exec($conn, "copy bar from stdin");
pg_put_line($conn, "3\thello world\t4.5\n");
pg_put_line($conn, "4\tgoodbye world\t7.11\n");
pg_put_line($conn, "\\.\n");
pg_end_copy($conn);
?>
pg_result(result_id, row_number, fieldname)

Description: Returns values from a result identifier produced by pg_exec(). The row_number
and fieldname specify what elements of the array are returned. Instead of naming the field,
you can use the field index as an unquoted number.

pg_set_client_encoding(connection_id, encoding)

Description: Sets the client encoding type. The encoding can be SQL_ASCII, EUC_JP, EUC_CN,
EUC_KR, EUC_TW, UNICODE, MULE_INTERNAL, LATIN1 … LATIN9, KOI8, WIN, ALT, SJIS,
BIG5, or WIN1250. Returns 0 if success or –1 if error.

pg_client_encoding(connection_id)

Description: Returns the client encoding as a string. Will be one of the values that can be set
with the pg_set_client_encoding function.

pg_trace(filename, [mode, [connection_id]])

Description: Enables tracing of the PostgreSQL front-end/back-end communication to a


debugging file. Useful aid in debugging communication problems.

pg_tty(connection_id)

Description: Returns the tty name that server-side debugging output is being sent.

pg_untrace(connection_id)

Description: Stops tracing started by pg_trace.


Chapter 14. Advanced PostgreSQL Programming
One of the true benefits PostgreSQL has over many commercial RDBMSs is that it
can be regularly extended. For instance, PostgreSQL does not natively have a data
type for a dewey-decimal object. However, if you were creating a database that
was going to serve as the back-end system to a large library that used the Dewey
Decimal system, that could be a very useful data type.

Plugging in new features and objects into the database is known as "extending" it.
Fundamentally, PostgreSQL enables this by allowing users to write new C-based
objects and by using the resultant function as a handler for specific data type,
operator, or aggregate needs.

The basic act of extending PostgreSQL involves the following steps:

1. Creating a C-based shared object that performs the function desired.

2. Registering that function with the PostgreSQL back end through the use of the
CREATE FUNCTION command.

3. Linking the proper SQL command (for example, CREATE TYPE, CREATE
OPERATOR, and so on) with that registered object.

Understanding how extensibility works first requires an overview of the PostgreSQL


catalog system. The system catalogs are essentially just special tables. However,
instead of storing user data, these tables store information regarding how operators,
functions, data types, aggregates, rules, and triggers are defined. Therefore, by
using the provided mechanisms to modify these tables, the PostgreSQL system itself
can be extended.

One type of information that these tables store is pointers to compiled shared
objects that handle specific database functions. In essence, the CREATE
FUNCTION, CREATE OPERATOR, CREATE TYPE, and CREATE AGGREGATE
commands modify these system catalogs to include definitions for this extra
functionality.

The basic breakdown of system catalogs can be defined as shown in Table 14.1.

Table 14.1. System Catalogs

Table Description

pg_aggregate Aggregates and aggregate functions

pg_am Access methods


pg_amop Access method operators

pg_amproc Access method support functions

pg_attribute Table columns

pg_class Tables

pg_database Databases

pg_index Secondary indices

pg_opclass Access method operator classes

pg_operator Operators

pg_proc Procedures (both C and SQL)

pg_type Types (both base and complex)


Extending Functions

Most acts of extension require the defining of special functions. For instance, to define a new
data type, a C shared-object function describing the new data type must first be created.
There are three fundamental types of custom function (also refer to Chapter 12, "Creating
Custom Functions," for a more relevant discussion of created SQL or PL functions):

SQL functions. These functions consist purely of standard SQL code. No external
database objects must exist in order for these to be executed. They can be defined on-
the-fly regardless of the configuration of the base system.

PL functions. These functions are written in a non-native code (for example, PL/pgTCL).
For these functions to execute, an external shared-object handler must exist. The
handler functions must first be registered with the database back end before execution
can proceed.

Compiled functions. These functions are typically compiled C-language func-tions.


Typically, these functions define a specific input-output response. They can be called
from standard SQL code (the upper() function is an example). However, they must first
be defined as a C shared object and then registered with the database in order to be
activated.

SQL Functions

SQL language functions are simply predefined queries that are assigned a name. However,
they do support input type and can provide return values. Writing SQL functions requires no
modification to the base system or special features. For instance:

CREATE FUNCTION getpay(int4) RETURN float8 AS


' SELECT sum((amount) FROM payroll
WHERE employee_id=$1;
' LANGUAGE 'sql';

Standard SQL functions can also handle classes to be passed to or from it. For instance:

CREATE FUNCTION getshortname(payroll) RETURN varchar AS


' SELECT left(($1.last_name, 4) AS S_Name;
' LANGUAGE 'sql';

SELECT last_name FROM payroll WHERE emp_id=12345;

last_name
-------------------
Parody

SELECT getshortname(payroll) WHERE emp_id=12345;

last_name
-------------------
Paro
Procedural Language Functions

Procedural language functions are offered via loadable modules. For instance, the PL/pgSQL
language depends on the plpgsql.so loadable module. After these shared objects have been
created, they are defined as handlers with the CREATE LANGUAGE command.

The specific steps to create a valid handler object are beyond the scope of this book, but the
basic or general steps would be as follows:

1. Compile the shared-object handler from C (for example, plfoobar.so).

2. Create a function that defines this object. The return type must be set as OPAQUE for
this function. For instance:

CREATE FUNCTION plfoobar_handler() RETURNS OPAQUE AS


'/usr/local/pgsql/lib/plfoobar.so' LANGUAGE 'C';

3. Define a handler that routes a language request for this object to the previously created
function. For instance:

CREATE TRUSTED PROCEDURAL LANGUAGE 'plfoobar'


HANDLER plfoobar_handler
LANCOMPILER 'PL/FooBar';

After a language has been defined, functions and stored procedures can be created with it.
Currently, PostgreSQL supports PL/pgSQL, PL/Tcl, and PL/Perl. For more information on
creating procedural language functions, refer to Chapter 11, "Server-Side Programming."

Compiled Functions

Compiled functions are shared objects that have been registered with the database through
the use of the CREATE FUNCTION command. Creating custom compiled functions is more
complex than creating scripted functions, but they do offer a tremendous benefit in execution
speed.

Creating successful C functions requires that the PostgreSQL and C data types can be
exchanged correctly. Table 14.2 lists the PostgreSQL data type, the corresponding C data type,
and the C header file where it is defined.

Table 14.2. Corresponding PostgreSQL Data Types, C Data Types, and C Header Files

PostgreSQL Data
C Data Type C Header File
Type

abstime AbsoluteTime utils/nabstime.h

bool bool include/c.h

box (BOX *) utils/geo-decls.h


bytea (bytea *) include/postgres.h

char char N/A

cid CID include/postgres.h

datetime (DateTime *) include/c.h or


include/postgres.h

float4 (float4 *) include/c.h or


include/postgres.h

float8 (float8 *) include/c.h or


include/postgres.h

int2 int2 or int16 include/postgres.h

int2vector (int2vector *) include/postgres.h

int4 int4 or int32 include/postgres.h

lseg (LSEG *) include/geo-decls.h

name (Name) include/postgres.h

oid oid include/postgres.h

oidvector (oidvector *) include/postgres.h

path (PATH *) utils/geo-decls.h

point (POINT *) utils/geo-decls.h

regproc regproc or include/postgres.h


REGPROC

reltime RelativeTime utils/nabstime.h


text (text *) include/postgres.h

tid ItemPointer storage/itemptr.h

timespan (TimeSpan *) include/c.h or


include/postgres.h

tinterval TimeInterval utils/nabstime.h

xid (XID *) include/postgres.h

Data is passed to the compiled function internally in one of three ways:

pass-by-value

pass-by-reference (fixed length)

pass-by-reference (variable length)

Generally, data that is passed by value must either be 1, 2, or 4 bytes in length (although
some architectures can support 8 bytes as well). Fixed-length or variable-length calls can be
made with any size data types.

Calling a C-Based Function

Two separate conventions exist regarding how C-based functions are to be interfaced:

Version-0. This method is the original, but it has now been deprecated. Although this
method was fairly simple to use, functions using this method encountered portability
problems when trying to port functions across architectures.

Version-1. This is the newest interface convention. It overcomes many of the shortfalls
of Version-0 calling. It achieves this by relying on macros to encapsulate the passing of
arguments, thereby making the resultant code much more portable.

BecauseVersion-0 calling is now deprecated, the following examples will demonstrate some
simpleVersion-1 functions. (For more information onVersion-0 calling, refer to the PostgreSQL
Programmer's Guide at www.postgresql.org.)

Version-1–compliant functions should begin with two macros: PG_FUNCTION_INFO_V1 and


PG_FUNCTION_ARGS. The following is a simple pass-by-value example:

/* Program Name: add_it.c


Description: Adds two int32 numbers */
#include "postgres.h"
#include "fmgr.h"

PG_FUNCTION_INFO_V1(add_it);

Datum add_it(PG_FUNCTION_ARGS)
{
int32 arg1 = PG_GETARG_INT32(0);
int32 arg2 = PG_GETARG_INT32(1);

PG_RETURN_INT32(arg1 + arg2);
}

After this has been compiled into a shared object, it can be defined and utilized as follows:

CREATE FUNCTION add_it(int4) RETURNS int4 AS


'/usr/local/pgsql/lib/add_it.so'
LANGUAGE 'C';

>SELECT add_it(4, 8) AS Answer;

Answer
------
12

In addition to handling simple pass-by-value transfers, composite objects, like row objects,
can be passed and manipulated by C functions. For instance, this example defines the function
named isminor(), which returns TRUE or FALSE depending on whether the employee is 21
or over:

/* Program Name: islegal.c


Description: Determines if an employee is legal age */

#include "postgres.h"
#include "executor/executor.h"
#include "fmgr.h"

PG_FUNCTION_INFO_V1(islegal);

Datum
islegal(PG_FUNCTION_ARGS)
{
/*Get the current table row, assign to pointer t*/

TupleTableSlot *t = (TupleTableSlot *) PG_GETARG_POINTER(0);

/*Declare variables needed*/


int32 emp_age;
bool isnull;

/*Get the 'age' attribute from the row, this function defined in
executor.h*/

emp_age = DatumGetInt32(GetAttributeByName(t, "age", &isnull));

/*If not a valid result, return NULL*/


if (isnull)
{
PG_RETURN_NULL();
}

/*Return Age Comparison*/


PG_RETURN_BOOL(age>20);
}

After this function is compiled to a shared object, it can be defined and used within
PostgreSQL. For instance:

CREATE FUNCTION islegal(payroll) RETURNS bool AS


'/usr/local/pgsql/lib/islegal.so'
LANGUAGE 'C';

>SELECT islegal(payroll) FROM payroll WHERE name='Barry';

islegal
------
t

Coding Tips and Tricks

The following is a list of tips and pointers garnered from the PostgreSQL Programmer's Guide
(for more information on this guide, visit www.postgresql.org):

Include files are installed under /usr/local/pgsql/include or equivalent.

Use the Postgres routines palloc and pfree instead of the standard C functions
malloc and free when allocating memory. Memory reserved with palloc will
automatically be freed for each transaction, thus preventing memory leaks.

Always zero the bytes of your structures using memset or bzero. Even if you initialize
all fields of your structure, there might be several bytes of alignment padding (holes in
the structure) that contain garbage values.

Usually, programs will always require at least postgres.h and fmgr.h to be included.
The internal Postgres types are declared in postgres.h, and function manager
interfaces (PG_FUNCTION_ARGS and so on) are in fmgr.h. For portability reasons, it's
best to include postgres.h first before any other system or user header files.

Symbol names defined within object files must not conflict with each other or with
symbols defined in the PostgreSQL server executable. You will have to rename your
functions or variables if you get error messages to this effect.
Extending Types

PostgreSQL has a plethora of built-in data types (see Chapter 2, "PostgreSQL Data
Types"). However, in specific cases, it might be advantageous to create custom-
defined data types.

All the data types in PostgreSQL can be defined as belonging to one of the following
cases: base types or composites.

Base types, like int4, are written in C and are compiled into the system. However,
custom data types can be compiled as shared objects and linked to the back end by
using the CREATE TYPE command.

Composite types are created whenever a new table is created. At first it might seem
counterintuitive to think of a table as a type. However, tables are merely collections
of single data types grouped in a specific order. In that way, a table can be seen as
just a "composite," or complex collection, of simpler single-element data types.

Creating Data Types

To create a custom base type, two functions must be defined: an input function
and an output function.

The input function is responsible for accepting a NULLdelimited character string


into memory, and it returns an internal representation value.

The output function accesses the internal representation of the data element and
returns it as the original NULL -delimited character string.

The PostgreSQL 7.1 Programmer's Guide contains a good example of how a custom
data type could be created.

First you must define the structure of your complex data type:

typedef struct Complex {


double x;
double y;
} Complex;

Next, the input and output functions must be specified:

Complex *
complex_in(char *str)
{
double x, y;
Complex *result;
if (sscanf(str, " ( %lf, %lf )", &x, &y) != 2) {
elog(ERROR, "complex_in: error in parsing %s", str);
return NULL;
}
result = (Complex *)palloc(sizeof(Complex));
result->x = x;
result->y = y;
return (result);
}

char *
complex_out(Complex *complex)
{
char *result;
if (complex == NULL)
return(NULL);
result = (char *) palloc(60);
sprintf(result, "(%g,%g)", complex->x, complex->y);
return(result);
}

Care should be taken to ensure that the input and output functions are the
reciprocal of each other. If not, data that is dumped out (that is, copied to a file) will
not be able to be read back in.

After the preceding code has been compiled to static objects, the corresponding SQL
function must be created to register them with the database:

CREATE FUNCTION complex_in(opaque)


RETURNS complex
AS 'PGROOT/tutorial/obj/complex.so'
LANGUAGE 'c';

CREATE FUNCTION complex_out(opaque)


RETURNS opaque
AS 'PGROOT/tutorial/obj/complex.so'
LANGUAGE 'c';

Lastly, the CREATE TYPE command is used to define the characteristics of the
newly created custom base type:

CREATE TYPE complex (


internallength = 16,
input = complex_in,
output = complex_out);
Extending Operators

PostgreSQL uses operators as the method by which data comparisons or


aggregations are done. There are three general classes of PostgreSQL operators:
left unary, right unary, and binary.

Binary operators are perhaps the most common. In essence, an operator is binary
when it will sit between two separate data types (for example, 21 > 20). A classic
example of a binary data type is the greater-than symbol (>); it sits between two
data elements and returns a Boolean value from the evaluation of each element.
Even more basic is the addition operator (+), which sums the values on each side
and returns a result (for example, 2 + 3 returns 5).

Unary operators only accept data from one side, hence the names left unary or right
unary. An example of a right-unary operator is the factorial operator (!); it sits on
the left of an integer and provides the factorial result (for example, !4).

Operators must be defined for the specific data types they are required to act on.
For instance, the > operator performs different actions depending on whether
integers or geometric elements are being evaluated. Because of that, it is necessary
to explicitly type the specific data types that custom operators are designated to
operate on.

Defining a Custom Operator

Before an operator can be defined, the underlying function must first be created.
These functions either can be defined as procedural functions (for example, SQL,
PL/pgSQL, and so on) or can link to a compiled C object file.

In this example, a function is created that accepts two integers. It adds these
integers. If the result is greater than 100, a TRUE value is returned; otherwise, it
returns FALSE. A simple SQL function is created to perform this action, as follows:

CREATE FUNCTION addhund (int4, int4) RETURNS boolean AS '


BEGIN
IF ($2 + $1) > 100 THEN
RETURN 't';
END IF;
RETURN 'f';
END;
' LANGUAGE 'plpgsql' WITH (( iscachable );

You can then test this function directly:

SELECT addhund(99,99) AS answer;

answer
------
t
SELECT addhund(9,9) AS answer;

answer
------
f

Next, this function is bound to a specific operator character through the use of the
CREATE OPERATOR command:

CREATE OPERATOR +++ (


leftarg=int4,
rightarg=int4,
procedure=addhund,
commutator= +++);

The preceding command specifies that it is a binary operator that expects int4
data types on both the left and right sides. Additionally, it specifies that the
COMMUTATOR optimization for this operator is itself.

This new operator can be tested as follows:

SELECT 11 +++ 90 AS answer;

answer
------
t

SELECT 9 +++ 90 AS answer;

answer
------
f

Optimization Notes

Operator optimization pertains to giving the database clues as to how the various
operators relate to each other. There are several optimization settings that can be
specified upon operator creation.

COMMUTATOR

In a previous example, you defined the COMMUTATOR optimization for your + + +


operator to be itself. Generally, COMMUTATOR specification only makes sense for
binary operators. It describes what relations should exist if the data on each side of
the operator were switched. For instance, consider the following relations between
the standard addition operator:

3 + 8 = 11

8 + 3 = 11

You can see that the addition operator is commutative with itself. This means that it
doesn't matter what side each individual data element is on; the results will be the
same. In contrast, this differs with regard to how the subtraction operator works:

3 – 8 = –5

8–3=5

In this case, the position of the data elements does make a difference. Therefore,
subtraction is not commutative with itself.

NEGATOR

Another phrase that can be specified during operator creation is what, if anything,
negates the current definition. For instance, the equal operator is negated by the
unequal operator (for example, a = b is negated by a <> b).

RESTRICT

The RESTRICT optimization clause is only valid for binary operators that return a
Boolean result (for example, a > b). Restriction provides hints to the query
optimizer related to the particular selectivity that would satisfy a general WHERE
clause. The standard estimators are as follows:

Estimator Description Used For

eqsel Equal to selection =

neqsel Not equal to selection <>

scalarltsel Scalar less-than selection < or <=

scalargtsel Scalar greater-than selection > or >=

JOIN
JOIN optimization is generally only valid for binary operators that return Boolean
results (for example, a = b). The JOIN optimizer provides insight as to how many
rows would match between a pair of tables selected with a general WHERE clause
(for example, payroll.empid=employee.empid).

The possible values that can be specified for an estimation clause are shown in Table
14.3.

Table 14.3. Values for an Estimation Clause

Estimator Description Used For

eqjoinsel Equal to selection =

nejoinqsel Not equal to selection <>

scalarltjoinsel Scalar less-than selection < or <=

scalargtjoinsel Scalar greater-than selection > or >=

areajoinsel 2D area comparisons N/A

positionjoinsel 2D position comparisons N/A

contjoinsel 2D containment comparisons N/A

HASHES

If the HASHES clause is present, the optimizer is instructed that it is permissible to


attempt hash joins on this operator. HASHES are only valid for binary operators that
return Boolean results.

In general, this only makes sense when the operator represents absolute equality
between the data types (for example, a = b). If the operator does not provide an
equality comparison between the operators, hash joins would be of little use.
SORT1 and SORT2

This set of clauses instructs the optimizer whether it is permissible to attempt


merge joins on either the left or right side of the operator.

Use of these optimization options is very limited. In practice, it is usually only valid
for the equal (=) operator. Moreover, the two referenced operators should always be
named <.

The CREATE OPERATOR command does not perform any sanity checks to determine
the validity of optimization options. Therefore, the command might successfully
create the specified operator, but it might still fail on use. In fact, using the
SORT1/SORT2 optimization options will cause failure if either of the following
conditions is not met:

The merge join equality operator must have a commutator (should be itself if
the two data types are the same).

There must be < and > operators that have the same data types as the
specified sort operator.
Part V: Appendices

Part V Appendices

A Additional Resources

B PostgreSQL Version Information


Appendix A. Additional Resources
A sizeable amount of resources are available for PostgreSQL. There are mailing lists,
web sites, and books that cover much of the material here in greater or lesser
detail.
PostgreSQL versus Other RDBMSs

One of the first questions asked by new users to RDBMSs is, "Which one is best?"
That question is nearly impossible to answer without a full understanding of the
database's required functionality.

Comparing databases is like comparing vehicles. Each type of vehicle is suited for a
particular task; motorcycles would be preferable over pickup trucks in some
situations and would be disastrous in others. Likewise, the required functionality of
a RDBMS must be understood before the right fit can be established.

In lieu of trying to compare apples to oranges, the following sections will give a brief
listing of the popular RDBMSs currently available and will list their strong and weak
points as well as their typical uses.

PostgreSQL

Pros:

Large support and development community.

Many commercial support options (Great Bridge, Red Hat, and others).

Fully open sourced.

No royalty or license fees.

Has a very robust feature set.

Supports a large number of internal procedural languages that can create


stored procedures, triggers, rules, and functions.

Has a wide array of API access solutions, including ODBC, JDBC, C, Perl, PHP,
and Python.

Fully transactional.

Scales very well with additional users.

Supports database versioning (MVCC) as a concurrency control mechanism.

Fully ACID-compliant database (Atomicity, Concurrency, I, and D).

Many web tools exist for interfacing with PostgreSQL.

Supports foreign keys.


Online backup and restores possible.

Cons:

No replication in current version (7.1).

Not multithreaded (problematic in NT environments).

Only a handful of GUI administration tools currently available.

Typical uses:

When you examine the features, development tools, reliability, and


performance of PostgreSQL, it's hard to find fault with it. As a midrange
database, PostgreSQL is impossible to beat in many situations. Although it is
still an open-source project, a number of commercial entities now offer
professional support.

The only possible concern that exists when evaluating PostgreSQL is the
specific environment where it will operate. Although an NT version of the
database is available, it tends to run better in a UNIX-style environment.

MySQL

Pros:

Large support and development community.

Very prevalent as a web back end.

Open sourced and therefore can be customized.

No royalty or license fees.

Performs basic SELECT s very quickly.

Small footprint needed to run.

Runs on many different operating systems.

Interfaces with many popular web-development tools.

Cons:
Doesn't perform complex joins or subselects.

Doesn't natively support transactions.

No atomicity with regard to locking.

No support for stored procedures.

No triggers.

No foreign keys.

Typical uses:

MySQL makes an excellent choice for serving dynamic web pages, particularly
if there is no need for transactions or complex queries. Additionally, MySQL is
very straightforward to install, configure, and administrate.

Because MySQL is so lean, it can outperform many other RDBMSs in terms of


raw SELECT, INSERT, or UPDATE speed. However, because MySQL lacks real
transactional support and row-level locking, this speed will degenerate if
multiple INSERT s or UPDATE s are performed simultaneously.

Generally, MySQL is ideally suited to serve as an engine to small to mid-size


databases, particularly if a high number of INSERT and UPDATE queries will
not be concurrently utilized.

Microsoft SQL Server

Pros:

GUI interface makes installation and administration easy.

Lots of training and support options available.

Has good support for transactions and atomicity.

Integrates extremely well with other Microsoft Office applications.

Has support for replication and clustering.

Cons:

Only able to run on Microsoft operating systems.


Proprietary software—cannot be customized.

Cost for large installations can be a factor.

Typical uses:

Overall, MS-SQL can be an effective database engine for small to mid-size


solutions. It is particularly useful for sites constrained to use only Microsoft
operating systems and applications. If a high degree of customization or
heterogeneous support is needed, other options will be more effective.

Interbase

Pros:

Newly open sourced.

Supports many features: transactions, triggers, user-defined functions, and so


on.

No royalty or license fees.

Runs on Windows, Linux, and Solaris.

GUI interface makes installation and configuration easy on Windows platforms.

Interfaces well with Delphi and PowerBuilder.

Commercial support from Borland and others.

Some impressive enterprise features.

Cons:

Newly open sourced.

Lacks some of the more advanced SQL statements (such as CASE, NULLIF,
and COALESCE).

Functions can only be written in C.

Linux version fairly new.

Somewhat fractured development community.


Typical uses:

Interbase is a full-featured RDBMS that used to be a proprietary product and


that has now been open sourced. Although this is seen as a plus with regard to
furthering development and tightening security, there are disadvantages. A
coherent project has yet to fully develop around the newly opened code. As a
result, the development community is slightly fractured.

Interbase interfaces well with many popular development languages.


Additionally, it has been in development for a long time and consequently
possesses almost all of the desired features in a modern RDBMS.

DB2

Pros:

Very scalable, including clusters.

Possesses very advanced replication features.

Support for multiterabyte databases.

Support for all features in a modern RDBMS.

Years of product refinement.

Supports multiple platforms, including small embedded-style environments.

724 support options available.

Plethora of training and user groups available.

Cons:

Proprietary software.

License cost can be significant.

Configuration can be overly complex for simple jobs.

Typical uses:

Used by enterprises for large, complex, involved projects that need a full-
featured RDBMS.
Proper installation and configuration can be complex. As a result, DB2 may not
make much sense for small to mid-range database solutions. However, as a
back end to a massive database system, DB2 cannot be beat.

Oracle

Pros:

Very scalable, including clusters.

Support for massively parallel architectures.

Support for multiterabyte databases.

Support for all features in a modern RDBMS.

Years of product refinement.

Supports multiple platforms, including small embedded-style environments.

7x24 support options available.

Plethora of training and user groups available.

Cons:

Proprietary software.

License cost can be significant.

Configuration can be overly complex for simple jobs.

Typical uses:

Oracle is often considered to be the flagship commercial RDBMS available.


Oracle has spent a tremendous amount of time and money in developing a
product that can almost guarantee 100% uptime reliability. Unfortunately,
those advanced features cost a significant amount of money to implement
correctly.

Typically, proper installation and configuration can be complex. As a result,


Oracle may not make much sense for small to midrange database solutions.
However, as a back end to a massive database system, Oracle cannot be beat.
Online PostgreSQL Resources

There are a number of online resources that assist in the installation,


administration, and development of PostgreSQL.

Web Sites

A full collection of mirrors can be found at www.postgresql.org. There are also a


number of commercial support sites.

Mirror Sites

Australia

postgresql.planetmirror.com

Canada

www.ca.postgresql.org/index.html

Germany

postgresql.bnv-bamberg.de

Italy

www.postgresql.uli.it

Russia

postgresql.rinet.ru

United States

postgresql.readysetnet.com

Commercial Support Sites

Great Bridge (www.greatbridge.com/). Commercial products, services, and support


for PostgreSQL.

PostgreSQL, Inc. (www.pgsql.com/). Support for PostgreSQL, database hosting,


and promotional materials.

Software Research Associates (osb.sra.co.jp/). Open-source software support—


a range of services to help customers develop open-source software–based systems
since April 1999.

Cybertec Geschwinde & Schvnig OEG (postpress.cybertec.at/) inVienna


(Austria). Offers training courses, support, consulting, cost-effective high-end
systems, and high-availability solutions in the whole German-speaking region
(Austria, Germany, Switzerland).

dbExperts (www.dbexperts.com.br) in Brazil. Offers training courses, specialized


support for development, and commercial products for PostgreSQL in the
Portuguese language.

Applinet (www.applinet.nl/). Offers PostgreSQL consulting services in the


Netherlands.

Command Prompt, Inc. (www.commandprompt.com/). Offers Linux-managed


services and PostgreSQL support. Located in the Pacific Northwest, Command
Prompt, Inc., specializes in Linux and PostgreSQL support, including custom
programming featuring PostgreSQL, C++, PHP, and Perl.

Mailing Lists

The PostgreSQL user and development community has a very active set of mailing
lists. The procedure for subscribing to any of the following lists is as follows:

1. Send a message to <groupname>[email protected] (where


<groupname> is the name of one of the following groups, such as pgsql-
[email protected]).

2. Put the word "subscribe" or "unsubscribe" in the body of the message.

3. Optionally, include the phrase "set digest" to consolidate multiple messages


into one daily email instead of numerous individual emails.

4. Optionally, set the phrase "set nomail" in the message body. This will stop the
flow of email but still keep you subscribed. (This is useful for the following
newsgroup option.)

The following mailing lists are currently active:

[email protected]

Covers the topic of PostgreSQL administration and related issues.

[email protected]

Announcement group for third-party and related items.

[email protected]

Mailing list to report or check the status of a found bug.

[email protected]

Group dealing with running PostgreSQL on Windows machines using cygwin.


[email protected]

General discussion area. Does not cover installation, compile, or bugs.


Generally, this is the most active list.

[email protected]

Mailing list for developers or those interested in doing customizations to the


PostgreSQL code base.

[email protected]

List to discuss external APIs to the PostgreSQL back end. (Note: There are
separate lists for the ODBC and JDBC interfaces.)

[email protected]

Used to discuss the external JDBC Java interface.

[email protected]

Used to discuss the external ODBC interface.

[email protected]

Discusses using PostgreSQL and PHP.

[email protected]

Discussion area for aspects of the SQL language.

Newsgroups

You can subscribe to many newsgroups from the PostgreSQL news server
(news//news.postgresq.org). Although anyone can read these groups, you must be
subscribed to one of the preceding mailing lists to post.

comp.databases.postgresql.admin

comp.databases.postgresql.announce

comp.databases.postgresql.bugs

comp.databases.postgresql.committers

comp.databases.postgresql.docs

comp.databases.postgresql.general
comp.databases.postgresql.hackers

comp.databases.postgresql.hackers.fmgr

comp.databases.postgresql.hackers.oo

comp.databases.postgresql.hackers.smgr

comp.databases.postgresql.hackers.wal

comp.databases.postgresql.interfaces

comp.databases.postgresql.interfaces.jdbc

comp.databases.postgresql.interfaces.odbc

comp.databases.postgresql.interfaces.php

comp.databases.postgresql.mirrors

comp.databases.postgresql.novice
comp.databases.postgresql.patches

comp.databases.postgresql.ports

comp.databases.postgresql.ports.cygwin

comp.databases.postgresql.questions

comp.databases.postgresql.sql

FTP Sites

The collection of mirrored FTP sites is the primary way to get source and binary
packages that relate to PostgreSQL. The main web site, www.postgresql.org, lists a
collection of addresses. Here are the more popular sites:

Australia

ftp.planetmirror.com/pub/postgresql

Canada

ftp.jack-of-all-trades.net/www.postgresql.org
looking-glass.usask.ca/pub/postgresql

postgresql.wavefire.com

Germany

ftp.leo.org/pub/comp/os/unix/database/postgresql

ftp-stud.fht-esslingen.de/pub/Mirrors/ftp.postgresql.org

Italy

ftp.postgresql.uli.it

postgresql.theomnistore.com/mirror/postgresql

bo.mirror.garr.it/mirrors/postgres

Japan

ring.asahi-net.or.jp/pub/misc/db/postgresql

Russia

ftp.chg.ru/pub/databases/postgresql

postgresql.rinet.ru

United Kingdom

postgresql.rmplc.co.uk/pub/postgresql

United States

postgresql.readysetnet.com/pub/postgresql

download.sourceforge.net/pub/mirrors/postgresql

ftp.digex.net/pub/packages/database/postgresql

ftp.crimelabs.net/pub/postgresql
Books

Momjian, Bruce. PostgreSQL: Introduction and Concepts. Reading, MA: Addison


Wesley, 2000.

Lockhart,Thomas. PostgreSQL Programmer's Guide. NewYork: iUniverse.com, 2000.

Matthew, Neil, et. al. Professional Linux Programming. Chicago:Wrox Press, Inc.,
2000.
Appendix B. PostgreSQL Version Information
PostgreSQL is under constant development, and in the last few years, a slew of new
features have been added. The following is a brief listing of the major changes/bug
fixes that each new version has implemented. For a more comprehensive listing,
look in the ChangeLog file (usually located in /usr/local/pgsql/ChangeLogs).
Version 7.1.2 (Released May 2001)

Fixed error in PL/PgSQL SELECT s when returning no rows.

Fixed psql backslash causing core dump.

Fixed referential integrity permission.

pg_dump cleanups and fixes.


Version 7.1.1 (Released May 2001)

pg_dump can operate on 7.0 databases.

EXTRACT can now accept string arguments.

JOIN fixes.

ODBC fixes.

Python fixes.

Whole tuple in function fixes.

AIX, MSWIN, VAX, and N32K fixes.


Version 7.1 (Released April 2001)

WAL (Write-Ahead Logging) implemented. Improves database consistency and


reduces data corruption resulting from a system crash.

TOAST (The Over-Attribute Storage Technique) implemented. Enabled any size rows
to be stored in tables. Removed previous fixed-row lengths.

Outer joins implemented.

Fixes implemented for running on 64-bit CPUs.

Optimizations to query engine.

Inherited tables now accessed by default on query of parent.


Version 7.0.3 (Released November 2000)

Large object fixes.

SELECT FOR UPDATE fix.

Added enable --with-syslog to configuration.

Allowed PL/pgSQL to accept non-ASCII identifiers.

Forced VACUUM to always flush buffers.

Fixed != problems in queries.

Fix implemented to stop database restart on write-error.

Fixed TIME aggregate handling.

Fix implemented for inserting long multibyte strings into type CHAR.
Version 7.0.2 (Released June 2000)

Fixed many of the CLUSTER failures.

ALTER TABLE RENAME now works on indexes.

Fixed PL/pgSQL to handle interval and timestamps conversions.

Fixed create user function in pgaccess.

IRIX and QNX fixes.

JDBC result set fixes.

Fixed UNLISTEN failure.

Fixed improper recovery after RENAME TABLE failure.


Version 7.0 (Released May 2000)

Implemented FOREIGN KEYS (except PARTIAL MATCH FKs).

Major overhaul to query optimizer.

Psql client application added many new features.

Date-time data types now SQL-92–compliant.

Removed fixed-length limit of query strings.

Maximum number of keys in an index increased to 16 from 8.

Sorts and hashes fixed to work on greater than 2GB of data.


Version 6.5.2 (Released September 1999)

Fixed subselect and CASE bugs.

Fixed CASE in WHERE join clauses.

Repaired the check for redundant UNIQUE and PRIMARY KEY indices.

Improved referential integrity; it checks for multicolumn constraints.

Fixed Win32 MAKE problem with MB enabled.

Fixed and reduced VACUUM memory consumption.

Fixed timestamp (date, time).

Fixed unary operators in rule deparser.

Updated version of pgaccess 0.98.


Version 6.5.1 (Released July 1999)

Portability fixes for linux_ppc, Irix, alpha, and OpenBSD.

Deprecated QUERY_LIMIT; use SELECT…LIMIT.

Fixed EXPLAIN on inheritance.

Patch added to allow VACUUM on multisegment tables.

Fixed R-Tree optimizer selectivity.

Fixed ACL file descriptor leak.

Avoid disk writes for read-only transactions.

Fix implemented for removal of temp tables if last transaction was aborted.

Fix implemented to prevent too large tuple from being created in plpgsql bug
fixes.

Allowed port numbers 32KB–64KB for connections.

Added ^ precedence.

Fixed microseconds in time values.

New linux_m68k port.

Fixed sorting of NULL s in some cases.

Fixed shared library dependencies.

Fixed glitches affecting GROUP BY in subselects.


Version 6.5 (Released June 1999)

Multiversion Concurrency Control (MVCC)—Removed older table-level locking


mechanisms.

Hot backups from pg_dump—Capability to backup/restore while database is


operational.

Numeric data type—Added a new NUMERIC data type.

Temporary tables—Temporary tables are guaranteed to have unique names within a


database session and are destroyed on session exit.

Ports—Expand ports list, including WinNT/ix86 and NetBSD/arm32.

New SQL features—Added CASE, INTERSECT, and EXCEPT statement support.


There is a new LIMIT/OFFSET, SET TRANSACTION ISOLATION LEVEL, and
SELECT … FOR UPDATE, and an improved LOCK TABLE command.

Added vacuumdb utility.

Using EXPLAIN, all indices used.

New pg_dump table output format.

Added string min()/max() functions.

Update to pgaccess 0.96 (Constantin).

Improved substr() function.

Improved multibyte handling (Tatsuo).

New SERIALIZED transaction mode.

Fixed tables over 2GB.

New SET TRANSACTION ISOLATION LEVEL.

New LOCK TABLE IN … MODE.

Updated ODBC driver.

New SELECT FOR UPDATE.

New TCL_ARRAYS option (Massimo).

New INTERSECT and EXCEPT (Stefan).

New READ COMMITTED isolation level.


New TEMP tables/indexes.

Allowed multiple rule actions.

New routines to convert between int8 and text/varchar types.

Enabled right-hand queries by default.

Added new Postgres -O option to allow system table structure changes.

Support for arrays of char() and varchar() fields.

UNION now supports ORDER BY of columns not in target list.

INET type now respects netmask for comparisons.

Allowed VIEWs on UNIONs.


Version 6.4.1 (Released December 1999)

Added pg_dump -N flag to force double quotes around identifiers. This is the
default.

Fixed NOT in WHERE clause causing crash.

EXPLAIN VERBOSE core dump fix.

Fixed test for table existence to allow mixed-case and whitespace in the table name.

Changed built-in function names from SPI_* to spi_*.

Updated pgaccess to 0.93.

Time zone fixes.

Used implicit type coercion for matching DEFAULT values.


Version 6.4 (Released October 1998)

Views and rules are now functional thanks to extensive new code in the rewrite rules
system from Jan Wieck. He also wrote a chapter on it for the Programmer's Guide.

Second procedural language, PL/pgSQL, to go with the original PL/pgTCL procedural


language.

Optional multiple-byte character set support.

The parser will now perform automatic type coercion to match arguments to
available operators and functions and to match columns and expressions with target
columns. This uses a generic mechanism that supports the type extensibility
features of Postgres. There is a new chapter in the User's Guide that covers this
topic.

Three new data types have been added. Two types, inet and cidr, support
various forms of IP network, subnet, and machine addressing. There is now an 8-
byte integer type available on some platforms. A fourth type, serial, is now
supported by the parser as an amalgam of the int4 type, a sequence, and a unique
index.

Several more SQL-92–compatible syntax features have been added, including


INSERT DEFAULT VALUES.

Show the index used in an EXPLAIN.

EXPLAIN invokes rule system and shows plan(s) for rewritten queries.

Multibyte awareness of many data types and functions, via configure.

New configure --with-mb option.

New initdb --pgencoding option.

New createdb -E multibyte option.

Libpq now allows asynchronous clients.

Allowed cancel from client of back-end query.

NOTIFY now sends sender's PID so you can tell whether it was your own.

Added routines to convert between varchar and bpchar.

Added routines to allow sizing of varchar and bpchar into target columns.

Added bit flags to support timezonehour and minute in data retrieval.

Implemented TIMEZONE_HOUR, TIMEZONE_MINUTE per SQL-92 specs.


Check for and properly ignore FOREIGN KEY column constraints.

New psql command "SET CLIENT_ENCODING TO 'encoding'"; for multibytes


feature, see /doc/README.mb.

Libpq can now be compiled on Win32.

Better support for quoted table/column names.

Surrounded table and column names with double quotes in pg_dump.

Allowed UNION in subselects.

Added HAVING clause with full support for subselects and unions.

Support for SQL-92 syntax "SET NAMES".

Support for LATIN2-5.

Allowed index use with OR clauses.

EXPLAIN VERBOSE can pretty-print the plan to the postmaster log file.

Allowed GROUP BY on functions.

New rewrite system fixes many problems with rules and views.

System indexes are now multikey.

Removed oidint2, oidint4, and oidname types.

New SERIAL data type; autocreates sequence/index.

New UNLISTEN command.

Createuser options now available on the command line.

Code for 64-bit integer (int8) added and tested.

New pg_upgrade command.

New CREATE TABLE DEFAULT VALUES statement available.

New INSERT INTO TABLE DEFAULT VALUES statement available.

New DECLARE and FETCH feature.

Allowed up to eight key indexes.

Removed ARCHIVE keyword that is no longer used.

New SET QUERY_LIMIT.


Version 6.3 (Released March 1998)

Subselects with EXISTS, IN, ALL, and ANY keywords (Vadim, Bruce, and Thomas).

Added SQL-92 "constants" CURRENT_DATE, CURRENT_TIME,


CURRENT_TIMESTAMP, and CURRENT_USER.

Modified constraint syntax to be SQL-92–compliant.

Implemented SQL-92 PRIMARY KEY and UNIQUE clauses using indices.

Recognized SQL-92 syntax for FOREIGN KEY.

Allowed NOT NULL UNIQUE constraint clause (each allowed separately before).

Allowed Postgres-style casting (::) of nonconstants.

Added support for SQL3 TRUE and FALSE Boolean constants.

Support SQL-92 syntax for IS TRUE/IS FALSE/IS NOT TRUE/IS NOT FALSE.

Allowed shorter strings for Boolean literals (for example, t, tr, tru).

Allowed SQL-92 delimited identifiers.

Implemented SQL-92 binary and hexadecimal string decoding (b'10' and x'1F').

Supported SQL-92 syntax for type coercion of literal strings (for example,
"DATETIME 'now'").

Added conversions for int2, int4, and OID types to and from text.

New SQL statement CREATE PROCEDURAL LANGUAGE.

New Postgres Procedural Language (PL) back-end interface.

Used indices for LIKE and ~, !~ operations.

Added hash functions for datetime and timespan.

Added UNIX domain socket support to back-end and front-end library.

Implemented CREATE DATABASE/WITH LOCATION and initlocation utility.

SET/SHOW/RESET TIME ZONE used TZ back-end environment variable.

Implemented SET keyword = DEFAULT and SET TIME ZONE DEFAULT.

Increased 16-character limit on system table/index names to 32 characters.

Renamed system indices.


Added 'GERMAN' option to SET DATESTYLE.

Defined an ISO-style time-span output format with hh:mm:ss fields.

Implemented day of year as possible input to date_part().

Defined timespan_finite() and text_timespan() functions.

Allowed for a pg_password authentication database that was separate from the
system password file.

Dumped ACLs, GRANT, and REVOKE permissions.

Defined text, varchar, and bpchar string-length functions.

Fixed query handling for inheritance and cost computations.

Implemented CREATE TABLE/AS SELECT (alternative to SELECT/INTO).

Allowed NOT, IS NULL, and IS NOT NULL in constraints.

Implemented UNIONs for SELECT.

Added UNION, GROUP, and DISTINCT to INSERT.

Large patch for JDBC.

New LOCK command and lock manual page describing deadlocks.

Added new psql \da, \dd, \df, \do, \dS, and \dT commands.

Showed NOT NULL and DEFAULT in psql \d table.

New types for IP and MAC addresses in contrib/ip_and_mac.

New python interface (PyGreSQL 2.0).

New front-end/back-end protocol has a version number and network byte order.

Security features in pg_hba.conf enhanced and documented; many cleanups.

ecpg -embedded SQL preprocessor.


Version 6.2.1 (Released October 1997)

Added JDBC driver as an interface.

Added pg_password utility.

Return number of tuples inserted/affected by INSERT/UPDATE/DELETE and so on.

Triggers implemented with CREATE TRIGGER (SQL3).

SPI (server programming interface) allowed execution of queries inside C functions.

NOT NULL implemented per SQL-92 standard.

Implement extended comments (/* … */) using exclusive states.

Added // single-line comments.

Implemented DEFAULT and CONSTRAINT for tables (SQL-92).

Added text concatenation operator and function (SQL-92).

Support WITH TIME ZONE syntax (SQL-92).

Support INTERVAL unit to unit syntax (SQL-92).

Defined types DOUBLE PRECISION, INTERVAL, CHARACTER, and CHARACTER


VARYING (SQL-92).

Defined type FLOAT(p) and rudimentary DECIMAL(p,s), NUMERIC(p,s) (SQL-


92).

Defined EXTRACT(), POSITION(), SUBSTRING(), and TRIM() (SQL-92).

Defined CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP (SQL-92).

Added syntax and warnings for UNION, HAVING, INNER, and OUTERJOIN (SQL-92).

Allowed hh:mm:ss time entry for timespan/reltime types.

Added center() routines for lseg, path, and polygon.

Added distance() routines for circle-polygon and polygon-polygon.

Added routine to convert circle-box.

Replaced distance operator <===> with <->.

Replaced above operator !^ with >^ and below operator ! | with <^.

Added routines for text trimming on both ends, substring, and string position.
Added conversion routines circle(box) and poly(circle).

Allowed internal sorts to be stored in memory rather than in files.

General trigger functions for referential integrity.

MOVE implementation.
Version 6.2 (Released June 1997)

Added UNIQUE index capability.

Added hostname/user level access control rather than just hostname and user.

Added synonym of != for <>.

Allowed select oid,* from table.

Allowed BY, ORDER BY to specify columns by number or by nonalias


table.column.

Allowed COPY from the front end.

Allowed GROUP BY to use alias column name.

Allowed restriction on who can create C functions.

Changed default decimal constant representation from float4 to float8.

European date format now set when postmaster is started.

Executed lowercase function names if not found with exact case.

identd authentication of local users.

Implemented BETWEEN qualifier.

Implemented IN qualifier.

Pg_dump allowed dump of oids.

Pg_dumpall dumped all databases and the user table.

Prevented postmaster from being run as root.

Psql allowed backslashes and semicolons anywhere on the line.

Secured authentication of local users.

Vacuum now has VERBOSE option.


Version 6.1 (Released June 1997)

BTREE UNIQUE added to bulk load code.

Massive changes to libpg++ (Leo).

New GEQO optimizer speeds table multitable optimization.

New WARN message for nonunique insert into unique key.

New plaintext password functions.

New ANSI timestamp function.

New ANSI time and date types.

Multicolumn B-Tree indexes.

New SET var TO value command.

New locale settings for character types.

New SEQUENCE serial number generator.

GROUP BY function now possible.

New MONEY data type.

New VACUUM option for attribute statistics and for certain columns.

New SET, SHOW, and RESET commands.

New \connect database USER option.

New destroydb -i option.

New \dt and \di psql commands.

SELECT \n now escapes newline.


Version Postgre95 .01 (Released May 1995)

Initial release.

You might also like