SlideShare a Scribd company logo
PostgreSQL
Database Systems Course
S.Shayan Daneshvar 27.1 , 27.2, 27.3, 27.4 ---- 14.1, 14.2, 14.8
Postgres
PostgreSQL is an open-source object-relational database management system. It is
a descendant of one of the earliest such systems, the POSTGRES system developed
under Professor Michael Stonebraker at the University of California, Berkeley.
The name “postgres” is derived from the name of a pioneering relational database
system, Ingres, also developed under Stonebraker at Berkeley.
Currently, PostgreSQL supports many aspects of SQL:2003 and offers features such
as complex queries, foreign keys, triggers, views, transactional integrity, full-text
searching, and limited data replication.
In addition, users can extend PostgreSQL with new data types, functions, operators,
or index methods. PostgreSQL supports a variety of programming languages
(including C, C++, Java, Perl,Tcl, and Python) as well as the database interfaces
JDBC and ODBC.
Introduction
In the course of two decades, PostgreSQL has undergone several major releases.
The first prototype system, under the name POSTGRES, was demonstrated at the
1988 ACM SIGMOD conference.The first version, distributed to users in 1989,
provided features such as extensible data types, a preliminary rule system, and a
query language named POSTQUEL. But who cares about what it was?
...
Today, PostgreSQL is used to implement several different research and production
applications (such as the PostGIS system for geographic information) and an
educational tool at several universities.The system continues to evolve through the
contributions of a community of about 1000 developers. In this chapter, we explain
how PostgreSQL works, starting from user interfaces and languages and continuing
into the heart of the system (the data structures and the concurrency control
mechanism).
User Interfaces
• InteractiveTerminal Interfaces
The main interactive terminal client is psql, which is modeled after the Unix shell
and allows execution of SQL commands on the server, as well as several other
operations.
• Graphical Interfaces
• AdministrationTools:
• pgAccess
• pgAdmin
• Database Design:
• TORA
• Data Architect
InteractiveTerminal Interface (psql)
•Variables: psql provides variable substitution features, similar to common Unix
command shells.
• SQL interpolation:The user can substitute (“interpolate”) psql variables into
regular SQL statements by placing a colon in front of the variable name.
• Command-line editing: psql uses the GNU readline library for convenient line
editing, with tab-completion support.
Programming Language Interfaces
PostgreSQL provides native interfaces for ODBC and JDBC, as well as bindings for
most programming languages, including C, C++, PHP, Perl,Tcl/Tk, ECPG, Python,
and Ruby.
The libpq library provides the C API for PostgreSQL; libpq is also the underlying
engine for most programming-language bindings.The libpq library supports both
synchronous and asynchronous execution of SQL commands and prepared
statements, through a reentrant and thread-safe interface.
SQLVariations & Extensions
The current version of PostgreSQL supports almost all entry-level SQL-92 features,
as well as many of the intermediate- and full-level features.
It also supports many SQL:1999 and SQL:2003 features, including most object-
relational features described in Chapter 22 and the SQL/XML features for parsed
XML data described in Chapter 23.
In fact, some features of the current SQL standard (such as arrays, functions, and
inheritance) were pioneered by PostgreSQL or its ancestors.
It lacks OLAP features (most notably, cube and rollup), but data from PostgreSQL
can be easily loaded into open-source external OLAP servers (such as Mondrian) as
well as commercial products.
SQLVariations and Extensions (Types)
PostgreSQL has support for several nonstandard types, useful for specific
application domains. Furthermore, users can define new types with the create type
command.This includes new low-level base types, typically written in C !
The PostgreSQLType System:
• BaseTypes
• Composite types
• Domains
• Enumerated types
• Pseudotypes
• Polymorphic types
BaseTypes
Base types are also known as abstract data types; that is, modules that encapsulate
both state and a set of operations.These are implemented below the SQL level,
typically in a language such as C .
Examples are int4 (already included in PostgreSQL) or complex (included as an
optional extension type).
A base type may represent either an individual scalar value or a variable-length
array of values. For each scalar type that exists in a database, PostgreSQL
automatically creates an array type that holds values of the same scalar type.
Composite types & Enumerated types
• CompositeTypes:
These correspond to table rows; that is, they are a list of field names and their
respective types.A composite type is created implicitly whenever a table is created,
but users may also construct them explicitly.
• EnumeratedTypes:
These are similar to enum types used in programming languages such as C and Java.
An enumerated type is essentially a fixed list of named values. In PostgreSQL,
enumerated types may be converted to the textual representation of their name,
but this conversion must be specified explicitly in some cases to ensure type safety.
For instance, values of different enumerated types may not be compared without
explicit conversion to compatible types.
Domains & Pseudo-types
• Domains:
A domain type is defined by coupling a base type with a constraint that values of the
type must satisfy.Values of the domain type and the associated base type may be
used interchangeably, provided that the constraint is satisfied. A domain may also
have an optional default value, whose meaning is similar to the default value of a
table column.
• Pseudotypes :
Currently, PostgreSQL supports the following pseudotypes: any, anyarray,
anyelement, anyenum, anynonarray cstring, internal, opaque, language handler,
record, trigger, and void.These cannot be used in composite types (and thus cannot
be used for table columns), but can be used as argument and return types of user-
defined functions.
Polymorphic types
Four of the pseudotypes anyelement, anyarray, anynonarray, and anyenum are
collectively known as polymorphic. Functions with arguments of these types
(correspondingly called polymorphic functions) may operate on any actual type.
PostgreSQL has a simple type-resolution scheme that requires that:
1. in any particular invocation of a polymorphic function, all occurrences of a
polymorphic type must be bound to the same actual type (that is, a function
defined as f(anyelement, anyelement) may operate only on pairs of the same
actual type), and …
2. if the return type is polymorphic, then at least one of the arguments must be of
the same polymorphic type.
NonstandardTypes
Thanks to the open nature of PostgreSQL, there are several contributed extension
types, such as complex numbers, and ISBN/ISSNs.
• Geometric data types (point, line, lseg, box, polygon, path, circle) are used in
geographic information systems to represent two-dimensional spatial objects such
as points, line segments, polygons, paths, and circles. Numerous functions and
operators are available in PostgreSQL to perform various geometric operations
such as scaling, translation, rotation, and determining intersections. Furthermore,
PostgreSQL supports indexing of these types using R-trees.
• Full-text searching is performed in PostgreSQL using the tsvector type that
represents a document and the tsquery type that represents a full-text query. A
tsvector stores the distinct words in a document, after converting variants of each
word to a common normal form. PostgreSQL provides functions to convert raw text
to a tsvector and concatenate documents. A tsquery specifies words to search for in
candidate documents, with multiple words connected by Boolean operators.
PostgreSQL natively supports operations on full-text types, including language
features and indexed search.
NonstandardTypes…
• PostgreSQL offers data types to store network addresses.These data types allow
network-management applications to use a PostgreSQL database as their data
store.These types offer input-error checking.Thus, they are preferable over plain
text fields. (cidr, inet and macaddr). IPv4, IPv6 with subnet mask and MAC address.
•The PostgreSQL bit type can store both fixed- and variable-length strings of 1s and
0s. PostgreSQL supports bit-logical operators and string-manipulation functions for
these values.
Rules and Active-Database Features
PostgreSQL supports SQL constraints and triggers (and stored procedures).
Furthermore, it features query-rewriting rules that can be declared on the server.
PostgreSQL allows check constraints, not null constraints, and primary-key and
foreign-key constraints (with restricting and cascading deletes).
Like many other relational database systems, PostgreSQL supports triggers, which
are useful for nontrivial constraints and consistency checking or enforcement.
Trigger functions can be written in a procedural language such as PL/pgSQL or in C,
but not in plain SQL.
Triggers can execute before or after insert, update, or delete operations and either
once per modified row, or once per SQL statement.
Rules
The PostgreSQL rules system allows users to define query-rewrite rules on the
database server. Unlike stored procedures and triggers, the rule system intervenes
between the query parser and the planner and modifies queries on the basis of the
set of rules.
After the original query tree has been transformed into one or more trees, they are
passed to the query planner.Thus, the planner has all the necessary information
(tables to be scanned, relationships between them, qualifications, join information,
and so forth) and can come up with an efficient execution plan, even when complex
rules are involved.
Extensibility
Like most relational database systems, PostgreSQL stores information about
databases, tables, columns, and so forth, in what are commonly known as system
catalogs, which appear to the user as normal tables. Other relational database
systems are typically extended by changing hard-coded procedures in the source
code or by loading special extension modules written by the vendor.
Unlike most relational database systems, PostgreSQL goes one step further and
stores much more information in its catalogs: not only information about tables and
columns, but also information about data types, functions, access methods, and so
on.Therefore, PostgreSQL is easy for users to extend and facilitates rapid
prototyping of new applications and storage structures. PostgreSQL can also
incorporate user-written code into the server, through dynamic loading of shared
objects.This provides an alternative approach to writing extensions that can be
used when catalog-based extensions are not sufficient.
CreatingTypes
PostgreSQL allows users to define composite types, enumeration types, and even
new base types. A composite-type definition is similar to a table definition (in fact,
the latter implicitly does the former).
The order of listed names in enum is significant in comparing values of an
enumerated type.This can be useful for a statement such as:
Create BaseTypes
Functions
PostgreSQL allows users to define functions that are stored and executed on the
server. PostgreSQL also supports function overloading. Functions can be written as
plain SQL statements, or in several procedural languages. Finally, PostgreSQL has
an application programmer interface for adding functions written in C.
Index Extensions
PostgreSQL currently supports the usual B-tree and hash indices, as well as two
index methods that are unique to PostgreSQL: the Generalized SearchTree (GiST)
and the Generalized Inverted Index (GIN), which is useful for full-text indexing.
Finally, PostgreSQL provides indexing of two-dimensional spatial objects with an R-
tree index, which is implemented using a GiST index behind the scenes.
Adding index extensions for a type requires definition of an operator class, which
encapsulates the following:
Index-method strategies: These are a set of operators that can be used as qualifiers
in where clauses.The particular set depends on the index type. For example, B-tree
indices can retrieve ranges of objects, so the set consists of five operators (<=, =, >=,
and >), all of which can appear in a where clause involving a B-tree index. A hash
index allows only equality testing and an R-tree index allows a number of spatial
relationships (for example contained, to-the-left, and so forth).
Index Extensions…
Index-method support routines:The above set of operators is typically not sufficient
for the operation of the index. For example, a hash index requires a function to
compute the hash value for each object. An R-tree index needs to be able to
compute intersections and unions and to estimate the size of indexed objects.
For example, if the following functions and operators are defined to compare the
magnitude of complex numbers, then we can make such objects indexable by the
following declaration:
The operator statements define the strategy methods and the function statements
define the support methods.
Procedural Languages
• PL/pgSQL:This is a trusted language that adds procedural programming
capabilities (for example, variables and control flow) to SQL. It is very similar to
Oracle’s PL/SQL. Although code cannot be transferred verbatim from one to the
other, porting is usually simple.
• PL/Tcl, PL/Perl, and PL/Python:These leverage the power ofTcl, Perl, and Python
to write stored functions and procedures on the server.The first two come in both
trusted and untrusted versions (PL/Tcl, PL/Perl and PL/TclU, PL/PerlU,
respectively), while PL/Python is untrusted at the time of this writing. Each of
these has bindings that allow access to the database system via a language-
specific interface.
Server Programming Interface
The server programming interface (SPI) is an application programmer interface that
allows user-defined C functions to run arbitrary SQL commands inside their
functions.This gives writers of user-defined functions the ability to implement only
essential parts in C and easily leverage the full power of the relational database
system engine to do most of the work.
Transaction Management
Transaction management in PostgreSQL uses both snapshot isolation and two-
phase locking. Which one of the two protocols is used depends on the type of
statement being executed.
For DML statements the snapshot isolation technique is used; the snapshot
isolation scheme is referred to as the multi-version concurrency control (MVCC)
scheme in PostgreSQL.
Concurrency control for DDL statements, on the other hand, is based on standard
two-phase locking.
Remember SQL Commands In General
Transactions? (Atomicity)
Collections of operations that form a single logical unit of work are called
transactions. A database system must ensure proper execution of transactions
despite failures—either the entire transaction executes, or none of it does.
Furthermore, it must manage concurrent execution of transactions in a way that
avoids the introduction of inconsistency.
A transaction is delimited by statements (or function calls) of the form begin
transaction and end transaction.The transaction consists of all operations executed
between the begin transaction and end transaction.This collection of steps must
appear to the user as a single, indivisible unit. Since a transaction is indivisible, it
either executes in its entirety or not at all.Thus, if a transaction begins to execute
but fails for whatever reason, any changes to the database that the transaction may
have made must be undone.
This “all-or-none” property is referred to as atomicity.
Transactions? (Isolation & Durability)
Also, since a transaction is a single unit, its actions cannot appear to be separated by
other database operations not part of the transaction.While we wish to present this
user-level impression of transactions, we know that reality is quite different. Even a
single SQL statement involves many separate accesses to the database, and a
transaction may consist of several SQL statements.Therefore, the database system
must take special actions to ensure that transactions operate properly without
interference from concurrently executing database statements.This property is
referred to as isolation.
Even if the system ensures correct execution of a transaction, this serves little
purpose if the system subsequently crashes and, as a result, the system “forgets”
about the transaction.Thus, a transaction’s actions must persist across crashes.This
property is referred to as durability.
Transactions? … (Consistency)
Because of the above three properties, transactions are an ideal way of structuring
interaction with a database.This leads us to impose a requirement on transactions
themselves. A transaction must preserve database consistency—if a transaction is
run atomically in isolation starting from a consistent database, the database must
again be consistent at the end of the transaction.This consistency requirement goes
beyond the data integrity constraints such as primary-key constraints, referential
integrity, check constraints, and the like.
Rather, transactions are expected to go beyond that to ensure preservation of those
application-dependent consistency constraints that are too complex to state using
the SQL constructs for data integrity. How this is done is the responsibility of the
programmer who codes a transaction.This property is referred to as consistency.
ACID
• Atomicity: Either all operations of the transaction are reflected properly in the
database, or none are.
• Consistency: Execution of a transaction in isolation (that is, with no other
transaction executing concurrently) preserves the consistency of the database.
• Isolation: Even though multiple transactions may execute concurrently, the
system guarantees that, for every pair of transactionsTi andTj , it appears toTi that
eitherTj finished execution beforeTi started orTj started execution afterTi finished.
Thus, each transaction is unaware of other transactions executing concurrently in
the system.
• Durability: After a transaction completes successfully, the changes it has made to
the database persist, even if there are system failures.
Transactions – An Example
Consistency:The consistency requirement here is that the sum of A and B be
unchanged by the execution of the transaction.Without the consistency
requirement, money could be created or destroyed by the transaction!
Atomicity: Suppose that, just before the execution of transactionTi , the values of
accounts A and B are $1000 and $2000, respectively. Now suppose that, during the
execution of transactionTi , a failure occurs that preventsTi from completing its
execution successfully. Further, suppose that the failure happened after the write(A)
operation but before the write(B) operation. In this case, the values of accounts A
and B reflected in the database are $950 and $2000.The system destroyed $50 as a
result of this failure. In particular, we note that the sum A + B is no longer preserved.
Thus, because of the failure, the state of the system no longer reflects a real state of
the world that the database is supposed to capture.We term such a state an
inconsistent state.
Transactions – An Example …
Durability: Once the execution of the transaction completes successfully, and the
user who initiated the transaction has been notified that the transfer of funds has
taken place, it must be the case that no system failure can result in a loss of data
corresponding to this transfer of funds.The durability property guarantees that,
once a transaction completes successfully, all the updates that it carried out on the
database persist, even if there is a system failure after the transaction completes
execution.
Isolation: Even if the consistency and atomicity properties are ensured for each
transaction, if several transactions are executed concurrently, their operations may
interleave in some undesirable way, resulting in an inconsistent state.
AbstractTransaction Model
• Active, the initial state; the transaction stays in this state while it is executing.
• Partially committed, after the final statement has been executed.
• Failed, after the discovery that normal execution can no longer proceed.
• Aborted, after the transaction has been rolled back and the database has been
restored to its state prior to the start of the transaction.
• Committed, after successful completion.
Transaction Isolation Levels
Serializable: usually ensures serializable execution. However, some database systems
implement this isolation level in a manner that may, in certain cases, allow non-serializable
executions.
Repeatable: read allows only committed data to be read and further requires that, between
two reads of a data item by a transaction, no other transaction is allowed to update it.
However, the transaction may not be serializable with respect to other transactions. For
instance, when it is searching for data satisfying some conditions, a transaction may find
some of the data inserted by a committed transaction, but may not find other data inserted
by the same transaction.
Read committed: allows only committed data to be read, but does not require repeatable
reads. For instance, between two reads of a data item by the transaction, another
transaction may have updated the data item and committed.
Read uncommitted: allows uncommitted data to be read. It is the lowest isolation level
allowed by SQL.
All the isolation levels above additionally disallow dirty writes, that is, they disallow writes to
a data item that has already been written by another transaction that has not yet
committed or aborted.
Implementation of Isolation Levels
• Locking
two-phase locking requires a transaction to have two phases, one where it acquires
locks but does not release any, and a second phase where the transaction releases
locks but does not acquire any.
• Timestamps
• MultipleVersions and Snapshot Isolation
By maintaining more than one version of a data item, it is possible to allow a
transaction to read an old version of a data item rather than a newer version written
by an uncommitted transaction or by a transaction that should come later in the
serialization order.There are a variety of multi-version concurrency control
techniques. One in particular, called snapshot isolation, is widely used in practice.
Back to PostgreSQL (Concurrency Control)
since the concurrency control protocol used by PostgreSQL depends on the isolation level
requested by the application, we begin with an overview of the isolation levels offered by
PostgreSQL.We then describe the key ideas behind the MVCC scheme, followed by a
discussion of their implementation in PostgreSQL and some of the implications of MVCC.
The SQL standard defines three weak levels of consistency, in addition to the serializable
level of consistency.
The purpose of providing the weak consistency levels is to allow a higher degree of
concurrency for applications that don’t require the strong guarantees that serializability
provides. Examples of such applications include long-running transactions that collect
statistics over the database and whose results do not need to be precise.The SQL standard
defines the different isolation levels in terms of three phenomena that violate serializability.
The three phenomena are called dirty read, nonrepeatable read, and phantom read.
PostgreSQL Isolation Levels
• Dirty read:The transaction reads values written by another transaction that hasn’t
committed yet.
• Non-repeatable read: A transaction reads the same object twice during execution
and finds a different value the second time, although the transaction has not
changed the value in the meantime.
• Phantom read: A transaction re-executes a query returning a set of rows that
satisfy a search condition and finds that the set of rows satisfying the condition has
changed as a result of another recently committed transaction.
It should be obvious that each of the above phenomena violates transaction isolation, and hence violates
serializability.
More on PostgreSQL
Database System Conecepts, Chapter 27 (Parts 2.4, 2.5, …), PostgreSQL Case Study.

More Related Content

PPT
1 - Introduction to PL/SQL
rehaniltifat
 
PPTX
Introduction to PostgreSQL
Joel Brewer
 
PPTX
Sql vs NoSQL
RTigger
 
PPTX
introdution to SQL and SQL functions
farwa waqar
 
PDF
Oracle database 12c sql worshop 2 student guide vol 2
Otto Paiz
 
PDF
Simplifying Model Management with MLflow
Databricks
 
PPTX
Introduction to NOSQL databases
Ashwani Kumar
 
PPTX
SQL - Structured query language introduction
Smriti Jain
 
1 - Introduction to PL/SQL
rehaniltifat
 
Introduction to PostgreSQL
Joel Brewer
 
Sql vs NoSQL
RTigger
 
introdution to SQL and SQL functions
farwa waqar
 
Oracle database 12c sql worshop 2 student guide vol 2
Otto Paiz
 
Simplifying Model Management with MLflow
Databricks
 
Introduction to NOSQL databases
Ashwani Kumar
 
SQL - Structured query language introduction
Smriti Jain
 

What's hot (20)

PPTX
Postgresql
NexThoughts Technologies
 
PPTX
Mongodb vs mysql
hemal sharma
 
PDF
How to Use JSON in MySQL Wrong
Karwin Software Solutions LLC
 
PDF
Common MongoDB Use Cases
DATAVERSITY
 
PPT
SQL Tutorial - Basic Commands
1keydata
 
PDF
PostgreSQL Tutorial for Beginners | Edureka
Edureka!
 
PDF
Introduction to SAML 2.0
Mika Koivisto
 
PPTX
Web services SOAP
princeirfancivil
 
PDF
Intro to Cypher
Neo4j
 
PPTX
Introduction to SQL
Ehsan Hamzei
 
PDF
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Edureka!
 
PDF
MERGE SQL Statement: Lesser Known Facets
Andrej Pashchenko
 
PPTX
Sql Basics And Advanced
rainynovember12
 
PDF
Oracle database 12c sql worshop 1 student guide vol 2
Otto Paiz
 
PDF
MySQL 8.0 Optimizer Guide
Morgan Tocker
 
PDF
MySQL Tutorial For Beginners | Relational Database Management System | MySQL ...
Edureka!
 
PPTX
Introduction to NoSQL
PolarSeven Pty Ltd
 
PDF
Sql Basics | Edureka
Edureka!
 
ODP
Python and MongoDB
Christiano Anderson
 
Mongodb vs mysql
hemal sharma
 
How to Use JSON in MySQL Wrong
Karwin Software Solutions LLC
 
Common MongoDB Use Cases
DATAVERSITY
 
SQL Tutorial - Basic Commands
1keydata
 
PostgreSQL Tutorial for Beginners | Edureka
Edureka!
 
Introduction to SAML 2.0
Mika Koivisto
 
Web services SOAP
princeirfancivil
 
Intro to Cypher
Neo4j
 
Introduction to SQL
Ehsan Hamzei
 
Elasticsearch Tutorial | Getting Started with Elasticsearch | ELK Stack Train...
Edureka!
 
MERGE SQL Statement: Lesser Known Facets
Andrej Pashchenko
 
Sql Basics And Advanced
rainynovember12
 
Oracle database 12c sql worshop 1 student guide vol 2
Otto Paiz
 
MySQL 8.0 Optimizer Guide
Morgan Tocker
 
MySQL Tutorial For Beginners | Relational Database Management System | MySQL ...
Edureka!
 
Introduction to NoSQL
PolarSeven Pty Ltd
 
Sql Basics | Edureka
Edureka!
 
Python and MongoDB
Christiano Anderson
 
Ad

Similar to PostgreSQL - Case Study (20)

ODP
Introduction to PostgreSQL
Jim Mlodgenski
 
PPTX
Chjkkkkkkkkkkkkkkkkkjjjjjjjjjjjjjjjjjjjjjjjjjj01_The Basics.pptx
MhmdMk10
 
PPTX
PostgreSQL - Object Relational Database
Mubashar Iqbal
 
PDF
An evening with Postgresql
Joshua Drake
 
PPT
Object Relational Database Management System
Amar Myana
 
KEY
Building and Distributing PostgreSQL Extensions Without Learning C
David Wheeler
 
PDF
0292-introduction-postgresql.pdf
Mustafa Keskin
 
PDF
Beyond Postgres: Interesting Projects, Tools and forks
Sameer Kumar
 
PPTX
PostgreSQL- An Introduction
Smita Prasad
 
PDF
Migrating to postgresql
botsplash.com
 
PPTX
PostgreSQL as an Alternative to MSSQL
Alexei Krasner
 
PDF
PostgreSQL Server Programming Second Edition Usama Dar Hannu Krosing Jim Mlod...
trddarvai
 
PDF
PostgreSQL Server Programming Second Edition Usama Dar Hannu Krosing Jim Mlod...
servanjervy
 
PDF
Don't panic! - Postgres introduction
Federico Campoli
 
PDF
PostgreSQL, your NoSQL database
Reuven Lerner
 
PDF
Get PostgreSQL Server Programming - Second Edition Dar free all chapters
raiyaalaiaya
 
PDF
Get PostgreSQL Server Programming - Second Edition Dar free all chapters
kapuilakna
 
KEY
PostgreSQL
Reuven Lerner
 
PDF
PostgreSQL Server Programming 2nd Edition Usama Dar
obdlioubysz
 
PDF
PostgreSQL Server Programming 2nd Edition Usama Dar
bhaveeranirh
 
Introduction to PostgreSQL
Jim Mlodgenski
 
Chjkkkkkkkkkkkkkkkkkjjjjjjjjjjjjjjjjjjjjjjjjjj01_The Basics.pptx
MhmdMk10
 
PostgreSQL - Object Relational Database
Mubashar Iqbal
 
An evening with Postgresql
Joshua Drake
 
Object Relational Database Management System
Amar Myana
 
Building and Distributing PostgreSQL Extensions Without Learning C
David Wheeler
 
0292-introduction-postgresql.pdf
Mustafa Keskin
 
Beyond Postgres: Interesting Projects, Tools and forks
Sameer Kumar
 
PostgreSQL- An Introduction
Smita Prasad
 
Migrating to postgresql
botsplash.com
 
PostgreSQL as an Alternative to MSSQL
Alexei Krasner
 
PostgreSQL Server Programming Second Edition Usama Dar Hannu Krosing Jim Mlod...
trddarvai
 
PostgreSQL Server Programming Second Edition Usama Dar Hannu Krosing Jim Mlod...
servanjervy
 
Don't panic! - Postgres introduction
Federico Campoli
 
PostgreSQL, your NoSQL database
Reuven Lerner
 
Get PostgreSQL Server Programming - Second Edition Dar free all chapters
raiyaalaiaya
 
Get PostgreSQL Server Programming - Second Edition Dar free all chapters
kapuilakna
 
PostgreSQL
Reuven Lerner
 
PostgreSQL Server Programming 2nd Edition Usama Dar
obdlioubysz
 
PostgreSQL Server Programming 2nd Edition Usama Dar
bhaveeranirh
 
Ad

More from S.Shayan Daneshvar (8)

PPTX
Image to image translation with Pix2Pix GAN
S.Shayan Daneshvar
 
PDF
Microservice architecture (MSA) and patterns
S.Shayan Daneshvar
 
PDF
Advanced SQL - Database Access from Programming Languages
S.Shayan Daneshvar
 
PPTX
P, NP and NP-Complete, Theory of NP-Completeness V2
S.Shayan Daneshvar
 
PPTX
Longest increasing subsequence
S.Shayan Daneshvar
 
PPTX
Analysis of algorithms
S.Shayan Daneshvar
 
PPTX
Amortized analysis
S.Shayan Daneshvar
 
PPTX
Introduction to MongoDB
S.Shayan Daneshvar
 
Image to image translation with Pix2Pix GAN
S.Shayan Daneshvar
 
Microservice architecture (MSA) and patterns
S.Shayan Daneshvar
 
Advanced SQL - Database Access from Programming Languages
S.Shayan Daneshvar
 
P, NP and NP-Complete, Theory of NP-Completeness V2
S.Shayan Daneshvar
 
Longest increasing subsequence
S.Shayan Daneshvar
 
Analysis of algorithms
S.Shayan Daneshvar
 
Amortized analysis
S.Shayan Daneshvar
 
Introduction to MongoDB
S.Shayan Daneshvar
 

Recently uploaded (20)

PPTX
classification of computer and basic part of digital computer
ravisinghrajpurohit3
 
PDF
Jenkins: An open-source automation server powering CI/CD Automation
SaikatBasu37
 
PPTX
Smart Panchayat Raj e-Governance App.pptx
Rohitnikam33
 
PPTX
Why Use Open Source Reporting Tools for Business Intelligence.pptx
Varsha Nayak
 
PDF
Micromaid: A simple Mermaid-like chart generator for Pharo
ESUG
 
PPT
Activate_Methodology_Summary presentatio
annapureddyn
 
PPTX
Presentation about variables and constant.pptx
kr2589474
 
PPTX
ASSIGNMENT_1[1][1][1][1][1] (1) variables.pptx
kr2589474
 
PPTX
The-Dawn-of-AI-Reshaping-Our-World.pptxx
parthbhanushali307
 
PPTX
Role Of Python In Programing Language.pptx
jaykoshti048
 
PDF
lesson-2-rules-of-netiquette.pdf.bshhsjdj
jasmenrojas249
 
PDF
Wondershare Filmora 14.5.20.12999 Crack Full New Version 2025
gsgssg2211
 
PPTX
AI-Ready Handoff: Auto-Summaries & Draft Emails from MQL to Slack in One Flow
bbedford2
 
PPTX
Explanation about Structures in C language.pptx
Veeral Rathod
 
PPTX
ConcordeApp: Engineering Global Impact & Unlocking Billions in Event ROI with AI
chastechaste14
 
PDF
QAware_Mario-Leander_Reimer_Architecting and Building a K8s-based AI Platform...
QAware GmbH
 
PDF
ShowUs: Pharo Stream Deck (ESUG 2025, Gdansk)
ESUG
 
PDF
Build Multi-agent using Agent Development Kit
FadyIbrahim23
 
PDF
advancepresentationskillshdhdhhdhdhdhhfhf
jasmenrojas249
 
PPTX
Can You Build Dashboards Using Open Source Visualization Tool.pptx
Varsha Nayak
 
classification of computer and basic part of digital computer
ravisinghrajpurohit3
 
Jenkins: An open-source automation server powering CI/CD Automation
SaikatBasu37
 
Smart Panchayat Raj e-Governance App.pptx
Rohitnikam33
 
Why Use Open Source Reporting Tools for Business Intelligence.pptx
Varsha Nayak
 
Micromaid: A simple Mermaid-like chart generator for Pharo
ESUG
 
Activate_Methodology_Summary presentatio
annapureddyn
 
Presentation about variables and constant.pptx
kr2589474
 
ASSIGNMENT_1[1][1][1][1][1] (1) variables.pptx
kr2589474
 
The-Dawn-of-AI-Reshaping-Our-World.pptxx
parthbhanushali307
 
Role Of Python In Programing Language.pptx
jaykoshti048
 
lesson-2-rules-of-netiquette.pdf.bshhsjdj
jasmenrojas249
 
Wondershare Filmora 14.5.20.12999 Crack Full New Version 2025
gsgssg2211
 
AI-Ready Handoff: Auto-Summaries & Draft Emails from MQL to Slack in One Flow
bbedford2
 
Explanation about Structures in C language.pptx
Veeral Rathod
 
ConcordeApp: Engineering Global Impact & Unlocking Billions in Event ROI with AI
chastechaste14
 
QAware_Mario-Leander_Reimer_Architecting and Building a K8s-based AI Platform...
QAware GmbH
 
ShowUs: Pharo Stream Deck (ESUG 2025, Gdansk)
ESUG
 
Build Multi-agent using Agent Development Kit
FadyIbrahim23
 
advancepresentationskillshdhdhhdhdhdhhfhf
jasmenrojas249
 
Can You Build Dashboards Using Open Source Visualization Tool.pptx
Varsha Nayak
 

PostgreSQL - Case Study

  • 1. PostgreSQL Database Systems Course S.Shayan Daneshvar 27.1 , 27.2, 27.3, 27.4 ---- 14.1, 14.2, 14.8
  • 2. Postgres PostgreSQL is an open-source object-relational database management system. It is a descendant of one of the earliest such systems, the POSTGRES system developed under Professor Michael Stonebraker at the University of California, Berkeley. The name “postgres” is derived from the name of a pioneering relational database system, Ingres, also developed under Stonebraker at Berkeley. Currently, PostgreSQL supports many aspects of SQL:2003 and offers features such as complex queries, foreign keys, triggers, views, transactional integrity, full-text searching, and limited data replication. In addition, users can extend PostgreSQL with new data types, functions, operators, or index methods. PostgreSQL supports a variety of programming languages (including C, C++, Java, Perl,Tcl, and Python) as well as the database interfaces JDBC and ODBC.
  • 3. Introduction In the course of two decades, PostgreSQL has undergone several major releases. The first prototype system, under the name POSTGRES, was demonstrated at the 1988 ACM SIGMOD conference.The first version, distributed to users in 1989, provided features such as extensible data types, a preliminary rule system, and a query language named POSTQUEL. But who cares about what it was? ... Today, PostgreSQL is used to implement several different research and production applications (such as the PostGIS system for geographic information) and an educational tool at several universities.The system continues to evolve through the contributions of a community of about 1000 developers. In this chapter, we explain how PostgreSQL works, starting from user interfaces and languages and continuing into the heart of the system (the data structures and the concurrency control mechanism).
  • 4. User Interfaces • InteractiveTerminal Interfaces The main interactive terminal client is psql, which is modeled after the Unix shell and allows execution of SQL commands on the server, as well as several other operations. • Graphical Interfaces • AdministrationTools: • pgAccess • pgAdmin • Database Design: • TORA • Data Architect
  • 5. InteractiveTerminal Interface (psql) •Variables: psql provides variable substitution features, similar to common Unix command shells. • SQL interpolation:The user can substitute (“interpolate”) psql variables into regular SQL statements by placing a colon in front of the variable name. • Command-line editing: psql uses the GNU readline library for convenient line editing, with tab-completion support.
  • 6. Programming Language Interfaces PostgreSQL provides native interfaces for ODBC and JDBC, as well as bindings for most programming languages, including C, C++, PHP, Perl,Tcl/Tk, ECPG, Python, and Ruby. The libpq library provides the C API for PostgreSQL; libpq is also the underlying engine for most programming-language bindings.The libpq library supports both synchronous and asynchronous execution of SQL commands and prepared statements, through a reentrant and thread-safe interface.
  • 7. SQLVariations & Extensions The current version of PostgreSQL supports almost all entry-level SQL-92 features, as well as many of the intermediate- and full-level features. It also supports many SQL:1999 and SQL:2003 features, including most object- relational features described in Chapter 22 and the SQL/XML features for parsed XML data described in Chapter 23. In fact, some features of the current SQL standard (such as arrays, functions, and inheritance) were pioneered by PostgreSQL or its ancestors. It lacks OLAP features (most notably, cube and rollup), but data from PostgreSQL can be easily loaded into open-source external OLAP servers (such as Mondrian) as well as commercial products.
  • 8. SQLVariations and Extensions (Types) PostgreSQL has support for several nonstandard types, useful for specific application domains. Furthermore, users can define new types with the create type command.This includes new low-level base types, typically written in C ! The PostgreSQLType System: • BaseTypes • Composite types • Domains • Enumerated types • Pseudotypes • Polymorphic types
  • 9. BaseTypes Base types are also known as abstract data types; that is, modules that encapsulate both state and a set of operations.These are implemented below the SQL level, typically in a language such as C . Examples are int4 (already included in PostgreSQL) or complex (included as an optional extension type). A base type may represent either an individual scalar value or a variable-length array of values. For each scalar type that exists in a database, PostgreSQL automatically creates an array type that holds values of the same scalar type.
  • 10. Composite types & Enumerated types • CompositeTypes: These correspond to table rows; that is, they are a list of field names and their respective types.A composite type is created implicitly whenever a table is created, but users may also construct them explicitly. • EnumeratedTypes: These are similar to enum types used in programming languages such as C and Java. An enumerated type is essentially a fixed list of named values. In PostgreSQL, enumerated types may be converted to the textual representation of their name, but this conversion must be specified explicitly in some cases to ensure type safety. For instance, values of different enumerated types may not be compared without explicit conversion to compatible types.
  • 11. Domains & Pseudo-types • Domains: A domain type is defined by coupling a base type with a constraint that values of the type must satisfy.Values of the domain type and the associated base type may be used interchangeably, provided that the constraint is satisfied. A domain may also have an optional default value, whose meaning is similar to the default value of a table column. • Pseudotypes : Currently, PostgreSQL supports the following pseudotypes: any, anyarray, anyelement, anyenum, anynonarray cstring, internal, opaque, language handler, record, trigger, and void.These cannot be used in composite types (and thus cannot be used for table columns), but can be used as argument and return types of user- defined functions.
  • 12. Polymorphic types Four of the pseudotypes anyelement, anyarray, anynonarray, and anyenum are collectively known as polymorphic. Functions with arguments of these types (correspondingly called polymorphic functions) may operate on any actual type. PostgreSQL has a simple type-resolution scheme that requires that: 1. in any particular invocation of a polymorphic function, all occurrences of a polymorphic type must be bound to the same actual type (that is, a function defined as f(anyelement, anyelement) may operate only on pairs of the same actual type), and … 2. if the return type is polymorphic, then at least one of the arguments must be of the same polymorphic type.
  • 13. NonstandardTypes Thanks to the open nature of PostgreSQL, there are several contributed extension types, such as complex numbers, and ISBN/ISSNs. • Geometric data types (point, line, lseg, box, polygon, path, circle) are used in geographic information systems to represent two-dimensional spatial objects such as points, line segments, polygons, paths, and circles. Numerous functions and operators are available in PostgreSQL to perform various geometric operations such as scaling, translation, rotation, and determining intersections. Furthermore, PostgreSQL supports indexing of these types using R-trees. • Full-text searching is performed in PostgreSQL using the tsvector type that represents a document and the tsquery type that represents a full-text query. A tsvector stores the distinct words in a document, after converting variants of each word to a common normal form. PostgreSQL provides functions to convert raw text to a tsvector and concatenate documents. A tsquery specifies words to search for in candidate documents, with multiple words connected by Boolean operators. PostgreSQL natively supports operations on full-text types, including language features and indexed search.
  • 14. NonstandardTypes… • PostgreSQL offers data types to store network addresses.These data types allow network-management applications to use a PostgreSQL database as their data store.These types offer input-error checking.Thus, they are preferable over plain text fields. (cidr, inet and macaddr). IPv4, IPv6 with subnet mask and MAC address. •The PostgreSQL bit type can store both fixed- and variable-length strings of 1s and 0s. PostgreSQL supports bit-logical operators and string-manipulation functions for these values.
  • 15. Rules and Active-Database Features PostgreSQL supports SQL constraints and triggers (and stored procedures). Furthermore, it features query-rewriting rules that can be declared on the server. PostgreSQL allows check constraints, not null constraints, and primary-key and foreign-key constraints (with restricting and cascading deletes). Like many other relational database systems, PostgreSQL supports triggers, which are useful for nontrivial constraints and consistency checking or enforcement. Trigger functions can be written in a procedural language such as PL/pgSQL or in C, but not in plain SQL. Triggers can execute before or after insert, update, or delete operations and either once per modified row, or once per SQL statement.
  • 16. Rules The PostgreSQL rules system allows users to define query-rewrite rules on the database server. Unlike stored procedures and triggers, the rule system intervenes between the query parser and the planner and modifies queries on the basis of the set of rules. After the original query tree has been transformed into one or more trees, they are passed to the query planner.Thus, the planner has all the necessary information (tables to be scanned, relationships between them, qualifications, join information, and so forth) and can come up with an efficient execution plan, even when complex rules are involved.
  • 17. Extensibility Like most relational database systems, PostgreSQL stores information about databases, tables, columns, and so forth, in what are commonly known as system catalogs, which appear to the user as normal tables. Other relational database systems are typically extended by changing hard-coded procedures in the source code or by loading special extension modules written by the vendor. Unlike most relational database systems, PostgreSQL goes one step further and stores much more information in its catalogs: not only information about tables and columns, but also information about data types, functions, access methods, and so on.Therefore, PostgreSQL is easy for users to extend and facilitates rapid prototyping of new applications and storage structures. PostgreSQL can also incorporate user-written code into the server, through dynamic loading of shared objects.This provides an alternative approach to writing extensions that can be used when catalog-based extensions are not sufficient.
  • 18. CreatingTypes PostgreSQL allows users to define composite types, enumeration types, and even new base types. A composite-type definition is similar to a table definition (in fact, the latter implicitly does the former). The order of listed names in enum is significant in comparing values of an enumerated type.This can be useful for a statement such as:
  • 20. Functions PostgreSQL allows users to define functions that are stored and executed on the server. PostgreSQL also supports function overloading. Functions can be written as plain SQL statements, or in several procedural languages. Finally, PostgreSQL has an application programmer interface for adding functions written in C.
  • 21. Index Extensions PostgreSQL currently supports the usual B-tree and hash indices, as well as two index methods that are unique to PostgreSQL: the Generalized SearchTree (GiST) and the Generalized Inverted Index (GIN), which is useful for full-text indexing. Finally, PostgreSQL provides indexing of two-dimensional spatial objects with an R- tree index, which is implemented using a GiST index behind the scenes. Adding index extensions for a type requires definition of an operator class, which encapsulates the following: Index-method strategies: These are a set of operators that can be used as qualifiers in where clauses.The particular set depends on the index type. For example, B-tree indices can retrieve ranges of objects, so the set consists of five operators (<=, =, >=, and >), all of which can appear in a where clause involving a B-tree index. A hash index allows only equality testing and an R-tree index allows a number of spatial relationships (for example contained, to-the-left, and so forth).
  • 22. Index Extensions… Index-method support routines:The above set of operators is typically not sufficient for the operation of the index. For example, a hash index requires a function to compute the hash value for each object. An R-tree index needs to be able to compute intersections and unions and to estimate the size of indexed objects. For example, if the following functions and operators are defined to compare the magnitude of complex numbers, then we can make such objects indexable by the following declaration: The operator statements define the strategy methods and the function statements define the support methods.
  • 23. Procedural Languages • PL/pgSQL:This is a trusted language that adds procedural programming capabilities (for example, variables and control flow) to SQL. It is very similar to Oracle’s PL/SQL. Although code cannot be transferred verbatim from one to the other, porting is usually simple. • PL/Tcl, PL/Perl, and PL/Python:These leverage the power ofTcl, Perl, and Python to write stored functions and procedures on the server.The first two come in both trusted and untrusted versions (PL/Tcl, PL/Perl and PL/TclU, PL/PerlU, respectively), while PL/Python is untrusted at the time of this writing. Each of these has bindings that allow access to the database system via a language- specific interface.
  • 24. Server Programming Interface The server programming interface (SPI) is an application programmer interface that allows user-defined C functions to run arbitrary SQL commands inside their functions.This gives writers of user-defined functions the ability to implement only essential parts in C and easily leverage the full power of the relational database system engine to do most of the work.
  • 25. Transaction Management Transaction management in PostgreSQL uses both snapshot isolation and two- phase locking. Which one of the two protocols is used depends on the type of statement being executed. For DML statements the snapshot isolation technique is used; the snapshot isolation scheme is referred to as the multi-version concurrency control (MVCC) scheme in PostgreSQL. Concurrency control for DDL statements, on the other hand, is based on standard two-phase locking.
  • 26. Remember SQL Commands In General
  • 27. Transactions? (Atomicity) Collections of operations that form a single logical unit of work are called transactions. A database system must ensure proper execution of transactions despite failures—either the entire transaction executes, or none of it does. Furthermore, it must manage concurrent execution of transactions in a way that avoids the introduction of inconsistency. A transaction is delimited by statements (or function calls) of the form begin transaction and end transaction.The transaction consists of all operations executed between the begin transaction and end transaction.This collection of steps must appear to the user as a single, indivisible unit. Since a transaction is indivisible, it either executes in its entirety or not at all.Thus, if a transaction begins to execute but fails for whatever reason, any changes to the database that the transaction may have made must be undone. This “all-or-none” property is referred to as atomicity.
  • 28. Transactions? (Isolation & Durability) Also, since a transaction is a single unit, its actions cannot appear to be separated by other database operations not part of the transaction.While we wish to present this user-level impression of transactions, we know that reality is quite different. Even a single SQL statement involves many separate accesses to the database, and a transaction may consist of several SQL statements.Therefore, the database system must take special actions to ensure that transactions operate properly without interference from concurrently executing database statements.This property is referred to as isolation. Even if the system ensures correct execution of a transaction, this serves little purpose if the system subsequently crashes and, as a result, the system “forgets” about the transaction.Thus, a transaction’s actions must persist across crashes.This property is referred to as durability.
  • 29. Transactions? … (Consistency) Because of the above three properties, transactions are an ideal way of structuring interaction with a database.This leads us to impose a requirement on transactions themselves. A transaction must preserve database consistency—if a transaction is run atomically in isolation starting from a consistent database, the database must again be consistent at the end of the transaction.This consistency requirement goes beyond the data integrity constraints such as primary-key constraints, referential integrity, check constraints, and the like. Rather, transactions are expected to go beyond that to ensure preservation of those application-dependent consistency constraints that are too complex to state using the SQL constructs for data integrity. How this is done is the responsibility of the programmer who codes a transaction.This property is referred to as consistency.
  • 30. ACID • Atomicity: Either all operations of the transaction are reflected properly in the database, or none are. • Consistency: Execution of a transaction in isolation (that is, with no other transaction executing concurrently) preserves the consistency of the database. • Isolation: Even though multiple transactions may execute concurrently, the system guarantees that, for every pair of transactionsTi andTj , it appears toTi that eitherTj finished execution beforeTi started orTj started execution afterTi finished. Thus, each transaction is unaware of other transactions executing concurrently in the system. • Durability: After a transaction completes successfully, the changes it has made to the database persist, even if there are system failures.
  • 31. Transactions – An Example Consistency:The consistency requirement here is that the sum of A and B be unchanged by the execution of the transaction.Without the consistency requirement, money could be created or destroyed by the transaction! Atomicity: Suppose that, just before the execution of transactionTi , the values of accounts A and B are $1000 and $2000, respectively. Now suppose that, during the execution of transactionTi , a failure occurs that preventsTi from completing its execution successfully. Further, suppose that the failure happened after the write(A) operation but before the write(B) operation. In this case, the values of accounts A and B reflected in the database are $950 and $2000.The system destroyed $50 as a result of this failure. In particular, we note that the sum A + B is no longer preserved. Thus, because of the failure, the state of the system no longer reflects a real state of the world that the database is supposed to capture.We term such a state an inconsistent state.
  • 32. Transactions – An Example … Durability: Once the execution of the transaction completes successfully, and the user who initiated the transaction has been notified that the transfer of funds has taken place, it must be the case that no system failure can result in a loss of data corresponding to this transfer of funds.The durability property guarantees that, once a transaction completes successfully, all the updates that it carried out on the database persist, even if there is a system failure after the transaction completes execution. Isolation: Even if the consistency and atomicity properties are ensured for each transaction, if several transactions are executed concurrently, their operations may interleave in some undesirable way, resulting in an inconsistent state.
  • 33. AbstractTransaction Model • Active, the initial state; the transaction stays in this state while it is executing. • Partially committed, after the final statement has been executed. • Failed, after the discovery that normal execution can no longer proceed. • Aborted, after the transaction has been rolled back and the database has been restored to its state prior to the start of the transaction. • Committed, after successful completion.
  • 34. Transaction Isolation Levels Serializable: usually ensures serializable execution. However, some database systems implement this isolation level in a manner that may, in certain cases, allow non-serializable executions. Repeatable: read allows only committed data to be read and further requires that, between two reads of a data item by a transaction, no other transaction is allowed to update it. However, the transaction may not be serializable with respect to other transactions. For instance, when it is searching for data satisfying some conditions, a transaction may find some of the data inserted by a committed transaction, but may not find other data inserted by the same transaction. Read committed: allows only committed data to be read, but does not require repeatable reads. For instance, between two reads of a data item by the transaction, another transaction may have updated the data item and committed. Read uncommitted: allows uncommitted data to be read. It is the lowest isolation level allowed by SQL. All the isolation levels above additionally disallow dirty writes, that is, they disallow writes to a data item that has already been written by another transaction that has not yet committed or aborted.
  • 35. Implementation of Isolation Levels • Locking two-phase locking requires a transaction to have two phases, one where it acquires locks but does not release any, and a second phase where the transaction releases locks but does not acquire any. • Timestamps • MultipleVersions and Snapshot Isolation By maintaining more than one version of a data item, it is possible to allow a transaction to read an old version of a data item rather than a newer version written by an uncommitted transaction or by a transaction that should come later in the serialization order.There are a variety of multi-version concurrency control techniques. One in particular, called snapshot isolation, is widely used in practice.
  • 36. Back to PostgreSQL (Concurrency Control) since the concurrency control protocol used by PostgreSQL depends on the isolation level requested by the application, we begin with an overview of the isolation levels offered by PostgreSQL.We then describe the key ideas behind the MVCC scheme, followed by a discussion of their implementation in PostgreSQL and some of the implications of MVCC. The SQL standard defines three weak levels of consistency, in addition to the serializable level of consistency. The purpose of providing the weak consistency levels is to allow a higher degree of concurrency for applications that don’t require the strong guarantees that serializability provides. Examples of such applications include long-running transactions that collect statistics over the database and whose results do not need to be precise.The SQL standard defines the different isolation levels in terms of three phenomena that violate serializability. The three phenomena are called dirty read, nonrepeatable read, and phantom read.
  • 37. PostgreSQL Isolation Levels • Dirty read:The transaction reads values written by another transaction that hasn’t committed yet. • Non-repeatable read: A transaction reads the same object twice during execution and finds a different value the second time, although the transaction has not changed the value in the meantime. • Phantom read: A transaction re-executes a query returning a set of rows that satisfy a search condition and finds that the set of rows satisfying the condition has changed as a result of another recently committed transaction. It should be obvious that each of the above phenomena violates transaction isolation, and hence violates serializability.
  • 38. More on PostgreSQL Database System Conecepts, Chapter 27 (Parts 2.4, 2.5, …), PostgreSQL Case Study.