Transaction Processing in Postgresql: Tom Lane Great Bridge, LLC Tgl@Sss - Pgh.Pa - Us 1
Transaction Processing in Postgresql: Tom Lane Great Bridge, LLC Tgl@Sss - Pgh.Pa - Us 1
Tom Lane
Great Bridge, LLC
[email protected]
1
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
Outline
Introduction
• What is a transaction?
User’s view
• Multi-version concurrency control
Implementation
• Tuple visibility
• Storage management
• Locks
2
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
Postmaster
Client Daemon
Application Process
Initial
DB Requests Connection
Spawn
and Results Request
Server
via and
Process
Library API Authentication
Client Postgres
Interface SQL Queries
Server
Library and Results (Backend)
3
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
Postmaster Create
Daemon Shared Kernel
Process Disk Disk
Buffers Buffers
Spawn
Server
Process
Shared
Read/ Disk
Postgres Write Tables
Server Storage
(Backend)
5
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
Atomic: results of a transaction are seen entirely or not at all within other transactions.
(A transaction need not appear atomic to itself.)
Durable: once a transaction commits, its results will not belost regardless of
subsequent failures.
6
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
7
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
Well ... that depends on how much you trust your kernel and hard disk.
• Postgres transactions are only guaranteed atomic if a disk page write isan atomic
action. On most modern hard drives that’s true if a page is aphysical sector, but most
people run with disk pages configured as 8K or so,which makes it a little more dubious
whether a page write is all-or-nothing.
• pg_log is safe anyway since we’re only flipping bits init, and both bits of a
transaction’s status must be in the same sector.
• But when moving tuples around in a data page, there’s a potential fordata corruption
if a power failure should manage to abort the page writepartway through (perhaps only
some of the component sectors get written).This is one reason to keep page sizes
small ... and to buy a UPS for your server!
8
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
It’s critical that we force a transaction’s data page changes down to diskbefore we write
pg_log. If the disk writes occur in the wrongorder, a power failure could leave us with a
transaction that’smarked committed in pg_log but not all ofwhose data changes are
reflected on disk --- thus failing the atomicity test.
• Unix kernels allow us to force the correct write order via fsync(2), butthe performance
penalty of fsync’ing many files is pretty high.
• We’re looking at ways to avoid needing so many fsync()s, but that’s adifferent talk.
9
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
• Writers only block each other when updating the same row
10
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
and before it commits,transaction B comes along and wants to do the same thing
on the same row.
11
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
12
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
Ignoring tuples you’re not supposed to be able to see is the key tomaking
transactions appear atomic.
13
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
We must store multiple versions of every row. A tuple can be removed onlyafter
it’s been committed as deleted for long enough that no activetransaction
can see it anymore.
14
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
If we plan to update rather than delete, we first add new version of rowto table,
then set xmax and forward link in old tuple. Forward link willbe needed by
concurrent updaters (but not by readers).
To avoid repeated consultation of pg_log, there are alsosome statusbits that indicate
"known committed" or "known aborted" for xmin and xmax.These are set by the first
backend that inspects xmin or xmax after thereferenced transaction commits/aborts.
15
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
• Hence, we make a listat transaction start of which transactions are currently being
run by other backends.(Cheap shared-memory communication is essential here: we
just look in ashared-memory table, in which each backend records its current
transactionnumber.)
• These transaction IDs will never be considered validby the current transaction,
even if they are shown to be committed in pg_log or on-row status bits.
• Nor will a transaction with ID higher than the current transaction’sever be
considered valid.
• These rules ensure that no transaction committing after the currenttransaction’s
start will be considered committed.
• Validity is in the eye of the beholder.
16
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
Even though readers and writers don’t block each other under MVCC, we stillneed
table-level locking.
This exists mainly to prevent the entire table frombeing altered or deleted
out from under readers or writers.
We also offer various lock levels for application use (mainly forporting applications
that take a traditional lock-based approach toconcurrency).
17
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
Types of locks
1 AccessShareLock SELECT 7
5 ShareRowExclusiveLock 3,4,5,6,7
6 ExclusiveLock 2,3,4,5,6,7
Locks are held till end of transaction: you can grab a lock, but you can’trelease it
except by ending your transaction.
18
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
Lock implementation
Locks are recorded in a shared-memory hash table keyed by kind and ID ofobject
being locked. Each item shows the types and numbers of locks held orpending on
its object. Would-be lockers who have a conflict with an existinglock must wait.
19
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
Deadlock detection
Deadlock is possible if two transactions try to grab conflicting locksin different orders.
If a would-be locker sleeps for more than a second without getting thedesired lock,
it runs a deadlock-check algorithm that searches thelock hash table for circular
lock dependencies. If it finds any, thenobtaining the lock will be impossible, so it
gives up and reports anerror. Else it goes back to sleep and waits till granted the
lock (ortill client application gives up and requests transaction cancel).
• The delay before running the deadlock check algorithm can betuned to match the
typical transaction time in a particular server’sworkload. In this way, unnecessary
deadlock checks are seldomperformed, but real deadlocks are detected reasonably
quickly.
20
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
Short-term locks
These locks should only be held for long enough toexamine and/or update a
shared item --- in particular a backend should neverblock while holding one.
21
30 Oct 2000 Tom Lane
Transaction Processing in PostgreSQL
Summary
22
30 Oct 2000 Tom Lane