Introduction To Hacking PostgreSQL PDF
Introduction To Hacking PostgreSQL PDF
Neil Conway
[email protected]
1 Prerequisites
Why Should You Hack On PostgreSQL?
What Skills Will You Need?
What Tools Should You Use?
2 The Architecture of PostgreSQL
System Architecture
Components of the Backend
3 Common Code Conventions
Memory Management
Error Handling
4 Community Processes
5 Sample Patch
6 Conclusion
Outline
1 Prerequisites
Why Should You Hack On PostgreSQL?
What Skills Will You Need?
What Tools Should You Use?
2 The Architecture of PostgreSQL
System Architecture
Components of the Backend
3 Common Code Conventions
Memory Management
Error Handling
4 Community Processes
5 Sample Patch
6 Conclusion
Why Hack On PostgreSQL?
Possible Reasons
Possible Reasons
Possible Reasons
Possible Reasons
Possible Reasons
Essential
Some knowledge of C
Fortunately, C is easy
Some familiarity with Unix and basic Unix programming
Postgres development on Win32 is increasingly feasible
The Basics
$CC, Bison, Flex, CVS, autotools
Configure flags: enable-depend, enable-debug,
enable-cassert
Consider CFLAGS=-O0 for easier debugging (and faster builds)
With GCC, this suppresses some important warnings
Development Tools
The Basics
$CC, Bison, Flex, CVS, autotools
Configure flags: enable-depend, enable-debug,
enable-cassert
Consider CFLAGS=-O0 for easier debugging (and faster builds)
With GCC, this suppresses some important warnings
Profiling
Understatement
Understatement
Authoring SGML
1 Prerequisites
Why Should You Hack On PostgreSQL?
What Skills Will You Need?
What Tools Should You Use?
2 The Architecture of PostgreSQL
System Architecture
Components of the Backend
3 Common Code Conventions
Memory Management
Error Handling
4 Community Processes
5 Sample Patch
6 Conclusion
The Postmaster
Lifecycle
Lifecycle
Lifecycle
Lifecycle
Types of Processes
autovacuum launcher: Periodically start autovacuum workers
bgwriter: Flush dirty buffers to disk, perform periodic checkpoints
stats collector: Accepts run-time stats from backends via UDP
syslogger: Collect log output from other processes, write to file(s)
normal backend: Handles a single client session
Daemon Processes
Types of Processes
autovacuum launcher: Periodically start autovacuum workers
bgwriter: Flush dirty buffers to disk, perform periodic checkpoints
stats collector: Accepts run-time stats from backends via UDP
syslogger: Collect log output from other processes, write to file(s)
normal backend: Handles a single client session
Inter-Process Communication
Advantages
Advantages
Disadvantages
Backend Lifecycle
Backend Lifecycle
Major Components
Tables Files
Tables Files
Files Blocks
Each file is divided into blocks of BLCKSZ bytes each
8192 by default; compile-time constant
Blocks consist of items, such as heap tuples (in tables), or
index entries (in indexes), along with metadata
Tuple versions uniquely identified by triple (r , p, i): relation
OID, block number, offset within block; known as ctid
The Buffer Manager
Table-level Locks
Table-level Locks
Row-level Locks
LWLocks (Latches)
LWLocks (Latches)
Spinlocks
Makefiles
Content of src/backend
1 Prerequisites
Why Should You Hack On PostgreSQL?
What Skills Will You Need?
What Tools Should You Use?
2 The Architecture of PostgreSQL
System Architecture
Components of the Backend
3 Common Code Conventions
Memory Management
Error Handling
4 Community Processes
5 Sample Patch
6 Conclusion
The Postgres Object System: Nodes
1 Prerequisites
Why Should You Hack On PostgreSQL?
What Skills Will You Need?
What Tools Should You Use?
2 The Architecture of PostgreSQL
System Architecture
Components of the Backend
3 Common Code Conventions
Memory Management
Error Handling
4 Community Processes
5 Sample Patch
6 Conclusion
Mailing Lists
1 Prerequisites
Why Should You Hack On PostgreSQL?
What Skills Will You Need?
What Tools Should You Use?
2 The Architecture of PostgreSQL
System Architecture
Components of the Backend
3 Common Code Conventions
Memory Management
Error Handling
4 Community Processes
5 Sample Patch
6 Conclusion
The TABLESAMPLE Clause
Deficiencies
1 Non-uniform sampling when either
row size is non-uniform
distribution of live tuples is non-uniform
2 Consumes a lot of entropy
3 Could be optimized to reduce random I/O
Behavioral Questions
Range Table
The parse-analysis phase constructs a range table consisting of
the FROM clause elements
When converting the FROM clause RVs into range table entries
(RTEs), attach the TableSampleInfo
Optimizer Terminology
Mandatory
Mandatory
Optional
TABLESAMPLE support
1 Prerequisites
Why Should You Hack On PostgreSQL?
What Skills Will You Need?
What Tools Should You Use?
2 The Architecture of PostgreSQL
System Architecture
Components of the Backend
3 Common Code Conventions
Memory Management
Error Handling
4 Community Processes
5 Sample Patch
6 Conclusion
Next Steps
Any questions?