SlideShare a Scribd company logo
PostgreSQL
Reuven M. Lerner (reuven@lerner.co.il)
             IL-Techtalks
        November 14th, 2012
Who am I?
• Web developer since 1993
• Linux Journal columnist since 1996
• Software architect, developer, consultant
• Mostly Ruby on Rails + PostgreSQL, but
  also Python, PHP, Perl, JavaScript, MySQL,
  MongoDB, and lots more...
• PostgreSQL user since (at least) 1997
What do I do?

• Web development, especially in Rails
• Teaching/training
• Coaching/consulting
What is a database?

 Store data
 confidently

                   Database


Retrieve data
   flexibly
Relational databases

   Define tables,
store data in them

                     Database


Retrieve data from
  related tables
Lots of options!

• Oracle
• Microsoft SQL Server
• IBM DB2
• MySQL
• PostgreSQL
How do you choose?
•   Integrity (ACID compliance)

•   Data types

•   Functionality

•   Tools

•   Extensibility

•   Documentation

•   Community
PostgreSQL
• Very fast, very scalable. (Just ask Skype.)
• Amazingly flexible, easily extensible.
• Rock-solid — no crashes, corruption,
  security issues for years
• Ridiculously easy administration
• It also happens to be free (MIT/BSD)
PostgreSQL
PostgreSQL
PostgreSQL
PostgreSQL
PostgreSQL
PostgreSQL
What about MySQL?
• PostgreSQL has many more features
• Not nearly as popular as MySQL
• No single company behind it
 • (A good thing, I think!)
• After using both, I prefer PostgreSQL
 • I’ll be happy to answer questions later
Brief history
• Ingres (Stonebreaker, Berkeley)
• Postgres (Stonebreaker, Berkeley)
• PostgreSQL project = Postgres + SQL
• About one major release per year
• Version 8.x — Windows port, recovery
• Version 9.0 — hot replication, upgrades
ACID
• ACID — basic standard for databases
 • Atomicity
 • Consistency
 • Isolation
 • Durability
• Pg has always been ACID compliant
Data types
• Boolean
• Numeric (integer, float, decimal)
• (var)char, text (infinitely large), binary
• sequences (guaranteed to be unique)
• Date/time and time intervals
• IP addresses, XML, enums, arrays
Or create your own!
Or create your own!

CREATE TYPE Person AS
(first_name TEXT, last_name
TEXT);
Or create your own!

CREATE TYPE Person AS
(first_name TEXT, last_name
TEXT);
Or create your own!

CREATE TYPE Person AS
(first_name TEXT, last_name
TEXT);


CREATE TABLE Members (group_id
INTEGER, member Person);
Strong typing
• PostgreSQL won’t automatically change
  types for you.
• This can be annoying at first — but it is
  meant to protect your data!
• You can cast from one type to another with
  the “cast” function or the :: operator
• You can also define your own casts
PostGIS
• Some people took this all the way
• Want to include geographical information?
• No problem — we’ve got PostGIS!
• Complete GIS solution, with data types and
  functions
• Keeps pace with main PostgreSQL revisions
Object oriented tables

• Employee table inherits from People table:
 CREATE TABLE Employee
 (employee_id INTEGER
 department_id INTEGER)
 INHERITS (People);
Foreign keys that work
CREATE TABLE DVDs (id SERIAL, title TEXT, store_id
INTEGER REFERENCES Stores);

INSERT INTO DVDs (title, store_id) VALUES ('Attack of
the Killer Tomatoes', 500);
Foreign keys that work
CREATE TABLE DVDs (id SERIAL, title TEXT, store_id
INTEGER REFERENCES Stores);

INSERT INTO DVDs (title, store_id) VALUES ('Attack of
the Killer Tomatoes', 500);

ERROR: insert or update on table "dvds" violates
foreign key constraint "dvds_store_id_fkey"
Foreign keys that work
CREATE TABLE DVDs (id SERIAL, title TEXT, store_id
INTEGER REFERENCES Stores);

INSERT INTO DVDs (title, store_id) VALUES ('Attack of
the Killer Tomatoes', 500);

ERROR: insert or update on table "dvds" violates
foreign key constraint "dvds_store_id_fkey"

DETAIL: Key (store_id)=(500) is not present in table
"stores".
Foreign keys that work
CREATE TABLE DVDs (id SERIAL, title TEXT, store_id
INTEGER REFERENCES Stores);

INSERT INTO DVDs (title, store_id) VALUES ('Attack of
the Killer Tomatoes', 500);

ERROR: insert or update on table "dvds" violates
foreign key constraint "dvds_store_id_fkey"

DETAIL: Key (store_id)=(500) is not present in table
"stores".

ERROR: insert or update on table "dvds" violates
foreign key constraint "dvds_store_id_fkey"
Foreign keys that work
CREATE TABLE DVDs (id SERIAL, title TEXT, store_id
INTEGER REFERENCES Stores);

INSERT INTO DVDs (title, store_id) VALUES ('Attack of
the Killer Tomatoes', 500);

ERROR: insert or update on table "dvds" violates
foreign key constraint "dvds_store_id_fkey"

DETAIL: Key (store_id)=(500) is not present in table
"stores".

ERROR: insert or update on table "dvds" violates
foreign key constraint "dvds_store_id_fkey"

DETAIL: Key (store_id)=(500) is not present in table
"stores".
Custom validity checks
CREATE TABLE DVDs (id SERIAL, title
TEXT check (length(title) > 3),
store_id INTEGER REFERENCES
Stores);
INSERT INTO DVDs (title, store_id)
VALUES ('AB', 500);
Custom validity checks
CREATE TABLE DVDs (id SERIAL, title
TEXT check (length(title) > 3),
store_id INTEGER REFERENCES
Stores);
INSERT INTO DVDs (title, store_id)
VALUES ('AB', 500);
ERROR: new row for relation "dvds"
violates check constraint
"dvds_title_check"
No more bad dates!
INSERT INTO UPDATES
(created_at) values ('32-
feb-2008');
No more bad dates!
INSERT INTO UPDATES
(created_at) values ('32-
feb-2008');
ERROR: date/time field value
out of range: "32-feb-2008"
No more bad dates!
INSERT INTO UPDATES
(created_at) values ('32-
feb-2008');
ERROR: date/time field value
out of range: "32-feb-2008"
LINE 1: insert into updates
(created_at) values ('32-
feb-2008');
Timestamp vs. Interval
testdb=# select now();
              now
-------------------------------
 2010-10-31 08:58:23.365792+02
(1 row)
                                  Point in time

testdb=# select now() - interval '3 days';
           ?column?
-------------------------------
 2010-10-28 08:58:28.870011+02
                                  Difference between
(1 row)                              points in time
Built-in functions
• Math
• Text processing (including regexps)
• Date/time calculations
• Conditionals (CASE, COALESCE, NULLIF)
  for use in queries
• Extensive library of geometrical functions
Or write your own!
• PL/pgSQL
• PL/Perl
• PL/Python
• PL/Ruby
• PL/R
• PL/Tcl
Or write your own!
CREATE OR REPLACE FUNCTION remove_cache_tables() RETURNS
VOID AS $$
DECLARE
      r pg_catalog.pg_tables%rowtype;
BEGIN
      FOR r IN SELECT * FROM pg_catalog.pg_tables
      WHERE schemaname = 'public'
        AND tablename ILIKE 'cache_%'
      LOOP
           RAISE NOTICE 'Now dropping table %', r.tablename;
           EXECUTE 'DROP TABLE ' || r.tablename;
      END LOOP;
END;
$$ LANGUAGE 'plpgsql';
Another example
CREATE OR REPLACE FUNCTION store_hostname() RETURNS
TRIGGER AS $store_hostname$

    BEGIN

           NEW.hostname := 'http://' ||

            substring(NEW.url, '(?:http://)?([^/]+)');

           RETURN NEW;

    END;

$store_hostname$ LANGUAGE plpgsql;
Triggers

• Yes, that last function was a trigger
• Automatically execute functions upon
  INSERT, UPDATE, and/or DELETE
• Can execute before or after
• Very powerful, very fast
Function possibilities
• Computing values, strings
• Returning table-like sets of values
• Encapsulating queries
• Dynamically generating queries via strings
• Triggers: Modifying data before it is inserted
  or updated
Why use a PL/lang?

• Other libraries (e.g., CPAN for Perl)
• Faster, optimized functions (eg., R)
• Programmer familiarity
• Cached query plans
Views and rules
• Views are stored SELECT statements
• Pretend that something is a read-only table
• Rules let you turn it into a read/write table
 • Intercept and rewrite incoming query
 • Check or change data
 • Change where data is stored
Full-text indexing

• Built into PostgreSQL
• Handles stop words, different languages,
  synonyms, and even (often) stemming
• Very powerful, but it can take some time to
  get configured correctly
Transactions
• In PostgreSQL from the beginning
• Use transactions for just about anything:
  BEGIN
  DROP TABLE DVDs;
  ROLLBACK;
  SELECT * FROM DVDs; -- Works!
Savepoints
(or, sub-transactions)
BEGIN;
INSERT INTO table1 VALUES (1);
SAVEPOINT my_savepoint;
INSERT INTO table1 VALUES (2);
ROLLBACK TO SAVEPOINT my_savepoint;
INSERT INTO table1 VALUES (3);
COMMIT;
MVCC
• Readers and writers don’t block each other
• “Multi-version concurrency control”
• xmin, xmax on each tuple; rows are those
  tuples with txid_current between them
• Old versions stick around until vacuumed
 • Autovacuum removes even this issue
MVCC
• Look at a row’s xmin and xmax
• Look at txid_current()
• Start transaction; look at row’s xmin/xmax
• Look at xmin/xmax on that row from
  another session
• Commit, and look again at both!
Downsides of MVCC
• MVCC is usually fantastic
• But if you insert or update many rows, and
  then do a COUNT(*), things will be slow
• There are solutions — including more
  aggressive auto-vacuuming
• 9.2 introduced features that improved this
Indexing
• Regular, unique indexes
• Functional indexes
 • Index calling a function on a column
• Partial indexes
 • Index only rows matching criteria
• Cluster table on an index
CTEs
• Adds a “WITH” statement, which defines a
  sorta-kinda temp table
• You can then query that same temp table
• Makes many queries easier to read, write,
  without a real temp table
• Better yet: CTEs can be recursive, for
  everything from Fibonacci to org charts
Speed and scalability
• MVCC + a smart query optimizer makes
  PostgreSQL pretty fast and smart
• Statistics based on previous query results
  inform the query planner
• Several scan types, join types are weighed
• Benchmarks consistently show excellent
  performance with high mixes of read/write
WAL
• All activity in the database is put in “write-
  ahead logs” before it happens
• If the database server fails, it replays the
  WALs, then continues
• You can change how often WALs are
  written, to improve performance
• PITR — restore database from WALs
Log shipping
• Copy WALs to a second, identical server —
  known as “log shipping” — and you have a
  backup
• If the primary server goes down, you can
  bring the secondary up in its place
• This was known as “warm standby,” and
  worked in 8.4
Hot standby,
 streaming replication
• As of 9.0, you don’t have to do this
• You can have the primary stream the
  information to the secondary
 • Almost-instant updates
• The secondary machine can answer read-
  only queries (“hot standby”), not just
  handle failover
Extensions
• Provides a standardized mechanism for
  downloading, installing, and versioning
  extensions
• New data types, functions, languages are
  possible
• Download, search via pgxn.org
• Similar to CPAN, PyPi, or Ruby gems
SQL/MED

• SQL/MED was introduced in 9.1
• Query information from other databases
  (and database-like interfaces)
• So if you have data in MySQL, Oracle,
  CSV ... just install a wrapper, and you can
  query it like a PostgreSQL table
Unlogged tables

• All actions are logged in WALs
• That adds some overhead, which isn’t
  required by throwaway data
• Unlogged tables (different from temp
  tables!) offer a speedup, in exchange for
  less reliability
New in 9.2
• JSON support
• Range types, for handling
• Much more scalable — from 24 cores and
  75k queries/sec to 64 cores and 350k
  queries/sec
• Index-only queries (“covering indexes”)
• Cascading replication
Web problems
• PostgreSQL is great as a Web backend
• But if you use an ORM (e.g., ActiveRecord),
  you are probably losing much of the power
  • e.g., foreign keys, CTE, triggers, and views
• No good way to bridge this gap — for now
• There are always methods, but this is an
  area that definitely needs some work
Tablespaces
• You can create any number of
  “tablespaces,” separate storage areas
• Put tables, indexes on different tablespaces
 • Most useful with multiple disks
• Separate tables (or parts of a partitioned
  table)... or separate tables from indexes
Partitioning
• Combine object-oriented tables, CHECK
  clauses, and tablespaces for partitioning
• Example: Invoices from Jan-June go in table
  “q12”, and July-December go in table “q34”
• Now PostgreSQL knows where to look
  when you SELECT from the parent table
• Note that INSERT requires a trigger
Reflection

• pg_catalog schema contains everything
  about your database
  • Tables, functions, views, etc.
• You can learn a great deal about
  PostgreSQL by looking through the
  pg_catalog schema
Advanced uses

• GridSQL: Split a query across multiple
  PostgreSQL servers
• Very large-scale data warehousing:
  Greenplum
Client libraries
• libpq (in C)        • Java (JDBC)
• Others by 3    rd   • .NET (npgsql)
  parties:            • ODBC
• Python              • JavaScript (!)
• Ruby                • Just about any
                        language you can
• Perl                  imagine
Tools
• Yeah, tools are more primitive
• If you love GUIs, and hate the command
  line, then PostgreSQL will be hard for you
• PgAdmin and other tools are OK, but not
  really up to the task for “real” work
 • PgAdmin does provide some graphical
    query building and “explain” output
Windows compatibility
• It works on Windows
• .NET drivers work, as well
• Logging is far from perfect (can go to the
  Windows log tool, but not filtered well)
• Configuration is still in a text file, foreign to
  most Windows people
• Windows is still a second-class citizen
Who uses it?
• Affilias
              • IMDB
• Apple
              • Skype
• BASF
              • Sourceforge
• Cisco
              • Heroku
• CD Baby
              • Checkpoint
• Etsy
Who supports it?

• EnterpriseDB — products and services
• 2 Quadrant
   nd


• Many freelancers (like me!)
PostgreSQL problems
• Tuning is still hard (but getting easier)
• Double quotes
• Lack of good GUI-based tools
• Some features (e.g., materialized views) that
  people want without having to resort to
  hacks and triggers/rules
• Multi-master (of course!)
Bottom line
• PostgreSQL: BSD licensed, easy to install,
  easy to use, easy to administer
• Still not quite up to commercial databases
  regarding features — but not far behind
• More than good enough for places like
  Skype and Affilias; probably good enough
  for you!
Want to learn more?
• Mailing lists, wikis, and blogs
 • All at http://postgresql.org/
 • http://planetpostgresql.org
• PostgreSQL training, consulting,
  development, hand-holding, and general
  encouragement
Thanks!
(Any questions?)



     reuven@lerner.co.il
   http://www.lerner.co.il/
        054-496-8405
“reuvenlerner” on Skype/AIM

More Related Content

PPTX
PostgreSQL- An Introduction
Smita Prasad
 
PDF
Mastering PostgreSQL Administration
EDB
 
PDF
Postgresql tutorial
Ashoka Vanjare
 
PPTX
Postgresql Database Administration Basic - Day1
PoguttuezhiniVP
 
PDF
Postgresql database administration volume 1
Federico Campoli
 
ODP
Introduction to PostgreSQL
Jim Mlodgenski
 
PDF
PostgreSQL Tutorial For Beginners | Edureka
Edureka!
 
PDF
PostgreSQL : Introduction
Open Source School
 
PostgreSQL- An Introduction
Smita Prasad
 
Mastering PostgreSQL Administration
EDB
 
Postgresql tutorial
Ashoka Vanjare
 
Postgresql Database Administration Basic - Day1
PoguttuezhiniVP
 
Postgresql database administration volume 1
Federico Campoli
 
Introduction to PostgreSQL
Jim Mlodgenski
 
PostgreSQL Tutorial For Beginners | Edureka
Edureka!
 
PostgreSQL : Introduction
Open Source School
 

What's hot (20)

PDF
PostgreSQL Deep Internal
EXEM
 
PDF
Oracle to Postgres Migration - part 1
PgTraining
 
PDF
Kevin Kempter PostgreSQL Backup and Recovery Methods @ Postgres Open
PostgresOpen
 
PDF
VLDB 2009 Tutorial on Column-Stores
Daniel Abadi
 
PDF
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 
PDF
PostgreSQL WAL for DBAs
PGConf APAC
 
PPTX
Postgresql
NexThoughts Technologies
 
PDF
Backup and-recovery2
Command Prompt., Inc
 
PPTX
Maxscale 소개 1.1.1
NeoClova
 
PDF
Get to know PostgreSQL!
Oddbjørn Steffensen
 
PDF
Migrating from Oracle to Postgres
EDB
 
ODP
PostgreSQL Administration for System Administrators
Command Prompt., Inc
 
PPTX
PostgreSQL Database Slides
metsarin
 
PDF
[EPPG] Oracle to PostgreSQL, Challenges to Opportunity
Equnix Business Solutions
 
PDF
[오픈소스컨설팅]Day #1 MySQL 엔진소개, 튜닝, 백업 및 복구, 업그레이드방법
Ji-Woong Choi
 
PDF
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
PostgreSQL-Consulting
 
PDF
MySQL/MariaDB Proxy Software Test
I Goo Lee
 
PDF
PostgreSQL Replication High Availability Methods
Mydbops
 
PDF
Rapid Upgrades With Pg_Upgrade, Bruce Momjian
Fuenteovejuna
 
PDF
One PDB to go, please!
Christian Gohmann
 
PostgreSQL Deep Internal
EXEM
 
Oracle to Postgres Migration - part 1
PgTraining
 
Kevin Kempter PostgreSQL Backup and Recovery Methods @ Postgres Open
PostgresOpen
 
VLDB 2009 Tutorial on Column-Stores
Daniel Abadi
 
Deep dive into PostgreSQL statistics.
Alexey Lesovsky
 
PostgreSQL WAL for DBAs
PGConf APAC
 
Backup and-recovery2
Command Prompt., Inc
 
Maxscale 소개 1.1.1
NeoClova
 
Get to know PostgreSQL!
Oddbjørn Steffensen
 
Migrating from Oracle to Postgres
EDB
 
PostgreSQL Administration for System Administrators
Command Prompt., Inc
 
PostgreSQL Database Slides
metsarin
 
[EPPG] Oracle to PostgreSQL, Challenges to Opportunity
Equnix Business Solutions
 
[오픈소스컨설팅]Day #1 MySQL 엔진소개, 튜닝, 백업 및 복구, 업그레이드방법
Ji-Woong Choi
 
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
PostgreSQL-Consulting
 
MySQL/MariaDB Proxy Software Test
I Goo Lee
 
PostgreSQL Replication High Availability Methods
Mydbops
 
Rapid Upgrades With Pg_Upgrade, Bruce Momjian
Fuenteovejuna
 
One PDB to go, please!
Christian Gohmann
 
Ad

Similar to PostgreSQL (20)

PDF
PostgreSQL 9.0 & The Future
Aaron Thul
 
PPTX
PostgreSQL - It's kind've a nifty database
Barry Jones
 
PDF
Mathias test
Mathias Stjernström
 
PPTX
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Michael Rys
 
PDF
MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
Pythian
 
PPT
Object Relational Database Management System
Amar Myana
 
PDF
ActiveJDBC - ActiveRecord implementation in Java
ipolevoy
 
PPTX
Introducing U-SQL (SQLPASS 2016)
Michael Rys
 
PDF
PostgreSQL, your NoSQL database
Reuven Lerner
 
PDF
Full Text Search In PostgreSQL
Karwin Software Solutions LLC
 
PDF
Bye bye $GLOBALS['TYPO3_DB']
Jan Helke
 
PPTX
xjtrutdctrd5454drxxresersestryugyufy6rythgfytfyt
WrushabhShirsat3
 
PPTX
unit-ii.pptx
NilamHonmane
 
PPTX
An Introduction to Elastic Search.
Jurriaan Persyn
 
PPTX
Webinar 2017. Supercharge your analytics with ClickHouse. Alexander Zaitsev
Altinity Ltd
 
PDF
Yapc10 Cdt World Domination
cPanel
 
PDF
Walkthrough Neo4j 1.9 & 2.0
Neo4j
 
PDF
•Design (create) 3 questions for a quiz show game and design regular.pdf
jyothimuppasani1
 
PDF
An introduction to mysql Performance Optimization (2008)
lmrei
 
KEY
Scaling php applications with redis
jimbojsb
 
PostgreSQL 9.0 & The Future
Aaron Thul
 
PostgreSQL - It's kind've a nifty database
Barry Jones
 
Mathias test
Mathias Stjernström
 
Introduction to Azure Data Lake and U-SQL for SQL users (SQL Saturday 635)
Michael Rys
 
MYSQL Query Anti-Patterns That Can Be Moved to Sphinx
Pythian
 
Object Relational Database Management System
Amar Myana
 
ActiveJDBC - ActiveRecord implementation in Java
ipolevoy
 
Introducing U-SQL (SQLPASS 2016)
Michael Rys
 
PostgreSQL, your NoSQL database
Reuven Lerner
 
Full Text Search In PostgreSQL
Karwin Software Solutions LLC
 
Bye bye $GLOBALS['TYPO3_DB']
Jan Helke
 
xjtrutdctrd5454drxxresersestryugyufy6rythgfytfyt
WrushabhShirsat3
 
unit-ii.pptx
NilamHonmane
 
An Introduction to Elastic Search.
Jurriaan Persyn
 
Webinar 2017. Supercharge your analytics with ClickHouse. Alexander Zaitsev
Altinity Ltd
 
Yapc10 Cdt World Domination
cPanel
 
Walkthrough Neo4j 1.9 & 2.0
Neo4j
 
•Design (create) 3 questions for a quiz show game and design regular.pdf
jyothimuppasani1
 
An introduction to mysql Performance Optimization (2008)
lmrei
 
Scaling php applications with redis
jimbojsb
 
Ad

More from Reuven Lerner (20)

PDF
Technical training business talk.key
Reuven Lerner
 
PDF
Big Data — Your new best friend
Reuven Lerner
 
PDF
Python's magic methods
Reuven Lerner
 
PDF
What can Ruby learn from Python (and vice versa)?
Reuven Lerner
 
PDF
Functional Python Webinar from October 22nd, 2014
Reuven Lerner
 
PDF
Web APIs: The future of software
Reuven Lerner
 
PDF
Rails israel 2013
Reuven Lerner
 
PDF
Intro to cloud computing — MegaCOMM 2013, Jerusalem
Reuven Lerner
 
KEY
Rails traps
Reuven Lerner
 
KEY
Modern Web technologies (and why you should care): Megacomm, Jerusalem, Febru...
Reuven Lerner
 
KEY
Rails development environment talk
Reuven Lerner
 
KEY
Git talk from Open 2011 conference in Israel
Reuven Lerner
 
PDF
Dynamic languages, for software craftmanship group
Reuven Lerner
 
KEY
Modern Web Technologies — Jerusalem Web Professionals, January 2011
Reuven Lerner
 
KEY
PostgreSQL talk, Database 2011 conference
Reuven Lerner
 
PDF
ActiveRecord 2.3
Reuven Lerner
 
KEY
Ruby objects
Reuven Lerner
 
KEY
Rails console
Reuven Lerner
 
KEY
Rails tools
Reuven Lerner
 
KEY
Why ruby and rails
Reuven Lerner
 
Technical training business talk.key
Reuven Lerner
 
Big Data — Your new best friend
Reuven Lerner
 
Python's magic methods
Reuven Lerner
 
What can Ruby learn from Python (and vice versa)?
Reuven Lerner
 
Functional Python Webinar from October 22nd, 2014
Reuven Lerner
 
Web APIs: The future of software
Reuven Lerner
 
Rails israel 2013
Reuven Lerner
 
Intro to cloud computing — MegaCOMM 2013, Jerusalem
Reuven Lerner
 
Rails traps
Reuven Lerner
 
Modern Web technologies (and why you should care): Megacomm, Jerusalem, Febru...
Reuven Lerner
 
Rails development environment talk
Reuven Lerner
 
Git talk from Open 2011 conference in Israel
Reuven Lerner
 
Dynamic languages, for software craftmanship group
Reuven Lerner
 
Modern Web Technologies — Jerusalem Web Professionals, January 2011
Reuven Lerner
 
PostgreSQL talk, Database 2011 conference
Reuven Lerner
 
ActiveRecord 2.3
Reuven Lerner
 
Ruby objects
Reuven Lerner
 
Rails console
Reuven Lerner
 
Rails tools
Reuven Lerner
 
Why ruby and rails
Reuven Lerner
 

Recently uploaded (20)

PDF
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
PPTX
ChatGPT's Deck on The Enduring Legacy of Fax Machines
Greg Swan
 
PPTX
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
PDF
This slide provides an overview Technology
mineshkharadi333
 
PDF
Software Development Methodologies in 2025
KodekX
 
PDF
Chapter 2 Digital Image Fundamentals.pdf
Getnet Tigabie Askale -(GM)
 
PDF
CIFDAQ's Teaching Thursday: Moving Averages Made Simple
CIFDAQ
 
PDF
agentic-ai-and-the-future-of-autonomous-systems.pdf
siddharthnetsavvies
 
PDF
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
PDF
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
AVTRON Technologies LLC
 
PDF
CIFDAQ'S Market Insight: BTC to ETH money in motion
CIFDAQ
 
PPTX
The Power of IoT Sensor Integration in Smart Infrastructure and Automation.pptx
Rejig Digital
 
PDF
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
PDF
Why Your AI & Cybersecurity Hiring Still Misses the Mark in 2025
Virtual Employee Pvt. Ltd.
 
PPTX
C Programming Basics concept krnppt.pptx
Karan Prajapat
 
PDF
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
PDF
NewMind AI Monthly Chronicles - July 2025
NewMind AI
 
PDF
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
PDF
Make GenAI investments go further with the Dell AI Factory - Infographic
Principled Technologies
 
Orbitly Pitch Deck|A Mission-Driven Platform for Side Project Collaboration (...
zz41354899
 
ChatGPT's Deck on The Enduring Legacy of Fax Machines
Greg Swan
 
New ThousandEyes Product Innovations: Cisco Live June 2025
ThousandEyes
 
This slide provides an overview Technology
mineshkharadi333
 
Software Development Methodologies in 2025
KodekX
 
Chapter 2 Digital Image Fundamentals.pdf
Getnet Tigabie Askale -(GM)
 
CIFDAQ's Teaching Thursday: Moving Averages Made Simple
CIFDAQ
 
agentic-ai-and-the-future-of-autonomous-systems.pdf
siddharthnetsavvies
 
Oracle AI Vector Search- Getting Started and what's new in 2025- AIOUG Yatra ...
Sandesh Rao
 
Using Anchore and DefectDojo to Stand Up Your DevSecOps Function
Anchore
 
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
AVTRON Technologies LLC
 
CIFDAQ'S Market Insight: BTC to ETH money in motion
CIFDAQ
 
The Power of IoT Sensor Integration in Smart Infrastructure and Automation.pptx
Rejig Digital
 
Cloud-Migration-Best-Practices-A-Practical-Guide-to-AWS-Azure-and-Google-Clou...
Artjoker Software Development Company
 
Why Your AI & Cybersecurity Hiring Still Misses the Mark in 2025
Virtual Employee Pvt. Ltd.
 
C Programming Basics concept krnppt.pptx
Karan Prajapat
 
Security features in Dell, HP, and Lenovo PC systems: A research-based compar...
Principled Technologies
 
NewMind AI Monthly Chronicles - July 2025
NewMind AI
 
BLW VOCATIONAL TRAINING SUMMER INTERNSHIP REPORT
codernjn73
 
Make GenAI investments go further with the Dell AI Factory - Infographic
Principled Technologies
 

PostgreSQL

  • 1. PostgreSQL Reuven M. Lerner ([email protected]) IL-Techtalks November 14th, 2012
  • 2. Who am I? • Web developer since 1993 • Linux Journal columnist since 1996 • Software architect, developer, consultant • Mostly Ruby on Rails + PostgreSQL, but also Python, PHP, Perl, JavaScript, MySQL, MongoDB, and lots more... • PostgreSQL user since (at least) 1997
  • 3. What do I do? • Web development, especially in Rails • Teaching/training • Coaching/consulting
  • 4. What is a database? Store data confidently Database Retrieve data flexibly
  • 5. Relational databases Define tables, store data in them Database Retrieve data from related tables
  • 6. Lots of options! • Oracle • Microsoft SQL Server • IBM DB2 • MySQL • PostgreSQL
  • 7. How do you choose? • Integrity (ACID compliance) • Data types • Functionality • Tools • Extensibility • Documentation • Community
  • 8. PostgreSQL • Very fast, very scalable. (Just ask Skype.) • Amazingly flexible, easily extensible. • Rock-solid — no crashes, corruption, security issues for years • Ridiculously easy administration • It also happens to be free (MIT/BSD)
  • 15. What about MySQL? • PostgreSQL has many more features • Not nearly as popular as MySQL • No single company behind it • (A good thing, I think!) • After using both, I prefer PostgreSQL • I’ll be happy to answer questions later
  • 16. Brief history • Ingres (Stonebreaker, Berkeley) • Postgres (Stonebreaker, Berkeley) • PostgreSQL project = Postgres + SQL • About one major release per year • Version 8.x — Windows port, recovery • Version 9.0 — hot replication, upgrades
  • 17. ACID • ACID — basic standard for databases • Atomicity • Consistency • Isolation • Durability • Pg has always been ACID compliant
  • 18. Data types • Boolean • Numeric (integer, float, decimal) • (var)char, text (infinitely large), binary • sequences (guaranteed to be unique) • Date/time and time intervals • IP addresses, XML, enums, arrays
  • 20. Or create your own! CREATE TYPE Person AS (first_name TEXT, last_name TEXT);
  • 21. Or create your own! CREATE TYPE Person AS (first_name TEXT, last_name TEXT);
  • 22. Or create your own! CREATE TYPE Person AS (first_name TEXT, last_name TEXT); CREATE TABLE Members (group_id INTEGER, member Person);
  • 23. Strong typing • PostgreSQL won’t automatically change types for you. • This can be annoying at first — but it is meant to protect your data! • You can cast from one type to another with the “cast” function or the :: operator • You can also define your own casts
  • 24. PostGIS • Some people took this all the way • Want to include geographical information? • No problem — we’ve got PostGIS! • Complete GIS solution, with data types and functions • Keeps pace with main PostgreSQL revisions
  • 25. Object oriented tables • Employee table inherits from People table: CREATE TABLE Employee (employee_id INTEGER department_id INTEGER) INHERITS (People);
  • 26. Foreign keys that work CREATE TABLE DVDs (id SERIAL, title TEXT, store_id INTEGER REFERENCES Stores); INSERT INTO DVDs (title, store_id) VALUES ('Attack of the Killer Tomatoes', 500);
  • 27. Foreign keys that work CREATE TABLE DVDs (id SERIAL, title TEXT, store_id INTEGER REFERENCES Stores); INSERT INTO DVDs (title, store_id) VALUES ('Attack of the Killer Tomatoes', 500); ERROR: insert or update on table "dvds" violates foreign key constraint "dvds_store_id_fkey"
  • 28. Foreign keys that work CREATE TABLE DVDs (id SERIAL, title TEXT, store_id INTEGER REFERENCES Stores); INSERT INTO DVDs (title, store_id) VALUES ('Attack of the Killer Tomatoes', 500); ERROR: insert or update on table "dvds" violates foreign key constraint "dvds_store_id_fkey" DETAIL: Key (store_id)=(500) is not present in table "stores".
  • 29. Foreign keys that work CREATE TABLE DVDs (id SERIAL, title TEXT, store_id INTEGER REFERENCES Stores); INSERT INTO DVDs (title, store_id) VALUES ('Attack of the Killer Tomatoes', 500); ERROR: insert or update on table "dvds" violates foreign key constraint "dvds_store_id_fkey" DETAIL: Key (store_id)=(500) is not present in table "stores". ERROR: insert or update on table "dvds" violates foreign key constraint "dvds_store_id_fkey"
  • 30. Foreign keys that work CREATE TABLE DVDs (id SERIAL, title TEXT, store_id INTEGER REFERENCES Stores); INSERT INTO DVDs (title, store_id) VALUES ('Attack of the Killer Tomatoes', 500); ERROR: insert or update on table "dvds" violates foreign key constraint "dvds_store_id_fkey" DETAIL: Key (store_id)=(500) is not present in table "stores". ERROR: insert or update on table "dvds" violates foreign key constraint "dvds_store_id_fkey" DETAIL: Key (store_id)=(500) is not present in table "stores".
  • 31. Custom validity checks CREATE TABLE DVDs (id SERIAL, title TEXT check (length(title) > 3), store_id INTEGER REFERENCES Stores); INSERT INTO DVDs (title, store_id) VALUES ('AB', 500);
  • 32. Custom validity checks CREATE TABLE DVDs (id SERIAL, title TEXT check (length(title) > 3), store_id INTEGER REFERENCES Stores); INSERT INTO DVDs (title, store_id) VALUES ('AB', 500); ERROR: new row for relation "dvds" violates check constraint "dvds_title_check"
  • 33. No more bad dates! INSERT INTO UPDATES (created_at) values ('32- feb-2008');
  • 34. No more bad dates! INSERT INTO UPDATES (created_at) values ('32- feb-2008'); ERROR: date/time field value out of range: "32-feb-2008"
  • 35. No more bad dates! INSERT INTO UPDATES (created_at) values ('32- feb-2008'); ERROR: date/time field value out of range: "32-feb-2008" LINE 1: insert into updates (created_at) values ('32- feb-2008');
  • 36. Timestamp vs. Interval testdb=# select now(); now ------------------------------- 2010-10-31 08:58:23.365792+02 (1 row) Point in time testdb=# select now() - interval '3 days'; ?column? ------------------------------- 2010-10-28 08:58:28.870011+02 Difference between (1 row) points in time
  • 37. Built-in functions • Math • Text processing (including regexps) • Date/time calculations • Conditionals (CASE, COALESCE, NULLIF) for use in queries • Extensive library of geometrical functions
  • 38. Or write your own! • PL/pgSQL • PL/Perl • PL/Python • PL/Ruby • PL/R • PL/Tcl
  • 39. Or write your own! CREATE OR REPLACE FUNCTION remove_cache_tables() RETURNS VOID AS $$ DECLARE r pg_catalog.pg_tables%rowtype; BEGIN FOR r IN SELECT * FROM pg_catalog.pg_tables WHERE schemaname = 'public' AND tablename ILIKE 'cache_%' LOOP RAISE NOTICE 'Now dropping table %', r.tablename; EXECUTE 'DROP TABLE ' || r.tablename; END LOOP; END; $$ LANGUAGE 'plpgsql';
  • 40. Another example CREATE OR REPLACE FUNCTION store_hostname() RETURNS TRIGGER AS $store_hostname$ BEGIN NEW.hostname := 'http://' || substring(NEW.url, '(?:http://)?([^/]+)'); RETURN NEW; END; $store_hostname$ LANGUAGE plpgsql;
  • 41. Triggers • Yes, that last function was a trigger • Automatically execute functions upon INSERT, UPDATE, and/or DELETE • Can execute before or after • Very powerful, very fast
  • 42. Function possibilities • Computing values, strings • Returning table-like sets of values • Encapsulating queries • Dynamically generating queries via strings • Triggers: Modifying data before it is inserted or updated
  • 43. Why use a PL/lang? • Other libraries (e.g., CPAN for Perl) • Faster, optimized functions (eg., R) • Programmer familiarity • Cached query plans
  • 44. Views and rules • Views are stored SELECT statements • Pretend that something is a read-only table • Rules let you turn it into a read/write table • Intercept and rewrite incoming query • Check or change data • Change where data is stored
  • 45. Full-text indexing • Built into PostgreSQL • Handles stop words, different languages, synonyms, and even (often) stemming • Very powerful, but it can take some time to get configured correctly
  • 46. Transactions • In PostgreSQL from the beginning • Use transactions for just about anything: BEGIN DROP TABLE DVDs; ROLLBACK; SELECT * FROM DVDs; -- Works!
  • 47. Savepoints (or, sub-transactions) BEGIN; INSERT INTO table1 VALUES (1); SAVEPOINT my_savepoint; INSERT INTO table1 VALUES (2); ROLLBACK TO SAVEPOINT my_savepoint; INSERT INTO table1 VALUES (3); COMMIT;
  • 48. MVCC • Readers and writers don’t block each other • “Multi-version concurrency control” • xmin, xmax on each tuple; rows are those tuples with txid_current between them • Old versions stick around until vacuumed • Autovacuum removes even this issue
  • 49. MVCC • Look at a row’s xmin and xmax • Look at txid_current() • Start transaction; look at row’s xmin/xmax • Look at xmin/xmax on that row from another session • Commit, and look again at both!
  • 50. Downsides of MVCC • MVCC is usually fantastic • But if you insert or update many rows, and then do a COUNT(*), things will be slow • There are solutions — including more aggressive auto-vacuuming • 9.2 introduced features that improved this
  • 51. Indexing • Regular, unique indexes • Functional indexes • Index calling a function on a column • Partial indexes • Index only rows matching criteria • Cluster table on an index
  • 52. CTEs • Adds a “WITH” statement, which defines a sorta-kinda temp table • You can then query that same temp table • Makes many queries easier to read, write, without a real temp table • Better yet: CTEs can be recursive, for everything from Fibonacci to org charts
  • 53. Speed and scalability • MVCC + a smart query optimizer makes PostgreSQL pretty fast and smart • Statistics based on previous query results inform the query planner • Several scan types, join types are weighed • Benchmarks consistently show excellent performance with high mixes of read/write
  • 54. WAL • All activity in the database is put in “write- ahead logs” before it happens • If the database server fails, it replays the WALs, then continues • You can change how often WALs are written, to improve performance • PITR — restore database from WALs
  • 55. Log shipping • Copy WALs to a second, identical server — known as “log shipping” — and you have a backup • If the primary server goes down, you can bring the secondary up in its place • This was known as “warm standby,” and worked in 8.4
  • 56. Hot standby, streaming replication • As of 9.0, you don’t have to do this • You can have the primary stream the information to the secondary • Almost-instant updates • The secondary machine can answer read- only queries (“hot standby”), not just handle failover
  • 57. Extensions • Provides a standardized mechanism for downloading, installing, and versioning extensions • New data types, functions, languages are possible • Download, search via pgxn.org • Similar to CPAN, PyPi, or Ruby gems
  • 58. SQL/MED • SQL/MED was introduced in 9.1 • Query information from other databases (and database-like interfaces) • So if you have data in MySQL, Oracle, CSV ... just install a wrapper, and you can query it like a PostgreSQL table
  • 59. Unlogged tables • All actions are logged in WALs • That adds some overhead, which isn’t required by throwaway data • Unlogged tables (different from temp tables!) offer a speedup, in exchange for less reliability
  • 60. New in 9.2 • JSON support • Range types, for handling • Much more scalable — from 24 cores and 75k queries/sec to 64 cores and 350k queries/sec • Index-only queries (“covering indexes”) • Cascading replication
  • 61. Web problems • PostgreSQL is great as a Web backend • But if you use an ORM (e.g., ActiveRecord), you are probably losing much of the power • e.g., foreign keys, CTE, triggers, and views • No good way to bridge this gap — for now • There are always methods, but this is an area that definitely needs some work
  • 62. Tablespaces • You can create any number of “tablespaces,” separate storage areas • Put tables, indexes on different tablespaces • Most useful with multiple disks • Separate tables (or parts of a partitioned table)... or separate tables from indexes
  • 63. Partitioning • Combine object-oriented tables, CHECK clauses, and tablespaces for partitioning • Example: Invoices from Jan-June go in table “q12”, and July-December go in table “q34” • Now PostgreSQL knows where to look when you SELECT from the parent table • Note that INSERT requires a trigger
  • 64. Reflection • pg_catalog schema contains everything about your database • Tables, functions, views, etc. • You can learn a great deal about PostgreSQL by looking through the pg_catalog schema
  • 65. Advanced uses • GridSQL: Split a query across multiple PostgreSQL servers • Very large-scale data warehousing: Greenplum
  • 66. Client libraries • libpq (in C) • Java (JDBC) • Others by 3 rd • .NET (npgsql) parties: • ODBC • Python • JavaScript (!) • Ruby • Just about any language you can • Perl imagine
  • 67. Tools • Yeah, tools are more primitive • If you love GUIs, and hate the command line, then PostgreSQL will be hard for you • PgAdmin and other tools are OK, but not really up to the task for “real” work • PgAdmin does provide some graphical query building and “explain” output
  • 68. Windows compatibility • It works on Windows • .NET drivers work, as well • Logging is far from perfect (can go to the Windows log tool, but not filtered well) • Configuration is still in a text file, foreign to most Windows people • Windows is still a second-class citizen
  • 69. Who uses it? • Affilias • IMDB • Apple • Skype • BASF • Sourceforge • Cisco • Heroku • CD Baby • Checkpoint • Etsy
  • 70. Who supports it? • EnterpriseDB — products and services • 2 Quadrant nd • Many freelancers (like me!)
  • 71. PostgreSQL problems • Tuning is still hard (but getting easier) • Double quotes • Lack of good GUI-based tools • Some features (e.g., materialized views) that people want without having to resort to hacks and triggers/rules • Multi-master (of course!)
  • 72. Bottom line • PostgreSQL: BSD licensed, easy to install, easy to use, easy to administer • Still not quite up to commercial databases regarding features — but not far behind • More than good enough for places like Skype and Affilias; probably good enough for you!
  • 73. Want to learn more? • Mailing lists, wikis, and blogs • All at http://postgresql.org/ • http://planetpostgresql.org • PostgreSQL training, consulting, development, hand-holding, and general encouragement
  • 74. Thanks! (Any questions?) [email protected] http://www.lerner.co.il/ 054-496-8405 “reuvenlerner” on Skype/AIM

Editor's Notes