PostgreSQL and RAM Uses
PostgreSQL and RAM Uses
27 Feb 2017
The Skiff, Brighton
1/37
One fine day early in the morning
2/37
One fine day early in the morning
2/37
One fine day early in the morning
2/37
One fine day
DB Log:
LOG: server process (PID 18742) was terminated by signal 9: Killed
DETAIL: Failed process was running: some query here
LOG: terminating any other active server processes
FATAL: the database system is in recovery mode
...
LOG: database system is ready to accept connections
3/37
One fine day
DB Log:
LOG: server process (PID 18742) was terminated by signal 9: Killed
DETAIL: Failed process was running: some query here
LOG: terminating any other active server processes
FATAL: the database system is in recovery mode
...
LOG: database system is ready to accept connections
Syslog:
Out of memory: Kill process 18742 (postgres) score 669 or sacrifice child
Killed process 18742 (postgres) total-vm:5670864kB, anon-rss:5401060kB, file-rss:1428kB
3/37
How to avoid such a scenario?
4/37
Outline
5/37
What are postgres server processes?
6/37
What are postgres server processes?
7/37
What are postgres server processes?
7/37
What are postgres server processes?
7/37
What are postgres server processes?
7/37
What are postgres server processes?
7/37
What processes use much RAM and why?
8/37
Shared memory
9/37
Shared memory
9/37
Shared memory
9/37
Shared memory
9/37
Shared memory
10/37
Backends and their bgworkers
11/37
Backends and their bgworkers
11/37
Backends and their bgworkers
11/37
What queries require much RAM?
12/37
What queries require much RAM?
13/37
What queries require much RAM?
13/37
What queries require much RAM?
13/37
What execution plan nodes might require
much RAM?
14/37
Nodes: stream-like
15/37
Nodes: stream-like
15/37
Nodes: controlled
Some of the other nodes actively use RAM but control the amount
used. They have a fallback behaviour to switch to if they realise
they cannot fit work_mem.
Sort node switches from quicksort to sort-on-disk
CTE and materialize nodes use temporary files if needed
Group Aggregation with DISTINCT keyword can use temporary
files
Beware of out of disk space problems.
16/37
Nodes: controlled
Some of the other nodes actively use RAM but control the amount
used. They have a fallback behaviour to switch to if they realise
they cannot fit work_mem.
Sort node switches from quicksort to sort-on-disk
CTE and materialize nodes use temporary files if needed
Group Aggregation with DISTINCT keyword can use temporary
files
Beware of out of disk space problems.
Also
Exact Bitmap Scan falls back to Lossy Bitmap Scan
Hash Join switches to batchwise processing if it encounters
more data than expected
16/37
Nodes: unsafe
They are Hash Agg, hashed SubPlan and (rarely) Hash Join can use
unlimited amount of RAM.
Optimizer normally avoids them when it estimates them to process
huge sets, but it can easily be wrong.
17/37
Unsafe nodes: hashed SubPlan
The backend used 60MB of RAM while work_mem was only 4MB.
18/37
Unsafe nodes: hashed SubPlan and partitioned table
20/37
Unsafe nodes: Hash Aggregation
21/37
Unsafe nodes: Hash Join
Hash Joins can use more memory than expected if there are many
collisions on the hashed side:
postgres=# explain (analyze, costs off)
postgres-# select * from t t1 join t t2 on t1.b = t2.b where t1.a = 1;
QUERY PLAN
--------------------------------------------------------------------------------------------
Hash Join (actual time=873.321..4223.080 rows=1000000 loops=1)
Hash Cond: (t2.b = t1.b)
-> Seq Scan on t t2 (actual time=0.048..755.195 rows=10500000 loops=1)
-> Hash (actual time=873.163..873.163 rows=500000 loops=1)
Buckets: 131072 (originally 1024) Batches: 8 (originally 1) Memory Usage: 3465kB
-> Seq Scan on t t1 (actual time=748.700..803.665 rows=500000 loops=1)
Filter: (a = 1)
Rows Removed by Filter: 10000000
22/37
Unsafe nodes: array_agg
array_agg used at least 1Kb per array before a fix in Postgres 9.5
23/37
How to we measure the amount of RAM used?
24/37
How to we measure the amount of RAM used?
top? ps?
25/37
How to we measure the amount of RAM used?
25/37
How to we measure the amount of RAM used?
top? ps? htop? atop? No. They show private and shared memory
together.
25/37
How to we measure the amount of RAM used?
top? ps? htop? atop? No. They show private and shared memory
together.
25/37
smaps
. . . or this
....
7f8ce656a000-7f8cef300000 rw-s 00000000 00:04 7334558
/dev/zero (deleted)
Size: 144984 kB
Rss: 75068 kB
Pss: 38025 kB
Shared_Clean: 0 kB
Shared_Dirty: 73632 kB
Private_Clean: 0 kB
Private_Dirty: 1436 kB
Referenced: 75068 kB
Anonymous: 0 kB
AnonHugePages: 0 kB
Swap: 0 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Locked: 0 kB
VmFlags: rd wr sh mr mw me ms sd
....
28/37
smaps: PSS
X
PSS(pid) = total memory used!
pid
28/37
smaps: PSS
X
PSS(pid) = total memory used!
pid
PSS support was added to Linux kernel in 2007, but I’m not aware of
a task manager able to display it or sort processes by it.
28/37
smaps: Private
29/37
smaps: Private from psql
You even can get amount of private memory used by a backend from
itself using SQL:
do $do$
declare
l_command text :=
$p$ cat /proc/$p$ || pg_backend_pid() || $p$/smaps $p$ ||
$p$ | grep '^Private' $p$ ||
$p$ | awk '{a+=$2}END{print a * 1024}' $p$;
begin
create temp table if not exists z (a int);
execute 'copy z from program ' || quote_literal(l_command);
raise notice '%', (select pg_size_pretty(sum(a)) from z);
truncate z;
end;
$do$;
31/37
How is allocated RAM reclaimed?
32/37
How is allocated RAM reclaimed?
32/37
How is allocated RAM reclaimed?
32/37
How is allocated RAM reclaimed?
33/37
How is allocated RAM reclaimed?
34/37
How is allocated RAM reclaimed?
The threshold for the decision what to use is not fixed as well. It is
initially 128Kb but Linux increases it up to 32MB adaptively
depending on the process previous allocations history.
35/37
Questions?
36/37
Relevant ads everywhere:
Used 4GB+4GB laptop DDR2 for sale, £64.95 only.
For your postgres never to run OOM!
37/37