Skip to content
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Commit e6c039d

Browse files
committedMar 28, 2018
Add documentation for the JIT feature.
As promised in earlier commits, this adds documentation about the new build options, the new GUCs, about the planner logic when JIT is used, and the benefits of JIT in general. Also adds a more implementation oriented README. I'm sure we're going to want to expand this further, but I think this is a reasonable start. Author: Andres Freund, with contributions by Thomas Munro Reviewed-By: Thomas Munro Discussion: https://postgr.es/m/20170901064131.tazjxwus3k2w3ybh@alap3.anarazel.de
1 parent 1f0c6a9 commit e6c039d

File tree

9 files changed

+844
-2
lines changed

9 files changed

+844
-2
lines changed
 

‎doc/src/sgml/acronyms.sgml

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -369,6 +369,16 @@
369369
</listitem>
370370
</varlistentry>
371371

372+
<varlistentry>
373+
<term><acronym>JIT</acronym></term>
374+
<listitem>
375+
<para>
376+
<ulink url="https://en.wikipedia.org/wiki/Just-in-time_compilation">Just-in-Time
377+
compilation</ulink>
378+
</para>
379+
</listitem>
380+
</varlistentry>
381+
372382
<varlistentry>
373383
<term><acronym>JSON</acronym></term>
374384
<listitem>

‎doc/src/sgml/config.sgml

Lines changed: 182 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4136,6 +4136,62 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
41364136
</listitem>
41374137
</varlistentry>
41384138

4139+
4140+
<varlistentry id="guc-jit-above-cost" xreflabel="jit_above_cost">
4141+
<term><varname>jit_above_cost</varname> (<type>floating point</type>)
4142+
<indexterm>
4143+
<primary><varname>jit_above_cost</varname> configuration parameter</primary>
4144+
</indexterm>
4145+
</term>
4146+
<listitem>
4147+
<para>
4148+
Sets the planner's cutoff above which JIT compilation is used as part
4149+
of query execution (see <xref linkend="jit"/>). Performing
4150+
<acronym>JIT</acronym> costs time but can accelerate query execution.
4151+
4152+
The default is <literal>100000</literal>.
4153+
</para>
4154+
</listitem>
4155+
</varlistentry>
4156+
4157+
<varlistentry id="guc-jit-optimize-above-cost" xreflabel="jit_optimize_above_cost">
4158+
<term><varname>jit_optimize_above_cost</varname> (<type>floating point</type>)
4159+
<indexterm>
4160+
<primary><varname>jit_optimize_above_cost</varname> configuration parameter</primary>
4161+
</indexterm>
4162+
</term>
4163+
<listitem>
4164+
<para>
4165+
Sets the planner's cutoff above which JIT compiled programs (see <xref
4166+
linkend="guc-jit-above-cost"/>) are optimized. Optimization initially
4167+
takes time, but can improve execution speed. It is not meaningful to
4168+
set this to a lower value than <xref linkend="guc-jit-above-cost"/>.
4169+
4170+
The default is <literal>500000</literal>.
4171+
</para>
4172+
</listitem>
4173+
</varlistentry>
4174+
4175+
<varlistentry id="guc-jit-inline-above-cost" xreflabel="jit_inline_above_cost">
4176+
<term><varname>jit_inline_above_cost</varname> (<type>floating point</type>)
4177+
<indexterm>
4178+
<primary><varname>jit_inline_above_cost</varname> configuration parameter</primary>
4179+
</indexterm>
4180+
</term>
4181+
<listitem>
4182+
<para>
4183+
Sets the planner's cutoff above which JIT compiled programs (see <xref
4184+
linkend="guc-jit-above-cost"/>) attempt to inline functions and
4185+
operators. Inlining initially takes time, but can improve execution
4186+
speed. It is unlikely to be beneficial to set
4187+
<varname>jit_inline_above_cost</varname> below
4188+
<varname>jit_optimize_above_cost</varname>.
4189+
4190+
The default is <literal>500000</literal>.
4191+
</para>
4192+
</listitem>
4193+
</varlistentry>
4194+
41394195
</variablelist>
41404196

41414197
</sect2>
@@ -4418,6 +4474,23 @@ SELECT * FROM parent WHERE key = 2400;
44184474
</listitem>
44194475
</varlistentry>
44204476

4477+
<varlistentry id="guc-jit" xreflabel="jit">
4478+
<term><varname>jit</varname> (<type>boolean</type>)
4479+
<indexterm>
4480+
<primary><varname>jit</varname> configuration parameter</primary>
4481+
</indexterm>
4482+
</term>
4483+
<listitem>
4484+
<para>
4485+
Determines whether <acronym>JIT</acronym> may be used by
4486+
<productname>PostgreSQL</productname>, if available (see <xref
4487+
linkend="jit"/>).
4488+
4489+
The default is <literal>on</literal>.
4490+
</para>
4491+
</listitem>
4492+
</varlistentry>
4493+
44214494
<varlistentry id="guc-join-collapse-limit" xreflabel="join_collapse_limit">
44224495
<term><varname>join_collapse_limit</varname> (<type>integer</type>)
44234496
<indexterm>
@@ -7412,6 +7485,29 @@ SET XML OPTION { DOCUMENT | CONTENT };
74127485
</note>
74137486
</listitem>
74147487
</varlistentry>
7488+
7489+
<varlistentry id="guc-jit-provider" xreflabel="jit_provider">
7490+
<term><varname>jit_provider</varname> (<type>string</type>)
7491+
<indexterm>
7492+
<primary><varname>jit_provider</varname> configuration parameter</primary>
7493+
</indexterm>
7494+
</term>
7495+
<listitem>
7496+
<para>
7497+
Determines which JIT provider (see <xref linkend="jit-extensibility"/>) is
7498+
used. The built-in default is <literal>llvmjit</literal>.
7499+
</para>
7500+
<para>
7501+
If set to a non-existent library <acronym>JIT</acronym> will not
7502+
available, but no error will be raised. This allows JIT support to be
7503+
installed separately from the main
7504+
<productname>PostgreSQL</productname> package.
7505+
7506+
This parameter can only be set at server start.
7507+
</para>
7508+
</listitem>
7509+
</varlistentry>
7510+
74157511
</variablelist>
74167512
</sect2>
74177513

@@ -8658,7 +8754,92 @@ LOG: CleanUpLock: deleting: lock(0xb7acd844) id(24688,24696,0,0,0,1)
86588754
</para>
86598755
</listitem>
86608756
</varlistentry>
8661-
</variablelist>
8757+
8758+
<varlistentry id="guc-jit-debugging-support" xreflabel="jit_debugging_support">
8759+
<term><varname>jit_debugging_support</varname> (<type>boolean</type>)
8760+
<indexterm>
8761+
<primary><varname>jit_debugging_support</varname> configuration parameter</primary>
8762+
</indexterm>
8763+
</term>
8764+
<listitem>
8765+
<para>
8766+
If LLVM has the required functionality, register generated functions
8767+
with <productname>GDB</productname>. This makes debugging easier.
8768+
8769+
The default setting is <literal>off</literal>, and can only be set at
8770+
server start.
8771+
</para>
8772+
</listitem>
8773+
</varlistentry>
8774+
8775+
<varlistentry id="guc-jit-dump-bitcode" xreflabel="jit_dump_bitcode">
8776+
<term><varname>jit_dump_bitcode</varname> (<type>boolean</type>)
8777+
<indexterm>
8778+
<primary><varname>jit_dump_bitcode</varname> configuration parameter</primary>
8779+
</indexterm>
8780+
</term>
8781+
<listitem>
8782+
<para>
8783+
Writes the generated <productname>LLVM</productname> IR out to the
8784+
filesystem, inside <xref linkend="guc-data-directory"/>. This is only
8785+
useful for working on the internals of the JIT implementation.
8786+
8787+
The default setting is <literal>off</literal>, and it can only be
8788+
changed by a superuser.
8789+
</para>
8790+
</listitem>
8791+
</varlistentry>
8792+
8793+
<varlistentry id="guc-jit-expressions" xreflabel="jit_expressions">
8794+
<term><varname>jit_expressions</varname> (<type>boolean</type>)
8795+
<indexterm>
8796+
<primary><varname>jit_expressions</varname> configuration parameter</primary>
8797+
</indexterm>
8798+
</term>
8799+
<listitem>
8800+
<para>
8801+
Determines whether expressions are JIT compiled, subject to costing
8802+
decisions (see <xref linkend="jit-decision"/>). The default is
8803+
<literal>on</literal>.
8804+
</para>
8805+
</listitem>
8806+
</varlistentry>
8807+
8808+
<varlistentry id="guc-jit-profiling-support" xreflabel="jit_profiling_support">
8809+
<term><varname>jit_profiling_support</varname> (<type>boolean</type>)
8810+
<indexterm>
8811+
<primary><varname>jit_profiling_support</varname> configuration parameter</primary>
8812+
</indexterm>
8813+
</term>
8814+
<listitem>
8815+
<para>
8816+
If LLVM has the required functionality, emit required data to allow
8817+
<productname>perf</productname> to profile functions generated by JIT.
8818+
This writes out files to <filename>$HOME/.debug/jit/</filename>; the
8819+
user is responsible for performing cleanup when desired.
8820+
8821+
The default setting is <literal>off</literal>, and can only be set at
8822+
server start.
8823+
</para>
8824+
</listitem>
8825+
</varlistentry>
8826+
8827+
<varlistentry id="guc-jit-tuple-deforming" xreflabel="jit_tuple_deforming">
8828+
<term><varname>jit_tuple_deforming</varname> (<type>boolean</type>)
8829+
<indexterm>
8830+
<primary><varname>jit_tuple_deforming</varname> configuration parameter</primary>
8831+
</indexterm>
8832+
</term>
8833+
<listitem>
8834+
<para>
8835+
Determines whether tuple deforming is JIT compiled, subject to costing
8836+
decisions (see <xref linkend="jit-decision"/>). The default is
8837+
<literal>on</literal>.
8838+
</para>
8839+
</listitem>
8840+
</varlistentry>
8841+
8842+
</variablelist>
86628843
</sect1>
86638844
<sect1 id="runtime-config-short">
86648845
<title>Short Options</title>

‎doc/src/sgml/filelist.sgml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,7 @@
4848
<!ENTITY user-manag SYSTEM "user-manag.sgml">
4949
<!ENTITY wal SYSTEM "wal.sgml">
5050
<!ENTITY logical-replication SYSTEM "logical-replication.sgml">
51+
<!ENTITY jit SYSTEM "jit.sgml">
5152

5253
<!-- programmer's guide -->
5354
<!ENTITY bgworker SYSTEM "bgworker.sgml">

‎doc/src/sgml/func.sgml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15942,6 +15942,14 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
1594215942
<entry>is schema another session's temporary schema?</entry>
1594315943
</row>
1594415944

15945+
<row>
15946+
<entry><literal><function>pg_jit_available()</function></literal></entry>
15947+
<entry><type>boolean</type></entry>
15948+
<entry>is <acronym>JIT</acronym> available in this session (see <xref
15949+
linkend="jit"/>)? Returns <literal>false</literal> if <xref
15950+
linkend="guc-jit"/> is set to false.</entry>
15951+
</row>
15952+
1594515953
<row>
1594615954
<entry><literal><function>pg_listening_channels()</function></literal></entry>
1594715955
<entry><type>setof text</type></entry>

‎doc/src/sgml/installation.sgml

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -758,6 +758,39 @@ su - postgres
758758
</listitem>
759759
</varlistentry>
760760

761+
<varlistentry id="configure-with-llvm">
762+
<term><option>--with-llvm</option></term>
763+
<listitem>
764+
<para>
765+
Build with support for <productname>LLVM</productname> based
766+
<acronym>JIT</acronym> compilation (see <xref linkend="jit"/>). This
767+
requires the <productname>LLVM</productname> library to be installed.
768+
The minimum required version of <productname>LLVM</productname> is
769+
currently 3.9.
770+
</para>
771+
<para>
772+
<command>llvm-config</command><indexterm><primary>llvm-config</primary></indexterm>
773+
will be used to find the required compilation options.
774+
<command>llvm-config</command>, and then
775+
<command>llvm-config-$major-$minor</command> for all supported
776+
versions, will be searched on <envar>PATH</envar>. If that would not
777+
yield the correct binary, use <envar>LLVM_CONFIG</envar> to specify a
778+
path to the correct <command>llvm-config</command>. For example
779+
<programlisting>
780+
./configure ... --with-llvm LLVM_CONFIG='/path/to/llvm/bin/llvm-config'
781+
</programlisting>
782+
</para>
783+
784+
<para>
785+
<productname>LLVM</productname> support requires a compatible
786+
<command>clang</command> compiler (specified, if necessary, using the
787+
<envar>CLANG</envar> environment variable), and a working C++
788+
compiler (specified, if necessary, using the <envar>CXX</envar>
789+
environment variable).
790+
</para>
791+
</listitem>
792+
</varlistentry>
793+
761794
<varlistentry>
762795
<term><option>--with-icu</option></term>
763796
<listitem>
@@ -1342,6 +1375,16 @@ su - postgres
13421375
</listitem>
13431376
</varlistentry>
13441377

1378+
<varlistentry>
1379+
<term><envar>CLANG</envar></term>
1380+
<listitem>
1381+
<para>
1382+
path to <command>clang</command> program used to process source code
1383+
for inlining when compiling with <literal>--with-llvm</literal>
1384+
</para>
1385+
</listitem>
1386+
</varlistentry>
1387+
13451388
<varlistentry>
13461389
<term><envar>CPP</envar></term>
13471390
<listitem>
@@ -1432,6 +1475,16 @@ su - postgres
14321475
</listitem>
14331476
</varlistentry>
14341477

1478+
<varlistentry>
1479+
<term><envar>LLVM_CONFIG</envar></term>
1480+
<listitem>
1481+
<para>
1482+
<command>llvm-config</command> program used to locate the
1483+
<productname>LLVM</productname> installation.
1484+
</para>
1485+
</listitem>
1486+
</varlistentry>
1487+
14351488
<varlistentry>
14361489
<term><envar>MSGFMT</envar></term>
14371490
<listitem>

‎doc/src/sgml/jit.sgml

Lines changed: 299 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,299 @@
1+
<!-- doc/src/sgml/jit.sgml -->
2+
3+
<chapter id="jit">
4+
<title>Just-in-Time Compilation (<acronym>JIT</acronym>)</title>
5+
6+
<indexterm zone="jit">
7+
<primary><acronym>JIT</acronym></primary>
8+
</indexterm>
9+
10+
<indexterm>
11+
<primary>Just-In-Time compilation</primary>
12+
<see><acronym>JIT</acronym></see>
13+
</indexterm>
14+
15+
<para>
16+
This chapter explains what just-in-time compilation is, and how it can be
17+
configured in <productname>PostgreSQL</productname>.
18+
</para>
19+
20+
<sect1 id="jit-reason">
21+
<title>What is <acronym>JIT</acronym>?</title>
22+
23+
<para>
24+
Just-in-time compilation (<acronym>JIT</acronym>) is the process of turning
25+
some form of interpreted program evaluation into a native program, and
26+
doing so at runtime.
27+
28+
For example, instead of using a facility that can evaluate arbitrary SQL
29+
expressions to evaluate an SQL predicate like <literal>WHERE a.col =
30+
3</literal>, it is possible to generate a function than can be natively
31+
executed by the CPU that just handles that expression, yielding a speedup.
32+
</para>
33+
34+
<para>
35+
<productname>PostgreSQL</productname> has builtin support perform
36+
<acronym>JIT</acronym> using <ulink
37+
url="https://llvm.org/"><productname>LLVM</productname></ulink> when built
38+
<productname>PostgreSQL</productname> was built with
39+
<literal>--with-llvm</literal> (see <xref linkend="configure-with-llvm"/>).
40+
</para>
41+
42+
<para>
43+
See <filename>src/backend/jit/README</filename> for further details.
44+
</para>
45+
46+
<sect2 id="jit-accelerated-operations">
47+
<title><acronym>JIT</acronym> Accelerated Operations</title>
48+
<para>
49+
Currently <productname>PostgreSQL</productname>'s <acronym>JIT</acronym>
50+
implementation has support for accelerating expression evaluation and
51+
tuple deforming. Several other operations could be accelerated in the
52+
future.
53+
</para>
54+
<para>
55+
Expression evaluation is used to evaluate <literal>WHERE</literal>
56+
clauses, target lists, aggregates and projections. It can be accelerated
57+
by generating code specific to each case.
58+
</para>
59+
<para>
60+
Tuple deforming is the process of transforming an on-disk tuple (see <xref
61+
linkend="heaptuple"/>) into its in-memory representation. It can be
62+
accelerated by creating a function specific to the table layout and the
63+
number of columns to be extracted.
64+
</para>
65+
</sect2>
66+
67+
<sect2 id="jit-optimization">
68+
<title>Optimization</title>
69+
<para>
70+
<productname>LLVM</productname> has support for optimizing generated
71+
code. Some of the optimizations are cheap enough to be performed whenever
72+
<acronym>JIT</acronym> is used, while others are only beneficial for
73+
longer running queries.
74+
75+
See <ulink url="https://llvm.org/docs/Passes.html#transform-passes"/> for
76+
more details about optimizations.
77+
</para>
78+
</sect2>
79+
80+
<sect2 id="jit-inlining">
81+
<title>Inlining</title>
82+
<para>
83+
<productname>PostgreSQL</productname> is very extensible and allows new
84+
datatypes, functions, operators and other database objects to be defined;
85+
see <xref linkend="extend"/>. In fact the built-in ones are implemented
86+
using nearly the same mechanisms. This extensibility implies some
87+
overhead, for example due to function calls (see <xref linkend="xfunc"/>).
88+
To reduce that overhead <acronym>JIT</acronym> compilation can inline the
89+
body for small functions into the expression using them. That allows a
90+
significant percentage of the overhead to be optimized away.
91+
</para>
92+
</sect2>
93+
94+
</sect1>
95+
96+
<sect1 id="jit-decision">
97+
<title>When to <acronym>JIT</acronym>?</title>
98+
99+
<para>
100+
<acronym>JIT</acronym> is beneficial primarily for long-running CPU bound
101+
queries. Frequently these will be analytical queries. For short queries
102+
the overhead of performing <acronym>JIT</acronym> will often be higher than
103+
the time it can save.
104+
</para>
105+
106+
<para>
107+
To determine whether <acronym>JIT</acronym> is used, the total cost of a
108+
query (see <xref linkend="planner-stats-details"/> and <xref
109+
linkend="runtime-config-query-constants"/>) is used.
110+
</para>
111+
112+
<para>
113+
The cost of the query will be compared with <xref
114+
linkend="guc-jit-above-cost"/> GUC. If the cost is higher,
115+
<acronym>JIT</acronym> compilation will be performed.
116+
</para>
117+
118+
<para>
119+
If the planner, based on the above criterion, decided that
120+
<acronym>JIT</acronym> is beneficial, two further decisions are
121+
made. Firstly, if the query is more costly than the <xref
122+
linkend="guc-jit-optimize-above-cost"/>, GUC expensive optimizations are
123+
used to improve the generated code. Secondly, if the query is more costly
124+
than the <xref linkend="guc-jit-inline-above-cost"/> GUC, short functions
125+
and operators used in the query will be inlined. Both of these operations
126+
increase the <acronym>JIT</acronym> overhead, but can reduce query
127+
execution time considerably.
128+
</para>
129+
130+
<para>
131+
This cost based decision will be made at plan time, not execution
132+
time. This means that when prepared statements are in use, and the generic
133+
plan is used (see <xref linkend="sql-prepare-notes"/>), the values of the
134+
GUCs set at prepare time take effect, not the settings at execution time.
135+
</para>
136+
137+
<note>
138+
<para>
139+
If <xref linkend="guc-jit"/> is set to <literal>off</literal>, or no
140+
<acronym>JIT</acronym> implementation is available (for example because
141+
the server was compiled without <literal>--with-llvm</literal>),
142+
<acronym>JIT</acronym> will not performed, even if considered to be
143+
beneficial based on the above criteria. Setting <xref linkend="guc-jit"/>
144+
to <literal>off</literal> takes effect both at plan and at execution time.
145+
</para>
146+
</note>
147+
148+
<para>
149+
<xref linkend="sql-explain"/> can be used to see whether
150+
<acronym>JIT</acronym> is used or not. As an example, here is a query that
151+
is not using <acronym>JIT</acronym>:
152+
<programlisting>
153+
=# EXPLAIN ANALYZE SELECT SUM(relpages) FROM pg_class;
154+
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
155+
│ QUERY PLAN │
156+
├─────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
157+
│ Aggregate (cost=16.27..16.29 rows=1 width=8) (actual time=0.303..0.303 rows=1 loops=1) │
158+
│ -> Seq Scan on pg_class (cost=0.00..15.42 rows=342 width=4) (actual time=0.017..0.111 rows=356 loops=1) │
159+
│ Planning Time: 0.116 ms │
160+
│ Execution Time: 0.365 ms │
161+
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
162+
(4 rows)
163+
</programlisting>
164+
Given the cost of the plan, it is entirely reasonable that no
165+
<acronym>JIT</acronym> was used, the cost of <acronym>JIT</acronym> would
166+
have been bigger than the savings. Adjusting the cost limits will lead to
167+
<acronym>JIT</acronym> use:
168+
<programlisting>
169+
=# SET jit_above_cost = 10;
170+
SET
171+
=# EXPLAIN ANALYZE SELECT SUM(relpages) FROM pg_class;
172+
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
173+
│ QUERY PLAN │
174+
├─────────────────────────────────────────────────────────────────────────────────────────────────────────────┤
175+
│ Aggregate (cost=16.27..16.29 rows=1 width=8) (actual time=6.049..6.049 rows=1 loops=1) │
176+
│ -> Seq Scan on pg_class (cost=0.00..15.42 rows=342 width=4) (actual time=0.019..0.052 rows=356 loops=1) │
177+
│ Planning Time: 0.133 ms │
178+
│ JIT: │
179+
│ Functions: 3 │
180+
│ Generation Time: 1.259 ms │
181+
│ Inlining: false │
182+
│ Inlining Time: 0.000 ms │
183+
│ Optimization: false │
184+
│ Optimization Time: 0.797 ms │
185+
│ Emission Time: 5.048 ms │
186+
│ Execution Time: 7.416 ms │
187+
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
188+
</programlisting>
189+
As visible here, <acronym>JIT</acronym> was used, but inlining and
190+
optimization were not. If <xref linkend="guc-jit-optimize-above-cost"/>,
191+
<xref linkend="guc-jit-inline-above-cost"/> were lowered, just like <xref
192+
linkend="guc-jit-above-cost"/>, that would change.
193+
</para>
194+
</sect1>
195+
196+
<sect1 id="jit-configuration" xreflabel="JIT Configuration">
197+
<title>Configuration</title>
198+
199+
<para>
200+
<xref linkend="guc-jit"/> determines whether <acronym>JIT</acronym> is
201+
enabled or disabled.
202+
</para>
203+
204+
<para>
205+
As explained in <xref linkend="jit-decision"/> the configuration variables
206+
<xref linkend="guc-jit-above-cost"/>, <xref
207+
linkend="guc-jit-optimize-above-cost"/>, <xref
208+
linkend="guc-jit-inline-above-cost"/> decide whether <acronym>JIT</acronym>
209+
compilation is performed for a query, and how much effort is spent doing
210+
so.
211+
</para>
212+
213+
<para>
214+
For development and debugging purposes a few additional GUCs exist. <xref
215+
linkend="guc-jit-dump-bitcode"/> allows the generated bitcode to be
216+
inspected. <xref linkend="guc-jit-debugging-support"/> allows GDB to see
217+
generated functions. <xref linkend="guc-jit-profiling-support"/> emits
218+
information so the <productname>perf</productname> profiler can interpret
219+
<acronym>JIT</acronym> generated functions sensibly.
220+
</para>
221+
222+
<para>
223+
<xref linkend="guc-jit-provider"/> determines which <acronym>JIT</acronym>
224+
implementation is used. It rarely is required to be changed. See <xref
225+
linkend="jit-pluggable"/>.
226+
</para>
227+
</sect1>
228+
229+
<sect1 id="jit-extensibility" xreflabel="JIT Extensibility">
230+
<title>Extensibility</title>
231+
232+
<sect2 id="jit-extensibility-bitcode">
233+
<title>Inlining Support for Extensions</title>
234+
<para>
235+
<productname>PostgreSQL</productname>'s <acronym>JIT</acronym>
236+
implementation can inline the implementation of operators and functions
237+
(of type <literal>C</literal> and <literal>internal</literal>). See <xref
238+
linkend="jit-inlining"/>. To do so for functions in extensions, the
239+
definition of these functions needs to be made available. When using <link
240+
linkend="extend-pgxs">PGXS</link> to build an extension against a server
241+
that has been compiled with LLVM support, the relevant files will be
242+
installed automatically.
243+
</para>
244+
245+
<para>
246+
The relevant files have to be installed into
247+
<filename>$pkglibdir/bitcode/$extension/</filename> and a summary of them
248+
to <filename>$pkglibdir/bitcode/$extension.index.bc</filename>, where
249+
<literal>$pkglibdir</literal> is the directory returned by
250+
<literal>pg_config --pkglibdir</literal> and <literal>$extension</literal>
251+
the basename of the extension's shared library.
252+
253+
<note>
254+
<para>
255+
For functions built into <productname>PostgreSQL</productname> itself,
256+
the bitcode is installed into
257+
<literal>$pkglibdir/bitcode/postgres</literal>.
258+
</para>
259+
</note>
260+
</para>
261+
</sect2>
262+
263+
<sect2 id="jit-pluggable">
264+
<title>Pluggable <acronym>JIT</acronym> Provider</title>
265+
266+
<para>
267+
<productname>PostgreSQL</productname> provides a <acronym>JIT</acronym>
268+
implementation based on <productname>LLVM</productname>. The interface to
269+
the <acronym>JIT</acronym> provider is pluggable and the provider can be
270+
changed without recompiling. The provider is chosen via the <xref
271+
linkend="guc-jit-provider"/> <acronym>GUC</acronym>.
272+
</para>
273+
274+
<sect3>
275+
<title><acronym>JIT</acronym> Provider Interface</title>
276+
<para>
277+
A <acronym>JIT</acronym> provider is loaded by dynamically loading the
278+
named shared library. The normal library search path is used to locate
279+
the library. To provide the required <acronym>JIT</acronym> provider
280+
callbacks and to indicate that the library is actually a
281+
<acronym>JIT</acronym> provider it needs to provide a function named
282+
<function>_PG_jit_provider_init</function>. This function is passed a
283+
struct that needs to be filled with the callback function pointers for
284+
individual actions.
285+
<programlisting>
286+
struct JitProviderCallbacks
287+
{
288+
JitProviderResetAfterErrorCB reset_after_error;
289+
JitProviderReleaseContextCB release_context;
290+
JitProviderCompileExprCB compile_expr;
291+
};
292+
extern void _PG_jit_provider_init(JitProviderCallbacks *cb);
293+
</programlisting>
294+
</para>
295+
</sect3>
296+
</sect2>
297+
</sect1>
298+
299+
</chapter>

‎doc/src/sgml/postgres.sgml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -163,6 +163,7 @@
163163
&diskusage;
164164
&wal;
165165
&logical-replication;
166+
&jit;
166167
&regress;
167168

168169
</part>

‎doc/src/sgml/storage.sgml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -875,7 +875,7 @@ data. Empty in ordinary tables.</entry>
875875
<filename>src/include/storage/bufpage.h</filename>.
876876
</para>
877877

878-
<para>
878+
<para id="heaptuple">
879879

880880
Following the page header are item identifiers
881881
(<type>ItemIdData</type>), each requiring four bytes.

‎src/backend/jit/README

Lines changed: 289 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,289 @@
1+
What is Just-in-Time Compilation?
2+
=================================
3+
4+
Just-in-Time compilation (JIT) is the process of turning some form of
5+
interpreted program evaluation into a native program, and doing so at
6+
runtime.
7+
8+
For example, instead of using a facility that can evaluate arbitrary
9+
SQL expressions to evaluate an SQL predicate like WHERE a.col = 3, it
10+
is possible to generate a function than can be natively executed by
11+
the CPU that just handles that expression, yielding a speedup.
12+
13+
That this is done at query execution time, possibly even only in cases
14+
the relevant task is done a number of times, makes it JIT, rather than
15+
ahead-of-time (AOT). Given the way JIT compilation is used in
16+
postgres, the lines between interpretation, AOT and JIT are somewhat
17+
blurry.
18+
19+
Note that the interpreted program turned into a native program does
20+
not necessarily have to be a program in the classical sense. E.g. it
21+
is highly beneficial JIT compile tuple deforming into a native
22+
function just handling a specific type of table, despite tuple
23+
deforming not commonly being understood as a "program".
24+
25+
26+
Why JIT?
27+
========
28+
29+
Parts of postgres are commonly bottlenecked by comparatively small
30+
pieces of CPU intensive code. In a number of cases that is because the
31+
relevant code has to be very generic (e.g. handling arbitrary SQL
32+
level expressions, over arbitrary tables, with arbitrary extensions
33+
installed). This often leads to a large number of indirect jumps and
34+
unpredictable branches, and generally a high number of instructions
35+
for a given task. E.g. just evaluating an expression comparing a
36+
column in a database to an integer ends up needing several hundred
37+
cycles.
38+
39+
By generating native code large numbers of indirect jumps can be
40+
removed by either making them into direct branches (e.g. replacing the
41+
indirect call to an SQL operator's implementation with a direct call
42+
to that function), or by removing it entirely (e.g. by evaluating the
43+
branch at compile time because the input is constant). Similarly a lot
44+
of branches can be entirely removed (e.g. by again evaluating the
45+
branch at compile time because the input is constant). The latter is
46+
particularly beneficial for removing branches during tuple deforming.
47+
48+
49+
How to JIT
50+
==========
51+
52+
Postgres, by default, uses LLVM to perform JIT. LLVM was chosen
53+
because it is developed by several large corporations and therefore
54+
unlikely to be discontinued, because it has a license compatible with
55+
PostgreSQL, and because its LLVM IR can be generated from C
56+
using the clang compiler.
57+
58+
59+
Shared Library Separation
60+
-------------------------
61+
62+
To avoid the main PostgreSQL binary directly depending on LLVM, which
63+
would prevent LLVM support being independently installed by OS package
64+
managers, the LLVM dependent code is located in a shared library that
65+
is loaded on-demand.
66+
67+
An additional benefit of doing so is that it is relatively easy to
68+
evaluate JIT compilation that does not use LLVM, by changing out the
69+
shared library used to provide JIT compilation.
70+
71+
To achieve this code, e.g. expression evaluation, intending to perform
72+
JIT, calls a LLVM independent wrapper located in jit.c to do so. If
73+
the shared library providing JIT support can be loaded (i.e. postgres
74+
was compiled with LLVM support and the shared library is installed),
75+
the task of JIT compiling an expression gets handed of to shared
76+
library. This obviously requires that the function in jit.c is allowed
77+
to fail in case not JIT provider can be loaded.
78+
79+
Which shared library is loaded is determined by the jit_provider GUC,
80+
defaulting to "llvmjit".
81+
82+
Cloistering code performing JIT into a shared library unfortunately
83+
also means that code doing JIT compilation for various parts of code
84+
has to be located separately from the code doing so without
85+
JIT. E.g. the JITed version of execExprInterp.c is located in
86+
jit/llvm/ rather than executor/.
87+
88+
89+
JIT Context
90+
-----------
91+
92+
For performance and convenience reasons it is useful to allow JITed
93+
functions to be emitted and deallocated together. It is e.g. very
94+
common to create a number of functions at query initialization time,
95+
use them during query execution, and then deallocate all of them
96+
together at the end of the query.
97+
98+
Lifetimes of JITed functions are managed via JITContext. Exactly one
99+
such context should be created for work in which all created JITed
100+
function should have the same lifetime. E.g. there's exactly one
101+
JITContext for each query executed, in the query's EState. Only the
102+
release of an JITContext is exposed to the provider independent
103+
facility, as the creation of one is done on-demand by the JIT
104+
implementations.
105+
106+
Emitting individual functions separately is more expensive than
107+
emitting several functions at once, and emitting them together can
108+
provide additional optimization opportunities. To facilitate that the
109+
LLVM provider separates function definition from emitting them in an
110+
executable way.
111+
112+
Creating functions into the current mutable module (a module
113+
essentially is LLVM's equivalent of a translation unit in C) is done
114+
using
115+
extern LLVMModuleRef llvm_mutable_module(LLVMJitContext *context);
116+
in which it then can emit as much code using the LLVM APIs as it
117+
wants. Whenever a function actually needs to be called
118+
extern void *llvm_get_function(LLVMJitContext *context, const char *funcname);
119+
returns a pointer to it.
120+
121+
E.g. in the expression evaluation case this setup allows most
122+
functions in a query to be emitted during ExecInitNode(), delaying the
123+
function emission to the time the first time a function is actually
124+
used.
125+
126+
127+
Error Handling
128+
--------------
129+
130+
There are two aspects to error handling. Firstly, generated (LLVM IR)
131+
and emitted functions (mmap()ed segments) need to be cleaned up both
132+
after a successful query execution and after an error. This is done by
133+
registering each created JITContext with the current resource owner,
134+
and cleaning it up on error / end of transaction. If it is desirable
135+
to release resources earlier, jit_release_context() can be used.
136+
137+
The second, less pretty, aspect of error handling is OOM handling
138+
inside LLVM itself. The above resowner based mechanism takes care of
139+
cleaning up emitted code upon ERROR, but there's also the chance that
140+
LLVM itself runs out of memory. LLVM by default does *not* use any C++
141+
exceptions. Its allocations are primarily funneled through the
142+
standard "new" handlers, and some direct use of malloc() and
143+
mmap(). For the former a 'new handler' exists
144+
http://en.cppreference.com/w/cpp/memory/new/set_new_handler for the
145+
latter LLVM provides callback that get called upon failure
146+
(unfortunately mmap() failures are treated as fatal rather than OOM
147+
errors). What we've, for now, chosen to do, is to have two functions
148+
that LLVM using code must use:
149+
extern void llvm_enter_fatal_on_oom(void);
150+
extern void llvm_leave_fatal_on_oom(void);
151+
before interacting with LLVM code.
152+
153+
When a libstdc++ new or LLVM error occurs, the handlers set up by the
154+
above functions trigger a FATAL error. We have to use FATAL rather
155+
than ERROR, as we *cannot* reliably throw ERROR inside a foreign
156+
library without risking corrupting its internal state.
157+
158+
Users of the above sections do *not* have to use PG_TRY/CATCH blocks,
159+
the handlers instead are reset on toplevel sigsetjmp() level.
160+
161+
Using a relatively small enter/leave protected section of code, rather
162+
than setting up these handlers globally, avoids negative interactions
163+
with extensions that might use C++ like e.g. postgis. As LLVM code
164+
generation should never execute arbitrary code, just setting these
165+
handlers temporarily ought to suffice.
166+
167+
168+
Type Synchronization
169+
--------------------
170+
171+
To able to generate code performing tasks that are done in "interpreted"
172+
postgres, it obviously is required that code generation knows about at
173+
least a few postgres types. While it is possible to inform LLVM about
174+
type definitions by recreating them manually in C code, that is failure
175+
prone and labor intensive.
176+
177+
Instead the is one small file (llvmjit_types.c) which references each of
178+
the types required for JITing. That file is translated to bitcode at
179+
compile time, and loaded when LLVM is initialized in a backend.
180+
181+
That works very well to synchronize the type definition, unfortunately
182+
it does *not* synchronize offsets as the IR level representation doesn't
183+
know field names. Instead required offsets are maintained as defines in
184+
the original struct definition. E.g.
185+
#define FIELDNO_TUPLETABLESLOT_NVALID 9
186+
int tts_nvalid; /* # of valid values in tts_values */
187+
while that still needs to be defined, it's only required for a
188+
relatively small number of fields, and it's bunched together with the
189+
struct definition, so it's easily kept synchronized.
190+
191+
192+
Inlining
193+
--------
194+
195+
One big advantage of JITing expressions is that it can significantly
196+
reduce the overhead of postgres's extensible function/operator
197+
mechanism, by inlining the body of called functions / operators.
198+
199+
It obviously is undesirable to maintain a second implementation of
200+
commonly used functions, just for inlining purposes. Instead we take
201+
advantage of the fact that the clang compiler can emit LLVM IR.
202+
203+
The ability to do so allows us to get the LLVM IR for all operators
204+
(e.g. int8eq, float8pl etc), without maintaining two copies. These
205+
bitcode files get installed into the server's
206+
$pkglibdir/bitcode/postgres/
207+
Using existing LLVM functionality (for parallel LTO compilation),
208+
additionally an index is over these is stored to
209+
$pkglibdir/bitcode/postgres.index.bc
210+
211+
Similarly extensions can install code into
212+
$pkglibdir/bitcode/[extension]/
213+
accompanied by
214+
$pkglibdir/bitcode/[extension].index.bc
215+
216+
just alongside the actual library. An extension's index will be used
217+
to look up symbols when located in the corresponding shared
218+
library. Symbols that are used inside the extension, when inlined,
219+
will be first looked up in the main binary and then the extension's.
220+
221+
222+
Caching
223+
-------
224+
225+
Currently it is not yet possible to cache generated functions, even
226+
though that'd be desirable from a performance point of view. The
227+
problem is that the generated functions commonly contain pointers into
228+
per-execution memory. The expression evaluation functionality needs to
229+
be redesigned a bit to avoid that. Basically all per-execution memory
230+
needs to be referenced as an offset to one block of memory stored in
231+
an ExprState, rather than absolute pointers into memory.
232+
233+
Once that is addressed, adding an LRU cache that's keyed by the
234+
generated LLVM IR will allow to use optimized functions even for
235+
shorter functions.
236+
237+
A longer term project is to move expression compilation to the planner
238+
stage, allowing to tie
239+
240+
What to JIT
241+
===========
242+
243+
Currently expression evaluation and tuple deforming are JITed. Those
244+
were chosen because they commonly are major CPU bottlenecks in
245+
analytics queries, but are by no means the only potentially beneficial cases.
246+
247+
For JITing to be beneficial a piece of code first and foremost has to
248+
be a CPU bottleneck. But also importantly, JITing can only be
249+
beneficial if overhead can be removed by doing so. E.g. in the tuple
250+
deforming case the knowledge about the number of columns and their
251+
types can remove a significant number of branches, and in the
252+
expression evaluation case a lot of indirect jumps/calls can be
253+
removed. If neither of these is the case, JITing is a waste of
254+
resources.
255+
256+
Future avenues for JITing are tuple sorting, COPY parsing/output
257+
generation, and later compiling larger parts of queries.
258+
259+
260+
When to JIT
261+
===========
262+
263+
Currently there are a number of GUCs that influence JITing:
264+
265+
- jit_above_cost = -1, 0-DBL_MAX - all queries with a higher total cost
266+
get JITed, *without* optimization (expensive part), corresponding to
267+
-O0. This commonly already results in significant speedups if
268+
expression/deforming is a bottleneck (removing dynamic branches
269+
mostly).
270+
- jit_optimize_above_cost = -1, 0-DBL_MAX - all queries with a higher total cost
271+
get JITed, *with* optimization (expensive part).
272+
- jit_inline_above_cost = -1, 0-DBL_MAX - inlining is tried if query has
273+
higher cost.
274+
275+
whenever a query's total cost is above these limits, JITing is
276+
performed.
277+
278+
Alternative costing models, e.g. by generating separate paths for
279+
parts of a query with lower cpu_* costs, are also a possibility, but
280+
it's doubtful the overhead of doing so is sufficient. Another
281+
alternative would be to count the number of times individual
282+
expressions are estimated to be evaluated, and perform JITing of these
283+
individual expressions.
284+
285+
The obvious seeming approach of JITing expressions individually after
286+
a number of execution turns out not to work too well. Primarily
287+
because emitting many small functions individually has significant
288+
overhead. Secondarily because the time till JITing occurs causes
289+
relative slowdowns that eat into the gain of JIT compilation.

0 commit comments

Comments
 (0)
Please sign in to comment.