CYBERTEC PostgreSQL Logo

CLUSTER: Improving PostgreSQL performance

03.2020
Category: 

When designing a database application, the layout of data in disk is often neglected. However, the way data is clustered by PostgreSQL can have a major performance impact. Therefore it makes sense to take a look at what can be done to improve speed and throughput. In this post you will learn one of the most important tricks.

PostgreSQL: To sort or not to sort

To demonstrate the importance of the on-disk layout I have created a simple test set:

Note that both data sets are absolutely identical. I have loaded 10 million rows into a simple table. However, in the first case data has been sorted, then inserted. generate_series returns data in ascending order and because the table is new data will be written to disk in that order.
In the second case I decided to shuffle the data before insertion. We are still talking about the same data set. However, it is not in the same order:

In both cases the size on disk is the same. There are no changes in terms of space consumption which can be an important factor as well.

Creating an index in PostgreSQL

Let us create an index on both tables:

Even creating the index is already faster on sorted data for various reasons. However, creating initial indexes does not happen too often, so you should not worry too much.

In the next step we can already create optimizer statistics and make sure that all hint bits are set to ensure a fair performance comparison:

Reading blocks of database

Now that all the test data sets are in place we can run a simple test: Let us fetch 49000 rows from the sorted data set first:

Not bad. We need 11.785 milliseconds to read the data. What is most important to consider here is that the number of 8k blocks needed is 138, which is not much. “shared hit” means that all the data has come from memory.

Let me run the same test again:

In this case the query took a bit longer: 13.4 ms. However, let us talk about the most important number here: The number of blocks needed to return this result. 18799 blocks. Wow. That is roughly 150 times more.

One could argue that the query is not really that much slower. This is true. However, in my example all data is coming from memory. Let us assume for a moment that data has to be read from disk because for some reason we get no cache hits. The situation would change dramatically. Let us assume that reading one block from disk takes 0.1 ms:

138 * 0.1 + 11.7 = 25.5 ms
vs.
18799 * 0.1 + 13.4 = 1893.3 ms

That is a major difference. This is why the number of blocks does make a difference – even if it might not appear to be the case at first glance. The lower your cache hit rates are, the bigger the problem will become.

There is one more aspect to consider in this example: Note that if you want to read a handful of rows only the on-disk layout does not make too much of a difference. However, if the subset of data contains thousands of rows, the way is ordered on disk does have an impact on performance.

CLUSTER: PostgreSQL comes to the rescue

The CLUSTER command has been introduced many years ago to address exactly the issues I have just outlined. It allows you to organize data according to an index. Here is the syntax:

URL: https://www.postgresql.org/docs/12/sql-cluster.html

Utilizing the CLUSTER command is easy. The following code snipped will show how you can do that:

To see what happens I have executed the same query as before again. However, there is something important to be seen:

PostgreSQL has changed the execution plan. This happens due to wrong statistics. Therefore it is important to run ANALYZE to make sure that the optimizer has up-to date information:

Once the new optimizer statistics is in place the execution plan will be as expected again:

Maintaining order

If you have decided to cluster a table it does NOT mean that order on disk is maintained forever. If you run UPDATES etc. frequently the table might gradually loose order again. Therefore, CLUSTER is especially useful if your data is rather static. It can also make sense to order data as you import it to ensure physical order.

Finally …

If you want to learn more about database performance and storage consider checking out my post about shrinking the storage footprint of PostgreSQL.


In order to receive regular updates on important changes in PostgreSQL, subscribe to our newsletter, or follow us on Facebook or LinkedIn.

5 responses to “CLUSTER: Improving PostgreSQL performance”

  1. Thanks for sharing! I like the idea in the docs to reduce the table fillfactor a little if the data is not static. It sounds like this could avoid requiring a re-cluster, or at least less often. Is it worth adding to the maintaining order section? Or is this a bad idea? Thanks again!

  2. I don't get it. If I am understanding it correctly, the last example is actually the worst performing of the whole article: 15 ms, when before CLUSTERing it was 13ms.

  3. Getting columns that are not part of the index would be a better way to demonstrate the advantages of clustering a table.

  4. Thanks for this great article !

    But why is the Buffers count so high on the index idx_random in the first place - since the DB is using as "Index Only Scan"

    According to the documentation (postgres.org) only index pages are used in this case:
    "To solve this performance problem, PostgreSQL supports index-only scans, which can answer queries from an index alone without any heap access."

    In my case I got the following execution plan:

    user1db=> explain (analyze, buffers) SELECT *
    FROM t_random
    WHERE generate_series BETWEEN 1000 AND 50000;
    QUERY PLAN
    -------------------------------------------------------------------------------------------------------------------------------------
    Index Only Scan using idx_random on t_random (cost=0.43..1690.11 rows=50684 width=4) (actual time=0.022..9.955 rows=49001 loops=1)
    Index Cond: ((generate_series >= 1000) AND (generate_series <= 50000))
    Heap Fetches: 0
    Buffers: shared hit=18859 read=135
    Planning:
    Buffers: shared hit=19 read=3
    Planning Time: 0.344 ms
    Execution Time: 12.337 ms
    (8 rows)

    So 18859 Buffers where accessed, the whole index has about 27323 leaf pages.

    # SELECT leaf_pages FROM pgstatindex('idx_random');
    leaf_pages
    ------------
    27323
    (1 row)

    So for reading about 50000 values (out of 10000000 values), most of the leaf nodes where read.

    Could you probably shed some light on that ?

    Thanks

    • A very observant question!
      These buffers are not index buffers, but buffers from the visibility map.
      When PostgreSQL performs an index-only scan, it checks the visibility map if the block is all-visible so that it can skip the expensive row fetch.
      Now if the table is clustered, the next row is probably in the same block ad the previous one, and the same visibility map entry applies.
      If the table is not clustered, PostgreSQL will often have to fetch a different visibility map entry to check the all-visible flag.
      The "buffers" statistics don't take duplicate access into account, so if PostgreSQL switches between the same visibility map buffers repeatedly, it will count them repeatedly.
      That is the reason for the high number.
      You can see that the execution time is not greatly affected, because all these buffers are cached in RAM.

Leave a Reply

Your email address will not be published. Required fields are marked *

CYBERTEC Logo white
Get the newest PostgreSQL Info & Tools


    This site is protected by reCAPTCHA and the Google Privacy Policy & Terms of Service apply.

    ©
    2025
    CYBERTEC PostgreSQL International GmbH
    phone-handsetmagnifiercrosscross-circle
    linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram