PostgreSQL 16: A Performance Peek
Very recently, the latest major version of one of the most popular database management systems got released: PostgreSQL v16.
The new release comes with many new features and improvements that also include performance optimizations for specific query types. Hence, the question arises which performance improvements to expect for standard workloads and whether there are even performance regressions?
We ran some benchmarks. And here are the results!
Table of Contents
- Chapter 1: Key Insights
- Chapter 2: PostgreSQL 16 Main Updates
- Chapter 3: Performance Results
- Chapter 4: Summary
- Appendix
Key Insights - Performance Results PostgreSQL v16
The latest PostgreSQL v16 release comes with performance improvements for specific query types. We run a set of four open-source benchmarks against PostgreSQL v15.4 and 16.0 to analyse performance improvements and performance regressions for standard workloads. The results in a nutshell:
- The throughput results show nearly no changes from 15.4 to 16.0.
- The read latency results show nearly no changes from 15.4 to 16.0.
- The read-heavy CRUD workload shows an improvement of the update latency of about 40%.
- PostgreSQL comes with new features and does not show any performance regressions which is not necessarily the case for new major database releases.
- The results do not show any significant performance improvements for the standard benchmarks, performance improvements for the specific query types are out of scope of this analysis.
- Before upgrading to PostgreSQL v16, we recommend application-specific performance measurements to validate that there are no performance regressions in your case.
PostgreSQL v16 - Main Updates
The following bullet points summarize the major announced changes for PostgreSQL v16.
Performance Improvements
- Query planner parallelization for FULL and RIGHT JOINS
- Faster DISTINCT and ORDER BY
- RIGHT OUTER JOINS improvements
- COPY commands up to 300% faster
- Client load balancing improvements
- CPU acceleration for Ascii und JSON strings
Further Updates
- Logical replication: streaming between PostgreSQL instances
- Further SQL/JSON commands: JSON_ARRAY(), JSON_ARRAYAGG(), IS JSON
- New PSQL specific commands
- Granular I/O metric monitoring
- Control and security improvements
The full changelog can be found in the official release notes.
Performance Results of PostgreSQL v16
The main goal of our performance measurements was the verification that there are no performance regressions for standard workloads between the new PostgreSQL v16.0 and the former version PostgreSQL v15.4.
First performance benchmarks comparing v15 and v16 were already done by Mark Callaghan. He uses the open source Sysbench benchmarking suite to run a set of micro-benchmarks against several recent versions of PostgreSQL. The results are available in two articles released on Mark's blog:
Mark Callaghan's main insights are, that
- For small servers (8 cores, 16G RAM NVMe SSD) insert queries for PostgreSQL 16.0 are ~5% lower compared to previous versions, otherwise there are neither significant improvements nor regressions.
- For medium servers (virtual machine with 15 cores, hyperthreads disabled, 120GB of RAM, NVMe SSD) PostgreSQL v16.0 is faster than v15.2. Improvements are visible for read and write operations, but more dominant for read operations.
We are following a similar approach using open-source benchmarking suites to generate an approved workload. We run the measurements with our approved benchANT framework for the following workloads:
- YCSB – A: CRUD workload with 95% read and 5% update queries
- YCSB – B: CRUD workload with 50% read and 50% update queries
- TPC-C: OLTP workload
- TPC-H: OLAP workload
Explicitly, we did not focus on the improved query types, but used standard workloads instead. The workloads are run without throughput limitation to focus on ops/s rather than latency under a stable throughput. The full workload specifications can be found in the corresponding appendix section.
The performance measurement runs were executed on the cloud infrastructure of the Open Telekom Cloud (OTC). We used a s3.2xlarge.4 Virtual Machine type (8vCores / 32GB RAM) with 200GB ultra-high I/O in a single-node environment. Even though it does not precisely fall into Mark Callaghan's small / medium categories, it is rather on the small end of the spectrum.
PostgreSQL's configurations were designed with PGTune.
All measurement scenarios were executed three times to reduce variances coming from cloud resources. The full benchmarking specifications can be found in the corresponding appendix section.
Throughput Results – PostgreSQL v16.0 vs PostgreSQL v15.4
- The YCSB measurements show slightly improved operations per second (~ 1%) for the PostgreSQL 16.
- The TPC-C results of PostgreSQL 16 show a decline of 4% of the transactions per second.
- The TPC-H results show also a small decline (~1%) of the new PostgreSQL 16 compared to the last stable PostgreSQL 15.4 version.
- All results show increases/declines which are in the range of measurement tolerance, especially on cloud resources.
Note: Please acknowledge the different result metrics for the different workload types: YCSB: operations per second, TPC-C: transactions per second, TPC-H: transactions per hour
Read Latency Results 95th Percentile – PostgreSQL v16.0 vs PostgreSQL v15.4
- The YCSB A results show slightly improved latency (lower is better!) for PostgreSQL 16, while the read latency for the YCSB B measurements are nearly identical.
- For the TPC-C, the measurement results for PostgreSQL 16 shows an increase of the read latency of about 4%.
- The TPC-H results show nearly no differences between the PostgreSQL 16 and PostgreSQL 15.4 versions.
Note: The TPC-C and TPC-H latency results are not solely for read operations, but for whole transactions. Latency results are given for the 95th percentile.
Write/Update Latency Results 95th Percentile – PostgreSQL v16.0 vs PostgreSQL v15.4
- The results for the update latency of the YCSB A (read-heavy) measurements show a massive improvement of the update latency (~ 40%).
- While the results for the YCSB B measurements are nearly identical.
Note: TPC-C and TPC-H do not separate read and update latency due to transactional bundling of the single operations.
Summary
In summary, the performance results verify earlier results gained by Mark Callaghan and indicate that there is no performance regression from version 15.4 to 16.0 for standard workloads. In addition, there are no significant performance improvements for standard workloads for the given server size.
The results suggest that for small servers and standard workload sizes migrating to PostgreSQL should not cause any harm from the performance side. For different and special workloads, we recommend to do specific benchmark before upgrading to newer database versions.
Future work from our side includes running benchmarks with a focus on improvements promised by the PostgreSQL 16 release notes. This will be subject to a larger study. Also, taking Mark Callaghan's results into account, it might be beneficial to look at the behaviour for larger servers.
Appendix – Benchmarking Specification
All measurement results can be found in our GitHub repository including all applied configs and a full audit of the benchmarking runs.
DBMS Configuration | V15 | V16 |
---|---|---|
DBMS | PostgreSQL | PostgreSQL |
Version | 15.4 | 16.0 |
Cluster Type | single node | single node |
Cloud Configuration | DBMS Instance | Benchmark Instance |
---|---|---|
Cloud | OTC | OTC |
Region | de-de | de-de |
Instance Type | s3.2xlarge.4 | s3.4xlarge.2 |
Instance Specs | 8 vCores/32 GB RAM | 16 vCores/32 GB RAM |
Storage Type | Ultra-high I/O | Common I/O |
Storage Size | 200 GB | 50GB GB |
OS | Ubuntu 20.04 | Ubuntu 20.04 |
Appendix – Workload & Database Specification
Workload 1 & 2: YCSB Read-heavy/Read-Update
PostgreSQL Configuration | Parameter |
---|---|
shared_buffers | 8GB |
effective_cache_size | 24GB |
maintenance_work_mem | 2GB |
checkpoint_completion_target | 0.9 |
wal_buffers | 16MB |
default_statistics_target | 100 |
random_page_cost | 1.1 |
effective_io_concurrency | 200 |
work_mem | 41943kB |
min_wal_size | 1GB |
max_wal_size | 4GB |
max_worker_processes | 8 |
max_parallel_workers_per_gather | 4 |
max_parallel_workers | 8 |
max_parallel_maintenance_workers | 4 |
YCSB Configuration | read-heavy | read-update |
---|---|---|
version | 0.17.0 | 0.17.0 |
runtime | 30m | 30m |
threads | 50 | 50 |
initial data | 50 GB | 50GB |
record size | 1 KB | 1 KB |
read proportion | 95% | 50% |
update | 5% | 50% |
request distribution | zipfian | zipfian |
Workload 3 (TPC-C) & 4 (TPC-H) Configuration
BenchBase Configuration | TPC-C | TPC-H |
---|---|---|
Version | master | master |
Runtime | 30m | 30m |
Terminals | 20 | 2 |
Scale Factor | 100 | 5 |
PostgreSQL Configuration | Parameter |
---|---|
shared_buffers | 8GB |
effective_cache_size | 24GB |
maintenance_work_mem | 2GB |
checkpoint_completion_target | 0.9 |
wal_buffers | 16MB |
default_statistics_target | 100 |
random_page_cost | 1.1 |
effective_io_concurrency | 200 |
work_mem | 41943kB |
min_wal_size | 2GB |
max_wal_size | 8GB |
max_worker_processes | 8 |
max_parallel_workers_per_gather | 4 |
max_parallel_workers | 8 |
max_parallel_maintenance_workers | 4 |
PostgreSQL Configuration | Parameter |
---|---|
shared_buffers | 8GB |
effective_cache_size | 24GB |
maintenance_work_mem | 2GB |
checkpoint_completion_target | 0.9 |
wal_buffers | 16MB |
default_statistics_target | 500 |
random_page_cost | 1.1 |
effective_io_concurrency | 200 |
work_mem | 20971kB |
min_wal_size | 4GB |
max_wal_size | 16GB |
max_worker_processes | 8 |
max_parallel_workers_per_gather | 4 |
max_parallel_workers | 8 |
max_parallel_maintenance_workers | 4 |