Database Workload Performance Impact: Part 1 - Read-Write Ratio.

Which impact does the distribution of read and write operations have on a database management system?

Does such an influence exist? Is it always similar? How does a specific database behave when the workload changes?

We did what we do best (and fastest), we benchmarked.

9 workload variations - 2 distributed cloud databases - 3 key insights.

Here are the results.

The Benchmark Setting

Every software system and its users create individual workload patterns on the database. But what influence do the workloads and the individual workload parameters have on the throughput and latency of the database?

At the beginning of this database benchmarking series, we vary the distribution of the read and write workload. This is as if the features or the access patterns of the users had changed in a production system of a software application.

What impact do such changes have on the performance of a production distributed database?

Our database benchmarking measurements provide reliable, meaningful results.

Database

We look at 2 of the most popular NoSQL databases:

MongoDB (Community Edition) and
Apache Cassandra (Community Edition)

in a distributed 3-node cluster with full replication.

We use the databases in the vanilla configuration, without adjustments to the settings and workload-specific tuning.

Workload-specific tuning would certainly increase performance, but would affect the comparability of this "baseline benchmarking."

Cloud

As resources, we use

3 identical VMs (m5.large, 2x8 GB) on AWS EC2
with a GP2 storage of 100 GB per VM.
Ubuntu 20.04 as operating system

After each measurement, the VMs are shut down and restarted to obtain reliable and comparable results.

The benchmarking instance runs on a separate VM of type c4.2xlarge.

Benchmark

As benchmark suite we use the Yahoo! Cloud Serving Benchmark in version 0.17.0 with the following parameters.

Data set size: 5 kB
Initial data volume: 10 GB
Read-write distribution: variable
Read access pattern: ZIPFIAN (this will play a role in the discussion)

The complete workload parameters can be taken from the YCSB command:

Command line: -db site.ycsb.db.CassandraCQLClient -s -p hosts=172.31.37.80,172.31.39.116,172.31.44.43 -p cassandra.keyspace=ycsb -p cassandra.writeconsistencylevel=ONE -p workload=site.ycsb.workloads. CoreWorkload -p maxexecutiontime=1800 -threads 64 -p recordcount=2000000 -p operationcount=10000000 -p fieldcount=10 -p fieldlength=500 -p requestdistribution=zipfian -p insertorder=ordered -p readproportion=0. 1 -p updateproportion=0.0 -p insertproportion=0.9 -p scanproportion=0.0 -p maxscanlength=1000 -p scanlengthdistribution=uniform -p core_workload_insertion_retry_limit=3 -p core_workload_insertion_retry_interval=3 -p insertstart=2000001 -t YCSB Client 0.17.0

The most important parameter in our scenario is the read-write distribution (readproportion=X / insertproportion=1-X), which we set in 10 % steps from

from 90% write / 10% read
to 10% write / 90% read.

The write operations are pure write operations, without update operations.

Unfortunately, a 100% read workload is currently not possible due to a bug in YCSB. For reasons of symmetry, we have therefore also dispensed with the 100% write variant, which also has very little relation to reality.

The workload is also dimensioned in such a way that pure in-memory reads should be excluded. Due to the size, the data volumes cannot be kept completely in the main memory.

Benchmark Execution

The benchmarks were measured fully-automated with the benchANT benchmarking platform. Each configuration setup was measured 3x and the data aggregated at the end. The performance measurement under load ran for 20 minutes per benchmark run.

As performance KPIs, we consider

the Throughput,
the Read Latency (95% quantile) and
the Write Latency (95% quantile).

MongoDB - Performance Results

MongoDB is certainly the best-known and most popular NoSQL database, especially for small applications. With its master-slave architecture, MongoDB is said to have strengths for read-heavy workloads.

We have named the configurations "mongodb-w80-r10" in the following notation for easy understanding. This means 80% write share and 20% read share of the workload on the MongoDB database.

MongoDB Throughput

MongoDB shows only minor throughput changes with write-heavy and balanced read-write ratios. The deviations are only in the range of ± 5%.
However, throughput increases almost steadily with increasing read share, confirming the prevailing assessment of MongoDB as a read-oriented database.
The 90% read workload has a throughput that is over 50% higher than the 90% write workload.

MongoDB DB Throughput - RW-Ratio-Variation

MongoDB Latency

This tendency can also be seen in the write latency. This gets better and better (smaller value) the more the read share increases.
In the read extreme case (10% write and 90% read) MongoDB shows a clear increase in performance for both write and read latency.
However, even small increases in the write share lead to a significant deterioration in the read latency.
This effect can probably be attributed to in-memory read operations with the selected ZIPFIAN access distribution, see ZIPFIAN distribution. MongoDB keeps these records in memory in the absence of new write operations. With an increase in write operations, this caching no longer leads to the hoped-for performance success. Without this in-memory effect, a continuous increase in read latency can even be observed.

MongoDB DB Write Latency 95 - RW-Ratio-Variation

MongoDB DB Read Latency 95 - RW-Ratio-Variation

Apache Cassandra - Performance Results

Apache Cassandra is one of the most popular NoSQL databases and uses a wide-column store. Apache Cassandra is said to have excellent scalability and availability, especially for write-heavy workloads.

Apache Cassandra Throughput

Cassandra's throughput shows a distinctly different pattern than MongoDB's.
Performance decreases as the read share increases to a 50/50 distribution. With a further increase in the read share, however, the performance increases again in a similarly symmetrical manner.
The peak values of the throughput are up to more than 30% of those of the minimum.

Apache Cassandra DB Throughput - RW-Ratio-Variation

Apache Cassandra Latency

The write latency shows the same pattern. However, the write latency is now lowest in the 50/50 distribution case and increases to the more extreme read-write distributions.
Exactly the opposite is true for the read latency, which is only almost half as high for both a small read share and a large read share as in the mixed case.
This is to be expected with a low read share. With a high read share, this can only be explained by in-memory caching with the ZIPFIAN access pattern on individual, cached data sets.

Apache Cassandra DB Write Latency 95 - RW-Ratio-Variation

Apache Cassandra DB Read Latency 95 - RW-Ratio-Variation

Key Insights

The Read-Write Ratio of the Workload Has an Influence on Performance.

This finding should not surprise IT experts interested in performance. The distribution of the read-write ratio cannot have an insignificant impact on (non-optimised) distributed databases.

At peak, we find differences, both in throughput and latency, of up to 50% at the same load. This is a significantly high value.

In production systems, the load can shift due to the introduction of new features or a change in user behaviour, and this can have a significant impact on the performance of the database.

Every Database Performs Differently to a Variation in the Read-Write Ratio.

How a particular database will behave under a particular variation of the read-write ratio can be very individual and specific.

With MongoDB, one can see a certain performance plateau with small throughput and latency changes in mixed workloads. With Apache Cassandra, each incremental change has a 10-15% impact, but not just in a linear direction, but with a surprising inflection point.

How does a distributed Couchbase, or your production database, behave?

Do you know the answer?

3. The Read-Write Ratio of the Workload is Only a Small Piece of the Performance Puzzle.

Read-write ratio appears on paper to be one of the most important workload parameters, along with query complexity and the number and size of records.

However, even with these measurement scenarios, one recognizes the influence of other parameters such as the access pattern. The chosen ZIPFIAN distribution leads to a better than expected read performance for Apache Cassandra, as it allows many in-memory read operations to be executed.

But how would Cassandra perform with a UNIFORM distribution? Or with larger data sets? Or with more/less resources? Or more complex DB queries?

Each workload parameter can have a direct impact on database performance, as can the DBMS configuration and underlying resources.

For more up-to-date benchmarking data on performance & scalability, see our MongoDB vs Apache Cassandra Study.

Conclusion

The benchmark workload is a central issue in performance measurements. A good workload image enables reliable decisions for productive systems through cleverly used benchmark measurements.

The workload variation of the read-write ratio will be the start of a longer series of investigations. There are too many aspects that are still unknown here.

What are your thoughts on the findings so far?

With which workload investigations should we continue the series?

What benchmark measurements would help you?

About benchANT

benchANT is a spin-off from the University of Ulm, Germany. The two technical co-founders have more than 20 years of combined research experience with distributed systems and performance engineering, especially in the area of database management systems.

With the benchANT platform, they have developed an automated benchmarking tool that enables every IT architect and system/database administrator to quickly and efficiently run cloud database benchmarks and make decisions based on performance measurements.

In addition, benchANT also advises on on-prem vs. cloud decisions and helps with resource selection and performance optimization - always based on reliable performance measurements.

For a much more detailed analysis, see our MongoDB vs Apache Cassandra Study.