benchANT Homepage
benchANT Homepage

Performance-Analyse: Apache Cassandra 4.0.0 Release

Apache Cassandra has released a new major release of their popular open-source, column-oriented No-SQL database management system (DBMS) after more than 6 years.

In doing so, they announce major improvements in terms of quality and performance!

How fast is the new Cassandra 4.0.0? And how does it perform against the previous versions?

We have made benchmark measurements on throughput and latency.

Read our performance analysis!

Apache Cassandra Releases

Benchmarking Goals

  • Performance analysis of the current major release of Apache Cassandra 4.0.0
  • The Release Notes highlight the following improvements in addition to over 1000 bug fixes:
    • Up to 5x fast scaling at runtime (elasticity)
    • Auditing features
    • Improved stability
    • Improved data correctness
    • Privacy by Design
    • Improved performance
  • How does the performance (throughput & latency) compare to the previous stable releases 3.11.11, 3.0.25 and 2.2.19 based on a specific IoT use case?

Disclaimer

  • The benchmark results presented in the following are based on a scientifically established methodology, but only refer to a concrete use case with a specific database workload and are not generally valid.
  • In particular, the following parameters must be taken into account in order to be able to apply the results to other use cases:
    • On which (cloud) resources is Apache Cassandra run?
    • Which cluster size, replication factor and consistency parameters are chosen?
    • What other adjustments will be made in cassandra.yaml?
    • What workload is applied to the database (query types, read/write ratio, intensity, ...)?
  • All results were created with the benchANT platform and can be reproduced.

For more up-to-date benchmarking data on performance & scalability, see our MongoDB vs Apache Cassandra Study.

Key Findings

  • Cassandra 4.0.0 achieves the highest benchANT score of 14 - followed by Cassandra 3.11.11(13).
  • Cassandra 4.0.0 achieves 61% higher throughput compared to Cassandra 3.11.11
  • Cassandra 4.0.0 achieves 82% improved average READ latency compared to Cassandra 3.11.11.
  • Cassandra 4.0.0 achieves a 210% degraded average WRITE latency compared to Cassandra 3.11.11.
  • The older versions 2.2.19 and 3.0.25 differ only marginally in their performance from the 3.11.11 version.
benchANT Scores Apache Cassandra 4.0

Benchmarking Setup

Cloud Deployment

  • Cloud Provider: AWS EC2
  • VM Type: m5.large
  • Storage: GP2 (SSD)
  • OS: Ubuntu 20.04
  • Java: OpenJKD 8

DB Cluster Configuration

  • Cluster-Size: 3 Nodes
  • Replication Level: 3
  • WRITE Consistency: ONE
  • READ Consistency: ONE
  • Memory Allocation: 75% of the available memory (6GB)
  • else „Vanilla cassandra.yaml“ configuration

Benchmark Configuration

  • Yahoo Cloud Serving Benchmark (YCSB)
  • Version 0.17.0
  • Workload: benchANT IoT workload
  • WRITE Ratio: 80%
  • READ Ratio: 20%
  • Initial data records: 2.000.000
  • Data record size: 5KB
  • Threads: 64
  • Runtime: 30 Minuten
  • Runs: 3

Throughput

  • Cassandra 4.0.0 achieves by far the highest throughput with an average of 7205 ops/s.
  • The other Cassandra versions are very uniform in the range from 4507 ops/s (3.11.11) to 4671 ops/s (2.2.19).
  • All Cassandra releases examined achieve extremely stable results over the 3 benchmark measurement runs per configuration. The maximum coefficient of variation of Apache Cassandra 4.0.0 is only 1.3%.
Cassandra 4.0 throughput benchmarks

WRITE Latency

  • The average WRITE latency of Cassandra 4.0.0 (7.6 ms) is more than 2x greater compared to Cassandra 3.11.11 (3.6 ms).
  • The WRITE latency of 95% of the requests is below 13.1 ms for Cassandra 4.0.0 and below 7.8 ms for Cassandra 3.11.11. This corresponds to a deterioration of 67%.
  • The WRITE latency of 99% of the requests is below 75.9 ms for Cassandra 4.0.0 and below 29.7 ms for Cassandra 3.11.11. This corresponds to a deterioration of 155%. This corresponds to a deterioration of 155%. The best result for the WRITE latency (99% percentile) is achieved by Cassandra 3.0.25 with 19.0 ms.
Average Write Latency Benchmarks Apache Cassandra
Write Latency Benchmarks Apache Cassandra - 95th percentile
Write Latency Benchmarks Apache Cassandra - 99th percentile

READ Latency

  • The average READ latency of Cassandra 4.0.0 (12 ms) is more than 3.7x lower compared to Cassandra 3.11.11 (56.8 ms).
  • The READ latency of 95% of the requests is below 34.4 ms with Cassandra 4.0.0 and below 187.1 ms with Cassandra 3.11.11. This corresponds to an improvement of 81%.
  • The READ latency of 99% of the requests is below 95.0 ms for Cassandra 4.0.0 and below 331.9 ms for Cassandra 3.11.11. This corresponds to an improvement of 71%. This corresponds to an improvement of 71%.
Average Read-Latency Benchmarks Apache Cassandra
Read latency benchmarks Apache Cassandra - 95th percentile
Read latency benchmarks Apache Cassandra - 99th percentile

Performance Discussion

  • Cassandra 4.0.0 keeps the performance promises for throughput and READ latency and offers significantly better performance in these areas compared to the previous version 3.11.11.
  • However, this does not apply to WRITE latency. On average, this doubles with the Cassandra 4.0.0 version update. This deterioration of the WRITE latency is also clearly evident in the 95% percentile and 99% percentile.
  • The difference to the previous versions, which hardly differ in terms of performance, is very pronounced for all 3 performance indicators.
  • READ-intensive workloads clearly benefit from the new Cassandra 4.0.0 version.
  • For WRITE-intensive workloads where latency is a critical service-level objective, a version update may pose a performance risk.
  • In general, our results are consistent with the results of the ScyllaDB blog post in terms of improved throughput and READ latency from Cassandra 4.0.0 to Cassandra 3.11.11.
    • The ScyllaDB results also show improved WRITE latency when Cassandra 4.0.0 is under heavy load.
  • However, the results cannot be compared 1:1, as the setup of ScyllaDB differs from ours in many ways. The most important differences are:
    • VM Type: significantly larger i3.4xlarge (16 vCPUs and 122 GiB)
    • Cloud Storage: faster NVME with Raid0
    • Java: Java 16 for Cassandra 4.0.0 (classified as "experimental" by the Cassandra developers)
    • Benchmark: cassandra-stress as benchmark and different workloads (100% read, 100% write, 50% read-50%write mix)
    • Consistency: Quorum
  • benchANT will soon also add cassandra-stress to the benchmark portfolio and will then enable the automated reproduction and extension of these results.

Next Steps

  • Our measurements cover only one individual use case. There are other possible benchmark scenarios:
    • Further YCSB workloads (read-heavy, read-only,...)
    • Further benchmarks (cassandra-stress, TSBS, ...)
    • Further cluster sizes, replication factors, consistency configurations
    • Further cloud providers and cloud resource sizes
    • Fine-grained adaptations of cassandra.yml
    • Investigation of new Java versions up to Java16
  • You can benchmark all these scenarios independently with our benchANT platform.

Conclusion

The announced performance improvements can be partially confirmed for a selected use case with a specific IoT workload.

However, if you plan to update to Cassandra 4.0.0, you should make performance measurements for your specific workload.

Especially if you have a lot of write operations and WRITE latency is an important performance metric for you.

Contact us with any questions, problems or suggestions.