Cloud Database Benchmarking

The original database benchmarking of the 1990s and 2000s was limited to on-premise servers with single-node databases.

Cloud database benchmarking, on the other hand, looks at modern multi-node databases on distributed systems such as the public cloud, but also in private cloud environments.

It is more complex and multi-layered.

And as a result, unfortunately, no longer as popular.

But this must and will change in view of the coming data age.

What is Cloud Database Benchmarking?

Cloud database benchmarking is the adaptation of classic database benchmarking to the benchmarking of distributed databases with multiple instances on elastic resources such as cloud infrastructures.

The technological advancement of modern databases in the 2010s and the success story of cloud computing brought classical database benchmarking solutions to their performance limits. Adapted benchmarks and benchmarking tools became necessary, but were not easy to implement due to technological heterogeneity.

benchANT is the first software product that enables multi-cloud, multi-database and multi-benchmark cloud database benchmarking.

What are the Challenges of Cloud Database Benchmarking?

1. Technological Heterogeneity

For data-driven software applications, a wide variety of NoSQL and NewSQL databases are now being considered in addition to the classic relational databases. These databases differ greatly from each other technologically and do not offer a common data model like the family of SQL databases.

They differ not only in their data model and functional properties such as performance or consistency, but also in their interfaces, configuration options and drivers.

This also applies to the various public cloud offerings, which offer a wide range of IaaS solutions from open-stack developments to proprietary solutions. Hybrid and private cloud solutions add to this diversity.

The challenge of cloud database benchmarking is not only the immense number of possible combinations of database technologies and cloud resources, but also running the same workload on sometimes very different technologies for a comparable end result.

2. Elastic Resources - Different Performance (The Cloud Impact)

In particular, the impact of the cloud infrastructure on the performance of a database setup is often completely underestimated.

Scientific research has shown significant differences between different cloud offerings and unexpected influences on the performance of distributed database systems. See Insight: AWS EC2 vs Open Telekom Cloud vs IONOS.

Consequently, not all clouds are the same.

Every cloud offering has strengths and weaknesses, which also differ depending on the database technology used.

An independent performance statement is not possible. The entire cloud database setup must always be considered and benchmarked in order to make reliable, meaningful decisions.

3. Millions of Possible Combinations

Whereas in the past, a few databases on a dozen servers with similar workloads were givermarked, the combinatorial number of possible cloud database setups is several 100 million.

Over 500 industry-ready database technologies, with numerous individual configuration options, combined with the numerous resources of the cloud providers make a complete benchmarking of all combinations impossible.

This also makes no sense due to the now much more diverse workloads - from IoT to eCommerce - as only the individual workload pattern has any relevance.

4. Dynamic Cloud and Database Development

The world of cloud computing and modern databases is spinning rapidly.

Every month there are new cloud resource offerings, new DBMS versions with new features and configuration options.

These changes must be taken into account in benchmarking and the measurement results must be kept up to date. This requires continuous benchmarking.

Why is Cloud Database Benchmarking Important?

In short: without reliable, independent cloud database benchmarking, no efficient and risk-conscious IT decision is possible.

Technological diversity, constant innovations and increasing data volumes are increasingly putting database management systems in the focus of software applications. The right IT decisions determine success and technical failure.

Benchmarking achieves:

IT alignment: The right technology for the right purpose
Right Sizing: Efficient cost and load management
Foreplanning: Scalable and future-proof solutions for future challenges.

Especially in public cloud infrastructures, the potential of cloud database benchmarking is enormous and enables Reduction of technical risks and costs and delivers more efficient solutions in terms of performance and scalability.

The Cloud Database Benchmarking Process

The cloud database benchmarking process follows the classic [benchmarking process][blog/database-benchmarking]. However, it differs in the specific details.

1. Framework Conditions & Goals

Besides performance and costs, there are other possible benchmarking objectives such as scalability, availability and elasticity KPIs.
Workload distribution needs to be analysed not only temporally but also spatially across regions.
Many modern applications such as IoT, AI or eCommerce are primarily at home in the cloud. Each application has its own specific workload that needs to be considered and modelled.

2 Identify Entities

There are many more offerings, especially in the cloud IaaS space, that need to be considered and pre-selected.
By giving the database and infrastructure together, there is a large number of possible setup configurations.

3. Measurement

An automated measurement procedure is almost inevitable due to the large number of setups and the complexity of the measurement procedure.
Customer accounts and API keys are required for cloud access.

4. Compare Results

More diverse KPIs and more setups measured lead to more data being analysed and compared.
Due to the cloud impact, the measurement results can be significantly more unstable than with classic database benchmarking and therefore require modern data science methods in order to recognise and correctly take them into account.
Preparation of the measurement data and enrichment with metadata is imperative to ensure the tangibility of the results.
A multi-dimensional visualisation of the results is usually required.

Conclusion

Cloud database benchmarking is much more than a further development of classic database benchmarking.

The distributed and elastic infrastructure has a significant influence on the database performance metrics and must be taken into account correctly.