Competitive Yet? A Glance at OVHcloud's new MongoDB DBaaS
Database-as-a-Service (DBaaS) has the potential to reduce required manpower, lower time-to-delivery and to reduce the total cost of ownership (TCO) of database services. As with most other cloud services, the DBaaS market worldwide is dominated by the American hyperscalers AWS, Microsoft Azure, and Google Cloud Platform (GCP). Relying solely on U.S. companies is not without risk for European customers, mainly due to the interplay of Cloud Act and GDPR.
Therefore, it is beneficial to keep an eye on the European market and to evaluate European alternatives such as OVHcloud's recently released MongoDB DBaaS. In this article, we evaluate this new European service against AWS DocumentDB and Azure CosmosDB MongoDB vCore. We investigate pricing, the performance regarding three relevant workloads as well as the resulting value for money.
Introduction
Database Management Systems (DBMS) are the backbone of every modern application. They play a crucial role in e-commerce, Internet of Things, machine learning, data analytics and many other fields. Unfortunately, their role and their extensive configurability makes operating DBMS a challenging task that demands for qualified personnel trained for specific DBMS. Applying Database-as-a-Service (DBaaS) reduces required manpower, lowers time-to-delivery and has the potential to reduce the total cost of ownership (TCO) of database services. As with most other cloud services, the DBaaS market worldwide is dominated by the American hyperscalers AWS, Microsoft Azure, and Google Cloud Platform (GCP).
Over the last 20 years the database market has become increasingly scattered with many different technologies advertised for specific usage scenarios. Besides the world of classical, relational databases a whole new world of NoSQL databases has evolved. MongoDB is a pioneer in this domain and certainly one of the most popular NoSQL databases available on the market. AWS and Microsoft offer their MongoDB-compatible DBaaS.
For European customers, the use of American cloud providers is not without risk. On the one hand, the Cloud Act allows U.S. federal law enforcement to compel U.S.-based technology companies to provide requested data stored on servers regardless of whether the data are stored in the U.S. or on foreign soil. On the other hand, the European General Data Protection Regulation (GDPR) strictly regulates who is allowed to access what kind of data under what condition. Hence, European customers worried about that fact might consider using European alternatives such as OVHcloud.
OVHcloud is a France-based cloud provider offering over 80 IaaS and PaaS services in 40 datacenter locations in Europe, North America and APAC. They are the only European cloud provider in 2022 IDC MarketScape: Worldwide Public Cloud Infrastructure as a Service 2022 Vendor Assessment, and have been named as a “Major Player” in this study. Recently, they launched their MongoDB DBaaS. This whitepaper investigates performance, cost, and value for money characteristics of OVHcloud’s MongoDB offer for three different types of workloads. The results are compared to offers from AWS and Microsoft Azure.
About MongoDB
NoSQL databases promise to provide high performance and horizontal scalability for data intensive workloads. MongoDB is one of the most popular NoSQL databases on the market, with a large and active community of users and developers. DB-Engines currently ranks MongoDB as the most popular NoSQL database and the 5th most popular database overall. MongoDB is available as a free-of-charge Community Edition (CE), but also as an Enterprise Edition. Finally, many different cloud providers offer MongoDB as a service. Others offer DBaaS services that promise to be API-compatible with MongoDB to a certain degree.
MongoDB is applied for various data-intensive use cases including e-commerce, social media, mobile and gaming applications, financial applications, real-time analytics and Internet of Things.
MongoDB is built upon a distributed architecture that comes with two cluster modes: a replica set cluster targets high-availability, while a sharded cluster targets horizontal scalability and high-availability. A sharded cluster requires the additional nodes of type router and config. For more details see the official MongoDB documentation on sharded clusters. This report focuses solely on replica set clusters.
Competitor Selection
Currently two different types of MongoDB DBaaS can be found on the market: those that offer a native MongoDB API and those that offer a MonogoDB-compatible API. The goal of this study is to evaluate OVHcloud’s new MongoDB DBaaS offering with its native API against established API-compatible offers from major market players. A comparison of OVHcloud’s MongoDB against other MongoDB DBaaS is beyond the scope of this article.
In contrast to MongoDB-native services, MongoDB-compatible ones offer a provider-specific subset of the native MongoDB API. In this evaluation we analyze the MongoDB-compatible API services of AWS DocumentDB and Azure CosmosDB for MongoDB vCore services and compare them to OVHcloud's offering.
It is important to note that two types of MongoDB offerings exist for Azure CosmosDB. The older, serverless variant with a MongoDB 4.4-compatible API and the comparably new Azure CosmosDB MongoDB vCore with a v6 API. For the evaluation at hand, we choose the vCore variant.
DBaaS Configuration
Table 1 lists the three competitors as well as the flavors, locations, and versions used in this evaluation. The starting point for the evaluation is OVHcloud’s db2-30
flavor. Care has been taken to ensure a fair evaluation by choosing resource-equivalent set-ups. That is, the same amount of virtual cores and the same amount of physical memory was chosen as for OVHcloud MongoDB. Here, we selected a node size with 8 cores and ~30GB RAM aiming at OVHcloud’s major customer group. All DBaaS offers are configured with the default storage backend and default IOPS configurations.
OVHcloud MongoDB | AWS DocumentDB | Azure CosmosDB MongoDB vCore | |
---|---|---|---|
Flavor | db2-30 | db.r6g.xlarge | M50 |
Storage per node | 250GB | n/a | 256GB |
API compatibility | MongoDB v6 | MongoDB v5 | MongoDB v6 |
API Type | native | compatible | compatible |
Cluster Type | Replica Set (3 nodes) | Replica Set (3 nodes) | Replica Set (2 nodes) |
Region | DE (Frankfurt) | eu-central-1 (Frankfurt) | westeurope |
For OVHcloud MongoDB we selected a set-up with one replica set of one primary and two secondaries without sharding. This set-up was mirrored for AWS DocumentDB. For Azure CosmosDB MongoDB vCore set-ups with three replicas are not possible. In this case, we chose to enable High-Availability leading to a replica set with one primary and one secondary.
For all competitors, we choose a disk size that is close to 250GB of capacity. For AWS DocumentDB it is not possible to define the disk size in advance. Storage capacity is scaled as needed.
Pricing and Costs
Besides governance aspects, technical features and performance, cost is the major factor when deciding for or against cloud providers. Here, not only the mere costs of an offering are decisive, but also the complexity of the pricing model. All cost-related data presented in this article was collected from the respective service providers between 20th and 24th January 2024.
included | size/IOPS | egress | other cost drivers | |
---|---|---|---|---|
OVHcloud MongoDB | compute, storage, backup | size is flexible, IOPS depend on on disk size | included | n/a |
AWS DocumentDB | compute | I/O charged per million operations | charged extra | storage per GB, backup |
Azure CosmosDB MongoDB vCore | compute, storage, backup | not documented | charged extra | n/a |
Table 2 summarizes core aspects of the pricing models. In all cases, the selected service plan, e.g. db2-30
in the case of OVHcloud, determines the compute resources (vCores and RAM). DocumentDB does not link storage capacity to the plan, but adds storage capacity as needed. Accordingly, storage is charged on a per GB basis on top of the service plan. In all other cases the plan also determines the basic amount of storage, which can be increased as needed. When it comes to storage bandwidth (IOPS), the offers differ: for OVHcloud MongoDB, IOPS depends on disk size. DocumentDB does not have a fixed IOPS limit, but charges the user per million operations, no matter if read or write. For Azure CosmosDB MongoDB vCore IOPS are not documented.
OVHcloud MongoDB is the only contender where network traffic is included in the baseline costs. All others charge extra, in particular for egress traffic. Backup is included in the baseline price for CosmosDB MongoDB vCore and for OVHcloud MongoDB. In the case of AWS DocumentDB backup is charged in addition to the baseline costs.
OVHcloud MongoDB only allow clusters with three nodes for one replica set. Yet, they support sharding to grow the overall capacity. DocumentDB does not support sharding, but allows cluster sizes from 1 to 15 nodes. Azure CosmosDB MongoDB vCore supports sharding, but only comes with cluster sizes of one or two nodes per shard. Of course, single node instances are cheaper than clustered set-ups. Yet, they provide less fault-tolerance and are not suited for high-availability set-ups.
The plot visualizes the total monthly cost per DBaaS used in this study. The values include the baseline costs for booking the offering shown in Table 1 and add 6% of network costs for outgoing traffic on top of it. The only exception is OVHcloud who do not additionally charge egress traffic. The bar for DocumentDB already includes storage costs for 250GB of data. AWS’s price calculator recommends to change the set-up when costs for I/O operations are more than 25% of the baseline costs. The light yellow bar visualizes these 25% additional costs. Backup costs are omitted in the above diagram, as they do not contribute significantly to the costs.
Evaluation Use Cases and Workloads
For the evaluation, we consider three different types of use cases. The read-only use case evaluates a caching scenario with a data set of 50GB and intensive read access to the database. It mainly evaluates the capabilities of cpu, memory, and network, not so much the disk. The second case read-update mixes reads and updates and matches best with the eCommerce domain. The third scenario, write-heavy, is taken from the IoT / observability world. In this case, 90% of all operations are insert operations and only 10% are reads.
For ensuring a controllable evaluation environment, we use synthetic data for the evaluation and rely on established tools to generate the workload. More precisely, we use the Yahoo! Cloud Serving Benchmark (YCSB) for that purpose, an open source and industry standard benchmarking tool. All benchmarks are carried out with YCSB version 0.18.0-SNAPSHOT. Table 2 lists the relevant YCSB configurations per workload.
All three cases simulate 50 clients that concurrently operate on the database using documents of a size of 1,000 bytes. The read-only case starts off with 50GB of data in the database and then reads data based on a zipfian distribution; that is, some items are read a lot more often than others. The read-update case starts with 100GB. 50% of the operations are reads, the other 50% are updates by primary key. Both follow a zipfian distribution. The write-heavy case initially has 10GB of data in the database and mostly reads items that have recently been added.
In all cases, the default MongoDB client consistency settings have been applied: writeConcern=majority, readPreference=primary, readConcern=local
.
YCSB Configuration | read-only (memory-bound) | read-update (disk-bound) | write-heavy (disk-bound) |
---|---|---|---|
runtime [min] | 90 | 90 | 90 |
initial data [GB] | 50 | 100 | 10 |
document size [B] | 1,000 | 1,000 | 1,000 |
threads | 50 | 50 | 50 |
read requests [%] | 100 | 50 | 10 |
update requests [%] | 0 | 50 | 0 |
insert requests [%] | 0 | 0 | 90 |
read request distribution | zipfian | zipfian | latest |
Methodology
All benchmarks are executed with benchANT’s benchmarking platform that fully automates the entire benchmarking process to ensure deterministic and reproducible benchmarks (for more details, see the associated publications on Mowgli and benchANT). A single benchmark execution comprises the following steps carried out by the benchANT platform:
- deploy a new (MongoDB) DBaaS instance
- deploy a benchmark VM (16 vCores / 64 GB RAM) to run the YCSB on the same cloud provider in the same region
- trigger the LOAD phase and fill the database with the initial data set
- wait for 5 minutes as stabilization period
- trigger the RUN phase of 90 minutes for the actual workload
- collect and process the results and metadata
- tear down the DBaaS instance and the benchmark VM
Performance Evaluation
Read-only workload
The read-only workload represents one of the standard workloads of YCSB (Workload C) that simulates a caching application. This workload is mostly memory-bound, as 50GB of data is accessed based on a zipfian distribution. In consequence, the main part of the requests access cached data since each DBaaS node comes with 32GB of RAM. This results in a workload that mainly targets CPU and memory on the resource layer.
The results show that OVHcloud is able to provide an average throughput of 11,756 ops/s, taking the second place right behind AWS DocumentDB with 11,819 ops/s. In comparison to Azure CosmosDB vCore, OVHcloud MongoDB provides 18% higher throughput.
Regarding latencies, the following charts show the average and 95th percentile read latency. Similar to the throughput results, AWS DocumentDB and OVHcloud MongoDB provide the best, i.e. the lowest read latency where OVHcloud MongoDB provides 14% lower latency compared to Azure.
The results for the 95th percentile (95% percent of the read requests were fully processed within the depicted time), show that Azure CosmosDB vCore provides the lowest 95th percentile latency while OVHcloud MongoDB ranks on the third place.
In order to compare the value for money per DBaaS offer, we put throughput and monthly costs in relation. The resulting chart shows that OVHcloud MongoDB provides the best value for money, namely 10.22 operations per $ and second. AWS DocumentDB ranks second with value for money between 8.85 and 6.35 operations per $ and per second. Azure CosmosDB MongoDB vCore ranks third. Overall, OVHcloud MongoDB is more than 75% more efficient than Azure CosmosDB MongoDB vCore.
Read-update workload
The read-update workload represents one of the standard workloads of YCSB (Workload A) that simulates for instance an update-intensive e-commerce application. When running this benchmark, the database is initialized with 100GB of data. In consequence, the read-portion of the workload is disk-bound despite the zipfian distribution of reads. Apparently, also the update portion is disk bound as changes need to be stored persistently.
The following figure shows the throughput results for all four contenders. Please note that this is the combined throughput of read and update operations. Overall and not surprisingly, the numbers are significantly lower than for the memory-bound read-heavy case. As with the read-only case, AWS DocumentDB reaches the highest throughput numbers. Unlike before, OVHcloud MongoDB does not rank second, but last with only 71% of AWS DocumentDB’s throughput. Azure CosmosDB MongoDB vCore ranks second (-18.7%).
Regarding the results for read latency, the following charts show the average and 95th percentile for read latency. The ranking in both data sets is the same and we see that OVHcloud MongoDB offers the best numbers, with an average read latency slightly above 2ms and a 95th percentile of 5.5ms, followed by Azure CosmosDB MongoDB vCore. AWS DocumentDB ranks 3rd.
From the read latency results, we can conclude that the read part of the workload is probably not responsible for OVHcloud MongoDB’s low throughput numbers. Things look different for the update latencies. Regarding average update latencies, AWS DocumentDB ranks first. Azure CosmosDB MongoDB vCore and OVHcloud MongoDB rank second and third with 34% and 61% higher latencies.
Turning towards the 95th percentile for the update results, there is nothing surprising for AWS DocumentDB. Its 95th percentile is higher than for the average case which is the usual behavior. The 95th percentile for Azure CosmosDB MongoDB vCore is significantly higher than both the average and its competitors’. This is an indication that for Azure ComosDB MongoDB vCore the write bandwidth was exhausted during the benchmark explaining the low throughput in comparison to AWS DocumentDB.
The true surprise in this benchmark is the 95th percentile for OVHcloud MongoDB. It is even lower than AWS DocumentDB. So OVHcloud MongoDB ranks first for this metric. Further, the value is even lower than what was measured as average update latency for OVHcloud. While such a case is mathematically possible, it is very seldom to observe: More precisely, 95% of all updates complete in less than 35ms while on average an update takes 37ms. This means that while update latencies are comparable to AWS DocumentDB for at least 95% of all updates, some queries exist amongst the remaining 5% that heavily boost the average.
When it comes to value for money, Azure CosmosDB MongoDB vCore ranks last with only 1.71 operations per second and $. In its expensive configuration, AWS DocumentDB is only slightly better than Azure CosmosDB MongoDB vCore (1.71 operations per second and $). With its cheap configuration, it is slightly better than OVHcloud MongoDB and winner of this category. OVHcloud clearly achieves more throughput per $ than Azure CosmosDB MongoDB vCore. Despite its relatively low throughput, OVHcloud gets very close to AWS DocumentDB’s value for money.
Write-heavy workload
The write-heavy workload represents an IoT or monitoring use case. 90% of all operations are insert statements while the remaining 10% of statements query the database for recently added elements.
For Azure CosmosDB MongoDB vCore we measure 5,497 ops / s making it the winner for this KPI and workload. AWS DocumentDB ends up second with 4,326 ops / s (-21%). OVHcloud MongoDB is third in this evaluation with 3,841 ops / s (-30%). Despite the write-heaviness, this is a mixed workload and it is beneficial to look at both read and insert latencies individually.
The results for the read latency do not mirror the throughput results. OVHcloud MongoDB is best, closely followed by Azure CosmosDB MongoDB vCore and AWS DocumentDB.
Regarding insert latency, Azure CosmosDB MongoDB vCore is best for both 95th percentile and average case. OVHcloud MongoDB and AWS DocumentDB battle for second place. OVHcloud’s service shows better performance for the 95th percentile, while AWS DocumentDB is better on average.
Comparing both average and 95th percentile values for OVHcloud MongoDB and AWS DocumentDB and setting them in relation with the throughput results indicates that for both cases a few long running queries limit the maximum achievable throughput. This kind of hiccup is not surprising for a new service such as OVHcloud’s MongoDB offering, but unexpected for a long-standing, highly prolific service such as DocumentDB.
Regarding value for money, OVHcloud clearly achieves more throughput per $ compared to AWS DocumentDB. They are a tiny 0.11 operations / s / $ better than Azure CosmosDB MongoDB vCore.
Conclusion
Database management systems (DBMS) are the backbone of every modern application and Database-as-a-Service (DBaaS) may reduce total cost of ownership of database operations. MongoDB is the market leader in document-oriented DBMS and many different DBaaS offerings exist for it.These are either native MongoDB services, i.e. running software from MongoDB Inc., or compatible services providing a similar API.
In this whitepaper we compared OVHcloud’s new MongoDB service with API-compatible services offered by Microsoft Azure CosmosDB vCore and AWS DocumentDB. Our comparison covers general pricing as well as throughput, latencies, and value for money for three different use cases. We have not investigated potential issues that may arise when MongoDB-compatible competitors are not fully compatible.
Overall, we can state that despite being new on the market OVHcloud MongoDB is able to compete with both contenders. It scores with a simple pricing model with no hidden costs that also offers the cheapest price in the field for the investigated 8 cores / 30GB RAM set-up. This pricing puts OVHcloud MongoDB in one of the leading positions regarding value for money for all three workloads. Regarding high-availability, OVHcloud MongoDB supports replica sets with three replicas that MongoDB users are used to. This is different for Azure CosmosDB MongoDB vCore that only offers two replicas.
Performance-wise, OVHcloud MongoDB has a sweet spot for memory-bound reads. For update latencies, it is at eye-level with AWS DocumentDB and Azure CosmosDB MongoDB vCore for the majority of operations. Yet overall throughput suffers from few very slow operations. This is nothing surprising for a new service and we expect improvements with updates on OVHcloud MongoDB’s side over the next few months including the introduction of new DBaaS compute and network infrastructure resources based on their recently released 3rd generation (B3/C3/R3) instances as well as new saving plans to keep ahead of competitors in terms of value for money.
Disclaimer
The work presented in this document was commissioned by OVHcloud. OVHcloud chose the competitors, the test, and the database sizes. benchANT chose the most compatible configurations for the other tested DBaaS, ran the benchmarks, evaluated the results, and wrote the text.
About OVHcloud
OVHcloud is a global player and the leading European cloud provider operating over 450,000 servers within 40 data centers across 4 continents to reach 1,6 million customers in over 140 countries. Spearheading a trusted cloud and pioneering a sustainable cloud with the best performance-price ratio, the Group has been leveraging for over 20 years an integrated model that guarantees total control of its value chain: from the design of its servers to the construction and management of its data centers, including the orchestration of its fiber-optic network. This unique approach enables OVHcloud to independently cover all the uses of its customers so they can seize the benefits of an environmentally conscious model with a frugal use of resources and a carbon footprint reaching the best ratios in the industry. OVHcloud now offers customers the latest-generation solutions combining performance, predictable pricing, and complete data sovereignty to support their unfettered growth.
About benchANT
benchANT is a consulting and analytics firm specializing in comparative performance analysis of database management systems with a focus on cloud hosted databases and Database-as-a-Service technologies. BenchANT provides services to database vendors, cloud providers, and end users taking the role of an unbiased analyst, researcher, and evaluator. The experiments described in this paper have been designed, executed, and written-up by two of benchANT’s key employees.
Dr. Jörg Domaschka is one of benchANT’s co-founders. He has been trying to understand distributed systems for more than two decades. Performance engineering and benchmarking help him with that task and allow educating others on his findings.
Dr. Daniel Seybold is a co-founder and the CTO of benchANT. Daniel started his career as a researcher with a focus on distributed systems and databases. He has extensive experience in the field of database performance testing and has been working with NoSQL databases such as MongoDB, Cassandra and ScyllaDB for more than a decade.
Annex
The benchmarking process carried out by benchANT's automated benchmarking platform emphasizes full transparency and reproducibility based on established scientific concepts. The following figure depicts the main technical tasks carried out by the benchmarking platform for a single benchmark run, i.e. benchmarking one defined setups such as OVHcloud MongoDB for the read-only workload. In consequence, each benchmark run is carried out on a fresh instance to avoid any caching impact on memory OS or disk level from previous runs. The benchmark runs per cloud provider are scheduled sequentially to avoid creating noisy neighbors and all benchmarks are executed during regular business hours.