The Price of Inertia: Why Legacy AWS RDS PostgreSQL Instances are Costing Your Business Performance

Selecting cloud database hardware often feels like an exercise in faith rather than empirical engineering. In this analysis, we explore the performance variables of Amazon RDS PostgreSQL across two major architectural families (x86-64 based Intel Xeon and ARM64-based AWS Graviton) as well as two CPU generations. Utilizing a standardized BenchBase TPC-C workload, our results demonstrate that instances with newer CPU generations deliver measurable latency cuts and throughput gains at near-identical baseline costs. We also highlight that the observed performance leaps are markedly asymmetric with more pronounced gains for the Graviton instances. If you are running legacy database instances, you are actively overpaying for underperformance. Also, our results show that Graviton offerings currently yield the more optimal performance-per-dollar ratio for AWS RDS PostgreSQL.

Overview

Selecting the optimal instance family for an enterprise Relational Database Service (RDS) deployment on Amazon Web Services (AWS) is routinely oversimplified. Unlike many other DBaaS vendors, AWS allows you to choose from a wide variety of instance types and architectures with significant price differences. Organizations frequently make multi-million dollar infrastructure commitments based on nominal cloud billing estimates or high-level compute unit specifications rather than empirical performance profiles. Yet, for database engines like PostgreSQL, the actual performance is not only determined by obvious parameters such as the number of generic virtual CPUs (vCPU), but also it's specific architecture. Micro-architectural differences, instruction set architectures and cache topologies have a strong impact on transaction execution speeds, lock contention, and connection-handling overhead of the database. Also, optimizations of the database engine for the specific target CPU architecture play a crucial role.

This technical post presents an empirical comparison of four AWS RDS PostgreSQL instance options across the general-purpose profiles. We evaluate two generations of Intel x86-64 based compute (m6i, m7i) against two generations of AWS custom-engineered ARM64-based Graviton silicon (m7g, m8g). The goal is straightforward: isolate the impact of these CPU instance types to quantify absolute throughput capabilities, transactional latency reductions, and the subsequent "Bang for Buck" metrics. For reproducibility and transparency, all data associated with this analysis (including meta data, configurations, raw measurement results, and monitoring data) can be accessed on our GitHub repository.

The Instance Types under Review

To evaluate the progression of database compute efficiency, we choose four instance specifications matching a standard deployment profile using the 2xlarge sizing tier: 8 vCPUs and 32 GB of RAM. The comparison includes two distinct paths of silicon evolution:

On the traditional x86-64 side, we examine the older db.m6i.2xlarge (priced $1,477 per month at the time of writing), driven by 3rd gen. Intel Xeon Scalable processors (Ice Lake architecture), and its direct generational successor, the db.m7i.2xlarge (also priced $1,477 per month), which leverages 4th gen. Intel Xeon Scalable processors from the Sapphire Rapids family. The shift from Ice Lake to Sapphire Rapids introduces a higher Instruction-Per-Cycle execution profile, improved memory bus architectures, and updated hardware-assisted cryptographic and compression accelerators.

On the ARM64 side, we benchmark the custom AWS silicon line, starting with the db.m7g.2xlarge ($1,378 per month) powered by AWS Graviton3 processors and compare it to the latest-generation db.m8g.2xlarge ($1,377 per month) driven by Graviton4 chips. Graviton3 utilizes Neoverse-V1 cores, while Graviton4 upgrades the architecture to Neoverse-V2 cores. This shift provides significant enhancements including larger L2 caches per core, increased total memory bandwidth (DDR5), and advanced branch prediction pipelines.

Methods and Measurements

To guarantee a fair benchmark environment, all four instance flavors were configured identically except for the CPU flavor. benchANT's benchmarking framework ensures that the testing parameters were strictly locked as follows to isolate the impact of the CPU architecture:

Compute Sizing: 8 vCores / 32 GB RAM (configured as db.m6i.2xlarge, db.m7i.2xlarge, db.m7g.2xlarge, and db.m8g.2xlarge).
Storage Engine Baseline: General Purpose SSD (gp2) storage with a provisioned size of 500 GB.
Database Engine: PostgreSQL version 17.5 (aarch64 and x86_64).

For executing the workload, we make use of the BenchBase benchmarking suite included in benchANT's framework. BenchBase is configured to run its implementation of the TPC-C benchmark, which is considered a representative industry-standard workload. The scale factor for the benchmark was tuned to generate a total footprint of approximately 20 GB of raw data which leads to around 26GB of data used on disk (includng indexes, WAL, etc.). While this data set fully fits in memory, the evaluation still considers disk access, as the workload makes use of both insert and update operations.

The main target variables are throughput (measured by average transactions per second; higher is better), latency (measured via p95 latency; lower is better), and estimated monthly costs (based on on-demand pricing in the eu-central-1 region, excluding data transfer, backups etc.; lower is better). We also calculated a derived value that captures the price performance (defined as operations per second per dollar; higher is better).

Note: Standard synthetic benchmarks provide an essential baseline, but your production workloads possess unique structural characteristics, index designs, and query access patterns. benchANT is specialized in executing highly customized, isolated DBaaS benchmarks that mimic your exact production parameters. We help companies to eliminate speculative infrastructure over-provisioning.

Results and Analysis

First, we take a look at performance in terms of throughput and latency. The upper left corner represents the high-performance instances with a high throughput and low latencies. The plot indicates a correlation between throughput and latency – more powerful CPU architectures usually increase throughput and cut down latencies.

Performance: Throughput vs. Latency (p95)

Figure 1: Throughput vs. Latency (p95)

Next, we take a look at the impact of price and performance, in particular, how newer instance types with almost identical pricing provide better performance results. Across both execution architectures, newer compute tiers conclusively outperformed their predecessors. For x86-64, the db.m7i.2xlarge demonstrated a 13.7% increase in Transactions Per Second (TPS) compared to the db.m6i.2xlarge, accompanied by an 14.5% reduction in P95 transactional latencies.

Figure 2: Monthly Costs vs. Throughput (higher is better)

In turn, for Graviton, the transition from the Graviton3 platform (db.m7g.2xlarge) to the Graviton4 platform (db.m8g.2xlarge) yielded a massive throughput increase, culminating in a 45.5% transactional improvement. The db.m8g.2xlarge cut P95 latency by 28.7% relative to the db.m7g.2xlarge instance.

Figure 3: Monthly Costs vs. Latency (p95) (lower is better)

While both families progressed, the performance enhancements within the Graviton ARM ecosystem were significantly more pronounced. The competitive pricing makes the asymmetric performance leaps of the Graviton instances particularly interesting, as can be seen in the price performance chart. Here, the db.m8g.2xlarge has taken the lead.

Figure 4: Price Performance Comparison (higher is better)

Takeaways and Strategic Recommendations

The data gathered via our standardized benchmarking methodology drives two unambiguous, actionable conclusions for database infrastructure planning:

Generational Upgrades Are Obligatory: As long as AWS routes newer silicon tiers into identical or near-identical cost buckets as legacy generations, running database instances on legacy families (such as db.m6i) represents an ongoing operational inefficiency. Upgrading an active database cluster to a newer generational counterpart delivers an immediate, zero-cost performance uplift. This translates into extra application headroom and a measurable reduction in query processing latency.
Graviton Wins the "Bang for Buck" Metric: For greenfield software rollouts or active migration tracks, AWS's current Graviton ARM64 offerings provide the most compelling performance-per-dollar ratio. The Graviton line combines a structural cost discount with generational performance scaling that can even surpass its x86-64 competition. In our analysis, the db.m8g instance provides not only a convincing performance, but also the most "Bang for Buck".

Note: Selecting a core database tier using general baseline indicators leaves your bottom-line performance up to chance. benchANT can deploy advanced synthetic and production-playback testing suites tailored to your application's precise transaction mix. We conduct such analyses for our customers on a broad range of DBaas offerings, including hyperscalers and EU-based DBaas vendors – independent of the database or workload selected. Contact us today to audit your database fleet, validate architectural choices, and maximize your performance-per-dollar ratio.