benchANT Homepage
benchANT Homepage

PostgreSQL Vector Search Extensions

Following a lot of hype surrounding standalone vector databases, many in the industry have been realizing that such specialized, dedicated engines do not always fit the bill. Instead, keeping your vectors next to your relational data is oftentimes more manageable and straightforward.

In a series of blog posts, we will explore how appropriate extensions allow you to turn PostgreSQL into a hybrid database that supports both relational data and vectors. In this first part, we take a look at the essential concepts of vector databases and what the PostgreSQL ecosystem provides for this purpose. Later in a second part, we will dive into the performance characteristics of different vector extensions and how they compare against other offerings. We discussed Benchmarking Vector Databases in an earlier blog post.

A Storage for Vector Embeddings

At its core, a vector embedding is a numerical representation of unstructured data such as text, images, or audio, stored as an array of floating-point numbers. Such embeddings are the output of machine learning techniques that map any entity to an appropriate vector representation. In a high-dimensional space, “meaning” is then represented by proximity. If two vectors are close together, the concepts they represent are semantically similar. Depending on the type of input, different models are available for the embeddings. For instance, word embeddings assign vector embeddings at the level of single words, hence, related words are hence mapped to embeddings that are also close to each other. In contrast, sentence embeddings capture entire sentences (or even documents) into a single vector. Again, if two vectors are close, also the original sentences share similar contents.

Modern databases already allow finding similar strings with techniques for fuzzy string searching. For instance, distance-based metrics account for the number of edit operations to transform a string into a target string while phonetic algorithms index strings based on their pronunciation. However, these searches group hits by their similarity in terms of spelling, but not by their meaning. Here, vector embeddings provide an alternative that index entities by their semantics. Queries in the vector space allow for interesting use cases, including semantic searches (e.g., seeking entries by content), similarity searches (e.g., finding similar entries), and recommendation engines (e.g., suggesting related entries).

A key feature of a vector database is an indexing mechanism for vectors that enables fast retrievals of vectors and searches for similar vectors. Using appropriate distance metrics, a vector search turns a query for a given entry or a given search vector into a nearest neighbor search.

Popular indexing strategies for approximate nearest neighbor (ANN) searches include tree-based methods, hash-based methods with locality sensitive hashing, quantization-based methods with inverted file indexes (e.g., IVF-PQ), and graph-based methods (e.g., HNSW). Vector databases primarily use the latter two for indexing. In short, IVF-PQ (product quantization with inverted file index) first partitions the vector space into clusters and then compresses the vector space. In turn, HNSW (Hierarchical Navigable Small World) is an approach that uses multi-layered graphs which allow to “zoom in” into the vector space for queries.

Enabling Vector Search for PostgreSQL

Although PostgreSQL does not natively support vectors, this feature can be added by a number of competing open source extensions: pgvector, pgvectorscale, and VectorChord: These extensions provide new vector data type(s), different distance operators for quantifying similarity between embeddings, and index types for efficient searches. Other extensions have been abandonned in favour of pgvector (e.g., pg_embedding) or VectorChord (e.g. pgvecto.rs).

pgvector

Written in C and most widely supported, the pgvector extension has become the de-facto standard for vectors in PostgreSQL. It features rock-solid stability and broad cloud support (also see below). It provides a total of six different distance metrics, specialized vectors (e.g., for binary or sparse vectors), and two index types to choose from: HNSW and IVFFlat.

pgvectorscale

Developed by the TigerData team, this extension complements pgvector with a strong focus on performance and scalability. The core of pgvectorscale is an alternative index type called StreamingDiskANN. This technique moves the heavy lifting to disk while it also applies a more efficient compression for quantization. Thus, it is significantly more storage-efficient than pgvector with HNSW and enables even larger vector data sets.

VectorChord

VectorChord is the sucessor of the now deprecated pgvecto.rs extension (a prior, Rust-based contender for pgvector) and is designed for scalable, high-performance, and cost-effective vector search. VectorChord's index type employs IVF with an index that can be build externally through offloading. It also employs RaBitQ to significantly reduce computation costs when compressing vectors. Consequently, VectorChord is a potent contender for vector search at billion scale.

Comparison of Open Source PostgreSQL Vector Extensions

Table 1. Vector Extensions as of May 2026
pgvectorpgvectorscaleVectorChord
Primary MaintainerAndrew KaneTiger DataTensorChord
Latest Stable Releasev0.8.20.9.01.1.1
GitHub Stars21.2k3k1.7k
LicensePostgreSQL LicensePostgreSQL LicenseAGPLv3 / ELv2
Main LanguageCRustRust
Primary Index TypesHNSW, IVFFlatStreamingDiskANNvchordrq
(IVF+RaBitQ)
Vector Typesvector
halfvec
bit
sparsevec (HNSW only)
vectorvector
halfvec
Max Dimensions2,000 (vector)
4,000 (halfvec)
64,000 (bit)
16,000 (vector)60,000 (vector)
Available Distance MetricsL2, L1, cosine,
negative dot product,
Hamming (for bit)
Jaccard (for bit)
L2, cosine,
negative dot product
L2, cosine,
negative dot product

AlloyDB for PostgreSQL

Beyond the open source extensions discussed above, there is also AlloyDB, a Postgres-compatible engine by Google Inc. that also provides vector search capabilities. Internally, AlloyDB's vector search is built upon the ScaNN (Scalable Nearest Neighbors) index, Google's approach towards the approximate nearest neighbor problem. This approach is optimized for a small memory footprint and fast index creations and claims to outperform traditional HNSW indexing in both high throughput writes and vector queries.

Vector-Enabled Cloud and DBaaS Offerings

If you don't want to manage your own vector-enabled PostgreSQL database, the major cloud players as well as many DBaaS providers have already integrated these extensions into their managed offerings:

  • AWS (RDS & Aurora): Support for pgvector.
  • Azure Database for PostgreSQL: Offers pg_diskann, a Microsoft-proprietary implementation of the DiskANN algorithm.
  • Google Cloud SQL: Support for pgvector with integrated vertex AI embeddings pipelines.
  • Google AlloyDB for PostgreSQL: A managed Postgres-compatible service with support for ScaNN-based indexing.
  • Neon (A Databricks Company) Serverless Postgres: Support for pgvector.
  • Tiger Cloud by Tiger Data: Support for pgvectorscale.
  • VectorChord Cloud: Support for VectorChord.
  • OVHcloud Public Cloud Databases PostgreSQL: Support for pgvector and pgvectorscale.
  • Telecom Cloud RDS PostgreSQL: Support for pgvector.
  • StackIT PostgreSQL Flex: Support for pgvector.

This concludes our first part on our series on vector search extensions for PostgreSQL. In the following post, we will look into the performance characteristics of the extensions and explore potential performance optimizations.