March 1, 2026·Rivestack Team

Getting Started with pgvector on Rivestack

pgvector

tutorial

So you're building something with AI and someone told you that you need a "vector database." Maybe you looked at Pinecone, Weaviate, or Qdrant. Maybe you signed up for one, stared at the dashboard for ten minutes, and thought: "Do I really need another database?"

Good news: you probably don't.

pgvector is a PostgreSQL extension that adds vector similarity search to the database you already know. Same SQL. Same transactions. Same JOINs. Your embeddings live right next to your users, orders, and everything else. No syncing pipelines, no eventual consistency headaches, no new query language to learn.

This guide walks you through everything from enabling the extension to running production-ready similarity search. If you've ever written a SELECT statement, you're already 90% of the way there.

Why pgvector Instead of a Dedicated Vector Database?

This is the question everyone asks, so let's get it out of the way.

Dedicated vector databases are great if vectors are literally the only thing you store. But in practice, every AI application also needs relational data. Users, sessions, billing, metadata, access control. When your vectors live in one database and everything else lives in another, you end up building and maintaining a sync pipeline between them. And sync pipelines are where weekends go to die.

With pgvector, you get:

One database for everything. Vectors, relational data, full-text search, JSONB metadata. All queryable with SQL.
Transactional consistency. Insert a document and its embedding in the same transaction. No "the embedding will show up eventually" surprises.
JOINs that actually work. Find similar products, then join with inventory, pricing, and reviews. Try doing that across two databases without crying.
No new infrastructure. Your existing connection pooler, monitoring, backup strategy, and deployment pipeline all just work.
The PostgreSQL ecosystem. 30+ years of battle-tested reliability, tooling, and community knowledge. Your DBA already knows how to tune it.

The tradeoff? Dedicated vector databases can be faster for pure vector workloads at massive scale (billions of vectors). But if you're working with millions of vectors or fewer and you also need relational data, pgvector is almost certainly the right choice.

Setting Up pgvector

Every Rivestack database comes with pgvector pre-installed. You don't need to compile anything, mess with shared_preload_libraries, or sacrifice a rubber duck to the extension gods. Just run:

CREATE EXTENSION IF NOT EXISTS vector;

That's it. You now have vector superpowers.

Creating a Table with Vector Columns

Let's build a simple document search system. Each document has a title, some content, and an embedding vector:

CREATE TABLE documents (
  id BIGSERIAL PRIMARY KEY,
  title TEXT NOT NULL,
  content TEXT,
  embedding VECTOR(1536),
  created_at TIMESTAMPTZ DEFAULT now()
);

The VECTOR(1536) type stores a 1536-dimensional vector, which is the output size of OpenAI's text-embedding-3-small model. If you're using a different model, adjust the number:

Model	Dimensions
OpenAI text-embedding-3-small	1536
OpenAI text-embedding-3-large	3072
Cohere embed-v4	1024
Google text-embedding-005	768
Sentence Transformers (all-MiniLM-L6-v2)	384

Pick the dimension that matches your embedding model. Using the wrong number will give you an error when inserting, which is actually a feature because it prevents you from accidentally mixing embeddings from different models. PostgreSQL looking out for you.

Inserting Embeddings

In practice, you'll generate embeddings in your application code (using the OpenAI API, a local model, etc.) and insert them alongside your data:

INSERT INTO documents (title, content, embedding)
VALUES (
  'Introduction to PostgreSQL',
  'PostgreSQL is a powerful open-source relational database...',
  '[0.021, -0.038, 0.114, ...]'::vector
);

A few things to note:

The vector is passed as a string representation of an array, then cast to vector using ::vector.
All dimensions must be present. If your model outputs 1536 dimensions, your vector string needs exactly 1536 numbers.
NULL embeddings are fine. You can insert a row now and add the embedding later when your pipeline processes it.

For bulk inserts, COPY is your best friend. It's significantly faster than individual INSERT statements:

COPY documents (title, content, embedding)
FROM '/path/to/data.csv' WITH (FORMAT csv);

Running Your First Similarity Search

Here's the moment you've been waiting for. Let's find the 5 documents most similar to a query vector:

SELECT title, content,
       1 - (embedding <=> '[0.021, -0.038, ...]'::vector) AS similarity
FROM documents
ORDER BY embedding <=> '[0.021, -0.038, ...]'::vector
LIMIT 5;

The <=> operator computes cosine distance (not similarity). Lower distance means more similar. We subtract from 1 to get a similarity score between 0 and 1, which is easier to reason about.

pgvector supports three distance operators:

Operator	Distance Type	Best For
`<=>`	Cosine	Normalized embeddings (most common)
`<->`	L2 (Euclidean)	Raw, unnormalized vectors
`<#>`	Inner product (negative)	When you want dot product similarity

For most AI applications using pre-trained embedding models, cosine distance (<=>) is the right choice. The models typically output normalized vectors where direction matters more than magnitude.

The Power of Combining Vectors with Relational Data

This is where pgvector really shines compared to standalone vector databases. Let's say you're building a product recommendation engine:

SELECT p.name, p.price, p.category, c.avg_rating,
       1 - (p.embedding <=> query.embedding) AS similarity
FROM products p
JOIN (SELECT embedding FROM products WHERE id = 42) query ON true
LEFT JOIN product_stats c ON c.product_id = p.id
WHERE p.in_stock = true
  AND p.price BETWEEN 10 AND 100
  AND p.category = 'electronics'
ORDER BY p.embedding <=> query.embedding
LIMIT 10;

Try doing that in Pinecone. You'd need to query the vector database first, get the IDs back, then query your relational database for the rest. Two round trips, data that might be out of sync, and code that's twice as complex for no good reason.

Adding an HNSW Index (You Need This for Production)

Without an index, pgvector does a sequential scan, which means it compares your query vector against every single row. Fine for a few thousand rows. Absolutely not fine for 100K or more.

HNSW (Hierarchical Navigable Small World) is the go-to index type. It's an approximate nearest neighbor algorithm that trades a tiny bit of recall for massive speed improvements:

CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 200);

Let's break down those parameters because they actually matter:

m = 16 controls how many connections each node has in the graph. Higher values improve recall but use more memory and slow down inserts. 16 is a solid default. Go to 32 or 64 if you need recall above 0.99 and don't mind the memory cost.

ef_construction = 200 controls how thorough the index build process is. Higher values produce a better quality index but take longer to build. 200 is the sweet spot for most workloads. Don't go below 100 unless you enjoy debugging bad search results at 2 AM.

vector_cosine_ops matches the cosine distance operator (<=>). If you're using L2 distance, use vector_l2_ops instead.

Tuning Search Quality

At query time, you can control the accuracy vs speed tradeoff with ef_search:

SET hnsw.ef_search = 100;

The default is 40, which gives about 0.95 recall (95% of the time, the true nearest neighbors are in the results). Bump it to 100 or 200 for higher recall. Lower it to 20 if you need maximum speed and can tolerate slightly less accurate results.

You can set it per-session or per-transaction, so different parts of your application can use different values.

Monitoring Your Index

Once your index is built, keep an eye on it:

-- Check index size
SELECT pg_size_pretty(pg_relation_size('documents_embedding_idx'));

-- Check if queries are using the index
EXPLAIN ANALYZE
SELECT * FROM documents
ORDER BY embedding <=> '[0.021, ...]'::vector
LIMIT 10;

Look for "Index Scan using hnsw" in the EXPLAIN output. If you see "Seq Scan" instead, PostgreSQL thinks the table is too small for the index or the query isn't using LIMIT. HNSW indexes only kick in when you have an ORDER BY ... LIMIT clause.

Common Gotchas

A few things that trip people up, so you don't have to learn them the hard way:

Forgetting the LIMIT clause. Without LIMIT, PostgreSQL will do a sequential scan even if an HNSW index exists. Always include LIMIT in your similarity queries.

Mixing embedding dimensions. If you switch from text-embedding-3-small (1536d) to text-embedding-3-large (3072d), you need a new column with the right dimension. Old embeddings and new embeddings can't be compared.

Not setting ef_search. The default of 40 is fine for prototyping but you should benchmark different values for your specific dataset and recall requirements.

Indexing before bulk loading. Build the HNSW index after you've loaded your data, not before. Inserting into an existing HNSW index is much slower than building it from scratch.

What's Next

You now have everything you need to build vector search into your PostgreSQL application. From here, you might want to explore:

Hybrid search combining pgvector similarity with PostgreSQL's built-in full-text search for better results on text-heavy datasets
HNSW parameter tuning to find the right balance of speed, recall, and memory for your specific workload
Partitioning strategies for very large vector datasets (10M+ rows) where you can partition by tenant, date range, or category

On Rivestack, all of this runs on NVMe storage, which means your HNSW queries stay fast even when the index doesn't fit in memory. If you want to try it out, create a free instance and follow along with the examples above. The whole thing takes about five minutes.