Learn to implement pgvector in Postgres for semantic search. We cover vector embeddings, indexing, and performance tuning for production-ready AI apps.

Last month, our team spent three days refactoring a legacy search feature that relied on rigid keyword matching. We realized our users were getting zero results for synonyms, so we decided to pivot to semantic search. Instead of spinning up a separate vector database like Pinecone or Milvus, we leveraged our existing postgres infrastructure by installing the pgvector extension.
If you’re already running a standard relational database, adding vector embeddings directly into your schema is often the fastest way to get started. It keeps your data consistent and avoids the headache of managing cross-service synchronization.
First, you need the extension. If you're on a managed cloud instance like RDS or Supabase, it’s usually just a CREATE EXTENSION away. For local development, I use a standard Docker image:
SQL-- Enable the extension CREATE EXTENSION IF NOT EXISTS vector; -- Create your table CREATE TABLE documents ( id serial PRIMARY KEY, content text, embedding vector(1536) -- 1536 for OpenAI's text-embedding-3-small );
The 1536 dimension is crucial—it must match the output of your embedding model. If you change your model, you’ll have to rebuild your vector columns. Don't learn that the hard way like I did; keep your model versioning explicit in your application code.
Once your data is in, querying for similarity is surprisingly intuitive. pgvector supports standard operators like <=> for cosine distance.
SQLSELECT content FROM documents ORDER BY embedding <=> '[0.12, 0.05, ...]' LIMIT 5;
This query returns the top 5 documents closest to your input vector. It’s simple, but it hits a performance wall once you cross around 10,000 rows because it performs a sequential scan. That’s when you need to start looking into HNSW or IVFFlat indexes.
I initially thought IVFFlat was the go-to index. It’s easier to configure, but it often sacrifices too much recall for speed. After testing both, I found that HNSW (Hierarchical Navigable Small World) indexes provide a much better balance of speed and accuracy for most production workloads.
SQLCREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);
Keep in mind that while HNSW speeds up your reads, it makes your INSERT and UPDATE operations significantly slower. If you have a high-write volume application, consider Postgres logical decoding for real-time CDC to offload the embedding generation and indexing to a background worker.
While semantic search is powerful, it’s not a silver bullet. Sometimes, you need exact keyword matches for IDs or product codes. We eventually implemented Hybrid Search in RAG Pipelines: Boosting Retrieval Accuracy to combine vector similarity with traditional full-text search.
If you’re building a chatbot or a Q&A system, don't forget to implement Semantic caching for RAG pipelines: Cut latency and costs. It’ll save you a significant amount of money on embedding API calls.
Does pgvector affect my database backups?
Yes, it increases the size of your storage and potentially the time it takes to perform vacuuming. Ensure your maintenance_work_mem is tuned high enough to handle large index builds.
Is it really production-ready? Absolutely. Many teams run massive vector datasets in Postgres now. The key is to treat it like any other index—monitor your cache hit ratios and watch your explain plans.
What's the biggest trade-off? The main trade-off is memory. HNSW indexes reside in RAM to be performant. If your embedding table grows to tens of millions of rows, you’ll need to scale your instance memory accordingly, which can get expensive.
Moving to semantic search using pgvector saved us from the operational complexity of managing a dedicated vector engine. We’re still tweaking our HNSW parameters to find the sweet spot between latency and recall, but for now, the simplicity of having everything in postgres is worth every byte.
AI coding assistants are changing how we write code, but are they worth the hype? Learn how to integrate LLMs into your workflow without losing your edge.