Mahamudul Hasan Rubel
HomeAboutProjectsSkillsExperienceBlogPhotosContact
Mahamudul Hasan Rubel

Senior Software Engineer crafting high-performance web applications and SaaS platforms.

Navigation

  • Home
  • About
  • Projects
  • Skills
  • Experience
  • Blog
  • Photos
  • Contact

Get in Touch

Available for senior/lead roles and consulting.

bd.mhrubel@gmail.comHire Me

© 2026 Mahamudul Hasan Rubel. All rights reserved.

Built with using Next.js 16 & Tailwind v4

Back to Blog
Tech NewsJune 21, 20263 min read

Implementing pgvector in Postgres for Semantic Search at Scale

Learn to implement pgvector in Postgres for semantic search. We cover vector embeddings, indexing, and performance tuning for production-ready AI apps.

postgrespgvectorvector-searchai-engineeringmachine-learningdatabasesNewsTrendsIndustry
White keyboard keys spelling 'search' on a bold red surface, conceptual design with copyspace.

Last month, our team spent three days refactoring a legacy search feature that relied on rigid keyword matching. We realized our users were getting zero results for synonyms, so we decided to pivot to semantic search. Instead of spinning up a separate vector database like Pinecone or Milvus, we leveraged our existing postgres infrastructure by installing the pgvector extension.

If you’re already running a standard relational database, adding vector embeddings directly into your schema is often the fastest way to get started. It keeps your data consistent and avoids the headache of managing cross-service synchronization.

Setting up your environment

First, you need the extension. If you're on a managed cloud instance like RDS or Supabase, it’s usually just a CREATE EXTENSION away. For local development, I use a standard Docker image:

SQL
-- Enable the extension
CREATE EXTENSION IF NOT EXISTS vector;

-- Create your table
CREATE TABLE documents (
    id serial PRIMARY KEY,
    content text,
    embedding vector(1536) -- 1536 for OpenAI's text-embedding-3-small
);

The 1536 dimension is crucial—it must match the output of your embedding model. If you change your model, you’ll have to rebuild your vector columns. Don't learn that the hard way like I did; keep your model versioning explicit in your application code.

The power of pgvector similarity

Once your data is in, querying for similarity is surprisingly intuitive. pgvector supports standard operators like <=> for cosine distance.

SQL
SELECT content 
FROM documents 
ORDER BY embedding <=> '[0.12, 0.05, ...]' 
LIMIT 5;

This query returns the top 5 documents closest to your input vector. It’s simple, but it hits a performance wall once you cross around 10,000 rows because it performs a sequential scan. That’s when you need to start looking into HNSW or IVFFlat indexes.

Optimizing performance with indexing

I initially thought IVFFlat was the go-to index. It’s easier to configure, but it often sacrifices too much recall for speed. After testing both, I found that HNSW (Hierarchical Navigable Small World) indexes provide a much better balance of speed and accuracy for most production workloads.

SQL
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

Keep in mind that while HNSW speeds up your reads, it makes your INSERT and UPDATE operations significantly slower. If you have a high-write volume application, consider Postgres logical decoding for real-time CDC to offload the embedding generation and indexing to a background worker.

Bridging the gap: RAG and hybrid search

While semantic search is powerful, it’s not a silver bullet. Sometimes, you need exact keyword matches for IDs or product codes. We eventually implemented Hybrid Search in RAG Pipelines: Boosting Retrieval Accuracy to combine vector similarity with traditional full-text search.

If you’re building a chatbot or a Q&A system, don't forget to implement Semantic caching for RAG pipelines: Cut latency and costs. It’ll save you a significant amount of money on embedding API calls.

Common Questions

Does pgvector affect my database backups? Yes, it increases the size of your storage and potentially the time it takes to perform vacuuming. Ensure your maintenance_work_mem is tuned high enough to handle large index builds.

Is it really production-ready? Absolutely. Many teams run massive vector datasets in Postgres now. The key is to treat it like any other index—monitor your cache hit ratios and watch your explain plans.

What's the biggest trade-off? The main trade-off is memory. HNSW indexes reside in RAM to be performant. If your embedding table grows to tens of millions of rows, you’ll need to scale your instance memory accordingly, which can get expensive.

Moving to semantic search using pgvector saved us from the operational complexity of managing a dedicated vector engine. We’re still tweaking our HNSW parameters to find the sweet spot between latency and recall, but for now, the simplicity of having everything in postgres is worth every byte.

Back to Blog

Similar Posts

Close-up of a hand adjusting a chess clock during a timed chess game indoors.
Tech NewsJune 20, 20264 min read

Postgres logical decoding for real-time CDC: A Practical Guide

Master Postgres logical decoding for real-time CDC. Learn how to stream database changes effectively to build robust, event-driven architectures today.

Read more
Close-up of AI-assisted coding with menu options for debugging and problem-solving.
Tech News
June 20, 2026
4 min read

AI coding assistants: A grounded take on the current hype

AI coding assistants are changing how we write code, but are they worth the hype? Learn how to integrate LLMs into your workflow without losing your edge.

Read more
Runner's feet on track lane 5 at the finish line, symbolizing completion and success.
Tech NewsJune 20, 20264 min read

Bun runtime performance: Why it’s the shift you need

Bun runtime performance is changing how we build apps. Discover if this high-speed alternative to Node.js is ready for your production environment today.

Read more