Vector Store Integration

Wire a real vector store into your RAG pipeline. The store landscape and selection criteria, client lifecycle and connection management, metadata filtering as a quality lever, batched upserts with stable IDs, and when to choose framework abstractions versus direct vendor clients.

Day 4 Progress0%
Section 1 of 58 min

The Vector Store Landscape

Day 1 used an in-memory mock vector DB to keep the lesson focused. Real production systems use a dedicated vector store, and there are four categories to know. **Managed SaaS.** Pinecone, Weaviate Cloud, Zilliz Cloud (managed Milvus). You hand over data, they handle scaling, replication, and uptime. The right choice when your team doesn't want to operate infrastructure. Expensive at scale but cheap for small-to-medium workloads when you account for your engineering time. **Self-hosted dedicated.** Qdrant, Weaviate, Milvus, LanceDB. You run them on your own infrastructure (k8s, ECS, plain VMs). More work but you own the operational characteristics — no vendor caps on namespace count, no surprise pricing changes, your data stays in your VPC. Standard choice for security-conscious teams once you have one or two infra engineers. **Embedded.** Chroma, LanceDB-as-library, FAISS. Run in-process with your application. Zero operational overhead. The right choice for prototypes, demos, single-machine apps, and read-heavy workloads where you can rebuild the index from source if it gets corrupted. **Database extensions.** pgvector (Postgres), Atlas Vector Search (MongoDB), Elastic kNN, Redis Vector Search. Bolt vector search onto a database you already run. Performance is decent but rarely best-in-class. The big win is that your vectors and your relational data live in the same database, so filtering by user_id / org_id / status is just a SQL WHERE clause. For most line-of-business apps where the corpus is under 10M vectors, pgvector is the right answer for boring reasons. **Selection criteria that actually matter:** - **Metadata filtering quality.** This is the single biggest production differentiator and the next section's topic. - **Hybrid search support.** Native BM25 + dense fusion (covered in Vector DB Intermediate Day 3) saves you building it yourself. - **Operational footprint.** What's the smallest production deployment? What does it look like at 100M vectors? - **Language SDK maturity.** A janky Python client is fine for prototyping; for a Go or TypeScript service you want a first-class SDK. - **Cost model.** Per-vector storage, per-query, per-index, per-replica. Run the napkin math at your expected scale. **The honest reality** is that for most projects, the answer is "whatever your team already runs." If you're on Postgres, use pgvector. If you're on AWS and want managed, use OpenSearch's vector mode or Pinecone. Don't change databases for a vector store unless you have a specific reason; the integration cost dominates the perf benchmarks for a long time.
// Same RAG pipeline, three different stores

// pgvector (Postgres extension)
async function searchPgvector(queryVec, k) {
  return await sql`
    SELECT id, text, embedding <=> ${queryVec}::vector AS distance
    FROM documents
    WHERE user_id = ${userId}
    ORDER BY distance ASC
    LIMIT ${k}
  `;
}

// Pinecone (managed SaaS)
async function searchPinecone(queryVec, k) {
  return await index.query({
    vector: queryVec,
    topK: k,
    filter: { user_id: { $eq: userId } },
    includeMetadata: true,
  });
}

// Qdrant (self-hosted)
async function searchQdrant(queryVec, k) {
  return await client.search("documents", {
    vector: queryVec,
    limit: k,
    filter: {
      must: [{ key: "user_id", match: { value: userId } }]
    },
  });
}

// All three return ranked vectors with metadata.
// The differences are in operational characteristics,
// not in the application code shape.

Key Takeaways

  • Four categories: managed SaaS, self-hosted dedicated, embedded, database extensions
  • pgvector is the right answer for most line-of-business apps under 10M vectors
  • Metadata filtering quality is the biggest production differentiator
  • Hybrid search and SDK quality matter more than dense-search benchmarks
  • Default to whatever your team already runs; integration cost dominates perf

AI Learning Assistant

Powered by advanced LLM

Get personalized help with concepts, code examples, and explanations tailored to your learning pace.

Course Stats

Estimated Time
45 min
Lessons
5 sections