# Vector DB Cost Models: A Buyer's Guide for 2026

> The vector DB market consolidated; capability differences across Pinecone, Weaviate, pgvector, and Cloudflare Vectorize are small. Cost model is the differentiator. A buyer's guide for 2026.

URL: https://agentsbooks.com/blog/vector-db-cost-models
Published: 2026-05-19T18:35:00Z
Category: Deep Dive
Tags: vector-db, memory, cost, spoke, p8

The vector-DB market has consolidated. Picking one isn't capability-driven anymore — capability differences across the top 4 are small. It's *cost-model* driven.

## The top 4 in 2026

- **Pinecone** ([docs](https://docs.pinecone.io/guides/get-started/overview)) — managed, serverless.
- **Weaviate** ([docs](https://docs.weaviate.io/weaviate)) — open-source + managed cloud option.
- **pgvector** ([repo](https://github.com/pgvector/pgvector)) — Postgres extension.
- **Cloudflare Vectorize** ([docs](https://developers.cloudflare.com/vectorize/)) — edge-local.

(Honourable mentions: Qdrant, Milvus, MongoDB Atlas Vector Search — all viable but smaller market share. ChromaDB is increasingly used in dev environments but rarely production.)

## The four cost models

### Pinecone — pay per query + storage

Per-namespace pricing. Storage tier (~$0.33/GB/month) + per-second of pod time (varies by pod size + replication). At 1M vectors × 1536-dim = ~6GB; with a single small pod, ~$70/month at minimum, scaling with query volume.

**Right when:** small-to-medium scale, want zero-ops, query volume is the variable. Most agentic firms below ~50M vectors.

### Weaviate Cloud — pay per cluster

Cluster pricing (managed) + storage. Tiered by RAM. A 4GB cluster runs ~$100/month for storage of ~2–4M vectors; bigger clusters scale linearly.

**Right when:** want hybrid search (vector + keyword) without a separate engine. Want the open-source escape hatch (you can self-host the same software if managed-cloud pricing changes).

### pgvector — pay for Postgres + your own ops

Free extension on a Postgres instance you already run. Cost is whatever your Postgres costs. At small scale on shared infrastructure: essentially free. At large scale (>10M vectors, high query rate): substantial Postgres compute.

**Right when:** already on Postgres, want a single data plane, willing to manage indexes + tuning. Especially right for early-stage when the vector store is one piece of a broader SQL workload.

### Cloudflare Vectorize — pay per query + storage, edge-local

Edge-local pricing. ~$0.01 per million queried vectors + storage. Globally distributed.

**Right when:** need low-latency global queries (consumer-facing apps in particular). Want zero-ops + edge distribution. Pairs naturally with Cloudflare Workers for the agent runtime.

## Decision shortcuts

If you're already on Postgres at <10M vectors: **pgvector.** Single data plane wins.

If you're consumer-facing with global users: **Cloudflare Vectorize.** Latency wins.

If you want maximum ops simplicity and you're at small scale: **Pinecone.** Default for "I don't want to think about it."

If you want hybrid search + the option to escape to self-hosting: **Weaviate.**

If you're at >100M vectors and willing to operate it: re-evaluate. Self-hosted Weaviate or Milvus at that scale typically beats managed pricing.

## The cost components

For any vector DB, total cost = storage + query + index-build + replication. The four DBs above weight these differently:

| DB | Storage | Query | Index-build | Replication |
|---|---|---|---|---|
| Pinecone | Medium | Pod-bound | Auto | Replica pods |
| Weaviate | Medium | Cluster-bound | Auto | Replica cluster |
| pgvector | Low (Postgres) | CPU-bound | Manual | Postgres replica |
| Cloudflare Vectorize | Low | Per-query | Auto | Global by default |

The dominant cost for most agentic firms is *query*. Optimising for query patterns (right index type, right replica count, right filter usage) matters more than picking between the four.

## The hidden cost: embeddings

Vector DBs charge for *storing + querying* vectors. They don't generate them. Embedding generation is a separate cost (OpenAI text-embedding-3-small at ~$0.02/M tokens; open-source alternatives free if you host).

For a small firm at ~500K-document corpus: one-time embedding cost ~$50–200. For a firm re-embedding nightly: monthly cost grows with corpus size.

The [vector-db-cost-calculator](https://vector-db-cost-calculator.roei-020.workers.dev/) models all four DBs + embedding costs at varying corpus + query scale.

## FAQ

**Q: What about latency differences?**
A: p95 latencies for all 4 are <50ms at small-to-medium scale. Cloudflare wins at global edge, the others tie. At very large scale (>100M vectors), latency curves diverge — benchmark before committing.

**Q: Migration risk?**
A: Open-source options (Weaviate, pgvector) have lower migration risk by definition. Pinecone is the most locked-in but exports clean to other DBs if needed.

**Q: Embedding-model choice?**
A: Separate question. Most firms in 2026 use OpenAI text-embedding-3-large or Anthropic's voyage models. The cost difference is small; the quality difference on domain-specific tasks is measurable. Run an eval before committing.

**Q: How does this map to the [Pillar P8 essay](/blog/agent-memory-knowledge)?**
A: P8 covers the three memory layers + the RAG-vs-context decision tree. This spoke is the buyer-side guidance for the specific component that powers semantic memory.

---

*Want to model the cost for your firm? [Try the calculator →](https://vector-db-cost-calculator.roei-020.workers.dev/)*
