Why Vector Databases Are Now a Production Necessity
Generative AI adoption in enterprise has produced a new class of infrastructure requirement: the ability to search large collections of embeddings at low latency. A retrieval-augmented generation system that takes three seconds to find relevant context before generating a response is not a product. It is a demo.
Traditional relational databases handle vector similarity search poorly. PostgreSQL with pgvector works at small scale and is excellent for teams that already run Postgres and need basic semantic search. At 100 million vectors with sub-100ms p99 latency requirements, it starts to show its limits. Dedicated vector databases exist precisely to handle these workloads at production scale.
The question enterprises now face is not whether to adopt a vector database. It is which one, in what deployment model, and integrated how deeply into the existing data platform. The wrong answers to those questions create migration costs that tend to materialise eighteen months after the initial selection.
Vector database selection is not primarily a technical decision. It is a total cost of ownership decision with technical constraints. The cheapest option at 10 million vectors is often not the cheapest option at 500 million vectors. Factor in operational cost, not just licensing.
The Evaluation Dimensions That Actually Matter
Most vendor comparison articles for vector databases focus on benchmark query speeds. Benchmarks matter, but they are rarely the deciding factor for enterprise selection. The dimensions that actually drive enterprise decisions are more operational than algorithmic.
The Major Options: A Practical Assessment
This is not a comprehensive benchmark. It is an honest practitioner assessment of where each major option fits and where it does not. Benchmarks change with every release; architectural fit changes rarely.
Decision Framework: Which Option for Which Use Case
The Hidden Cost: Embedding Storage at Scale
Vector database costs are often underestimated because the vector count at project inception is not the vector count eighteen months later. A document management system that starts at 5 million vectors grows with every document processed. A recommendation system that vectors every product interaction can reach billions of vectors in a mature e-commerce business.
The cost of storing a float32 embedding at 1536 dimensions (OpenAI ada-002 standard) is approximately 6KB per vector. At 100 million vectors, that is roughly 600GB of raw storage before indexing overhead. Most vector databases add 1.5x to 3x storage overhead for their index structures. Plan for 2TB at 100 million vectors and budget accordingly.
Quantisation reduces this significantly. int8 quantisation reduces storage by 4x with typically 1 to 3% recall degradation. Binary quantisation reduces by 32x with higher recall loss but can be acceptable for certain use cases. Both Qdrant and Weaviate have strong quantisation support. Factor this into your cost model when comparing managed options.
Build a cost model at 3x, 10x, and 30x your current vector count before committing to a managed SaaS option. The per-vector pricing that looks reasonable at 5 million vectors often becomes the dominant line item in your AI infrastructure budget at 200 million. The vendor selection advisory work we do always includes a 3-year TCO model before any recommendation.
Integration Architecture: Where Vector Databases Sit
A vector database is a component of a larger AI data architecture, not a standalone system. How it integrates with the existing data platform determines operational overhead more than the choice of vendor.
The two integration questions that matter most: first, where do embeddings get generated and how do they flow into the vector store? Second, how is the vector store kept in sync when source data changes?
The embedding generation pipeline typically runs as part of the data lake processing layer: source document arrives, triggers an embedding job, vector is upserted into the store with its metadata. This requires a durable queue (Kafka, SQS) between the data platform and the vector store to handle backpressure when embedding throughput exceeds ingestion capacity.
The synchronisation problem is harder. When a document is updated, its embedding changes. When a document is deleted, its vector must be removed. Systems that handle this correctly maintain an audit log of document-to-vector-ID mappings and run reconciliation jobs. Systems that handle this poorly accumulate stale vectors that degrade retrieval quality over time. This is not a vector database problem. It is a data engineering problem that manifests as a vector database problem.
Evaluating a Vendor: What to Ask Before You Sign
Beyond the technical evaluation, enterprise vendor selection for infrastructure components involves contractual and operational questions that data scientists often skip. Before committing to any vector database vendor at enterprise scale, get answers to these questions in writing:
- Data export: What format is data exported in? How long does a full export take at your scale? Is there an API or only a web interface?
- Service level agreement: What is the SLA for query latency, not just uptime? What is the remediation process and compensation structure for SLA breaches?
- Version compatibility: What is the upgrade path between major versions? How many breaking changes occurred in the last three major releases?
- Support tier: Does enterprise support include a named technical account manager? What is the escalation path for production incidents at 3am?
- Security certifications: SOC 2 Type II, ISO 27001, and any sector-specific certifications your procurement process requires.
These questions are not bureaucratic. They are the questions you will wish you had asked when your vendor has a three-hour outage during a peak business period and the SLA document turns out to cover only uptime, not latency.
When to Consolidate Vector Search into an Existing Platform
Not every team needs a dedicated vector database. The case for pgvector is stronger than its performance characteristics suggest, for teams in the right situation. If you run Postgres, have a skilled DBA team, and your vector use case is internal with moderate scale, pgvector gives you ACID transactions, JOIN capability, a single operational stack, and familiar tooling. The operational cost savings over a dedicated vector database can justify a performance compromise that would be unacceptable for a customer-facing use case.
Similarly, if your data platform lives entirely within a single cloud provider, the native vector offerings from AWS (OpenSearch with k-NN), GCP (Vertex Matching Engine), and Azure (AI Search) are worth evaluating seriously. They are not the performance leaders but they have the lowest operational overhead for teams already deep in that cloud ecosystem. The deeper your cloud commitment, the more attractive the managed native option becomes.
The decision tree is simple: if your use case is internal, your scale is moderate, and you run Postgres, start with pgvector and evaluate dedicated options only when you hit limits. If your use case is customer-facing, high-volume, or involves sensitive data with strict residency requirements, evaluate dedicated options from the start. See our AI data strategy advisory for help structuring this decision within your existing platform architecture.
Summary: Making a Decision That Holds at Scale
The vendor you select in a two-week proof of concept is usually fine for the proof of concept. The problem surfaces when production scale is 10x the POC and the cost model changes, or when your security team reviews the data residency terms, or when the vendor has a major version release with breaking API changes.
Select based on your anticipated scale in two years, not your current scale. Preference vendors with open storage formats and documented migration paths. Quantify TCO at 10x current volume before signing. And if you cannot answer the question "how do we migrate to a different vector database if this vendor is acquired next year," make sure you can before you go to production.
For teams building the infrastructure strategy around multiple AI data components, the enterprise data lake architecture guide covers how vector storage integrates with the broader data platform. The AI vendor selection advisory provides structured evaluation frameworks and reference architectures for these decisions.