Introduction#
Vector databases store high-dimensional embeddings and support approximate nearest neighbor (ANN) search. They are the retrieval backbone for RAG (Retrieval-Augmented Generation) systems, semantic search, and recommendation engines. This post compares the main options and covers practical usage.
What Problems Vector Databases Solve#
Traditional databases support exact string or numeric matching. Vector search finds items that are semantically similar even when they share no keywords.
1
2
3
4
5
6
7
8
9
10
# Example: semantic search
# Query: "laptop computer" should find "notebook PC", "MacBook Pro"
# Even though they share no common words
# Traditional search: keyword matching
SELECT * FROM products WHERE name LIKE '%laptop%'
# Misses: "MacBook Pro", "ThinkPad"
# Vector search: similarity-based
# Embed the query and find products with similar embeddings
Embeddings#
An embedding is a dense vector that represents semantic meaning. Similar concepts have vectors that are close in high-dimensional space.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
from openai import OpenAI
client = OpenAI()
def embed(text: str) -> list[float]:
response = client.embeddings.create(
model="text-embedding-3-small",
input=text,
)
return response.data[0].embedding # 1536-dimensional vector
# Embed documents at indexing time
product_embedding = embed("Wireless mechanical keyboard with RGB backlight")
# [0.0234, -0.0891, 0.1203, ...] # 1536 dimensions
# Embed query at search time
query_embedding = embed("keyboard for programmers")
# Similar vector to the product embedding
pgvector: PostgreSQL Extension#
Add vector search to your existing PostgreSQL database. Best choice if you don’t want to add a new infrastructure component.
1
2
3
4
5
6
7
8
9
10
11
12
13
-- Install extension
CREATE EXTENSION IF NOT EXISTS vector;
-- Add vector column (1536 dimensions for OpenAI text-embedding-3-small)
ALTER TABLE products ADD COLUMN embedding vector(1536);
-- Create HNSW index for fast ANN search
CREATE INDEX ON products USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
-- IVFFlat index (faster to build, slightly less accurate)
-- CREATE INDEX ON products USING ivfflat (embedding vector_cosine_ops)
-- WITH (lists = 100);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import asyncpg
import numpy as np
from openai import AsyncOpenAI
client = AsyncOpenAI()
async def search_products(query: str, limit: int = 10) -> list[dict]:
# Embed the query
response = await client.embeddings.create(
model="text-embedding-3-small",
input=query,
)
query_embedding = response.data[0].embedding
# Search using cosine similarity
conn = await asyncpg.connect("postgresql://localhost/mydb")
results = await conn.fetch("""
SELECT id, name, description,
1 - (embedding <=> $1::vector) AS similarity
FROM products
WHERE 1 - (embedding <=> $1::vector) > 0.7
ORDER BY embedding <=> $1::vector
LIMIT $2
""", query_embedding, limit)
return [dict(r) for r in results]
# <=> cosine distance, <-> L2 distance, <#> inner product
Pros: no additional infrastructure, full SQL power, transactional consistency.
Cons: slower than dedicated vector databases at scale (>10M vectors), no distributed sharding.
Pinecone: Managed Vector Database#
Fully managed, serverless vector database. No infrastructure to manage.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
from pinecone import Pinecone
pc = Pinecone(api_key="your-api-key")
# Create index
pc.create_index(
name="products",
dimension=1536,
metric="cosine",
spec={"serverless": {"cloud": "aws", "region": "us-east-1"}}
)
index = pc.Index("products")
# Upsert vectors with metadata
index.upsert(vectors=[
{
"id": "prod-123",
"values": embedding_vector,
"metadata": {
"name": "Wireless Keyboard",
"category": "electronics",
"price": 79.99,
}
},
# ...more vectors
])
# Query with metadata filtering
results = index.query(
vector=query_embedding,
top_k=10,
filter={"category": "electronics", "price": {"$lte": 100}},
include_metadata=True,
)
for match in results.matches:
print(f"{match.id}: {match.score:.3f} - {match.metadata['name']}")
Pros: managed, scales automatically, fast at large scale.
Cons: cost, vendor lock-in, no SQL, data leaves your infrastructure.
Weaviate: Open-Source Vector Database#
Self-hosted or managed, supports hybrid search (vector + keyword BM25).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
import weaviate
from weaviate.classes.config import Property, DataType, Configure
client = weaviate.connect_to_local()
# Create a collection (like a table)
client.collections.create(
name="Product",
properties=[
Property(name="name", data_type=DataType.TEXT),
Property(name="description", data_type=DataType.TEXT),
Property(name="price", data_type=DataType.NUMBER),
],
vectorizer_config=Configure.Vectorizer.text2vec_openai(
model="text-embedding-3-small"
),
)
collection = client.collections.get("Product")
# Insert objects (auto-vectorized by Weaviate)
collection.data.insert({
"name": "Wireless Keyboard",
"description": "Mechanical keyboard with RGB lighting",
"price": 79.99,
})
# Hybrid search: combine vector and keyword search
from weaviate.classes.query import HybridFusion
results = collection.query.hybrid(
query="keyboard for programmers",
alpha=0.5, # 0 = pure keyword (BM25), 1 = pure vector
fusion_type=HybridFusion.RELATIVE_SCORE,
limit=10,
)
for obj in results.objects:
print(obj.properties["name"])
Pros: open-source, hybrid search, multi-modal support.
Cons: more complex to operate than pgvector, less mature ecosystem than Pinecone.
Choosing the Right Tool#
| Requirement | Choice |
|---|---|
| Existing PostgreSQL infrastructure | pgvector |
| <5M vectors, simple use case | pgvector |
| >10M vectors, managed service | Pinecone |
| Hybrid keyword + semantic search | Weaviate |
| On-premise, full control | Weaviate or pgvector |
| Fastest time to production | Pinecone |
Conclusion#
pgvector is the right starting point for most teams — it requires no new infrastructure and integrates with existing PostgreSQL tooling. At scale (tens of millions of vectors), dedicated vector databases provide better performance. Weaviate’s hybrid search is compelling when you need to combine semantic relevance with keyword matching. For production RAG systems, start with pgvector and migrate to a dedicated vector database only when query latency becomes a problem.