Vector Databases (Pinecone, Milvus)
Storing and retrieving semantic data efficiently.
What Are Vector Databases?
Vector DB Architecture
Vector databases store and index high-dimensional vectors (embeddings) for fast similarity search. They are the backbone of every RAG system.
How They Work
- Embed: Convert your text into vectors using an embedding model (e.g., OpenAI text-embedding-3-small)
- Index: Store vectors in an optimized index structure (HNSW, IVF)
- Query: Convert the user's question into a vector, find the nearest neighbors
Popular Options
- Pinecone: Managed, serverless. Great for getting started quickly.
- Milvus / Zilliz: Open-source, highly scalable. Good for large datasets.
- Weaviate: Open-source with built-in vectorization modules.
- ChromaDB: Lightweight, great for prototyping and local development.
- pgvector: PostgreSQL extension. Use your existing Postgres infrastructure.
Similarity Metrics
- Cosine Similarity: Most common. Measures angle between vectors. Best for text.
- Euclidean Distance: Measures straight-line distance. Good for spatial data.
- Dot Product: Fast computation. Works when vectors are normalized.
Code Example
Using ChromaDB for local vector storage. Embed documents with OpenAI, store in Chroma, and query by semantic similarity.
python
1from openai import OpenAI
2import chromadb
3
4client = OpenAI()
5chroma = chromadb.Client()
6
7# Create a collection
8collection = chroma.create_collection("docs")
9
10# Add documents
11docs = [
12 "RAG improves LLM accuracy by providing relevant context.",
13 "Vector databases store embeddings for fast similarity search.",
14 "Chunking strategy significantly affects retrieval quality.",
15]
16
17# Embed and store
18for i, doc in enumerate(docs):
19 embedding = client.embeddings.create(
20 input=doc, model="text-embedding-3-small"
21 ).data[0].embedding
22
23 collection.add(
24 embeddings=[embedding],
25 documents=[doc],
26 ids=[f"doc_{i}"]
27 )
28
29# Query
30query = "How does RAG work?"
31query_embedding = client.embeddings.create(
32 input=query, model="text-embedding-3-small"
33).data[0].embedding
34
35results = collection.query(query_embeddings=[query_embedding], n_results=2)
36print(results["documents"])Use Cases
Semantic search over internal documentation
Product recommendation engines
Duplicate detection in support tickets
Image search using CLIP embeddings
Common Mistakes
Using the wrong similarity metric for your embedding model
Not normalizing vectors when using dot product similarity
Storing too many metadata fields which slows down queries
Choosing a managed solution when pgvector would suffice for your scale
Interview Insight
Relevance
High - Core RAG infrastructure