Semantic and vector search implementations for improved information retrieval.
Search Technologies
Vector Search (Semantic)
| Provider | Index Size | Price | Notes |
|---|
| Pinecone | Serverless | $0.10/1k vectors | Pay per query |
| Weaviate | Standard | $0.00025/1k vectors | Cloud |
| Qdrant | Cloud | $0.004/1k vectors | Self-hosted option |
| Chroma | Local | Free | Open source |
| Pinecone | Starter | $70/mo | 1M vectors |
Hybrid Search
| Provider | Pricing | Best For |
|---|
| Algolia | $0-$500+/mo | General search |
| Elasticsearch | $0.10-$0.60/1k docs | Enterprise |
| Typesense | $25-$250/mo | Fast, open source |
Embedding Costs for Search
| Provider | Model | Per 1M Tokens |
|---|
| OpenAI | text-embedding-3-small | $0.02 |
| Cohere | embed-v3 | $0.10 |
| Google | Embedding Gecko | $0.10 |
| Hugging Face | Instructor | Free |
Search Cost Examples
1M Document Corpus
| Component | Usage | Monthly Cost |
|---|
| Embeddings (initial) | 500M tokens | $10 |
| Pinecone (1M vectors) | Serverless | ~$100 |
| LLM (query answering) | 10M tokens | $25 |
| Total | | ~$135/mo |
10M Document Corpus
| Component | Usage | Monthly Cost |
|---|
| Embeddings | 5B tokens | $100 |
| Pinecone (10M vectors) | Standard | $500 |
| LLM | 100M tokens | $250 |
| Total | | ~$850/mo |
Per-Query Costs
Embedding Lookup
| Corpus Size | Vector Ops | Cost/Query |
|---|
| 1M vectors | 1 | $0.0001 |
| 10M vectors | 1 | $0.001 |
| 100M vectors | 1 | $0.01 |
Full RAG Query (100 tokens in, 50 out)
| Model | Embedding | LLM | Total |
|---|
| GPT-4o-mini | $0.000002 | $0.00003 | $0.00003 |
| Claude Haiku | $0.00001 | $0.00007 | $0.00008 |
| GPT-4o | $0.000002 | $0.00013 | $0.00013 |
Optimization Strategies
- ANN algorithms - Use HNSW/IVF for faster approximate search
- Dimension reduction - OpenAI small embeddings (1536 dim)
- Caching - Cache frequent queries
- Tiered search - BM25 first, then vector for ambiguous
- Embeddings - Embedding comparison
- RAG - Search-augmented generation