Skip to main content

Search

Semantic and vector search implementations for improved information retrieval.

Search Technologies

Vector Search (Semantic)

ProviderIndex SizePriceNotes
PineconeServerless$0.10/1k vectorsPay per query
WeaviateStandard$0.00025/1k vectorsCloud
QdrantCloud$0.004/1k vectorsSelf-hosted option
ChromaLocalFreeOpen source
PineconeStarter$70/mo1M vectors
ProviderPricingBest For
Algolia$0-$500+/moGeneral search
Elasticsearch$0.10-$0.60/1k docsEnterprise
Typesense$25-$250/moFast, open source
ProviderModelPer 1M Tokens
OpenAItext-embedding-3-small$0.02
Cohereembed-v3$0.10
GoogleEmbedding Gecko$0.10
Hugging FaceInstructorFree

Search Cost Examples

1M Document Corpus

ComponentUsageMonthly Cost
Embeddings (initial)500M tokens$10
Pinecone (1M vectors)Serverless~$100
LLM (query answering)10M tokens$25
Total~$135/mo

10M Document Corpus

ComponentUsageMonthly Cost
Embeddings5B tokens$100
Pinecone (10M vectors)Standard$500
LLM100M tokens$250
Total~$850/mo

Per-Query Costs

Embedding Lookup

Corpus SizeVector OpsCost/Query
1M vectors1$0.0001
10M vectors1$0.001
100M vectors1$0.01

Full RAG Query (100 tokens in, 50 out)

ModelEmbeddingLLMTotal
GPT-4o-mini$0.000002$0.00003$0.00003
Claude Haiku$0.00001$0.00007$0.00008
GPT-4o$0.000002$0.00013$0.00013

Optimization Strategies

  1. ANN algorithms - Use HNSW/IVF for faster approximate search
  2. Dimension reduction - OpenAI small embeddings (1536 dim)
  3. Caching - Cache frequent queries
  4. Tiered search - BM25 first, then vector for ambiguous
  • Embeddings - Embedding comparison
  • RAG - Search-augmented generation