Skip to main content

Hugging Face

Open-source AI models, free inference, and the largest ML community.

Inference API (Pay-per-use)

Language Models

ModelInput /1MOutput /1M
Llama 3.1 70B$0.65$2.20
Llama 3.1 8B$0.22$0.88
Mistral 7B$0.24$0.24
Mixtral 8x7B$0.24$0.24

Embeddings

ModelPer 1M Tokens
InstructorFree
BGEFree

Inference Endpoints (Dedicated)

InstancePrice/hrGPU
small$0.06T4
medium$0.35A10G
large$1.01A100
xlarge$2.93A100 80GB

Free Tier

BenefitLimit
Requests/day1,000
Rate limit30 RPM
ModelsPopular ones

Pro Tier

BenefitPrice
Monthly$9
Requests/day50,000
Rate limit200 RPM

Cost Examples

API-Based (1M requests)

ModelTokens/reqCost
Llama 3.1 8B500$110
GPT-4o-mini500$75

Self-Hosted (1M requests)

InstanceHoursCost
A10G100$101
A10050$146

Open Source Models (Free)

ModelTypeLicense
Llama 3.1 8BChatLlama 3.1
Mistral 7BChatApache 2.0
Gemma 2BChatGemma
Whisper MediumAudioApache 2.0

Comparison

FeatureFree TierProEnterprise
ModelsPopularAllAll + custom
Rate limit30 RPM200 RPMCustom
SupportCommunityEmailDedicated

Best For

  1. Experimentation and prototyping
  2. Open-source model evaluation
  3. Budget deployments
  4. Model fine-tuning
  5. Community models