Compare pricing for speech-to-text, text-to-speech, voice cloning, and audio processing APIs.
Speech-to-Text (STT)
| Vendor | Service | Price/min | Free Tier |
|---|
| OpenAI | Whisper API | $0.006 | 1 hour free |
| AssemblyAI | Speech-to-Text | $0.05-$0.15 | 10 hours |
| Deepgram | Nova-2 | $0.0043-$0.15 | 200 mins |
| Google | Cloud Speech-to-Text | $0.025-$0.15 | 60 mins |
| AWS Transcribe | Standard | $0.024-$0.10 | 1 hour |
| Microsoft | Azure AI Speech | $0.026-$0.10 | 1 hour |
| Speechmatics | Real-time | $0.04-$0.12 | 30 mins |
| Rev | AI Transcription | $0.05-$0.15 | Pay-per-use |
Text-to-Speech (TTS)
| Vendor | Quality | Price per 1M chars | Free Tier |
|---|
| OpenAI | TTS-1 | $15.00 | $5 credit |
| OpenAI | TTS-1 HD | $30.00 | $5 credit |
| ElevenLabs | Multilingual v2 | $300.00 | 10k chars |
| ElevenLabs | Language | $120.00 | 10k chars |
| ElevenLabs | English | $90.00 | 10k chars |
| Google Cloud | WaveNet2 | $16.00 | 1M chars |
| Google Cloud | Standard | $4.00 | 1M chars |
| AWS Polly | Neural | $16.00 | 5M chars |
| AWS Polly | Standard | $4.00 | 5M chars |
| Azure | Neural | $16.00 | 0.5M chars |
| Murf AI | Studio | $69/mo | 10k chars |
| Murf AI | API | $0.004/char | 10k chars |
| WellSaid Labs | Creative | $49/mo | Pay-per-use |
| WellSaid Labs | API | $40/100k chars | Trial |
| Natural Reader | Pro | $99/yr | Pay-per-use |
Voice Cloning
| Vendor | Plan | Price/mo | Features |
|---|
| ElevenLabs | Pro | $330 | 30 custom voices |
| ElevenLabs | Starter | $99 | 10 custom voices |
| Resemble AI | Build | $99 | Unlimited voices |
| Resemble AI | Scale | $499 | + API, custom voices |
| Descript | Creator | $12 | 1 voice |
| Descript | Pro | $24 | 5 voices |
| Resemble | Basic | Free | 1 voice, limited use |
Audio Intelligence
| Vendor | Service | Price | Notes |
|---|
| AssemblyAI | Audio Intelligence | $0.05-$0.15/min | PII detection, topics |
| Deepgram | Audio Intelligence | $0.05/30 sec | Topics, entities |
| Google | Speech-to-Text + ML | $0.10-$0.30/min | Advanced features |
| OpenAI | Whisper + GPT-4 | $0.006 + token cost | Transcription + analysis |
Cost Estimator
Transcription (per hour)
| Service | Quality | Cost |
|---|
| Whisper | Standard | $0.36 |
| Deepgram Nova-2 | High | $0.26 |
| AssemblyAI | Standard | $1.50 |
| Rev | Human + AI | $2.00 |
| Speechmatics | Real-time | $2.40 |
TTS (per 100k chars)
| Service | Quality | Cost |
|---|
| Polly Standard | Basic | $0.40 |
| Google Standard | Basic | $0.40 |
| ElevenLabs Multilingual | Premium | $30.00 |
| Murf API | Pro | $0.40 |
| WellSaid API | Pro | $40.00 |
Real-Time Voice Agents
| Vendor | Price/min | Use Case |
|---|
| ElevenLabs | $0.30-$0.60 | Conversational AI |
| Daily | $0.003-$0.005/sec | Real-time calls |
| Agora | $0.99-$3.99/1000 mins | VoIP |
| Twilio | $0.001/AI agent | Voice assistants |
Best For
| Need | Recommended |
|---|
| Cost efficiency | Whisper API |
| High accuracy | AssemblyAI, Deepgram Nova |
| Real-time | Google Cloud Speech |
| Voice cloning | ElevenLabs |
| Enterprise | AWS Transcribe |
| Multilingual | ElevenLabs, Google WaveNet |
| Affordable TTS | Polly, Murf |
| Avatars + Audio | Descript |
Free Tier Comparison
| Service | Free Offering |
|---|
| Whisper | 1 hour |
| Deepgram | 200 mins |
| AssemblyAI | 10 hours |
| ElevenLabs | 10k chars, 3 voices |
| Murf | 10k chars |
| Google Cloud | 1M chars |