Sonic: The fastest, ultra-realistic generative voice API
The best API for real-time, interactive voice. Powered by our next-gen state space model.
Blazing fast
135 ms model latencyHigh throughput
Highly concurrent, low-cost inference with our first-of-its-kind state space model inference stackUltra-realistic
Human, emotional, expressive text-to-speech models built on an entirely new state space model architectureSupports zero-shot voice cloning
Match prosody, inflection, and vocal characteristics with only 10 seconds of recorded speechMultilingual Alpha
Supports seamless speech in English, Chinese, Japanese, Spanish, French, German, and PortugueseControllable Alpha
Adjust pitch, speed, emotion, and pronunciationSamples
Conversational
Health Insurance Agent
Call Center
Gaming
Fortune Teller
Wizard
Media & Broadcasting
Ad Voiceover
News Anchor
Sports Commentator
Radio Host
Content
Beauty Vlogger
Yoga Instructor
Pricing
Free
$0/forever
Get started with Cartesia.
Includes:
- 10,000 characters per month
- Access to our API
- Must attribute Cartesia with a link to www.cartesia.ai in public content
Pro
$5/mo
Leverage Cartesia's advanced features for your projects.
Everything in Free, plus:
- 100,000 characters per month
- Instant voice cloning
- Output in all formats, including 44.1 kHz PCM
- 3 concurrent requests
- License for commercial use
Startup
$49/mo
The perfect plan for growing startups.
Everything in Pro, plus:
- 1,250,000 characters per month
- 5 concurrent requests
Scale
$299/mo
Higher caps for scaling your business.
Everything in Startup, plus:
- 8,000,000 characters per month
- 15 concurrent requests
Enterprise
Premium plan for large enterprises. Contact us for pricing.
Everything in Scale, plus:
- Dedicated Slack support
- Onboarding and migration support
- Custom limits
[REDACTED]
┌───────────────────────────────────────┐ │ Loading ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒ 67% │ └───────────────────────────────────────┘