Sonic: The fastest, ultra-realistic generative voice API

The best API for real-time, interactive voice. Powered by our next-gen state space model.

Blazing fast

135 ms model latency

High throughput

Highly concurrent, low-cost inference with our first-of-its-kind state space model inference stack

Ultra-realistic

Human, emotional, expressive text-to-speech models built on an entirely new state space model architecture

Supports zero-shot voice cloning

Match prosody, inflection, and vocal characteristics with only 10 seconds of recorded speech

Controllable Alpha

Adjust pitch, speed, emotion, and pronunciation
Try it on the playground Read the announcement

Samples

Conversational

Health Insurance Agent

Call Center

Gaming

Fortune Teller

Wizard

Media & Broadcasting

Ad Voiceover

News Anchor

Sports Commentator

Radio Host

Content

Beauty Vlogger

Yoga Instructor

Pricing

Free

$0/forever

Get started with Cartesia.

Includes:

  • 10,000 characters per month
  • Access to our API
  • Must attribute Cartesia with a link to www.cartesia.ai in public content

Pro

$5/mo

Leverage Cartesia's advanced features for your projects.

Everything in Free, plus:

  • 100,000 characters per month
  • Instant voice cloning
  • Output in all formats, including 44.1 kHz PCM
  • 3 concurrent requests
  • License for commercial use

Startup

$49/mo

The perfect plan for growing startups.

Everything in Pro, plus:

  • 1,250,000 characters per month
  • 5 concurrent requests

Scale

$299/mo

Higher caps for scaling your business.

Everything in Startup, plus:

  • 8,000,000 characters per month
  • 15 concurrent requests

Enterprise

Premium plan for large enterprises. Contact us for pricing.

Everything in Scale, plus:

  • Dedicated Slack support
  • Onboarding and migration support
  • Custom limits
Contact Us

[REDACTED]

┌───────────────────────────────────────┐
│ Loading ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒       67%    │
└───────────────────────────────────────┘