How Cartesia Powers Retell's Voice Agents at Scale
How Cartesia Powers Retell's Voice Agents at Scale

About the Company
Retell AI is redefining call center automation. Founded in 2023 by former Google, Meta, and ByteDance engineers, Retell tackles one of the hardest problems in conversational AI: building LLM-powered voice agents that are reliable, secure, and capable of taking action - all with human-level latency.
Through its no-code platform, businesses can launch voice agents in minutes to handle calls 24/7 - answering questions, booking appointments, qualifying leads. Today, over 3,000 companies manage 40 million + calls on Retell.
As growth accelerated, Retell needed a voice partner that could scale with them - one that understood the unique technical demands of building truly conversational agents. Off-the-shelf providers couldn't deliver.
The Problem: When "Good Enough" Voice Quality Breaks Enterprise Workflows
Building conversational AI agents requires solving problems that didn't exist with traditional IVR systems. Legacy call scripting took months to design rigid flows. Modern LLM-based agents face the opposite challenge: they're creative with conversations, but require proper guardrails and reliability to work in production.
In healthcare, a wrong phone number means a missed appointment. In logistics, a garbled address means a failed delivery. In financial services, an unintelligible confirmation code means a frustrated customer and a lost transaction.
The reliability problem was just as bad. Traditional TTS providers had random outages. Performance degraded during peak hours. No real fallback systems. For businesses running 24/7 call centers, downtime wasn't just inconvenient - it was existential.
The Solution: A Voice Model Built for Production
Retell added Cartesia as a primary voice provider one quarter ago. The adoption was the fastest they'd ever seen from a new provider - driven by improvements in voice quality, latency, and accuracy across every dimension that mattered for their enterprise use cases.
What Cartesia delivered:
Fastest latency on the platform
Cartesia's Time-to-First-Audio (TTFA) is 2-3x faster than the next best provider on Retell's platform. For voice agents, this translates to conversations that feel natural rather than stilted - critical for maintaining engagement in high-stakes calls.Best-in-class accuracy
Cartesia maintains a <0.1% error rate on Retell's platform. This isn't luck - it's architecture. Cartesia's model, powered by state space models rather than traditional transformers, has been specifically fine-tuned to handle the alphanumeric sequences that break other TTS systems:Phone numbers: (415) 831-7577 → pronounced correctly, every time
Addresses: 1234 Market St, Suite 5B → no hallucinations
Confirmation codes: ABC-1245-XYZ → crystal clear
Enterprise reliability
99.9% uptime. Retell configured Cartesia as a primary voice provider and automatic failover - if any TTS system goes down, Cartesia kicks in seamlessly. According to Retell's team, "Cartesia is a favorite for our solutions engineering and forward deployed teams working with enterprise customers, where reliability is one of the most important factors."Zero-friction integration
One API call. Retell's customers can add Cartesia voices to their agents in minutes - no re-engineering required.Native multi-language support
40+ languages. Not English models poorly adapted for other languages, but actual native support built from the ground up.The integration was straightforward. Retell embedded Cartesia directly into their platform - customers just flip a switch.
The Results: Fastest Provider Adoption in Retell's History
"It's incredibly difficult to achieve the rare blend of latency, reliability, accuracy, and voice quality. Cartesia managed to do all four, which is why customers adopted them faster than any other voice model we've introduced."
Zexia Zhang, CEO of Retell AI
"Retell has been an incredible thought partner since day one given their expertise on enterprise voice agents. We're learning from them daily."
Karan Goel, CEO of Cartesia
What's Next: Scaling Production Voice AI
Powered by Cartesia's text-to-speech, Retell AI continues to expand across industries - healthcare, financial services, insurance, logistics, retail, and debt collection. Their customer base of 3,000+ companies includes enterprises like Asbury Auto, Anker, and Storagevault.
The bottom line: Building call centers that actually work in production requires more than natural-sounding voices. You need accuracy on edge cases, reliability under load, and infrastructure built for scale. We’re proud to support Retell in delivering this experience with our voices.
Retell users can now access Cartesia’ newest model, Sonic 3, with support for 27 new languages, custom pronunciations, speed and volume controls, and even better accuracy and naturalness!
About the Company
Retell AI is redefining call center automation. Founded in 2023 by former Google, Meta, and ByteDance engineers, Retell tackles one of the hardest problems in conversational AI: building LLM-powered voice agents that are reliable, secure, and capable of taking action - all with human-level latency.
Through its no-code platform, businesses can launch voice agents in minutes to handle calls 24/7 - answering questions, booking appointments, qualifying leads. Today, over 3,000 companies manage 40 million + calls on Retell.
As growth accelerated, Retell needed a voice partner that could scale with them - one that understood the unique technical demands of building truly conversational agents. Off-the-shelf providers couldn't deliver.
The Problem: When "Good Enough" Voice Quality Breaks Enterprise Workflows
Building conversational AI agents requires solving problems that didn't exist with traditional IVR systems. Legacy call scripting took months to design rigid flows. Modern LLM-based agents face the opposite challenge: they're creative with conversations, but require proper guardrails and reliability to work in production.
In healthcare, a wrong phone number means a missed appointment. In logistics, a garbled address means a failed delivery. In financial services, an unintelligible confirmation code means a frustrated customer and a lost transaction.
The reliability problem was just as bad. Traditional TTS providers had random outages. Performance degraded during peak hours. No real fallback systems. For businesses running 24/7 call centers, downtime wasn't just inconvenient - it was existential.
The Solution: A Voice Model Built for Production
Retell added Cartesia as a primary voice provider one quarter ago. The adoption was the fastest they'd ever seen from a new provider - driven by improvements in voice quality, latency, and accuracy across every dimension that mattered for their enterprise use cases.
What Cartesia delivered:
Fastest latency on the platform
Cartesia's Time-to-First-Audio (TTFA) is 2-3x faster than the next best provider on Retell's platform. For voice agents, this translates to conversations that feel natural rather than stilted - critical for maintaining engagement in high-stakes calls.Best-in-class accuracy
Cartesia maintains a <0.1% error rate on Retell's platform. This isn't luck - it's architecture. Cartesia's model, powered by state space models rather than traditional transformers, has been specifically fine-tuned to handle the alphanumeric sequences that break other TTS systems:Phone numbers: (415) 831-7577 → pronounced correctly, every time
Addresses: 1234 Market St, Suite 5B → no hallucinations
Confirmation codes: ABC-1245-XYZ → crystal clear
Enterprise reliability
99.9% uptime. Retell configured Cartesia as a primary voice provider and automatic failover - if any TTS system goes down, Cartesia kicks in seamlessly. According to Retell's team, "Cartesia is a favorite for our solutions engineering and forward deployed teams working with enterprise customers, where reliability is one of the most important factors."Zero-friction integration
One API call. Retell's customers can add Cartesia voices to their agents in minutes - no re-engineering required.Native multi-language support
40+ languages. Not English models poorly adapted for other languages, but actual native support built from the ground up.The integration was straightforward. Retell embedded Cartesia directly into their platform - customers just flip a switch.
The Results: Fastest Provider Adoption in Retell's History
"It's incredibly difficult to achieve the rare blend of latency, reliability, accuracy, and voice quality. Cartesia managed to do all four, which is why customers adopted them faster than any other voice model we've introduced."
Zexia Zhang, CEO of Retell AI
"Retell has been an incredible thought partner since day one given their expertise on enterprise voice agents. We're learning from them daily."
Karan Goel, CEO of Cartesia
What's Next: Scaling Production Voice AI
Powered by Cartesia's text-to-speech, Retell AI continues to expand across industries - healthcare, financial services, insurance, logistics, retail, and debt collection. Their customer base of 3,000+ companies includes enterprises like Asbury Auto, Anker, and Storagevault.
The bottom line: Building call centers that actually work in production requires more than natural-sounding voices. You need accuracy on edge cases, reliability under load, and infrastructure built for scale. We’re proud to support Retell in delivering this experience with our voices.
Retell users can now access Cartesia’ newest model, Sonic 3, with support for 27 new languages, custom pronunciations, speed and volume controls, and even better accuracy and naturalness!


Build Your Voice Agent With Cartesia Sonic
Build Your Voice Agent With Cartesia Sonic
Experience low latency and superior voice quality with Cartesia's voice AI technology, Sonic.
Experience low latency and superior voice quality with Cartesia's voice AI technology, Sonic.
RESULTS
2-3x faster latency than next best provider on platform
<0.1% error rate on production calls
Fastest-ever adoption of a new voice provider
PRODUCTS
Text To Speech
RESULTS
2-3x faster latency than next best provider on platform
<0.1% error rate on production calls
Fastest-ever adoption of a new voice provider
PRODUCTS
Text To Speech
Explore more success stories
Explore more success stories
Explore more success stories
How Cartesia Powers Retell's Voice Agents at Scale
Read the full story
How Replicant Keeps Enterprise Conversations Effortless With Cartesia
Read the full story

Cartesia voice AI technology to be integrated with new ServiceNow AI Voice Agents
Read the full story
Regions
Regions
Regions