Explore our advanced text to speech capabilities
Discover our text-to-speech technology in playground or via API for lifelike voices with accurate transcript following and full delivery controllability.
No hallucinations
Our TTS technology enables accurate transcript handling, even with complex content like numbers, dates, or medical jargon.
Fine-grained control
Adjust pitch, speed, and emotion with our creator studio and API to create a personalized and engaging audio experience.
Fast response time
Experience a blazing fast 40ms model latency with our Sonic Turbo model, ensuring seamless real-time interactions.
Accurate and high quality text to speech
Cartesia TTS accurately handles complex transcript elements such as names, phone numbers, confirmation numbers, medical terms, and industry jargon. It's perfect for voice AI agents in healthcare, insurance, banking, etc.
Phone number
Date
Take full control of the voice delivery for your content, with complete expressiveness and slider controls for speed and emotions.
Surprised
Sad
You can fine-tune the audio by adding breaks or pauses using the <break />
tags. You can specify the break/pause duration in seconds (s
) or milliseconds (ms
).
With 500ms break tags
With 300ms and 600ms break tags
To spell out input text, you can wrap it in <spell>
tags. This is particularly useful for pronouncing long numbers or identifiers, such as credit card numbers, phone numbers, or unique IDs.
Spell out numbers
Complex number and letter
What our customers say
Join the growing list of companies opting for Sonic.
"Cartesia's breakthrough voice technology significantly enhances our creative suite, giving creators the freedom to generate any voice they can imagine and furthering our goal of making it easy for anyone to create videos they're proud to share."
Gaurav Misra, Co-Founder and CEO of Captions
"Using Cartesia's generative voice API, Sonic, we’ve strengthened Cresta AI Agent to move beyond rigid scripts and towards delivering empathetic, human-like conversations that accurately represent customer brands and their commitment to excellent customer service. Sonic empowers Cresta AI Agent to resolve complex issues effortlessly, helping our customers gain real value from their AI investment and significantly improve their NPS and CSAT scores."
Tim Shi, Co-Founder & Chief Technology Officer at Cresta
"This partnership represents a transformative moment in enterprise AI adoption. By combining Rasa’s strengths in enterprise conversational AI with Cartesia's innovative voice technology, we're fundamentally changing how enterprises can deploy and scale AI assistants across their organizations."
Melissa Gordon, CEO of Rasa
How to use our text to speech creator studio and API
Step One
Visit Cartesia's website and sign up to access to our TTS playground. Explore the documentation for API integration details.
Step Two
Select the desired language and voice settings. Use the creator studio or API to input text and generate audio in real-time.
Step Three
Implement the generated audio into your application, or export the audio to MP3, M4a or other prefered audio formats.