Introducing Voice Changer: Transform Audio Your Way
October 30, 2024
Today, we're excited to launch Voice Changer, our new model that transforms the voice of any audio clip while maintaining its original delivery and emotion.
Select from our diverse library of studio-quality voices or clone your own, while retaining complete control over nuances like the vocalization, emotion, and prosody.
Natural Speech Preservation
Voice Changer excels at preserving the qualities that make speech sound unique, natural, and engaging - exactly as it was in the original clip.
Here’s a narration of Alice in Wonderland Chapter 12, "Alice's Evidence," where we’ve changed the input to four different voices from our playground: Child, Movie Guy, Reading Lady, and Wizard.
Notice how each transformed clip maintains the original narrator's expressiveness, vocalizations (like sighs and exclamations), and prosody. This consistency creates seamless voice transitions – even when we’ve edited the narrator's voice to change five times within a single clip.
We were able to transform the original clip
...into all of the clips below, all with different voices.
Perfect Control Over Delivery
While our Text to Speech product has contextual awareness, there may be times when you need precise control over how a script is read. Voice Changer gives you that control, allowing you to perfect every aspect of the delivery – from emotion to timing.
Compare these examples:
Use Cases
We're excited about the countless ways you can use Voice Changer:
- For creators, you can record speech exactly as you want it, and change it to the voice of your choice to make unique and emotive content.
- For gaming & entertainment, act out dialogs exactly the way you want, and use the character voice you love for each dialog to craft the perfect experience - in studio quality.
- For listeners, transform your favorite audiobooks, lectures, and podcasts to bring every character to life with a different voice.
- For businesses, allow any employee to quickly create studio quality, expressive, and realistic audio that fits your brand and your goals.
And by combining Sonic voice generation with Voice Changer, generate any speech and modify it to sound exactly the way you love.
Have an innovative use case in mind? Share it with us on X (formerly Twitter) @cartesia_ai or submit it to our developer showcase.
Getting Started
Voice Changer is available through our playground in three simple steps:
- Record yourself or upload a file
- Choose the voice you want to change to
- Generate high-quality audio in the new voice
Visit our developer portal for detailed implementation guides and API documentation.
Powered by State Space Models
Like all our models, Voice Changer is built on our pioneering work in state space model (SSM) architectures. This fundamentally different approach enables our models to process and generate high-resolution modalities like audio in amazing quality. State space models like S4 and Mamba, originally developed by our team, offer near-linear scaling costs in sequence length, making them ideal for high quality voice generation.
Join Us on This Journey
We're excited to see how developers and creators will use Voice Changer to build new experiences: