Introducing our next 8 languages on Sonic Multilingual

Karan Goel

Today, we're excited to announce the Alpha Release of our next 8 languages—Hindi, Italian, Korean, Dutch, Polish, Russian, Swedish, and Turkish—on Sonic Multilingual, bringing our total number of languages supported to 15. All the same features you love on Sonic English work on multilingual as well - industry-leading latency, top-rated conversational quality, and voice design.

This marks a significant milestone in making our real-time, industry-leading voice generation accessible to anyone.

Expanding Sonic's Capabilities

Sonic Multilingual extends the capabilities of our state-of-the-art Sonic voice model, allowing developers and users to create lifelike speech applications across a diverse range of languages and regions. The supported languages include:

German (de) - Deutsch
Portuguese (pt) - Português
Chinese (zh) - 中文
Japanese (ja) - 日本語
French (fr) - Français
Spanish (es) - Español
Hindi (hi) - हिन्दी
Italian (it) - Italiano
Korean (ko) - 한국어
Dutch (nl) - Nederlands
Polish (pl) - Polski
Russian (ru) - Русский
Swedish (sv) - Svenska
Turkish (tr) - Türkçe

How It Works

To see Sonic Multilingual in action, watch this live demo. In the video, you'll experience firsthand how effortlessly our model generates lifelike speech across multiple languages with low latency.

Getting started with Sonic Multilingual is simple:

Select the "sonic-multilingual" model in your application or via our API.
Choose your preferred language from the 14 available options.
Type your transcript of what you want the model to say.
Pick a voice from our diverse voice library. For the best performance, we recommend using voices that match the language of your transcript.
Hit the "Speak" button and experience the low-latency, lifelike voice generation.
Download the audio by clicking the "Download" button if you wish to save it.

Powered by State Space Models

Like all our models, Sonic Multilingual is built on our groundbreaking state space model (SSM) architecture. This fundamentally more efficient approach enables our models to process and generate high-resolution modalities like audio in real time. State space models like S4 and Mamba, originally developed by our team, offer near-linear scaling costs in sequence length, making them ideal for low-latency, high quality voice generation.

Join Us on This Journey

If you're interested in partnering with us to build real-time conversational AI using Sonic, we'd love to hear from you.

Try It Now: Sign up and experience our new multilingual capabilities. Fill out our enterprise form for priority support, custom SLAs, volume discounts, and more.
Join Our Team: If you're passionate about bringing real-time multimodal intelligence to every device, we're hiring across every function.