Quora
Quora Introduces Audio to Poe with Cartesia
“Poe brings together the world's best AI, all in one place. With Cartesia's Sonic model, users can interact with a wide range of high-quality, human-like voices in multiple languages, enhancing their experience on our platform.”
— Spencer Chan, Head of Poe Product, Quora
About the company
Quora's mission to share and grow the world's knowledge has inspired the creation of Poe, a versatile chat platform that brings together the best AI models in one place. Poe enables users to engage with latest state-of-the-art LLMs through natural, dynamic conversations, helping transform how people interact with AI.
Introduction
This week, we launched Sonic on Poe, bringing our text-to-speech technology to the platform's users. With the ability to output audio responses in a wide range of unique voices and languages, users can turn text into audio for a more immersive and entertaining experience.
Poe users can leverage Cartesia Sonic by sending text directly to the bot or by prompting “@Cartesia” followed by the text within any conversation. The message will then be transformed into an audio response in the voice of your choice. This feature allows any Poe user to convert their text-based content on Poe into high-quality audio using one of our 100+ default voices and 14 different languages. Just add --voice [Voice Name] to the end of a message to customize the voice used or to handle different language inputs (e.g. 你好 --voice Chinese Commercial Woman).
Here are 2 examples of how you can use this feature:
1) Nostalgic radio report with 1920's Radioman voice
2) Immersive Story with Wizardman voice
This new feature enables Poe users to bring their ideas to life through powerful voice capabilities, including:
Realistic Voices: With Cartesia Sonic, Poe provides natural, high-quality audio generation, offering over 100 default voices and support for custom voice creation.
Multilingual Support: Featuring diverse accents and support for 14 languages, Poe ensures that content can reach and resonate with a global audience.
Advanced Controllability: Creators often seek precise control over how phrases are delivered. Cartesia addresses this need with advanced features like SSML support for break tags and spell tags, as well as IPA support for accurate pronunciation control.
Enterprise Scalability: Cartesia is SOC2 compliant and supports unlimited concurrency for its enterprise customers, guaranteeing uptime and performance for high volumes of users.
"We're thrilled to partner with Quora as they build Poe into the central hub for AI interaction. Their platform perfectly embodies our vision of making knowledge more accessible through voice and multimodal AI. We're excited to bring natural speech capabilities to their millions of users and help define how people learn and create with AI."
— Karan Goel, CEO, Cartesia
What our customers say
Join the growing list of companies opting for Sonic.
"We're thrilled to partner with Cartesia - their technology has dramatically improved the accuracy and reliability of our call center agents. Beyond just providing best-in-class voice AI, the Cartesia team has been a true partner in helping us transform 24/7 patient support for over 215,000 patients. Their support has been instrumental in making exceptional care accessible anytime, anywhere."
Jeffrey Liu, Founder and co-CEO, Assort Health
"This partnership represents a transformative moment in enterprise AI adoption," said Melissa Gordon, CEO of Rasa. "By combining Rasa’s strengths in enterprise conversational AI with Cartesia's innovative voice technology, we're fundamentally changing how enterprises can deploy and scale AI assistants across their organizations."
"Together AI's mission has always been to provide developers with the most powerful and efficient tools for building AI applications," says Vipul Ved Prakash, Together AI's CEO. "Cartesia is leading the charge of building efficient, multimodal models from first principles, starting with their Sonic TTS model. By integrating Sonic into our platform, we're enabling developers to create sophisticated multi-modal applications that leverage the most advanced and lowest latency voice model available today, all while maintaining the simplicity and reliability our users expect."