Daily
Daily launches Daily Bots with Cartesia as Primary Voice Provider
“Cartesia Sonic is the best voice model today for real-time multimodal use cases. At Daily, we've been working extensively with open source developers and enterprise customers building with text to speech. It's exciting to see the innovations unlocked by Cartesia's state-of-the-art research, and its combination of high quality, flexibility, fast response, and reasonable cost. It's unlocking new voice AI use cases for Daily Bots developers building in experiences like customer support, appointment scheduling, and interacting with virtual personas. We couldn't be more excited to partner with Cartesia.”
Kwindla Hultman Kramer, Co-Founder and CEO, Daily
About the company
Daily is at the forefront of voice and video, providing infrastructure and SDKs for thousands of developers worldwide. Its industry-leading global edge network enables <13 ms first hop latency to over 5 billion users, across 75 points of presence. Its track record of innovation since 2016 includes the first cross-platform WebRTC SDKs built on a Rust common core, ultra low latency mesh networking with end-to-end adaptive bandwidth management, and, recently, launching Pipecat, the fastest growing open source framework for voice AI.
Daily’s homage to Metal Gear highlights how Sonic can power real time gaming conversations. Voices powered by Catesia, orchestration and ultra low latency transport by Daily Bots. Play the demo here and read Kwindla’s X post here.
Introduction
Earlier this year Daily launched Daily Bots, the Open Source cloud for voice AI. Developers can build adaptive voice AI agents using any LLM, using Open Source SDKs for the Web, iOS, Android, Python, and telephony, and run them at scale on Daily's infrastructure. On its first day hundreds of developers signed up, reflecting pent-up demand for hosted agents built on the RTVI and Pipecat open standards.
Daily has optimized adaptive voice AI for a great conversational experience and enterprise reliability by working with Cartesia:
Ultra-low Low Latency: Daily Bots delivers voice-to-voice latencies – including transport over the network – as fast as 500ms. This is human conversational speed. The speed of Cartesia Sonic is part of what makes this possible. Cartesia consistently achieves a time-to-first-byte for voice inference of 180 ms or better, including network overhead.
Realistic Voices: By utilizing Cartesia Sonic, Daily facilitates natural and responsive voice interactions operating in 13 languages with support for customized voices.
Industry-Leading Research Team: “Cartesia’s team is highly regarded as pioneers in building highly efficient and performant real time models on State Space Models (SSMs). They’ve recruited an all-star team of AI researchers to push the boundaries of what’s possible with voice AI - from increasing realism, prompting/customization, and latency we didn’t think possible,” says Hultman Kramer.
“Daily has led the charge in real-time voice and video experiences since 2016, building a devoted developer community that’s pioneering the next generation of AI experiences. As the creators of Pipecat, a rapidly growing open-source voice framework, they've been a crucial partner for us as we’ve optimized Sonic for real-time applications. Daily Bots empowers developers with unparalleled workflow flexibility to build voice applications, and we're glad to partner with them as their primary voice provider.”
Karan Goel, CEO, Cartesia
What our customers say
Join the growing list of companies opting for Sonic.
"This partnership represents a transformative moment in enterprise AI adoption. By combining Rasa’s strengths in enterprise conversational AI with Cartesia's innovative voice technology, we're fundamentally changing how enterprises can deploy and scale AI assistants across their organizations."
Melissa Gordon, CEO of Rasa
"Together AI's mission has always been to provide developers with the most powerful and efficient tools for building AI applications. Cartesia is leading the charge of building efficient, multimodal models from first principles, starting with their Sonic TTS model. By integrating Sonic into our platform, we're enabling developers to create sophisticated multi-modal applications that leverage the most advanced and lowest latency voice model available today, all while maintaining the simplicity and reliability our users expect."
Vipul Ved Prakash, Together AI's CEO
"Internet applications weren’t built for how we’ll use computers in the future. Those computers will see, hear, and speak like we do. We’ll interact with them like we do with each other. We designed LiveKit’s Agents framework to make it easy to build applications for this new paradigm. Cartesia—pioneers of the SSM architecture—shared our belief that real-time, multimodal AI models would be at the center of computing, making them the perfect Agents launch partner."
Russ d'Sa CEO & LiveKit co-founder