Ego
Ego Revolutionizes Gaming with Interactive Characters Powered by Cartesia
"Gaming has always been where communities form - from my generation's World of Warcraft and Runescape to today's Roblox and Minecraft. As games evolve into social platforms, AI characters need to feel genuinely human in both their responsiveness and emotional depth. Cartesia's technology, with its ultra-low latency, natural voices, and precise emotional control, helps us create truly immersive worlds where AI characters feel alive and authentic."
Peggy Wang, Co-Founder and CTO, Ego
About the company
Ego is an AI-native simulation engine where users can create and share 3D animated characters, worlds, and game scripts using natural language. Ego’s vision is revolutionary for gaming - imagine being able to prompt the most popular games today like Minecraft or Animal Crossing into existence through simple commands.
Introduction
Ego’s founders bring deep expertise in both gaming and AI. CEO Vishnu Hari, a former Top 500 Overwatch player, previously led product teams at Facebook AI Applied Research (FAIR) and the Meta Horizon Scripting team. CTO Peggy Wang, who shipped ML algorithms for Meta's AR Avatar face tracking, brings experience from Stanford AI Research and Lyft Level 5 in autonomous behavior planning algorithms for robotics.
Partnering with Cartesia has accelerated Ego's vision of allowing gamers to spawn the 3D world they imagine, complete with AI agents and Non-Player Characters (NPCs) that display human-like behavior. This partnership was recently showcased with the launch of Thrall, a game mod for the Viking-themed survival game Valheim. Using Cartesia's ultra-low latency voice technology (<90 ms) and advanced voice design capabilities, Thrall responds to players with human-like reactions, emotional awareness, and natural interactions.
Check out the launch tweet from Ego here.
The Role of Voice AI in Immersive Gaming
Until recently, open-world gaming faced three major limitations:
NPCs were limited to rigid, pre-programmed behaviors without true agency
Creating game interactions required extensive coding expertise
Developing and scaling 3D assets and environments was prohibitively resource-intensive
The proliferation of generative AI has changed this landscape dramatically. Ego saw an opportunity to democratize immersive gaming by enabling anyone to create rich gaming experiences through natural language—similar to filling out a character sheet in Dungeons & Dragons.
Voice AI enhances two critical elements in modern gaming:
Non-Player Characters (NPCs): The computer-controlled characters that populate game worlds—from shopkeepers to quest-givers
AI Companions: Intelligent agents that can make decisions and interact like human players
Both require voices that can match their intelligence and adapt to emotion and context in real-time.
The Challenge
When evaluating voice providers, Ego sought to solve several critical requirements:
Achieving natural, real-time voice interactions between players and AI companions
Generating contextually appropriate emotional responses
Supporting multiple languages for international players
Creating distinct voices for different character personalities
Maintaining low latency for seamless gaming experiences
The Solution
Gaming demands voices that can match dynamic player interactions. Ego chose Cartesia's Sonic model for its distinctive capabilities:
Emotion-First Design: Unlike traditional text-to-speech models, Sonic enables AI companions to express a full spectrum of emotions—from curiosity to combat urgency—making every interaction feel authentic. Alternate providers typically have some form of emotion recognition pre-trained in their models, but contextual awareness falls flat. Cartesia gives customers fine-grained control over emotions without having to regenerate the voice over and over again.
Gaming-Optimized Latency: Powered by Cartesia's breakthrough SSM architecture, Sonic delivers industry-leading 90ms latency—essential for maintaining immersion in real-time gaming environments.
Unlimited Dynamic Character Voices: With Cartesia's instant voice cloning capability, each AI character maintains its unique voice characteristics, trainable with just 10 seconds of audio. This enables Ego to create diverse, memorable characters that resonate with players while maintaining consistent personalities.
Global Gaming Community: Ego’s games reach a global audience across 14 other languages in addition to regional accents, ensuring authentic character interactions worldwide.
Voice Prompting: Cartesia's voice changer allows precise control over dialogue delivery, replacing traditional studio sessions with efficient voice-prompted commands. Developers can record a sample with the desired style, emotion, and prosody, then apply it to any character voice—enabling natural voice interactions without the need for repeated studio sessions.
The results
Ego's launch of Thrall represents a breakthrough in NPC design—players can use natural voice commands to direct it in performing any task a human player would do, from gathering resources to engaging in combat. The AI companion responds with contextually appropriate emotions, celebrating successful hunts or acknowledging commands with human-like reactions. Most importantly, players can customize Thrall's personality and behavioral traits, creating a truly personalized gaming companion.
The partnership represents a significant step toward Ego's vision of democratizing game creation and enabling truly interactive gaming experiences where AI characters feel alive and responsive.
"As an avid gamer, I've been amazed by Ego's breakthrough in creating truly responsive gaming experiences. Previously, NPCs and game characters were limited by rigid, pre-programmed behaviors. Sonic brings these characters to life with natural personalities and expressiveness. I'm excited to contribute to Ego's vision for the future of interactive gaming."
Karan Goel, CEO, Cartesia
What our customers say
Join the growing list of companies opting for Sonic.
"We're thrilled to partner with Cartesia - their technology has dramatically improved the accuracy and reliability of our call center agents. Beyond just providing best-in-class voice AI, the Cartesia team has been a true partner in helping us transform 24/7 patient support for over 215,000 patients. Their support has been instrumental in making exceptional care accessible anytime, anywhere."
Jeffrey Liu, Founder and co-CEO, Assort Health
"This partnership represents a transformative moment in enterprise AI adoption," said Melissa Gordon, CEO of Rasa. "By combining Rasa’s strengths in enterprise conversational AI with Cartesia's innovative voice technology, we're fundamentally changing how enterprises can deploy and scale AI assistants across their organizations."
"Together AI's mission has always been to provide developers with the most powerful and efficient tools for building AI applications," says Vipul Ved Prakash, Together AI's CEO. "Cartesia is leading the charge of building efficient, multimodal models from first principles, starting with their Sonic TTS model. By integrating Sonic into our platform, we're enabling developers to create sophisticated multi-modal applications that leverage the most advanced and lowest latency voice model available today, all while maintaining the simplicity and reliability our users expect."