Skip to main content

Tongyi Lab Unveils Next-Gen Voice Models That Respond Like Humans

Tongyi Lab's Voice AI Breakthrough: Speaking Human

Image

In a significant advancement for voice technology, Tongyi Lab has launched Fun-CosyVoice3.5 and Fun-AudioGen-VD, two models that understand instructions as naturally as humans do. Gone are the days of memorizing specific commands - now you can simply tell these systems what you need.

The Human Touch in Machine Speech

The real magic lies in how these models interpret requests. Want a villainous voice whispering threats? Or perhaps a cheerful barista taking your coffee order? Just say so. The system handles the rest, eliminating the technical jargon barrier that once separated creators from powerful voice tools.

Image

Fun-CosyVoice3.5 brings impressive upgrades:

  • Supports four additional languages including Thai and Indonesian
  • Cuts pronunciation errors by nearly 70%
  • Reduces processing delays significantly

The secret sauce combines advanced reinforcement learning techniques called DiffRO and GRPO, which help the AI grasp subtle speech patterns most systems miss.

Meanwhile, Fun-AudioGen-VD transforms sound design:

  • Adjusts gender, emotion and even room acoustics on command
  • Creates everything from single voices to complex ambient scenes
  • Perfect for gaming environments or film dubbing workflows

Why This Matters Beyond Tech Circles

The implications stretch far beyond impressive demos. Film studios can prototype character voices instantly. Game developers might slash weeks off production schedules. Even virtual assistants could soon respond with emotional intelligence rather than robotic precision.

The technology arrives as demand grows exponentially - industry analysts project the voice synthesis market will double by 2028 as consumers embrace more natural digital interactions.

Key Points:

  • Natural commands replace technical parameters
  • 70% accuracy boost for uncommon words/phrases
  • 35% faster response times than previous versions
  • New language support expands global accessibility
  • Emotional range control unlocks creative potential

Enjoyed this article?

Subscribe to our newsletter for the latest AI news, product reviews, and project recommendations delivered to your inbox weekly.

Weekly digestFree foreverUnsubscribe anytime

Related Articles

Tencent's New AI Tool Lets You Build Game Worlds with Just a Description
News

Tencent's New AI Tool Lets You Build Game Worlds with Just a Description

Tencent has unveiled HY-World2.0, an open-source 3D world model that transforms simple text prompts into fully editable game environments. This upgrade from version 1.5 allows developers to instantly create medieval dungeons, futuristic cities, or even digital twins of real locations - all ready for Unity or Unreal Engine. The system's improved algorithms deliver stunningly realistic results while making 3D world-building accessible to everyone.

April 16, 2026
3D modelinggame developmentAI tools
News

Volcano Engine Unleashes Powerful Video Generation API for Creators and Businesses

Volcano Engine has launched its Seedance 2.0 API, offering cutting-edge video generation capabilities to both enterprise and individual users. The upgraded model handles text, images, audio, and video inputs with improved physical accuracy and visual realism. Alongside technical enhancements, the company has implemented robust compliance measures for AI-generated content. Industry experts believe this release could transform workflows in marketing, entertainment, and corporate video production.

April 14, 2026
AI video generationcreative technologydigital content creation
News

Yi Meng AI's New Tool Octo Transforms Storytelling with AI Co-Creation

Yi Meng AI has unveiled 'Octo,' a groundbreaking collaborative storytelling tool featuring an innovative VibeCreate mode. This web-based platform reimagines AI as a creative partner rather than just a tool, enabling real-time multi-modal collaboration through text, images, and audio. Currently in internal testing, Octo promises to revolutionize how stories are conceived and produced using Yi Meng's latest AI models.

April 9, 2026
AI storytellingcreative technologyhuman-AI collaboration
Xiaomi's OmniVoice: A Game-Changer in Multilingual Speech Synthesis
News

Xiaomi's OmniVoice: A Game-Changer in Multilingual Speech Synthesis

Xiaomi's next-generation Kaldi team has open-sourced OmniVoice, a groundbreaking multilingual text-to-speech model supporting over 600 languages. With Chinese word error rates as low as 0.84% and processing speeds 40 times faster than real-time, this innovation sets new standards in speech synthesis. What makes it truly remarkable? It can clone voices from just 3-10 seconds of audio and even help preserve endangered languages.

April 9, 2026
speech synthesisAI innovationmultilingual technology
PixVerse C1: The AI Director Reshaping Film Production
News

PixVerse C1: The AI Director Reshaping Film Production

Ashi Technology's PixVerse C1 is turning heads in the film industry with its ability to generate 15-second 1080P videos complete with synchronized audio. This isn't just another video tool - it's a creative powerhouse that can automatically storyboard scenes based on simple prompts. While global markets fluctuate, this innovation suggests we're entering an era where solo creators can rival production teams.

April 8, 2026
AI filmmakingvideo generationcreative technology
News

Meitu's RoboNeo Joins Forces with Seedance 2.0 to Revolutionize AI Video Creation

Meitu's AI video assistant RoboNeo has teamed up with Seedance 2.0, transforming how creators produce short videos. This powerful combination moves beyond basic video generation to offer a complete, intelligent workflow system. Creators can now seamlessly blend text-to-video conversions, frame controls, and reference-based generation in one smooth process. The integration promises to solve the industry's fragmentation issues while making professional-quality video production accessible to everyone.

April 7, 2026
AI videocreative technologydigital content creation