Skip to main content

Meituan's New AI Can Clone Voices with Stunning Accuracy

Meituan Breaks New Ground in Voice Cloning Technology

In a significant leap for audio generation, Meituan's LongCat team has open-sourced their revolutionary LongCat-AudioDiT model. This technology skips the conventional middle steps used in text-to-speech systems, working directly with sound waves to create eerily accurate voice clones.

Image

A Radical New Approach

Traditional voice synthesis relies on multiple stages of processing, which can degrade quality. LongCat-AudioDiT takes a bold shortcut with just two core components:

  • Wav-VAE: This clever compressor shrinks audio files dramatically while preserving quality - imagine fitting a 24kHz recording into just 11.7 frames per second without losing clarity.
  • Semantic-enhanced DiT: The model smartly blends text understanding with sound generation, catching subtle pronunciation details that often get lost in translation.

Solving Persistent Problems

The team tackled two major voice cloning challenges head-on:

  1. Voice Drift Fix: Ever noticed how some AI voices seem to change character mid-sentence? The new dual constraint mechanism puts a stop to that instability.
  2. Natural Sound Boost: Their adaptive projection guidance acts like an intelligent filter, keeping the good parts of the audio signal while ditching the parts that make speech sound robotic.

Performance That Speaks for Itself

Independent tests show LongCat-AudioDiT setting new standards:

  • Achieved near-perfect similarity scores (0.818 for Chinese, 0.797 for challenging sentences)
  • Maintains exceptional accuracy with just 1.5% word error rate in English
  • Outperforms established models like Seed-TTS and CosyVoice3.5

The real kicker? It does all this using simpler training methods than competitors, proving that sometimes less really is more.

The technology is now available to developers worldwide through GitHub and HuggingFace.

Key Points:

  • Direct waveform modeling eliminates quality loss from intermediate steps
  • 2000x compression maintains audio fidelity through innovative techniques
  • Top-tier performance in both Chinese and English voice cloning
  • Open-source availability encourages community development and innovation

Enjoyed this article?

Subscribe to our newsletter for the latest AI news, product reviews, and project recommendations delivered to your inbox weekly.

Weekly digestFree foreverUnsubscribe anytime

Related Articles

News

Zhou Shen Takes a Stand: New Song Blocks AI Voice Cloning

Chinese singer Zhou Shen has made waves by embedding a bold copyright notice in his latest release 'Moon Chronicle.' The track explicitly prohibits AI training and voice cloning, setting a new precedent for artist rights in the digital age. This move comes as musicians worldwide grapple with the ethical dilemmas posed by AI-generated music. Industry experts see this as a landmark case that could shape future copyright standards for human-AI collaboration in creative fields.

April 2, 2026
AI ethicsmusic copyrightvoice cloning
News

Suno 5.5 Lets You Clone Your Voice and Create Custom AI Singers

Suno's latest update transforms AI music creation by putting users in the driver's seat. Version 5.5 introduces voice cloning, custom model training, and personalized style adaptation - letting anyone create an AI singer that sounds just like them. The platform now requires fewer recordings for quality results while adding safeguards against voice misuse. For musicians, this could mean never needing to hire a vocalist again.

March 30, 2026
AI musicvoice cloningmusic technology
News

AI Voice Scams Surge as Deepfakes Fool Even Close Family Members

A disturbing new wave of AI-powered voice scams is sweeping across multiple countries, with fraudsters using eerily accurate deepfake technology to impersonate loved ones. Recent research reveals one in four Americans received such calls last year, with seniors particularly vulnerable - losing an average of $1,298 per scam. As these sophisticated cons grow at 16% annually, experts warn we're losing the technological arms race against scammers and urgently need better defenses.

March 16, 2026
AI securityvoice cloningfinancial fraud
News

Hume AI's TADA Brings Lightning-Fast, Hallucination-Free Speech to Your Phone

Hume AI has unveiled TADA, a groundbreaking text-to-speech system that runs efficiently on mobile devices. Unlike traditional models, it eliminates content hallucinations while delivering audio five times faster. What really sets it apart? The ability to generate 700-second audio clips and provide real-time transcriptions simultaneously - no extra processing needed. Early tests show it outperforms larger models in voice quality too.

March 12, 2026
AI speech synthesismobile technologyopen source AI
NPR Host Sues Google Over AI Voice That Sounds 'Eerily Like Me'
News

NPR Host Sues Google Over AI Voice That Sounds 'Eerily Like Me'

NPR veteran David Greene has filed a lawsuit against Google, claiming its NotebookLM AI tool uses a synthetic voice that mimics his distinctive vocal style. The radio host says friends and colleagues mistook the AI's speech patterns - including his signature 'ums' - for his own recordings. Google maintains the voice belongs to a professional actor. This legal battle highlights growing concerns about AI voice cloning in the entertainment industry, following similar disputes involving celebrity voices.

February 16, 2026
AI ethicsvoice cloningmedia law
Kuaishou's Kling 2.6 Brings AI Videos to Life with Voice and Motion Magic
News

Kuaishou's Kling 2.6 Brings AI Videos to Life with Voice and Motion Magic

Kuaishou's latest Kling 2.6 update transforms AI video generation with groundbreaking voice and motion control. Now your favorite characters can speak in your voice while performing complex dance moves flawlessly. The upgrade tackles traditional AI video challenges like blurry hand movements and unnatural facial expressions, offering creators unprecedented control at competitive prices.

December 22, 2025
AI video generationvoice cloningdigital avatars