Skip to main content

Tencent ARC Open-Sources AudioStory for Long-Form Audio

Tencent ARC Unveils Open-Source AudioStory Model for Long-Form Audio Generation

Tencent's Applied Research Center (ARC) has publicly released AudioStory, an innovative model designed to generate long-form narrative audio using large language models (LLMs). The open-source project marks a significant advancement in text-to-audio technology, particularly for extended content where temporal coherence and structural complexity present challenges.

Image

Technical Framework and Capabilities

The model operates through a unified understanding and generation framework, enabling diverse applications including:

  • Video dubbing
  • Audio continuation
  • Long narrative synthesis

By integrating LLMs with audio generation systems, AudioStory maintains scene transition continuity and emotional tone consistency across extended timelines. Its instruction-following architecture decomposes complex narrative queries into chronologically ordered subtasks.

Image

Key Innovations

AudioStory introduces two breakthrough features:

  1. Decoupled bridging mechanism: Separates LLM collaboration from audio generation into specialized components
  2. End-to-end training: Unifies instruction interpretation with audio production for enhanced system synergy

The team has concurrently released the AudioStory-10K benchmark dataset, spanning domains from animated soundscapes to natural sound narratives. Comparative testing demonstrates superior performance against conventional text-to-audio models in both single-instance generation and extended narrative contexts.

Practical Applications

Current implementations include:

  • Dubbing for classic animations (demonstrated with Tom and Jerry samples)
  • Text-based long audio generation
  • Multi-scene narrative construction The project's GitHub repository contains inference code alongside extensive documentation of use cases.

Key Points:

🎧 Combines LLMs with audio generation for coherent long-form narratives
📊 Outperforms existing models in temporal coherence and instruction fidelity
🛠️ Open-sourced with 10K benchmark dataset for community development
🌐 Demonstrated applications in entertainment and media production

Enjoyed this article?

Subscribe to our newsletter for the latest AI news, product reviews, and project recommendations delivered to your inbox weekly.

Weekly digestFree foreverUnsubscribe anytime

Related Articles

News

China's AI Models Take Global Lead as Query Volumes Soar

Chinese AI models have outpaced their U.S. counterparts in global usage, with weekly queries hitting 4.19 trillion tokens - a 35% weekly surge. MiniMax leads the pack while two other Chinese firms join the top five, signaling a potential shift in AI dominance. The growth reflects both technological advances and robust domestic applications.

March 10, 2026
Artificial IntelligenceLarge Language ModelsTech Competition
News

China's AI Models Outpace Global Rivals as MiniMax Holds Top Spot

China's artificial intelligence sector is surging ahead, with domestic large language models now processing more weekly requests than their U.S. counterparts. MiniMax's M2.5 model continues to dominate globally, while newcomers like Stepwise Star show explosive growth. The latest data reveals shifting patterns in AI adoption and highlights China's strengthening position in the competitive AI landscape.

March 10, 2026
Artificial IntelligenceChinese TechLarge Language Models
News

Tsinghua-Backed AI Startup Mianshi Intelligence Lands Major Funding with China Telecom

Mianshi Intelligence, a rising star in China's AI landscape with deep Tsinghua University roots, has secured hundreds of millions in new funding led by China Telecom. The company's innovative MiniCPM series models are making waves with their efficient performance, particularly in edge computing applications. This investment signals growing confidence in the commercialization of large language models across industries like finance and government services.

February 28, 2026
Artificial IntelligenceTech StartupsChina Telecom
Tencent's AI Assistant Caught Swearing in Holiday Messages
News

Tencent's AI Assistant Caught Swearing in Holiday Messages

Tencent's AI assistant Yuanbao sparked outrage after generating New Year greeting images with profanity instead of festive wishes. Users reported similar incidents earlier this year where the AI responded with personal insults during coding help requests. The company apologized, calling it an 'uncommon abnormal output,' while experts warn this exposes fundamental challenges in controlling large language models.

February 25, 2026
AI EthicsLarge Language ModelsTech Controversy
News

JD.com Unveils Powerful JoyAI Model to Boost AI Innovation

Chinese e-commerce giant JD.com has open-sourced its new JoyAI-LLM-Flash model on Hugging Face. With 4.8 billion parameters and trained on 20 trillion text tokens, this AI powerhouse shows remarkable reasoning and programming capabilities. The innovative FiberPO framework helps solve traditional scaling issues while boosting efficiency.

February 16, 2026
JoyAILarge Language ModelsJD.com
China's GLM-5 AI Model Breaks Into Global Top Four
News

China's GLM-5 AI Model Breaks Into Global Top Four

China's AI sector celebrates a major achievement as Zhipu AI's GLM-5 model climbs to fourth place in global rankings, matching Anthropic's Claude Opus4.5. The newly open-sourced model boasts impressive upgrades including doubled parameter size and cutting-edge architecture improvements. Developers can now access its high-speed version through Silicon Flow AI Cloud.

February 13, 2026
AI DevelopmentChinese TechLarge Language Models