Alibaba Unveils FunAudio-ASR with Breakthrough Noise ReductionWelcome to AI DAMN! Discover the most amazing latest AI news, innovative AI products, and groundbreaking AI projects. From ChatGPT to cutting-edge models, we curate the AI developments that make you go 'DAMN!' - your daily dose of mind-blowing artificial intelligence.

Discover

Language

Account

Alibaba Unveils FunAudio-ASR with Breakthrough Noise Reduction

Alibaba's FunAudio-ASR Redefines Speech Recognition Standards

Alibaba Group's TONGYI Lab has introduced FunAudio-ASR, an end-to-end speech recognition model that dramatically improves accuracy in noisy environments through its innovative Context module. This technological advancement reduces hallucination rates from 78.5% to just 10.7% - a nearly 70% improvement that establishes new benchmarks for the industry.

Technical Breakthroughs

The model was trained on tens of millions of hours of audio data and integrates large language models' semantic understanding capabilities. Testing shows superior performance compared to competitors like Seed-ASR and KimiAudio-8B in challenging scenarios including:

Far-field audio capture
High-noise environments
Multi-speaker situations

The system demonstrates particular effectiveness in business applications such as meetings and public spaces where background noise traditionally degrades recognition quality.

Deployment Options

Recognizing diverse user needs, Alibaba offers:

Full version: Maximum accuracy for enterprise applications
FunAudio-ASR-nano: Lightweight version maintaining core functionality while reducing computational requirements

The nano variant enables cost-effective deployment across various hardware configurations without significant performance compromises.

Current Implementations

The technology already powers several real-world applications:

DingTalk's "AI Note-taking" feature
Video conferencing systems
DingTalk A1 hardware devices Developers can access the API through Alibaba Cloud's BaiLian platform, facilitating seamless integration into existing systems.

Industry Impact

The launch represents a significant leap forward for:

Business communication tools
Accessibility technologies
AI-powered transcription services By dramatically improving reliability in noisy conditions, FunAudio-ASR removes a major barrier to widespread speech recognition adoption.

Key Points:

70% reduction in hallucination rates compared to previous solutions The Context module enables unprecedented accuracy improvements Dual deployment options accommodate different resource requirements Already implemented across Alibaba's business communication ecosystem API availability accelerates third-party adoption

Enjoyed this article?

Subscribe to our newsletter for the latest AI news, product reviews, and project recommendations delivered to your inbox weekly.

Weekly digestFree foreverUnsubscribe anytime

News

Tongyi Qianwen Unveils Qwen3-ASR-Flash Speech Recognition Model

Tongyi Qianwen has launched Qwen3-ASR-Flash, a cutting-edge speech recognition model with multilingual support, singing recognition capabilities, and customizable context adaptation. The model achieves under 8% error rate in tests and supports 11 languages across various dialects.

September 9, 2025

speech-recognitionAI-technologymultilingual-processing

News

Step-Audio-R1.1 Shatters Records as New Speech AI Champion

StepZen Star's open-source speech model Step-Audio-R1.1 has outperformed tech giants' offerings, achieving a record-breaking 96.4% accuracy in global AI evaluations. This innovative model combines human-like reasoning with real-time response capabilities, allowing users to think and speak simultaneously through streaming inference. Developers can already experiment with its groundbreaking technology via HuggingFace.

January 15, 2026

speech-recognitionAI-breakthroughopen-source-tech

News

Alibaba's Fun-ASR Model Boosts Speech Recognition by 15%

Alibaba's Tongyi has upgraded its Fun-ASR speech recognition model, achieving over 15% accuracy improvements in vertical industries like insurance and home decoration. The model leverages advanced algorithms and reinforcement learning to enhance context awareness and reduce errors in noisy environments.

August 23, 2025

speech-recognitionAI-modelsAlibaba-Tongyi

News

NVIDIA's Canary-Qwen-2.5B Sets New Speech Recognition Benchmark

NVIDIA has launched Canary-Qwen-2.5B, a hybrid speech recognition and language model achieving a record-low 5.63% word error rate. The commercial-grade model combines ASR with LLM capabilities, offering unprecedented accuracy and speed for enterprise applications while being available under an open CC-BY license.

July 18, 2025

speech-recognitionAI-modelsNVIDIA

News

Kyutai Labs Open-Sources Real-Time Voice Synthesis Tech

Kyutai Labs has open-sourced its Kyutai TTS technology, offering low-latency, high-fidelity real-time voice synthesis. The system supports streaming text input and generates precise word timestamps, making it ideal for interactive applications. Currently supporting English and French, it achieves high accuracy with WER rates below 3.3%.

July 4, 2025

voice-synthesisAI-technologyopen-source

News

Meta and Luxury Brands Unveil AI-Powered Smart Glasses

Meta has partnered with Prada, Oakley, and other luxury brands to launch next-generation smart glasses featuring advanced AI technology. Priced at $360, the glasses target sports enthusiasts with enhanced durability and functionality. A third-generation model with additional features is expected by year-end.

June 18, 2025

smart-glassesAI-technologyluxury-brands

Alibaba Unveils FunAudio-ASR with Breakthrough Noise Reduction

Alibaba's FunAudio-ASR Redefines Speech Recognition Standards

Technical Breakthroughs

Deployment Options

Current Implementations

Industry Impact

Key Points:

Enjoyed this article?

Related Articles

Tongyi Qianwen Unveils Qwen3-ASR-Flash Speech Recognition Model

Step-Audio-R1.1 Shatters Records as New Speech AI Champion

Alibaba's Fun-ASR Model Boosts Speech Recognition by 15%

NVIDIA's Canary-Qwen-2.5B Sets New Speech Recognition Benchmark

Kyutai Labs Open-Sources Real-Time Voice Synthesis Tech

Meta and Luxury Brands Unveil AI-Powered Smart Glasses

Popular Articles

TSMC Reports Record Revenue, AI Growth Fuels Optimism for 2025

SoulX-Podcast AI Model Revolutionizes Long-Form Voice Generation

Plaud AI Pro Launches with 30-Hour Battery and Smart Screen

MiniMax Unveils M2 Inference Model for Smart Agents

ChatGPT Launches Instant Checkout for Seamless E-commerce

Main Pages

Content

Others