AI D​A​M​N/ElevenLabs CEO: AI Voice Models to Commoditize Soon

ElevenLabs CEO: AI Voice Models to Commoditize Soon

ElevenLabs CEO Predicts Commoditization of AI Voice Models

At TechCrunch Disrupt 2025, Mati Staniszewski, co-founder and CEO of ElevenLabs, made a bold prediction: AI voice models will become commoditized within the next two to three years. While currently a competitive differentiator, Staniszewski believes performance gaps between models will narrow significantly for mainstream languages and general voice styles.

Image

Image source note: The image is AI-generated, and the licensing service provider is Midjourney

Short-Term Focus on Models, Long-Term on Products

When questioned about investing heavily in R&D for potentially homogeneous future models, Staniszewski explained: "Today, models remain the biggest technical barrier. If AI voice sounds unnatural or unsmooth, user experience suffers." He highlighted ElevenLabs' breakthroughs in model architecture, particularly in emotional expression and multilingual prosody modeling, as key differentiators.

The company is already preparing for the post-model era. "Our long-term strategy isn't just being a model supplier," Staniszewski emphasized. "We're building complete 'AI + product' experiences." Drawing parallels to Apple's hardware-software integration approach with smartphones, ElevenLabs aims to use its proprietary models as engines powering high-value applications.

Multi-Modal Integration Emerges as Next Frontier

Looking ahead 1-2 years, Staniszewski anticipates rapid convergence of single-modal voice systems into multi-modal platforms. "You'll generate audio and video simultaneously," he predicted, "or dynamically link large language models with voice engines during conversations." He cited Google's Veo3 video generation model as evidence that cross-modal collaboration represents the next technological frontier.

To position itself competitively, ElevenLabs is actively pursuing partnerships with third-party models and open-source communities. These collaborations explore embedding ElevenLabs' audio capabilities into broader AI ecosystems—potentially enabling immersive virtual humans, advanced smart customer service systems, or innovative interactive entertainment experiences.

Commoditization Signals Value Shift, Not Decline

Staniszewski rejects notions that model commoditization spells industry decline. Instead, he sees it representing a shift in value creation from underlying technology to application innovation. "Future companies will select models based on specific scenarios," he explained. "Different solutions for customer service versus game voice acting versus educational explanations."

The CEO emphasized that reliability, scalability, and scenario adaptability will surpass raw sound quality as primary decision factors. Accordingly, ElevenLabs is strengthening its API platform, developer toolchain, and industry-specific solutions—ensuring customers can integrate high-quality voices seamlessly into business workflows.

Key Points:

  • Commoditization timeline: AI voice models expected to become standardized commodities within 2-3 years
  • Strategic pivot: ElevenLabs transitioning from pure model development to integrated product solutions
  • Multi-modal future: Convergence of audio with video generation and LLMs emerging as next competitive battleground
  • Value migration: Industry focus shifting from technical superiority to application-specific implementations