Doubao Unveils Advanced Real-time Voice Model for AI Dialogue
Recently, Doubao Company announced the launch of its new real-time voice large model, which it claims achieves a "cliff-like lead" in Chinese dialogue. This advancement marks a significant enhancement in artificial intelligence (AI) conversational capabilities, now fully available in the Doubao App (version 7.2.0 Spring Edition). Users can expect a richer and more authentic voice communication experience.
According to reports, Doubao's real-time voice large model integrates speech understanding and generation into a seamless end-to-end dialogue system. This technological breakthrough enables the model to excel in voice expressiveness, control, and emotional continuity, featuring low latency and the capacity to interrupt conversations at any time. Such improvements greatly enhance the user interaction experience. The official statement indicates that this technology not only boosts IQ but also emotional intelligence, thereby facilitating a better understanding and expression of emotions.
The latest update introduces a real-time voice call feature, leveraging Doubao's large model to adjust dialogue pace, retroflex sounds, volume, and breathiness according to different scenarios. Moreover, the new voice function can mimic various vocal tones, support multiple dialects, and engage in English conversations. It even has the ability to sing certain songs, raising the realism of human-machine dialogue to a level where it becomes difficult to distinguish between human and machine.
Doubao's research and development team has stated that this new technology is built on an end-to-end framework, which deeply integrates speech and text patterns through native methods for unified modeling. This design not only optimizes the processes of speech recognition and generation but also endows AI with a richer "soul," enhancing its capability to communicate effectively with humans.
The launch of Doubao's real-time voice large model in the domain of Chinese voice dialogue promises to provide users with an unprecedented interactive experience and facilitate the growth of intelligent voice technology. This innovation set to redefine the standards of AI engagement in conversational settings, ensuring that users enjoy interactions that feel increasingly natural and human-like.
Key Points
- Doubao's new voice model enhances AI conversational capabilities in Chinese.
- The model integrates speech understanding and generation for improved user experience.
- The real-time voice call feature allows for flexible adjustments in dialogue.
- New functionalities include mimicry of various vocal tones and singing abilities.
- The technology aims to bridge the gap between human and machine communication.