Doubao Unveils Advanced Visual Understanding ModelWelcome to AI DAMN! Discover the most amazing latest AI news, innovative AI products, and groundbreaking AI projects. From ChatGPT to cutting-edge models, we curate the AI developments that make you go 'DAMN!' - your daily dose of mind-blowing artificial intelligence.

Discover

Language

Account

Doubao Unveils Advanced Visual Understanding Model

At the Volcano Engine FORCE Power Conference on December 18, 2024, Volcano Engine announced a comprehensive upgrade to the Doubao large model family, introducing a groundbreaking visual understanding model.

Tan Dai, the president of Volcano Engine, highlighted that the daily token usage of the Doubao large model has surged to over 4 trillion tokens, a remarkable 33-fold increase since its launch in May. This significant growth underscores the model's widespread adoption across various application scenarios.

The newly launched visual understanding model enables users to input both text and image questions simultaneously. This capability enhances the model's understanding and allows it to provide accurate responses, simplifying the application development process and unlocking the potential of large models in diverse scenarios.

The visual understanding model is equipped with advanced content recognition capabilities. It can identify basic elements such as object categories and shapes in images, understand relationships between objects, spatial layouts, and the overall meaning of scenes. For instance, it can recognize shadows and apply natural knowledge to interpret visual data effectively.

Additionally, the model exhibits stronger understanding and reasoning abilities, allowing for better content recognition and facilitating complex logical calculations based on identified text and image information. This includes chart reasoning and physical reasoning, enhancing its application in analytical tasks.

Furthermore, the visual understanding model features refined visual description capabilities, enabling it to generate detailed descriptions of content presented in images. This functionality can support various forms of creative writing, including image creation and image poetry.

The visual understanding model holds promising application prospects in numerous fields such as education, tourism, and e-commerce. In education, for example, the model can assist students in optimizing essays and enhancing their scientific knowledge. In tourism, it can provide translations of foreign menus and explanations of architectural sites for travelers. In the realm of e-commerce, it can help merchants highlight product features, thus improving advertising effectiveness.

The usage cost of the visual understanding model is notably affordable, priced at 0.003 yuan per thousand tokens, which is 85% lower than the industry average. This pricing allows the processing of up to 284 images at 720P for every yuan spent, marking a significant advancement in visual understanding technology. Additionally, Volcano Engine offers up to 15,000 initial traffic supports for enterprises and developers, facilitating better utilization of this innovative technology.

During the conference, Volcano Engine not only launched the visual understanding model but also upgraded several other models. The comprehensive task handling capability of the Doubao general model pro has improved by 32% since May, with notable enhancements in reasoning, instruction following, coding, and mathematics. Furthermore, the Doubao video generation model is set to be available for external service in January 2025, with enterprises encouraged to make reservations for its use.

To further enhance enterprises' information acquisition and search recommendation capabilities, Volcano Engine introduced a comprehensive AI search service. This service aims to help businesses connect information effectively with user needs, thus facilitating the intelligent transformation of various industries.

Key Points

The daily token usage of the Doubao large model has reached 4 trillion, a 33-fold increase since May.
The newly launched visual understanding model supports simultaneous input of text and images, applicable in education, tourism, and e-commerce.
The usage cost is only 0.003 yuan per thousand tokens, significantly lower than the industry average.

Enjoyed this article?

Subscribe to our newsletter for the latest AI news, product reviews, and project recommendations delivered to your inbox weekly.

Weekly digestFree foreverUnsubscribe anytime

News

Merkel's AI Glasses Try-On Sparks German Business Rush

German Chancellor Angela Merkel's spontaneous test of Zhejiang Lingoport's AI glasses during her Hangzhou visit turned heads—and opened wallets. Several German executives immediately placed orders after witnessing the device's impressive real-time translation capabilities. The incident highlights China's growing tech influence and Germany's appetite for innovative partnerships.

February 28, 2026

AI technologySino-German relationsconsumer electronics

News

Bumble's New AI Tools Help You Shine Online

Dating app Bumble rolled out smart new features this week to help users put their best foot forward. An AI profile coach offers personalized tips to polish your bio, while a photo advisor helps pick your most flattering shots. The moves aim to boost matches by reducing awkward first impressions—because let's face it, writing about yourself is hard. While competitors race to add similar tech, privacy concerns linger as apps dig deeper into our personal data.

February 27, 2026

dating appsAI technologyonline privacy

News

Anthropic Bolsters AI Ambitions with Vercept Acquisition

AI powerhouse Anthropic has snapped up Seattle-based startup Vercept in a strategic move to strengthen its Claude Code ecosystem. While some founders transition to Anthropic, others voice disappointment over the product shutdown. The deal highlights the fierce competition for top AI talent as major players race to dominate emerging technologies.

February 26, 2026

AnthropicAI acquisitionsdeveloper tools

News

Wayve Drives Off with $1 Billion for AI-Powered Autonomous Cars

London-based AI startup Wayve just secured a massive $1.05 billion investment, led by SoftBank with backing from NVIDIA and Microsoft. The company's unique approach to self-driving technology - which mimics human learning rather than relying on expensive sensors - could revolutionize how cars navigate city streets. This funding marks a major vote of confidence in European AI innovation and signals growing excitement about 'embodied AI' applications.

February 25, 2026

autonomous vehiclesAI startupsSoftBank

News

Spotify's New AI Feature Turns Your Mood Into Music

Spotify is revolutionizing how we discover music with its new AI Playlist feature. Premium subscribers in select countries can now create personalized playlists simply by describing their mood or activity - no more endless scrolling. The tool understands complex requests like 'retro jogging tracks with an 80s neon vibe' and continuously improves results based on feedback. This innovation comes as Spotify increasingly bets on AI to stay ahead in the competitive streaming market.

February 24, 2026

Spotifymusic streamingAI technology

News

China's GLM-5 AI Model Breaks New Ground with Domestic Chip Support

Zhipu Technology's GLM-5 AI model has made waves with its latest upgrades, now fully supporting seven major Chinese chip platforms. The model boasts a staggering 744 billion parameters and leads globally in programming agent capabilities. While user demand temporarily overwhelmed servers, the company has responded with compensation measures. Key innovations include a dynamic attention mechanism and new reinforcement learning algorithms that significantly boost performance.

February 23, 2026

AI innovationChinese techmachine learning

Doubao Unveils Advanced Visual Understanding Model

Doubao Unveils Advanced Visual Understanding Model

Enjoyed this article?

Related Articles

Merkel's AI Glasses Try-On Sparks German Business Rush

Bumble's New AI Tools Help You Shine Online

Anthropic Bolsters AI Ambitions with Vercept Acquisition

Wayve Drives Off with $1 Billion for AI-Powered Autonomous Cars

Spotify's New AI Feature Turns Your Mood Into Music

China's GLM-5 AI Model Breaks New Ground with Domestic Chip Support

Popular Articles

TSMC Reports Record Revenue, AI Growth Fuels Optimism for 2025

Plaud AI Pro Launches with 30-Hour Battery and Smart Screen

SenseTime's New AI Model Outperforms GPT-5 in Spatial Intelligence

Silicon Flow Launches Enterprise MaaS Platform for AI Model Industrialization

ChatGPT Launches Instant Checkout for Seamless E-commerce

Main Pages

Content

Others