Xiaohongshu's Open-Source Multimodal Model Rivals Top AIWelcome to AI DAMN! Discover the most amazing latest AI news, innovative AI products, and groundbreaking AI projects. From ChatGPT to cutting-edge models, we curate the AI developments that make you go 'DAMN!' - your daily dose of mind-blowing artificial intelligence.

Discover

Language

Account

Xiaohongshu's Open-Source Multimodal Model Rivals Top AI

Xiaohongshu's Open-Source Multimodal Model Challenges Industry Leaders

Chinese social media platform Xiaohongshu has entered the AI arms race with the release of dots.vlm1, its first self-developed multimodal large model. The open-source system combines a 1.2B parameter NaViT visual encoder with the DeepSeek V3 large language model, achieving performance comparable to proprietary models like Google's Gemini2.5Pro.

Native Architecture Breaks New Ground

The model's standout feature is its completely self-developed architecture, trained from scratch rather than fine-tuned from existing models. The NaViT encoder supports dynamic resolution processing, allowing superior handling of real-world image variability. Through dual supervision combining pure visual and text-visual training, the system demonstrates exceptional capability with non-standard content including:

Tables and charts
Mathematical formulas
Document structures

"We rebuilt our entire training pipeline," explained the Hi Lab team. "From data collection using our dots.ocr tool for PDF processing to manual rewriting of web-sourced text, every component was optimized for cross-modal understanding."

Benchmark Performance Analysis

In rigorous testing across international evaluation sets, dots.vlm1 shows remarkable results:

Benchmark	Performance Level

The model particularly shines in complex analytical tasks, solving Olympiad-level math problems and demonstrating strong STEM reasoning capabilities. While trailing slightly in advanced textual reasoning, its mathematical and coding performance equals leading LLMs.

Future Development Roadmap

The Hi Lab team outlined three key focus areas for future development:

Data expansion: Scaling cross-modal training datasets
Algorithm enhancement: Implementing reinforcement learning techniques
Reasoning improvement: Boosting generalization capabilities

By open-sourcing dots.vlm1, Xiaohongshu aims to stimulate innovation in the multimodal AI space while establishing itself as a serious player in foundational model development.

Key Points:

First complete open-source multimodal model from Xiaohongshu
Native NaViT encoder handles dynamic resolution natively
Matches proprietary models in 6/8 benchmark categories
Exceptional performance on STEM and analytical tasks
Planned enhancements through RL and data scaling

Enjoyed this article?

Subscribe to our newsletter for the latest AI news, product reviews, and project recommendations delivered to your inbox weekly.

Weekly digestFree foreverUnsubscribe anytime

News

Xiaohongshu's AI Video Editor Lets Creators Speak Their Vision Into Reality

China's popular lifestyle platform Xiaohongshu is quietly developing OpenStoryline, an AI-powered video editor that could revolutionize content creation. The tool promises to transform verbal ideas into polished videos through conversational commands, potentially challenging rivals like ByteDance's Xiaoyunque. In a surprising move, Xiaohongshu hints at possible open-sourcing of the technology.

February 9, 2026

AI video editingXiaohongshucreator tools

News

Xiaohongshu's New AI Video Editor Lets You Chat Your Way to Creative Content

China's popular social platform Xiaohongshu is testing OpenStoryline, an innovative AI-powered video editing tool that responds to conversational commands. Currently in version 1.0.0, this creative assistant could potentially go open-source, making professional-grade video editing accessible to more users. The move signals Xiaohongshu's deeper push into short-form video creation tools.

February 9, 2026

AI-video-editingXiaohongshucreative-technology

News

Kling AI 3.0 Unleashed: Bringing Cinematic Magic Within Reach

Kling AI's latest 3.0 version transforms video creation with smart storyboarding and extended clips up to 15 seconds. The update introduces film-grade lighting tech for stunning 4K images and simplifies multi-image style blending. Currently available for Black Gold members, these tools promise to democratize professional-quality storytelling.

February 5, 2026

AI video generationcreative toolsdigital storytelling

News

Global AI Showdown: Chinese Models Rise While Overseas Giants Hold Lead

The latest SuperCLUE rankings reveal fascinating shifts in the AI landscape. While Anthropic's Claude-Opus still leads Chinese-language capabilities, domestic models like Kimi and Qwen3 are making impressive gains, even topping specialized categories. What's particularly striking is how China's open-source ecosystem now dominates its segment - a testament to the country's growing AI prowess.

February 4, 2026

AI rankingsChinese techlarge language models

News

Xiaohongshu Tests Voice-Powered Q&A Feature That Blends AI With Real User Experiences

China's popular lifestyle platform Xiaohongshu is quietly testing an innovative voice Q&A feature that combines AI-generated summaries with authentic user experiences. Early testers report the tool transforms scattered community notes into concise answers while preserving real insights. This move signals Xiaohongshu's ambition to carve out a unique space in the competitive AI search landscape.

January 30, 2026

XiaohongshuVoice SearchAI Curation

News

Yuchu's New AI Model Gives Robots Common Sense

Chinese tech firm Yuchu has open-sourced UnifoLM-VLA-0, a breakthrough AI model that helps humanoid robots understand physical interactions like humans do. Unlike typical AI that just processes text and images, this model grasps spatial relationships and real-world dynamics - enabling robots to handle complex tasks from picking up objects to resisting disturbances. Built on existing technology but trained with just 340 hours of robot data, it's already outperforming competitors in spatial reasoning tests.

January 30, 2026

AI roboticsopen-source AIhumanoid robots

Xiaohongshu's Open-Source Multimodal Model Rivals Top AI

Xiaohongshu's Open-Source Multimodal Model Challenges Industry Leaders

Native Architecture Breaks New Ground

Benchmark Performance Analysis

Future Development Roadmap

Key Points:

Enjoyed this article?

Related Articles

Xiaohongshu's AI Video Editor Lets Creators Speak Their Vision Into Reality

Xiaohongshu's New AI Video Editor Lets You Chat Your Way to Creative Content

Kling AI 3.0 Unleashed: Bringing Cinematic Magic Within Reach

Global AI Showdown: Chinese Models Rise While Overseas Giants Hold Lead

Xiaohongshu Tests Voice-Powered Q&A Feature That Blends AI With Real User Experiences

Yuchu's New AI Model Gives Robots Common Sense

Popular Articles

TSMC Reports Record Revenue, AI Growth Fuels Optimism for 2025

Nvidia Introduces New AI Safety Features for Chatbots

Claude AI Assistant Launches on Slack to Boost Team Productivity

Baidu Unveils 2024 AI Keyword: 'Answer'

Wittro: Undetectable AI Assistant for Interviews & Meetings

Main Pages

Content

Others