AI Breakthroughs: Zhipu, DAMO, and 360 Release Cutting-Edge Models
Major AI Developments Reshape Tech Landscape
Zhipu Open-Sources GLM-4.5V Vision Model
Chinese AI company Zhipu has made waves by open-sourcing its GLM-4.5V visual reasoning model, boasting 106 billion parameters and state-of-the-art performance across 41 multimodal benchmarks. The model excels in full-scenario applications including image analysis, video understanding, and GUI tasks.

Key features include:
- New "thinking mode" switch for efficiency optimization
- Competitive pricing at ¥2/M input tokens
- Superior performance in complex visual reasoning tasks
Alibaba DAMO Advances Embodied Intelligence
At the World Robot Conference, Alibaba's research arm unveiled three groundbreaking technologies:
- RynnVLA-001-7B: A vision-language-action model learning from first-person videos
- RynnEC: World understanding model analyzing scenes across 11 dimensions
- RynnRCP: Robot context protocol enabling complete sensor-to-action workflows
The open-source initiative (GitHub) aims to standardize embodied intelligence development.
Apple Prepares GPT-5 Integration for Siri
The tech giant announced plans to upgrade its Apple Intelligence system with GPT-5 capabilities in upcoming iOS/macOS updates. Enhancements will include:
- Improved multilingual real-time translation
- Advanced screen content analysis
- First-time API access for third-party developers
The move signals Apple's commitment to staying competitive in the AI assistant space.
Mapping Gets Smarter with Gaode's AI Agent
Alibaba's mapping service launched the world's first AI-native map agent, "Xiao Gao Teacher," featuring:
- End-to-end voice interaction with interruption capability
- Complex POI reasoning with multiple constraints
- Built on a Qwen model trained on 36 trillion tokens
The system represents a significant leap in spatial semantic understanding.
ByteDance Solves Subtitle Removal Challenge
The TikTok parent company introduced a DiT-based solution for seamless video subtitle removal with:
- Pixel-perfect restoration technology
- Multilingual support including minority languages
- One-click "remove-translate-lip sync" workflow
The innovation (VolcEngine) promises to streamline content localization.

Kunlun Wanwei Pushes Boundaries with Open Models
The gaming company made two significant contributions:
- Matrix-Game2.0: Real-time generation of minute-long 25fps videos without language prompts
- Matrix-3D: Single-image to 360° navigable video conversion (GitHub)
Both models demonstrate remarkable progress in generative AI applications.
Key Points:
- Visual AI Leap: Zhipu's GLM-4.5V sets new standards for open-source vision models (106B params)
- Robotics Framework: Alibaba DAMO's trio of technologies could accelerate embodied intelligence development
- Consumer Upgrades: Apple's GPT-5 integration and Gaode's mapping agent showcase practical AI applications
- Content Tools: ByteDance and Kunlun Wanwei solutions address critical challenges in media production
- Open Source Momentum: Multiple major players releasing weights/code signals industry collaboration trend



