AgentCPM-GUI: Open-Source LLM Agent for Mobile Apps

Product Introduction
AgentCPM-GUI is a cutting-edge open-source mobile agent powered by large language models (LLMs). It automates tasks in Chinese and English applications by analyzing screenshots and executing commands. Designed to boost productivity on mobile devices, it particularly shines in handling complex workflows and supporting popular Chinese apps.
Key Features
- Advanced GUI Understanding: Pre-trained on a massive bilingual Android dataset for superior GUI component recognition.
- Chinese App Optimization: Fine-tuned for over 30 top Chinese applications like Dianping and Amap.
- Enhanced Reasoning: Uses Reinforcement Fine-Tuning (RFT) for thoughtful task execution.
- Efficient Action Design: Compact JSON format reduces average action length to just 9.7 tokens.
- Screenshot Input: Processes screen images directly for intuitive operation.
- Multi-App Adaptability: Works seamlessly across various Android applications.
- Easy Setup: Simple installation process with clear documentation.
Product Data
- Supported Languages: Chinese, English
- Action Token Length: Average 9.7 tokens
- Pre-trained Models: Available for download
- Supported Apps: 30+ Chinese applications including bilibili, Amap, Dianping





