AI DAMN/Tsinghua University Unveils AutoDroid-V2 for Mobile AI Control

Tsinghua University Unveils AutoDroid-V2 for Mobile AI Control

Tsinghua University's AutoDroid-V2 Launch

On December 24, 2024, Tsinghua University’s Intelligent Industry Research Institute (AIR) introduced AutoDroid-V2, a groundbreaking AI model aimed at optimizing automation control for mobile devices. This new model significantly enhances user efficiency by allowing commands to be executed through natural language, leveraging the capabilities of small language models.

Innovations in AI Automation

Unlike traditional systems that depend on large cloud-based language models (LLMs), AutoDroid-V2 utilizes a script-based approach. This innovative strategy enables mobile devices to execute user commands more effectively, reducing reliance on cloud services and thereby enhancing privacy and security. Furthermore, it decreases data consumption for users and lowers operational costs for servers, facilitating broader adoption of mobile devices.

image

Background and Development

The recent advancements in large language models and visual language models have paved the way for controlling mobile devices via natural language commands. These technologies provide novel solutions for addressing complex user tasks. However, conventional methods, such as the "step-by-step GUI agent" approach, often encounter issues related to high data consumption and privacy concerns, hindering their large-scale implementation.

The key innovation of AutoDroid-V2 lies in its ability to generate multi-step scripts directly from user commands. This allows the model to carry out several GUI operations simultaneously, leading to a significant reduction in query frequency and resource consumption. It also enables the generation and execution of task scripts directly on the user’s device, with the model able to create application documentation in offline mode, setting the stage for subsequent script generation.

Performance Testing Results

In performance evaluations, AutoDroid-V2 was benchmarked against 226 tasks across 23 mobile applications. The model demonstrated a task completion rate improvement ranging from 10.5% to 51.7% compared to its predecessors, including AutoDroid and SeeClick. Additionally, it reduced input and output token consumption to 1/43.5 and 1/5.8, respectively, while the model inference latency decreased dramatically to between 1/5.7 and 1/13.4 of the previous models. These findings underscore the efficiency and reliability of AutoDroid-V2 in practical applications.

Implications for the Future

The launch of AutoDroid-V2 represents a significant advancement in the field of AI and mobile technology. By improving the efficiency of natural language commands and reducing dependence on cloud infrastructure, Tsinghua University is setting a new standard for mobile device automation. This innovation not only enhances user experience but also addresses critical issues surrounding data privacy and operational efficiency.

Key Points

  1. AutoDroid-V2 is a new AI model launched by Tsinghua University, enhancing the efficiency of natural language control for mobile devices.
  2. The model reduces dependence on cloud services through small language models, enhancing user privacy and security.
  3. Benchmark tests show significant improvements in task completion rates and resource consumption for AutoDroid-V2, showcasing its strong application potential.

© 2024 - 2025 Summer Origin Tech

Powered by Nobelium