Grab Develops AI Model for Southeast Asian Languages
Grab Tackles Language Recognition Challenges with Custom AI Model
Singapore-based super app company Grab has developed its own visual language model to address shortcomings in processing Southeast Asian languages, according to a recent technical blog post. The innovation comes as existing commercial solutions struggle with non-Latin scripts common across Grab's eight-country operational footprint.

Image source note: The image is AI-generated
The Compliance Challenge
Grab's platform, which offers ride-hailing, food delivery, and financial services across Singapore, Malaysia, Indonesia and neighboring countries, requires accurate document processing for customer verification. Traditional OCR systems proved inadequate when handling diverse identity documents written in regional scripts.
"We found commercial models made frequent errors with Southeast Asian languages," Grab engineers noted. "Even open-source visual language models lacked sufficient accuracy despite better efficiency."
Building a Specialized Solution
In 2025, Grab began developing its own visual large language model (VLLM) capable of vectorizing images for text extraction. The team selected Alibaba Cloud's Qwen2-VL2B as foundation due to:
- Moderate model size
- Native Southeast Asian language support
- Dynamic handling of varied image resolutions
The company created specialized training data by:
- Extracting regional language content from Common Crawl
- Building synthetic data pipelines generating text under diverse fonts/backgrounds
- Applying low-rank adaptation fine-tuning techniques
The resulting model showed particular success processing Indonesian documents while continuing development for Thai and Vietnamese recognition.
Performance Breakthroughs
The customized solution demonstrates several advantages:
- Outperforms general OCR tools in accuracy
- Exceeds commercial LLMs' regional language capabilities
- Maintains lightweight efficiency through focused training
- Enables reliable compliance document processing
"Strategic use of high-quality data proves small specialized models can achieve both effectiveness and efficiency," Grab stated.
The company plans further model development to expand its document processing capabilities amid growing operational complexity.
Key Points:
📊 Commercial models underperform on Southeast Asian scripts prompting Grab's custom solution
🔍 Visual LLM breakthrough improves ID/license processing accuracy
🚀 Continued development planned to handle more document types and languages



