Skip to main content

Ant Group Introduces Cost-Efficient MoE Language Models

Ant Group's Ling team has made a groundbreaking advancement in the field of artificial intelligence with the introduction of two new Mixture-of-Experts (MoE) large language models: Ling-Lite and Ling-Plus. These models, detailed in a technical paper published on the preprint platform Arxiv, are designed to significantly reduce training costs while maintaining high performance, even on low-performance hardware.

The Models: Ling-Lite and Ling-Plus

Ling-Lite, with 16.8 billion parameters (including 2.75 billion activation parameters), and its enhanced counterpart, Ling-Plus, featuring a staggering 290 billion parameters (with 28.8 billion activation parameters), represent a leap forward in AI efficiency. Notably, the 300 billion parameter MoE model within Ling-Plus achieves performance comparable to models trained on high-end Nvidia GPUs, despite being trained on domestically produced, lower-spec hardware.

Image Image Source Note: Image generated by AI, image licensing provided by Midjourney

Breaking Resource Barriers

Traditionally, training MoE models requires expensive high-performance GPUs like Nvidia's H100 and H800. This not only drives up costs but also limits accessibility due to chip shortages. To address these challenges, Ant Group's Ling team set an ambitious goal: scaling models without relying on high-end GPUs. Their innovative approach includes:

  • Dynamic parameter allocation: Optimizing resource usage during training.
  • Mixed-precision scheduling: Reducing computational overhead.
  • Upgraded training exception handling: Cutting interruption response time and compressing the verification cycle by over 50%.

Cost Efficiency and Performance

In experiments, the team pre-trained Ling-Plus on 9 trillion tokens. Training on 1 trillion tokens using high-performance hardware typically costs approximately 6.35 million RMB. However, Ant Group's optimized methods reduced this cost to around 5.08 million RMB, achieving nearly 20% savings. Performance-wise, the models rival established systems like Alibaba's Tongyi Qwen2.5-72B-Instruct and DeepSeek-V2.5-1210-Chat.

Implications for AI Development

The success of these models could revolutionize the AI industry by providing a more cost-effective solution for developing large language models. By reducing reliance on Nvidia chips and enabling efficient training on lower-spec hardware, Ant Group is paving the way for broader adoption of advanced AI technologies in resource-constrained environments.

Key Points

  1. Ant Group introduced two MoE large language models: Ling-Lite (16.8B parameters) and Ling-Plus (290B parameters).
  2. These models achieve high performance using low-performance hardware, reducing training costs by nearly 20%.
  3. Innovations include dynamic parameter allocation, mixed-precision scheduling, and improved exception handling.
  4. The technology reduces reliance on Nvidia GPUs, offering a cost-effective alternative for AI development.
  5. The models' performance rivals established systems like Alibaba's Tongyi Qwen2.5 and DeepSeek-V2.5.

Enjoyed this article?

Subscribe to our newsletter for the latest AI news, product reviews, and project recommendations delivered to your inbox weekly.

Weekly digestFree foreverUnsubscribe anytime

Related Articles

China-Led Team Sets Global Standard for Trustworthy AI Agents
News

China-Led Team Sets Global Standard for Trustworthy AI Agents

A consortium of Chinese tech leaders including Ant Group and China Telecom has successfully pushed through a groundbreaking international standard for trustworthy multi-agent AI systems at the ITU. The framework addresses critical security challenges in agent interactions, marking China's growing influence in shaping global digital governance. Experts hail this as a vital 'security pass' for the rapidly evolving AI ecosystem.

December 22, 2025
AI StandardsAnt GroupTrusted AI
News

ByteDance's AI Models Reach New Heights with Doubao 1.8 and Seedance Pro

ByteDance's Volcanic Engine unveiled major upgrades at its FORCE conference, introducing Doubao Large Model 1.8 and Seedance 1.5 Pro video generation model. These advancements showcase impressive performance metrics, including processing over 50 trillion tokens daily - topping China's charts and ranking third globally. Alongside these technical leaps, ByteDance launched an 'AI Cost-Saving Plan' to make enterprise adoption more affordable, signaling their push toward widespread industrial application.

December 18, 2025
Artificial IntelligenceByteDanceLarge Language Models
News

Tencent Shakes Up AI Strategy with Major Restructuring and OpenAI Veteran at Helm

Tencent is making bold moves in the AI race, completely restructuring its research divisions and bringing in top talent from OpenAI. The Chinese tech giant has created three new core departments focused on infrastructure, data systems, and computing platforms. Leading this transformation is Vince Yao, a former OpenAI researcher who contributed to key projects like Operator. Meanwhile, Tencent's Huan Yuan model continues rapid development, with a new 'world model' just launched. As domestic tech giants like ByteDance and Alibaba also push forward with AI initiatives, the battle for supremacy in China's AI landscape is heating up.

December 18, 2025
TencentAI RestructuringLarge Language Models
Tencent Overhauls AI Strategy with New Departments Focused on Large Models
News

Tencent Overhauls AI Strategy with New Departments Focused on Large Models

Tencent is shaking up its AI research structure by creating specialized departments dedicated to infrastructure and data processing for large language models. The tech giant appointed Vincesyao as Chief AI Scientist to lead these efforts, signaling a major push to strengthen its position in the competitive AI landscape. These changes aim to streamline development from computing foundations to practical applications.

December 17, 2025
TencentArtificial IntelligenceCorporate Restructuring
Ant Group's LLaDA2.0: A 100B-Parameter Leap in AI Language Models
News

Ant Group's LLaDA2.0: A 100B-Parameter Leap in AI Language Models

Ant Group has unveiled LLaDA2.0, a groundbreaking 100-billion-parameter diffusion language model that challenges conventional wisdom about scaling limitations. This innovative technology not only delivers faster processing speeds but also excels in complex tasks like code generation. By open-sourcing the model, Ant is inviting developers worldwide to explore its potential while pushing the boundaries of what diffusion models can achieve.

December 12, 2025
LLaDA2.0Diffusion ModelsAI Innovation
China's AI Titans Race Toward Hong Kong Stock Market Debut
News

China's AI Titans Race Toward Hong Kong Stock Market Debut

Three Chinese AI powerhouses—MiniMax, Zhipu AI, and Moonshot AI—are locked in a quiet battle to become China's first publicly traded large language model company. Sources indicate MiniMax may lead the pack with a potential early 2026 IPO, while Zhipu shifts its listing plans from mainland China to Hong Kong. Meanwhile, Moonshot AI banks on its technical edge despite trailing in user numbers. The listings could reshape China's competitive AI landscape.

December 12, 2025
Artificial IntelligenceIPOChinese Tech