Skip to main content

Ant Group's BaiLing Team Open Sources Efficient AI Model

Ant Group's BaiLing Team Releases Revolutionary AI Model

Amid fierce competition in AI development, Ant Group's BaiLing large model team has open-sourced Ring-flash-linear-2.0-128K, a groundbreaking model designed specifically for ultra-long text programming applications. This release marks a significant advancement in efficient AI inference and long-context processing.

Image

Hybrid Architecture Delivers Unprecedented Efficiency

The model features an innovative hybrid linear + standard attention mechanism combined with a sparse MoE (Mixture of Experts) architecture. With total parameters scaled at 104B but only 6.1B activated during operation (4.8B excluding embeddings), the system achieves:

  • Near-linear time complexity
  • Constant space complexity
  • Generation speeds exceeding 200 tokens/second at 128K context on H20 hardware
  • Three times faster daily use speeds compared to traditional models

The architecture is particularly optimized for resource-limited scenarios while maintaining performance comparable to 40B dense models.

Enhanced Training Yields Superior Reasoning Capabilities

Building upon the Ling-flash-base-2.0 foundation, the model underwent:

  • Additional training on 1T tokens of high-quality data
  • Stable supervised fine-tuning (SFT)
  • Multi-stage reinforcement learning (RL)

The training process overcame traditional instability issues in MoE long-chain reasoning through Ant's proprietary "Icepop" algorithm. Benchmark results demonstrate exceptional capabilities:

  • 86.98 score in AIME2025 math competition
  • 90.23 Elo rating in CodeForces programming tests
  • Outperforms 40B dense models like Qwen3-32B in logical reasoning and creative writing tasks

Image

Long Context Handling Redefines Programming Efficiency

The model natively supports 128K context windows, expandable to 512K using YaRN extrapolation technology. Performance highlights include:

  • Prefill phase throughput nearly 5× higher than Qwen3-32B
  • Decoding phase achieving 10× acceleration
  • Maintains high accuracy even in 32K+ context programming tasks without "model leakage" issues The system proves particularly effective for:

  • Front-end development
  • Structured code generation
  • Agent simulation scenarios

    Open Source Availability Accelerates Adoption

    The BaiLing team has made the model available on:

    /div>">Hugging Face ">ModelScope ">">">">">">">">">">">Support includes BF16/FP8 formats and easy integration with popular frameworks like Transformers, SGLang, and vLLM."";"""Technical documentation is available on arXiv (https://arxiv.org/abs/2510.19338).""",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,""",,,,"",,"",,"",,"",,"",,"",,"",,"",,"",,"".'''''''''''''''',,,,,,,,,,,,''''',,,,,,,,''''',,,,,,,,''''',,,,,,,,''''',,,,,,,,''''',,,,,,,,''''',,,,,,,,''''',,,,,,,,''''',,,,,,,,''''',,,,,,,,''''',,,,,,,,''''',,,,,,,,''''','', '''Key Points:'''''- Combines hybrid linear attention with MoE architecture'- Achieves SOTA performance with only 6.1B activated parameters'- Native 128K context support expandable to 512K'- Sevenfold efficiency improvement over previous versions'- Available now on Hugging Face and ModelScope

Enjoyed this article?

Subscribe to our newsletter for the latest AI news, product reviews, and project recommendations delivered to your inbox weekly.

Weekly digestFree foreverUnsubscribe anytime

Related Articles

News

Robots Get a Dose of Common Sense with New AI Model

DeepMind Intelligence has unveiled PhysBrain 1.0, a groundbreaking AI model that gives robots human-like understanding of physical laws. Unlike previous systems that simply mimic actions, this technology enables machines to predict and adapt to real-world environments. Developed by Beijing Zhongguancun College researchers, it could revolutionize how robots operate in unpredictable settings.

March 27, 2026
Artificial IntelligenceRoboticsMachine Learning
News

Xiaomi Bets Big on AI with $6 Billion Investment and New Model Family

Xiaomi founder Lei Jun has unveiled ambitious plans to invest over 60 billion yuan in AI development over the next three years. The tech giant introduced its MiMo-V2 model family, including advanced agent capabilities and multimodal systems, while showcasing AI integration across smartphones and smart vehicles. This massive investment signals Xiaomi's strategic pivot from hardware manufacturer to AI powerhouse.

March 27, 2026
XiaomiArtificial IntelligenceTech Investment
News

Leaked: Claude's Next-Gen AI Model Shows Stunning Capabilities

Anthropic's upcoming Claude Mythos AI model has reportedly surpassed its flagship Opus system in testing, according to leaked documents. The new 'Capybara' tier represents a quantum leap in reasoning abilities, though insiders warn of unprecedented security risks. This development could reshape the competitive landscape of advanced AI systems.

March 27, 2026
AI DevelopmentAnthropicMachine Learning
News

Ex-Qwen Engineer Reveals: AI Models Are Becoming Doers, Not Just Thinkers

Lin Junyang, former lead engineer of Alibaba's Qwen model, shares groundbreaking insights about AI's evolution from passive reasoning to active problem-solving. He reveals the team's early struggles merging 'thinking' and 'doing' functions, explaining why Qwen ultimately split these capabilities. The industry is shifting focus from training models to developing complete 'model + environment' agent systems where action matters more than endless reasoning chains.

March 27, 2026
AI EvolutionAgentic ThinkingQwen Model
Chinese AI Model SkyReels V4 Outperforms Global Rivals in Video Generation
News

Chinese AI Model SkyReels V4 Outperforms Global Rivals in Video Generation

Kunlun Wanyi's SkyReels V4 has claimed the top spot in global text-to-video generation rankings, surpassing competitors like OpenAI's Sora2 and Google Veo3.1. The breakthrough comes from innovative reinforcement learning and logical reasoning capabilities that solve persistent video consistency issues. Now available via API, this technology promises to revolutionize industries from e-commerce to education with its advanced audiovisual generation.

March 19, 2026
AI Video GenerationChinese TechnologyMachine Learning
News

Moonshot AI Founder Unveils Next-Gen Model Strategy at NVIDIA Event

Yang Zhilin, founder of Moonshot AI, made waves at the NVIDIA GTC2026 conference with his vision for the future of large language models. Moving beyond simple computing power scaling, he proposed a three-pronged approach focusing on token efficiency, long context processing, and agent clusters. The strategy behind their Kimi K2.5 model suggests we're entering an era where intelligence density matters more than raw parameter counts.

March 18, 2026
AI InnovationMoonshot AINVIDIA GTC