跳转到主要内容

Ant Group's BaiLing Team Open Sources Efficient AI Model

Ant Group's BaiLing Team Releases Revolutionary AI Model

Amid fierce competition in AI development, Ant Group's BaiLing large model team has open-sourced Ring-flash-linear-2.0-128K, a groundbreaking model designed specifically for ultra-long text programming applications. This release marks a significant advancement in efficient AI inference and long-context processing.

Image

Hybrid Architecture Delivers Unprecedented Efficiency

The model features an innovative hybrid linear + standard attention mechanism combined with a sparse MoE (Mixture of Experts) architecture. With total parameters scaled at 104B but only 6.1B activated during operation (4.8B excluding embeddings), the system achieves:

  • Near-linear time complexity
  • Constant space complexity
  • Generation speeds exceeding 200 tokens/second at 128K context on H20 hardware
  • Three times faster daily use speeds compared to traditional models

The architecture is particularly optimized for resource-limited scenarios while maintaining performance comparable to 40B dense models.

Enhanced Training Yields Superior Reasoning Capabilities

Building upon the Ling-flash-base-2.0 foundation, the model underwent:

  • Additional training on 1T tokens of high-quality data
  • Stable supervised fine-tuning (SFT)
  • Multi-stage reinforcement learning (RL)

The training process overcame traditional instability issues in MoE long-chain reasoning through Ant's proprietary "Icepop" algorithm. Benchmark results demonstrate exceptional capabilities:

  • 86.98 score in AIME2025 math competition
  • 90.23 Elo rating in CodeForces programming tests
  • Outperforms 40B dense models like Qwen3-32B in logical reasoning and creative writing tasks

Image

Long Context Handling Redefines Programming Efficiency

The model natively supports 128K context windows, expandable to 512K using YaRN extrapolation technology. Performance highlights include:

  • Prefill phase throughput nearly 5× higher than Qwen3-32B
  • Decoding phase achieving 10× acceleration
  • Maintains high accuracy even in 32K+ context programming tasks without "model leakage" issues The system proves particularly effective for:

  • Front-end development
  • Structured code generation
  • Agent simulation scenarios

    Open Source Availability Accelerates Adoption

    The BaiLing team has made the model available on:

    /div>">Hugging Face ">ModelScope ">">">">">">">">">">">Support includes BF16/FP8 formats and easy integration with popular frameworks like Transformers, SGLang, and vLLM."";"""Technical documentation is available on arXiv (https://arxiv.org/abs/2510.19338).""",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,""",,,,"",,"",,"",,"",,"",,"",,"",,"",,"",,"".'''''''''''''''',,,,,,,,,,,,''''',,,,,,,,''''',,,,,,,,''''',,,,,,,,''''',,,,,,,,''''',,,,,,,,''''',,,,,,,,''''',,,,,,,,''''',,,,,,,,''''',,,,,,,,''''',,,,,,,,''''',,,,,,,,''''','', '''Key Points:'''''- Combines hybrid linear attention with MoE architecture'- Achieves SOTA performance with only 6.1B activated parameters'- Native 128K context support expandable to 512K'- Sevenfold efficiency improvement over previous versions'- Available now on Hugging Face and ModelScope

喜欢这篇文章?

订阅我们的 Newsletter,获取最新 AI 资讯、产品评测和项目推荐,每周精选直达邮箱。

每周精选完全免费随时退订

相关文章

MiniMax发布专为智能体设计的M2推理模型
News

MiniMax发布专为智能体设计的M2推理模型

MiniMax推出了其M2推理模型,拥有2300亿参数和每秒100个token的处理速度。该模型专为智能体设计,强调效率和实时交互能力,同时减小上下文窗口大小以便实际部署。

October 28, 2025
MiniMaxAI ModelsSmart Agents
小米AI模型开启付费模式:你需要了解的关键信息
News

小米AI模型开启付费模式:你需要了解的关键信息

小米为其MiMo-V2-Flash AI模型推出付费选项的同时仍向用户提供免费额度。这家科技巨头公布了国内外市场的定价细节,该模型拥有惊人的3090亿参数。中国用户需完成实名认证,而海外用户则可通过Apple Pay和信用卡享受更简便的支付方式。

January 21, 2026
XiaomiAI ModelsTech News
AI攻克埃尔德什最难题:GPT5.2的突破令数学家震惊
News

AI攻克埃尔德什最难题:GPT5.2的突破令数学家震惊

GPT5.2在两周内解决了保罗·埃尔德什留下的11个传奇性未解数学难题,并通过形式化证明工具验证。这一突破引起了陶哲轩等顶尖数学家的关注,哈佛大学的诺姆·埃尔基斯更是在AI生成的解决方案基础上继续研究。这标志着人工智能不仅辅助人类研究者,更在纯数学前沿实现自主发现的转折点。

January 15, 2026
Artificial IntelligenceMathematicsGPT5
NVIDIA的NitroGen通过观看YouTube学习人类游戏方式
News

NVIDIA的NitroGen通过观看YouTube学习人类游戏方式

NVIDIA发布了NitroGen人工智能模型,该模型仅通过观看游戏视频就能学会玩游戏。经过涵盖1,000多款游戏的4万小时素材训练而成。

December 29, 2025
AI GamingNVIDIAMachine Learning
NVIDIA与斯坦福大学发布开源游戏AI,可精通1000款游戏
News

NVIDIA与斯坦福大学发布开源游戏AI,可精通1000款游戏

在一项突破性合作中,NVIDIA与斯坦福大学推出了NitroGen——一个经过4万小时游戏数据训练后能玩1000多款不同游戏的AI智能体。其独特之处在于?团队将所有内容开源:包括训练好的模型权重及其庞大的GameVerse-1K数据集。这不仅关乎游戏;研究人员视其为通向更通用人工智能的垫脚石,未来或将为机器人和自主系统提供动力。

December 26, 2025
Artificial IntelligenceMachine LearningVideo Games
NVIDIA的NitroGen AI通过观看4万小时YouTube视频学习游戏技能
News

NVIDIA的NitroGen AI通过观看4万小时YouTube视频学习游戏技能

NVIDIA发布了突破性AI NitroGen,该AI通过分析数千小时的玩家视频掌握了游戏技能。与专业游戏机器人不同,这个通用智能体能够以惊人的技巧适应各种游戏类型。秘诀何在?通过研究YouTube和Twitch直播中真实玩家的手柄输入。研究人员表示,在面对陌生游戏时,其表现比传统AI高出52%。NVIDIA已开源该项目,以加速开发多功能虚拟智能体。

December 22, 2025
AIGamingMachine Learning