Skip to main content

Meituan's New AI Model Packs a Punch with Smart Parameter Trick

Rethinking How AI Models Grow

Most AI models try to get smarter by adding more "experts" - specialized sub-models that handle different tasks. But Meituan's LongCat team noticed this approach hits diminishing returns fast. Their solution? A clever workaround they call "Embedding Expansion" that makes every parameter work harder.

Image

The Numbers Behind the Magic

At first glance, LongCat-Flash-Lite seems massive - 68.5 billion parameters total. But here's where it gets interesting: during actual use, only 2.9 to 4.5 billion parameters activate at once. That's like having a sports car that only uses the horsepower it needs for each stretch of road.

The secret sauce? Over 30 billion parameters dedicated to an N-gram embedding layer that's exceptionally good at picking up context clues. Need to understand programming commands or technical jargon? This model nails it with surgical precision.

Image

Engineering for Real-World Speed

All this clever architecture wouldn't mean much if the model crawled along. Meituan's engineers made sure that didn't happen with three key optimizations:

  • Smart Parameter Management: Nearly half the model's brainpower sits in its embedding layer, which works more like a quick dictionary lookup than heavy computation.
  • Custom Hardware Tricks: They built specialized caching (think of it as short-term memory) and fused operations together to cut down on processing delays.
  • Predictive Processing: The model guesses what might come next to work more efficiently, like a chess player thinking several moves ahead.

The payoff? Blazing speeds of 500-700 tokens per second (that's several paragraphs generated in the blink of an eye) and the ability to handle documents up to 256,000 words long - perfect for analyzing lengthy reports or codebases.

Benchmark Buster

When put through its paces, LongCat-Flash-Lite surprised even its creators:

  • Specialized Tasks: Outperformed rivals in telecom, retail, and aviation scenarios on industry-standard tests.
  • Coding Prowess: Solved over half the problems in SWE-Bench (a tough coding challenge) and crushed terminal command tests with a score nearly double some competitors'.
  • General Smarts: Held its own against Google's Gemini2.5Flash-Lite on broad knowledge tests and tackled advanced math problems with ease.

The best part? Meituan has open-sourced everything - the model itself, detailed technical papers, even their custom inference engine. Developers can try it out today with a generous daily free allowance of 50 million tokens.

Key Points:

  • Meituan challenges conventional AI scaling with innovative "Embedding Expansion"
  • Model activates only 4.5B of its 68.5B parameters per task for efficiency
  • Excels at technical domains like programming while maintaining broad competence
  • Open-source release includes weights, research papers, and optimized inference tools

Enjoyed this article?

Subscribe to our newsletter for the latest AI news, product reviews, and project recommendations delivered to your inbox weekly.

Weekly digestFree foreverUnsubscribe anytime

Related Articles

AI Industry Buzz: Claude's Big Leap, Qwen's Red Envelope Rush & Tencent's Manga Move
News

AI Industry Buzz: Claude's Big Leap, Qwen's Red Envelope Rush & Tencent's Manga Move

The AI world never sleeps, and today's developments prove it. Anthropic pushes boundaries with Claude Opus 4.6's massive context window, while Alibaba's Qwen app crashes servers with its viral Spring Festival giveaway. Meanwhile, Tencent enters the anime arena with Huolong Webtoon, and regulators crack down on shady AI practices. From million-token models to digital tea wars, here's what's shaking up the tech landscape.

February 6, 2026
AI innovationtech regulationdigital competition
Zhipu's GLM-4.7-Flash Hits 1 Million Downloads in Just Two Weeks
News

Zhipu's GLM-4.7-Flash Hits 1 Million Downloads in Just Two Weeks

Zhipu AI's lightweight model GLM-4.7-Flash has taken the open-source community by storm, surpassing 1 million downloads on Hugging Face within 14 days of release. This hybrid thinking model outperforms competitors in benchmark tests, offering developers an efficient and cost-effective solution for AI applications. Its rapid adoption signals strong market validation for Zhipu's approach to balancing performance with practical deployment considerations.

February 4, 2026
AI developmentOpen sourceMachine learning
News

China Telecom Spearheads AI Revolution Across Industries

China Telecom is leading the charge in implementing AI across diverse sectors, from urban management to industrial production. Partnering with other telecom giants, they've launched a massive computing project to fuel AI development. Government officials highlight how these efforts boost efficiency while driving economic growth through technological innovation.

February 4, 2026
AI innovationdigital transformationChina Telecom
News

Kunlun Tech Brings AI Power Directly to Your Desktop with TianGong Skywork

Kunlun Tech has unveiled its groundbreaking TianGong Skywork Desktop Edition, putting powerful AI capabilities right on your computer. Unlike cloud-dependent alternatives, this innovative software processes everything locally - keeping your data secure while delivering lightning-fast performance. With support for multiple top-tier AI models and hundreds of built-in skills, it's transforming Windows PCs into intelligent digital collaborators.

February 4, 2026
AI innovationdesktop computingdata privacy
News

AI's Reality Check: Top Models Flunk Expert Exam

In a humbling revelation, leading AI models including GPT-4o scored dismally on a rigorous new test designed by global experts. The 'Ultimate Human Exam' exposed critical limitations in AI reasoning, with top performers barely scraping 8% accuracy. These results challenge our assumptions about artificial intelligence's true capabilities and raise questions about whether current benchmarks measure real understanding or just sophisticated pattern matching.

February 3, 2026
AI testingMachine learningArtificial intelligence
News

Carnegie Mellon's AI Conductors Fix 3D Printing Flaws Mid-Creation

Researchers at Carnegie Mellon have created an AI system that spots and fixes 3D printing errors in real time, much like a conductor leading an orchestra. The innovative approach coordinates multiple AI agents to monitor prints, diagnose issues, and adjust settings automatically. Early tests show parts made with this system can handle five times more weight than traditional prints. What makes it special? The system works across different printers without retraining and keeps manufacturers' secrets safe.

February 3, 2026
AI innovation3D printingmanufacturing tech