Skip to main content

AI2's Molmo 2 Brings Open-Source Video Intelligence to Your Fingertips

A New Era of Open Video Intelligence

The Allen Institute for Artificial Intelligence (AI2) is shaking up the AI world again with its latest release: Molmo 2. This isn't just another language model - it's specifically designed to understand videos and images, and best of all, it's completely open-source.

Image

What's Under the Hood?

Molmo 2 comes in several flavors:

  • Molmo2-4B & Molmo2-8B: Built on Alibaba's Qwen3 foundation
  • Molmo2-O-7B: A fully transparent version using AI2's own Olmo architecture

The package includes nine new datasets covering everything from multi-image analysis to video tracking - essentially giving developers the building blocks to create custom video understanding systems.

Why This Matters for Businesses

Ranjay Krishna, who leads perception research at AI2, explains what sets Molmo 2 apart: "These models don't just answer questions - they can pinpoint exactly when and where events happen in videos." Imagine asking "When did the player score?" and getting not just the answer but the exact timestamp.

The models pack some impressive capabilities:

  • Generating detailed video descriptions
  • Counting objects across frames
  • Spotting rare events in long footage

The Open-Source Advantage

In an industry where most powerful models are locked behind corporate walls, AI2's commitment to openness stands out. As analyst Bradley Shimmin notes: "For companies worried about data sovereignty or needing custom solutions, having full access to model weights and training data is invaluable."

The relatively compact size (4B-8B parameters) makes Molmo 2 practical for real-world deployment. Shimmin adds: "Enterprises are realizing bigger isn't always better - what matters is having control and understanding of your AI tools."

Try It Yourself

Curious developers can test drive Molmo 2 on:

The complete project details are available at allenai.org/blog/molmo2.

Key Points:

  • Open access: Full model weights and training data available
  • Video smarts: Understands temporal events and spatial relationships
  • Developer friendly: Multiple size options balance capability with efficiency
  • Transparent AI: Complete visibility into how models were built

Enjoyed this article?

Subscribe to our newsletter for the latest AI news, product reviews, and project recommendations delivered to your inbox weekly.

Weekly digestFree foreverUnsubscribe anytime

Related Articles

News

Hikvision's AI Inspector Tackles Factory Packaging Errors

Hikvision has unveiled a smart quality control system powered by its Guanlan AI model that spots packaging mistakes instantly. Unlike traditional manual checks, this solution scans every item with precision, adapting to complex production environments. Already proving valuable in automotive and electronics plants, it marks another step toward smarter manufacturing.

January 30, 2026
industrial automationquality controlcomputer vision
Kunlun Wanwei's Open-Source Video AI Takes Creativity to New Heights
News

Kunlun Wanwei's Open-Source Video AI Takes Creativity to New Heights

Chinese tech firm Kunlun Wanwei has unveiled SkyReels-V3, an open-source video generation model that's turning heads in the AI community. This versatile tool combines image-to-video conversion, cinematic-style extensions, and lifelike virtual avatars in one package. Early tests show it outperforms commercial rivals in visual quality and consistency. Best of all? It's free to use—for now.

January 29, 2026
AI video generationopen source AImultimodal models
News

Robots Get a Sense of Touch with Groundbreaking New Dataset

A major leap forward in robotics arrived this week with the release of Baihu-VTouch, the world's first cross-body visual-tactile dataset. Developed collaboratively by China's National-Local Co-built Humanoid Robot Innovation Center and multiple research teams, this treasure trove contains over 60,000 minutes of real robot interaction data. What makes it special? The dataset captures not just what robots see, but how objects feel - enabling machines to develop human-like tactile sensitivity across different hardware platforms.

January 27, 2026
roboticsAI researchtactile sensing
Robots Get a Sense of Touch: Groundbreaking Dataset Bridges Vision and Feeling
News

Robots Get a Sense of Touch: Groundbreaking Dataset Bridges Vision and Feeling

Scientists have unveiled Baihu-VTouch, the world's most comprehensive dataset combining robotic vision and touch. This collection spans over 60,000 minutes of interactions across various robot types, capturing delicate contact details with remarkable precision. The breakthrough could revolutionize how robots handle delicate tasks - imagine machines that can actually 'feel' what they're doing.

January 26, 2026
roboticsAI researchtactile sensors
Small AI Model Packs Big Punch: Step3-VL-10B Challenges Giants
News

Small AI Model Packs Big Punch: Step3-VL-10B Challenges Giants

StepZen's new open-source vision-language model Step3-VL-10B is turning heads in AI circles. Despite its compact 10 billion parameters, it's outperforming models twenty times its size in visual reasoning and math competitions. The secret? Innovative training techniques that could revolutionize how we deploy AI on everyday devices.

January 20, 2026
AI innovationcomputer visionedge computing
News

AI cracks famous math puzzle with a fresh approach

OpenAI's latest model has made waves in mathematics by solving a long-standing number theory problem. The solution to the Erdős problem caught the attention of Fields Medalist Terence Tao, who praised its originality. But behind this success lies a sobering reality - AI's overall success rate in solving such problems remains low, reminding us that these tools are assistants rather than replacements for human mathematicians.

January 19, 2026
AI researchmathematicsmachine learning