Skip to main content

Jan's New AI Model Outshines Google Gemini in Long-Term Tasks

Jan's Breakthrough AI Model Sets New Standard for Reliability

In the race to create AI that doesn't just think but reliably acts, the open-source Jan team has pulled ahead with their latest release. Jan-v2-VL-Max isn't just another large language model - it's specifically engineered to solve one of artificial intelligence's most frustrating limitations: the tendency to veer off course during extended tasks.

Image

Solving the "Error Snowball" Problem

Anyone who's worked with AI assistants knows the frustration - small mistakes early in a process compound into complete failures later. Current multimodal agents struggle particularly with long sequences like automated UI operations or cross-application workflows. The Jan team calls this "error accumulation," where minor deviations become major derailments.

Their solution? A clever adaptation called RLVR (Reinforced Long-horizon Vision-Language Reasoning) technology. Built on LoRA architecture, this innovation maintains the Qwen3-VL-30B base model's capabilities while dramatically improving consistency. The result? An AI that can accurately complete dozens of steps without losing its way.

Benchmark-Busting Performance

The proof comes in specialized testing. In the "Hallucination-Decay Return" (HDR) benchmark - which measures how quickly an AI's performance degrades during prolonged tasks - Jan-v2-VL-Max leaves competitors eating its digital dust. It maintains stability where others falter, outperforming not just Google's Gemini 2.5 Pro but also DeepSeek R1.

Image

Designed for Real-World Use

The Jan team hasn't just built impressive tech - they've made it accessible:

  • Web interface: Upload images and test multi-step processes without coding
  • Local deployment: Optimized vLLM solution runs efficiently on consumer GPUs
  • Integration-ready: Developers can easily incorporate it into existing systems

The implications are significant for fields like UI automation, robotics, and multi-tool collaboration.

Why This Matters Now

As AI transitions from dazzling demos to daily tools, reliability becomes paramount. While competitors chase headline-grabbing capabilities, Jan focuses on making AI you can actually depend on when it matters most.

The model represents more than technical achievement—it signals a shift in priorities from "smart" to "steady," from flashy single responses to trustworthy extended performance.

Key Points:

  • 30-billion parameter multimodal model excels at long-term tasks
  • Solves "error accumulation" problem plaguing current AI agents
  • Outperforms Google Gemini 2.5 Pro in stability benchmarks
  • Offers both web interface and efficient local deployment
  • Marks shift toward reliability-focused AI development

Enjoyed this article?

Subscribe to our newsletter for the latest AI news, product reviews, and project recommendations delivered to your inbox weekly.

Weekly digestFree foreverUnsubscribe anytime

Related Articles

News

DeepSeek V4 Emerges: A Trillion-Parameter AI with Million-Token Memory

China's DeepSeek is preparing to unveil its V4 AI model, boasting groundbreaking capabilities that could reshape the industry. The trillion-parameter system features native multimodal processing and an unprecedented 1 million token context window - enough to digest entire books at once. In a strategic shift, DeepSeek prioritized optimization for domestic hardware partners like Huawei over foreign chipmakers, signaling China's growing AI independence. With internal testing already underway, the tech world eagerly awaits what could be a game-changing release.

February 26, 2026
Artificial IntelligenceDeepSeekAI Development
Alibaba Cloud Slashes AI Coding Costs With Budget-Friendly Bundles
News

Alibaba Cloud Slashes AI Coding Costs With Budget-Friendly Bundles

As AI agents drive up computing costs, Alibaba Cloud responds with an affordable solution. Their upgraded Coding Plan now includes four premium programming models and introduces family-friendly pricing starting at just 7.9 yuan for new users. This strategic move could democratize AI development tools for smaller teams and individual coders.

February 25, 2026
Cloud ComputingAI DevelopmentTech Pricing
News

Google's AI Crackdown: Developers Face Bans for Using Open-Source Tools

Google has sparked controversy by banning developers who use open-source AI tools like OpenClaw on its Antigravity platform. The tech giant appears to be tightening control over its AI ecosystem, leaving many developers frustrated and questioning the move's impact on innovation. While Google cites intellectual property concerns, critics argue this could stifle competition in the rapidly evolving AI landscape.

February 25, 2026
GoogleAI DevelopmentOpen Source
Moonshot AI's Kimi K2.5 Achieves Remarkable Profitability Milestone
News

Moonshot AI's Kimi K2.5 Achieves Remarkable Profitability Milestone

Moonshot AI's latest model, Kimi K2.5, has stunned the tech world by generating more revenue in its first 20 days than all of 2025 combined. The breakthrough comes primarily from overseas users and developers embracing its API services, propelling the company's valuation past $10 billion. Founder Yang Zhilin confirms the company is well-funded with no immediate IPO plans.

February 24, 2026
Artificial IntelligenceTech StartupsMachine Learning
News

Chinese AI Models Capture Global Spotlight During Lunar New Year

Chinese artificial intelligence models made waves internationally during the 2026 Spring Festival, capturing over 60% market share on OpenRouter's developer platform. Three domestic models - MiniMax M2.5, Kimi K2.5, and Zhipu GLM-5 - dominated the rankings by offering superior coding and automation capabilities at remarkably low costs. Their success highlights China's growing influence in AI productivity tools.

February 24, 2026
Artificial IntelligenceChinese TechDeveloper Tools
Google's Gemini 3.1 Pro Outshines Competitors With Breakthrough Reasoning Skills
News

Google's Gemini 3.1 Pro Outshines Competitors With Breakthrough Reasoning Skills

Google has unveiled Gemini 3.1 Pro, its most advanced AI model yet, showcasing remarkable improvements in logical reasoning and problem-solving. The new architecture delivers more than double the performance of its predecessor in critical tests, even surpassing GPT-5.2 in some benchmarks. Beyond raw power, Gemini 3.1 Pro introduces innovative multimodal capabilities, handling ultra-long contexts and generating visual representations of complex concepts.

February 24, 2026
AI InnovationGoogle TechMachine Learning