Skip to main content

Jan's New AI Model Outshines Google Gemini in Long-Term Tasks

Jan's Breakthrough AI Model Sets New Standard for Reliability

In the race to create AI that doesn't just think but reliably acts, the open-source Jan team has pulled ahead with their latest release. Jan-v2-VL-Max isn't just another large language model - it's specifically engineered to solve one of artificial intelligence's most frustrating limitations: the tendency to veer off course during extended tasks.

Image

Solving the "Error Snowball" Problem

Anyone who's worked with AI assistants knows the frustration - small mistakes early in a process compound into complete failures later. Current multimodal agents struggle particularly with long sequences like automated UI operations or cross-application workflows. The Jan team calls this "error accumulation," where minor deviations become major derailments.

Their solution? A clever adaptation called RLVR (Reinforced Long-horizon Vision-Language Reasoning) technology. Built on LoRA architecture, this innovation maintains the Qwen3-VL-30B base model's capabilities while dramatically improving consistency. The result? An AI that can accurately complete dozens of steps without losing its way.

Benchmark-Busting Performance

The proof comes in specialized testing. In the "Hallucination-Decay Return" (HDR) benchmark - which measures how quickly an AI's performance degrades during prolonged tasks - Jan-v2-VL-Max leaves competitors eating its digital dust. It maintains stability where others falter, outperforming not just Google's Gemini 2.5 Pro but also DeepSeek R1.

Image

Designed for Real-World Use

The Jan team hasn't just built impressive tech - they've made it accessible:

  • Web interface: Upload images and test multi-step processes without coding
  • Local deployment: Optimized vLLM solution runs efficiently on consumer GPUs
  • Integration-ready: Developers can easily incorporate it into existing systems

The implications are significant for fields like UI automation, robotics, and multi-tool collaboration.

Why This Matters Now

As AI transitions from dazzling demos to daily tools, reliability becomes paramount. While competitors chase headline-grabbing capabilities, Jan focuses on making AI you can actually depend on when it matters most.

The model represents more than technical achievement—it signals a shift in priorities from "smart" to "steady," from flashy single responses to trustworthy extended performance.

Key Points:

  • 30-billion parameter multimodal model excels at long-term tasks
  • Solves "error accumulation" problem plaguing current AI agents
  • Outperforms Google Gemini 2.5 Pro in stability benchmarks
  • Offers both web interface and efficient local deployment
  • Marks shift toward reliability-focused AI development

Enjoyed this article?

Subscribe to our newsletter for the latest AI news, product reviews, and project recommendations delivered to your inbox weekly.

Weekly digestFree foreverUnsubscribe anytime

Related Articles

DeepSeek-V4 Set to Revolutionize Code Generation This February
News

DeepSeek-V4 Set to Revolutionize Code Generation This February

DeepSeek is gearing up to launch its powerful new AI model, DeepSeek-V4, around Chinese New Year. The update promises major leaps in code generation and handling complex programming tasks, potentially outperforming competitors like Claude and GPT series. Developers can expect more organized responses and better reasoning capabilities from this innovative tool.

January 12, 2026
AI DevelopmentProgramming ToolsMachine Learning
Chinese AI Model Stuns Tech World with Consumer GPU Performance
News

Chinese AI Model Stuns Tech World with Consumer GPU Performance

Jiukun Investment's new IQuest-Coder-V1 series is turning heads in the AI community. This powerful code-generation model, running on a single consumer-grade GPU, outperforms industry giants like Claude and GPT-5.2 in coding tasks. Its unique 'code flow' training approach mimics real-world development processes, offering developers unprecedented creative possibilities while keeping hardware requirements surprisingly accessible.

January 4, 2026
AI DevelopmentMachine LearningCode Generation
Anthropic's Cowork: The AI Coding Assistant Built by AI in Just 10 Days
News

Anthropic's Cowork: The AI Coding Assistant Built by AI in Just 10 Days

Anthropic has unveiled Cowork, a groundbreaking AI programming assistant developed primarily by its own Claude model in mere days. Designed to democratize coding, Cowork lets users complete tasks through simple voice commands - though Anthropic cautions about potential risks. The tool's rapid development showcases AI's growing capability to build itself.

January 14, 2026
AI DevelopmentProgramming ToolsAnthropic
South Korea's AI Dream Hits Snag as Firms Rely on Chinese Code
News

South Korea's AI Dream Hits Snag as Firms Rely on Chinese Code

South Korea's ambitious plan to build a homegrown AI industry has hit turbulence after three finalists in a government-backed competition were found using Chinese open-source code. While companies defend their approach as standard practice, the revelations have sparked debate about what truly constitutes 'self-reliant' AI development in today's interconnected tech landscape.

January 14, 2026
AI DevelopmentSouth Korea TechOpen Source Controversy
Claude Code Goes Desktop: A Developer's New Best Friend
News

Claude Code Goes Desktop: A Developer's New Best Friend

Anthropic has launched a desktop preview of Claude Code, bringing AI-assisted programming to developers' fingertips with a sleek graphical interface. The standout feature? Multi-session workflows powered by Git Worktree technology let developers juggle tasks without messy code conflicts. It's like having multiple coding assistants working simultaneously in isolated branches. The desktop version plays nice with cloud tools and automatically syncs your development environment - no more npm headaches. Currently available for macOS and Windows (excluding ARM), this release transforms Claude from chatbot to full-fledged coding companion.

January 7, 2026
AI DevelopmentCoding ToolsProductivity Tech
News

DeepSeek Finds Smarter AI Doesn't Need Bigger Brains

DeepSeek's latest research reveals a breakthrough in AI development - optimizing neural network architecture can boost reasoning abilities more effectively than simply scaling up model size. Their innovative 'Manifold-Constrained Hyper-Connections' approach improved complex reasoning accuracy by over 7% while adding minimal training costs, challenging the industry's obsession with ever-larger models.

January 4, 2026
AI ResearchMachine LearningNeural Networks