Skip to main content

NavFoM: World's First Cross-Entity Navigation AI Model Launched

NavFoM: A Breakthrough in Unified Robot Navigation

In a significant advancement for robotics and AI, Galaxy General has partnered with research teams from Peking University, the University of Adelaide, and Zhejiang University to launch NavFoM (Navigation Foundation Model), the world's first cross-ontology full-scene panoramic navigation foundation model.

Image

Unified Framework for Diverse Navigation Tasks

The innovative model represents a paradigm shift by integrating various robot navigation tasks into a single framework. This includes:

  • Visual and language navigation
  • Goal-oriented navigation
  • Visual tracking
  • Autonomous driving applications

Dr. Chen Wei, lead researcher at Galaxy General, explains: "NavFoM eliminates the need for specialized models for each navigation task. Our approach mirrors how humans use the same cognitive framework to navigate different environments."

Zero-Shot Operation Across Environments

One of NavFoM's most remarkable features is its full-scenario support capability. The model can operate in both indoor and outdoor environments without prior knowledge or mapping requirements. This means:

  • No additional data collection needed for new environments
  • Immediate deployment in unseen locations
  • Reduced setup time and costs for implementation

The system achieves this through advanced machine learning techniques that allow it to generalize from its training to novel situations.

Image

Multi-Task Support Through Natural Language

The model's multi-task support capabilities enable diverse functions through natural language instructions, including:

  • Target following
  • Autonomous navigation
  • Complex route planning This flexibility allows various robotic platforms - from robotic dogs to drones and autonomous vehicles - to operate efficiently within the same framework.

Technical Innovations: TVI Tokens and BATS Strategy

The research team introduced two groundbreaking technical components:

  1. TVI Tokens (Temporal-Viewpoint-Indexed Tokens): Enables the model to understand temporal sequences and directional information critical for navigation tasks.
  2. BATS strategy (Budget-Aware Token Sampling): Allows optimal performance even with limited computational resources, making the model practical for real-world applications.

The team compiled an unprecedented training dataset containing:

  • 8 million cross-task, cross-ontology navigation data points
  • 4 million open-ended question-and-answer pairs This represents twice the training volume of previous models in this field.

Future Applications and Development

The release of NavFoM opens new possibilities for robotics development. According to Professor Li Ming of Peking University: "Developers can now build specialized applications on this foundation model through transfer learning, significantly reducing development time while improving performance." Potential applications span:

  • Smart city infrastructure
  • Search and rescue operations
  • Industrial automation
  • Personal assistance robotics

The research team plans to release an open-source version of NavFoM later this year to accelerate innovation in the field.

Key Points:

🌟 First unified navigation model combining multiple robot tasks under one framework
🏞️ Zero-shot operation in both indoor/outdoor environments without prior mapping
💬 Natural language control enables intuitive human-machine interaction
💡 TVI Tokens & BATS strategy provide technical advantages in understanding and resource management
📊 Unprecedented training dataset with 12 million data points ensures robust performance

Enjoyed this article?

Subscribe to our newsletter for the latest AI news, product reviews, and project recommendations delivered to your inbox weekly.

Weekly digestFree foreverUnsubscribe anytime

Related Articles

Google Maps Gets Chatty: New AI Feature Lets You Ask Directions Like a Local
News

Google Maps Gets Chatty: New AI Feature Lets You Ask Directions Like a Local

Google Maps just got smarter. The app's latest update introduces 'Ask Maps,' letting users search with natural language queries like 'Where's a quiet café with outlets?' Powered by Gemini AI, it understands complex requests and personal preferences. The upgrade also brings immersive 3D navigation that highlights lane details and landmarks, making those tricky highway exits less stressful. Currently rolling out in the U.S. and India, it's like having a knowledgeable local guide in your pocket.

March 13, 2026
Google MapsAI NavigationGemini AI
Google Maps Gets a Brain Upgrade: Now It Understands Your Oddest Requests
News

Google Maps Gets a Brain Upgrade: Now It Understands Your Oddest Requests

Google Maps just became your new best travel buddy. By integrating Gemini's AI, the app now understands natural language queries that would have baffled it before - like finding a 'not-too-dirty' public restroom or a cozy vegetarian spot between work and a friend's place. The upgrade also brings stunning 3D visuals to navigation, making those confusing intersections finally make sense. It's like having a local expert in your pocket who never gets annoyed by strange questions.

March 13, 2026
Google MapsAI NavigationTravel Tech
Xie Saining's Team Unveils Solaris: A Breakthrough in Multi-User Video AI
News

Xie Saining's Team Unveils Solaris: A Breakthrough in Multi-User Video AI

Xie Saining's research team has launched Solaris, the world's first multi-user video world model, powered by Kunlun Wanzhi's Matrix-Game2.0. This innovative technology enhances player interaction in environments like Minecraft, outperforming previous solutions. The release coincides with a major funding milestone for Xie's AI company, AMI, highlighting the growing importance of world models in advancing artificial general intelligence.

March 11, 2026
AIMachine LearningVirtual Worlds
News

AI Pioneer Yann LeCun Secures $1 Billion for His Next Big Bet

Yann LeCun, the Turing Award-winning AI researcher, has raised over $1 billion for his new venture Advanced Machine Intelligence. The startup aims to move beyond today's language models by developing systems that can truly reason and understand the physical world. With backing from major investors, LeCun's company could reshape industries from robotics to healthcare.

March 10, 2026
Artificial IntelligenceTech StartupsMachine Learning
News

Arduino's New Powerhouse: VENTUNO Q Brings Edge AI to Life

Arduino has unveiled its groundbreaking VENTUNO Q single-board computer, packing Qualcomm's Dragonwing processor with an impressive 40 TOPS computing power. This Italian-designed powerhouse marks Arduino's 21st anniversary by bringing generative AI capabilities to edge devices. From smart mirrors to industrial robots, developers now have unprecedented local processing power in their hands.

March 10, 2026
ArduinoEdgeAIQualcomm
OpenClaw's Game-Changing Update: GPT-5.4 Support and Smarter AI Agents
News

OpenClaw's Game-Changing Update: GPT-5.4 Support and Smarter AI Agents

The open-source AI project OpenClaw just dropped its biggest update yet, bringing native GPT-5.4 support that outperforms competitors like Claude Code. The 2026.3.7 version introduces revolutionary 'memory hot-swapping' technology, solving long-standing fragmentation issues in smart agents. From coding to stock analysis, this update transforms OpenClaw from a developer's toy into a true virtual employee that never stops working.

March 9, 2026
AI DevelopmentOpenClawGPT-5