Skip to content
Research · Apr 18, 2026

DeepMind releases Gemini Robotics-ER 1.6 with enhanced spatial reasoning for autonomous tasks

The upgraded model improves embodied reasoning capabilities including pointing, counting, and a new instrument-reading feature developed in collaboration with Boston Dynamics.

Trust59
HypeSome hype

1 source

ShareXLinkedInEmail
TL;DR
  • DeepMind introduced Gemini Robotics-ER 1.6, an upgraded reasoning-first model designed to help robots understand and interact with physical environments.
  • The model demonstrates enhanced capabilities in spatial reasoning, multi-view understanding, and a newly added instrument-reading function for reading gauges and sight glasses.
  • The model is available to developers immediately via the Gemini API and Google AI Studio, with benchmark comparisons showing improvements over previous versions.
  • Key capabilities include pointing for spatial reasoning, counting, success detection, and task planning through integration with tools like Google Search and vision-language-action models.

DeepMind has released Gemini Robotics-ER 1.6, a specialized reasoning model designed to enable robots to understand and navigate physical environments with greater precision. The model builds on embodied reasoning capabilities—the ability to connect digital intelligence with real-world action—by focusing on spatial understanding and multi-view perception, addressing a core challenge in autonomous robotics.

The model's core capabilities center on spatial reasoning tasks essential for robotic systems. Pointing functionality serves as a foundation for multiple reasoning patterns: robots can use points to detect and count objects, establish spatial relationships, map movement trajectories, and apply logical constraints to determine which items meet specific criteria. In demonstration tests, the system accurately counted tool categories while correctly avoiding false positives on items not present in imagery.

A newly developed instrument-reading capability represents an extension of the model's sensing abilities. Developed through partnership with Boston Dynamics, this feature enables robots to interpret complex analog gauges and sight glasses—a practical requirement for industrial and facility monitoring tasks where manual reading becomes a bottleneck.

The model functions as a high-level reasoning layer that can coordinate with other systems. It natively supports tool calling, including integration with Google Search for information retrieval and connection to vision-language-action models—specialized systems that map visual inputs to motor commands. This architecture allows it to orchestrate multi-step tasks requiring both reasoning and direct physical control.

Benchmark comparisons show measurable improvements against both the prior Robotics-ER version and the general-purpose Gemini 3.0 Flash model, particularly in spatial and physical reasoning dimensions. The model is now available through the Gemini API and Google AI Studio, with developer documentation and example code provided to facilitate implementation.

Sources
  1. 01Google DeepMind — BlogGemini Robotics-ER 1.6: Powering real-world robotics tasks through enhanced embodied reasoning
Also on Research

Stories may contain errors. Dispatch is assembled with AI assistance and curated by human editors; despite the trust-score filter, mistakes happen. We correct publicly — every article links to its revision history. Nothing here is financial, legal, or medical advice. Verify before relying on any claim.

© 2026 Dispatch. No ads. No sponsorships. No paid placement. Reader-supported via Ko-fi.

Built by a person who cares about honest AI news.