
TL;DR
- Google DeepMind has launched Gemini Robotics On-Device, a lightweight AI model for local robot control without internet.
- The new model rivals cloud-based performance, enabling real-time, offline robotics tasks like unzipping bags and folding clothes.
- Developers can fine-tune tasks using natural language prompts and the new Gemini Robotics SDK.
- Robots using the model successfully tackled unseen tasks, including industrial assembly.
- The model runs on various hardware including ALOHA, Franka FR3, and Apollo humanoid robots.
- Competitors in the space include Nvidia, Hugging Face, and RLWRLD.
Google Unveils Gemini Robotics On-Device: AI That Runs Offline
Google DeepMind has unveiled Gemini Robotics On-Device, a new iteration of its robotics-focused language model that can execute complex tasks entirely offline on hardware — no cloud connection needed.
The model is a follow-up to March’s Gemini Robotics release and allows robots to interpret and perform commands in natural language while processing everything locally.
“This is a pivotal moment in robotics—AI can now guide physical actions in real-world settings without constant server access,” said a DeepMind engineer during the product briefing.
What the New Gemini Robotics Model Can Do
The on-device version has been tested in real-world scenarios and showcases near-cloud-level performance, according to Google. Although the company hasn’t named the benchmark competitors it outperformed, it claims superiority across standard robotics evaluation suites.
In demos, the model successfully powered robots performing household and industrial tasks, including:
- Unzipping and sorting bags
- Folding clothing
- Interacting with unfamiliar assembly lines
One notable test featured the bi-arm Franka FR3 robot completing industrial assembly with objects it had never seen before—an indicator of zero-shot adaptability.
Broad Hardware Compatibility: From ALOHA to Apollo
Originally trained for the ALOHA robot platform, Gemini Robotics On-Device has also been deployed on the dual-arm Franka FR3 and Apptronik’s Apollo humanoid.
Google stated that its model adapts across form factors with minimal tuning, thanks to generalist design principles built into Gemini’s architecture.
Training with Gemini SDK: 50 to 100 Demos Required
To support developers, Google is releasing a Gemini Robotics SDK, which allows training robots via demonstration-based learning. Developers can train new tasks using just 50–100 demonstrations and simulate physics-driven outcomes through the MuJoCo engine.
This enables streamlined fine-tuning and task generalization — all on-device.
Growing Competitive Landscape in Robotics AI
Google’s latest launch lands in a rapidly growing field of robotics-focused foundation models:
- Nvidia is developing a full-stack platform for humanoid robotics foundation models.
- Hugging Face is combining open-source models with real hardware, pursuing a community-driven approach.
- South Korea’s RLWRLD is also building foundation models for robotic generalization, backed by Mirae Asset.
Gemini Robotics On-Device vs Market
Feature | Gemini Robotics On-Device | Source |
Offline Execution | Yes, fully local without cloud | Google DeepMind |
Hardware Supported | ALOHA, Franka FR3, Apollo humanoid | Apptronik |
Training Requirement | 50–100 demos using MuJoCo SDK | DeepMind SDK |
Notable Competitors | Nvidia, Hugging Face, RLWRLD | Nvidia, Hugging Face |
Cloud vs. Local Performance | Near-parity with cloud Gemini Robotics | DeepMind |