In a groundbreaking move, Google’s DeepMind robotics team has introduced three significant advancements aimed at improving the decision-making capabilities of robots operating in outdoor environments. Among these innovations is a unique system for collecting training data, accompanied by what Google calls a “Robot Constitution.” This development is expected to enable robots to make quicker, smarter, and safer decisions while working alongside humans.
AutoRT: An advanced data collection system
DeepMind’s AutoRT is a state-of-the-art data collection system designed to empower robots with the ability to understand their surroundings, adapt to new situations, and select the most appropriate tasks. It leverages both a Visual Language Model (VLM) and a Large Language Model (LLM) working in harmony.
The VLM is responsible for perceiving the robot’s environment and identifying visible objects. Meanwhile, the LLM serves as the decision-maker, suggesting tasks that the robot can perform. This dual-model approach enables robots to make informed choices and navigate their surroundings more efficiently.
Inspired by Asimov’s ‘Three Laws of Robotics’
The Robot Constitution, inspired by the visionary science fiction writer Isaac Asimov’s “Three Laws of Robotics,” functions as a set of safety guidelines for AI-powered robots. This constitution instructs the LLM to avoid tasks that involve humans, animals, sharp objects, and electrical appliances, ensuring the safety of both robots and their human counterparts.
Enhanced safety measures
DeepMind has taken additional precautions to enhance the safety of these robots. If the force on a robot’s joints exceeds a predetermined limit, it will automatically come to a halt. Furthermore, a physical kill switch is available to human operators, allowing them to deactivate the robots when necessary, providing an added layer of security.
Real-world testing and implementation
In a span of just seven months, Google successfully deployed fifty-three AutoRT robots across four office buildings. During this time, they conducted over seventy-seven thousand trials to test the robots’ capabilities. Some of these robots were remotely controlled by human operators, while others followed predefined scripts or operated autonomously using Google’s Robotic Transformer (RT-2) AI learning model.
Notably, these robots have a practical appearance, featuring a camera, a robot arm, and a mobile base. The Visual Language Model (VLM) is instrumental in helping them understand their surroundings and identify objects, while the Large Language Model (LLM) assists in decision-making, ensuring that robots perform tasks effectively and safely.
Introducing SARA-RT: A breakthrough in efficiency
One of the most significant breakthroughs in this endeavor is the introduction of Self-Adaptive Robust Attention for Robotics Transformers (SARA-RT). This system has been developed to optimize Robotics Transformer (RT) models, making them more efficient in real-world applications.
The RT neural network architecture, which played a pivotal role in the latest advancements in robotic control systems, especially the cutting-edge RT-2 model, has seen remarkable improvements. The top-performing SARA-RT-2 models have demonstrated a remarkable 10.6 percent improvement in accuracy and a 14 percent increase in speed when compared to their RT-2 counterparts, all while using a concise history of images as input.
A game-changer for robotics efficiency
The introduction of SARA-RT is being hailed as a game-changer in the field of robotics. It represents the first scalable attention mechanism that significantly enhances computational efficiency without compromising on the quality of decision-making. This development is set to revolutionize how robots operate in real-world scenarios, making them more capable, efficient, and safe.
Google’s DeepMind robotics team’s recent advancements in AI-driven robotics, including the introduction of a “Robot Constitution” and the breakthrough SARA-RT system, mark a significant leap forward in the world of robotics. These innovations are expected to enhance the decision-making capabilities of robots in outdoor environments, making them smarter, safer, and more efficient. With real-world testing and impressive results, Google’s commitment to advancing the field of robotics is evident, promising a future where robots and humans can collaborate seamlessly and safely.