DeepMind’s RoboCat AI
DeepMind has developed an AI model known as RoboCat that demonstrates versatility in performing a range of tasks using different models of robotic arms. Although robots executing a variety of tasks isn’t particularly newsworthy, RoboCat distinguishes itself as the first model to adapt to and solve multiple tasks utilizing different real-world robots.
RoboCat’s creation was inspired by another DeepMind AI model, Gato, capable of analyzing and acting on text, images, and events. RoboCat’s training involved images and action data collected from robotic simulations and real-life environments. The data was sourced from a mix of other robot-controlling models inside virtual environments, humans controlling robots and previous iterations of RoboCat itself.
The training process for RoboCat required DeepMind researchers to collect between 100 to 1,000 demonstrations of a task or robot using a human-controlled robotic arm. Tasks ranged from picking up gears to stacking blocks. The researchers then fine-tuned RoboCat on each task, creating a specialized “spin-off” model that practiced the task an average of 10,000 times. This rigorous training process continued with new versions of RoboCat, fueled by the data generated by the spin-off models and the demonstration data.
The final version of RoboCat was trained on a total of 253 tasks, and benchmarked on a set of 141 variations of these tasks, both in simulated environments and the real world. Notably, after observing 1,000 human-controlled demonstrations, RoboCat learned to operate different robotic arms.
While RoboCat was trained on four types of robots with two-pronged arms, the model managed to adapt to a more complex arm with a three-fingered gripper and twice as many controllable inputs. It’s important to note that RoboCat’s success rate varies significantly across tasks from a low of 13% to a high of 99%, with a thousand demonstrations in the training data. The success rate predictably decreases with fewer demonstrations. Still, in some scenarios, DeepMind asserts that RoboCat was able to learn new tasks with as few as 100 demonstrations.
This development could signal a major shift in the field of robotics. Alex Lee, a research scientist at DeepMind and a co-contributor on the RoboCat team, expressed his belief that RoboCat could lower the barrier to solving new tasks in robotics. He said, “Provided with a limited number of demonstrations for a new task, RoboCat can be fine-tuned to the new tasks and in turn self-generate more data to improve even further.” The team’s future goal is to reduce the number of demonstrations needed to teach RoboCat a new task to fewer than 10.