View on GitHub

Generative Attention Learning - A "GenerAL" Framework for High-Performance Multi-Fingered Grasping in Clutter

Bohan Wu, Iretiayo Akinola, Abhi Gupta, Feng Xu, Jacob Varley, David Watkins-Valls, and Peter Allen

Columbia Robotics Lab and Robotics at Google

Abstract

Generative Attention Learning (GenerAL) is a framework for high-DOF multi-fingered grasping that is not only robust to dense clutter and novel objects, but also effective with a variety of different parallel-jaw and multi-fingered robot hands. This framework introduces a novel attention mechanism that substantially improves grasp success rate in clutter. Its generative nature allows the learning of full-DOF grasps with flexible end-effector positions and orientations, as well as all finger joint angles of the hand. Trained purely in simulation, this framework closes the visual sim-to-real gap by using a single depth image as input and closes the dynamics sim-to-real gap by circumventing continuous motor control with a direct mapping from pixel to Cartesian space inferred from the same depth image. Finally, this framework demonstrates inter-robot generality by achieving over 90% grasp success rates in cluttered scenes with novel objects using two multi-fingered robotic hand/arm systems with different degrees of freedom.


Video


Acknowledgement

This work was supported in part by a Google Research grant and National Science Foundation grants CMMI-1734557 and IIS-1527747.