Pixel-Attentive Policy Gradient for Multi-Fingered Grasping in Cluttered Scenes

Bohan Wu, Iretiayo Akinola, and Peter Allen

Abstract

Recent advances in on-policy reinforcement learning (RL) methods enabled learning agents in virtual environments to master complex tasks with high-dimensional and continuous observation and action spaces. However, leveraging this family of algorithms in multi-fingered robotic grasping remains a challenge due to large sim-to-real fidelity gaps and the high sample complexity of on-policy RL algorithms. This work aims to bridge these gaps by first reinforcement-learning a multi-fingered robotic grasping policy in simulation that operates in the pixel space of the input: a single depth image. Using a mapping from pixel space to Cartesian space according to the depth map, this method transfers to the real world with high fidelity and introduces a novel attention mechanism that substantially improves grasp success rate in cluttered environments. Finally, the direct-generative nature of this method allows learning of multi-fingered grasps that have flexible end-effector positions, orientations and rotations, as well as all degrees of freedom of the hand.

Pixel-Attentive Mechanism

Pixel-Attentive Mechanism.

Video (4-Minutes)

Acknowledgement

This work was supported in part by a Google Research grant and National Science Foundation grants CMMI-1734557 and IIS-1527747.