Publication: CoRR
Abstract: Proposing grasp poses for novel objects is an essential component for any robot manipulation task. Planning six degrees of freedom (DoF) grasps with a single camera, however, is challenging due to the complex object shape, incomplete object information, and sensor noise. In this paper, we present a 6-DoF contrastive grasp proposal network (CGPN) to infer 6-DoF grasps from a single-view depth image. First, an image encoder is used to extract the feature map from the input depth image, after which 3-DoF grasp regions are proposed from the feature map with a rotated region proposal network. Feature vectors that within the proposed grasp regions are then extracted and refined to 6-DoF grasps. The proposed model is trained offline with synthetic grasp data. To improve the robustness in reality and bridge the simulation-to-real gap, we further introduce a contrastive learning module and variant image processing techniques during the training. CGPN can locate collision-free grasps of an object using a single-view depth image within 0.5 seconds. Experiments on a physical robot further demonstrate the effectiveness of the algorithm.
Bibtex:
@article{zhu20216dof, author = {Zhu, Xinghao and Sun, Lingfeng and Fan, Yongxiang and Tomizuka, Masayoshi}, title = {6-DoF Contrastive Grasp Proposal Network}, journal = {CoRR}, volume = {abs/2103.15995}, year = {2021}, eprinttype = {arXiv}, eprint = {2103.15995}, timestamp = {Wed, 07 Apr 2021 15:31:46 +0200}, bibsource = {dblp computer science bibliography, https://dblp.org} }