Rhys Newbury | 6-DoF Contrastive Grasp Proposal Network

Abstract: Proposing grasp poses for novel objects is an essential component for any robot manipulation task. Planning six degrees of freedom (DoF) grasps with a single camera, however, is challenging due to the complex object shape, incomplete object information, and sensor noise. In this paper, we present a 6-DoF contrastive grasp proposal network (CGPN) to infer 6-DoF grasps from a single-view depth image. First, an image encoder is used to extract the feature map from the input depth image, after which 3-DoF grasp regions are proposed from the feature map with a rotated region proposal network. Feature vectors that within the proposed grasp regions are then extracted and refined to 6-DoF grasps. The proposed model is trained offline with synthetic grasp data. To improve the robustness in reality and bridge the simulation-to-real gap, we further introduce a contrastive learning module and variant image processing techniques during the training. CGPN can locate collision-free grasps of an object using a single-view depth image within 0.5 seconds. Experiments on a physical robot further demonstrate the effectiveness of the algorithm.

@article{zhu20216dof,
  author = {Zhu, Xinghao and Sun, Lingfeng and Fan, Yongxiang and Tomizuka, Masayoshi},
  title = {6-DoF Contrastive Grasp Proposal Network},
  journal = {CoRR},
  volume = {abs/2103.15995},
  year = {2021},
  eprinttype = {arXiv},
  eprint = {2103.15995},
  timestamp = {Wed, 07 Apr 2021 15:31:46 +0200},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}