Publication: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Abstract: Object grasping is critical for many applications, which is also a challenging computer vision problem. However, for cluttered scene, current researches suffer from the problems of insufficient training data and the lacking of evaluation benchmarks. In this work, we contribute a large-scale grasp pose detection dataset with a unified evaluation system. Our dataset contains 97,280 RGB-D image with over one billion grasp poses. Meanwhile, our evaluation system directly reports whether a grasping is successful by analytic computation, which is able to evaluate any kind of grasp poses without exhaustively labeling ground-truth. In addition, we propose an end-to-end grasp pose prediction network given point cloud inputs, where we learn approaching direction and operation parameters in a decoupled manner. A novel grasp affinity field is also designed to improve the grasping robustness. We conduct extensive experiments to show that our dataset and evaluation system can align well with real-world experiments and our proposed network achieves the state-of-the-art performance. Our dataset, source code and models are publicly available at www.graspnet.net.
Bibtex:
@inproceedings{fang2020graspnet, title = {Graspnet-1billion: a large-scale benchmark for general object grasping}, author = {Fang, Hao-Shu and Wang, Chenxi and Gou, Minghao and Lu, Cewu}, booktitle = {Proceedings of the IEEE/CVF conference on computer vision and pattern recognition}, pages = {11444--11453}, year = {2020} }