Skip to main content
Log in

6-DoF grasp pose estimation based on instance reconstruction

  • Original Research Paper
  • Published:
Intelligent Service Robotics Aims and scope Submit manuscript

Abstract

Grasping objects poses a significant challenge for autonomous robotic manipulation in unstructured and cluttered environments. Despite recent advancements in 6-DoF (degree of freedom) grasp pose estimation, the majority of existing methods fail to differentiate between points on adjacent objects, particularly in scenarios where objects are positioned next to each other. This imprecise orientation often results in collisions or unsuccessful grasps. To address these challenges, this paper proposes a semantic instance reconstruction grasp network (SIRGN) that efficiently generates accurate grasping configurations. Firstly, the foreground objects are reconstructed using the implicit semantic instance branch. Through voting, we predict the corresponding instance for each foreground point and thereby differentiate adjacent objects. Secondly, to enhance the accuracy of grasping orientation, we decompose the 3D rotation matrix into two orthogonal unit vectors. Furthermore, the network is trained using VGN simulation grasping datasets. The results of declutter experiments demonstrate that the grasp success rate of SIRGN in packed and pile scenes is 89.5% and 78.1%, respectively. Experiments conducted in both simulated and real environments have fully demonstrated the effectiveness of the proposed methodology.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Li G, Liang X, Gao Y, Su T, Liu Z, Hou ZG (2023) A linkage-driven underactuated robotic hand for adaptive grasping and in-hand manipulation. IEEE Trans Autom Sci Eng. https://doi.org/10.1109/TASE.2023.3273721

    Article  Google Scholar 

  2. Zhu H, Li X, Chen W, Li X, Ma J, Teo CS, Lin W (2022) Weight imprinting classification-based force grasping with a variable-stiffness robotic gripper. IEEE Trans Autom Sci Eng 19(2):961–981. https://doi.org/10.1109/TASE.2021.3054655

    Article  Google Scholar 

  3. Newbury R, Gu M, Chumbley L, Mousavian A, Eppner C, Leitner J, Bohg J, Morales A, Asfour T, Kragic D, Fox D, Cosgun A (2022) A deep learning approaches to grasp synthesis: a review. arXiv preprint arXiv:2207.02556. https://doi.org/10.48550/arXiv.2207.02556

  4. Zhang Y, Yang Q (2022) A survey on multi-task learning. IEEE Trans Knowl Data Eng 34(12):5586–5609. https://doi.org/10.1109/TKDE.2021.3070203

    Article  Google Scholar 

  5. Du G, Wang K, Lian S, Zhao K (2021) Vision-based robotic grasping from object localization, object pose estimation to grasp estimation for parallel grippers: a review. Artif Intell Rev 54(3):1677–1734. https://doi.org/10.1007/s10462-020-09888-5

    Article  Google Scholar 

  6. Chiu MC, Tsai HY, Chiu JE (2022) A novel directional object detection method for piled objects using a hybrid region-based convolutional neural network. Adv Eng Inform 51:101448. https://doi.org/10.1016/j.aei.2021.101448

    Article  Google Scholar 

  7. Zhu X, Wang D, Biza O, Su G, Walters R, Platt R (2022) Sample efficient grasp learning using equivariant models. arXiv preprint arXiv:2202.09468. https://doi.org/10.48550/arXiv.2202.09468

  8. Yu X, Zhuang Z, Koniusz P, Li H (2020) 6DoF object pose estimation via differentiable proxy voting loss. arXiv preprint arXiv:2002.03923. https://doi.org/10.48550/arXiv.2002.03923

  9. Tian H, Song K, Li S, Ma S, Xu J, Yan Y (2023) Data-driven robotic visual grasping detection for unknown objects: a problem-oriented review. Expert Syst Appl 211:118624. https://doi.org/10.1016/j.eswa.2022.118624

    Article  Google Scholar 

  10. Zhao S, Li B, Xu P, Keutzer K (2020) Multi-source domain adaptation in the deep learning era: a systematic survey. arXiv preprint arXiv:2002.12169. https://doi.org/10.48550/arXiv.2002.12169

  11. Hong QQ, Yang L, Zeng B (2022) RANET: a grasp generative residual attention network for robotic grasping detection. Int J Control Autom Syst 20:3996–4004. https://doi.org/10.1007/s12555-021-0929-8

    Article  Google Scholar 

  12. Wu C, Chen J, Cao Q, Zhang J, Tai Y, Sun L, Jia K (2020) Grasp proposal networks: an end-to-end solution for visual learning of robotic grasps. Adv Neural Inf Process Syst 33:13174–13184

    Google Scholar 

  13. Suzuki Y, Yoshida R, Tsuji T, Nishimura T, Watanabe T (2022) Grasping strategy for unknown objects based on real-time grasp-stability evaluation using proximity sensing. IEEE Robot Autom Lett 7(4):8643–8650. https://doi.org/10.1109/LRA.2022.3188885

    Article  Google Scholar 

  14. Wang JW, Li CL, Chen JL, Lee JJ (2022) Robot grasping in dense clutter via view-based experience transfer. Int J Intell Robot Appl 6:23–37. https://doi.org/10.1007/s41315-021-00179-y

    Article  Google Scholar 

  15. Lilge S, Barfoot TD, Burgner-Kahrs J (2022) Continuum robot state estimation using Gaussian process regression on SE (3). Int J Robot Res 41(13–14):1099–1120. https://doi.org/10.1177/02783649221128843

    Article  Google Scholar 

  16. Bohg J, Morales A, Asfour T, Kragic D (2013) Data-driven grasp synthesis—a survey. IEEE Trans Robot 30(2):289–309. https://doi.org/10.1109/TRO.2013.2289018

    Article  Google Scholar 

  17. Tian H, Song K, Li S, Ma S, Yan Y (2023) Rotation adaptive grasping estimation network oriented to unknown objects based on novel RGB-D fusion strategy. Eng Appl Artif Intell 120:105842. https://doi.org/10.1016/j.engappai.2023.105842

    Article  Google Scholar 

  18. Kim S, Ahn T, Lee Y, Kim J, Wang MY, Park FC (2022) DSQNet: a deformable model-based supervised learning algorithm for grasping unknown occluded objects. IEEE Trans Autom Sci Eng 20(3):1721–1734. https://doi.org/10.1109/TASE.2022.3184873

    Article  Google Scholar 

  19. Morrison D, Corke P, Leitner J (2018) Closing the loop for robotic grasping: a real-time, generative grasp synthesis approach. arXiv preprint arXiv:1804.05172. https://doi.org/10.48550/arXiv.1804.05172

  20. Sayour MH, Kozhaya SE, Saab SS (2022) Autonomous robotic manipulation: real-time, deep-learning approach for grasping of unknown objects. J Robot 2022:2585656. https://doi.org/10.1155/2022/2585656

    Article  Google Scholar 

  21. Fang K, Zhu Y, Garg A, Kurenkov A, Mehta V, Fei-Fei L, Savarese S (2020) Learning task-oriented grasping for tool manipulation from simulated self-supervision. Int J Robot Res 39(2–3):202–216. https://doi.org/10.1177/0278364919872545

    Article  Google Scholar 

  22. Mahler J, Liang J, Niyaz S, Laskey M, Doan R, Liu X, Goldberg K (2017) Dex-net 2.0: deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics. arXiv preprint arXiv:1703.09312. https://doi.org/10.48550/arXiv.1703.09312

  23. Zeng A, Song S, Yu KT, Donlon E, Hogan FR, Bauza M, Rodriguez A (2022) Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching. Int J Robot Res 41(7):690–705. https://doi.org/10.1177/0278364919868017

    Article  Google Scholar 

  24. Mousavian A, Eppner C, Fox D (2019) 6-DoF graspnet: variational grasp generation for object manipulation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 2901–2910

  25. Ten Pas A, Gualtieri M, Saenko K, Platt R (2017) Grasp pose detection in point clouds. Int J Robot Res 36(13–14):1455–1473. https://doi.org/10.1177/0278364917735594

    Article  Google Scholar 

  26. Liang H, Ma X, Li S, Görner M, Tang S, Fang B, Zhang J (2019) Pointnetgpd: detecting grasp configurations from point sets. In: 2019 International conference on robotics and automation (ICRA), Montreal, IEEE, pp 3629–3635. https://doi.org/10.1109/ICRA.2019.8794435

  27. Breyer M, Chung JJ, Ott L, Siegwart R, Nieto J (2021) Volumetric grasping network: real-time 6 DoF grasp detection in clutter. In: Proceedings of the 2020 conference on robot learning, PMLR, vol 155, pp 1602–1611.

  28. Li G, Xu P, Qiao S, Li B (2021) Stability analysis and optimal enveloping grasp planning of a deployable robotic hand. Mech Mach Theory 158:104241. https://doi.org/10.1016/j.mechmachtheory.2020.104241

    Article  Google Scholar 

  29. Chen S, Tang W, Xie P, Yang W, Wang G (2023) Efficient heatmap-guided 6-DoF grasp detection in cluttered scenes. IEEE Robot Autom Lett 8(8):4895–4902. https://doi.org/10.1109/LRA.2023.3290513

    Article  Google Scholar 

  30. Yan X, Hsu J, Khansari M, Bai Y, Pathak A, Gupta A, Lee H (2018) Learning 6-DoF grasping interaction via deep geometry-aware 3d representations. In: 2018 IEEE international conference on robotics and automation (ICRA), Brisbane, QLD, Australia, IEEE, pp 3766–3773. https://doi.org/10.1109/ICRA.2018.8460609

  31. Li Y, Kong T, Chu R, Li Y, Wang P, Li L (2021) Simultaneous semantic and collision learning for 6-DoF grasp pose estimation. arXiv preprint arXiv:2108.02425. https://doi.org/10.48550/arXiv.2108.02425

  32. Van der Merwe M, Lu Q, Sundaralingam B, Matak M, Hermans T (2020) Learning continuous 3d reconstructions for geometrically aware grasping. In: 2020 IEEE International conference on robotics and automation (ICRA), Paris, France, IEEE, pp 11516–11522. https://doi.org/10.1109/ICRA40945.2020.9196981

  33. Jiang Z, Zhu Y, Svetlik M, Fang K, Zhu Y (2021) Synergies between affordance and geometry: 6-DoF grasp detection via implicit representations. arXiv preprint arXiv:2104.01542. https://doi.org/10.48550/arXiv.2104.01542

  34. Chen B, Zhang T, Cong L, Ma J, Hu W (2022) Forward kinematics of body posture perception using an improved BP neural network based on a quantum genetic algorithm. Laser Phys Lett 19(9):095201. https://doi.org/10.1088/1612-202X/ac7f37

    Article  Google Scholar 

  35. Fang HS, Wang C, Gou M, Lu C (2020) Graspnet-1billion: a large-scale benchmark for general object grasping. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 11444–11453

  36. Peng S, Niemeyer M, Mescheder L, Pollefeys M, Geiger A (2020) Convolutional occupancy networks. arXiv preprint arXiv:2003.04618. https://doi.org/10.48550/arXiv.2003.04618

  37. Grassmann R, Burgner-Kahrs J (2019) On the merits of joint space and orientation representations in learning the forward kinematics in SE (3). In: Robotics: Science and Systems. Freiburg in Breisgau, June 22–26

  38. Danielczuk M, Matl M, Gupta S, Li A, Lee A, Mahler J, Goldberg K (2019) Segmenting unknown 3d objects from real depth images using mask R-CNN trained on synthetic data. In: 2019 International conference on robotics and automation (ICRA). Montreal, QC, Canada, IEEE, pp 7283–7290. https://doi.org/10.1109/ICRA.2019.8793744

  39. Han L, Zheng T, Xu L, Fang L (2020) Occuseg: occupancy-aware 3d instance segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2940–2949

  40. Xie C, Xiang Y, Mousavian A, Fox D (2021) Unseen object instance segmentation for robotic environments. IEEE Trans Robot 37(5):1343–1359. https://doi.org/10.1109/TRO.2021.3060341

    Article  Google Scholar 

Download references

Acknowledgements

The work was supported by the National Nature Science Foundation of China (62272426, 62106238) and Shanxi Province Nature Science Foundation of China (202303021211153, 202203021212138).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huiyan Han.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Han, H., Wang, W., Han, X. et al. 6-DoF grasp pose estimation based on instance reconstruction. Intel Serv Robotics 17, 251–264 (2024). https://doi.org/10.1007/s11370-023-00489-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11370-023-00489-z

Keywords

Navigation