On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the GVirtuS Framework

Montella, Raffaele; Giunta, Giulio; Laccetti, Giuliano; Lapegna, Marco; Palmieri, Carlo; Ferraro, Carmine; Pelliccia, Valentina; Hong, Cheol-Ho; Spence, Ivor; Nikolopoulos, Dimitrios S.

doi:10.1007/s10766-016-0462-1

On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the GVirtuS Framework

Published: 13 October 2016

Volume 45, pages 1142–1163, (2017)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Raffaele Montella ORCID: orcid.org/0000-0002-4767-2045¹,
Giulio Giunta¹,
Giuliano Laccetti²,
Marco Lapegna²,
Carlo Palmieri¹,
Carmine Ferraro¹,
Valentina Pelliccia¹,
Cheol-Ho Hong³,
Ivor Spence³ &
…
Dimitrios S. Nikolopoulos³

991 Accesses
21 Citations
Explore all metrics

Abstract

The astonishing development of diverse and different hardware platforms is twofold: on one side, the challenge for the exascale performance for big data processing and management; on the other side, the mobile and embedded devices for data collection and human machine interaction. This drove to a highly hierarchical evolution of programming models. GVirtuS is the general virtualization system developed in 2009 and firstly introduced in 2010 enabling a completely transparent layer among GPUs and VMs. This paper shows the latest achievements and developments of GVirtuS, now supporting CUDA 6.5, memory management and scheduling. Thanks to the new and improved remoting capabilities, GVirtus now enables GPU sharing among physical and virtual machines based on x86 and ARM CPUs on local workstations, computing clusters and distributed cloud appliances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey of Kubernetes scheduling algorithms

Article Open access 13 June 2023

Containerization technologies: taxonomies, applications and challenges

Article 08 June 2021

Virtualization in Cloud Computing: Moving from Hypervisor to Containerization—A Survey

Article 13 April 2021

References

Armand, F., Gien, M., Maign, G., Mardinian, G.: Shared device driver model for virtualized mobile handsets. In: Proceedings of the First Workshop on Virtualization in Mobile Computing, pp. 12–16. ACM (2008)
Bairoch, A.M., Apweiler, R., Wu, C.H., Barker, W.C., Boeckmann, B., Ferro Rojas, S., Gasteiger, E., et al.: The universal protein resource (UniProt). Nucleic Acids Res. 33(Database issue), D154–D159 (2005)
Bell, N., Garland, M.: Efficient sparse matrix-vector multiplication on CUDA. NVIDIA Technical Report NVR-2008-004, Nvidia Corporation (2008)
Caruso P.G. Laccetti, Lapegna, M.: A performance contract system in a grid enabling, component based programming environment. In: Advances in Grid Computing-EGC 2005, LNCS, vol. 3470, pp. 982–992. Springer (2005)
Castello, A., Duato, J., Mayo, R., Pena, A.J., Quintana-Ort, E.S., Roca, V., Silla, F.: On the use of remote GPUs and low-power processors for the acceleration of scientific applications. In: The Fourth International Conference on Smart Grids, Green Communications and IT Energy-aware Technologies (ENERGY), pp. 57–62 (2014)
Dagum, L., Enon, R.: OpenMP: an industry standard API for shared-memory programming. IEEE Comput. Sci. Eng. 5(1), 46–55 (1998)
Article Google Scholar
Di Lauro, R., Giannone, F., Ambrosio, L., Montella, R.: Virtualizing general purpose GPUs for high performance cloud computing: an application to a fluid simulator. In: IEEE 10th International Symposium on Proceedings of Parallel and Distributed Processing with Applications (ISPA), pp. 863–864 (2012)
Di Lauro, R., Lucarelli, F., Montella, R.: SIaaS-sensing instrument as a service using cloud computing to turn physical instrument into ubiquitous service. In: 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications. IEEE, pp. 861–862 (2012)
Foster, I., Zhao, Y., Raicu, I., Lu, S.: Cloud computing and grid computing 360-degree compared. In: IEEE Grid Computing Environments Workshop GCE 08, pp. 1–10 (2008)
Giunta, G., Mariani, P., Montella, R., Riccio, A.: pPOM: a nested, scalable, parallel and Fortran 90 implementation of the Princeton Ocean Model. Environ. Model. Softw. 22(1), 117–122 (2007)
Article Google Scholar
Garland, M., Le Grand, S., Nickolls, J., Anderson, J., Hardwick, J., Morton, S., Phillips, E., Zhang, Y., Volkov, V.: Parallel computing experiences with CUDA. IEEE Micro 28(4), 13–27 (2008)
Article Google Scholar
Giunta, G., Montella, R., Agrillo, G., Coviello, G.: A GPGPU transparent virtualization component for high performance computing clouds. In: EuroPar 2010 Parallel Processing, LNCS, vol. 6271, no. 2, pp. 379–391. Springer (2010)
Giunta, G., Montella, R., Laccetti, G., Isaila, F., Blas, F.: A GPU accelerated high performance cloud computing infrastructure for grid computing based virtual environmental laboratory. Adv. Grid Comput. 35–43 (2011)
Gropp, W.: MPICH2: a new start for MPI implementations. In: Recent Advances in Parallel Virtual Machine and Message Passing Interface 2002, LNCS, vol. 2474, p. 7. Springer (2002)
Gupta, V., Gavrilovska, A., Schwan, K., Kharche, H., Tolia, N., Talwar, V., Ranganathan, P.: GViM: GPU-accelerated virtual machines. In: Proceedings of the 3rd ACM Workshop on System-Level Virtualization for High Performance Computing, pp. 17–24. ACM (2009)
Herrera, A.: NVIDIA GRID: Graphics Accelerated VDI with the Visual Performance of a Workstation. Nvidia Corp, Santa Clara (2014)
Google Scholar
Kawai, A., Yasuoka, K., Yoshikawa, K., Narumi, T.: Distributed-shared CUDA: virtualization of large-scale GPU systems for programmability and reliability (2012)
Karunadasa, N.P., Ranasinghe, D.N.: Accelerating high performance applications with CUDA and MPI. In: 2009 International Conference on Industrial and Information Systems (ICIIS), pp. 331–336. IEEE (2009)
Kehne, J., Metter, J., Bellosa, F.: GPUswap: enabling oversubscription of GPU memory through transparent swapping. In: Proceedings of the 11th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, pp. 65–77. ACM (2015)
Laccetti, G., Montella, R., Palmieri, C., Pelliccia, V.: The high performance internet of things: using GVirtuS to share high-end GPUs with ARM based cluster computing nodes. In: Parallel Processing and Applied Mathematics 2013, LNCS, vol. 8384, pp. 734–744. Springer, Berlin, Heidelberg (2013)
Ligowski, L., Rudnicki, W.: An efficient implementation of Smith–Waterman algorithm on GPU using CUDA, for massively parallel scanning of sequence databases. In: IEEE International Symposium on Parallel and Distributed Processing 2009, IPDPS 2009, pp. 1–8. IEEE (2009)
Liu, Y., Schmidt, B., Maskell, D.L.: CUDASW++ 2.0: enhanced Smith–Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions. BMC Res. Notes 3(1), 93 (2010)
Article Google Scholar
Manavski, S.A., Valle, G.: CUDA compatible GPU cards as efficient hardware accelerators for Smith–Waterman sequence alignment. BMC Bioinf. 9(2), 1 (2008)
Google Scholar
Martinez-Noriega, E.J., Josafat, E., Kawai, A., Yoshikawa, K., Yasuoka, K., Narumi, T.: CUDA Enabled for Android Tablets through DS-CUDA (2013)
Montella, R., Foster, I.: Using hybrid grid/cloud computing technologies for environmental data elastic storage, processing, and provisioning. In: Handbook of Cloud Computing, pp. 595–618. Springer, USA (2010)
Montella, R., Coviello, G., Giunta, G., Laccetti, G., Isaila, F., Blas, J.G.: A general-purpose virtualization service for HPC on cloud computing: an application to GPUs. In: International Conference on Parallel Processing and Applied Mathematics, pp. 740–749. Springer, Berlin, Heidelberg (2011)
Montella, R., Giunta, G., Laccetti, G.: Virtualizing high-end GPGPUs on ARM clusters for the next generation of high performance cloud computing. Cluster Comput. 17(1), 139–152 (2014)
Article Google Scholar
Montella, R., Kelly, D., Xiong, W., Brizius, A., Elliott, J., Madduri, R., Maheshwari, K., et al.: FACE IT: A science gateway for food security research. Concurr. Comput. Pract. Exp. 27(16), 4423–4436 (2015)
Article Google Scholar
Montella, R., Giunta, G., Laccetti, G., Lapegna, M., Palmieri, C., Ferraro, C., Pelliccia, V.: Virtualizing CUDA enabled GPGPUs on ARM clusters. In: Parallel Processing in and Applied Mathematics 2015, LNCS, vol. 9574, Springer, Berlin, Heidelberg (2016)
Pham, Q., Malik, T., Foster, I., Di Lauro, R., Montella, R., SOLE: linking research papers with science objects. In: Provenance and Annotation of Data and Processes 2012, LNCS, vol. 7525, pp. 203–208. Springer, Berlin, Heidelberg (2012)
Prades, J., Reao, C., Silla, F.: CUDA acceleration for Xen virtual machines in infiniband clusters with rCUDA. In: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, p. 35. ACM (2016)
Rajovic, N., Rico, A., Puzovic, N., Adeniyi-Jones, C., Ramirez, A.: Tibidabo: making the case for an ARM-based HPC system. Fut. Gener. Comput. Syst. 36, 322–334 (2014)
Article Google Scholar
Reao, C., Mayo, R., Quintana-Orti, E.S., Silla, F., Duato, J., Pea, A.J.: Influence of InfiniBand FDR on the performance of remote GPU virtualization. In: Proceedings of the 2013 IEEE International Conference on Cluster Computing, Indianapolis, USA (2013)
Reao, C., Silla, F., Pena, A.J., Shainer, G., Schultz, S., Castello, A., Quintana-Orti, E.S., Duato, J.: POSTER: Boosting the performance of remote GPU virtualization using InfiniBand connect-IB and PCIe 3.0. In: 2014 IEEE International Conference on Cluster Computing (CLUSTER), pp. 266–267. IEEE (2014)
Shi, L., Chen, H., Sun, J., Li, K.: vCUDA: GPU-accelerated high-performance computing in virtual machines. IEEE Trans. Comput. 61(6), 804–816 (2012)
Article MathSciNet MATH Google Scholar
Shuai, C., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Skadron, K.: A performance study of general-purpose applications on graphics processors using CUDA. J. Parallel Distrib. Comput. 68(10), 1370–1380 (2008)
Article Google Scholar
Shuai, C., Boyer, M., Meng, J., Tarjan, D., Sheaffer, J.W., Lee, S.-H., Skadron, K.: Rodinia: a benchmark suite for heterogeneous computing. In: Proceedings of the IEEE International Symposium on Workload Characterization—IISWC 2009, pp. 44–54 (2009)
Sourouri, M., Gillberg, T., Baden, S.B., Cai, X.: Effective multi-GPU communication using multiple CUDA streams and threads. In: 2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS), pp. 981–986. IEEE (2014)
Szafaryn, L.G., Skadron, K., Saucerman, J.J.: Experiences accelerating MATLAB systems biology applications. In: Proceedings of the workshop on biomedicine in computing: systems, architectures, and circuits (BiC) 2009. In: Conjunction with the 36th IEEE/ACM International Symposium on Computer Architecture (ISCA) (2009)
Volkov, V., Demmel, J.W.: Benchmarking GPUs to tune dense linear algebra. In: International Conference for High Performance Computing, Networking, Storage and Analysis 2008, SC 2008, pp. 1–11. IEEE (2008)
Yang, C., Huang, C., Lin, C.: Hybrid CUDA, OpenMP, and MPI parallel programming on multicore GPU clusters. Comput. Phys. Commun. 182(1), 266–269 (2011)
Article Google Scholar

Download references

Acknowledgments

This research has been supported mainly by the Grant Agreement number: 644312 - RAPID - H2020-ICT-2014/H2020-ICT-2014-1 “Heterogeneous Secure Multi-level Remote Acceleration Service for Low-Power Integrated Systems and Devices”, in part by the project IZS ME04/12 RC/C78C120017001 “Mapping Escherichia Coli and Salmonella pollution in mussel farm areas and model prediction comparisons”, in part by the University of Naples Parthenope - Department of Science and Technologies “Weather/marine extreme event simulation with Galaxy-ES (Earth System) scientific workflow engine and cloud computing tools” Research Project, and in part by the University of Naples Federico II - Department of Mathematics “Approcci Innovativi per la Risoluzione di Modelli di Interesse nelle Simulazioni Computazionali” Research Project Grant Agreement.

Author information

Authors and Affiliations

University of Naples Parthenope, Naples, Italy
Raffaele Montella, Giulio Giunta, Carlo Palmieri, Carmine Ferraro & Valentina Pelliccia
University of Naples Federico II, Naples, Italy
Giuliano Laccetti & Marco Lapegna
Queen’s University of Belfast, Belfast, Northern Ireland, UK
Cheol-Ho Hong, Ivor Spence & Dimitrios S. Nikolopoulos

Authors

Raffaele Montella
View author publications
You can also search for this author in PubMed Google Scholar
Giulio Giunta
View author publications
You can also search for this author in PubMed Google Scholar
Giuliano Laccetti
View author publications
You can also search for this author in PubMed Google Scholar
Marco Lapegna
View author publications
You can also search for this author in PubMed Google Scholar
Carlo Palmieri
View author publications
You can also search for this author in PubMed Google Scholar
Carmine Ferraro
View author publications
You can also search for this author in PubMed Google Scholar
Valentina Pelliccia
View author publications
You can also search for this author in PubMed Google Scholar
Cheol-Ho Hong
View author publications
You can also search for this author in PubMed Google Scholar
Ivor Spence
View author publications
You can also search for this author in PubMed Google Scholar
Dimitrios S. Nikolopoulos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raffaele Montella.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Montella, R., Giunta, G., Laccetti, G. et al. On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the GVirtuS Framework. Int J Parallel Prog 45, 1142–1163 (2017). https://doi.org/10.1007/s10766-016-0462-1

Download citation

Received: 13 March 2016
Accepted: 22 September 2016
Published: 13 October 2016
Issue Date: October 2017
DOI: https://doi.org/10.1007/s10766-016-0462-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the GVirtuS Framework

Abstract

Access this article

Similar content being viewed by others

A survey of Kubernetes scheduling algorithms

Containerization technologies: taxonomies, applications and challenges

Virtualization in Cloud Computing: Moving from Hypervisor to Containerization—A Survey

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the GVirtuS Framework

Abstract

Access this article

Similar content being viewed by others

A survey of Kubernetes scheduling algorithms

Containerization technologies: taxonomies, applications and challenges

Virtualization in Cloud Computing: Moving from Hypervisor to Containerization—A Survey

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation