Enabling Pipeline Parallelism in Heterogeneous Managed Runtime Environments via Batch Processing

Published in VEE 2022, Virtual Execution Environments, 2022

Recommended citation: Florin Blanaru, Athanasios Stratikopoulos, Juan Fumero, Christos Kotselidis. VEE 2022.

Abstract

Authors: Florin Blanaru, Athanasios Stratikopoulos, Juan Fumero, and Christos Kotselidis

During the last decade, managed runtime systems have been constantly evolving to become capable of exploiting underlying hardware accelerators, such as GPUs and FPGAs. Regardless of the programming language and their corresponding runtime systems, the majority of the work has been focusing on the compiler front trying to tackle the challenging task of how to enable just-in-time compilation and execution of arbitrary code segments on various accelerators. Besides this challenging task, another important aspect that defines both functional correctness and performance of managed runtime systems is that of automatic memory management. Although automatic memory management improves productivity by abstracting away memory allocation and maintenance, it hinders the capability of using specific memory regions, such as pinned memory, in order to perform data transfer times between the CPU and hardware accelerators.

In this paper, we introduce and evaluate a series of memory optimizations specifically tailored for heterogeneous managed runtime systems. In particular, we propose: (i) transparent and automatic “parallel batch processing” for overlapping data transfers and computation between the host and hardware accelerators in order to enable pipeline parallelism, and (ii) “off-heap pinned memory” in combination with parallel batch processing in order to increase the performance of data transfers without posing any on-heap overheads. These two techniques have been implemented in the context of the state-of-the-art open-source TornadoVM and their combination can lead up to 2.5x end-to-end performance speedup against sequential batch processing.

Recommended citation:

@inproceedings{10.1145/3516807.3516821,
  author = {Blanaru, Florin and Stratikopoulos, Athanasios and Fumero, Juan and Kotselidis, Christos},
  title = {Enabling Pipeline Parallelism in Heterogeneous Managed Runtime Environments via Batch Processing},
  year = {2022},
  isbn = {9781450392518},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3516807.3516821},
  doi = {10.1145/3516807.3516821},
  booktitle = {Proceedings of the 18th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments},
  pages = {58--71},
  numpages = {14},
  keywords = {Optimizations, GPUs, Memory Management, Virtual Machines, Heterogeneous Architectures, Data Transfers},
  location = {Virtual, Switzerland},
  series = {VEE 2022}
}