MPI-RICAL: Data-Driven MPI Distributed Parallelism Assistance with Transformers

Nadav Schneider, Tal Kadosh, Niranjan Hasabnis, Timothy Mattson, Yuval Pinter, Gal Oren

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Computational science has made rapid progress in recent years, leading to ever increasing demand for supercomputing resources. For scientific applications that leverage such resources, Message Passing Interface (MPI) plays a crucial role in enabling distributed memory parallelization across multiple nodes. However, parallelizing MPI code manually, and specifically, performing domain decomposition, is a challenging and error-prone task. In this paper, we address this problem by developing MPI-rical, a novel data-driven, programming-assistance tool that assists programmers in writing domain decomposition based distributed memory parallelization code using MPI. Specifically, we leverage Transformer architecture - the invention that led to advancements in the field of natural language processing (NLP) - with a supervised language model to suggest MPI functions and their proper locations in the code on the fly. In addition to the novel model for MPI-based parallel programming, in this paper, we also introduce MPICodeCorpus, the first publicly-available corpus of MPI-based parallel programs that is created by mining more than 15,000 open-source repositories on GitHub. Experimental results demonstrate the effectiveness of MPI-rical on both dataset from MPICodeCorpus and more importantly, on a compiled benchmark of MPI-based parallel programs for numerical computations that represent real-world scientific applications. Specifically, MPI-rical achieves F1 scores between 0.87-0.91 on these programs, demonstrating its accuracy in suggesting correct MPI functions at appropriate code locations. The source code used in this work, as well as other relevant sources, are available at: https://github.com/Scientific-Computing-Lab-NRCN/MPI-rical.

Original languageEnglish
Title of host publicationProceedings of 2023 SC Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis, SC Workshops 2023
PublisherAssociation for Computing Machinery
Pages2-10
Number of pages9
ISBN (Electronic)9798400707858
DOIs
StatePublished - 12 Nov 2023
Event2023 International Conference on High Performance Computing, Network, Storage, and Analysis, SC Workshops 2023 - Denver, United States
Duration: 12 Nov 202317 Nov 2023

Publication series

NameACM International Conference Proceeding Series

Conference

Conference2023 International Conference on High Performance Computing, Network, Storage, and Analysis, SC Workshops 2023
Country/TerritoryUnited States
CityDenver
Period12/11/2317/11/23

Keywords

  • Domain Decomposition
  • LLM
  • MPI
  • MPI-rical
  • MPICodeCorpus
  • SPT-Code
  • Transformer

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software

Fingerprint

Dive into the research topics of 'MPI-RICAL: Data-Driven MPI Distributed Parallelism Assistance with Transformers'. Together they form a unique fingerprint.

Cite this