TY - GEN
T1 - OMPGPT
T2 - 30th International Conference on Parallel and Distributed Computing, Euro-Par 2024
AU - Chen, Le
AU - Bhattacharjee, Arijit
AU - Ahmed, Nesreen
AU - Hasabnis, Niranjan
AU - Oren, Gal
AU - Vo, Vy
AU - Jannesari, Ali
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024/1/1
Y1 - 2024/1/1
N2 - Large language models (LLMs)such as ChatGPT have significantly advanced the field of Natural Language Processing (NLP). This trend led to the development of code-based large language models such as StarCoder, WizardCoder, and CodeLlama, which are trained extensively on vast repositories of code and programming languages. While the generic abilities of these code LLMs are helpful for many programmers in tasks like code generation, the area of high-performance computing (HPC) has a narrower set of requirements that make a smaller and more domain-specific model a smarter choice. This paper presents OMPGPT, a novel domain-specific model meticulously designed to harness the inherent strengths of language models for OpenMP pragma generation. Furthermore, we leverage prompt engineering techniques from the NLP domain to create Chain-of-OMP, an innovative strategy designed to enhance OMPGPT’s effectiveness. Our extensive evaluations demonstrate that OMPGPT outperforms existing large language models specialized in OpenMP tasks and maintains a notably smaller size, aligning it more closely with the typical hardware constraints of HPC environments. We consider our contribution as a pivotal bridge, connecting the advantage of language models with the specific demands of HPC tasks.
AB - Large language models (LLMs)such as ChatGPT have significantly advanced the field of Natural Language Processing (NLP). This trend led to the development of code-based large language models such as StarCoder, WizardCoder, and CodeLlama, which are trained extensively on vast repositories of code and programming languages. While the generic abilities of these code LLMs are helpful for many programmers in tasks like code generation, the area of high-performance computing (HPC) has a narrower set of requirements that make a smaller and more domain-specific model a smarter choice. This paper presents OMPGPT, a novel domain-specific model meticulously designed to harness the inherent strengths of language models for OpenMP pragma generation. Furthermore, we leverage prompt engineering techniques from the NLP domain to create Chain-of-OMP, an innovative strategy designed to enhance OMPGPT’s effectiveness. Our extensive evaluations demonstrate that OMPGPT outperforms existing large language models specialized in OpenMP tasks and maintains a notably smaller size, aligning it more closely with the typical hardware constraints of HPC environments. We consider our contribution as a pivotal bridge, connecting the advantage of language models with the specific demands of HPC tasks.
KW - HPC
KW - Large Language model
KW - OpenMP
UR - https://www.scopus.com/pages/publications/85202597161
U2 - 10.1007/978-3-031-69577-3_9
DO - 10.1007/978-3-031-69577-3_9
M3 - Conference contribution
AN - SCOPUS:85202597161
SN - 9783031695766
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 121
EP - 134
BT - Euro-Par 2024
A2 - Carretero, Jesus
A2 - Garcia-Blas, Javier
A2 - Shende, Sameer
A2 - Brandic, Ivona
A2 - Olcoz, Katzalin
A2 - Schreiber, Martin
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 26 August 2024 through 30 August 2024
ER -