Malware poses a threat to computing systems worldwide, and security experts work tirelessly to detect and classify malware as accurately and quickly as possible. Since malware can use evasion techniques to bypass static analysis and security mechanisms, dynamic analysis methods are more useful for accurately analyzing the behavioral patterns of malware. Previous studies showed that malware behavior can be represented by sequences of executed system calls and that machine learning algorithms can leverage such sequences for the task of malware classification (a.k.a. malware categorization). Accurate malware classification is helpful for malware signature generation and is thus beneficial to antivirus vendors; this capability is also valuable to organizational security experts, enabling them to mitigate malware attacks and respond to security incidents. In this paper, we propose an improved methodology for malware classification, based on analyzing sequences of system calls invoked by malware in a dynamic analysis environment. We show that adding an attention mechanism to a LSTM model improves accuracy for the task of malware classification, thus outperforming the state-of-the-art algorithm by up to 6%. We also show that the transformer architecture can be used to analyze very long sequences with significantly lower time complexity for training and prediction. Our proposed method can serve as the basis for a decision support system for security experts, for the task of malware categorization.