TY - GEN
T1 - Threat Intelligence Named Entity Recognition Based on Global Gated Feature Fusion
AU - Du, Chao
AU - Liu, Xuhong
AU - Miao, Lin
AU - Liu, Xiulei
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024/1/1
Y1 - 2024/1/1
N2 - Due to the presence of a large number of jargons, abbreviations, technical details, and complex attack chain descriptions in the threat intelligence text, the named entity recognition task targeting in the threat intelligence domain has difficulty in obtaining a wide range of global contextual information, as well as the presence of unknown threat intelligence entity words, which prevents it from effectively solving the problem of long-distance dependency relationships in the text. To solve this problem, this paper proposes a threat entity recognition model based on global gated feature fusion. Firstly, the model enhances a large-scale cybersecurity text corpus to pre-train the SecureBERT model to obtain dynamic word vectors, uses Cross-BiLSTM to capture the long-distance dependencies of sequences, obtains cross-contextual hidden-layer feature vector representations, and obtains representationally-rich global features by fusing the local hidden states of the sequences with the global sentence representations through a global gated feature fusion. In the experimental comparative analysis with the other four NER baseline models, the F1 value of this model on the two threat intelligence datasets is improved by 2.22% and 1.48% respectively, and it is able to effectively recognize threat intelligence entities.
AB - Due to the presence of a large number of jargons, abbreviations, technical details, and complex attack chain descriptions in the threat intelligence text, the named entity recognition task targeting in the threat intelligence domain has difficulty in obtaining a wide range of global contextual information, as well as the presence of unknown threat intelligence entity words, which prevents it from effectively solving the problem of long-distance dependency relationships in the text. To solve this problem, this paper proposes a threat entity recognition model based on global gated feature fusion. Firstly, the model enhances a large-scale cybersecurity text corpus to pre-train the SecureBERT model to obtain dynamic word vectors, uses Cross-BiLSTM to capture the long-distance dependencies of sequences, obtains cross-contextual hidden-layer feature vector representations, and obtains representationally-rich global features by fusing the local hidden states of the sequences with the global sentence representations through a global gated feature fusion. In the experimental comparative analysis with the other four NER baseline models, the F1 value of this model on the two threat intelligence datasets is improved by 2.22% and 1.48% respectively, and it is able to effectively recognize threat intelligence entities.
KW - feature extraction
KW - gating mechanism
KW - named entity recognition
KW - threat intelligence
UR - https://www.scopus.com/pages/publications/85207091547
U2 - 10.1109/IoTAAI62601.2024.10692655
DO - 10.1109/IoTAAI62601.2024.10692655
M3 - Conference contribution
AN - SCOPUS:85207091547
T3 - 2024 6th International Conference on Internet of Things, Automation and Artificial Intelligence, IoTAAI 2024
SP - 618
EP - 622
BT - 2024 6th International Conference on Internet of Things, Automation and Artificial Intelligence, IoTAAI 2024
PB - Institute of Electrical and Electronics Engineers
T2 - 6th International Conference on Internet of Things, Automation and Artificial Intelligence, IoTAAI 2024
Y2 - 26 July 2024 through 28 July 2024
ER -