TY - JOUR
T1 - TIRPClo
T2 - efficient and complete mining of time intervals-related patterns
AU - Harel, Omer
AU - Moskovitch, Robert
N1 - Funding Information:
This research was partially funded by a grant of the Israeli Ministry of Science and Technology (grant 8760441). Omer Harel was also funded by the Darom-Lachish scholarship of Kreitman School of Advanced Graduate Studies at Ben Gurion University (No. 1955129).
Funding Information:
The authors wish to thank Prof. Panagiotis Papapetrou and Prof. Diane J Cook for providing datasets for the evaluation. This research was partially funded by a grant of the Israeli Ministry of Science and Technology (Grant 8760441). Omer Harel was funded also by the Darom-Lachish scholarship of Kreitman School of Advanced Graduate Studies at Ben Gurion University (No. 1955129).
Publisher Copyright:
© 2023, The Author(s), under exclusive licence to Springer Science+Business Media LLC, part of Springer Nature.
PY - 2023/9/1
Y1 - 2023/9/1
N2 - Mining frequent Time Intervals-Related Patterns (TIRPs) from series of symbolic time intervals offers a comprehensive framework for heterogeneous, multivariate temporal data analysis in various application domains. While gaining a growing interest in recent decades, the efficient mining of frequent TIRPs is still a high computational challenge which has also not yet been investigated in its full complexity. The majority of previous methods discover only the first instances of the TIRPs within each series of symbolic time intervals, whereas their re-occurring instances are ignored. This eventually results in an incomplete discovery of frequent TIRPs, a problem that lies also in the challenge of mining only the frequent closed TIRPs, which was only recently investigated for the first time. In this paper, we introduce TIRPClo—an efficient algorithm for the complete mining of either the entire set of frequent TIRPs, or only the frequent closed TIRPs. The algorithm proposes a non-ambiguous sequential representation of symbolic time intervals series through the intervals’ end-points, as well as a memory-efficient index and a novel method for data projection, due to which it is the first algorithm to guarantee a complete discovery of frequent closed TIRPs. The experimental evaluation conducted on eleven real-world and four synthetic datasets demonstrates that TIRPClo is up to 10 times faster when mining the entire set of frequent TIRPs, and up to more than 100 times faster when mining only the frequent closed TIRPs compared to four state-of-the-art methods, while also reporting lower memory measurements.
AB - Mining frequent Time Intervals-Related Patterns (TIRPs) from series of symbolic time intervals offers a comprehensive framework for heterogeneous, multivariate temporal data analysis in various application domains. While gaining a growing interest in recent decades, the efficient mining of frequent TIRPs is still a high computational challenge which has also not yet been investigated in its full complexity. The majority of previous methods discover only the first instances of the TIRPs within each series of symbolic time intervals, whereas their re-occurring instances are ignored. This eventually results in an incomplete discovery of frequent TIRPs, a problem that lies also in the challenge of mining only the frequent closed TIRPs, which was only recently investigated for the first time. In this paper, we introduce TIRPClo—an efficient algorithm for the complete mining of either the entire set of frequent TIRPs, or only the frequent closed TIRPs. The algorithm proposes a non-ambiguous sequential representation of symbolic time intervals series through the intervals’ end-points, as well as a memory-efficient index and a novel method for data projection, due to which it is the first algorithm to guarantee a complete discovery of frequent closed TIRPs. The experimental evaluation conducted on eleven real-world and four synthetic datasets demonstrates that TIRPClo is up to 10 times faster when mining the entire set of frequent TIRPs, and up to more than 100 times faster when mining only the frequent closed TIRPs compared to four state-of-the-art methods, while also reporting lower memory measurements.
KW - Closed temporal pattern
KW - Frequent pattern mining
KW - Temporal knowledge discovery
KW - Time interval mining
UR - http://www.scopus.com/inward/record.url?scp=85163714041&partnerID=8YFLogxK
U2 - 10.1007/s10618-023-00944-6
DO - 10.1007/s10618-023-00944-6
M3 - Article
AN - SCOPUS:85163714041
SN - 1384-5810
VL - 37
SP - 1806
EP - 1857
JO - Data Mining and Knowledge Discovery
JF - Data Mining and Knowledge Discovery
IS - 5
ER -