Khai thác k mẫu tuần tự tối đại sử dụng cây dữ liệu chiếu tiền tố

  • Lê Hoài Bắc VNUHCM - University of Science
  • Nguyễn Thị Quyên

Abstract

This paper propose a method called TMSP to perform squential patten mining. Because maximal patterns compact representations of frequent patterns, so they are used for mining in TMSP. The main idea of TMSP is mining top-k frequent maximal equential patterns of length no less than the minimum length of each pattern (min_l) and no greater than the maximum length of each pattern (max_l) with k is the desired number of maximal sequential patterns to be mined. The proposed method helps user do not need turning specification of a minimum support threshold to perform the mining which is a disadvantage of previous studies. Experimental results on real datasets show that TMSP serves as an efficient solution for mining sequential patterns. The reults also demonstrate that TMSP is better than the maximal sequential pattern mining algorithm (MAXSP) in term memory efficient and easier for users to find the number of required patterns without adjusting minsup.

Author Biography

Lê Hoài Bắc, VNUHCM - University of Science

References

J. P. J. PEI, J. H. J. HAN, B. MORTAZAVI-ASL, H. PINTO, Q. C. Q. CHEN, U. DAYAL, AND M.-C. H. M.-C. HSU, “PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth,” Proc. 17th Int. Conf. Data Eng., 2001.

M. J. ZAKI, “SPADE: An efficient algorithm for mining frequent sequences,” Mach. Learn., vol. 42, no. 1–2, pp. 31–60, 2001.

J. AYRES, J. GEHRKE, T. YIU, AND J. FLANNICK, “Sequential pattern mining using a bitmap representation,” Proc. eighth ACM SIGKDD Int. Conf. Knowl. Discov. data Min., pp. 429–435, 2002.

P. FOURNIER-VIGER, A. GOMARIZ, T. GUENICHE, E. MWAMIKAZI, AND R. THOMAS, “TKS: Efficient mining of top-k sequential patterns,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 8346 LNAI, no. PART 1, pp. 109–120, 2013.

P. TZVETKOV, X. YAN, AND J. HAN, “TSP: Mining top-k closed sequential patterns,” Knowl. Inf. Syst., vol. 7, no. 4, pp. 438–457, 2005.

K.SOHINI AND MR.V.PURUSHOTHAMA RAJU, “Mining Top-k Closed Sequential Patterns in Sequential Databases,” IOSR J. Comput. Eng. , vol. 15, no. 4, pp. 20–23, 2013.

P. FOURNIER-VIGER, C. W. WU, AND V. S. TSENG, “Mining maximal sequential patterns without candidate maintenance,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 8346 LNAI, no. PART 1, pp. 169–180, 2013.

P. FOURNIER-VIGER, C. W. WU, A. GOMARIZ, AND V. S. TSENG, “VMSP: Efficient vertical mining of maximal sequential patterns,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 8436 LNAI, pp. 83–94, 2014.

M. J. ZAKI AND W. MEIRA JR., Data mining and analysis: fundamental concepts and algorithms. Cambridge University Press, New York, 2014.

M.-T. TRAN, B. LE, AND B. VO, “Combination of dynamic bit vectors and transaction information for mining frequent closed sequences efficiently,” Eng. Appl. Artif. Intell., vol. 38, pp. 183–189, 2015.

Published
2016-07-06
Section
Bài báo