Một mô hình hiệu quả khai phá tập mục lợi ích cao

  • Đậu Hải Phong Đại học Thăng Long
  • Nguyễn Mạnh Hùng

Abstract

Today, high utility itemsets mining is an important research issue in data mining because it considers the profit and quantity of items in each transaction. Most high utility itemsets algorithms such as UP-Growth [9], Udepth [5], Two-Phase [3], PB (Projection-Based) [8], ect. use TWU model (Transaction Weight Utility) for pruning candidates. However, the number of candidate itemsets generated in these algorithms is enormous. In this paper, we propose a new candidate weight utility model (CWU) and HP (High Projection) algorithm base on CWU to reduce the number of candidate itemsets. The experimental results show that the performance and number candidate of our algorithm is better than Two-Phase [3], PB [8].

References

R. Chan, Q. Yang, and Y. Shen, “Mining high utility itemsets”. In Proc. of Third IEEE Int'l Conf. on Data Mining, pp. 19-26, 2003.

R.Agrawal, and R. Sri kant, "Fast algorithms for mining association rules in large databases", Proc. 20th IntI. Conf. Very Large Data Bases (VLDB'94), pp. 487-499, September 1994.

Y. Liu, W. Liao, and A. N. Choudhary, "A two-phase algorithm for fast discovery of high utility itemsets", Proc. 9th Pacific-Asia Conf. Knowledge Discovery and Data Mining (PAKDD'05), pp. 689-695, May 2005.

H. Yao, H.J. Hamilton, and C. Butz, "A foundational approach to mining itemset utilities from databases", Proc. 4th SIAM IntI. Conf. Data Mining (SDM'04), pp. 482-486, April 2004.

Wei Song, Yu Liu, Jinhong Li, “Vertical Mining for High Utility Itemsets”, IEEE International Conference on Granular Computing, 2012.

C. F. Ahmed, S. K. Tanbeer, B.-S. Jeong, and Y.-K. Lee, "HUC-Prune: an efficient candidate pruning technique to mine high utility patterns," Appl. Intell., vol. 34, pp.181-198, April 2011.

Liu Y, Liao W, Choudhary A “A fast high utility itemsets mining algorithm”. In: Proceeding of the utility-based data mining workshop, pp 90–99, 2005.

Guo-Cheng Lan, Tzung-Pei Hong, Vincent S. Tseng, “An efficient projection-based indexing approach for mining high utility itemsets”, Knowl Inf Syst (2014) 38:85–107, Springer-Verlag London 2013.

Vincent S. Tseng, Cheng-Wei Wu, Bai-En Shie, Philip S.Yu, “UP-Growth: An Efficient Algorithm for High Utility Itemset Mining”, KDD’10, July 25–28, Washington, DC, USA, 2010.

A.Erwin, R. P. Gopalan, and N. R. Achuthan, "CTU-Mine: an efficient high utility item set mining algorithm using the pattern growth approach," Proc. 7th IEEE IntI. Conf. Computer and Information Technology (CIT'07), pp. 71-76, October 2007.

C. F. Ahmed, S. K. Tanbeer, B.-S. Jeong, and Y.-K. Lee, "HUC-Prune: an efficient candidate pruning technique to mine high utility patterns". Appl. Intell., vol. 34, pp. 181-198, April 2011.

IBM Quest Data Mining Project, Quest Synthetic Data Generation Code. Available at (http:// www.almaden.ibm.com/cs/quest/syndata.html)

https://archive.ics.uci.edu/ml/datasets

Chuyên san số 13 (33)
Published
2015-09-17
Section
Bài báo