Parallel Mining for High Utility Itemsets Mining by Efficient Data Structure

Nguyen Manh Hung, Dau Hai Phong


Mining high utility itemsets in transaction database is an important task in data mining and widely applied in many areas. Recently, many algorithms have been proposed, but most algorithms for identifying high utility itemsets need to generate candidate sets by overestimating their utility and then calculating their exact utility value. Therefore, the number of candidate itemsets is much larger than the actual number of high utility itemsets. In this paper, we introduce the Retail Transaction-Weighted Utility (RTWU) structure and propose two algorithms: EAHUIMiner algorithm and PEAHUI-Miner parallel algorithm. They have been experimented and compared to the two most efficient algorithms: EFIM and FHM. Results show that our algorithm is better with sparse datasets.

DOI: 10.32913/rd-ict.vol3.no14.519


High Utility Mining, EAHUI-Miner, PEAHUI-Miner, Utility List


