Mining Interesting Infrequent Itemsets from Very Large Data based on MapReduce Framework

T Ramakrishnudu 1,* R B V Subramanyam 1

1. Dept. of CSE, National Institute of Technology, Warangal, 506004, India

* Corresponding author.


Received: 20 Oct. 2014 / Revised: 10 Feb. 2015 / Accepted: 15 Mar. 2015 / Published: 8 Jun. 2015

Index Terms

Data Mining, Association Rule, Frequent Itemset, Infrequent Itemset, Hadoop, Mapreduce


Mining frequent and infrequent itemsets from a given dataset is the most important field of data mining. When we mine frequent and infrequent itemsets simultaneously, infrequent itemsets become very important because there are many valued negative association rules in them. Mining frequent Itemset is highly expensive, if the minimum threshold is low, whereas mining infrequent itemsets is highly expensive, if the minimum threshold is high. When the dataset size is very large, both memory usage and computational cost of mining infrequent items is very expensive. In addition, single processor’s memory and CPU resources are not enough to handle very large datasets. Parallel and distributed computing are effective approaches to handle large datasets. In this paper we proposed a method based on Hadoop-MapReduce model, which can handle massive datasets in mining infrequent itemsets. Experiments are performed on 8 node cluster with a synthetic dataset. The performance study shows that the proposed method is efficient in handling very large datasets.

Cite This Paper

T Ramakrishnudu, R B V Subramanyam, "Mining Interesting Infrequent Itemsets from Very Large Data based on MapReduce Framework", International Journal of Intelligent Systems and Applications(IJISA), vol.7, no.7, pp.44-49, 2015. DOI:10.5815/ijisa.2015.07.06


