RECENT SURVEY OF BIG DATA ANALYTICS FOR MAPREDUCE FREQUENT ITEM MINING

Sri Vasavi College, Erode Self-Finance Wing 3rd February 2017 National Conference on Computer and Communication NCCC’17

Format: Volume 5, Issue 1, No 10, 2017

Copyright: All Rights Reserved ©2017

Year of Publication: 2017

Author: S.Prakash,M.Inbavel, Dr.P.Siva Prakasam

Reference:IJCS-205

View PDF Format

Abstract

Frequent Itemset Mining (FIM) is one of the most well known techniques to extract knowledge from data process. The combinatorial explosion of FIM methods become even more problematic when they are applied to Big Data. Fortunately, present improvements in the field of parallel programming already provide good tools to tackle this problem. However, these tools come with their own technical challenges, e.g. balanced data distribution and inter-communication costs. In this paper, we analysis the applicability of FIM techniques on the MapReduce platform. In this paper propose a Confabulation Base Parallel FIM approach called CBP-FIM-DP using the MapReduce programming model. The above mentioned FIM mining algorithms extract from and analyze the historical datasets for decision making. The purpose of Big data mining is to go beyond the usual request-response processing, market basket analysis or uncovering some hidden relationships and implement very large scale parallel data mining algorithm. Comparing with the results derived from mining the conventional datasets, unveiling the huge volume of interconnected heterogeneous big data has the potential to maximize our knowledge in the target domain. In our experiments we show the scalability of our methods.

References

[1] R. Agrawal and R. Srikant. Fast algorithms for mining association rules in large databases. In Proc. VLDB, pages 487–499, 1994. [2] R. J. Bayardo, Jr. Efficiently mining long patterns from databases. SIGMOD Rec. , pages 85–93, 1998. [3] M. Zaki, S. Parthasarathy, M. Ogihara, and W. Li. Parallel algorithms for discovery of association rules. Data Min. and Knowl. Disc. , pages 343–373, 1997. [4] R. Agrawal and J. Shafer. Parallel mining of association rules. IEEE Trans. Knowl. Data Eng. , pages 962–969, 1996 [5] B. Mobasher, H. Dai, T. Luo, and M. Nakagawa. Effective personalization based on association rule discovery from web usage data. In Proc. WIDM, pages 9–15. ACM, 2001. [6] J. Ekanayake, H. Li, B. Zhang, T. Gunarathne, S.-H. Bae, J. Qiu, and G. Fox. Twister: A runtime for iterative MapReduce. In Proc. HPDC, pages 810–818. ACM, 2010. [7] G. A. Andrews. Foundations of Multithreaded, Parallel, and Distributed Programming. Addison-Wesley, 2000. [8] Z. Zheng, R. Kohavi, and L. Mason. Real world performance of association rule algorithms. In F. Provost and R. Srikant, editors, Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 401 –406. ACM Press, 2001. [9] M. J. Zaki, “Parallel and distributed association mining: A survey,” Concurrency, IEEE, vol. 7, no. 4, pp. 14–25, 1999. [10] I. Pramudiono and M. Kitsuregawa, “Fp-tax: Tree structure based generalized association rule mining,” in Proceedings of the 9th ACMSIGMOD workshop on Research issues in data mining and knowledge discovery. ACM, 2004, pp. 60–63 [11] M.-Y. Lin, P.-Y. Lee, and S.-C. Hsueh, “Apriori-based frequent itemset mining algorithms on mapreduce,” in Proceedings of the 6th International Conference on Ubiquitous Information Management and Communication, ser. ICUIMC ’12. New York, NY, USA: ACM, 2012, pp. 76:1–76:8. [12] X. Lin, “Mr-apriori: Association rules algorithm based on mapreduce,” in Software Engineering and Service Science (ICSESS),


Keywords

Hadoop, Frequent Item Mining, MapReduce, Parallel Algorithm, CBP-FIM

This work is licensed under a Creative Commons Attribution 3.0 Unported License.   

TOP
Facebook IconYouTube IconTwitter IconVisit Our Blog