EPGA: Enhanced Pattern Growth Algorithm for Sequential Pattern Mining in Hadoop-Mapreduce Framework for Big Data

Authors

  • Sujit R Wakchaure, Dr. Rajeev G Vishwakarma

DOI:

https://doi.org/10.17762/msea.v71i4.2240

Abstract

Mining frequent itemsets plays a crucial part in mining associations, relations, causality, and other significant data mining activities. This is because Frequent Itemset Mining (FIM) is an essential component of the process of uncovering association rules. Due to factors like high memory consumption, high I/O overhead, and poor processing speed, certain conventional frequent itemsets mining techniques are not able to successfully manage enormous tiny file datasets. Therefore, as the size of the data increases, the single machine FIM method faces the challenges of taking a significant amount of time and using a significant amount of memory. Therefore, a new implementation technique named as the Enhanced Pattern Growth Algorithm (EPGA) is proposed. This technique relies on a MapReduce parallel environment and is designed for mining frequent item sets in order to produce association rules. This method is validated by employing various sizes of real-time big datasets on various nodes in the cluster while simultaneously choosing speedup, reliability as a criterion. The findings indicate that the suggested method is both practicable and reasonable, and that it has the potential to enhance the general performance as well as the effectiveness of the Apriori and FP-Growth algorithms in order to fulfil the requirements of big data association rules mining.

Downloads

Published

2022-08-19

How to Cite

Sujit R Wakchaure, Dr. Rajeev G Vishwakarma. (2022). EPGA: Enhanced Pattern Growth Algorithm for Sequential Pattern Mining in Hadoop-Mapreduce Framework for Big Data. Mathematical Statistician and Engineering Applications, 71(4), 12372–12387. https://doi.org/10.17762/msea.v71i4.2240

Issue

Section

Articles