Analyzing the Performance of ECLAT Algorithm for Large Datasets by Comparing K-means and Gaussian Mixture Model

Authors

  • Nandar Lin Computer Engineering and Information Technology, Yangon Technological University,Yangon, Insein, 11012, Myanmar
  • Thanda Win Computer Engineering and Information Technology, Yangon Technological University,Yangon, Insein, 11012, Myanmar

Keywords:

Frequent Itemset Mining, Support Items, ECLAT, K-means, Gaussian Mixture Model

Abstract

Frequent Itemset Mining (FIM) is a technique that transforms historical data into useful information by identifying beneficial patterns. The ECLAT method uses depth-first search to intersect the transaction ID sets with the corresponding kth item sets in order to calculate the support items. While searching for the best-selling products, ECLAT uses a lot of memory and processing time due to the enormous number of transaction ID sets. To overcome these problems, the clustering method combines with the ECLAT algorithm to retrieve the support items. Description elements 100,000 to 400,000 were used to retrieve the support items of the most popular selling goods. For the K-means clustering approach, the optimal value of k is 8 clusters according to the 0.59 silhouette value. For the Gaussian Mixture Model, the ideal value of k is 14 clusters based on a 0.59 silhouette score value between 100,000 and 400,000 data items. After clustering the same product items, the ECLAT algorithm retrieves the support items by applying a minimum support value of 0.00001 in this investigation. According to the experimental results, the Gaussian Mixture Model not only offers more flexibility for clustering the same items but also reduces the memory usage and execution times. The outcomes of this investigation indicate that the Gaussian Mixture Model provides more efficient enhancement of the performance of the ECLAT algorithm than the K-means algorithm.

References

Liu, Y., Liao, W. K., Choudhary, A. N., & Li, J. (2008). “Parallel Data Mining Algorithms for Association Rules and Clustering,” In Intl. Conf. on Management of Data, pp.1-25.

R. Agrawal, T. Imielminski, A. Swami: “Mining Association Rules Between Sets of Items in Large Databases”. In: Proc. ACM Intern. Conf. on Management of Data, pp. 207-216, ACM Press (1993)

M.J. Zaki, S. Parthasarathy, M. Ogihara, and W. Li, "New algorithms for fast discovery of association rules," in Third International Conference on Knowledge Discovery and Data Mining, 1997.

G. Naga Chandrika, G. Varshith, N.Bhargav Reddy & G.Gurubrahmaiah. “Customer Segmentation using K-means and Gaussian Mixture”. Journal of Engineering Science, vol 13, pp.744-750, June.2022.

M. Hafidh Raditya, Indwiarti, and A. Atiqi Rohmawati, “House Prices Segmentation Using Gaussian Mixture Model-Based Clustering”, Jurnal Resti, Vol. 6 No. 5, pp. 866 – 871.

N. P. Dharshinni, H. Mawengkang and M. K. M Nasution, “Mapping of medicine data with k-means and apriori combinations based on patient diagnosis”, IOP Conf. Series: Journal of Physics: Conf. Series 978 (2018) 012027.

C.P. Ezenkwu, S. Ozuomba, C. Kalu, “Application of K-Means Algorithm for Efficient Customer Segmentation: A Strategy for Targeted Customer Services”, Electrical/Electronics & Computer Engineering Department, University of Uyo, Uyo, Akwa Ibom State, Nigeria (IJARAI) International Journal of Advanced Research in Artificial Intelligence, Vol. 4, No.10, 201

A Comparative Study of Support Vector Machine and Artificial Neural Network for Option Price Prediction , Journal of Computer and Communications, Vol.9 No.5, May 28, 2021, Biplab Madhu, Md. Azizur Rahman, Arnab Mukherjee, Md. Zahidul Islam, Raju Roy, Lasker Ershad Ali.

http://towardsdatascience.com/k-mean clustering algorithm

https://www.educative.io/answers/what-is-silhouette-score

https://www.deepchecks.com/glossary/gaussian-mixture-model

https://quality-life.medium.com/eclat-algorithm-in-machine-learning

https://www.kaggle.com/datasets/lakshmi25npathi/online-retail-data

Downloads

Published

2025-05-06

How to Cite

Nandar Lin, & Thanda Win. (2025). Analyzing the Performance of ECLAT Algorithm for Large Datasets by Comparing K-means and Gaussian Mixture Model. International Journal of Computer (IJC), 55(1), 1–12. Retrieved from https://ijcjournal.org/index.php/InternationalJournalOfComputer/article/view/2341

Issue

Section

Articles