Open Access Open Access  Restricted Access Subscription Access

A Disk-Based Mining Algorithm for Frequent Pattern Discovery from Big Data in Distributed Computing Environments

Kawuu W. Lin,
Sheng-Hao Chung,
Chun-Yuan Hsiao,
Chun-Cheng Lin,
Pei-Ling Chen,

Abstract


In distributed computing environments, frequent pattern mining by a multi-computing node can greatly improve mining efficiency. However, the drawback of memory limitations may cause interruption in the kernel and computing nodes when recursively building a frequent-pattern (FP) tree or an FP-growth algorithm. In this paper, we propose disk-based FP-tree generation and node-based clustering mechanisms to solve the insufficient memory problem. Results from empirical evaluations show that the proposed method delivers excellent scalability.

Keywords


Data mining; Frequent pattern mining; Clustering; Distributed computing

Citation Format:
Kawuu W. Lin, Sheng-Hao Chung, Chun-Yuan Hsiao, Chun-Cheng Lin, Pei-Ling Chen, "A Disk-Based Mining Algorithm for Frequent Pattern Discovery from Big Data in Distributed Computing Environments," Journal of Internet Technology, vol. 17, no. 6 , pp. 1259-1268, Nov. 2016.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.





Published by Executive Committee, Taiwan Academic Network, Ministry of Education, Taipei, Taiwan, R.O.C
JIT Editorial Office, Office of Library and Information Services, National Dong Hwa University
No. 1, Sec. 2, Da Hsueh Rd., Shoufeng, Hualien 974301, Taiwan, R.O.C.
Tel: +886-3-931-7314  E-mail: jit.editorial@gmail.com