Open Access
Subscription Access
A Disk-Based Mining Algorithm for Frequent Pattern Discovery from Big Data in Distributed Computing Environments
Abstract
In distributed computing environments, frequent pattern mining by a multi-computing node can greatly improve mining efficiency. However, the drawback of memory limitations may cause interruption in the kernel and computing nodes when recursively building a frequent-pattern (FP) tree or an FP-growth algorithm. In this paper, we propose disk-based FP-tree generation and node-based clustering mechanisms to solve the insufficient memory problem. Results from empirical evaluations show that the proposed method delivers excellent scalability.
Keywords
Data mining; Frequent pattern mining; Clustering; Distributed computing
Citation Format:
Kawuu W. Lin, Sheng-Hao Chung, Chun-Yuan Hsiao, Chun-Cheng Lin, Pei-Ling Chen, "A Disk-Based Mining Algorithm for Frequent Pattern Discovery from Big Data in Distributed Computing Environments," Journal of Internet Technology, vol. 17, no. 6 , pp. 1259-1268, Nov. 2016.
Kawuu W. Lin, Sheng-Hao Chung, Chun-Yuan Hsiao, Chun-Cheng Lin, Pei-Ling Chen, "A Disk-Based Mining Algorithm for Frequent Pattern Discovery from Big Data in Distributed Computing Environments," Journal of Internet Technology, vol. 17, no. 6 , pp. 1259-1268, Nov. 2016.
Full Text:
PDFRefbacks
- There are currently no refbacks.
Published by Executive Committee, Taiwan Academic Network, Ministry of Education, Taipei, Taiwan, R.O.C
JIT Editorial Office, Office of Library and Information Services, National Dong Hwa University
No. 1, Sec. 2, Da Hsueh Rd., Shoufeng, Hualien 974301, Taiwan, R.O.C.
Tel: +886-3-931-7314 E-mail: jit.editorial@gmail.com