Open Access
Subscription Access
PageFile: The Return of Classical Page Storage Structure on MapReduce Framework
Abstract
The MapReduce framework has been applied on many researches and proved to be distinguished for processing large scale of data in the last decade. However, it was mostly used to manage unstructured and semi-structured data, and abandoned the classical page storage structure in many MapReduce-based database systems. Therefore, current MapReduce systems didn't take much care on massive structured data. In this paper, we proposed PageFile, a hybrid page-based storage structure on MapReduce framework. It has faster query processing, better disk space utility compared to Hive's RCFile. Moreover, we created a "multiple reduced B-Trees" structure based on PageFile, which performs excellent on single column values or small-ranged queries.
Keywords
Page structure; Hybrid store; MapReduce framework; Multiple reduced B-Trees
Citation Format:
Ye-Feng Li, Jia-Jin Le, De-Hua Chen, Mei Wang, Bin Zhang, "PageFile: The Return of Classical Page Storage Structure on MapReduce Framework," Journal of Internet Technology, vol. 18, no. 1 , pp. 65-75, Jan. 2017.
Ye-Feng Li, Jia-Jin Le, De-Hua Chen, Mei Wang, Bin Zhang, "PageFile: The Return of Classical Page Storage Structure on MapReduce Framework," Journal of Internet Technology, vol. 18, no. 1 , pp. 65-75, Jan. 2017.
Full Text:
PDFRefbacks
- There are currently no refbacks.
Published by Executive Committee, Taiwan Academic Network, Ministry of Education, Taipei, Taiwan, R.O.C
JIT Editorial Office, Office of Library and Information Services, National Dong Hwa University
No. 1, Sec. 2, Da Hsueh Rd., Shoufeng, Hualien 974301, Taiwan, R.O.C.
Tel: +886-3-931-7314 E-mail: jit.editorial@gmail.com