Open Access Open Access  Restricted Access Subscription Access

PageFile: The Return of Classical Page Storage Structure on MapReduce Framework

Ye-Feng Li,
Jia-Jin Le,
De-Hua Chen,
Mei Wang,
Bin Zhang,

Abstract


The MapReduce framework has been applied on many researches and proved to be distinguished for processing large scale of data in the last decade. However, it was mostly used to manage unstructured and semi-structured data, and abandoned the classical page storage structure in many MapReduce-based database systems. Therefore, current MapReduce systems didn't take much care on massive structured data. In this paper, we proposed PageFile, a hybrid page-based storage structure on MapReduce framework. It has faster query processing, better disk space utility compared to Hive's RCFile. Moreover, we created a "multiple reduced B-Trees" structure based on PageFile, which performs excellent on single column values or small-ranged queries.

Keywords


Page structure; Hybrid store; MapReduce framework; Multiple reduced B-Trees

Citation Format:
Ye-Feng Li, Jia-Jin Le, De-Hua Chen, Mei Wang, Bin Zhang, "PageFile: The Return of Classical Page Storage Structure on MapReduce Framework," Journal of Internet Technology, vol. 18, no. 1 , pp. 65-75, Jan. 2017.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.





Published by Executive Committee, Taiwan Academic Network, Ministry of Education, Taipei, Taiwan, R.O.C
JIT Editorial Office, No. 1, Sec. 2, Da Hsueh Rd. Shoufeng, Hualien 97401, Taiwan, R.O.C.
Tel: +886-3-931-7017  E-mail: jit.editorial@gmail.com