Open Access Open Access  Restricted Access Subscription Access

Efficient Top-k Similarity Join of Massive Time Series Using MapReduce

Dehua Chen,
Changgan Shen,
Yue Li,
Jiajin Le,
Chunming Rong,

Abstract


Top-k similarity join of time series, designed to find top-k most similar pairs of time series records, is a primitive operation widely adopted by many time series data analysis applications. However, computing such top-k similarity join is a challenging problem today, as many modern applications are creating massive amounts of time series data. Obviously, a centralized machine is difficult to perform top-k similarity join in a large time series database efficiently. In this paper, we investigate how to perform the top-k similarity join of massive time series in parallel using MapReduce over a large cluster of commodity machines. Our proposed MapReduce-based algorithm consists of four steps, which takes as input a set of time series records and output an ordered list of top k closest pairs. To improve the efficiency in computing top-k similarity join, we proposed several solutions. We first introduce an efficient distance function based on LSH (Locality Sensitive Hash) for time series to improve the efficiency in pairwise similarity comparison. We next propose all pair partitioning methods to minimize the amount of data transfers between map and reduce functions. Moreover, we make use of serial computation strategy for parallelizing the computation of local top-k closest pairs in each partition. Our performance study confirms the effectiveness and scalability of our MapReduce algorithms.

Keywords


Massive time series; Top-k similarity join; MapReduce; Parallel computation

Citation Format:
Dehua Chen, Changgan Shen, Yue Li, Jiajin Le, Chunming Rong, "Efficient Top-k Similarity Join of Massive Time Series Using MapReduce," Journal of Internet Technology, vol. 15, no. 6 , pp. 1025-1032, Nov. 2014.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.





Published by Executive Committee, Taiwan Academic Network, Ministry of Education, Taipei, Taiwan, R.O.C
JIT Editorial Office, Office of Library and Information Services, National Dong Hwa University
No. 1, Sec. 2, Da Hsueh Rd., Shoufeng, Hualien 974301, Taiwan, R.O.C.
Tel: +886-3-931-7314  E-mail: jit.editorial@gmail.com