Open Access Open Access  Restricted Access Subscription Access

基於新聞評論的熱點話題發現系統研究

程軍軍(Jun-Jun Cheng),
劉雲(Yun Liu),

Abstract


在當前資訊爆炸的時代,如何從特定媒體中獲取有價值的資訊,是人們需要考慮的問題。本文針對新聞報導的特點,提出了一套熱點話題(Hot Topic)發現的演算法,旨在找出當前環境中討論較熱的若干話題, 並將該系統實現。本文重點介紹演算法的三大功能模組:預處理(Pre-processing)、聚類(Clustering)和聚類後處理(熱度打分),並且針對資料特性擇優選取相似度公式,對提出的熱度打分公式進行測試,此外還將本熱度打分演算法與其他方法進行比較,最後通過實驗的方法驗證了本演算法是有效的、合理的。In the current era of information explosion, it is to be considered that how to dig out valuable information from a specific media. In this paper, we analyze the characteristics of news reports, and propose a set of algorithms about hot topic detection, whose aim is to identify a number of hot topics in the current environment, and finally we achieve this system. In addition, reference is made respectively to some skills used in the modules of the system which contains message pre-processing module, text clustering module and post-clustering module (hot topic detection module). We select a better similarity formula and test the hot topic detection formula which is compared with another method. At the end of the issue, we test the data saved in the database, and finally we prove that algorithms in the system are rational and essential.

Keywords


資料採擷Data Mining; 文本聚類; 熱點發現; 熱度排序; Data mining; Text Clustering; Hot Topic Detection; Temperature Sorting

Citation Format:
程軍軍(Jun-Jun Cheng), 劉雲(Yun Liu), "基於新聞評論的熱點話題發現系統研究," Journal of Internet Technology, vol. 9, no. 5 , pp. 433-436, Dec. 2008.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.





Published by Executive Committee, Taiwan Academic Network, Ministry of Education, Taipei, Taiwan, R.O.C
JIT Editorial Office, Office of Library and Information Services, National Dong Hwa University
No. 1, Sec. 2, Da Hsueh Rd., Shoufeng, Hualien 974301, Taiwan, R.O.C.
Tel: +886-3-931-7314  E-mail: jit.editorial@gmail.com