Content Enrichment Using Linked Open Data for News Classification

Hsin-Chang Yang,
Yu-Chih Wang,

Abstract


In the Web era, people tend to rely on the Web to receive news instead of traditional ways such as newspapers. However, the amount of news generated online is enormous that prohibits people from obtaining their interested news. Most of the common newswire sites still classified the news manually that costed a lot of human effort and may receive unstable result. In the past decades, text classification has been a hot topic and received attention from many scholars in areas such as natural language processing, information retrieval, and machine learning, etc. Various classification algorithms and models have been developed to tackle this problem. In the meantime, Tim Berners-Lee proposed the concept of linked data in 2006. Linked open data (LOD) were constructed prevalently since then. In this study, we try to incorporate LOD into the news classification system. We collected four datasets in order to evaluate the accuracy in various text lengths with or without incorporating LOD. Three classification algorithms, namely K nearest neighbors, support vector machines, and decision trees, were used to classify the news. The experimental results show that the linked open data can improve the accuracy in news classification, especially for short texts or small datasets.


Citation Format:
Hsin-Chang Yang, Yu-Chih Wang, "Content Enrichment Using Linked Open Data for News Classification," Journal of Internet Technology, vol. 21, no. 5 , pp. 1397-1407, Sep. 2020.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.





Published by Executive Committee, Taiwan Academic Network, Ministry of Education, Taipei, Taiwan, R.O.C
JIT Editorial Office, Office of Library and Information Services, National Dong Hwa University
No. 1, Sec. 2, Da Hsueh Rd., Shoufeng, Hualien 974301, Taiwan, R.O.C.
Tel: +886-3-931-7314  E-mail: jit.editorial@gmail.com