Multi-label Scientific Document Classification

Tariq Ali,
Sohail Asghar,

Abstract


Scientific document label identification is a significant research area having numerous applications like digital libraries. The author assigns a category or categories to their document manually. Likewise, categories are structured in taxonomy in the form of tree such as ACM CCS. The dilemma becomes more complex when a document belongs to multiple categories. The problem of manual assignment becomes more complicated when the number of expected labels increases. Moreover, the accession schemes are insufficient for solutions with higher accuracy on real scientific document datasets. One way to handle the multi-label classification is to change the problem into a single-label classification. Another way is the variation of the algorithm to handle multi-label classification. The focus of our research is on conversion. Moreover, we propose a solution stimulated from the particle swarm optimization algorithm that can consign a label from the taxonomy. A set of similarity measures is evaluated as well for documentation relatedness that are used in the proposed approach. The designed solution is evaluated on two documents dataset that are retrieved from J. UCS and ACM with an average accuracy of 77 percent as compared to the state of the art algorithms .


Citation Format:
Tariq Ali, Sohail Asghar, "Multi-label Scientific Document Classification," Journal of Internet Technology, vol. 19, no. 6 , pp. 1707-1716, Nov. 2018.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.





Published by Executive Committee, Taiwan Academic Network, Ministry of Education, Taipei, Taiwan, R.O.C
JIT Editorial Office, Office of Library and Information Services, National Dong Hwa University
No. 1, Sec. 2, Da Hsueh Rd., Shoufeng, Hualien 974301, Taiwan, R.O.C.
Tel: +886-3-931-7314  E-mail: jit.editorial@gmail.com