Unsupervised hindi word sense disambiguation using graph based centrality measures

Prajna Jha, Shreya Agarwal, Ali Abbas, Satyendr Singh, Tanveer Jahan Siddiqui

Abstract


The task of word sense disambiguation (WSD) plays a key role in multiple applications of natural language processing. In this paper, we propose a novel unsupervised method for targeted Hindi WSD task. First, we create a weighted graph where the nodes correspond to various synsets of the target word and the neighboring context words. The edges in the graph represent the semantic relations between these synsets in the Hindi WordNet hierarchy. A path-based similarity measure, namely Leacock-Chodorow similarity measure, is used to assign weights to edges. An unsupervised weighted graph-based centrality algorithm is used to identify the correct sense of a target word in a given context. The performance of the proposed algorithm is measured on 20 ambiguous Hindi nouns using four different graph-based centrality measures. We observed a maximum accuracy of 66.92% using PageRank centrality measure which is significantly better than earlier reported graph-based Hindi WSD algorithmsevaluated on the same dataset.

Keywords


Hindi WordNet; Natural language processing; Path-based similarity measure; Weighted graph-based centrality measures; Word sense disambiguation

Full Text:

PDF


DOI: http://doi.org/10.11591/ijai.v13.i4.pp4957-4964

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938 
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

View IJAI Stats