Unsupervised hindi word sense disambiguation using graph based centrality measures
Abstract
The task of word sense disambiguation (WSD) plays a key role in multiple applications of natural language processing. In this paper, we propose a novel unsupervised method for targeted Hindi WSD task. First, we create a weighted graph where the nodes correspond to various synsets of the target word and the neighboring context words. The edges in the graph represent the semantic relations between these synsets in the Hindi WordNet hierarchy. A path-based similarity measure, namely Leacock-Chodorow similarity measure, is used to assign weights to edges. An unsupervised weighted graph-based centrality algorithm is used to identify the correct sense of a target word in a given context. The performance of the proposed algorithm is measured on 20 ambiguous Hindi nouns using four different graph-based centrality measures. We observed a maximum accuracy of 66.92% using PageRank centrality measure which is significantly better than earlier reported graph-based Hindi WSD algorithmsevaluated on the same dataset.
Keywords
Hindi WordNet; Natural language processing; Path-based similarity measure; Weighted graph-based centrality measures; Word sense disambiguation
Full Text:
PDFDOI: http://doi.org/10.11591/ijai.v13.i4.pp4957-4964
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).