Financial text embeddings for the Russian language: a global vectors-based approach
Abstract
The article presents a software implementation of the linguistic embedding method for the Russian language, based on the global vectors for word representation (GloVe) model. The GloVe method allows to obtain word vectors that reflect their semantic and syntactic properties. The resulting vector model can be used in various natural language processing (NLP) tasks, such as machine translation and text clustering. The article describes the architecture of software that implements a method similar to the GloVe algorithm for Russian-language financial texts. The mechanisms used to train the model as well as to compute word vectors are described. Testing with typical classification methods demonstrated that the developed program generates accurate vector representations of Russian-language texts, proving effective in various NLP tasks. This work is one of the first studies devoted to the software implementation of the GloVe method for the Russian language using learning algorithms based on sparse matrices. The results of this study can be used in various NLP tasks, such as machine translation and text clustering.
Keywords
Global vectors; Linguistic embedding; Machine learning; Natural language processing; Russian language
Full Text:
PDFDOI: http://doi.org/10.11591/ijai.v14.i1.pp692-701
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).