A deep learning based technique for plagiarism detection: a comparative study

Hambi El Mostafa, Faouzia Benabbou

Abstract


The ease of access to the various resources on the web-enabled the democratization of access to information but at the same time allowed the appearance of enormous plagiarism problems. Many techniques of plagiarism were identified in the literature, but the plagiarism of idea steels the foremost troublesome to detect, because it uses different text manipulation at the same time. Indeed, a few strategies have been proposed to perform semantic plagiarism detection, but they are still numerous challenges to overcome. Unlike the existing states of the art, the purpose of this study is to give an overview of different propositions for plagiarism detection based on the deep learning algorithms. The main goal of these approaches is to provide a high quality of worlds or sentences vector representation. In this paper, we propose a comparative study based on a set of criterions like: Vector representation method, Level Treatment, Similarity Method and Dataset. One result of this study is that most of researches are based on world granularity and use the word2vec method for word vector representation, which sometimes is not suitable to keep the meaning of the whole sentences. Each technique has strengths and weaknesses; however, none is quite mature for semantic plagiarism detection.


Keywords


Plagiarism, Deep Learning, Preprocessing, Doc2vev, Neural network

Full Text:

PDF


DOI: http://doi.org/10.11591/ijai.v9.i1.pp81-90
Total views : 364 times

Refbacks

  • There are currently no refbacks.


View IJAI Stats

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.