A benchmark of health insurance fraud detection using machine learning techniques

Ossama Cherkaoui, Houda Anoun, Abderrahim Maizate


Health insurance fraud is a complex problem that also has a significant financial impact. Recently, with the availability of large volumes of data and the evolution of computing power, machine learning techniques have become the preferred method for fraud detection. However, the main difficulty facing researchers in this field is the lack of real data sets and the absence of reliable fraud labels. Most published studies use aggregated provider-level or simulated data to test fraud detection algorithms, which may not deliver accurate results. The present study aims to provide a more accurate assessment of fraud detection methods by using real detailed health insurance claims data to compare six of the most common supervised classification algorithms including neural networks and the use of two categorical feature preparation methods. The study was conducted under the guidance of insurance experts, who provided the fraud label inference rules and reviewed the results. A comprehensive description of the benchmarking process and an interpretation of the results are provided in this paper. The results show that supervised classification can be used effectively to detect health insurance fraud, improving detection accuracy by a factor of 4.2 (84% recall for a positive rate of 20%).



Anomaly detection; Fraud detection; Health insurance fraud; Machine learning; Supervised classification

Full Text:


DOI: http://doi.org/10.11591/ijai.v13.i2.pp1925-1934


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938 
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

View IJAI Stats