Sampling methods in handling imbalanced data for Indonesia health insurance dataset

Felix Indra Kurniadi; Kartika Purwandari; Ajeng Wulandari; Syarifah Diana Permai

doi:10.11591/ijai.v13.i1.pp348-357

Sampling methods in handling imbalanced data for Indonesia health insurance dataset

Felix Indra Kurniadi, Kartika Purwandari, Ajeng Wulandari, Syarifah Diana Permai

Abstract

Health insurance fraud is one of the most frequently occurring fraudulent acts and has become a concern for every insurance. According to data from The Indonesian General Insurance Association or Asosiasi Asuransi Umum Indonesia (AAUI), the private insurance industry suffered losses up to billions rupiah throughout 2018 due to the fraudulent acts commited by the perpetrators. The problem in with the number of frauds in Indonesia is that the current system is highly vulnerable and they is still done manually. The other problem from this detection is imbalance data which often occurs in fraudulent cases. In this research, we used a sampling methods using several machine learning as the baseline. The result shows that the instance hardness thresholding algorithm and extreme gradient boosting gives the best performance for all the case. It shows the method can reduced the bias and can achieve better generalization.

Keywords

Health insurance frauds; Machine learning; Sampling method;

Full Text:

PDF

DOI: http://doi.org/10.11591/ijai.v13.i1.pp348-357

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938
This journal is published by the Institute of Advanced Engineering and Science (IAES).

View IJAI Stats

Username
Password
Remember me