A survey of missing data imputation techniques: statistical methods, machine learning models, and GAN-based approaches

Rifaa Sadegh, Ahmed Mohameden, Mohamed Lemine Salihi, Mohamedade Farouk Nanne

Abstract


Efficiently addressing missing data is critical in data analysis across diverse domains. This study evaluates traditional statistical, machine learning, and generative adversarial network (GAN)-based imputation methods, emphasizing their strengths, limitations, and applicability to different data types and missing data mechanisms (missing completely at random (MCAR), missing at random (MAR), missing not at random (MNAR)). GAN-based models, including generative adversarial imputation network (GAIN), view imputation generative adversarial network (VIGAN), and SolarGAN, are highlighted for their adaptability and effectiveness in handling complex datasets, such as images and time series. Despite challenges like computational demands, GANs outperform conventional methods in capturing non-linear dependencies. Future work includes optimizing GAN architectures for broader data types and exploring hybrid models to enhance imputation accuracy and scalability in real-world applications.


Keywords


Data imputation; Generative adversarial networks; Machine learning; Missing data; Statistical methods

Full Text:

PDF


DOI: http://doi.org/10.11591/ijai.v14.i4.pp2876-2888

Refbacks

  • There are currently no refbacks.


Copyright (c) 2025 Institute of Advanced Engineering and Science

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938 
This journal is published by the Institute of Advanced Engineering and Science (IAES).

View IJAI Stats