Choosing allowability boundaries for describing objects in subject areas

Musulmon Lolaev, Shavkat Madrakhimov, Kodirbek Makharov, Doniyor Saidov

Abstract


Anomaly detection is one of the most promising problems for study and can be used as independent units and preprocessing tools before solving any fundamental data mining problems. This article proposes a method for detecting specific errors with the involvement of experts from subject areas to fill knowledge. The proposed method about outliers hypothesizes that they locate closer to logical boundaries of intervals derived from pair features, and the interval ranges vary in different domains. We construct intervals leveraging pair feature values. While forming knowledge in a specific field, a domain specialist checks the logical allowability of objects based on the range of the intervals. If the objects are logical outliers, the specialist ignores or corrects them. We offer the general algorithm for the formation of the database based on the proposed method in the form of a pseudo-code, and we provide comparison results with existing methods.

Keywords


Data cleaning; Dirty data; Invalid objects; Machine learning; Outliers; Preprocessing; Valid intervals;

Full Text:

PDF


DOI: http://doi.org/10.11591/ijai.v13.i1.pp329-336

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938 
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).

View IJAI Stats