Enhancing data retrieval efficiency in large-scale javascript object notation datasets by using indexing techniques
Abstract
The use of javascript object notation (JSON) format as a not only structured query language (NoSQL) storage solution has grown in popularity, but has presented technical challenges, particularly in indexing large-scale JSON files. This has resulted in slow data retrieval, especially for larger datasets. In this study, we propose the use of JSON datasets to preserve data in resource survey processes. We conducted experiments on a 32-gigabyte dataset containing 1,000,000 transactions in JSON format and implemented two indexing methods, dense and sparse, to improve retrieval efficiency. Additionally, we determined the optimal range of segment sizes for the indexing methods. Our findings revealed that adopting dense indexing reduced data retrieval time from 15,635 milliseconds to 55 milliseconds in one-to-one data retrieval, and from 38,300 milliseconds to 1 millisecond in the absence of keywords. In contrast, using sparse indexing reduced data retrieval time from 33,726 milliseconds to 36 milliseconds in one-to-many data retrieval and from 47,203 milliseconds to 0.17 milliseconds when keywords were not found. Furthermore, we discovered that the optimal segment size range was between 20,000 and 200,000 transactions for both dense and sparse indexing.
Keywords
Dense indexing; Javacript object notation; Large scale dataset; Not only structured query language; Sparse indexing
Full Text:
PDFDOI: http://doi.org/10.11591/ijai.v13.i2.pp2342-2353
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).