An improved real time detection transformer method for retail product detection
Abstract
The main problem in retail product detection is intra-class variation, as some products have similar but distinct characteristics. The primary goal of this study is to address the problem of object detection on intra-class variation in retail environments. As a result, a new approach for object detection of retail products was developed by modifying the Real Time Detection Transformer (RT-DETR) model. To manage intra-class variation more successfully, the RT-DETR model is updated by modifying its architecture. There are two convolutions in the Contextual Cross-Feature Module (CCFM) fusion block section, which is adjusted by adding one convolution layer to each CCFM fusion block. A customized dataset was meticulously constructed to reflect the wide range of products frequently seen in retail outlets. For the constructed datasets, tests were run using the mean Average Precision (mAP) metric, which had a mAP0.5 of 99.5% and a mAP0.5:0.95 of 88.2%. The updated model is superior compared to original model. The difference in mAP0.5:0.95 was 2.5%, while precision increased by 1.3% and recall increased by 0.1%. Although the mAP0.5 results stay unchanged, the gains in the other metrics suggest that the RT-DETR model modifications can improve object detection skills, particularly when dealing with intra-class variation in retail merchandise.
Keywords
Detection transformer; Fusion block; Intra-class variation; Object detection; Retail product
Full Text:
PDFDOI: http://doi.org/10.11591/ijai.v14.i5.pp%25p
Refbacks
- There are currently no refbacks.
Copyright (c) 2025 Andi Wahyu Maulana, Suryo Adhi Wibowo
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938
This journal is published by the Institute of Advanced Engineering and Science (IAES).