Activity recognition based on spatio-temporal features with transfer learning
Abstract
Human action recognition has emerged as a significant area of study due to it is diverse applications. This research investigates convolutional neural network (CNN) structures to extract spatio-temporal attributes from 2D images. By harnessing the power of pre-trained residual network 50 (ResNet50) and visual geometric group 16 (VGG16) networks through transfer learning, intricate human actions can be discerned more effectively. These networks aid in isolating and merging spatio-temporal features, which are then trained using a support vector machine (SVM) classifier. The refined approach yielded an accuracy of 89.71% on the UCF-101 dataset. Utilizing the UCF YouTube action dataset, activities such as basketball playing and cycling were successfully identified using ResNet50 and VGG16 models. Despite variations in frame dimensions, 3DCNN models demonstrated notable proficiency in video classification. The training phase achieved a remarkable 95.6% accuracy rate. Such advancements in leveraging pre-trained neural networks offer promising prospects for enhancing human activity recognition, especially in areas like personal security and senior care.
Keywords
Convolutional neural network; Deep learning; Human action recognition; Multiclass classification ResNet50; Support vector machine; Transfer learning
Full Text:
PDFDOI: http://doi.org/10.11591/ijai.v13.i2.pp2102-2110
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).