Learning assistance module based on a small language model
Abstract
This paper presents the development of a low-cost learning assistant embedded in an NVIDIA Jetson Xavier board that uses speech and gesture recognition, together with a long language model for offline work. Using the large language model (LLM) Phi-3 Mini (3.8B) model and the Whisper (model base) model for automatic speech recognition, a learning assistant is obtained under a compact and efficient design based on extensive language model architectures that give a general answer set of a topic. Average processing times of 0.108 seconds per character, a speech transcription efficiency of 94.75%, an average accuracy of 9.5/10 and 8.5/10 in the consistency of the responses generated by the learning assistant, a full recognition of the hand raising gesture when done for at least 2 seconds, even without fully extending the fingers, were obtained. The prototype is based on the design of a graphical interface capable of responding to voice commands and generating dynamic interactions in response to the user's gesture detection, representing a significant advance towards the creation of comprehensive and accessible human-machine interface solutions.
Keywords
Artificial intelligence; Deep learning; Embedded system; Large language model; Learning assistant; Small language model
Full Text:
PDFDOI: http://doi.org/10.11591/ijai.v14.i5.pp4202-4210
Refbacks
- There are currently no refbacks.
Copyright (c) 2025 Marco Antonio Jinete, Robinson Jiménez-Moreno, Anny Astrid Espitia-Cubillos
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938
This journal is published by the Institute of Advanced Engineering and Science (IAES).