Ensemble reverse knowledge distillation: training robust model using weak models
Abstract
To ensure that artificial intelligence (AI) can be aligned with humans, AI models need to be developed and supervised by humans. Unfortunately, it is possible for an AI to exceed human capabilities, which is commonly referred to as superalignment models. Thus, it raised the question of whether humans can still supervise a superalignment model, which is encapsulated in a concept called weak-to-strong generalization. To address this issue, we introduce ensemble reverse knowledge distillation (ERKD), which leverages two weaker models to supervise a more robust model. This technique is a potential solution for humans to manage a super-alignment of models. ERKD enables a more robust model to achieve optimal performance with the assistance of two weaker models. We tried to train a more robust EfficientNet model with weaker convolutional neural network (CNN) models in a supervised fashion. With this method, the EfficientNet model performed better than the model trained with the standard transfer learning (STL) method. It also performed better than a model that was supervised by a single weaker model. Finally, ERKD-trained EfficientNet models can perform better than EfficientNet models that are one or even two levels stronger.
Keywords
EfficientNet; Ensemble learning; Knowledge distillation; Transfer learning; Weak-to-strong
Full Text:
PDFDOI: http://doi.org/10.11591/ijai.v14.i5.pp%25p
Refbacks
- There are currently no refbacks.
Copyright (c) 2025 Christopher Gavra Reswara, Tjeng Wawan Cenggoro
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938
This journal is published by the Institute of Advanced Engineering and Science (IAES).