FaceSynth: text-to-face generation using CLIP and its variants with generative adversarial networks

Priyadharsini Ravisankar, Shruthi Dhanvanth, Vaishnave Jenane Padmanabhan

Abstract


In recent years, there have been massive developments in the field of generative AI, especially in generative adversarial networks (GANs). GANs generate original images that haven't been seen during training and have had several advancements like StyleGAN, StyleGAN2, and StyleGAN2-adaptive discriminator augmentation (ADA). Contrastive language-image pre-training (CLIP), by OpenAI, is a visual linguistic model that has been trained to associate texts with images. Recently, new CLIP variants were developed, such as metadata-curated language-image pre-training (MetaCLIP), released by Facebook and trained on a larger dataset, and Multilinigual-CLIP, which adapts CLIP to multiple languages. We compare CLIP and its variants in text-to-face synthesis with a custom StyleGAN2-ADA model and a pre-trained StyleGAN2 model. Our training-free algorithm starts with an initial image latent code that is iteratively manipulated to match a given text description. It achieves this by minimizing the distance between the text and image embedding in the multi-modal embedding space of the CLIP models. An examination of CLIP and its variants showed that MetaCLIP outperformed its competitors in LPIPS similarity and closeness of the synthesized image to the actual prompt. CLIP produced the most realistic images with the best FID score and multilingual-CLIP presented a choice of input text language and generated decent images.

Keywords


CLIP; Generative adversarial network; MetaCLIP; Multilingual-CLIP; StyleGAN; Text-to-face generation; Text-to-image generation

Full Text:

PDF


DOI: http://doi.org/10.11591/ijai.v14.i5.pp3588-3598

Refbacks

  • There are currently no refbacks.


Copyright (c) 2025 Institute of Advanced Engineering and Science

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

IAES International Journal of Artificial Intelligence (IJ-AI)
ISSN/e-ISSN 2089-4872/2252-8938 
This journal is published by the Institute of Advanced Engineering and Science (IAES).

View IJAI Stats