Skip to content

Text-to-face implementation using AttnGan architecture.

License

Notifications You must be signed in to change notification settings

riteshsonavane/Text2Face

 
 

Repository files navigation

Text To Face Using AttnGAN Python 3.6+ Pytorch 1.0+

Getting Started

  1. Overview
  2. Installations
  3. Training
  4. Run
  5. Validation
  6. Credits

Pytorch implementation for reproducing Text to Face (T2F) using AttnGAN results from our paper Development and Deployment of a Generative Model-Based Framework for Text to Photorealistic Image Generation.

Dependencies

  • Python 3.6+
  • Pytorch 1.0+

In addition, please add the project folder to PYTHONPATH and pip install the following packages:

  • python-dateutil
  • easydict
  • pandas
  • torchfile
  • nltk
  • scikit-image

Data

  1. Download our preprocessed metadata for birds & CelebA and save them to data/
  2. Download the birds image data. Extract them to data/birds/
  3. For faces download CelebA dataset and extract the images to data/face/
  • Pre-train DAMSM models:

    • For bird dataset: python pretrain_DAMSM.py --cfg cfg/DAMSM/bird.yml --gpu 0
    • For face dataset: python pretrain_DAMSM.py --cfg cfg/DAMSM/face.yml --gpu 1
  • Train AttnGAN models:

    • For birds dataset: python main.py --cfg cfg/bird_attn2.yml --gpu 2
    • For CelebA dataset: python main.py --cfg cfg/face_attn2.yml --gpu 3
  • *.yml files are example configuration files for training/evaluation our models.

Pretrained Model

Run (Sampling)

  • Run python main.py --cfg cfg/eval_bird.yml --gpu 1 to generate examples from captions in files listed in "./data/birds/example_filenames.txt". Results are saved to DAMSMencoders/.
  • Change the eval_*.yml files to generate images from other pre-trained models.
  • Input your own sentence in "./data/birds/example_captions.txt" if you wannt to generate images from customized sentences.
  • To generate images for all captions in the validation dataset, change B_VALIDATION to True in the eval_*.yml. and then run python main.py --cfg cfg/eval_bird.yml --gpu 1
  • We compute FID score for models trained on CelebA using #.

Examples generated by AttnGAN [Blog]

CelebA example

Creating an API Evaluation code embedded into a callable containerized API is included in the eval\ folder.

Citing

If you find Text to Face (T2F) using AttnGAN useful in your research, please consider citing:

Text To Face using AttnGAN

@article{PANDE2021,
title = {Development and Deployment of a Generative Model-Based Framework for Text to Photorealistic Image Generation},
journal = {Neurocomputing},
year = {2021},
issn = {0925-2312},
doi = {https://doi.org/10.1016/j.neucom.2021.08.055},
url = {https://www.sciencedirect.com/science/article/pii/S092523122101239X},
author = {Sharad Pande, Srishti Chouhan, Ritesh Sonavane, Rahee Walambe, George Ghinea, Ketan Kotecha},
keywords = {text-to-image, text-to-face, face synthesis, GAN, AttnGAN},
abstract = {The task of generating photorealistic images from their textual descriptions is quite challenging. Most existing tasks in this domain are focused on the generation of images such as flowers or birds from their textual description, especially for validating the generative models based on Generative Adversarial Network (GAN) variants and for recreational purposes. However, such work is limited in the domain of photorealistic face image generation and the results obtained have not been satisfactory. This is partly due to the absence of concrete data in this domain and a large number of highly specific features/attributes involved in face generation compared to birds or flowers. In this paper, we propose an Attention Generative Adversarial Network (AttnGAN) for a fine-grained text-to-face generation that enables attention-driven multi-stage refinement by employing Deep Attentional Multimodal Similarity Model (DAMSM). Through extensive experimentation on the CelebA dataset, we evaluated our approach using the Frechet Inception Distance (FID) score. The output files for the Face2Text Dataset are also compare with that of the T2F Github project. According to the visual comparison, AttnGAN generated higher-quality images than T2F. Additionally, we compare our methodology with existing approaches with a specific focus on CelebA dataset and demonstrate that our approach generates a better FID score facilitating more realistic image generation. The application of such an approach can be found in criminal identification, where faces are generated from the textual description from an eyewitness. Such a method can bring consistency and eliminate the individual biases of an artist drawing the faces from the description given by the eyewitness. Finally, we discuss the deplyment of the models on a Raspberry Pi to test how effective the models would be on a standalone device to facilitate portability and timely task completion.}
}

AttnGAN

@article{Tao18attngan,
  author    = {Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He},
  title     = {AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks},
  Year = {2018},
  booktitle = {{CVPR}}
}

Reference

About

Text-to-face implementation using AttnGan architecture.

Topics

Resources

License

Stars

Watchers

Forks

Languages

  • Python 100.0%