Text To Face Using AttnGAN

Getting Started

Overview
Installations
Training
Run
Validation
Credits

Overview

Pytorch implementation for reproducing Text to Face (T2F) using AttnGAN results from our paper Development and Deployment of a Generative Model-Based Framework for Text to Photorealistic Image Generation.

Installations

Dependencies

In addition, please add the project folder to PYTHONPATH and pip install the following packages:

python-dateutil
easydict
pandas
torchfile
nltk
scikit-image

Data

Download our preprocessed metadata for birds & CelebA and save them to data/
Download the birds image data. Extract them to data/birds/
For faces download CelebA dataset and extract the images to data/face/

Training

Pre-train DAMSM models:
- For bird dataset: python pretrain_DAMSM.py --cfg cfg/DAMSM/bird.yml --gpu 0
- For face dataset: python pretrain_DAMSM.py --cfg cfg/DAMSM/face.yml --gpu 1
Train AttnGAN models:
- For birds dataset: python main.py --cfg cfg/bird_attn2.yml --gpu 2
- For CelebA dataset: python main.py --cfg cfg/face_attn2.yml --gpu 3
*.yml files are example configuration files for training/evaluation our models.

Pretrained Model

DAMSM for bird. Download and save it to DAMSMencoders/
DAMSM for CelebA. Download and save it to DAMSMencoders/
AttnGAN for bird. Download and save it to models/
AttnGAN for CelebA. Download and save it to models/
AttnDCGAN for bird. Download and save it to models/
- This is an variant of AttnGAN which applies the propsoed attention mechanisms to DCGAN framework.

Run (Sampling)

Run python main.py --cfg cfg/eval_bird.yml --gpu 1 to generate examples from captions in files listed in "./data/birds/example_filenames.txt". Results are saved to DAMSMencoders/.
Change the eval_*.yml files to generate images from other pre-trained models.
Input your own sentence in "./data/birds/example_captions.txt" if you wannt to generate images from customized sentences.

Validation

To generate images for all captions in the validation dataset, change B_VALIDATION to True in the eval_*.yml. and then run python main.py --cfg cfg/eval_bird.yml --gpu 1
We compute FID score for models trained on CelebA using #.

Examples generated by AttnGAN [Blog]

CelebA example

Creating an API Evaluation code embedded into a callable containerized API is included in the eval\ folder.

Credits

Citing

If you find Text to Face (T2F) using AttnGAN useful in your research, please consider citing:

Text To Face using AttnGAN

@article{PANDE2021,
title = {Development and Deployment of a Generative Model-Based Framework for Text to Photorealistic Image Generation},
journal = {Neurocomputing},
year = {2021},
issn = {0925-2312},
doi = {https://doi.org/10.1016/j.neucom.2021.08.055},
url = {https://www.sciencedirect.com/science/article/pii/S092523122101239X},
author = {Sharad Pande, Srishti Chouhan, Ritesh Sonavane, Rahee Walambe, George Ghinea, Ketan Kotecha},
keywords = {text-to-image, text-to-face, face synthesis, GAN, AttnGAN},
abstract = {The task of generating photorealistic images from their textual descriptions is quite challenging. Most existing tasks in this domain are focused on the generation of images such as flowers or birds from their textual description, especially for validating the generative models based on Generative Adversarial Network (GAN) variants and for recreational purposes. However, such work is limited in the domain of photorealistic face image generation and the results obtained have not been satisfactory. This is partly due to the absence of concrete data in this domain and a large number of highly specific features/attributes involved in face generation compared to birds or flowers. In this paper, we propose an Attention Generative Adversarial Network (AttnGAN) for a fine-grained text-to-face generation that enables attention-driven multi-stage refinement by employing Deep Attentional Multimodal Similarity Model (DAMSM). Through extensive experimentation on the CelebA dataset, we evaluated our approach using the Frechet Inception Distance (FID) score. The output files for the Face2Text Dataset are also compare with that of the T2F Github project. According to the visual comparison, AttnGAN generated higher-quality images than T2F. Additionally, we compare our methodology with existing approaches with a specific focus on CelebA dataset and demonstrate that our approach generates a better FID score facilitating more realistic image generation. The application of such an approach can be found in criminal identification, where faces are generated from the textual description from an eyewitness. Such a method can bring consistency and eliminate the individual biases of an artist drawing the faces from the description given by the eyewitness. Finally, we discuss the deplyment of the models on a Raspberry Pi to test how effective the models would be on a standalone device to facilitate portability and timely task completion.}
}

AttnGAN

@article{Tao18attngan,
  author    = {Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He},
  title     = {AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks},
  Year = {2018},
  booktitle = {{CVPR}}
}

Reference

AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks by Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, Xiaodong He. (This work was performed when Tao was an intern with Microsoft Research.
StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks [code]
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks [code]

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
DAMSMencoders		DAMSMencoders
code		code
data		data
eval		eval
models		models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
example_bird.png		example_bird.png
example_face.png		example_face.png
framework.svg		framework.svg

License

riteshsonavane/Text2Face

Folders and files

Latest commit

History

Repository files navigation

Text To Face Using AttnGAN

Getting Started

Dependencies

Data

Run (Sampling)

Citing

Text To Face using AttnGAN

AttnGAN

Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Languages