Skip to content

hyeonsieun/Text-to-Image_Generation

Repository files navigation

Text-to-Image_Generation

T2 Image Model

This project was set as the final project assignment for the 2023 2nd OUTTA AI Bootcamp, where I served as the overall leader. This bootcamp is operated by OUTTA, a non-profit AI education organization where I hold the position of president.

This project is designed to generate images based on text input. The goal of this project is to implement a simple T2I (Text-to-Image) Generation model using Conditional GANs.

Along with the OUTTA members, I created this project, set it as the final team project assignment for the 2023 2nd OUTTA AI Bootcamp, and evaluated the submissions to select the top-performing teams.

If you're interested in undertaking this project yourself, you can download the skeleton code from here.

This repository contains the solution for the project.

For a more detailed explanation about this project, please refer to the uploaded '2023_final_project_guideline.pdf'.

To execute this project, you'll need to modify the 'network.py' and 'train.py' files; it is recommended not to change other files.

A brief explanatory video about this project is available at the following link.

This project was designed to be primarily executed in the Google Colab environment.

MM-CelebA-HQ-Dataset

Dataset can be downloaded from here.

You can see the source of the dataset at the following link.

Command for data preprocessing:

python preproc_datasets_celeba_zip_train.py --source=./multimodal_celeba_hq.zip \
                                            --dest train_data_6cap.zip --width 256 --height 256 \
                                            --transform center-crop --emb_dim 512 --width=256 --height=256

Zip file ./multimodal_celeba_hq.zip is like:

./multimodal_celeba_hq.zip
  ├── image
  │   ├── 0.jpg
  │   ├── 1.jpg
  │   ├── 2.jpg
  │   └── ...
  └── celea-caption
  │   ├── 0.txt
  │   ├── 1.txt
  │   ├── 2.txt
  │   └── ...

Measure FID and IS

If you want to measure FID and IS, run the file 'Evaluate_FID_and_IS.ipynb'.

Reference

This repository is implemented based on LAFITE, StackGAN++ and AttnGAN.