Skip to content

Implemented training strategies to help improve bottlenecks and to improve the training speed while maintaining the quality of our GANs.

Notifications You must be signed in to change notification settings

siddhanthiyer-99/Distributed-Training-of-GANs

Repository files navigation

HPML

Distributed training of GANs

slowed_down_looped_once

OVERVIEW:

We implement different training strategies for creating a Generator to build fake abstract art images.
We profile different training methods to understand the bottleneck in our training pipeline.
We implement training strategies to help improve these bottlenecks to improve the training speed while maintaining the quality of our GANs.

ARCHITECTURE:

image

REPOSITORY:

GDLoss.png - Graph plot of the Generator Loss and the Discriminator Loss of a GAN.

animation.gif - A GIF file to show the progress of images generated over 200 epochs.

cpu.out - Output file generated using running of cpu.sbatch.

cpu.sbatch - Sbatch file to run on CPU.

dcgan.py - Main Python file to run in sbatch which contains the training logic.

rtx.out - Output file generated using running of rtx.sbatch.

rtx.sbatch - Sbatch file to run on rtx8000 GPU.

slowed_down_looped_once.gif - animation.gif slowed down and looped once for better visual understanding.

v100.out - Output file generated using running of b100.sbatch.

v100.sbatch - Sbatch file to run on v100 GPU.

v100_GDLoss.png - Graph plot of the Generator Loss and the Discriminator Loss of a GAN on v100 GPU.

v100animation.gif - A GIF file to show the progress of images generated over 200 epochs on v100 GPU.

HOW TO RUN:

1. Run the following command for GPU - sbatch training.sbatch

2. Run the following command for CPU - python dcgan.py --device cpu

RESULTS:

image

About

Implemented training strategies to help improve bottlenecks and to improve the training speed while maintaining the quality of our GANs.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published