HPML

Distributed training of GANs

OVERVIEW:

We implement different training strategies for creating a Generator to build fake abstract art images.
We profile different training methods to understand the bottleneck in our training pipeline.
We implement training strategies to help improve these bottlenecks to improve the training speed while maintaining the quality of our GANs.

ARCHITECTURE:

REPOSITORY:

GDLoss.png - Graph plot of the Generator Loss and the Discriminator Loss of a GAN.

animation.gif - A GIF file to show the progress of images generated over 200 epochs.

cpu.out - Output file generated using running of cpu.sbatch.

cpu.sbatch - Sbatch file to run on CPU.

dcgan.py - Main Python file to run in sbatch which contains the training logic.

rtx.out - Output file generated using running of rtx.sbatch.

rtx.sbatch - Sbatch file to run on rtx8000 GPU.

slowed_down_looped_once.gif - animation.gif slowed down and looped once for better visual understanding.

v100.out - Output file generated using running of b100.sbatch.

v100.sbatch - Sbatch file to run on v100 GPU.

v100_GDLoss.png - Graph plot of the Generator Loss and the Discriminator Loss of a GAN on v100 GPU.

v100animation.gif - A GIF file to show the progress of images generated over 200 epochs on v100 GPU.

HOW TO RUN:

1. Run the following command for GPU - sbatch training.sbatch

2. Run the following command for CPU - python dcgan.py --device cpu

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
.gitignore		.gitignore
GDLoss.png		GDLoss.png
README.md		README.md
animation.gif		animation.gif
cpu.out		cpu.out
cpu.sbatch		cpu.sbatch
cpu_GDLoss.png		cpu_GDLoss.png
cpuanimation.gif		cpuanimation.gif
dataparallelgan.py		dataparallelgan.py
dcgan.py		dcgan.py
ddp.py		ddp.py
ddp_art_rtx.sbatch		ddp_art_rtx.sbatch
ddp_art_rtx8000.out		ddp_art_rtx8000.out
ddp_art_rtx8000D.pth		ddp_art_rtx8000D.pth
ddp_art_rtx8000G.pth		ddp_art_rtx8000G.pth
ddp_art_rtx8000GDLoss.png		ddp_art_rtx8000GDLoss.png
ddp_art_rtx8000animation.gif		ddp_art_rtx8000animation.gif
ddp_art_v100.out		ddp_art_v100.out
ddp_art_v100.sbatch		ddp_art_v100.sbatch
ddp_art_v100GDLoss.png		ddp_art_v100GDLoss.png
ddp_art_v100animation.gif		ddp_art_v100animation.gif
ddp_ff.py		ddp_ff.py
ddp_ff_rtx.sbatch		ddp_ff_rtx.sbatch
ddp_ff_rtx8000.out		ddp_ff_rtx8000.out
ddp_ff_v100.out		ddp_ff_v100.out
ddp_ff_v100.sbatch		ddp_ff_v100.sbatch
distdataparalleldcgan.py		distdataparalleldcgan.py
ff.out		ff.out
ff_dp.out		ff_dp.out
ff_dp.sbatch		ff_dp.sbatch
rtx.out		rtx.out
rtx.sbatch		rtx.sbatch
rtx8000_GDLoss.png		rtx8000_GDLoss.png
rtx8000animation.gif		rtx8000animation.gif
rtx_ff.sbatch		rtx_ff.sbatch
slowed_down_looped_once.gif		slowed_down_looped_once.gif
v100.out		v100.out
v100.sbatch		v100.sbatch
v100_GDLoss.png		v100_GDLoss.png
v100animation.gif		v100animation.gif

siddhanthiyer-99/Distributed-Training-of-GANs

Folders and files

Latest commit

History

Repository files navigation

HPML

Distributed training of GANs

OVERVIEW:

ARCHITECTURE:

REPOSITORY:

HOW TO RUN:

RESULTS:

About

Topics

Resources

Stars

Watchers

Forks

Languages