Solo-Synth-GAN (Zero Shot Image-to-Video) - Train only on one image

Documentation | Paper(Updated Soon) | Code of Conduct | Developer's Guide

Solo-Synth GAN: Train only on one image

Image	Video

This is the open-source repository for the project Solo-Synth GAN which is one of the latest zero shot generative adversarial network techniques for image-2-video while training it only on a SINGLE IMAGE For a concise summary of our research, please refer to our paper. (Will be updated once paper is published)

Solo-Synth-GAN (Zero Shot Image-2-Video)
Solo-Synth GAN: Train only on one image
Table of Contents
1. Overview
2. Repo Structure
- 2.1 Bill of Materials & Licenses
- Software Bill of Materials (SBOM)
3. Model Architecture
4 Image-2-Video Zero Shot
5 Parameter Tuning with examples
- 5.1 Parameter Descriptions
6 Tested Environments
7 Contributing to Solo-Synth-GAN
8 Additional Data(Coming Soon!)
9 Acknowledgements

1. Overview

In this work, we propose and examine new technique for training Generative Adversarial Networks (GANs) on a single image and generating a video from just a single image. The GAN is not trained on ANY OTHER IMAGE.

Our approach involves training the model iteratively on different resolutions of the original image, gradually increasing resolution as training progresses. As the resolution increases, we augment the generator's capacity by adding additional convolutional layers. At each stage, only the most recently added convolutional layers are trained with a higher learning rate, while previously existing layers are trained with a smaller learning rate.

2. Repo Structure

📦home 
.github
│  └─ workflows
│     └─ python-app.yml
├─ .gitignore
├─ Examples
│  └─ marinabaysands copy.jpg
Images
│  └─ marinabaysands.jpg
├─ LICENSE
├─ README.md
├─ Results
│  └─ results.py
├─ Solo-Synth-GAN
│  └─ functions.py
├─ Trained_Dataset
│  └─ saved_timelines.py
└─ main.py

2.1 Bill of Materials & Licenses

Software Bill of Materials (SBOM)

This table provides a list of licenses associated with the dependencies used in this project. The SBOM is included in the ssgan.bom file. Each entry in the table corresponds to a dependency used in the project, along with its associated license.

License No	License Name
0	MPL-2.0
1	mpmath
2	pandas
3	Apache-2.0
4	torchvision
5	opencv-python
6	scipy
7	sympy
8	types-python-dateutil
9	requests
10	chardet
11	kiwisolver
12	MIT
13	Unlicense
14	BSD-3-Clause
15	sortedcontainers
16	GPL-3.0-or-later
17	HPND
18	fqdn
19	cycler
20	fsspec
21	AGPL-3.0
22	jsonpointer
23	torch
24	Python-2.0
25	python-dateutil
26	BSD-2-Clause
27	numpy
28	ISC
29	contourpy

3. Model Architecture

Firstly make sure clone the repository and enter into the main folder so that you can proceed with installing the dependencies.

git clone https://github.com/PrateekJannu/Solo-Synth-GAN-v1.0

cd Solo-Synth-GAN-v1.0

Our model architecture is based on PyTorch and is compatible with Python 3.8, 3.9 and 3.10 . For installation, please run:

pip install -r requirements.txt

3.1 Model Architecture GPU (Nvidia, TPU, A100, V100, NcASV3/6)

Our model architecture needs CUDA toolkit to run. For installation, please run:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

3.2 Unconditional Generation (GPU is a must and required for training and generation)

To train a model with default parameters from our paper(to be published), execute:

python main.py --gpu 0 --train_mode generation --input_name Images/marinabaysands.jpg

Training a single model typically takes about 8 minutes on a NVIDIA® V100 Tensor Core.

Original Image	Generated Samples

3.3 Customization Options

To modify learning rate scaling or the number of trained stages, you can adjust parameters as follows:

python main.py --gpu 0 --train_mode generation --input_name Images/colusseum.png --lr_scale 0.5

or

python main.py --gpu 0 --train_mode generation --input_name Images/colusseum.png --train_stages 7

3.4 Unconditional Generation (Arbitrary Sizes)

For generating images of arbitrary sizes, use the following command:

python main.py --gpu 0 --train_mode retarget --input_name Images/colusseum.png

4 Image-2-Video Zero Shot

To train an image-2-video model or from a single image, refer to the provided commands below. Firstly train the GAN only on one image particularly for video generation using the commend below, here we use Images/balloons from our directory

python main.py --gpu 0 --train_mode animation --input_name Images/balloons.png

After training the single image now we generate videos from the model please run the evaluate model and replace the ... below with the directory name of the trained model. Enjoy!

python evaluate_model.py --gpu 0 --model_dir TrainedModels/balloons/...

Original Image	Generated Video

5 Parameter Tuning with examples

5.1 Parameter Descriptions

The following table provides a comprehensive overview of the parameters used in the project. These parameters are utilized for configuring and controlling various aspects of the training process in the context of the project.

By understanding the purpose and usage of each parameter, users can effectively tailor their training settings and experiment with different configurations to achieve desired outcomes.

Parameter	Description	Example
`--input_name`	Input image name for training	`--input_name input_image.png`
`--naive_img`	Naive input image (harmonization or editing)	`--naive_img naive_image.png`
`--gpu`	Which GPU to use	`--gpu 0`
`--train_mode`	Mode of training	`--train_mode generation`
`--lr_scale`	Scaling of learning rate for lower stages	`--lr_scale 0.12`
`--train_stages`	Number of stages to use for training	`--train_stages 9`
`--fine_tune`	Whether to fine-tune on a given image	`--fine_tune`
`--model_dir`	Model to be used for fine-tuning (harmonization or editing)	`--model_dir model_directory`

6 Tested Environments

7 Contributing to Solo-Synth-GAN

The Solo-Synth-GAN project welcomes, and depends on, contributions from developers and users in the open source community. Please see the Contributing Guide for information on how you can help.

8 Additional Data(Coming Soon!)

The User-Studies folder contains raw images used for conducting user studies.

9 Acknowledgements

For more details, please refer to the paper, published soon and feel free to reach out with any questions or feedback!

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
.github/workflows		.github/workflows
Examples		Examples
Images		Images
Results		Results
SoloSynthGAN		SoloSynthGAN
Tests		Tests
TrainedModels		TrainedModels
Trained_Dataset		Trained_Dataset
docs		docs
.amlignore		.amlignore
.amlignore.amltmp		.amlignore.amltmp
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
evaluate_model.py		evaluate_model.py
evaluate_model.py.amltmp		evaluate_model.py.amltmp
licenses.py		licenses.py
main.py		main.py
main.py.amltmp		main.py.amltmp
requirements.txt		requirements.txt
ssgan.bom		ssgan.bom

License

PrateekJannu/Solo-Synth-GAN-v1.0

Folders and files

Latest commit

History

Repository files navigation

Solo-Synth-GAN (Zero Shot Image-to-Video) - Train only on one image

Solo-Synth GAN: Train only on one image

Table of Contents

1. Overview

2. Repo Structure

2.1 Bill of Materials & Licenses

Software Bill of Materials (SBOM)

3. Model Architecture

3.1 Model Architecture GPU (Nvidia, TPU, A100, V100, NcASV3/6)

3.2 Unconditional Generation (GPU is a must and required for training and generation)

3.3 Customization Options

3.4 Unconditional Generation (Arbitrary Sizes)

4 Image-2-Video Zero Shot

5 Parameter Tuning with examples

5.1 Parameter Descriptions

6 Tested Environments

7 Contributing to Solo-Synth-GAN

8 Additional Data(Coming Soon!)

9 Acknowledgements

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages