Specialist Diffusion: Plug-and-Play Sample-Efficient Fine-Tuning of Text-to-Image Diffusion Models to Learn Any Unseen Style

This repository is the official implementation of Specialist Diffusion (CVPR 2023)

Specialist Diffusion: Plug-and-Play Sample-Efficient Fine-Tuning of Text-to-Image Diffusion Models to Learn Any Unseen Style
Haoming Lu, Hazarapet Tunanyan, Kai Wang, Shant Navasardyan, Zhangyang Wang, Humphrey Shi

Paper | Project

We present Specialist Diffusion, a style specific personalized text-to-image model. It is plug-and-play to existing diffusion models and other personalization techniques. It outperforms the latest few-shot personalization alternatives of diffusion models such as Textual Inversion and DreamBooth, in terms of learning highly sophisticated styles with ultra-sample-efficient tuning.

Setup the environment

First, install prerequisites with:

conda env create -f environment.yml
conda activate sd

Then, set up the configuration for accelerate with:

accelerate config

Train a model

An example call:

accelerate launch train.py --config='configs/train_default.json'

Evaluate a model

An example call:

accelerate launch eval.py --config='configs/eval_default.json'

Plug-and-Play

Combination of our model and Textual Inversion. Text prompts used for generation are listed top, styles of the respective datasets are listed under, and the methods for training the models are listed left. By integrating Textual Inversion with our model, the results capture even richer details without losing the style

BibTex

If you use our work in your research, please cite our publication:

@InProceedings{Lu_2023_CVPR,
    author    = {Lu, Haoming and Tunanyan, Hazarapet and Wang, Kai and Navasardyan, Shant and Wang, Zhangyang and Shi, Humphrey},
    title     = {Specialist Diffusion: Plug-and-Play Sample-Efficient Fine-Tuning of Text-to-Image Diffusion Models To Learn Any Unseen Style},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2023},
    pages     = {14267-14276}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
__assets__/images		__assets__/images
configs		configs
libs		libs
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
eval.py		eval.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets/images

assets/images

configs

configs

libs

libs

.gitignore

.gitignore

README.md

README.md

environment.yml

environment.yml

eval.py

eval.py

requirements.txt

requirements.txt

train.py

train.py

Repository files navigation

Specialist Diffusion: Plug-and-Play Sample-Efficient Fine-Tuning of Text-to-Image Diffusion Models to Learn Any Unseen Style

Setup the environment

Train a model

Evaluate a model

Plug-and-Play

BibTex

About

Releases

Packages

Contributors 3

Languages

Picsart-AI-Research/Specialist-Diffusion

Folders and files

Latest commit

History

Repository files navigation

Specialist Diffusion: Plug-and-Play Sample-Efficient Fine-Tuning of Text-to-Image Diffusion Models to Learn Any Unseen Style

Setup the environment

Train a model

Evaluate a model

Plug-and-Play

BibTex

About

Resources

Stars

Watchers

Forks

Languages