Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Input type (c10::Half) and bias type (float) should be the same #7

Open
densechen opened this issue Aug 18, 2023 · 8 comments

Comments

@densechen
Copy link

It is ok for use to generate images for the first time. However, it will raise the following error if we generate images for second time.

Traceback (most recent call last):
  File "/docker/software/anaconda3/envs/r3d/lib/python3.8/site-packages/gradio/routes.py", line 321, in run_predict
    output = await app.blocks.process_api(
  File "/docker/software/anaconda3/envs/r3d/lib/python3.8/site-packages/gradio/blocks.py", line 1006, in process_api
    result = await self.call_function(fn_index, inputs, iterator, request)
  File "/docker/software/anaconda3/envs/r3d/lib/python3.8/site-packages/gradio/blocks.py", line 847, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/docker/software/anaconda3/envs/r3d/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/docker/software/anaconda3/envs/r3d/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/docker/software/anaconda3/envs/r3d/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "app.py", line 123, in infer
    images = pipe_refiner(prompt=prompt, negative_prompt=negative, image=images, num_inference_steps=steps, strength=refiner_strength, generator=g).images
  File "/docker/software/anaconda3/envs/r3d/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/docker/software/anaconda3/envs/r3d/lib/python3.8/site-packages/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_img2img.py", line 998, in __call__
    image = self.vae.decode(latents / self.vae.config.scaling_factor, return_dict=False)[0]
  File "/docker/software/anaconda3/envs/r3d/lib/python3.8/site-packages/diffusers/utils/accelerate_utils.py", line 46, in wrapper
    return method(self, *args, **kwargs)
  File "/docker/software/anaconda3/envs/r3d/lib/python3.8/site-packages/diffusers/models/autoencoder_kl.py", line 270, in decode
    decoded = self._decode(z).sample
  File "/docker/software/anaconda3/envs/r3d/lib/python3.8/site-packages/diffusers/models/autoencoder_kl.py", line 256, in _decode
    z = self.post_quant_conv(z)
  File "/docker/software/anaconda3/envs/r3d/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/docker/software/anaconda3/envs/r3d/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 463, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/docker/software/anaconda3/envs/r3d/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (c10::Half) and bias type (float) should be the same
@TonyLianLong
Copy link
Owner

What is your setting? It works on my end.

To create a public link, set `share=True` in `launch()`.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:30<00:00,  1.62it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:13<00:00,  1.15it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:23<00:00,  2.10it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:08<00:00,  1.85it/s]

@TonyLianLong
Copy link
Owner

Seems that the decoder VAE on the refiner is somehow not fp16. Did you change any config? You can also disable refiner to see if it still happens.

@densechen
Copy link
Author

densechen commented Aug 18, 2023 via email

@TonyLianLong
Copy link
Owner

Seems like it still works if the models are not offloaded.

$ OFFLOAD_BASE=false OFFLOAD_REFINER=false python app.py 
Loading model stabilityai/stable-diffusion-xl-base-1.0
Loading pipeline components...: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:01<00:00,  3.92it/s]
Loading model stabilityai/stable-diffusion-xl-refiner-1.0
Loading pipeline components...:  20%|███████████████████████████████                                                                                                                            | 1/5 [00:00<00:01,  2.55it/s]
Loading pipeline components...: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:01<00:00,  4.18it/s]
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:06<00:00,  7.44it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:02<00:00,  7.47it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:06<00:00,  7.70it/s]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:01<00:00,  7.72it/s]

@TonyLianLong
Copy link
Owner

Since you loaded custom weights, it's possible that somehow fp32 weights are loaded. You probably want to check whether you have fp16 weights (https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/tree/main/vae).

If you loaded fp32 weights, you can add refiner.to(torch.float16) to cast the model parameters to fp16.

@densechen
Copy link
Author

Seems that the decoder VAE on the refiner is somehow not fp16. Did you change any config? You can also disable refiner to see if it still happens.

I have disabled refiner via PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512 SDXL_MODEL_DIR=pretrained_models OFFLOAD_BASE=false ENABLE_REFINER=false python app.py, failed to solve this issue.

@densechen
Copy link
Author

It seems like that. A minor modification maybe required to make this repo more robust.

@TonyLianLong
Copy link
Owner

Since I could not reproduce this, could you show me your diff?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants