Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch size has weird effects on temp images generated by zoom_enhance #199

Open
3 tasks done
Sporking opened this issue Sep 19, 2023 · 1 comment
Open
3 tasks done
Labels
bug Something isn't working

Comments

@Sporking
Copy link

Sporking commented Sep 19, 2023

Due diligence

  • I checked for similar issues and couldn't find any.
  • My WebUI and Unprompted are both up-to-date.
  • I disabled my other extensions but the problem persists.

Describe the bug

I am seeing strange effects when batch size is changed in the WebUI.

Prompt: "an ugly man[after][zoom_enhance _alt include_original replacement="blue face" denoising_max=0.8][/after]"
Seed: 797457345
Sampling Method: DPM++ 3M SDE Karras
Sampling Steps: 10
Batch Count: 1
Batch Size: <see below>
Width: 512
Height: 512
Checkpoint: absolutereality_v16.safetensors ("Absolute Reality", downloadable from civitai.com). I imagine that similar effects could probably occur for any checkpoint. I am not using SDXL.

Also, "Unprompted: Synchronize with main seed" is enabled.

Try this with Batch Size==1, then Batch Size==2, then Batch Size==3.

Some terminology: Consider that the first image (generated from the exact seed) into txt2img/<date> is "A", the second is "B", and the third is "C". Let's denote a portion of of image A which has been clipped out via zoom_enhance and saved in img2img-images/<date> as "A1", "A2", "A3", etc. with all "A1" being identical images, but different from "A2", and different from "A3".

So with that understood, what I expect to see in img2img-images/<date> (images listed in the order they appear, by filename):

When Batch size==1: A1
When Batch size==2: A1 B1 (A1 should be identical with A1 produced by batch size==1)
When Batch size==3: A1 B1 C1 (A1 and B1 should be identical with A1 and B1 produced by batch size==2)

What I actually see in img2img-images/<date>:

When Batch size==1: A1 (Note by the way that A1 looks basically identical to a clipped portion of A: no "blue face")
When Batch size==2: A2 B1 A3 B2 (Yes, they are generated twice! With different images from each other and from batch size==1! Blue is evident in all images)
When Batch size==3: A4 C1 B3 A5 B4 C2 (Again, generated twice, with still different images from batch size==1 or 2. Blue evident in all images)

No two images generated are the same. I expected that for at least the first image generated (the A image) and consequently the same seed, that regardless of batch size, I should see exactly the same portion detected as a face and clipped out by zoom_enhance each time, and identical processing applied to it, to produce identical images, but this isn't what happens.

One thing that may be relevant here is that I notice that the higher I set batch size, the larger more of a discontinuity I see on the right and bottom sides of each image: with batch size=1 there seems to be no discontinuity, but with batch size=2 I see a subtle discontinuity appear at perhaps 1/30th of the image width on the right side in the first two images, and perhaps at twice that in the next two images. At batch size=3 it is even larger in the first three images (appearing on the right and bottom) and even bigger on the next three (the img2img algorithm seems to have decided in that case to imagine a corner of a blurry room in the background behind the foreground face). I am calling this a "discontinuity" but it actually looks to me as if the image has been shifted left by varying amounts (note the decreasing distance between the man's ear on the left and the left edge of the image) as new pixels are supplied on the right.

I have seen some evidence in larger images (for other, more complex prompts) to suggest that these clipped portions of images are being taken via an algorithm that "wraps around" the original source image: For instance, if a clip region was taken from -2, -2 to 8, 8 from a source image of size 100, 100, then the portion of the clipped image that mapped to coordinates in the original image that were less than zero would be taken from the range 98..100x98..100 of the source image, rather than being set to a fixed color (e.g. black or gray) as I would have expected. This results in images within img2img-images/<date> that "wrap around" (toroidal wrapping) the original source image, sometimes dramatically. I think this is what is causing the discontinuities, although I can't explain why they seem to grow larger with increasing batch size. It appears to me that in such cases it would be better to clip the clipping region to the bounds of the source image (so clip a region like "-2, -2 to 8, 8" into a region of "0, 0 to 8, 8") before taking pixels from it, to avoid wrap-around effects like these. I think that similar wrap-around effects can happen on the right and bottom sides of the source image as well, if the right and/or bottom side of the clipping region exceeds the right and or bottom bounds of the source image.

Note the growing discontinuities on the right side of the "A1", "A2", and "A4" images, for instance. Also notice the decreasing distance of the ear on the left of the image to the left edge of the image in these cases.

batch size==1: Source image A, and clipped image A1, A1 size = 512x512 (why?)
00088-797457345
00056-797457345

batch size==2: Source image A, and clipped images A2 and A3, A2 and A3 size = 1024x1024 (why?)
00089-797457345
00057-797457345
00059-797457345

batch size==3: Source image A, and clipped images A4 and A5, A4 and A5 size = 1024x1024 (why?):
00091-797457345
00061-797457345
00064-797457345

Open mysteries:
1: Why does the generated source image A differ between batch size==1, 2, or 3? It has the same seed in each case!
2: Why does batch size greater than 1 produce doubled images in img2img-images/<date>? There is only one face in each source image, so zoom_enhance should only find one face for each, and the number of clipped images should be the same as the number of source images (which should be the same as the batch size). Is some kind of double-processing occurring?
3: Why is there a discontinuity on the right side and bottom side of the first (but not the second) of the clipped images in img2img-images/<date>? Is bounds checking and wraparound being handled correctly?
4: Why do the sizes of the generated images in img2img-images/<date> seem to differ depending on batch size?
5: Does this have anything to do with the problems being reported in #152? I still can't get the clipped images from img2image-images/<date> to get stitched back into the original source images as they should be. Also, include_original has no effect.

Thanks in advance for any insight on these questions.

Log output

There were no error messages in log output.

Unprompted version

v9.16.1

WebUI version

v1.6.0

@Sporking Sporking added the bug Something isn't working label Sep 19, 2023
@Sporking
Copy link
Author

Note that using _alt on zoom_enhance seems to make no difference in these results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant