Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to GuernikaCore #403

Open
wants to merge 51 commits into
base: main
Choose a base branch
from
Open

Conversation

czkoko
Copy link

@czkoko czkoko commented Feb 17, 2024

  • New Scheduler: LCM, DPM++ SDE Karras, Euler Ancenstral, etc.

  • New Model: LCM, SDXL Turbo, SDXL lightning etc.

  • New any model inpaiting support: special inpaiting model has better effect. (Any model needs to reduce its strength)

  • New painting interface: can control the size of the stroke.

  • New Control: ControlNet(SD1.5 and SDXL), T2I-Adapter(SD1.5 and SDXL) (Put the same folder as ControlNet, SDXL recommends using T2I-Adapter, which is smaller, faster and better picture quality)

  • New latent image decoder: used for the preview of the generation process without affecting the speed of generation.

  • Style prompt: A variety of stylized presets suitable for SDXL.

  • Variable Resolution: SD1.5 SDXL, Set the resolution completely freely, Free up your storage space.

  • Textual Inversion: You can add embeddings when using GuernikaModelConverter.

  • Weighted prompts: Increase the weight (prompt), reduce the weight [prompt].

  • No additional ControlledUnet.mlmodelc

  • More convenient model conversion tool.

  • Tip: Can disable CFG when converting the lcm, turbo, lightning model can reduce the generation time to half of the original, and the memory only needs 60% of the original.

  • Note: It is no longer compatible with previously converted models.
    new conversion tool: GuernikaModelConverter.dmg

  • For model conversion settings and other questions, please refer to: discord support

  • The basic functions have been completed, and there should be many minor problems, but I'm just an photographer who like programming.
    So the code will be imperfect and unprofessional. Please help continue to improve it.

2024-03-06 00 43 23

2024-02-17_18 31 05

2024-02-24 01 15 58

xx.mp4

- New Scheduler:
  LCM, DPM++ 2M Karras, DPM++ SDE Karras, Euler Ancenstral

- New Model:
  LCM, SDXL Turbo

- New Control
  T2I-Adapter
@gdbing
Copy link
Collaborator

gdbing commented Feb 18, 2024

To be clear, does this break compatibility with existing converted coreml models?

@czkoko
Copy link
Author

czkoko commented Feb 18, 2024

To be clear, does this break compatibility with existing converted coreml models?

There is a way to be compatible with the old model, but it will affect some new functions, so if you want to change it, it is completely incompatible.

@godly-devotion
Copy link
Collaborator

Thanks for your work, this looks promising. Unfortunately I currently don't have enough time to review and verify that all the features are working correctly, especially since there are are number of differences between ml-stable-diffusion. If someone wants to check and make some changes please feel free to do so.

@gdbing
Copy link
Collaborator

gdbing commented Feb 21, 2024

It looks like the executable bit was erroneously set on every file in this PR and should be set back. This will have the added benefit of reducing the Files Changed from 95 to 16

git diff-tree --no-commit-id --name-only -r fb463dccf09da7275d2df3016a21fc6aef9e4991 | tr \\n \\0 | xargs -0 chmod -x

Mochi Diffusion/Support/Extensions.swift Outdated Show resolved Hide resolved
Mochi Diffusion/Views/SettingsView.swift Outdated Show resolved Hide resolved
Mochi Diffusion/Views/InspectorView.swift Outdated Show resolved Hide resolved
Mochi Diffusion/Support/Functions.swift Outdated Show resolved Hide resolved
Mochi Diffusion/Model/SDControlNet.swift Show resolved Hide resolved
Mochi Diffusion/Model/Scheduler.swift Show resolved Hide resolved
Mochi Diffusion/Support/ImageController.swift Outdated Show resolved Hide resolved
Mochi Diffusion/Support/ImageGenerator.swift Outdated Show resolved Hide resolved
Mochi Diffusion/Views/JobQueueView.swift Outdated Show resolved Hide resolved
@gdbing
Copy link
Collaborator

gdbing commented Feb 22, 2024

I suggested a few nitpicking code changes, but I think that the real questions about this PR are much broader.

  • does this seamlessly reproduce all existing Mochi functionality?
    • I haven't begun testing it myself, I've compiled this branch but haven't converted any suitable models yet
  • how to handle a compatibility breaking change like this
    • don't want users to automatically update Mochi because Sparkle prompted them to and then lose the use of their models
  • are there disadvantages or reasons not to move to Guernika?
    • beyond the obvious "it's a fork maintained by one guy"

- more Scheduler
- New painting interface
- New any model inpaiting support
- Variable Resolution
- etc.
@czkoko
Copy link
Author

czkoko commented Feb 23, 2024

@gdbing
I haven't followed these days. I haven't seen your suggestions, but I have finished all the work. You can check the code I just submitted.

@da-z

This comment was marked as off-topic.

godly-devotion and others added 4 commits February 23, 2024 16:22
Co-authored-by: Graham Bing <gdbing@users.noreply.github.com>
Co-authored-by: Graham Bing <gdbing@users.noreply.github.com>
Co-authored-by: Graham Bing <gdbing@users.noreply.github.com>
@czkoko
Copy link
Author

czkoko commented Feb 24, 2024

@godly-devotion @gdbing
At present, I have repeatedly tested that the new code basically runs perfectly, but there are still the following problems that need to be improved:

  • Some codes are not fault-tolerant.

  • Whether the model supports ControlNet and T2I-Adapter is only judged when the pipeline is loaded, and not determined when the drop-down box is displayed. GuernikaModelConverter can be compatible with both at the same time.

  • StartingImageView.swift: After adding ImageWellView to the last picture with different resolutions in the form of drop, MaskEditorView may have an incorrect display problem. You need to add a drop event to clear the information of the previous picture.

  • The show image preview button does not work at present. Low-quality image decoding that does not affect performance is forced to be turned on. Should the switching option of high-quality decoding be added or delete the show image preview button?

@godly-devotion
Copy link
Collaborator

  • Whether the model supports ControlNet and T2I-Adapter is only judged when the pipeline is loaded, and not determined when the drop-down box is displayed. GuernikaModelConverter can be compatible with both at the same time.

How does Guernika app (not the converter) deal with this scenario? Do they silently fail if its set but not available?

  • The show image preview button does not work at present. Low-quality image decoding that does not affect performance is forced to be turned on. Should the switching option of high-quality decoding be added or delete the show image preview button?

Is the image preview functionality not available with GuernikaCore or the existing code just needs to be adapted for it to work?

Also do you mind resolving the remaining comments that @gdbing made as well? Again, impressive work!

czkoko and others added 2 commits February 25, 2024 17:30
Co-authored-by: Graham Bing <gdbing@users.noreply.github.com>
Co-authored-by: Graham Bing <gdbing@users.noreply.github.com>
Increase the selection of high-quality preview images that affect performance, and use low-quality previews that do not affect performance by default.
@czkoko
Copy link
Author

czkoko commented Feb 25, 2024

  • Whether the model supports ControlNet and T2I-Adapter is only judged when the pipeline is loaded, and not determined when the drop-down box is displayed. GuernikaModelConverter can be compatible with both at the same time.

How does Guernika app (not the converter) deal with this scenario? Do they silently fail if its set but not available?

  • The show image preview button does not work at present. Low-quality image decoding that does not affect performance is forced to be turned on. Should the switching option of high-quality decoding be added or delete the show image preview button?

Is the image preview functionality not available with GuernikaCore or the existing code just needs to be adapted for it to work?

Also do you mind resolving the remaining comments that @gdbing made as well? Again, impressive work!

@godly-devotion I mentioned that some problems have been fixed.

If there is a problem with the latest PR.
There is a problem with SDXL, SDXL Turbo, sd1.5.
SDXL lighting normal
reduceMemory needs to be forced to solve the variable resolution.
@gdbing
Copy link
Collaborator

gdbing commented Feb 26, 2024

Variable size is a really cool hack, and I definitely think the functionality is worth the hackiness. But it can be cleaned up and made a little less hacky. Most importantly, Mochi shouldn't destructively modify user (model) data, even if it's reversible.

Currently VAEDecoder.mlmodelc/coremldata.bin, VAEEncoder.mlmodelc/coremldata.bin, VAEDecoder.mlmodelc/model.mil, VAEEncoder.mlmodelc/model.mil, and VAEEncoder.mlmodelc/metadata.json are overwritten with image dimension data. the model.mil files are backed up, but never restored by Mochi, I think if anything the backup is vestigial dev debug code from developing the vaeDeSDXL() functions.

If this was the only way to implement this, I wouldn't object, but Mochi could copy and link the model files to a new /tmp/ folder instead, and modify and read them from there without making changes to the original model data.

I'm implemented this change myself and I'm just testing it out, if I think it's good code I'll maybe submit it as a PR to https://github.com/czkoko/MochiDiffusion/tree/GuernikaCore , unless there’s a better way to offer code changes to an open PR. I could just attach a patch file.

self.pipeline = modelresource as! StableDiffusionMainPipeline
}

self.pipeline?.reduceMemory = true//reduceMemory
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.pipeline?.reduceMemory = true//reduceMemory
self.pipeline?.reduceMemory = reduceMemory

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReduceMemory must be forced to be turned on here, otherwise img2img will not be able to use variable resolution. ReduceMemory will unload VAEEncoder to read the new input shape. Unload VAEEncoder is not public.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also apply to ControlNets and T2IAdapters?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also apply to ControlNets and T2IAdapters?

Judging from the source code of Guernikakit, reduceMemory will unload ControlNets and T2IAdapters if necessary.
I compared and tested the completion time of the first step of ReduceMemory in the on and disabled state, which is almost the same.
txt2img switching resolution will not affect the speed.
img2img switching resolution requires reloading the pipeline, which takes 1-2 seconds.

@SpiraMira
Copy link

SpiraMira commented Mar 7, 2024

@godly-devotion - is it safe for me to assume that this branch will not be merged into the main branch? Congratulations to @czkoko for this great integration effort, but I am a little worried about the level of support for GuernikaKit. It seems like it is only one developer and the changes are few and far between. Am I wrong?

gdbing and others added 4 commits March 6, 2024 23:51
to match typo in Schedulers package
- pipeline invalidated by changes to
  - model
  - controlnets
  - computeUnit
  - reduceMemory setting

- the hack which allows variable inputSize requires that the pipeline
  be invalidated when starting image changes size
  - also updates VAEEncoder inputSize
  - also unloads VAEEncoder from memory with reduceMemory = true
@vzsg
Copy link
Contributor

vzsg commented Mar 7, 2024

@godly-devotion - is it safe for me to assume that this branch will not be merged into the main branch? Congratulations to @czkoko for this great integration effort, but I am a little worried about the level of support for GuernikaKit. It seems like it is only one developer and the changes are few and far between. Am I wrong?

To be fair, Apple's ml-stable-diffusion library – that Mochi is using on main – has been even more stagnant.

@SpiraMira
Copy link

@godly-devotion - is it safe for me to assume that this branch will not be merged into the main branch? Congratulations to @czkoko for this great integration effort, but I am a little worried about the level of support for GuernikaKit. It seems like it is only one developer and the changes are few and far between. Am I wrong?

To be fair, Apple's ml-stable-diffusion library – that Mochi is using on main – has been even more stagnant.

I agree. But there have been at least 16 commits on Apple's ml-stable-diffusion library since release 1.1.0 in September (and an arguably larger community of interested parties and contributors). GuernikaKit is chock full of good stuff (that I am inspecting for my own purposes). Breaking model compatibility is a concern but I need to deep dive into the Guernika model conversion scripts a little more before I feel comfortable...

- turn functions hackVAE and modifyInputSize into SDModel methods
- include size in pipelineHash
- include presence of inputImage in pipelineHash
- move hackVAE call from ImageGenerator to ImageController
- when models are loaded (or reloaded) variable size models are copied
  to a temp folder and their data is loaded from there, so that Mochi
  can edit files required for variable size image generation without
  affecting the original model data
- model data is hard linked, except for the files which will be edited
- temp folder is created in same volume as models so that models can be 
  hard linked
    - if models folder is in System volume, uses
      FileMonitor.temporaryDirectory
    - otherwise creates temp folder in models/../
- reload models when model folder is changed in settings
- cleanup temp folder on app quit or reloading models
@gdbing
Copy link
Collaborator

gdbing commented Mar 13, 2024

I've created another PR pointed at czkoko/GuernikaCore: czkoko#4

It attempts to address most of the issues which I and others have raised here.

@gdbing
Copy link
Collaborator

gdbing commented Mar 15, 2024

Something odd I've found is that if I use the variable size hack to resize an SDXL model so that one dimensions is at least 1024, and the other is greater than 1024, it consistently crashes on the final processing step.

I can generate 896x1536 and 1024x1024, but if I try 1024x1152, it will crash.

These are the only crashes I'm still seeing, and I can reproduce this going back to the earliest variable size hack builds, so it seems to be something inherent to the hack.

Does Guernika itself allow you to generate 1152x1152 SDXL images?

@czkoko
Copy link
Author

czkoko commented Mar 15, 2024

Something odd I've found is that if I use the variable size hack to resize an SDXL model so that one dimensions is at least 1024, and the other is greater than 1024, it consistently crashes on the final processing step.

I can generate 896x1536 and 1024x1024, but if I try 1024x1152, it will crash.

These are the only crashes I'm still seeing, and I can reproduce this going back to the earliest variable size hack builds, so it seems to be something inherent to the hack.

Does Guernika itself allow you to generate 1152x1152 SDXL images?

Yes, I have found this problem for a long time. The width and height cannot exceed 1024 at the same time.
I'm sure it's caused by coremldata.bin. Guernika app does not have this problem. It modifies coremldata.bin every time it modifies the resolution. It compiled fixed shape information to coremldata.bin.
I tried this scheme, but failed. Using binary mode to modify the shape information caused a coreml core reading error. I don't know how the author of Guernika modified it.
Later, I found the current solution: the information of flexible shapes is compiled in coremldata.bin. The advantage is that you only need to replace the coremldata.bin once, and the disadvantage is that the width and height cannot exceed 1024 at the same time.

@SpiraMira
Copy link

Something odd I've found is that if I use the variable size hack to resize an SDXL model so that one dimensions is at least 1024, and the other is greater than 1024, it consistently crashes on the final processing step.
I can generate 896x1536 and 1024x1024, but if I try 1024x1152, it will crash.
These are the only crashes I'm still seeing, and I can reproduce this going back to the earliest variable size hack builds, so it seems to be something inherent to the hack.
Does Guernika itself allow you to generate 1152x1152 SDXL images?

Yes, I have found this problem for a long time. The width and height cannot exceed 1024 at the same time. I'm sure it's caused by coremldata.bin. Guernika app does not have this problem. It modifies coremldata.bin every time it modifies the resolution. It compiled fixed shape information to coremldata.bin. I tried this scheme, but failed. Using binary mode to modify the shape information caused a coreml core reading error. I don't know how the author of Guernika modified it. Later, I found the current solution: the information of flexible shapes is compiled in coremldata.bin. The advantage is that you only need to replace the coremldata.bin once, and the disadvantage is that the width and height cannot exceed 1024 at the same time.

Hi - the hack just touches the .mil files. Why is coremldata.bin also being backed up, renamed and copied? I’ve searched the code here and don’t see where coremldata.bin comes into play. Also is there a description of what coremldata.bin contains? I know that it’s a product of the coreml compiler but I can’t find any decent documentation on it (beyond it containing some metadata...)

thanks in advance.

@czkoko
Copy link
Author

czkoko commented Mar 17, 2024

Hi - the hack just touches the .mil files. Why is coremldata.bin also being backed up, renamed and copied? I’ve searched the code here and don’t see where coremldata.bin comes into play. Also is there a description of what coremldata.bin contains? I know that it’s a product of the coreml compiler but I can’t find any decent documentation on it (beyond it containing some metadata...)

Coremldata.bin stores some information about the following picture, and CoreML Framework will read it to verify the model.
2024-03-17 13 58 44

@SpiraMira
Copy link

Hi - the hack just touches the .mil files. Why is coremldata.bin also being backed up, renamed and copied? I’ve searched the code here and don’t see where coremldata.bin comes into play. Also is there a description of what coremldata.bin contains? I know that it’s a product of the coreml compiler but I can’t find any decent documentation on it (beyond it containing some metadata...)

Coremldata.bin stores some information about the following picture, and CoreML Framework will read it to verify the model. 2024-03-17 13 58 44

thanks - a binary version of InputSchema and OutputSchema latent descriptions in metadata.json (I guess)

  • how did you get that picture?
  • is it being backed up just as a precaution? (I don’t think the app modifies it in any way)

@lost-illusion
Copy link

@czkoko Hi!
Im trying to test your PR, clone branch, etc, then convert model like this
изображение

and... on my m1 max 24gpu im waiting three hours for loading model first time, and nothing. Maybe im doing something wrong ? im trying both branches, main and GuernikaCore

@czkoko
Copy link
Author

czkoko commented May 8, 2024

@lost-illusion
Using cpu&gpu to reconvert the model, ANE does not support such a high resolution as 1024x1024.

@lost-illusion
Copy link

@lost-illusion Using cpu&gpu to reconvert the model, ANE does not support such a high resolution as 1024x1024.

okey, thank you, something like this ?
изображение
will be good to have any example, for simplified testing)

@czkoko
Copy link
Author

czkoko commented May 8, 2024

@lost-illusion
Copy link

@lost-illusion Recommended settings: 2024-05-08 22 48 25

More instructions for use: https://discord.com/channels/1068185566782423092/1068187870537449532/1208434036574396477

ty

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants