Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved metadata, second attempt #406

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Dalchrome
Copy link

I have improved and added to the metadata capabilities of MochiDiffusion so metadata can be ported between apps via the saved images.

I have kept the remaining Metadata and added various others including Xmp.exif.UserComment, this is a json formatted string that matches with the A1111 api protocol so as to be compatible with CivitAI, all the main python generators, popular Mac apps and makes it easier link to other apps later too.

the main file added is MetadataHelper.swift which is my ported code as well as the existing metadata merged. This file is east to maintain and I can see plenty to add later when the transition to Guernika is complete, but I have left placeholders ready for the new schedulers.

the other change is amending the SDImage.swift imagedata function, to call for the metadata

@Dalchrome
Copy link
Author

I just revised the variable names to begin with lowercases as recommended by Swift Lint

@gdbing
Copy link
Collaborator

gdbing commented Mar 17, 2024

I’ll take a closer look at this later but I have some quick initial questions you might be able to answer

  • is the new metadata format documented anywhere?
  • does Mochi still correctly handle metadata from images generated with previous versions of the app?
  • for that matter, I’m assuming that we’re replacing the old metadata with the new, is that correct?

@Dalchrome
Copy link
Author

The original Metadata remains unchanged, the old code is just moved to the new metadata function. The main additional (exifMetadata) metadata is taken from the output files from Draw Things and A1111, and tested on the import functions of DT, Prompt Writer and image editing software. I hadn't found any 'official' spec, but apart from "c" and "uc" for positive and negative prompts, they follow the A1111 API json standards, with room to add custom fields. Additional metadata formats are there to be human readable and give the same information for wider compatibility in whatever photo editing/browsing software is being used.

@gdbing
Copy link
Collaborator

gdbing commented Mar 22, 2024

The original Metadata remains unchanged, the old code is just moved to the new metadata function.

This isn't right, what I mean is that the metadata from a random image generated with main branch looks like this

Include in Image: portrait of a man submerged in honey; Exclude from Image: ; Model: juggernautXL-v6_SDXL_original_8bit_768x1024; Steps: 14; Guidance Scale: 4.0; Seed: 1963368986; Size: 768x1024; Scheduler: PNDM; ML Compute Unit: CPU & GPU; Generator: Mochi Diffusion 5.1

and the metadata of the same image generated with your branch looks like this

c:portrait of a man submerged in honey, uc:, seed:2963653646, Guidance Scale:4.0, Sampler:PNDM, Steps:14, Model:juggernautXL-v6_SDXL_original_8bit_768x1024, upscaler:, Size:768x1024

It's fine to change this, the point of this PR is that there is a way of formatting metadata which has become the de facto standard used by, ie civitai and a1111, and that Mochi should change to adhere to it. Becoming slightly less human readable ("Include in Image" vs "c") seems like a reasonable tradeoff.

I'm interested in this PR, and agree with its broad goal of aligning with an existing metadata format, but I need to know more about the format we're trying to adhere to before I can review this and affirm that this PR accomplishes its goal. If there's not a documented spec, show some practical examples of what the target is. Ideally you could link to the equivalent code in a1111 or comfy, but if you've been just examining the metadata of images generated by other apps, then paste in some examples of that.

@gdbing gdbing self-requested a review March 22, 2024 21:06
Copy link
Collaborator

@gdbing gdbing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are also some fixes and improvements I'd like to see

  • merge in, or branch from the latest origin/main, to fix merge conflicts
  • images generated with the new metadata disappear when you relaunch Mochi, because the Generated field was removed from metadata, which causes createSDImageFromURL() to return nil for them
    • in general, I don't think there's any need to remove preexisting fields in order to bring metadata into conformance, I would imagine that they are just ignored by other applications and websites
  • duplicate code
    • MetadataHelper.swift can probably be removed altogether. The goal can be accomplished by modifying SDImage.metadata() instead of creating a new function.
  • NB there's no need to exhaustively check every case for enums with a sensible rawValue, like Scheduler
    • see SDImage.swift line 136: \(Metadata.scheduler.rawValue): \(scheduler.rawValue); \

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants