You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently in the process of evaluating the Blip2 model for one of my use cases, where I need to assess the similarity between text and images. For the initial round of experiments, I utilized the image text matching notebook.
Below are the results for the Image-Text Coherence (ITC) and Image-Text Matching (ITM):
Inputs:
Input Image:
Input Text:
In this image, a person is depicted wearing a white and black t-shirt and black socks. The individual is standing on a green surface, with a white wall in the background.
ITM Score: The image and text are matched with a probability of 99.878%.
ITC Score: The cosine similarity between the image feature and text feature is 0.4622.
Why is there a significant difference in the ITM and ITC scores?
The text was updated successfully, but these errors were encountered:
Hello,
I am currently in the process of evaluating the Blip2 model for one of my use cases, where I need to assess the similarity between text and images. For the initial round of experiments, I utilized the image text matching notebook.
Below are the results for the Image-Text Coherence (ITC) and Image-Text Matching (ITM):
Inputs:
Input Image:
Input Text:
In this image, a person is depicted wearing a white and black t-shirt and black socks. The individual is standing on a green surface, with a white wall in the background.
ITM Score: The image and text are matched with a probability of 99.878%.
ITC Score: The cosine similarity between the image feature and text feature is 0.4622.
Why is there a significant difference in the ITM and ITC scores?
The text was updated successfully, but these errors were encountered: