Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to realize multi-image correlation in vqa task? #200

Open
fansticOne opened this issue Jan 31, 2024 · 4 comments
Open

how to realize multi-image correlation in vqa task? #200

fansticOne opened this issue Jan 31, 2024 · 4 comments

Comments

@fansticOne
Copy link

In vqa task, I want to input two images and ask a question about the two images,how to realize it?

@LukeForeverYoung
Copy link
Collaborator

You can pass a list of images and place the same number of "<|image|>" in your prompt.

@fansticOne
Copy link
Author

fansticOne commented Feb 4, 2024

I pass a list of images, say 2 images, and modify the prompt. The image_tensor after preprocess has batch size of 2, while the input_ids has batch size of 1,then I run model.generate(), I do get a result, however the result is wrong. Do I misunderstand?

@LukeForeverYoung
Copy link
Collaborator

I pass a list of images, say 2 images, and modify the prompt. The image_tensor after preprocess has batch size of 2, while the input_ids has batch size of 1,then I run model.generate(), I do get a result, however the result is wrong. Do I misunderstand?

Could you provide an example and the incorrect response generated by the owl? Btw, the owl has not been trained on SFT data that includes multiple images. Therefore, it is reasonable to expect that it might fail in some cases.

@fansticOne
Copy link
Author

Here are the two images I passed
1664356777209_m_11
1664356777209_m_17
the prompt is
'USER: <|image|><|image|>{}\nAnswer the question using a single word or phrase. ASSISTANT:'.format('Does the dog in the first picture have same color with the dog in the second picture?')
the response generated by the owl is 'Yes'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants