New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to realize multi-image correlation in vqa task? #200
Comments
You can pass a list of images and place the same number of "<|image|>" in your prompt. |
I pass a list of images, say 2 images, and modify the prompt. The image_tensor after preprocess has batch size of 2, while the input_ids has batch size of 1,then I run model.generate(), I do get a result, however the result is wrong. Do I misunderstand? |
Could you provide an example and the incorrect response generated by the owl? Btw, the owl has not been trained on SFT data that includes multiple images. Therefore, it is reasonable to expect that it might fail in some cases. |
In vqa task, I want to input two images and ask a question about the two images,how to realize it?
The text was updated successfully, but these errors were encountered: