Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cls_token problem with image. #207

Open
evertonipx opened this issue Feb 14, 2024 · 3 comments
Open

cls_token problem with image. #207

evertonipx opened this issue Feb 14, 2024 · 3 comments

Comments

@evertonipx
Copy link

evertonipx commented Feb 14, 2024

When I use only prompt text mPLUG-Owl2 works fine. But when I include an image have this error:

File "C:\py projects\IPXCopilot_OWLVersion\mplug_owl2\model\visual_encoder.py", line 117, in forward if self.cls_token : RuntimeError: Boolean value of Tensor with more than one value is ambiguous

If I change to: if self.cls_token is not None: I got this error:

File "C:\py projects\IPXCopilot_OWLVersion\mplug_owl2\model\visual_encoder.py", line 123, in forward embeddings = embeddings + get_abs_pos(self.position_embedding,embeddings.size(1)) RuntimeError: The size of tensor a (1024) must match the size of tensor b (1049600) at non-singleton dimension 2

Anyone with the same problem? Worked fine before the update

@evertonipx evertonipx changed the title cls_token wproblem with image. cls_token problem with image. Feb 14, 2024
@findalexli
Copy link

Following as well, getting this exact issue

@jiaqixuac
Copy link

It seems that the updated code does not deal with cls_token well.
See 54b508a
If modify if embeddings.shape[1] != self.num_patches: -> if self.cls_token is None and embeddings.shape[1] != self.num_patches:, it can work.

@vateye
Copy link

vateye commented Feb 18, 2024

Fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants