You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thanks for the great work. I tried the following code snippet with the internlm-xcomposer2-vl-7b model for QA task with two input images.
images = [osp.join( image_folder_dir, "COCO_val2014_000000143961.jpg"),
osp.join( image_folder_dir, "COCO_val2014_000000274538.jpg")]
image1 = model.encode_img(images[0])
image2 = model.encode_img(images[1])
image = torch.cat((image1, image2), dim=0)
query = """First picture:<ImageHere>, second picture:<ImageHere>. Describe the subject of these two pictures?"""
response, _ = model.interleav_wrap_chat(tokenizer, query, image, history=[], meta_instruction= True)
(here the meta_instruction is a required positional argument, not sure whether it should be set to True or False)
However, I realized that the returned response is actually {'inputs_embeds': wrap_embeds}.
How should I further proceed to get the decoded text output?
Thanks in advance!
The text was updated successfully, but these errors were encountered:
Hi, thanks for the great work. I tried the following code snippet with the
internlm-xcomposer2-vl-7b
model for QA task with two input images.(here the
meta_instruction
is a required positional argument, not sure whether it should be set to True or False)However, I realized that the returned
response
is actually{'inputs_embeds': wrap_embeds}
.How should I further proceed to get the decoded text output?
Thanks in advance!
The text was updated successfully, but these errors were encountered: