Output of `interleav_wrap_chat` #301

wlin-at · 2024-05-02T09:18:10Z

Hi, thanks for the great work. I tried the following code snippet with the internlm-xcomposer2-vl-7b model for QA task with two input images.

images = [osp.join( image_folder_dir, "COCO_val2014_000000143961.jpg"),
          osp.join( image_folder_dir, "COCO_val2014_000000274538.jpg")]
image1 = model.encode_img(images[0])
image2 = model.encode_img(images[1])
image = torch.cat((image1, image2), dim=0)
query = """First picture:<ImageHere>, second picture:<ImageHere>. Describe the subject of these two pictures?"""
response, _ = model.interleav_wrap_chat(tokenizer, query, image, history=[], meta_instruction= True)

(here the meta_instruction is a required positional argument, not sure whether it should be set to True or False)
However, I realized that the returned response is actually {'inputs_embeds': wrap_embeds}.
How should I further proceed to get the decoded text output?
Thanks in advance!

The text was updated successfully, but these errors were encountered:

mm-assistant bot assigned LightDXY May 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output of `interleav_wrap_chat` #301

Output of `interleav_wrap_chat` #301

wlin-at commented May 2, 2024

Output of interleav_wrap_chat #301

Output of interleav_wrap_chat #301

Comments

wlin-at commented May 2, 2024

Output of `interleav_wrap_chat` #301

Output of `interleav_wrap_chat` #301