Multi-image inference #71

g-h-chen · 2024-03-07T08:46:53Z

Thanks for your great work! LLaMA-VID supports single-image input and video input, but does it support multi-image input? What's the quickest way to adapt to this input?

Thanks in advance!

yanwei-li · 2024-04-01T04:16:38Z

In current version, we do not support multi-image input. But you can support it by using multi-image instruction data like MIMIC-IT for instruction tuning.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-image inference #71

Multi-image inference #71

g-h-chen commented Mar 7, 2024

yanwei-li commented Apr 1, 2024

Multi-image inference #71

Multi-image inference #71

Comments

g-h-chen commented Mar 7, 2024

yanwei-li commented Apr 1, 2024