Zero-shot retrieval reproduction issue #112

jqsun98 · 2024-04-25T09:24:57Z

According to the ReadMe at https://github.com/OpenGVLab/InternVideo/tree/main/InternVideo1/Downstream/Video-Text-Retrieval, the zero-shot retrieval results will be obtained after running the command ./zeroshot_scripts/eval_msrvtt.sh. This command will execute the main_task_retrieval.py. But in "main_task_retrieval.py", I find that the model is CLIP4CLIP, instead of ViCLIP. I'd like to know how to conduct zero-shot video-text retrieval experiments with pretrained ViCLIP.

The text was updated successfully, but these errors were encountered:

leexinhao · 2024-04-30T02:46:55Z

Maybe you need to use the code of Internvideo2.mulitidality and add a model defintion of ViCLIP.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zero-shot retrieval reproduction issue #112

Zero-shot retrieval reproduction issue #112

jqsun98 commented Apr 25, 2024

leexinhao commented Apr 30, 2024

Zero-shot retrieval reproduction issue #112

Zero-shot retrieval reproduction issue #112

Comments

jqsun98 commented Apr 25, 2024

leexinhao commented Apr 30, 2024