The Paper List of Large Multi-Modality Model, Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.
tutorial
awesome-list
vision-and-language
video-text-recognition
cross-modal-retrieval
visual-semantic-embedding
image-text-matching
video-text-retrieval
image-text-retrieval
multimodal-pretraining
large-language-models
large-vision-language-models
memory-efficient-tuning
parameter-efficient-fine-tuning
large-vision-models
-
Updated
Mar 9, 2024