vision-language-pretraining

This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://dl.acm.org/doi/abs/10.1145/3617833 .

cross-modal multimodal-deep-learning multimodal-datasets transformer-models multimodal-pre-trained-model vision-language-pretraining multimodal-applications multimodal-pretext

Updated Oct 19, 2023

megvii-research / protoclip

Star

📍 Official pytorch implementation of paper "ProtoCLIP: Prototypical Contrastive Language Image Pretraining" (IEEE TNNLS)

self-supervised-learning contrastive-learning vision-language-pretraining

Updated Nov 8, 2023
Python

Surrey-UPLab / Recognize-Any-Regions

Star

Recognize Any Regions

open-world object-detection zero-shot instance-segmentation auto-labeling vision-language-pretraining open-vocabulary vision-language-model multimodal-representation-learning vision-foundation-model vision-language-foundation-model

Updated Nov 22, 2023
Python

yiren-jian / BLIText

Star

[NeurIPS 2023] Bootstrapping Vision-Language Learning with Decoupled Language Pre-training

multimodal-deep-learning vision-language-transformer vision-language-pretraining

Updated Dec 5, 2023
Python

LooperXX / ManagerTower

Star

Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning

vision-language multi-modal-learning vision-language-pretraining vision-language-learning

Updated Dec 12, 2023
Python

omipan / svl_adapter

Star

SVL-Adapter: Self-Supervised Adapter for Vision-Language Pretrained Models

self-supervised-learning vision-language-pretraining

Updated Jan 11, 2024
Python

YyzHarry / vlm-fairness

Star

Demographic Bias of Vision-Language Foundation Models in Medical Imaging

medical-imaging fairness subpopulation algorithmic-fairness bias-mitigation ood-generalization foundation-models vision-language-pretraining vision-language-model

Updated Feb 23, 2024
Python

deepseek-ai / DeepSeek-VL

Star

DeepSeek-VL: Towards Real-World Vision-Language Understanding

foundation-models vision-language-pretraining vision-language-model

Updated Apr 24, 2024
Python

Improve this page

Add a description, image, and links to the vision-language-pretraining topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the vision-language-pretraining topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vision-language-pretraining

Here are 28 public repositories matching this topic...

ttengwang / VLMixer

Sense-GVT / DeCLIP

vgthengane / Continual-CLIP

TencentARC / FLM

alinlab / b2t

sail-sg / ptp

ChenDelong1999 / ITRA

ArrowLuo / SegCLIP

TXH-mercury / COSA

adarobustness / adaptation_robustness

Zoky-2020 / SGA

ahmdtaha / distributed_sigmoid_loss

marslanm / Multimodality-Representation-Learning

megvii-research / protoclip

Surrey-UPLab / Recognize-Any-Regions

yiren-jian / BLIText

LooperXX / ManagerTower

omipan / svl_adapter

YyzHarry / vlm-fairness

deepseek-ai / DeepSeek-VL

Improve this page

Add this topic to your repo