visual-language-models

Star

Here are 11 public repositories matching this topic...

THUDM / CogVLM

Star

a state-of-the-art-level open visual language model | 多模态预训练模型

pretrained-models language-model multi-modal cross-modality visual-language-models

Updated May 20, 2024
Python

bilel-bj / ROSGPT_Vision

Star

Commanding robots using only Language Models' prompts

robotics language-models ros2 robotic-vision large-language-models llm prompt-engineering chatgpt language-models-are-next robotic-design-patterns prompting-robotic-modalities visual-language-models

Updated May 19, 2024
Python

hk-zh / language-conditioned-robot-manipulation-models

Star

https://arxiv.org/abs/2312.10807

reinforcement-learning imitation-learning foundation-models visual-language-models language-conditioned-learning large-languge-models

Updated Mar 20, 2024

Sid2697 / HOI-Ref

Star

Code implementation for paper titled "HOI-Ref: Hand-Object Interaction Referral in Egocentric Vision"

dataset dataset-generation vlm hand-object-interaction egocentric-vision large-language-models visual-language-models

Updated Apr 16, 2024
Python

amathislab / wildclip

Star

Scene and animal attribute retrieval from camera trap data with domain-adapted vision-language models

behavior computer-vision clip camera-trap computervision visual-language-models

Updated Mar 8, 2024
Python

GraphPKU / CoI

Star

Chain of Images for Intuitively Reasoning

chatbot llama multimodal chatgpt llava visual-language-models gpt4v dalle3 chain-of-throught chain-of-image

Updated Nov 29, 2023
Python

declare-lab / Sealing

Star

[NAACL 2024] Official Implementation of paper "Self-Adaptive Sampling for Efficient Video Question Answering on Image--Text Models"

multimodality video-understanding video-question-answering visual-language-models naacl2024

Updated Apr 26, 2024
Python

csebuetnlp / IllusionVQA

Star

This repository contains the data and code of the paper titled "IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language Models"

vqa vqa-dataset optical-illusions visual-language-models

Updated Apr 8, 2024
Jupyter Notebook

CristianoPatricio / concept-based-interpretability-VLM

Star

Code for the paper "Towards Concept-based Interpretability of Skin Lesion Diagnosis using Vision-Language Models", ISBI 2024.

deep-learning medical-imaging clip interpretability explainable-ai skin-lesion-classification melanoma-diagnosis concept-based-explanations visual-language-models ieee-isbi

Updated May 20, 2024
Jupyter Notebook

sduzpf / UAP_VLP

Star

Universal Adversarial Perturbations for Vision-Language Pre-trained Models

deep-neural-networks adversarial-attacks visual-language-models

Updated May 16, 2024
Python

laclouis5 / uform-coreml-converters

Star

CLI for converting UForm models to CoreML.

transformers coreml coremltools uform visual-language-models

Updated Jan 12, 2024
Python

Improve this page

Add a description, image, and links to the visual-language-models topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the visual-language-models topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

visual-language-models

Here are 11 public repositories matching this topic...

THUDM / CogVLM

bilel-bj / ROSGPT_Vision

hk-zh / language-conditioned-robot-manipulation-models

Sid2697 / HOI-Ref

amathislab / wildclip

GraphPKU / CoI

declare-lab / Sealing

csebuetnlp / IllusionVQA

CristianoPatricio / concept-based-interpretability-VLM

sduzpf / UAP_VLP

laclouis5 / uform-coreml-converters

Improve this page

Add this topic to your repo