AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
-
Updated
May 30, 2024 - Python
AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
LLaVA-Plus: Large Language and Vision Assistants that Plug and Learn to Use Skills
A Framework of Small-scale Large Multimodal Models
A collection of resources on applications of multi-modal learning in medical imaging.
This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
[CVPR 2024 Highlight] OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation
Embed arbitrary modalities (images, audio, documents, etc) into large language models.
Open Platform for Embodied Agents
This repo contains evaluation code for the paper "Are We on the Right Way for Evaluating Large Vision-Language Models"
BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models
An open-source implementation of LLaVA-NeXT.
Evaluation framework for paper "VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding?"
A curated list of awesome Multimodal studies.
The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.
This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"
Awesome multi-modal large language paper/project, collections of popular training strategies, e.g., PEFT, LoRA.
The offical Implementation of "Instruction-Guided Visual Masking"
Code for "Adapting Large Multimodal Models to Distribution Shifts: The Role of In-Context Learning"
MixEval, a ground-truth-based dynamic benchmark derived from off-the-shelf benchmark mixtures, which evaluates LLMs with a highly capable model ranking (i.e., 0.96 correlation with Chatbot Arena) while running locally and quickly (6% the time and cost of running MMLU), with its queries being stably updated every month to avoid contamination.
Add a description, image, and links to the large-multimodal-models topic page so that developers can more easily learn about it.
To associate your repository with the large-multimodal-models topic, visit your repo's landing page and select "manage topics."