Skip to content
#

large-multimodal-models

Here are 19 public repositories matching this topic...

OpenAdapt

MixEval, a ground-truth-based dynamic benchmark derived from off-the-shelf benchmark mixtures, which evaluates LLMs with a highly capable model ranking (i.e., 0.96 correlation with Chatbot Arena) while running locally and quickly (6% the time and cost of running MMLU), with its queries being stably updated every month to avoid contamination.

  • Updated May 31, 2024
  • Python

Improve this page

Add a description, image, and links to the large-multimodal-models topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the large-multimodal-models topic, visit your repo's landing page and select "manage topics."

Learn more