llm-benchmarking

Here are 4 public repositories matching this topic...

robertvacareanu / llm4regression

Examining how large language models (LLMs) perform across various synthetic regression tasks when given (input, output) examples in their context, without any parameter update

linear-regression sklearn regression regression-models large-language-models llm llms llm-inference llm-benchmarking

Updated Apr 18, 2024
Python

lakeraai / pint-benchmark

Star

A benchmark for prompt injection detection systems.

benchmark llm prompt-injection llm-security llm-benchmarking

Updated May 7, 2024
Jupyter Notebook

AKSW / LLM-KG-Bench

Star

LLM-KG-Bench is a Framework and task collection for automated benchmarking of Large Language Models (LLMs) on Knowledge Graph (KG) related tasks.

sparql rdf knowledge-graph large-language-models llm llm-benchmarking

Updated May 9, 2024
Python

aws-samples / fm-leaderboarder

Star

FM-Leaderboard-er allows you to create leaderboard to find the best LLM/prompt for your own business use case based on your data, task, prompts

llm-evaluation llm-evaluation-framework llm-benchmarking

Updated May 8, 2024
Python

Improve this page

Add a description, image, and links to the llm-benchmarking topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-benchmarking topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm-benchmarking

Here are 4 public repositories matching this topic...

robertvacareanu / llm4regression

lakeraai / pint-benchmark

AKSW / LLM-KG-Bench

aws-samples / fm-leaderboarder

Improve this page

Add this topic to your repo