Evaluate your LLM's response with Prometheus and GPT4 💯
-
Updated
Jun 11, 2024 - Python
Evaluate your LLM's response with Prometheus and GPT4 💯
This is the repo for the survey of Bias and Fairness in IR with LLMs.
Code and data for Koo et al's ACL 2024 paper "Benchmarking Cognitive Biases in Large Language Models as Evaluators"
Add a description, image, and links to the llm-as-evaluator topic page so that developers can more easily learn about it.
To associate your repository with the llm-as-evaluator topic, visit your repo's landing page and select "manage topics."