LLM Evaluation

This project aims to evaluate Large Language Models performance on different NLP tasks in combination with various prompts

Environment

Create a virtualenv and install requirements

make virtualenv

Then pull the data

dvc pull

Note that for DVC to work you need access to Mantis AWS

Data

To be filled

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
llm_logs		llm_logs
notebooks		notebooks
prompts		prompts
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt
run_parallel_llm_evaluations.py		run_parallel_llm_evaluations.py
setup.py		setup.py
unpinned_requirements.txt		unpinned_requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm_logs

llm_logs

notebooks

notebooks

prompts

prompts

Makefile

Makefile

README.md

README.md

requirements.txt

requirements.txt

run_parallel_llm_evaluations.py

run_parallel_llm_evaluations.py

setup.py

setup.py

unpinned_requirements.txt

unpinned_requirements.txt

Repository files navigation

LLM Evaluation

Environment

Data

About

Releases

Sponsor this project

Packages

Languages

MantisAI/prompt_engineering

Folders and files

Latest commit

History

Repository files navigation

LLM Evaluation

Environment

Data

About

Topics

Resources

Stars

Watchers

Forks

Sponsor this project

Languages