Skip to content

How to Check LLM's Performance #362

Answered by ttthree
DevanshuBrahmbhatt asked this question in Q&A
Discussion options

You must be logged in to vote

The answer is "it depends on your scenario" ... and you will need to pick metric + define dataset(s) to run and calculate metric.
E.g. if you're using the LLM for classification or named entity recognition, you can use simple/classical metrics like accuracy, % match. These usually have a ground-truth to compare with.

If you're using the LLM for more complex scenarios like creating a chatbot then a new set of metric should be define, things like:

  • Whether the chatbot is answering questions based on facts and provided context. (people may call it groundedness)
  • How intelligent/smart the answer look like, e.g. is it relevant to the question, is it creative and able to impress the user
  • How eff…

Replies: 3 comments 3 replies

Comment options

You must be logged in to vote
0 replies
Answer selected by 0mza987
Comment options

You must be logged in to vote
3 replies
@OMGSOFTWARE
Comment options

@anoexpected
Comment options

@boorge
Comment options

Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
6 participants