Benchmark existing techniques using evaluation harness #7625

mrm1001 · 2024-05-02T08:42:10Z

Context on benchmark work

goal number 1 is to give user practical guidance on what techniques to try out on their dataset/use case
goal number 2 is to show that there is not a “silver bullet” type of solution, that it depends on the dataset and use case, but that Haystack can support them all
goal number 3 is to showcase advanced evaluation/experimentation API (most advanced compared to competitors)
it’s not a research paper, so should not be too “academic” (i.e. not too restricted in terms of metrics or datasets to use, not meant to be peer-reviewed or submitted to an academic conference)
Datasets

Give feedback

mrm1001 added P1 High priority, add to the next sprint topic:benchmark labels May 2, 2024

mrm1001 assigned davidsbatista May 3, 2024

mrm1001 mentioned this issue May 22, 2024

Create benchmarks for RAG on a multiple industry datasets #7728

Open