ceval评估chatglm3和qwen1.5模型时,和官方给出的数据相差较大 #2818
Unanswered
hnxtcyj123
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
我用ceval评估chatglm3和qwen1.5,但是比官方给出的分数要低不少
例如在评估chatglm3时,我的评估脚本:
此时分数ceval的分数为65.75,而官方给出的分数为69,在把--n_shot设置为5时分数反而更低了(62.93),请问这是什么原因?是我的脚本设置有问题吗?
Beta Was this translation helpful? Give feedback.
All reactions