Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于llama3base版本的评测 #3447

Closed
1 task done
QingChengLineOne opened this issue Apr 25, 2024 · 5 comments
Closed
1 task done

关于llama3base版本的评测 #3447

QingChengLineOne opened this issue Apr 25, 2024 · 5 comments
Labels
solved This problem has been already solved.

Comments

@QingChengLineOne
Copy link

Reminder

  • I have read the README and searched the existing issues.

Reproduction

CUDA_VISIBLE_DEVICES=0,1 python ../../src/evaluate.py
--model_name_or_path /public/model/Meta-Llama-3-8B
--template llama3
--finetuning_type lora
--task /public/zzy/LLaMA-Factory/evaluation/mmlu
--split validation
--lang en
--n_shot 5
--batch_size 4
评测结果:
Average: 23.74
STEM: 24.75
Social Sciences: 22.15
Humanities: 22.97
Other: 25.51

Expected behavior

为什么只有23.74

System Info

No response

Others

No response

@hiyouga hiyouga added the pending This problem is yet to be addressed. label Apr 25, 2024
@codemayq
Copy link
Collaborator

使用 fewshot 的 template 试一下,我的结果是
Average: 61.67
STEM: 48.18
Social Sciences: 74.15
Humanities: 55.64
Other: 70.67

@codemayq codemayq added solved This problem has been already solved. and removed pending This problem is yet to be addressed. labels Apr 28, 2024
@QingChengLineOne
Copy link
Author

使用 fewshot 的 template 试一下,我的结果是 Average: 61.67 STEM: 48.18 Social Sciences: 74.15 Humanities: 55.64 Other: 70.67

可以展示一下脚本文件吗

@QingChengLineOne
Copy link
Author

使用 fewshot 的 template 试一下,我的结果是 Average: 61.67 STEM: 48.18 Social Sciences: 74.15 Humanities: 55.64 Other: 70.67
也就是说如果我要测llama3base版本,就需要将--template这个参数设置为fewshot吗,instruct版本就是llama3?

@codemayq
Copy link
Collaborator

python ./src/evaluate.py
--model_name_or_path /media/codingma/LLM/llama3/Meta-Llama-3-8B
--template fewshot
--task mmlu
--split validation
--lang en
--n_shot 5
--batch_size 4

@QingChengLineOne
Copy link
Author

python ./src/evaluate.py --model_name_or_path /media/codingma/LLM/llama3/Meta-Llama-3-8B --template fewshot --task mmlu --split validation --lang en --n_shot 5 --batch_size 4

非常感谢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved.
Projects
None yet
Development

No branches or pull requests

3 participants