关于llama3base版本的评测 #3447

QingChengLineOne · 2024-04-25T11:14:34Z

Reminder

I have read the README and searched the existing issues.

Reproduction

CUDA_VISIBLE_DEVICES=0,1 python ../../src/evaluate.py
--model_name_or_path /public/model/Meta-Llama-3-8B
--template llama3
--finetuning_type lora
--task /public/zzy/LLaMA-Factory/evaluation/mmlu
--split validation
--lang en
--n_shot 5
--batch_size 4
评测结果：
Average: 23.74
STEM: 24.75
Social Sciences: 22.15
Humanities: 22.97
Other: 25.51

Expected behavior

为什么只有23.74

System Info

No response

Others

No response

codemayq · 2024-04-28T01:07:56Z

使用 fewshot 的 template 试一下，我的结果是
Average: 61.67
STEM: 48.18
Social Sciences: 74.15
Humanities: 55.64
Other: 70.67

QingChengLineOne · 2024-04-28T02:10:12Z

使用 fewshot 的 template 试一下，我的结果是 Average: 61.67 STEM: 48.18 Social Sciences: 74.15 Humanities: 55.64 Other: 70.67

可以展示一下脚本文件吗

QingChengLineOne · 2024-04-28T02:27:52Z

使用 fewshot 的 template 试一下，我的结果是 Average: 61.67 STEM: 48.18 Social Sciences: 74.15 Humanities: 55.64 Other: 70.67
也就是说如果我要测llama3base版本，就需要将--template这个参数设置为fewshot吗，instruct版本就是llama3？

codemayq · 2024-04-28T09:50:58Z

python ./src/evaluate.py
--model_name_or_path /media/codingma/LLM/llama3/Meta-Llama-3-8B
--template fewshot
--task mmlu
--split validation
--lang en
--n_shot 5
--batch_size 4

QingChengLineOne · 2024-04-28T12:53:22Z

python ./src/evaluate.py --model_name_or_path /media/codingma/LLM/llama3/Meta-Llama-3-8B --template fewshot --task mmlu --split validation --lang en --n_shot 5 --batch_size 4

非常感谢

hiyouga added the pending This problem is yet to be addressed. label Apr 25, 2024

codemayq added solved This problem has been already solved. and removed pending This problem is yet to be addressed. labels Apr 28, 2024

QingChengLineOne closed this as completed Apr 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

关于llama3base版本的评测 #3447

关于llama3base版本的评测 #3447

QingChengLineOne commented Apr 25, 2024

codemayq commented Apr 28, 2024

QingChengLineOne commented Apr 28, 2024

QingChengLineOne commented Apr 28, 2024

codemayq commented Apr 28, 2024

QingChengLineOne commented Apr 28, 2024

关于llama3base版本的评测 #3447

关于llama3base版本的评测 #3447

Comments

QingChengLineOne commented Apr 25, 2024

Reminder

Reproduction

Expected behavior

System Info

Others

codemayq commented Apr 28, 2024

QingChengLineOne commented Apr 28, 2024

QingChengLineOne commented Apr 28, 2024

codemayq commented Apr 28, 2024

QingChengLineOne commented Apr 28, 2024