You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The difference between the evaluation result parameters and the document long text evaluation is about 20 points, The score for the document can not be reproduced.
“max_seq_len、max_out_len” Should these two parameters be modified in any way?
Other information
No response
The text was updated successfully, but these errors were encountered:
For optimal performance, it is advisable to configure the max_seq_len parameter to the highest value feasible, such as 32768 or even higher if possible. As for the max_out_len, it typically comes with a preset default value within the dataset configuration. You have the option to adjust this to 256, or you may simply retain the default setting.
Prerequisite
Type
I'm evaluating with the officially supported tasks/models/datasets.
Environment
python 3.10.1
OpenCompass 0.2.3
vllm 0.2.3
Reproduces the problem - code/configuration sample
configs/models/chatglm/vllm_chatglm2_6b_32k.py
from opencompass.models import VLLM
models = [
dict(
type=VLLM,
abbr='chatglm2-6b-32k-vllm',
path='THUDM/chatglm2-6b-32k',
max_out_len=512,
max_seq_len=4096,
batch_size=32,
generation_kwargs=dict(temperature=0),
run_cfg=dict(num_gpus=1, num_procs=1),
)
]
Reproduces the problem - command or script
python run.py --model vllm_chatglm2_6b_32k --datasets longbench leval
Reproduces the problem - error message
The difference between the evaluation result parameters and the document long text evaluation is about 20 points, The score for the document can not be reproduced.
Other information
No response
The text was updated successfully, but these errors were encountered: