Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question]:chatglmv2无法正确初始化 #8352

Closed
yidu0924 opened this issue Apr 30, 2024 · 3 comments
Closed

[Question]:chatglmv2无法正确初始化 #8352

yidu0924 opened this issue Apr 30, 2024 · 3 comments
Assignees
Labels
question Further information is requested

Comments

@yidu0924
Copy link

请提出你的问题

Some weights of ChatGLMv2ForCausalLM were not initialized from the model checkpoint at /home/.paddlenlp/models/THUDM/chatglm2-6b and are newly initialized: ['encoder.layers.2.self_attention.key.weight', 'encoder.layers.0.self_attention.value.bias', 'encoder.layers.0.self_attention.key.weight', 'encoder.layers.11.self_attention.key.weight', 'encoder.layers.15.self_attention.query.weight', 'encoder.layers.24.self_attention.value.bias', 'encoder.layers.7.self_attention.key.weight', 'encoder.layers.24.self_attention.key.weight', 'encoder.layers.19.self_attention.query.weight', 'encoder.layers.11.self_attention.query.weight', 'encoder.layers.20.self_attention.key.bias', 'encoder.layers.23.self_attention.query.bias', 'encoder.layers.25.self_attention.query.weight', 'encoder.layers.4.self_attention.key.bias', 'encoder.layers.6.self_attention.value.bias', 'encoder.layers.16.self_attention.value.bias', 'encoder.layers.17.self_attention.key.bias', 'encoder.layers.21.self_attention.query.weight', 'encoder.layers.24.self_attention.query.weight', 'encoder.layers.26.self_attention.query.bias', 'encoder.layers.23.self_attention.key.bias', 'encoder.layers.23.self_attention.query.weight', 'encoder.layers.21.self_attention.key.weight', 'encoder.layers.25.self_attention.key.weight', 'encoder.layers.27.self_attention.value.bias', 'encoder.layers.2.self_attention.key.bias', 'encoder.layers.25.self_attention.value.weight', 'encoder.layers.20.self_attention.value.bias', 'encoder.layers.18.self_attention.key.weight', 'encoder.layers.12.self_attention.query.bias', 'encoder.layers.14.self_attention.query.bias', 'encoder.layers.5.self_attention.key.bias', 'encoder.layers.24.self_attention.value.weight', 'encoder.layers.17.self_attention.query.weight', 'encoder.layers.7.self_attention.value.weight', 'encoder.layers.18.self_attention.value.weight', 'encoder.layers.22.self_attention.query.weight', 'encoder.layers.12.self_attention.key.weight', 'encoder.layers.17.self_attention.value.bias', 'encoder.layers.13.self_attention.query.bias', 'encoder.layers.22.self_attention.key.bias', 'encoder.layers.1.self_attention.key.bias', 'encoder.layers.5.self_attention.key.weight', 'encoder.layers.26.self_attention.query.weight', 'encoder.layers.12.self_attention.query.weight', 'encoder.layers.0.self_attention.query.weight', 'encoder.layers.16.self_attention.query.weight', 'encoder.layers.27.self_attention.query.bias', 'encoder.layers.3.self_attention.query.weight', 'encoder.layers.25.self_attention.key.bias', 'encoder.layers.1.self_attention.query.weight', 'encoder.layers.5.self_attention.value.bias', 'encoder.layers.21.self_attention.query.bias', 'encoder.layers.17.self_attention.value.weight', 'encoder.layers.10.self_attention.key.weight', 'encoder.layers.22.self_attention.key.weight', 'encoder.layers.19.self_attention.key.bias', 'encoder.layers.24.self_attention.query.bias', 'encoder.layers.24.self_attention.key.bias', 'encoder.layers.21.self_attention.key.bias', 'encoder.layers.22.self_attention.query.bias', 'encoder.layers.6.self_attention.key.weight', 'encoder.layers.4.self_attention.value.bias', 'encoder.layers.13.self_attention.query.weight', 'encoder.layers.11.self_attention.query.bias', 'encoder.layers.2.self_attention.value.weight', 'encoder.layers.9.self_attention.key.bias', 'encoder.layers.26.self_attention.key.bias', 'encoder.layers.2.self_attention.query.weight', 'encoder.layers.3.self_attention.value.weight', 'encoder.layers.15.self_attention.value.bias', 'encoder.layers.22.self_attention.value.bias', 'encoder.layers.27.self_attention.key.weight', 'encoder.layers.13.self_attention.value.weight', 'encoder.layers.1.self_attention.value.weight', 'encoder.layers.27.self_attention.query.weight', 'encoder.layers.14.self_attention.query.weight', 'encoder.layers.9.self_attention.query.weight', 'encoder.layers.25.self_attention.query.bias', 'encoder.layers.12.self_attention.value.weight', 'encoder.layers.4.self_attention.query.weight', 'encoder.layers.17.self_attention.query.bias', 'encoder.layers.14.self_attention.value.weight', 'encoder.layers.10.self_attention.query.weight', 'encoder.layers.18.self_attention.query.weight', 'encoder.layers.3.self_attention.query.bias', 'encoder.layers.8.self_attention.query.bias', 'encoder.layers.2.self_attention.value.bias', 'encoder.layers.9.self_attention.query.bias', 'encoder.layers.27.self_attention.value.weight', 'encoder.layers.1.self_attention.value.bias', 'encoder.layers.10.self_attention.query.bias', 'encoder.layers.7.self_attention.value.bias', 'encoder.layers.9.self_attention.value.bias', 'encoder.layers.27.self_attention.key.bias', 'encoder.layers.5.self_attention.query.weight', 'encoder.layers.17.self_attention.key.weight', 'encoder.layers.25.self_attention.value.bias', 'encoder.layers.8.self_attention.query.weight', 'encoder.layers.19.self_attention.query.bias', 'encoder.layers.22.self_attention.value.weight', 'encoder.layers.12.self_attention.value.bias', 'encoder.layers.20.self_attention.query.weight', 'encoder.layers.12.self_attention.key.bias', 'encoder.layers.26.self_attention.value.bias', 'encoder.layers.0.self_attention.value.weight', 'encoder.layers.8.self_attention.value.weight', 'encoder.layers.11.self_attention.value.bias', 'encoder.layers.7.self_attention.query.bias', 'encoder.layers.23.self_attention.key.weight', 'encoder.layers.21.self_attention.value.weight', 'encoder.layers.14.self_attention.key.weight', 'encoder.layers.9.self_attention.value.weight', 'encoder.layers.8.self_attention.key.weight', 'encoder.layers.7.self_attention.key.bias', 'encoder.layers.13.self_attention.key.bias', 'encoder.layers.6.self_attention.query.weight', 'encoder.layers.11.self_attention.key.bias', 'encoder.layers.3.self_attention.key.weight', 'encoder.layers.15.self_attention.value.weight', 'encoder.layers.3.self_attention.key.bias', 'encoder.layers.9.self_attention.key.weight', 'encoder.layers.16.self_attention.key.weight', 'encoder.layers.10.self_attention.key.bias', 'encoder.layers.1.self_attention.query.bias', 'encoder.layers.5.self_attention.value.weight', 'encoder.layers.20.self_attention.query.bias', 'encoder.layers.18.self_attention.query.bias', 'encoder.layers.20.self_attention.key.weight', 'encoder.layers.14.self_attention.value.bias', 'encoder.layers.13.self_attention.key.weight', 'encoder.layers.4.self_attention.value.weight', 'encoder.layers.7.self_attention.query.weight', 'encoder.layers.16.self_attention.value.weight', 'encoder.layers.10.self_attention.value.bias', 'encoder.layers.21.self_attention.value.bias', 'encoder.layers.23.self_attention.value.weight', 'encoder.layers.26.self_attention.key.weight', 'encoder.layers.18.self_attention.value.bias', 'encoder.layers.6.self_attention.query.bias', 'encoder.layers.8.self_attention.value.bias', 'encoder.layers.18.self_attention.key.bias', 'encoder.layers.4.self_attention.query.bias', 'encoder.layers.3.self_attention.value.bias', 'encoder.layers.4.self_attention.key.weight', 'encoder.layers.20.self_attention.value.weight', 'encoder.layers.8.self_attention.key.bias', 'encoder.layers.19.self_attention.value.bias', 'encoder.layers.11.self_attention.value.weight', 'encoder.layers.6.self_attention.value.weight', 'encoder.layers.0.self_attention.query.bias', 'encoder.layers.5.self_attention.query.bias', 'encoder.layers.2.self_attention.query.bias', 'encoder.layers.15.self_attention.key.weight', 'encoder.layers.0.self_attention.key.bias', 'encoder.layers.26.self_attention.value.weight', 'encoder.layers.19.self_attention.key.weight', 'encoder.layers.13.self_attention.value.bias', 'encoder.layers.19.self_attention.value.weight', 'encoder.layers.1.self_attention.key.weight', 'encoder.layers.23.self_attention.value.bias', 'encoder.layers.15.self_attention.query.bias', 'encoder.layers.14.self_attention.key.bias', 'encoder.layers.6.self_attention.key.bias', 'encoder.layers.16.self_attention.query.bias', 'encoder.layers.10.self_attention.value.weight', 'encoder.layers.15.self_attention.key.bias', 'encoder.layers.16.self_attention.key.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

权重文件是下载的model_state.pdparams,但是模型无法正确初始化,进而无法作出正确的预测

@yidu0924 yidu0924 added the question Further information is requested label Apr 30, 2024
@w5688414
Copy link
Contributor

请问你用的是什么环境,我测了一下没什么问题:

Python 3.9.16 (main, Dec  7 2022, 01:11:58) 
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from paddlenlp.transformers import AutoModelForCausalLM
>>> model = AutoModelForCausalLM.from_pretrained("THUDM/chatglm2-6b")
(…)/community/THUDM/chatglm2-6b/config.json: 100%|███████████████████████████████████████████████████████████| 885/885 [00:00<00:00, 141kB/s]
[2024-04-30 14:16:22,828] [    INFO] - We are using <class 'paddlenlp.transformers.chatglm_v2.modeling.ChatGLMv2ForCausalLM'> to load 'THUDM/chatglm2-6b'.
[2024-04-30 14:16:22,828] [    INFO] - Loading configuration file /root/.paddlenlp/models/THUDM/chatglm2-6b/config.json
(…)y/THUDM/chatglm2-6b/model_state.pdparams: 100%|██████████████████████████████████████████████████████| 12.5G/12.5G [02:54<00:00, 71.4MB/s]
[2024-04-30 14:19:18,048] [    INFO] - Loading weights file from cache at /root/.paddlenlp/models/THUDM/chatglm2-6b/model_state.pdparams
[2024-04-30 14:19:30,138] [    INFO] - Loaded weights file from disk, setting weights to model.
W0430 14:19:30.150593  2094 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.0, Driver API Version: 11.8, Runtime API Version: 11.8
W0430 14:19:30.174417  2094 gpu_resources.cc:164] device: 0, cuDNN Version: 8.6.
[2024-04-30 14:19:45,617] [    INFO] - All model checkpoint weights were used when initializing ChatGLMv2ForCausalLM.

[2024-04-30 14:19:45,618] [    INFO] - All the weights of ChatGLMv2ForCausalLM were initialized from the model checkpoint at THUDM/chatglm2-6b.
If your task is similar to the task the model of the checkpoint was trained on, you can already use ChatGLMv2ForCausalLM for predictions without further training.
[2024-04-30 14:19:45,642] [    INFO] - Generation config file not found, using a generation config created from the model config.

@yidu0924
Copy link
Author

您好,我的版本如下
paddle:2.6.1
paddlenlp:2.6.1
cuda:12.0
ubuntu:22.04

@yidu0924
Copy link
Author

解决了,谢谢大佬,把python版本降到3.9.16就好了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants