Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update README for NPU inference #661

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

wangshuai09
Copy link

I've verified ChatGLM2-6B in HUAWEI Ascend device,

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True, device='npu')
model = model.eval()
response, history = model.chat(tokenizer, "你好", history=[])
print(response)
response, history = model.chat(tokenizer, "今天的天气怎么样", history=[])
print(response)

Ouputs:

Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:25<00:00,  3.71s/it]
你好👋!我是人工智能助手 ChatGLM2-6B,很高兴见到你,欢迎问我任何问题。
抱歉,作为一个人工智能语言模型,我没有实时的气象数据或访问权限。建议您查看当地的天气预报或使用天气应用程序来获取最准确的天气信息。

Update the README about Ascend NPU support to help people who wants to use Ascend for ChatGLM2-6B inference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant