Update README for NPU inference #661

wangshuai09 · 2024-01-30T08:48:11Z

I've verified ChatGLM2-6B in HUAWEI Ascend device,

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("THUDM/chatglm2-6b", trust_remote_code=True, device='npu')
model = model.eval()
response, history = model.chat(tokenizer, "你好", history=[])
print(response)
response, history = model.chat(tokenizer, "今天的天气怎么样", history=[])
print(response)

Ouputs:

Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:25<00:00,  3.71s/it]
你好👋！我是人工智能助手 ChatGLM2-6B，很高兴见到你，欢迎问我任何问题。
抱歉,作为一个人工智能语言模型,我没有实时的气象数据或访问权限。建议您查看当地的天气预报或使用天气应用程序来获取最准确的天气信息。

Update the README about Ascend NPU support to help people who wants to use Ascend for ChatGLM2-6B inference.

Update README for NPU inference

a2065a4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update README for NPU inference #661

Update README for NPU inference #661

wangshuai09 commented Jan 30, 2024

Update README for NPU inference #661

Are you sure you want to change the base?

Update README for NPU inference #661

Conversation

wangshuai09 commented Jan 30, 2024