adding support for the chatglm3-6b model #6623

mnlife · 2024-04-12T02:31:27Z

mnlife
Apr 12, 2024

Hello everyone, I'm currently adding support for the chatglm3-6b model on llama.cpp. I've done a quick check on the inference algorithm, but I didn't find any issues.
However, the inference results are all over the place.

How should I go about debugging this?
Are there any tricks for pinpointing the issue when inference goes wrong?

The following are my log link and code modifications:

main.log
0001-feature-add-chatglm-support.patch.txt

ggerganov · 2024-04-12T07:57:35Z

ggerganov
Apr 12, 2024
Maintainer

Debugging could be tedious - you can try to compare the results after each op with the reference implementation. Or open a PR and more people can have a look

From a quick look, the following code incorrectly passes split_point for the offset argument of ggml_view_2d which expects bytes instead of elements:

+                // 创建两个视图张量，分别表示切分后的两部分
+                int64_t split_point = cur->ne[0] / 2;
+                struct ggml_tensor * x0 = ggml_cont(ctx, ggml_view_2d(ctx, cur, split_point, cur->ne[1], cur->nb[1], 0));
+                struct ggml_tensor * x1 = ggml_cont(ctx, ggml_view_2d(ctx, cur, split_point, cur->ne[1], cur->nb[1], split_point));

1 reply

mnlife Apr 15, 2024
Author

Thank you!!!
I've made modifications to this, along with some other changes. After testing, everything seems fine. I'll tidy it up and push this commit shortly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adding support for the chatglm3-6b model #6623

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

adding support for the chatglm3-6b model #6623

mnlife Apr 12, 2024

Replies: 1 comment · 1 reply

ggerganov Apr 12, 2024 Maintainer

mnlife Apr 15, 2024 Author

mnlife
Apr 12, 2024

Replies: 1 comment 1 reply

ggerganov
Apr 12, 2024
Maintainer

mnlife Apr 15, 2024
Author