Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cannot use RWKV models #121

Open
rozek opened this issue Nov 9, 2023 · 16 comments
Open

cannot use RWKV models #121

rozek opened this issue Nov 9, 2023 · 16 comments

Comments

@rozek
Copy link

rozek commented Nov 9, 2023

I just tried to use the current version of "llama-node" with the "rwkv.cpp" backend and failed.

The link found in the docs where I should be able to download RWKV models points to nowhere.

Since I could not find pre-quantized models anywhere, I followed the instructions found in the rwkv.cpp repo to download, convert and quantize the 1.5B and 0.1B models - I even uploaded them to HuggingFace.

Then, I copied the example found in your docs added a path to my quantized model, changed the template and tried to run the result.

Unfortunately, I got nothing but an error message:

llama.cpp: loading model from /Users/andreas/rozek/AI/RWKV/RWKV-5-World-0.1B-v1-20230803-ctx4096-Q4_1.bin
error loading model: unknown (magic, version) combination: 67676d66, 00000065; is this really a GGML file?
llama_init_from_file: failed to load model
node:internal/process/promises:288
            triggerUncaughtException(err, true /* fromPromise */);
            ^

[Error: Failed to initialize LLama context from file: /Users/andreas/rozek/AI/RWKV/RWKV-5-World-0.1B-v1-20230803-ctx4096-Q4_1.bin] {
  code: 'GenericFailure'
}

Node.js v18.17.0

Do you have any idea what could be wrong?

@rozek
Copy link
Author

rozek commented Nov 9, 2023

I just learned that RWKV-5 models are not yet supported by rwkv.cpp.

So I tried RWKV-4 instead - took the .pth model and converted it to .bin following the docs. Unfortunately, however, the result is the same:

llama.cpp: loading model from /Users/andreas/rozek/AI/RWKV/RWKV-4-World-0.1B-v1-20230520-ctx4096.bin
error loading model: unknown (magic, version) combination: 67676d66, 00000065; is this really a GGML file?
llama_init_from_file: failed to load model
node:internal/process/promises:288
            triggerUncaughtException(err, true /* fromPromise */);
            ^

[Error: Failed to initialize LLama context from file: /Users/andreas/rozek/AI/RWKV/RWKV-4-World-0.1B-v1-20230520-ctx4096.bin] {
  code: 'GenericFailure'
}

Node.js v18.17.0

Using the same model with python python/generate_completions.py /rwkv/RWKV-4-World-0.1B-v1-20230520-ctx4096.bin from rwkv.cpp works, however

@yorkzero831
Copy link
Collaborator

hi there, could you please check your ggml version, it my not work if you are using the recent ggml version

@rozek
Copy link
Author

rozek commented Nov 9, 2023

how do I check my GGML version? I'm using the current version of rwkv.cpp

@rozek
Copy link
Author

rozek commented Nov 9, 2023

I just found a section in th rwkv.cpp README.md which says:

⚠️ Python API was restructured on 2023-09-20, you may need to change paths/package names in your code when updating rwkv.cpp.

may this be the reason for misbehaviour?

@rozek
Copy link
Author

rozek commented Nov 10, 2023

FYI: I just used the version of rwkv.cpp from from Sept, 20th (before they restructured the Python API) and tried again - with the same results.

Which means: no, the API restructuring is not the reason for not loading the RWKV model

@rozek
Copy link
Author

rozek commented Nov 11, 2023

FYI: going back to the latest commit (of rwkv.cpp) before "update ggml" fails because the resulting code can not be compiled.

Thus, in order to test if "llama-node" does work with RWKV actually means to go back to commit "update ggml" (8db73b1) and manually revert any changes related to GGML

Damn...

Not being a C++ developer, I have to give up here - I'll mention this problem in rwkv.cpp as well (see issue 144), let's see who will be able to fix it

@saharNooby
Copy link

Hi!

The module rwkv-cpp in llama-node explicitly points to a specific version of rwkv.cpp: rwkv.cpp @ 363dfb1. In turn, this version of rwkv.cpp explicitly points to a specific version of ggml: ggml @ 00b49ec. For all of this to work, I highly recommend not using newest/arbitrary versions of the packages, and stick to the ones that are explicitly referenced -- that way, everything should be compatible with each other.

@saharNooby
Copy link

If it helps debugging, for some reason llama.cpp loads the RWKV file, not rwkv.cpp:

llama.cpp: loading model from /Users/andreas/rozek/AI/RWKV/RWKV-4-World-0.1B-v1-20230520-ctx4096.bin

@rozek
Copy link
Author

rozek commented Nov 11, 2023

That was quick - thank you very much.

Unfortunately, I cannot get rwkv.cpp @ 363dfb1 to compile.

Unless I manage to find out why, I may have to wait for RWKV-5 support.

Nevertheless, thank you very much for your effort!

@rozek
Copy link
Author

rozek commented Nov 11, 2023

FYI: I managed to compile rwkv.cpp again - my mistake was to only git reset --hard rwkv.cpp itself, but not the included ggml repo...

Now I'm trying to use it - a first attempt with the current version of llama-node failed with the same error message as before.

Let's see what the detail llama.cpp: loading model means

@rozek
Copy link
Author

rozek commented Nov 11, 2023

Ok, I think I have to give up - now RWKV crashes with

Unsupported file version 101
/Users/runner/work/llama-node/llama-node/packages/rwkv-cpp/rwkv-sys/rwkv.cpp/rwkv.cpp:195: version == RWKV_FILE_VERSION
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '                        ' was expected to have ID '50254' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '                       ' was expected to have ID '50255' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '                      ' was expected to have ID '50256' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '                     ' was expected to have ID '50257' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '                    ' was expected to have ID '50258' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '                   ' was expected to have ID '50259' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '                  ' was expected to have ID '50260' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '                 ' was expected to have ID '50261' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '                ' was expected to have ID '50262' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '               ' was expected to have ID '50263' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '              ' was expected to have ID '50264' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '             ' was expected to have ID '50265' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '            ' was expected to have ID '50266' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '           ' was expected to have ID '50267' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '          ' was expected to have ID '50268' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '         ' was expected to have ID '50269' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '        ' was expected to have ID '50270' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '       ' was expected to have ID '50271' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '      ' was expected to have ID '50272' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '     ' was expected to have ID '50273' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '    ' was expected to have ID '50274' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '   ' was expected to have ID '50275' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - WARN - tokenizers::tokenizer::serialization] - Warning: Token '  ' was expected to have ID '50276' but was given ID 'None'
[Sat, 11 Nov 2023 13:57:54 +0000 - INFO - rwkv_node_cpp::context] - AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 | 
zsh: segmentation fault  node RWKV.mjs

I installed llama-node using

npm install llama-node
npm install @llama-node/rwkv-cpp

which seems to be wrong anyway as the RWKV inference example refers to a file (20B_tokenizer.json) which is only found within the node_modules/llama-node folder and should not have to be referenced there

@yorkzero831
Copy link
Collaborator

@rozek I think this is because of your rwkv model was quantified by wrong version of rwkv.cpp, you may have last try on quantify the model file with rwkv.cpp @ 363dfb1.

@yorkzero831
Copy link
Collaborator

FYI: only rwkv-4-raven been tested

@rozek
Copy link
Author

rozek commented Nov 11, 2023

Well, I meanwhile used rwkv.cpp @ 363dfb1 with ggml @ 00b49ec, as mentioned above.

But, as described before

  • the example refers to a file from node_modules (which means that simply installing llama-node with its rwkv backend is not sufficient)
  • it crashes with a "segmentation fault" (oh god, I stopped using C/C++ over 2 decades ago exactly because of these meaningless error messages)

@yorkzero831
Copy link
Collaborator

@yorkzero831
Copy link
Collaborator

and been tested in "llama-node": "^0.1.6", lol maybe too old

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants