Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what ever I try no model loads #65

Open
simteraplications opened this issue Apr 22, 2023 · 22 comments
Open

what ever I try no model loads #65

simteraplications opened this issue Apr 22, 2023 · 22 comments
Labels
bug Something isn't working

Comments

@simteraplications
Copy link

I downloaded the models from the link provided on version1.05 release page. But what ever I try it always sais couldn't load model. I use the ggml-model-q4_0.bin or the ggml-model-q4_0.bin files but nothing loads. I tried windows and Mac. It doesn't give me a proper error message just sais couldn't load model.

@ItsPi3141
Copy link
Owner

You need to download the q4_1 file, not q4_0.

@simteraplications
Copy link
Author

I used the following link. https://huggingface.co/Pi3141/alpaca-7b-native-enhanced/blob/main/ggml-model-q4_1.bin
it doesn't work, just sais can't load.

@ItsPi3141 ItsPi3141 added the bug Something isn't working label Apr 23, 2023
@ItsPi3141
Copy link
Owner

i tried so many models and they either fail to load or they never write anything at all, i used kobold and the models work fine so i dunno what im doing wrong, i like this tool a lot but it never actually worked for me

Where exactly did you get the models from?

@ItsPi3141
Copy link
Owner

from the link on the releases page
https://huggingface.co/Pi3141

And you're using q4_1, right?

@ItsPi3141
Copy link
Owner

@penghe2021
Copy link

Maybe you can show the terminal log if you are using mac or linux, that will be more clear.

@ItsPi3141
Copy link
Owner

that one works, i guess its just really slow? also it doesnt seem to take into account other stuff that runs on my pc because, its running at 100% and, my music now has these little skips in the audio, and my pc is unstable

i dont remember the kobold ui being so extreme, i could multitask with other stuff, also kobold shows me the tokens being read in real time which was really good feedback that it was doing stuff, but with alpaca electron i cant tell if the window is stuck or if its actually doing stuff, i really wish there was some text down here that said
"Processing Characters: 1 of 5000"

or something like that, it would improve the usability by 200%

this is just, kinda annoying to look at and, it doesnt tell me anything, it just made me assume it was frozen

I'll consider adding the character processed counter. Most of this stuff is to do with llama.cpp though. I have no control over the CPU usage. Im just making the frontend for it.

@ItsPi3141
Copy link
Owner

i think you should add it or you are going to get more people reporting the models as broken

Actually I can't. Llama.cpp doesn't show how many tokens of the prompt has been processed.

What I'll do to fix people reporting that the model is broken is that I will make it a rule that people cannot open an issue if they haven't waited at least 1 hour for a response from the model to make sure that it's not just their computer.

Because if a model can't be load, the app will notify you. It only freezes in rare edge cases.

@simteraplications
Copy link
Author

I tried all these models and none of them works, everything just sais couldn't load model. How do I find the terminall logs? I am using the macOS arm64 build.

@ItsPi3141
Copy link
Owner

bruh nobody is ever gonna wait one hour, they will just find another tool

Yeah good luck to them finding a different tool thats faster than llama.cpp. If it takes that long for llama.cpp to run for them, then their CPU spec is probably not good, thus it would also make sense that they wouldn't have a GPU or the GPU won't be powerful enough.

@simteraplications
Copy link
Author

where can I find the terminal logs on Mac?

@penghe2021
Copy link

where can I find the terminal logs on Mac?

Sorry, I didn't test it on Mac before, I just assume when we run the command on terminal, it will display some info like this


//> llama_model_load_internal: format     = ggjt v1 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 2048
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 3 (mostly Q4_1)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size =  59.11 KB
llama_model_load_internal: mem required  = 6612.57 MB (+ 1026.00 MB per state)

//> llama_init_from_file: kv self size  = 1024.00 MB

system_info: n_threads = 8 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | 
main: interactive mode on.
Reverse prompt: 'User:'
Reverse prompt: '### Instruction:

@ItsPi3141
Copy link
Owner

Sorry, I didn't test it on Mac before, I just assume when we run the command on terminal, it will display some info like this


//> llama_model_load_internal: format     = ggjt v1 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 2048
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 3 (mostly Q4_1)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size =  59.11 KB
llama_model_load_internal: mem required  = 6612.57 MB (+ 1026.00 MB per state)

//> llama_init_from_file: kv self size  = 1024.00 MB

system_info: n_threads = 8 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | 
main: interactive mode on.
Reverse prompt: 'User:'
Reverse prompt: '### Instruction:

That's normal, it's loading the model. Give it some time.

@skidd-level-100
Copy link

Hey I had the same problem on linux (fedora silverblue 38) and I tryed to compile it myself and then it worked!
Im also guessing this is the same issues as:
#24
#51

@tinfoil-hat-net
Copy link

from the link on the releases page
https://huggingface.co/Pi3141

And you're using q4_1, right?

What's the difference with q4_1.bin q4_2.bin q4_3.bin etc?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants
@simteraplications @penghe2021 @ItsPi3141 @tinfoil-hat-net @skidd-level-100 and others