what ever I try no model loads #65

simteraplications · 2023-04-22T00:15:35Z

I downloaded the models from the link provided on version1.05 release page. But what ever I try it always sais couldn't load model. I use the ggml-model-q4_0.bin or the ggml-model-q4_0.bin files but nothing loads. I tried windows and Mac. It doesn't give me a proper error message just sais couldn't load model.

ItsPi3141 · 2023-04-22T03:10:25Z

You need to download the q4_1 file, not q4_0.

simteraplications · 2023-04-22T14:52:56Z

I used the following link. https://huggingface.co/Pi3141/alpaca-7b-native-enhanced/blob/main/ggml-model-q4_1.bin
it doesn't work, just sais can't load.

ItsPi3141 · 2023-04-24T14:32:24Z

i tried so many models and they either fail to load or they never write anything at all, i used kobold and the models work fine so i dunno what im doing wrong, i like this tool a lot but it never actually worked for me

Where exactly did you get the models from?

ItsPi3141 · 2023-04-24T18:25:59Z

from the link on the releases page
https://huggingface.co/Pi3141

And you're using q4_1, right?

ItsPi3141 · 2023-04-24T18:30:39Z

i tried this one
https://huggingface.co/Pi3141/gpt4-x-alpaca-native-13B-ggml/blob/main/ggml-model-q4_1.bin

Can you try Alpaca native enhanced?
https://huggingface.co/Pi3141/alpaca-7b-native-enhanced

penghe2021 · 2023-04-24T18:32:33Z

Maybe you can show the terminal log if you are using mac or linux, that will be more clear.

ItsPi3141 · 2023-04-24T19:08:24Z

that one works, i guess its just really slow? also it doesnt seem to take into account other stuff that runs on my pc because, its running at 100% and, my music now has these little skips in the audio, and my pc is unstable

i dont remember the kobold ui being so extreme, i could multitask with other stuff, also kobold shows me the tokens being read in real time which was really good feedback that it was doing stuff, but with alpaca electron i cant tell if the window is stuck or if its actually doing stuff, i really wish there was some text down here that said
"Processing Characters: 1 of 5000"

or something like that, it would improve the usability by 200%

this is just, kinda annoying to look at and, it doesnt tell me anything, it just made me assume it was frozen

I'll consider adding the character processed counter. Most of this stuff is to do with llama.cpp though. I have no control over the CPU usage. Im just making the frontend for it.

ItsPi3141 · 2023-04-24T20:12:35Z

i think you should add it or you are going to get more people reporting the models as broken

Actually I can't. Llama.cpp doesn't show how many tokens of the prompt has been processed.

What I'll do to fix people reporting that the model is broken is that I will make it a rule that people cannot open an issue if they haven't waited at least 1 hour for a response from the model to make sure that it's not just their computer.

Because if a model can't be load, the app will notify you. It only freezes in rare edge cases.

simteraplications · 2023-04-24T20:59:54Z

I tried all these models and none of them works, everything just sais couldn't load model. How do I find the terminall logs? I am using the macOS arm64 build.

ItsPi3141 · 2023-04-24T22:32:49Z

bruh nobody is ever gonna wait one hour, they will just find another tool

Yeah good luck to them finding a different tool thats faster than llama.cpp. If it takes that long for llama.cpp to run for them, then their CPU spec is probably not good, thus it would also make sense that they wouldn't have a GPU or the GPU won't be powerful enough.

simteraplications · 2023-04-24T22:35:31Z

where can I find the terminal logs on Mac?

penghe2021 · 2023-04-24T22:54:16Z

where can I find the terminal logs on Mac?

Sorry, I didn't test it on Mac before, I just assume when we run the command on terminal, it will display some info like this


//> llama_model_load_internal: format     = ggjt v1 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 2048
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 3 (mostly Q4_1)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size =  59.11 KB
llama_model_load_internal: mem required  = 6612.57 MB (+ 1026.00 MB per state)

//> llama_init_from_file: kv self size  = 1024.00 MB

system_info: n_threads = 8 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | 
main: interactive mode on.
Reverse prompt: 'User:'
Reverse prompt: '### Instruction:

ItsPi3141 · 2023-04-24T23:50:38Z

Sorry, I didn't test it on Mac before, I just assume when we run the command on terminal, it will display some info like this


//> llama_model_load_internal: format     = ggjt v1 (latest)
llama_model_load_internal: n_vocab    = 32000
llama_model_load_internal: n_ctx      = 2048
llama_model_load_internal: n_embd     = 4096
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 32
llama_model_load_internal: n_layer    = 32
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 3 (mostly Q4_1)
llama_model_load_internal: n_ff       = 11008
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size =  59.11 KB
llama_model_load_internal: mem required  = 6612.57 MB (+ 1026.00 MB per state)

//> llama_init_from_file: kv self size  = 1024.00 MB

system_info: n_threads = 8 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | 
main: interactive mode on.
Reverse prompt: 'User:'
Reverse prompt: '### Instruction:

That's normal, it's loading the model. Give it some time.

skidd-level-100 · 2023-04-29T20:45:14Z

Hey I had the same problem on linux (fedora silverblue 38) and I tryed to compile it myself and then it worked!
Im also guessing this is the same issues as:
#24
#51

tinfoil-hat-net · 2023-05-14T10:31:31Z

from the link on the releases page
https://huggingface.co/Pi3141

And you're using q4_1, right?

What's the difference with q4_1.bin q4_2.bin q4_3.bin etc?

ItsPi3141 added the bug Something isn't working label Apr 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

what ever I try no model loads #65

what ever I try no model loads #65

simteraplications commented Apr 22, 2023

ItsPi3141 commented Apr 22, 2023

simteraplications commented Apr 22, 2023

ItsPi3141 commented Apr 24, 2023

ItsPi3141 commented Apr 24, 2023

ItsPi3141 commented Apr 24, 2023

penghe2021 commented Apr 24, 2023

ItsPi3141 commented Apr 24, 2023

ItsPi3141 commented Apr 24, 2023

simteraplications commented Apr 24, 2023

ItsPi3141 commented Apr 24, 2023

simteraplications commented Apr 24, 2023

penghe2021 commented Apr 24, 2023

ItsPi3141 commented Apr 24, 2023

skidd-level-100 commented Apr 29, 2023

tinfoil-hat-net commented May 14, 2023

what ever I try no model loads #65

what ever I try no model loads #65

Comments

simteraplications commented Apr 22, 2023

ItsPi3141 commented Apr 22, 2023

simteraplications commented Apr 22, 2023

ItsPi3141 commented Apr 24, 2023

ItsPi3141 commented Apr 24, 2023

ItsPi3141 commented Apr 24, 2023

penghe2021 commented Apr 24, 2023

ItsPi3141 commented Apr 24, 2023

ItsPi3141 commented Apr 24, 2023

simteraplications commented Apr 24, 2023

ItsPi3141 commented Apr 24, 2023

simteraplications commented Apr 24, 2023

penghe2021 commented Apr 24, 2023

ItsPi3141 commented Apr 24, 2023

skidd-level-100 commented Apr 29, 2023

tinfoil-hat-net commented May 14, 2023