Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

7B 13B 30B Comparisons #37

Open
enzyme69 opened this issue Apr 11, 2023 · 5 comments
Open

7B 13B 30B Comparisons #37

enzyme69 opened this issue Apr 11, 2023 · 5 comments
Labels
question Further information is requested

Comments

@enzyme69
Copy link

I am testing a few models on my machine, M2 Mac.

At first, I tried 13B, slightly slow, but it's not bad, 5-7 words per seconds. The answers are pretty good actually. It's not yet ChatGPT, as I could not get proper answer on Blender Python. But it's pretty good at general Q&A.

I thought 7GB would be faster, but somewhat the AI responses and answers are disappointing. I delete it right away.

30 GB... is a bit too slow for this machine.

I wonder how we can refine a model, make it run faster and more precise on topic?

@DogVanDog
Copy link

"how we can refine a model" - i think this depends on the model that you're using - not a program. Can author refine all models? not sure.
"precise on topic" will be only if parameters like "temp" "top-p" will be added to control in this program (like in other UI text-generation tools). As example if you'll run your models through console services - you'll have ability to control base params like this.

@enzyme69
Copy link
Author

Hmm... yes, I need to investigate the model more, but I am pretty happy with 13B. It seems pretty smart, just under ChatGPT.

Screenshot 2023-04-11 at 11 54 20 pm

However, this 7B model is definitely broken:
https://huggingface.co/Pi3141/alpaca-lora-7B-ggml/blob/main/ggml-model-q4_1.bin
https://huggingface.co/Pi3141/alpaca-lora-7B-ggml/resolve/main/ggml-model-q4_1.bin

@enzyme69
Copy link
Author

Is there a recommendation on Llama or Alpaca model that's the most creative / better for coding?

@enzyme69
Copy link
Author

Screenshot 2023-04-12 at 10 06 20 pm

In many Occassions, if we ask questions with the same beginning of sentences, it will repeat the answers, without even thinking. It's a bug.

@mlbrnm
Copy link

mlbrnm commented Apr 17, 2023

Are there any compatible models other than the basic 7b/13b/30b from here?

https://huggingface.co/Pi3141

I've got 30b running on my 5900x/64GB RAM desktop and it's actually pretty useable - maybe 2-3 words per second. I wasn't sure how well that would work.

Curious if there is anything else I can try out. I don't know that much about LLMs but most don't seem to be in this ggml/.bin format. I searched GGML on HuggingFace but none of them (at least that I'm interested in) seem to work. Assume they're the "old format" the model loader references.

EDIT: Actually I've found one that works : https://huggingface.co/verymuchawful/Alpacino-13b-ggml

@ItsPi3141 ItsPi3141 added the question Further information is requested label Apr 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants