New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Zephyr and other "StableLmForCausalLM" models? #1649
Comments
Here is yet another badass model @minhthuc2502 . Would love to help create a converter but am not an expert. It's the 1.6b version of Zephyr: https://huggingface.co/stabilityai/stablelm-2-zephyr-1_6b It kicks ass for its size. The only other small models with a context size of over 4,000 is Currently, the only reasonable option to build a chat application with
Here are tests for In short, NOTE: The models in the legend with "ct2" in their name are obviously Lastly, |
To maybe save you a few minutes..I've gathered the following information for someone/anyone:
Based on this snippet, hopefully it wouldn't be too complicated to create a converter for it... |
Any plans to support conversion of ```StableLmForCausalLM" models? I've noticed that they're very good; for example the new Zephyr model here:
https://huggingface.co/stabilityai/stablelm-zephyr-3b
Amazing performance for a 3B model, much better than Phi2 IMHO. Support was added into Transformers in version 4.38.2:
https://github.com/huggingface/transformers/releases/tag/v4.38.0
Here's the link to a description of the model architecture to help:
https://huggingface.co/docs/transformers/v4.38.2/en/model_doc/stablelm
The text was updated successfully, but these errors were encountered: