We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error loading model: llama.cpp: tensor 'layers.0.attention.wk.weight' has wrong shape; expected 8192 x 8192, got 8192 x 1024
The exact same settings and quantization works for 7B and 13B. Here is my .env
MODEL_PATH = ""
#MODEL_PATH = "./models/llama-2-7b-chat.ggmlv3.q4_0.bin" MODEL_PATH = "./models/llama-2-70b-chat.ggmlv3.q4_0.bin" #MODEL_PATH = "./models/llama-2-13b-chat.ggmlv3.q4_0.bin"
BACKEND_TYPE = "llama.cpp"
LOAD_IN_8BIT = False
MAX_MAX_NEW_TOKENS = 2048 DEFAULT_MAX_NEW_TOKENS = 1024 MAX_INPUT_TOKEN_LENGTH = 4000
DEFAULT_SYSTEM_PROMPT = ""
The text was updated successfully, but these errors were encountered:
@Dougie777 the env looks good to me. might be the error from 70b model.
Sorry, something went wrong.
No branches or pull requests
error loading model: llama.cpp: tensor 'layers.0.attention.wk.weight' has wrong shape; expected 8192 x 8192, got 8192 x 1024
The exact same settings and quantization works for 7B and 13B. Here is my .env
MODEL_PATH = ""
if MODEL_PATH is "", default llama.cpp/gptq models
will be downloaded to: ./models
Example ggml path:
#MODEL_PATH = "./models/llama-2-7b-chat.ggmlv3.q4_0.bin"
MODEL_PATH = "./models/llama-2-70b-chat.ggmlv3.q4_0.bin"
#MODEL_PATH = "./models/llama-2-13b-chat.ggmlv3.q4_0.bin"
options: llama.cpp, gptq, transformers
BACKEND_TYPE = "llama.cpp"
only for transformers bitsandbytes 8 bit
LOAD_IN_8BIT = False
MAX_MAX_NEW_TOKENS = 2048
DEFAULT_MAX_NEW_TOKENS = 1024
MAX_INPUT_TOKEN_LENGTH = 4000
DEFAULT_SYSTEM_PROMPT = ""
The text was updated successfully, but these errors were encountered: