Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]Ask aboout Qwen models with weight quantization . #408

Open
1 task
Cesilina opened this issue May 15, 2024 · 2 comments
Open
1 task

[BUG]Ask aboout Qwen models with weight quantization . #408

Cesilina opened this issue May 15, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@Cesilina
Copy link

Before you submit an issue, please search for existing issues to avoid duplicates.

Issue description:
Lightllm中的models里有qwen_wquant,我想了解这代码支持的是什么版本的qwen模型呢,本地下载了Qwen-7b-chat-AWQ和Qwen1.5-7b-chat-AWQ都会出错,错误内容是AttributeError: 'QwenTransformerLayerWeightQuantized' object has no attribute 'q_weight_'
Please provide a clear and concise description of your issue.

Steps to reproduce:

Please list the steps to reproduce the issue, such as:

  1. command 0
  2. command 2
  3. command 3
  4. See error

Expected behavior:

Please describe what you expected to happen.

Error logging:

If applicable, please copy and paste the error message or stack trace here. Use code blocks for better readability.

Environment:

Please provide information about your environment, such as:

  • Using container

  • OS: (Ubuntu 14.04, CentOS7)

  • GPU info:

    • nvidia-smi (e.g. NVIDIA-SMI 525.116.04 Driver Version: 525.116.04 CUDA Version: 12.0)
    • Graphics cards: (e.g. 4090x8)
  • Python: (e.g. CPython3.9)

    • currently, only python>=3.9 is supported
  • LightLLm: (git commit-hash)

    • for container: docker run --entrypoint cat --rm ghcr.io/modeltc/lightllm:main /lightllm/.git/refs/heads/main
  • openai-triton: pip show triton

Additional context:

Please add any other context or screenshots about the issue here.

Language:

Please use English as much as possible for better communication.

@Cesilina Cesilina added the bug Something isn't working label May 15, 2024
@hiworldwzj
Copy link
Collaborator

@Cesilina 目前没有直接直接加载这种量化权重的加载方式,支持的加载方式是直接把fp16 加载量化成 int4 权重。

@hiworldwzj
Copy link
Collaborator

而且 per channel per group 各种方案很多,可能需要自己去修改代码定制。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants